-
Notifications
You must be signed in to change notification settings - Fork 10
feat(config): Add fingerprint ignore tags to some config fields #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| # Component Identity & Change Detection | ||
|
|
||
| `azldev` computes a fingerprint for each component based on its config and sources. This enables: | ||
|
|
||
| - **Change detection**: identify which components have changed between two commits or branches, even if the change is a non-obvious config inheritance. | ||
| - **Build optimization**: only rebuild changed components and their dependents, skipping unchanged ones. | ||
| - **Automatic release bumping**: increment the release tag of changed packages automatically, and include the commits that caused the change in the changelog. | ||
|
|
||
| ## Fingerprint Inputs | ||
|
|
||
| A component's fingerprint is a SHA256 combining: | ||
|
|
||
| 1. **Config hash** — `hashstructure.Hash()` of the resolved `ComponentConfig` (after all merging). Fields tagged `fingerprint:"-"` are excluded. | ||
| 2. **Source identity** — content hash for local specs (all files in the spec directory), commit hash for upstream. | ||
| 3. **Overlay file hashes** — SHA256 of each file referenced by overlay `Source` fields. | ||
| 4. **Distro name + version** | ||
| 5. **Manual release bump counter** — increments with each manual release bump, ensuring a new fingerprint even if there are no config or source changes. | ||
|
|
||
dmcilvaney marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Global change propagation works automatically: the fingerprint operates on the fully-merged config, so a change to a distro or group default changes the resolved config of every inheriting component. | ||
|
|
||
| ## `fingerprint:"-"` Tag System | ||
|
|
||
| The `hashstructure` library uses `TagName: "fingerprint"`. Untagged fields are **included by default** (safe default: false positive > false negative). | ||
|
|
||
| A guard test (`TestAllFingerprintedFieldsHaveDecision`) reflects over all fingerprinted structs and maintains a bi-directional allowlist of exclusions. It fails if a `fingerprint:"-"` tag is added without registering it, or if a registered exclusion's tag is removed. | ||
|
|
||
| ### Adding a New Config Field | ||
|
|
||
| 1. Add the field to the struct in `internal/projectconfig/`. | ||
| 2. **If NOT a build input**: add `fingerprint:"-"` to the struct tag and register it in `expectedExclusions` in `internal/projectconfig/fingerprint_test.go`. | ||
| 3. **If a build input**: do nothing — included by default. | ||
| 4. Run `mage unit`. | ||
|
|
||
| ### Adding a New Source Type | ||
|
|
||
| 1. Implement `SourceIdentityProvider` on your provider (see `ResolveLocalSourceIdentity` in `localidentity.go` for a simple example). | ||
| 2. Add a case to `sourceManager.ResolveSourceIdentity()` in `sourcemanager.go`. | ||
| 3. Add tests in `identityprovider_test.go`. | ||
|
|
||
| ## Known Limitations | ||
|
|
||
| - It is difficult to determine WHY a diff occurred (e.g., which specific field changed) since the fingerprint is a single opaque hash. The JSON output includes an `inputs` breakdown (`configHash`, `sourceIdentity`, `overlayFileHashes`, etc.) that can help narrow it down by comparing the two identity files manually. | ||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -49,7 +49,7 @@ type Origin struct { | |||||
| // SourceFileReference encapsulates a reference to a specific source file artifact. | ||||||
| type SourceFileReference struct { | ||||||
| // Reference to the component to which the source file belongs. | ||||||
| Component ComponentReference `toml:"-" json:"-"` | ||||||
| Component ComponentReference `toml:"-" json:"-" fingerprint:"-"` | ||||||
|
|
||||||
| // Name of the source file; must be non-empty. | ||||||
| Filename string `toml:"filename" json:"filename"` | ||||||
|
|
@@ -61,7 +61,7 @@ type SourceFileReference struct { | |||||
| HashType fileutils.HashType `toml:"hash-type,omitempty" json:"hashType,omitempty" jsonschema:"enum=SHA256,enum=SHA512,title=Hash type,description=Hash algorithm used for the hash value"` | ||||||
|
|
||||||
| // Origin for this source file. When omitted, the file is resolved via the lookaside cache. | ||||||
| Origin Origin `toml:"origin,omitempty" json:"origin,omitempty"` | ||||||
| Origin Origin `toml:"origin,omitempty" json:"origin,omitempty" fingerprint:"-"` | ||||||
|
||||||
| Origin Origin `toml:"origin,omitempty" json:"origin,omitempty" fingerprint:"-"` | |
| Origin Origin `toml:"origin,omitempty" json:"origin,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add a check to the calculation step that errors on this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -18,7 +18,7 @@ type DistroReference struct { | |||||
| // Version of the referenced distro. | ||||||
| Version string `toml:"version,omitempty" json:"version,omitempty" jsonschema:"title=Version,description=Version of the referenced distro"` | ||||||
| // Snapshot date/time for source code if specified components will use source as it existed at this time. | ||||||
| Snapshot string `toml:"snapshot,omitempty" json:"snapshot,omitempty" jsonschema:"format=date-time,title=Snapshot,description=If specified use source code as it existed at this date/time"` | ||||||
| Snapshot string `toml:"snapshot,omitempty" json:"snapshot,omitempty" jsonschema:"format=date-time,title=Snapshot,description=If specified use source code as it existed at this date/time" fingerprint:"-"` | ||||||
|
||||||
| Snapshot string `toml:"snapshot,omitempty" json:"snapshot,omitempty" jsonschema:"format=date-time,title=Snapshot,description=If specified use source code as it existed at this date/time" fingerprint:"-"` | |
| Snapshot string `toml:"snapshot,omitempty" json:"snapshot,omitempty" jsonschema:"format=date-time,title=Snapshot,description=If specified use source code as it existed at this date/time"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fully resolved commit for each component is part of the fingerprint. That is more reliable than the snapshot date (also, updating the snapshot date would cause a rebuild of every single package, even if many of them didn't change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,115 @@ | ||
| // Copyright (c) Microsoft Corporation. | ||
| // Licensed under the MIT License. | ||
|
|
||
| package projectconfig_test | ||
|
|
||
| import ( | ||
| "reflect" | ||
| "testing" | ||
|
|
||
| "github.com/microsoft/azure-linux-dev-tools/internal/projectconfig" | ||
| "github.com/stretchr/testify/assert" | ||
| ) | ||
|
|
||
| // TestAllFingerprintedFieldsHaveDecision verifies that every field in every | ||
| // fingerprinted struct has been consciously categorized as either included | ||
| // (no fingerprint tag) or excluded (`fingerprint:"-"`). | ||
| // | ||
| // This test serves two purposes: | ||
| // 1. It ensures that newly added fields default to **included** in the fingerprint | ||
| // (the safe default — you get a false positive, never a false negative). | ||
| // 2. It catches accidental removal of `fingerprint:"-"` tags from excluded fields, | ||
| // since all exclusions are tracked in expectedExclusions. | ||
| func TestAllFingerprintedFieldsHaveDecision(t *testing.T) { | ||
dmcilvaney marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| // All struct types whose fields participate in component fingerprinting. | ||
| // When adding a new struct that feeds into the fingerprint, add it here. | ||
| fingerprintedStructs := []reflect.Type{ | ||
| reflect.TypeFor[projectconfig.ComponentConfig](), | ||
| reflect.TypeFor[projectconfig.ComponentBuildConfig](), | ||
| reflect.TypeFor[projectconfig.CheckConfig](), | ||
| reflect.TypeFor[projectconfig.PackageConfig](), | ||
| reflect.TypeFor[projectconfig.ComponentOverlay](), | ||
| reflect.TypeFor[projectconfig.SpecSource](), | ||
| reflect.TypeFor[projectconfig.DistroReference](), | ||
| reflect.TypeFor[projectconfig.SourceFileReference](), | ||
| } | ||
dmcilvaney marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| // Maps "StructName.FieldName" for every field that should carry a | ||
| // `fingerprint:"-"` tag. Catches accidental tag removal. | ||
| // | ||
| // Each entry documents WHY the field is excluded from the fingerprint: | ||
| expectedExclusions := map[string]bool{ | ||
| // ComponentConfig.Name — metadata, already the map key in project config. | ||
| "ComponentConfig.Name": true, | ||
| // ComponentConfig.SourceConfigFile — internal bookkeeping reference, not a build input. | ||
| "ComponentConfig.SourceConfigFile": true, | ||
|
|
||
| // ComponentBuildConfig.Failure — CI policy (expected failure tracking), not a build input. | ||
| "ComponentBuildConfig.Failure": true, | ||
| // ComponentBuildConfig.Hints — scheduling hints (e.g. expensive), not a build input. | ||
| "ComponentBuildConfig.Hints": true, | ||
|
|
||
| // CheckConfig.SkipReason — human documentation for why check is skipped, not a build input. | ||
| "CheckConfig.SkipReason": true, | ||
|
|
||
| // PackageConfig.Publish — post-build routing (where to publish), not a build input. | ||
| "PackageConfig.Publish": true, | ||
|
|
||
| // ComponentOverlay.Description — human-readable documentation for the overlay. | ||
| "ComponentOverlay.Description": true, | ||
|
|
||
| // SourceFileReference.Component — back-reference to parent, not a build input. | ||
| "SourceFileReference.Component": true, | ||
|
|
||
| // DistroReference.Snapshot — snapshot timestamp is not a build input; the resolved | ||
| // upstream commit hash (captured separately via SourceIdentity) is what matters. | ||
| // Excluding this prevents a snapshot bump from marking all upstream components as changed. | ||
| "DistroReference.Snapshot": true, | ||
|
|
||
| // SourceFileReference.Origin — download location metadata (URI, type), not a build input. | ||
| // The file content is already captured by Filename + Hash; changing a CDN URL should not | ||
| // trigger a rebuild. | ||
| "SourceFileReference.Origin": true, | ||
| } | ||
|
|
||
| // Collect all actual exclusions found via reflection, and flag invalid tag values. | ||
| actualExclusions := make(map[string]bool) | ||
|
|
||
| for _, st := range fingerprintedStructs { | ||
| for i := range st.NumField() { | ||
| field := st.Field(i) | ||
| key := st.Name() + "." + field.Name | ||
|
|
||
| tag := field.Tag.Get("fingerprint") | ||
|
|
||
| switch tag { | ||
| case "": | ||
| // No tag — included by default (the safe default). | ||
| case "-": | ||
| actualExclusions[key] = true | ||
| default: | ||
| // hashstructure only recognises "" (include) and "-" (exclude). | ||
| // Any other value is silently treated as included, which is | ||
| // almost certainly a typo. | ||
| assert.Failf(t, "invalid fingerprint tag", | ||
| "field %q has unrecognised fingerprint tag value %q — "+ | ||
| "only `fingerprint:\"-\"` (exclude) is valid; "+ | ||
| "remove the tag to include the field", key, tag) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Verify every expected exclusion is actually present. | ||
| for key := range expectedExclusions { | ||
| assert.Truef(t, actualExclusions[key], | ||
| "expected field %q to have `fingerprint:\"-\"` tag, but it does not — "+ | ||
| "was the tag accidentally removed?", key) | ||
| } | ||
|
|
||
| // Verify no unexpected exclusions exist. | ||
| for key := range actualExclusions { | ||
| assert.Truef(t, expectedExclusions[key], | ||
| "field %q has `fingerprint:\"-\"` tag but is not in expectedExclusions — "+ | ||
| "add it to expectedExclusions if the exclusion is intentional", key) | ||
| } | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get a general overview of this new tag added to some of the human-targeted docs too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final PR has a section on it, figured I'd put the whole feature set in a doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should document as we go. It's clear directions change quickly here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved docs up to this PR