Skip to content

feat(scanner): Add scan result parser#24

Merged
heliocastro merged 1 commit intomainfrom
feat/scanner_result
Mar 16, 2026
Merged

feat(scanner): Add scan result parser#24
heliocastro merged 1 commit intomainfrom
feat/scanner_result

Conversation

@heliocastro
Copy link
Owner

No description provided.

@heliocastro heliocastro self-assigned this Mar 13, 2026
Copilot AI review requested due to automatic review settings March 13, 2026 15:15
@heliocastro heliocastro added the enhancement New feature or request label Mar 13, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the Python ORT result model layer to support scanner results, including parsing snippet findings from ORT YAML output, and updates project metadata/dependencies accordingly.

Changes:

  • Add scanner-related models (ScannerRun, ScanResult, ScanSummary, ScannerDetails) plus supporting types (snippet findings, provenance resolution results, file lists, storage configs).
  • Add SPDX license-expression validation for snippet and license findings.
  • Update YAML loader and project/tooling configs (version bump, dependencies, pre-commit hooks, README usage example).

Reviewed changes

Copilot reviewed 30 out of 31 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
uv.lock Locks new runtime deps and bumps dev tools (ruff/ty).
tests/test_scan_result.py Adds tests covering scanner run + scan result parsing from YAML.
src/ort/utils/yaml_loader.py Simplifies loader selection (CSafeLoader fallback).
src/ort/models/vcstype.py Renames validator to satisfy naming/linting.
src/ort/models/text_location.py Makes TextLocation hashable/comparable for set usage.
src/ort/models/snippet_finding.py Adds SnippetFinding model (hashable for sets).
src/ort/models/snippet.py Adds Snippet model + SPDX expression validation.
src/ort/models/scanner_run.py Adds ScannerRun model to parse scanner run output.
src/ort/models/scanner_details.py Adds ScannerDetails model.
src/ort/models/scan_summary.py Adds ScanSummary model (findings + issues).
src/ort/models/scan_result.py Adds ScanResult model (provenance/scanner/summary).
src/ort/models/provenance_resolution_result.py Adds provenance resolution result model for scanner run.
src/ort/models/provenance.py Refactors provenance hierarchy for scan/snippet provenance parsing.
src/ort/models/ort_result.py Adds optional scanner section to OrtResult.
src/ort/models/license_finding.py Adds LicenseFinding model + SPDX expression validation.
src/ort/models/file_list.py Adds FileList and Entry models for scanner file lists.
src/ort/models/copyright_finding.py Adds CopyrightFinding model.
src/ort/models/config/scanner_configuration.py Adds scanner configuration model.
src/ort/models/config/scan_storage_configuration.py Adds scan storage configuration models + storage type enum.
src/ort/models/config/s3_file_storage_configuration.py Adds S3 file storage config model.
src/ort/models/config/provenance_storage_configuration.py Adds provenance storage config model.
src/ort/models/config/postgres_connection.py Adds Postgres connection config model.
src/ort/models/config/local_file_storage_configuration.py Adds local file storage config model.
src/ort/models/config/http_file_storage_configuration.py Adds HTTP file storage config model.
src/ort/models/config/file_storage_configuration.py Adds file storage root config model.
src/ort/models/config/file_list_storage_configuration.py Adds file list storage config model.
src/ort/models/config/file_archiver_configuration.py Adds file archiver config model.
src/ort/models/base_run.py Updates SPDX header metadata.
pyproject.toml Bumps package version, adds deps, updates tool versions/ruff ignores.
prek.toml Bumps hook versions and adds license-expression to ty hook deps.
README.md Expands docs with installation + YAML parsing example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@heliocastro heliocastro force-pushed the feat/scanner_result branch from c0a1e89 to 04cd0be Compare March 13, 2026 15:25
Copilot AI review requested due to automatic review settings March 16, 2026 13:10
@heliocastro heliocastro force-pushed the feat/scanner_result branch from 04cd0be to b8e315c Compare March 16, 2026 13:10
@heliocastro heliocastro force-pushed the feat/scanner_result branch from b8e315c to 36910c6 Compare March 16, 2026 13:16
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds scanner / scan-result parsing support to the Python ORT model layer (including snippet findings), updates YAML loading behavior, and wires the new scanner run into OrtResult.

Changes:

  • Add new scanner-related models (ScannerRun, ScanResult, ScanSummary, ScannerDetails) plus snippet-related models (Snippet, SnippetFinding) and supporting structures (e.g., FileList, provenance resolution results).
  • Add SPDX license-expression validation for snippet and license findings (new license-expression dependency).
  • Update docs / tooling metadata (README example, dependency bumps, pre-commit hook revs).

Reviewed changes

Copilot reviewed 33 out of 34 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
uv.lock Locks new dependencies (license-expression, etc.) and bumps dev tooling versions.
pyproject.toml Bumps project version and adds dependencies / dev dependency bumps.
prek.toml Updates pre-commit hook revisions and adds license-expression to ty hook deps.
README.md Adds installation + YAML parsing usage example.
tests/test_scan_result.py Adds tests covering scan result parsing (including YAML fixture loading).
src/ort/utils/yaml_loader.py Simplifies loader selection to prefer CSafeLoader when present.
src/ort/models/vcstype.py Renames the model validator method to satisfy naming rules.
src/ort/models/text_location.py Makes TextLocation hashable/comparable for use in sets.
src/ort/models/source_code_origin.py Switches to ValidatedIntEnum for origin values.
src/ort/models/snippet_finding.py Introduces SnippetFinding model with hashing/equality.
src/ort/models/snippet.py Introduces Snippet model + SPDX expression validation.
src/ort/models/scanner_run.py Introduces ScannerRun model representing scanner execution results.
src/ort/models/scanner_details.py Introduces ScannerDetails model for scanner metadata.
src/ort/models/scan_summary.py Introduces ScanSummary model with findings and timing info.
src/ort/models/scan_result.py Introduces ScanResult model for per-provenance scan results.
src/ort/models/resolutions.py Migrates resolution reason enums to ValidatedIntEnum.
src/ort/models/provenance_resolution_result.py Introduces provenance resolution result model w/ hashing/equality.
src/ort/models/provenance.py Refactors provenance typing to a discriminated union and adds hashing/equality.
src/ort/models/ort_result.py Adds optional scanner run to the top-level OrtResult.
src/ort/models/license_finding.py Introduces LicenseFinding model + SPDX expression validation.
src/ort/models/file_list.py Introduces FileList + entries for scanned provenance file listings.
src/ort/models/copyright_finding.py Introduces CopyrightFinding model.
src/ort/models/config/scanner_configuration.py Introduces scanner configuration model used by ScannerRun.
src/ort/models/config/scan_storage_configuration.py Adds scan storage configuration models + StorageType.
src/ort/models/config/s3_file_storage_configuration.py Adds S3 storage configuration model.
src/ort/models/config/provenance_storage_configuration.py Adds provenance storage configuration model.
src/ort/models/config/postgres_connection.py Adds Postgres connection configuration model.
src/ort/models/config/local_file_storage_configuration.py Adds local file storage configuration model.
src/ort/models/config/license_finding_curation_reason.py Migrates curation reason enum to ValidatedIntEnum.
src/ort/models/config/http_file_storage_configuration.py Adds HTTP file storage configuration model.
src/ort/models/config/file_storage_configuration.py Adds file storage configuration wrapper model.
src/ort/models/config/file_list_storage_configuration.py Adds file list storage configuration model.
src/ort/models/config/file_archiver_configuration.py Adds file archiver configuration model.
src/ort/models/base_run.py Updates SPDX header metadata.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -1,4 +1,5 @@
# SPDX-FileCopyrightText: 2025 Helio Chissini de Castro <heliocastro@gmail.com>
# SPDX-FileCopyrightText: 2025 Helio Chissini de Castro <dev@heliocastro.info>
# # SPDX-FileCopyrightText: 2026 CARIAD SE
Comment on lines +25 to +28
description="The time the analyzer was started.",
)
end_time: datetime = Field(
description="The time the analyzer has finished.",
@heliocastro heliocastro force-pushed the feat/scanner_result branch from 36910c6 to d9e1b10 Compare March 16, 2026 13:24
Signed-off-by: Helio Chissini de Castro <dev@heliocastro.info>
Signed-off-by: Helio Chissini de Castro <helio.chissini.de.castro@cariad.technology>
Copilot AI review requested due to automatic review settings March 16, 2026 13:32
@heliocastro heliocastro force-pushed the feat/scanner_result branch from d9e1b10 to be13bc8 Compare March 16, 2026 13:32
@heliocastro heliocastro merged commit d1f88bc into main Mar 16, 2026
18 checks passed
@heliocastro heliocastro deleted the feat/scanner_result branch March 16, 2026 13:33
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for parsing ORT scanner results by introducing new scanner-related Pydantic models (scan results, summaries, snippet findings, scanner configuration/storage config) and wiring them into OrtResult, along with test coverage and dependency updates.

Changes:

  • Add new scanner domain models (ScannerRun, ScanResult, ScanSummary, snippet/license/copyright finding models, storage config models).
  • Update provenance parsing to use a Pydantic discriminator-based union.
  • Add tests for YAML-driven parsing of scan results; update dependencies / tooling versions (including license-expression).

Reviewed changes

Copilot reviewed 34 out of 35 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
src/ort/models/ort_result.py Adds optional scanner: ScannerRun to the ORT result model.
src/ort/models/scanner_run.py Introduces the top-level scanner run model and its collections.
src/ort/models/scan_result.py Adds a scan result model including hashing/equality behavior.
src/ort/models/scan_summary.py Adds scan summary model including findings and issues.
src/ort/models/snippet.py / snippet_finding.py Adds snippet models and SPDX license validation.
src/ort/models/provenance.py Reworks provenance into a discriminator-based union; adds hashing/equality.
src/ort/models/config/* Adds scanner configuration and storage config models for scanner run parsing.
tests/test_scan_result.py Adds tests to validate model construction and YAML parsing.
pyproject.toml / uv.lock / prek.toml Bumps versions and adds license-expression dependency + hook deps.
README.md Expands install + usage example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +42 to +47
return hash(self.provenance)

def __eq__(self, other) -> bool:
if not isinstance(other, ScanResult):
return NotImplemented
return self.provenance == other.provenance
Comment on lines +65 to +69
ort_type: StorageType = Field(
alias="type",
default="PROVENANCE_BASED",
description=("The way that scan results are stored, defaults to StorageType.PROVENANCE_BASED."),
)
description="The path of the file relative to the root of the provenance corresponding"
"to the enclosing [FileList].",
)
sha1: str = Field(..., description="The sha1 checksum of the file, consisting of 40lowercase hexadecimal digits.")
Comment on lines +39 to +44
return hash(self.affected_path)

def __eq__(self, other) -> bool:
if not isinstance(other, Issue):
return NotImplemented
return self.affected_path == other.affected_path
Comment on lines +80 to +84
ort_type: StorageType = Field(
alias="type",
default="PROVENANCE_BASED",
description=("The way that scan results are stored, defaults to StorageType.PROVENANCE_BASED."),
)
Comment on lines +13 to +35
class IssueResolutionReason(ValidatedIntEnum):
BUILD_TOOL_ISSUE = 1
CANT_FIX_ISSUE = 2
SCANNER_ISSUE = 3

class RuleViolationResolutionReason(Enum):
CANT_FIX_EXCEPTION = "CANT_FIX_EXCEPTION"
DYNAMIC_LINKAGE_EXCEPTION = "DYNAMIC_LINKAGE_EXCEPTION"
EXAMPLE_OF_EXCEPTION = "EXAMPLE_OF_EXCEPTION"
LICENSE_ACQUIRED_EXCEPTION = "LICENSE_ACQUIRED_EXCEPTION"
NOT_MODIFIED_EXCEPTION = "NOT_MODIFIED_EXCEPTION"
PATENT_GRANT_EXCEPTION = "PATENT_GRANT_EXCEPTION"

class RuleViolationResolutionReason(ValidatedIntEnum):
CANT_FIX_EXCEPTION = 1
DYNAMIC_LINKAGE_EXCEPTION = 2
EXAMPLE_OF_EXCEPTION = 3
LICENSE_ACQUIRED_EXCEPTION = 4
NOT_MODIFIED_EXCEPTION = 5
PATENT_GRANT_EXCEPTION = 6

class VulnerabilityResolutionReason(Enum):
CANT_FIX_VULNERABILITY = "CANT_FIX_VULNERABILITY"
INEFFECTIVE_VULNERABILITY = "INEFFECTIVE_VULNERABILITY"
INVALID_MATCH_VULNERABILITY = "INVALID_MATCH_VULNERABILITY"
MITIGATED_VULNERABILITY = "MITIGATED_VULNERABILITY"
NOT_A_VULNERABILITY = "NOT_A_VULNERABILITY"
WILL_NOT_FIX_VULNERABILITY = "WILL_NOT_FIX_VULNERABILITY"
WORKAROUND_FOR_VULNERABILITY = "WORKAROUND_FOR_VULNERABILITY"

class VulnerabilityResolutionReason(ValidatedIntEnum):
CANT_FIX_VULNERABILITY = 1
INEFFECTIVE_VULNERABILITY = 2
INVALID_MATCH_VULNERABILITY = 3
MITIGATED_VULNERABILITY = 4
NOT_A_VULNERABILITY = 5
WILL_NOT_FIX_VULNERABILITY = 6
WORKAROUND_FOR_VULNERABILITY = 7
Comment on lines +45 to +51
issues: list[Issue] = Field(
default_factory=list,
description=(
"The list of issues that occurred during the scan. This property is "
"not serialized if the list is empty to reduce the size of the result "
"file."
),
Comment on lines +64 to +70
@field_validator("license", mode="before")
@classmethod
def validate_spdx(cls, value):
try:
licensing = get_spdx_licensing()
licensing.parse(value)
return value
Comment on lines +37 to +43
@field_validator("license", mode="before")
@classmethod
def validate_spdx(cls, value):
try:
licensing = get_spdx_licensing()
licensing.parse(value)
return value
DOCUMENTATION_OF = 3
INCORRECT = 4
NOT_DETECTED = 5
REFERENCE = 6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants