feat: enhance meaning units and utility by asteier2026 · Pull Request #156 · NVIDIA-NeMo/Anonymizer

asteier2026 · 2026-05-12T22:43:49Z

Changes include:

Meaning unit prompt changes:
-- Meaning units should preserve only the minimum semantic detail
necessary to retain the text's functional utility.
-- <segmentation_rules> updated (emphasizing one idea per unit)
-- Each meaning unit now has an importance. We later use this to weight quality_qa failures. Eventually we may use it if we repair quality_qa. We may also use it if we implement rare semantic bundles.
Quality QA Compare: <grading_rules> updated (be more lenient on generalization and take importance into account)
Updates to do weighted utility and carry importance through

greptile-apps · 2026-05-12T22:47:40Z

Greptile Summary

This PR enhances the meaning-unit pipeline with a two-level importance field (critical / important) that flows from extraction through QA generation into a weighted utility score calculation. The grading prompt is updated to score abstracted answers more leniently for "important" units and more strictly for "critical" ones.

Adds MeaningUnitImportance enum and attaches it as a required field on MeaningUnitSchema and QualityQAItemSchema; both updated prompts instruct the LLM to carry the value through.
Introduces importance-weighted mean in compute_utility_score (critical = 2×, important = 1×), with a safe flat-mean fallback; _compute_metrics_columns is updated to look up importances from COL_QUALITY_QA and COL_QUALITY_QA is now correctly listed as a required column in its decorator.
Tests cover the weighted, mismatched-length, and None fallback cases for compute_utility_score.

Confidence Score: 5/5

Safe to merge. The importance weighting flows correctly through extraction, QA generation, and scoring with a safe flat-mean fallback when data is missing.

All code paths for the new importance field are correctly wired: the enum is declared, carried through both LLM prompt stages, stored in the schemas, looked up by ID in _compute_metrics_columns, and passed correctly to compute_utility_score. The required_columns decorator was updated to include COL_QUALITY_QA, preventing silent missing-data bugs. Tests cover the weighted, mismatch, and None fallback cases. The only observation is a prompt-level ambiguity about how strictly to score critical units with abstracted answers, which does not affect code correctness.

No files require special attention.

Important Files Changed

Filename	Overview
src/anonymizer/engine/rewrite/evaluate.py	Adds importance-weighted utility scoring; correctly updates required_columns, ids lookup, and fallback logic.
src/anonymizer/engine/schemas/rewrite.py	Adds MeaningUnitImportance enum and importance field to MeaningUnitSchema and QualityQAItemSchema; no backward-compat default was intentionally omitted per team decision.
src/anonymizer/engine/rewrite/qa_generation.py	Prompt additions correctly define importance criteria and instruct the LLM to carry importance through to output items.
tests/engine/test_evaluate.py	Good new tests covering weighted scoring, ordering, mismatch, and None fallback for compute_utility_score.
tests/engine/test_parsers.py	Parser tests updated to include the new required importance field in fixture payloads.
tests/engine/test_qa_generation.py	Stub meaning units updated to supply the required importance field on MeaningUnitSchema.
src/anonymizer/engine/schemas/init.py	MeaningUnitImportance exported correctly from the schemas package.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["LLM: Meaning Unit Extraction\n(MeaningUnitsSchema)"] -->|"units with importance:\ncritical | important"| B["Serialize to JSON\n(_serialize_meaning_units)"]
    B --> C["LLM: QA Generation\n(QualityQAPairsSchema)"]
    C -->|"items with importance\ncarried through"| D["LLM: Quality Re-answer\n(QualityAnswersSchema)"]
    D --> E["LLM: Quality Compare\n(QACompareResultsSchema)\ngrading rules now importance-aware"]
    E -->|"ids + scores"| F["_compute_metrics_columns"]
    C -->|"importance lookup\nby item.id"| F
    F -->|"importance_levels list"| G["compute_utility_score\ncritical=2.0x, important=1.0x\nweighted mean"]
    G --> H["COL_UTILITY_SCORE"]

_{Reviews (6): Last reviewed commit: "Update tests/engine/test_evaluate.py" | Re-trigger Greptile}

asteier2026 · 2026-05-13T15:32:22Z

We will never have the legacy problem greptile is referring to. There are couple small changes needed by the sensitivity PR that are duplicated in this PR, so that both will merge smoothly: Removing reference to "sensitiive_attribute" and adding checks in validate_protection_consistency for combined_risk_level.

lipikaramaswamy · 2026-05-15T19:57:25Z

I think this looks good, but to be sure, could you merge/rebase latest main into this PR? I did that locally and the diff becomes much clearer: the duplicated sensitivity-disposition changes are absorbed from main, and this PR is left with the intended quality/importance changes. But I wanna make sure that propagates correctly.

Also one small test ask: could we add direct coverage for the new weighted utility behavior in compute_utility_score? For example, [0.0, 1.0] with ["critical", "important"] should differ from the reversed ordering, and a mismatched importance list should fall back to the flat mean.

lipikaramaswamy

Looks good. The rebase makes the diff much cleaner: the sensitivity-disposition changes are coming from main, and this PR is focused on meaning-unit importance / weighted utility.

Thanks for adding the weighted utility tests.

Q/non-blocking: do we still want compute_utility_score to fall back to a flat mean when importance_levels is missing or length-mismatched? Since importance is now required and this is going into a new release (0.2.0), a mismatch could arguably fail loudly instead of silently using the old scoring behavior.

Co-authored-by: lipikaramaswamy <31832945+lipikaramaswamy@users.noreply.github.com>

asteier2026 requested a review from a team as a code owner May 12, 2026 22:43

asteier2026 requested a review from lipikaramaswamy May 12, 2026 22:46

greptile-apps Bot reviewed May 12, 2026

View reviewed changes

Comment thread src/anonymizer/engine/schemas/rewrite.py

Comment thread src/anonymizer/engine/schemas/rewrite.py

Comment thread src/anonymizer/engine/schemas/rewrite.py

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

Comment thread src/anonymizer/engine/schemas/rewrite.py

lipikaramaswamy reviewed May 15, 2026

View reviewed changes

Comment thread src/anonymizer/engine/rewrite/evaluate.py

asteier2026 requested a review from a team as a code owner May 15, 2026 20:43

asteier2026 added 5 commits May 15, 2026 13:46

feature: meaning unit and utility enhancements

921c648

fix: update test code

5c37321

fix: add importance field to MeaningUnitSchema test fixtures

2fbcd96

fix: add test updates

ac90884

feature: rebase with main and add more tests

99ed115

asteier2026 force-pushed the asteier2026/feature/quality-improvements branch from 8111523 to 99ed115 Compare May 15, 2026 20:51

lipikaramaswamy approved these changes May 15, 2026

View reviewed changes

Comment thread tests/engine/test_evaluate.py Outdated

Update tests/engine/test_evaluate.py

a93e1c0

Co-authored-by: lipikaramaswamy <31832945+lipikaramaswamy@users.noreply.github.com>

asteier2026 merged commit 4eefca9 into main May 15, 2026
11 checks passed

asteier2026 deleted the asteier2026/feature/quality-improvements branch May 15, 2026 22:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enhance meaning units and utility#156

feat: enhance meaning units and utility#156
asteier2026 merged 6 commits into
mainfrom
asteier2026/feature/quality-improvements

asteier2026 commented May 12, 2026

Uh oh!

greptile-apps Bot commented May 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asteier2026 commented May 13, 2026

Uh oh!

Uh oh!

Uh oh!

lipikaramaswamy commented May 15, 2026

Uh oh!

lipikaramaswamy left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

asteier2026 commented May 12, 2026

Uh oh!

greptile-apps Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

Uh oh!

asteier2026 commented May 13, 2026

Uh oh!

Uh oh!

Uh oh!

lipikaramaswamy commented May 15, 2026

Uh oh!

lipikaramaswamy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented May 12, 2026 •

edited

Loading