Releases: microsoft/presidio
Release 2.2.360
Analyzer
Added
- Korean Resident Registration Number (RRN) recognizer with checksum validation for numbers issued prior to October 2020 (#1675) (Thanks @siwoo-jung)
- Azure Health Data Services (AHDS) de-identification service integration as a remote recognizer with Entra ID authentication (#1624) (Thanks @rishasurana)
- Comprehensive input validation methods for NlpEngineProvider to ensure valid arguments for engines, configuration, and file paths (#1653) (Thanks @siwoo-jung)
Changed
- Updated Indian Aadhaar recognizer to support contextual delimiters (-, :, space) for improved detection accuracy (#1677) (Thanks @K3y5tr0ke)
- Fixed Italian Driver License recognizer regex to include missing characters per government requirements, excluding only A, O, Q, I (#1651) (Thanks @K3y5tr0ke)
- Refactored recognizers folder structure for better organization and maintainability (#1670) (Thanks @omri374)
Anonymizer
Added
- Azure Health Data Services (AHDS) Surrogate anonymization operator with medical domain expertise for realistic PHI surrogate generation (#1672) (Thanks @rishasurana)
Changed
General
Added
- Comprehensive GitHub Copilot instructions with development guidelines, build processes, and e2e testing procedures (#1693) (Thanks @Copilot)
- New GitHub Actions CI & release workflows with multi-platform Docker image support for AMD64 and ARM64 architectures (#1697) (Thanks @tamirkamara)
- Dual-path CI workflow to fix GitHub Actions failures for external contributors by auto-detecting fork vs. main repository PRs (#1708) (Thanks @Copilot)
- OIDC trusted publishing for PyPI releases eliminating manual API token management and enhancing security (#1702) (Thanks @Copilot)
- Comprehensive YAML and Python examples for context-aware recognizers documentation (#1710) (Thanks @MRADULTRIPATHI)
Changed
- Updated actions/checkout from v4 to v5 to support Node.js 24 runtime (#1699) (Thanks @dependabot)
- Fixed PR template to use proper GitHub issue linking syntax for automatic issue association and closing (#1701) (Thanks @Copilot)
- Updated LiteLLM documentation with detailed guide links for better integration guidance (#1698) (Thanks @BhargavDT)
- Fixed broken links in CONTRIBUTING.md and developing recognizers documentation after recognizers refactoring (#1674) (Thanks @siwoo-jung)
- Fixed OpenSSF badge embedding in README.MD for proper display (#1673) (Thanks @SharonHart)
- Removed Terrascan from Microsoft Defender for DevOps workflow to eliminate false positives on non-IAC repository (#1691) (Thanks @Copilot)
Security
- Updated Streamlit and PyTorch dependency versions to fix CVE vulnerabilities (#1685) (Thanks @SharonHart)
- Updated requests library to mitigate security vulnerability GHSA-9hjg-9r4m-mvj7 (#1683) (Thanks @SharonHart)
- Locked pandas dependency in Streamlit to prevent version conflicts (#1689) (Thanks @SharonHart)
2.2.359
Release 2.2.359
This is a period release with feature enhancements, bug fixes, documentation updates and one configuration change.
Changes in Presidio's behavior
Turn country specific recognizers to disabled to avoid false positives when they are not needed.
Most country specific recognizers that expect English were put as optional to avoid false positives, and would not work out-of-the-box (#1586). Specifically:
- SgFinRecognizer
- AuAbnRecognizer
- AuAcnRecognizer
- AuTfnRecognizer
- AuMedicareRecognizer
- InPanRecognizer
- InAadhaarRecognizer
- InVehicleRegistrationRecognizer
- InPassportRecognizer
- EsNifRecognizer
- InVoterRecognizer
To re-enable them, either change the default YAML to have them as enabled: true, or via code, add them to the recognizer registry manually.
- Yaml based: see more here: YAML based configuration.
- Code based:
from presidio_analyzer import AnalyzerEngine
from presidio_analyzer.predefined_recognizers import AuAbnRecognizer
# Initialize an analyzer engine with the recognizer registry
analyzer = AnalyzerEngine()
# Create an instance of the AuAbnRecognizer
au_abn_recognizer = AuAbnRecognizer()
# Add the recognizer to the registry
analyzer.registry.add_recognizer(au_abn_recognizer)Changes:
Analyzer
- Allow loading of StanzaRecognizer when StanzaNlpEngine is configured, improving NLP engine flexibility (#1643) (Thanks @omri374)
- Excluded recognition_metadata attribute from REST Analyze Response DTO to clean up API responses (#1627) (Thanks @SharonHart)
- Added ISO 8601 support to DateRecognizer for improved date parsing (#1621) (Thanks @StefH)
- Prevented misidentification of 13-digit timestamps as credit cards (#1609) (Thanks @eagle-p)
- Updated analyzer_engine_provider.md for clarity and completeness (#1590) (Thanks @AvinandanBandyopadhyay)
- Bumped python from 3.9 to 3.12 in presidio-analyzer Dockerfile (#1583) (Thanks @dependabot)
- Bumped phonenumbers version for improved validation and parsing (#1579) (Thanks @omri374)
- Refactored InstanceCounterAnonymizer to simplify index retrieval logic (#1577) (Thanks @ShakutaiGit)
- Fixed issue #1574 to support as_tuples in relevant functions (#1575) (Thanks @omri374)
- Updated initial scores in IN_PAN for better recognition performance (#1565) (Thanks @omri374)
- Added accelerate as a missing build dependency to fix build failures (#1564) (Thanks @SharonHart)
- Don't set a default for LABELS_TO_IGNORE if not specified, to avoid unintended behavior (#1563) (Thanks @SharonHart)
- Updated 08_no_code.md for documentation improvements (#1561) (Thanks @alan-insam)
- Added the ability to disable the NLP recognizer via configuration (#1558) (Thanks @omri374)
- Removed 'class' from API documentation for clarity (#1554) (Thanks @omri374)
- Set country-specific default recognizers to enabled=false for safer defaults (#1586) (Thanks @omri374)
- Most country specific recognizers that expect English were put as optional to avoid false positives, and would not work out-of-the-box (#1586).
Anonymizer
- Update python base image to 3.13 (#1612) (Thanks @dependabot[bot])
- Bumped python from 3.12-windowsservercore to 3.13-windowsservercore in presidio-anonymizer Dockerfile (#1612) (Thanks @dependabot)
- Ensured anonymizer sorts analyzer results input by start and end for correct whitespace merging (#1588) (Thanks @mkh1991)
- Bumped python from 3.9 to 3.12 in presidio-anonymizer Dockerfile (#1582) (Thanks @dependabot)
Image Redactor
- Bumped python from 3.12-slim to 3.13-slim in presidio-image-redactor Dockerfile (#1611) (Thanks @dependabot)
- Bumped python from 3.10 to 3.12 in presidio-image-redactor Dockerfile (#1581) (Thanks @dependabot)
General
- Fixed typographical errors in documentation files for better clarity (#1637) (Thanks @kilavvy)
- Corrected spelling mistakes across code comments and documentation for improved readability (#1636) (Thanks @leopardracer)
- Fixed typos in documentation and test descriptions, enhancing clarity and consistency in the codebase (#1631) (Thanks @zeevick10)
- Corrected typos in docstrings and comments to maintain documentation quality (#1630) (Thanks @kilavvy)
- Fixed typos in documentation and test descriptions, ensuring accurate references and descriptions (#1628) (Thanks @leopardracer)
- Removed unnecessary run.bat script from the repository (#1626) (Thanks @SharonHart)
- Added "/TestResults" to .gitignore file to prevent test result artifacts from being committed (#1622) (Thanks @StefH)
- Added links to the discussion board about Docker prebuilt images to documentation (#1614) (Thanks @omri374)
- Fixed spelling, grammar, and style issues in Presidio V2 documentation (#1610) (Thanks @Vruddhi18)
- Updated .gitignore to include the .vs folder (#1608) (Thanks @StefH)
- Fixed typo in api-docs.yml to improve documentation accuracy (#1602) (Thanks @StefH)
- Reverted a previous update to codeql-analysis.yml to restore earlier configuration (#1595) (Thanks @SharonHart)
- Updated codeql-analysis.yml for improved code scanning configuration (#1594) (Thanks @SharonHart)
- Fixed paths-ignore in codeql-analysis.yml to refine scanning scope (#1593) (Thanks @SharonHart)
- Ignored docs/ directory in CodeQL analysis to prevent unnecessary scanning (#1592) (Thanks @SharonHart)
- Fixed minor typos in code and documentation (#1585) (Thanks @omahs)
- Restored dependabot scanning for security and dependency updates (#1580) (Thanks @SharonHart)
- Added SUPPORT.md file to provide support information to users (#1568) (Thanks @omri374)
Version 2.2.358 (#1553)
Changes:
- 04920aa Version 2.2.358 (#1553)
- dcf1ae0 drop blake2b (#1552)
- a0484dd Replace MD5 with Blake2 (#1550)
- e9df3be Exclude closing single or double quote from URL in URL recogniser (#1532)
- cf75c5a Updated the Evaluating DICOM Redaction documentation to reflect changes in verify_dicom_instance() within the DicomImagePiiVerifyEngine class. (#1549)
- 6b10fd5 Updated the return type annotation of in function from to . (#1547)
- bacf23f Add multiprocessing parameters (#1521)
- 0856479 Migrate to poetry 2.0 (and PEP 621) (#1539)
- 8b288fa docs: Add environment setup and notebook execution guides for Presidio + Spark in Fabric (#1529)
- 5881d75 Update defender-for-devops.yml (#1544)
See More
- eb72ae0 Update CodeQL and defender-for-devops workflows (#1540)
- 5152805 Replace pycryptodome with cryptography (#1537)
- 5fea5af Upgrade codeql (#1536)
- 26f2e92 Fix python 3.9 Builds (#1534)
- 65eabd4 add spacy_stanza into stanza_nlp_engine as it is no longer maintained (#1522)
- 35ab8ae Move sanitize_value to be common, Fix InPassportRecognizer (#1519)
- 6f840ea remove azure-core (#1517)
This list of changes was auto generated.
2.2.357
Version 2.2.356 (#1477)
Changes:
- 9fee330 Version 2.2.356 (#1477)
- a081c22 Update presidio containers to use gunicorn (#1497)
- ebf0ca5 Restricting spacy.cli for version 3.7.0 (#1495)
- 3d9cee9 Fix regex match_time output (#1488)
- a21a17c Add a link to model classes to simplify configuration (#1472)
- d238da9 Update community.md (#1469)
- a0a5f89 Use ACR service connection when pushing containers (#1484)
- fde30dd Add support for allow_list, allow_list_match, regex_flags in REST API (#1478)
- ce63783 Unlock numpy after dropping 3.8 (#1480)
- 33808c2 Removed python 3.8 support (EOL) and added 3.12 (#1479)
See More
- cc31bb6 Add a link to HashiCorp vault operator resource (#1468)
- 71fa64d docs: clarify the docs on deploying presidio to k8s (#1453)
- 21361f9 Updates to the transformers conf docs and yaml file (#1467)
- 13ae328 Fix presidio-structured build - lock numpy version (#1465)
- 49f2b6a Fix space (#1459)
- b9f6cba Bug/azure ai language context (#1458)
- 89ccadb Update US_SSN CONTEXT and unit test (#1455)
- c54ce2b Add UK National Insurance Number Recognizer (#1446)
- 9321e14 Remove ignored labels from supported entities (#1454)
- 0721e36 Dev containers for: analyzer, analyzer+transformers, anonymizer and image redaction (#1450)
- 4aeb56b added batching support (#1449)
- 1bf22ed Update installation.md (#1439)
- e55300a Update defender-for-devops.yml (#1437)
- e08f44b Fix #1442 (#1445)
- 9696b9e Reduce memory usage of Analyzer test suite (#1429)
- 6c51464 added logic to handle phonenumbers with country code (#1426)
- 3e4a806 Update CI due to DockerCompose project name issue (#1428)
- 2fe6ad7 closing handles (#1424)
- cd7e547 (docs) Use Presidio across Anthropic, Bedrock, VertexAI, Azure OpenAI, etc. w/ LiteLLM Proxy (#1421)
- 8dc46e2 Make sure that configuration files are closed when loading them (#1423)
- ada5fce Do not release presidio-cli as part of the release pipeline (#1422)
- d85ba6e Typo fix added missing ":" after if condition (#1419)
- d46bacb minor notebook changes (#1420)
This list of changes was auto generated.
version 2.2.355 (#1410)
Changes:
- edd722d version 2.2.355 (#1410)
- c059131 changing predefined recognizers to use the config file (#1393)
- 56f0df2 Update Dockerfile.windows (#1414)
- ad77f2f Update Dockerfile.windows (#1413)
- ac38cca NLP engine sample + refresh on samples (#1388)
- 97a7e42 Fix the entity filtering of the transformer_recognizer.py analzye function (#1403)
- 2be6de1 Fix ports in docs (#1408)
- a3a609b Improve url detector (#1398)
- 4752166 From Pipenv to Poetry (#1391)
- ebbfd30 Added presidio structured downloads to readme (#1392)
See More
- 67d5837 Feature/analyzer documentation (#1384)
- e65c89c Migrate Python Packaging to pyproject.toml (#1383)
- 2d92539 Fix N818, E721 (#1382)
- 51dc5c6 Auto-formatting, fix D rules (#1381)
- cb0184a Add Ruff linter + Apply Ruff fix (#1379)
- 2348fff Fix OverflowError in crypto_recognizer (#1377)
- ff31243 Align ports with documentation and postman collection. (#1375)
- 2805c86 Loading analyzer engine & recognizer registry from configuration file (#1367)
- 55bfb8f added regex functionality for allow lists in the analyzer (#1357)
- e64d8ec Spanish NIE (Foreigners ID card) recognizer (#1359)
- f29e112 Update conf files location (#1358)
- 41e0202 feat: Add new recognizer for IN_VOTER #1344 (#1345)
- 5ea004d New Predefined Recognizer for Indian Passport #1350 (#1351)
- c7fa825 Added Finnish Personal Identity Code Recognizer. (#1349)
This list of changes was auto generated.
2.2.354
Changes:
- ffa29f8 Fixed wrong condition for dicom metadata (#1347)
- 49a996d Changed default aggregation_strategy to max (#1342)
- db8ff82 feat: Implement user-defined entity selection strategies in Presidio Structured (#1319)
- 4db5278 Cache compiled regexes in analyzer (#1335)
- 9c3369d Bugfix - Fix for incorrectly referenced recognizer in analysis_explanation using PhoneRecognizer (#1332)
- 6a4135e Fix bug where "bank" and "check" wouldn't work (#1333)
- ea8d830 Added contributions to readme (#1331)
- 733cca2 Adding Span Marker Recognizer Sample (#1321)
- 1911a3d Update spacy_stanza.md (#1325)
See More
- d71c5fb feat: Add Singapore UEN Recognizer (#1315)
- 59af84d Added tesseract to installation (#1312)
- 4c48b92 Addition of leniency parameter in predefined PhoneRecognizer (#1311)
- 173b527 Bugfix in tutorial (#1310)
- dee6562 predefined pattern recognizer : IN_VEHICLE_REGISTRATION (#1288)
- a8d2c90 feat: Add Bech32 and Bech32m Bitcoin Address Validation in Crypto Recognizer and expand tests (#1307)
- 45c418d feat: Support 'M' prefix in SG_NRIC_FIN Recognizer and expand tests (#1304)
- 5dfbf27 Analysis builder improvements (#1295)
- 7f09c95 added pseudonimyzation sample (#1296)
This list of changes was auto generated.
2.2.353
- Add predefined_recognizer: IN_AADHAAR (#1256)
- Added the option to add custom operators + pseudonymization sample (#1284)
- Fix failing test due to optional package (#1258)
- Allow local Spacy Models to be loaded in NLP Engine (#1269)
- Upgrade pip in windows containers (#1272)
- Bugfix in ImageAnalyzerEngine #1274
2.2.352
Changes:
Added
Structured
- Added alpha of presidio-structured, a library (presidio-structured) which re-uses existing logic from existing presidio components to allow anonymization of (semi-)structured data. (#1192)
Analyzer
- Add PL PESEL recognizer (#1209)
- Azure AI language recognizer (#1228)
- Add_conf_to_package_data (#1243)
Anonymizer
- Add keep operator as deanonymizer (#1255)
- Update anonymize_list type hints and document that sometimes items will be ignored. (#1252)
General
- Add Dockerfile for Windows containers (#1194)
Changed
Analyzer
- Drop WA driver license number (#1214)
- Change ner_model_configuration from list to map (#1222)
- Bugfix in SpacyRecognizer (#1221)
- Bugfix in NerModelConfiguration (#1230)
- Add_conf_to_package_data (#1243)
Anonymizer
- Improved the logic of conflict handling in AnonymizerEngine (#1196)
Image Redactor
- Change default score threshold in image redactor (#1210)
- fixes bug #1227 (#1231)
- Added missing dependencies for opencv-python and azure forms recognizer (#1257)