Production-readiness cleanup: license, uv, CI, types, docs#3
Open
Samuelstein1224 wants to merge 18 commits into
Open
Production-readiness cleanup: license, uv, CI, types, docs#3Samuelstein1224 wants to merge 18 commits into
Samuelstein1224 wants to merge 18 commits into
Conversation
- Fix MIT/Apache mismatch (LICENSE is MIT; pyproject + README now match) - Set Python support to 3.10-3.12; drop 3.9 - Update classifiers, black target-version, add ruff target-version Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- docs/installation.md: Python 3.9+ -> 3.10+ - ci.yml: add Python 3.12 to test matrix Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
No behavioral changes; mechanical formatter output only. Establishes a clean baseline so pre-commit hooks (Task 4) produce zero diffs on existing code.
- Add uv.lock for reproducible installs - Add .python-version (3.11) - Switch README, docs/installation.md, CONTRIBUTING.md to uv-first install instructions; keep pip as a documented fallback
- .pre-commit-config.yaml runs ruff/black/isort + hygiene hooks - pre-commit added to dev extras - CONTRIBUTING.md documents opt-in install via `uv run pre-commit install` Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- mypy in dev extras with permissive settings (ignore_missing_imports, no disallow-untyped-defs); Task 7 will triage and tighten. - pytest-cov in dev extras with [tool.coverage] config so the CI coverage job (Task 5) has config to consume.
- test: pytest matrix on Python 3.10/3.11/3.12 - lint: ruff/black/isort --check - typecheck: mypy (continue-on-error until Task 7 cleans up errors) - coverage: pytest-cov + Codecov upload - Switch all installs to uv via astral-sh/setup-uv with cache Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reduce mypy error count from 149 to 0 across the package, mostly via implicit-Optional and missing-annotation fixes: - analyzer/clifford_t_analyzer: type stats dict and counters, mark union-attr ignores for the networkx-Graph union returns - analyzer/pbc_analyzer: implicit-Optional for pbc_conversion_stats, type defaultdict counters - analyzer/visualization: widen position arg to Any (matplotlib accepts str or tuple), type defaultdict counters - api: type the parameters dict as Dict[str, Any], make artifacts Optional, accumulate parameters via mutation rather than reassignment - benchmark_utils: implicit-Optional for column_alignments, type qasm_files dict, ignore Number->float conversion in fmt_float_cell - decomposer/decomposer: annotate ops_info / rz_jobs / allowed_globals - fidelity: widen Dict[str, Any] returns where the union grew unwieldy - parser/qasm_parser: implicit-Optional for basis_gates - pbc_converter/pbc_generator: implicit-Optional for max_workers and output_prefix - pbc_converter/r_pauli_circ: type tracking dict - transpilers/__init__: align signatures of cpp-fallback dummy functions with their real counterparts so mypy treats the conditional branches as compatible Defer the deep type issues in ftcircuitbench.pbc_converter.pbc_circuit_reader and ftcircuitbench.pbc_converter.pbc_generator via tool.mypy.overrides ignore_errors=true with TODO comments. Those modules carry object-typed parser returns and tableau-type mismatches that need an API/data-model review and were out of scope for this pass. With mypy at zero errors, remove continue-on-error from the typecheck job in .github/workflows/ci.yml so it now blocks CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add CI status badge - Switch quick-start commands to uv run python - New "Reproducing paper results" section with verified smoke-test command - New "Troubleshooting" section - Forward-link CITATION.cff (created in Task 10) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- api.md: faithful reference for the public surface in ftcircuitbench/__init__.py and ftcircuitbench/api.py - examples.md: three worked examples (CLI single, CLI batch, Python API) verified locally - installation.md: final polish; uv-first with pip fallback and optional gridsynth binary section - index.md: navigation TOC pointing at all of the above Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- CHANGELOG.md in Keep a Changelog format with [Unreleased] reflecting the production-readiness cleanup - CITATION.cff matching the README BibTeX, validated by cffconvert
…tree - Smoke-test paragraph no longer points readers at the legacy circuit_outputs/qft_4q_gs_clifford_t.qasm as a comparison target; the JSON snippet in the README is now the canonical reference for the current nwqec backend. - Add circuit_outputs/ to the Repository structure tree, marked as archival/legacy.
- Set UV_PYTHON at job level so matrix value overrides .python-version - Pass python-version to setup-uv so the right interpreter is provisioned - Use uv run --no-sync to prevent implicit re-sync that dropped extras (which left pytest itself missing from .venv)
- Introduced end-to-end tests for the PBC converter to validate acceptance of the extended Clifford basis, including gates {cx, h, s, sdg, x, y, z}.
- Implemented parity tests for GS, SK, and NWQEC transpilers to ensure consistent pipeline shapes and structural parity across stages.
- Added unit tests for native Pauli/Sdg operations on the TableauForGate, covering direct conjugation, algebraic identities, and cross-checks against numpy-computed conjugation.
…ocessing fix
Remove stale MAX_QUBITS_FOR_FIDELITY SK short-circuit; SK fidelity now flows through rz_product_fidelity_sk for large circuits.
Make rz_product_fidelity_sk's per-theta helper module-level so multiprocessing.Pool can pickle it.
Add --pbc-backend {auto,cpp,python} flag, wire through PipelineConfig.use_nwqec_pbc.
argparse uses ArgumentDefaultsHelpFormatter; --sk-recursion default → 2.
Fix flaky parity test (gridsynth Python can emit X, so input-preservation is >= not ==).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Comprehensive cleanup making FTCircuitBench reproducible and contributor-friendly for outside readers of the paper.
pyproject.toml,README.md,CITATION.cff, and classifiers.requires-python = ">=3.10,<3.13", classifiers + black/ruff targets + CI matrix all match.uvwith committeduv.lockfor reproducible installs. Pip remains a documented fallback. Added.python-version(3.11).test(matrix 3.10/3.11/3.12),lint(ruff/black/isort),typecheck(mypy, blocking),coverage(pytest-cov + Codecov). All install viaastral-sh/setup-uvwith cache..pre-commit-config.yamlruns ruff/black/isort + standard hygiene hooks. Documented opt-in install inCONTRIBUTING.md.pbc_convertermodules deferred via documented[[tool.mypy.overrides]]withTODO(typecheck-followup)markers; everything else is properly typed.docs/api.md(22 public symbols), expandeddocs/examples.mdwith 3 verified examples, polisheddocs/installation.md, refresheddocs/index.md. New README sections: "Reproducing paper results" with a verified smoke-test command, and "Troubleshooting".CHANGELOG.md(Keep a Changelog format) andCITATION.cff(validates against schema 1.2.0; 13 authors transcribed from BibTeX).Test plan
All gates pass locally on this branch:
uv run pytest -v→ 56 passed, 2 skipped (pre-existing QASM3 skips)uv run mypy ftcircuitbench→ 0 errors across 27 source filesuv run ruff check ftcircuitbench tests→ cleanuv run black --check ftcircuitbench tests→ cleanuv run isort --check-only ftcircuitbench tests→ cleanuv run pre-commit run --all-files→ all 8 hooks passuv run --with cffconvert cffconvert --validate -i CITATION.cff→ validuv run python analyze_circuit.py qasm/qft/qft_4q.qasm --pipeline gs --gridsynth-precision 5 --skip-fidelity→ produces canonical reference numbers documented in READMENotes for reviewers
ftcircuitbench/is intentionally unchanged except for type annotations driven by the mypy clean-up. No behavioral refactoring was done.circuit_outputs/,circuit_stats_output/,circuit_benchmarks/) were left committed as you requested. The README now markscircuit_outputs/as archival (the legacy Python-Gridsynth output predates the currentnwqecC++ backend, so its T-counts diverge from current runs by ~250×; a follow-up should regenerate it).pbc_convertermodules (pbc_circuit_reader,pbc_generator) have mypyignore_errors = trueoverrides with documented TODOs — their type errors require API-level review and were out of scope for this cleanup.bf55c47); reviewinggit log --first-parentand skipping that commit makes the substantive changes much easier to read.