Common questions about PQF, with links to deeper answers when they exist. Open an issue if you have a question that should be here but isn't.
Probably not for irreplaceable data. The spec is still draft
v0.3.1; the wire format is not frozen. The reference implementation
has not undergone external cryptographic review. See
docs/COMPATIBILITY.md for the v1.0.0 freeze
contract and the blocker list. PQF is appropriate for:
- experimenting with hybrid post-quantum file formats,
- contributing to the spec or reference impl,
- exploring how PQ primitives compose,
- writing or running interop tests.
It is not yet appropriate for "I want to make sure my client's legal archive is decryptable in twenty years."
Not yet. The README's "Cryptographic review wanted" section enumerates the specific normative sections where review has the highest leverage. If you do this kind of work, please open an issue.
Because ML-KEM is roughly two years old as a standard. The
cryptographic community has been wrong about lattice-based
primitives before (NTRUSign, the Newhope mistakes, the first version
of SIKE — actually broken). Hybrid means a defect in either ML-KEM
or X25519 alone does not compromise the file. We get post-quantum
resistance without giving up classical assurance. See
docs/DESIGN.md for the longer argument.
Because CBOR has a deterministic-encoding profile in the standard
(RFC 8949 §4.2.2). Re-encoding a canonical CBOR value byte-for-byte
matches the input, which means tamper-detection on the header
becomes structural rather than hand-written. Protobuf and msgpack
don't have this property; FlatBuffers does at a different level of
the stack but doesn't have a deterministic-encoding policy in the
same way. See docs/DESIGN.md.
CBOR is a tighter wire encoding (binary), has a deterministic profile, and supports raw byte strings without base64 overhead. JSON would have worked but with more hand-written canonicalization rules.
Yes, via the WebAssembly reader at
bindings/wasm. It's reader-only; producing
files in a browser is not yet wired up. The
demo page lets you drop a file
in and see the parsed header with zero network calls.
Yes, in bindings/python, reader-only via
pyo3. Not yet on PyPI; build from source with maturin for now.
PyPI publication is gated on v1.0.0 wire-format freeze.
Historical: it's where the project started. The reference implementation is the artifact that proves the spec is implementable, and the goal of the spec-first approach is that a second-source implementation in another language exists too. That second source is the Rust reader, which catches divergence the .NET impl alone can't.
If a second-source production writer in a language other than .NET appeared, the project would be very happy to have it.
So the verifier doesn't need an out-of-band key directory to verify the file's signature. It also makes the signed-file contract very explicit: the signer field is the public statement "this is the key I am claiming signed this file." The verifier still has to decide whether to trust that key — PQF does not solve key binding.
The hybrid signature also covers the header bytes. If you mutate
the header (including setting signer to null), the header
signature stops verifying. The threat-model writeup in
THREAT-MODEL.md walks through this.
Some files don't fit in memory, and some pipelines truly need
"verified bytes as they arrive." For those, Streaming Mode trades
the "verify-before-release" property for "verify-per-chunk +
verify-once-at-end." The catch is that the caller has to actually
check the end-of-stream result; the [MustUseReturnValue] on the
API surface catches casual misuse. See
STREAMING.md for the decision matrix.
4096 bytes minimum, 16 MiB maximum, must be a power of two. The chunk size is in the header so the reader knows how to frame the on-disk chunks.
Because every chunk has a unique key derived via HKDF from the DEK
with the chunk index in the info string. "Same nonce, different
key" is fine for AES-GCM. This saves 12 bytes per chunk and
removes a whole class of nonce-reuse bugs. The full argument is
in spec/PQF-SPEC-v1.md §5.2.
Roughly:
- Header: ~2 KiB for a single-recipient unsigned file. Each extra recipient adds ~1700 bytes (the ML-KEM-1024 ciphertext dominates).
- Per chunk: 5 bytes frame + 16 bytes AEAD tag.
- Footer: 20 bytes.
- Hybrid signature (if signed): 4691 bytes × 2 (header sig + file sig) = 9382 bytes.
For a 16 MiB unsigned single-recipient file with the default 64 KiB chunk size: 2 KiB header + 256 chunks × 21 bytes + 20 byte footer ≈ 7.5 KiB overhead, or 0.05% of the plaintext.
The recipient public-key material is. You can tell from the file
which recipients were targeted. If you need to hide that, PQF is
not the right tool — layer a metadata-privacy system above it.
This is explicitly documented in THREAT-MODEL.md as out of
scope.
PQF uses ML-KEM-1024 (FIPS 203) and ML-DSA-87 (FIPS 204), both standardized by NIST in 2024. PQF does not invent new primitives. The contribution is the wire format and the combiner construction; the cryptography itself is off-the-shelf NIST-approved primitives.
For non-security bugs, file an issue using the
bug template.
For exploitable issues, use the private security advisory
channel
described in SECURITY.md. Response-time
expectations are in MAINTAINERS.md.