fix: honest precision tiers + chat-online boundary + README perf reconciliation#197
Open
rylinjames wants to merge 1 commit into
Open
fix: honest precision tiers + chat-online boundary + README perf reconciliation#197rylinjames wants to merge 1 commit into
rylinjames wants to merge 1 commit into
Conversation
…rf reconciliation Three honesty fixes for when VERIFICATION.md + README become sales artifacts: (1) the verification ledger labeled everything 6e-7..4.25e-4 as 'machine precision' — added a _precision_tier() classifier to the generated receipt (<=2e-6 machine / <=1e-4 tight / <=5e-3 fp16 / else DIVERGENT) and retiered the Eagle VLM (4.25e-4 -> fp16 tolerance) + DiT (1.78e-5 -> tight) rows; (2) reflex chat silently required network (routes to chat.fastcrest.com) which contradicts the offline/private pitch — it now raises OfflineError stating the offline guarantee is the serving path not chat, and the README Blackwell 'chat-only' workaround is corrected; (3) README perf was self-contradictory — the '\''latency not in README'\'' line now points at the real published Performance table and the 2.3-2.9x batching figure (abandoned decomposed-ONNX path) is marked deprecated. 155 chat/verification tests pass. No scope/behavior change to other verbs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three honesty fixes — the kind that matter once VERIFICATION.md and the README are sales artifacts a QA/compliance team reads.
1. Verification ledger over-labeled precision
The ledger called everything from 6e-7 to 4.25e-4 "machine precision" — a ~700x spread under one green check. The Eagle VLM stack at 4.25e-4 is fp16 tolerance, not fp32 machine precision (the first row a QA reviewer pulls).
_precision_tier()toverification_report.pyso the generated VERIFICATION.md self-classifies each fixture:<=2e-6 machine precision (fp32)/<=1e-4 tight tolerance/<=5e-3 fp16 tolerance/ elseDIVERGENT - review.measured_numbers.mdlink.2.
reflex chatsilently required network (contradicts offline/private pitch)chatroutes tochat.fastcrest.com(GPT-5 Mini). For the defense/legal/offline positioning that is the inverse of the promise.chatnow raisesOfflineErroron network failure, stating plainly that the offline/air-gap guarantee is the serving path (reflex serve//act, fully on-device), not the chat helper; self-host viaFASTCREST_PROXY_URL.3. README perf self-contradiction
The Performance section publishes a 5.55x TRT table, but a later line said "latency numbers intentionally not in the README yet," and the only batching figure (2.3-2.9x) was on the abandoned decomposed-ONNX path.
Test plan
155 passedacross chat/verification/backend/report tests_precision_tiermaps 4.25e-4 -> "fp16 tolerance"Co-Authored-By: Claude Opus 4.7 (1M context)