Add in-browser OCR support with TXT/CSV/XLSX export#1
Merged
Conversation
Add skeleton modules (ocr, receipt-parser, export, ocr-types), tesseract.js + xlsx + papaparse deps, jsdom test env, global tesseract mock, and Playwright webServer config. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Heuristic line-by-line parser detecting merchant, date, currency, line items (with optional quantity), subtotal, tax, and total. Pure function, fully unit tested. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure transform from OcrResult[] to downloadable Blobs. Supports combined or per-file output, free-form text, and receipt mode with structured headers + line items. Lazy-loads xlsx package to keep main bundle slim. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lazy-loaded singleton worker with progress callback and clean termination. Pure interface over tesseract.js v5, fully tested via a global module mock that simulates the logger lifecycle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds Convert/OCR mode toggle, OCR control panel (format, receipt mode, combined), per-item cards with editable text, confidence badges, progress bars, and download single/all via buildExport + JSZip for multi-file ZIPs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Runs typecheck, build, unit tests, and Playwright E2E on every push and pull request. Caches npm + Playwright browsers. Uploads dist artifact and playwright-report on failure. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Allows Playwright tests to substitute a synthetic OCR result instead of loading the 10 MB Tesseract WASM, keeping E2E hermetic and fast. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hermetic test using __ocrTestMode hook to avoid loading the real Tesseract WASM. Covers TXT export, CSV receipt-mode export, and download verification. Also fixes playwright.config.ts webServer to bind to 127.0.0.1 so Playwright's URL health-check resolves. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan
__ocrTestModehook (hermetic, no WASM download)Files
🤖 Generated with Claude Code