Receipt Scanner.
Desktop tool that reconciles photo receipts against a 14-page bank statement, every month.
- Client
- Accounting team, EU operations
- Role
- Solo, end to end
- Stack
- Python · Apple Vision · Claude · pdfplumber · pydantic
- Year
- 2026
The problem.
An EU company runs operations in Slovakia and Hungary. Every month the accountant reconciles 30 to 50 photo receipts against a 14-page Fio banka statement, around 150 transactions in EUR with some HUF and CZK.
Manual reconciliation took 4 to 6 hours per month. Small receipts went missing. By the end of the quarter someone had to call the client and ask for documents nobody remembered.
The approach.
I started from a fork of an open-source linker built around Gemini and US receipts. Kept the parts that worked. Replaced everything else.
Statement parsing moved from a vision model to pdfplumber plus a text-only Claude call. Apple Vision runs OCR locally on macOS for simple receipts. Claude vision steps in only when the local engine is uncertain.
Decisions worth explaining.
- 01
Pluggable OCR engines
Three implementations behind one interface. Apple Vision: free, instant, runs on the accountant's laptop. Claude vision: high quality on hard documents, $0.005 to $0.01 per receipt. Hybrid: Apple first, Claude as fallback when confidence drops below 0.75. One YAML key switches modes.
- 02
Text-only LLM for the statement
Sending 14 PDF pages to a vision model costs around $0.20 per run. I replaced it with pdfplumber to extract text in 100 ms, then a text-only Claude call to parse the text into a pydantic schema. Cost dropped to $0.05 to $0.12 per statement.
- 03
Two-pass matcher
Exact first: amount within 0.05 EUR, date within 3 days. Fuzzy second: 5 percent amount tolerance plus mandatory vendor name similarity through rapidfuzz. Without the name check, a 2000 EUR receipt for electronics would silently match a 2093 EUR IKEA payment by amount alone. With it, the match is rejected.
- 04
SHA256 cache for OCR
Every recognized file is keyed by hash plus engine name. A second pass on the same receipts takes seconds and costs nothing, so iterating on prompts and thresholds stays free during development.
- 05
Structured output through pydantic
LiteLLM response_format with pydantic schemas. For Anthropic this becomes a tool call with an enforced JSON shape. No try/except json.loads, no trailing markdown to strip, no half-broken arrays from the model.
- 06
Excel report built for accountants
Four sheets. The first one is the only sheet the accountant opens day to day: card payments without a receipt, grouped by vendor, sorted by amount descending. A red header on top shows how much money is still unbacked.
Desktop interface.
Drop-zone on the left for receipts, on the right for the statement PDF. Side panel for API key, report path, model, and date window. Bottom row is the progress bar and an Open Excel button. customtkinter on top of tkinter, light theme, no terminal.
Outcomes.
- Monthly reconciliation dropped from 4-6 hours to 2-3 minutes.
- Hybrid OCR keeps 70 to 85 percent of receipts free and offline.
- Re-running on the same files is free thanks to the SHA256 cache.
- Missing-receipt discoveries happen the same week, not at quarter end.
What's next.
If the volume grows past 100 receipts a day, three switches sit ready: Anthropic Batch API for a 50 percent input discount, Gemini Flash through litellm with one config line, and prompt caching on the system prompt. The architecture takes all three without rewrites.