Intelligent OCR Document Analyzer
Vision-LLM OCR that digitizes handwritten paper forms into structured records and live analytics dashboards. Deployed for the LGU of Bocaue (Bulacan) as Smart Deep Scan — the household census hub.
What we set out to solve
Bocaue's barangay enumerators capture household census data on Tagalog paper forms — head of household, family composition, water and sanitation, housing tenure, flood exposure, utilities. Encoding the backlog into spreadsheets was slow, error-prone, and made cross-barangay reporting nearly impossible. Generic OCR could not read handwriting or preserve the bilingual field structure.
How we built it
- 01Built a category tree so enumerators file each scanned form under the correct barangay/sub-area before upload.
- 02Compressed each image client-side (canvas, ≤100 KB JPEG) and routed it through an OpenAI vision endpoint with a strict JSON schema mirroring the paper form one-to-one.
- 03Preserved Tagalog/Filipino field names verbatim (komposisyon_ng_pamilya, tubig_at_sanitasyon, tirahan) so downstream reports map back to the printed form.
- 04Wired an editor for low-confidence fields with side-by-side image preview, plus Excel export and four live dashboards (Population Profile, WASH, Housing Vulnerability, Utilities).
- 05Stood the whole stack on Firebase Auth + Realtime Database + Storage for a zero-ops shared workspace across LGU staff.
Tools & systems
- Next.js 16 (Pages Router) + React 19 + TypeScript
- Tailwind CSS 4 + shadcn/ui
- OpenAI vision (gpt-5-mini) via REST
- Firebase Auth, Realtime Database, Storage
- Tesseract.js & llama-ocr (fallback OCR)
- Chart.js, @tanstack/react-table, xlsx
- Vercel (iad1)
What it delivers
- Handwritten census forms become structured, searchable household records in seconds per page.
- Tagalog field names are preserved end-to-end, so enumerators, reviewers, and report consumers all work in the same vocabulary as the paper form.
- Four built-in dashboards turn the data into population, WASH, housing-risk, and utilities views the LGU can act on immediately.
- Repeatable pipeline — the same architecture scales to other LGUs and other Filipino-language paper forms.
More in AI Integration
AI-Powered Workflow Automation
Streamline operations using intelligent automation powered by machine learning and LLMs.
Smart Compliance & Background Checker
Automated system that generates and verifies candidate or vendor background reports for due diligence.
AI Resume Screener & Job Matcher
Smart recruiting assistant that evaluates resumes, scores candidates, and suggests best-fit job roles automatically.
Let's scope Intelligent OCR Document Analyzer
Tell us your constraints and goals. We'll come back with a build plan, timeline, and price.
