All Services
LiveAI Integration · Automation

Intelligent OCR Document Analyzer

Vision-LLM OCR that digitizes handwritten paper forms into structured records and live analytics dashboards. Deployed for the LGU of Bocaue (Bulacan) as Smart Deep Scan — the household census hub.

01 · The Problem

What we set out to solve

Bocaue's barangay enumerators capture household census data on Tagalog paper forms — head of household, family composition, water and sanitation, housing tenure, flood exposure, utilities. Encoding the backlog into spreadsheets was slow, error-prone, and made cross-barangay reporting nearly impossible. Generic OCR could not read handwriting or preserve the bilingual field structure.

02 · The Approach

How we built it

  1. 01Built a category tree so enumerators file each scanned form under the correct barangay/sub-area before upload.
  2. 02Compressed each image client-side (canvas, ≤100 KB JPEG) and routed it through an OpenAI vision endpoint with a strict JSON schema mirroring the paper form one-to-one.
  3. 03Preserved Tagalog/Filipino field names verbatim (komposisyon_ng_pamilya, tubig_at_sanitasyon, tirahan) so downstream reports map back to the printed form.
  4. 04Wired an editor for low-confidence fields with side-by-side image preview, plus Excel export and four live dashboards (Population Profile, WASH, Housing Vulnerability, Utilities).
  5. 05Stood the whole stack on Firebase Auth + Realtime Database + Storage for a zero-ops shared workspace across LGU staff.
03 · The Stack

Tools & systems

  • Next.js 16 (Pages Router) + React 19 + TypeScript
  • Tailwind CSS 4 + shadcn/ui
  • OpenAI vision (gpt-5-mini) via REST
  • Firebase Auth, Realtime Database, Storage
  • Tesseract.js & llama-ocr (fallback OCR)
  • Chart.js, @tanstack/react-table, xlsx
  • Vercel (iad1)
04 · The Outcome

What it delivers

  • Handwritten census forms become structured, searchable household records in seconds per page.
  • Tagalog field names are preserved end-to-end, so enumerators, reviewers, and report consumers all work in the same vocabulary as the paper form.
  • Four built-in dashboards turn the data into population, WASH, housing-risk, and utilities views the LGU can act on immediately.
  • Repeatable pipeline — the same architecture scales to other LGUs and other Filipino-language paper forms.
Want this for your team?

Let's scope Intelligent OCR Document Analyzer

Tell us your constraints and goals. We'll come back with a build plan, timeline, and price.

Inquire Now