AI Agent

Document Extraction

From PDFs and Scans to Structured, Validated Data

Stock statements, invoices, distributor reports, KYC packs — the documents that power your operations rarely arrive as clean data. Our extraction agent reads them the way a careful accountant would, validates against a typed schema, and writes the result straight into your systems.

What This Agent Does

Document understanding is more than OCR. The agent reads layouts, infers field meaning, validates types, and tags low-confidence cells for review — so what reaches your downstream system is reliable.

Ingest Any PDF

Schema-Constrained Output

Header & Line-Item Split

Validate & Correct

Async Worker Queue

Reference Use Case — Stock Statements

Pharma distributors send monthly stock statements as PDFs. Each one has a different layout. Our agent reads them all into one canonical schema — distributor metadata up top, dozens of SKU rows in the body, validated types throughout.

Distributor Header

Identifying details extracted with the type and format the downstream system expects — not free-text strings.

  • Name, full address, city, state, pincode
  • GSTIN (validated format), PAN
  • Drug License & FSSAI numbers
  • Contact, email, website
  • Statement period start & end dates

Line Items

Every SKU on the statement, parsed into a typed row — quantities as integers, values as floats, units normalised.

  • Product description & packing unit
  • Opening, purchase, sale, closing quantities
  • Opening, purchase, sale, closing values (₹)
  • Type-checked: integers, floats, dates

The Pipeline

01

Upload

PDF arrives via the upload endpoint — UI, email-ingest, or a watched folder. The document gets a record and status of processing.

02

Enqueue

An RQ job is enqueued on a Redis-backed worker queue, so the API stays responsive while heavy extraction runs in the background.

03

VLM Extraction

A vision-language model runs against the typed Pydantic schema. Output is constrained to the shape your downstream code expects.

04

Validate

Field types are checked, totals are sanity-checked against line items, and low-confidence cells are flagged for review.

05

Persist

The structured result is saved alongside the source file, with an audit trail and a status of processed.

06

Review & Edit

A side-by-side viewer lets a human compare the original PDF and the extracted fields, edit anything that needs correcting, and approve.

Sample Output

What you get back — a clean JSON document validated against the typed schema, ready to load into your warehouse, ERP, or analytics pipeline.

stock_statement_oct_2024.json

{
  "distributor_name": "Aarogya Pharma Distributors Pvt. Ltd.",
  "distributor_address_line": "Plot 14, MIDC Industrial Area, Phase II",
  "distributor_city": "Pune",
  "distributor_state": "Maharashtra",
  "distributor_pincode": "411019",
  "distributor_gstin": "27AABCA1234N1Z5",
  "distributor_pan": "AABCA1234N",
  "distributor_dl_no": "20B/MH-PUN/2019",
  "distributor_fssai_no": "11522003000324",
  "start_date": "Oct 2024",
  "end_date": "Oct 2024",
  "items": [
    {
      "product_description": "Paracetamol 500mg Tablets",
      "packing": "STRIP",
      "opening_qty": 1240,
      "opening_value": 18600.00,
      "purchase_qty": 5000,
      "purchase_value": 75000.00,
      "sale_qty": 4820,
      "sale_value": 96400.00,
      "closing_qty": 1420,
      "closing_value": 21300.00
    }
    // ... 187 more rows
  ]
}

Under the Hood

A boring, reliable pipeline — which is exactly what you want when you're processing thousands of documents a month.

Vision-Language Models

VLM Run for production extraction. Pluggable — we can swap in your preferred VLM provider or run it on-prem.

FastAPI + Postgres + Redis

A typed API surface, a relational store for documents and extractions, and an RQ-based async worker queue.

Pydantic-Typed Schemas

Schema-as-code. The same types validate the model output, the API response, and your downstream consumers.

Where It Fits

Pharma Distribution

Stock statements, secondary sales reports, distributor returns.

Finance & AP

Invoice and bill-of-lading capture into your ERP without manual entry.

KYC & Onboarding

Identity documents, address proofs, and licence packs into structured records.

Legal & Contracts

Key clause extraction and structured summaries from long-form contracts.

Explore other AI agents

Lumière — Restaurant Concierge

Aria — Real Estate Agent

Eric — Voice Concierge

GTM Engine