INVOICE_OCR
Invoice OCR for scanned bills, line items, and AP exports
Turn invoice PDFs into structured data finance teams can review, export to Excel, or send downstream through the API.
{
"document_type": "invoice",
"invoice_number": "INV-10492",
"vendor_name": "Northstar Supply Co.",
"invoice_date": "2026-04-30",
"purchase_order": "PO-88421",
"subtotal": 4820.0,
"tax": 385.6,
"total_due": 5205.6
}
FIELD_SCHEMA
Fields built for invoice processing
The parser focuses on invoice fields that decide payment routing: vendor identity, invoice dates, totals, taxes, and line items.
Invoice identity
Vendor and buyer
Amounts and lines
Handles invoice formats that break spreadsheets
Multi-page invoices with line items split across pages
Scanned bills where totals and tax rows are faint or rotated
Vendor layouts that place PO numbers outside the main table
Mixed currencies, discounts, and freight charges in the same PDF
From invoice upload to AP-ready data
Upload an invoice PDF, scan, or image
Extract vendor, invoice, line-item, tax, and total fields
Review low-confidence values before export
Export clean invoice data to Excel, CSV, JSON, or your API flow
RELATED_PARSERS
FAQ
Common questions
Can PDF2TEXT extract invoice line items?
Yes. It extracts invoice header fields and line-item tables, then lets you export the result to Excel, CSV, or JSON.
Does invoice OCR work on scanned PDFs?
Yes. Upload scanned invoices or image files and review the extracted fields before sending data downstream.
Can I use invoice OCR with an API?
Yes. Use the API to return structured invoice JSON or export reviewed data from the workstation.
Turn invoice PDFs into structured AP data
Turn invoice PDFs into structured data finance teams can review, export to Excel, or send downstream through the API.