All docs
Invoicing
Invoice OCR and parsing
How Financica extracts data from uploaded invoices using OCR.
2 min read
When you upload a PDF or image of an invoice, Financica uses optical character recognition (OCR) to automatically extract the key information. This saves you from manually entering invoice details.
What gets extracted
The OCR engine identifies and extracts:
- Supplier or customer name — The company that issued or received the invoice.
- Invoice number — The reference number on the invoice.
- Invoice date — When the invoice was issued.
- Due date — When payment is expected.
- Line items — Individual products or services with descriptions, quantities, and prices.
- Subtotals and totals — Including any discounts applied.
- VAT details — Tax rates and VAT amounts per line item and in total.
- Payment information — Bank account details or payment references, when available.
How the process works
- Upload — You upload a PDF or image file from the expenses or revenue section.
- Processing — The file is sent to the OCR engine for analysis. This typically takes a few seconds.
- Review — The extracted data is presented for your review. Fields that the engine was less confident about may be highlighted.
- Correct and save — Make any necessary corrections and save the invoice record.
Tips for better OCR results
- Use high-quality scans — Clear, well-lit images produce better results than blurry photos.
- PDF is preferred — Native PDF files (not scanned images saved as PDF) give the best results because the text is already machine-readable.
- Standard layouts — Invoices with conventional layouts are parsed more accurately than highly stylized designs.
- One invoice per file — Upload each invoice as a separate file for the cleanest results.
Supported file formats
- PDF (native and scanned)
- PNG and JPG images
- HEIC photos (from iPhone cameras)
When OCR is not enough
For invoices that OCR struggles with (handwritten, unusual layouts, or very poor quality), you can always enter the details manually. The OCR extraction is a starting point, not a requirement — every field can be edited.
For structured electronic invoices (UBL XML), no OCR is needed at all. See Electronic invoicing.