Which fields does the IDR recognise on a PDF invoice?

Which fields the IDR recognises on a PDF invoice: standard, Professional, and configurable references.

The IDR (Intelligent Document Recogniser) automatically recognises the key data on a PDF invoice and converts it into structured fields in the e-invoice. Which fields are recognised depends on your subscription and configuration.

Standard recognised fields

For every PDF conversion, the following fields are recognised automatically:

FieldDetailsSupplierName, address, Chamber of Commerce number, VAT number. The supplier is identified through the eConnect party database (Purple Pages), not based on what appears on the PDF.BuyerName and address as stated on the invoice. Included in an XML extension, not as primary identification.Invoice numberUnique invoice numberInvoice dateDate of the invoiceDue datePayment term (if stated)AmountsSubtotal, VAT amount and total amountVAT ratePercentage and category (standard, reverse charge, exempt)IBANSupplier bank accountPayment referenceStructured reference (if present)CurrencyCurrency of the invoice
Incorrectly recognised invoice number per supplier format

With some supplier formats, the IDR may pick up a different field instead of the invoice number, for example a transaction ID instead of the actual invoice number. A well-known example is Meta (Facebook) invoices, where the invoice number appears at the bottom of a later page of the PDF and the transaction ID is more prominently recognised.

This can be improved for the supplier format in question. The recognition of the invoice number is not configurable by the customer, but is internally optimised by a support team member via outlier detection, regex and hints per supplier or format. No roadmap change is required; it is a targeted optimisation of existing functionality. Report such a case to support — based on the report, the team adds a supplier-specific hint so that future invoices of that format receive the correct invoice number.

Professional fields

With the Professional subscription, additional fields are recognised:

FieldDetailsPurchase order numberThe most commonly used reference. The IDR recognises this field automatically.Contract numberReference number of the underlying contractProject numberReference number of the projectBuyer referenceReceiver reference field, configurable per Chamber of Commerce, OIN or VAT numberG-account IBANRecognition of G-account bank numbers (identifiable by the "099" series)Structured payment referencesBelgian OGM, Norwegian KID number, Swiss QR codeVAT base amounts (reverse charge)Recognises the original VAT rate on reverse-charged invoices and assigns the correct VAT code (AE, K or G). Relevant for construction and housing associations.
Date recognition by country

The IDR recognises dates on PDF invoices and converts them to the standard UBL date format (YYYY-MM-DD). Because date notations differ by country, the IDR uses the supplier country to interpret ambiguous dates:

  • United States: MDY (month-day-year). The date 03/11 is interpreted as 11 March.
  • All other countries: DMY (day-month-year). The date 03/11 is interpreted as 3 November.

If automatic country detection is not enough, a specific hint for the date format can be added per supplier through the hint mechanism.

Tip: wrongly interpreted dates (for example 03/11 as 11 March instead of 3 November) are almost always a recognition issue, not a platform issue. The platform always displays the date as it appears in the UBL.

IBAN validation

With the Professional subscription, the recognised IBAN is compared against the verification store: a database of previously manually validated IBAN numbers per supplier. If the IBAN on the invoice differs from what was verified earlier, this is flagged. This helps detect ghost invoices or changed bank details.

Note: according to European standard EN16931, the IBAN on an invoice is primarily for supplier identification, not as a payment instruction. A changed IBAN must always be validated in your financial system's master data before payment is made.

Configurable references (custom)

In addition to standard references (order number, contract number, project number), other references can be configured specifically per supplier. Think budget codes, budget holder codes or internal references. This is custom work set up on a strip card basis.

Configurable reference recognition uses a three-layer mechanism:

  1. Regex: format validation to ensure the extracted reference exactly matches the expected format
  2. Outlier detection: statistical deviation detection that catches unlikely values
  3. Hints: automatically generated training data based on corrections by the QC team

Tip: the purchase order number is the most commonly used reference and is stated on the invoice by most suppliers. If a supplier cannot fill a certain reference field in their software, there is little point asking for it. In that case, use the purchase order number as the primary reference.

Reference obligation and rejection (EN16931)

According to European standard EN16931, a reference (buyer reference) is mandatory on an e-invoice; this is often a purchase order number or another reference from the recipient. A sender can place an incorrect value in this field.

eConnect does not reject an invoice on the reference by default. An invoice only fails to meet the basic rules of the European standard if there is no reference at all. Whether an invoice is rejected due to an unknown or incorrect reference depends on the recipient's configuration: in a specific recipient configuration, an invoice can be rejected if the reference is unknown. This is therefore a property of the receiving configuration, not of standard eConnect processing.

Tip: if you want to reject invoices on missing or unknown references, configure this via RBE (Rule Based Enrichment) on the receiving endpoint.

Missing invoice number: surrogate number (-NOTFOUND)

The invoice number (UBL field cbc:ID, BT-1) is mandatory in EN 16931, UBL BIS Billing 3.0 and NLCIUS for regular invoices. However, the IDR processes a mixed document stream: regular invoices, credit notes, expense claims and receipts. Receipts fall under the simplified invoice regime (transactions up to approximately €100 including VAT), for which the tax authority does not set an invoice number as a legal requirement. Rejecting on a missing invoice number would exclude all receipts and expense claims from processing.

For this reason, the IDR pipeline automatically generates a surrogate number when no invoice number can be extracted from the document during recognition. The cbc:ID field (BT-1) is then filled with the structure:

YYYYMMDDHHmmss-NOTFOUND

The timestamp is the moment of processing by the IDR pipeline, not the date on the document itself. Example: 20240315143022-NOTFOUND.

Catching the surrogate number

Filtering is possible at two levels:

  1. RBE (Rule Based Enrichment): configure a rule that checks whether BT-1 (cbc:ID) ends with -NOTFOUND. Based on this, the document can be held for manual review, redirected to a separate inbox, or automatically rejected.
  2. Own systems (ERP/accounting package): a direct string match or regular expression on -NOTFOUND in the cbc:ID field is sufficient.
Invoice lines (line recognition)

With line recognition, individual invoice lines are also recognised: description, unit price, quantity, line amount and reference fields per line. This is a separate feature you can activate yourself via My Environment.


Want to know how recognition works technically? Read How does Scan & Recognise (IDR/OCR) work?.

View your conversion tasks