Core Concepts
This page explains key concepts and terminology used across the anyformat API.Workflows
A Workflow defines what information should be extracted from your documents. Think of it as a template or schema for document processing. Each workflow contains:- A unique identifier (UUID)
- A name and description
- A set of fields that define what data to extract
Fields
Fields define specific data points to extract from your documents. Each field has:- A name (e.g., “invoice_number”)
- A description that helps the AI understand what to extract
- A data type (string, integer, float, date, etc.) - see Field Types for details
Runs
A Run is the process and result of applying a workflow to a document. When you submit a document to be processed with a workflow viaPOST /v2/workflows/{id}/run/, a file is created, which contains:
- A file UUID (the
idin the response) - A status tracking the processing lifecycle:
| Status | Description |
|---|---|
pending | File created, processing not yet started |
queued | Waiting for an available processing slot |
in_progress | Processing is actively running |
processed | Processing complete, results available |
error | Processing failed |
cancelled | Processing was cancelled (terminal state, stop polling) |
- Extracted data points (available when status is
processed)
GET /v2/files/{file_id}/extraction/ — returns 412 while processing and 200 when results are ready.
Results
Results are the structured data extracted from your documents. Each data point includes:- The field name
- The extracted value
- A confidence score (0-100)
- Evidence information (location in the document where the data was found)
Evidence
Evidence is an array of metadata objects that indicate where in the document pieces of information were found. This is an array because depending on the type of information one is looking for, sometimes it is inferred instead of directly extracted from a concrete place. The evidence array is therefore way more useful when it comes to validating the results! Each evidence object provides:- The actual snippet of text from which the data was extracted
- The page number in the document
Confidence
The Confidence score (0-100) indicates how certain the system is about an extracted value. Higher scores indicate greater confidence in the accuracy.Additional Resources
For detailed information about specific topics, see:- Field Types - Detailed field type definitions including objects and enums
- Response Formats - Output formats and the
as_lists=trueparameter - Error Handling - Complete error handling guide with examples
- Create Workflow - API endpoint for creating workflows
