Documentation Index
Fetch the complete documentation index at: https://docs.anyformat.ai/llms.txt
Use this file to discover all available pages before exploring further.
Every recipe and SDK reference defers to the shapes defined on this page.
Paginated List Responses
Endpoints that return multiple items use a paginated wrapper:
{
"count": 25,
"page": 1,
"page_size": 20,
"results": [ /* items */ ]
}
| Field | Description |
|---|
count | Total number of items matching the query, across all pages. |
page | Current page number (1-indexed). |
page_size | Number of items per page. Maximum is 100; requests above are silently capped. |
results | Array of items for the current page. |
File-Collection Results
The shape returned by GET /v2/workflows/{workflow_id}/files/{collection_id}/results/ once processing completes (HTTP 200). While processing is still in progress the endpoint returns 412 Precondition Failed — see Error Handling for the polling pattern.
{
"collection_id": "069dcc2c-e14c-7606-8000-2ee4fb17b4e1",
"verification_url": "https://app.anyformat.ai/workflows/.../files/...",
"parse": {
"markdown": "<DOCUMENT id=\"1\" page=\"1\">..."
},
"classifications": [],
"splits": [],
"extractions": [
{
"split_name": null,
"partition": null,
"fields": {
"invoice_number": {
"value": "INV-2024-0847",
"value_override": null,
"verification_status": "not_verified",
"confidence": 97.0,
"evidence": [{"text": "Invoice #INV-2024-0847", "page_number": 1}]
},
"total_amount": {
"value": 4087.50,
"value_override": null,
"verification_status": "not_verified",
"confidence": 96.0,
"evidence": [{"text": "Total: $4,087.50", "page_number": 2}]
}
}
}
],
"extraction": {
"invoice_number": {"value": "INV-2024-0847", "confidence": 97.0, "evidence": [{"text": "Invoice #INV-2024-0847", "page_number": 1}], "value_override": null, "verification_status": "not_verified"},
"total_amount": {"value": 4087.50, "confidence": 96.0, "evidence": [{"text": "Total: $4,087.50", "page_number": 2}], "value_override": null, "verification_status": "not_verified"}
}
}
Top-level fields
| Field | Type | Description |
|---|
collection_id | string (UUID) | Identifier of the file collection. Same value as the id returned by POST /v2/workflows/{wid}/run/. |
verification_url | string | Link to the AnyFormat dashboard for human review of the results. |
parse | object | null | Output of the parse node. null if the workflow has no parse node. |
extraction | object | null | Deprecated — use extractions instead. Mirrors extractions[0].fields for linear (non-split) workflows; null for parse-only or split workflows. Will be removed in a future major version. |
classifications | array | Per-classifier-node verdicts. Each entry has category, confidence, evidence. Empty when the workflow has no classifier. |
splits | array | Splitter output: category-level geometry with optional partitions. Each entry has name, files[], confidence, partitions[]. Empty when the workflow has no splitter. |
extractions | array | Flat list of extraction datapoints, one entry per (split_name, partition) pair. Linear workflows produce a single untagged entry (split_name=null, partition=null). Empty when no extraction has run yet. |
The keys parse and extraction are always present — they are explicitly null when the corresponding node didn’t produce results. New code should read extractions[] instead; extraction is retained for back-compat but will be removed.
parse object
| Field | Type | Description |
|---|
markdown | string | null | Document content rendered as structured markdown. Tables become <table> elements; figures become base64-encoded <img> tags; each page is wrapped in <DOCUMENT> / <section> blocks with bounding-box coordinates. |
See Parse-Only Workflow for an example of the markdown structure.
The extraction object is keyed by field name. Each value is an ExtractedField:
| Field | Type | Description |
|---|
value | varies | null | The extracted value. The runtime type depends on the field’s data_type — string, number for integer/float, ISO-format string for date/datetime, etc. null when no value could be extracted. |
value_override | varies | null | Human-supplied override of the extracted value, if one was set during verification. null when no override exists. To pick the most-trusted value, check explicitly for null — see the reading pattern below. Don’t use or/?? truthy fallback: a legitimate override of 0, False, or "" is falsy and would be silently discarded. |
verification_status | string | null | Verification state for this datapoint. Common values: not_verified (default), verified. null when not yet reviewed. |
confidence | number | Model confidence on a 0–100 scale. Higher means more certain. |
evidence | array | Source-text snippets the model used to derive this value. May be empty. |
evidence entries
| Field | Type | Description |
|---|
text | string | The exact source-text snippet that supports the extracted value. |
page_number | integer | 1-indexed page number where the snippet was found. |
Reading results — recommended pattern
result = response.json()
# parse markdown (always present, may be null for non-parse workflows)
markdown = result["parse"]["markdown"] if result["parse"] else None
# Extraction values — read `extractions[]`. Linear workflows produce one
# untagged entry (split_name=None, partition=None); split workflows produce
# one entry per (split, partition).
for extraction in result["extractions"]:
fields = extraction["fields"]
field = fields["invoice_number"]
# prefer human override if one exists, otherwise the model value.
# `is not None` (not `or`) — a legitimate override of 0, False, or "" is falsy.
final = field["value_override"] if field["value_override"] is not None else field["value"]
if extraction["split_name"]:
print(f"{extraction['split_name']}/{extraction['partition']}: {final}")
else:
print(final)
# Deprecated singular form (kept for back-compat; only populated for linear
# workflows). Prefer `extractions[]` above.
legacy = result.get("extraction")
Run Response
POST /v2/workflows/{wid}/run/ returns this shape on 202 Accepted:
{
"id": "069dcc2c-e14c-7606-8000-2ee4fb17b4e1",
"status": "success",
"workflow_id": "0686bb97-8c30-70f0-8000-97669e000eb8"
}
The id field is the collection_id — pass it as {collection_id} to the results endpoint. Keep this consistent in your client code.
Error Responses
All error responses share a single shape:
{
"error": "Brief, human-readable error description",
"detail": "Detailed explanation of what went wrong",
"error_code": "MACHINE_READABLE_ERROR_CODE",
"retryable": false,
"request_id": "a1b2c3d4e5f67890abcdef1234567890"
}
See Error Handling for the complete error-code reference, polling guidance, and best-practice retry logic.
All UUIDs in the API are hyphenated UUIDv7 (RFC 4122 form: 8-4-4-4-12):
069dcc2c-e14c-7606-8000-2ee4fb17b4e1
This applies to workflow IDs, collection IDs, file IDs, and webhook IDs alike. A few legacy code paths historically returned the same value as 32-character hex (no hyphens) — those are deprecated and will be removed. Always treat UUIDs as opaque strings.
Supported Endpoints
| Endpoint | Returns |
|---|
GET /v2/workflows/{workflow_id}/files/{collection_id}/results/ | File-collection results (this page’s shape) |
GET /v2/workflows/ | Paginated list response |
GET /v2/workflows/{workflow_id}/runs/ | Paginated list response |
GET /v2/workflows/{workflow_id}/files/ | Paginated list response |