Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.anyformat.ai/llms.txt

Use this file to discover all available pages before exploring further.

Every recipe and SDK reference defers to the shapes defined on this page.

Paginated List Responses

Endpoints that return multiple items use a paginated wrapper:
{
  "count": 25,
  "page": 1,
  "page_size": 20,
  "results": [ /* items */ ]
}
FieldDescription
countTotal number of items matching the query, across all pages.
pageCurrent page number (1-indexed).
page_sizeNumber of items per page. Maximum is 100; requests above are silently capped.
resultsArray of items for the current page.

File-Collection Results

The shape returned by GET /v2/workflows/{workflow_id}/files/{collection_id}/results/ once processing completes (HTTP 200). While processing is still in progress the endpoint returns 412 Precondition Failed — see Error Handling for the polling pattern.
{
  "collection_id": "069dcc2c-e14c-7606-8000-2ee4fb17b4e1",
  "verification_url": "https://app.anyformat.ai/workflows/.../files/...",
  "parse": {
    "markdown": "<DOCUMENT id=\"1\" page=\"1\">..."
  },
  "classifications": [],
  "splits": [],
  "extractions": [
    {
      "split_name": null,
      "partition": null,
      "fields": {
        "invoice_number": {
          "value": "INV-2024-0847",
          "value_override": null,
          "verification_status": "not_verified",
          "confidence": 97.0,
          "evidence": [{"text": "Invoice #INV-2024-0847", "page_number": 1}]
        },
        "total_amount": {
          "value": 4087.50,
          "value_override": null,
          "verification_status": "not_verified",
          "confidence": 96.0,
          "evidence": [{"text": "Total: $4,087.50", "page_number": 2}]
        }
      }
    }
  ],
  "extraction": {
    "invoice_number": {"value": "INV-2024-0847", "confidence": 97.0, "evidence": [{"text": "Invoice #INV-2024-0847", "page_number": 1}], "value_override": null, "verification_status": "not_verified"},
    "total_amount":   {"value": 4087.50,        "confidence": 96.0, "evidence": [{"text": "Total: $4,087.50",       "page_number": 2}], "value_override": null, "verification_status": "not_verified"}
  }
}

Top-level fields

FieldTypeDescription
collection_idstring (UUID)Identifier of the file collection. Same value as the id returned by POST /v2/workflows/{wid}/run/.
verification_urlstringLink to the AnyFormat dashboard for human review of the results.
parseobject | nullOutput of the parse node. null if the workflow has no parse node.
extractionobject | nullDeprecated — use extractions instead. Mirrors extractions[0].fields for linear (non-split) workflows; null for parse-only or split workflows. Will be removed in a future major version.
classificationsarrayPer-classifier-node verdicts. Each entry has category, confidence, evidence. Empty when the workflow has no classifier.
splitsarraySplitter output: category-level geometry with optional partitions. Each entry has name, files[], confidence, partitions[]. Empty when the workflow has no splitter.
extractionsarrayFlat list of extraction datapoints, one entry per (split_name, partition) pair. Linear workflows produce a single untagged entry (split_name=null, partition=null). Empty when no extraction has run yet.
The keys parse and extraction are always present — they are explicitly null when the corresponding node didn’t produce results. New code should read extractions[] instead; extraction is retained for back-compat but will be removed.

parse object

FieldTypeDescription
markdownstring | nullDocument content rendered as structured markdown. Tables become <table> elements; figures become base64-encoded <img> tags; each page is wrapped in <DOCUMENT> / <section> blocks with bounding-box coordinates.
See Parse-Only Workflow for an example of the markdown structure.

extraction object — ExtractedField shape

The extraction object is keyed by field name. Each value is an ExtractedField:
FieldTypeDescription
valuevaries | nullThe extracted value. The runtime type depends on the field’s data_typestring, number for integer/float, ISO-format string for date/datetime, etc. null when no value could be extracted.
value_overridevaries | nullHuman-supplied override of the extracted value, if one was set during verification. null when no override exists. To pick the most-trusted value, check explicitly for null — see the reading pattern below. Don’t use or/?? truthy fallback: a legitimate override of 0, False, or "" is falsy and would be silently discarded.
verification_statusstring | nullVerification state for this datapoint. Common values: not_verified (default), verified. null when not yet reviewed.
confidencenumberModel confidence on a 0–100 scale. Higher means more certain.
evidencearraySource-text snippets the model used to derive this value. May be empty.

evidence entries

FieldTypeDescription
textstringThe exact source-text snippet that supports the extracted value.
page_numberinteger1-indexed page number where the snippet was found.
result = response.json()

# parse markdown (always present, may be null for non-parse workflows)
markdown = result["parse"]["markdown"] if result["parse"] else None

# Extraction values — read `extractions[]`. Linear workflows produce one
# untagged entry (split_name=None, partition=None); split workflows produce
# one entry per (split, partition).
for extraction in result["extractions"]:
    fields = extraction["fields"]
    field = fields["invoice_number"]

    # prefer human override if one exists, otherwise the model value.
    # `is not None` (not `or`) — a legitimate override of 0, False, or "" is falsy.
    final = field["value_override"] if field["value_override"] is not None else field["value"]
    if extraction["split_name"]:
        print(f"{extraction['split_name']}/{extraction['partition']}: {final}")
    else:
        print(final)

# Deprecated singular form (kept for back-compat; only populated for linear
# workflows). Prefer `extractions[]` above.
legacy = result.get("extraction")

Run Response

POST /v2/workflows/{wid}/run/ returns this shape on 202 Accepted:
{
  "id": "069dcc2c-e14c-7606-8000-2ee4fb17b4e1",
  "status": "success",
  "workflow_id": "0686bb97-8c30-70f0-8000-97669e000eb8"
}
The id field is the collection_id — pass it as {collection_id} to the results endpoint. Keep this consistent in your client code.

Error Responses

All error responses share a single shape:
{
  "error": "Brief, human-readable error description",
  "detail": "Detailed explanation of what went wrong",
  "error_code": "MACHINE_READABLE_ERROR_CODE",
  "retryable": false,
  "request_id": "a1b2c3d4e5f67890abcdef1234567890"
}
See Error Handling for the complete error-code reference, polling guidance, and best-practice retry logic.

Identifier Format

All UUIDs in the API are hyphenated UUIDv7 (RFC 4122 form: 8-4-4-4-12):
069dcc2c-e14c-7606-8000-2ee4fb17b4e1
This applies to workflow IDs, collection IDs, file IDs, and webhook IDs alike. A few legacy code paths historically returned the same value as 32-character hex (no hyphens) — those are deprecated and will be removed. Always treat UUIDs as opaque strings.

Supported Endpoints

EndpointReturns
GET /v2/workflows/{workflow_id}/files/{collection_id}/results/File-collection results (this page’s shape)
GET /v2/workflows/Paginated list response
GET /v2/workflows/{workflow_id}/runs/Paginated list response
GET /v2/workflows/{workflow_id}/files/Paginated list response