Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.anyformat.ai/llms.txt

Use this file to discover all available pages before exploring further.

You compose the graph from a small set of node types, wire them with edges, and run any document through the whole thing as a single call. Each workflow targets one kind of document and one shape of output. Need a different output? Build a different workflow.

The five node types

A workflow’s graph is built from five node types. Each one does one thing.
NodeWhat it producesTypical use
ParseStructured markdown of the document (text + tables + figures, with bounding boxes and per-block confidence)Always required; exactly one per workflow
ClassifyA category verdict for the document”Is this an invoice or a receipt?”
SplitterDocument segments (one logical document broken into multiple)“This PDF contains four invoices stapled together”
ExtractTyped fields (data_type) pulled from upstream parsed contentThe classic “give me invoice_number, total, date” use case
ValidatePass/fail checks on extracted values”Total must be ≥ 0 and within 1% of line-item sum”
Edges (source → target) wire nodes into a pipeline. Branching nodes (classify and splitter) need a branch on each outgoing edge to say which category or split rule the downstream node handles. The full topology rules — exactly one parse node, no cycles, who can connect to whom — live on the Create workflow endpoint reference.

Three common shapes

You’ll see these three over and over. They’re the same five node types arranged differently.

Parse-only

One parse node, no edges. Produces markdown — useful when you want anyformat’s parsed output to feed your own pipeline (RAG, custom LLM, search index).
[parse]

Linear: parse → extract

A parse node into an extract node. The default for “give me structured fields from this document.”
[parse] → [extract]

Branched: classify → extract-per-type

A classifier picks the document type, then routes to an extract node tailored to that type. Same idea works for splitters.
                       ┌──> [extract_invoice]
[parse] → [classify] ──┤
                       └──> [extract_receipt]
See the recipes for end-to-end examples of each shape.

How to think about workflows

Most users follow the same lifecycle:
1

Create

Define the graph — which nodes you need, how they connect
2

Refine

Run sample documents through it, inspect the output, tighten the schema or instructions
3

Publish

Mark it ready for production use
4

Run at scale

Apply it to many documents — manually, via API, or via cloud-storage integration
In the web platform you usually build workflows visually from the home screen. Programmatically you submit a JSON graph to POST /v2/workflows/.

What’s next?

Runs & results

What happens when you run a workflow, and how the output of each node type is returned

Field types

The data_type values an extract node’s schema can use

Build your first workflow

Linear parse → extract walkthrough — UI, curl, and Python

Create workflow (API)

The full graph schema, all five node types, and topology rules