You compose the graph from a small set of node types, wire them with edges, and run any document through the whole thing as a single call. Each workflow targets one kind of document and one shape of output. Need a different output? Build a different workflow.Documentation Index
Fetch the complete documentation index at: https://docs.anyformat.ai/llms.txt
Use this file to discover all available pages before exploring further.
The five node types
A workflow’s graph is built from five node types. Each one does one thing.| Node | What it produces | Typical use |
|---|---|---|
| Parse | Structured markdown of the document (text + tables + figures, with bounding boxes and per-block confidence) | Always required; exactly one per workflow |
| Classify | A category verdict for the document | ”Is this an invoice or a receipt?” |
| Splitter | Document segments (one logical document broken into multiple) | “This PDF contains four invoices stapled together” |
| Extract | Typed fields (data_type) pulled from upstream parsed content | The classic “give me invoice_number, total, date” use case |
| Validate | Pass/fail checks on extracted values | ”Total must be ≥ 0 and within 1% of line-item sum” |
source → target) wire nodes into a pipeline. Branching nodes (classify and splitter) need a branch on each outgoing edge to say which category or split rule the downstream node handles.
The full topology rules — exactly one parse node, no cycles, who can connect to whom — live on the Create workflow endpoint reference.
Three common shapes
You’ll see these three over and over. They’re the same five node types arranged differently.Parse-only
One parse node, no edges. Produces markdown — useful when you want anyformat’s parsed output to feed your own pipeline (RAG, custom LLM, search index).Linear: parse → extract
A parse node into an extract node. The default for “give me structured fields from this document.”Branched: classify → extract-per-type
A classifier picks the document type, then routes to an extract node tailored to that type. Same idea works for splitters.How to think about workflows
Most users follow the same lifecycle:
In the web platform you usually build workflows visually from the home screen. Programmatically you submit a JSON graph to
POST /v2/workflows/.
What’s next?
Runs & results
What happens when you run a workflow, and how the output of each node type is returned
Field types
The
data_type values an extract node’s schema can useBuild your first workflow
Linear parse → extract walkthrough — UI, curl, and Python
Create workflow (API)
The full graph schema, all five node types, and topology rules
