Use this file to discover all available pages before exploring further.
The walkthrough below has tabs at each step — pick UI to drive everything from app.anyformat.ai, or curl / TypeScript / Python to drive the API directly. The four paths produce the same result.
Even if you plan to integrate via the API, we recommend building your first workflow in the UI. It’s faster to iterate on field definitions visually, and once it works you can copy the workflow ID and call it from code.
Python package + class names are provisional.pip install anyformat-sdk and from anyformat.sdk import Client work today, but both are expected to change before the official launch — pin the version you ship with.
A workflow defines what data to extract. We’ll build a simple invoice processor with three fields: invoice_number, total_amount, issue_date.
UI
curl
TypeScript
Python
From the home screen, type a description of what you want to extract (e.g. “Invoice processing: extract invoice number, total, and issue date”).
Drag in a sample invoice PDF (optional but recommended — anyformat will suggest fields from the document).
Click Create.
anyformat opens the workflow workspace with the document on the left and the fields panel on the right. Review the suggested fields and adjust as needed.Once your fields look right, copy the workflow ID from the URL or workflow settings — you’ll need it if you want to run this workflow via the API later.
A workflow is a typed graph of nodes. The minimal extraction shape is a parse node feeding an extract node — two nodes, one edge.
{ "id": "550e8400-e29b-41d4-a716-446655440000", "name": "Invoice Processing", "description": "Extract key data from invoices", "created_at": "2024-01-01T00:00:00.000Z", "updated_at": "2024-01-01T00:00:00.000Z"}
The SDK exposes a fluent builder over the same typed graph.
import { Anyformat, Schema } from "@anyformat/sdk";const af = new Anyformat({ apiKey: process.env.ANYFORMAT_API_KEY! });// .create() persists the workflow and returns its id.const workflowId = await af .workflow("Invoice Processing", "Extract key data from invoices") .parse() .extract([ Schema.string("invoice_number", "The unique invoice identifier"), Schema.float("total_amount", "Total invoice amount"), Schema.date("issue_date", "Date when the invoice was issued"), ]) .create();console.log(`Created workflow: ${workflowId}`);
The SDK exposes a fluent builder over the same typed graph.
import osfrom anyformat.sdk import Clientfrom anyformat.workflow import Schemaclient = Client(api_key=os.environ["ANYFORMAT_API_KEY"])workflow = ( client.workflow("Invoice Processing") .parse() .extract([ Schema.string("invoice_number", "The unique invoice identifier"), Schema.float("total_amount", "Total invoice amount"), Schema.date("issue_date", "Date when the invoice was issued"), ]) .create() # persists and returns a Workflow handle)print(f"Created workflow: {workflow.id}")
The same shape supports parse-only workflows (drop the extract node), classify-then-extract for mixed document types, and split workflows for multi-document files. See Create workflow for the full topology rules.
From the workflow workspace, drag in (or upload via the Add document button) the invoice you want to process. Processing starts automatically and usually completes in 10–60 seconds depending on the document.
Replace WORKFLOW_ID with the ID returned in step 1.
status: "pending" means the run was accepted, not that extraction has finished — poll the results endpoint. version_id records the workflow version your run was bound to, which lets you verify which schema produced the results (useful right after an edit).
The TS SDK collapses “submit + poll” into a single .run(file).wait() chain. Build the same workflow shape (or look it up by id with the low-level client) and call .run(file) on it:
const file: File = /* a File with .name set, e.g. new File([bytes], "invoice.pdf") */;const result = await af .workflow("Invoice Processing", "Extract key data from invoices") .parse() .extract([ Schema.string("invoice_number", "The unique invoice identifier"), Schema.float("total_amount", "Total invoice amount"), Schema.date("issue_date", "Date when the invoice was issued"), ]) .run(file) .wait(); // continues into step 3
run = workflow.run("invoice.pdf") # Path | str | bytes
Results appear in the workflow workspace as soon as processing finishes. Each extracted value is linked to its location in the document — click a field to highlight where it came from.Export the results as CSV, Excel, or JSON from the workflow view. See Outputs for the differences.
Poll the results endpoint. It returns 412 while processing and 200 when the run is done.
For production integrations, use webhooks instead of polling — webhooks deliver results immediately and don’t consume rate limit.
wait() polls until the run completes (412 → still going, 200 → done) and returns a typed Result.
// `result` was awaited in step 2console.log(result.field("invoice_number")?.value);console.log(result.field("total_amount")?.value);console.log(result.field("issue_date")?.value);
wait() polls until the run completes (412 → still going, 200 → done) and returns a typed Result.
result = run.wait()print(result.fields["invoice_number"].value)print(result.fields["total_amount"].value)print(result.fields["issue_date"].value)
For production integrations, use webhooks instead of polling.
A completed result looks like this (one section per node type that ran — parse and extractions here; classifications and splits are empty because this workflow has neither):
The three steps above are narrative slices of the same script. Here they are end-to-end as a single pasteable block.
TypeScript
Python
import { Anyformat, Schema } from "@anyformat/sdk";const af = new Anyformat({ apiKey: process.env.ANYFORMAT_API_KEY! });const file: File = /* a File with .name set, e.g. new File([bytes], "invoice.pdf") */;const result = await af .workflow("Invoice Processing", "Extract key data from invoices") .parse() .extract([ Schema.string("invoice_number", "The unique invoice identifier"), Schema.float("total_amount", "Total invoice amount"), Schema.date("issue_date", "Date when the invoice was issued"), ]) .run(file) .wait();console.log(result.field("invoice_number")?.value);console.log(result.field("total_amount")?.value);console.log(result.field("issue_date")?.value);
import osfrom anyformat.sdk import Clientfrom anyformat.workflow import Schemaclient = Client(api_key=os.environ["ANYFORMAT_API_KEY"])workflow = ( client.workflow("Invoice Processing") .parse() .extract([ Schema.string("invoice_number", "The unique invoice identifier"), Schema.float("total_amount", "Total invoice amount"), Schema.date("issue_date", "Date when the invoice was issued"), ]) .create())result = workflow.run("invoice.pdf").wait()print(result.fields["invoice_number"].value)print(result.fields["total_amount"].value)print(result.fields["issue_date"].value)