Quickstart - anyformat

The walkthrough below has tabs at each step — pick UI to drive everything from app.anyformat.ai, or curl / TypeScript / Python to drive the API directly. The four paths produce the same result.

Even if you plan to integrate via the API, we recommend building your first workflow in the UI. It’s faster to iterate on field definitions visually, and once it works you can copy the workflow ID and call it from code.

Python package + class names are provisional. pip install anyformat-sdk and from anyformat.sdk import Client work today, but both are expected to change before the official launch — pin the version you ship with.

Before you start

UI
curl
TypeScript
Python

Create an account at app.anyformat.ai. That’s it.

Get an API key from app.anyformat.ai/api-key. Export it for the snippets below:

export ANYFORMAT_API_KEY="your_api_key_here"

Install the SDK (Node 18+):

npm install @anyformat/sdk

Get an API key from app.anyformat.ai/api-key and pass it to the constructor — or set ANYFORMAT_API_KEY and read it from the environment:

export ANYFORMAT_API_KEY="your_api_key_here"

Install the SDK (Python 3.13 — see the SDK page for the pin):

pip install anyformat-sdk

Get an API key from app.anyformat.ai/api-key and pass it to the Client — or set ANYFORMAT_API_KEY and pick it up from the environment:

export ANYFORMAT_API_KEY="your_api_key_here"

1. Create a workflow

A workflow defines what data to extract. We’ll build a simple invoice processor with three fields: invoice_number, total_amount, issue_date.

UI
curl
TypeScript
Python

From the home screen, type a description of what you want to extract (e.g. “Invoice processing: extract invoice number, total, and issue date”).
Drag in a sample invoice PDF (optional but recommended — anyformat will suggest fields from the document).
Click Create.

anyformat opens the workflow workspace with the document on the left and the fields panel on the right. Review the suggested fields and adjust as needed.

Once your fields look right, copy the workflow ID from the URL or workflow settings — you’ll need it if you want to run this workflow via the API later.

A workflow is a typed graph of nodes. The minimal extraction shape is a parse node feeding an extract node — two nodes, one edge.

curl -X POST 'https://api.anyformat.ai/v2/workflows/' \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer $ANYFORMAT_API_KEY" \
  -d '{
    "name": "Invoice Processing",
    "description": "Extract key data from invoices",
    "nodes": [
      {"id": "parse_1", "type": "parse"},
      {
        "id": "extract_1",
        "type": "extract",
        "extraction_schema": {
          "fields": [
            {"name": "invoice_number", "description": "The unique invoice identifier",       "data_type": "string"},
            {"name": "total_amount",   "description": "Total invoice amount",                "data_type": "float"},
            {"name": "issue_date",     "description": "Date when the invoice was issued",    "data_type": "date"}
          ]
        }
      }
    ],
    "edges": [{"source": "parse_1", "target": "extract_1"}]
  }'

The response contains your workflow ID:

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "name": "Invoice Processing",
  "description": "Extract key data from invoices",
  "created_at": "2024-01-01T00:00:00.000Z",
  "updated_at": "2024-01-01T00:00:00.000Z"
}

The SDK exposes a fluent builder over the same typed graph.

import { Anyformat, Schema } from "@anyformat/sdk";

const af = new Anyformat({ apiKey: process.env.ANYFORMAT_API_KEY! });

// .create() persists the workflow and returns its id.
const workflowId = await af
  .workflow("Invoice Processing", "Extract key data from invoices")
  .parse()
  .extract([
    Schema.string("invoice_number", "The unique invoice identifier"),
    Schema.float("total_amount",    "Total invoice amount"),
    Schema.date("issue_date",       "Date when the invoice was issued"),
  ])
  .create();

console.log(`Created workflow: ${workflowId}`);

The SDK exposes a fluent builder over the same typed graph.

import os
from anyformat.sdk import Client
from anyformat.workflow import Schema

client = Client(api_key=os.environ["ANYFORMAT_API_KEY"])

workflow = (
    client.workflow("Invoice Processing")
    .parse()
    .extract([
        Schema.string("invoice_number", "The unique invoice identifier"),
        Schema.float("total_amount",    "Total invoice amount"),
        Schema.date("issue_date",       "Date when the invoice was issued"),
    ])
    .create()  # persists and returns a Workflow handle
)

print(f"Created workflow: {workflow.id}")

The same shape supports parse-only workflows (drop the extract node), classify-then-extract for mixed document types, and split workflows for multi-document files. See Create workflow for the full topology rules.

2. Run the workflow on a document

UI
curl
TypeScript
Python

From the workflow workspace, drag in (or upload via the Add document button) the invoice you want to process. Processing starts automatically and usually completes in 10–60 seconds depending on the document.

Replace WORKFLOW_ID with the ID returned in step 1.

curl -X POST 'https://api.anyformat.ai/v2/workflows/WORKFLOW_ID/run/' \
  -H "Authorization: Bearer $ANYFORMAT_API_KEY" \
  -F 'file=@invoice.pdf'

The response returns a file ID — keep it; you’ll use it to poll for results:

{
  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "pending",
  "workflow_id": "550e8400-e29b-41d4-a716-446655440000",
  "version_id": "FGaV4I2JAA"
}

status: "pending" means the run was accepted, not that extraction has finished — poll the results endpoint. version_id records the workflow version your run was bound to, which lets you verify which schema produced the results (useful right after an edit).

The TS SDK collapses “submit + poll” into a single .run(file).wait() chain. Build the same workflow shape (or look it up by id with the low-level client) and call .run(file) on it:

const file: File = /* a File with .name set, e.g. new File([bytes], "invoice.pdf") */;

const result = await af
  .workflow("Invoice Processing", "Extract key data from invoices")
  .parse()
  .extract([
    Schema.string("invoice_number", "The unique invoice identifier"),
    Schema.float("total_amount",    "Total invoice amount"),
    Schema.date("issue_date",       "Date when the invoice was issued"),
  ])
  .run(file)
  .wait();  // continues into step 3

run = workflow.run("invoice.pdf")  # Path | str | bytes

3. Get the extracted data

UI
curl
TypeScript
Python

Results appear in the workflow workspace as soon as processing finishes. Each extracted value is linked to its location in the document — click a field to highlight where it came from.Export the results as CSV, Excel, or JSON from the workflow view. See Outputs for the differences.

Poll the results endpoint. It returns 412 while processing and 200 when the run is done.

curl -H "Authorization: Bearer $ANYFORMAT_API_KEY" \
  "https://api.anyformat.ai/v2/workflows/WORKFLOW_ID/files/FILE_ID/results/"

For production integrations, use webhooks instead of polling — webhooks deliver results immediately and don’t consume rate limit.

wait() polls until the run completes (412 → still going, 200 → done) and returns a typed Result.

// `result` was awaited in step 2
console.log(result.field("invoice_number")?.value);
console.log(result.field("total_amount")?.value);
console.log(result.field("issue_date")?.value);

wait() polls until the run completes (412 → still going, 200 → done) and returns a typed Result.

result = run.wait()
print(result.fields["invoice_number"].value)
print(result.fields["total_amount"].value)
print(result.fields["issue_date"].value)

For production integrations, use webhooks instead of polling.

A completed result looks like this (one section per node type that ran — parse and extractions here; classifications and splits are empty because this workflow has neither):

{
  "collection_id": "069dcc2c-e14c-7606-8000-2ee4fb17b4e1",
  "verification_url": "https://app.anyformat.ai/workflows/.../files/...",
  "parse": { "markdown": "<DOCUMENT id=\"1\" page=\"1\">...", "text": "...", "parse_confidence": 94.2, "layout_confidence": 87.4, "blocks": [] },
  "classifications": [],
  "splits": [],
  "extractions": [
    {
      "split_name": null,
      "partition": null,
      "fields": {
        "invoice_number": {
          "value": "INV-2024-0847",
          "confidence": 97.0,
          "evidence": [{"text": "Invoice #INV-2024-0847", "page_number": 1}],
          "verification_status": "not_verified"
        },
        "total_amount": {
          "value": 4087.50,
          "confidence": 96.0,
          "evidence": [{"text": "Total: $4,087.50", "page_number": 2}],
          "verification_status": "not_verified"
        },
        "issue_date": {
          "value": "2024-03-15",
          "confidence": 93.0,
          "evidence": [{"text": "Date: March 15, 2024", "page_number": 1}],
          "verification_status": "not_verified"
        }
      }
    }
  ]
}

See Runs & results for the model behind these sections, and Response formats for every field in the envelope.

Complete script

The three steps above are narrative slices of the same script. Here they are end-to-end as a single pasteable block.

TypeScript
Python

import { Anyformat, Schema } from "@anyformat/sdk";

const af = new Anyformat({ apiKey: process.env.ANYFORMAT_API_KEY! });
const file: File = /* a File with .name set, e.g. new File([bytes], "invoice.pdf") */;

const result = await af
  .workflow("Invoice Processing", "Extract key data from invoices")
  .parse()
  .extract([
    Schema.string("invoice_number", "The unique invoice identifier"),
    Schema.float("total_amount",    "Total invoice amount"),
    Schema.date("issue_date",       "Date when the invoice was issued"),
  ])
  .run(file)
  .wait();

console.log(result.field("invoice_number")?.value);
console.log(result.field("total_amount")?.value);
console.log(result.field("issue_date")?.value);

import os
from anyformat.sdk import Client
from anyformat.workflow import Schema

client = Client(api_key=os.environ["ANYFORMAT_API_KEY"])

workflow = (
    client.workflow("Invoice Processing")
    .parse()
    .extract([
        Schema.string("invoice_number", "The unique invoice identifier"),
        Schema.float("total_amount",    "Total invoice amount"),
        Schema.date("issue_date",       "Date when the invoice was issued"),
    ])
    .create()
)

result = workflow.run("invoice.pdf").wait()

print(result.fields["invoice_number"].value)
print(result.fields["total_amount"].value)
print(result.fields["issue_date"].value)

Where to go next

Build workflows

Deeper walkthrough of the Define → Refine → Publish lifecycle in the UI

Recipes

End-to-end examples — invoices, resumes, contracts, receipts, and more

Coding assistant

Let Claude Code build and run anyformat workflows from your editor

API reference

Every endpoint, every response, every error code

Documentation Index

​Before you start

​1. Create a workflow

​2. Run the workflow on a document

​3. Get the extracted data

​Complete script

​Where to go next

Build workflows

Recipes

Coding assistant

API reference

Before you start

1. Create a workflow

2. Run the workflow on a document

3. Get the extracted data

Complete script

Where to go next