Agentic Parse to Markdown

A complete cookbook for the most powerful parse mode in anyformat: agentic parsing. Each block in the document is routed through a typed strategy (text / table / figure / dense-table-agent) instead of a single batched call, giving you better tables and richer figure descriptions. This recipe walks the full loop: create the workflow → upload a document → poll → retrieve markdown + confidence.

What is Agentic Parsing?

Standard parsing sends every block to one LLM call with one prompt. Agentic parsing first routes each block (using a fast LLM with the page image) to the right specialised strategy:

Block type	Strategy	Why
Plain text	`text-bytes-first`	Cheap, fast — bypasses LLM entirely when source bytes are extractable
Tables	`dense-table-agent`	Tool-calling agent that handles spans, merged cells, multi-page tables
Figures / charts	`single-call vision`	LLM with image input for structured figure descriptions

You pick the effort preset, which controls the model mix:

`effort`	Best for
`"low"`	Simple documents, fast/cheap turnaround
`"mid"`	Balanced default — recommended for most documents
`"accurate"`	Highest quality on complex layouts; slowest

Step 1 — Create the Workflow

Use POST /v2/workflows/ with one parse node, agentic mode, no edges.

curl -X POST 'https://api.anyformat.ai/v2/workflows/' \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -d '{
    "name": "Agentic parse-only",
    "nodes": [
      {
        "id": "parse_1",
        "type": "parse",
        "mode": "agentic",
        "effort": "mid"
      }
    ],
    "edges": []
  }'

Step 2 — Upload a Document

curl -X POST "https://api.anyformat.ai/v2/workflows/${workflow_id}/run/" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@document.pdf"

Step 3 — Poll Until Processed

import time

for attempt in range(60):
    response = requests.get(
        f"{BASE}/v2/workflows/{workflow_id}/files/{collection_id}/results/",
        headers={"Authorization": f"Bearer {API_KEY}"},
    )
    if response.status_code == 200:
        break
    elif response.status_code == 412:
        time.sleep(min(5 * 1.5 ** min(attempt, 5), 30))
    else:
        raise Exception(f"Error: {response.json()}")
else:
    raise TimeoutError("Processing timed out")

Agentic mode takes longer than standard mode — 30-90s for a typical 3-page document — because each block hits its own LLM strategy. Use webhooks (extraction.completed event) instead of polling for production workloads.

Step 4 — Read Results

The response carries the parsed markdown plus a document-level confidence score:

data = response.json()
markdown = data["parse"]["markdown"]
confidence = data["parse"]["confidence"]  # 0-100, char-weighted aggregate

print(f"Document confidence: {confidence:.1f}/100")
print(markdown[:500])

Sample response:

{
  "collection_id": "069dcc2c-e14c-7606-8000-2ee4fb17b4e1",
  "verification_url": "https://app.anyformat.ai/workflows/.../files/...",
  "parse": {
    "markdown": "<DOCUMENT id=\"1\" page=\"1\">...",
    "confidence": 87.5
  },
  "classifications": [],
  "splits": [],
  "extractions": []
}

Per-Block vs Document Confidence

Each <section> in the markdown carries its own data-confidence attribute (0-100), and the response has a top-level parse.confidence that aggregates them.

<section id="p1_b1" data-type="text" data-confidence="94.2" data-bbox="x0:0.034,y0:0.037,x1:0.436,y1:0.053">

# ACME CORPORATION

</section>

When to use what
`parse.confidence` — triage at the document level: “is this doc worth processing further, or send to manual review?”
`data-confidence` per block — UI highlighting: dim or flag low-confidence regions inline so a human reviewer can focus their attention

The document-level value is a character-weighted mean: a 500-char paragraph at 80% counts ~100× more than a 5-char header at 99%.

Mode caveat: in agentic mode, per-block strategies don’t always populate parser logprobs (e.g. the fast text-bytes-first strategy never calls an LLM). When logprobs are absent, the score falls back to the YOLO layout-segmentation confidence — useful, but a measure of “is this region a table?” not “is the parsed content accurate?” For calibrated parser confidence comparable to extraction confidence, use mode="standard" with visual_grounding_enabled=true.

Use with AI Coding Agents

Building an integration on top of anyformat? Install the anyformat agent skill so your AI coding agent (Claude Code, Cursor, etc.) knows the right endpoints, payloads, and gotchas out of the box.

npx skills add anyformat-ai/skills

The anyformat agent skill is launching soon. The install command above is the planned interface — until it ships, you can copy the system prompt from the agent skill repository and paste it into your agent’s instructions.

What the skill gives your agent:

✅ Recipes for typed-graph workflow creation (parse-only, linear, classify-branched, splitter)
✅ The right authentication header + endpoint URLs
✅ Polling/retry behavior for the results endpoint
✅ Confidence-aware processing patterns (when to trust parse.confidence, when to look per-block)
✅ Common pitfalls already baked in

Add a single line to your agent and skip the trial-and-error.

Complete Recipe

A single pasteable script that does the whole loop:

import requests
import time

API_KEY = "YOUR_API_KEY"
BASE = "https://api.anyformat.ai"
headers = {"Authorization": f"Bearer {API_KEY}"}

# 1. Create workflow
wf = requests.post(
    f"{BASE}/v2/workflows/",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "name": "Agentic parse-only",
        "nodes": [{"id": "parse_1", "type": "parse", "mode": "agentic", "effort": "mid"}],
        "edges": [],
    },
).json()
workflow_id = wf["id"]

# 2. Upload + run
with open("document.pdf", "rb") as f:
    run = requests.post(
        f"{BASE}/v2/workflows/{workflow_id}/run/",
        headers=headers,
        files={"file": f},
    ).json()
collection_id = run["id"]

# 3. Poll
for attempt in range(60):
    r = requests.get(
        f"{BASE}/v2/workflows/{workflow_id}/files/{collection_id}/results/",
        headers=headers,
    )
    if r.status_code == 200:
        break
    elif r.status_code == 412:
        time.sleep(min(5 * 1.5 ** min(attempt, 5), 30))
    else:
        raise RuntimeError(r.json())
else:
    raise TimeoutError("processing timed out")

# 4. Use results
data = r.json()
print(f"Confidence: {data['parse']['confidence']:.1f}/100")
print(data["parse"]["markdown"][:500])

Next Steps

Create Workflow Reference

Full reference for the typed-graph endpoint

Parse-Only Workflow

Standard parse-only cookbook with the same endpoint

Add Extraction

Add an extract node after the parse step to pull structured fields

Webhooks

Skip polling — receive extraction.completed events

Documentation Index

​Agentic Parse to Markdown

​What is Agentic Parsing?

​Step 1 — Create the Workflow

​Step 2 — Upload a Document

​Step 3 — Poll Until Processed

​Step 4 — Read Results

​Per-Block vs Document Confidence

​Use with AI Coding Agents

​Complete Recipe

​Next Steps

Create Workflow Reference

Parse-Only Workflow

Add Extraction

Webhooks

Agentic Parse to Markdown

What is Agentic Parsing?

Step 1 — Create the Workflow

Step 2 — Upload a Document

Step 3 — Poll Until Processed

Step 4 — Read Results

Per-Block vs Document Confidence

Use with AI Coding Agents

Complete Recipe

Next Steps