Skip to main content

Introduction

anyformat provides a REST API that allows you to programmatically create extraction workflows, process documents, and retrieve extracted data. The API uses standard HTTP response codes and authentication via API keys. All API requests must use HTTPS and include a trailing slash (/).

Base URL

https://api.anyformat.ai/

Authentication

All API endpoints (except /docs/ and /schema/) require authentication using an API key passed in the x-api-key header:
curl -H "x-api-key: YOUR_API_KEY" https://api.anyformat.ai/workflows/
See Authentication for details on obtaining and managing API keys.

Rate Limits

The API is rate-limited to ensure fair usage:
LimitValue
Documents per minute10
If you exceed these limits, the API will return a 429 Too Many Requests response. Wait and retry after a short delay.

API Endpoints

The API is organized around two main resources:

Workflows

Workflows define what information should be extracted from your documents.
MethodEndpointDescription
POST/workflows/Create a new workflow
GET/workflows/List all workflows
GET/workflows/{id}/Get workflow details
DELETE/workflows/{id}/Delete a workflow
POST/workflows/{id}/run/Run extraction on a file
POST/workflows/{id}/upload/Upload a file (no extraction)
GET/workflows/{id}/results/Get all extraction results

Jobs

Jobs represent individual extraction tasks and their results.
MethodEndpointDescription
GET/jobs/{id}/Get job status and results
GET/jobs/{id}/file/Download original file

Quick Start

  1. Create a workflow with the fields you want to extract
  2. Run extraction by submitting a file to your workflow
  3. Poll the job endpoint until status is processed
  4. Retrieve results from the job response
import requests
import time

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.anyformat.ai"
headers = {"x-api-key": API_KEY}

# 1. Create a workflow
workflow = requests.post(
    f"{BASE_URL}/workflows/",
    headers={**headers, "Content-Type": "application/json"},
    json={
        "name": "Invoice Extraction",
        "fields": [
            {"name": "invoice_number", "description": "Invoice ID", "data_type": "string"},
            {"name": "total", "description": "Total amount", "data_type": "float"}
        ]
    }
).json()

# 2. Run extraction
with open("invoice.pdf", "rb") as f:
    job = requests.post(
        f"{BASE_URL}/workflows/{workflow['id']}/run/",
        headers=headers,
        files={"file": f}
    ).json()

# 3. Poll for results
while True:
    result = requests.get(
        f"{BASE_URL}/jobs/{job['extraction_id']}/",
        headers=headers
    ).json()

    if result["status"] == "processed":
        print(result["results"])
        break

    time.sleep(5)

OpenAPI Schema

The full OpenAPI specification is available at:
  • JSON: https://api.anyformat.ai/schema/?format=json
  • Swagger UI: https://api.anyformat.ai/docs/