> ## Documentation Index
> Fetch the complete documentation index at: https://docs.anyformat.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Invoice Processing

> Extract line items, totals, and dates from invoices using nested objects

This recipe uses the `object` field type with `nested_fields` to capture tabular line-item data.

## Workflow fields

<Tip>
  We recommend creating this workflow in the [anyformat platform](https://app.anyformat.ai) where you can test with sample invoices and iterate on field descriptions. Copy the workflow ID to use with the API.
</Tip>

| Field            | Type   | Description                                                        |
| ---------------- | ------ | ------------------------------------------------------------------ |
| `invoice_number` | string | The unique invoice identifier                                      |
| `vendor_name`    | string | Name of the company issuing the invoice                            |
| `issue_date`     | date   | Date the invoice was issued                                        |
| `due_date`       | date   | Payment due date                                                   |
| `subtotal`       | float  | Amount before tax                                                  |
| `tax_amount`     | float  | Total tax applied                                                  |
| `total_amount`   | float  | Final amount due including tax                                     |
| `currency`       | enum   | Currency code (USD, EUR, GBP, …)                                   |
| `line_items`     | object | Individual line items (description, quantity, unit\_price, amount) |

## End-to-end

<Tabs>
  <Tab title="curl" icon="terminal">
    Three-step flow: create the workflow, run a document, poll the results endpoint until it returns `200`.

    ```bash theme={null}
    # 1. Create the workflow
    curl -X POST 'https://api.anyformat.ai/v2/workflows/' \
      -H 'Content-Type: application/json' \
      -H "Authorization: Bearer $ANYFORMAT_API_KEY" \
      -d '{
        "name": "Invoice Processor",
        "description": "Extract invoice header and line items",
        "nodes": [
          {"id": "parse_1", "type": "parse"},
          {
            "id": "extract_1",
            "type": "extract",
            "extraction_schema": {
              "fields": [
                {"name": "invoice_number", "description": "The unique invoice identifier", "data_type": "string"},
                {"name": "vendor_name",    "description": "Name of the company that issued the invoice", "data_type": "string"},
                {"name": "issue_date",     "description": "Date the invoice was issued", "data_type": "date"},
                {"name": "due_date",       "description": "Date by which payment is due", "data_type": "date"},
                {"name": "subtotal",       "description": "Amount before tax", "data_type": "float"},
                {"name": "tax_amount",     "description": "Total tax amount", "data_type": "float"},
                {"name": "total_amount",   "description": "Final total amount due including tax", "data_type": "float"},
                {
                  "name": "currency",
                  "description": "Currency of the invoice amounts",
                  "data_type": "enum",
                  "enum_options": [
                    {"name": "USD", "description": "US Dollar"},
                    {"name": "EUR", "description": "Euro"},
                    {"name": "GBP", "description": "British Pound"}
                  ]
                },
                {
                  "name": "line_items",
                  "description": "Individual line items listed on the invoice",
                  "data_type": "object",
                  "nested_fields": [
                    {"name": "description", "description": "Description of the item or service", "data_type": "string"},
                    {"name": "quantity",    "description": "Number of units",                    "data_type": "integer"},
                    {"name": "unit_price",  "description": "Price per unit",                     "data_type": "float"},
                    {"name": "amount",      "description": "Total amount for this line item",    "data_type": "float"}
                  ]
                }
              ]
            }
          }
        ],
        "edges": [{"source": "parse_1", "target": "extract_1"}]
      }'
    # → { "id": "550e8400-…", ... }   ← workflow_id

    # 2. Run a document
    curl -X POST 'https://api.anyformat.ai/v2/workflows/WORKFLOW_ID/run/' \
      -H "Authorization: Bearer $ANYFORMAT_API_KEY" \
      -F 'file=@invoice.pdf'
    # → { "id": "069dcc2c-…", "status": "pending", ... }   ← collection_id

    # 3. Poll for results (412 while processing, 200 when done)
    curl -H "Authorization: Bearer $ANYFORMAT_API_KEY" \
      'https://api.anyformat.ai/v2/workflows/WORKFLOW_ID/files/COLLECTION_ID/results/'
    ```

    For production integrations, use [webhooks](/api-reference/webhooks/overview) instead of polling — they deliver results immediately and don't consume your rate limit.
  </Tab>

  <Tab title="TypeScript" icon="js" iconType="brands">
    ```typescript theme={null}
    import { Anyformat, Schema } from "@anyformat/sdk";

    const af = new Anyformat({ apiKey: process.env.ANYFORMAT_API_KEY! });
    const file: File = /* a File with .name set (e.g. read from <input type="file"> or new File([bytes], "invoice.pdf")) */;

    const workflow = await af
      .workflow("Invoice Processor", "Extract invoice header and line items")
      .parse()
      .extract([
        Schema.string("invoice_number", "The unique invoice identifier"),
        Schema.string("vendor_name",    "Name of the company that issued the invoice"),
        Schema.date("issue_date",       "Date the invoice was issued"),
        Schema.date("due_date",         "Date by which payment is due"),
        Schema.float("subtotal",        "Amount before tax"),
        Schema.float("tax_amount",      "Total tax amount"),
        Schema.float("total_amount",    "Final total amount due including tax"),
        Schema.enum("currency", "Currency of the invoice amounts", [
          Schema.option("USD", "US Dollar"),
          Schema.option("EUR", "Euro"),
          Schema.option("GBP", "British Pound"),
        ]),
        Schema.object("line_items", "Individual line items listed on the invoice", [
          Schema.string("description", "Description of the item or service"),
          Schema.integer("quantity",   "Number of units"),
          Schema.float("unit_price",   "Price per unit"),
          Schema.float("amount",       "Total amount for this line item"),
        ]),
      ])
      .create();

    const run = await workflow.run(file);
    const result = await run.wait();

    // Scalar accessor — undefined for missing or list-valued fields.
    console.log(result.field("invoice_number")?.value);
    console.log(result.field("total_amount")?.value);

    // Nested rows live under extractions[0].fields.<name>
    console.log(result.extractions[0]?.fields.line_items);
    ```
  </Tab>

  <Tab title="Python" icon="python" iconType="brands">
    ```python theme={null}
    import os
    from anyformat.sdk import Client
    from anyformat.workflow import Schema

    client = Client(api_key=os.environ["ANYFORMAT_API_KEY"])

    workflow = (
        client.workflow("Invoice Processor")
        .parse()
        .extract([
            Schema.string("invoice_number", "The unique invoice identifier"),
            Schema.string("vendor_name",    "Name of the company that issued the invoice"),
            Schema.date("issue_date",       "Date the invoice was issued"),
            Schema.date("due_date",         "Date by which payment is due"),
            Schema.float("subtotal",        "Amount before tax"),
            Schema.float("tax_amount",      "Total tax amount"),
            Schema.float("total_amount",    "Final total amount due including tax"),
            Schema.enum("currency", "Currency of the invoice amounts", options=[
                Schema.option("USD", "US Dollar"),
                Schema.option("EUR", "Euro"),
                Schema.option("GBP", "British Pound"),
            ]),
            Schema.object("line_items", "Individual line items listed on the invoice", fields=[
                Schema.string("description", "Description of the item or service"),
                Schema.integer("quantity",   "Number of units"),
                Schema.float("unit_price",   "Price per unit"),
                Schema.float("amount",       "Total amount for this line item"),
            ]),
        ])
        .create()
    )

    result = workflow.run("invoice.pdf").wait()

    print(result.fields["invoice_number"].value)
    print(result.fields["total_amount"].value)
    # Nested rows: read from the raw extractions array
    print(result.raw["extractions"][0]["fields"]["line_items"])
    ```
  </Tab>
</Tabs>

### Example response

The full envelope is documented in [Response formats](/api-reference/response-formats). The top-level keys are `collection_id`, `verification_url`, `parse`, `classifications`, `splits`, and `extractions`. Each scalar field follows the `ExtractedField` shape (`value`, `value_override`, `verification_status`, `confidence`, `evidence`).

```json theme={null}
{
  "collection_id": "069dcc2c-e14c-7606-8000-2ee4fb17b4e1",
  "verification_url": "https://app.anyformat.ai/workflows/.../files/...",
  "parse": {"markdown": "<DOCUMENT id=\"1\" page=\"1\">...", "text": "...", "parse_confidence": 94.2, "layout_confidence": 87.4, "blocks": []},
  "classifications": [],
  "splits": [],
  "extractions": [
    {
      "split_name": null,
      "partition": null,
      "fields": {
        "invoice_number": {
          "value": "INV-2024-0847",
          "confidence": 97.0,
          "evidence": [{"text": "Invoice #INV-2024-0847", "page_number": 1}],
          "verification_status": "not_verified"
        },
        "total_amount": {
          "value": 4087.50,
          "confidence": 96.0,
          "evidence": [{"text": "Total: $4,087.50", "page_number": 2}],
          "verification_status": "not_verified"
        },
        "currency": {
          "value": "USD",
          "confidence": 98.0,
          "evidence": [{"text": "USD", "page_number": 1}],
          "verification_status": "not_verified"
        }
      }
    }
  ]
}
```

## Tips

* Write specific field descriptions. "Total amount due including tax" extracts better than just "total".
* If invoices span multiple currencies, add the `enum_options` for all currencies you expect.
* Multi-page invoices are handled automatically; no special configuration needed.

## Next steps

<CardGroup cols={2}>
  <Card title="Field types" icon="diagram-project" href="/concepts/field-types">
    Object, enum, and other complex field types
  </Card>

  <Card title="Response formats" icon="list" href="/api-reference/response-formats">
    Full schema of every section in the results envelope
  </Card>
</CardGroup>
