Creating a Workflow
You can also create a workflow programmatically by sending a POST request to/v2/workflows/. This defines the structure of the data you want to extract.
Processing a Document
Once you have created a workflow, you can submit documents for processing using the workflow ID:Checking Status
Use the fileid to poll for results. The endpoint returns 412 while processing and 200 when results are ready:
Polling Example
Prefer webhooks over polling for production integrations. Webhooks deliver results immediately without consuming your rate limit.
Webhook Notifications
Instead of polling, you can configure webhooks to receive notifications when processing completes or fails. The supported event types are:extraction.completed— processing finished successfully, results are availableextraction.failed— processing encountered an error
Error Handling
The endpoint returns412 PRECONDITION_FAILED whenever results aren’t ready. The retryable flag tells you whether to keep polling:
retryable: true— still running (pending,queued,in_progress,processing). Keep polling.retryable: false— reached a terminal failure state (error,cancelled). Stop polling; retrying will not produce results.
Best Practices
- Polling Strategy: Use exponential backoff when polling for results. Start with 5-second intervals and increase the delay between attempts. See the error handling retry pattern for a complete implementation.
-
Webhook Security: If using webhooks, verify the signature using the
secretreturned when creating the webhook subscription. -
Error Handling: Always handle potential error states in your integration code. Check for both HTTP status codes and the
error_codefield. - Field Definitions: Provide clear, specific descriptions for each field to improve accuracy.
- File Types: Ensure your documents are in supported formats: PDF, DOC, DOCX, TXT, HTML, HTM, RTF, ODT, PPT, PPTX, EPUB, XLSX, XLS, MD, MARKDOWN, PNG, JPG, JPEG, GIF, BMP, TIFF, EML, MSG, MP3, WAV. Maximum file size is 20 MB.
