April 2026
April 6, 2026
- Agentic Document Splitter: Replaced the batch LLM splitter with an agentic workflow. You can now define custom split rules (name, description, optional partition key for grouping repeating sections) directly in the Studio SplitPages node, with a built-in “Other” catch-all rule.
- Figure Pipeline Fix: Fixed PDF-to-markdown figure detection —
OTHERblocks are no longer treated as pictures, so only YOLO-classified figures receive<figure-content>wrapping and screenshot crops. - Bug Fix: Restored system admin privileges for local logins by treating the org Owner role as system admin, matching production JWT-claim behavior.
- Product Docs Refresh: Added GIF/WEBP walkthroughs across guide pages, added billing contact (
info@anyformat.ai) to the Pricing section, corrected visual grounding descriptions, and removed the outdated thumbs up/down validation section. - API Cleanup: Removed a duplicate health check endpoint.
- IAM Cleanup: Removed dead
UpdateUserProfileandProvisionUserOnSignupuse cases along with orphaned domain code that was never wired to any endpoint.
April 1, 2026
- Workflow Usage Pagination: The workflow usage endpoint now supports server-side pagination (
page,page_size), ordering, and DB-level filtering with standardresultsresponses. - Unified Workflow Results:
GET /v2/workflows/{id}/results/always returns unified JSON (parse markdown + extraction data per file). Addedfile_idquery param to filter to a single file; removedoutput_format,as_lists, and file filter params. - Restored v1 Compatibility Routes: Reintroduced hidden v1 routes (workflows, jobs, uploads/results) with deprecation/sunset headers and
X-API-Keyauth fallback for backward compatibility.
March 2026
March 30, 2026
- Two-Tier Rate Limiting: Rate limits are now applied independently to file submission endpoints (60 RPM) and general endpoints (600 RPM). Submission operations (run, upload, create file collection) have a stricter limit, while read and management endpoints share a higher limit. Each tier uses an independent counter — consuming one does not affect the other.
March 26, 2026
- Image Paste in Workflow Creation: Paste images directly into the workflow creation chat for more contextual AI suggestions (up to 3 images, 3 MB each)
- Concurrent Edit Protection: Workflow configuration saves now detect concurrent edits, preventing accidental data loss
- API Cleanup: Removed deprecated
as_listsparameter from the v2 extraction endpoint - Bug Fix: Fixed image uploads in workflow suggestions being rejected when within the allowed size limit
- Bug Fix: Fixed workflow name loading indicator occasionally getting stuck
- Bug Fix: Fixed buttons not showing pointer cursor on hover
March 25, 2026
- API v2 Documentation: All API documentation updated for v2 endpoints with path-based versioning (
/v2/prefix) - Bearer Auth: Documentation now shows
Authorization: Beareras the authentication method in customer-facing docs - Collection-First API: Run extraction returns a file collection UUID; poll results via
GET /v2/files/{id}/extraction/(412 while pending, 200 when ready) - Immutable Collections: Removed
file_collection_idparameter from upload endpoint — collections cannot be modified after creation - Multi-File Upload:
POST /v2/files/accepts multiple files in a single request - Simplified Pagination: v2 pagination uses
count,page,page_size(removedtotal_pages,next,previous) - Removed Jobs Endpoints: Deprecated
/jobs/endpoints removed from documentation — use/v2/files/{id}/extraction/instead
March 19, 2026 (Part 2)
- MSG File Support: Added support for
.msg(Outlook) email files alongside existing.emlsupport - EML Inline Images: Email-to-PDF conversion now correctly inlines embedded images
- File Upload Naming: Fixed file names with special characters (commas, currency symbols) being corrupted on upload
- PDF Viewer: Fixed rendering issues when a document mixes different page sizes (e.g., US Letter and A4)
- Auth Callback: Moved auth callback outside the dashboard layout for a cleaner loading experience with full-page loader
- Logging: Suppressed verbose PaddleOCR and fontTools logs during email-to-PDF conversion and OCR initialization
March 19, 2026
- EML File Support: Added support for
.emlemail files across the extraction pipeline and file picker - Smart Table Resubmit: Replace semantics on resubmit instead of skip, with scoped worker context and consolidated briefing
- Submit Extend Mode: New
allow_extendparameter lets agents append rows to existing submissions - OCR Accuracy: Improved OCR text detection using word-count and spatial coverage heuristics instead of simple truthy checks, with screenshot always sent to LLM
- Table Grounding: HTML cell ID injection now gated behind
inject_cell_idsflag with minimum overlap ratio improvements - API Improvements: Updated v3 types, v2 file endpoints now delegate to v3, and graceful V3 error handling
- Rate Limiting: Separated auth from rate limiting, with per-user rate limits via API key comparison
- Billing Cleanup: Removed old billing system (usage limits, subscription tiers)
- Security: Added HTTP security headers per pentest findings
- CLI Fix:
anyformat endpointsnow correctly formats PostgreSQL endpoint as ODBC connection string - Bug Fix: Fixed bounding boxes misaligned on pages with different dimensions
- Bug Fix: Fixed static assets not loading on backend login page
- Bug Fix: Fixed
api-speccommand failing due to unnecessary backend dependencies - Infrastructure: Redis URL consolidation, fallback Redis URL for production, and environment naming consistency
March 18, 2026
- API v2: Full v2 API surface with versioned endpoints, pagination wrappers, workflow routes, list extractions, provenance API, and improved OpenAPI docs
- API Bearer Auth: Support for standard
Authorization: Bearerheader alongside API key authentication - Smart Table Improvements: Extraction planner with table-scoped delegates, dynamic agent limits that scale with document complexity, and improved sandbox tools
- Table Grounding: Robust row-level grounding via token overlap pre-mapping with numerous anchor finder accuracy fixes
- File Domain Refactor: Enforced aggregate invariants, non-nullable file collections, authorization hardening (role-aware checks), and v2 file endpoints now delegate to v3 use cases
- Improved Page Rotation: Cascade decision engine with landscape support and optimized center crop reuse for faster rotation detection
- File Details Toggle: New metadata visibility toggle on the table header with local storage persistence
- Improved Polling: File counts now reflect filtered results instead of selected, with loading state on action buttons
- API Rate Limiting: Redis-based rate limiting for API endpoints
- Bug Fix: Fixed migration that incorrectly backfilled all files with the same UUID
- Bug Fix: Fixed authorization returning 403 for all file collection creation in production
- Bug Fix: Fixed Content-Disposition doubling file extensions on markdown downloads
- Security: Updated vulnerable dependencies (Django, PyJWT, tornado, cairosvg, orjson, joserfc, black)
March 5, 2026
- Improved Document Processing: Enhanced layout analysis for more accurate table and section detection
- JSON Viewer: Raw JSON extraction results now available in a dedicated tab alongside validated results for easier inspection
- Field Hover Cards: Simple field results now display hover cards with additional context on interaction
- Resizable Chat: Workflow creation chat panel is now resizable for a more comfortable editing experience
- Improved Validation UI: Better identification of validated datapoints with refined result card styling
- Bug Fix: Fixed confidence score handling when values are undefined
- Bug Fix: Fixed clipboard copy errors with proper error handling
- Bug Fix: Fixed various UI styling inconsistencies in result cards and file viewer
March 4, 2026
- Webhook System: Production-grade webhook system for the External API — create subscriptions, receive real-time event notifications, and configure automatic retries on failure
- Improved Error Handling: Better error messages with clear categories, making it easier to understand and resolve extraction issues
- Schema Editor Auto-Save: Fields and schema changes now auto-save when switching tabs or editing nested fields, preventing accidental data loss
- Validation Page: New validated card and improved result cards with keyboard navigation and better loading states
- Faster OCR Processing: Upgraded to GPU-accelerated batch processing for significantly faster text recognition on large documents
- Bug Fix: Fixed race conditions in field form submission and save state management
- Bug Fix: Fixed a security vulnerability in a third-party dependency
February 2026
February 27, 2026
- Usage Date Range Filter: The usage page now lets you filter by a custom date range instead of a fixed year, giving more precise control over billing and usage reports
- Verification Status Fix: Editing a field value back to its originally extracted value now correctly marks it as human-verified rather than human-edited
- Worker Reliability: Added startup and liveness probes to extraction workers for improved health monitoring and faster recovery from failures
February 25, 2026
- Improved OCR Accuracy: Upgraded text recognition engine with better support for Latin languages
- LLM Cost Tracking: Billing transactions now include per-extraction LLM token counts and costs for full cost transparency
- Table Extraction Accuracy: Improved accuracy when extracting data from tables with closely spaced rows or surrounding text
- Faster Processing: Extraction workers now auto-scale to handle high volumes, reducing wait times during peak usage
- Sample Workflows: New accounts are prepopulated with sample workflows and documents for a guided onboarding experience
- Bug Fix: Fixed a crash in smart lookup when the search corpus was empty
- Bug Fix: Improved performance on extraction callbacks and field list endpoints
- Bug Fix: Fixed file download filenames not being readable by the frontend
February 21, 2026
- Visual Grounding 2.0: Completely revamped evidence highlighting with block-level and element-level precision, making it easier to trace extracted values back to their exact location in the original document
- Billing Dashboard: New usage dashboard with remaining credits, weekly summaries, per-workflow usage breakdown, and CSV export
- Document Segmentation: Advanced document layout analysis now enabled in production for improved table and section detection
- Bug Fix: Fixed an issue where bounding box highlights could remain stuck after navigating between fields
- Bug Fix: Fixed calibration configuration for confidence scoring
February 14, 2026
- Redesigned Home Page: New personalized home with recent workflows, at-a-glance file status indicators, and a streamlined workflow creation experience with templates
- New Schema Builder: Redesigned workflow configuration with a two-step Define & Refine flow, AI-powered field suggestions, inline file previews, and drag-and-drop uploads
- Faster Page Rotation: Upgraded page orientation detection engine for ~2x faster processing per page
- Improved Field Suggestions: More reliable AI field suggestions powered by an upgraded model
- Smart Lookup Fix: Fixed an issue where smart lookup settings were not saved when publishing a workflow version
- Bug Fix: Fixed sidebar overflow in the workflow studio
February 7, 2026
- New Dashboard Design: Refreshed visual design across the platform with updated components, colors, and layout
- Improved Visual Grounding: Evidence highlighting now uses an event-driven architecture, making interactions faster and more reliable across all views
- Secure File Access: Files and organization logos are now served securely through the backend, improving security and simplifying local development
- Workflow Name Suggestions: New AI-powered workflow name suggestions based on your description and uploaded files
- File Status Counts: Workflow listings now show at-a-glance counts of files by processing status
- Bug Fix: Fixed document element classification that could cause incorrect layout detection in certain documents
February 3, 2026
- Smart Lookup: New feature to enrich extracted data with external reference files using AI-powered matching
- Field Source Tracking: Fields now track their source (extraction vs. smart lookup) for better traceability
- OCR Performance: Improved startup times for OCR services with memory snapshots
- Bug Fix: Fixed field data processing to correctly handle manual field creation
January 2026
January 27, 2026
- Usage-Based Billing: Introduced billing for data extraction operations with transparent per-page pricing
- Improved Suggestions: Faster and more reliable field suggestions when creating workflows
- Bug Fix: Resolved an issue where certain field value formats could cause upload errors
January 20, 2026
- Faster OCR Processing: Improved document processing speed with optimized text recognition
- Extraction Review: Administrators can now review extractions directly in the dashboard
- Organization Credits: Added ability to manage organization credits for usage billing
January 13, 2026
- Faster Document Processing: Significantly improved processing speed for large documents
- Row Count Filters: Filter extraction results by the number of rows extracted
- Page Rotation Control: Configure custom page rotation strategies per workflow
January 6, 2026
- Results Filtering: Enhanced results download with additional filtering options
- API Documentation: Expanded API reference documentation with more examples
December 2025
December 15, 2025
- Improved Reliability: Enhanced system stability and faster deployments
December 1, 2025
- Performance Improvements: Optimized results retrieval for workflows with many documents
- Cost Tracking: Added detailed usage tracking for extraction operations
November 2025
November 24, 2025
- Datarow Creation: Create new data rows directly from specific extractions
November 17, 2025
- Table Extraction: Improved accuracy when extracting data from complex tables
- User Status Endpoint: New
/user/is-enabledendpoint to check account status - File Validation: File validator responses now include user information
November 10, 2025
- Large Document Support: Better handling of documents with many pages using smart chunking
- Validation Tracking: Track which values have been manually validated or edited
- Improved Accuracy: Enhanced extraction accuracy with updated default model
November 3, 2025
- Large Document Processing: Improved extraction quality for documents exceeding context limits
- Confidence Scoring: When merging extractions, the highest confidence value is preserved
October 2025
October 27, 2025
- Performance: General performance improvements across the platform
October 13, 2025
- CSV Export: Fixed field ordering in CSV exports to match workflow configuration
- Results Ordering: Extraction results now consistently respect field order
October 6, 2025
- Nested Field Reordering: New endpoint to reorder fields within object-type fields
- Verification URLs: Job results now include verification URLs for easy access
- Bug Fix: Fixed handling of fields with similar names in nested structures
September 30, 2025
- Improved Extraction Pipeline: Redesigned extraction workflow for better reliability
