Skip to main content
This section explains what happens to your documents once they’re uploaded to anyformat. It focuses on:
  • Files and pages
  • How documents are processed
  • How usage is calculated
You don’t need to define schemas or fields here — this is about the document side, not the data side.

What is a file?

A file is any document you upload to anyformat. Examples:
  • PDFs
  • Images
  • Scans
  • Multi-page documents
Files are the raw input. They contain information, but no structure yet. anyformat reads files to understand:
  • Text
  • Layout
  • Tables
  • Visual cues

Supported file formats

anyformat supports a wide range of document types:
CategoryFormats
PDF.pdf
Documents.doc, .docx, .txt, .html, .htm, .rtf, .odt, .ppt, .pptx, .epub
Spreadsheets.xlsx, .xls
Markdown.md, .markdown
Images.png, .jpg, .jpeg, .gif, .bmp, .tiff
Email.eml, .msg
Audio.mp3, .wav

File size limits

  • Maximum file size: 20 MB per file
  • Page count: No hard limit (usage-based billing applies)

What happens when you upload a file?

When a file is uploaded:
1

Document reading

anyformat reads the document
2

Page detection

Pages are detected
3

Content analysis

Content is analyzed and made searchable
4

Ready for processing

The file becomes ready for extraction or workflows
You don’t need to configure OCR engines or preprocessing steps.

How files work with workflows

Files are always processed through workflows in anyformat:

With workflows

  • Apply repeatable logic
  • Extract structured data consistently
  • Scale to many documents
Files are the starting point, workflows define what to extract.

What’s next?