Mistral OCR 4
5 hours ago
- #multilingual
- #OCR
- #document-processing
- Mistral releases OCR 4, featuring bounding boxes, block classification, and inline confidence scores with support for 170 languages.
- The model outperforms competitors in human preference evaluations (72% win rate) and benchmarks like OlmOCRBench (85.20 score), especially for rare languages.
- OCR 4 provides structured document output for RAG, agentic workflows, and data pipelines, and can be deployed self-hosted for data sovereignty.
- Available via API (priced at $4 per 1,000 pages, with batch discounts) and Document AI, with integrations through Mistral Studio, Amazon SageMaker, and Microsoft Foundry.
- Benchmarks have known scoring limitations, with errors often due to ground-truth issues or formatting, not model mistakes.