DocStrange: Open-source tool to turn PDFs, images, docs to clean JSON/Markdown
2 days ago
- #document-processing
- #OCR
- #data-extraction
- DocStrange offers free cloud processing for up to 10,000 documents per month with no installation required.
- Supports local processing (CPU/GPU) for 100% privacy with no data sent externally.
- Extracts and converts data from various document formats (PDF, Word, Excel, images, URLs) into multiple output formats (Markdown, JSON, CSV, HTML).
- Includes a built-in web interface for drag-and-drop document conversion with a user-friendly UI.
- Features intelligent content extraction, advanced OCR, table processing, and LLM-optimized outputs.
- Provides both cloud and local processing modes with options for specific field extraction and JSON schema validation.
- Free tier available with rate limits; authenticated users get 10,000 docs/month via 'docstrange login' or API key.
- Optional MCP server for local development enables intelligent document processing in Claude Desktop.