Doc2MD: An LLM powered document to Markdown conversion utility
16 days ago
- #markdown-conversion
- #text-extraction
- #openai-api
- A utility extracts text from images or PDFs using a local or remote OpenAI-compatible API with vision-capable models.
- For PDFs, each page is rendered to an image and processed sequentially; outputs are concatenated into a single Markdown document.
- Requires Python 3.12+, uv package manager, and dependencies like requests and pymupdf.
- Supports configuration via TOML file, command-line flags, or environment variables.
- Handles various error conditions including missing files, unsupported formats, and API server errors.
- Supports multiple image formats (JPG, PNG, GIF, BMP, WebP) and PDFs.
- Provides options to specify model, endpoint, API key, and output file.