Doc2MD: An LLM powered document to Markdown conversion utility

16 days ago

Copy Link

A utility extracts text from images or PDFs using a local or remote OpenAI-compatible API with vision-capable models.
For PDFs, each page is rendered to an image and processed sequentially; outputs are concatenated into a single Markdown document.
Requires Python 3.12+, uv package manager, and dependencies like requests and pymupdf.
Supports configuration via TOML file, command-line flags, or environment variables.
Handles various error conditions including missing files, unsupported formats, and API server errors.
Supports multiple image formats (JPG, PNG, GIF, BMP, WebP) and PDFs.
Provides options to specify model, endpoint, API key, and output file.

Hasty Briefsbeta