Which Table Format Do LLMs Understand Best? (Results for 11 Formats)
15 hours ago
- #AI Performance
- #LLM
- #Data Formats
- The article explores the best data formats for passing tabular data to LLMs, focusing on accuracy and efficiency.
- Markdown-KV format achieved the highest accuracy (60.7%) but used more tokens, while CSV was token-efficient but less accurate (44.3%).
- The study tested 11 formats, including JSON, XML, YAML, HTML, and natural language, using GPT-4.1-nano on 1,000 synthetic employee records.
- Practical guidance suggests considering Markdown-KV for accuracy, markdown tables for balance, and avoiding CSV/JSONL for critical applications.
- Limitations include testing only one model (GPT-4.1 nano) and data pattern, with suggestions for future research on other models and data structures.