Hasty Briefsbeta

Which Table Format Do LLMs Understand Best? (Results for 11 Formats)

15 hours ago
  • #AI Performance
  • #LLM
  • #Data Formats
  • The article explores the best data formats for passing tabular data to LLMs, focusing on accuracy and efficiency.
  • Markdown-KV format achieved the highest accuracy (60.7%) but used more tokens, while CSV was token-efficient but less accurate (44.3%).
  • The study tested 11 formats, including JSON, XML, YAML, HTML, and natural language, using GPT-4.1-nano on 1,000 synthetic employee records.
  • Practical guidance suggests considering Markdown-KV for accuracy, markdown tables for balance, and avoiding CSV/JSONL for critical applications.
  • Limitations include testing only one model (GPT-4.1 nano) and data pattern, with suggestions for future research on other models and data structures.