Hasty Briefsbeta

TaxCalcBench: Evaluating Frontier Models on the Tax Calculation Task

18 hours ago
  • #Benchmarking
  • #Tax Calculation
  • #Artificial Intelligence
  • AI currently cannot accurately file US personal income taxes.
  • TaxCalcBench is introduced as a benchmark to evaluate AI models on tax calculation tasks.
  • State-of-the-art models succeed in calculating less than a third of federal income tax returns.
  • Common errors include misuse of tax tables, calculation mistakes, and incorrect eligibility determination.
  • Additional infrastructure is needed to improve AI application in tax calculations.