Hasty Briefsbeta

Bilingual

AI bots ignore evidence. Can we trust them with science?

14 hours ago
  • #Science Trust
  • #AI Reasoning
  • #Evidence Ignorance
  • AI chatbots like ChatGPT, Gemini, and Grok failed to update predictions based on experimental evidence in a pen demonstration, sticking to incorrect assumptions.
  • A study showed AI agents ignored evidence in 68% of scientific reasoning tasks, made unsupported claims in 53%, and used contradictory evidence to change output only 26% of the time.
  • AI systems lack an iterative reasoning process similar to human scientists, often refusing to revise hypotheses despite clear evidence, limiting their reliability in science and medicine.
  • Researchers developed a benchmark to evaluate AI agents' reasoning process rather than just outcomes, revealing gaps in their ability to incorporate new data transparently.
  • Reasoning models, which mimic step-by-step thinking, may not truly reason but imitate patterns without verification, making it hard to trust their problem-solving process.
  • AI is best suited for well-defined tasks in science, not open-ended reasoning, contradicting claims of emergent intelligence and raising concerns about knowledge erosion.
  • Understanding AI's limitations allows for improvement towards meaningful discoveries, though current systems risk undermining scientific trust and transparency.