Hasty Briefsbeta

Show HN: OSS implementation of Test Time Diffusion that runs on a 24gb GPU

15 days ago
  • #MMU-RAG
  • #deep-learning
  • #research-agent
  • TTD-RAG is a deep research agent submitted for the MMU-RAG Competition.
  • The system implements the 'Deep Researcher with Test-Time Diffusion (TTD-DR)' framework.
  • Report generation is modeled as an iterative 'denoising' process.
  • Features include Test-Time Diffusion Framework, Report-Level Denoising with Retrieval, Component-wise Self-Evolution, and High-Performance Serving.
  • The agent operates in three stages: Planning & Initial Drafting, Iterative Search & Denoising, and Final Report Generation.
  • Technologies used include FastAPI, vLLM, Qwen/Qwen3-4B-Instruct-2507, tomaarsen/Qwen3-Reranker-0.6B-seq-cls, FineWeb Search API, and Docker.
  • Setup requires Docker, NVIDIA GPU with 24GB+ VRAM, and API keys for FINEWEB_API_KEY and OPENROUTER_API_KEY.
  • The API includes endpoints for Health Check, Dynamic Evaluation (/run), and Static Evaluation (/evaluate).
  • AWS CLI commands are provided for pushing the Docker image to the competition's ECR repository.