Show HN: OSS implementation of Test Time Diffusion that runs on a 24gb GPU
15 days ago
- #MMU-RAG
- #deep-learning
- #research-agent
- TTD-RAG is a deep research agent submitted for the MMU-RAG Competition.
- The system implements the 'Deep Researcher with Test-Time Diffusion (TTD-DR)' framework.
- Report generation is modeled as an iterative 'denoising' process.
- Features include Test-Time Diffusion Framework, Report-Level Denoising with Retrieval, Component-wise Self-Evolution, and High-Performance Serving.
- The agent operates in three stages: Planning & Initial Drafting, Iterative Search & Denoising, and Final Report Generation.
- Technologies used include FastAPI, vLLM, Qwen/Qwen3-4B-Instruct-2507, tomaarsen/Qwen3-Reranker-0.6B-seq-cls, FineWeb Search API, and Docker.
- Setup requires Docker, NVIDIA GPU with 24GB+ VRAM, and API keys for FINEWEB_API_KEY and OPENROUTER_API_KEY.
- The API includes endpoints for Health Check, Dynamic Evaluation (/run), and Static Evaluation (/evaluate).
- AWS CLI commands are provided for pushing the Docker image to the competition's ECR repository.