Hasty Briefsbeta

Bilingual

A case study in testing with 100+ Claude agents in parallel

2 days ago
  • #testing automation
  • #AI-assisted software engineering
  • #multi-agent workflows
  • mngr is used to run and improve itself by testing its own demo script through a multi-agent workflow.
  • The process starts with a tutorial script divided into blocks, each converted into pytest functions and assigned to an agent for execution, debugging, fixing, and improvement.
  • Coding agents generate examples from tutorial blocks, helping either produce good examples or identify interface issues to refine mngr.
  • Tutorial blocks are transformed into pytest functions with a 1:N correspondence to cover various scenarios, and agents cite tutorial blocks for traceability.
  • A test framework built on Python's subprocess module allows concise test functions and generates CLI transcripts and TUI recordings via tools like asciinema.
  • Tests are orchestrated by collecting test names, launching agents to fix or improve each test, pulling results, and integrating changes into a single PR.
  • Integration involves separating implementation fixes from non-implementation fixes, merging the latter directly and ranking the former for review.
  • The workflow was developed locally with 10 agents and scaled to 100 agents on Modal by changing mngr create commands, maintaining consistency across environments.
  • mngr enables building custom map-reduce-like pipelines using its primitives, supporting both small-scale local runs and large-scale remote deployments without upfront costs.
  • The tool emphasizes scalability in both directions (up and down), allowing teams to start quickly locally and scale as needed, aligning with Imbue's mission to make tech serve humans.