Harness engineering: Leveraging Codex in an agent-first world

a day ago

#AI-assisted development
#software engineering
#agent-first workflow

Development of an internal beta software product without any manually-written code over five months.
Codex generated all code including application logic, tests, CI configuration, documentation, observability, and internal tooling.
Estimated development time reduced to about 1/10th compared to manual coding.
Human engineers focus on designing environments, specifying intent, and building feedback loops instead of writing code.
Initial repository scaffold and AGENTS.md file were generated by Codex CLI using GPT‑5.
Repository grew to about a million lines of code with 1,500 pull requests merged by a small team.
Human interaction primarily through prompts; agents handle tasks and open pull requests autonomously.
Agents review their own changes and respond to feedback in loops until satisfaction.
Bottleneck shifted to human QA capacity, leading to enhancements for agent legibility in UI, logs, and metrics.
Agents can launch isolated app instances and interact via Chrome DevTools Protocol for bug reproduction and validation.
Observability tooling exposed logs, metrics, and traces to agents via local stacks for isolated worktrees.
Context management improved by structuring documentation and avoiding monolithic AGENTS.md files.
Knowledge base organized in a docs/ directory with AGENTS.md as a table of contents.
Progressive disclosure approach: agents start with small, stable entry points and are guided to deeper sources.
Mechanical enforcement via linters and CI jobs to maintain documentation freshness and structure.
Codebase optimized for Codex legibility, pushing context into the repository for agent accessibility.
Dependencies and abstractions chosen for composability, API stability, and ease of agent modeling.
Invariants enforced (e.g., data shape parsing at boundaries) without micromanaging implementations.
Rigid architectural model with fixed layers and dependency directions enforced mechanically via custom linters.
Constraints allow speed without architectural drift, similar to practices in large engineering organizations.
Human taste fed back into system through documentation updates or encoded into tooling.
Minimal blocking merge gates; short-lived pull requests and quick corrections prioritized over waiting.
Agents produce everything in the codebase, including product code, tools, documentation, and review comments.
Agents can end-to-end drive features: reproduce bugs, implement fixes, validate, open pull requests, and merge changes.
Golden principles encoded to maintain codebase consistency and legibility for future agent runs.
Recurring cleanup processes address technical debt continuously to prevent compounding issues.
Uncertainty remains about long-term architectural coherence in fully agent-generated systems over years.
Future challenges focus on designing environments, feedback loops, and control systems for agent scalability.

Hasty Briefsbeta

Harness engineering: Leveraging Codex in an agent-first world