Harness engineering: Leveraging Codex in an agent-first world
a day ago
- #AI-assisted development
- #software engineering
- #agent-first workflow
- Development of an internal beta software product without any manually-written code over five months.
- Codex generated all code including application logic, tests, CI configuration, documentation, observability, and internal tooling.
- Estimated development time reduced to about 1/10th compared to manual coding.
- Human engineers focus on designing environments, specifying intent, and building feedback loops instead of writing code.
- Initial repository scaffold and AGENTS.md file were generated by Codex CLI using GPT‑5.
- Repository grew to about a million lines of code with 1,500 pull requests merged by a small team.
- Human interaction primarily through prompts; agents handle tasks and open pull requests autonomously.
- Agents review their own changes and respond to feedback in loops until satisfaction.
- Bottleneck shifted to human QA capacity, leading to enhancements for agent legibility in UI, logs, and metrics.
- Agents can launch isolated app instances and interact via Chrome DevTools Protocol for bug reproduction and validation.
- Observability tooling exposed logs, metrics, and traces to agents via local stacks for isolated worktrees.
- Context management improved by structuring documentation and avoiding monolithic AGENTS.md files.
- Knowledge base organized in a docs/ directory with AGENTS.md as a table of contents.
- Progressive disclosure approach: agents start with small, stable entry points and are guided to deeper sources.
- Mechanical enforcement via linters and CI jobs to maintain documentation freshness and structure.
- Codebase optimized for Codex legibility, pushing context into the repository for agent accessibility.
- Dependencies and abstractions chosen for composability, API stability, and ease of agent modeling.
- Invariants enforced (e.g., data shape parsing at boundaries) without micromanaging implementations.
- Rigid architectural model with fixed layers and dependency directions enforced mechanically via custom linters.
- Constraints allow speed without architectural drift, similar to practices in large engineering organizations.
- Human taste fed back into system through documentation updates or encoded into tooling.
- Minimal blocking merge gates; short-lived pull requests and quick corrections prioritized over waiting.
- Agents produce everything in the codebase, including product code, tools, documentation, and review comments.
- Agents can end-to-end drive features: reproduce bugs, implement fixes, validate, open pull requests, and merge changes.
- Golden principles encoded to maintain codebase consistency and legibility for future agent runs.
- Recurring cleanup processes address technical debt continuously to prevent compounding issues.
- Uncertainty remains about long-term architectural coherence in fully agent-generated systems over years.
- Future challenges focus on designing environments, feedback loops, and control systems for agent scalability.