Optimizing Tool Selection for LLM Workflows with Differentiable Programming

10 months ago

Modern agentic architectures rely on chaining LLM calls, which scales poorly due to latency, cost, and token overhead.
Differentiable routing replaces LLM-based tool selection with a trainable function, offering benefits like local execution, determinism, and composability.
A minimal example involves a 4-layer PyTorch network for tool selection, trainable via backpropagation from downstream task rewards.
Context inflation in prompt-based planners leads to token tax, truncation risk, attention dilution, and leakage, whereas differentiable routers maintain constant context length.
Differentiable programming decouples control logic from generative inference, leading to more modular, inspectable, and scalable architectures.
A case study shows a 3× cost reduction by replacing LLM routing with differentiable controllers in a planner using search and calculator tools.
Differentiable controllers are economically and architecturally efficient, marking a shift from prompt chains to program-like LLM systems.

Hasty Briefsbeta