Run to completion. The whole contract is a single method:
run.trace in place; the answer it produces is run.trace.content, graded when the run exits. Agents are stateless per run, so one instance can drive many concurrent rollouts.
create_agent
claude-..., gpt-..., gemini-..., grok-...). Extra kwargs pass through to the provider config.
Provider agents
Each provider agent takes an optional config fromhud.agents.types:
| Agent | Config | Default model |
|---|---|---|
ClaudeAgent | ClaudeConfig | claude-sonnet-4-6 |
OpenAIAgent | OpenAIConfig | gpt-5.4 |
GeminiAgent | GeminiConfig | gemini-3-pro-preview |
OpenAIChatAgent | OpenAIChatConfig | gpt-5-mini |
ClaudeSDKAgent | ClaudeSDKConfig | claude-sonnet-4-5 |
OpenAIChatAgentspeaks OpenAI Chat Completions — pointbase_urlat any compatible server (vLLM, local models).ClaudeSDKAgentruns theclaudeCLI (Claude Code) over ansshcapability.
How an agent uses capabilities
The bundled agents are catalog-driven: on each run they read the environment’s manifest, open the capabilities they support (run.client.open(protocol)), build their provider tools into fresh per-run state, then loop against run.prompt_messages. You don’t wire tools — declaring the capability on the environment is enough.
__call__(run) takes only the run; tuning like max_steps, system_prompt, and citations_enabled is read from the agent’s config:
Settings precedence
When the same knob (e.g.model, max_steps) is set in more than one place, the order is: explicit kwarg/config field > CLI flag > defaults. Concretely:
create_agent("…", max_steps=30)andClaudeConfig(max_steps=30)set the config field directly.hud eval … --max-steps 30 --model …overrides the config defaults for that run.- Unset everywhere → the config’s built-in default (
max_steps=10).
Bring your own harness
SubclassAgent and implement __call__. Write the answer to run.trace.content:
BrowserUseAgent (in hud.agents.browser_use, config BrowserUseConfig) is this pattern wrapping browser-use on the cdp capability.
RobotAgent (in hud.agents.robot, beta — the robot extra) is the non-LLM version of the same pattern: it opens the openpi/0 capability and runs an observe → infer → act loop, with your policy plugged in through Model/Adapter seams. See Robots.