Studio
The studio command launches a web-based dashboard for browsing evaluation runs, inspecting individual test results, and reviewing scores. It shows both local runs and runs synced from a remote results repository.
agentv studioStudio auto-discovers run workspaces from .agentv/results/runs/ in the current directory and opens at http://localhost:3117.
You can also point it at a specific run workspace or index.jsonl manifest:
agentv studio .agentv/results/runs/2026-03-30T11-45-56-989Z/index.jsonl# oragentv studio .agentv/results/runs/2026-03-30T11-45-56-989ZOptions
Section titled “Options”| Option | Description |
|---|---|
--port, -p | Port to listen on (flag > PORT env var > 3117) |
--dir, -d | Working directory (default: current directory) |
--multi | Launch in multi-project dashboard mode |
--add <path> | Register a project by path |
--remove <id> | Unregister a project by ID |
--discover <path> | Scan a directory tree for repos with .agentv/ |
Features
Section titled “Features”- Recent Runs — table of all evaluation runs with source badge (
local/remote), target, experiment, timestamp, test count, pass rate, and mean score - Experiments — group and compare runs by experiment name
- Targets — group runs by target (model/agent)
- Run Detail — drill into a run to see per-test results, scores, and evaluator output
- Human Review — add feedback annotations to individual test results
- Remote Results — sync and browse runs pushed from other machines or CI (see Remote Results)
Run Detail
Section titled “Run Detail”Click any run to see a breakdown by suite, per-test scores, target, duration, and cost. The source label (local or remote) tells you where the run came from.
Experiments
Section titled “Experiments”The Experiments tab groups runs by experiment name so you can compare the impact of changes — for example, with_skills vs without_skills.
Multi-Project Dashboard
Section titled “Multi-Project Dashboard”By default, Studio shows results for the current directory. The multi-project mode lets you view results across multiple repositories from a single dashboard.
Registering Projects
Section titled “Registering Projects”Register projects one at a time:
agentv studio --add /path/to/project-aagentv studio --add /path/to/project-bEach path must contain a .agentv/ directory. Projects are stored in ~/.agentv/projects.yaml.
Auto-Discovery
Section titled “Auto-Discovery”Scan a parent directory to find and register all projects:
agentv studio --discover /path/to/reposThis recursively searches (up to 2 levels deep) for directories containing .agentv/ and registers them.
Launching the Dashboard
Section titled “Launching the Dashboard”Once projects are registered, launch the multi-project dashboard:
agentv studio --multiIf you have any registered projects, --multi is automatically enabled. The landing page shows a card for each project with run count, pass rate, and last run time. Click a project to view its runs.
Removing Projects
Section titled “Removing Projects”Unregister a project by its ID:
agentv studio --remove my-projectProject IDs are derived from the directory name (e.g., /home/user/repos/my-project becomes my-project).
Remote Results
Section titled “Remote Results”Studio can display runs pushed to a remote git repository by other machines or CI — alongside your local runs. Each run in the list carries a source badge: local (green) or remote (amber).
Configuration
Section titled “Configuration”Add a results.export block to .agentv/config.yaml:
results: export: repo: EntityProcess/agentv-evals # GitHub repo (owner/repo or full URL) path: runs # Directory within the repo auto_push: true # Push automatically after every eval run branch_prefix: eval-results # Branch naming prefix (default: eval-results)With auto_push: true, every agentv eval run or agentv pipeline bench automatically creates a draft PR in the configured repo with a structured results table.
Authentication
Section titled “Authentication”Uses gh CLI and git credentials already configured on the machine. If authentication is missing, AgentV warns and skips the export — the eval run itself is never blocked.
Syncing in Studio
Section titled “Syncing in Studio”Once configured, Studio fetches remote runs on load. Use the Sync Remote Results button in the source toolbar to pull the latest. The toolbar also shows when results were last synced and the configured repo.
Use the All Sources / Local Only / Remote Only filter to narrow the run list by origin.