Skip to main content
The Critiqor dashboard is entirely local — it reads diagnosis.json files directly from your runs/ directory and hosts a web UI at http://127.0.0.1:<port>. Nothing is sent to a remote server. The same dashboard you use during development is the same one you use in a completely offline environment.

Why Local-Only?

Running locally isn’t a limitation — it’s a deliberate design choice for agent observability:
  • Evidence stays local: your tool calls, inputs, outputs, and memory events never leave your machine. Agent sessions often contain sensitive context; Critiqor never touches it.
  • No cloud dependency: works offline, requires no API keys, no account, and no network access after install.
  • Immediate access: the dashboard is available as soon as critiqor finalize completes — there is no upload step or processing queue to wait for.
  • Auditable: all artifacts (session.json, diagnosis.json) are plain JSON files you can inspect, diff, and version-control directly.

How It Launches

Automatic launch after finalize

Running critiqor finalize is the most common path. After generating diagnosis.json it automatically starts the dashboard and opens it in your default browser:
critiqor finalize
# Stopping observer...
# Finalizing evidence...
# Generating diagnosis...
# Diagnosis saved: runs/run_003/diagnosis.json
# Starting local dashboard...
# Dashboard run: run_003
# Critiqor dashboard: http://127.0.0.1:52341/?run_id=run_003
To finalize without opening the dashboard:
critiqor finalize --no-dashboard

Manual launch

# Open the latest finalized run
critiqor dashboard

# Open a specific run
critiqor dashboard run_001

# Specify host, port, or a non-default runs directory
critiqor dashboard run_001 --host 127.0.0.1 --port 3000 --runs /path/to/runs

Port selection

Critiqor uses --port 0 by default, which lets the OS assign any available port. To use a fixed port, pass --port <number>. The assigned port is printed when the server starts and saved to runs/.critiqor_dashboard/server.json so subsequent critiqor dashboard calls can reuse the same server without restarting it.

How the dashboard app is found

The dashboard UI (critiqor-core-engine) is a local web app launched via bun run dev (preferred, if bun is installed) or npm run dev. Critiqor searches for it in a set of standard locations including the bundled copy inside the package, ~/Code/critiqor-core-engine, and ~/.critiqor/core-engine-dashboard. If your copy is in a non-standard location, set the environment variable:
export CRITIQOR_DASHBOARD_DIR=/path/to/critiqor-core-engine

Logs

Dashboard startup output is written to /tmp/critiqor-dashboard-<hash>.log, where the hash is derived from the absolute path of your runs/ directory. Check this file if the dashboard fails to start.

Dashboard Sections

Executive Summary

The first thing you see when the dashboard loads. It gives you an instant reliability verdict for the run:
FieldDescription
Trust Score0–100 reliability score. Green >=80, yellow 60–79, red less than 60.
Readiness Levelready_for_runtime / review_recommended / unsafe_for_production
Primary DiagnosisThe highest-impact failure type detected (e.g. infinite_tool_loop)
Recommended ActionThe single most important next step derived from the diagnosis
A score in the green range with ready_for_runtime means the agent completed the session without significant reliability failures. Yellow or red with review_recommended or unsafe_for_production means failures were detected that warrant investigation before deploying.

Primary Diagnosis

A deep-dive into the single most impactful failure cause detected from the runtime evidence:
  • Failure type name — the canonical failure category (e.g. infinite_tool_loop, cost_explosion, memory_degradation)
  • Severitymedium, high, or critical
  • Causal chain — a plain-language explanation of how the failure unfolded, for example: tool_call -> tool_failure_or_no_progress -> retry_same_action -> loop_flagged
  • Description — a detailed narrative of what was observed and why it was flagged
If no failures were detected, this section will reflect a clean diagnosis with no root cause failure type.

Runtime Timeline

A chronological list of every event Critiqor observed, with timestamps and source layer tags (e.g. extension_api, critiqor_finalize). Use the timeline to understand the exact sequence of agent actions — which tools were called, in what order, when retries occurred, and where the session deviated from expected behaviour.

Evidence

The technical audit panel. It surfaces the raw observations behind the diagnosis:
  • Tool calls — every tool invocation with name, call ID, arguments, and status
  • Tool outputs — the responses returned to the agent, including errors
  • Memory events — memory reads and writes (memory_search, memory_get)
  • Retries and errors — repeated calls, failure events, and state transitions
  • Source link — a direct link to the underlying session.json for complete trace inspection
The evidence panel is where you go to verify a diagnosis claim. If the Primary Diagnosis says a tool was called seven times with matching arguments, the Evidence panel will show exactly which calls those were.

Recommendations

Actionable suggestions derived from the failure causes detected in this run. Each failure type in Critiqor’s diagnosis engine has a known remediation pattern — for example, infinite_tool_loop produces a recommendation to add retry caps or backoff logic to the tool execution layer. Recommendations are ordered by the severity of the failure that produced them.

Run History

A list of all completed runs for the current runs/ directory, each shown with its trust score. Click any row to switch the dashboard to that run’s diagnosis without restarting the server. Runs are listed in run ID order (which is also chronological), making it easy to spot reliability trends.

Causal Graph

A visual representation of how events in the session relate to each other and to the failures detected:
  • Event nodes — one node per significant runtime event
  • causes edges — link evidence events directly to the failure they produced
  • precedes edges — show the temporal ordering between events
  • Failure nodes — highlighted to distinguish them from normal events
The causal graph is the most compact view of why a failure occurred. A dense cluster of causes edges pointing at a single failure node usually indicates a clear, isolatable root cause.

Agent Health

A trust score trend chart across multiple runs. Use this section to answer the question “is my agent getting more or less reliable over time?” The chart shows whether trust is improving (scores trending upward), stable (flat), or declining (scores trending downward) as you iterate on the agent.

Switching Runs

# Switch to a different run from the command line
critiqor dashboard run_003

# Or click any entry in the Run History section of the dashboard UI
When the server is already running (detected via runs/.critiqor_dashboard/server.json), Critiqor navigates to the new run by hitting the dashboard’s /api/runs/<run_id> endpoint — the server is not restarted and your browser tab stays open.

Troubleshooting

“Critiqor Core Engine dashboard not found” Critiqor could not locate the critiqor-core-engine web app. Set the environment variable to its location:
export CRITIQOR_DASHBOARD_DIR=/path/to/critiqor-core-engine
critiqor dashboard
Critiqor also accepts CRITIQOR_CORE_ENGINE_DASHBOARD_DIR as an alias. Dashboard doesn’t open / server fails to start Check that bun or npm is installed and available on your PATH. Verify that the configured port isn’t already in use. Review the startup log at /tmp/critiqor-dashboard-<hash>.log for the full error output from the dev server. “Diagnosis file invalid. Dashboard launch aborted.” diagnosis.json is missing required fields (run_id and executive_summary.trust_score). This typically means finalization was interrupted before it completed. Re-run finalization to regenerate it:
critiqor finalize
If active_session.json no longer exists but the diagnosis is still incomplete, the session evidence in runs/<run_id>/session.json is still intact — you can re-trigger finalization logic against it without re-running the agent.