docs/reference/test.md

---
summary: "How to run tests locally (vitest) and when to use force/coverage modes"
read_when:
  - Running or fixing tests
title: "Tests"
---

# Tests

- Full testing kit (suites, live, Docker): [Testing](/help/testing)

- `pnpm test:force`: Kills any lingering gateway process holding the default control port, then runs the full Vitest suite with an isolated gateway port so server tests don’t collide with a running instance. Use this when a prior gateway run left port 18789 occupied.
- `pnpm test:coverage`: Runs the unit suite with V8 coverage (via `vitest.unit.config.ts`). Global thresholds are 70% lines/branches/functions/statements. Coverage excludes integration-heavy entrypoints (CLI wiring, gateway/telegram bridges, webchat static server) to keep the target focused on unit-testable logic.
- `pnpm test:coverage:changed`: Runs unit coverage only for files changed since `origin/main`.
- `pnpm test:changed`: runs the native Vitest projects config with `--changed origin/main`. The base config treats the projects/config files as `forceRerunTriggers` so wiring changes still rerun broadly when needed.
- `pnpm test`: runs the native Vitest root projects config directly. File filters work natively across the configured projects.
- Base Vitest config now defaults to `pool: "threads"` and `isolate: false`, with the shared non-isolated runner enabled across the repo configs.
- `pnpm test:channels` runs `vitest.channels.config.ts`.
- `pnpm test:extensions` runs `vitest.extensions.config.ts`.
- `pnpm test:extensions`: runs extension/plugin suites.
- `pnpm test:perf:imports`: enables Vitest import-duration + import-breakdown reporting for the native root projects run.
- `pnpm test:perf:imports:changed`: same import profiling, but only for files changed since `origin/main`.
- `pnpm test:perf:profile:main`: writes a CPU profile for the Vitest main thread (`.artifacts/vitest-main-profile`).
- `pnpm test:perf:profile:runner`: writes CPU + heap profiles for the unit runner (`.artifacts/vitest-runner-profile`).
- Gateway integration: opt-in via `OPENCLAW_TEST_INCLUDE_GATEWAY=1 pnpm test` or `pnpm test:gateway`.
- `pnpm test:e2e`: Runs gateway end-to-end smoke tests (multi-instance WS/HTTP/node pairing). Defaults to `threads` + `isolate: false` with adaptive workers in `vitest.e2e.config.ts`; tune with `OPENCLAW_E2E_WORKERS=<n>` and set `OPENCLAW_E2E_VERBOSE=1` for verbose logs.
- `pnpm test:live`: Runs provider live tests (minimax/zai). Requires API keys and `LIVE=1` (or provider-specific `*_LIVE_TEST=1`) to unskip.
- `pnpm test:docker:openwebui`: Starts Dockerized OpenClaw + Open WebUI, signs in through Open WebUI, checks `/api/models`, then runs a real proxied chat through `/api/chat/completions`. Requires a usable live model key (for example OpenAI in `~/.profile`), pulls an external Open WebUI image, and is not expected to be CI-stable like the normal unit/e2e suites.
- `pnpm test:docker:mcp-channels`: Starts a seeded Gateway container and a second client container that spawns `openclaw mcp serve`, then verifies routed conversation discovery, transcript reads, attachment metadata, live event queue behavior, outbound send routing, and Claude-style channel + permission notifications over the real stdio bridge. The Claude notification assertion reads the raw stdio MCP frames directly so the smoke reflects what the bridge actually emits.

## Local PR gate

For local PR land/gate checks, run:

- `pnpm check`
- `pnpm build`
- `pnpm test`
- `pnpm check:docs`

If `pnpm test` flakes on a loaded host, rerun once before treating it as a regression, then isolate with `pnpm test <path/to/test>`. For memory-constrained hosts, use:

- `OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test`
- `OPENCLAW_VITEST_FS_MODULE_CACHE_PATH=/tmp/openclaw-vitest-cache pnpm test:changed`

## Model latency bench (local keys)

Script: [`scripts/bench-model.ts`](https://github.com/openclaw/openclaw/blob/main/scripts/bench-model.ts)

Usage:

- `source ~/.profile && pnpm tsx scripts/bench-model.ts --runs 10`
- Optional env: `MINIMAX_API_KEY`, `MINIMAX_BASE_URL`, `MINIMAX_MODEL`, `ANTHROPIC_API_KEY`
- Default prompt: “Reply with a single word: ok. No punctuation or extra text.”

Last run (2025-12-31, 20 runs):

- minimax median 1279ms (min 1114, max 2431)
- opus median 2454ms (min 1224, max 3170)

## CLI startup bench

Script: [`scripts/bench-cli-startup.ts`](https://github.com/openclaw/openclaw/blob/main/scripts/bench-cli-startup.ts)

Usage:

- `pnpm test:startup:bench`
- `pnpm test:startup:bench:smoke`
- `pnpm test:startup:bench:save`
- `pnpm test:startup:bench:update`
- `pnpm test:startup:bench:check`
- `pnpm tsx scripts/bench-cli-startup.ts`
- `pnpm tsx scripts/bench-cli-startup.ts --runs 12`
- `pnpm tsx scripts/bench-cli-startup.ts --preset real`
- `pnpm tsx scripts/bench-cli-startup.ts --preset real --case status --case gatewayStatus --runs 3`
- `pnpm tsx scripts/bench-cli-startup.ts --entry openclaw.mjs --entry-secondary dist/entry.js --preset all`
- `pnpm tsx scripts/bench-cli-startup.ts --preset all --output .artifacts/cli-startup-bench-all.json`
- `pnpm tsx scripts/bench-cli-startup.ts --preset real --case gatewayStatusJson --output .artifacts/cli-startup-bench-smoke.json`
- `pnpm tsx scripts/bench-cli-startup.ts --preset real --cpu-prof-dir .artifacts/cli-cpu`
- `pnpm tsx scripts/bench-cli-startup.ts --json`

Presets:

- `startup`: `--version`, `--help`, `health`, `health --json`, `status --json`, `status`
- `real`: `health`, `status`, `status --json`, `sessions`, `sessions --json`, `agents list --json`, `gateway status`, `gateway status --json`, `gateway health --json`, `config get gateway.port`
- `all`: both presets

Output includes `sampleCount`, avg, p50, p95, min/max, exit-code/signal distribution, and max RSS summaries for each command. Optional `--cpu-prof-dir` / `--heap-prof-dir` writes V8 profiles per run so timing and profile capture use the same harness.

Saved output conventions:

- `pnpm test:startup:bench:smoke` writes the targeted smoke artifact at `.artifacts/cli-startup-bench-smoke.json`
- `pnpm test:startup:bench:save` writes the full-suite artifact at `.artifacts/cli-startup-bench-all.json` using `runs=5` and `warmup=1`
- `pnpm test:startup:bench:update` refreshes the checked-in baseline fixture at `test/fixtures/cli-startup-bench.json` using `runs=5` and `warmup=1`

Checked-in fixture:

- `test/fixtures/cli-startup-bench.json`
- Refresh with `pnpm test:startup:bench:update`
- Compare current results against the fixture with `pnpm test:startup:bench:check`

## Onboarding E2E (Docker)

Docker is optional; this is only needed for containerized onboarding smoke tests.

Full cold-start flow in a clean Linux container:

```bash
scripts/e2e/onboard-docker.sh
```

This script drives the interactive wizard via a pseudo-tty, verifies config/workspace/session files, then starts the gateway and runs `openclaw health`.

## QR import smoke (Docker)

Ensures `qrcode-terminal` loads under the supported Docker Node runtimes (Node 24 default, Node 22 compatible):

```bash
pnpm test:docker:qr
```
-											docs(site): refresh clawdis.ai for Pi
										
										
											2025-12-13 13:25:49 +00:00
+								---
 								summary: "How to run tests locally (vitest) and when to use force/coverage modes"
 								read_when:
 								  - Running or fixing tests
-											Docs: add nav titles across docs (#5689)
										
										
											2026-01-31 16:04:03 -05:00
+								title: "Tests"
-											docs(site): refresh clawdis.ai for Pi
										
										
											2025-12-13 13:25:49 +00:00
+								---
-											chore: Run pnpm format:fix.
										
										
											2026-01-31 21:13:13 +09:00
-											docs(site): refresh clawdis.ai for Pi
										
										
											2025-12-13 13:25:49 +00:00
+								# Tests
-											test: add test:force helper
										
										
											2025-12-10 01:00:29 +00:00
-											docs: canonicalize docs paths and align zh navigation (#11428)
										
										
											2026-02-07 15:40:35 -05:00
+								- Full testing kit (suites, live, Docker): [Testing](/help/testing)
-											docs: document testing kit
										
										
											2026-01-10 01:15:42 +00:00
-											chore(gateway): use ws bind as lock
										
										
											2025-12-11 15:17:40 +00:00
+								- `pnpm test:force`: Kills any lingering gateway process holding the default control port, then runs the full Vitest suite with an isolated gateway port so server tests don’t collide with a running instance. Use this when a prior gateway run left port 18789 occupied.
-											perf(test): run coverage gate on unit suite
										
										
											2026-02-15 04:20:08 +00:00
+								- `pnpm test:coverage`: Runs the unit suite with V8 coverage (via `vitest.unit.config.ts`). Global thresholds are 70% lines/branches/functions/statements. Coverage excludes integration-heavy entrypoints (CLI wiring, gateway/telegram bridges, webchat static server) to keep the target focused on unit-testable logic.
-											perf: add vitest test perf workflows
										
										
											2026-03-23 04:40:45 +00:00
+								- `pnpm test:coverage:changed`: Runs unit coverage only for files changed since `origin/main`.
-											docs: simplify vitest workflow guidance
										
										
											2026-04-03 12:45:05 +01:00
+								- `pnpm test:changed`: runs the native Vitest projects config with `--changed origin/main`. The base config treats the projects/config files as `forceRerunTriggers` so wiring changes still rerun broadly when needed.
-											test: use native vitest root projects
										
										
											2026-04-04 04:01:17 +01:00
+								- `pnpm test`: runs the native Vitest root projects config directly. File filters work natively across the configured projects.
-											test: enforce thread-first vitest configs
										
										
											2026-04-04 05:49:56 +01:00
+								- Base Vitest config now defaults to `pool: "threads"` and `isolate: false`, with the shared non-isolated runner enabled across the repo configs.
-											docs: simplify vitest workflow guidance
										
										
											2026-04-03 12:45:05 +01:00
+								- `pnpm test:channels` runs `vitest.channels.config.ts`.
 								- `pnpm test:extensions` runs `vitest.extensions.config.ts`.
-											test: split fast lane from channel and gateway suites
										
										
											2026-03-02 05:31:39 +00:00
+								- `pnpm test:extensions`: runs extension/plugin suites.
-											test: use native vitest root projects
										
										
											2026-04-04 04:01:17 +01:00
+								- `pnpm test:perf:imports`: enables Vitest import-duration + import-breakdown reporting for the native root projects run.
-											perf: add vitest test perf workflows
										
										
											2026-03-23 04:40:45 +00:00
+								- `pnpm test:perf:imports:changed`: same import profiling, but only for files changed since `origin/main`.
 								- `pnpm test:perf:profile:main`: writes a CPU profile for the Vitest main thread (`.artifacts/vitest-main-profile`).
 								- `pnpm test:perf:profile:runner`: writes CPU + heap profiles for the unit runner (`.artifacts/vitest-runner-profile`).
-											test: split fast lane from channel and gateway suites
										
										
											2026-03-02 05:31:39 +00:00
+								- Gateway integration: opt-in via `OPENCLAW_TEST_INCLUDE_GATEWAY=1 pnpm test` or `pnpm test:gateway`.
-											test: enforce thread-first vitest configs
										
										
											2026-04-04 05:49:56 +01:00
+								- `pnpm test:e2e`: Runs gateway end-to-end smoke tests (multi-instance WS/HTTP/node pairing). Defaults to `threads` + `isolate: false` with adaptive workers in `vitest.e2e.config.ts`; tune with `OPENCLAW_E2E_WORKERS=<n>` and set `OPENCLAW_E2E_VERBOSE=1` for verbose logs.
-											test: split live tests into separate config
										
										
											2026-01-08 02:00:11 +01:00
+								- `pnpm test:live`: Runs provider live tests (minimax/zai). Requires API keys and `LIVE=1` (or provider-specific `*_LIVE_TEST=1`) to unskip.
-											test: add Open WebUI docker smoke
										
										
											2026-03-25 05:26:38 -07:00
+								- `pnpm test:docker:openwebui`: Starts Dockerized OpenClaw + Open WebUI, signs in through Open WebUI, checks `/api/models`, then runs a real proxied chat through `/api/chat/completions`. Requires a usable live model key (for example OpenAI in `~/.profile`), pulls an external Open WebUI image, and is not expected to be CI-stable like the normal unit/e2e suites.
-											docs: clarify mcp server and client modes
										
										
											2026-03-28 04:10:07 +00:00
+								- `pnpm test:docker:mcp-channels`: Starts a seeded Gateway container and a second client container that spawns `openclaw mcp serve`, then verifies routed conversation discovery, transcript reads, attachment metadata, live event queue behavior, outbound send routing, and Claude-style channel + permission notifications over the real stdio bridge. The Claude notification assertion reads the raw stdio MCP frames directly so the smoke reflects what the bridge actually emits.
-											docs: add model latency bench notes
										
										
											2025-12-31 22:39:42 +01:00
-											refactor(sandbox): share container-path utils and tighten fs bridge tests
										
										
											2026-02-25 01:59:43 +00:00
+								## Local PR gate
 								For local PR land/gate checks, run:
 								- `pnpm check`
 								- `pnpm build`
 								- `pnpm test`
 								- `pnpm check:docs`
-											test: use native vitest root projects
										
										
											2026-04-04 04:01:17 +01:00
+								If `pnpm test` flakes on a loaded host, rerun once before treating it as a regression, then isolate with `pnpm test <path/to/test>`. For memory-constrained hosts, use:
-											refactor(sandbox): share container-path utils and tighten fs bridge tests
										
										
											2026-02-25 01:59:43 +00:00
-											docs: simplify vitest workflow guidance
										
										
											2026-04-03 12:45:05 +01:00
+								- `OPENCLAW_VITEST_MAX_WORKERS=1 pnpm test`
-											build: refresh deps and vitest cache lanes
										
										
											2026-03-27 02:25:58 +00:00
+								- `OPENCLAW_VITEST_FS_MODULE_CACHE_PATH=/tmp/openclaw-vitest-cache pnpm test:changed`
-											refactor(sandbox): share container-path utils and tighten fs bridge tests
										
										
											2026-02-25 01:59:43 +00:00
-											docs: add model latency bench notes
										
										
											2025-12-31 22:39:42 +01:00
+								## Model latency bench (local keys)
-											refactor: rename to openclaw
										
										
											2026-01-30 03:15:10 +01:00
+								Script: [`scripts/bench-model.ts`](https://github.com/openclaw/openclaw/blob/main/scripts/bench-model.ts)
-											docs: add model latency bench notes
										
										
											2025-12-31 22:39:42 +01:00
 								Usage:
-											chore: Run pnpm format:fix.
										
										
											2026-01-31 21:13:13 +09:00
-											chore: make bun optional for source builds
										
										
											2026-01-06 23:48:22 +00:00
+								- `source ~/.profile && pnpm tsx scripts/bench-model.ts --runs 10`
-											docs: add model latency bench notes
										
										
											2025-12-31 22:39:42 +01:00
+								- Optional env: `MINIMAX_API_KEY`, `MINIMAX_BASE_URL`, `MINIMAX_MODEL`, `ANTHROPIC_API_KEY`
 								- Default prompt: “Reply with a single word: ok. No punctuation or extra text.”
 								Last run (2025-12-31, 20 runs):
-											chore: Run pnpm format:fix.
										
										
											2026-01-31 21:13:13 +09:00
-											docs: add model latency bench notes
										
										
											2025-12-31 22:39:42 +01:00
+								- minimax median 1279ms (min 1114, max 2431)
 								- opus median 2454ms (min 1224, max 3170)
-											test: add onboarding e2e harness
										
										
											2026-01-01 18:01:42 +01:00
-											Docs tests: add CLI startup benchmark usage
										
										
											2026-03-01 12:37:23 -08:00
+								## CLI startup bench
 								Script: [`scripts/bench-cli-startup.ts`](https://github.com/openclaw/openclaw/blob/main/scripts/bench-cli-startup.ts)
 								Usage:
-											tests: standardize CLI startup benchmarks
										
										
											2026-03-30 22:15:09 -04:00
+								- `pnpm test:startup:bench`
 								- `pnpm test:startup:bench:smoke`
 								- `pnpm test:startup:bench:save`
 								- `pnpm test:startup:bench:update`
 								- `pnpm test:startup:bench:check`
-											Docs tests: add CLI startup benchmark usage
										
										
											2026-03-01 12:37:23 -08:00
+								- `pnpm tsx scripts/bench-cli-startup.ts`
 								- `pnpm tsx scripts/bench-cli-startup.ts --runs 12`
-											tests: standardize CLI startup benchmarks
										
										
											2026-03-30 22:15:09 -04:00
+								- `pnpm tsx scripts/bench-cli-startup.ts --preset real`
 								- `pnpm tsx scripts/bench-cli-startup.ts --preset real --case status --case gatewayStatus --runs 3`
 								- `pnpm tsx scripts/bench-cli-startup.ts --entry openclaw.mjs --entry-secondary dist/entry.js --preset all`
 								- `pnpm tsx scripts/bench-cli-startup.ts --preset all --output .artifacts/cli-startup-bench-all.json`
 								- `pnpm tsx scripts/bench-cli-startup.ts --preset real --case gatewayStatusJson --output .artifacts/cli-startup-bench-smoke.json`
 								- `pnpm tsx scripts/bench-cli-startup.ts --preset real --cpu-prof-dir .artifacts/cli-cpu`
 								- `pnpm tsx scripts/bench-cli-startup.ts --json`
-											Docs tests: add CLI startup benchmark usage
										
										
											2026-03-01 12:37:23 -08:00
-											tests: standardize CLI startup benchmarks
										
										
											2026-03-30 22:15:09 -04:00
+								Presets:
-											Docs tests: add CLI startup benchmark usage
										
										
											2026-03-01 12:37:23 -08:00
-											tests: standardize CLI startup benchmarks
										
										
											2026-03-30 22:15:09 -04:00
+								- `startup`: `--version`, `--help`, `health`, `health --json`, `status --json`, `status`
 								- `real`: `health`, `status`, `status --json`, `sessions`, `sessions --json`, `agents list --json`, `gateway status`, `gateway status --json`, `gateway health --json`, `config get gateway.port`
 								- `all`: both presets
-											Docs tests: add CLI startup benchmark usage
										
										
											2026-03-01 12:37:23 -08:00
-											tests: use multi-sample CLI startup baselines
										
										
											2026-03-30 22:34:43 -04:00
+								Output includes `sampleCount`, avg, p50, p95, min/max, exit-code/signal distribution, and max RSS summaries for each command. Optional `--cpu-prof-dir` / `--heap-prof-dir` writes V8 profiles per run so timing and profile capture use the same harness.
-											tests: standardize CLI startup benchmarks
										
										
											2026-03-30 22:15:09 -04:00
 								Saved output conventions:
 								- `pnpm test:startup:bench:smoke` writes the targeted smoke artifact at `.artifacts/cli-startup-bench-smoke.json`
-											tests: use multi-sample CLI startup baselines
										
										
											2026-03-30 22:34:43 -04:00
+								- `pnpm test:startup:bench:save` writes the full-suite artifact at `.artifacts/cli-startup-bench-all.json` using `runs=5` and `warmup=1`
 								- `pnpm test:startup:bench:update` refreshes the checked-in baseline fixture at `test/fixtures/cli-startup-bench.json` using `runs=5` and `warmup=1`
-											tests: standardize CLI startup benchmarks
										
										
											2026-03-30 22:15:09 -04:00
 								Checked-in fixture:
 								- `test/fixtures/cli-startup-bench.json`
 								- Refresh with `pnpm test:startup:bench:update`
 								- Compare current results against the fixture with `pnpm test:startup:bench:check`
-											Docs tests: add CLI startup benchmark usage
										
										
											2026-03-01 12:37:23 -08:00
-											test: add onboarding e2e harness
										
										
											2026-01-01 18:01:42 +01:00
+								## Onboarding E2E (Docker)
-											docs: note optional docker setup
										
										
											2026-01-02 20:58:50 +01:00
+								Docker is optional; this is only needed for containerized onboarding smoke tests.
-											test: add onboarding e2e harness
										
										
											2026-01-01 18:01:42 +01:00
+								Full cold-start flow in a clean Linux container:
 								```bash
 								scripts/e2e/onboard-docker.sh
 								```
-											feat: expand wizard setup flow
										
										
											2026-01-01 19:14:14 +01:00
-											refactor: rename to openclaw
										
										
											2026-01-30 03:15:10 +01:00
+								This script drives the interactive wizard via a pseudo-tty, verifies config/workspace/session files, then starts the gateway and runs `openclaw health`.
-											fix: patch qrcode-terminal import for Node 22
										
										
											2026-01-06 02:22:20 +01:00
 								## QR import smoke (Docker)
-											build: default to Node 24 and keep Node 22 compat
										
										
											2026-03-12 15:09:23 +03:00
+								Ensures `qrcode-terminal` loads under the supported Docker Node runtimes (Node 24 default, Node 22 compatible):
-											fix: patch qrcode-terminal import for Node 22
										
										
											2026-01-06 02:22:20 +01:00
 								```bash
 								pnpm test:docker:qr
 								```