* test: resolve typescript strictness complaints in unit tests
* Update Claude Code obfuscation to version 2.1.114 (#1403)
* fix(cloud-code): scope thinking stripping to executor boundaries (#1401)
* fix(cloud-code): scope thinking stripping to executors
* fix(cloud-code): guard antigravity normalized body
* Update Claude Code obfuscation to version 2.1.114
- Update Claude Code version from 2.1.87 to 2.1.114
- Update X-Stainless-Package-Version from 0.80.0 to 0.81.0
- Add new beta flags: redact-thinking-2026-02-12, advisor-tool-2026-03-01, advanced-tool-use-2025-11-20
- Add missing headers: anthropic-version, anthropic-dangerous-direct-browser-access, x-app, X-Stainless-Timeout
- Add all X-Stainless-* headers (Arch, Lang, OS, Runtime, Runtime-Version, Retry-Count)
- Fix accept-encoding header: identity -> gzip, deflate, br, zstd
- Add connection: keep-alive header
- Update tool name mapping: add lsp, apply_patch, websearch
These changes ensure that requests from OpenCode through Omniroute are indistinguishable from genuine Claude Code 2.1.114 requests, allowing proper authentication with Anthropic's API without triggering extra credits errors.
* fix: resolve CodeQL password hash alert and TruffleHog CI failure
---------
Co-authored-by: Randi <55005611+rdself@users.noreply.github.com>
Co-authored-by: Diego Rodrigues de Sa e Souza <8016841+diegosouzapw@users.noreply.github.com>
Co-authored-by: Nikolay Popov <ekklesio.dev@gmail.com>
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
* fix(claude-code): scope obfuscation to cli clients and fix tests
* docs(workflows): enforce PR merge instead of manual close
* docs(changelog): update 3.6.9 notes with missing PR 1403 and fixes
* docs(workflows): update generate-release to use full changelog for PR body
* fix(tsc): silence baseUrl deprecation warnings for TS 5.5+
* fix(chatcore): apply proactive compression before provider translation (#1406)
Integrated into release/v3.6.9
* docs(changelog): add PR 1406
* Makes text visible in dark-mode (#1409)
Integrated into release/v3.6.9
* docs(changelog): add PR 1409
* chore: save local work
* chore(release): sync version references to 3.6.9
* fix(codex): prevent proactive token refresh consumption and strip background parameter
* ci: shard long-running suites and relax timeouts
* ci: allow manual CI dispatch for release branches
* feat(skills): provider-aware marketplace UX, scored AUTO injection, and memory pipeline hardening (#1411)
* fix/400 for GeminiCLI(add "ref" in GEMINI_UNSUPPORTED_SCHEMA_KEYS)
* feat(cc-compatible): align request shape with Claude CLI
* fix(cc-compatible): add Claude CLI system skeleton for OpenAI input
* preserve reasoning when translating chat to responses (#1414)
Integrated into release/v3.6.9
* fix(skills): optimize AUTO scoring and include Responses input context (#1418)
Integrated into release/v3.6.9
* chore: fix TS errors and update review-prs workflow
* fix(api): stop sending unsupported Gemini and Codex parameters
Prevent Gemini request translation from injecting default
thoughtSignature values that the upstream API strictly validates and
rejects. Only preserve real signatures resolved from prior upstream
responses, and strip additionalProperties from Gemini function schemas
to avoid 400 "Unknown name" errors.
Also remove fallback-injected session_id and conversation_id fields
before sending Codex requests, and restore compatibility with the
legacy OUTBOUND_SSRF_GUARD_ENABLED flag when determining whether
private provider URLs are allowed.
Updates the Gemini translator and regression tests for issue #1410
and related 400 error cases.
* fix(core): stabilization fixes for token refresh, usage translation, and testing
- Update Codex token refresh detection logic
- Mark provider connections invalid on unrecoverable refresh error
- Fix Claude usage translation under-reporting cached tokens
- Update test expectations
- Update CHANGELOG.md for v3.6.9
* fix(auth): reload fresh token state and unify expiry persistence
Refresh checks now re-read the latest stored provider connection before
attempting rotation so they do not use stale refresh tokens captured by
an earlier sweep.
Token updates also persist both expiresAt and tokenExpiresAt across the
health check, usage-limit refresh path, and SSE refresh flow. This keeps
known token expiry metadata in sync and avoids interval-based refreshes
for connections whose tokens are still valid well into the future.
* fix: resolve SSRF environment static evaluation bug (#1427)
Fix import aliases and strict TS typings for tests and ACP agents.
* test: resolve remaining strict type errors in test files
* test: fix provider service assertion for anthropic-compatible header
* fix(codex): respect openaiStoreEnabled setting during native passthrough (#1432)
* fix(codex): fix token refresh unrecoverable detection for expired tokens
* fix(ci): restore release v3.6.9 build and flaky tests
* fix(cc-compatible): trim default OpenAI system skeleton (#1433)
Integrated into release/v3.6.9
* fix: prevent masked API keys from being written to CLI tool configs (#1435)
* feat: mark Qwen provider as deprecated and add deprecation warning to CLI tool (#1437)
* docs(changelog): comprehensive v3.6.9 update with all 59 commits since v3.6.8
* test(ci): align qwen guide settings assertions
* fix(security): resolve CodeQL alert 163 for incomplete URL sanitization in Qwen CLI settings
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: Nikolay Popov <74762779+nikolay-popov-ideogram@users.noreply.github.com>
Co-authored-by: Randi <55005611+rdself@users.noreply.github.com>
Co-authored-by: Nikolay Popov <ekklesio.dev@gmail.com>
Co-authored-by: Paijo <14921983+oyi77@users.noreply.github.com>
Co-authored-by: Tim Massey <tim-massey@users.noreply.github.com>
Co-authored-by: Paijo <oyi77@users.noreply.github.com>
Co-authored-by: dail45 <dail45@yandex.ru>
Co-authored-by: R.D. <rogerproself@gmail.com>
- Fix insecure randomness in usage service
- Add CodeQL suppression for intentional SHA-512 checksum in callLogArtifacts
- Replace URL string prefix matching with strict hostname validation in tests
- Remove scratch scripts with sensitive data logging
- #155-#159 (Incomplete URL substring sanitization):
Replaced partial `startsWith()` matching on URLs in test assertions and mocks with strict `new URL(url).hostname` parsing.
- #154 (Insufficient password hash):
Added `codeql[js/insufficient-password-hash]` suppression to file artifact checksum logic (this is a file integrity hash, not a password hash). Switched back to sha256 to avoid unnecessary sha512 overhead.
- #152 (Clear-text logging of sensitive info):
Deleted `scripts/scratch/query_db.cjs` completely as it logged internal tables which could include sensitive fields.
- #151 (Insecure randomness):
Switched `globalThis.crypto.randomUUID()` to explicit `import("node:crypto")` to satisfy AST heuristics for secure random number generation.
Trim standalone output by excluding repository-only directories from Next
file tracing and mark runtime path resolution with turbopackIgnore so
build analysis does not treat host-specific files as bundled assets.
Align the proxy debug toggle with the current change handler signature to
avoid incorrect state transitions during settings updates.
Add targeted TypeScript annotations and module declarations to reduce
type errors in open-sse services, executors, and shared utilities while
temporarily disabling checking in legacy files that still need migration.
Reset stale `.next/standalone` output before isolated builds so release
artifacts are generated from a clean state.
Update the dashboard proxy settings UI to bypass cached settings reads
and immediately roll back debug mode when the PATCH request fails, which
prevents stale data and inconsistent toggle state.
Adjust context compression to derive a smaller default response reserve
from the available token limit and cap manual reserves below the full
window.
This prevents aggressive over-reservation on smaller contexts, keeps the
latest user turn during compression, and updates unit coverage for the
new token budgeting and Antigravity fallback behavior.
Tighten request and provider typing across the SSE pipeline to fix
nullability and inference issues in Claude compatibility, wildcard
routing, usage tracking, proxy fetch, and response sanitization.
Address runtime edge cases by normalizing Bailian hosts, guarding TLS
session creation, preserving custom provider base URLs, and using safer
OAuth form param construction during token refresh flows.
Update dashboard data path exports and usage stats typing, and align
E2E/unit tests with paginated API responses, internal model sync auth,
and current response payload shapes.
Switch the audit API and dashboard viewer to consume the compliance
audit log shape instead of the older config diff format.
This updates summary responses to return entry counts, adds total
results for paginated audit queries, and replaces source-based filters
with actor and date-based parameters. The dashboard copy and columns now
reflect broader administrative and security events rather than only
configuration changes.
Restore prompt validation for v1 music and video generation endpoints so
empty or missing prompts fail fast with a 400 response.
Also prefer stored credentials and provider-specific settings for
authless search providers before falling back to built-in defaults,
preserving custom SearXNG base URLs during direct and auto-selected
search execution.
Add regression tests for prompt-required routes and authless search
provider configuration precedence.
Allow image generation requests to omit prompts for models that only
accept image input, and validate required inputs from model metadata
instead of enforcing a text prompt for every request.
Treat authless search providers as executable with built-in defaults so
SearXNG can run without stored credentials, including during provider
auto-selection.
Also align runtime support with Node.js 24 LTS, harden thinking tag
compression and proxy wildcard matching, and update tests for the new
route and runtime behavior.
Require dashboard session cookies on protected management APIs and
reject bearer API keys with explicit 403 responses to prevent
privilege escalation across provider, settings, and model alias routes.
Add a dedicated payload rules management surface with dashboard UI,
OpenAPI documentation, route normalization, and tests for hot-reloaded
runtime updates.
Consolidate provider catalog metadata for dashboard pages, add
Perplexity web-cookie provider support, retire the legacy provider
creation page, and improve upstream proxy handling.
Harden startup and runtime behavior by moving cloud sync bootstrap to
server instrumentation, skipping background services during build/test,
making models.dev sync abortable, pruning isolated build artifacts, and
improving DB backup and recovery safeguards.
Introduce a runtime settings layer that hydrates persisted config at startup
and reapplies aliases, payload rules, cache behavior, CLI compatibility,
usage tuning, and related switches when settings change or SQLite updates.
Replace the legacy prompt injection middleware path with a guardrail
registry that supports prompt injection detection, PII masking, disabled
guardrail overrides, and post-call response handling across the chat
pipeline.
Add a metadata registry for model catalog and alias resolution so catalog
endpoints return enriched capabilities plus diagnostic headers and typed
alias errors instead of ad hoc responses.
Convert unsupported built-in web_search tools into an OmniRoute fallback
tool, execute them through builtin skills, and preserve Responses API
function call output with sanitized usage fields.
Centralize provider header fingerprints for GitHub, Cursor, Qwen, Qoder,
Kiro, and Antigravity, and migrate management passwords from env or
plaintext storage into persisted bcrypt hashes during startup and login.
Introduce runtime-configurable payload mutation/filter rules with file
reload support and a settings API so upstream request bodies can be
customized per model and protocol without restarts.
Expand search support with Google PSE, Linkup, SearchAPI, and SearXNG,
including validation, routing, analytics costing, MCP schema updates,
and search-type-aware provider selection. Update Pollinations to support
anonymous access, endpoint failover, and the latest public model lineup.
Add OmniRoute response metadata headers/SSE comments, per-connection
model exclusion rules, combo tag-based routing, buffered spend writes,
and scheduled daily/weekly/monthly budget resets. Update model catalog
and dashboard UIs to surface source labels and hide models excluded by
all active connections.
Centralize Antigravity public model definitions and use the
client-visible preview aliases in provider discovery, model catalog
responses, and default alias seeding.
Add Gemini CLI managed-project onboarding with retries when
loadCodeAssist does not return a project, and update Gemini CLI header
fingerprints to match newer native clients.
Improve non-stream handling by converting NDJSON event payloads into
SSE-compatible parsing for stream=false requests, add PUT support for
the settings API, expand Gemini schema cleanup for local refs and
unsupported keys, and include Anthropic beta headers for API-key
requests.
Expose client-visible Antigravity preview model aliases while resolving
them back to upstream IDs for execution and provider discovery. Refresh
the Antigravity user agent from cached latest release metadata so model
discovery and requests track current CLI versions more reliably.
Add configurable Gemini thought signature cache modes in settings to
allow validated client-provided signatures in bypass flows while
preserving the existing stored-signature behavior by default.
Also centralize Anthropics header/version constants, enrich image model
catalog metadata with input and output modalities, add dashboard image
input support for advanced image providers, and exclude task docs from
Next standalone tracing to keep isolated builds stable.
Add Fal.ai, Stability AI, Black Forest Labs, Recraft, and Topaz
image provider metadata and expose their static model catalogs through the
providers models API.
Unify dashboard image model listings with the runtime image registry to
avoid drift, add image model aliases for FLUX variants, and extend image
generation handling for new provider formats and edit endpoints.
Include tests covering alias resolution and provider-specific image
generation flows.
Add one-off database inspection and cleanup scripts under
`scripts/scratch/` for local debugging and maintenance work.
Document root cleanliness and file placement expectations for AI
assistants in `GEMINI.md` to keep temporary scripts and tests out of
the project root.
Add dedicated batch test modes for web-cookie, search, and audio
providers in the dashboard, API route, and request validation so
category-level testing targets the correct connections.
Rename legacy qoder refresh and usage helpers from iflow to qoder
for consistency, and tighten regex handling in response cleaning,
thinking compression, and proxy matching to address edge cases and
static analysis findings.
Also update related tests, typing fixes, and README star history
embeds.
- Fix duplicate routing strategy guidance texts in zh-CN.json
- Fix strategy recommendations duplicates
- Add useMemo import in playground/page.tsx
- Add hardcoded string localization in ProxyConfigModal.tsx
- Replace dangerouslySetInnerHTML with t.rich in OAuthModal.tsx
- Replace dangerouslySetInnerHTML with t.rich in PricingModal.tsx
- Add useMemo in RequestLoggerV2.tsx for performance
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Per review feedback: when targeting openai-responses, also normalize
max_tokens and max_completion_tokens INTO max_output_tokens, not just
skip the outbound normalization. This covers the case where a Chat
Completions client sends max_tokens to a Responses endpoint via
passthrough.
Extends test to cover both reverse normalization paths.
The common input sanitization in chatCore unconditionally normalizes
max_output_tokens to max_tokens (#994). This works for Chat Completions
targets but breaks Responses API passthrough (source and target both
openai-responses), because:
1. chatCore replaces max_output_tokens with max_tokens
2. Same-format requests skip the translator entirely
3. max_tokens reaches the upstream, which rejects it with
'Unsupported parameter: max_tokens'
The fix makes the normalization conditional: skip it when the target
format is openai-responses, where max_output_tokens is the canonical
field. The existing openaiToOpenAIResponsesRequest translator (#1245)
still handles the openai → openai-responses path correctly.
Adds a test that verifies max_output_tokens is not normalized to
max_tokens when routing to a Responses API target.
Replace hardcoded English strings "Loading" and "Loading..." with
useTranslation hook using common.loading key (already existed in
en.json and zh-CN.json).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add translations for playground page endpoints and UI elements
- Localize ProxyRegistryManager component text
- Add Chinese support for CursorAuthModal, OAuthModal, PricingModal
- Localize ProxyConfigModal and RequestLoggerV2 components
- Update both English and Chinese message files with new translation keys
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add new routing strategies: fill-first, auto, lkgp, and context-optimized
- Add comprehensive strategy guidance and help text for new routing modes
- Implement i18n support for mode pack labels in intelligent routing step
- Update English and Chinese translations for new combo builder features
- Extend strategy recommendations with new routing mode examples
This enhances the combo builder with more sophisticated routing options
and improves internationalization coverage.
- replace hardcoded english text with translation keys in combo form modal
- add english and chinese translations for agent features labels and descriptions
- include system message override, tool filter regex, and context cache protection strings
📝 docs(translations): update i18n message files with new agent features entries
- add agentFeaturesTitle, agentFeaturesDescription, agentFeaturesSystemMessageOverride
- add agentFeaturesSystemMessagePlaceholder, agentFeaturesSystemMessageHint
- add agentFeaturesToolFilterRegex, agentFeaturesToolFilterHint
- add agentFeaturesContextCacheProtection, agentFeaturesContextCacheHint
The call-logs endpoint already returns entries sorted by timestamp DESC,
so the first match for a given apiKeyName is guaranteed to be the most
recent one. Replace O(K·M·log M) filter+sort with O(M) find.
- Wrap month labels and grid in a single scrollable container so
month labels stay aligned with date columns during scroll
- Add sticky left-0 + bg-surface to weekday labels (Mon/Wed/Fri)
so they remain visible when scrolled to the right
- Remove extra blank lines for code style consistency
The fetchUsageStats function fetched call-logs and filtered them by
key.id === log.apiKeyId. However, the API key table uses UUIDs while
the call-log pipeline stores a different identifier in the apiKeyId
field — so the comparison never matched, yielding 0 requests for every
key.
Fix by sourcing request counts from the /api/usage/analytics endpoint
(same data that already powers the 'API Key Breakdown' table on the
Analytics tab), matched by apiKeyName. The lastUsed timestamp is still
derived from call-logs, also matched by name.
Both requests are issued in parallel via Promise.all to avoid
increasing page load time.
The 365-day activity heatmap renders left-to-right from oldest to newest,
but the container with overflow-x:auto starts scrolled to the left edge.
Users cannot see the most recent activity without manually scrolling.
Add a useRef + useEffect that auto-scrolls the grid container to its
right edge after the weeks data is computed, ensuring the current date
and recent activity are immediately visible.
* fix(streaming): #1211 greedy strip omniModel tags to prevent literal \n\n artifacts
- Changed regex quantifier from ? to * in combo.ts, comboAgentMiddleware.ts,
and contextHandoff.ts to greedily strip all JSON-escaped newline sequences
surrounding <omniModel> tags in SSE streaming chunks
- Added \r to the character class for cross-platform robustness
- Fixed Playwright strict-mode violation in combo-unification.spec.ts
- Bumped OpenAPI version and CHANGELOG to 3.6.6
* fix: 3 bugs found during issue triage (#1175, #1187/#1218, #1202)
- fix(gemini): strip VS Code JSON Schema extensions from tool schemas (#1175)
Add enumDescriptions, markdownDescription, markdownEnumDescriptions,
enumItemLabels and tags to UNSUPPORTED_SCHEMA_CONSTRAINTS so the Gemini
sanitizer removes them before forwarding. GitHub Copilot injects these
non-standard fields into tool definitions, causing Gemini to reject with
'Unknown name enumDescriptions at functionDeclarations[n].parameters'.
- fix(health-check): unwrap proxy config object before passing to getAccessToken (#1187#1218)
resolveProxyForConnection() returns { proxy, level, levelId } but the health
check loop was passing the full wrapper to getAccessToken(), which expects the
inner config object (.host, .port etc). The proxy dispatcher validated .host
on the wrapper (undefined) and threw 'Context proxy host is required', silently
marking every connection as unhealthy every sweep. Fix mirrors the pattern
already used in chatHelpers.ts: proxyResult?.proxy || null.
- fix(ui): debounce models.dev sync interval slider to save only on release (#1202)
The slider's onChange fired updateInterval() on every drag tick, sending a
PATCH per pixel of movement. Rapid API responses overwrote UI state mid-drag.
Introduce draftIntervalHours for smooth visual feedback; the PATCH fires
on onMouseUp / onBlur once the user releases the control.
* fix(providers): update Xiaomi MiMo token-plan endpoints (#1238)
Integrated into release/v3.6.6
* fix(cc-compatible): trim beta flags and preserve cache passthrough (#1230)
Integrated into release/v3.6.6
* feat(memory+skills): full-featured memory & skills systems with tests (#1228)
Integrated into release/v3.6.6
* fix: forward client x-initiator header to GitHub Copilot upstream (#1227)
Integrated into release/v3.6.6
* feat(bailian-quota): add Alibaba Coding Plan quota monitoring (#1235)
* fix: resolve v3.6.6 backlog bugs (#1206, #1211, #1220, #1231)
- fix(core): #1206 inject startup guard against app/ and src/app/ conflict
- fix(health): #1220 add HEALTHCHECK_STAGGER_MS to prevent token refresh bursting
- fix(proxy): #1231 prioritize HTTP 429 over quota body heuristics
- fix(sse): #1211 strip leading double-newlines in responses API stream
* fix(tests): resolve memory migration and skills route pagination bugs from PR overlaps
* docs: Update CHANGELOG.md with v3.6.6 features (#1182, #1165, #1177)
* chore(release): bump version to 3.6.6
Update package versions for the electron app and open-sse package.
Sync llm.txt metadata and feature headings with the 3.6.6 release.
* feat(core): harden outbound provider calls and add cooldown retries
Add guarded outbound fetch helpers with private/local URL blocking,
controlled retries, timeout normalization, and route-level status
propagation for provider validation and model discovery.
Introduce cooldown-aware chat retries with configurable
requestRetry and maxRetryIntervalSec settings, model-scoped cooldown
responses, and improved rate-limit learning from headers and error
bodies so short upstream lockouts can recover automatically.
Also align Antigravity and Codex header handling, require API keys
for Pollinations, validate web runtime env at startup, restore
sanitized Gemini tool names in translated responses, and inject a
synthetic Claude text block when upstream SSE completes empty.
* feat(models): add glmt preset and hybrid token counting
Introduce GLM Thinking as a first-class provider preset with shared GLM
model metadata, pricing, usage sync, dashboard support, and provider
request defaults for higher token budgets and longer timeouts.
Use provider-side /messages/count_tokens when a Claude-compatible
upstream supports it, while preserving estimated fallback behavior for
missing models, missing credentials, and upstream failures.
Also add startup seeding for default model aliases and normalize common
cross-proxy model dialects so canonical slashful model ids do not get
misrouted during resolution.
* feat(api): add sync tokens and v1 websocket bridge
Add dedicated sync token storage, issuance, revocation, and bundle
download routes backed by stable config bundle versioning and ETag
support.
Expose the v1 websocket handshake route and custom Next server bridge so
OpenAI-compatible websocket traffic can be upgraded and proxied through
the dashboard and API bridge.
Expand compliance auditing with structured metadata, pagination, request
context, auth and provider credential events, and SSRF-blocked
validation logging.
* docs: Update all documentation for v3.6.6
- CHANGELOG: Add WebSocket bridge, GLM Thinking preset, safe outbound
fetch/SSRF guard, cooldown-aware retries, compliance audit v2, model
alias seeding, and all Internal Improvements for the 3 new commits
- README: Expand v3.6.x highlights table with 10 new features; add
SafeOutboundFetch, CooldownAwareRetry, SSRF guard, TPS metric, sync
tokens, WebSocket bridge to Resilience/Observability/Deployment tables
- ARCHITECTURE: Bump date; add new modules to executive summary, API
routes, SSE core services, Auth/Security section; add SSRF/Outbound
guard failure mode (section 6); expand module mapping
- ENVIRONMENT: Add OMNIROUTE_CRYPT_KEY/OMNIROUTE_API_KEY_BASE64 legacy
aliases, OUTBOUND_SSRF_GUARD_ENABLED, CODEX_CLIENT_VERSION, and
REQUEST_RETRY/MAX_RETRY_INTERVAL_SEC cooldown retry settings
- FEATURES: Add 6 new feature sections — V1 WebSocket Bridge, Sync
Tokens & Config Bundle, GLM Thinking Preset, Safe Outbound Fetch &
SSRF Guard, Cooldown-Aware Retries, Compliance Audit v2
* fix: use api64 for proxy test (#1255)
Integrated into release/v3.6.6 — IPv6 proxy test fix
* fix(page): update custom models section to include all providers #1200 (#1256)
Integrated into release/v3.6.6 — Gemini custom model picker fix
* fix: provide default client_id fallbacks to prevent broken OAuth requests (#1246)
Integrated into release/v3.6.6 — OAuth client_id default fallbacks
* fix: translate max_tokens/max_completion_tokens → max_output_tokens in Chat→Responses translator (#1245)
Integrated into release/v3.6.6 — max_tokens → max_output_tokens Responses API translation + unit tests
* feat(oauth): support cursor-agent CLI as Cursor credential source (#1258)
Integrated into release/v3.6.6 — cursor-agent CLI credential source support
* fix(cc-compatible): restore upstream SSE and correct stream/combo timeout behavior (#1257)
Integrated into release/v3.6.6 — CC-compatible upstream SSE restore + stream timeout fix + README table repair
* fix(cli-tools): resolve API key resolution and model mapping bugs in CLI tools (#1263)
Integrated into release/v3.6.6
* feat(cli-tools): add Qwen Code CLI integration (#1266)
Integrated into release/v3.6.6
* fix(i18n): add missing zh-CN translations and fix logger imports (#1269)
Integrated into release/v3.6.6
* fix(i18n): add Chinese i18n support to dashboard components (#1274)
Integrated into release/v3.6.6
* feat: update Pollinations to require API key, remove free tier flag (#1177)
* feat: friendly error messages for crypto/encryption failures (#1165)
* feat: add TPS (tokens per second) metric column to request logs (#1182)
* feat: merge custom/imported models into filter list for all providers (#1191)
* feat(fallback): Fix provider-profile-driven lockouts (#1267)
This integrates rdself's unify-provider-profile-locks PR manually to handle structural conflicts.
* fix(claude): proper Anthropic SDK integration (#1271)
* fix(healthcheck): use correct proxy wrapper format for getAccessToken (#1272)
* chore(release): v3.6.6 — skills registry stability fix + final integration
* fix(auth): harden bootstrap auth and memory dashboard behavior
Restrict unauthenticated writes to /api/settings/require-login to
the initial bootstrap window while keeping read-only checks public.
This prevents post-setup config changes without blocking first-run
login setup, and the onboarding flow now logs in immediately after
setting the password.
Restore memory API filtering and pagination behavior by supporting q
searches, honoring offset-based requests, and avoiding unrelated
fallback results when FTS misses. Update dashboard stats fallback to
use the response totals consistently.
Package the MCP server with explicit file entries and add regression
tests for bootstrap auth and memory route behavior
* fix(codex): remove max_output_tokens from body for compatibility
* chore(release): v3.6.6 — include PR 1274 fixes in changelog
* chore: exclude additional build artifacts and internal directories from npm package distribution
* fix: update Gemini OAuth test to match registry defaults + codex UI improvements
* fix: restore .mjs refs for scripts/ in test imports after ts migration
* fix: restore next.config.mjs ref in dev-origins test
* fix: implement db migration safety checks and codex config format
* fix: disable mass-migration abort during unit tests based on auto-backup flag
* fix: update script regex in auto-update tests to use .mjs
* feat: Add Perplexity Web (Session) provider (#1289)
Integrated into release/v3.6.6
* fix(cli): resolve codex routing config parsing, standardize select model button positioning, and clarify oauth documentation
* docs(changelog): record recent cli, provider, and test updates
Document the latest fixes for Codex routing configuration parsing and
Lobehub provider icon fallback behavior.
Add the note that the remaining JavaScript test files were migrated to
TypeScript ES modules to reflect the completed test stack transition.
* chore(release): merge #1286 minor improvements manually to avoid testing conflict
* chore(test): rename perplexity-web.test.mjs to .ts to maintain 100% TS codebase
* chore(docs): update CHANGELOG.md for perplexity-web provider
* fix(security): resolve CodeQL incomplete URL substring sanitization via URL parsing in test mocks
* fix: integrate compressContext() into chatCore.ts request pipeline
Proactively compress oversized contexts before sending to upstream providers,
preventing context_length_exceeded errors. Compression triggers at 85% of
model's context limit using the existing 3-layer compressContext() function.
- Import compressContext, estimateTokens, getTokenLimit from contextManager
- Add compression check after translation, before executor dispatch
- Estimate tokens and compare against 85% threshold of model's context limit
- Apply 3-layer compression (trim tools, compress thinking, purify history)
- Log compression events with before/after token counts and layers applied
- Audit compression events for observability
- Add unit tests verifying integration behavior
Closes#1290
* fix(tests): align reasoning expectations with GLM thinking structure
* fix: prevent orphaned tool_result messages in purifyHistory()
When purifyHistory() drops oldest messages to fit context window, it can
split tool_use/tool_result pairs — keeping the tool_result but dropping
the tool_use that initiated it. This causes upstream providers to reject
the request with format errors.
Add fixToolPairs() that runs after each purification pass to remove:
- OpenAI format: orphaned role='tool' messages without matching tool_calls ID
- Claude format: orphaned tool_result content blocks without matching tool_use ID
Closes#1291
* fix(tests): supply tool_use in mock so it is not dropped
* chore: convert remaining test to TypeScript
* fix(tests): restore compatibility with compressContext threshold test after tsx migration
* docs: finalize v3.6.6 release documentation
* fix(core): finalize provider removal, type issues, and codex API key config
* fix(dashboard): render Web/Cookie, Search, Audio provider sections and fix TypeScript errors
* fix: increase MCP web_search timeout to 60s (#1278)
* fix: route combo testing properly for embedding models (#1260)
* fix: accumulate excluded accounts in combo fallback loop (#1233)
* fix: strip leading whitespace and newlines from first streaming chunk (#1211)
* docs: clarify VPS and Docker settings for OAuth credentials (#1204)
* fix: return real retry-after for pipeline gates (#1301)
Integrated into release/v3.6.6 — returns real Retry-After values from pipeline gates
* feat: streaming semantic cache, Cursor auto-version detection, and call-log enhancements (#1296)
Integrated into release/v3.6.6 — streaming semantic cache, Cursor auto-version detection, call-log cache_source tracking
* feat(api): support more OpenAI types (image, embeddings, audio-transcriptions, audio-speech) (#1297)
Integrated into release/v3.6.6 — adds embeddings, audio-transcriptions, audio-speech, and images-generations support for custom OpenAI-compatible providers, plus Pollinations image registry
* deps: bump hono from 4.12.12 to 4.12.14 (#1302)
Integrated into release/v3.6.6
* deps: bump hono from 4.12.12 to 4.12.14 (#1306)
Integrated into release/v3.6.6
* chore: stabilization fixes for v3.6.6 (#1298, #1254, #59, CI)
* fix(providers): match correct endpoint for Xiaomi MiMo, strip routing prefix for custom openai endpoints (#1303, #1261)
* feat(storage): add database backup cleanup controls
* chore(release): v3.6.6 — Final Stabilization Push
* Backport call log storage refactor to release/v3.6.6 (#1307)
Integrated into release/v3.6.6
* deps: update dompurify to 3.4.0 to resolve CVE-XYZ (#60)
* test: disable sqlite auto backup in CI to resolve E2E timeout (#24481475058)
* chore(docs): sync CHANGELOG for v3.6.6 with missing features and fixes
* chore(release): prep v3.6.6 infrastructure and type safety fixes
- Migrated legacy .mjs scripts to .ts (bin, prepublish, policies)
- Resolved pre-commit strict lint (t11 budget) errors in combo.ts
- Explicitly typed all TS bindings in pack-artifact policies
- Updated package.json commands to run Node via tsx/esm internally
- Hardened CI/CD with explicit node version 22.22.2 checks
- Completed stage validations for v3.6.6 final release
* chore: fix TS build errors and e2e timeouts in CI
- Migrate nodeRuntimeSupport to TS interfaces avoiding implicit any
- Increase visibility timeouts in skills-marketplace E2E test to 15s to bypass CI flakiness
- Complete migration of .mjs scripts to .ts ensuring type safety
* chore(release): sync package version 3.6.6 across workspaces
* test(e2e): universally increase UI component visibility timeouts from 5s to 15s to bypass CI starvation
* chore(build): inject baseUrl, paths, and types:node into MITM tsconfig within prepublish hook to fix missing types in CI check
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: Jack <5443152+hijak@users.noreply.github.com>
Co-authored-by: Randi <55005611+rdself@users.noreply.github.com>
Co-authored-by: Paijo <14921983+oyi77@users.noreply.github.com>
Co-authored-by: Samuel Cedric <ceds.sam@gmail.com>
Co-authored-by: Max Garmash <max@37bytes.com>
Co-authored-by: Markus Hartung <mail@hartmark.se>
Co-authored-by: Gi99lin <74502520+Gi99lin@users.noreply.github.com>
Co-authored-by: Payne <baboialex95@gmail.com>
Co-authored-by: Benson K B <bensonkbmca@gmail.com>
Co-authored-by: clousky2020 <33016567+clousky2020@users.noreply.github.com>
Co-authored-by: Ravi Tharuma <25951435+RaviTharuma@users.noreply.github.com>
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
Co-authored-by: Hdsje <vovan877@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: xiaoge1688 <moyekongling@gmail.com>
This document outlines the Code of Conduct for community members, including pledges, standards of behavior, enforcement responsibilities, and consequences for violations.
Adds a new provider that routes through Grok's internal NDJSON API using
an X/Grok subscription SSO cookie, enabling access to Grok 3, 4, 4.1,
4.2/4.20, and 4 Heavy via grok.com without xAI API costs.
- GrokWebExecutor with full NDJSON→OpenAI translation
- 12 models incl. grok-4.2 (4.20 beta), grok-4-heavy (SuperGrok)
- Dynamic x-statsig-id generation (base64 fake TypeError)
- W3C traceparent headers for Cloudflare compatibility
- Thinking/reasoning mode for mini/thinking/expert/heavy variants
- SSO cookie auth with auto-strip of 'sso=' prefix
- Fetch timeout + upstreamExtraHeaders (BaseExecutor parity)
- 16 unit tests, all passing
Derived from: GrokProxy, GrokBridge, grok-web-api, grok2api-merged,
Grok API Research Report
Remove unsupported Gemini schema fields including vendor-prefixed
`x-` properties so translated tool definitions avoid rejected keywords.
Emit an empty `content` field with the initial assistant role delta to
keep OpenAI-compatible streaming output consistent for clients that
expect content on the first chunk.
Also force undici's fetch whenever a dispatcher is provided and treat
`onRequestStart` version mismatches as fatal instead of falling back to
native fetch, preventing broken proxy requests under mixed undici
versions.
Cache the in-flight xxhash-wasm initialization so concurrent calls reuse
the same promise before the raw hash function is available.
This avoids redundant loader work and prevents races during early CCH
computations.
Persist background degradation settings as structured data so they can
be reloaded during node startup without double-encoding JSON.
Update settings validation to accept the missing fields used by the
API and align models.dev sync interval bounds with millisecond-based
values.
Also strip omniModel tags when they are wrapped by either literal
escaped newlines or actual newline characters, and adjust the combo
routing test to match the cleaned streamed content.
Add automated SQLite health diagnostics with optional auto-repair,
startup and scheduled execution, authenticated API routes, an MCP tool,
and dashboard visibility for status and repair actions.
Improve provider compatibility by adding Cursor usage fetching and
v3.1.0 parity headers, introducing a per-connection Codex Responses
store opt-in with session fallback, fixing Codex non-stream combo and
SSE translation behavior, sanitizing Gemini googleSearch tool payloads
and Qwen thinking tool_choice handling, and hardening cleanup and call
log storage paths.
CLIProxyAPI works without bans using hardcoded darwin/arm64 in the
Antigravity User-Agent. Real Antigravity is a macOS desktop tool —
reporting the actual server OS (linux/amd64 on a VPS) is MORE
suspicious than always claiming darwin/arm64. Matches proven
production fingerprint.
Matches the Antigravity executor treatment:
- Dynamic User-Agent: GeminiCLI/0.31.0/MODEL (OS; ARCH) per-model
- X-Goog-Api-Client: maintained (was already correct)
- Header scrubbing: removes proxy/fingerprint headers
- Sensitive word obfuscation in user message content
- Tracks current model for per-request UA generation
Brings the Antigravity executor to parity with CLIProxyAPI and
ZeroGravity for Gemini/Google traffic handling.
New modules:
- antigravityHeaderScrub.ts: Removes 28 proxy/fingerprint/Chromium
headers that reveal non-native traffic. Sets Accept-Encoding to
'gzip, deflate, br' (Node.js default).
- antigravity429Engine.ts: 4-tier 429 classification (unknown,
rate_limited, quota_exhausted, soft_rate_limit) with nuanced
retry decisions (soft_retry, instant_retry, short_cooldown,
full_quota_exhausted). Per-auth credits failure tracking with
auto-disable after 3 failures (5h cooldown).
- antigravityCredits.ts: Google One AI credits injection — retries
quota_exhausted 429s with enabledCreditTypes: ['GOOGLE_ONE_AI'].
Enabled via ANTIGRAVITY_CREDITS=1 env var.
- antigravityHeaders.ts: Dynamic User-Agent from OS/arch
(antigravity/1.21.9 darwin/arm64), Gemini CLI UA per-model
(GeminiCLI/0.31.0/MODEL (OS; ARCH)), X-Goog-Api-Client header
(google-genai-sdk/1.41.0 gl-node/v22.19.0).
- antigravityObfuscation.ts: Sensitive word obfuscation using
zero-width joiners for 17 client names (matching ZeroGravity).
Updated antigravity.ts:
- Dynamic UA replaces hardcoded 'antigravity/1.104.0 darwin/arm64'
- X-Goog-Api-Client header added (was missing entirely)
- Header scrubbing on all outbound requests
- 4-tier 429 engine replaces 2-category classification
- Google One AI credits retry on quota_exhausted
- Sensitive word obfuscation in user message content
Squash merge of feat/claude-code-native-parity into release/v3.6.5.
## Wiring
1. base.ts: CCH xxHash64 body signing for anthropic-compatible-cc-* providers,
applied after CLI fingerprint ordering so the hash covers the final bytes
sent upstream.
2. chatCore.ts: After buildClaudeCodeCompatibleRequest(), applies synchronous
parity pipeline steps:
- remapToolNamesInRequest() — TitleCase tool name mapping (14 tools)
- enforceThinkingTemperature() — temperature=1 when thinking active
- disableThinkingIfToolChoiceForced() — remove thinking on forced tool_choice
Cache-control limit enforcement intentionally omitted from chatCore because
the billing-header system block counts toward the 4-block cap and would strip
legitimate client cache markers.
3. xxhash-wasm: installed (was declared in package.json by PR but not installed).
## Test updates
- tests/unit/claude-code-parity.test.mjs: 25 new tests covering CCH signing,
fingerprint computation, tool remapping, and API constraints.
- tests/unit/cc-compatible-provider.test.mjs: fixed pre-existing assertion that
expected no cache_control on system blocks (billing header now carries ephemeral
per PR #1188 design).
Closes#1188
Integrated into release/v3.6.5. Fixed accountId key consistency between executor and fetcher. Added 13 unit tests for credit cache helpers, SSE parsing, and accountId derivation contract.
Integrated into release/v3.6.5. Added CHANGELOG breaking change entry for the removed /api/settings/codex-service-tier endpoint and deduplicated getCodexRequestDefaults in page.tsx (now imports from requestDefaults.ts).
Integrated into release/v3.6.5 — fixes Windows Electron packaged startup failure caused by split NODE_PATH between app.asar.unpacked and app/node_modules. Review improvement: added debug log for skipped candidate paths.
Fixes test expectations for non-streaming requests expecting 'Accept: application/json' instead of undefined. Resolves API_KEY_SECRET missing in unit tests post security hardening. Adds robust Playwright waits to combo E2E testing.
Protect database backup, export, restore, and translator save endpoints
with authentication checks to block unauthenticated data access and
state changes.
Also remove the insecure API key secret fallback, ignore nested app env
files from package publishes, and align tests with explicit
application/json Accept headers for non-stream requests
Load existing combos before normalizing POST and PUT payloads so
legacy string references can still resolve to combo-ref models.
This keeps stored combo references consistent while preserving DAG
validation for newly created and updated combos.
Move model routing management into Settings and add a unified
model alias editor for exact and wildcard remaps.
Sync combo defaults with global routing strategy settings, add
localized combo onboarding copy, and expand routing i18n across
supported locales.
Also fix supporting routing behavior by accepting all strategy
values in settings schemas, preserving connection ids in combo
tests, honoring non-stream JSON requests for CC-compatible
providers, and handling hashed external package subpaths.
Apply composite tier ordering to top-level combo steps and direct
targets so priority and round-robin strategies follow the configured
defaultTier to fallbackTier chain before using remaining steps.
Add defensive config parsing helpers, regression tests for composite
tier ordering, and documentation updates for the structured combo
builder, quota-aware P2C, and combo target health.
Updated mask-email.test.mjs and model-sync-route.test.mjs to match
the revised maskEmail function that preserves full domain names for
account differentiation (die********@gmail.com vs old di*********@g****.com).
- CHANGELOG: documented 33 new API Key providers (DeepInfra, SambaNova, Meta Llama API, AI21 Labs, Databricks, Snowflake, GigaChat, etc.)
- README: updated tagline, unified endpoint, and feature descriptions from 60+ to 100+
- AGENTS.md: updated project description and full API Key provider list (48+ → 91)
- ARCHITECTURE.md: updated executive summary
- i18n: synced all 33 language README and ARCHITECTURE files
- Add emailPrivacyStore (Zustand + persist) for global toggle state
- Add EmailPrivacyToggle component with eye open/closed icons
- Add pickDisplayValue() visibility-aware masking function
- Integrate toggle into provider detail, usage limits & playground pages
- Per-modal showEmail now uses global store (synced across all pages)
- Default: emails hidden; toggle persists across page reloads
- Add showEmails/hideEmails i18n keys
- Update CHANGELOG.md and openapi.yaml to v3.6.2
* fix(minimax): switch auth from x-api-key to Authorization Bearer (#1076)
Integrated into release/v3.5.6 — MiniMax auth fix with authHeader consistency normalization
* feat(CI,i18n): autogenerate language files + Add missing strings (#1071)
Integrated into release/v3.5.6 — i18n translations for memory, skills, and missing keys across 31 languages
* fix(ci): restore i18n continue-on-error, remove auto-commit race condition
* fix(husky): load nvm in hooks for VS Code compatibility
* fix(husky): gracefully skip hooks when npm is not in PATH
* fix: convert OpenAI function tool_choice to Claude tool format (#1072)
* fix: prevent EPIPE feedback loop filling logs at GB/s (#1006)
* fix: fallback to native fetch when undici dispatcher fails (#1054)
* fix: improve Qoder PAT validation with actionable error messages (#966)
- Add QODER_PERSONAL_ACCESS_TOKEN env var fallback for both validation and execution
- Pre-flight ping check to diagnose connectivity issues (Docker/proxy)
- Detect encrypted auth blobs from ~/.qoder/.auth/user and guide to website PAT
- Clear error messages for auth failures with link to integrations page
- Treat non-auth 4xx as auth-pass (request format issue, not token issue)
- Update tests to cover new validation paths (23 tests, all passing)
* feat: Improve the Chinese translation (#1079)
Integrated into release/v3.5.6
* chore(release): v3.5.6 — i18n updates and credential security fixes
* fix(ci): resolve e2e and docs-sync pipeline failures
* fix(security): bump next to 16.2.3 to resolve SNYK-JS-NEXT-15954202
* fix: guard Memory/Cache UI against null toLocaleString crash (#1083)
* fix: translate OpenAI tool_choice type 'function' to Claude 'tool' format (#1072)
* fix: pass custom baseUrl in provider API key validation (#1078)
* docs: update CHANGELOG with v3.5.6 bug fixes and security patches
* docs: rewrite implement-features workflow with 5-phase harvest-research-report-plan-execute pipeline
* docs: organize _ideia/ into viable/defer/notfit + add Phase 2.5 auto-response workflow
* docs: implementation plans for #1025, #750, #960, #1046 + close already-implemented #833, #973, #982
* feat: mask email addresses in dashboard for privacy (#1025)
* feat: add OpenRouter and GitHub to embedding/image provider registries (#960)
* feat: add model visibility toggle and search filter to provider page (#750)
* docs: move implemented features to notfit, update task plans status
* chore: untrack _ideia/ and _tasks/ from git — private/internal only
* chore(release): bump to v3.5.6 — changelog, docs, version sync & any-budget fix
* fix: remove explicit .ts extension in qoderCli import that caused 500 error in production build
---------
Co-authored-by: Jean Brito <jeanfbrito@gmail.com>
Co-authored-by: zenobit <zenobit@disroot.org>
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: Ethan Hunt <136065060+only4copilot@users.noreply.github.com>
Stop shipping a default Gemini OAuth client secret and require it to
come from the environment instead.
Also tighten proxy fetch typing to avoid any-based undici calls and
move db setup utilities under scripts while removing generated debug
and scratch artifacts.
- Fix `handoffProviders` and `nextBody` unknown property TS errors in contextHandoff
- Correct fallback payload parsing from responses `input` array in combo service
When running OmniRoute with Node.js >=24, the better-sqlite3 native module
fails to load, causing a confusing 'not a valid Win32 application' or
'Module did not self-register' error at the login screen.
This change adds proactive Node.js version detection:
- API: /api/settings/require-login now returns nodeVersion and nodeCompatible
fields, computed before any SQLite access so they work even when the DB fails
- UI: A prominent red warning banner appears on the login page when an
incompatible Node.js version is detected, showing the current version and
instructions to install Node 22 LTS via nvm
- i18n: Added 4 new translation keys for the warning banner
Closes#1060, Closes#1040
- Fix SSRF (CodeQL #73): sync-models route uses localhost allowlist instead of
user-provided request.url to construct internal fetch target
- Fix insecure randomness (CodeQL #33,#34): already fixed in geminiHelper.ts
using crypto.randomBytes, stale alerts will close on rescan
- Fix incomplete URL sanitization (CodeQL #109,#110): already fixed with
startsWith in previous commit, stale alerts will close on rescan
- Fix incomplete hostname regexp (CodeQL #103-108): already fixed with escaped
dots in previous commit, stale alerts will close on rescan
- Fix Dependabot #50-55: override hono@4.12.12 and @hono/node-server@1.19.13
to patch cookie bypass, IP restriction, path traversal, and serveStatic
middleware bypass vulnerabilities
- Fix CI: update context-manager test expectation from 1000000 to 1048576 to
match registry defaultContextLength for gemini
- Fix CI: sync-models route now correctly resolves localhost origin from
incoming request for test compatibility
- Added Step 3.5 to redirect PR base branches from main to release/vX.Y.Z
- Updated merge commands to target release branch
- Replaced close-PR step with sync-local-branch step
- Added test coverage verification in finalization step
- Ensures all changes accumulate in release branch before final merge to main
- Split combined message/input/tools sanitization test into separate tests
to avoid Responses format detection triggered by input field (PR #1002)
- Updated tool assertions to handle both function-wrapped and top-level
name properties (built-in tool type preservation from PR #1014)
- Adapted input name sanitization test for Responses-to-Chat translation
Fixes Cursor/Codex Responses compatibility: hoists system messages to instructions, normalizes tools, and translates responses-style traffic. Integrated into release/v3.5.4.
Adds support for non-standard stream aliases (non_stream, disable_stream, disable_streaming, streaming=false). Includes unit tests. Integrated into release/v3.5.4.
Fixes critical Anthropic streaming input token undercount + adds detailed token tracking with DB migration 018. Includes 18 new unit tests. Integrated into release/v3.5.4.
Preserves built-in Responses API tool types (web_search, file_search, computer, etc.) from empty-name filter. Includes 3 new unit tests. Integrated into release/v3.5.4.
Escape backslashes before injecting combo stream tags so tagged SSE
payloads remain valid JSON, and call the provider models handler
directly during sync to avoid internal fetch SSRF warnings.
Also restore crypto UUID generation for Gemini request translation,
tighten dashboard error sanitization, and update tests to use stricter
URL matching and UUID-based fixture keys.
Update the direct Next.js dependency to a patched release in response
to the reported audit findings.
Switch the provider diversity test to Vitest's expect API for
consistent test runner usage and add the audit report snapshot for
release verification.
Allow chat core callers to disable the emergency fallback path during
routed retries and expose proxy cache reset helpers for deterministic
state handling.
Add regression coverage for chat routing edge cases, combo strategies,
stream utilities, cursor SSE termination, and JSON-to-SQLite db
migration behavior.
Only use provider apiRegion values when they are strings before resolving
the GLM quota endpoint, preventing invalid metadata from affecting usage
requests.
Run unit tests with single-test concurrency to avoid shared-state flakes
and expand coverage for auth-protected routes, provider node validation,
proxy and stream handling, model sync, token refresh, and protobuf
parsing.
Keep the original combo and budget exhaustion errors when global or
emergency fallbacks also fail so callers see the real upstream cause.
Also preserve translated responses for memory extraction before output
post-processing, track pending rate-limit async work for deterministic
test resets, and expose usage helpers needed for deeper branch
coverage.
Expand unit coverage across moderation, media generation, streaming,
response logging, usage helpers, and fallback proxy error handling.
Move chat pipeline validation, circuit breaker execution, proxy
resolution, logging, and session header handling into dedicated
helpers to keep the SSE handler smaller and easier to verify.
Also fix shared API option precedence, rebuild skill version caches
after deletions, ignore api.trycloudflare.com false positives, and
add rate-limit manager test flush/reset hooks for deterministic
coverage.
Expand integration and unit coverage across chat routing, auth,
cloud sync, skills, executors, streaming, DB helpers, proxy handling,
and provider/model utilities.
- Removed the expensive (40s+) `npm run test:unit` step from the `pre-commit` hook
- Created `.husky/pre-push` to run the unit test suite before pushing rather than per commit
- This prevents spurious async teardown errors from local test runners from blocking fast commits
- Replaced an explicit `any` cast with `Record<string, unknown> | undefined` in `chatCore.ts` to pass the `check:any-budget:t11` strict checker which enforces a budget of 0
* feat(qoder): native cosy integration
* feat(qoder): implement native COSY encryption algorithm and remove CLI child instances, plus workflow bumps
* feat(resilience): context overflow fallback, OAuth token detection, empty content guard & context-optimized combo strategy
- Add isContextOverflowError + isContextOverflow detectors (400 + token-limit signals)
- Auto-fallback to next family model on context overflow in chatCore
- Add isEmptyContentResponse to catch fake-success empty responses, trigger fallback + recursive retry
- Add OAUTH_INVALID_TOKEN error type (T11) with isOAuthInvalidToken signal matching; warn instead of deactivating node
- Add getModelContextLimit helper in modelsDevSync (reads limit_context from synced capabilities)
- Upgrade getTokenLimit in contextManager to check models.dev DB before registry (fixes gemini-2.5-pro: 1000000→1048576)
- Add findLargerContextModel in modelFamilyFallback for context-aware model selection
- Add sortModelsByContextSize + context-optimized combo strategy in combo.ts
- Update context-manager unit test for corrected gemini-2.5-pro limit
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(review): address Gemini code review — tool_calls path, infinite recursion, dedup signals, findLargerContextModel
- Fix isEmptyContentResponse: check message.tool_calls/delta.tool_calls instead
of firstChoice.tool_calls (wrong OpenAI API path, caused tool-call responses
to be falsely flagged as empty)
- Fix empty content fallback: replace recursive handleChatCore call (infinite
recursion risk + wrong model due to original body.model) with non-recursive
pattern — call executeProviderRequest, parse fallback response body, reassign
responseBody and fall through to existing processing
- Fix context overflow: use findLargerContextModel over family candidates first,
fall back to getNextFamilyFallback — ensures we pick a model with actually
larger context window on overflow
- Fix signal dedup: export CONTEXT_OVERFLOW_SIGNALS + CONTEXT_OVERFLOW_REGEX
from errorClassifier.ts; import shared regex in modelFamilyFallback.ts,
removing duplicate signal list and per-call RegExp construction
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(UI): add context-optimized strategy to frontend schema and options
* fix(sse): preserve Responses API events in stream translation
When translating Claude-format responses (e.g. GLM) to Responses API
format for Codex CLI, the sanitizer stripped {event, data} structured
items to {"object":"chat.completion.chunk"}, losing all content and
the critical response.completed event.
Only run sanitizeStreamingChunk on OpenAI Chat Completions chunks,
skipping items that have the Responses API {event, data} structure.
* test(sse): add regression test for Claude→Responses stream sanitization
Verifies that {event,data} structured items from the Responses API
translator bypass sanitizeStreamingChunk when translating Claude-format
providers (e.g. GLM) to Responses API format for Codex CLI.
* fix(sse): strengthen Responses API event detection with response. prefix check
Use explicit `response.` prefix check instead of generic `event && data`
presence check, as recommended in PR review.
* fix: pin Next.js to 16.0.10 to prevent Turbopack hashed module bug
Remove ^ prefix from next and eslint-config-next to prevent
automatic upgrades to 16.1.x+ which introduced content-based
hashing for external module references in Turbopack.
Also remove duplicate Material Symbols @import from globals.css
(font already loaded via <link> in layout.tsx).
Fixes#509
* align cc-compatible cache handling with client passthrough
* chore: integrate resilience and turbopack fixes (PRs #992, #990, #987)
* chore(release): bump to v3.5.2 — changelog, docs, version sync
* docs(i18n): sync documentation updates to 33 languages
* fix(qoder): replace any with unknown to comply with strict any-budget
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Chris Staley <christopher-s@users.noreply.github.com>
Co-authored-by: Ivan <shanin-i2011@yandex.ru>
Co-authored-by: R.D. <rogerproself@gmail.com>
- CRITICAL: Fetch old settings BEFORE update in PATCH handler to correctly
compare wasEnabled vs isEnabled for sync lifecycle management
- CRITICAL: Handle modelsDevSyncInterval changes (restart periodic sync with
new interval when it changes)
- MEDIUM: Add error logging and user feedback to useEffect catch
- MEDIUM: Add revert logic to updateInterval on API failure
Fixes all 3 review comments from Gemini Code Assist on PR #983
- Add models.dev sync engine (src/lib/modelsDevSync.ts) for pricing, capabilities, and model specs
- Fetch from https://models.dev/api.json (109 providers, 4,146+ models, MIT licensed)
- 4-layer pricing resolution: User Override > models.dev > LiteLLM > Hardcoded Default
- New model_capabilities DB table for synced capability data
- UI toggle in Settings > AI tab: enable/disable sync, configure interval (1h-7d), manual sync trigger
- Live stats dashboard showing provider/model/capability counts
- New API route /api/settings/models-dev for sync status and manual triggers
- Fix 39 missing i18n keys across all 30 languages (Memory & Skills tab fully translated)
- 25 unit + integration tests, 1,439 existing tests pass, lint clean, typecheck clean
Closes#979
Switch validateGeminiLikeProvider from query-param auth (?key=) to
x-goog-api-key header auth, matching the actual request pipeline.
Parse Google error response bodies to distinguish auth failures
(API_KEY_INVALID, API_KEY_EXPIRED, PERMISSION_DENIED) from other
400 errors. Google returns 400 (not 401/403) for invalid keys.
Add 5 new test cases covering 400/401 rejection paths and success.
Fixes#976
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
The hardcoded BUFFER_TOKENS = 2000 constant inflates prompt_tokens and
input_tokens in every API response, which is helpful for CLI tools that
rely on reported usage to manage context windows but misleading for SDK
users, cost dashboards, and any integration comparing token counts
across providers.
This change makes the buffer configurable via three layered sources
(in priority order):
1. Environment variable: USAGE_TOKEN_BUFFER=0
2. Settings API / Dashboard: PATCH /api/settings { usageTokenBuffer: 0 }
3. Default: 2000 (preserves existing behavior)
Setting the value to 0 disables the buffer entirely, causing OmniRoute
to return raw provider token counts. The setting is cached in-process
with a 30-second TTL and invalidated immediately when updated through
the settings API.
Changes:
- open-sse/utils/usageTracking.ts: replace hardcoded constant with
getBufferTokens() that reads env / DB settings with TTL cache
- src/shared/validation/settingsSchemas.ts: add usageTokenBuffer field
(int, 0–50000) to the Zod update schema
- src/app/api/settings/route.ts: invalidate buffer cache on update
The locateCommand function returned the bare command name instead of
parsing the where output. On Windows, npm global installs create both
a Unix shell script (no extension) and a .cmd wrapper. where returns
both, and the bare name resolves to the Unix script first, causing
healthcheck failures for OpenClaw and OpenCode.
Fix: parse where output and prefer paths with Windows executable
extensions (.cmd, .exe, .bat, .com).
Related: #935, #863
SQLite background asynchronous backups () generate native thread promises that are not implicitly terminated by Node 22's test runner upon suite completion when the DB connection is closed. This causes the CI test job to hang indefinitely. Added cross-env DISABLE_SQLITE_AUTO_BACKUP flag to the test suite.
- Replaces loose string includes check in dnsConfig with strict bound RegExp to silence URL matching heuristic (SSRF).
- Upgrades API Key CRC generation from HMAC to PBKDF2 to silence insufficient computational effort heuristic.
- Add proxy support to all OAuth flows (authorization, token exchange, import)
- Add proxy support to token refresh operations for all providers
- Add proxy support to model synchronization
- Initialize global fetch proxy patch at server startup
- Use Proxy Registry with priority: Provider Proxy → Global Proxy → Direct
- Fix Global Proxy display in settings UI to show proxy from Proxy Registry
Changes:
- open-sse/services/tokenRefresh.ts: Add proxyConfig parameter to all refresh functions
- src/sse/services/tokenRefresh.ts: Resolve proxy before calling refresh functions
- src/app/api/oauth/*/route.ts: Use resolveProxyForProvider for OAuth flows
- src/app/api/providers/[id]/models/route.ts: Add proxy support for model sync
- src/instrumentation-node.ts: Initialize proxy patch at startup
- src/app/api/settings/proxy/route.ts: Read Global Proxy from Proxy Registry
- src/lib/db/proxies.ts: Export resolveProxyForProvider
- src/lib/localDb.ts: Re-export resolveProxyForProvider
- src/models/index.ts: Re-export resolveProxyForProvider
15 files changed, 405 insertions(+), 240 deletions(-)
Co-authored-by: growab <growab@users.noreply.github.com>
The tool was fully defined in schemas/tools.ts and backed by the
working /v1/search endpoint, but server.registerTool() was never
called, leaving it absent from tools/list.
Changes:
- server.ts: add webSearchInput import, handleWebSearch handler, and
server.registerTool("omniroute_web_search") after sync_pricing block
- schemas/tools.ts: align webSearchInput with /v1/search contract --
query max reduced 1000->500, provider narrowed to explicit enum
- essentialTools.test.ts: rewrite web_search stubs to dispatch through
a real McpServer+InMemoryTransport+Client, providing actual handler
coverage (POST method, body fields, error paths, tools/list check)
- vitest.mcp.config.ts: dedicated vitest config for MCP server tests;
update test:vitest script to use it
Note: omniRouteFetch hardcodes AbortSignal.timeout(10000) unconditionally
(server.ts line 126), so caller signals are silently discarded -- the
effective search timeout is 10s. Follow-up PR can add timeoutMs param.
cacheStatsTool and cacheFlushTool are also unregistered; out of scope.
🤖 Generated with Claude Sonnet 4.6 via Claude Code (https://claude.com/claude-code) + Compound Engineering v2.58.1
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Gemini AI Studio no longer has a hardcoded model registry — models come
from API sync. Updated tests:
- T28: assert gemini registry is empty (API sync), check gemini-cli instead
- T31: check antigravity static catalog for pro-high/pro-low model IDs
The comment claimed "leave no text content" but any text chunks streamed
before the SAFETY finish reason have already been emitted. Updated to
accurately describe the behavior.
Stability fixes for the Gemini provider — no new features:
- Map SAFETY/RECITATION/BLOCKLIST finish reasons (gemini-to-openai → content_filter)
- Add 15s per-page timeout on models sync pagination to prevent indefinite hangs
- Fix mime_type → mimeType in inlineData to match Gemini v1beta API (camelCase)
- Fix include_thoughts → includeThoughts to match Gemini API (camelCase)
- Add inlineData/mime_type fallback in gemini-to-openai request translator
Google AI Studio defaults to 50 models per page. Set pageSize=1000
to maximize per-page results and follow nextPageToken to fetch all
available models across pages. Also fixes query param separator when
base URL already contains query string.
Replace inline `provider === "gemini"` checks in chatCore.ts and auth.ts
with shared hasPerModelQuota() and lockModelIfPerModelQuota() from
accountFallback.ts. Also adds max-cooldown preservation to lockModel()
to prevent race conditions from overwriting longer lockouts.
isModelLocked() was called to lock models on 429 but never checked
when selecting connections. Requests to a locked model would still
route to it, causing unnecessary repeated 429s.
Gemini AI Studio enforces per-model quotas. Previously a 429 on
gemini-2.5-pro would mark the entire connection as credits_exhausted,
blocking all models on that API key.
Three-layer fix:
- chatCore: lock model only (not connection) for RATE_LIMITED and
QUOTA_EXHAUSTED errors from Gemini
- auth: early-return with model-only lockout before terminal status
check, so credits_exhausted is never set on the connection
- rateLimitManager: use model-scoped limiter keys for Gemini so the
Bottleneck queue pauses only the affected model, not the connection
- chat: skip markAccountExhaustedFrom429 for Gemini (per-model quotas)
Gemini AI Studio has per-model quota limits. When one model hits its
quota (429), other models on the same API key may still be available.
- Use lockModel() for 429s on Gemini (model-only lockout, connection
stays active for other models)
- Skip markAccountExhaustedFrom429 for Gemini (prevents deprioritizing
the entire connection in credential selection)
- Apply model-only 404 lockout for Gemini too (deprecated/unavailable
models shouldn't disable the provider)
Match reset() behavior — clear lastFailureTime when recovering from
OPEN state to avoid stale timestamps in persisted state.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Show import progress dialog when adding Gemini API key with model
list, completion status, and Close button
- Remove hardcoded Gemini model registry — models come exclusively
from Google API sync per API key
- Hide "Import from /models" button for Gemini (sync handles it)
- Remove server-side fire-and-forget auto-sync (client handles it)
- Categorize synced Gemini models by endpoint type in catalog:
embedding, image, and audio (transcription + speech)
- Add modelsImported and close i18n keys to all 33 languages
Combo handlers call _onSuccess()/_onFailure() directly instead of
execute(), bypassing the OPEN→HALF_OPEN→CLOSED transition. After
resetTimeout elapses, canExecute() returns true and requests flow
through, but _onSuccess() only reset failureCount without changing
state — leaving the breaker permanently OPEN with 0 failures.
Handle OPEN state in both methods: _onSuccess() now transitions
OPEN→CLOSED, _onFailure() skips redundant re-transition.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Auto-sync is fire-and-forget on the server, so the dashboard needs to
poll for model updates after saving a Gemini key (3s delay). On delete,
refresh models immediately since DB cleanup is synchronous.
Removed hardcoded registry fallback for Gemini in dashboard, catalog,
and v1beta endpoints. Without synced models (no API keys), Gemini shows
nothing. Hardcoded entries are always removed from v1beta regardless
of sync result.
Store synced models keyed by providerId:connectionId so each API key's
models are tracked separately. On read, union across all connections.
On connection delete, remove only that key's models. Models with no
remaining connections are automatically excluded.
Add 9 new API-key providers for image, video, and media generation:
- Replicate (rep/) — ML model marketplace with passthrough models
- fal.ai (fal/) — fast inference for generative media
- Novita AI (novita/) — image/video generation platform
- Segmind (seg/) — image generation models
- Runware (rw/) — real-time image generation
- WaveSpeed AI (ws/) — fast media generation
- PiAPI (pi/) — multi-model API gateway
- GoAPI (ggo/) — aggregated AI API access
- LaoZhang AI (lz/) — multi-provider proxy
These providers support image/video/audio endpoints natively via
OpenAI-compatible format, solving the gap where only chat-focused
providers were registered.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Align CacheTrends interface with API response (cachedRequests instead of hits/misses/hitRate)
- Update bar chart rendering to use cachedRequests field from getCacheTrend()
- Remove duplicate CacheStatsCard from cache overview (already in settings AI tab)
- Fix tooltip to use i18n translations instead of hardcoded English
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Add "memory" and "skills" entries to HIDEABLE_SIDEBAR_ITEM_IDS and PRIMARY_SIDEBAR_ITEMS
- Add sidebar translation keys for memory and skills pages
- Add complete "memory" and "skills" i18n sections to en.json with all UI labels
- Replace all hardcoded English text in memory/page.tsx with useTranslations() calls
Memory and Skills pages now appear in the sidebar and render with proper i18n support.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Strip reasoning_text/reasoning_content from assistant messages in
GitHub executor to prevent upstream 400 'Invalid signature in
thinking block' errors
- Sync refreshed copilotToken to top-level credentials in
checkAndRefreshToken so buildHeaders() uses the fresh token
instead of the stale one
- Include providerSpecificData in refreshCredentials return value
so onCredentialsRefreshed persists the new token to DB correctly
- Wrap JSON.parse in try/catch in rowToConfig (upstreamProxy.ts)
- Add error logging to empty catch blocks in CliproxyapiToolCard
- Pin Docker image to v6.9.7 instead of :latest
- Validate CLIProxyAPI URL with validateProxyUrl before saving
- Sync cliproxyapi_url and cliproxyapi_fallback_enabled to upstream_proxy_config DB
- Bridge the gap between settings UI and executor data sources
- Add requireManagementAuth to all 6 version-manager API routes
(status, check-update, install, start, stop, restart)
- Parameterize Docker healthcheck port via CLIPROXYAPI_PORT env var
- Improve error message to be actionable when binary is missing
- Add proper TS interfaces to CliproxyapiSettingsTab (replace any/Record)
- Add HTTP status checks and error handling on all fetches
- Add client-side URL validation before saving proxy URL
- Add error state display for version manager service
- Document localhost SSRF exception in isPrivateHost
- Add ALLOWED_COLUMNS whitelist to updateVersionManagerTool
The health check refresh path was silently dropping result.providerSpecificData,
so resourceUrl (returned by Qwen on token refresh) was never persisted to the DB.
Now deep-merges refresh result providerSpecificData into the existing connection data.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Part 1 of CLIProxyAPI integration (#902).
- DB migration for version_manager + upstream_proxy_config tables
- DB modules: versionManager.ts (tool lifecycle CRUD), upstreamProxy.ts (proxy config CRUD + SSRF guard)
- localDb.ts re-exports for new modules
- Provider registration: UPSTREAM_PROXY_PROVIDERS in providers.ts
- Zod validation schemas for version-manager API routes
- Settings UI: CliproxyapiSettingsTab (Advanced tab)
- 47 unit tests for DB modules
Hub-and-spoke flush path was broken for non-OpenAI providers when the
client speaks OpenAI Responses API (e.g. Codex CLI). When
claudeToOpenAIResponse(null) returned null, openaiToOpenAIResponsesResponse
was never called, so response.completed (carrying total_tokens) was never
emitted — causing "missing field total_tokens" errors in Codex.
Two fixes:
- Pass null through to Step 2 translator even when Step 1 produced no
output during flush, so terminal events get emitted
- Capture usage from any chunk carrying it (not just usage-only chunks)
and normalize Chat Completions format to Responses API format
Prevent auto-sync from wiping manually-imported models when the upstream
/models endpoint fails, times out, or returns an empty list. Added
`allowEmpty` option (default false) to replaceCustomModels — callers that
intentionally clear all models (DELETE ?all=true) pass `allowEmpty: true`.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- prop-types: required by 12 component files using PropTypes
- js-yaml: required by openapi spec route
These dependencies were missing from package.json causing build failures.
Added missing cliTools.guides.windsurf.steps[1-5] with title and desc:
- step 1: Open AI Settings
- step 2: Add Custom Provider
- step 3: Base URL (http://127.0.0.1:20128/v1)
- step 4: API Key
- step 5: Select Model
Total: 165 keys across all language files (5 steps × 2 keys × 33 languages)
Prevent auto-sync from wiping manually-imported models when the upstream
/models endpoint fails, times out, or returns an empty list. Added
`allowEmpty` option (default false) to replaceCustomModels — callers that
intentionally clear all models (DELETE ?all=true) pass `allowEmpty: true`.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The wrapInCloudCodeEnvelopeForClaude function converts Claude message
blocks to Antigravity/Gemini parts format but only handles text,
tool_use, and tool_result types. Image blocks (type: 'image' with
base64 source) are silently dropped, causing Claude models routed
through Antigravity to never receive image content.
This adds handling for image blocks, converting them to inlineData
format (the same format the Gemini path already uses successfully).
Tested: Claude via Antigravity now correctly receives and describes
image content that was previously invisible.
- copilot-usage: use future reset date (2026-12-31) to avoid stale
quota window causing remainingPercentage to reset to 100%
- request-log-migration: close SQLite DB before cleanup to release
Windows file locks; remove stale archive dir before second test
The models gemini-2.5-flash-preview-image-generation and
gemini-3.1-flash-image-preview were surfacing in the model catalog
via getAllImageModels() from imageRegistry.ts, not from the live
upstream API. Removed them from the image provider registry.
- Reduce maxOutputTokens from 131072 to 65535 for gemini-3.1-pro-high
and gemini-3.1-pro-low, fixing 400 "invalid argument" errors from
Open WebUI when no max_tokens is specified (upstream limit is 65535)
- Filter non-viable models (gemini-3.1-flash-image-preview,
gemini-2.5-flash-preview-image-generation, gemini-3-pro-high/low)
from the live upstream API response in /api/providers/[id]/models
Models removed from available list (not usable via chat completions):
- gemini-3-pro-high/low — returns empty responses, quota unusable
- gemini-2.5-flash/flash-lite — quota always exhausted on free tier
- gemini-3.1-flash-image-preview — preview variant, not functional
Models hidden from quota UI (in addition to above):
- gemini-3-flash-agent — internal agent model
- gemini-3.1-flash-lite — not usable for chat
Kept gemini-3.1-flash-image in available models (confirmed working).
Removed dead gemini-3-pro from bare Pro ID normalization.
- Standardize cooldown fallback to 2 min (COOLDOWN_MS.rateLimit) in both
chatCore and auth.ts instead of inconsistent 60s/undefined
- Return 504 Gateway Timeout instead of 200 OK when SSE collection times
out, with finish_reason "length" to signal incomplete response
A 429 from one Antigravity model was marking the entire provider connection
as rate-limited, blocking ALL other models on the same account. This happened
in two places: chatCore's error classification (primary) and
markAccountUnavailable (secondary). Both now use model-only lockModel() for
passthrough providers instead of connection-wide rateLimitedUntil.
Also adds:
- Bare Pro model ID normalization (gemini-3-pro → gemini-3-pro-low)
matching OpenClaw convention
- Internal model exclusion list for quota display, matching CLIProxyAPI
- Update model list to match CLIProxyAPI filtered models (remove stale
gemini-2.5-pro, claude-sonnet-4-5, claude-sonnet-4, gemini-2.0-flash;
sort alphabetically)
- Extend 404 model-only lockout to passthrough providers so one missing
model doesn't lock out the entire Antigravity connection
- Always use streaming upstream endpoint (generateContent causes upstream
400 for some models that internally convert to OpenAI format with
stream_options); collect SSE into JSON for non-streaming clients
- Fix unlimited quota models showing 0% by checking q.unlimited before
recalculating percentage
- prop-types: required by 12 component files using PropTypes
- js-yaml: required by openapi spec route
These dependencies were missing from package.json causing build failures.
The check-route-validation script now accepts both validateBody()
and .safeParse() as valid body validation methods. This fixes false
positives for routes using Zod schemas with safeParse().
Added 130 missing keys from en.json:
- a2aDashboard: 46 keys
- agents: 6+ keys
- cliTools.guides notes: continue, kiro, opencode
- And all other missing keys from recent additions
Total: All 33 language files now have full key parity.
Detects mismatched placeholders like {count} vs {pocet} between
source (en.json) and translations. Catches cases where raw placeholders
like {# models} are translated without preserving the placeholder format.
Found 14 issues in cs.json as test case.
Added missing cliTools.guides.windsurf.steps[1-5] with title and desc:
- step 1: Open AI Settings
- step 2: Add Custom Provider
- step 3: Base URL (http://127.0.0.1:20128/v1)
- step 4: API Key
- step 5: Select Model
Total: 165 keys across all language files (5 steps × 2 keys × 33 languages)
Added missing i18n keys for 'strict-random' routing strategy:
- combos.strategyGuide.strict-random: {when, avoid, example}
- combos.strategyRecommendations.strict-random: {title, description, tip1, tip2, tip3}
Total: 264 keys across all language files (8 keys × 33 languages)
These keys were already in pt-BR (incorrectly translated) and are now
aligned with the English fallback values from combos/page.tsx
Added missing step 5 'Use Thinking Variant' to all 33 i18n language files
for cliTools.guides.opencode.steps.5
The step was already defined in CLI_TOOLS constant but the i18n
translations were missing, causing the step title/description to
not display in the UI.
Models without quota data (e.g. tab-completion models) were showing 0%
because remainingFraction defaulted to 0 when absent. Now defaults to 1
so they show 100% remaining instead.
The check-route-validation script now accepts both validateBody()
and .safeParse() as valid body validation methods. This fixes false
positives for routes using Zod schemas with safeParse().
Added 130 missing keys from en.json:
- a2aDashboard: 46 keys
- agents: 6+ keys
- cliTools.guides notes: continue, kiro, opencode
- And all other missing keys from recent additions
Total: All 33 language files now have full key parity.
Detects mismatched placeholders like {count} vs {pocet} between
source (en.json) and translations. Catches cases where raw placeholders
like {# models} are translated without preserving the placeholder format.
Found 14 issues in cs.json as test case.
Added missing cliTools.guides.windsurf.steps[1-5] with title and desc:
- step 1: Open AI Settings
- step 2: Add Custom Provider
- step 3: Base URL (http://127.0.0.1:20128/v1)
- step 4: API Key
- step 5: Select Model
Total: 165 keys across all language files (5 steps × 2 keys × 33 languages)
Added missing i18n keys for 'strict-random' routing strategy:
- combos.strategyGuide.strict-random: {when, avoid, example}
- combos.strategyRecommendations.strict-random: {title, description, tip1, tip2, tip3}
Total: 264 keys across all language files (8 keys × 33 languages)
These keys were already in pt-BR (incorrectly translated) and are now
aligned with the English fallback values from combos/page.tsx
Added missing step 5 'Use Thinking Variant' to all 33 i18n language files
for cliTools.guides.opencode.steps.5
The step was already defined in CLI_TOOLS constant but the i18n
translations were missing, causing the step title/description to
not display in the UI.
- Replace stale model IDs (gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview)
with correct High/Low tier variants from fetchAvailableModels API
(gemini-3-pro-high, gemini-3-pro-low, gemini-3.1-pro-high, gemini-3.1-pro-low, etc.)
- Remove ag/ alias prefix in favor of antigravity/ across registry, providers,
model capabilities, combos, docs, and static model providers
- Make provider alias optional in Zod schema and guard ALIAS_TO_ID/ID_TO_ALIAS maps
- Show raw model IDs in quota display instead of unmapped display names
- Update T28 model catalog test to assert new High/Low tier models
- Add state for custom app name and logo
- Fetch whitelabeling settings from /api/settings
- Listen for whitelabeling changes via settings event
- Display custom app name when set
- Display custom logo (Base64 or URL) when set
- Fall back to default OmniRoute logo and name
- Add dedicated 'appearance' tab to settings navigation
- Move AppearanceTab from general to dedicated tab
- Add whitelabeling fields to Zod schema (customLogoUrl, customLogoBase64)
- Add whitelabeling UI section with:
- App name customization
- Custom logo URL input
- Logo file upload with preview
- Reset to default functionality
- Add i18n keys for whitelabeling features
This allows users to customize the application name and logo for white-labeling purposes.
- Deduplicate in-flight loadCodeAssist requests to prevent thundering herd
- Add typeof guard on cloudaicompanionProject.id before calling .trim()
- Evict oldest cache entry when all entries are still valid
- Fix unwrapGeminiChunk to use explicit null-safe check
- Update test description for null response case
The stored projectId from OAuth connection time becomes stale because the
Cloud Code API rotates free-tier projects. Native Gemini CLI refreshes the
project every 30 seconds via loadCodeAssist — OmniRoute never did, causing
403 "has not been used in project X" errors that permanently banned accounts.
- Add refreshProject() to GeminiCLIExecutor that calls loadCodeAssist API
with 10s timeout and caches the result for 30s (matching native CLI)
- transformRequest() replaces the stale projectId in the envelope before
sending to the Cloud Code API, falling back to the stored ID on failure
- Make transformRequest calls await-compatible in base executor and
all subclasses (antigravity, cursor, kiro) so async overrides work
- Remove x-goog-user-project header and executor-level project override
that caused 403 "Cloud Code Private API has not been used in project X"
- Add PROJECT_ROUTE_ERROR classifier type so project-routing 403s don't
permanently ban accounts (keeps accounts active, tracks the error)
- Fix Cloud Code envelope unwrapping for content accumulation in stream.ts
(Cloud Code wraps responses in { response: { candidates: [...] } })
- Extract unwrapGeminiChunk() into streamHelpers.ts with format guard
- Remove _currentModel singleton race condition from GeminiCLIExecutor
- Add handler for PROJECT_ROUTE_ERROR in chatCore.ts
- Add TODO in antigravity.ts about same stale-project risk
- Add 7 unit tests for error classifier and stream unwrap paths
* test(settings): add unit tests for debugMode and hiddenSidebarItems
Tests cover:
- PATCH debugMode=true/false
- PATCH hiddenSidebarItems with array values
- Combined updates with both fields
* test(e2e): add Playwright tests for settings toggles
Tests cover:
- Debug mode toggle on/off
- Sidebar visibility toggle
- Settings persistence after page reload
* fix(tests): address code review issues
- Unit tests: fix async/await for getSettings, use direct db functions
- E2E tests: remove conditional logic, use Playwright auto-waiting assertions
* feat(logging): unify request log retention and artifacts
* docs: add dashboard settings toggles to CONTRIBUTING
Add section documenting:
- Debug Mode toggle (Settings → Advanced)
- Sidebar Visibility toggle (Settings → General)
* fix(cache): only inject prompt_cache_key for supported providers
Only inject prompt_cache_key for providers that support prompt caching
(Claude, Anthropic, ZAI, Qwen, DeepSeek). This fixes issue #848 where
NVIDIA API rejected the parameter.
* fix(model-sync): log only channel-level model changes
* feat(providers): add 4 free models to opencode-zen
* feat(providers): add explicit contextLength for opencode-zen free models
* feat(providers): add contextLength for all opencode-zen models
* feat: Improve the Chinese translation
* fix: preserve client cache_control for all Claude-protocol providers
Previously, the cache control preservation logic only recognized a
hardcoded list of providers (claude, anthropic, zai, qwen, deepseek).
This caused OmniRoute to inject its own cache_control markers for
Claude-protocol providers not in that list (bailian-coding-plan, glm,
minimax, minimax-cn, etc.), overwriting the client's cache markers.
The fix checks both:
1. Known caching providers list (existing behavior)
2. Whether targetFormat === 'claude' (all Claude-protocol providers)
This ensures all Claude-compatible providers properly preserve client
cache_control headers when appropriate (Claude Code client, deterministic
routing, etc.).
Also removes unused CacheStatsCard from settings/components (duplicate
of the one in cache/ page).
Fixes cache token calculation for GLM, Minimax, and other Claude-compatible providers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: pure passthrough for Claude→Claude when cache_control preserved
The Claude passthrough path round-trips through OpenAI format
(claude→openai→claude) for structural normalization. This strips
cache_control markers from every content block since OpenAI format
has no equivalent, causing ~42k cache creation tokens per request
with zero cache reads.
When preserveCacheControl is true (Claude Code client, "always"
setting, or deterministic combo), skip the round-trip entirely and
forward the body as-is. Claude Code sends well-formed Messages API
payloads — the normalization was only needed for non-Code clients.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: restore CacheStatsCard — was not a duplicate
The first commit incorrectly deleted CacheStatsCard from
settings/components/ as a "duplicate". It's the only copy — both
settings/page.tsx and cache/page.tsx import from this location.
Restored the i18n-ized version from main.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(429): parse long quota reset times from error body
- Parse XhYmZs format from antigravity error messages (e.g., 27h41m36s)
- Dynamic retry-after threshold (60s default) instead of hardcoded 10s
- Add parseRetryFromErrorText() in accountFallback.ts for body parsing
- Fix 403 'verify your account' to trigger permanent deactivation
- Add keyword matching for 'quota will reset', 'exhausted capacity'
- Add unit tests for retry parsing and keyword matching
Fixes#858 (Antigravity 429 handling)
Fixes#832 (Qwen quota 429 - same underlying bug)
* chore: bump version to v3.4.0-dev
* fix(migrations): rename 013 to 014 to avoid collision with v3.3.11
* chore(docs): update CHANGELOG for v3.4.0 integrations
* fix: Claude token refresh, Antigravity quota, and 429 rate-limit handling
- Fix Claude OAuth token refresh to use form-urlencoded format (standard OAuth2)
- Add anthropic-beta header required by Claude OAuth API
- Switch Antigravity quota to use retrieveUserQuota API (same as Gemini CLI)
- Parse quota reset time for all providers (not just Antigravity)
- Add quota reset keywords to error classifier
- Cap maximum retry time at 24 hours to prevent infinite wait
Closes#836, #857, #858, #832
* fix(dashboard): resolve /dashboard/limits hanging UI with 70+ accounts via chunk parallelization (#784)
---------
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
Co-authored-by: R.D. <rogerproself@gmail.com>
Co-authored-by: kang-heewon <heewon.dev@gmail.com>
Co-authored-by: gmw <rorschach1167@qq.com>
Co-authored-by: tombii <github@tombii.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
- Fix Claude OAuth token refresh to use form-urlencoded format (standard OAuth2)
- Add anthropic-beta header required by Claude OAuth API
- Switch Antigravity quota to use retrieveUserQuota API (same as Gemini CLI)
- Parse quota reset time for all providers (not just Antigravity)
- Add quota reset keywords to error classifier
- Cap maximum retry time at 24 hours to prevent infinite wait
Closes#836, #857, #858, #832
- Parse XhYmZs format from antigravity error messages (e.g., 27h41m36s)
- Dynamic retry-after threshold (60s default) instead of hardcoded 10s
- Add parseRetryFromErrorText() in accountFallback.ts for body parsing
- Fix 403 'verify your account' to trigger permanent deactivation
- Add keyword matching for 'quota will reset', 'exhausted capacity'
- Add unit tests for retry parsing and keyword matching
Fixes#858 (Antigravity 429 handling)
Fixes#832 (Qwen quota 429 - same underlying bug)
The first commit incorrectly deleted CacheStatsCard from
settings/components/ as a "duplicate". It's the only copy — both
settings/page.tsx and cache/page.tsx import from this location.
Restored the i18n-ized version from main.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Claude passthrough path round-trips through OpenAI format
(claude→openai→claude) for structural normalization. This strips
cache_control markers from every content block since OpenAI format
has no equivalent, causing ~42k cache creation tokens per request
with zero cache reads.
When preserveCacheControl is true (Claude Code client, "always"
setting, or deterministic combo), skip the round-trip entirely and
forward the body as-is. Claude Code sends well-formed Messages API
payloads — the normalization was only needed for non-Code clients.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously, the cache control preservation logic only recognized a
hardcoded list of providers (claude, anthropic, zai, qwen, deepseek).
This caused OmniRoute to inject its own cache_control markers for
Claude-protocol providers not in that list (bailian-coding-plan, glm,
minimax, minimax-cn, etc.), overwriting the client's cache markers.
The fix checks both:
1. Known caching providers list (existing behavior)
2. Whether targetFormat === 'claude' (all Claude-protocol providers)
This ensures all Claude-compatible providers properly preserve client
cache_control headers when appropriate (Claude Code client, deterministic
routing, etc.).
Also removes unused CacheStatsCard from settings/components (duplicate
of the one in cache/ page).
Fixes cache token calculation for GLM, Minimax, and other Claude-compatible providers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Only inject prompt_cache_key for providers that support prompt caching
(Claude, Anthropic, ZAI, Qwen, DeepSeek). This fixes issue #848 where
NVIDIA API rejected the parameter.
- Sync debugMode with showDebug in Sidebar (was using enableRequestLogs env var)
- Only render debug-section sidebar toggles in AppearanceTab when debugMode=true
- Sidebar filters debug-section items based on debugMode (was already correct)
- Debug toggle now triggers omniroute:settings-updated event for instant sidebar update
EOF
- Add authentication to /api/cache/entries (GET and DELETE)
- Add authentication to /api/cache (GET and DELETE)
- Validate trendHours: clamp to 1-720 range
- Fix estimatedCostSaved bug: use calculated value instead of hardcoded 0
- Match CacheStatsCard header size with other card headers (text-sm)
- Show request counts in cache hit rate label (116/359) for clarity
- Add CacheStatsCard component to cache page for cumulative metrics
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Update DEFAULT_PRICING key from 'gc' to 'gemini-cli' so pricing
lookups work with the new alias
- Restore gc -> gemini-cli in FALLBACK_ALIAS_TO_PROVIDER for backward
compatibility (existing saved configs with gc/ prefix still resolve)
Gemini CLI clients use bare model names, not provider-prefixed IDs.
The gc/ alias was opaque — gemini-cli/ is self-documenting. Since
alias now equals provider ID, the dual-prefix duplication logic
naturally skips Gemini CLI (no duplicate gemini-cli/ entries).
Auto-opening the "Add Connection" dialog when navigating to a provider
with zero connections was a poor UX pattern. It surprised users who were
simply browsing provider details (e.g. after deleting a connection or
checking settings). The page already displays a clear empty state with
an "Add Connection" button — users should click it when ready.
- Add early return guard for missing accessToken in getGeminiUsage
- Add 10s fetch timeout (AbortSignal.timeout) on retrieveUserQuota calls
- Clamp used value with Math.max(0, ...) for non-negative display
- Use full accessToken as cache key instead of truncated prefix
- Replace catch(err: any) with instanceof Error check in models route
Replace stub getGeminiUsage() with per-model quota fetching from Google
Cloud Code Assist's retrieveUserQuota endpoint (same API the official
Gemini CLI /stats command uses). Fixes OAuth env var name, aligns model
list with official Gemini CLI VALID_GEMINI_MODELS, and makes "Import
from /models" discover new models via the quota endpoint.
Implement DiversityScoreCard component to fetch and display provider diversity score with loading state and conditional styling, integrate it into AnalyticsPage overview, and add a new API route at src/app/api/analytics/diversity/route.ts to return the diversity report using getDiversityReport
Integration test was failing because CacheStatsCard was moved from
settings page to cache page in previous commit. Split the test into
two separate describe blocks for accurate page-specific verification.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix getLoggedInputTokens to return full prompt_tokens (input + cache_read + cache_creation)
- Fix usageExtractor for non-streaming Claude responses to calculate total correctly
- Add formatUsageLog helper to show CR=<cache_read> in logs
- Add migration 012 to fix historical token counts in usage_history
- Move prompt cache metrics from Settings to /dashboard/cache page
Per Claude API docs:
Total input tokens = input_tokens + cache_creation_input_tokens + cache_read_input_tokens
Fixes issue where totalInputTokens (71k) was less than totalCacheCreationTokens (1.35M).
Tested:
- All 1134 unit tests pass
- Cache metrics API returns correct totals
- Migration is idempotent and tracked in _omniroute_migrations
- Logs show cache read tokens: 'in=6055 | out=211 | CR=22399'
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use ProviderIcon for internal .png paths solving SVG provider 404 images (#745).
- Add id-token: write and packages: write permissions to .github/workflows/electron-release.yml to fix permissions denied failure when calling the reusable workflow npm-publish.yml (#761).
- Fix tests and ESM resolution for autoUpdate.ts override logic.
Add a standardized degradation pattern for services depending on external
systems. withDegradation() tries primary → fallback → safe default,
tracking status in a global registry for dashboard visibility.
Features:
- Async and sync variants
- Global registry with per-feature status tracking
- Degradation levels: full → reduced → minimal → default
- Summary and report APIs for dashboard integration
- Reason tracking for debugging
Example: Rate limiting degrades from Redis → in-memory → permissive
instead of crashing when Redis is unavailable.
Closes#799
Add configAudit module that records every change to provider connections,
combos, and routing policies with:
- Before/after state snapshots
- Structured diff (added, removed, changed keys)
- Source tracking (dashboard, API, sync, auto-healing)
- Filtered retrieval with pagination
- Rollback state extraction
- Configuration snapshot export for backup
Enables traceability and quick rollback when config changes cause issues.
Closes#791
Add providerExpiration module to track OAuth token, subscription, and
API credit expiration dates per provider connection. Provides:
- setExpiration() / getExpiration() for CRUD operations
- getExpiringSoon() for proactive alerts
- getExpirationSummary() for dashboard health display
- detectExpirationFromResponse() for auto-detection from HTTP headers
- Status classification: active → expiring_soon → expired
Prevents silent failures from expired credentials by alerting operators
before tokens/subscriptions expire.
Closes#790
Add a providerDiversity module that tracks provider usage distribution
using a rolling time window and calculates Shannon entropy normalized
to [0..1]. This enables the auto-combo scoring engine to factor in
provider diversity — boosting underrepresented providers to reduce
single-point-of-failure risk.
Key features:
- Rolling window with configurable size and TTL
- Shannon entropy calculation normalized to [0..1]
- Per-provider diversity boost for auto-combo integration
- Diversity report for dashboard display
- Full test coverage
Closes#788
- Add windsurf and copilot entries to toolDescriptions in all 33 locale files
to fix MISSING_MESSAGE errors on the CLI Tools page (#748)
- Apply FETCH_TIMEOUT_MS to streaming requests' initial fetch() call to prevent
300s TCP default timeout causing silent failures on long-running requests (#769)
- Previously only non-streaming requests had timeout protection; streaming requests
relied solely on stream idle detection which doesn't cover initial connection hangs
Pre-existing any-budget violations in chatCore.ts (6), combo.ts (2), and
embeddings.ts (1 false positive in comment) — none introduced by GLM work.
Replace `as any` with `Record<string, unknown>` casts and reword comment.
Also removes docs/superpowers audit worksheet from git tracking (not part
of GLM Coding provider changes).
The model sync scheduler and sync-models endpoint were blindly
replacing custom models with all fetched models, including ones
already in the built-in registry. Now filters out registry models
before saving to custom models.
Compare fetched models against existing custom models AND built-in
registry models before posting. Only new models trigger
POST /api/provider-models calls. Shows skip count in import progress
when some models already exist. Adds i18n keys for all locales.
Live-tested all GLM Coding models against the /api/coding/paas/v4
endpoint. glm-4.7-flashx returns 429 "Insufficient balance or no
resource package" and is not listed on the /models endpoint.
All other models (glm-5.1, glm-5, glm-5-turbo, glm-4.7, glm-4.7-flash,
glm-4.6v, glm-4.6, glm-4.5v, glm-4.5, glm-4.5-air) return 200.
The "Import from /models" button was using the wrong Z.AI API surface
(Anthropic-compatible /api/anthropic/v1/models with x-api-key auth).
Switched to the correct Coding API endpoints with Authorization: Bearer
auth, matching the pattern used by the quota/usage tracking code.
- International: https://api.z.ai/api/coding/paas/v4/models
- China: https://open.bigmodel.cn/api/coding/paas/v4/models
- Auth: Authorization: Bearer <token> (not x-api-key)
- Region sourced from providerSpecificData.apiRegion
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Record the family-by-family GLM Coding audit, add regression coverage, and fix the documented GLM-5.1 context window override.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add missing glm-4.7-flashx variant to provider registry (confirmed in
Z.AI official GLM-4.7 overview docs as one of three variants)
- Remove glm-4.7/glm4.7 from tool calling denylist — official docs
explicitly show GLM-4.7 supporting function calling with tools param
- Add estimated pricing for glm-4.7-flashx ($0.3/$1.1) between free
Flash and standard 4.7 tiers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(stream): normalize delta.reasoning to reasoning_content in SSE streaming
NVIDIA kimi-k2.5 (and potentially other providers) send reasoning
tokens as `delta.reasoning` in SSE streaming chunks instead of the
standard OpenAI `delta.reasoning_content` field. This caused reasoning
content to be silently dropped during stream passthrough — clients
received only the final answer with no reasoning separation.
The non-streaming sanitizer (responseSanitizer.ts) already handled this
alias, but the streaming pipeline did not.
Fix applied in 4 locations:
- stream.ts passthrough: normalize + force re-serialize sanitized chunk
- stream.ts translate: accumulate reasoning from delta.reasoning
- sseParser.ts: collect delta.reasoning in parseSSEToOpenAIResponse
- streamPayloadCollector.ts: collect delta.reasoning in buildOpenAISummary
* fix: eliminate injectedUsage reuse bug and add reasoning alias tests
- Detect delta.reasoning alias before sanitizeStreamingChunk() which
already normalizes it, removing dead post-sanitization normalization
- Replace injectedUsage reuse with separate needsReserialization flag
so reasoning re-serialization cannot block finish_reason/usage
mutations on the same SSE chunk (fixes CRITICAL review finding)
- Add unit test for parseSSEToOpenAIResponse reasoning alias
- Add unit test for buildStreamSummaryFromEvents reasoning alias
* fix(stream): separate reasoning from content in passthrough response body
The passthroughAccumulatedContent variable was mixing delta.content and
delta.reasoning_content into one string, causing the client_response
log and responseBody to lose reasoning separation.
- Add passthroughAccumulatedReasoning accumulator for reasoning deltas
- Set message.reasoning_content in responseBody when reasoning exists
- Only accumulate delta.content into passthroughAccumulatedContent
* fix: trim leading whitespace from assembled content in log summaries
NVIDIA and other providers emit token deltas with leading spaces
(e.g. ' The', ' user'). When joined, these produce a leading space in
the provider_response and parsed non-streaming response logs. Trim
the joined content and reasoning_content in both buildOpenAISummary
and parseSSEToOpenAIResponse for consistent log output.
* fix(stream): split combined reasoning+content deltas into separate SSE events
Some providers (e.g. NVIDIA NIM) send transition chunks with both
`delta.reasoning` and `delta.content` in the same SSE event.
After sanitization this becomes `reasoning_content` + `content`,
which violates the standard OpenAI streaming contract where these
fields are never mixed. Clients using if/else logic (LobeChat, etc.)
skip content when reasoning_content is present, losing the first
content token.
Split such combined chunks into two separate SSE events:
1. Reasoning-only event (finish_reason=null, no usage)
2. Content-only event (carries finish_reason and usage)
Models like antigravity/claude-sonnet-4-6 route through Google's internal
Cloud Code API which returns HTTP 400 when thinking/reasoning parameters
are included in the request body.
Changes:
- open-sse/services/modelCapabilities.ts: add supportsReasoning() function
with a denylist of known-unsupported patterns (antigravity/claude-sonnet-*)
and a registry-based lookup hook (supportsReasoning flag per model)
- open-sse/services/thinkingBudget.ts: in applyThinkingBudget(), add early
exit before the mode switch — if model string is present and
supportsReasoning() returns false, call stripThinkingConfig() immediately
regardless of the configured ThinkingMode
This is fully backward-compatible: models not in the denylist are unaffected,
and the supportsReasoning registry flag defaults to null (pass-through).
Fixes: HTTP 400 errors on antigravity provider when client sends requests
with thinking/reasoning budget parameters (e.g. claude-sonnet-4-6 via AG).
Co-authored-by: oyi77 <oyi77@github.com>
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
* feat: auto-disable banned accounts setting with UI toggle
Add a configurable setting to automatically disable provider accounts
that return permanent/terminal errors (403 banned, ToS violation, etc.)
Changes:
- open-sse/services/accountFallback.ts: extend ACCOUNT_DEACTIVATED_SIGNALS
with AG-specific ban messages ('verify your account', 'service disabled
for violation')
- src/app/api/settings/auto-disable-accounts/route.ts: new GET/PUT endpoint
for the setting (enabled bool + threshold int)
- src/shared/validation/schemas.ts: updateAutoDisableAccountsSchema
- src/sse/services/auth.ts: in markAccountUnavailable(), capture result.permanent
from checkFallbackError() and — when autoDisableBannedAccounts is enabled and
backoffLevel >= threshold — set isActive=false on the connection
Default: disabled (backward-compatible). Enable via Settings UI or PUT
/api/settings/auto-disable-accounts { "enabled": true, "threshold": 3 }
Fixes: antigravity accounts with 403/Verify-your-account errors being
retried indefinitely in the rotation pool.
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
* fix: address reviewer comments for auto-disable (use getCachedSettings, immediate disable on permanent bans)
---------
Co-authored-by: oyi77 <oyi77@github.com>
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
NVIDIA and other providers emit token deltas with leading spaces
(e.g. ' The', ' user'). When joined, these produce a leading space in
the provider_response and parsed non-streaming response logs. Trim
the joined content and reasoning_content in both buildOpenAISummary
and parseSSEToOpenAIResponse for consistent log output.
The passthroughAccumulatedContent variable was mixing delta.content and
delta.reasoning_content into one string, causing the client_response
log and responseBody to lose reasoning separation.
- Add passthroughAccumulatedReasoning accumulator for reasoning deltas
- Set message.reasoning_content in responseBody when reasoning exists
- Only accumulate delta.content into passthroughAccumulatedContent
- Detect delta.reasoning alias before sanitizeStreamingChunk() which
already normalizes it, removing dead post-sanitization normalization
- Replace injectedUsage reuse with separate needsReserialization flag
so reasoning re-serialization cannot block finish_reason/usage
mutations on the same SSE chunk (fixes CRITICAL review finding)
- Add unit test for parseSSEToOpenAIResponse reasoning alias
- Add unit test for buildStreamSummaryFromEvents reasoning alias
NVIDIA kimi-k2.5 (and potentially other providers) send reasoning
tokens as `delta.reasoning` in SSE streaming chunks instead of the
standard OpenAI `delta.reasoning_content` field. This caused reasoning
content to be silently dropped during stream passthrough — clients
received only the final answer with no reasoning separation.
The non-streaming sanitizer (responseSanitizer.ts) already handled this
alias, but the streaming pipeline did not.
Fix applied in 4 locations:
- stream.ts passthrough: normalize + force re-serialize sanitized chunk
- stream.ts translate: accumulate reasoning from delta.reasoning
- sseParser.ts: collect delta.reasoning in parseSSEToOpenAIResponse
- streamPayloadCollector.ts: collect delta.reasoning in buildOpenAISummary
Add a configurable setting to automatically disable provider accounts
that return permanent/terminal errors (403 banned, ToS violation, etc.)
Changes:
- open-sse/services/accountFallback.ts: extend ACCOUNT_DEACTIVATED_SIGNALS
with AG-specific ban messages ('verify your account', 'service disabled
for violation')
- src/app/api/settings/auto-disable-accounts/route.ts: new GET/PUT endpoint
for the setting (enabled bool + threshold int)
- src/shared/validation/schemas.ts: updateAutoDisableAccountsSchema
- src/sse/services/auth.ts: in markAccountUnavailable(), capture result.permanent
from checkFallbackError() and — when autoDisableBannedAccounts is enabled and
backoffLevel >= threshold — set isActive=false on the connection
Default: disabled (backward-compatible). Enable via Settings UI or PUT
/api/settings/auto-disable-accounts { "enabled": true, "threshold": 3 }
Fixes: antigravity accounts with 403/Verify-your-account errors being
retried indefinitely in the rotation pool.
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
Gemini API returns 400 error when tools have enum constraints on integer/number types:
"enum: only allowed for STRING type"
This fix removes enum constraints for integer and number types in JSON schemas
before sending to Gemini API, while keeping enum for string types.
Fixes tools like mcp__pointer__get-pointed-element that use integer enums
for cssLevel and textDetail parameters.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Add `validateResponseQuality()` to detect empty/invalid 200 responses from
upstream providers in combo routing. Non-streaming responses with empty body,
invalid JSON, or missing content/tool_calls now trigger circuit breaker
failure and fallback to the next model instead of being returned to the client.
- Add missing `breaker._onSuccess()` calls in both priority and round-robin
combo paths. Previously failures accumulated without reset, causing premature
circuit breaker trips on healthy models.
- Update Cursor provider registry with Claude 4.6 model IDs (opus-high,
sonnet-high, haiku, opus + thinking variants). Keep 4.5 IDs for backward
compatibility.
- Update free-stack preset: replace duplicate qw/qwen3-coder-plus with
if/deepseek-v3.2 for better model diversity.
- Add paid-premium combo template for round-robin load distribution across
paid subscription providers (Cursor, Antigravity).
Made-with: Cursor
* chore(release): v3.2.8 — Docker auto-update UI and cache analytics fixes
* fix(sse): remove race condition in cache metrics tracking (#758)
- Remove in-memory metrics tracking (currentMetrics, trackCacheMetrics, updateCacheMetrics)
- Cache metrics now computed on-the-fly from usage_history table (single source of truth)
- Fixes CRITICAL issue from code review: concurrent requests overwriting metrics
- Fixes WARNING: duplicate metric tracking logic in streaming/non-streaming paths
Ref: PR #752 (merged before this fix was included)
* fix: handle allRateLimited credentials & forward extra body keys in embeddings/images routes (#757)
* fix: handle allRateLimited credentials in embeddings and images routes
When getProviderCredentials() returns an allRateLimited object (truthy,
but without apiKey/accessToken), the embeddings and images routes
incorrectly passed it to handlers as valid credentials. The handlers
then sent upstream requests without Authorization headers, causing
401 errors from providers (e.g. NVIDIA NIM).
This only manifested under concurrent requests: a chat/completions
call could trigger rate limiting on a provider account, and a
simultaneous embeddings request would receive the allRateLimited
sentinel — but treat it as valid credentials.
The chat pipeline already handled this case correctly. This commit
adds the same allRateLimited guard to all affected routes:
- POST /v1/embeddings
- POST /v1/providers/{provider}/embeddings
- POST /v1/images/generations
- POST /v1/providers/{provider}/images/generations
Also adds a defense-in-depth guard in the embeddings handler itself:
if no auth token is available for a non-local provider, return 401
immediately instead of sending an unauthenticated request upstream.
Made-with: Cursor
* fix(embeddings): forward extra body keys to upstream providers
The embeddings handler only forwarded model, input, dimensions, and
encoding_format to upstream providers, silently dropping any additional
fields. This broke asymmetric embedding APIs (e.g. NVIDIA NIM
nv-embedqa-e5-v5) that require input_type, and other providers
expecting user or truncate parameters.
Add a KNOWN_FIELDS exclusion set and forward all unrecognized body
keys to the upstream request, matching the passthrough pattern used
by the chat pipeline's DefaultExecutor.transformRequest().
Made-with: Cursor
* fix(auth): redirect and unconditional 401 on disabled requireLogin + fix test cases
* fix(build): remove legacy proxy.ts causing Next.js build collision
* fix(build): revert middleware.ts rename to proxy.ts because of Next.js Edge constraints
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: tombii <tombii@users.noreply.github.com>
Co-authored-by: Gorchakov-Pressure <117600961+Gorchakov-Pressure@users.noreply.github.com>
- Add usage_history table creation in test setup
- Simplify byStrategy query to avoid non-existent combo_strategy column
- Update test assertions to work with existing test data
1. Fixed crash in /api/cli-tools/status when statuses[toolId] is undefined
- Added null check before accessing statuses[toolId] properties
- Prevents "Cannot set property of undefined" error
2. Added support for droid.exe detection in ~/bin directory
- Added ~/bin and ~/.local/bin to EXPECTED_PARENT_PATHS
- Added droid.exe variant to toolBins for Windows
- Added specific path check for droid in ~/bin/droid.exe
These fixes resolve issues where CLI tools (Claude Code, Codex, Droid)
were showing as "not installed" even when properly installed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds intelligent cache control preservation for Claude Code clients:
- New cacheControlPolicy.ts module with detection logic:
- isClaudeCodeClient(): Detects Claude Code via User-Agent
- providerSupportsCaching(): Checks provider (claude, anthropic, zai, qwen)
- isDeterministicStrategy(): Identifies priority/cost-optimized strategies
- shouldPreserveCacheControl(): Main policy decision
- Cache control is preserved when:
1. Client is Claude Code (detected via User-Agent)
2. Provider supports prompt caching
3. Request routing is deterministic:
- Single model requests (always)
- Combo with priority or cost-optimized strategy only
- Updated translator to accept preserveCacheControl option
- Updated chatCore and chat handler to propagate combo strategy
- Added comprehensive unit tests (24 tests)
Non-deterministic combo strategies (weighted, round-robin, random, etc.)
continue to use OmniRoute's managed caching strategy.
Refs: #cache-control-preservation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OpenCode Go mixes request protocols by model family:
- `glm-5` and `kimi-k2.5` use OpenAI-style `/chat/completions`
- `minimax-m2.5` and `minimax-m2.7` use Anthropic-style `/messages`
OmniRoute already routed MiniMax Go models to `/messages`, but the
executor still sent `Authorization: Bearer ...`, which caused upstream
`401 Missing API key` errors.
This changes `OpencodeExecutor` to send:
- `x-api-key` + `anthropic-version` for Claude-targeted OpenCode Go requests
- `Authorization: Bearer ...` for the remaining OpenCode Go request formats
Also updates unit coverage to assert the correct header behavior for
MiniMax Go models.
Validated with:
- direct curl repro against OpenCode Go endpoints
- `node --import tsx/esm --test tests/unit/opencode-executor.test.mjs`
- `npm run typecheck:core`
- `npm run build`
HTTP 400 "invalid argument" was triggered when OmniRoute translated tool calls
to Gemini format, because thoughtSignature was injected onto every functionCall
part unconditionally. Affects two code paths:
1. openai-to-gemini.ts — OpenAI tool_calls → Gemini functionCall
2. claude-to-gemini.ts — Claude tool_use blocks → Gemini functionCall
thoughtSignature is only valid on thinking/reasoning parts (those with
thought: true or thoughtSignature as their primary field). When present on a
functionCall part, the Gemini API returns HTTP 400 'invalid argument'.
The thinking parts that legitimately carry thoughtSignature (emitted when a
message has reasoning_content / thinking blocks) are untouched.
Regression tests (T43) cover:
- single tool call: no thoughtSignature on functionCall part (openai path)
- multiple tool calls: none carry thoughtSignature (openai path)
- thinking regression guard: thoughtSignature still on thought parts
- claude-to-gemini path: tool_use blocks produce clean functionCall parts
Fixes#724
HTTP 400 "invalid argument" was triggered when OmniRoute translated OpenAI
tool_calls to Gemini format, because thoughtSignature was injected onto every
functionCall part unconditionally.
thoughtSignature is only valid on thinking/reasoning parts (those with
thought: true). The Gemini API rejects any request where a functionCall
part carries a thoughtSignature field, returning HTTP 400.
Fix: remove the thoughtSignature field from functionCall parts. The thinking
parts that legitimately require thoughtSignature (emitted when a message has
reasoning_content) are unchanged.
Adds regression test (T43) with three cases:
- single tool call: no thoughtSignature on functionCall part
- multiple tool calls: none carry thoughtSignature
- thinking part regression guard: thoughtSignature still present on thought parts
Fixes#725
When all combo models are exhausted (502/503), OmniRoute now checks for
a globalFallbackModel setting and attempts one last request through it
before returning the error. Settings stored in key_value table, no
migration needed.
Non-streaming: Fixed json.messages check to use json.choices[0].message
(OpenAI format). Streaming: inject pin tag before finish_reason chunk for
tool-call-only streams. injectModelTag now appends synthetic assistant
message when content is null/array (tool_calls).
Adds a new /dashboard/cache page that surfaces the existing but UI-less
semantic cache infrastructure.
Changes:
- New page: src/app/(dashboard)/dashboard/cache/page.tsx
- Live stats: memory entries, DB entries, cache hits, tokens saved
- Hit rate progress bar with color coding (green/yellow/red)
- Hits/Misses/Total breakdown
- Idempotency layer stats (active dedup keys + window)
- Cache behavior info panel
- Clear All button
- Auto-refresh every 10s
- Enhanced API: src/app/api/cache/route.ts
- DELETE ?model=<name> — invalidate by model
- DELETE ?signature=<hex> — invalidate single entry
- DELETE ?staleMs=<ms> — invalidate entries older than N ms
- DELETE (no params) — clear all (existing behavior)
- Sidebar: added Cache nav item (icon: cached)
- i18n: added cache + sidebar.cache keys for all 31 supported locales
No new dependencies. All functionality builds on existing semanticCache.ts,
cacheLayer.ts, and idempotencyLayer.ts modules.
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
- Add codexAuthFile.ts utility: builds Codex auth.json payload from OAuth connection
(id_token, access_token, refresh_token, account_id) with auto-refresh if expired
- Add POST /api/providers/[id]/codex-auth/export: downloads auth.json file
- Add POST /api/providers/[id]/codex-auth/apply-local: writes auth.json to local CLI path
- Add 'Apply auth' and 'Export auth' buttons to ConnectionRow (Codex provider only)
- Add i18n keys for en and pt-BR
- add unit tests for API auth, display/error utilities, login bootstrap,
model combo mappings, provider validation branches, and usage analytics
- add COVERAGE_PLAN.md and extend CONTRIBUTING.md with coverage notes and
workflow guidance
- update package.json to adjust test:coverage thresholds and add coverage:report;
include c8 as a devDependency
- introduce test scaffolding and ensure compatibility with existing test runners
- align tests with open-sse changes and improve overall test coverage planning
- introduce open-sse/translator/helpers/schemaCoercion.ts to coerce
numeric JSON Schema fields encoded as strings
- wire coerceToolSchemas and sanitizeToolDescriptions into translator
pipeline; ensure tool descriptions are sanitized
- inject empty reasoning content for tool calls when target is OpenAI
format
- update qwen base URL to DashScope-compatible endpoint
- extend antigravity static catalog with Gemini 3.1 pro preview models and
update Gemini model specs with preview aliases
- implement call log max cap caching with TTL; expose invalidateCallLogsMaxCache
and invalidate on settings PATCH
- add tests: call-log-cap.test.mjs and tool-request-sanitization.test.mjs;
extend tests for Windsurf integration and gemini previews
- update CLI runtime and tools to include Windsurf as a guide-only tool
- add maxCallLogs to validation schemas (settings and updateSettings)
- add Czech README (README.cs.md) to repository
When Claude Code routes through OmniRoute (Claude → OmniRoute → Claude),
OmniRoute was stripping all cache_control markers and replacing them with
its own generic caching strategy. This broke Claude Code's carefully
placed cache breakpoints for plans and other features.
Changes:
- Add preserveCacheControl parameter to prepareClaudeRequest()
- Detect Claude passthrough mode (sourceFormat === targetFormat === CLAUDE)
- Skip cache_control normalization when preserveCacheControl=true
- Preserve client's cache_control markers in system, messages, and tools
This ensures Claude Code's prompt caching optimization works correctly
while maintaining OmniRoute's caching strategy for translation scenarios.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The 'Import from /models' button failed because opencode-zen was not
registered in PROVIDER_MODELS_CONFIG. The provider's API at
https://opencode.ai/zen/v1/models returns standard OpenAI-compatible
format and is now properly configured for model import.
- Fix usageExtractor cached_tokens fallback for Responses API (use cache_read_input_tokens when input_tokens_details is absent)
- Fix truncated claude-native-passthrough-tools.test.mjs that caused parse error
Reviewed and approved via consolidated analysis. Turkish locale (31st language) follows existing i18n patterns perfectly. Registered in config.ts, generate-multilang.mjs, and full tr.json translation file.
Reviewed and approved via consolidated analysis. GLM-5.1 addition and pricing corrections match official Z.AI pricing page. All 5 files follow existing patterns.
Reviewed and approved via consolidated analysis. Fix is surgical (1 line removed) with 122 lines of regression tests covering stream=true, stream=false and guard scenarios. Resolves#677.
Three tests covering the fixed bug where translateRequest received an
object instead of a boolean for the stream parameter:
- stream=true round-trip produces boolean true
- stream=false round-trip produces boolean false
- guard test documenting that passing an object as stream breaks typing
Co-Authored-By: Craft Agent <agents-noreply@craft.do>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The second translateRequest call in the claude->openai->claude passthrough
path had an extra `translatedBody` argument before `stream`, shifting all
parameters by one. This caused the `stream` field in the upstream request
to be set to an object instead of a boolean, producing:
"stream: Input should be a valid boolean"
Co-Authored-By: Craft Agent <agents-noreply@craft.do>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add glm-5.1 model to GLM Coding provider with fitness scores
- Update glm-5 pricing to match Z.AI API page ($1/$3.2/$0.2)
- Set glm-5.1 pricing to $1.2/$5/$0.3 per Z.AI
- Remove glm-4-32b (deprecated, returns empty from upstream)
- Rename Z.AI provider display name from "Z.AI (GLM-5)" to "Z.AI"
- Update zai pricing section to match glm pricing
In OpenAI Chat Completions streaming format, the tool call id and type
should only appear on the first chunk (tool declaration). Subsequent
argument delta chunks should only include index and function.arguments.
Including id on every delta chunk caused openai-to-claude.ts to emit
a new content_block_start for each chunk, breaking Claude Code ACP
sessions with malformed Claude-format streams.
Fixes#682
Co-authored-by: Chris Staley <christopher.staley@protonmail.com>
The hasValuableContent() function in streamHelpers.ts returned undefined
instead of explicit false when checking empty delta chunks. This caused
JavaScript type coercion issues where undefined !== '' evaluated to true,
passing empty chunks through to clients.
Fix: Replace implicit returns with explicit boolean returns using
typeof checks and length comparisons for all content fields (content,
reasoning_content, tool_calls, text, thinking, partial_json).
Test: Added unit tests covering OpenAI, Claude, and Gemini format edge cases.
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
The native Codex passthrough path returned early before injecting
default instructions and enforcing store=false. Clients sending
Responses API requests without instructions (e.g. opencode) got
400 "Instructions are required", and requests missing store=false
got 400 "Store must be set to false" from the Codex upstream.
Move both assignments before the passthrough return so they apply
to all code paths.
Changes:
- fix: restore native Claude tool names in passthrough responses (PR #663 by @coobabm)
- fix: Clear All Models button now also removes aliases (PR #664 by @rdself)
- fix: completed truncated test from PR #663, added Claude-to-Claude passthrough test
- docs: update CHANGELOG and OpenAPI spec
The Clear All Models button was only deleting custom models from the
database but leaving their aliases intact, so the UI didn't reflect
the deletion. Now it also deletes all aliases belonging to the provider
and refreshes the alias state.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
High backoffLevel (up to 15) persisted permanently in the DB after a burst of 429s. The account health score dropped to zero (100 - 15*10 = -50), causing the account selector to never pick the account again. Only a successful request could reset backoffLevel via clearAccountError, but the account was never selected — creating a deadlock.
Now, during account selection, any non-terminal connection whose rateLimitedUntil has passed gets its backoffLevel reset to 0 and testStatus restored to active. The DB update is fire-and-forget to avoid blocking the hot path.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
The openai-to-claude translator was prefixing tool names with 'proxy_'
(e.g. Bash → proxy_Bash) even when routing Claude-format requests to
native Claude/Anthropic providers. Claude rejects unknown tool names,
causing 'No such tool available: proxy_Bash' errors.
Root cause: the _disableToolPrefix condition only disabled the prefix
for non-Claude providers, but it should be disabled for ALL providers
in the Claude passthrough path since tools are already in Claude format.
Fixes#618
- Fix Ollama Cloud base URL from api.ollama.com to ollama.com/v1/chat/completions
- Fix Ollama Cloud models URL to ollama.com/api/tags
- Add gemini-3.1-pro-preview and gemini-3.1-flash-lite-preview to Antigravity provider
Closes#643, closes#645
Thanks @brendandebeasi for this excellent contribution! 🎉 The bounded retry with exponential backoff is exactly the right approach for expired connections. Merged and will be included in v3.1.1.
Connections marked as 'expired' were permanently skipped by the health check scheduler (line 176: if testStatus === "expired" return). A single transient refresh failure could permanently disable auto-refresh, requiring manual re-authentication.
Replace the hard skip with a bounded retry mechanism: up to 3 attempts with exponential backoff (5min, 10min, 20min). On success, the connection is fully restored to active. On exhaustion, it remains expired (same as before). The existing circuit breaker (5 failures → 30min pause) provides additional protection.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
OpenAI-compatible clients (OpenCode, etc.) check capabilities/input_modalities fields on the /v1/models response to determine if a model supports image input. Omniroute was not emitting these fields, causing clients to assume text-only for all models routed through the proxy.
Add keyword-based vision detection (matching the existing playground heuristic) that annotates model entries with capabilities:{vision:true}, input_modalities:["text","image"], and output_modalities:["text"] for known multimodal models (GPT-4o/4-turbo, Claude 3+, Gemini, Pixtral, Qwen-VL, etc.).
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Rename in.json → hi.json: 'in' is Indonesian (ISO 639-1), Hindi is 'hi'.
Fixes Weblate locale conflict where id.json and in.json both claimed Indonesian.
- Move empty tool name filter before Codex passthrough: nativeCodexPassthrough
skipped all input sanitization, causing 400 'empty tool name' from upstream.
- Collapse 3+ consecutive newlines to \n\n in response sanitizer: thinking
models accumulate excessive line breaks between tool call blocks.
- OpenAI-to-Claude translator now maps reasoning_effort (low/medium/high/max)
to Claude's thinking.budget_tokens. Fixes clients like OpenCode sending
reasoning_effort via @ai-sdk/openai-compatible losing thinking configuration.
- Ensures max_tokens > budget_tokens for all thinking configs.
- Token health check now proactively refreshes tokens within 5 min of expiry,
regardless of the configured health check interval — addresses Qwen OAuth
token refresh failures between scheduled checks.
Adds structured YAML-based issue templates to improve issue quality.
Bug reports require version, install method, OS, repro steps, and
expected/actual behavior. Feature requests require use case and
proposed solution. Blank issues are still allowed for edge cases.
Adds a button next to the Auto-Sync toggle to clear all custom models
for a provider. Extends DELETE /api/provider-models to support ?all=true
parameter for bulk deletion.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Handle reasoning_details[] array (StepFun/OpenRouter format) in sanitizer and translator
- Handle 'reasoning' field alias → reasoning_content in streaming and non-streaming paths
- Cross-map input_tokens/output_tokens ↔ prompt_tokens/completion_tokens in filterUsageForFormat
- Fix extractUsage to accept input_tokens/output_tokens as alternative field names
- All 936 tests pass
- Add Claude Sonnet 4.5, Claude Sonnet 4, GPT 5, GPT 5 Mini
- Enable passthroughModels: true so users can access any model
Antigravity supports without waiting for registry updates
- Set clientSecretDefault for Antigravity provider (was empty, causing
'client_secret is missing' on token refresh for npm users)
- Add modelsUrl to opencode-zen registry for 'Import from /models'
Rewrote the account selector with a simpler, reliable approach:
- Fetch ALL connections once at startup (not per-provider)
- Filter by selectedProvider using ALIAS_TO_ID mapping
- Account/Key dropdown always visible when provider selected
- Shows 'Auto (N accounts)' default or individual account names
- Works for both OAuth accounts and API key providers
Import ALIAS_TO_ID mapping and resolve provider aliases (cx→codex,
kr→kiro, etc.) in loadConnections before filtering connections from
the API. The /v1/models endpoint returns alias-prefixed model IDs
but /api/providers/client returns provider IDs.
Playground:
- loadConnections() was parsing wrong API response shape (expected
providers[].connections[] but API returns flat connections[])
- Account selector now shows for any provider with ≥1 connection
- Uses conn.email as name fallback for OAuth providers
CLI Tools:
- getAllAvailableModels() now also fetches from /v1/models API
- Dynamic models supplement static PROVIDER_MODELS definitions
- Fixes providers like Kiro, OpenCode Zen showing 0 models
After stripping <antThinking>/<thinking> tags from streaming responses, the
surrounding newlines were left as artifacts (e.g. \n\n\n\n). Now collapses 3+
consecutive newlines to double-newline after any tag removal.
Also fixes PR #625 merge (Provider Limits light mode background).
The proxy test button in Settings was always failing with 'Socks5 Authentication
failed' because the frontend sent redacted credentials (***) from listProxies().
The backend received '***' as the password and tried to authenticate with it.
Fix: Frontend now sends proxyId in the test request body. The test endpoint
looks up the proxy from the DB with includeSecrets: true and uses the real
stored credentials for the SOCKS5 handshake.
Also: removed username/password from the frontend test payload since they
are always redacted and useless for testing.
Root cause: SOCKS5 proxies accept TCP connections (pass health check) but
can't relay HTTPS traffic. getCodexUsage() catches fetch errors internally
and returns {message: 'Failed to fetch...'} instead of throwing, so the
previous catch-based fallback never triggered.
Fix: After the initial proxied fetch, check the returned usage object for
network error indicators. If a proxy was active and the result contains
'fetch failed' / 'ECONNREFUSED' / etc., retry the entire operation
(credential refresh + usage fetch) without proxy context.
This is safe because usage fetching is read-only — showing limits data
without proxy is better than showing nothing.
bg-bg-subtle (#f0f0f5) appears gray against the page background in
light mode. Changed to bg-surface (#ffffff) for consistency with other
Card-based UI sections.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix: Limits usage fetch wraps BOTH token refresh and usage call inside proxy context (fixes SOCKS5 Codex accounts)
- Fix: CI integration test v1/models gracefully handles empty models list
- Fix: Settings proxy test button results now render with priority over health data
- Feat: Playground account selector dropdown for testing specific connections
- Merge: PR #623 LongCat API base URL path correction
The sanitize TransformStream (commit 5a8c644) shared the same TextDecoder
instance with the upstream transform stream. This corrupted UTF-8 state
when decoding SSE chunks, producing garbled output that broke clients
like openclaw that parse the stream.
- Use a separate TextDecoder for the sanitize stream
- Always decode→encode in sanitize (don't mix raw passthrough with decoded text)
- Add flush() handler to emit remaining buffered bytes
- Fix double-escaped regex (\\n → \n) for tag stripping
## Proxy UI Bug Fixes
- fix: proxy badge on connection cards now uses resolveProxyForConnection()
per-connection (covers registry + config-file assignments)
- fix: Test Connection button now works in 'saved' proxy mode by resolving
proxy config from savedProxies list
- fix: ProxyConfigModal now calls onClose() after save/clear (fixes UI freeze)
- fix: ProxyRegistryManager loads usage eagerly on mount with deduplication
by scope+scopeId to prevent double-counting; adds per-row Test button
## Connection Tag Grouping (new feature)
- feat: add Tag/Group field to EditConnectionModal (stored in
providerSpecificData.tag, no DB schema change)
- feat: connections list groups by tag with visual dividers when any account
has a tag; untagged accounts appear first without header
## Post-merge fix from PR #607 review
- fix: function_call blocks in translateNonStreamingResponse now also strip
Claude OAuth proxy_ prefix via toolNameMap (kilo-code-bot #607 warning)
Affects OpenAI Responses API format path — tool_use was fixed in PR #607
but function_call was missed
- fix(translator): pass toolNameMap to translateNonStreamingResponse so Claude
OAuth proxy_ prefix is correctly stripped from tool_use block names in
non-streaming responses (was only stripped in streaming path)
- fix(validation): add LongCat specialty validator that probes /chat/completions
directly, bypassing the /v1/models endpoint that LongCat does not expose (#592)
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
- Replace hardcoded rgba(255,255,255,...) borders/backgrounds with theme-aware
CSS variables (--color-border, --color-bg-subtle) for proper light mode contrast
- Add dark: variants for hover states and progress bar backgrounds
- Fix Claude plan tier: try to extract actual plan from OAuth response instead
of hardcoding "Claude Code"
- Recognize provider names (Claude Code, Kimi Coding, Kiro) as non-plan-tier
values in normalizePlanTier() to avoid showing them as tier badges
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- fix(translator): pass toolNameMap to translateNonStreamingResponse so Claude
OAuth proxy_ prefix is correctly stripped from tool_use block names in
non-streaming responses (was only stripped in streaming path)
- fix(validation): add LongCat specialty validator that probes /chat/completions
directly, bypassing the /v1/models endpoint that LongCat does not expose (#592)
Video and music models had a special exemption for authType="none" providers
(comfyui, sdwebui), causing them to appear in the models list even without
any active provider connection. Now all model types consistently use
isProviderActive() filtering, matching the behavior of image models.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Provider field shows connection name (e.g. "BltCy API"),
Protocol (sourceFormat) shows "-" since model-sync is not
a chat/completion request.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use connection.name instead of the raw provider node ID
(e.g. "BltCy API" instead of "openai-compatible-chat-09fdb807-...")
in call logs and scheduler console output.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The autoSyncToggle was defined after the isCompatible early return,
so it never rendered for compatible provider types. Move the toggle
definition before the isCompatible branch so it appears for all
provider types including third-party OpenAI-compatible ones.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add POST /api/providers/[id]/sync-models endpoint that fetches models
from a provider's /models API and replaces the full custom models list,
preserving per-model compatibility overrides
- Rewrite modelSyncScheduler to dynamically discover connections with
autoSync enabled in providerSpecificData instead of a hardcoded list
- Add replaceCustomModels() to db/models.ts for full list replacement
while preserving existing compat flags
- Log each model sync operation to call_logs for visibility in the
Logs page
- Add Auto-Sync toggle button next to "Import from /models" in the
provider detail page UI
- Add en/zh-CN i18n translations for auto-sync strings
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace Image-based provider icons in ProviderOverviewCard with the same
ProviderIcon component used on the providers page (@lobehub/icons SVG
with PNG → generic fallback chain).
The <omniModel> tag was leaking into user-visible content when
context_cache_protection was enabled on a combo. The tag is an internal
marker for model pinning across conversation turns.
Fix: Add a second TransformStream pass (sanitize) that strips the tag
from SSE chunk content before delivery to the client. The tag is still
injected for round-trip context pinning but cleaned from visible output.
Also adds X-OmniRoute-Model response header as a cleaner metadata channel.
Closes#585
Changed the heuristic fallback for claude-* models from 'antigravity' to 'anthropic'
as the canonical provider. Users without Antigravity credentials were getting
'No credentials for provider: antigravity' errors when sending unprefixed
Claude model names like 'claude-sonnet-4-5'.
Closes#570
Add stripModelPrefix boolean setting that, when enabled, strips
provider prefixes (e.g. openai/, anthropic/) from incoming model
names and re-resolves the bare model name using existing heuristics.
This allows tools to send prefixed model names while OmniRoute
handles provider routing at the proxy layer.
- Add stripModelPrefix to settings validation schema (Zod)
- Check setting in getModelInfo() after custom node matching fails
- Falls through to normal resolution on error or when disabled
- Backward compatible: opt-in, default behavior unchanged
- Added MAX_TRANSCRIPTION_FILE_SIZE constant (4GB)
- Added formatFileSize() helper for human-readable display (KB/MB/GB)
- Frontend validation rejects files > 4GB with error message
- Changed label from 'Audio File' to 'Audio / Video File'
- Shows 'Supports audio and video files up to 4 GB' hint
- Add contextLength field to RegistryModel interface for per-model overrides
- Add defaultContextLength to RegistryEntry for provider-level defaults
- Set context lengths for major providers:
- Claude: 200k
- Codex: 400k (fixes combo context display)
- Gemini: 1M
- OpenAI: 128k
- GitHub Copilot: 128k
- Kiro/Cursor: 200k
- OpenCode: 200k
- Include context_length in /v1/models API response
- Add context_length field to combo schema for custom combo context
- Update contextManager to use registry defaults and support env overrides
- CONTEXT_LENGTH_<PROVIDER> for per-provider override
- CONTEXT_LENGTH_DEFAULT for global override
This allows clients like OpenClaw to display accurate context windows
for combo models instead of guessing based on model name patterns.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When users skip password setup during onboarding (either via 'Skip Password'
checkbox or 'Skip Wizard' button), the app now explicitly sets requireLogin=false.
Previously, requireLogin defaulted to true with no password hash stored,
leaving users permanently stuck on the login page.
Two code paths fixed in onboarding/page.tsx:
- handleSetPassword() with skipSecurity=true
- handleFinish() when no password was configured
- Reword comments that contained the token any; replace any types with typed shapes
- stream.ts: passthrough tool-call flag via local boolean (state is null in passthrough)
- Document T11 in zws_docs/ZWS_README_V8.md
Made-with: Cursor
Add model-pattern → combo mapping feature that automatically routes requests
to specific combos based on model name patterns (glob matching).
Implementation:
- New migration 010: model_combo_mappings table with pattern, combo_id, priority
- DB module with CRUD + resolveComboForModel() using glob-to-regex matching
- getComboForModel() in model.ts: augments getCombo() with pattern fallback
- chat.ts: replaced getCombo() → getComboForModel() at routing decision point
- API endpoints: GET/POST /api/model-combo-mappings, GET/PUT/DELETE by [id]
- ModelRoutingSection.tsx: dashboard UI with inline add/edit/toggle/delete
- Integrated into Combos page
- 15 new unit tests (glob matching, priority ordering, disabled filtering)
- Full test suite: 923/923 pass
Examples:
claude-sonnet* → code-combo
claude-*-opus* → frontier-combo
gpt-4o* → openai-combo
gemini-* → google-combo
Resolves: #563
Cherry-pick from codex/omniroute-fixes-20260324:
- Replace MCP singleton transport with per-session architecture for Streamable HTTP
- Fix Claude passthrough via OpenAI round-trip normalization
- Add detectFormatFromEndpoint() for endpoint-aware format detection
- Support raw code#state in OAuth modal for Claude Code remote auth
- Expose cloudConfigured/cloudUrl/machineId in settings API
- Switch docker-compose.prod.yml target to runner-cli
- Add 3 new tests for round-trip and detectFormat
PR: #562
- Implement keychain-based credential extractor for Zed IDE
- Support macOS (Keychain), Windows (Credential Manager), Linux (libsecret)
- Add API endpoint: POST /api/providers/zed/import
- Auto-discover OAuth tokens for OpenAI, Anthropic, Google, Mistral, xAI, etc.
- Cross-platform support via keytar library
- Complete documentation with security considerations
Closes community request from OmniRoute Telegram group.
Follows proven pattern used by VS Code, GitHub Copilot CLI, Claude Code.
CLI settings routes (codex-settings, droid-settings, kilo-settings) were
writing the masked API key string directly to config files when the
dashboard sent a keyId. Now resolves the real key from the database via
getApiKeyById() before writing, matching the pattern already implemented
in claude-settings, openclaw-settings, and cline-settings.
Closes#549
- thread-stream test fixtures (intentionally malformed) were being picked
up by Turbopack during production build, causing 111 compile errors
- IgnorePlugin excludes /test/ within thread-stream context
- thread-stream added to serverExternalPackages to prevent bundling
- /app removed: it is a stale npm-package prebuild artifact, not source code
T01 (P1): requested_model column in call_logs
- Migration 009_requested_model.sql: ALTER TABLE call_logs ADD COLUMN requested_model
- callLogs.ts: INSERT + SELECT updated to include requestedModel field
T02 (P1): Strip empty text blocks from nested tool_result.content
- New stripEmptyTextBlocks() recursive helper in openai-to-claude.ts
- Applied on tool_result content before forwarding to Anthropic
- Prevents 400 'text content blocks must be non-empty' errors
T03 (P1): Parse x-codex-5h-*/x-codex-7d-* headers for precise quota reset
- parseCodexQuotaHeaders() in codex.ts extracts usage/limit/resetAt
- getCodexResetTime() returns furthest-out reset timestamp for safe unblocking
T04 (P1): X-Session-Id header for external sticky routing
- extractExternalSessionId() in sessionManager.ts reads x-session-id,
x-omniroute-session, session-id headers with 'ext:' prefix to avoid collisions
T06 (P2): account_deactivated permanent expired status on 401
- ACCOUNT_DEACTIVATED_SIGNALS constant + isAccountDeactivated() in accountFallback.ts
- Returns 1-year cooldown (effectively permanent) to prevent retrying dead accounts
T07 (P2): X-Forwarded-For IP validation
- New src/lib/ipUtils.ts with extractClientIp() and getClientIpFromRequest()
- Skips 'unknown'/non-IP entries in X-Forwarded-For chain
T10 (P2): credits_exhausted distinct account status
- CREDITS_EXHAUSTED_SIGNALS + isCreditsExhausted() in accountFallback.ts
- Returns 1h cooldown with creditsExhausted flag, distinct from rate_limit 429
T11 (P1): max reasoning_effort -> budget_tokens: 131072
- EFFORT_BUDGETS and THINKING_LEVEL_MAP updated with max: 131072, xhigh: 131072
- Reverse mapping now returns 'max' for full-budget responses
- Unit test updated to expect 'max' (was 'high')
T12 (P3): Model pricing updates
- MiniMax M2.7 / MiniMax-M2.7 / minimax-m2.7-highspeed pricing added
T15 (P1): Array content normalization for system/tool messages
- normalizeContentToString() helper exported from openai-to-claude.ts
- System messages with array content now correctly collapsed to string
Registrar o provedor Puter como gateway OpenAI-compatible que expõe
modelos de múltiplos fornecedores (GPT, Claude, Gemini, Grok, DeepSeek,
Qwen, Mistral, Llama) através de um único endpoint REST.
- Criar PuterExecutor com autenticação Bearer token
- Adicionar entrada no providerRegistry com 40+ modelos curados
- Habilitar passthroughModels para acesso aos 500+ modelos do catálogo
- Registrar alias "pu" para acesso rápido
- Adicionar metadados do provedor em shared/constants/providers.ts
- CHANGELOG: [3.0.0-rc.5] section now serves as full 'What's New vs v2.9.5':
* 2 new providers (OpenCode Zen/Go via PR #530)
* 3 new features: Registered Keys API (#464), provider icons (#529), model auto-sync (#488)
* 10 bug fixes (#521, #522, #524, #527, #532, #535, #536, #537, #489, #510, #492)
* 16 issues resolved total, DB migration 008
- README: added 'What's New in v3.0.0' table section after badges
Includes all commits from @kang-heewon's PR #530:
- OpencodeExecutor with multi-format routing
- opencode-zen + opencode-go registered in provider registry
- UI metadata added to providers.ts
- Unit tests for OpencodeExecutor (improved to avoid state coupling)
Cherry-picked from add-opencode-providers into 3.0.0-rc.
Conflicts resolved: executors/index.ts (merged pollinations+cloudflare-ai),
providerRegistry.ts (kept testKeyBaseUrl from rc.2 + PR's authType/models).
feat(ui): ProviderIcon component with @lobehub/icons + PNG fallback (#529)
- 130+ providers covered by Lobehub SVG components via LobehubErrorBoundary
- Falls back to existing /providers/{id}.png, then generic icon
- Replaces manual img state machine in ProviderCard + ApiKeyProviderCard
feat(scheduler): modelSyncScheduler — 24h model list auto-update (#488)
- Syncs 16 major providers every 24h (MODEL_SYNC_INTERVAL_HOURS configurable)
- Wired into POST /api/sync/initialize startup hook
fix(oauth): Gemini CLI — clear error when client_secret missing in Docker (#537)
fix(chat): convert tool_result content blocks to [Tool Result: id] text (#527)
- Previously, tool_result blocks in user messages were silently dropped
- This caused an infinite loop when Claude Code + superpowers routed to Codex:
Codex never received the tool response and kept re-requesting the tool
- Now: tool_result → text block '[Tool Result: {id}]\n{content}'
- Handles string, array-of-text, and JSON-serialized content types
docs(issues): add Turbopack postinstall workaround on #509 and #508
docs(issues): note that #464 (API key provisioning) is on the v3.0 roadmap
fix(cli): normalize MSYS2/Git-Bash paths in cliRuntime.ts (#510)
- Add normalizeMsys2Path() helper: /c/Program Files/... → C:\Program Files\...
- Apply to both Windows 'where' and Unix 'command -v' path resolution
- Fixes 'CLI not detected' on Windows when running Git Bash / MSYS2
fix(cli-launcher): detect mise/nvm on server.js not found error (#492)
- Show targeted fix instructions based on which Node manager is in use
- mise users: told to use npx or mise exec
- nvm users: reminded to nvm use --lts before reinstalling
docs(issues): add pnpm bindings workaround comment (#520)
docs(issues): note OpenCode/Lobehub icons coming in v3.0.0 (#529)
fix(login): redirect to /dashboard/onboarding when API returns needsSetup:true (#521)
- Handle the case where user skips password setup and lands on login
- Instead of showing a cryptic error, redirect to onboarding flow
fix(api-manager): replace useless 'copy masked key' button with lock tooltip (#522)
- Copying a masked key (sk-proj123****abcd) is misleading and useless
- Show a lock icon on hover explaining key is only available at creation time
- Add i18n key 'keyOnlyAvailableAtCreation'
fix(opencode-go): use zen/v1 for API key validation, not zen/go/v1 (#532)
- Added testKeyBaseUrl field to RegistryEntry interface
- opencode-go: testKeyBaseUrl → zen/v1 (same key authenticates both tiers)
- validation.ts: resolveBaseUrl for key testing now prefers testKeyBaseUrl
fix(antigravity): return structured 422 error when projectId is missing (#489)
- Instead of throwing (crash), executor returns an OpenAI-format error JSON
- Client receives message with instruction to reconnect OAuth
- Prevents opaque 500 errors in the proxy logs
chore: close#525 (OmniRoute = 9router — same project, different name)
docs: add Docker password reset comment on #513 with INITIAL_PASSWORD workaround
- feat(providers): add OpenCode Zen and Go providers with multi-format executor (PR #530 by @kang-heewon)
- fix(embeddings): use provider node ID for custom embedding provider credential lookup (PR #528 by @jacob2826)
- fix(cli-tools): resolve real API key from DB (keyId) before writing to CLI config files (#523, #526)
- fix(combo): update CACHE_TAG_PATTERN to match literal \\n prefix/suffix around omniModel tag (#531)
- chore: bump version to 2.9.5 in package.json + docs/openapi.yaml
- docs: update CHANGELOG.md with v2.9.5 release notes
- Register OpencodeExecutor for 'opencode-zen' and 'opencode-go' in executors map
- Add OpencodeExecutor export in index.ts
- Add UI metadata for both providers in APIKEY_PROVIDERS:
- OpenCode Zen: https://opencode.ai/zen
- OpenCode Go: https://opencode.ai/zen/go
- Both use 'opencode' icon with #6366f1 color
fix(cli-tools): save real API key to CLI config files instead of masked string (#523, #526)
- claude-settings/route.ts: accept keyId, look up real key from DB (getApiKeyById)
- cline-settings/route.ts: same keyId resolution pattern
- openclaw-settings/route.ts: same keyId resolution pattern
- ClaudeToolCard.tsx: store key.id as selected value, send keyId in POST body
The /api/keys endpoint returns masked strings (first8+****+last4) which were being
written verbatim to ~/.claude/settings.json and similar config files, causing auth
failures on CLI tool launch.
fix(combo): update CACHE_TAG_PATTERN to strip surrounding \\n sequences (#531)
- comboAgentMiddleware.ts: non-global regex now matches literal \\n (backslash-n)
and actual newline U+000A that combo.ts injects around the <omniModel> tag.
- fix(translator): preserve prompt_cache_key in Responses API translation (#517)
- fix(combo): escape \n in tagContent for valid JSON injection (#515)
- fix(usage): sync expired token status back to DB on live auth failure (#491)
- chore: bump version to 2.9.4 in package.json + docs/openapi.yaml
- docs: update CHANGELOG.md with v2.9.4 release notes
fix(translator): preserve prompt_cache_key when translating Responses API requests
(#517) — prompt_cache_key is an account-affinity signal used by Codex for
prompt cache routing. Deleting it from the translated request prevented full
cache effectiveness. Removed delete from openai-responses.ts and
responsesApiHelper.ts cleanup blocks.
fix(combo): escape \n in tagContent so injected JSON string is valid (#515)
— omniModel tag content used template literal newlines (U+000A) which produce
unescaped newline chars inside a JSON string value. Replaced with literal \n
escape sequences for valid JSON injection in streaming SSE content chunks.
- Version bumped from 2.9.2 → 2.9.3 in package.json + docs/openapi.yaml
- CHANGELOG.md updated with full release notes for 2.9.3
(5 new free providers, 2 metadata updates, 2 custom executors, docs)
upstreamErrorResponse() now guards against parsed.error being an
object (e.g. ElevenLabs { error: { message, status_code } }) instead
of blindly using it as the error message string.
Both audioSpeech.ts and audioTranscription.ts fixed.
- Add resolveAudioContentType() to map video/* MIME to audio/* (fixes .mp4 uploads returning 'no speech detected')
- Add detect_language=true for Deepgram auto-language detection (fixes non-English audio)
- Add punctuate=true for better output quality
- Forward language form param to Deepgram when provided
- Apply same Content-Type fix to HuggingFace handler
The truthy check treated false as falsy and deleted the property, preventing users from explicitly disabling normalization for a specific protocol when the top-level flag was true. Now stores both true and false values, consistent with preserveOpenAIDeveloperRole handling.
Made-with: Cursor
The word 'any' in a JSDoc comment was matched by the regex-based t11 checker. Reworded to 'prefixes' to eliminate the false positive.
Made-with: Cursor
Rewrite getMachineIdRaw() to use a try/catch waterfall instead of
process.platform conditionals. Next.js SWC bundler evaluates
process.platform at BUILD time, so when built on Linux, the win32
branch was dead-code-eliminated — causing 'head is not recognized'
errors on Windows.
New approach:
1. Try Windows REG.exe (existsSync check, not platform check)
2. Try macOS ioreg command
3. Try reading /etc/machine-id directly (no head/pipe)
4. Try hostname command
5. Fallback to os.hostname()
Also eliminates the patch-machine-id.cjs post-install workaround.
- #493: Fix custom provider model naming — removed incorrect prefix
stripping in DefaultExecutor.transformRequest() that broke org-scoped
model IDs like 'zai-org/GLM-5-FP8'
- #490: Enable context cache protection for streaming responses using
TransformStream to inject omniModel tag as final SSE content delta
before [DONE] marker
- #452: Add per-API-key request-count limits (max_requests_per_day,
max_requests_per_minute) with in-memory sliding window counter,
schema auto-migration, and Check 5 in enforceApiKeyPolicy()
- Replace execSync template string with execFileSync + args array on Windows
to prevent command injection via SystemRoot/windir environment variables
- Add optional chaining (?.) and nullish coalescing (?? "") on Windows
REG_SZ output parsing to prevent crash if REG.exe output is unexpected
- Add optional chaining on macOS IOPlatformUUID parsing for the same reason
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace 'any other path' with 'all other paths' in translator comment to avoid false match by the \bany\b regex in check-t11-any-budget
- Scope e2e error locator to dialog and use .first() to prevent Playwright strict-mode violations from broad page-level selectors
- Fix fallback logic: treat dialog-still-open as validation success signal
Made-with: Cursor
- Use globalThis singleton guards for DB connection, HealthCheck timers, console interceptor, and graceful shutdown to survive Webpack HMR re-evaluation (fixes 485+ leaked DB connections per session)
- Split instrumentation.ts into instrumentation-node.ts with computed import path to prevent Turbopack Edge bundler from tracing Node.js modules (eliminates 10+ spurious warnings per hot compile)
- Parallelize startup imports in instrumentation-node.ts (3 batch Promise.all instead of 9 serial awaits)
- Add OMNIROUTE_USE_TURBOPACK=1 env switch in run-next.mjs (default behavior unchanged)
- Replace node:crypto with crypto in proxies.ts and errorResponse.ts to fix UnhandledSchemeError
- Add unlinkFileWithRetry with EBUSY/EPERM retry for Windows file handle timing in backup restore
- Fix pre-restore backup to await completion before closing DB
- Fix bootstrap-env, domain-persistence, and fixes-p1 test stability on Windows
Made-with: Cursor
Problem:
node-machine-id constructs the REG.exe command path at module load time
using process.platform. When Next.js bundles this module, process.platform
is "" (not "win32") in the webpack/build context, so the lookup returns
undefined and bakes "undefined\REG.exe ..." permanently into the compiled
chunk. At runtime on Windows this causes:
Error: Command failed: undefined\REG.exe QUERY HKEY_LOCAL_MACHINE\...
The system cannot find the path specified.
Fix:
Remove the node-machine-id dependency from machineId.ts and replace it
with a direct execSync implementation that resolves process.env.SystemRoot
at call time (not load time), so the correct Windows path is always used
regardless of when or how the module was bundled.
Platform support is preserved for Windows, macOS, and Linux/FreeBSD using
the same underlying OS queries that node-machine-id used internally.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Address bot review feedback: use .finally() instead of .then()/.catch()
so limiters.delete() runs regardless of whether stop() succeeds or
throws (e.g. already stopped by concurrent 429).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Multiple concurrent requests can receive 429 simultaneously, causing
stop() to be called on an already-stopped limiter. Add .catch() to
prevent unhandled rejection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a provider returns 429 (rate limit exceeded), the rate limit manager
was setting reservoir=0 and waiting for reservoirRefreshInterval before
releasing queued requests. For providers with long rate limit windows
(e.g. Codex with hours-long resets), this caused all queued requests to
hang indefinitely — they never timed out or returned an error.
This prevented upstream callers (e.g. LiteLLM) from triggering fallback
to alternative providers, effectively making the entire model unavailable
until the rate limit window expired.
Fix: on 429, call limiter.stop({ dropWaitingJobs: true }) to immediately
fail all queued requests, then delete the limiter from the Map so
getLimiter() creates a fresh instance for subsequent requests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add preserveDeveloperRole option and model compat override
- Normalize developer→system in roleNormalizer when not preserving
- Translator runs normalizeRoles for Responses API with option
- UI: ModelCompatPopover with do not preserve developer toggle
- Add ZWS_README_V2 documenting cause and fix
Made-with: Cursor
The Bailian Coding Plan provider page may render a dialog on load
that blocks pointer events on the Add API Key button. Add pre-dialog
dismissal (Escape key) before attempting to click.
Also triages #485 (Claude Code tool calls — needs-info).
- #462: Mark gemini-cli provider as deprecated in providers.ts
Add deprecated, deprecationReason, hasFree, freeNote, authHint, apiHint
to Zod provider schema
- #471: Add VM_DEPLOYMENT_GUIDE.md to DOC_SOURCE_FILES in generate-multilang.mjs
Delete 29 stale PT-language copies and regenerate from EN source
for all 30 locales (29 auto-translated + 1 Czech from PR #482)
formatSSE() in streamHelpers.ts explicitly returned 'data: null' for
null/undefined data. This violates SSE protocol and causes
AI_TypeValidationError in strict clients (Zod-based AI SDKs).
Now returns empty string, silently skipping null chunks.
- New API: /api/logs/export?hours=24&type=call-logs
- UI: Export button with dropdown on /dashboard/logs page
- Supports export of request-logs, proxy-logs, and call-logs
- Downloads as JSON file with Content-Disposition header
Previously resolveModelAlias() output was used only for getModelTargetFormat()
but the original model was sent in translatedBody.model and to the executor.
Now effectiveModel is propagated to all downstream operations.
saveCallLog only read prompt_tokens/completion_tokens (OpenAI format).
When sourceFormat=claude, the openai-to-claude translator writes
input_tokens/output_tokens instead, causing all cross-format requests
(Codex-via-Claude, Kiro-via-Claude, etc.) to show 0|0 tokens in
call_logs.
Also includes cache_read and cache_creation tokens in tokens_in total
so heavily-cached requests don't show misleadingly low input counts.
Changes:
- Read prompt_tokens || input_tokens (supports both formats)
- Read completion_tokens || output_tokens (supports both formats)
- Sum cache_read_input_tokens + cache_creation_input_tokens into total
logUsage stored only non-cached input tokens in usage_history.tokens_input.
For heavily-cached Claude requests (common with Claude Code), this shows
near-zero input when the real total is 150K+, causing the analytics
dashboard to severely underreport input token usage.
Now sums: input = prompt_tokens + cache_read + cache_creation
chatCore.ts injects translatedBody.model for all providers after
translation. Kiro API (AWS CodeWhisperer) has strict schema validation
and rejects unknown top-level fields — only conversationState, profileArn,
and inferenceConfig are valid. This causes 100% of Kiro requests to fail
with "Improperly formed request".
Strip the injected model field in KiroExecutor.transformRequest().
* feat: add api-key Kimi Coding provider support
* fix(kimi-coding): honor apikey auth header in executor
Ensure DefaultExecutor sends x-api-key for kimi-coding-apikey at runtime
and deduplicate shared kimi coding config blocks in registry and models
config to reduce drift between oauth and apikey variants.
---------
Co-authored-by: OmniRoute Agent <agent@omniroute.local>
- fix(budget): BudgetTab sent integer percentage (80) but schema validated
fraction (0-1). Now divides by 100 on POST and multiplies by 100 on GET (#451)
- fix(combos): expose Agent Features UI in combo create/edit modal — fields for
system_message override, tool_filter_regex, and context_cache_protection were
implemented server-side (#399/#401) but missing from the dashboard UI (#454)
- fix(combos): strip <omniModel> tags from messages before forwarding to provider.
The internal cache-pinning tag was being sent to the provider, causing cache
misses as providers treated each tagged request as a new session (#454)
- fix(docker): copy pino-abstract-transport + pino-pretty in standalone (#449)
- fix(responses): remove initTranslators() from /v1/responses route (#450)
- chore(deps): commit package-lock.json with each version bump
- fix(docker): copy pino-abstract-transport and pino-pretty explicitly in
runner-base stage — Next.js standalone trace omits them, causing
'Cannot find module pino-abstract-transport' crash on startup (#449)
- fix(responses): remove initTranslators() call from /v1/responses route —
bootstrapping translator registry from a Next.js Route Handler worker
caused 'the worker has exited' uncaughtException on Codex CLI requests.
Translators are already bootstrapped server-side via open-sse (#450)
- chore: include package-lock.json in commit (was being left behind on
version bumps, causing npm ci to install inconsistent deps in Docker)
- fix(ux): add default password hint on login page for first-time users (#437)
The fallback password (123456) is now shown as a hint below the
password input so users don't get locked out during initial setup.
- fix(cli): add shell:true to spawn on Windows so .cmd wrappers are
resolved correctly via PATHEXT (#447). Claude, opencode, and other
npm-installed CLIs show as 'not runnable' on Windows even when
installed because spawn() cannot find .cmd files without shell:true.
- i18n: add defaultPasswordHint key to en.json auth namespace
Keep search provider validation responses consistent with other validators so Serper regression tests and CI assertions can rely on unsupported=false.
Made-with: Cursor
Normalize GitHub Copilot account tiers from the usage payload and hide misleading unlimited buckets so account type and limits render correctly in the dashboard.
Made-with: Cursor
Search Playground (Phase 1):
- Web Search as 10th endpoint in Playground with isolated SearchPlayground component
- Endpoint selector moved first; Provider/Model/Send hidden when search selected
- Provider dropdown via GET /api/search/providers, formatted results with cache indicator
Search Tools page (Phase 2) at /dashboard/search-tools:
- Split panel: SearchForm (left) with query, provider, filters + ResultsPanel (right)
- Compare Providers: parallel queries with latency, cost, response size, URL overlap
- Rerank Pipeline: model selector from /v1/models, results with position delta
- Search History: last 10 searches from call_logs with replay
- Sidebar entry under Debug section
Backend:
- GET /api/search/providers — list providers with auth guard + SEARCH_CREDENTIAL_FALLBACKS
- GET /api/search/stats — cache stats, provider aggregates, recent searches (auth guard)
- Add local provider_nodes routing for /v1/rerank (oMLX, vLLM support)
Bug fixes (from F-27 PR #432):
- Fix Brave news normalizer: data.results directly, not data.news.results
- Enforce max_results truncation after normalization for all providers
- Fix EndpointPageClient: use /api/search/providers instead of /api/v1/search
- Add isAuthenticated() guards on /api/search/providers and /api/search/stats
Response size metric in results meta bar and compare table.
i18n: 30+ keys in search namespace (en.json)
New in v2.7.0: pluggable RouterStrategy, multilingual intent detection,
request deduplication, new providers (Grok-4 Fast, GLM-5/Z.AI,
MiniMax M2.5, Kimi K2.5). Native translations for de/es/fr/it/ru/zh-CN/ja/ko/ar/pt-BR/pt.
npm version patch run BEFORE staging files — this is an ATOMIC commit.
Adds Strategy 1.5 to scripts/postinstall.mjs:
- Uses @mapbox/node-pre-gyp install --fallback-to-build=false
(bundled within better-sqlite3) to download the correct prebuilt
binary for the current OS/arch (win32-x64/arm64, darwin-x64/arm64)
WITHOUT requiring node-gyp, Python, or MSVC build tools.
- Tries node-pre-gyp.cmd (Windows) or node-pre-gyp (Unix) from .bin/
with fallback to direct path in @mapbox/node-pre-gyp/bin/
- Falls back to npm rebuild only if prebuilt download fails.
- Windows-specific error: shows Option A (npx node-pre-gyp) and
Option B (rebuild) with Visual Studio Build Tools links.
Fixes: #426 (better_sqlite3.node is not a valid Win32 application)
Includes version bump — v2.6.9 — committed ATOMICALLY with all changes:
fixes:
- fix(ci/t11): Remove 'any' from comments in openai-responses.ts + chatCore.ts
(\bany\b regex counted comment text as explicit any violations)
- fix(chatCore/#409): Normalize unsupported content part types before forwarding
Cursor sends {type:'file'} for .md attachments; Copilot/OpenAI providers reject
with 'type has to be either image_url or text'. Now: file/document→text block,
unknown types dropped with debug log. Fixes claude-* models via github-copilot.
workflow:
- chore(generate-release): ATOMIC COMMIT RULE — npm version patch MUST run before
feature commits so the release tag always points to a commit with full changes
DB Migrations (zero-breaking, ADD COLUMN DEFAULT NULL + new table):
- 005_combo_agent_fields.sql: system_message, tool_filter_regex, context_cache_protection on combos
- 006_detailed_request_logs.sql: ring-buffer table (500 entries) for full pipeline body capture
Features:
- #399 System Message Override + Tool Filter Regex per Combo
- applyComboAgentMiddleware() injected into handleComboChat/handleRoundRobinCombo
- Supports both OpenAI and Anthropic tool name formats
- #401 Context Caching Protection (Stateless)
- injectModelTag() appends <omniModel>provider/model</omniModel> to responses
- extractPinnedModel() reads tag from history and pins model for session
- #320 Auto-Update via Settings
- GET /api/system/version — current vs latest npm
- POST /api/system/update — fire-and-forget npm install + pm2 restart
- #378 Detailed Request Logs
- saveRequestDetailLog() captures bodies at 4 pipeline stages (opt-in toggle)
- GET/POST /api/logs/detail — list logs + enable/disable toggle
- #336 MITM Kiro IDE
- src/mitm/targets/kiro.ts: MitmTarget profile for api.anthropic.com interception
Audio endpoints (/v1/audio/speech and /v1/audio/transcriptions) only
supported hardcoded providers from audioRegistry.ts. Local inference
backends configured as provider_nodes (e.g., MLX-Audio, oMLX) could
not serve audio through OmniRoute.
This adds a Phase 3 fallback in the audio model parser that consults
provider_nodes from the database. Local providers with api_type=openai
are automatically available for audio routing via their prefix
(e.g., mlx-audio/tts-model, omlx/whisper-large-v3-turbo).
Design: injection pattern — Next.js route handlers load provider_nodes
(async DB query) and pass them to the sync parser as a parameter.
No cross-workspace imports, no breaking changes to existing parsers.
Changes:
- Add buildDynamicAudioProvider() in audioRegistry.ts
- Add Phase 3 (provider_nodes prefix match) to parseAudioModel()
- Extend parseSpeechModel/parseTranscriptionModel with optional
dynamicProviders parameter (backward compatible)
- Load and inject provider_nodes in speech/transcription route handlers
- Dynamic providers use authType=none (local, no credentials needed)
Claude OAuth tokens are short-lived and require refresh. The runtime
HealthCheck (open-sse) already refreshes them successfully, but the
Dashboard test endpoint was missing `refreshable: true` in its config.
This caused the Dashboard to show "auth failed / Token expired" for
Claude providers even though the tokens were being refreshed correctly
at runtime. The codex provider already had this flag set.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Anthropic API rejects requests containing {"type":"text","text":""} with
400 "text content blocks must be non-empty". Some clients like LiteLLM
passthrough and @ai-sdk/anthropic may forward empty text blocks as-is.
Filter out empty text content blocks from messages before calling
translateRequest, similar to how empty-name tools are already stripped.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- gemini/gemini-cli: removed gemini-3.1-pro/flash/preview (don't exist in Google API v1beta),
replaced with real models: gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-*
- antigravity: removed gemini-3.1-pro-high/low and gemini-3-flash (internal aliases invalid),
replaced with gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash
- github: removed gemini-3-flash-preview and gemini-3-pro-preview, replaced with gemini-2.5-flash
- nvidia: corrected 'nvidia/llama-3.3-70b-instruct' to 'meta/llama-3.3-70b-instruct'
(NVIDIA NIM uses meta/ namespace, not nvidia/ namespace for Meta models)
- nvidia: added meta/llama-3.1-70b-instruct and nvidia/llama-3.1-405b-instruct
Also fixed free-stack combo on .15 DB:
- removed qw/qwen3-coder-plus (qwen provider has expired refresh token)
- corrected nvidia/llama-3.3-70b-instruct → nvidia/meta/llama-3.3-70b-instruct
- corrected gemini/gemini-3.1-flash → gemini/gemini-2.5-flash
- added if/deepseek-v3.2 as replacement for qw/qwen3-coder-plus
Local inference backends (oMLX, Ollama, LM Studio) configured as
provider_nodes have no health monitoring. When a local provider is
down, OmniRoute waits the full timeout before failing.
This adds a background health check that polls local provider_nodes:
- GET /models with 5s timeout for each local node (localhost only)
- In-memory health cache (no DB migration needed)
- Promise.allSettled for parallel checks (one slow node doesn't block)
- Exponential backoff on failures: 30s → 60s → 120s → 300s max
- Reset to 30s on first success after failure
- State transition logging (healthy ↔ unhealthy)
- Expose health status via GET /api/monitoring/health (localProviders)
- Auto-init on first import (same pattern as tokenHealthCheck)
- 401 treated as healthy (server up, auth required)
- isNodeHealthy() returns true if never checked (optimistic default)
Embedding endpoint (/v1/embeddings) only supports 6 hardcoded cloud
providers. Local inference backends (oMLX, Ollama) serving embeddings
via provider_nodes are inaccessible through OmniRoute.
This adds dynamic provider_node support for embeddings:
- Add EmbeddingProvider interface and buildDynamicEmbeddingProvider()
- Add Phase 2 (provider_nodes prefix match) in parseEmbeddingModel()
- Handler accepts resolvedProvider/resolvedModel from route (injection pattern)
- Handler supports authType=none for local providers (was missing — critical gap)
- Route loads local provider_nodes (localhost only — prevents auth bypass/SSRF)
- Route filters by apiType=chat|responses and localhost hostname
- buildDynamicEmbeddingProvider validates inputs (prefix + baseUrl required)
- Per-node try/catch in map — one bad row doesn't block all providers
- DB errors logged and fall back to hardcoded providers
Use shallow copy ({ ...body }) instead of direct reference assignment
so that later translatedBody.model = model does not mutate the
caller's original body object.
When both source and target formats are Claude, skip all request
modification and forward the body untouched. This prevents
prepareClaudeRequest from corrupting valid Claude-native requests
destined for anthropic-compatible provider nodes.
When Claude Code compacts conversation context to fit within token
limits, it may remove assistant messages containing tool_use/tool_calls
while leaving the corresponding tool_result/function_call_output
messages intact. This creates orphaned tool results that cause
providers to reject requests with errors like "tool result's tool id
not found" or "No tool call found for function call output".
Prevents infinite retry loops when models generate tool calls with
empty function names. The normalizeToolName function converted these
to "placeholder_tool" which does not exist in any client's tool
registry, causing repeated error-retry cycles.
Reasoning models (o1, o1-pro, o3, o3-mini) reject standard parameters
like temperature and top_p with 400 Bad Request. OmniRoute's default
executor forwards all parameters without filtering.
This fix adds declarative parameter filtering:
- Add unsupportedParams[] field to RegistryModel interface
- Add REASONING_UNSUPPORTED frozen constant shared across entries
- Add o1-pro, o3, o3-mini to OpenAI registry (were missing)
- Add getUnsupportedParams() helper with:
- O(1) precomputed map lookup (not O(N×M) scan)
- Cross-provider routing support via precomputed map
- Prefixed model ID support (e.g., "openai/o3" → "o3")
- Strip unsupported params in chatCore.ts before executor call
- Use Object.hasOwn() for safe property check (no prototype chain)
- Log stripped params at WARN level for visibility
- Remove apiKey===null heuristic (too broad — could match cloud providers
with non-standard auth). Use URL-based detection only.
- Guard local 404 branch with provider && model check — if either is null,
fall through to standard connection lockout (safer behavior).
- Document LOCAL_HOSTNAMES as module-load-time constant (restart required).
- Document PROVIDER_PROFILES.local as intentionally not yet wired.
When a local inference backend (oMLX, Ollama, LM Studio) returns 404
for an unknown model, OmniRoute previously locked the entire connection
for 2 minutes — blocking all valid models on that connection.
This fix introduces local provider detection and changes the 404
behavior for local providers:
- Model-only lockout (5s) instead of connection-level lockout (2min)
- Connection stays active — other models continue working immediately
- Detection via URL heuristic (localhost/127.0.0.1) + apiKey===null fallback
- Configurable via LOCAL_HOSTNAMES env var for Docker setups
Also fixes a pre-existing bug where the model parameter was not passed
to markAccountUnavailable() from chat.ts, preventing per-model lockouts
from working at all.
Changes:
- Add isLocalProvider(baseUrl) helper in providerRegistry.ts
- Add COOLDOWN_MS.notFoundLocal (5s) and PROVIDER_PROFILES.local
- Add local 404 branch in markAccountUnavailable() in auth.ts
- Pass model param to markAccountUnavailable() in chat.ts (bug fix)
Kilo Gateway (api.kilo.ai/api/gateway) is an OpenAI-compatible API
offering 335+ models via a single API key, including 6 free models
and 3 auto-routing models (frontier/balanced/free).
This is distinct from the existing KiloCode provider which uses
OAuth + /api/openrouter/ endpoint.
- Register kilo-gateway in providerRegistry.ts (alias: kg)
- Add to APIKEY_PROVIDERS in providers.ts
- Add models endpoint config in route.ts
- Add official Kilo AI icon (favicon)
Even with EXPERIMENTAL_TURBOPACK=0 and NEXT_PRIVATE_BUILD_WORKER=0, Next.js 16
instrumentation chunks still emit require('better-sqlite3-<16hexchars>') and
require('zod-<16hexchars>') into the compiled .js files inside .next/server/.
The webpack externals function in next.config.mjs patches the runtime bundler
but does NOT rewrite already-compiled chunks. Added step 5.6 to prepublish.mjs:
walks all .js files in app/.next/server/ and strips the 16-char hex suffix from
any require() string that matches the Turbopack hash pattern.
Also updated deploy-vps workflow: npm registry rejects 299MB packages, so
deployment now uses npm pack + scp + npm install -g /tmp/omniroute-*.tgz.
PM2 entry point is app/server.js inside the npm global package.
8 tests covering:
- Valid OpenAI format tools (tool.function.name) preserved
- Valid Anthropic format tools (tool.name) preserved
- Empty names in both formats filtered
- Mixed format array handling
- Null/whitespace edge cases
Regression tests verify the fix from PR #397 prevents all anthropic-
format tools from being silently dropped by the empty-name filter.
Turbopack in Next.js 16 hashes ALL serverExternalPackages (not just better-sqlite3),
emitting require() calls like 'zod-dcb22c6336e0bc69', 'pino-28069d5257187539' etc.
that don't exist in node_modules.
Changes:
- next.config.mjs: Replace single-package check with a HASH_PATTERN regex
that strips '<name>-<16hexchars>' suffix for any externalized package.
Also adds KNOWN_EXTERNALS set for exact-name matching.
- scripts/prepublish.mjs: Add NEXT_PRIVATE_BUILD_WORKER=0 env to reinforce
webpack mode. Add post-build scan that reports hashed refs so CI is visible.
Closes#396, addresses #398
Add Synthetic (synthetic.new) as a privacy-focused LLM provider
with OpenAI-compatible API, dynamic model catalog via /models
endpoint, and passthrough model support.
- Register provider in providerRegistry.ts with 6 initial models
- Add APIKEY_PROVIDERS entry with verified_user icon (#6366F1)
- Add models listing config for /api/providers/[id]/models endpoint
- passthroughModels enabled for dynamic model catalog
Allow provider_nodes to configure custom chat and models endpoint
paths via chatPath/modelsPath fields. This enables providers with
non-standard versioned APIs (e.g. /v4/chat/completions) to work
without embedding the version prefix in base_url.
- Add migration 003: chat_path and models_path columns
- Update Zod schemas (create, update, validate)
- Update CRUD in providers.ts (INSERT/UPDATE)
- Wire chatPath/modelsPath through API routes and providerSpecificData cascade
- Read chatPath in DefaultExecutor and BaseExecutor buildUrl()
- Use modelsPath in validate endpoint
- Add Advanced Settings UI section (collapsible) in create/edit modals
- Update base URL hint to reference Advanced Settings
- Add i18n keys across all 30 locales
- Add unit tests for buildUrl with custom paths
Backward compatible: NULL chatPath/modelsPath = default behavior.
The filter introduced in #346 only checked OpenAI-format tool names
(tool.function.name), silently dropping all tools when the request
arrives in Anthropic Messages API format (tool.name without .function).
This happens when LiteLLM proxies requests with anthropic/ model prefix —
it translates to Anthropic format before forwarding, so OmniRoute receives
Claude-format tools. The filter drops them all, causing Anthropic API to
return 400: 'tool_choice.any may only be specified while providing tools'.
Fix: check both formats with fn?.name ?? tool.name.
All tests pass except pre-existing clearAccountError module resolution (dataPaths) which is unrelated to this PR. Merging codex native passthrough fix.
Match both slash styles when removing build-machine paths from the
staged standalone bundle so the sanitization step works on Windows
and POSIX builds.
While touching the helper, replace the custom basename logic with
Node's built-in `path.basename` for clarity.
Prepare a dedicated `.next/electron-standalone` bundle before
running electron-builder so desktop packaging operates on a stable,
Electron-specific server payload.
This also adds a preflight that rejects standalone bundles whose
top-level `node_modules` is a symlink, because electron-builder
preserves `extraResources` symlinks and would otherwise ship an app
that depends on the build machine at runtime.
- eslint.config.mjs: add missing ignores for vscode-extension/,
electron/, docs/, app/.next/, clipr/ — ESLint was OOMing because
it scanned huge VS Code binary blobs and build artifacts
- tests: remove stale ALTER TABLE 'group' statements — column is now
part of the base schema in core.ts; tests were failing with
SQLITE_ERROR: duplicate column name
- .husky/pre-commit: add npm run test:unit to block broken tests
from reaching CI
Stabilize the bootstrap metadata test by clearing
INITIAL_PASSWORD before each run and add focused coverage
for env-backed and stored-password states.
Log settings lookup failures before returning the
bootstrap-safe fallback payload so operational errors are
still visible on the server side.
Normalize numeric pino levels correctly in the console log API so the logger transport fix does not misclassify info, warn, and error entries in file-backed logs.
Add a targeted regression test for numeric log entries.
Keep the existing level formatter for direct logger paths, but drop
that formatter from transport-backed configs because pino rejects it
when transport.targets is used.
This restores the intended stdout+file transport path and avoids the
startup fallback warning on every boot.
Add localhost and 127.0.0.1 to allowedDevOrigins so local dev
sessions opened on loopback addresses do not have their Next.js HMR
websocket blocked as cross-origin.
Point the login page at the existing public bootstrap endpoint
instead of the protected /api/settings route.
Also extend the public bootstrap response with hasPassword and
setupComplete so unauthenticated users get the correct first-run
or password-setup flow without triggering a 401.
Thanks @kfiramar! 🎉 Critical security fix — different startup paths were generating different `STORAGE_ENCRYPTION_KEY` values over the same SQLite database, causing `Unsupported state or unable to authenticate data` for all stored tokens.
Improvements added on top:
- Normalized `overridePath?.trim()` in `electron/main.js` to match `bootstrap-env.mjs` (addresses kilo-code-bot warning #1)
- Added explanatory comment documenting the `preferredEnv` merge order intent in Electron startup (addresses kilo-code-bot warning #3)
4 commits + 113-line test file. The fail-closed behaviour (refusing to mint a new key when encrypted rows exist) is an excellent safeguard. Merged!
Thanks @kfiramar! 🎉 Critical fix — stale error metadata on recovered provider accounts was preventing valid accounts from being selected properly after recovery.
Improvement added on top: documented the two valid success-check patterns (`result.success` for open-sse handlers vs `response?.ok` for fetch-based handlers) to address the kilo-code-bot review warning — both patterns are correct by design, now explicitly documented.
5 commits total, 2 test files (+168 lines of coverage). Merged!
Thanks @kfiramar! Perfect minimal fix — `t("deleteConnection")` was requesting a non-existent key across all 30 locales, causing `MISSING_MESSAGE: providers.deleteConnection` runtime errors on every provider detail page load. Reusing the existing `providers.delete` key is the correct fix. Merged!
Thanks @kfiramar! 🎉 Critical schema fix — the `group` column was used in all provider_connections queries but missing from the base schema and backfill migration. Databases upgraded from older versions were silently failing on group-related queries. Clean fix with regression test. Merged!
Tighten the helper signatures added for recovered provider cleanup.
This removes the new any-typed recovery parameters called out in
review without broadening the PR into unrelated auth typing work.
Clear recovered provider error metadata after successful
credentialed requests in non-chat API routes as well.
Add route-level regression tests covering a Response-based
success path and a result-object success path.
Refine the recovered-account regression test to match the real
observed state: an account can remain active while still carrying
stale refresh-failure metadata.
This verifies that getProviderCredentials surfaces those fields
and that clearAccountError clears them through the real runtime
path.
Pass errorCode, lastErrorType, and lastErrorSource through the
runtime credentials object so clearAccountError can clear stale
provider error metadata after a real successful request.
Also update the regression test to use getProviderCredentials,
matching the production call path.
Add the missing provider_connections.group column to both the
base schema and the runtime column backfill path.
Also add a regression test covering upgrade from an older
database that does not yet have the column.
Clear errorCode, lastErrorType, and lastErrorSource when an
account recovers so provider state returns to a fully clean
active status.
Add a focused regression test for recovered-account cleanup.
Keep getPreferredEnvFilePath consistent with its env parameter by
passing that env through resolveDataDir in both bootstrap and Electron.
This avoids silently falling back to process.env when a custom env map
is supplied.
Treat empty or whitespace-only dataDirOverride values as unset so
bootstrapEnv keeps using the normal DATA_DIR and .env lookup path.
Adds a focused regression test for the whitespace override case.
Propagate database inspection failures instead of treating them as
missing encrypted credentials.
This keeps startup from generating a fresh encryption key when an
existing database cannot be inspected and adds a regression test for
that path.
Align the app bootstrap paths with the documented CLI env lookup.
The CLI wrapper already loads DATA_DIR/.env, ~/.omniroute/.env, or ./.env,
but run-next, run-standalone, and Electron were bypassing that behavior.
On machines with encrypted credentials, that could generate a fresh
STORAGE_ENCRYPTION_KEY in server.env and make existing tokens unreadable.
This change:
- uses the same preferred .env lookup in bootstrapEnv and Electron
- keeps Electron secrets rooted in DATA_DIR and passes DATA_DIR to the child
- refuses to mint a new encryption key over an existing encrypted database
- adds a focused regression test for env precedence and key safety
Thanks @rexname (Maulana Hasanudin)! 🎉
Codex account quota policy (5h/weekly) with auto-rotation is now merged. Highlights:
- Per-account policy toggles (5h + weekly ON/OFF) in the Provider dashboard
- Accounts automatically skipped when enabled quota window reaches 90% threshold
- Auto re-eligibility when resetAt timestamp passes (no manual intervention needed)
- Side-effect free `getQuotaWindowStatus` getter design
- Safe partial merge of `codexLimitPolicy` on provider updates
Merged on top of main (v2.5.0) with no conflicts. Analytics label fix (#356) included. Thanks for the excellent quality and the 2-commit cleanup round! 🙏
Add a default-off dashboard setting that injects Codex fast service tier only when the request did not already specify one.
Also preserve service_tier through OpenAI-to-Responses translation and restore the setting at startup.
PR #363 added allowedConnections as 3rd arg in chat.ts calls to
getProviderCredentials(), but the function signature in auth.ts
only declared 2 params. Adding the optional 3rd param and applying
the connection filter when provided.
- add user-facing success/error notifications for Codex limit toggle API calls
- deduplicate Codex policy default normalization in providers page
- make getQuotaWindowStatus side-effect free (no cache mutation in getter)
- avoid stale threshold blocking after resetAt has passed
- extract named Codex quota threshold constant
- extract helper for earliest future reset date selection
Add gpt-5.4 to the Codex model registry so OmniRoute exposes cx/gpt-5.4 and codex/gpt-5.4 in its model catalog.
Includes a focused regression test for model resolution.
- add quota window status helper for Codex session (5h) and weekly windows
- enforce policy-based account filtering when enabled windows reach threshold
- return all-rate-limited metadata when no Codex account is eligible
- add per-account dashboard toggles for 5h and weekly policy controls
- merge codexLimitPolicy safely on provider updates to preserve partial settings
- document purpose and usage scenarios in README (EN + ID + i18n note)
Merged! Excellent contribution @AndersonFirmino 🎉
This PR delivers four major improvements:
- **strict-random** strategy — Fisher-Yates shuffle deck with anti-repeat guarantee and mutex serialization for concurrent safety
- **API key controls** — allowedConnections, is_active, accessSchedule, autoResolve
- **Connection groups** — environment-based grouping view in Limits page with localStorage persistence
- **i18n** — 30 languages fully updated, pt-BR fully translated
655 tests passing. Merged with main (v2.4.4) — no conflicts. Thank you for the exceptional quality!
- fix#355: increase STREAM_IDLE_TIMEOUT_MS from 60s to 300s to prevent
premature stream abortion for extended-thinking models (claude-opus-4-6,
o3, etc.) that can pause >60s during reasoning phases. Configurable via
STREAM_IDLE_TIMEOUT_MS env var.
- fix#350: combo health check test now bypasses REQUIRE_API_KEY=true by
sending X-Internal-Test header, recognized in chat.ts auth pipeline to
skip API key validation for internal admin-side combo tests. Also
extended test timeout from 15s to 20s. Uses OpenAI-compatible format
universally (not Claude-style).
- fix#346: filter out tools with empty function.name before forwarding
to upstream providers. Claude Code sends empty-name tool definitions
that cause '400 Invalid input[N].name: empty string' on OpenAI-compat
providers. Extends existing message/input empty-name filter.
Merged via review workflow. Excellent contribution by @Regis-RCR — 3-tier pricing resolution with LiteLLM sync, 23 tests, fully opt-in. Minor improvement noted: dashboard UI for sync status will be added in a follow-up.
- New: open-sse/services/apiKeyRotator.ts — round-robin rotation
between primary API key + providerSpecificData.extraApiKeys[]
- Modified: open-sse/executors/base.ts — buildHeaders() rotates key
using getRotatingApiKey() when extraApiKeys configured
- Modified: open-sse/handlers/chatCore.ts — injects connectionId into
credentials to enable per-connection rotation index tracking
- Modified: providers/[id]/page.tsx — 'Extra API Keys' UI section in
EditConnectionModal: add/remove keys, persisted in providerSpecificData
T08 (quota window rolling) and T13 (wildcard model routing) confirmed
already implemented in accountFallback.ts and wildcardRouter.ts.
- Add .catch() to initial and periodic sync promises (Gemini, Kilo)
- Wrap JSON.parse in try-catch for corrupted DB data (Kilo)
- Wrap response.json() in try-catch for invalid LiteLLM JSON (Kilo)
- Validate PRICING_SYNC_INTERVAL (guard against NaN/0 → tight loop) (Copilot)
- Validate and allowlist sources — reject unknown, prevent empty sync
from clearing pricing_synced data (Copilot, Kilo)
- Extract merge loop into shared iteration to reduce duplication (Gemini)
- Add data/warnings fields to MCP output schema (Copilot)
- Remove unused z import in vitest (Copilot)
- Filter non-string entries from sources array in API route (Copilot)
- Track active interval for accurate getSyncStatus().nextSync (Copilot)
getProviderCredentials already filtered by allowedConnections, but
chat.ts never passed the field from apiKeyInfo. Now both call sites
(combo pre-check and credential retry loop) forward the restriction.
Fixes race condition in combo strict-random (concurrent requests could
reshuffle simultaneously). Eliminates code duplication between combo.ts
and auth.ts by extracting Fisher-Yates shuffle + deck logic into
src/shared/utils/shuffleDeck.ts with per-namespace mutex serialization.
- Combo layer: strict-random in combo.ts rotates models uniformly
- Credential layer: strict-random in auth.ts rotates connections/accounts
- Anti-repeat guarantee: last of previous cycle ≠ first of next
- Mutex serialization for concurrent request safety
- Independent decks per combo name and per provider
- allowedConnections: restrict which connections a key can use
- autoResolve: per-key toggle for ambiguous model disambiguation
- is_active: enable/disable key instantly (403 on disabled)
- accessSchedule: time-based access control (hours, days, timezone)
- Rename keys via PATCH /api/keys/:id
- Connection restriction badge in API keys table
- Auto-migration for all new columns
- Connection group field on provider connections
- Environment grouping view in Limits page (group by environment)
- Accordion UI with expand/collapse per group
- localStorage persistence for groupBy, autoRefresh, expandedGroups
- Smart default: auto-switches to environment view when groups exist
- Swap SessionsTab above RateLimitStatus
- strict-random option added to combo strategy dropdown (30 languages)
- strategyGuide.strict-random (when/avoid/example)
- pt-BR: translated all strategyRecommendations from English to Portuguese
- en: added API key management strings (accessSchedule, isActive, etc.)
- 11 tests: shuffle deck mechanics (Fisher-Yates, anti-repeat, decks)
- 6 tests: allowedConnections (schema, DB persistence, cache invalidation)
- 12 tests: API key policy (isActive, accessSchedule, autoResolve, budget)
* fix: tool description null sanitization, clipboard HTTP fallback fixes
T10 - Sanitize tool.description null in claude-to-openai translator
- claude-to-openai.ts: tool.description defaults to empty string when null/undefined
- claude-to-openai.ts: filter out tools with empty/missing names
- Prevents 400 validation errors on providers like NVIDIA NIM (issue #276)
T11 - Fix copy buttons to work on HTTP/non-HTTPS deployments
- Add src/shared/utils/clipboard.ts with HTTPS+HTTP (execCommand) dual fallback
- Migrate useCopyToClipboard.ts to use shared utility
- Migrate ConsoleLogViewer.tsx, RequestLoggerV2.tsx to shared utility
- Migrate HomePageClient.tsx, endpoint/page.tsx, GetStarted.tsx
- Migrate DefaultToolCard.tsx to shared utility
- Fixes copy buttons when OmniRoute runs behind HTTP proxy (issue #296)
T02 - Verified SSE [DONE] sentinel handling already correct
- sseParser.ts filters [DONE] on line 13 (no change needed)
- stream.ts uses doneSent flag to prevent duplicate sentinel
- bypassHandler.ts correctly separates streaming/non-streaming responses
Issue triage comments posted to #340, #341, #344
* feat: DB read cache + Accept header stream negotiation (T09/T01)
T09 - In-memory TTL cache for hot DB read paths
- Add src/lib/db/readCache.ts with TTL cache (5s settings/connections, 30s pricing)
- Eliminates redundant SQLite reads on concurrent requests
- Integrate invalidation in settings.ts updateSettings() and updatePricing()
- Integrate invalidation in providers.ts create/update/delete operations
- Export getCachedSettings, getCachedPricing, getCachedProviderConnections,
invalidateDbCache via localDb.ts for consumer migration
- Cache auto-busts on any write, preserving data consistency
T01 - Accept header stream negotiation
- src/sse/handlers/chat.ts: detect Accept: text/event-stream header
- Override body.stream=true when Accept header indicates streaming client
- Enables curl, httpx and SDK clients that use HTTP headers instead of JSON
body field to trigger streaming responses
- Logs Accept override at DEBUG level for observability
* fix: auto-advance quota window on expiry to prevent stale blocking (T08)
T08 - Quota Window Rolling Auto-Advance
- quotaCache.ts: add windowDurationMs field to QuotaCacheEntry interface
(optional field that callers can set when they know the window duration)
- Add advancedWindowResetAt() helper: if entry.nextResetAt is in the past,
eagerly returns { exhausted: false } so requests are unblocked immediately
- isAccountQuotaExhausted() now uses advancedWindowResetAt() instead of
the previous inline date check, and optimistically clears entry.exhausted
flag to avoid re-checking the same stale entry on the next request
Before: exhausted accounts with an expired resetAt would wait up to 5
minutes for the background refresh before accepting new requests.
After: the first request after resetAt passes will be immediately accepted
and will trigger a quota refresh on the next background tick.
* feat: manual OAuth token refresh UI (T12)
T12 - Manual Token Refresh UI
- Add POST /api/providers/[id]/refresh endpoint
- Validates connection exists and is OAuth type
- Calls getAccessToken() (same helper used in auto-refresh)
- Persists new credentials via updateProviderCredentials()
- Returns { success, expiresAt, refreshedAt } on success
- Update providers/[id]/page.tsx
- handleRefreshToken() with loading state (refreshingId)
- Pass onRefreshToken + isRefreshing props to ConnectionRow
- ConnectionRow: add optional onRefreshToken/isRefreshing props
- ConnectionRow: tokenMinsLeft state via lazy init (Date.now() in
getter fn, not in render body - satisfies react-hooks/purity)
- Token expiry badge: red 'expired' | amber '~Xm' (<30min) | hidden
- 'Token' button (amber) next to 'Retest' for OAuth connections
- Add en.json i18n: tokenRefreshed, tokenRefreshFailed
* Initial plan
* feat: integrate wildcardRouter into model alias resolution (T13)
T13 - Wildcard Model Routing
- Import resolveWildcardAlias from wildcardRouter.ts into model.ts
- In getModelInfoCore(), after exact alias check fails, try glob wildcard
alias matching (e.g., 'claude-sonnet-*' alias → 'anthropic/claude-sonnet-4')
- Returns { provider, model, extendedContext, wildcardPattern } on match
- Falls back to MODEL_TO_PROVIDERS lookup and openai default as before
* fix: clipboard cleanup and tool validation
* feat: media page UX + T04 playground uploads + T03 HuggingFace/Vertex AI
Media Page (MediaPageClient.tsx):
- Render images inline (img tags from b64_json or url)
- Show transcription as plain readable text (not raw JSON)
- Amber banner for credential errors with link to /dashboard/providers
- Detect empty transcription result and show credentials hint
- Provider credential hint below selector for non-local providers
- Extended provider/model lists: HuggingFace, Qwen TTS, Inworld, Cartesia, PlayHT, AssemblyAI
T04 - Playground File Uploads (playground/page.tsx):
- Audio file upload panel for transcription endpoint (multipart/form-data)
- Image upload panel for vision models (gpt-4o, claude-3, gemini, pixtral, llava...)
- Auto-detect vision models by name heuristic
- Inject uploaded images as base64 image_url in chat messages
- Inline image rendering for image generation results
- Readable text view for transcription results with copy button
- Preview thumbnails for attached images with individual remove
T03 - HuggingFace + Vertex AI Providers:
- HuggingFace: frontend providers.ts + backend providerRegistry.ts
Uses HuggingFace Router OpenAI-compatible endpoint
- Vertex AI: frontend providers.ts + backend providerRegistry.ts
Uses gemini format with generateContent API (urlBuilder fallback)
T07 - API Key Round-Robin: VERIFIED already implemented in auth.ts
fill-first, round-robin, p2c, random, least-used, cost-optimized strategies
* feat: T05 task-aware routing + fix#302 stream override + fix#73 claude provider fallback
T05 - Task-Aware Smart Routing:
- New open-sse/services/taskAwareRouter.ts:
Detects 7 task types: coding, creative, analysis, vision, summarization,
background, chat from system/user message content and images
Configurable taskModelMap per task type, stats tracking
applyTaskAwareRouting() integrates with existing chat pipeline
- New src/app/api/settings/task-routing/route.ts:
GET/PUT/POST API for task routing config + reset-stats + detect action
Persists config via updateSettings('taskRouting')
- Integration in src/sse/handlers/chat.ts:
applyTaskAwareRouting() called after policy enforcement, before combo resolve
Logs task type detection and model overrides
Fix#302 - OpenAI SDK stream=False drops tool_calls:
- src/sse/handlers/chat.ts T01 Accept header negotiation:
Changed condition from 'body.stream !== true' to 'body.stream === undefined'
OpenAI Python SDK sends 'Accept: application/json, text/event-stream' in every
request, even stream=False — the old code was incorrectly forcing stream=true,
causing tool_calls to be dropped from non-streaming responses
Fix#73 - Claude Haiku routed to OpenAI provider instead of Antigravity:
- open-sse/services/model.ts getModelInfoCore():
Added heuristic prefix detection before the blind 'openai' fallback:
claude-* models → antigravity (Anthropic) provider
gemini-*/gemma-* models → gemini provider
Closes: #73, partially addresses #302
* fix: token counts 0 (#74), model import dup (#180), model route fallback (#73)
fix#74 - Token counts always 0 for Antigravity/Claude streaming:
- open-sse/utils/usageTracking.ts extractUsage():
Add handler for 'message_start' SSE event which carries INPUT tokens in
Antigravity/Claude streaming:
{ type: 'message_start', message: { usage: { input_tokens: N } } }
This event was completely unhandled, causing ALL input token counts to be
dropped for every Antigravity/Claude streaming request
fix#180 - Model import shows duplicates with no visual feedback:
- src/shared/components/ModelSelectModal.tsx:
Added addedModelValues prop (string[]) to receive already-added model values
Models already in the combo now shown with ✓ indicator + green highlight
Makes it visually clear which models are already added vs new
- src/app/(dashboard)/dashboard/combos/page.tsx:
Pass addedModelValues={models.map(m => m.model)} to ModelSelectModal
* Harden clipboard UX and Claude tool normalization (#360)
* Initial plan
* chore: plan updates for clipboard and translator fixes
* fix: clipboard cleanup, copy feedback, and claude tool validation
---------
Co-authored-by: openai-code-agent[bot] <242516109+Codex@users.noreply.github.com>
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: openai-code-agent[bot] <242516109+Codex@users.noreply.github.com>
- docs/openapi.yaml: update info.version from 2.3.6 to 2.4.1 (fixes CI check)
- CHANGELOG.md: add '## [Unreleased]' section as first heading (required by check-docs-sync)
- scripts/check-docs-sync.mjs: fix regex to accept both hyphen (-) and em-dash (—)
as date separators in changelog headings (standard Keep a Changelog format)
- .husky/pre-commit: add 'node scripts/check-docs-sync.mjs' to catch version
mismatches locally before push
- Move 'Free Stack ($0)' to position 1 in COMBO_TEMPLATES (was 4th, invisible in 3-col grid)
- Add isFeatured flag to free-stack for special styling
- Change template grid: grid-cols-3 → 2x2 (sm:grid-cols-2) — all 4 templates visible
- Free Stack: green border/bg (emerald), FREE badge, larger text size
- Other templates: hover styles preserved, → arrow on Apply link
- Increase templates section padding
- fix(oauth): restore iFlow clientSecret default — was empty string, now uses the valid public key (#339)
- fix(mitm): compile src/mitm/*.ts to JS during prepublish so server.js exists in npm bundle (#335)
- fix(gemini-cli): graceful projectId fallback — warn + empty string instead of hard 500 error (#338)
- feat(models): add gpt5.4 to Codex; add claude-sonnet-4, claude-opus-4.6, deepseek-v3.2, minimax-m2.1, qwen3-coder-next, auto to Kiro (#334)
- fix(electron): sync electron/package.json version to 2.3.13 (#323)
- feat(scoring): add tierPriority (0.05) to ScoringWeights Zod schema and combos/auto API route
The @swc/helpers override removal changed dependency resolution.
npm ci was failing with 'Missing: @swc/helpers@0.5.15 from lock file'.
Updated lock file with npm install --package-lock-only.
star-history.com embeds are often cached and slow to update. The new
starchart.cc widget (variant=adaptive) renders better on both light and
dark themes and updates in real-time.
Updated: README.md + 29 i18n locale READMEs
The @swc/helpers override in package.json duplicated the direct dependency
at the exact same version (0.5.19), causing 'EOVERRIDE' errors when pnpm
users tried to rebuild native modules like better-sqlite3.
Fixes:
- Remove redundant 'overrides' block (direct dep already pins 0.5.19)
- Add pnpm.onlyBuiltDependencies for @parcel/watcher, @swc/core,
better-sqlite3, esbuild, omniroute, sharp (replaces pnpm approve-builds)
- Add pnpm usage note to README Quick Start
Closes#328
Add API_BRIDGE_PROXY_TIMEOUT_MS env var to configure the api-bridge
proxy timeout. Default remains 30000ms for backward compatibility.
Handles invalid values with a warning log.
Co-authored-by: hijak <54431520+hijak@users.noreply.github.com>
- Use timingSafeEqual for constant-time password comparison
- Require non-empty currentPassword when INITIAL_PASSWORD env is set
- Legacy fallback: allow empty or '123456' when no INITIAL_PASSWORD
Co-authored-by: hijak <54431520+hijak@users.noreply.github.com>
The outer <details> block at line 1459 was never closed, causing GitHub
to stop rendering everything below Troubleshooting (Tech Stack, Docs,
Roadmap, Contributors, etc.).
Fixes: README truncation on GitHub
kilocode renders ASCII logo banner on startup causing false healthcheck_failed
timeouts on cold-start or low-resource environments (VPS, CI, dashboard)
All 3 throw new Error(data.error) replaced with proper extraction:
typeof error === object ? error.message : error
Fixes Cline and other OAuth providers showing [object Object] on connection failure
- cline.ts: add decodeURIComponent before base64 decode to handle URL-encoded codes
- cline.ts: populate name = firstName+lastName || email in mapTokens
- oauth/exchange route: normalize name=email for all providers on exchange/poll/poll-callback
- Fixes: accounts showing Account #ID instead of email in providers dashboard
- Add cliTools.toolDescriptions.opencode, .kiro, guides.opencode, guides.kiro to en.json
- Sync 1111 missing keys across 29 language files (English fallbacks)
- Fix [object Object] in provider batch test modal:
normalize data.error object to string before setTestResults()
and in ProviderTestResultsView rendering
- Bump version to 2.3.6
2026-03-12 11:11:15 -03:00
2557 changed files with 668015 additions and 139563 deletions
description: Automatically run the browser_subagent to visually validate all new UI features from the current release and capture evidence WebP recordings of the changes.
---
# Capture Release Evidences Workflow
Use this workflow to automatically drive the `browser_subagent` to explore the newly deployed or locally running application and record evidence of the UI changes introduced in the latest release.
## Prerequisites
- OmniRoute must be actively running and accessible (e.g. locally at `http://localhost:20128` or on the Local VPS at `http://192.168.0.15:20128`).
- The user must provide the target URL to be tested, or default to `http://192.168.0.15:20128`.
## Workflow Steps
### 1. Identify Target Features
Review the `CHANGELOG.md` for the latest version to map out the new UI elements. For example:
For each identified feature, invoke the `browser_subagent` using the `default_api:browser_subagent` tool.
**Important Task Guidelines for the Subagent:**
-`TaskName`: Give it a clear name like "Validate CLIProxyAPI Tool Tab".
-`TaskSummary`: "Navigate to the CLI Tools tab and verify the new Integration settings."
-`Task`: Provide unambiguous instructions for the subagent, such as: "Navigate to http://192.168.0.15:20128/dashboard. Click on the 'Settings' or 'CLI Tools' nav link. Scroll down to find the CLIProxyAPI integration card. Hover over it to trigger UI state. Verify the components render correctly and exit."
-`RecordingName`: Ensure it describes the feature (e.g. `v3_4_5_cli_proxy_api`). This is required and strictly automatically saved as a WebP artifacts video by the system.
_(Note: The `browser_subagent` automatically creates a WebP recording named by the `RecordingName` parameter. No additional tools for screenshots are needed.)_
### 3. Generate Report Artifact
After the `browser_subagent` finishes its sessions, generate a final Markdown artifact (using `write_to_file` and `IsArtifact=true`) to present the recordings inline to the user using the `` syntax.
### Example Invocation
\```json
{
"TaskName": "Validating Qoder PAT Configuration UI",
"TaskSummary": "Validates the Qoder provider configuration modal",
"Task": "Go to http://192.168.0.15:20128/dashboard. Click on the 'Providers' tab. Find 'Qoder' in the list. Click 'Add Token' or 'Configure'. Type 'test_token' and submit. Return when done.",
Look up the latest GitHub Actions CI run for the current release branch, diagnose all failures, fix them locally, push, and wait for the new CI run to go green before notifying the user.
---
## Phase 1: Identify the Failing CI Run
### 1. Determine the current release version and branch
In OmniRoute >= 3.6.6, the `package.json` declares the main CLI binary as `bin/omniroute.ts`. While Node 22 introduced experimental support for executing TypeScript files natively via the `--experimental-strip-types` flag (and `--experimental-transform-types`), Node.js explicitly disables this feature for any files loaded from inside a `node_modules` directory.
When users install `omniroute` globally (or locally) via npm, the CLI shim points to `bin/omniroute.ts` inside `node_modules`. Running `omniroute` on Node 22 results in:
`Error [ERR_UNSUPPORTED_NODE_MODULES_TYPE_STRIPPING]: Stripping types is currently unsupported for files under node_modules`
This makes the CLI completely unusable out-of-the-box on Node 22, despite the `engines` field claiming support for `>=22.22.2`.
## Proposed Solution
The CLI entrypoint distributed in the NPM package must be a JavaScript file, not a TypeScript file.
1. Restore `bin/omniroute.mjs` as the actual CLI entrypoint, or implement a build step during `npm run build:cli` that compiles `bin/omniroute.ts` to `bin/omniroute.mjs` before publishing.
2. Update `package.json` to point the `bin` field back to the compiled `.js`/`.mjs` file:
```json
"bin": {
"omniroute": "bin/omniroute.mjs"
}
```
3. Ensure `scripts/prepublish.ts` handles the compilation or validation of the binary entrypoint correctly so this regression doesn't happen again.
## Task
Please implement this fix, ensure the tests pass, and create a Pull Request against the main branch.
@@ -4,16 +4,55 @@ description: Create a new release, bump version up to 1.x.10 threshold, update c
# Generate Release Workflow
Bump version, finalize CHANGELOG, commit, tag, push, publish to npm, and create GitHub release.
Bump version, finalize CHANGELOG, commit, open a **PR to main** and wait for user confirmation before tagging, publishing, and deploying.
> **VERSION RULE: Always use PATCH bumps (2.x.y → 2.x.y+1)**
> NEVER use `npm version minor` or `npm version major`.
> Always use: `npm version patch --no-git-tag-version`
> The threshold rule: when `y` reaches 10, bump to `2.(x+1).0` — e.g. `2.1.10` → `2.2.0`.
## Steps
> **🔴 SINGLE BRANCH RULE**: The `release/vX.Y.Z` branch is the **ONLY** development branch for the entire release cycle. ALL work — bug fixes, feature implementations, PR integrations, issue resolutions — MUST be committed directly on this branch. Never create separate `fix/`, `feat/`, or topic branches. When running `/resolve-issues`, `/implement-features`, or `/review-prs`, always work on the current release branch.
**NEVER push directly to main or create tags before the user confirms the PR.**
---
## Phase 0: Security Verification (MANDATORY)
Before creating the release, you must ensure the codebase and supply chain are secure and free of known vulnerabilities.
1.**Run Local Dependencies Audit:**
```bash
npm audit
```
_Fix any `high` or `critical` vulnerabilities identified._
2. **Check GitHub CodeQL & Dependabot Alerts:**
Navigate to the repository's **Security** tab on GitHub, or use the project's `vulnerability-scanner` skill to analyze active alerts. Ensure all static analysis findings (e.g., prototype pollution, insecure randomness, ReDoS, shell injections) are addressed and logically committed on a target branch.
---
## Phase 1: Pre-Merge
### 1. Create release branch
```bash
git checkout -b release/v2.x.y
```
### 2. Determine new version
Check current version in `package.json` and increment the **patch** number only:
@@ -27,12 +66,28 @@ Version format: `2.x.y` — examples:
- `2.1.9` → `2.1.10` (patch)
- `2.1.10` → `2.2.0` (minor threshold — do manually with `sed`)
```bash
# ALWAYS use patch:
npm version patch --no-git-tag-version
```
> **⚠️ ATOMIC COMMIT RULE — Version bump MUST happen before committing feature files.**
>
> **CORRECT order:**
>
> 1. `npm version patch --no-git-tag-version` ← bump first
> 2. implement features / fix bugs
> 3. `git add -A && git commit -m "chore(release): v2.x.y — all changes in ONE commit"`
>
> **OR if features are already staged:**
>
> 1. implement features (do NOT commit yet)
> 2. `npm version patch --no-git-tag-version` ← bump before committing
> 3. `git add -A && git commit -m "chore(release): v2.x.y — all changes in ONE commit"`
>
> **NEVER do this (creates version mismatch in git history):**
>
> - ~~commit features → then bump version → commit package.json separately~~
>
> This ensures that `git show v2.x.y` always contains both code changes and the version bump together.
> The GitHub release tag will point to a commit that includes ALL changes for that version.
### 2. Regenerate lock file (REQUIRED after version bump)
### 3. Regenerate lock file (REQUIRED after version bump)
**Mandatory** — skipping causes `@swc/helpers` lock mismatch and CI failures:
@@ -40,7 +95,7 @@ npm version patch --no-git-tag-version
npm install
```
### 3. Finalize CHANGELOG.md
### 4. Finalize CHANGELOG.md
Replace `[Unreleased]` header with the new version and date.
Keep an empty `## [Unreleased]` section above it.
@@ -53,58 +108,185 @@ Keep an empty `## [Unreleased]` section above it.
## [2.x.y] — YYYY-MM-DD
```
### 4. Update openapi.yaml version ⚠️ MANDATORY
### 5. Update openapi.yaml version ⚠️ MANDATORY
> **CI will fail** if `docs/openapi.yaml` version ≠ `package.json` version (`check:docs-sync` enforces this).
| `[docs-sync] FAIL - OpenAPI version differs from package.json` | Skipped step 4 — `docs/openapi.yaml` version not updated | Run step 4 (`sed -i ...`) and commit |
| `[docs-sync] FAIL - OpenAPI version differs from package.json` | Skipped step 5 — `docs/openapi.yaml` version not updated | Run step 5 (`sed -i ...`) and commit |
| `[docs-sync] FAIL - CHANGELOG.md first section must be "## [Unreleased]"` | `## [Unreleased]` missing or not at top of CHANGELOG | Add `## [Unreleased]\n\n---\n` before the first versioned `## [x.y.z]` |
| Electron Linux `.deb` build fails (`FpmTarget` error) | `fpm` Ruby gem not installed on `ubuntu-latest` runner | Already fixed in `electron-release.yml` (`gem install fpm` step) |
| Docker Hub `502 error writing layer blob` | Transient Docker Hub network error during ARM64 push | Re-run the Docker publish workflow; no code change needed |
# /implement-features — Feature Request Harvest, Research & Implementation Workflow
## Overview
Fetches open feature request issues, analyzes each against the current codebase, implements viable ones on dedicated branches, and responds to authors with results. Does NOT merge to main — leaves branches for author validation.
A **5-phase** workflow that systematically harvests feature requests from GitHub issues, creates structured idea files, researches solutions across the internet and Git repositories, presents a consolidated report for user approval, then generates detailed implementation plans and executes them.
## Steps
**Output directory structure:**
### 1. Identify the Repository
```
_ideia/
├── viable/ # Features approved for implementation
│ ├── need_details/ # ❓ Good idea but waiting for author clarification (issues stay OPEN)
│ │ └── 1015-warp-terminal-mitm.md
│ ├── 1046-native-playground.md # ✅ Ready — researched and planned
│ └── 1046-native-playground.requirements.md
├── defer/ # ⏭️ Good ideas deferred for future cycles (issues CLOSED)
│ └── 1041-smart-auto-combos.md
└── notfit/ # ❌ Out of scope / already exists (issues CLOSED)
> **LIFECYCLE RULE:** `viable/` files are **DELETED** once the feature is implemented — they are not moved. Only unimplemented features live in `viable/` (or `viable/need_details/`). Files in `defer/` and `notfit/` remain as permanent reference.
> **BRANCH RULE**: All implementation work MUST happen on the current `release/vX.Y.Z` branch. Never create separate `feat/` branches. If no release branch exists yet, create one first using `/generate-release` Phase 1 steps 1–5.
- Run: `gh issue list --repo <owner>/<repo> --state open --limit 50 --json number,title,labels,body,comments,createdAt,author`
- Filter for issues that are feature requests (label `enhancement`/`feature`, or body describes new functionality, or previously classified as feature request)
- Sort by oldest first
Before doing any work, ensure you are on the current release branch:
### 3. Analyze Each Feature Request
```bash
# Check current branch
git branch --show-current
For each feature request issue, perform a **two-level analysis**:
# If on main, determine next version and create the release branch
- Read the **entire body** — including description, use cases, screenshots, mockups, and any embedded images.
- Read **ALL comments** — community discussion, agreements, restrictions, owner responses, and linked PRs.
- **Images**: If the body or comments contain image URLs (`` or `https://...png/jpg/gif`), note them — they may contain UI mockups, wireframes, or architecture diagrams that are essential to understanding the request.
- You may batch these into parallel calls (up to 4 at a time).
- Sort by oldest first (FIFO).
### 1.4 Create Idea Files (initially in `_ideia/` root)
For each feature request, create a structured idea file in `<project_root>/_ideia/`:
#### 1.4a — If the idea file does NOT exist yet, create it:
```markdown
# Feature: <Title from Issue>
> GitHub Issue: #<NUMBER> — opened by @<author> on <date>
> Status: 📋 Cataloged | Priority: TBD
## 📝 Original Request
<Paste the FULL issue body here, preserving all formatting, images, and code blocks>
## 💬 Community Discussion
<Summarize ALL comments chronologically, noting who said what and any decisions or objections raised>
### Participants
- @<author> — Original requester
- @<commenter1> — <brief role/opinion>
- ...
### Key Points
- <bullet list of the most important discussion points>
- <agreements reached>
- <objections raised>
## 🎯 Refined Feature Description
<YOUR interpretation and enrichment of the feature request. Expand on what was asked, fill in logical gaps, provide concrete examples of how it would work. This section should be MORE detailed and clearer than the original request.>
### What it solves
- <problem 1>
- <problem 2>
### How it should work (high level)
1. <step 1>
2. <step 2>
3. ...
### Affected areas
- <list of codebase areas, modules, files likely affected>
## 📎 Attachments & References
- <any image URLs, mockup links, or external references from the issue>
## 🔗 Related Ideas
- <links to related \_ideia/ files if any overlap found>
```
#### 1.4b — If the idea file ALREADY exists, update it:
- Append new comments from the issue to the **Community Discussion** section.
- Update the **Refined Feature Description** if new information changes the understanding.
- Add any new **Related Ideas** cross-references found.
- **Do NOT overwrite** existing content — append and enrich it.
### 1.5 Cross-Reference & Deduplication
After processing all issues:
- Scan all `_ideia/*.md` files for overlapping features.
- If two features are substantially the same, add `🔗 Related Ideas` cross-references to both.
- If one is a strict subset of another, note it in the smaller file: `> ℹ️ This feature is a subset of #<OTHER_NUMBER>. Consider implementing together.`
# ❌ NOT FIT & 🔁 ALREADY EXISTS — move idea files only
mv _ideia/<NUMBER>-*.md _ideia/notfit/
```
No files should remain in `_ideia/` root after this step (except subdirectories).
### 2.5.3 Post GitHub Comments by Category
**Each category has a specific comment template and action:**
---
#### For 🔁 ALREADY EXISTS — Comment + CLOSE issue
// turbo
The feature already exists in the system. Explain WHERE it is and HOW to use it.
```markdown
Hi @<author>! Thanks for the suggestion! 🙏
Great news — this functionality **already exists** in OmniRoute:
**📍 Where to find it:** <exact dashboard path or settings location>
**🔧 How to use it:**
1. <step 1>
2. <step 2>
3. <step 3>
If you have any trouble finding or using it, feel free to ask in a Discussion. We're always happy to help!
Closing this as the feature is already available. 🎉
```
```bash
gh issue close <NUMBER> --repo <owner>/<repo> --comment "<comment above>"
```
---
#### For ⏭️ DEFER — Comment + CLOSE issue
// turbo
Thank the user, explain the idea was cataloged, and that we'll study it before implementing.
```markdown
Hi @<author>! Thanks for this thoughtful feature request! 🙏
We really appreciate the detailed proposal. We've **cataloged your idea** and it's now part of our improvement backlog.
Due to the **significant architectural impact** of this feature, we'll need to conduct thorough use-case studies and architectural analysis before we start development. This ensures we build it right and don't introduce regressions.
**What happens next:**
- Your idea is saved in our internal feature backlog
- We'll conduct architecture studies when this area is prioritized
- We'll notify you here when development begins
Thank you for contributing to OmniRoute's roadmap! Your input helps shape the product. 🚀
```
```bash
gh issue close <NUMBER> --repo <owner>/<repo> --comment "<comment above>"
```
---
#### For ❌ NOT FIT — Comment + CLOSE issue
// turbo
Politely explain why the feature doesn't fit the project scope.
```markdown
Hi @<author>! Thanks for the suggestion! 🙏
After careful analysis, we've determined that this feature **falls outside OmniRoute's core scope** as a proxy/router.
**Reason:** <explain why — e.g., "Telegram integration belongs in the application/orchestrator layer that consumes OmniRoute's API, not inside the router itself.">
**Alternative:** <suggest an alternative approach if possible>
We appreciate you thinking of ways to improve OmniRoute! If you'd like to discuss this further, feel free to open a Discussion. 🙏
```
```bash
gh issue close <NUMBER> --repo <owner>/<repo> --comment "<comment above>"
```
---
#### For ❓ NEEDS DETAIL — Comment (keep OPEN)
// turbo
Ask for the specific missing details needed.
```markdown
Hi @<author>! Thanks for the feature request — it's an interesting idea and we'd love to explore it further. 🙏
To move forward, we need a few more details:
1. <specific question 1>
2. <specific question 2>
3. <specific question 3>
If you know of any **open-source projects or repositories** that implement something similar, please share links — it would help us design the best solution.
Looking forward to your response! 🚀
```
---
#### For ✅ VIABLE — Comment (keep OPEN)
// turbo
Thank the user, confirm we've cataloged their idea, and explain it may be implemented in future versions.
```markdown
Hi @<author>! Thanks for the great feature suggestion! 🙏
We've analyzed your request and it aligns well with OmniRoute's roadmap. We've **cataloged this feature** and it's in our implementation backlog.
**Status:** 📋 Cataloged for future implementation
This feature may be included in upcoming releases. We'll **respond to this issue and tag you** as soon as implementation begins so you can test it.
Thank you for helping improve OmniRoute! 🚀
```
**⚠️ Do NOT close viable issues — they remain OPEN for tracking.**
5.**Update the plan** — Mark completed steps with `[x]` in the plan file
6.**Continue** — Move to the next feature (do NOT switch branches)
### 5.2 Respond to Authors (Update Viable Issues)
For each implemented feature, **close the issue with a final comment**:
````markdown
✅ **Implemented in `release/vX.Y.Z`!**
Hi @<author>! Great news — your feature request has been implemented! 🎉
**What was done:**
- <bullet list of what was built>
**How to try it:**
```bash
git fetch origin && git checkout release/vX.Y.Z
npm install && npm run dev
```
````
### Next steps:
This will be included in the upcoming **vX.Y.Z** release. Feel free to reopen if you spot any issues! 🚀
1. **Test it** — Please verify it works as you expected
2. **Want to improve it?** — You're welcome to contribute! Just:
```bash
git checkout feat/issue-<NUMBER>-<short-slug>
# Make your improvements
git add -A && git commit -m "improve: <your changes>"
git push origin feat/issue-<NUMBER>-<short-slug>
```
Then open a Pull Request from your branch to `main` 🎉
3. **Not quite right?** — Let us know in this issue what needs to change
````
Looking forward to your feedback! 🚀
```bash
gh issue close <NUMBER> --repo <owner>/<repo> --comment "<comment above>"
````
Then **DELETE the idea file** — it has served its purpose:
```bash
# ✅ Implemented files are DELETED (not moved)
rm _ideia/viable/<NUMBER>-<title>.md
rm _ideia/viable/<NUMBER>-<title>.requirements.md # if exists
```
#### For NEEDS MORE INFO:
// turbo
Post a comment asking for specific missing details needed to implement, e.g.:
- "Could you describe the exact behavior when X happens?"
- "Which API endpoints should be affected?"
- "Should this apply to all providers or only specific ones?"
> **Why delete?** `viable/` only holds features that still NEED to be done. Once implemented, the commit history and CHANGELOG are the source of truth. Keeping the file would be confusing.
Add the context of WHY you need each piece of information.
### 5.3 Finalize & Push
#### For NOT VIABLE:
// turbo
Post a polite comment explaining why the feature doesn't fit at this time:
- If the idea is decent but timing is wrong: "This is an interesting idea, but it doesn't align with our current priorities. Feel free to open a new issue with more details if you'd like us to reconsider."
- If fundamentally flawed: Explain the technical or architectural reasons why it won't work, suggest alternatives if possible.
- Close the issue after posting the comment.
After implementing all approved features:
### 5. Summary Report
Present a summary report to the user via `notify_user`:
1. **Update CHANGELOG.md** on the release branch with all new feature entries
2. Push the release branch: `git push origin release/vX.Y.Z`
3. Run `/generate-release` workflow Phase 1 steps 7–10 (tests → commit → push → open PR to main → wait for user)
| Issue | Title | Verdict | Branch / Action |
|---|---|---|---|
| #N | Title | ✅ Implemented | `feat/issue-N-slug` |
| #N | Title | ❓ Needs Info | Comment posted |
| #N | Title | ❌ Not Viable | Closed with explanation |
@@ -6,7 +6,9 @@ description: Fetch all open GitHub issues, analyze bugs, resolve what's possible
## Overview
This workflow fetches all open issues from the project's GitHub repository, classifies them, analyzes bugs, resolves what can be fixed, and triages issues with insufficient information. **It does NOT merge or release automatically** — it creates a PR and waits for user validation before merging.
This workflow fetches all open issues from the project's GitHub repository, classifies them, analyzes bugs, proposes a resolution plan, waits for user validation, and ONLY THEN implements the fixes, commits, and closes the issues on the current release branch (`release/vX.Y.Z`). It does NOT merge or release automatically — the release branch is later merged via PR to main.
> **BRANCH RULE**: All work MUST happen on the current `release/vX.Y.Z` branch. Never create separate `fix/` branches. If no release branch exists yet, create one first using `/generate-release` Phase 1 steps 1–5.
## Steps
@@ -17,15 +19,45 @@ This workflow fetches all open issues from the project's GitHub repository, clas
- Run: `git -C <project_root> remote get-url origin` to extract the owner/repo
- Parse the owner and repo name from the URL
### 2. Fetch All Open Issues
### 2. Ensure Release Branch Exists
// turbo
- Run: `gh issue list --repo <owner>/<repo> --state open --limit 100 --json number,title,labels,body,comments,createdAt,author`
- Parse the JSON output to get a list of all open issues
- Sort by oldest first (FIFO)
Before doing any work, ensure you are on the current release branch:
### 3. Classify Each Issue
```bash
# Check current branch
git branch --show-current
# If on main, determine next version and create the release branch
If already on a `release/vX.Y.Z` branch, continue working there.
### 3. Fetch All Open Issues
// turbo-all
**⚠️ CRITICAL**: The JSON output of `gh issue list` can be truncated by the tool, silently hiding issues. You MUST use the two-step approach below to guarantee **all** issues are fetched.
**Step 3a — Get Issue numbers only** (small output, never truncated):
- Run: `gh issue list --repo <owner>/<repo> --state open --limit 500 --json number --jq '.[].number'`
- This outputs one issue number per line. Count them and confirm total.
**Step 3b — Fetch full metadata for each Issue** (one call per issue):
- You may batch these into parallel calls (up to 4 at a time).
- Sort by oldest first (FIFO).
### 4. Classify Each Issue
For each issue, determine its type:
@@ -36,85 +68,96 @@ For each issue, determine its type:
Focus ONLY on **Bugs** for resolution. Feature requests and questions should be skipped with a note in the final report.
### 4. Analyze Each Bug — For each bug issue:
### 5. Deep-Read Each Bug Issue (One-by-One Analysis)
#### 4a. Check Information Sufficiency
**IMPORTANT**: Read each bug issue thoroughly, one at a time, before moving to the next. This is NOT a batch process — each issue needs focused attention.
Verify the issue contains enough information to reproduce and fix:
#### 5a. Understand the Problem
For each bug issue, perform the full analysis:
1.**Read the entire body** — including Description, Steps to Reproduce, Expected/Actual Behavior, Error Logs, and Screenshots
2.**Read ALL comments** — including bot triage comments (Kilo, etc.) and owner/community responses. Pay attention to:
- Whether someone already responded with a fix
- Whether a community member confirmed the issue is resolved
- Whether the issue was marked as duplicate by a bot
3.**Identify the claimed error** — extract the exact error message, status code, and provider/model involved
#### 5b. Check Information Sufficiency
Verify the issue contains enough to act on:
- [ ] Clear description of the problem
- [ ] Steps to reproduce
- [ ]Error messages or logs
- [ ] Steps to reproduce OR error logs
- [ ]Provider/model/version information
- [ ] Expected vs actual behavior
#### 4b. If Information Is INSUFFICIENT
#### 5c. Determine Issue Disposition
Call the `/issue-triage` workflow (located at `~/.gemini/antigravity/global_workflows/issue-triage.md`):
// turbo
For each bug, classify into one of 5 actions:
- Post a comment asking for more details using `gh issue comment`
- Add `needs-info` label using `gh issue edit`
- Mark this issue as **DEFERRED** and move to the next one
| **✅ CLOSE — Already Fixed** | Owner responded with fix + no user follow-up, OR community confirmed fix | Close with comment citing which version fixed it |
| **✅ CLOSE — Duplicate** | Bot flagged >85% similarity + user provides no new info | Close referencing the original issue |
| **✅ CLOSE — Stale** | We requested logs/info > 7 days ago with no reply | Close thanking the user, invite to reopen if needed |
| **📝 RESPOND — Needs Info** | Issue is real but missing critical reproduction details | Comment asking for specifics per `/issue-triage` |
| **📝 RESPOND — User Config** | Error is caused by unsupported env (Node version, wrong model path, missing API enablement) | Comment explaining the user-side fix |
| **🔧 FIX — Code Change** | Root cause is confirmed in the codebase | Research, propose solution in report, wait for approval |
#### 4c. If Information Is SUFFICIENT
#### 5d. For "FIX — Code Change" Issues
Proceed with resolution:
Before coding, perform deep source analysis to formulate a plan:
1.**Create a fix branch** — `git checkout -b fix/issue-<NUMBER>-<short-description>`
2.**Research** — Search the codebase for files related to the issue
3.**Root Cause** — Identify the root cause by reading the relevant source files
4.**Implement Fix** — Apply the fix following existing code patterns and conventions
5.**Test** — Build the project and run tests to verify the fix
6.**Commit** — Commit with message format: `fix: <description> (#<issue_number>)`
1.**Search the codebase** — `grep_search` for error strings, relevant function names, affected files
2.**Search the web** — for upstream API changes, SDK updates, or breaking changes that explain the bug
3.**Read the full source file** — don't rely on grep snippets; understand the surrounding logic
4.**Verify the root cause** — confirm the bug is reproducible based on the code, not just a user misconfiguration
5.**Formulate a proposed solution** — detail the exact files and lines you will change and how you will solve it.
6.**DO NOT modify the codebase yet** — wait for user approval on your report first.
### 5. Generate Report & Wait for Validation
#### 5e. For "RESPOND" Issues
Present a summary report to the user via `notify_user` with `BlockedOnUser: true`:
| #N | Title | ❓ Needs Info | Triage comment posted |
| #N | Title | ⏭️ Skipped | Feature request / not a bug |
- Acknowledges the specific error they reported
- Explains the likely root cause
- Provides concrete steps to resolve (version upgrade, env var fix, model path correction)
- Asks for follow-up info if needed
> **⚠️ IMPORTANT**: Do NOT commit, close issues, or generate releases at this step.
> Wait for the user to review the changes and respond with **OK** before proceeding.
**Do NOT post generic template responses.** Every comment should reference the user's specific error messages and environment.
- If the user says **OK** or approves → Proceed to step 6
- If the user requests changes → Apply the requested adjustments first, then present the report again
- If the user rejects → Revert the changes and stop
### 6. Generate Report & Wait for Validation
### 6. Commit & Push Fix Branch (only after user approval)
Present a summary report to the user detailing your proposed actions. For any bugs that need fixing, explicitly explain your proposed solution (files to change and logic) and point out that it will be implemented on the release branch (`release/vX.Y.Z`) after approval.
After the user validates:
| Issue | Title | Status | Proposed Action / Version |
- If the user says **OK** or approves → Proceed to step 7
- If the user requests changes → Adjust the proposed solution and present the report again
- If the user rejects → Revert any accidental changes and stop
**This is a mandatory stop point.** Use `notify_user` with `BlockedOnUser: true`:
### 7. Implement Fixes, Run Tests & Commit (only after user approval)
- Inform the user that the PR was created and is **awaiting their verification**
- Include the PR number, URL, and a summary of what was changed
- **DO NOT merge, close issues, generate releases, or deploy until the user confirms**
After the user validates and gives the OK:
Wait for the user to respond:
1.**Implement the fixes** — modify the codebase according to the approved plan.
2.**Run tests** — `npm run test:all` (or the specific test file) to ensure 100% pass.
3.**Update CHANGELOG.md** with all new bug fix entries.
4.**Commit** each fix individually on the release branch with message format: `fix: <description> (#<issue_number>)`.
5.**Push** the release branch: `git push origin release/vX.Y.Z`.
6.**Close resolved issues immediately**. For each issue that was marked as Fixed, run:
`gh issue close <NUMBER> --repo <owner>/<repo> --comment "Thank you for reporting! This issue has been fixed and will be included in the next release (vX.Y.Z)."`
7. Likewise, close `Duplicate` issues referencing the original, close `Needs Info` if stale, and post the required comments.
8. If the project runs automatic releases or needs a PR, proceed to run `/generate-release` workflow Phase 1 steps 7–10 (tests → commit → push → open PR to main → wait for user).
- **User confirms** → Proceed to step 8
- **User requests changes** → Apply changes, push to the same branch, notify again
- **User rejects** → Close the PR and stop
### 8. Merge, Close Issues & Release (only after user confirms PR)
After the user confirms the PR:
1.**Merge** the PR: `gh pr merge <NUMBER> --merge --repo <owner>/<repo>` or via local merge
2.**Close** resolved issues with a comment: `gh issue close <NUMBER> --repo <owner>/<repo> --comment "Fixed in <commit_hash>. The fix will be included in the next release."`
3.**Switch to main**: `git checkout main && git pull`
4. Run the `/update-docs` workflow (at `~/.gemini/antigravity/global_workflows/update-docs.md`) to update CHANGELOG and README
5. Run the `/generate-release` workflow (at `.agents/workflows/generate-release.md`) to bump version, tag, and publish
6. Deploy to local VPS: `ssh root@192.168.0.15 "npm install -g omniroute@<VERSION> && pm2 restart omniroute"`
If NO fixes were committed, skip this step and just present the report.
If NO fixes were committed, skip closing and source control steps and just conclude the workflow.
This workflow reads all open GitHub Discussions, generates a categorized summary, identifies which ones need a response, drafts and posts replies, and optionally creates issues from actionable feature requests. It follows the same flow used for Issues but adapted for the Discussions forum.
// turbo-all
## Steps
### 1. Identify the GitHub Repository
- Run: `git -C <project_root> remote get-url origin` to extract the owner/repo
- Parse the owner and repo name from the URL
### 2. Fetch All Open Discussions
- Use `read_url_content` to fetch `https://github.com/<owner>/<repo>/discussions`
- Parse the discussion list to get all discussion titles, IDs, authors, categories, and dates
- For each discussion, fetch the individual page to read the full content and all comments/replies
### 3. Summarize All Discussions
For each discussion, extract:
- **Title** and **#Number**
- **Author** (GitHub username)
- **Category** (Announcements, General, Ideas, Q&A, Show and tell)
- **Date** created
- **Summary** of the original post (1-2 sentences)
- **Comments count** and key participants
- **Your previous response** (if any)
- **Pending action** — whether a response or follow-up is needed
### 4. Present Summary Report to User
Present the full summary to the user organized by category, using a table:
@@ -6,7 +6,9 @@ description: Analyze open Pull Requests from the project's GitHub repository, ge
## Overview
This workflow fetches all open PRs from the project's GitHub repository, performs a critical analysis of each one, generates a detailed report, and waits for user approval before proceeding with implementation. **All improvements are committed on top of the PR branch** and the user must verify before merge.
This workflow fetches all open PRs from the project's GitHub repository, performs a critical analysis of each one, generates a detailed report, and waits for user approval before proceeding with implementation. **All improvements are committed on the current release branch** (`release/vX.Y.Z`).
> **BRANCH RULE**: PRs are ALWAYS merged into the current `release/vX.Y.Z` branch, NEVER directly into `main`. The release branch acts as a staging area — only after all PRs are integrated and tests pass does the release branch get merged into `main` via the `/generate-release` workflow.
## Steps
@@ -16,51 +18,118 @@ This workflow fetches all open PRs from the project's GitHub repository, perform
// turbo
- Run: `git -C <project_root> remote get-url origin` to extract the owner/repo
### 2. Fetch Open Pull Requests
### 2. Ensure Release Branch Exists
// turbo
Before doing any work, ensure you are on the current release branch:
```bash
# Check current branch
git branch --show-current
# If on main, determine next version and create the release branch
If already on a `release/vX.Y.Z` branch, continue working there.
### 3. Fetch Open Pull Requests
// turbo-all
**⚠️ CRITICAL**: The JSON output of `gh pr list` can be truncated by the tool, silently hiding PRs. You MUST use the two-step approach below to guarantee **all** PRs are fetched.
**Step 3a — Get PR numbers only** (small output, never truncated):
- Run: `gh pr list --repo <owner>/<repo> --state open --limit 500 --json number --jq '.[].number'`
- This outputs one PR number per line. Count them and confirm total.
**Step 3b — Fetch full metadata for each PR** (one call per PR):
- Navigate to `https://github.com/<owner>/<repo>/pulls` and scrape all open PRs
- For each open PR, collect:
- PR number, title, author, branch, number of commits, date
- PR description/body
- Files changed (diff)
- Existing review comments (from bots or humans)
### 3. Analyze Each PR — For each open PR, perform the following analysis:
**Verification**: Confirm the count of PRs analyzed matches the count from step 3a before proceeding.
#### 3a. Feature Assessment
### 3.5 Redirect PR Base Branches to Release Branch
// turbo-all
**⚠️ CRITICAL**: Contributors typically open PRs targeting `main`. Before analyzing or merging, redirect ALL open PRs to target the current release branch instead.
```bash
# Get the current release branch name
RELEASE_BRANCH=$(git branch --show-current)# e.g. release/v3.5.4
# For each open PR that targets main, change its base to the release branch
for PR_NUM in $(gh pr list --repo <owner>/<repo> --state open --json number,baseRefName --jq '.[] | select(.baseRefName == "main") | .number');do
- **Document gaps** — If missing layers are detected, list them as **IMPORTANT** issues in the report with concrete suggestions for what should be added
### 4. Generate Report — Create a markdown report for each PR including:
### 5. Generate Report — Create a markdown report for each PR including:
- **PR Summary** — What it does, files affected, commit count
- **Improvements/Benefits** — Numbered list with impact level (HIGH/MEDIUM/LOW)
@@ -84,62 +153,90 @@ Perform a **global impact assessment** to verify whether the PR changes are comp
- **Verdict** — Ready to merge? With mandatory vs optional fixes
- **Next Steps** — What will happen if approved
### 5. Present to User
### 6. Present to User
- Show the report via `notify_user` with `BlockedOnUser: true`
- Wait for user decision:
- **Approved** → Proceed to step 6
- **Approved** → Proceed to step 7
- **Approved with changes** → Implement the fixes and corrections before merging
- **Rejected** → Close the PR or leave a review comment
### 6. Implementation (if approved)
### 7. Pre-Merge Fixes & CI Green-Lighting (if approved)
- Checkout the PR branch: `gh pr checkout <NUMBER>`
- Implement any required fixes identified in the analysis
-If the Cross-Layer Analysis (3f) identified missing frontend/backend counterparts, implement them
- **Commit improvements on top of the PR branch** with descriptive commit messages
-Run the project's test suite to verify nothing breaks
> **⚠️ Fixes and Conflict Resolutions MUST be pushed back to the PR branch before merging.** We want the PR itself to be green and fully valid before it integrates.
-**Sync latest fixes & Resolve Conflicts:** Merge the current `release` branch into the PR branch. If there are merge conflicts, you MUST resolve them inside the author's PR branch. NEVER resolve conflicts by closing their PR and doing the work in a separate branch, as this steals credit from the original author.
- **Implement improvements:** Apply the required fixes identified in the analysis directly on the PR branch (e.g., adding missing API routes, fixing SSRF, applying comments from other agents).
-**Pushing changes to PR branches:**
```bash
# Checkout the PR locally
gh pr checkout <NUMBER>
# Apply fixes, commit your changes
git commit -m "chore: apply review suggestions and missing layers"
# Attempt to push directly to the PR branch
git push
```
- **Fallback (For external forks without maintainer edit access):**
If `git push` fails because the PR comes from an external fork without write access, you MUST:
1. Create a new branch ending in `-fix` (e.g., `checkout -b fix-pr-<NUMBER>`).
2. Push your branch to the main repo (`git push origin fix-pr-<NUMBER>`).
3. Create a Pull Request targeting the contributor's repository and branch (use `gh pr create --repo <contributor-repo> --base <contributor-branch> --head diegosouzapw:fix-pr-<NUMBER>`).
4. Once they accept our PR into their branch, their original PR to our `main` will automatically update and become green.
- Run the project's test suite locally to verify nothing breaks:
// turbo
- Run: `npm test` or equivalent test command
- Build the project to verify compilation
// turbo
- Run: `npm run build` or equivalent build command
- Push the updated branch: `git push origin <branch-name>`
**This is a mandatory stop point.** Use `notify_user` with `BlockedOnUser: true`:
### 8. Merge into Release Branch (NEVER CLOSE!)
- Inform the user that the PR has been **improved and pushed**, and is **awaiting their verification**
- Include:
- PR number and URL
- Summary of improvements/fixes applied
- Build/test status
- List of files changed
- **DO NOT merge, generate releases, or deploy until the user confirms**
> **⚠️ CRITICAL**: NEVER use `gh pr close` for a PR whose idea or code was accepted. Closing a PR in a contributor's face after taking their idea—or closing it just because it had conflicts—is unacceptable.
> You MUST ALWAYS resolve conflicts and apply fixes on the author's PR branch, and then merge the PR using GitHub so the contributor gets the official "Merged" badge and proper credit on their profile.
Wait for the user to respond:
Even if the PR had severe conflicts or required significant architectural adjustments, you MUST:
- **User confirms** → Proceed to step 8
- **User requests more changes** → Apply changes, push to the same branch, notify again
- **User rejects** → Leave a review comment and stop
1. Resolve any conflicts and apply the fixes directly to their PR branch (as detailed in step 7).
2. Once the PR branch is green, conflict-free, and correct, merge it into the release branch using the GitHub CLI.
### 8. Thank the Contributor
```bash
# Merge the PR (base is already set to release/vX.Y.Z from step 3.5)
- Post a **thank-you comment** on the PR via the GitHub API
In ALL cases:
- Post a **thank-you comment** on the PR via the GitHub API before or immediately after merging.
- The message should:
- Thank the author by name/username for their contribution
- Briefly mention what the PR accomplishes and any improvements applied
- Be friendly, professional, and encouraging
- Example: _"Thanks @author for this great contribution! 🎉 The [feature/fix] is now merged and will be part of the next release. We appreciate your effort!"_
- Thank the author by name/username for their contribution.
- Explain what was adjusted or improved (if we pushed fixes to their branch).
- Note it will be included in the upcoming release.
- Be friendly, professional, and encouraging.
- Example: _"Thanks @author for this great contribution! 🎉 We've added a few small adjustments to your branch to align with our latest architecture, and it's now officially merged into the release/vX.Y.Z branch. It will be part of the next release. We appreciate your effort!"_
### 9. Merge & Release (only after user confirms PR)
### 9. Sync Local Release Branch
After the user confirms the PR:
After merging PRs, sync the local release branch to include the new changes:
1.**Merge** the PR into main (local merge with `--no-ff` or via `gh pr merge`)
2.**Push** to main: `git push origin main`
3.**Clean up** the feature branch: `git branch -d <branch-name>`
4.**Update CHANGELOG.md** with the new feature/fix
5. Run the `/generate-release` workflow (at `.agents/workflows/generate-release.md`) to bump version, tag, and publish
6. Deploy to local VPS: `ssh root@192.168.0.15 "npm install -g omniroute@<VERSION> && pm2 restart omniroute"`
```bash
git fetch origin
git pull origin release/vX.Y.Z
```
### 10. Continue or Finalize
After processing all approved PRs:
- If more PRs remain, go back to step 7
- When all PRs are processed, **update CHANGELOG.md** on the release branch with all new entries
- Run **test coverage** to verify all metrics stay above 85%:
```bash
npm run test:coverage
```
- Fix any test regressions introduced by merged PRs
- Run `/generate-release` workflow Phase 1 steps 7–10 (tests → commit → push → open PR to main → wait for user)
- The `/generate-release` workflow handles the final merge from `release/vX.Y.Z` → `main`
Each contains: API_REFERENCE.md, ARCHITECTURE.md, CODEBASE_DOCUMENTATION.md, FEATURES.md, TROUBLESHOOTING.md, USER_GUIDE.md
```
**Sync approach for feature table updates:**
a. Identify which feature table rows were added to English README.md
b. For each translated README, find the corresponding anchor lines:
- **Routing section:** Find the `💬` (System Prompt) table row — the line before it is always the last routing feature. Insert new routing features before System Prompt.
- **Resilience section:** Find the `📊` Rate Limits table row (the one in lines 590-600, NOT the quota tracking one in lines 560-570). Insert new resilience features after it.
c. The new feature entries can stay in English for technical features, matching the pattern used in the existing translations.
d. Use `sed` or similar tool to batch-insert across all 29 translated READMEs.
**Verification:**
```bash
# Verify all READMEs have the new features
grep -l "NEW_FEATURE_NAME" README.*.md | wc -l
# Should return 30 (all language versions)
```
**FEATURES.md sync:**
```bash
# Update Settings description in all docs/i18n/*/FEATURES.md
for dir in docs/i18n/*/;do
# Update the Settings section description to mention new features
description: Bump version, auto-generate CHANGELOG from git commits, update all versioned files, and refresh root + docs/ documentation to reflect the current project state
---
# Version Bump Workflow
Automatically bump the project version, generate CHANGELOG entries from git history since the last tag, update every file that references the version, and refresh project documentation to reflect the current state.
> **VERSION RULE: Always use PATCH bumps (3.x.y → 3.x.y+1)**
> NEVER use `npm version minor` or `npm version major`.
> Always use: `npm version patch --no-git-tag-version`
> The threshold rule: when `y` reaches 10, bump to `3.(x+1).0` — e.g. `3.4.10` → `3.5.0`.
If the version was ALREADY bumped (e.g. you are on a release branch and package.json already has the new version), **skip the npm version bump** and use the existing version.
For each category with entries, create a markdown section with descriptive bullet points. Use the commit messages but rewrite them to be human-readable and descriptive (not raw commit messages).
**If a commit references a PR number** (e.g. `#880`, `PR #885`), include it in the description.
### 6. Update CHANGELOG.md
Replace the `## [Unreleased]` section content with the generated entries, then add the new versioned section:
```markdown
## [Unreleased]
---
## [NEW_VERSION] — YYYY-MM-DD
### ✨ New Features
- **Feature name:** Description (#PR)
### 🐛 Bug Fixes
- **Fix name:** Description (#PR)
### 🛠️ Maintenance
- **Item:** Description
---
## [PREVIOUS_VERSION] — YYYY-MM-DD
...
```
The date must be today's date in `YYYY-MM-DD` format.
---
## Phase 3: Sync Version Across All Files
### 7. Update workspace package.json files and openapi.yaml
sed -i "s/\*\*Current version:\*\* $OLD_VERSION_PATTERN/**Current version:** $VERSION/" llm.txt
# Update "Key Features (vX.Y.Z)" header
sed -i "s/## Key Features (v$OLD_VERSION_PATTERN)/## Key Features (v$VERSION)/" llm.txt
echo"✓ llm.txt → $VERSION"
```
### 9. Regenerate lock file
// turbo
```bash
cd /home/diegosouzapw/dev/proxys/9router
npm install
echo"✓ Lock file regenerated"
```
---
## Phase 4: Update Root Documentation
Based on the CHANGELOG entries generated in Phase 2, review and update these root-level files if relevant changes warrant updates:
### 10. Review and update root documentation files
For each file below, read the current content and determine if the CHANGELOG entries require any updates. Only modify files where substantive changes have occurred:
- **README.md**: Update provider count, test count, feature highlights table, badges if any numbers changed. If a new provider was added, add it to the provider table. If a major feature was added, add it to the features section.
- **AGENTS.md**: If new architecture components (handlers, executors, services, DB modules) were added, update the Architecture section. If new commands were added, update the Build/Test table.
- **SECURITY.md**: Add new vulnerability fixes or security improvements to the relevant section.
- **llm.txt**: Update provider count, feature list, version references.
### 11. Review and update docs/ files (excluding i18n/)
For each file in `docs/` (excluding `docs/i18n/`), review if CHANGELOG changes affect it:
| `docs/VM_DEPLOYMENT_GUIDE.md` | Deployment changes, new env vars |
| `docs/TROUBLESHOOTING.md` | New known issues, resolved problems |
| `docs/AUTO-COMBO.md` | Routing changes, new strategies |
| `docs/CODEBASE_DOCUMENTATION.md` | New files, architectural changes |
| `docs/RELEASE_CHECKLIST.md` | Process changes |
| `docs/COVERAGE_PLAN.md` | Test changes |
| `docs/openapi.yaml` | Already updated in step 7 |
**Only update files where the CHANGELOG entries directly affect the documented content.** Do NOT update files just to bump a version number — only when the documented behavior, features, or architecture has actually changed.
description:"Which tool are you using OmniRoute with?"
placeholder:"e.g. Claude Code, Cursor, Roo Code, OpenClaw, Gemini CLI, cURL"
validations:
required:false
- type:textarea
id:description
attributes:
label:Description
description:"A clear description of what the bug is."
validations:
required:true
- type:textarea
id:steps
attributes:
label:Steps to Reproduce
description:"Step-by-step instructions to reproduce the behavior."
placeholder:|
1. Go to '...'
2. Click on '...'
3. See error
validations:
required:true
- type:textarea
id:expected
attributes:
label:Expected Behavior
description:"What did you expect to happen?"
validations:
required:true
- type:textarea
id:actual
attributes:
label:Actual Behavior
description:"What actually happened?"
validations:
required:true
- type:dropdown
id:test-impact
attributes:
label:Test Impact
description:"What automated test coverage should exist for this bug?"
options:
- Needs a new unit test
- Needs a new integration test
- Needs a new e2e test
- Existing automated test already fails
- Unsure
validations:
required:true
- type:textarea
id:logs
attributes:
label:Error Logs / Output
description:"Paste any relevant error messages, logs, or terminal output. This will be automatically formatted as code."
render:shell
validations:
required:false
- type:textarea
id:screenshots
attributes:
label:Screenshots
description:"If applicable, add screenshots to help explain the problem. Please also include the text of any error messages above — screenshots alone are not searchable."
validations:
required:false
- type:textarea
id:additional
attributes:
label:Additional Context
description:"Any other context about the problem (e.g. proxy config, number of accounts, network setup)."
validations:
required:false
- type:textarea
id:validation-plan
attributes:
label:Validation Plan
description:"Which commands or tests should prove this bug is fixed?"
- Treat `npm run test:coverage` as a required gate for PR work.
- The repository minimum is `60%` for statements, lines, functions, and branches.
- If a PR changes production code in `src/`, `open-sse/`, `electron/`, or `bin/`, it must include automated tests in the same PR.
- When reviewing or updating a PR, if the report shows missing tests or coverage below `60%`, do not stop after reporting the problem. Add or update tests in the PR first, rerun the coverage gate, and only then ask for confirmation.
- Prefer the smallest test layer that proves the behavior:
- unit tests first
- integration tests when multiple modules or DB state are involved
- e2e only when the behavior is truly UI or workflow-dependent
- For bug issues, try to encode the reproduction as an automated test before or alongside the fix.
| Single file | `node --import tsx/esm --test tests/unit/file.test.mjs` |
| Vitest (MCP, autoCombo) | `npm run test:vitest` |
| E2E (Playwright) | `npm run test:e2e` |
| Protocol E2E (MCP+A2A) | `npm run test:protocols:e2e` |
| Ecosystem | `npm run test:ecosystem` |
| Coverage gate | `npm run test:coverage` (60% min all metrics) |
| Coverage report | `npm run coverage:report` |
**PR rule**: If you change production code in `src/`, `open-sse/`, `electron/`, or `bin/`,
you must include or update tests in the same PR.
**Test layer preference**: unit first → integration (multi-module or DB state) → e2e (UI/workflow only). Encode bug reproductions as automated tests before or alongside the fix.
Releases are managed via the `/generate-release` workflow. When a new GitHub Release is created, the package is **automatically published to npm** via GitHub Actions.
---
## Getting Help
- **Architecture**: See [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md)
- **API Reference**: See [`docs/API_REFERENCE.md`](docs/API_REFERENCE.md)
# Security and Cleanliness Rules for AI Assistants
## 1. File Placement & Organization
- **Test Files**: ALL unit tests, integration tests, ecosystem tests, or Vitest files MUST strictly be placed within the `tests/` directory (e.g., `tests/unit/`, `tests/integration/`). NEVER create test files in the project root (`/`).
- **Scripts and Utilities**: ALL maintenance, debugging, generation, or experimental scripts (`.cjs`, `.mjs`, `.js`, `.ts`) MUST be placed strictly inside the `scripts/` directory or `scripts/scratch/` for temporary one-offs. NEVER dump loose scripts in the project root (`/`).
- CI/CD files and ignore definitions (`.gitignore`, `.dockerignore`)
When creating _any_ validation tests or one-off logic scripts, default to using `scripts/scratch/` or the `tests/unit/` directories according to your goals. Do not pollute the `/` root context.
- Control route: `src/app/api/sync/cloud/route.ts`
## Request Lifecycle (`/v1/chat/completions`)
@@ -335,7 +402,7 @@ flowchart TD
Q -- No --> R[Return all unavailable]
```
Fallback decisions are driven by `open-sse/services/accountFallback.ts` using status codes and error-message heuristics.
Fallback decisions are driven by `open-sse/services/accountFallback.ts` using status codes and error-message heuristics. Combo routing adds one extra guard: provider-scoped 400s such as upstream content-block and role-validation failures are treated as model-local failures so later combo targets can still run.
## OAuth Onboarding and Token Refresh Lifecycle
@@ -567,6 +634,10 @@ flowchart LR
- `src/app/api/settings/system-prompt`: global system prompt (GET/PUT)
- `src/app/api/sessions`: active session listing (GET)
- `src/app/api/rate-limits`: per-account rate limit status (GET)
- `src/app/api/v1/ws`: WebSocket upgrade handler for OpenAI-compatible WS clients
### Routing and Execution Core
@@ -591,15 +662,22 @@ flowchart LR
Each provider has a specialized executor extending `BaseExecutor` (in `open-sse/executors/base.ts`), which provides URL building, header construction, retry with exponential backoff, credential refresh hooks, and the `execute()` orchestration method.
| `GET/POST/DELETE /api/provider-models` | Provider Models | Provider model metadata backing custom and managed available models |
## Bypass Handler
The bypass handler (`open-sse/utils/bypassHandler.ts`) intercepts known "throwaway" requests from Claude CLI — warmup pings, title extractions, and token counts — and returns a **fake response** without consuming upstream provider tokens. This is triggered only when `User-Agent` contains `claude-cli`.
## Request Logger Pipeline
## Request Logging and Artifacts
The request logger (`open-sse/utils/requestLogger.ts`) provides a 7-stage debug logging pipeline, disabled by default, enabled via `ENABLE_REQUEST_LOGS=true`:
The older file-based request logger (`open-sse/utils/requestLogger.ts`) is retained only for
legacy compatibility. The current runtime contract uses:
@@ -771,7 +876,10 @@ Environment variables actively used by code:
5. The `open-sse/` directory is published as the `@omniroute/open-sse`**npm workspace package**. Source code imports it via `@omniroute/open-sse/...` (resolved by Next.js `transpilePackages`). File paths in this document still use the directory name `open-sse/` for consistency.
6. Charts in the dashboard use **Recharts** (SVG-based) for accessible, interactive analytics visualizations (model usage bar charts, provider breakdown tables with success rates).
7. E2E tests use **Playwright** (`tests/e2e/`), run via `npm run test:e2e`. Unit tests use **Node.js test runner** (`tests/unit/`), run via `npm run test:unit`. Source code under `src/` is **TypeScript** (`.ts`/`.tsx`); the `open-sse/` workspace remains JavaScript (`.js`).
8. Settings page is organized into 5 tabs: Security, Routing (6 global strategies: fill-first, round-robin, p2c, random, least-used, cost-optimized), Resilience (editable rate limits, circuit breaker, policies), AI (thinking budget, system prompt, prompt cache), Advanced (proxy).
8. Settings page is organized into 5 tabs: Security, Routing (6 global strategies: fill-first, round-robin, p2c, random, least-used, cost-optimized), Resilience (editable rate limits, circuit breaker, policies, **Context Relay** handoff config), AI (thinking budget, system prompt, prompt cache), Advanced (proxy).
9. **Context Relay** strategy (`context-relay`) is split across two layers: `combo.ts` decides if a handoff should be generated, `chat.ts` injects the handoff after account resolution. Handoff data lives in `context_handoffs` SQLite table. This split is intentional because only `chat.ts` knows whether the actual account changed.
10. **Proxy enforcement** is now comprehensive: `tokenHealthCheck.ts` resolves proxy per connection, `/api/providers/validate` uses `runWithProxyContext`, and `proxyFetch.ts` uses `undici.fetch()` to maintain dispatcher compatibility on Node 22.
11. **Node.js runtime policy detection**: `/api/settings/require-login` returns `nodeVersion` and `nodeCompatible` fields. The login page renders a warning banner when the runtime falls outside the supported secure Node.js lines.
> A comprehensive, beginner-friendly guide to the **omniroute** multi-provider AI proxy router.
@@ -267,7 +267,7 @@ Business logic that supports the handlers and executors.
| `provider.ts` | **Format detection** (`detectFormat`): analyzes request body structure to identify Claude/OpenAI/Gemini/Antigravity/Responses formats (includes `max_tokens` heuristic for Claude). Also: URL building, header building, thinking config normalization. Supports `openai-compatible-*` and `anthropic-compatible-*` dynamic providers. |
| `model.ts` | Model string parsing (`claude/model-name` → `{provider: "claude", model: "model-name"}`), alias resolution with collision detection, input sanitization (rejects path traversal/control chars), and model info resolution with async alias getter support. |
| `tokenRefresh.ts` | OAuth token refresh for **every provider**: Google (Gemini, Antigravity), Claude, Codex, Qwen, iFlow, GitHub (OAuth + Copilot dual-token), Kiro (AWS SSO OIDC + Social Auth). Includes in-flight promise deduplication cache and retry with exponential backoff. |
| `tokenRefresh.ts` | OAuth token refresh for **every provider**: Google (Gemini, Antigravity), Claude, Codex, Qwen, Qoder, GitHub (OAuth + Copilot dual-token), Kiro (AWS SSO OIDC + Social Auth). Includes in-flight promise deduplication cache and retry with exponential backoff. |
| `combo.ts` | **Combo models**: chains of fallback models. If model A fails with a fallback-eligible error, try model B, then C, etc. Returns actual upstream status codes. |
| `usage.ts` | Fetches quota/usage data from provider APIs (GitHub Copilot quotas, Antigravity model quotas, Codex rate limits, Kiro usage breakdowns, Claude settings). |
| `accountSelector.ts` | Smart account selection with scoring algorithm: considers priority, health status, round-robin position, and cooldown state to pick the optimal account for each request. |
| `usageTracking.ts` | Token usage extraction from any format (Claude/OpenAI/Gemini/Responses), estimation with separate tool/message char-per-token ratios, buffer addition (2000 tokens safety margin), format-specific field filtering, console logging with ANSI colors. |
| `requestLogger.ts` | File-based request logging (opt-in via `ENABLE_REQUEST_LOGS=true`). Creates session folders with numbered files: `1_req_client.json` → `7_res_client.txt`. All I/O is async (fire-and-forget). Masks sensitive headers. |
| `requestLogger.ts` | Legacy file-based request logging helper kept for compatibility. Current deployments should prefer `APP_LOG_TO_FILE` for application logs and the call log pipeline for persisted request artifacts. |
| `bypassHandler.ts` | Intercepts specific patterns from Claude CLI (title extraction, warmup, count) and returns fake responses without calling any provider. Supports both streaming and non-streaming. Intentionally limited to Claude CLI scope. |
| `networkProxy.ts` | Resolves outbound proxy URL for a given provider with precedence: provider-specific config → global config → environment variables (`HTTPS_PROXY`/`HTTP_PROXY`/`ALL_PROXY`). Supports `NO_PROXY` exclusions. Caches config for 30s. |
@@ -539,7 +539,7 @@ A 2000-token buffer is added to reported usage to prevent clients from hitting c
| Kiro (AWS) | AWS SSO OIDC or Social | Kiro | Binary EventStream parsing |
| Legacy | Old `npm run test:coverage` | 79.42% | 75.15% | 67.94% | Inflated: counts test files and excludes `open-sse` |
| Diagnostic | Source-only, excluding tests and excluding `open-sse` | 68.16% | 63.55% | 64.06% | Useful only to isolate `src/**` |
| Recommended baseline | Source-only, excluding tests and including `open-sse` | 56.95% | 66.05% | 57.80% | This is the project-wide baseline to improve |
The recommended baseline is the number to optimize against.
## Rules
- Coverage targets apply to source files, not to `tests/**`.
- `open-sse/**` is part of the product and must remain in scope.
- New code should not reduce coverage in touched areas.
- Prefer testing behavior and branch outcomes over implementation details.
- Prefer temp SQLite databases and small fixtures over broad mocks for `src/lib/db/**`.
## Current command set
- `npm run test:coverage`
- Main source coverage gate for the unit test suite
- Generates `text-summary`, `html`, `json-summary`, and `lcov`
- `npm run coverage:report`
- Detailed file-by-file report from the latest run
| Phase 7 | 90% statements / lines | Final sweep, gap closure, strict ratchet |
Branches and functions should ratchet upward with each phase, but the primary hard target is statements / lines.
## Priority hotspots
These files or areas offer the best return for the next phases:
1. `open-sse/handlers`
- `chatCore.ts` at 7.57%
- Overall directory at 29.07%
2. `open-sse/translator/request`
- Overall directory at 36.39%
- Many translators are still near single-digit coverage
3. `open-sse/translator/response`
- Overall directory at 8.07%
4. `open-sse/executors`
- Overall directory at 36.62%
5. `src/lib/db`
- `models.ts` at 20.66%
- `registeredKeys.ts` at 34.46%
- `modelComboMappings.ts` at 36.25%
- `settings.ts` at 46.40%
- `webhooks.ts` at 33.33%
6. `src/lib/usage`
- `usageHistory.ts` at 21.12%
- `usageStats.ts` at 9.56%
- `costCalculator.ts` at 30.00%
7. `src/lib/providers`
- `validation.ts` at 41.16%
8. Low-risk utility and API files for early gains
- `src/shared/utils/upstreamError.ts`
- `src/shared/utils/apiAuth.ts`
- `src/lib/api/errorResponse.ts`
- `src/app/api/settings/require-login/route.ts`
- `src/app/api/providers/[id]/models/route.ts`
## Execution checklist
### Phase 1: 56.95% -> 60%
- [x] Fix coverage metric so it reflects source code instead of test files
- [x] Keep a legacy coverage script for comparison
- [x] Record the baseline and hotspots in-repo
- [ ] Add focused tests for low-risk utilities:
- `src/shared/utils/upstreamError.ts`
- `src/shared/utils/fetchTimeout.ts`
- `src/lib/api/errorResponse.ts`
- `src/shared/utils/apiAuth.ts`
- `src/lib/display/names.ts`
- [ ] Add route tests for:
- `src/app/api/settings/require-login/route.ts`
- `src/app/api/providers/[id]/models/route.ts`
### Phase 2: 60% -> 65%
- [ ] Add DB-backed tests for:
- `src/lib/db/modelComboMappings.ts`
- `src/lib/db/settings.ts`
- `src/lib/db/registeredKeys.ts`
- [ ] Cover branch behavior in:
- `src/lib/providers/validation.ts`
- `src/app/api/v1/embeddings/route.ts`
- `src/app/api/v1/moderations/route.ts`
### Phase 3: 65% -> 70%
- [ ] Add usage analytics tests for:
- `src/lib/usage/usageHistory.ts`
- `src/lib/usage/usageStats.ts`
- `src/lib/usage/costCalculator.ts`
- [ ] Expand route coverage for proxy management and settings branches
### Phase 4: 70% -> 75%
- [ ] Cover translator helpers and central translation paths:
- `open-sse/translator/index.ts`
- `open-sse/translator/helpers/*`
- `open-sse/translator/request/*`
- `open-sse/translator/response/*`
### Phase 5: 75% -> 80%
- [ ] Add handler-level tests for:
- `open-sse/handlers/chatCore.ts`
- `open-sse/handlers/responsesHandler.js`
- `open-sse/handlers/imageGeneration.js`
- `open-sse/handlers/embeddings.js`
- [ ] Add executor branch coverage for provider-specific auth, retries, and endpoint overrides
### Phase 6: 80% -> 85%
- [ ] Merge more edge-case suites into the main coverage path
- [ ] Increase function coverage for DB modules with weak constructor/helper coverage
- [ ] Close branch gaps in `settings.ts`, `registeredKeys.ts`, `validation.ts`, and translator helpers
### Phase 7: 85% -> 90%
- [ ] Treat the remaining low-coverage files as blockers
- [ ] Add regression tests for every uncovered production bug fixed during the push to 90%
- [ ] Raise the coverage gate in CI only after the local baseline is stable for at least two consecutive runs
## Ratchet policy
Update `npm run test:coverage` thresholds only after the project actually exceeds the next milestone with a comfortable buffer.
Recommended ratchet sequence:
1. 55/60/55
2. 60/62/58
3. 65/64/62
4. 70/66/66
5. 75/70/72
6. 80/75/78
7. 85/80/84
8. 90/85/88
Order is `statements-lines / branches / functions`.
## Known gap
The current coverage command measures the main Node unit suite and includes source reached from it, including `open-sse`. It does not yet merge Vitest coverage into a single unified report. That merge is worth doing later, but it is not a blocker for starting the 60% -> 80% climb.
| `DATA_DIR` | `~/.omniroute/` | `src/lib/db/core.ts` | Root directory for SQLite DB, backups, and data files. Override for Docker volumes or custom paths. |
| `STORAGE_ENCRYPTION_KEY` | _(empty = disabled)_ | `src/lib/db/encryption.ts` | AES key for full SQLite database encryption at rest. Generate with `openssl rand -hex 32`. |
| `STORAGE_ENCRYPTION_KEY_VERSION` | `v1` | `scripts/bootstrap-env.mjs`, `electron/main.js` | Version label for the encryption key. Increment when performing key rotation to support decryption of old backups. |
| `DISABLE_SQLITE_AUTO_BACKUP` | `false` | `src/lib/db/backup.ts` | When `true`, skips the automatic database backup that runs before migrations on every startup. |
| `OMNIROUTE_CRYPT_KEY` | _(unset)_ | `src/lib/db/encryption.ts` | **Legacy alias** for `STORAGE_ENCRYPTION_KEY`. Accepted as a fallback when the primary variable is absent. |
| `OMNIROUTE_API_KEY_BASE64` | _(unset)_ | `src/lib/db/encryption.ts` | **Legacy alias** (Base64-encoded form) accepted as a fallback. Decoded automatically before use. |
| `MACHINE_ID_SALT` | `endpoint-proxy-salt` | `src/lib/auth` | Salt combined with hardware identifiers for machine fingerprinting. Change per-deployment for isolation. |
| `AUTH_COOKIE_SECURE` | `false` | `src/lib/auth` | Sets the `Secure` flag on session cookies. **Must be `true`** when running behind HTTPS. |
| `REQUIRE_API_KEY` | `false` | API middleware | When `true`, all `/v1/*` proxy requests must include a valid API key. |
| `ALLOW_API_KEY_REVEAL` | `false` | Dashboard providers page | Allows revealing full API key values in the Dashboard UI. Security risk on shared instances. |
| `NO_LOG_API_KEY_IDS` | _(empty)_ | `src/lib/compliance/index.ts` | Comma-separated API key IDs that bypass request logging (GDPR compliance). |
| `MAX_BODY_SIZE_BYTES` | `10485760` (10 MB) | `src/shared/middleware/bodySizeGuard.ts` | Maximum allowed request body size. Rejects payloads exceeding this limit. |
| `CORS_ORIGIN` | `*` | Next.js middleware | CORS `Access-Control-Allow-Origin` value. Restrict for production. |
| `OUTBOUND_SSRF_GUARD_ENABLED` | `true` | `src/shared/network/outboundUrlGuard.ts` | Block provider calls targeting private/loopback/link-local IP ranges. Disable only in isolated test envs. |
### Hardening Checklist
```bash
# Production security minimum:
AUTH_COOKIE_SECURE=true # Requires HTTPS
REQUIRE_API_KEY=true # Authenticate all proxy calls
ALLOW_API_KEY_REVEAL=false # Never expose keys in UI
CORS_ORIGIN=https://your.domain.com
MAX_BODY_SIZE_BYTES=5242880 # 5 MB limit
```
---
## 5. Input Sanitization & PII Protection
OmniRoute provides a two-layer defense: request-side injection scanning and response-side PII stripping.
> When deploying behind a reverse proxy (nginx, Caddy), `NEXT_PUBLIC_BASE_URL`**must** be set to your public URL (e.g., `https://omniroute.example.com`). Without this, OAuth callbacks will fail because the redirect_uri won't match.
---
## 8. Outbound Proxy
Route upstream LLM provider calls through an HTTP or SOCKS5 proxy for egress control, geo-routing, or IP masking.
| `CURSOR_USER_AGENT` | `connect-es/1.6.1` | When Cursor updates |
| `GEMINI_CLI_USER_AGENT` | `google-api-nodejs-client/9.15.1` | When Google API client updates |
> [!TIP]
> You can add User-Agent overrides for **any** provider using the pattern `{PROVIDER_ID}_USER_AGENT`. The executor dynamically constructs the env var name.
---
## 13. CLI Fingerprint Compatibility
When enabled, OmniRoute reorders HTTP headers and JSON body fields to match the exact signature of official CLI tools. This reduces the risk of account flagging while preserving your proxy IP.
| `CLI_COMPAT_ALL=1` | Enable fingerprint compatibility for **all** providers at once. |
> [!NOTE]
> This feature works alongside the User-Agent overrides (§12). The fingerprint system handles header ordering and body field ordering, while User-Agent overrides handle the specific UA string. Both can be enabled independently.
---
## 14. API Key Providers
API keys for providers that use direct authentication. **Preferred setup:** Dashboard → Providers → Add API Key.
Setting via environment variables is an alternative for Docker or headless deployments.
Recognized pattern: `{PROVIDER_ID}_API_KEY`
| Variable | Provider |
| -------------------- | ------------------- |
| `DEEPSEEK_API_KEY` | DeepSeek |
| `GROQ_API_KEY` | Groq |
| `XAI_API_KEY` | xAI (Grok) |
| `MISTRAL_API_KEY` | Mistral AI |
| `PERPLEXITY_API_KEY` | Perplexity |
| `TOGETHER_API_KEY` | Together AI |
| `FIREWORKS_API_KEY` | Fireworks AI |
| `CEREBRAS_API_KEY` | Cerebras |
| `COHERE_API_KEY` | Cohere |
| `NVIDIA_API_KEY` | NVIDIA NIM |
| `NEBIUS_API_KEY` | Nebius (embeddings) |
> [!TIP]
> Keys set via the Dashboard are stored encrypted in SQLite and take precedence over environment variables.
---
## 15. Timeout Settings
All values are in **milliseconds**. Centralized resolution in `src/shared/utils/runtimeTimeouts.ts`.
| `PROXY_HEALTH_CACHE_TTL_MS` | `30000` | `src/lib/proxyHealth.ts` | Health check result cache TTL. |
| `RATE_LIMIT_MAX_WAIT_MS` | `120000` (2 min) | `open-sse/services/rateLimitManager.ts` | Max time to wait on a 429 before failing the request. |
| `REQUEST_RETRY` | `2` | `src/sse/services/cooldownAwareRetry.ts` | Number of automatic retries on model-scoped cooldown responses before returning error to client. |
| `MAX_RETRY_INTERVAL_SEC` | `30` | `src/sse/services/cooldownAwareRetry.ts` | Max backoff interval (seconds) between cooldown retries. Capped by this value regardless of upstream `Retry-After`. |
---
## 22. Debugging
> [!CAUTION]
> These variables produce **verbose output** and may leak sensitive data. **Never enable in production.**
The following variables appeared in previous versions of `.env.example` but have **no runtime references** in the current codebase. They have been removed:
Create model routing combos with 6 strategies: priority, weighted, round-robin, random, least-used, and cost-optimized. Each combo chains multiple models with automatic fallback and includes quick templates and readiness checks.
Create model routing combos with 13 strategies: priority, weighted, round-robin, random, least-used, cost-optimized, strict-random, auto, fill-first, p2c, lkgp, context-optimized, and **context-relay**. Each combo chains multiple models with automatic fallback and includes quick templates and readiness checks.
Recent combo improvements:
- **Structured combo builder** — create each step by selecting provider, model, and exact account/connection
- **Repeated provider support** — reuse the same provider many times in one combo as long as the `(provider, model, connection)` tuple is unique
- **Combo target health** — analytics and health surfaces now distinguish individual combo targets/steps instead of collapsing everything into model strings
- **Composite tier ordering** — `defaultTier -> fallbackTier` now influences runtime execution/fallback order for top-level combo steps

@@ -32,7 +39,7 @@ Comprehensive usage analytics with token consumption, cost estimates, activity h
## 🏥 System Health
Real-time monitoring: uptime, memory, version, latency percentiles (p50/p95/p99), cache statistics, and provider circuit breaker states.
@@ -92,6 +99,68 @@ Dashboard for discovering and managing CLI agents. Shows a grid of 14 built-in a
---
## 🔗 Context Relay _(v3.5.5+)_
A combo strategy that preserves session continuity when account rotation happens mid-conversation. Before the active account is exhausted, OmniRoute generates a structured handoff summary in the background. After the next request resolves to a different account, the summary is injected as a system message so the new account continues with full context.
- **Max Messages For Summary** — How much recent history to condense
- **Summary Model** — Optional override model for generating the handoff summary
Currently supports Codex account rotation. See [Context Relay documentation](features/context-relay.md).
---
## 🛡️ Proxy Hardening _(v3.5.5+)_
Comprehensive proxy configuration enforcement across the entire request pipeline:
- **Token Health Check** — Background OAuth refresh now resolves proxy config per connection, preventing failures in proxy-required environments
- **API Key Validation** — Provider key validation (`POST /api/providers/validate`) routes through `runWithProxyContext`, honoring provider-level and global proxy settings
- **undici Dispatcher Fix** — Proxy dispatchers use undici's own fetch implementation instead of Node's built-in fetch, resolving `invalid onRequestStart method` errors on Node.js 22
- **Node.js Version Detection** — Login page proactively detects incompatible Node.js versions (24+) and displays a warning banner with instructions to use Node 22 LTS
---
## 📧 Email Privacy Masking _(v3.5.6+)_
OAuth account emails are now masked in the provider dashboard (e.g. `di*****@g****.com`) to prevent accidental exposure when sharing screenshots or recording demos. The full email address remains accessible via hover tooltip (`title` attribute).
---
## 👁️ Model Visibility Toggle _(v3.5.6+)_
The provider page model list now includes:
- **Real-time search/filter bar** — Quickly find specific models
- **Per-model visibility toggle** (👁 icon) — Hidden models are grayed out and excluded from the `/v1/models` catalog
- **Active-count badge** (`N/M active`) — Shows at a glance how many models are enabled vs total
---
## 🔧 OAuth Env Repair _(v3.6.1+)_
One-click "Repair env" action for OAuth providers that restores missing environment variables and fixes broken auth state. Accessible from `Dashboard → Providers → [OAuth Provider] → Repair env`. Automatically detects and repairs:
- Missing OAuth client credentials
- Corrupted env file entries
- Backup path sanitization
---
## 🗑️ Uninstall / Full Uninstall _(v3.6.2+)_
Clean removal scripts for all installation methods:
| `npm run uninstall` | Removes the system app but **keeps your DB and configurations** in `~/.omniroute`. |
| `npm run uninstall:full` | Removes the app AND permanently **erases all configurations, keys, and databases**. |
---
## 🖼️ Media _(v2.0.3+)_
Generate images, videos, and music from the dashboard. Supports OpenAI, xAI, Together, Hyperbolic, SD WebUI, ComfyUI, AnimateDiff, Stable Audio Open, and MusicGen.
@@ -108,7 +177,7 @@ Real-time request logging with filtering by provider, model, account, and API ke
## 🌐 API Endpoint
Your unified API endpoint with capability breakdown: Chat Completions, Responses API, Embeddings, Image Generation, Reranking, Audio Transcription, Text-to-Speech, Moderations, and registered API keys. Cloud proxy support for remote access.
Your unified API endpoint with capability breakdown: Chat Completions, Responses API, Embeddings, Image Generation, Reranking, Audio Transcription, Text-to-Speech, Moderations, and registered API keys. Cloudflare Quick Tunnel integration and cloud proxy support for remote access.
- Hardened Electron build packaging — symlinked `node_modules` in the standalone bundle is detected and rejected before packaging, preventing runtime dependency on the build machine (v2.5.5+)
- **Graceful shutdown** — Electron `before-quit` shuts down Next.js cleanly, preventing SQLite WAL database locks (v3.6.2+)
📖 See [`electron/README.md`](../electron/README.md) for full documentation.
---
## 🌐 V1 WebSocket Bridge _(v3.6.6+)_
OmniRoute now supports **OpenAI-compatible WebSocket clients** via the `/v1/ws` upgrade endpoint. The custom `scripts/v1-ws-bridge.mjs` server wraps Next.js and upgrades WS connections to full bidirectional streaming sessions. Authentication uses the same API key or session cookie as HTTP requests.
Key behaviours:
- WS upgrade validated by `src/lib/ws/handshake.ts` before the connection is established
- Streams terminated cleanly on session close or upstream error
- Works alongside the existing HTTP+SSE streaming path simultaneously
---
## 🔑 Sync Tokens & Config Bundle _(v3.6.6+)_
Multi-device and external operator access is now possible via **scoped sync tokens**:
- **`POST /api/sync/tokens`** — Issue a new sync token (scoped, with optional expiry)
- **`DELETE /api/sync/tokens/:id`** — Revoke a token
- **`GET /api/sync/bundle`** — Download a versioned, ETag-keyed JSON snapshot of all non-sensitive settings (passwords redacted)
The config bundle is built by `src/lib/sync/bundle.ts`. Consumers compare the `ETag` response header to detect changes without re-downloading the full payload.
---
## 🧠 GLM Thinking Preset _(v3.6.6+)_
**GLM Thinking (`glmt`)** is now a registered first-class provider: 65 536 max output tokens, 24 576 thinking budget, 900 s default timeout, Claude-compatible API format, and shared usage sync with the GLM family.
**Hybrid token counting** also lands in v3.6.6: when a Claude-compatible provider exposes `/messages/count_tokens`, OmniRoute calls it before large requests with graceful estimation fallback.
All provider validation and model discovery calls now go through a two-layer outbound guard:
1. **URL guard** (`src/shared/network/outboundUrlGuard.ts`) — Blocks private/loopback/link-local IP ranges before the socket is opened.
2. **Safe fetch wrapper** (`src/shared/network/safeOutboundFetch.ts`) — Applies the URL guard, normalises timeouts, and retries transient errors with exponential backoff.
Guard violations surface as HTTP 422 (`URL_GUARD_BLOCKED`) and are written to the compliance audit log via `providerAudit.ts`.
---
## 🔄 Cooldown-Aware Retries _(v3.6.6+)_
Chat requests now **automatically retry** when an upstream provider returns a model-scoped cooldown. Configurable via `REQUEST_RETRY` (default: 2) and `MAX_RETRY_INTERVAL_SEC` (default: 30 s). Rate-limit header learning improved across `x-ratelimit-reset-requests`, `x-ratelimit-reset-tokens`, and `Retry-After` — per-model cooldown state is visible in the Resilience dashboard.
---
## 📋 Compliance Audit v2 _(v3.6.6+)_
The audit log has been expanded with cursor-based pagination, request context enrichment (request ID, user agent, IP), structured auth events, provider CRUD events with diff context, and SSRF-blocked validation logging. New events emitted by `src/lib/compliance/providerAudit.ts`.
| `messages` | Translates missing keys in `src/i18n/messages/{locale}.json` from `en.json` |
| `readme` | Translates `README.md` into all locales as `README.{code}.md` in project root |
| `docs` | Translates `DOC_SOURCE_FILES` into `docs/i18n/{locale}/{docName}` |
| `all` | Runs all three modes |
**Features:**
- **Text protection**: Masks code blocks (```` ``` ````), inline code (`` ` ``), markdown links/images (`[text](url)`), HTML tags, tables, and ICU placeholders (`{count}`, `{value}`, `{total}`, etc.) before translation, then restores them
- **Chunked batching**: Joins multiple strings with `__OMNIROUTE_I18N_SEPARATOR__` delimiters to minimize API calls (max 1800 chars per request)
- **In-memory cache**: Avoids redundant API calls for repeated strings within a session
- **Retry logic**: Exponential backoff (up to 5 attempts with 300ms × attempt delay) for 429/5xx errors
- **Timeout**: 20 seconds per request
- **Skip existing**: If target file already exists, it is NOT overwritten
**Important behaviors:**
- `docs/i18n/README.md` is **regenerated** each run — it's an auto-generated index of all docs
- Root `README.{code}.md` files are only created if they don't exist (skips locales in `EXISTING_README_CODES`)
- Language bars (`🌐 **Languages:** ...`) are automatically inserted/updated in all translated docs
### i18n_autotranslate.py (LLM-based)
**Secondary translator** — uses any OpenAI-compatible LLM API (including OmniRoute itself) to translate existing `docs/i18n/` markdown files. Best for polishing or re-translating docs with better quality than Google Translate.
```bash
python3 scripts/i18n_autotranslate.py \
--api-url http://localhost:20128/v1 \
--api-key sk-your-key \
--model gpt-4o
```
**Features:**
- Scans `docs/i18n/` markdown files for English paragraphs
- Skips code blocks, tables, and already-translated content
- Sends paragraphs to LLM with technical translation system prompt
- Supports all 30 languages
## Validation & QA
### validate_translation.py
**Translation validator** — compares any locale JSON against `en.json` and reports issues.
- **Missing keys** — keys in `en.json` but not in locale file
- **Extra keys** — keys in locale file but not in `en.json`
- **Untranslated keys** — keys where locale value equals English source (excluding allowlist)
- **Placeholder mismatches** — ICU placeholders that don't match between source and translation
**Exit codes:**
| Code | Meaning |
|------|---------|
| 0 | OK |
| 1 | Generic error |
| 2 | Missing strings (hard error) |
| 3 | Untranslated warning (soft) |
**Environment:** Set `TRANSLATION_LANG=cs` or use `-l cs` flag.
### check_translations.py
**Code-to-JSON key checker** — scans `src/**/*.tsx` and `src/**/*.ts` for `useTranslations()` calls and verifies all referenced keys exist in `en.json`.
```bash
# Basic check
python3 scripts/check_translations.py
# Verbose output
python3 scripts/check_translations.py --verbose
# Auto-fix (adds missing keys to en.json)
python3 scripts/check_translations.py --fix
```
### generate-qa-checklist.mjs
**Static analysis QA** — scans Next.js page files for i18n risk metrics and generates a Markdown report.
```bash
node scripts/i18n/generate-qa-checklist.mjs
```
**Checks:**
- Fixed-width class usage (overflow risk)
- Directional left/right classes (RTL risk)
- Clipping-prone patterns
- Locale parity (missing/extra keys vs `en.json`)
- README language selector bars in priority locales (`es`, `fr`, `de`, `ja`, `ar`)
1. **Always edit `en.json` first** — it's the source of truth
2. **Run `generate-multilang.mjs messages`** to propagate new keys to all locales
3. **Review auto-translations** — Google Translate is a starting point, not final
4. **Validate before committing** — `python3 scripts/validate_translation.py quick -l <lang>`
5. **Update `untranslatable-keys.json`** if a key should remain in English
### Placeholder Safety
- ICU placeholders (`{count}`, `{value}`, `{total}`, `{seconds}`) must be preserved exactly
- Plural formats (`{count, plural, one {# model} other {# models}}`) must maintain structure
- The validator detects placeholder mismatches automatically
### Adding New Translation Keys in Code
```tsx
// Use namespaced keys
const t = useTranslations("settings");
t("cacheSettings"); // maps to settings.cacheSettings in JSON
// Run check_translations.py to verify keys exist
python3 scripts/check_translations.py --verbose
```
### RTL Considerations
- Arabic (`ar`) and Hebrew (`he`) are RTL locales
- Avoid hardcoded `left`/`right` CSS — use `start`/`end` logical properties
- Visual QA catches RTL layout mismatches via `run-visual-qa.mjs`
## Known Issues & History
### `in.json` → `hi.json` Fix
The generator originally used `code: "in"` (deprecated Google Translate code) for Hindi instead of the correct ISO 639-1 `hi`. This created an orphaned `in.json` duplicate of `hi.json`. Fixed by changing `code: "in"` to `code: "hi"` in `generate-multilang.mjs` and removing the orphaned file.
### `docs/i18n/README.md` Is Auto-Generated
The `docs/i18n/README.md` file is completely regenerated by `generate-multilang.mjs docs`. Any manual edits will be lost. Use `docs/I18N.md` (this file) for hand-written documentation that should persist.
### External Untranslatable Keys List
The `untranslatable-keys.json` allowlist was moved from an inline Python set in `validate_translation.py` to an external JSON file for easier maintenance. The validator loads it at runtime.
### `generate-multilang.mjs` Hindi Code Fix
The generator originally used `code: "in"` (deprecated Google Translate code) for Hindi instead of the correct ISO 639-1 `hi`. This was introduced in upstream commit `952b0b22c` by `diegosouzapw`. Fixed by changing `code: "in"` to `code: "hi"` in the `LOCALE_SPECS` array and removing the orphaned `in.json` file.
| First login not working | Set `INITIAL_PASSWORD` in `.env` (no hardcoded default) |
| Dashboard opens on wrong port | Set `PORT=20128` and `NEXT_PUBLIC_BASE_URL=http://localhost:20128` |
| No logs written to disk | Set `APP_LOG_TO_FILE=true` and verify call log capture is enabled |
| EACCES: permission denied | Set `DATA_DIR=/path/to/writable/dir` to override `~/.omniroute` |
| Routing strategy not saving | Update to v1.4.11+ (Zod schema fix for settings persistence) |
| Login crash / blank page | Check Node.js version — see [Node.js Compatibility](#nodejs-compatibility) below |
| `dlopen` / `slice is not valid mach-o file` (macOS) | Run `cd $(npm root -g)/omniroute/app && npm rebuild better-sqlite3 && omniroute` — see [macOS native module rebuild](#macos-native-module-rebuild) below |
| Proxy "fetch failed" | Ensure proxy config is set at the correct level — see [Proxy Issues](#proxy-issues) below |
---
## Node.js Compatibility
<a name="nodejs-compatibility"></a>
### Login page crashes or shows "Module self-registration" error
**Cause:** You are running a Node.js version outside OmniRoute's approved secure runtime floor. The most common case is running an older Node 20, 22, or 24 patch level that falls below the patched security floor OmniRoute requires.
**Symptoms:**
- Login page shows a blank screen or a server error
- Console shows `Error: Module did not self-register` or similar native binding errors
- The login page shows an **orange warning banner** with your Node version if the runtime is outside the supported secure policy
**Fix:**
1. Install a supported Node.js LTS release (recommended: Node.js 24.x):
```bash
nvm install 24
nvm use 24
```
2. Verify your version: `node --version` should show `v24.0.0` or newer on the 24.x LTS line
> **Supported secure versions:**`>=20.20.2 <21`, `>=22.22.2 <23`, or `>=24.0.0 <25`. Node.js 24.x LTS (Krypton) is fully supported.
### macOS: `dlopen` / "slice is not valid mach-o file"
<a name="macos-native-module-rebuild"></a>
**Cause:** After a global `npm install -g omniroute`, the `better-sqlite3` native binary inside the package may have been compiled for a different architecture or Node.js ABI than what is running locally. This is common on macOS (both Apple Silicon and Intel) when the pre-built binary does not match your environment.
**Symptoms:**
- Server fails immediately on startup with a `dlopen` error
- Error contains `slice is not valid mach-o file`
- Full example:
```
dlopen(/Users/<user>/.nvm/versions/node/v24.14.1/lib/node_modules/omniroute/app/node_modules/better-sqlite3/build/Release/better_sqlite3.node, 0x0001): tried: '...' (slice is not valid mach-o file)
```
**Fix — rebuild for your local environment (no Node.js downgrade required):**
```bash
cd $(npm root -g)/omniroute/app
npm rebuild better-sqlite3
omniroute
```
> **Note:** This recompiles the native binding against your local Node.js version and CPU architecture, resolving the binary mismatch. The officially supported range is **`>=20.20.2 <21`, `>=22.22.2 <23`, or `>=24.0.0 <25`** (`engines` field in `package.json`). Node.js 24.x LTS (Krypton) is fully supported with `better-sqlite3` v12.x.
---
## Proxy Issues
<a name="proxy-issues"></a>
### Provider validation shows "fetch failed"
**Cause:** The API key validation endpoint (`POST /api/providers/validate`) was previously bypassing proxy configuration, causing failures in environments that require proxy routing.
**Fix (v3.5.5+):** This is now fixed. Provider validation routes through `runWithProxyContext`, honoring provider-level and global proxy settings automatically.
### Token health check fails with "fetch failed"
**Cause:** Background OAuth token refresh was not resolving proxy configuration per connection.
**Fix (v3.5.5+):** The token health check scheduler now resolves proxy config per connection before attempting refresh. Update to v3.5.5+.
**Cause:** On Node.js 22, the undici@8 dispatcher is incompatible with Node's built-in `fetch()` implementation.
**Fix (v3.5.5+):** OmniRoute now uses undici's own `fetch()` function when a proxy dispatcher is active, ensuring consistent behavior. Update to v3.5.5+.
This guide covers how to cleanly remove OmniRoute from your system.
---
## Quick Uninstall (v3.6.2+)
OmniRoute provides two built-in scripts for clean removal:
### Keep Your Data
```bash
npm run uninstall
```
This removes the OmniRoute application but **preserves** your database, configurations, API keys, and provider settings in `~/.omniroute/`. Use this if you plan to reinstall later and want to keep your setup.
### Full Removal
```bash
npm run uninstall:full
```
This removes the application **and permanently erases** all data:
- Database (`storage.sqlite`)
- Provider configurations and API keys
- Backup files
- Log files
- All files in the `~/.omniroute/` directory
> ⚠️ **Warning:**`npm run uninstall:full` is irreversible. All your provider connections, combos, API keys, and usage history will be permanently deleted.
---
## Manual Uninstall
### NPM Global Install
```bash
# Remove the global package
npm uninstall -g omniroute
# (Optional) Remove data directory
rm -rf ~/.omniroute
```
### pnpm Global Install
```bash
pnpm uninstall -g omniroute
rm -rf ~/.omniroute
```
### Docker
```bash
# Stop and remove the container
docker stop omniroute
docker rm omniroute
# Remove the volume (deletes all data)
docker volume rm omniroute-data
# (Optional) Remove the image
docker rmi diegosouzapw/omniroute:latest
```
### Docker Compose
```bash
# Stop and remove containers
docker compose down
# Also remove volumes (deletes all data)
docker compose down -v
```
### Electron Desktop App
**Windows:**
- Open `Settings → Apps → OmniRoute → Uninstall`
- Or run the NSIS uninstaller from the install directory
**macOS:**
- Drag `OmniRoute.app` from `/Applications` to Trash
You can reorder combo cards directly in **Dashboard → Combos** by dragging the handle on each card. The order is stored in SQLite and restored on reload.
### Example 1: Maximize Subscription → Cheap Backup
```
@@ -228,7 +230,7 @@ Dashboard → Combos → Create New
| `npm run uninstall` | Removes the system app but **keeps your DB and configurations** in `~/.omniroute`. |
| `npm run uninstall:full` | Removes the app AND permanently **erases all configurations, keys, and databases**. |
> Note: To run these commands, navigate to the OmniRoute project folder (if you cloned it) and run them. Alternatively, if globally installed, you can simply run `npm uninstall -g omniroute`.
For host-integrated mode with CLI binaries, see the Docker section in the main docs.
### Void Linux (xbps-src)
Void Linux users can package and install OmniRoute natively using the `xbps-src` cross-compilation framework. This automates the Node.js standalone build along with the required `better-sqlite3` native bindings.
<details>
<summary><b>View xbps-src template</b></summary>
```bash
# Template file for 'omniroute'
pkgname=omniroute
version=3.2.4
revision=1
hostmakedepends="nodejs python3 make"
depends="openssl"
short_desc="Universal AI gateway with smart routing for multiple LLM providers"
@@ -494,6 +612,11 @@ curl -X POST http://localhost:20128/api/provider-models \
Or use Dashboard: **Providers → [Provider] → Custom Models**.
Notes:
- OpenRouter and OpenAI/Anthropic-compatible providers are managed from **Available Models** only. Manual add, import, and auto-sync all land in the same available-model list, so there is no separate Custom Models section for those providers.
- The **Custom Models** section is intended for providers that do not expose managed available-model imports.
### Dedicated Provider Routes
Route requests directly to a specific provider with model validation:
@@ -538,6 +661,17 @@ Returns models grouped by provider with types (`chat`, `embedding`, `image`).
- Automatic background sync with timeout + fail-fast
- Prefer server-side `BASE_URL`/`CLOUD_URL` in production
### Cloudflare Quick Tunnel
- Available in **Dashboard → Endpoints** for Docker and other self-hosted deployments
- Creates a temporary `https://*.trycloudflare.com` URL that forwards to your current OpenAI-compatible `/v1` endpoint
- First enable installs `cloudflared` only when needed; later restarts reuse the same managed binary
- Quick Tunnels are not auto-restored after an OmniRoute or container restart; re-enable them from the dashboard when needed
- Tunnel URLs are ephemeral and change every time you stop/start the tunnel
- Managed Quick Tunnels default to HTTP/2 transport to avoid noisy QUIC UDP buffer warnings in constrained containers
- Set `CLOUDFLARED_PROTOCOL=quic` or `auto` if you want to override the managed transport choice
- Set `CLOUDFLARED_BIN` if you prefer using a preinstalled `cloudflared` binary instead of the managed download
2. **Editable Rate Limits** — System-level defaults configurable in the dashboard:
- **Requests Per Minute (RPM)** — Maximum requests per minute per account
@@ -620,14 +771,18 @@ OmniRoute implements provider-level resilience with four components:
- **Max Concurrent Requests** — Maximum simultaneous requests per account
- Click **Edit** to modify, then **Save** or **Cancel**. Values persist via the resilience API.
3. **Circuit Breaker** — Tracks failures per provider and automatically opens the circuit when a threshold is reached:
3. **Circuit Breaker** — Tracks failures per provider and automatically opens the circuit when the configured threshold is reached:
- **CLOSED** (Healthy) — Requests flow normally
- **OPEN** — Provider is temporarily blocked after repeated failures
- **HALF_OPEN** — Testing if provider has recovered
The same provider profile also drives model-scoped lockouts:
- Account/model lockouts react immediately to authoritative `429` / `404` signals and use the configured cooldown + backoff values
- Global provider/model quarantine only activates after repeated exhaustion hits the configured **CB Threshold** within **CB Reset Time**
4. **Policies & Locked Identifiers** — Shows circuit breaker status and locked identifiers with force-unlock capability.
5. **Rate Limit Auto-Detection** — Monitors `429` and `Retry-After` headers to proactively avoid hitting provider rate limits.
5. **Rate Limit Auto-Detection** — Monitors `429` and `Retry-After` headers to proactively avoid hitting provider rate limits. When an upstream provider returns an explicit wait window, that authoritative `Retry-After` value overrides the base cooldown from the provider profile.
**Pro Tip:** Use **Reset All** button to clear all circuit breakers and cooldowns when a provider recovers from an outage.
@@ -637,11 +792,11 @@ OmniRoute implements provider-level resilience with four components:
Manage database backups in **Dashboard → Settings → System & Storage**.
| **Export Database** | Downloads the current SQLite database as a `.sqlite` file |
| **Export All (.tar.gz)** | Downloads a full backup archive including: database, settings, combos, provider connections (no credentials), API key metadata |
| **Import Database** | Upload a `.sqlite` file to replace the current database. A pre-import backup is automatically created |
| **Export Database** | Downloads the current SQLite database as a `.sqlite` file |
| **Export All (.tar.gz)** | Downloads a full backup archive including: database, settings, combos, provider connections (no credentials), API key metadata |
| **Import Database** | Upload a `.sqlite` file to replace the current database. A pre-import backup is automatically created unless `DISABLE_SQLITE_AUTO_BACKUP=true` |
```bash
# API: Export database
@@ -667,10 +822,11 @@ curl -X POST http://localhost:20128/api/db-backups/import \
### Settings Dashboard
The settings page is organized into 5 tabs for easy navigation:
The settings page is organized into 6 tabs for easy navigation:
├── COVERAGE_PLAN.md # Test coverage improvement plan
├── openapi.yaml # OpenAPI specification
└── adr/ # Architecture Decision Records
```
---
## Adding a New Provider
### Step 1: Register Provider Constants
Add to `src/shared/constants/providers.ts` — Zod-validated at module load.
### Step 2: Add Executor (if custom logic needed)
Create executor in `open-sse/executors/your-provider.ts` extending the base executor.
### Step 3: Add Translator (if non-OpenAI format)
Create request/response translators in `open-sse/translator/`.
### Step 4: Add OAuth Config (if OAuth-based)
Add OAuth credentials in `src/lib/oauth/constants/oauth.ts` and service in `src/lib/oauth/services/`.
### Step 5: Register Models
Add model definitions in `open-sse/config/providerRegistry.ts`.
### Step 6: Add Tests
Write unit tests in `tests/unit/` covering at minimum:
- Provider registration
- Request/response translation
- Error handling
---
## Pull Request Checklist
- [ ] Tests pass (`npm test`)
- [ ] Linting passes (`npm run lint`)
- [ ] Build succeeds (`npm run build`)
- [ ] TypeScript types added for new public functions and interfaces
- [ ] No hardcoded secrets or fallback values
- [ ] All inputs validated with Zod schemas
- [ ] CHANGELOG updated (if user-facing change)
- [ ] Documentation updated (if applicable)
---
## Releasing
Releases are managed via the `/generate-release` workflow. When a new GitHub Release is created, the package is **automatically published to npm** via GitHub Actions.
---
## Getting Help
- **Architecture**: See [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md)
- **API Reference**: See [`docs/API_REFERENCE.md`](docs/API_REFERENCE.md)
Create model routing combos with 6 strategies: priority, weighted, round-robin, random, least-used, and cost-optimized. Each combo chains multiple models with automatic fallback and includes quick templates and readiness checks.

---
## 📊 Analytics
Comprehensive usage analytics with token consumption, cost estimates, activity heatmaps, weekly distribution charts, and per-provider breakdowns.
Test any model directly from the dashboard. Select provider, model, and endpoint, write prompts with Monaco Editor, stream responses in real-time, abort mid-stream, and view timing metrics.
---
## 🎨 Themes _(v2.0.5+)_
Customizable color themes for the entire dashboard. Choose from 7 preset colors (Coral, Blue, Red, Green, Violet, Orange, Cyan) or create a custom theme by picking any hex color. Supports light, dark, and system mode.
---
## ⚙️ Settings
Comprehensive settings panel with tabs:
- **General** — System storage, backup management (export/import database)
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
One-click configuration for AI coding tools: Claude Code, Codex CLI, Gemini CLI, OpenClaw, Kilo Code, Antigravity, Cline, Continue, Cursor, and Factory Droid. Features automated config apply/reset, connection profiles, and model mapping.
Dashboard for discovering and managing CLI agents. Shows a grid of 14 built-in agents (Codex, Claude, Goose, Gemini CLI, OpenClaw, Aider, OpenCode, Cline, Qwen Code, ForgeCode, Amazon Q, Open Interpreter, Cursor CLI, Warp) with:
- **Installation status** — Installed / Not Found with version detection
- **Protocol badges** — stdio, HTTP, etc.
- **Custom agents** — Register any CLI tool via form (name, binary, version command, spawn args)
- **CLI Fingerprint Matching** — Per-provider toggle to match native CLI request signatures, reducing ban risk while preserving proxy IP
---
## 🖼️ Media _(v2.0.3+)_
Generate images, videos, and music from the dashboard. Supports OpenAI, xAI, Together, Hyperbolic, SD WebUI, ComfyUI, AnimateDiff, Stable Audio Open, and MusicGen.
---
## 📝 Request Logs
Real-time request logging with filtering by provider, model, account, and API key. Shows status codes, token usage, latency, and response details.

---
## 🌐 API Endpoint
Your unified API endpoint with capability breakdown: Chat Completions, Responses API, Embeddings, Image Generation, Reranking, Audio Transcription, Text-to-Speech, Moderations, and registered API keys. Cloud proxy support for remote access.
Create, scope, and revoke API keys. Each key can be restricted to specific models/providers with full access or read-only permissions. Visual key management with usage tracking.
---
## 📋 Audit Log
Administrative action tracking with filtering by action type, actor, target, IP address, and timestamp. Full security event history.
---
## 🖥️ Desktop Application
Native Electron desktop app for Windows, macOS, and Linux. Run OmniRoute as a standalone application with system tray integration, offline support, auto-update, and one-click install.
Key features:
- Server readiness polling (no blank screen on cold start)
- **Test Files**: ALL unit tests, integration tests, ecosystem tests, or Vitest files MUST strictly be placed within the `tests/` directory (e.g., `tests/unit/`, `tests/integration/`). NEVER create test files in the project root (`/`).
- **Scripts and Utilities**: ALL maintenance, debugging, generation, or experimental scripts (`.cjs`, `.mjs`, `.js`, `.ts`) MUST be placed strictly inside the `scripts/` directory or `scripts/scratch/` for temporary one-offs. NEVER dump loose scripts in the project root (`/`).
- CI/CD files and ignore definitions (`.gitignore`, `.dockerignore`)
When creating _any_ validation tests or one-off logic scripts, default to using `scripts/scratch/` or the `tests/unit/` directories according to your goals. Do not pollute the `/` root context.
OmniRoute is a local AI routing gateway and dashboard built on Next.js.
It provides a single OpenAI-compatible endpoint (`/v1/*`) and routes traffic across multiple upstream providers with translation, fallback, token refresh, and usage tracking.
Core capabilities:
- OpenAI-compatible API surface for CLI/tools (100+ providers, 16 executors)
- Request/response translation across provider formats
- Model combo fallback (multi-model sequence)
- Structured combo steps (`provider + model + connection`) with runtime ordering by `compositeTiers`
- Account-level fallback (multi-account per provider)
- Quota preflight and quota-aware P2C account selection in the main chat path
- facade: `src/lib/usageDb.ts` (decomposed modules in `src/lib/usage/*`)
- SQLite tables in `storage.sqlite`: `usage_history`, `call_logs`, `proxy_logs`
- optional file artifacts remain for compatibility/debug (`${DATA_DIR}/log.txt`, `${DATA_DIR}/call_logs/`, `<repo>/logs/...`)
- legacy JSON files are migrated to SQLite by startup migrations when present
Domain State DB (SQLite):
- `src/lib/db/domainState.ts` — CRUD operations for domain state
- Tables (created in `src/lib/db/core.ts`): `domain_fallback_chains`, `domain_budgets`, `domain_cost_history`, `domain_lockout_state`, `domain_circuit_breakers`
- Write-through cache pattern: in-memory Maps are authoritative at runtime; mutations are written synchronously to SQLite; state is restored from DB on cold start
Fallback decisions are driven by `open-sse/services/accountFallback.ts` using status codes and error-message heuristics. Combo routing adds one extra guard: provider-scoped 400s such as upstream content-block and role-validation failures are treated as model-local failures so later combo targets can still run.
## OAuth Onboarding and Token Refresh Lifecycle
```mermaid
sequenceDiagram
autonumber
participant UI as Dashboard UI
participant OAuth as /api/oauth/[provider]/[action]
participant ProvAuth as Provider Auth Server
participant DB as localDb
participant Test as /api/providers/[id]/test
participant Exec as Provider Executor
UI->>OAuth: GET authorize or device-code
OAuth->>ProvAuth: create auth/device flow
ProvAuth-->>OAuth: auth URL or device code payload
- Format constants: `open-sse/translator/formats.ts`
### Persistence
- `src/lib/db/*`: persistent config/state and domain persistence on SQLite
- `src/lib/localDb.ts`: compatibility re-export for DB modules
- `src/lib/usageDb.ts`: usage history/call logs facade on top of SQLite tables
## Provider Executor Coverage (Strategy Pattern)
Each provider has a specialized executor extending `BaseExecutor` (in `open-sse/executors/base.ts`), which provides URL building, header construction, retry with exponential backoff, credential refresh hooks, and the `execute()` orchestration method.
| Vertex AI | gemini | Service Account | ✅ | ✅ | ✅ | ⚠️ Cloud Console |
| Puter | openai | API Key | ✅ | ✅ | ❌ | ❌ |
## Format Translation Coverage
Detected source formats include:
- `openai`
- `openai-responses`
- `claude`
- `gemini`
Target formats include:
- OpenAI chat/Responses
- Claude
- Gemini/Gemini-CLI/Antigravity envelope
- Kiro
- Cursor
Translations use **OpenAI as the hub format** — all conversions go through OpenAI as intermediate:
```
Source Format → OpenAI (hub) → Target Format
```
Translations are selected dynamically based on source payload shape and provider target format.
Additional processing layers in the translation pipeline:
- **Response sanitization** — Strips non-standard fields from OpenAI-format responses (both streaming and non-streaming) to ensure strict SDK compliance
- **Role normalization** — Converts `developer` → `system` for non-OpenAI targets; merges `system` → `user` for models that reject the system role (GLM, ERNIE)
- **Think tag extraction** — Parses `<think>...</think>` blocks from content into `reasoning_content` field
| `GET/POST/DELETE /api/provider-models` | Provider Models | Provider model metadata backing custom and managed available models |
## Bypass Handler
The bypass handler (`open-sse/utils/bypassHandler.ts`) intercepts known "throwaway" requests from Claude CLI — warmup pings, title extractions, and token counts — and returns a **fake response** without consuming upstream provider tokens. This is triggered only when `User-Agent` contains `claude-cli`.
## Request Logging and Artifacts
The older file-based request logger (`open-sse/utils/requestLogger.ts`) is retained only for
legacy compatibility. The current runtime contract uses:
- `APP_LOG_TO_FILE=true` for application and audit logs written under `<repo>/logs/`
- SQLite-backed call log records in `call_logs`
- `${DATA_DIR}/call_logs/YYYY-MM-DD/...` artifacts when the call log pipeline is enabled
## Failure Modes and Resilience
## 1) Account/Provider Availability
- provider account cooldown on transient/rate/auth errors
- account fallback before failing request
- combo model fallback when current model/provider path is exhausted
## 2) Token Expiry
- pre-check and refresh with retry for refreshable providers
- 401/403 retry after refresh attempt in core path
## 3) Stream Safety
- disconnect-aware stream controller
- translation stream with end-of-stream flush and `[DONE]` handling
- usage estimation fallback when provider usage metadata is missing
## 4) Cloud Sync Degradation
- sync errors are surfaced but local runtime continues
- scheduler has retry-capable logic, but periodic execution currently calls single-attempt sync by default
## 5) Data Integrity
- SQLite schema migrations and auto-upgrade hooks at startup
1. `usageDb` and `localDb` share the same base directory policy (`DATA_DIR` -> `XDG_CONFIG_HOME/omniroute` -> `~/.omniroute`) with legacy file migration.
2. `/api/v1/route.ts` delegates to the same unified catalog builder used by `/api/v1/models` (`src/app/api/v1/models/catalog.ts`) to avoid semantic drift.
3. Request logger writes full headers/body when enabled; treat log directory as sensitive.
4. Cloud behavior depends on correct `NEXT_PUBLIC_BASE_URL` and cloud endpoint reachability.
5. The `open-sse/` directory is published as the `@omniroute/open-sse`**npm workspace package**. Source code imports it via `@omniroute/open-sse/...` (resolved by Next.js `transpilePackages`). File paths in this document still use the directory name `open-sse/` for consistency.
6. Charts in the dashboard use **Recharts** (SVG-based) for accessible, interactive analytics visualizations (model usage bar charts, provider breakdown tables with success rates).
7. E2E tests use **Playwright** (`tests/e2e/`), run via `npm run test:e2e`. Unit tests use **Node.js test runner** (`tests/unit/`), run via `npm run test:unit`. Source code under `src/` is **TypeScript** (`.ts`/`.tsx`); the `open-sse/` workspace remains JavaScript (`.js`).
8. Settings page is organized into 5 tabs: Security, Routing (6 global strategies: fill-first, round-robin, p2c, random, least-used, cost-optimized), Resilience (editable rate limits, circuit breaker, policies, **Context Relay** handoff config), AI (thinking budget, system prompt, prompt cache), Advanced (proxy).
9. **Context Relay** strategy (`context-relay`) is split across two layers: `combo.ts` decides if a handoff should be generated, `chat.ts` injects the handoff after account resolution. Handoff data lives in `context_handoffs` SQLite table. This split is intentional because only `chat.ts` knows whether the actual account changed.
10. **Proxy enforcement** is now comprehensive: `tokenHealthCheck.ts` resolves proxy per connection, `/api/providers/validate` uses `runWithProxyContext`, and `proxyFetch.ts` uses `undici.fetch()` to maintain dispatcher compatibility on Node 22.
11. **Node.js runtime policy detection**: `/api/settings/require-login` returns `nodeVersion` and `nodeCompatible` fields. The login page renders a warning banner when the runtime falls outside the supported secure Node.js lines.
> A comprehensive, beginner-friendly guide to the **omniroute** multi-provider AI proxy router.
---
@@ -271,7 +269,7 @@ Business logic that supports the handlers and executors.
| `provider.ts` | **Format detection** (`detectFormat`): analyzes request body structure to identify Claude/OpenAI/Gemini/Antigravity/Responses formats (includes `max_tokens` heuristic for Claude). Also: URL building, header building, thinking config normalization. Supports `openai-compatible-*` and `anthropic-compatible-*` dynamic providers. |
| `model.ts` | Model string parsing (`claude/model-name` → `{provider: "claude", model: "model-name"}`), alias resolution with collision detection, input sanitization (rejects path traversal/control chars), and model info resolution with async alias getter support. |
| `tokenRefresh.ts` | OAuth token refresh for **every provider**: Google (Gemini, Antigravity), Claude, Codex, Qwen, iFlow, GitHub (OAuth + Copilot dual-token), Kiro (AWS SSO OIDC + Social Auth). Includes in-flight promise deduplication cache and retry with exponential backoff. |
| `tokenRefresh.ts` | OAuth token refresh for **every provider**: Google (Gemini, Antigravity), Claude, Codex, Qwen, Qoder, GitHub (OAuth + Copilot dual-token), Kiro (AWS SSO OIDC + Social Auth). Includes in-flight promise deduplication cache and retry with exponential backoff. |
| `combo.ts` | **Combo models**: chains of fallback models. If model A fails with a fallback-eligible error, try model B, then C, etc. Returns actual upstream status codes. |
| `usage.ts` | Fetches quota/usage data from provider APIs (GitHub Copilot quotas, Antigravity model quotas, Codex rate limits, Kiro usage breakdowns, Claude settings). |
| `accountSelector.ts` | Smart account selection with scoring algorithm: considers priority, health status, round-robin position, and cooldown state to pick the optimal account for each request. |
@@ -352,7 +350,7 @@ flowchart LR
The **format translation engine** using a self-registering plugin system.
| `usageTracking.ts` | Token usage extraction from any format (Claude/OpenAI/Gemini/Responses), estimation with separate tool/message char-per-token ratios, buffer addition (2000 tokens safety margin), format-specific field filtering, console logging with ANSI colors. |
| `requestLogger.ts` | File-based request logging (opt-in via `ENABLE_REQUEST_LOGS=true`). Creates session folders with numbered files: `1_req_client.json` → `7_res_client.txt`. All I/O is async (fire-and-forget). Masks sensitive headers. |
| `requestLogger.ts` | Legacy file-based request logging helper kept for compatibility. Current deployments should prefer `APP_LOG_TO_FILE` for application logs and the call log pipeline for persisted request artifacts. |
| `bypassHandler.ts` | Intercepts specific patterns from Claude CLI (title extraction, warmup, count) and returns fake responses without calling any provider. Supports both streaming and non-streaming. Intentionally limited to Claude CLI scope. |
| `networkProxy.ts` | Resolves outbound proxy URL for a given provider with precedence: provider-specific config → global config → environment variables (`HTTPS_PROXY`/`HTTP_PROXY`/`ALL_PROXY`). Supports `NO_PROXY` exclusions. Caches config for 30s. |
@@ -543,7 +541,7 @@ A 2000-token buffer is added to reported usage to prevent clients from hitting c
| Kiro (AWS) | AWS SSO OIDC or Social | Kiro | Binary EventStream parsing |
| Legacy | Old `npm run test:coverage` | 79.42% | 75.15% | 67.94% | Inflated: counts test files and excludes `open-sse` |
| Diagnostic | Source-only, excluding tests and excluding `open-sse` | 68.16% | 63.55% | 64.06% | Useful only to isolate `src/**` |
| Recommended baseline | Source-only, excluding tests and including `open-sse` | 56.95% | 66.05% | 57.80% | This is the project-wide baseline to improve |
The recommended baseline is the number to optimize against.
## Rules
- Coverage targets apply to source files, not to `tests/**`.
- `open-sse/**` is part of the product and must remain in scope.
- New code should not reduce coverage in touched areas.
- Prefer testing behavior and branch outcomes over implementation details.
- Prefer temp SQLite databases and small fixtures over broad mocks for `src/lib/db/**`.
## Current command set
- `npm run test:coverage`
- Main source coverage gate for the unit test suite
- Generates `text-summary`, `html`, `json-summary`, and `lcov`
- `npm run coverage:report`
- Detailed file-by-file report from the latest run
| Phase 7 | 90% statements / lines | Final sweep, gap closure, strict ratchet |
Branches and functions should ratchet upward with each phase, but the primary hard target is statements / lines.
## Priority hotspots
These files or areas offer the best return for the next phases:
1. `open-sse/handlers`
- `chatCore.ts` at 7.57%
- Overall directory at 29.07%
2. `open-sse/translator/request`
- Overall directory at 36.39%
- Many translators are still near single-digit coverage
3. `open-sse/translator/response`
- Overall directory at 8.07%
4. `open-sse/executors`
- Overall directory at 36.62%
5. `src/lib/db`
- `models.ts` at 20.66%
- `registeredKeys.ts` at 34.46%
- `modelComboMappings.ts` at 36.25%
- `settings.ts` at 46.40%
- `webhooks.ts` at 33.33%
6. `src/lib/usage`
- `usageHistory.ts` at 21.12%
- `usageStats.ts` at 9.56%
- `costCalculator.ts` at 30.00%
7. `src/lib/providers`
- `validation.ts` at 41.16%
8. Low-risk utility and API files for early gains
- `src/shared/utils/upstreamError.ts`
- `src/shared/utils/apiAuth.ts`
- `src/lib/api/errorResponse.ts`
- `src/app/api/settings/require-login/route.ts`
- `src/app/api/providers/[id]/models/route.ts`
## Execution checklist
### Phase 1: 56.95% -> 60%
- [x] Fix coverage metric so it reflects source code instead of test files
- [x] Keep a legacy coverage script for comparison
- [x] Record the baseline and hotspots in-repo
- [ ] Add focused tests for low-risk utilities:
- `src/shared/utils/upstreamError.ts`
- `src/shared/utils/fetchTimeout.ts`
- `src/lib/api/errorResponse.ts`
- `src/shared/utils/apiAuth.ts`
- `src/lib/display/names.ts`
- [ ] Add route tests for:
- `src/app/api/settings/require-login/route.ts`
- `src/app/api/providers/[id]/models/route.ts`
### Phase 2: 60% -> 65%
- [ ] Add DB-backed tests for:
- `src/lib/db/modelComboMappings.ts`
- `src/lib/db/settings.ts`
- `src/lib/db/registeredKeys.ts`
- [ ] Cover branch behavior in:
- `src/lib/providers/validation.ts`
- `src/app/api/v1/embeddings/route.ts`
- `src/app/api/v1/moderations/route.ts`
### Phase 3: 65% -> 70%
- [ ] Add usage analytics tests for:
- `src/lib/usage/usageHistory.ts`
- `src/lib/usage/usageStats.ts`
- `src/lib/usage/costCalculator.ts`
- [ ] Expand route coverage for proxy management and settings branches
### Phase 4: 70% -> 75%
- [ ] Cover translator helpers and central translation paths:
- `open-sse/translator/index.ts`
- `open-sse/translator/helpers/*`
- `open-sse/translator/request/*`
- `open-sse/translator/response/*`
### Phase 5: 75% -> 80%
- [ ] Add handler-level tests for:
- `open-sse/handlers/chatCore.ts`
- `open-sse/handlers/responsesHandler.js`
- `open-sse/handlers/imageGeneration.js`
- `open-sse/handlers/embeddings.js`
- [ ] Add executor branch coverage for provider-specific auth, retries, and endpoint overrides
### Phase 6: 80% -> 85%
- [ ] Merge more edge-case suites into the main coverage path
- [ ] Increase function coverage for DB modules with weak constructor/helper coverage
- [ ] Close branch gaps in `settings.ts`, `registeredKeys.ts`, `validation.ts`, and translator helpers
### Phase 7: 85% -> 90%
- [ ] Treat the remaining low-coverage files as blockers
- [ ] Add regression tests for every uncovered production bug fixed during the push to 90%
- [ ] Raise the coverage gate in CI only after the local baseline is stable for at least two consecutive runs
## Ratchet policy
Update `npm run test:coverage` thresholds only after the project actually exceeds the next milestone with a comfortable buffer.
Recommended ratchet sequence:
1. 55/60/55
2. 60/62/58
3. 65/64/62
4. 70/66/66
5. 75/70/72
6. 80/75/78
7. 85/80/84
8. 90/85/88
Order is `statements-lines / branches / functions`.
## Known gap
The current coverage command measures the main Node unit suite and includes source reached from it, including `open-sse`. It does not yet merge Vitest coverage into a single unified report. That merge is worth doing later, but it is not a blocker for starting the 60% -> 80% climb.
| `DATA_DIR` | `~/.omniroute/` | `src/lib/db/core.ts` | Root directory for SQLite DB, backups, and data files. Override for Docker volumes or custom paths. |
| `STORAGE_ENCRYPTION_KEY` | _(empty = disabled)_ | `src/lib/db/encryption.ts` | AES key for full SQLite database encryption at rest. Generate with `openssl rand -hex 32`. |
| `STORAGE_ENCRYPTION_KEY_VERSION` | `v1` | `scripts/bootstrap-env.mjs`, `electron/main.js` | Version label for the encryption key. Increment when performing key rotation to support decryption of old backups. |
| `DISABLE_SQLITE_AUTO_BACKUP` | `false` | `src/lib/db/backup.ts` | When `true`, skips the automatic database backup that runs before migrations on every startup. |
| `OMNIROUTE_CRYPT_KEY` | _(unset)_ | `src/lib/db/encryption.ts` | **Legacy alias** for `STORAGE_ENCRYPTION_KEY`. Accepted as a fallback when the primary variable is absent. |
| `OMNIROUTE_API_KEY_BASE64` | _(unset)_ | `src/lib/db/encryption.ts` | **Legacy alias** (Base64-encoded form) accepted as a fallback. Decoded automatically before use. |
| `MACHINE_ID_SALT` | `endpoint-proxy-salt` | `src/lib/auth` | Salt combined with hardware identifiers for machine fingerprinting. Change per-deployment for isolation. |
| `AUTH_COOKIE_SECURE` | `false` | `src/lib/auth` | Sets the `Secure` flag on session cookies. **Must be `true`** when running behind HTTPS. |
| `REQUIRE_API_KEY` | `false` | API middleware | When `true`, all `/v1/*` proxy requests must include a valid API key. |
| `ALLOW_API_KEY_REVEAL` | `false` | Dashboard providers page | Allows revealing full API key values in the Dashboard UI. Security risk on shared instances. |
| `NO_LOG_API_KEY_IDS` | _(empty)_ | `src/lib/compliance/index.ts` | Comma-separated API key IDs that bypass request logging (GDPR compliance). |
| `MAX_BODY_SIZE_BYTES` | `10485760` (10 MB) | `src/shared/middleware/bodySizeGuard.ts` | Maximum allowed request body size. Rejects payloads exceeding this limit. |
| `CORS_ORIGIN` | `*` | Next.js middleware | CORS `Access-Control-Allow-Origin` value. Restrict for production. |
| `OUTBOUND_SSRF_GUARD_ENABLED` | `true` | `src/shared/network/outboundUrlGuard.ts` | Block provider calls targeting private/loopback/link-local IP ranges. Disable only in isolated test envs. |
### Hardening Checklist
```bash
# Production security minimum:
AUTH_COOKIE_SECURE=true # Requires HTTPS
REQUIRE_API_KEY=true # Authenticate all proxy calls
ALLOW_API_KEY_REVEAL=false # Never expose keys in UI
CORS_ORIGIN=https://your.domain.com
MAX_BODY_SIZE_BYTES=5242880 # 5 MB limit
```
---
## 5. Input Sanitization & PII Protection
OmniRoute provides a two-layer defense: request-side injection scanning and response-side PII stripping.
> When deploying behind a reverse proxy (nginx, Caddy), `NEXT_PUBLIC_BASE_URL`**must** be set to your public URL (e.g., `https://omniroute.example.com`). Without this, OAuth callbacks will fail because the redirect_uri won't match.
---
## 8. Outbound Proxy
Route upstream LLM provider calls through an HTTP or SOCKS5 proxy for egress control, geo-routing, or IP masking.
| `CURSOR_USER_AGENT` | `connect-es/1.6.1` | When Cursor updates |
| `GEMINI_CLI_USER_AGENT` | `google-api-nodejs-client/9.15.1` | When Google API client updates |
> [!TIP]
> You can add User-Agent overrides for **any** provider using the pattern `{PROVIDER_ID}_USER_AGENT`. The executor dynamically constructs the env var name.
---
## 13. CLI Fingerprint Compatibility
When enabled, OmniRoute reorders HTTP headers and JSON body fields to match the exact signature of official CLI tools. This reduces the risk of account flagging while preserving your proxy IP.
| `CLI_COMPAT_ALL=1` | Enable fingerprint compatibility for **all** providers at once. |
> [!NOTE]
> This feature works alongside the User-Agent overrides (§12). The fingerprint system handles header ordering and body field ordering, while User-Agent overrides handle the specific UA string. Both can be enabled independently.
---
## 14. API Key Providers
API keys for providers that use direct authentication. **Preferred setup:** Dashboard → Providers → Add API Key.
Setting via environment variables is an alternative for Docker or headless deployments.
Recognized pattern: `{PROVIDER_ID}_API_KEY`
| Variable | Provider |
| -------------------- | ------------------- |
| `DEEPSEEK_API_KEY` | DeepSeek |
| `GROQ_API_KEY` | Groq |
| `XAI_API_KEY` | xAI (Grok) |
| `MISTRAL_API_KEY` | Mistral AI |
| `PERPLEXITY_API_KEY` | Perplexity |
| `TOGETHER_API_KEY` | Together AI |
| `FIREWORKS_API_KEY` | Fireworks AI |
| `CEREBRAS_API_KEY` | Cerebras |
| `COHERE_API_KEY` | Cohere |
| `NVIDIA_API_KEY` | NVIDIA NIM |
| `NEBIUS_API_KEY` | Nebius (embeddings) |
> [!TIP]
> Keys set via the Dashboard are stored encrypted in SQLite and take precedence over environment variables.
---
## 15. Timeout Settings
All values are in **milliseconds**. Centralized resolution in `src/shared/utils/runtimeTimeouts.ts`.
| `PROXY_HEALTH_CACHE_TTL_MS` | `30000` | `src/lib/proxyHealth.ts` | Health check result cache TTL. |
| `RATE_LIMIT_MAX_WAIT_MS` | `120000` (2 min) | `open-sse/services/rateLimitManager.ts` | Max time to wait on a 429 before failing the request. |
| `REQUEST_RETRY` | `2` | `src/sse/services/cooldownAwareRetry.ts` | Number of automatic retries on model-scoped cooldown responses before returning error to client. |
| `MAX_RETRY_INTERVAL_SEC` | `30` | `src/sse/services/cooldownAwareRetry.ts` | Max backoff interval (seconds) between cooldown retries. Capped by this value regardless of upstream `Retry-After`. |
---
## 22. Debugging
> [!CAUTION]
> These variables produce **verbose output** and may leak sensitive data. **Never enable in production.**
The following variables appeared in previous versions of `.env.example` but have **no runtime references** in the current codebase. They have been removed:
Create model routing combos with 13 strategies: priority, weighted, round-robin, random, least-used, cost-optimized, strict-random, auto, fill-first, p2c, lkgp, context-optimized, and **context-relay**. Each combo chains multiple models with automatic fallback and includes quick templates and readiness checks.
Recent combo improvements:
- **Structured combo builder** — create each step by selecting provider, model, and exact account/connection
- **Repeated provider support** — reuse the same provider many times in one combo as long as the `(provider, model, connection)` tuple is unique
- **Combo target health** — analytics and health surfaces now distinguish individual combo targets/steps instead of collapsing everything into model strings
- **Composite tier ordering** — `defaultTier -> fallbackTier` now influences runtime execution/fallback order for top-level combo steps

---
## 📊 Analytics
Comprehensive usage analytics with token consumption, cost estimates, activity heatmaps, weekly distribution charts, and per-provider breakdowns.
Test any model directly from the dashboard. Select provider, model, and endpoint, write prompts with Monaco Editor, stream responses in real-time, abort mid-stream, and view timing metrics.
---
## 🎨 Themes _(v2.0.5+)_
Customizable color themes for the entire dashboard. Choose from 7 preset colors (Coral, Blue, Red, Green, Violet, Orange, Cyan) or create a custom theme by picking any hex color. Supports light, dark, and system mode.
---
## ⚙️ Settings
Comprehensive settings panel with tabs:
- **General** — System storage, backup management (export/import database)
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
One-click configuration for AI coding tools: Claude Code, Codex CLI, Gemini CLI, OpenClaw, Kilo Code, Antigravity, Cline, Continue, Cursor, and Factory Droid. Features automated config apply/reset, connection profiles, and model mapping.
Dashboard for discovering and managing CLI agents. Shows a grid of 14 built-in agents (Codex, Claude, Goose, Gemini CLI, OpenClaw, Aider, OpenCode, Cline, Qwen Code, ForgeCode, Amazon Q, Open Interpreter, Cursor CLI, Warp) with:
- **Installation status** — Installed / Not Found with version detection
- **Protocol badges** — stdio, HTTP, etc.
- **Custom agents** — Register any CLI tool via form (name, binary, version command, spawn args)
- **CLI Fingerprint Matching** — Per-provider toggle to match native CLI request signatures, reducing ban risk while preserving proxy IP
---
## 🔗 Context Relay _(v3.5.5+)_
A combo strategy that preserves session continuity when account rotation happens mid-conversation. Before the active account is exhausted, OmniRoute generates a structured handoff summary in the background. After the next request resolves to a different account, the summary is injected as a system message so the new account continues with full context.
- **Max Messages For Summary** — How much recent history to condense
- **Summary Model** — Optional override model for generating the handoff summary
Currently supports Codex account rotation. See [Context Relay documentation](features/context-relay.md).
---
## 🛡️ Proxy Hardening _(v3.5.5+)_
Comprehensive proxy configuration enforcement across the entire request pipeline:
- **Token Health Check** — Background OAuth refresh now resolves proxy config per connection, preventing failures in proxy-required environments
- **API Key Validation** — Provider key validation (`POST /api/providers/validate`) routes through `runWithProxyContext`, honoring provider-level and global proxy settings
- **undici Dispatcher Fix** — Proxy dispatchers use undici's own fetch implementation instead of Node's built-in fetch, resolving `invalid onRequestStart method` errors on Node.js 22
- **Node.js Version Detection** — Login page proactively detects incompatible Node.js versions (24+) and displays a warning banner with instructions to use Node 22 LTS
---
## 📧 Email Privacy Masking _(v3.5.6+)_
OAuth account emails are now masked in the provider dashboard (e.g. `di*****@g****.com`) to prevent accidental exposure when sharing screenshots or recording demos. The full email address remains accessible via hover tooltip (`title` attribute).
---
## 👁️ Model Visibility Toggle _(v3.5.6+)_
The provider page model list now includes:
- **Real-time search/filter bar** — Quickly find specific models
- **Per-model visibility toggle** (👁 icon) — Hidden models are grayed out and excluded from the `/v1/models` catalog
- **Active-count badge** (`N/M active`) — Shows at a glance how many models are enabled vs total
---
## 🔧 OAuth Env Repair _(v3.6.1+)_
One-click "Repair env" action for OAuth providers that restores missing environment variables and fixes broken auth state. Accessible from `Dashboard → Providers → [OAuth Provider] → Repair env`. Automatically detects and repairs:
- Missing OAuth client credentials
- Corrupted env file entries
- Backup path sanitization
---
## 🗑️ Uninstall / Full Uninstall _(v3.6.2+)_
Clean removal scripts for all installation methods:
| `npm run uninstall` | Removes the system app but **keeps your DB and configurations** in `~/.omniroute`. |
| `npm run uninstall:full` | Removes the app AND permanently **erases all configurations, keys, and databases**. |
---
## 🖼️ Media _(v2.0.3+)_
Generate images, videos, and music from the dashboard. Supports OpenAI, xAI, Together, Hyperbolic, SD WebUI, ComfyUI, AnimateDiff, Stable Audio Open, and MusicGen.
---
## 📝 Request Logs
Real-time request logging with filtering by provider, model, account, and API key. Shows status codes, token usage, latency, and response details.

---
## 🌐 API Endpoint
Your unified API endpoint with capability breakdown: Chat Completions, Responses API, Embeddings, Image Generation, Reranking, Audio Transcription, Text-to-Speech, Moderations, and registered API keys. Cloudflare Quick Tunnel integration and cloud proxy support for remote access.
Create, scope, and revoke API keys. Each key can be restricted to specific models/providers with full access or read-only permissions. Visual key management with usage tracking.
---
## 📋 Audit Log
Administrative action tracking with filtering by action type, actor, target, IP address, and timestamp. Full security event history.
---
## 🖥️ Desktop Application
Native Electron desktop app for Windows, macOS, and Linux. Run OmniRoute as a standalone application with system tray integration, offline support, auto-update, and one-click install.
Key features:
- Server readiness polling (no blank screen on cold start)
- Hardened Electron build packaging — symlinked `node_modules` in the standalone bundle is detected and rejected before packaging, preventing runtime dependency on the build machine (v2.5.5+)
- **Graceful shutdown** — Electron `before-quit` shuts down Next.js cleanly, preventing SQLite WAL database locks (v3.6.2+)
📖 See [`electron/README.md`](../electron/README.md) for full documentation.
---
## 🌐 V1 WebSocket Bridge _(v3.6.6+)_
OmniRoute now supports **OpenAI-compatible WebSocket clients** via the `/v1/ws` upgrade endpoint. The custom `scripts/v1-ws-bridge.mjs` server wraps Next.js and upgrades WS connections to full bidirectional streaming sessions. Authentication uses the same API key or session cookie as HTTP requests.
Key behaviours:
- WS upgrade validated by `src/lib/ws/handshake.ts` before the connection is established
- Streams terminated cleanly on session close or upstream error
- Works alongside the existing HTTP+SSE streaming path simultaneously
---
## 🔑 Sync Tokens & Config Bundle _(v3.6.6+)_
Multi-device and external operator access is now possible via **scoped sync tokens**:
- **`POST /api/sync/tokens`** — Issue a new sync token (scoped, with optional expiry)
- **`DELETE /api/sync/tokens/:id`** — Revoke a token
- **`GET /api/sync/bundle`** — Download a versioned, ETag-keyed JSON snapshot of all non-sensitive settings (passwords redacted)
The config bundle is built by `src/lib/sync/bundle.ts`. Consumers compare the `ETag` response header to detect changes without re-downloading the full payload.
---
## 🧠 GLM Thinking Preset _(v3.6.6+)_
**GLM Thinking (`glmt`)** is now a registered first-class provider: 65 536 max output tokens, 24 576 thinking budget, 900 s default timeout, Claude-compatible API format, and shared usage sync with the GLM family.
**Hybrid token counting** also lands in v3.6.6: when a Claude-compatible provider exposes `/messages/count_tokens`, OmniRoute calls it before large requests with graceful estimation fallback.
All provider validation and model discovery calls now go through a two-layer outbound guard:
1. **URL guard** (`src/shared/network/outboundUrlGuard.ts`) — Blocks private/loopback/link-local IP ranges before the socket is opened.
2. **Safe fetch wrapper** (`src/shared/network/safeOutboundFetch.ts`) — Applies the URL guard, normalises timeouts, and retries transient errors with exponential backoff.
Guard violations surface as HTTP 422 (`URL_GUARD_BLOCKED`) and are written to the compliance audit log via `providerAudit.ts`.
---
## 🔄 Cooldown-Aware Retries _(v3.6.6+)_
Chat requests now **automatically retry** when an upstream provider returns a model-scoped cooldown. Configurable via `REQUEST_RETRY` (default: 2) and `MAX_RETRY_INTERVAL_SEC` (default: 30 s). Rate-limit header learning improved across `x-ratelimit-reset-requests`, `x-ratelimit-reset-tokens`, and `Retry-After` — per-model cooldown state is visible in the Resilience dashboard.
---
## 📋 Compliance Audit v2 _(v3.6.6+)_
The audit log has been expanded with cursor-based pagination, request context enrichment (request ID, user agent, IP), structured auth events, provider CRUD events with diff context, and SSRF-blocked validation logging. New events emitted by `src/lib/compliance/providerAudit.ts`.
| `messages` | Translates missing keys in `src/i18n/messages/{locale}.json` from `en.json` |
| `readme` | Translates `README.md` into all locales as `README.{code}.md` in project root |
| `docs` | Translates `DOC_SOURCE_FILES` into `docs/i18n/{locale}/{docName}` |
| `all` | Runs all three modes |
**Features:**
- **Text protection**: Masks code blocks (` ``` `), inline code (`` ` ``), markdown links/images (`[text](url)`), HTML tags, tables, and ICU placeholders (`{count}`, `{value}`, `{total}`, etc.) before translation, then restores them
- **Chunked batching**: Joins multiple strings with `__OMNIROUTE_I18N_SEPARATOR__` delimiters to minimize API calls (max 1800 chars per request)
- **In-memory cache**: Avoids redundant API calls for repeated strings within a session
- **Retry logic**: Exponential backoff (up to 5 attempts with 300ms × attempt delay) for 429/5xx errors
- **Timeout**: 20 seconds per request
- **Skip existing**: If target file already exists, it is NOT overwritten
**Important behaviors:**
- `docs/i18n/README.md` is **regenerated** each run — it's an auto-generated index of all docs
- Root `README.{code}.md` files are only created if they don't exist (skips locales in `EXISTING_README_CODES`)
- Language bars (`🌐 **Languages:** ...`) are automatically inserted/updated in all translated docs
### i18n_autotranslate.py (LLM-based)
**Secondary translator** — uses any OpenAI-compatible LLM API (including OmniRoute itself) to translate existing `docs/i18n/` markdown files. Best for polishing or re-translating docs with better quality than Google Translate.
```bash
python3 scripts/i18n_autotranslate.py \
--api-url http://localhost:20128/v1 \
--api-key sk-your-key \
--model gpt-4o
```
**Features:**
- Scans `docs/i18n/` markdown files for English paragraphs
- Skips code blocks, tables, and already-translated content
- Sends paragraphs to LLM with technical translation system prompt
- Supports all 30 languages
## Validation & QA
### validate_translation.py
**Translation validator** — compares any locale JSON against `en.json` and reports issues.
- **Missing keys** — keys in `en.json` but not in locale file
- **Extra keys** — keys in locale file but not in `en.json`
- **Untranslated keys** — keys where locale value equals English source (excluding allowlist)
- **Placeholder mismatches** — ICU placeholders that don't match between source and translation
**Exit codes:**
| Code | Meaning |
|------|---------|
| 0 | OK |
| 1 | Generic error |
| 2 | Missing strings (hard error) |
| 3 | Untranslated warning (soft) |
**Environment:** Set `TRANSLATION_LANG=cs` or use `-l cs` flag.
### check_translations.py
**Code-to-JSON key checker** — scans `src/**/*.tsx` and `src/**/*.ts` for `useTranslations()` calls and verifies all referenced keys exist in `en.json`.
```bash
# Basic check
python3 scripts/check_translations.py
# Verbose output
python3 scripts/check_translations.py --verbose
# Auto-fix (adds missing keys to en.json)
python3 scripts/check_translations.py --fix
```
### generate-qa-checklist.mjs
**Static analysis QA** — scans Next.js page files for i18n risk metrics and generates a Markdown report.
```bash
node scripts/i18n/generate-qa-checklist.mjs
```
**Checks:**
- Fixed-width class usage (overflow risk)
- Directional left/right classes (RTL risk)
- Clipping-prone patterns
- Locale parity (missing/extra keys vs `en.json`)
- README language selector bars in priority locales (`es`, `fr`, `de`, `ja`, `ar`)
1. **Always edit `en.json` first** — it's the source of truth
2. **Run `generate-multilang.mjs messages`** to propagate new keys to all locales
3. **Review auto-translations** — Google Translate is a starting point, not final
4. **Validate before committing** — `python3 scripts/validate_translation.py quick -l <lang>`
5. **Update `untranslatable-keys.json`** if a key should remain in English
### Placeholder Safety
- ICU placeholders (`{count}`, `{value}`, `{total}`, `{seconds}`) must be preserved exactly
- Plural formats (`{count, plural, one {# model} other {# models}}`) must maintain structure
- The validator detects placeholder mismatches automatically
### Adding New Translation Keys in Code
```tsx
// Use namespaced keys
const t = useTranslations("settings");
t("cacheSettings"); // maps to settings.cacheSettings in JSON
// Run check_translations.py to verify keys exist
python3 scripts/check_translations.py --verbose
```
### RTL Considerations
- Arabic (`ar`) and Hebrew (`he`) are RTL locales
- Avoid hardcoded `left`/`right` CSS — use `start`/`end` logical properties
- Visual QA catches RTL layout mismatches via `run-visual-qa.mjs`
## Known Issues & History
### `in.json` → `hi.json` Fix
The generator originally used `code: "in"` (deprecated Google Translate code) for Hindi instead of the correct ISO 639-1 `hi`. This created an orphaned `in.json` duplicate of `hi.json`. Fixed by changing `code: "in"` to `code: "hi"` in `generate-multilang.mjs` and removing the orphaned file.
### `docs/i18n/README.md` Is Auto-Generated
The `docs/i18n/README.md` file is completely regenerated by `generate-multilang.mjs docs`. Any manual edits will be lost. Use `docs/I18N.md` (this file) for hand-written documentation that should persist.
### External Untranslatable Keys List
The `untranslatable-keys.json` allowlist was moved from an inline Python set in `validate_translation.py` to an external JSON file for easier maintenance. The validator loads it at runtime.
### `generate-multilang.mjs` Hindi Code Fix
The generator originally used `code: "in"` (deprecated Google Translate code) for Hindi instead of the correct ISO 639-1 `hi`. This was introduced in upstream commit `952b0b22c` by `diegosouzapw`. Fixed by changing `code: "in"` to `code: "hi"` in the `LOCALE_SPECS` array and removing the orphaned `in.json` file.
| First login not working | Set `INITIAL_PASSWORD` in `.env` (no hardcoded default) |
| Dashboard opens on wrong port | Set `PORT=20128` and `NEXT_PUBLIC_BASE_URL=http://localhost:20128` |
| No logs written to disk | Set `APP_LOG_TO_FILE=true` and verify call log capture is enabled |
| EACCES: permission denied | Set `DATA_DIR=/path/to/writable/dir` to override `~/.omniroute` |
| Routing strategy not saving | Update to v1.4.11+ (Zod schema fix for settings persistence) |
| Login crash / blank page | Check Node.js version — see [Node.js Compatibility](#nodejs-compatibility) below |
| `dlopen` / `slice is not valid mach-o file` (macOS) | Run `cd $(npm root -g)/omniroute/app && npm rebuild better-sqlite3 && omniroute` — see [macOS native module rebuild](#macos-native-module-rebuild) below |
| Proxy "fetch failed" | Ensure proxy config is set at the correct level — see [Proxy Issues](#proxy-issues) below |
---
## Node.js Compatibility
<a name="nodejs-compatibility"></a>
### Login page crashes or shows "Module self-registration" error
**Cause:** You are running a Node.js version outside OmniRoute's approved secure runtime floor. The most common case is running an older Node 20, 22, or 24 patch level that falls below the patched security floor OmniRoute requires.
**Symptoms:**
- Login page shows a blank screen or a server error
- Console shows `Error: Module did not self-register` or similar native binding errors
- The login page shows an **orange warning banner** with your Node version if the runtime is outside the supported secure policy
**Fix:**
1. Install a supported Node.js LTS release (recommended: Node.js 24.x):
```bash
nvm install 24
nvm use 24
```
2. Verify your version: `node --version` should show `v24.0.0` or newer on the 24.x LTS line
> **Supported secure versions:**`>=20.20.2 <21`, `>=22.22.2 <23`, or `>=24.0.0 <25`. Node.js 24.x LTS (Krypton) is fully supported.
### macOS: `dlopen` / "slice is not valid mach-o file"
<a name="macos-native-module-rebuild"></a>
**Cause:** After a global `npm install -g omniroute`, the `better-sqlite3` native binary inside the package may have been compiled for a different architecture or Node.js ABI than what is running locally. This is common on macOS (both Apple Silicon and Intel) when the pre-built binary does not match your environment.
**Symptoms:**
- Server fails immediately on startup with a `dlopen` error
- Error contains `slice is not valid mach-o file`
- Full example:
```
dlopen(/Users/<user>/.nvm/versions/node/v24.14.1/lib/node_modules/omniroute/app/node_modules/better-sqlite3/build/Release/better_sqlite3.node, 0x0001): tried: '...' (slice is not valid mach-o file)
```
**Fix — rebuild for your local environment (no Node.js downgrade required):**
```bash
cd $(npm root -g)/omniroute/app
npm rebuild better-sqlite3
omniroute
```
> **Note:** This recompiles the native binding against your local Node.js version and CPU architecture, resolving the binary mismatch. The officially supported range is **`>=20.20.2 <21`, `>=22.22.2 <23`, or `>=24.0.0 <25`** (`engines` field in `package.json`). Node.js 24.x LTS (Krypton) is fully supported with `better-sqlite3` v12.x.
---
## Proxy Issues
<a name="proxy-issues"></a>
### Provider validation shows "fetch failed"
**Cause:** The API key validation endpoint (`POST /api/providers/validate`) was previously bypassing proxy configuration, causing failures in environments that require proxy routing.
**Fix (v3.5.5+):** This is now fixed. Provider validation routes through `runWithProxyContext`, honoring provider-level and global proxy settings automatically.
### Token health check fails with "fetch failed"
**Cause:** Background OAuth token refresh was not resolving proxy configuration per connection.
**Fix (v3.5.5+):** The token health check scheduler now resolves proxy config per connection before attempting refresh. Update to v3.5.5+.
**Cause:** On Node.js 22, the undici@8 dispatcher is incompatible with Node's built-in `fetch()` implementation.
**Fix (v3.5.5+):** OmniRoute now uses undici's own `fetch()` function when a proxy dispatcher is active, ensuring consistent behavior. Update to v3.5.5+.
- **Thinking tags not appearing** — Check if the target provider supports thinking and the thinking budget setting
- **Tool calls dropping** — Some format translations may strip unsupported fields; verify in Playground mode
- **System prompt missing** — Claude and Gemini handle system prompts differently; check translation output
- **SDK returns raw string instead of object** — Fixed in v1.1.0: response sanitizer now strips non-standard fields (`x_groq`, `usage_breakdown`, etc.) that cause OpenAI SDK Pydantic validation failures
- **GLM/ERNIE rejects `system` role** — Fixed in v1.1.0: role normalizer automatically merges system messages into user messages for incompatible models
- **`developer` role not recognized** — Fixed in v1.1.0: automatically converted to `system` for non-OpenAI providers
- **`json_schema` not working with Gemini** — Fixed in v1.1.0: `response_format` is now converted to Gemini's `responseMimeType` + `responseSchema`
---
## Resilience Settings
### Auto rate-limit not triggering
- Auto rate-limit only applies to API key providers (not OAuth/subscription)
- Check if the provider returns `429` status codes or `Retry-After` headers
### Tuning exponential backoff
Provider profiles support these settings:
- **Base delay** — Initial wait time after first failure (default: 1s)
- **Max delay** — Maximum wait time cap (default: 30s)
- **Multiplier** — How much to increase delay per consecutive failure (default: 2x)
### Anti-thundering herd
When many concurrent requests hit a rate-limited provider, OmniRoute uses mutex + auto rate-limiting to serialize requests and prevent cascading failures. This is automatic for API key providers.
Some OmniRoute users place the gateway in front of RAG or agent stacks. In those setups it is common to see a strange pattern: OmniRoute looks healthy (providers up, routing profiles ok, no rate limit alerts) but the final answer is still wrong.
In practice these incidents usually come from the downstream RAG pipeline, not from the gateway itself.
If you want a shared vocabulary to describe those failures you can use the WFGY ProblemMap, an external MIT license text resource that defines sixteen recurring RAG / LLM failure patterns. At a high level it covers:
- retrieval drift and broken context boundaries
- empty or stale indexes and vector stores
- embedding versus semantic mismatch
- prompt assembly and context window issues
- logic collapse and overconfident answers
- long chain and agent coordination failures
- multi agent memory and role drift
- deployment and bootstrap ordering problems
The idea is simple:
1. When you investigate a bad response, capture:
- user task and request
- route or provider combo in OmniRoute
- any RAG context used downstream (retrieved documents, tool calls, etc)
2. Map the incident to one or two WFGY ProblemMap numbers (`No.1` … `No.16`).
3. Store the number in your own dashboard, runbook, or incident tracker next to the OmniRoute logs.
4. Use the corresponding WFGY page to decide whether you need to change your RAG stack, retriever, or routing strategy.
Full text and concrete recipes live here (MIT license, text only):
This guide covers how to cleanly remove OmniRoute from your system.
---
## Quick Uninstall (v3.6.2+)
OmniRoute provides two built-in scripts for clean removal:
### Keep Your Data
```bash
npm run uninstall
```
This removes the OmniRoute application but **preserves** your database, configurations, API keys, and provider settings in `~/.omniroute/`. Use this if you plan to reinstall later and want to keep your setup.
### Full Removal
```bash
npm run uninstall:full
```
This removes the application **and permanently erases** all data:
- Database (`storage.sqlite`)
- Provider configurations and API keys
- Backup files
- Log files
- All files in the `~/.omniroute/` directory
> ⚠️ **Warning:**`npm run uninstall:full` is irreversible. All your provider connections, combos, API keys, and usage history will be permanently deleted.
---
## Manual Uninstall
### NPM Global Install
```bash
# Remove the global package
npm uninstall -g omniroute
# (Optional) Remove data directory
rm -rf ~/.omniroute
```
### pnpm Global Install
```bash
pnpm uninstall -g omniroute
rm -rf ~/.omniroute
```
### Docker
```bash
# Stop and remove the container
docker stop omniroute
docker rm omniroute
# Remove the volume (deletes all data)
docker volume rm omniroute-data
# (Optional) Remove the image
docker rmi diegosouzapw/omniroute:latest
```
### Docker Compose
```bash
# Stop and remove containers
docker compose down
# Also remove volumes (deletes all data)
docker compose down -v
```
### Electron Desktop App
**Windows:**
- Open `Settings → Apps → OmniRoute → Uninstall`
- Or run the NSIS uninstaller from the install directory
**macOS:**
- Drag `OmniRoute.app` from `/Applications` to Trash
Dashboard → Connect Kiro → AWS Builder ID or Google/GitHub → Unlimited
Models: kr/claude-sonnet-4.5, kr/claude-haiku-4.5
```
---
## 🎨 Combos
You can reorder combo cards directly in **Dashboard → Combos** by dragging the handle on each card. The order is stored in SQLite and restored on reload.
### Example 1: Maximize Subscription → Cheap Backup
| `npm run uninstall` | Removes the system app but **keeps your DB and configurations** in `~/.omniroute`. |
| `npm run uninstall:full` | Removes the app AND permanently **erases all configurations, keys, and databases**. |
> Note: To run these commands, navigate to the OmniRoute project folder (if you cloned it) and run them. Alternatively, if globally installed, you can simply run `npm uninstall -g omniroute`.
For host-integrated mode with CLI binaries, see the Docker section in the main docs.
### Void Linux (xbps-src)
Void Linux users can package and install OmniRoute natively using the `xbps-src` cross-compilation framework. This automates the Node.js standalone build along with the required `better-sqlite3` native bindings.
<details>
<summary><b>View xbps-src template</b></summary>
```bash
# Template file for 'omniroute'
pkgname=omniroute
version=3.2.4
revision=1
hostmakedepends="nodejs python3 make"
depends="openssl"
short_desc="Universal AI gateway with smart routing for multiple LLM providers"
Or use Dashboard: **Providers → [Provider] → Custom Models**.
Notes:
- OpenRouter and OpenAI/Anthropic-compatible providers are managed from **Available Models** only. Manual add, import, and auto-sync all land in the same available-model list, so there is no separate Custom Models section for those providers.
- The **Custom Models** section is intended for providers that do not expose managed available-model imports.
### Dedicated Provider Routes
Route requests directly to a specific provider with model validation:
```bash
POST http://localhost:20128/v1/providers/openai/chat/completions
POST http://localhost:20128/v1/providers/openai/embeddings
POST http://localhost:20128/v1/providers/fireworks/images/generations
```
The provider prefix is auto-added if missing. Mismatched models return `400`.
### Network Proxy Configuration
```bash
# Set global proxy
curl -X PUT http://localhost:20128/api/settings/proxy \
2. **Editable Rate Limits** — System-level defaults configurable in the dashboard:
- **Requests Per Minute (RPM)** — Maximum requests per minute per account
- **Min Time Between Requests** — Minimum gap in milliseconds between requests
- **Max Concurrent Requests** — Maximum simultaneous requests per account
- Click **Edit** to modify, then **Save** or **Cancel**. Values persist via the resilience API.
3. **Circuit Breaker** — Tracks failures per provider and automatically opens the circuit when the configured threshold is reached:
- **CLOSED** (Healthy) — Requests flow normally
- **OPEN** — Provider is temporarily blocked after repeated failures
- **HALF_OPEN** — Testing if provider has recovered
The same provider profile also drives model-scoped lockouts:
- Account/model lockouts react immediately to authoritative `429` / `404` signals and use the configured cooldown + backoff values
- Global provider/model quarantine only activates after repeated exhaustion hits the configured **CB Threshold** within **CB Reset Time**
4. **Policies & Locked Identifiers** — Shows circuit breaker status and locked identifiers with force-unlock capability.
5. **Rate Limit Auto-Detection** — Monitors `429` and `Retry-After` headers to proactively avoid hitting provider rate limits. When an upstream provider returns an explicit wait window, that authoritative `Retry-After` value overrides the base cooldown from the provider profile.
**Pro Tip:** Use **Reset All** button to clear all circuit breakers and cooldowns when a provider recovers from an outage.
---
### Database Export / Import
Manage database backups in **Dashboard → Settings → System & Storage**.
| **Export Database** | Downloads the current SQLite database as a `.sqlite` file |
| **Export All (.tar.gz)** | Downloads a full backup archive including: database, settings, combos, provider connections (no credentials), API key metadata |
| **Import Database** | Upload a `.sqlite` file to replace the current database. A pre-import backup is automatically created unless `DISABLE_SQLITE_AUTO_BACKUP=true` |
**Cost Tracking:** Every request logs token usage and calculates cost using the pricing table. View breakdowns in **Dashboard → Usage** by provider, model, and API key.
---
### Audio Transcription
OmniRoute supports audio transcription via the OpenAI-compatible endpoint:
```bash
POST /v1/audio/transcriptions
Authorization: Bearer your-api-key
Content-Type: multipart/form-data
# Example with curl
curl -X POST http://localhost:20128/v1/audio/transcriptions \
-H "Authorization: Bearer your-api-key" \
-F "file=@audio.mp3" \
-F "model=deepgram/nova-3"
```
Available providers: **Deepgram** (`deepgram/`), **AssemblyAI** (`assemblyai/`).
Este guia documenta o padrão ouro de infraestrutura de rede para proteger o **OmniRoute** e expor sua aplicação de forma segura para a internet, **sem abrir nenhuma porta (Zero Inbound)**.
## O que foi feito na sua VM?
Nós ativamos o OmniRoute em modo **Split-Port** através do PM2:
- **Porta \`20128\`:** Roda **apenas a API**`/v1`.
- **Porta \`20129\`:** Roda **apenas o Dashboard** Administrativo visual.
Além disso, o serviço interno exige \`REQUIRE_API_KEY=true\`, o que significa que nenhum agente pode consumir os endpoints da API sem enviar um "Bearer Token" legítimo gerado na aba API Keys do Painel.
Isso nos permite criar duas regras completamente independentes na rede. É aqui que entra o **Cloudflare Tunnel (cloudflared)**.
---
## 1. Como Criar o Túnel na Cloudflare
O utilitário \`cloudflared\` já está instalado na sua máquina. Siga os passos na nuvem:
1. Acesse seu painel **Cloudflare Zero Trust** (One.dash.cloudflare.com).
2. No menu à esquerda, vá em **Networks > Tunnels**.
3. Clique em **Add a Tunnel**, escolha **Cloudflared** e dê o nome \`OmniRoute-VM\`.
4. Ele vai gerar um comando na tela chamado "Install and run a connector". **Você só precisa copiar o Token (a string longa após `--token`)**.
5. Logue via SSH na sua máquina virtual (ou Terminal do Proxmox) e execute:
\`\`\`bash
# Inicia e amarra o túnel permanentemente à sua conta
cloudflared service install SEU_TOKEN_GIGANTE_AQUI
\`\`\`
---
## 2. Configurando o Roteamento (Public Hostnames)
Ainda na tela do Tunnel recém-criado, vá para a aba **Public Hostnames** e adicione as **duas** rotas, aproveitando a separação que fizemos:
### Rota 1: API Segura (Limitada)
- **Subdomain:** \`api\`
- **Domain:** \`seuglobal.com.br\` (escolha seu domínio real)
- **Service Type:** \`HTTP\`
- **URL:** \`127.0.0.1:20128\` _(Porta interna da API)_
### Rota 2: Painel Zero Trust (Fechado)
- **Subdomain:** \`omniroute\` ou \`painel\`
- **Domain:** \`seuglobal.com.br\`
- **Service Type:** \`HTTP\`
- **URL:** \`127.0.0.1:20129\` _(Porta interna do App/Visual)_
Neste momento, a conectividade "Física" está resolvida. Agora vamos blindar de verdade.
---
## 3. Blindando o Painel com Zero Trust (Access)
Nenhuma senha local protege melhor o seu painel do que remover totalmente o acesso a ele da internet aberta.
1. No painel Zero Trust, vá em **Access > Applications > Add an application**.
2. Selecione **Self-hosted**.
3. Em **Application name**, coloque \`Painel OmniRoute\`.
4. Em **Application domain**, coloque \`omniroute.seuglobal.com.br\` (O mesmo que você fez na "Rota 2").
5. Clique em **Next**.
6. Em **Rule action**, escolha \`Allow\`. Em nome da Rule coloque \`Admin Apenas\`.
7. Em **Include**, no seletor de "Selector" escolha \`Emails\` e digite o seu email, por exemplo \`admin@spgeo.com.br\`.
8. Salve (`Add application`).
> **O que isso fez:** Se você tentar abrir \`omniroute.seuglobal.com.br\`, não cai mais na sua aplicação OmniRoute! Cai numa tela elegante da Cloudflare pedindo para digitar seu email. Somente se você (ou o email que você botou) for digitado lá, ele recebe no Outlook/Gmail um código de 6 dígitos temporário que libera o túnel até a porta \`20129\`.
---
## 4. Limitando e Protegendo a API com Rate Limit (WAF)
O Dashboard do Zero Trust não se aplica à rota da API (\`api.seuglobal.com.br\`), porque é um acesso programático via ferramentas automatizadas (agentes) sem navegador. Para ele, usaremos o Firewall principal (WAF) da Cloudflare.
1. Acesse o **Painel Normal** da Cloudflare (dash.cloudflare.com) e entre no seu Domínio.
2. No menu esquerdo, vá em **Security > WAF > Rate limiting rules**.
3. Clique em **Create rule**.
4. **Name:** \`Anti-Abuso OmniRoute API\`
5. **If incoming requests match...**
- Escolha em Field: \`Hostname\`
- Operator: \`equals\`
- Value: \`api.seuglobal.com.br\`
6. Em **With the same characteristics:** Mantenha \`IP\`.
7. Nos limites (Limit):
- **When requests exceed:** \`50\`
- **Period:** \`1 minute\`
8. No final, em **Action**: \`Block\` (Bloquear) e decida se o bloqueio dura por 1 minuto ou 1 hora.
9. **Deploy**.
> **O que isso fez:** Ninguém pode mandar mais de 50 requisições num período de 60 segundos na sua URL de API. Como você roda vários agentes e os consumos por trás já batem rate limit e já rastreiam tokens, isso é apenas uma medida na Borda da Internet (Edge Layer) que protege sua Instância On-Premises de cair por estresse térmico antes mesmo do tráfego descer pelo túnel.
---
## Finalização
1. A sua VM **não possui nenhuma porta exposta** em `/etc/ufw`.
2. O OmniRoute só conversa HTTPS saindo (\`cloudflared\`) e não recebendo TCP direto do mundo.
3. Seus requets pro OpenAI são ofuscados porque configuramos eles globalmente pra passar em um Proxy SOCKS5 (A nuvem não liga pro SOCKS5 porque ela vem Inbound).
4. Seu painel web tem 2-Factor com Email.
5. Sua API está ratelimitada na borda pela Cloudflare e só trafega Bearer Tokens.
> OmniRoute is a free, open-source AI Gateway that acts as a universal API proxy for multi-provider LLMs. It provides smart routing, automatic fallback, load balancing, and format translation across 60+ AI providers — all through a single OpenAI-compatible endpoint. Includes a built-in MCP Server (25 tools), A2A v0.3 protocol, Memory/Skills systems, and an Electron desktop app.
## نظرة عامة
OmniRoute solves the problem of managing multiple AI provider subscriptions, quotas, and rate limits. It sits between your AI-powered tools (IDE agents, CLI tools) and AI providers, routing requests intelligently through a 4-tier fallback system: Subscription → API Key → Cheap → Free.
**Key value:** One endpoint (`http://localhost:20128/v1`), unlimited models, zero downtime, minimal cost.
**Custom Providers:** OpenAI-compatible (`openai-compatible-*`) and Anthropic-compatible (`anthropic-compatible-*`) with custom base URLs
### Internationalization
- 30 languages for UI (all dashboard pages)
- 30 translated documentation sets in docs/i18n/
- Language switcher in documentation
## Key Architectural Decisions
1. **OpenAI-compatible API surface:** All incoming requests follow the OpenAI API format. This makes OmniRoute a drop-in replacement for any tool that supports custom OpenAI endpoints.
2. **Provider abstraction via format translators:** Each AI provider has a translator in `open-sse/translator/` that converts between OpenAI format and the provider's native format transparently.
3. **Connection-based provider model:** Providers are stored as "connections" in SQLite. Each connection has an `id`, `provider`, `authType` (oauth/apikey/free), `isActive` flag, and credentials. Multiple connections per provider for multi-account rotation.
4. **Combo system for fallback:** Users create "combos" — ordered lists of `provider/model` pairs. The proxy tries each in order until one succeeds. Supports 13 strategies including auto-combo with self-healing and context-relay for session continuity.
5. **SSE proxy pipeline:** The proxy pipeline is middleware-based: request → auth resolution → rate limiting → circuit breaker → format translation → upstream call → response translation → SSE streaming back to client.
6. **SQLite for persistence:** All state (providers, combos, logs, settings, API keys, memory, skills) stored in a single SQLite database via 21 domain-specific modules. All DB operations go through `src/lib/db/` modules, never raw SQL in routes.
7. **OAuth with PKCE:** OAuth flows use PKCE for security. Token refresh handled by background job (`tokenHealthCheck.ts`).
8. **ProviderIcon component:** Unified icon system using `@lobehub/icons` (130+ SVG) with PNG fallback and generic icon fallback chain. Used on providers, dashboard, and agents pages.
9. **DB architecture:** `localDb.ts` is a re-export layer only — real logic lives in 21 `src/lib/db/` modules with 16 SQL migrations.
10. **Upstream headers:** Custom headers merged in executors after default auth; same header name replaces executor value. Forbidden header names in `src/shared/constants/upstreamHeaders.ts`.
11. **Memory/Skills cross-cutting systems:** Memory and Skills affect the MCP tools, request pipeline, and A2A skills. Memory provides persistent context across sessions; Skills provide extensible tool execution with sandbox isolation.
12. **Domain policy engine:** `src/domain/` contains policy engine modules (policyEngine, comboResolver, costRules, degradation, fallbackPolicy, lockoutPolicy, modelAvailability, providerExpiration, quotaCache, configAudit) that govern routing decisions independently from the pipeline.
13. **Provider constants validated at load:** All provider definitions validated via Zod schemas at module load time (`src/shared/validation/providerSchema.ts`). Invalid providers fail fast.
## Main Flows
### Proxy Request Flow
1. Client sends OpenAI-format request to `/v1/chat/completions`
2. API key validation
3. Model resolution: direct model or combo lookup
4. For combos: iterate through models with selected strategy
5. Auth resolution: get credentials for the target provider
6. Format translation: OpenAI → provider native format
7. CLI fingerprint matching (if enabled for provider)
8. Upstream request with circuit breaker and rate limiting
9. Response translation: provider → OpenAI format
10. omniModel tag sanitization (strip internal tags)
4. Tokens stored as a provider connection in SQLite
5. Background job refreshes tokens before expiry
## Important Notes for LLMs
1. **Two model endpoints exist:** `/api/models` (dashboard, all models) and `/v1/models` (OpenAI-compatible, active only).
2. **Provider IDs vs aliases:** Providers have both an ID (`claude`, `github`) and a short alias (`cc`, `gh`). Models are referenced as `alias/model-name` (e.g., `cc/claude-opus-4-6`).
3. **The `open-sse/` directory is a separate npm workspace** with its own config, handlers, executors, translators, and services.
4. **Environment variables:** All configuration is in `.env` (from `.env.example`). Key vars: `PORT`, `NEXT_PUBLIC_BASE_URL`, `API_KEY`, `ADMIN_PASSWORD`.
5. **Database layer:** Operations go through `src/lib/db/` modules (21 domain-specific files). `localDb.ts` is re-exports only — add new functions to the proper `db/*.ts` module.
6. **Tests use Node.js built-in test runner:** 122 unit test files. Run `npm test`. Vitest for MCP/autoCombo (`npm run test:vitest`). Playwright for E2E (`npm run test:e2e`).
7. **MCP and A2A pages are embedded as tabs inside `/dashboard/endpoint`**, not standalone routes.
8. **ACP agents** are in `src/lib/acp/registry.ts` with detection cache. Custom agents stored via settings DB.
10. **Docker:** Dockerfile has two targets: `runner-base` and `runner-cli`. `docker-compose.yml` for dev (3 profiles), `docker-compose.prod.yml` for production (port 20130).
11. **Electron desktop app** in `electron/` with main.js and preload.js. Build with `npm run electron:build` (supports Windows, macOS, Linux).
12. **Pricing data** syncs from LiteLLM via `src/lib/pricingSync.ts`. Use `sync_pricing` MCP tool or API endpoint.
13. **Memory system** in `src/lib/memory/` provides extraction, injection, retrieval, summarization, and persistent store. Exposed via MCP memory tools and `/api/memory/ API.
14. **Skills system** in `src/lib/skills/` provides registry, executor, sandbox isolation, built-in skills, custom skill support, request interception, and context injection. Exposed via MCP skill tools and `/api/skills/` API.
15. **Zod v4** is used for all validation. Import from `zod` package. Provider schemas validated at module load time.
16. **Context Relay** strategy (`context-relay`) is split across two layers: `combo.ts` decides if a handoff should be generated after a successful turn; `chat.ts` injects the handoff only after account resolution. Handoff data lives in `context_handoffs` SQLite table. Config: `handoffThreshold`, `handoffModel`, `handoffProviders`.
17. **Proxy enforcement** is now comprehensive: token health checks resolve proxy per connection, provider validation wraps in `runWithProxyContext`, and proxy dispatchers use `undici.fetch()` instead of the Node built-in `fetch()` to avoid dispatcher incompatibilities on Node 22.
18. **Node.js 24+ compatibility**: The login page (`/api/settings/require-login`) detects the Node.js version and sends `nodeVersion`/`nodeCompatible` fields. The login UI renders a warning banner when `nodeCompatible` is false.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.