Escape backslashes before injecting combo stream tags so tagged SSE
payloads remain valid JSON, and call the provider models handler
directly during sync to avoid internal fetch SSRF warnings.
Also restore crypto UUID generation for Gemini request translation,
tighten dashboard error sanitization, and update tests to use stricter
URL matching and UUID-based fixture keys.
Update the direct Next.js dependency to a patched release in response
to the reported audit findings.
Switch the provider diversity test to Vitest's expect API for
consistent test runner usage and add the audit report snapshot for
release verification.
Allow chat core callers to disable the emergency fallback path during
routed retries and expose proxy cache reset helpers for deterministic
state handling.
Add regression coverage for chat routing edge cases, combo strategies,
stream utilities, cursor SSE termination, and JSON-to-SQLite db
migration behavior.
Only use provider apiRegion values when they are strings before resolving
the GLM quota endpoint, preventing invalid metadata from affecting usage
requests.
Run unit tests with single-test concurrency to avoid shared-state flakes
and expand coverage for auth-protected routes, provider node validation,
proxy and stream handling, model sync, token refresh, and protobuf
parsing.
Keep the original combo and budget exhaustion errors when global or
emergency fallbacks also fail so callers see the real upstream cause.
Also preserve translated responses for memory extraction before output
post-processing, track pending rate-limit async work for deterministic
test resets, and expose usage helpers needed for deeper branch
coverage.
Expand unit coverage across moderation, media generation, streaming,
response logging, usage helpers, and fallback proxy error handling.
Move chat pipeline validation, circuit breaker execution, proxy
resolution, logging, and session header handling into dedicated
helpers to keep the SSE handler smaller and easier to verify.
Also fix shared API option precedence, rebuild skill version caches
after deletions, ignore api.trycloudflare.com false positives, and
add rate-limit manager test flush/reset hooks for deterministic
coverage.
Expand integration and unit coverage across chat routing, auth,
cloud sync, skills, executors, streaming, DB helpers, proxy handling,
and provider/model utilities.
- Removed the expensive (40s+) `npm run test:unit` step from the `pre-commit` hook
- Created `.husky/pre-push` to run the unit test suite before pushing rather than per commit
- This prevents spurious async teardown errors from local test runners from blocking fast commits
- Replaced an explicit `any` cast with `Record<string, unknown> | undefined` in `chatCore.ts` to pass the `check:any-budget:t11` strict checker which enforces a budget of 0
* feat(qoder): native cosy integration
* feat(qoder): implement native COSY encryption algorithm and remove CLI child instances, plus workflow bumps
* feat(resilience): context overflow fallback, OAuth token detection, empty content guard & context-optimized combo strategy
- Add isContextOverflowError + isContextOverflow detectors (400 + token-limit signals)
- Auto-fallback to next family model on context overflow in chatCore
- Add isEmptyContentResponse to catch fake-success empty responses, trigger fallback + recursive retry
- Add OAUTH_INVALID_TOKEN error type (T11) with isOAuthInvalidToken signal matching; warn instead of deactivating node
- Add getModelContextLimit helper in modelsDevSync (reads limit_context from synced capabilities)
- Upgrade getTokenLimit in contextManager to check models.dev DB before registry (fixes gemini-2.5-pro: 1000000→1048576)
- Add findLargerContextModel in modelFamilyFallback for context-aware model selection
- Add sortModelsByContextSize + context-optimized combo strategy in combo.ts
- Update context-manager unit test for corrected gemini-2.5-pro limit
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(review): address Gemini code review — tool_calls path, infinite recursion, dedup signals, findLargerContextModel
- Fix isEmptyContentResponse: check message.tool_calls/delta.tool_calls instead
of firstChoice.tool_calls (wrong OpenAI API path, caused tool-call responses
to be falsely flagged as empty)
- Fix empty content fallback: replace recursive handleChatCore call (infinite
recursion risk + wrong model due to original body.model) with non-recursive
pattern — call executeProviderRequest, parse fallback response body, reassign
responseBody and fall through to existing processing
- Fix context overflow: use findLargerContextModel over family candidates first,
fall back to getNextFamilyFallback — ensures we pick a model with actually
larger context window on overflow
- Fix signal dedup: export CONTEXT_OVERFLOW_SIGNALS + CONTEXT_OVERFLOW_REGEX
from errorClassifier.ts; import shared regex in modelFamilyFallback.ts,
removing duplicate signal list and per-call RegExp construction
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(UI): add context-optimized strategy to frontend schema and options
* fix(sse): preserve Responses API events in stream translation
When translating Claude-format responses (e.g. GLM) to Responses API
format for Codex CLI, the sanitizer stripped {event, data} structured
items to {"object":"chat.completion.chunk"}, losing all content and
the critical response.completed event.
Only run sanitizeStreamingChunk on OpenAI Chat Completions chunks,
skipping items that have the Responses API {event, data} structure.
* test(sse): add regression test for Claude→Responses stream sanitization
Verifies that {event,data} structured items from the Responses API
translator bypass sanitizeStreamingChunk when translating Claude-format
providers (e.g. GLM) to Responses API format for Codex CLI.
* fix(sse): strengthen Responses API event detection with response. prefix check
Use explicit `response.` prefix check instead of generic `event && data`
presence check, as recommended in PR review.
* fix: pin Next.js to 16.0.10 to prevent Turbopack hashed module bug
Remove ^ prefix from next and eslint-config-next to prevent
automatic upgrades to 16.1.x+ which introduced content-based
hashing for external module references in Turbopack.
Also remove duplicate Material Symbols @import from globals.css
(font already loaded via <link> in layout.tsx).
Fixes#509
* align cc-compatible cache handling with client passthrough
* chore: integrate resilience and turbopack fixes (PRs #992, #990, #987)
* chore(release): bump to v3.5.2 — changelog, docs, version sync
* docs(i18n): sync documentation updates to 33 languages
* fix(qoder): replace any with unknown to comply with strict any-budget
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Chris Staley <christopher-s@users.noreply.github.com>
Co-authored-by: Ivan <shanin-i2011@yandex.ru>
Co-authored-by: R.D. <rogerproself@gmail.com>
- CRITICAL: Fetch old settings BEFORE update in PATCH handler to correctly
compare wasEnabled vs isEnabled for sync lifecycle management
- CRITICAL: Handle modelsDevSyncInterval changes (restart periodic sync with
new interval when it changes)
- MEDIUM: Add error logging and user feedback to useEffect catch
- MEDIUM: Add revert logic to updateInterval on API failure
Fixes all 3 review comments from Gemini Code Assist on PR #983
- Add models.dev sync engine (src/lib/modelsDevSync.ts) for pricing, capabilities, and model specs
- Fetch from https://models.dev/api.json (109 providers, 4,146+ models, MIT licensed)
- 4-layer pricing resolution: User Override > models.dev > LiteLLM > Hardcoded Default
- New model_capabilities DB table for synced capability data
- UI toggle in Settings > AI tab: enable/disable sync, configure interval (1h-7d), manual sync trigger
- Live stats dashboard showing provider/model/capability counts
- New API route /api/settings/models-dev for sync status and manual triggers
- Fix 39 missing i18n keys across all 30 languages (Memory & Skills tab fully translated)
- 25 unit + integration tests, 1,439 existing tests pass, lint clean, typecheck clean
Closes#979
Switch validateGeminiLikeProvider from query-param auth (?key=) to
x-goog-api-key header auth, matching the actual request pipeline.
Parse Google error response bodies to distinguish auth failures
(API_KEY_INVALID, API_KEY_EXPIRED, PERMISSION_DENIED) from other
400 errors. Google returns 400 (not 401/403) for invalid keys.
Add 5 new test cases covering 400/401 rejection paths and success.
Fixes#976
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
The hardcoded BUFFER_TOKENS = 2000 constant inflates prompt_tokens and
input_tokens in every API response, which is helpful for CLI tools that
rely on reported usage to manage context windows but misleading for SDK
users, cost dashboards, and any integration comparing token counts
across providers.
This change makes the buffer configurable via three layered sources
(in priority order):
1. Environment variable: USAGE_TOKEN_BUFFER=0
2. Settings API / Dashboard: PATCH /api/settings { usageTokenBuffer: 0 }
3. Default: 2000 (preserves existing behavior)
Setting the value to 0 disables the buffer entirely, causing OmniRoute
to return raw provider token counts. The setting is cached in-process
with a 30-second TTL and invalidated immediately when updated through
the settings API.
Changes:
- open-sse/utils/usageTracking.ts: replace hardcoded constant with
getBufferTokens() that reads env / DB settings with TTL cache
- src/shared/validation/settingsSchemas.ts: add usageTokenBuffer field
(int, 0–50000) to the Zod update schema
- src/app/api/settings/route.ts: invalidate buffer cache on update
The locateCommand function returned the bare command name instead of
parsing the where output. On Windows, npm global installs create both
a Unix shell script (no extension) and a .cmd wrapper. where returns
both, and the bare name resolves to the Unix script first, causing
healthcheck failures for OpenClaw and OpenCode.
Fix: parse where output and prefer paths with Windows executable
extensions (.cmd, .exe, .bat, .com).
Related: #935, #863
SQLite background asynchronous backups () generate native thread promises that are not implicitly terminated by Node 22's test runner upon suite completion when the DB connection is closed. This causes the CI test job to hang indefinitely. Added cross-env DISABLE_SQLITE_AUTO_BACKUP flag to the test suite.
- Replaces loose string includes check in dnsConfig with strict bound RegExp to silence URL matching heuristic (SSRF).
- Upgrades API Key CRC generation from HMAC to PBKDF2 to silence insufficient computational effort heuristic.
- Add proxy support to all OAuth flows (authorization, token exchange, import)
- Add proxy support to token refresh operations for all providers
- Add proxy support to model synchronization
- Initialize global fetch proxy patch at server startup
- Use Proxy Registry with priority: Provider Proxy → Global Proxy → Direct
- Fix Global Proxy display in settings UI to show proxy from Proxy Registry
Changes:
- open-sse/services/tokenRefresh.ts: Add proxyConfig parameter to all refresh functions
- src/sse/services/tokenRefresh.ts: Resolve proxy before calling refresh functions
- src/app/api/oauth/*/route.ts: Use resolveProxyForProvider for OAuth flows
- src/app/api/providers/[id]/models/route.ts: Add proxy support for model sync
- src/instrumentation-node.ts: Initialize proxy patch at startup
- src/app/api/settings/proxy/route.ts: Read Global Proxy from Proxy Registry
- src/lib/db/proxies.ts: Export resolveProxyForProvider
- src/lib/localDb.ts: Re-export resolveProxyForProvider
- src/models/index.ts: Re-export resolveProxyForProvider
15 files changed, 405 insertions(+), 240 deletions(-)
Co-authored-by: growab <growab@users.noreply.github.com>
The tool was fully defined in schemas/tools.ts and backed by the
working /v1/search endpoint, but server.registerTool() was never
called, leaving it absent from tools/list.
Changes:
- server.ts: add webSearchInput import, handleWebSearch handler, and
server.registerTool("omniroute_web_search") after sync_pricing block
- schemas/tools.ts: align webSearchInput with /v1/search contract --
query max reduced 1000->500, provider narrowed to explicit enum
- essentialTools.test.ts: rewrite web_search stubs to dispatch through
a real McpServer+InMemoryTransport+Client, providing actual handler
coverage (POST method, body fields, error paths, tools/list check)
- vitest.mcp.config.ts: dedicated vitest config for MCP server tests;
update test:vitest script to use it
Note: omniRouteFetch hardcodes AbortSignal.timeout(10000) unconditionally
(server.ts line 126), so caller signals are silently discarded -- the
effective search timeout is 10s. Follow-up PR can add timeoutMs param.
cacheStatsTool and cacheFlushTool are also unregistered; out of scope.
🤖 Generated with Claude Sonnet 4.6 via Claude Code (https://claude.com/claude-code) + Compound Engineering v2.58.1
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Gemini AI Studio no longer has a hardcoded model registry — models come
from API sync. Updated tests:
- T28: assert gemini registry is empty (API sync), check gemini-cli instead
- T31: check antigravity static catalog for pro-high/pro-low model IDs
The comment claimed "leave no text content" but any text chunks streamed
before the SAFETY finish reason have already been emitted. Updated to
accurately describe the behavior.
Stability fixes for the Gemini provider — no new features:
- Map SAFETY/RECITATION/BLOCKLIST finish reasons (gemini-to-openai → content_filter)
- Add 15s per-page timeout on models sync pagination to prevent indefinite hangs
- Fix mime_type → mimeType in inlineData to match Gemini v1beta API (camelCase)
- Fix include_thoughts → includeThoughts to match Gemini API (camelCase)
- Add inlineData/mime_type fallback in gemini-to-openai request translator
Google AI Studio defaults to 50 models per page. Set pageSize=1000
to maximize per-page results and follow nextPageToken to fetch all
available models across pages. Also fixes query param separator when
base URL already contains query string.
Replace inline `provider === "gemini"` checks in chatCore.ts and auth.ts
with shared hasPerModelQuota() and lockModelIfPerModelQuota() from
accountFallback.ts. Also adds max-cooldown preservation to lockModel()
to prevent race conditions from overwriting longer lockouts.
isModelLocked() was called to lock models on 429 but never checked
when selecting connections. Requests to a locked model would still
route to it, causing unnecessary repeated 429s.
Gemini AI Studio enforces per-model quotas. Previously a 429 on
gemini-2.5-pro would mark the entire connection as credits_exhausted,
blocking all models on that API key.
Three-layer fix:
- chatCore: lock model only (not connection) for RATE_LIMITED and
QUOTA_EXHAUSTED errors from Gemini
- auth: early-return with model-only lockout before terminal status
check, so credits_exhausted is never set on the connection
- rateLimitManager: use model-scoped limiter keys for Gemini so the
Bottleneck queue pauses only the affected model, not the connection
- chat: skip markAccountExhaustedFrom429 for Gemini (per-model quotas)
Gemini AI Studio has per-model quota limits. When one model hits its
quota (429), other models on the same API key may still be available.
- Use lockModel() for 429s on Gemini (model-only lockout, connection
stays active for other models)
- Skip markAccountExhaustedFrom429 for Gemini (prevents deprioritizing
the entire connection in credential selection)
- Apply model-only 404 lockout for Gemini too (deprecated/unavailable
models shouldn't disable the provider)
Match reset() behavior — clear lastFailureTime when recovering from
OPEN state to avoid stale timestamps in persisted state.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Show import progress dialog when adding Gemini API key with model
list, completion status, and Close button
- Remove hardcoded Gemini model registry — models come exclusively
from Google API sync per API key
- Hide "Import from /models" button for Gemini (sync handles it)
- Remove server-side fire-and-forget auto-sync (client handles it)
- Categorize synced Gemini models by endpoint type in catalog:
embedding, image, and audio (transcription + speech)
- Add modelsImported and close i18n keys to all 33 languages
Combo handlers call _onSuccess()/_onFailure() directly instead of
execute(), bypassing the OPEN→HALF_OPEN→CLOSED transition. After
resetTimeout elapses, canExecute() returns true and requests flow
through, but _onSuccess() only reset failureCount without changing
state — leaving the breaker permanently OPEN with 0 failures.
Handle OPEN state in both methods: _onSuccess() now transitions
OPEN→CLOSED, _onFailure() skips redundant re-transition.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Auto-sync is fire-and-forget on the server, so the dashboard needs to
poll for model updates after saving a Gemini key (3s delay). On delete,
refresh models immediately since DB cleanup is synchronous.
Removed hardcoded registry fallback for Gemini in dashboard, catalog,
and v1beta endpoints. Without synced models (no API keys), Gemini shows
nothing. Hardcoded entries are always removed from v1beta regardless
of sync result.
Store synced models keyed by providerId:connectionId so each API key's
models are tracked separately. On read, union across all connections.
On connection delete, remove only that key's models. Models with no
remaining connections are automatically excluded.
Add 9 new API-key providers for image, video, and media generation:
- Replicate (rep/) — ML model marketplace with passthrough models
- fal.ai (fal/) — fast inference for generative media
- Novita AI (novita/) — image/video generation platform
- Segmind (seg/) — image generation models
- Runware (rw/) — real-time image generation
- WaveSpeed AI (ws/) — fast media generation
- PiAPI (pi/) — multi-model API gateway
- GoAPI (ggo/) — aggregated AI API access
- LaoZhang AI (lz/) — multi-provider proxy
These providers support image/video/audio endpoints natively via
OpenAI-compatible format, solving the gap where only chat-focused
providers were registered.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Align CacheTrends interface with API response (cachedRequests instead of hits/misses/hitRate)
- Update bar chart rendering to use cachedRequests field from getCacheTrend()
- Remove duplicate CacheStatsCard from cache overview (already in settings AI tab)
- Fix tooltip to use i18n translations instead of hardcoded English
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Add "memory" and "skills" entries to HIDEABLE_SIDEBAR_ITEM_IDS and PRIMARY_SIDEBAR_ITEMS
- Add sidebar translation keys for memory and skills pages
- Add complete "memory" and "skills" i18n sections to en.json with all UI labels
- Replace all hardcoded English text in memory/page.tsx with useTranslations() calls
Memory and Skills pages now appear in the sidebar and render with proper i18n support.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Strip reasoning_text/reasoning_content from assistant messages in
GitHub executor to prevent upstream 400 'Invalid signature in
thinking block' errors
- Sync refreshed copilotToken to top-level credentials in
checkAndRefreshToken so buildHeaders() uses the fresh token
instead of the stale one
- Include providerSpecificData in refreshCredentials return value
so onCredentialsRefreshed persists the new token to DB correctly
- Wrap JSON.parse in try/catch in rowToConfig (upstreamProxy.ts)
- Add error logging to empty catch blocks in CliproxyapiToolCard
- Pin Docker image to v6.9.7 instead of :latest
- Validate CLIProxyAPI URL with validateProxyUrl before saving
- Sync cliproxyapi_url and cliproxyapi_fallback_enabled to upstream_proxy_config DB
- Bridge the gap between settings UI and executor data sources
- Add requireManagementAuth to all 6 version-manager API routes
(status, check-update, install, start, stop, restart)
- Parameterize Docker healthcheck port via CLIPROXYAPI_PORT env var
- Improve error message to be actionable when binary is missing
- Add proper TS interfaces to CliproxyapiSettingsTab (replace any/Record)
- Add HTTP status checks and error handling on all fetches
- Add client-side URL validation before saving proxy URL
- Add error state display for version manager service
- Document localhost SSRF exception in isPrivateHost
- Add ALLOWED_COLUMNS whitelist to updateVersionManagerTool
The health check refresh path was silently dropping result.providerSpecificData,
so resourceUrl (returned by Qwen on token refresh) was never persisted to the DB.
Now deep-merges refresh result providerSpecificData into the existing connection data.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Part 1 of CLIProxyAPI integration (#902).
- DB migration for version_manager + upstream_proxy_config tables
- DB modules: versionManager.ts (tool lifecycle CRUD), upstreamProxy.ts (proxy config CRUD + SSRF guard)
- localDb.ts re-exports for new modules
- Provider registration: UPSTREAM_PROXY_PROVIDERS in providers.ts
- Zod validation schemas for version-manager API routes
- Settings UI: CliproxyapiSettingsTab (Advanced tab)
- 47 unit tests for DB modules
Hub-and-spoke flush path was broken for non-OpenAI providers when the
client speaks OpenAI Responses API (e.g. Codex CLI). When
claudeToOpenAIResponse(null) returned null, openaiToOpenAIResponsesResponse
was never called, so response.completed (carrying total_tokens) was never
emitted — causing "missing field total_tokens" errors in Codex.
Two fixes:
- Pass null through to Step 2 translator even when Step 1 produced no
output during flush, so terminal events get emitted
- Capture usage from any chunk carrying it (not just usage-only chunks)
and normalize Chat Completions format to Responses API format
Prevent auto-sync from wiping manually-imported models when the upstream
/models endpoint fails, times out, or returns an empty list. Added
`allowEmpty` option (default false) to replaceCustomModels — callers that
intentionally clear all models (DELETE ?all=true) pass `allowEmpty: true`.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- prop-types: required by 12 component files using PropTypes
- js-yaml: required by openapi spec route
These dependencies were missing from package.json causing build failures.
Added missing cliTools.guides.windsurf.steps[1-5] with title and desc:
- step 1: Open AI Settings
- step 2: Add Custom Provider
- step 3: Base URL (http://127.0.0.1:20128/v1)
- step 4: API Key
- step 5: Select Model
Total: 165 keys across all language files (5 steps × 2 keys × 33 languages)
Prevent auto-sync from wiping manually-imported models when the upstream
/models endpoint fails, times out, or returns an empty list. Added
`allowEmpty` option (default false) to replaceCustomModels — callers that
intentionally clear all models (DELETE ?all=true) pass `allowEmpty: true`.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The wrapInCloudCodeEnvelopeForClaude function converts Claude message
blocks to Antigravity/Gemini parts format but only handles text,
tool_use, and tool_result types. Image blocks (type: 'image' with
base64 source) are silently dropped, causing Claude models routed
through Antigravity to never receive image content.
This adds handling for image blocks, converting them to inlineData
format (the same format the Gemini path already uses successfully).
Tested: Claude via Antigravity now correctly receives and describes
image content that was previously invisible.
- copilot-usage: use future reset date (2026-12-31) to avoid stale
quota window causing remainingPercentage to reset to 100%
- request-log-migration: close SQLite DB before cleanup to release
Windows file locks; remove stale archive dir before second test
The models gemini-2.5-flash-preview-image-generation and
gemini-3.1-flash-image-preview were surfacing in the model catalog
via getAllImageModels() from imageRegistry.ts, not from the live
upstream API. Removed them from the image provider registry.
- Reduce maxOutputTokens from 131072 to 65535 for gemini-3.1-pro-high
and gemini-3.1-pro-low, fixing 400 "invalid argument" errors from
Open WebUI when no max_tokens is specified (upstream limit is 65535)
- Filter non-viable models (gemini-3.1-flash-image-preview,
gemini-2.5-flash-preview-image-generation, gemini-3-pro-high/low)
from the live upstream API response in /api/providers/[id]/models
Models removed from available list (not usable via chat completions):
- gemini-3-pro-high/low — returns empty responses, quota unusable
- gemini-2.5-flash/flash-lite — quota always exhausted on free tier
- gemini-3.1-flash-image-preview — preview variant, not functional
Models hidden from quota UI (in addition to above):
- gemini-3-flash-agent — internal agent model
- gemini-3.1-flash-lite — not usable for chat
Kept gemini-3.1-flash-image in available models (confirmed working).
Removed dead gemini-3-pro from bare Pro ID normalization.
- Standardize cooldown fallback to 2 min (COOLDOWN_MS.rateLimit) in both
chatCore and auth.ts instead of inconsistent 60s/undefined
- Return 504 Gateway Timeout instead of 200 OK when SSE collection times
out, with finish_reason "length" to signal incomplete response
A 429 from one Antigravity model was marking the entire provider connection
as rate-limited, blocking ALL other models on the same account. This happened
in two places: chatCore's error classification (primary) and
markAccountUnavailable (secondary). Both now use model-only lockModel() for
passthrough providers instead of connection-wide rateLimitedUntil.
Also adds:
- Bare Pro model ID normalization (gemini-3-pro → gemini-3-pro-low)
matching OpenClaw convention
- Internal model exclusion list for quota display, matching CLIProxyAPI
- Update model list to match CLIProxyAPI filtered models (remove stale
gemini-2.5-pro, claude-sonnet-4-5, claude-sonnet-4, gemini-2.0-flash;
sort alphabetically)
- Extend 404 model-only lockout to passthrough providers so one missing
model doesn't lock out the entire Antigravity connection
- Always use streaming upstream endpoint (generateContent causes upstream
400 for some models that internally convert to OpenAI format with
stream_options); collect SSE into JSON for non-streaming clients
- Fix unlimited quota models showing 0% by checking q.unlimited before
recalculating percentage
- prop-types: required by 12 component files using PropTypes
- js-yaml: required by openapi spec route
These dependencies were missing from package.json causing build failures.
The check-route-validation script now accepts both validateBody()
and .safeParse() as valid body validation methods. This fixes false
positives for routes using Zod schemas with safeParse().
Added 130 missing keys from en.json:
- a2aDashboard: 46 keys
- agents: 6+ keys
- cliTools.guides notes: continue, kiro, opencode
- And all other missing keys from recent additions
Total: All 33 language files now have full key parity.
Detects mismatched placeholders like {count} vs {pocet} between
source (en.json) and translations. Catches cases where raw placeholders
like {# models} are translated without preserving the placeholder format.
Found 14 issues in cs.json as test case.
Added missing cliTools.guides.windsurf.steps[1-5] with title and desc:
- step 1: Open AI Settings
- step 2: Add Custom Provider
- step 3: Base URL (http://127.0.0.1:20128/v1)
- step 4: API Key
- step 5: Select Model
Total: 165 keys across all language files (5 steps × 2 keys × 33 languages)
Added missing i18n keys for 'strict-random' routing strategy:
- combos.strategyGuide.strict-random: {when, avoid, example}
- combos.strategyRecommendations.strict-random: {title, description, tip1, tip2, tip3}
Total: 264 keys across all language files (8 keys × 33 languages)
These keys were already in pt-BR (incorrectly translated) and are now
aligned with the English fallback values from combos/page.tsx
Added missing step 5 'Use Thinking Variant' to all 33 i18n language files
for cliTools.guides.opencode.steps.5
The step was already defined in CLI_TOOLS constant but the i18n
translations were missing, causing the step title/description to
not display in the UI.
Models without quota data (e.g. tab-completion models) were showing 0%
because remainingFraction defaulted to 0 when absent. Now defaults to 1
so they show 100% remaining instead.
The check-route-validation script now accepts both validateBody()
and .safeParse() as valid body validation methods. This fixes false
positives for routes using Zod schemas with safeParse().
Added 130 missing keys from en.json:
- a2aDashboard: 46 keys
- agents: 6+ keys
- cliTools.guides notes: continue, kiro, opencode
- And all other missing keys from recent additions
Total: All 33 language files now have full key parity.
Detects mismatched placeholders like {count} vs {pocet} between
source (en.json) and translations. Catches cases where raw placeholders
like {# models} are translated without preserving the placeholder format.
Found 14 issues in cs.json as test case.
Added missing cliTools.guides.windsurf.steps[1-5] with title and desc:
- step 1: Open AI Settings
- step 2: Add Custom Provider
- step 3: Base URL (http://127.0.0.1:20128/v1)
- step 4: API Key
- step 5: Select Model
Total: 165 keys across all language files (5 steps × 2 keys × 33 languages)
Added missing i18n keys for 'strict-random' routing strategy:
- combos.strategyGuide.strict-random: {when, avoid, example}
- combos.strategyRecommendations.strict-random: {title, description, tip1, tip2, tip3}
Total: 264 keys across all language files (8 keys × 33 languages)
These keys were already in pt-BR (incorrectly translated) and are now
aligned with the English fallback values from combos/page.tsx
Added missing step 5 'Use Thinking Variant' to all 33 i18n language files
for cliTools.guides.opencode.steps.5
The step was already defined in CLI_TOOLS constant but the i18n
translations were missing, causing the step title/description to
not display in the UI.
- Replace stale model IDs (gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview)
with correct High/Low tier variants from fetchAvailableModels API
(gemini-3-pro-high, gemini-3-pro-low, gemini-3.1-pro-high, gemini-3.1-pro-low, etc.)
- Remove ag/ alias prefix in favor of antigravity/ across registry, providers,
model capabilities, combos, docs, and static model providers
- Make provider alias optional in Zod schema and guard ALIAS_TO_ID/ID_TO_ALIAS maps
- Show raw model IDs in quota display instead of unmapped display names
- Update T28 model catalog test to assert new High/Low tier models
- Add state for custom app name and logo
- Fetch whitelabeling settings from /api/settings
- Listen for whitelabeling changes via settings event
- Display custom app name when set
- Display custom logo (Base64 or URL) when set
- Fall back to default OmniRoute logo and name
- Add dedicated 'appearance' tab to settings navigation
- Move AppearanceTab from general to dedicated tab
- Add whitelabeling fields to Zod schema (customLogoUrl, customLogoBase64)
- Add whitelabeling UI section with:
- App name customization
- Custom logo URL input
- Logo file upload with preview
- Reset to default functionality
- Add i18n keys for whitelabeling features
This allows users to customize the application name and logo for white-labeling purposes.
- Deduplicate in-flight loadCodeAssist requests to prevent thundering herd
- Add typeof guard on cloudaicompanionProject.id before calling .trim()
- Evict oldest cache entry when all entries are still valid
- Fix unwrapGeminiChunk to use explicit null-safe check
- Update test description for null response case
The stored projectId from OAuth connection time becomes stale because the
Cloud Code API rotates free-tier projects. Native Gemini CLI refreshes the
project every 30 seconds via loadCodeAssist — OmniRoute never did, causing
403 "has not been used in project X" errors that permanently banned accounts.
- Add refreshProject() to GeminiCLIExecutor that calls loadCodeAssist API
with 10s timeout and caches the result for 30s (matching native CLI)
- transformRequest() replaces the stale projectId in the envelope before
sending to the Cloud Code API, falling back to the stored ID on failure
- Make transformRequest calls await-compatible in base executor and
all subclasses (antigravity, cursor, kiro) so async overrides work
- Remove x-goog-user-project header and executor-level project override
that caused 403 "Cloud Code Private API has not been used in project X"
- Add PROJECT_ROUTE_ERROR classifier type so project-routing 403s don't
permanently ban accounts (keeps accounts active, tracks the error)
- Fix Cloud Code envelope unwrapping for content accumulation in stream.ts
(Cloud Code wraps responses in { response: { candidates: [...] } })
- Extract unwrapGeminiChunk() into streamHelpers.ts with format guard
- Remove _currentModel singleton race condition from GeminiCLIExecutor
- Add handler for PROJECT_ROUTE_ERROR in chatCore.ts
- Add TODO in antigravity.ts about same stale-project risk
- Add 7 unit tests for error classifier and stream unwrap paths
* test(settings): add unit tests for debugMode and hiddenSidebarItems
Tests cover:
- PATCH debugMode=true/false
- PATCH hiddenSidebarItems with array values
- Combined updates with both fields
* test(e2e): add Playwright tests for settings toggles
Tests cover:
- Debug mode toggle on/off
- Sidebar visibility toggle
- Settings persistence after page reload
* fix(tests): address code review issues
- Unit tests: fix async/await for getSettings, use direct db functions
- E2E tests: remove conditional logic, use Playwright auto-waiting assertions
* feat(logging): unify request log retention and artifacts
* docs: add dashboard settings toggles to CONTRIBUTING
Add section documenting:
- Debug Mode toggle (Settings → Advanced)
- Sidebar Visibility toggle (Settings → General)
* fix(cache): only inject prompt_cache_key for supported providers
Only inject prompt_cache_key for providers that support prompt caching
(Claude, Anthropic, ZAI, Qwen, DeepSeek). This fixes issue #848 where
NVIDIA API rejected the parameter.
* fix(model-sync): log only channel-level model changes
* feat(providers): add 4 free models to opencode-zen
* feat(providers): add explicit contextLength for opencode-zen free models
* feat(providers): add contextLength for all opencode-zen models
* feat: Improve the Chinese translation
* fix: preserve client cache_control for all Claude-protocol providers
Previously, the cache control preservation logic only recognized a
hardcoded list of providers (claude, anthropic, zai, qwen, deepseek).
This caused OmniRoute to inject its own cache_control markers for
Claude-protocol providers not in that list (bailian-coding-plan, glm,
minimax, minimax-cn, etc.), overwriting the client's cache markers.
The fix checks both:
1. Known caching providers list (existing behavior)
2. Whether targetFormat === 'claude' (all Claude-protocol providers)
This ensures all Claude-compatible providers properly preserve client
cache_control headers when appropriate (Claude Code client, deterministic
routing, etc.).
Also removes unused CacheStatsCard from settings/components (duplicate
of the one in cache/ page).
Fixes cache token calculation for GLM, Minimax, and other Claude-compatible providers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: pure passthrough for Claude→Claude when cache_control preserved
The Claude passthrough path round-trips through OpenAI format
(claude→openai→claude) for structural normalization. This strips
cache_control markers from every content block since OpenAI format
has no equivalent, causing ~42k cache creation tokens per request
with zero cache reads.
When preserveCacheControl is true (Claude Code client, "always"
setting, or deterministic combo), skip the round-trip entirely and
forward the body as-is. Claude Code sends well-formed Messages API
payloads — the normalization was only needed for non-Code clients.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: restore CacheStatsCard — was not a duplicate
The first commit incorrectly deleted CacheStatsCard from
settings/components/ as a "duplicate". It's the only copy — both
settings/page.tsx and cache/page.tsx import from this location.
Restored the i18n-ized version from main.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(429): parse long quota reset times from error body
- Parse XhYmZs format from antigravity error messages (e.g., 27h41m36s)
- Dynamic retry-after threshold (60s default) instead of hardcoded 10s
- Add parseRetryFromErrorText() in accountFallback.ts for body parsing
- Fix 403 'verify your account' to trigger permanent deactivation
- Add keyword matching for 'quota will reset', 'exhausted capacity'
- Add unit tests for retry parsing and keyword matching
Fixes#858 (Antigravity 429 handling)
Fixes#832 (Qwen quota 429 - same underlying bug)
* chore: bump version to v3.4.0-dev
* fix(migrations): rename 013 to 014 to avoid collision with v3.3.11
* chore(docs): update CHANGELOG for v3.4.0 integrations
* fix: Claude token refresh, Antigravity quota, and 429 rate-limit handling
- Fix Claude OAuth token refresh to use form-urlencoded format (standard OAuth2)
- Add anthropic-beta header required by Claude OAuth API
- Switch Antigravity quota to use retrieveUserQuota API (same as Gemini CLI)
- Parse quota reset time for all providers (not just Antigravity)
- Add quota reset keywords to error classifier
- Cap maximum retry time at 24 hours to prevent infinite wait
Closes#836, #857, #858, #832
* fix(dashboard): resolve /dashboard/limits hanging UI with 70+ accounts via chunk parallelization (#784)
---------
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
Co-authored-by: R.D. <rogerproself@gmail.com>
Co-authored-by: kang-heewon <heewon.dev@gmail.com>
Co-authored-by: gmw <rorschach1167@qq.com>
Co-authored-by: tombii <github@tombii.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
- Fix Claude OAuth token refresh to use form-urlencoded format (standard OAuth2)
- Add anthropic-beta header required by Claude OAuth API
- Switch Antigravity quota to use retrieveUserQuota API (same as Gemini CLI)
- Parse quota reset time for all providers (not just Antigravity)
- Add quota reset keywords to error classifier
- Cap maximum retry time at 24 hours to prevent infinite wait
Closes#836, #857, #858, #832
- Parse XhYmZs format from antigravity error messages (e.g., 27h41m36s)
- Dynamic retry-after threshold (60s default) instead of hardcoded 10s
- Add parseRetryFromErrorText() in accountFallback.ts for body parsing
- Fix 403 'verify your account' to trigger permanent deactivation
- Add keyword matching for 'quota will reset', 'exhausted capacity'
- Add unit tests for retry parsing and keyword matching
Fixes#858 (Antigravity 429 handling)
Fixes#832 (Qwen quota 429 - same underlying bug)
The first commit incorrectly deleted CacheStatsCard from
settings/components/ as a "duplicate". It's the only copy — both
settings/page.tsx and cache/page.tsx import from this location.
Restored the i18n-ized version from main.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Claude passthrough path round-trips through OpenAI format
(claude→openai→claude) for structural normalization. This strips
cache_control markers from every content block since OpenAI format
has no equivalent, causing ~42k cache creation tokens per request
with zero cache reads.
When preserveCacheControl is true (Claude Code client, "always"
setting, or deterministic combo), skip the round-trip entirely and
forward the body as-is. Claude Code sends well-formed Messages API
payloads — the normalization was only needed for non-Code clients.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously, the cache control preservation logic only recognized a
hardcoded list of providers (claude, anthropic, zai, qwen, deepseek).
This caused OmniRoute to inject its own cache_control markers for
Claude-protocol providers not in that list (bailian-coding-plan, glm,
minimax, minimax-cn, etc.), overwriting the client's cache markers.
The fix checks both:
1. Known caching providers list (existing behavior)
2. Whether targetFormat === 'claude' (all Claude-protocol providers)
This ensures all Claude-compatible providers properly preserve client
cache_control headers when appropriate (Claude Code client, deterministic
routing, etc.).
Also removes unused CacheStatsCard from settings/components (duplicate
of the one in cache/ page).
Fixes cache token calculation for GLM, Minimax, and other Claude-compatible providers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Only inject prompt_cache_key for providers that support prompt caching
(Claude, Anthropic, ZAI, Qwen, DeepSeek). This fixes issue #848 where
NVIDIA API rejected the parameter.
- Sync debugMode with showDebug in Sidebar (was using enableRequestLogs env var)
- Only render debug-section sidebar toggles in AppearanceTab when debugMode=true
- Sidebar filters debug-section items based on debugMode (was already correct)
- Debug toggle now triggers omniroute:settings-updated event for instant sidebar update
EOF
- Add authentication to /api/cache/entries (GET and DELETE)
- Add authentication to /api/cache (GET and DELETE)
- Validate trendHours: clamp to 1-720 range
- Fix estimatedCostSaved bug: use calculated value instead of hardcoded 0
- Match CacheStatsCard header size with other card headers (text-sm)
- Show request counts in cache hit rate label (116/359) for clarity
- Add CacheStatsCard component to cache page for cumulative metrics
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Update DEFAULT_PRICING key from 'gc' to 'gemini-cli' so pricing
lookups work with the new alias
- Restore gc -> gemini-cli in FALLBACK_ALIAS_TO_PROVIDER for backward
compatibility (existing saved configs with gc/ prefix still resolve)
Gemini CLI clients use bare model names, not provider-prefixed IDs.
The gc/ alias was opaque — gemini-cli/ is self-documenting. Since
alias now equals provider ID, the dual-prefix duplication logic
naturally skips Gemini CLI (no duplicate gemini-cli/ entries).
Auto-opening the "Add Connection" dialog when navigating to a provider
with zero connections was a poor UX pattern. It surprised users who were
simply browsing provider details (e.g. after deleting a connection or
checking settings). The page already displays a clear empty state with
an "Add Connection" button — users should click it when ready.
- Add early return guard for missing accessToken in getGeminiUsage
- Add 10s fetch timeout (AbortSignal.timeout) on retrieveUserQuota calls
- Clamp used value with Math.max(0, ...) for non-negative display
- Use full accessToken as cache key instead of truncated prefix
- Replace catch(err: any) with instanceof Error check in models route
Replace stub getGeminiUsage() with per-model quota fetching from Google
Cloud Code Assist's retrieveUserQuota endpoint (same API the official
Gemini CLI /stats command uses). Fixes OAuth env var name, aligns model
list with official Gemini CLI VALID_GEMINI_MODELS, and makes "Import
from /models" discover new models via the quota endpoint.
Implement DiversityScoreCard component to fetch and display provider diversity score with loading state and conditional styling, integrate it into AnalyticsPage overview, and add a new API route at src/app/api/analytics/diversity/route.ts to return the diversity report using getDiversityReport
Integration test was failing because CacheStatsCard was moved from
settings page to cache page in previous commit. Split the test into
two separate describe blocks for accurate page-specific verification.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix getLoggedInputTokens to return full prompt_tokens (input + cache_read + cache_creation)
- Fix usageExtractor for non-streaming Claude responses to calculate total correctly
- Add formatUsageLog helper to show CR=<cache_read> in logs
- Add migration 012 to fix historical token counts in usage_history
- Move prompt cache metrics from Settings to /dashboard/cache page
Per Claude API docs:
Total input tokens = input_tokens + cache_creation_input_tokens + cache_read_input_tokens
Fixes issue where totalInputTokens (71k) was less than totalCacheCreationTokens (1.35M).
Tested:
- All 1134 unit tests pass
- Cache metrics API returns correct totals
- Migration is idempotent and tracked in _omniroute_migrations
- Logs show cache read tokens: 'in=6055 | out=211 | CR=22399'
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use ProviderIcon for internal .png paths solving SVG provider 404 images (#745).
- Add id-token: write and packages: write permissions to .github/workflows/electron-release.yml to fix permissions denied failure when calling the reusable workflow npm-publish.yml (#761).
- Fix tests and ESM resolution for autoUpdate.ts override logic.
Add a standardized degradation pattern for services depending on external
systems. withDegradation() tries primary → fallback → safe default,
tracking status in a global registry for dashboard visibility.
Features:
- Async and sync variants
- Global registry with per-feature status tracking
- Degradation levels: full → reduced → minimal → default
- Summary and report APIs for dashboard integration
- Reason tracking for debugging
Example: Rate limiting degrades from Redis → in-memory → permissive
instead of crashing when Redis is unavailable.
Closes#799
Add configAudit module that records every change to provider connections,
combos, and routing policies with:
- Before/after state snapshots
- Structured diff (added, removed, changed keys)
- Source tracking (dashboard, API, sync, auto-healing)
- Filtered retrieval with pagination
- Rollback state extraction
- Configuration snapshot export for backup
Enables traceability and quick rollback when config changes cause issues.
Closes#791
Add providerExpiration module to track OAuth token, subscription, and
API credit expiration dates per provider connection. Provides:
- setExpiration() / getExpiration() for CRUD operations
- getExpiringSoon() for proactive alerts
- getExpirationSummary() for dashboard health display
- detectExpirationFromResponse() for auto-detection from HTTP headers
- Status classification: active → expiring_soon → expired
Prevents silent failures from expired credentials by alerting operators
before tokens/subscriptions expire.
Closes#790
Add a providerDiversity module that tracks provider usage distribution
using a rolling time window and calculates Shannon entropy normalized
to [0..1]. This enables the auto-combo scoring engine to factor in
provider diversity — boosting underrepresented providers to reduce
single-point-of-failure risk.
Key features:
- Rolling window with configurable size and TTL
- Shannon entropy calculation normalized to [0..1]
- Per-provider diversity boost for auto-combo integration
- Diversity report for dashboard display
- Full test coverage
Closes#788
- Add windsurf and copilot entries to toolDescriptions in all 33 locale files
to fix MISSING_MESSAGE errors on the CLI Tools page (#748)
- Apply FETCH_TIMEOUT_MS to streaming requests' initial fetch() call to prevent
300s TCP default timeout causing silent failures on long-running requests (#769)
- Previously only non-streaming requests had timeout protection; streaming requests
relied solely on stream idle detection which doesn't cover initial connection hangs
Pre-existing any-budget violations in chatCore.ts (6), combo.ts (2), and
embeddings.ts (1 false positive in comment) — none introduced by GLM work.
Replace `as any` with `Record<string, unknown>` casts and reword comment.
Also removes docs/superpowers audit worksheet from git tracking (not part
of GLM Coding provider changes).
The model sync scheduler and sync-models endpoint were blindly
replacing custom models with all fetched models, including ones
already in the built-in registry. Now filters out registry models
before saving to custom models.
Compare fetched models against existing custom models AND built-in
registry models before posting. Only new models trigger
POST /api/provider-models calls. Shows skip count in import progress
when some models already exist. Adds i18n keys for all locales.
Live-tested all GLM Coding models against the /api/coding/paas/v4
endpoint. glm-4.7-flashx returns 429 "Insufficient balance or no
resource package" and is not listed on the /models endpoint.
All other models (glm-5.1, glm-5, glm-5-turbo, glm-4.7, glm-4.7-flash,
glm-4.6v, glm-4.6, glm-4.5v, glm-4.5, glm-4.5-air) return 200.
The "Import from /models" button was using the wrong Z.AI API surface
(Anthropic-compatible /api/anthropic/v1/models with x-api-key auth).
Switched to the correct Coding API endpoints with Authorization: Bearer
auth, matching the pattern used by the quota/usage tracking code.
- International: https://api.z.ai/api/coding/paas/v4/models
- China: https://open.bigmodel.cn/api/coding/paas/v4/models
- Auth: Authorization: Bearer <token> (not x-api-key)
- Region sourced from providerSpecificData.apiRegion
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Record the family-by-family GLM Coding audit, add regression coverage, and fix the documented GLM-5.1 context window override.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add missing glm-4.7-flashx variant to provider registry (confirmed in
Z.AI official GLM-4.7 overview docs as one of three variants)
- Remove glm-4.7/glm4.7 from tool calling denylist — official docs
explicitly show GLM-4.7 supporting function calling with tools param
- Add estimated pricing for glm-4.7-flashx ($0.3/$1.1) between free
Flash and standard 4.7 tiers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(stream): normalize delta.reasoning to reasoning_content in SSE streaming
NVIDIA kimi-k2.5 (and potentially other providers) send reasoning
tokens as `delta.reasoning` in SSE streaming chunks instead of the
standard OpenAI `delta.reasoning_content` field. This caused reasoning
content to be silently dropped during stream passthrough — clients
received only the final answer with no reasoning separation.
The non-streaming sanitizer (responseSanitizer.ts) already handled this
alias, but the streaming pipeline did not.
Fix applied in 4 locations:
- stream.ts passthrough: normalize + force re-serialize sanitized chunk
- stream.ts translate: accumulate reasoning from delta.reasoning
- sseParser.ts: collect delta.reasoning in parseSSEToOpenAIResponse
- streamPayloadCollector.ts: collect delta.reasoning in buildOpenAISummary
* fix: eliminate injectedUsage reuse bug and add reasoning alias tests
- Detect delta.reasoning alias before sanitizeStreamingChunk() which
already normalizes it, removing dead post-sanitization normalization
- Replace injectedUsage reuse with separate needsReserialization flag
so reasoning re-serialization cannot block finish_reason/usage
mutations on the same SSE chunk (fixes CRITICAL review finding)
- Add unit test for parseSSEToOpenAIResponse reasoning alias
- Add unit test for buildStreamSummaryFromEvents reasoning alias
* fix(stream): separate reasoning from content in passthrough response body
The passthroughAccumulatedContent variable was mixing delta.content and
delta.reasoning_content into one string, causing the client_response
log and responseBody to lose reasoning separation.
- Add passthroughAccumulatedReasoning accumulator for reasoning deltas
- Set message.reasoning_content in responseBody when reasoning exists
- Only accumulate delta.content into passthroughAccumulatedContent
* fix: trim leading whitespace from assembled content in log summaries
NVIDIA and other providers emit token deltas with leading spaces
(e.g. ' The', ' user'). When joined, these produce a leading space in
the provider_response and parsed non-streaming response logs. Trim
the joined content and reasoning_content in both buildOpenAISummary
and parseSSEToOpenAIResponse for consistent log output.
* fix(stream): split combined reasoning+content deltas into separate SSE events
Some providers (e.g. NVIDIA NIM) send transition chunks with both
`delta.reasoning` and `delta.content` in the same SSE event.
After sanitization this becomes `reasoning_content` + `content`,
which violates the standard OpenAI streaming contract where these
fields are never mixed. Clients using if/else logic (LobeChat, etc.)
skip content when reasoning_content is present, losing the first
content token.
Split such combined chunks into two separate SSE events:
1. Reasoning-only event (finish_reason=null, no usage)
2. Content-only event (carries finish_reason and usage)
Models like antigravity/claude-sonnet-4-6 route through Google's internal
Cloud Code API which returns HTTP 400 when thinking/reasoning parameters
are included in the request body.
Changes:
- open-sse/services/modelCapabilities.ts: add supportsReasoning() function
with a denylist of known-unsupported patterns (antigravity/claude-sonnet-*)
and a registry-based lookup hook (supportsReasoning flag per model)
- open-sse/services/thinkingBudget.ts: in applyThinkingBudget(), add early
exit before the mode switch — if model string is present and
supportsReasoning() returns false, call stripThinkingConfig() immediately
regardless of the configured ThinkingMode
This is fully backward-compatible: models not in the denylist are unaffected,
and the supportsReasoning registry flag defaults to null (pass-through).
Fixes: HTTP 400 errors on antigravity provider when client sends requests
with thinking/reasoning budget parameters (e.g. claude-sonnet-4-6 via AG).
Co-authored-by: oyi77 <oyi77@github.com>
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
* feat: auto-disable banned accounts setting with UI toggle
Add a configurable setting to automatically disable provider accounts
that return permanent/terminal errors (403 banned, ToS violation, etc.)
Changes:
- open-sse/services/accountFallback.ts: extend ACCOUNT_DEACTIVATED_SIGNALS
with AG-specific ban messages ('verify your account', 'service disabled
for violation')
- src/app/api/settings/auto-disable-accounts/route.ts: new GET/PUT endpoint
for the setting (enabled bool + threshold int)
- src/shared/validation/schemas.ts: updateAutoDisableAccountsSchema
- src/sse/services/auth.ts: in markAccountUnavailable(), capture result.permanent
from checkFallbackError() and — when autoDisableBannedAccounts is enabled and
backoffLevel >= threshold — set isActive=false on the connection
Default: disabled (backward-compatible). Enable via Settings UI or PUT
/api/settings/auto-disable-accounts { "enabled": true, "threshold": 3 }
Fixes: antigravity accounts with 403/Verify-your-account errors being
retried indefinitely in the rotation pool.
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
* fix: address reviewer comments for auto-disable (use getCachedSettings, immediate disable on permanent bans)
---------
Co-authored-by: oyi77 <oyi77@github.com>
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
NVIDIA and other providers emit token deltas with leading spaces
(e.g. ' The', ' user'). When joined, these produce a leading space in
the provider_response and parsed non-streaming response logs. Trim
the joined content and reasoning_content in both buildOpenAISummary
and parseSSEToOpenAIResponse for consistent log output.
The passthroughAccumulatedContent variable was mixing delta.content and
delta.reasoning_content into one string, causing the client_response
log and responseBody to lose reasoning separation.
- Add passthroughAccumulatedReasoning accumulator for reasoning deltas
- Set message.reasoning_content in responseBody when reasoning exists
- Only accumulate delta.content into passthroughAccumulatedContent
- Detect delta.reasoning alias before sanitizeStreamingChunk() which
already normalizes it, removing dead post-sanitization normalization
- Replace injectedUsage reuse with separate needsReserialization flag
so reasoning re-serialization cannot block finish_reason/usage
mutations on the same SSE chunk (fixes CRITICAL review finding)
- Add unit test for parseSSEToOpenAIResponse reasoning alias
- Add unit test for buildStreamSummaryFromEvents reasoning alias
NVIDIA kimi-k2.5 (and potentially other providers) send reasoning
tokens as `delta.reasoning` in SSE streaming chunks instead of the
standard OpenAI `delta.reasoning_content` field. This caused reasoning
content to be silently dropped during stream passthrough — clients
received only the final answer with no reasoning separation.
The non-streaming sanitizer (responseSanitizer.ts) already handled this
alias, but the streaming pipeline did not.
Fix applied in 4 locations:
- stream.ts passthrough: normalize + force re-serialize sanitized chunk
- stream.ts translate: accumulate reasoning from delta.reasoning
- sseParser.ts: collect delta.reasoning in parseSSEToOpenAIResponse
- streamPayloadCollector.ts: collect delta.reasoning in buildOpenAISummary
Add a configurable setting to automatically disable provider accounts
that return permanent/terminal errors (403 banned, ToS violation, etc.)
Changes:
- open-sse/services/accountFallback.ts: extend ACCOUNT_DEACTIVATED_SIGNALS
with AG-specific ban messages ('verify your account', 'service disabled
for violation')
- src/app/api/settings/auto-disable-accounts/route.ts: new GET/PUT endpoint
for the setting (enabled bool + threshold int)
- src/shared/validation/schemas.ts: updateAutoDisableAccountsSchema
- src/sse/services/auth.ts: in markAccountUnavailable(), capture result.permanent
from checkFallbackError() and — when autoDisableBannedAccounts is enabled and
backoffLevel >= threshold — set isActive=false on the connection
Default: disabled (backward-compatible). Enable via Settings UI or PUT
/api/settings/auto-disable-accounts { "enabled": true, "threshold": 3 }
Fixes: antigravity accounts with 403/Verify-your-account errors being
retried indefinitely in the rotation pool.
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
Gemini API returns 400 error when tools have enum constraints on integer/number types:
"enum: only allowed for STRING type"
This fix removes enum constraints for integer and number types in JSON schemas
before sending to Gemini API, while keeping enum for string types.
Fixes tools like mcp__pointer__get-pointed-element that use integer enums
for cssLevel and textDetail parameters.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Add `validateResponseQuality()` to detect empty/invalid 200 responses from
upstream providers in combo routing. Non-streaming responses with empty body,
invalid JSON, or missing content/tool_calls now trigger circuit breaker
failure and fallback to the next model instead of being returned to the client.
- Add missing `breaker._onSuccess()` calls in both priority and round-robin
combo paths. Previously failures accumulated without reset, causing premature
circuit breaker trips on healthy models.
- Update Cursor provider registry with Claude 4.6 model IDs (opus-high,
sonnet-high, haiku, opus + thinking variants). Keep 4.5 IDs for backward
compatibility.
- Update free-stack preset: replace duplicate qw/qwen3-coder-plus with
if/deepseek-v3.2 for better model diversity.
- Add paid-premium combo template for round-robin load distribution across
paid subscription providers (Cursor, Antigravity).
Made-with: Cursor
* chore(release): v3.2.8 — Docker auto-update UI and cache analytics fixes
* fix(sse): remove race condition in cache metrics tracking (#758)
- Remove in-memory metrics tracking (currentMetrics, trackCacheMetrics, updateCacheMetrics)
- Cache metrics now computed on-the-fly from usage_history table (single source of truth)
- Fixes CRITICAL issue from code review: concurrent requests overwriting metrics
- Fixes WARNING: duplicate metric tracking logic in streaming/non-streaming paths
Ref: PR #752 (merged before this fix was included)
* fix: handle allRateLimited credentials & forward extra body keys in embeddings/images routes (#757)
* fix: handle allRateLimited credentials in embeddings and images routes
When getProviderCredentials() returns an allRateLimited object (truthy,
but without apiKey/accessToken), the embeddings and images routes
incorrectly passed it to handlers as valid credentials. The handlers
then sent upstream requests without Authorization headers, causing
401 errors from providers (e.g. NVIDIA NIM).
This only manifested under concurrent requests: a chat/completions
call could trigger rate limiting on a provider account, and a
simultaneous embeddings request would receive the allRateLimited
sentinel — but treat it as valid credentials.
The chat pipeline already handled this case correctly. This commit
adds the same allRateLimited guard to all affected routes:
- POST /v1/embeddings
- POST /v1/providers/{provider}/embeddings
- POST /v1/images/generations
- POST /v1/providers/{provider}/images/generations
Also adds a defense-in-depth guard in the embeddings handler itself:
if no auth token is available for a non-local provider, return 401
immediately instead of sending an unauthenticated request upstream.
Made-with: Cursor
* fix(embeddings): forward extra body keys to upstream providers
The embeddings handler only forwarded model, input, dimensions, and
encoding_format to upstream providers, silently dropping any additional
fields. This broke asymmetric embedding APIs (e.g. NVIDIA NIM
nv-embedqa-e5-v5) that require input_type, and other providers
expecting user or truncate parameters.
Add a KNOWN_FIELDS exclusion set and forward all unrecognized body
keys to the upstream request, matching the passthrough pattern used
by the chat pipeline's DefaultExecutor.transformRequest().
Made-with: Cursor
* fix(auth): redirect and unconditional 401 on disabled requireLogin + fix test cases
* fix(build): remove legacy proxy.ts causing Next.js build collision
* fix(build): revert middleware.ts rename to proxy.ts because of Next.js Edge constraints
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: tombii <tombii@users.noreply.github.com>
Co-authored-by: Gorchakov-Pressure <117600961+Gorchakov-Pressure@users.noreply.github.com>
- Add usage_history table creation in test setup
- Simplify byStrategy query to avoid non-existent combo_strategy column
- Update test assertions to work with existing test data
1. Fixed crash in /api/cli-tools/status when statuses[toolId] is undefined
- Added null check before accessing statuses[toolId] properties
- Prevents "Cannot set property of undefined" error
2. Added support for droid.exe detection in ~/bin directory
- Added ~/bin and ~/.local/bin to EXPECTED_PARENT_PATHS
- Added droid.exe variant to toolBins for Windows
- Added specific path check for droid in ~/bin/droid.exe
These fixes resolve issues where CLI tools (Claude Code, Codex, Droid)
were showing as "not installed" even when properly installed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds intelligent cache control preservation for Claude Code clients:
- New cacheControlPolicy.ts module with detection logic:
- isClaudeCodeClient(): Detects Claude Code via User-Agent
- providerSupportsCaching(): Checks provider (claude, anthropic, zai, qwen)
- isDeterministicStrategy(): Identifies priority/cost-optimized strategies
- shouldPreserveCacheControl(): Main policy decision
- Cache control is preserved when:
1. Client is Claude Code (detected via User-Agent)
2. Provider supports prompt caching
3. Request routing is deterministic:
- Single model requests (always)
- Combo with priority or cost-optimized strategy only
- Updated translator to accept preserveCacheControl option
- Updated chatCore and chat handler to propagate combo strategy
- Added comprehensive unit tests (24 tests)
Non-deterministic combo strategies (weighted, round-robin, random, etc.)
continue to use OmniRoute's managed caching strategy.
Refs: #cache-control-preservation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
OpenCode Go mixes request protocols by model family:
- `glm-5` and `kimi-k2.5` use OpenAI-style `/chat/completions`
- `minimax-m2.5` and `minimax-m2.7` use Anthropic-style `/messages`
OmniRoute already routed MiniMax Go models to `/messages`, but the
executor still sent `Authorization: Bearer ...`, which caused upstream
`401 Missing API key` errors.
This changes `OpencodeExecutor` to send:
- `x-api-key` + `anthropic-version` for Claude-targeted OpenCode Go requests
- `Authorization: Bearer ...` for the remaining OpenCode Go request formats
Also updates unit coverage to assert the correct header behavior for
MiniMax Go models.
Validated with:
- direct curl repro against OpenCode Go endpoints
- `node --import tsx/esm --test tests/unit/opencode-executor.test.mjs`
- `npm run typecheck:core`
- `npm run build`
HTTP 400 "invalid argument" was triggered when OmniRoute translated tool calls
to Gemini format, because thoughtSignature was injected onto every functionCall
part unconditionally. Affects two code paths:
1. openai-to-gemini.ts — OpenAI tool_calls → Gemini functionCall
2. claude-to-gemini.ts — Claude tool_use blocks → Gemini functionCall
thoughtSignature is only valid on thinking/reasoning parts (those with
thought: true or thoughtSignature as their primary field). When present on a
functionCall part, the Gemini API returns HTTP 400 'invalid argument'.
The thinking parts that legitimately carry thoughtSignature (emitted when a
message has reasoning_content / thinking blocks) are untouched.
Regression tests (T43) cover:
- single tool call: no thoughtSignature on functionCall part (openai path)
- multiple tool calls: none carry thoughtSignature (openai path)
- thinking regression guard: thoughtSignature still on thought parts
- claude-to-gemini path: tool_use blocks produce clean functionCall parts
Fixes#724
HTTP 400 "invalid argument" was triggered when OmniRoute translated OpenAI
tool_calls to Gemini format, because thoughtSignature was injected onto every
functionCall part unconditionally.
thoughtSignature is only valid on thinking/reasoning parts (those with
thought: true). The Gemini API rejects any request where a functionCall
part carries a thoughtSignature field, returning HTTP 400.
Fix: remove the thoughtSignature field from functionCall parts. The thinking
parts that legitimately require thoughtSignature (emitted when a message has
reasoning_content) are unchanged.
Adds regression test (T43) with three cases:
- single tool call: no thoughtSignature on functionCall part
- multiple tool calls: none carry thoughtSignature
- thinking part regression guard: thoughtSignature still present on thought parts
Fixes#725
When all combo models are exhausted (502/503), OmniRoute now checks for
a globalFallbackModel setting and attempts one last request through it
before returning the error. Settings stored in key_value table, no
migration needed.
Non-streaming: Fixed json.messages check to use json.choices[0].message
(OpenAI format). Streaming: inject pin tag before finish_reason chunk for
tool-call-only streams. injectModelTag now appends synthetic assistant
message when content is null/array (tool_calls).
Adds a new /dashboard/cache page that surfaces the existing but UI-less
semantic cache infrastructure.
Changes:
- New page: src/app/(dashboard)/dashboard/cache/page.tsx
- Live stats: memory entries, DB entries, cache hits, tokens saved
- Hit rate progress bar with color coding (green/yellow/red)
- Hits/Misses/Total breakdown
- Idempotency layer stats (active dedup keys + window)
- Cache behavior info panel
- Clear All button
- Auto-refresh every 10s
- Enhanced API: src/app/api/cache/route.ts
- DELETE ?model=<name> — invalidate by model
- DELETE ?signature=<hex> — invalidate single entry
- DELETE ?staleMs=<ms> — invalidate entries older than N ms
- DELETE (no params) — clear all (existing behavior)
- Sidebar: added Cache nav item (icon: cached)
- i18n: added cache + sidebar.cache keys for all 31 supported locales
No new dependencies. All functionality builds on existing semanticCache.ts,
cacheLayer.ts, and idempotencyLayer.ts modules.
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
- Add codexAuthFile.ts utility: builds Codex auth.json payload from OAuth connection
(id_token, access_token, refresh_token, account_id) with auto-refresh if expired
- Add POST /api/providers/[id]/codex-auth/export: downloads auth.json file
- Add POST /api/providers/[id]/codex-auth/apply-local: writes auth.json to local CLI path
- Add 'Apply auth' and 'Export auth' buttons to ConnectionRow (Codex provider only)
- Add i18n keys for en and pt-BR
- add unit tests for API auth, display/error utilities, login bootstrap,
model combo mappings, provider validation branches, and usage analytics
- add COVERAGE_PLAN.md and extend CONTRIBUTING.md with coverage notes and
workflow guidance
- update package.json to adjust test:coverage thresholds and add coverage:report;
include c8 as a devDependency
- introduce test scaffolding and ensure compatibility with existing test runners
- align tests with open-sse changes and improve overall test coverage planning
- introduce open-sse/translator/helpers/schemaCoercion.ts to coerce
numeric JSON Schema fields encoded as strings
- wire coerceToolSchemas and sanitizeToolDescriptions into translator
pipeline; ensure tool descriptions are sanitized
- inject empty reasoning content for tool calls when target is OpenAI
format
- update qwen base URL to DashScope-compatible endpoint
- extend antigravity static catalog with Gemini 3.1 pro preview models and
update Gemini model specs with preview aliases
- implement call log max cap caching with TTL; expose invalidateCallLogsMaxCache
and invalidate on settings PATCH
- add tests: call-log-cap.test.mjs and tool-request-sanitization.test.mjs;
extend tests for Windsurf integration and gemini previews
- update CLI runtime and tools to include Windsurf as a guide-only tool
- add maxCallLogs to validation schemas (settings and updateSettings)
- add Czech README (README.cs.md) to repository
When Claude Code routes through OmniRoute (Claude → OmniRoute → Claude),
OmniRoute was stripping all cache_control markers and replacing them with
its own generic caching strategy. This broke Claude Code's carefully
placed cache breakpoints for plans and other features.
Changes:
- Add preserveCacheControl parameter to prepareClaudeRequest()
- Detect Claude passthrough mode (sourceFormat === targetFormat === CLAUDE)
- Skip cache_control normalization when preserveCacheControl=true
- Preserve client's cache_control markers in system, messages, and tools
This ensures Claude Code's prompt caching optimization works correctly
while maintaining OmniRoute's caching strategy for translation scenarios.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The 'Import from /models' button failed because opencode-zen was not
registered in PROVIDER_MODELS_CONFIG. The provider's API at
https://opencode.ai/zen/v1/models returns standard OpenAI-compatible
format and is now properly configured for model import.
- Fix usageExtractor cached_tokens fallback for Responses API (use cache_read_input_tokens when input_tokens_details is absent)
- Fix truncated claude-native-passthrough-tools.test.mjs that caused parse error
Reviewed and approved via consolidated analysis. Turkish locale (31st language) follows existing i18n patterns perfectly. Registered in config.ts, generate-multilang.mjs, and full tr.json translation file.
Reviewed and approved via consolidated analysis. GLM-5.1 addition and pricing corrections match official Z.AI pricing page. All 5 files follow existing patterns.
Reviewed and approved via consolidated analysis. Fix is surgical (1 line removed) with 122 lines of regression tests covering stream=true, stream=false and guard scenarios. Resolves#677.
Three tests covering the fixed bug where translateRequest received an
object instead of a boolean for the stream parameter:
- stream=true round-trip produces boolean true
- stream=false round-trip produces boolean false
- guard test documenting that passing an object as stream breaks typing
Co-Authored-By: Craft Agent <agents-noreply@craft.do>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The second translateRequest call in the claude->openai->claude passthrough
path had an extra `translatedBody` argument before `stream`, shifting all
parameters by one. This caused the `stream` field in the upstream request
to be set to an object instead of a boolean, producing:
"stream: Input should be a valid boolean"
Co-Authored-By: Craft Agent <agents-noreply@craft.do>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add glm-5.1 model to GLM Coding provider with fitness scores
- Update glm-5 pricing to match Z.AI API page ($1/$3.2/$0.2)
- Set glm-5.1 pricing to $1.2/$5/$0.3 per Z.AI
- Remove glm-4-32b (deprecated, returns empty from upstream)
- Rename Z.AI provider display name from "Z.AI (GLM-5)" to "Z.AI"
- Update zai pricing section to match glm pricing
In OpenAI Chat Completions streaming format, the tool call id and type
should only appear on the first chunk (tool declaration). Subsequent
argument delta chunks should only include index and function.arguments.
Including id on every delta chunk caused openai-to-claude.ts to emit
a new content_block_start for each chunk, breaking Claude Code ACP
sessions with malformed Claude-format streams.
Fixes#682
Co-authored-by: Chris Staley <christopher.staley@protonmail.com>
The hasValuableContent() function in streamHelpers.ts returned undefined
instead of explicit false when checking empty delta chunks. This caused
JavaScript type coercion issues where undefined !== '' evaluated to true,
passing empty chunks through to clients.
Fix: Replace implicit returns with explicit boolean returns using
typeof checks and length comparisons for all content fields (content,
reasoning_content, tool_calls, text, thinking, partial_json).
Test: Added unit tests covering OpenAI, Claude, and Gemini format edge cases.
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
The native Codex passthrough path returned early before injecting
default instructions and enforcing store=false. Clients sending
Responses API requests without instructions (e.g. opencode) got
400 "Instructions are required", and requests missing store=false
got 400 "Store must be set to false" from the Codex upstream.
Move both assignments before the passthrough return so they apply
to all code paths.
Changes:
- fix: restore native Claude tool names in passthrough responses (PR #663 by @coobabm)
- fix: Clear All Models button now also removes aliases (PR #664 by @rdself)
- fix: completed truncated test from PR #663, added Claude-to-Claude passthrough test
- docs: update CHANGELOG and OpenAPI spec
The Clear All Models button was only deleting custom models from the
database but leaving their aliases intact, so the UI didn't reflect
the deletion. Now it also deletes all aliases belonging to the provider
and refreshes the alias state.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
High backoffLevel (up to 15) persisted permanently in the DB after a burst of 429s. The account health score dropped to zero (100 - 15*10 = -50), causing the account selector to never pick the account again. Only a successful request could reset backoffLevel via clearAccountError, but the account was never selected — creating a deadlock.
Now, during account selection, any non-terminal connection whose rateLimitedUntil has passed gets its backoffLevel reset to 0 and testStatus restored to active. The DB update is fire-and-forget to avoid blocking the hot path.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
The openai-to-claude translator was prefixing tool names with 'proxy_'
(e.g. Bash → proxy_Bash) even when routing Claude-format requests to
native Claude/Anthropic providers. Claude rejects unknown tool names,
causing 'No such tool available: proxy_Bash' errors.
Root cause: the _disableToolPrefix condition only disabled the prefix
for non-Claude providers, but it should be disabled for ALL providers
in the Claude passthrough path since tools are already in Claude format.
Fixes#618
- Fix Ollama Cloud base URL from api.ollama.com to ollama.com/v1/chat/completions
- Fix Ollama Cloud models URL to ollama.com/api/tags
- Add gemini-3.1-pro-preview and gemini-3.1-flash-lite-preview to Antigravity provider
Closes#643, closes#645
Thanks @brendandebeasi for this excellent contribution! 🎉 The bounded retry with exponential backoff is exactly the right approach for expired connections. Merged and will be included in v3.1.1.
Connections marked as 'expired' were permanently skipped by the health check scheduler (line 176: if testStatus === "expired" return). A single transient refresh failure could permanently disable auto-refresh, requiring manual re-authentication.
Replace the hard skip with a bounded retry mechanism: up to 3 attempts with exponential backoff (5min, 10min, 20min). On success, the connection is fully restored to active. On exhaustion, it remains expired (same as before). The existing circuit breaker (5 failures → 30min pause) provides additional protection.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
OpenAI-compatible clients (OpenCode, etc.) check capabilities/input_modalities fields on the /v1/models response to determine if a model supports image input. Omniroute was not emitting these fields, causing clients to assume text-only for all models routed through the proxy.
Add keyword-based vision detection (matching the existing playground heuristic) that annotates model entries with capabilities:{vision:true}, input_modalities:["text","image"], and output_modalities:["text"] for known multimodal models (GPT-4o/4-turbo, Claude 3+, Gemini, Pixtral, Qwen-VL, etc.).
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Rename in.json → hi.json: 'in' is Indonesian (ISO 639-1), Hindi is 'hi'.
Fixes Weblate locale conflict where id.json and in.json both claimed Indonesian.
- Move empty tool name filter before Codex passthrough: nativeCodexPassthrough
skipped all input sanitization, causing 400 'empty tool name' from upstream.
- Collapse 3+ consecutive newlines to \n\n in response sanitizer: thinking
models accumulate excessive line breaks between tool call blocks.
- OpenAI-to-Claude translator now maps reasoning_effort (low/medium/high/max)
to Claude's thinking.budget_tokens. Fixes clients like OpenCode sending
reasoning_effort via @ai-sdk/openai-compatible losing thinking configuration.
- Ensures max_tokens > budget_tokens for all thinking configs.
- Token health check now proactively refreshes tokens within 5 min of expiry,
regardless of the configured health check interval — addresses Qwen OAuth
token refresh failures between scheduled checks.
Adds structured YAML-based issue templates to improve issue quality.
Bug reports require version, install method, OS, repro steps, and
expected/actual behavior. Feature requests require use case and
proposed solution. Blank issues are still allowed for edge cases.
Adds a button next to the Auto-Sync toggle to clear all custom models
for a provider. Extends DELETE /api/provider-models to support ?all=true
parameter for bulk deletion.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Handle reasoning_details[] array (StepFun/OpenRouter format) in sanitizer and translator
- Handle 'reasoning' field alias → reasoning_content in streaming and non-streaming paths
- Cross-map input_tokens/output_tokens ↔ prompt_tokens/completion_tokens in filterUsageForFormat
- Fix extractUsage to accept input_tokens/output_tokens as alternative field names
- All 936 tests pass
- Add Claude Sonnet 4.5, Claude Sonnet 4, GPT 5, GPT 5 Mini
- Enable passthroughModels: true so users can access any model
Antigravity supports without waiting for registry updates
- Set clientSecretDefault for Antigravity provider (was empty, causing
'client_secret is missing' on token refresh for npm users)
- Add modelsUrl to opencode-zen registry for 'Import from /models'
Rewrote the account selector with a simpler, reliable approach:
- Fetch ALL connections once at startup (not per-provider)
- Filter by selectedProvider using ALIAS_TO_ID mapping
- Account/Key dropdown always visible when provider selected
- Shows 'Auto (N accounts)' default or individual account names
- Works for both OAuth accounts and API key providers
Import ALIAS_TO_ID mapping and resolve provider aliases (cx→codex,
kr→kiro, etc.) in loadConnections before filtering connections from
the API. The /v1/models endpoint returns alias-prefixed model IDs
but /api/providers/client returns provider IDs.
Playground:
- loadConnections() was parsing wrong API response shape (expected
providers[].connections[] but API returns flat connections[])
- Account selector now shows for any provider with ≥1 connection
- Uses conn.email as name fallback for OAuth providers
CLI Tools:
- getAllAvailableModels() now also fetches from /v1/models API
- Dynamic models supplement static PROVIDER_MODELS definitions
- Fixes providers like Kiro, OpenCode Zen showing 0 models
After stripping <antThinking>/<thinking> tags from streaming responses, the
surrounding newlines were left as artifacts (e.g. \n\n\n\n). Now collapses 3+
consecutive newlines to double-newline after any tag removal.
Also fixes PR #625 merge (Provider Limits light mode background).
The proxy test button in Settings was always failing with 'Socks5 Authentication
failed' because the frontend sent redacted credentials (***) from listProxies().
The backend received '***' as the password and tried to authenticate with it.
Fix: Frontend now sends proxyId in the test request body. The test endpoint
looks up the proxy from the DB with includeSecrets: true and uses the real
stored credentials for the SOCKS5 handshake.
Also: removed username/password from the frontend test payload since they
are always redacted and useless for testing.
Root cause: SOCKS5 proxies accept TCP connections (pass health check) but
can't relay HTTPS traffic. getCodexUsage() catches fetch errors internally
and returns {message: 'Failed to fetch...'} instead of throwing, so the
previous catch-based fallback never triggered.
Fix: After the initial proxied fetch, check the returned usage object for
network error indicators. If a proxy was active and the result contains
'fetch failed' / 'ECONNREFUSED' / etc., retry the entire operation
(credential refresh + usage fetch) without proxy context.
This is safe because usage fetching is read-only — showing limits data
without proxy is better than showing nothing.
bg-bg-subtle (#f0f0f5) appears gray against the page background in
light mode. Changed to bg-surface (#ffffff) for consistency with other
Card-based UI sections.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix: Limits usage fetch wraps BOTH token refresh and usage call inside proxy context (fixes SOCKS5 Codex accounts)
- Fix: CI integration test v1/models gracefully handles empty models list
- Fix: Settings proxy test button results now render with priority over health data
- Feat: Playground account selector dropdown for testing specific connections
- Merge: PR #623 LongCat API base URL path correction
The sanitize TransformStream (commit 5a8c644) shared the same TextDecoder
instance with the upstream transform stream. This corrupted UTF-8 state
when decoding SSE chunks, producing garbled output that broke clients
like openclaw that parse the stream.
- Use a separate TextDecoder for the sanitize stream
- Always decode→encode in sanitize (don't mix raw passthrough with decoded text)
- Add flush() handler to emit remaining buffered bytes
- Fix double-escaped regex (\\n → \n) for tag stripping
## Proxy UI Bug Fixes
- fix: proxy badge on connection cards now uses resolveProxyForConnection()
per-connection (covers registry + config-file assignments)
- fix: Test Connection button now works in 'saved' proxy mode by resolving
proxy config from savedProxies list
- fix: ProxyConfigModal now calls onClose() after save/clear (fixes UI freeze)
- fix: ProxyRegistryManager loads usage eagerly on mount with deduplication
by scope+scopeId to prevent double-counting; adds per-row Test button
## Connection Tag Grouping (new feature)
- feat: add Tag/Group field to EditConnectionModal (stored in
providerSpecificData.tag, no DB schema change)
- feat: connections list groups by tag with visual dividers when any account
has a tag; untagged accounts appear first without header
## Post-merge fix from PR #607 review
- fix: function_call blocks in translateNonStreamingResponse now also strip
Claude OAuth proxy_ prefix via toolNameMap (kilo-code-bot #607 warning)
Affects OpenAI Responses API format path — tool_use was fixed in PR #607
but function_call was missed
- fix(translator): pass toolNameMap to translateNonStreamingResponse so Claude
OAuth proxy_ prefix is correctly stripped from tool_use block names in
non-streaming responses (was only stripped in streaming path)
- fix(validation): add LongCat specialty validator that probes /chat/completions
directly, bypassing the /v1/models endpoint that LongCat does not expose (#592)
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
- Replace hardcoded rgba(255,255,255,...) borders/backgrounds with theme-aware
CSS variables (--color-border, --color-bg-subtle) for proper light mode contrast
- Add dark: variants for hover states and progress bar backgrounds
- Fix Claude plan tier: try to extract actual plan from OAuth response instead
of hardcoding "Claude Code"
- Recognize provider names (Claude Code, Kimi Coding, Kiro) as non-plan-tier
values in normalizePlanTier() to avoid showing them as tier badges
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- fix(translator): pass toolNameMap to translateNonStreamingResponse so Claude
OAuth proxy_ prefix is correctly stripped from tool_use block names in
non-streaming responses (was only stripped in streaming path)
- fix(validation): add LongCat specialty validator that probes /chat/completions
directly, bypassing the /v1/models endpoint that LongCat does not expose (#592)
Video and music models had a special exemption for authType="none" providers
(comfyui, sdwebui), causing them to appear in the models list even without
any active provider connection. Now all model types consistently use
isProviderActive() filtering, matching the behavior of image models.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Provider field shows connection name (e.g. "BltCy API"),
Protocol (sourceFormat) shows "-" since model-sync is not
a chat/completion request.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use connection.name instead of the raw provider node ID
(e.g. "BltCy API" instead of "openai-compatible-chat-09fdb807-...")
in call logs and scheduler console output.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The autoSyncToggle was defined after the isCompatible early return,
so it never rendered for compatible provider types. Move the toggle
definition before the isCompatible branch so it appears for all
provider types including third-party OpenAI-compatible ones.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add POST /api/providers/[id]/sync-models endpoint that fetches models
from a provider's /models API and replaces the full custom models list,
preserving per-model compatibility overrides
- Rewrite modelSyncScheduler to dynamically discover connections with
autoSync enabled in providerSpecificData instead of a hardcoded list
- Add replaceCustomModels() to db/models.ts for full list replacement
while preserving existing compat flags
- Log each model sync operation to call_logs for visibility in the
Logs page
- Add Auto-Sync toggle button next to "Import from /models" in the
provider detail page UI
- Add en/zh-CN i18n translations for auto-sync strings
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace Image-based provider icons in ProviderOverviewCard with the same
ProviderIcon component used on the providers page (@lobehub/icons SVG
with PNG → generic fallback chain).
The <omniModel> tag was leaking into user-visible content when
context_cache_protection was enabled on a combo. The tag is an internal
marker for model pinning across conversation turns.
Fix: Add a second TransformStream pass (sanitize) that strips the tag
from SSE chunk content before delivery to the client. The tag is still
injected for round-trip context pinning but cleaned from visible output.
Also adds X-OmniRoute-Model response header as a cleaner metadata channel.
Closes#585
Changed the heuristic fallback for claude-* models from 'antigravity' to 'anthropic'
as the canonical provider. Users without Antigravity credentials were getting
'No credentials for provider: antigravity' errors when sending unprefixed
Claude model names like 'claude-sonnet-4-5'.
Closes#570
Add stripModelPrefix boolean setting that, when enabled, strips
provider prefixes (e.g. openai/, anthropic/) from incoming model
names and re-resolves the bare model name using existing heuristics.
This allows tools to send prefixed model names while OmniRoute
handles provider routing at the proxy layer.
- Add stripModelPrefix to settings validation schema (Zod)
- Check setting in getModelInfo() after custom node matching fails
- Falls through to normal resolution on error or when disabled
- Backward compatible: opt-in, default behavior unchanged
- Added MAX_TRANSCRIPTION_FILE_SIZE constant (4GB)
- Added formatFileSize() helper for human-readable display (KB/MB/GB)
- Frontend validation rejects files > 4GB with error message
- Changed label from 'Audio File' to 'Audio / Video File'
- Shows 'Supports audio and video files up to 4 GB' hint
- Add contextLength field to RegistryModel interface for per-model overrides
- Add defaultContextLength to RegistryEntry for provider-level defaults
- Set context lengths for major providers:
- Claude: 200k
- Codex: 400k (fixes combo context display)
- Gemini: 1M
- OpenAI: 128k
- GitHub Copilot: 128k
- Kiro/Cursor: 200k
- OpenCode: 200k
- Include context_length in /v1/models API response
- Add context_length field to combo schema for custom combo context
- Update contextManager to use registry defaults and support env overrides
- CONTEXT_LENGTH_<PROVIDER> for per-provider override
- CONTEXT_LENGTH_DEFAULT for global override
This allows clients like OpenClaw to display accurate context windows
for combo models instead of guessing based on model name patterns.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When users skip password setup during onboarding (either via 'Skip Password'
checkbox or 'Skip Wizard' button), the app now explicitly sets requireLogin=false.
Previously, requireLogin defaulted to true with no password hash stored,
leaving users permanently stuck on the login page.
Two code paths fixed in onboarding/page.tsx:
- handleSetPassword() with skipSecurity=true
- handleFinish() when no password was configured
- Reword comments that contained the token any; replace any types with typed shapes
- stream.ts: passthrough tool-call flag via local boolean (state is null in passthrough)
- Document T11 in zws_docs/ZWS_README_V8.md
Made-with: Cursor
Add model-pattern → combo mapping feature that automatically routes requests
to specific combos based on model name patterns (glob matching).
Implementation:
- New migration 010: model_combo_mappings table with pattern, combo_id, priority
- DB module with CRUD + resolveComboForModel() using glob-to-regex matching
- getComboForModel() in model.ts: augments getCombo() with pattern fallback
- chat.ts: replaced getCombo() → getComboForModel() at routing decision point
- API endpoints: GET/POST /api/model-combo-mappings, GET/PUT/DELETE by [id]
- ModelRoutingSection.tsx: dashboard UI with inline add/edit/toggle/delete
- Integrated into Combos page
- 15 new unit tests (glob matching, priority ordering, disabled filtering)
- Full test suite: 923/923 pass
Examples:
claude-sonnet* → code-combo
claude-*-opus* → frontier-combo
gpt-4o* → openai-combo
gemini-* → google-combo
Resolves: #563
Cherry-pick from codex/omniroute-fixes-20260324:
- Replace MCP singleton transport with per-session architecture for Streamable HTTP
- Fix Claude passthrough via OpenAI round-trip normalization
- Add detectFormatFromEndpoint() for endpoint-aware format detection
- Support raw code#state in OAuth modal for Claude Code remote auth
- Expose cloudConfigured/cloudUrl/machineId in settings API
- Switch docker-compose.prod.yml target to runner-cli
- Add 3 new tests for round-trip and detectFormat
PR: #562
- Implement keychain-based credential extractor for Zed IDE
- Support macOS (Keychain), Windows (Credential Manager), Linux (libsecret)
- Add API endpoint: POST /api/providers/zed/import
- Auto-discover OAuth tokens for OpenAI, Anthropic, Google, Mistral, xAI, etc.
- Cross-platform support via keytar library
- Complete documentation with security considerations
Closes community request from OmniRoute Telegram group.
Follows proven pattern used by VS Code, GitHub Copilot CLI, Claude Code.
CLI settings routes (codex-settings, droid-settings, kilo-settings) were
writing the masked API key string directly to config files when the
dashboard sent a keyId. Now resolves the real key from the database via
getApiKeyById() before writing, matching the pattern already implemented
in claude-settings, openclaw-settings, and cline-settings.
Closes#549
- thread-stream test fixtures (intentionally malformed) were being picked
up by Turbopack during production build, causing 111 compile errors
- IgnorePlugin excludes /test/ within thread-stream context
- thread-stream added to serverExternalPackages to prevent bundling
- /app removed: it is a stale npm-package prebuild artifact, not source code
T01 (P1): requested_model column in call_logs
- Migration 009_requested_model.sql: ALTER TABLE call_logs ADD COLUMN requested_model
- callLogs.ts: INSERT + SELECT updated to include requestedModel field
T02 (P1): Strip empty text blocks from nested tool_result.content
- New stripEmptyTextBlocks() recursive helper in openai-to-claude.ts
- Applied on tool_result content before forwarding to Anthropic
- Prevents 400 'text content blocks must be non-empty' errors
T03 (P1): Parse x-codex-5h-*/x-codex-7d-* headers for precise quota reset
- parseCodexQuotaHeaders() in codex.ts extracts usage/limit/resetAt
- getCodexResetTime() returns furthest-out reset timestamp for safe unblocking
T04 (P1): X-Session-Id header for external sticky routing
- extractExternalSessionId() in sessionManager.ts reads x-session-id,
x-omniroute-session, session-id headers with 'ext:' prefix to avoid collisions
T06 (P2): account_deactivated permanent expired status on 401
- ACCOUNT_DEACTIVATED_SIGNALS constant + isAccountDeactivated() in accountFallback.ts
- Returns 1-year cooldown (effectively permanent) to prevent retrying dead accounts
T07 (P2): X-Forwarded-For IP validation
- New src/lib/ipUtils.ts with extractClientIp() and getClientIpFromRequest()
- Skips 'unknown'/non-IP entries in X-Forwarded-For chain
T10 (P2): credits_exhausted distinct account status
- CREDITS_EXHAUSTED_SIGNALS + isCreditsExhausted() in accountFallback.ts
- Returns 1h cooldown with creditsExhausted flag, distinct from rate_limit 429
T11 (P1): max reasoning_effort -> budget_tokens: 131072
- EFFORT_BUDGETS and THINKING_LEVEL_MAP updated with max: 131072, xhigh: 131072
- Reverse mapping now returns 'max' for full-budget responses
- Unit test updated to expect 'max' (was 'high')
T12 (P3): Model pricing updates
- MiniMax M2.7 / MiniMax-M2.7 / minimax-m2.7-highspeed pricing added
T15 (P1): Array content normalization for system/tool messages
- normalizeContentToString() helper exported from openai-to-claude.ts
- System messages with array content now correctly collapsed to string
Registrar o provedor Puter como gateway OpenAI-compatible que expõe
modelos de múltiplos fornecedores (GPT, Claude, Gemini, Grok, DeepSeek,
Qwen, Mistral, Llama) através de um único endpoint REST.
- Criar PuterExecutor com autenticação Bearer token
- Adicionar entrada no providerRegistry com 40+ modelos curados
- Habilitar passthroughModels para acesso aos 500+ modelos do catálogo
- Registrar alias "pu" para acesso rápido
- Adicionar metadados do provedor em shared/constants/providers.ts
- CHANGELOG: [3.0.0-rc.5] section now serves as full 'What's New vs v2.9.5':
* 2 new providers (OpenCode Zen/Go via PR #530)
* 3 new features: Registered Keys API (#464), provider icons (#529), model auto-sync (#488)
* 10 bug fixes (#521, #522, #524, #527, #532, #535, #536, #537, #489, #510, #492)
* 16 issues resolved total, DB migration 008
- README: added 'What's New in v3.0.0' table section after badges
Includes all commits from @kang-heewon's PR #530:
- OpencodeExecutor with multi-format routing
- opencode-zen + opencode-go registered in provider registry
- UI metadata added to providers.ts
- Unit tests for OpencodeExecutor (improved to avoid state coupling)
Cherry-picked from add-opencode-providers into 3.0.0-rc.
Conflicts resolved: executors/index.ts (merged pollinations+cloudflare-ai),
providerRegistry.ts (kept testKeyBaseUrl from rc.2 + PR's authType/models).
feat(ui): ProviderIcon component with @lobehub/icons + PNG fallback (#529)
- 130+ providers covered by Lobehub SVG components via LobehubErrorBoundary
- Falls back to existing /providers/{id}.png, then generic icon
- Replaces manual img state machine in ProviderCard + ApiKeyProviderCard
feat(scheduler): modelSyncScheduler — 24h model list auto-update (#488)
- Syncs 16 major providers every 24h (MODEL_SYNC_INTERVAL_HOURS configurable)
- Wired into POST /api/sync/initialize startup hook
fix(oauth): Gemini CLI — clear error when client_secret missing in Docker (#537)
fix(chat): convert tool_result content blocks to [Tool Result: id] text (#527)
- Previously, tool_result blocks in user messages were silently dropped
- This caused an infinite loop when Claude Code + superpowers routed to Codex:
Codex never received the tool response and kept re-requesting the tool
- Now: tool_result → text block '[Tool Result: {id}]\n{content}'
- Handles string, array-of-text, and JSON-serialized content types
docs(issues): add Turbopack postinstall workaround on #509 and #508
docs(issues): note that #464 (API key provisioning) is on the v3.0 roadmap
fix(cli): normalize MSYS2/Git-Bash paths in cliRuntime.ts (#510)
- Add normalizeMsys2Path() helper: /c/Program Files/... → C:\Program Files\...
- Apply to both Windows 'where' and Unix 'command -v' path resolution
- Fixes 'CLI not detected' on Windows when running Git Bash / MSYS2
fix(cli-launcher): detect mise/nvm on server.js not found error (#492)
- Show targeted fix instructions based on which Node manager is in use
- mise users: told to use npx or mise exec
- nvm users: reminded to nvm use --lts before reinstalling
docs(issues): add pnpm bindings workaround comment (#520)
docs(issues): note OpenCode/Lobehub icons coming in v3.0.0 (#529)
fix(login): redirect to /dashboard/onboarding when API returns needsSetup:true (#521)
- Handle the case where user skips password setup and lands on login
- Instead of showing a cryptic error, redirect to onboarding flow
fix(api-manager): replace useless 'copy masked key' button with lock tooltip (#522)
- Copying a masked key (sk-proj123****abcd) is misleading and useless
- Show a lock icon on hover explaining key is only available at creation time
- Add i18n key 'keyOnlyAvailableAtCreation'
fix(opencode-go): use zen/v1 for API key validation, not zen/go/v1 (#532)
- Added testKeyBaseUrl field to RegistryEntry interface
- opencode-go: testKeyBaseUrl → zen/v1 (same key authenticates both tiers)
- validation.ts: resolveBaseUrl for key testing now prefers testKeyBaseUrl
fix(antigravity): return structured 422 error when projectId is missing (#489)
- Instead of throwing (crash), executor returns an OpenAI-format error JSON
- Client receives message with instruction to reconnect OAuth
- Prevents opaque 500 errors in the proxy logs
chore: close#525 (OmniRoute = 9router — same project, different name)
docs: add Docker password reset comment on #513 with INITIAL_PASSWORD workaround
- feat(providers): add OpenCode Zen and Go providers with multi-format executor (PR #530 by @kang-heewon)
- fix(embeddings): use provider node ID for custom embedding provider credential lookup (PR #528 by @jacob2826)
- fix(cli-tools): resolve real API key from DB (keyId) before writing to CLI config files (#523, #526)
- fix(combo): update CACHE_TAG_PATTERN to match literal \\n prefix/suffix around omniModel tag (#531)
- chore: bump version to 2.9.5 in package.json + docs/openapi.yaml
- docs: update CHANGELOG.md with v2.9.5 release notes
- Register OpencodeExecutor for 'opencode-zen' and 'opencode-go' in executors map
- Add OpencodeExecutor export in index.ts
- Add UI metadata for both providers in APIKEY_PROVIDERS:
- OpenCode Zen: https://opencode.ai/zen
- OpenCode Go: https://opencode.ai/zen/go
- Both use 'opencode' icon with #6366f1 color
fix(cli-tools): save real API key to CLI config files instead of masked string (#523, #526)
- claude-settings/route.ts: accept keyId, look up real key from DB (getApiKeyById)
- cline-settings/route.ts: same keyId resolution pattern
- openclaw-settings/route.ts: same keyId resolution pattern
- ClaudeToolCard.tsx: store key.id as selected value, send keyId in POST body
The /api/keys endpoint returns masked strings (first8+****+last4) which were being
written verbatim to ~/.claude/settings.json and similar config files, causing auth
failures on CLI tool launch.
fix(combo): update CACHE_TAG_PATTERN to strip surrounding \\n sequences (#531)
- comboAgentMiddleware.ts: non-global regex now matches literal \\n (backslash-n)
and actual newline U+000A that combo.ts injects around the <omniModel> tag.
- fix(translator): preserve prompt_cache_key in Responses API translation (#517)
- fix(combo): escape \n in tagContent for valid JSON injection (#515)
- fix(usage): sync expired token status back to DB on live auth failure (#491)
- chore: bump version to 2.9.4 in package.json + docs/openapi.yaml
- docs: update CHANGELOG.md with v2.9.4 release notes
fix(translator): preserve prompt_cache_key when translating Responses API requests
(#517) — prompt_cache_key is an account-affinity signal used by Codex for
prompt cache routing. Deleting it from the translated request prevented full
cache effectiveness. Removed delete from openai-responses.ts and
responsesApiHelper.ts cleanup blocks.
fix(combo): escape \n in tagContent so injected JSON string is valid (#515)
— omniModel tag content used template literal newlines (U+000A) which produce
unescaped newline chars inside a JSON string value. Replaced with literal \n
escape sequences for valid JSON injection in streaming SSE content chunks.
- Version bumped from 2.9.2 → 2.9.3 in package.json + docs/openapi.yaml
- CHANGELOG.md updated with full release notes for 2.9.3
(5 new free providers, 2 metadata updates, 2 custom executors, docs)
upstreamErrorResponse() now guards against parsed.error being an
object (e.g. ElevenLabs { error: { message, status_code } }) instead
of blindly using it as the error message string.
Both audioSpeech.ts and audioTranscription.ts fixed.
- Add resolveAudioContentType() to map video/* MIME to audio/* (fixes .mp4 uploads returning 'no speech detected')
- Add detect_language=true for Deepgram auto-language detection (fixes non-English audio)
- Add punctuate=true for better output quality
- Forward language form param to Deepgram when provided
- Apply same Content-Type fix to HuggingFace handler
The truthy check treated false as falsy and deleted the property, preventing users from explicitly disabling normalization for a specific protocol when the top-level flag was true. Now stores both true and false values, consistent with preserveOpenAIDeveloperRole handling.
Made-with: Cursor
The word 'any' in a JSDoc comment was matched by the regex-based t11 checker. Reworded to 'prefixes' to eliminate the false positive.
Made-with: Cursor
Rewrite getMachineIdRaw() to use a try/catch waterfall instead of
process.platform conditionals. Next.js SWC bundler evaluates
process.platform at BUILD time, so when built on Linux, the win32
branch was dead-code-eliminated — causing 'head is not recognized'
errors on Windows.
New approach:
1. Try Windows REG.exe (existsSync check, not platform check)
2. Try macOS ioreg command
3. Try reading /etc/machine-id directly (no head/pipe)
4. Try hostname command
5. Fallback to os.hostname()
Also eliminates the patch-machine-id.cjs post-install workaround.
- #493: Fix custom provider model naming — removed incorrect prefix
stripping in DefaultExecutor.transformRequest() that broke org-scoped
model IDs like 'zai-org/GLM-5-FP8'
- #490: Enable context cache protection for streaming responses using
TransformStream to inject omniModel tag as final SSE content delta
before [DONE] marker
- #452: Add per-API-key request-count limits (max_requests_per_day,
max_requests_per_minute) with in-memory sliding window counter,
schema auto-migration, and Check 5 in enforceApiKeyPolicy()
- Replace execSync template string with execFileSync + args array on Windows
to prevent command injection via SystemRoot/windir environment variables
- Add optional chaining (?.) and nullish coalescing (?? "") on Windows
REG_SZ output parsing to prevent crash if REG.exe output is unexpected
- Add optional chaining on macOS IOPlatformUUID parsing for the same reason
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace 'any other path' with 'all other paths' in translator comment to avoid false match by the \bany\b regex in check-t11-any-budget
- Scope e2e error locator to dialog and use .first() to prevent Playwright strict-mode violations from broad page-level selectors
- Fix fallback logic: treat dialog-still-open as validation success signal
Made-with: Cursor
- Use globalThis singleton guards for DB connection, HealthCheck timers, console interceptor, and graceful shutdown to survive Webpack HMR re-evaluation (fixes 485+ leaked DB connections per session)
- Split instrumentation.ts into instrumentation-node.ts with computed import path to prevent Turbopack Edge bundler from tracing Node.js modules (eliminates 10+ spurious warnings per hot compile)
- Parallelize startup imports in instrumentation-node.ts (3 batch Promise.all instead of 9 serial awaits)
- Add OMNIROUTE_USE_TURBOPACK=1 env switch in run-next.mjs (default behavior unchanged)
- Replace node:crypto with crypto in proxies.ts and errorResponse.ts to fix UnhandledSchemeError
- Add unlinkFileWithRetry with EBUSY/EPERM retry for Windows file handle timing in backup restore
- Fix pre-restore backup to await completion before closing DB
- Fix bootstrap-env, domain-persistence, and fixes-p1 test stability on Windows
Made-with: Cursor
Problem:
node-machine-id constructs the REG.exe command path at module load time
using process.platform. When Next.js bundles this module, process.platform
is "" (not "win32") in the webpack/build context, so the lookup returns
undefined and bakes "undefined\REG.exe ..." permanently into the compiled
chunk. At runtime on Windows this causes:
Error: Command failed: undefined\REG.exe QUERY HKEY_LOCAL_MACHINE\...
The system cannot find the path specified.
Fix:
Remove the node-machine-id dependency from machineId.ts and replace it
with a direct execSync implementation that resolves process.env.SystemRoot
at call time (not load time), so the correct Windows path is always used
regardless of when or how the module was bundled.
Platform support is preserved for Windows, macOS, and Linux/FreeBSD using
the same underlying OS queries that node-machine-id used internally.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Address bot review feedback: use .finally() instead of .then()/.catch()
so limiters.delete() runs regardless of whether stop() succeeds or
throws (e.g. already stopped by concurrent 429).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Multiple concurrent requests can receive 429 simultaneously, causing
stop() to be called on an already-stopped limiter. Add .catch() to
prevent unhandled rejection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a provider returns 429 (rate limit exceeded), the rate limit manager
was setting reservoir=0 and waiting for reservoirRefreshInterval before
releasing queued requests. For providers with long rate limit windows
(e.g. Codex with hours-long resets), this caused all queued requests to
hang indefinitely — they never timed out or returned an error.
This prevented upstream callers (e.g. LiteLLM) from triggering fallback
to alternative providers, effectively making the entire model unavailable
until the rate limit window expired.
Fix: on 429, call limiter.stop({ dropWaitingJobs: true }) to immediately
fail all queued requests, then delete the limiter from the Map so
getLimiter() creates a fresh instance for subsequent requests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add preserveDeveloperRole option and model compat override
- Normalize developer→system in roleNormalizer when not preserving
- Translator runs normalizeRoles for Responses API with option
- UI: ModelCompatPopover with do not preserve developer toggle
- Add ZWS_README_V2 documenting cause and fix
Made-with: Cursor
The Bailian Coding Plan provider page may render a dialog on load
that blocks pointer events on the Add API Key button. Add pre-dialog
dismissal (Escape key) before attempting to click.
Also triages #485 (Claude Code tool calls — needs-info).
- #462: Mark gemini-cli provider as deprecated in providers.ts
Add deprecated, deprecationReason, hasFree, freeNote, authHint, apiHint
to Zod provider schema
- #471: Add VM_DEPLOYMENT_GUIDE.md to DOC_SOURCE_FILES in generate-multilang.mjs
Delete 29 stale PT-language copies and regenerate from EN source
for all 30 locales (29 auto-translated + 1 Czech from PR #482)
formatSSE() in streamHelpers.ts explicitly returned 'data: null' for
null/undefined data. This violates SSE protocol and causes
AI_TypeValidationError in strict clients (Zod-based AI SDKs).
Now returns empty string, silently skipping null chunks.
- New API: /api/logs/export?hours=24&type=call-logs
- UI: Export button with dropdown on /dashboard/logs page
- Supports export of request-logs, proxy-logs, and call-logs
- Downloads as JSON file with Content-Disposition header
Previously resolveModelAlias() output was used only for getModelTargetFormat()
but the original model was sent in translatedBody.model and to the executor.
Now effectiveModel is propagated to all downstream operations.
saveCallLog only read prompt_tokens/completion_tokens (OpenAI format).
When sourceFormat=claude, the openai-to-claude translator writes
input_tokens/output_tokens instead, causing all cross-format requests
(Codex-via-Claude, Kiro-via-Claude, etc.) to show 0|0 tokens in
call_logs.
Also includes cache_read and cache_creation tokens in tokens_in total
so heavily-cached requests don't show misleadingly low input counts.
Changes:
- Read prompt_tokens || input_tokens (supports both formats)
- Read completion_tokens || output_tokens (supports both formats)
- Sum cache_read_input_tokens + cache_creation_input_tokens into total
logUsage stored only non-cached input tokens in usage_history.tokens_input.
For heavily-cached Claude requests (common with Claude Code), this shows
near-zero input when the real total is 150K+, causing the analytics
dashboard to severely underreport input token usage.
Now sums: input = prompt_tokens + cache_read + cache_creation
chatCore.ts injects translatedBody.model for all providers after
translation. Kiro API (AWS CodeWhisperer) has strict schema validation
and rejects unknown top-level fields — only conversationState, profileArn,
and inferenceConfig are valid. This causes 100% of Kiro requests to fail
with "Improperly formed request".
Strip the injected model field in KiroExecutor.transformRequest().
* feat: add api-key Kimi Coding provider support
* fix(kimi-coding): honor apikey auth header in executor
Ensure DefaultExecutor sends x-api-key for kimi-coding-apikey at runtime
and deduplicate shared kimi coding config blocks in registry and models
config to reduce drift between oauth and apikey variants.
---------
Co-authored-by: OmniRoute Agent <agent@omniroute.local>
- fix(budget): BudgetTab sent integer percentage (80) but schema validated
fraction (0-1). Now divides by 100 on POST and multiplies by 100 on GET (#451)
- fix(combos): expose Agent Features UI in combo create/edit modal — fields for
system_message override, tool_filter_regex, and context_cache_protection were
implemented server-side (#399/#401) but missing from the dashboard UI (#454)
- fix(combos): strip <omniModel> tags from messages before forwarding to provider.
The internal cache-pinning tag was being sent to the provider, causing cache
misses as providers treated each tagged request as a new session (#454)
- fix(docker): copy pino-abstract-transport + pino-pretty in standalone (#449)
- fix(responses): remove initTranslators() from /v1/responses route (#450)
- chore(deps): commit package-lock.json with each version bump
- fix(docker): copy pino-abstract-transport and pino-pretty explicitly in
runner-base stage — Next.js standalone trace omits them, causing
'Cannot find module pino-abstract-transport' crash on startup (#449)
- fix(responses): remove initTranslators() call from /v1/responses route —
bootstrapping translator registry from a Next.js Route Handler worker
caused 'the worker has exited' uncaughtException on Codex CLI requests.
Translators are already bootstrapped server-side via open-sse (#450)
- chore: include package-lock.json in commit (was being left behind on
version bumps, causing npm ci to install inconsistent deps in Docker)
- fix(ux): add default password hint on login page for first-time users (#437)
The fallback password (123456) is now shown as a hint below the
password input so users don't get locked out during initial setup.
- fix(cli): add shell:true to spawn on Windows so .cmd wrappers are
resolved correctly via PATHEXT (#447). Claude, opencode, and other
npm-installed CLIs show as 'not runnable' on Windows even when
installed because spawn() cannot find .cmd files without shell:true.
- i18n: add defaultPasswordHint key to en.json auth namespace
Keep search provider validation responses consistent with other validators so Serper regression tests and CI assertions can rely on unsupported=false.
Made-with: Cursor
Normalize GitHub Copilot account tiers from the usage payload and hide misleading unlimited buckets so account type and limits render correctly in the dashboard.
Made-with: Cursor
Search Playground (Phase 1):
- Web Search as 10th endpoint in Playground with isolated SearchPlayground component
- Endpoint selector moved first; Provider/Model/Send hidden when search selected
- Provider dropdown via GET /api/search/providers, formatted results with cache indicator
Search Tools page (Phase 2) at /dashboard/search-tools:
- Split panel: SearchForm (left) with query, provider, filters + ResultsPanel (right)
- Compare Providers: parallel queries with latency, cost, response size, URL overlap
- Rerank Pipeline: model selector from /v1/models, results with position delta
- Search History: last 10 searches from call_logs with replay
- Sidebar entry under Debug section
Backend:
- GET /api/search/providers — list providers with auth guard + SEARCH_CREDENTIAL_FALLBACKS
- GET /api/search/stats — cache stats, provider aggregates, recent searches (auth guard)
- Add local provider_nodes routing for /v1/rerank (oMLX, vLLM support)
Bug fixes (from F-27 PR #432):
- Fix Brave news normalizer: data.results directly, not data.news.results
- Enforce max_results truncation after normalization for all providers
- Fix EndpointPageClient: use /api/search/providers instead of /api/v1/search
- Add isAuthenticated() guards on /api/search/providers and /api/search/stats
Response size metric in results meta bar and compare table.
i18n: 30+ keys in search namespace (en.json)
New in v2.7.0: pluggable RouterStrategy, multilingual intent detection,
request deduplication, new providers (Grok-4 Fast, GLM-5/Z.AI,
MiniMax M2.5, Kimi K2.5). Native translations for de/es/fr/it/ru/zh-CN/ja/ko/ar/pt-BR/pt.
npm version patch run BEFORE staging files — this is an ATOMIC commit.
Adds Strategy 1.5 to scripts/postinstall.mjs:
- Uses @mapbox/node-pre-gyp install --fallback-to-build=false
(bundled within better-sqlite3) to download the correct prebuilt
binary for the current OS/arch (win32-x64/arm64, darwin-x64/arm64)
WITHOUT requiring node-gyp, Python, or MSVC build tools.
- Tries node-pre-gyp.cmd (Windows) or node-pre-gyp (Unix) from .bin/
with fallback to direct path in @mapbox/node-pre-gyp/bin/
- Falls back to npm rebuild only if prebuilt download fails.
- Windows-specific error: shows Option A (npx node-pre-gyp) and
Option B (rebuild) with Visual Studio Build Tools links.
Fixes: #426 (better_sqlite3.node is not a valid Win32 application)
Includes version bump — v2.6.9 — committed ATOMICALLY with all changes:
fixes:
- fix(ci/t11): Remove 'any' from comments in openai-responses.ts + chatCore.ts
(\bany\b regex counted comment text as explicit any violations)
- fix(chatCore/#409): Normalize unsupported content part types before forwarding
Cursor sends {type:'file'} for .md attachments; Copilot/OpenAI providers reject
with 'type has to be either image_url or text'. Now: file/document→text block,
unknown types dropped with debug log. Fixes claude-* models via github-copilot.
workflow:
- chore(generate-release): ATOMIC COMMIT RULE — npm version patch MUST run before
feature commits so the release tag always points to a commit with full changes
DB Migrations (zero-breaking, ADD COLUMN DEFAULT NULL + new table):
- 005_combo_agent_fields.sql: system_message, tool_filter_regex, context_cache_protection on combos
- 006_detailed_request_logs.sql: ring-buffer table (500 entries) for full pipeline body capture
Features:
- #399 System Message Override + Tool Filter Regex per Combo
- applyComboAgentMiddleware() injected into handleComboChat/handleRoundRobinCombo
- Supports both OpenAI and Anthropic tool name formats
- #401 Context Caching Protection (Stateless)
- injectModelTag() appends <omniModel>provider/model</omniModel> to responses
- extractPinnedModel() reads tag from history and pins model for session
- #320 Auto-Update via Settings
- GET /api/system/version — current vs latest npm
- POST /api/system/update — fire-and-forget npm install + pm2 restart
- #378 Detailed Request Logs
- saveRequestDetailLog() captures bodies at 4 pipeline stages (opt-in toggle)
- GET/POST /api/logs/detail — list logs + enable/disable toggle
- #336 MITM Kiro IDE
- src/mitm/targets/kiro.ts: MitmTarget profile for api.anthropic.com interception
Audio endpoints (/v1/audio/speech and /v1/audio/transcriptions) only
supported hardcoded providers from audioRegistry.ts. Local inference
backends configured as provider_nodes (e.g., MLX-Audio, oMLX) could
not serve audio through OmniRoute.
This adds a Phase 3 fallback in the audio model parser that consults
provider_nodes from the database. Local providers with api_type=openai
are automatically available for audio routing via their prefix
(e.g., mlx-audio/tts-model, omlx/whisper-large-v3-turbo).
Design: injection pattern — Next.js route handlers load provider_nodes
(async DB query) and pass them to the sync parser as a parameter.
No cross-workspace imports, no breaking changes to existing parsers.
Changes:
- Add buildDynamicAudioProvider() in audioRegistry.ts
- Add Phase 3 (provider_nodes prefix match) to parseAudioModel()
- Extend parseSpeechModel/parseTranscriptionModel with optional
dynamicProviders parameter (backward compatible)
- Load and inject provider_nodes in speech/transcription route handlers
- Dynamic providers use authType=none (local, no credentials needed)
Claude OAuth tokens are short-lived and require refresh. The runtime
HealthCheck (open-sse) already refreshes them successfully, but the
Dashboard test endpoint was missing `refreshable: true` in its config.
This caused the Dashboard to show "auth failed / Token expired" for
Claude providers even though the tokens were being refreshed correctly
at runtime. The codex provider already had this flag set.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Anthropic API rejects requests containing {"type":"text","text":""} with
400 "text content blocks must be non-empty". Some clients like LiteLLM
passthrough and @ai-sdk/anthropic may forward empty text blocks as-is.
Filter out empty text content blocks from messages before calling
translateRequest, similar to how empty-name tools are already stripped.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- gemini/gemini-cli: removed gemini-3.1-pro/flash/preview (don't exist in Google API v1beta),
replaced with real models: gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-*
- antigravity: removed gemini-3.1-pro-high/low and gemini-3-flash (internal aliases invalid),
replaced with gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash
- github: removed gemini-3-flash-preview and gemini-3-pro-preview, replaced with gemini-2.5-flash
- nvidia: corrected 'nvidia/llama-3.3-70b-instruct' to 'meta/llama-3.3-70b-instruct'
(NVIDIA NIM uses meta/ namespace, not nvidia/ namespace for Meta models)
- nvidia: added meta/llama-3.1-70b-instruct and nvidia/llama-3.1-405b-instruct
Also fixed free-stack combo on .15 DB:
- removed qw/qwen3-coder-plus (qwen provider has expired refresh token)
- corrected nvidia/llama-3.3-70b-instruct → nvidia/meta/llama-3.3-70b-instruct
- corrected gemini/gemini-3.1-flash → gemini/gemini-2.5-flash
- added if/deepseek-v3.2 as replacement for qw/qwen3-coder-plus
Local inference backends (oMLX, Ollama, LM Studio) configured as
provider_nodes have no health monitoring. When a local provider is
down, OmniRoute waits the full timeout before failing.
This adds a background health check that polls local provider_nodes:
- GET /models with 5s timeout for each local node (localhost only)
- In-memory health cache (no DB migration needed)
- Promise.allSettled for parallel checks (one slow node doesn't block)
- Exponential backoff on failures: 30s → 60s → 120s → 300s max
- Reset to 30s on first success after failure
- State transition logging (healthy ↔ unhealthy)
- Expose health status via GET /api/monitoring/health (localProviders)
- Auto-init on first import (same pattern as tokenHealthCheck)
- 401 treated as healthy (server up, auth required)
- isNodeHealthy() returns true if never checked (optimistic default)
Embedding endpoint (/v1/embeddings) only supports 6 hardcoded cloud
providers. Local inference backends (oMLX, Ollama) serving embeddings
via provider_nodes are inaccessible through OmniRoute.
This adds dynamic provider_node support for embeddings:
- Add EmbeddingProvider interface and buildDynamicEmbeddingProvider()
- Add Phase 2 (provider_nodes prefix match) in parseEmbeddingModel()
- Handler accepts resolvedProvider/resolvedModel from route (injection pattern)
- Handler supports authType=none for local providers (was missing — critical gap)
- Route loads local provider_nodes (localhost only — prevents auth bypass/SSRF)
- Route filters by apiType=chat|responses and localhost hostname
- buildDynamicEmbeddingProvider validates inputs (prefix + baseUrl required)
- Per-node try/catch in map — one bad row doesn't block all providers
- DB errors logged and fall back to hardcoded providers
Use shallow copy ({ ...body }) instead of direct reference assignment
so that later translatedBody.model = model does not mutate the
caller's original body object.
When both source and target formats are Claude, skip all request
modification and forward the body untouched. This prevents
prepareClaudeRequest from corrupting valid Claude-native requests
destined for anthropic-compatible provider nodes.
When Claude Code compacts conversation context to fit within token
limits, it may remove assistant messages containing tool_use/tool_calls
while leaving the corresponding tool_result/function_call_output
messages intact. This creates orphaned tool results that cause
providers to reject requests with errors like "tool result's tool id
not found" or "No tool call found for function call output".
Prevents infinite retry loops when models generate tool calls with
empty function names. The normalizeToolName function converted these
to "placeholder_tool" which does not exist in any client's tool
registry, causing repeated error-retry cycles.
Reasoning models (o1, o1-pro, o3, o3-mini) reject standard parameters
like temperature and top_p with 400 Bad Request. OmniRoute's default
executor forwards all parameters without filtering.
This fix adds declarative parameter filtering:
- Add unsupportedParams[] field to RegistryModel interface
- Add REASONING_UNSUPPORTED frozen constant shared across entries
- Add o1-pro, o3, o3-mini to OpenAI registry (were missing)
- Add getUnsupportedParams() helper with:
- O(1) precomputed map lookup (not O(N×M) scan)
- Cross-provider routing support via precomputed map
- Prefixed model ID support (e.g., "openai/o3" → "o3")
- Strip unsupported params in chatCore.ts before executor call
- Use Object.hasOwn() for safe property check (no prototype chain)
- Log stripped params at WARN level for visibility
- Remove apiKey===null heuristic (too broad — could match cloud providers
with non-standard auth). Use URL-based detection only.
- Guard local 404 branch with provider && model check — if either is null,
fall through to standard connection lockout (safer behavior).
- Document LOCAL_HOSTNAMES as module-load-time constant (restart required).
- Document PROVIDER_PROFILES.local as intentionally not yet wired.
When a local inference backend (oMLX, Ollama, LM Studio) returns 404
for an unknown model, OmniRoute previously locked the entire connection
for 2 minutes — blocking all valid models on that connection.
This fix introduces local provider detection and changes the 404
behavior for local providers:
- Model-only lockout (5s) instead of connection-level lockout (2min)
- Connection stays active — other models continue working immediately
- Detection via URL heuristic (localhost/127.0.0.1) + apiKey===null fallback
- Configurable via LOCAL_HOSTNAMES env var for Docker setups
Also fixes a pre-existing bug where the model parameter was not passed
to markAccountUnavailable() from chat.ts, preventing per-model lockouts
from working at all.
Changes:
- Add isLocalProvider(baseUrl) helper in providerRegistry.ts
- Add COOLDOWN_MS.notFoundLocal (5s) and PROVIDER_PROFILES.local
- Add local 404 branch in markAccountUnavailable() in auth.ts
- Pass model param to markAccountUnavailable() in chat.ts (bug fix)
Kilo Gateway (api.kilo.ai/api/gateway) is an OpenAI-compatible API
offering 335+ models via a single API key, including 6 free models
and 3 auto-routing models (frontier/balanced/free).
This is distinct from the existing KiloCode provider which uses
OAuth + /api/openrouter/ endpoint.
- Register kilo-gateway in providerRegistry.ts (alias: kg)
- Add to APIKEY_PROVIDERS in providers.ts
- Add models endpoint config in route.ts
- Add official Kilo AI icon (favicon)
Even with EXPERIMENTAL_TURBOPACK=0 and NEXT_PRIVATE_BUILD_WORKER=0, Next.js 16
instrumentation chunks still emit require('better-sqlite3-<16hexchars>') and
require('zod-<16hexchars>') into the compiled .js files inside .next/server/.
The webpack externals function in next.config.mjs patches the runtime bundler
but does NOT rewrite already-compiled chunks. Added step 5.6 to prepublish.mjs:
walks all .js files in app/.next/server/ and strips the 16-char hex suffix from
any require() string that matches the Turbopack hash pattern.
Also updated deploy-vps workflow: npm registry rejects 299MB packages, so
deployment now uses npm pack + scp + npm install -g /tmp/omniroute-*.tgz.
PM2 entry point is app/server.js inside the npm global package.
8 tests covering:
- Valid OpenAI format tools (tool.function.name) preserved
- Valid Anthropic format tools (tool.name) preserved
- Empty names in both formats filtered
- Mixed format array handling
- Null/whitespace edge cases
Regression tests verify the fix from PR #397 prevents all anthropic-
format tools from being silently dropped by the empty-name filter.
Turbopack in Next.js 16 hashes ALL serverExternalPackages (not just better-sqlite3),
emitting require() calls like 'zod-dcb22c6336e0bc69', 'pino-28069d5257187539' etc.
that don't exist in node_modules.
Changes:
- next.config.mjs: Replace single-package check with a HASH_PATTERN regex
that strips '<name>-<16hexchars>' suffix for any externalized package.
Also adds KNOWN_EXTERNALS set for exact-name matching.
- scripts/prepublish.mjs: Add NEXT_PRIVATE_BUILD_WORKER=0 env to reinforce
webpack mode. Add post-build scan that reports hashed refs so CI is visible.
Closes#396, addresses #398
Add Synthetic (synthetic.new) as a privacy-focused LLM provider
with OpenAI-compatible API, dynamic model catalog via /models
endpoint, and passthrough model support.
- Register provider in providerRegistry.ts with 6 initial models
- Add APIKEY_PROVIDERS entry with verified_user icon (#6366F1)
- Add models listing config for /api/providers/[id]/models endpoint
- passthroughModels enabled for dynamic model catalog
Allow provider_nodes to configure custom chat and models endpoint
paths via chatPath/modelsPath fields. This enables providers with
non-standard versioned APIs (e.g. /v4/chat/completions) to work
without embedding the version prefix in base_url.
- Add migration 003: chat_path and models_path columns
- Update Zod schemas (create, update, validate)
- Update CRUD in providers.ts (INSERT/UPDATE)
- Wire chatPath/modelsPath through API routes and providerSpecificData cascade
- Read chatPath in DefaultExecutor and BaseExecutor buildUrl()
- Use modelsPath in validate endpoint
- Add Advanced Settings UI section (collapsible) in create/edit modals
- Update base URL hint to reference Advanced Settings
- Add i18n keys across all 30 locales
- Add unit tests for buildUrl with custom paths
Backward compatible: NULL chatPath/modelsPath = default behavior.
The filter introduced in #346 only checked OpenAI-format tool names
(tool.function.name), silently dropping all tools when the request
arrives in Anthropic Messages API format (tool.name without .function).
This happens when LiteLLM proxies requests with anthropic/ model prefix —
it translates to Anthropic format before forwarding, so OmniRoute receives
Claude-format tools. The filter drops them all, causing Anthropic API to
return 400: 'tool_choice.any may only be specified while providing tools'.
Fix: check both formats with fn?.name ?? tool.name.
All tests pass except pre-existing clearAccountError module resolution (dataPaths) which is unrelated to this PR. Merging codex native passthrough fix.
Match both slash styles when removing build-machine paths from the
staged standalone bundle so the sanitization step works on Windows
and POSIX builds.
While touching the helper, replace the custom basename logic with
Node's built-in `path.basename` for clarity.
Prepare a dedicated `.next/electron-standalone` bundle before
running electron-builder so desktop packaging operates on a stable,
Electron-specific server payload.
This also adds a preflight that rejects standalone bundles whose
top-level `node_modules` is a symlink, because electron-builder
preserves `extraResources` symlinks and would otherwise ship an app
that depends on the build machine at runtime.
- eslint.config.mjs: add missing ignores for vscode-extension/,
electron/, docs/, app/.next/, clipr/ — ESLint was OOMing because
it scanned huge VS Code binary blobs and build artifacts
- tests: remove stale ALTER TABLE 'group' statements — column is now
part of the base schema in core.ts; tests were failing with
SQLITE_ERROR: duplicate column name
- .husky/pre-commit: add npm run test:unit to block broken tests
from reaching CI
Stabilize the bootstrap metadata test by clearing
INITIAL_PASSWORD before each run and add focused coverage
for env-backed and stored-password states.
Log settings lookup failures before returning the
bootstrap-safe fallback payload so operational errors are
still visible on the server side.
Normalize numeric pino levels correctly in the console log API so the logger transport fix does not misclassify info, warn, and error entries in file-backed logs.
Add a targeted regression test for numeric log entries.
Keep the existing level formatter for direct logger paths, but drop
that formatter from transport-backed configs because pino rejects it
when transport.targets is used.
This restores the intended stdout+file transport path and avoids the
startup fallback warning on every boot.
Add localhost and 127.0.0.1 to allowedDevOrigins so local dev
sessions opened on loopback addresses do not have their Next.js HMR
websocket blocked as cross-origin.
Point the login page at the existing public bootstrap endpoint
instead of the protected /api/settings route.
Also extend the public bootstrap response with hasPassword and
setupComplete so unauthenticated users get the correct first-run
or password-setup flow without triggering a 401.
Thanks @kfiramar! 🎉 Critical security fix — different startup paths were generating different `STORAGE_ENCRYPTION_KEY` values over the same SQLite database, causing `Unsupported state or unable to authenticate data` for all stored tokens.
Improvements added on top:
- Normalized `overridePath?.trim()` in `electron/main.js` to match `bootstrap-env.mjs` (addresses kilo-code-bot warning #1)
- Added explanatory comment documenting the `preferredEnv` merge order intent in Electron startup (addresses kilo-code-bot warning #3)
4 commits + 113-line test file. The fail-closed behaviour (refusing to mint a new key when encrypted rows exist) is an excellent safeguard. Merged!
Thanks @kfiramar! 🎉 Critical fix — stale error metadata on recovered provider accounts was preventing valid accounts from being selected properly after recovery.
Improvement added on top: documented the two valid success-check patterns (`result.success` for open-sse handlers vs `response?.ok` for fetch-based handlers) to address the kilo-code-bot review warning — both patterns are correct by design, now explicitly documented.
5 commits total, 2 test files (+168 lines of coverage). Merged!
Thanks @kfiramar! Perfect minimal fix — `t("deleteConnection")` was requesting a non-existent key across all 30 locales, causing `MISSING_MESSAGE: providers.deleteConnection` runtime errors on every provider detail page load. Reusing the existing `providers.delete` key is the correct fix. Merged!
Thanks @kfiramar! 🎉 Critical schema fix — the `group` column was used in all provider_connections queries but missing from the base schema and backfill migration. Databases upgraded from older versions were silently failing on group-related queries. Clean fix with regression test. Merged!
Tighten the helper signatures added for recovered provider cleanup.
This removes the new any-typed recovery parameters called out in
review without broadening the PR into unrelated auth typing work.
Clear recovered provider error metadata after successful
credentialed requests in non-chat API routes as well.
Add route-level regression tests covering a Response-based
success path and a result-object success path.
Refine the recovered-account regression test to match the real
observed state: an account can remain active while still carrying
stale refresh-failure metadata.
This verifies that getProviderCredentials surfaces those fields
and that clearAccountError clears them through the real runtime
path.
Pass errorCode, lastErrorType, and lastErrorSource through the
runtime credentials object so clearAccountError can clear stale
provider error metadata after a real successful request.
Also update the regression test to use getProviderCredentials,
matching the production call path.
Add the missing provider_connections.group column to both the
base schema and the runtime column backfill path.
Also add a regression test covering upgrade from an older
database that does not yet have the column.
Clear errorCode, lastErrorType, and lastErrorSource when an
account recovers so provider state returns to a fully clean
active status.
Add a focused regression test for recovered-account cleanup.
Keep getPreferredEnvFilePath consistent with its env parameter by
passing that env through resolveDataDir in both bootstrap and Electron.
This avoids silently falling back to process.env when a custom env map
is supplied.
Treat empty or whitespace-only dataDirOverride values as unset so
bootstrapEnv keeps using the normal DATA_DIR and .env lookup path.
Adds a focused regression test for the whitespace override case.
Propagate database inspection failures instead of treating them as
missing encrypted credentials.
This keeps startup from generating a fresh encryption key when an
existing database cannot be inspected and adds a regression test for
that path.
Align the app bootstrap paths with the documented CLI env lookup.
The CLI wrapper already loads DATA_DIR/.env, ~/.omniroute/.env, or ./.env,
but run-next, run-standalone, and Electron were bypassing that behavior.
On machines with encrypted credentials, that could generate a fresh
STORAGE_ENCRYPTION_KEY in server.env and make existing tokens unreadable.
This change:
- uses the same preferred .env lookup in bootstrapEnv and Electron
- keeps Electron secrets rooted in DATA_DIR and passes DATA_DIR to the child
- refuses to mint a new encryption key over an existing encrypted database
- adds a focused regression test for env precedence and key safety
Thanks @rexname (Maulana Hasanudin)! 🎉
Codex account quota policy (5h/weekly) with auto-rotation is now merged. Highlights:
- Per-account policy toggles (5h + weekly ON/OFF) in the Provider dashboard
- Accounts automatically skipped when enabled quota window reaches 90% threshold
- Auto re-eligibility when resetAt timestamp passes (no manual intervention needed)
- Side-effect free `getQuotaWindowStatus` getter design
- Safe partial merge of `codexLimitPolicy` on provider updates
Merged on top of main (v2.5.0) with no conflicts. Analytics label fix (#356) included. Thanks for the excellent quality and the 2-commit cleanup round! 🙏
Add a default-off dashboard setting that injects Codex fast service tier only when the request did not already specify one.
Also preserve service_tier through OpenAI-to-Responses translation and restore the setting at startup.
PR #363 added allowedConnections as 3rd arg in chat.ts calls to
getProviderCredentials(), but the function signature in auth.ts
only declared 2 params. Adding the optional 3rd param and applying
the connection filter when provided.
- add user-facing success/error notifications for Codex limit toggle API calls
- deduplicate Codex policy default normalization in providers page
- make getQuotaWindowStatus side-effect free (no cache mutation in getter)
- avoid stale threshold blocking after resetAt has passed
- extract named Codex quota threshold constant
- extract helper for earliest future reset date selection
Add gpt-5.4 to the Codex model registry so OmniRoute exposes cx/gpt-5.4 and codex/gpt-5.4 in its model catalog.
Includes a focused regression test for model resolution.
- add quota window status helper for Codex session (5h) and weekly windows
- enforce policy-based account filtering when enabled windows reach threshold
- return all-rate-limited metadata when no Codex account is eligible
- add per-account dashboard toggles for 5h and weekly policy controls
- merge codexLimitPolicy safely on provider updates to preserve partial settings
- document purpose and usage scenarios in README (EN + ID + i18n note)
Merged! Excellent contribution @AndersonFirmino 🎉
This PR delivers four major improvements:
- **strict-random** strategy — Fisher-Yates shuffle deck with anti-repeat guarantee and mutex serialization for concurrent safety
- **API key controls** — allowedConnections, is_active, accessSchedule, autoResolve
- **Connection groups** — environment-based grouping view in Limits page with localStorage persistence
- **i18n** — 30 languages fully updated, pt-BR fully translated
655 tests passing. Merged with main (v2.4.4) — no conflicts. Thank you for the exceptional quality!
- fix#355: increase STREAM_IDLE_TIMEOUT_MS from 60s to 300s to prevent
premature stream abortion for extended-thinking models (claude-opus-4-6,
o3, etc.) that can pause >60s during reasoning phases. Configurable via
STREAM_IDLE_TIMEOUT_MS env var.
- fix#350: combo health check test now bypasses REQUIRE_API_KEY=true by
sending X-Internal-Test header, recognized in chat.ts auth pipeline to
skip API key validation for internal admin-side combo tests. Also
extended test timeout from 15s to 20s. Uses OpenAI-compatible format
universally (not Claude-style).
- fix#346: filter out tools with empty function.name before forwarding
to upstream providers. Claude Code sends empty-name tool definitions
that cause '400 Invalid input[N].name: empty string' on OpenAI-compat
providers. Extends existing message/input empty-name filter.
Merged via review workflow. Excellent contribution by @Regis-RCR — 3-tier pricing resolution with LiteLLM sync, 23 tests, fully opt-in. Minor improvement noted: dashboard UI for sync status will be added in a follow-up.
- New: open-sse/services/apiKeyRotator.ts — round-robin rotation
between primary API key + providerSpecificData.extraApiKeys[]
- Modified: open-sse/executors/base.ts — buildHeaders() rotates key
using getRotatingApiKey() when extraApiKeys configured
- Modified: open-sse/handlers/chatCore.ts — injects connectionId into
credentials to enable per-connection rotation index tracking
- Modified: providers/[id]/page.tsx — 'Extra API Keys' UI section in
EditConnectionModal: add/remove keys, persisted in providerSpecificData
T08 (quota window rolling) and T13 (wildcard model routing) confirmed
already implemented in accountFallback.ts and wildcardRouter.ts.
- Add .catch() to initial and periodic sync promises (Gemini, Kilo)
- Wrap JSON.parse in try-catch for corrupted DB data (Kilo)
- Wrap response.json() in try-catch for invalid LiteLLM JSON (Kilo)
- Validate PRICING_SYNC_INTERVAL (guard against NaN/0 → tight loop) (Copilot)
- Validate and allowlist sources — reject unknown, prevent empty sync
from clearing pricing_synced data (Copilot, Kilo)
- Extract merge loop into shared iteration to reduce duplication (Gemini)
- Add data/warnings fields to MCP output schema (Copilot)
- Remove unused z import in vitest (Copilot)
- Filter non-string entries from sources array in API route (Copilot)
- Track active interval for accurate getSyncStatus().nextSync (Copilot)
getProviderCredentials already filtered by allowedConnections, but
chat.ts never passed the field from apiKeyInfo. Now both call sites
(combo pre-check and credential retry loop) forward the restriction.
Fixes race condition in combo strict-random (concurrent requests could
reshuffle simultaneously). Eliminates code duplication between combo.ts
and auth.ts by extracting Fisher-Yates shuffle + deck logic into
src/shared/utils/shuffleDeck.ts with per-namespace mutex serialization.
- Combo layer: strict-random in combo.ts rotates models uniformly
- Credential layer: strict-random in auth.ts rotates connections/accounts
- Anti-repeat guarantee: last of previous cycle ≠ first of next
- Mutex serialization for concurrent request safety
- Independent decks per combo name and per provider
- allowedConnections: restrict which connections a key can use
- autoResolve: per-key toggle for ambiguous model disambiguation
- is_active: enable/disable key instantly (403 on disabled)
- accessSchedule: time-based access control (hours, days, timezone)
- Rename keys via PATCH /api/keys/:id
- Connection restriction badge in API keys table
- Auto-migration for all new columns
- Connection group field on provider connections
- Environment grouping view in Limits page (group by environment)
- Accordion UI with expand/collapse per group
- localStorage persistence for groupBy, autoRefresh, expandedGroups
- Smart default: auto-switches to environment view when groups exist
- Swap SessionsTab above RateLimitStatus
- strict-random option added to combo strategy dropdown (30 languages)
- strategyGuide.strict-random (when/avoid/example)
- pt-BR: translated all strategyRecommendations from English to Portuguese
- en: added API key management strings (accessSchedule, isActive, etc.)
- 11 tests: shuffle deck mechanics (Fisher-Yates, anti-repeat, decks)
- 6 tests: allowedConnections (schema, DB persistence, cache invalidation)
- 12 tests: API key policy (isActive, accessSchedule, autoResolve, budget)
* fix: tool description null sanitization, clipboard HTTP fallback fixes
T10 - Sanitize tool.description null in claude-to-openai translator
- claude-to-openai.ts: tool.description defaults to empty string when null/undefined
- claude-to-openai.ts: filter out tools with empty/missing names
- Prevents 400 validation errors on providers like NVIDIA NIM (issue #276)
T11 - Fix copy buttons to work on HTTP/non-HTTPS deployments
- Add src/shared/utils/clipboard.ts with HTTPS+HTTP (execCommand) dual fallback
- Migrate useCopyToClipboard.ts to use shared utility
- Migrate ConsoleLogViewer.tsx, RequestLoggerV2.tsx to shared utility
- Migrate HomePageClient.tsx, endpoint/page.tsx, GetStarted.tsx
- Migrate DefaultToolCard.tsx to shared utility
- Fixes copy buttons when OmniRoute runs behind HTTP proxy (issue #296)
T02 - Verified SSE [DONE] sentinel handling already correct
- sseParser.ts filters [DONE] on line 13 (no change needed)
- stream.ts uses doneSent flag to prevent duplicate sentinel
- bypassHandler.ts correctly separates streaming/non-streaming responses
Issue triage comments posted to #340, #341, #344
* feat: DB read cache + Accept header stream negotiation (T09/T01)
T09 - In-memory TTL cache for hot DB read paths
- Add src/lib/db/readCache.ts with TTL cache (5s settings/connections, 30s pricing)
- Eliminates redundant SQLite reads on concurrent requests
- Integrate invalidation in settings.ts updateSettings() and updatePricing()
- Integrate invalidation in providers.ts create/update/delete operations
- Export getCachedSettings, getCachedPricing, getCachedProviderConnections,
invalidateDbCache via localDb.ts for consumer migration
- Cache auto-busts on any write, preserving data consistency
T01 - Accept header stream negotiation
- src/sse/handlers/chat.ts: detect Accept: text/event-stream header
- Override body.stream=true when Accept header indicates streaming client
- Enables curl, httpx and SDK clients that use HTTP headers instead of JSON
body field to trigger streaming responses
- Logs Accept override at DEBUG level for observability
* fix: auto-advance quota window on expiry to prevent stale blocking (T08)
T08 - Quota Window Rolling Auto-Advance
- quotaCache.ts: add windowDurationMs field to QuotaCacheEntry interface
(optional field that callers can set when they know the window duration)
- Add advancedWindowResetAt() helper: if entry.nextResetAt is in the past,
eagerly returns { exhausted: false } so requests are unblocked immediately
- isAccountQuotaExhausted() now uses advancedWindowResetAt() instead of
the previous inline date check, and optimistically clears entry.exhausted
flag to avoid re-checking the same stale entry on the next request
Before: exhausted accounts with an expired resetAt would wait up to 5
minutes for the background refresh before accepting new requests.
After: the first request after resetAt passes will be immediately accepted
and will trigger a quota refresh on the next background tick.
* feat: manual OAuth token refresh UI (T12)
T12 - Manual Token Refresh UI
- Add POST /api/providers/[id]/refresh endpoint
- Validates connection exists and is OAuth type
- Calls getAccessToken() (same helper used in auto-refresh)
- Persists new credentials via updateProviderCredentials()
- Returns { success, expiresAt, refreshedAt } on success
- Update providers/[id]/page.tsx
- handleRefreshToken() with loading state (refreshingId)
- Pass onRefreshToken + isRefreshing props to ConnectionRow
- ConnectionRow: add optional onRefreshToken/isRefreshing props
- ConnectionRow: tokenMinsLeft state via lazy init (Date.now() in
getter fn, not in render body - satisfies react-hooks/purity)
- Token expiry badge: red 'expired' | amber '~Xm' (<30min) | hidden
- 'Token' button (amber) next to 'Retest' for OAuth connections
- Add en.json i18n: tokenRefreshed, tokenRefreshFailed
* Initial plan
* feat: integrate wildcardRouter into model alias resolution (T13)
T13 - Wildcard Model Routing
- Import resolveWildcardAlias from wildcardRouter.ts into model.ts
- In getModelInfoCore(), after exact alias check fails, try glob wildcard
alias matching (e.g., 'claude-sonnet-*' alias → 'anthropic/claude-sonnet-4')
- Returns { provider, model, extendedContext, wildcardPattern } on match
- Falls back to MODEL_TO_PROVIDERS lookup and openai default as before
* fix: clipboard cleanup and tool validation
* feat: media page UX + T04 playground uploads + T03 HuggingFace/Vertex AI
Media Page (MediaPageClient.tsx):
- Render images inline (img tags from b64_json or url)
- Show transcription as plain readable text (not raw JSON)
- Amber banner for credential errors with link to /dashboard/providers
- Detect empty transcription result and show credentials hint
- Provider credential hint below selector for non-local providers
- Extended provider/model lists: HuggingFace, Qwen TTS, Inworld, Cartesia, PlayHT, AssemblyAI
T04 - Playground File Uploads (playground/page.tsx):
- Audio file upload panel for transcription endpoint (multipart/form-data)
- Image upload panel for vision models (gpt-4o, claude-3, gemini, pixtral, llava...)
- Auto-detect vision models by name heuristic
- Inject uploaded images as base64 image_url in chat messages
- Inline image rendering for image generation results
- Readable text view for transcription results with copy button
- Preview thumbnails for attached images with individual remove
T03 - HuggingFace + Vertex AI Providers:
- HuggingFace: frontend providers.ts + backend providerRegistry.ts
Uses HuggingFace Router OpenAI-compatible endpoint
- Vertex AI: frontend providers.ts + backend providerRegistry.ts
Uses gemini format with generateContent API (urlBuilder fallback)
T07 - API Key Round-Robin: VERIFIED already implemented in auth.ts
fill-first, round-robin, p2c, random, least-used, cost-optimized strategies
* feat: T05 task-aware routing + fix#302 stream override + fix#73 claude provider fallback
T05 - Task-Aware Smart Routing:
- New open-sse/services/taskAwareRouter.ts:
Detects 7 task types: coding, creative, analysis, vision, summarization,
background, chat from system/user message content and images
Configurable taskModelMap per task type, stats tracking
applyTaskAwareRouting() integrates with existing chat pipeline
- New src/app/api/settings/task-routing/route.ts:
GET/PUT/POST API for task routing config + reset-stats + detect action
Persists config via updateSettings('taskRouting')
- Integration in src/sse/handlers/chat.ts:
applyTaskAwareRouting() called after policy enforcement, before combo resolve
Logs task type detection and model overrides
Fix#302 - OpenAI SDK stream=False drops tool_calls:
- src/sse/handlers/chat.ts T01 Accept header negotiation:
Changed condition from 'body.stream !== true' to 'body.stream === undefined'
OpenAI Python SDK sends 'Accept: application/json, text/event-stream' in every
request, even stream=False — the old code was incorrectly forcing stream=true,
causing tool_calls to be dropped from non-streaming responses
Fix#73 - Claude Haiku routed to OpenAI provider instead of Antigravity:
- open-sse/services/model.ts getModelInfoCore():
Added heuristic prefix detection before the blind 'openai' fallback:
claude-* models → antigravity (Anthropic) provider
gemini-*/gemma-* models → gemini provider
Closes: #73, partially addresses #302
* fix: token counts 0 (#74), model import dup (#180), model route fallback (#73)
fix#74 - Token counts always 0 for Antigravity/Claude streaming:
- open-sse/utils/usageTracking.ts extractUsage():
Add handler for 'message_start' SSE event which carries INPUT tokens in
Antigravity/Claude streaming:
{ type: 'message_start', message: { usage: { input_tokens: N } } }
This event was completely unhandled, causing ALL input token counts to be
dropped for every Antigravity/Claude streaming request
fix#180 - Model import shows duplicates with no visual feedback:
- src/shared/components/ModelSelectModal.tsx:
Added addedModelValues prop (string[]) to receive already-added model values
Models already in the combo now shown with ✓ indicator + green highlight
Makes it visually clear which models are already added vs new
- src/app/(dashboard)/dashboard/combos/page.tsx:
Pass addedModelValues={models.map(m => m.model)} to ModelSelectModal
* Harden clipboard UX and Claude tool normalization (#360)
* Initial plan
* chore: plan updates for clipboard and translator fixes
* fix: clipboard cleanup, copy feedback, and claude tool validation
---------
Co-authored-by: openai-code-agent[bot] <242516109+Codex@users.noreply.github.com>
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: openai-code-agent[bot] <242516109+Codex@users.noreply.github.com>
- docs/openapi.yaml: update info.version from 2.3.6 to 2.4.1 (fixes CI check)
- CHANGELOG.md: add '## [Unreleased]' section as first heading (required by check-docs-sync)
- scripts/check-docs-sync.mjs: fix regex to accept both hyphen (-) and em-dash (—)
as date separators in changelog headings (standard Keep a Changelog format)
- .husky/pre-commit: add 'node scripts/check-docs-sync.mjs' to catch version
mismatches locally before push
- Move 'Free Stack ($0)' to position 1 in COMBO_TEMPLATES (was 4th, invisible in 3-col grid)
- Add isFeatured flag to free-stack for special styling
- Change template grid: grid-cols-3 → 2x2 (sm:grid-cols-2) — all 4 templates visible
- Free Stack: green border/bg (emerald), FREE badge, larger text size
- Other templates: hover styles preserved, → arrow on Apply link
- Increase templates section padding
- fix(oauth): restore iFlow clientSecret default — was empty string, now uses the valid public key (#339)
- fix(mitm): compile src/mitm/*.ts to JS during prepublish so server.js exists in npm bundle (#335)
- fix(gemini-cli): graceful projectId fallback — warn + empty string instead of hard 500 error (#338)
- feat(models): add gpt5.4 to Codex; add claude-sonnet-4, claude-opus-4.6, deepseek-v3.2, minimax-m2.1, qwen3-coder-next, auto to Kiro (#334)
- fix(electron): sync electron/package.json version to 2.3.13 (#323)
- feat(scoring): add tierPriority (0.05) to ScoringWeights Zod schema and combos/auto API route
The @swc/helpers override removal changed dependency resolution.
npm ci was failing with 'Missing: @swc/helpers@0.5.15 from lock file'.
Updated lock file with npm install --package-lock-only.
star-history.com embeds are often cached and slow to update. The new
starchart.cc widget (variant=adaptive) renders better on both light and
dark themes and updates in real-time.
Updated: README.md + 29 i18n locale READMEs
The @swc/helpers override in package.json duplicated the direct dependency
at the exact same version (0.5.19), causing 'EOVERRIDE' errors when pnpm
users tried to rebuild native modules like better-sqlite3.
Fixes:
- Remove redundant 'overrides' block (direct dep already pins 0.5.19)
- Add pnpm.onlyBuiltDependencies for @parcel/watcher, @swc/core,
better-sqlite3, esbuild, omniroute, sharp (replaces pnpm approve-builds)
- Add pnpm usage note to README Quick Start
Closes#328
Add API_BRIDGE_PROXY_TIMEOUT_MS env var to configure the api-bridge
proxy timeout. Default remains 30000ms for backward compatibility.
Handles invalid values with a warning log.
Co-authored-by: hijak <54431520+hijak@users.noreply.github.com>
- Use timingSafeEqual for constant-time password comparison
- Require non-empty currentPassword when INITIAL_PASSWORD env is set
- Legacy fallback: allow empty or '123456' when no INITIAL_PASSWORD
Co-authored-by: hijak <54431520+hijak@users.noreply.github.com>
The outer <details> block at line 1459 was never closed, causing GitHub
to stop rendering everything below Troubleshooting (Tech Stack, Docs,
Roadmap, Contributors, etc.).
Fixes: README truncation on GitHub
kilocode renders ASCII logo banner on startup causing false healthcheck_failed
timeouts on cold-start or low-resource environments (VPS, CI, dashboard)
All 3 throw new Error(data.error) replaced with proper extraction:
typeof error === object ? error.message : error
Fixes Cline and other OAuth providers showing [object Object] on connection failure
- cline.ts: add decodeURIComponent before base64 decode to handle URL-encoded codes
- cline.ts: populate name = firstName+lastName || email in mapTokens
- oauth/exchange route: normalize name=email for all providers on exchange/poll/poll-callback
- Fixes: accounts showing Account #ID instead of email in providers dashboard
- Add cliTools.toolDescriptions.opencode, .kiro, guides.opencode, guides.kiro to en.json
- Sync 1111 missing keys across 29 language files (English fallbacks)
- Fix [object Object] in provider batch test modal:
normalize data.error object to string before setTestResults()
and in ProviderTestResultsView rendering
- Bump version to 2.3.6
prepublish.mjs: explicitly copy @swc/helpers into standalone app/node_modules
before packaging. npm tarball will always include it.
postinstall.mjs: fallback copy of @swc/helpers from root node_modules into
app/node_modules/@swc/ when missing after npm install -g.
Fixes server crash after npm install -g omniroute.
Endpoints page:
- Add Music Generation section (/v1/music/generations) in Media & Multi-Modal category
- Include music models (type=music) in endpointData and total model count
- Transcription section already shows Deepgram/AssemblyAI via allModels filter
Provider action buttons:
- Remove hover-only behavior from connection action buttons (edit/delete/reauth/proxy)
- Remove hover-only behavior from combo action buttons (test/duplicate/proxy/edit/delete)
- Buttons now always visible for better UX
Provider logos (SVG fallback):
- ProviderCard now tries .svg before showing text initials when .png not found
- Add SVG logos: ElevenLabs, Hyperbolic, AssemblyAI, PlayHT, Inworld, NanoBanana
- Add ollama-cloud.png (official Ollama icon)
postinstall.mjs imports native-binary-compat.mjs but the Dockerfile
only copied postinstall.mjs, causing ERR_MODULE_NOT_FOUND during npm ci:
Cannot find module '/app/scripts/native-binary-compat.mjs'
imported from /app/scripts/postinstall.mjs
The '🔐 OAuth on a Remote Server' guide existed only in Portuguese (#oauth-em-servidor-remoto).
Multiple users (@hijak, @ldsgroups225, @vipinpg) couldn't find it in English.
Changes:
- Full English step-by-step guide added above the existing PT content
- Added 'oauth-on-a-remote-server' anchor (EN) alongside 'oauth-em-servidor-remoto' (PT)
- Portuguese version moved into a collapsible <details> section
- OAuthModal.tsx already updated in v2.3.1 to link to #oauth-on-a-remote-server
- Read only first 4096 bytes of binary header instead of entire file
- Add error logging to all catch blocks with specific failure messages
- Separate copy vs dlopen catch blocks in postinstall Strategy 1
- Add archCount sanity cap (max 30) for fat Mach-O parsing
- Distinguish timeout vs rebuild failure in Strategy 2
Add native-binary-compat module that reads ELF/Mach-O/PE headers to
determine the actual target platform/arch of the .node binary. This
eliminates the macOS false-positive where dlopen loads a linux-x64
binary without throwing.
- Parse ELF (linux), Mach-O (darwin), and PE (win32) binary formats
- Use header-based check as primary signal, dlopen as secondary
- Update pre-flight check in CLI to use the new module
- Add unit tests for all binary formats and cross-platform scenarios
The standalone app/ directory created by Next.js only contains runtime
files for better-sqlite3 (no binding.gyp, no source, no prebuild-install),
so `npm rebuild` inside app/ is a no-op. The previous fix (#312) added
exit(1) on rebuild failure, which caused npm to rollback the entire
package installation — leaving users with nothing to fix manually.
New approach:
1. Check if existing binary is already compatible (dlopen)
2. Copy the correctly-built binary from root node_modules/ (npm already
compiles it for the correct platform during install)
3. Fall back to npm rebuild if root binary is unavailable
4. Warn but don't fail the install if nothing works — the package stays
installed and the CLI pre-flight check gives a clear error at startup
Parse [1m] suffix from model name (e.g. claude-sonnet-4-6[1m]) and
propagate extendedContext flag through the request pipeline to append
context-1m-2025-08-07 to the Anthropic-Beta header.
fix(ui): translate hardcoded PT-BR text in OAuthModal to English (#314, PR #325)
fix(ts): wrap unknown dataObj fields with toRecord() in usage.ts (Kimi parser)
fix(instrumentation): await getSettings() — property access on Promise (#316 follow-up)
Two strings were hardcoded in Portuguese regardless of the user's language setting:
1. The redirect_uri_mismatch error message (line ~101)
2. The remote access info banner for Google OAuth providers (line ~515)
Both are now in English. The anchor href is updated from
'#oauth-em-servidor-remoto' to '#oauth-on-a-remote-server' to match
the EN README anchor.
Six TypeScript errors on lines 921/922/925/926/939/948:
- dataObj.five_hour / seven_day are 'unknown', can't be passed directly to
hasUtilization/createQuotaObject which expect JsonRecord — wrap with toRecord()
- dataObj.user is 'unknown', can't chain .membership?.level — use toRecord() first
getSettings() is declared async so calling it without await left
settings as a Promise<Record<string, unknown>>, causing 4 TS errors
when accessing settings.modelAliases in the alias restore block.
#315: Import and call resolveModelAlias() in chatCore.ts before the
getModelTargetFormat() lookup so that custom aliases configured in
Settings → Model Aliases → Pattern→Target are actually applied during
routing instead of being silently ignored.
#316: Load persisted custom model aliases from settings DB at server
startup (instrumentation.ts). Previously _customAliases started as an
empty object after every restart since setCustomAliases() was only
called by the PUT /api/settings/model-aliases handler — never at init.
Now aliases are restored from settings.modelAliases JSON field on boot.
Replace unreliable process.dlopen() platform detection with explicit
platform/arch comparison against the build target (linux-x64). On macOS,
dlopen can load an incompatible binary without throwing, causing the
postinstall script to skip the rebuild entirely.
- Detect platform mismatch via process.platform/arch instead of dlopen
- Fail the install (exit 1) if rebuild fails, instead of warning silently
- Verify rebuilt binary loads correctly after rebuild
- Add pre-flight binary check in CLI entry point as a safety net
The Claude Code OAuth API returns 'utilization' as percent USED,
not percent remaining. The createQuotaObject function had them swapped:
it set remainingPercentage = utilization, which inverted the quota bar.
Confirmed by reporter: Claude.ai shows 87% used → OmniRoute was showing
87% remaining (green bar), should show 13% remaining (yellow/red bar).
Fix: used = utilization; remaining = 100 - utilization.
next@16 lists @swc/helpers@0.5.15 in its own dependencies but npm's
deduplication during global install fails to place it in the omniroute
app's node_modules when hoisted. This causes MODULE_NOT_FOUND for
@swc/helpers/esm/_interop_require_default.js on startup.
Fix: add @swc/helpers@0.5.19 to omniroute's top-level dependencies and
overrides so npm guarantees its presence regardless of hoisting strategy.
Reproducible on Windows (Node 22) and Linux.
When all provider quotas are exhausted (reservoir=0 after repeated 429s),
Bottleneck's schedule() would queue requests indefinitely since no maxWait
was configured. Clients (Cursor, Claude Code, VS Code) would hang forever.
Fix: add maxWait=120000 (2min, configurable via RATE_LIMIT_MAX_WAIT_MS env)
to DEFAULT_SETTINGS and all three Bottleneck constructors. When a job waits
longer than maxWait, Bottleneck rejects with a BottleneckError which
propagates as a 502/503 error to the client — a clean fail-fast instead
of infinite hang.
The healthcheck script was querying /api/settings which returns config
data rather than system health. Updated to /api/monitoring/health which
is the canonical health endpoint used across tests, SystemMonitor.tsx,
MaintenanceBanner.tsx, playwright config, and MCP tools.
OpenAI-compatible providers (OpenAI, Codex) reject name:'' with 400 errors:
- 'Unknown parameter: input[1].name'
- 'Invalid tools[0].name: empty string'
Some clients (e.g. PocketPaw) forward assistant turns with name:'' in
the OpenAI Responses API input[] and chat completions messages[].
Fix: filter out name:'' from messages[] and input[] before translateRequest.
Non-empty non-null name values are preserved per OpenAI spec.
When proxying Claude responses through OmniRoute, thinking blocks were being
emitted as regular content (delta.content) with <think>...</think> XML tags.
Clients like Claude Code, Cursor, and Windsurf look for delta.reasoning_content
to render the thinking panel — not <think> tags inside content.
Root cause (claude-to-openai.ts):
- content_block_start type:thinking → emitted { content: '<think>' }
- content_block_delta thinking_delta → emitted { content: delta.thinking }
- content_block_stop thinking block → emitted { content: '</think>' }
Fix:
- content_block_start → emits { reasoning_content: '' } (signals block start)
- thinking_delta → emits { reasoning_content: delta.thinking }
- content_block_stop → no extra chunk needed (thinking streamed via reasoning_content)
This fix applies when sourceFormat=CLAUDE targetFormat=OPENAI (Antigravity OAuth,
direct Claude API providers). The user reported 'Thinking Budget: passthrough'
was enabled but thinking was invisible — this is the root cause.
Fixes#289
Resolves root cause of #252 (Electron black screen) and #249 (OAuth fail)
for users running with zero configuration (no .env needed).
New: scripts/bootstrap-env.mjs
- Auto-generates JWT_SECRET (64 bytes), STORAGE_ENCRYPTION_KEY (32 bytes),
API_KEY_SECRET (32 bytes) if missing or empty
- Persists to {DATA_DIR}/server.env — survives restarts, Docker volume
remounts, and upgrades without changing secrets
- Reads .env from CWD (user overrides), then merges process.env (highest prio)
- Logs friendly warnings for missing optional OAuth secrets
Updated: run-standalone.mjs + run-next.mjs
- Call bootstrapEnv() before spawning server — covers npm + Docker paths
Updated: electron/main.js (synchronous inline — CJS cannot await import ESM)
- Reads userData/server.env, generates missing secrets with crypto.randomBytes()
- Persists back to server.env, sets OMNIROUTE_BOOTSTRAPPED=true
New: BootstrapBanner.tsx + page.tsx update
- Dismissable amber banner on dashboard home when running in zero-config mode
- Shows where server.env is located and how to customize secrets
In packaged Electron on macOS/Windows/Linux, there is no .env file.
The Next.js server needs JWT_SECRET and STORAGE_ENCRYPTION_KEY to start —
without them it crashes silently, causing ERR_CONNECTION_REFUSED
and a black screen in the Electron window.
Fix: Generate cryptographically random values with crypto.randomBytes()
on first launch, persist them in userData/electron-env.json, and pass
them to the spawned server.js process via the env option.
Root cause: macOS users reported 'app black screen' (#252) and
ERR_CONNECTION_REFUSED — this was the Next.js server crashing at startup
because these env vars don't exist in the desktop OS environment.
- Step 4 now marked ⚠️ MANDATORY with CI will fail warning
- Command is now auto-extracting version from package.json (no manual substitution)
- Step 4 has // turbo annotation for auto-execution
- Added 'Known CI Pitfalls' table: docs-sync failures, Electron fpm, Docker 502
check:docs-sync fails when openapi.yaml version != package.json version.
Updating to match after v2.2.4 release.
Systematic fix: openapi.yaml version must always be updated alongside
package.json during releases (see generate-release workflow step 4).
2026-03-10 14:43:17 -03:00
2002 changed files with 464028 additions and 142308 deletions
description: Automatically run the browser_subagent to visually validate all new UI features from the current release and capture evidence WebP recordings of the changes.
---
# Capture Release Evidences Workflow
Use this workflow to automatically drive the `browser_subagent` to explore the newly deployed or locally running application and record evidence of the UI changes introduced in the latest release.
## Prerequisites
- OmniRoute must be actively running and accessible (e.g. locally at `http://localhost:20128` or on the Local VPS at `http://192.168.0.15:20128`).
- The user must provide the target URL to be tested, or default to `http://192.168.0.15:20128`.
## Workflow Steps
### 1. Identify Target Features
Review the `CHANGELOG.md` for the latest version to map out the new UI elements. For example:
For each identified feature, invoke the `browser_subagent` using the `default_api:browser_subagent` tool.
**Important Task Guidelines for the Subagent:**
-`TaskName`: Give it a clear name like "Validate CLIProxyAPI Tool Tab".
-`TaskSummary`: "Navigate to the CLI Tools tab and verify the new Integration settings."
-`Task`: Provide unambiguous instructions for the subagent, such as: "Navigate to http://192.168.0.15:20128/dashboard. Click on the 'Settings' or 'CLI Tools' nav link. Scroll down to find the CLIProxyAPI integration card. Hover over it to trigger UI state. Verify the components render correctly and exit."
-`RecordingName`: Ensure it describes the feature (e.g. `v3_4_5_cli_proxy_api`). This is required and strictly automatically saved as a WebP artifacts video by the system.
_(Note: The `browser_subagent` automatically creates a WebP recording named by the `RecordingName` parameter. No additional tools for screenshots are needed.)_
### 3. Generate Report Artifact
After the `browser_subagent` finishes its sessions, generate a final Markdown artifact (using `write_to_file` and `IsArtifact=true`) to present the recordings inline to the user using the `` syntax.
### Example Invocation
\```json
{
"TaskName": "Validating Qoder PAT Configuration UI",
"TaskSummary": "Validates the Qoder provider configuration modal",
"Task": "Go to http://192.168.0.15:20128/dashboard. Click on the 'Providers' tab. Find 'Qoder' in the list. Click 'Add Token' or 'Configure'. Type 'test_token' and submit. Return when done.",
@@ -4,16 +4,55 @@ description: Create a new release, bump version up to 1.x.10 threshold, update c
# Generate Release Workflow
Bump version, finalize CHANGELOG, commit, tag, push, publish to npm, and create GitHub release.
Bump version, finalize CHANGELOG, commit, open a **PR to main** and wait for user confirmation before tagging, publishing, and deploying.
> **VERSION RULE: Always use PATCH bumps (2.x.y → 2.x.y+1)**
> NEVER use `npm version minor` or `npm version major`.
> Always use: `npm version patch --no-git-tag-version`
> The threshold rule: when `y` reaches 10, bump to `2.(x+1).0` — e.g. `2.1.10` → `2.2.0`.
## Steps
> **🔴 SINGLE BRANCH RULE**: The `release/vX.Y.Z` branch is the **ONLY** development branch for the entire release cycle. ALL work — bug fixes, feature implementations, PR integrations, issue resolutions — MUST be committed directly on this branch. Never create separate `fix/`, `feat/`, or topic branches. When running `/resolve-issues`, `/implement-features`, or `/review-prs`, always work on the current release branch.
**NEVER push directly to main or create tags before the user confirms the PR.**
---
## Phase 0: Security Verification (MANDATORY)
Before creating the release, you must ensure the codebase and supply chain are secure and free of known vulnerabilities.
1.**Run Local Dependencies Audit:**
```bash
npm audit
```
_Fix any `high` or `critical` vulnerabilities identified._
2. **Check GitHub CodeQL & Dependabot Alerts:**
Navigate to the repository's **Security** tab on GitHub, or use the project's `vulnerability-scanner` skill to analyze active alerts. Ensure all static analysis findings (e.g., prototype pollution, insecure randomness, ReDoS, shell injections) are addressed and logically committed on a target branch.
---
## Phase 1: Pre-Merge
### 1. Create release branch
```bash
git checkout -b release/v2.x.y
```
### 2. Determine new version
Check current version in `package.json` and increment the **patch** number only:
@@ -27,12 +66,28 @@ Version format: `2.x.y` — examples:
- `2.1.9` → `2.1.10` (patch)
- `2.1.10` → `2.2.0` (minor threshold — do manually with `sed`)
```bash
# ALWAYS use patch:
npm version patch --no-git-tag-version
```
> **⚠️ ATOMIC COMMIT RULE — Version bump MUST happen before committing feature files.**
>
> **CORRECT order:**
>
> 1. `npm version patch --no-git-tag-version` ← bump first
> 2. implement features / fix bugs
> 3. `git add -A && git commit -m "chore(release): v2.x.y — all changes in ONE commit"`
>
> **OR if features are already staged:**
>
> 1. implement features (do NOT commit yet)
> 2. `npm version patch --no-git-tag-version` ← bump before committing
> 3. `git add -A && git commit -m "chore(release): v2.x.y — all changes in ONE commit"`
>
> **NEVER do this (creates version mismatch in git history):**
>
> - ~~commit features → then bump version → commit package.json separately~~
>
> This ensures that `git show v2.x.y` always contains both code changes and the version bump together.
> The GitHub release tag will point to a commit that includes ALL changes for that version.
### 2. Regenerate lock file (REQUIRED after version bump)
### 3. Regenerate lock file (REQUIRED after version bump)
**Mandatory** — skipping causes `@swc/helpers` lock mismatch and CI failures:
@@ -40,7 +95,7 @@ npm version patch --no-git-tag-version
npm install
```
### 3. Finalize CHANGELOG.md
### 4. Finalize CHANGELOG.md
Replace `[Unreleased]` header with the new version and date.
Keep an empty `## [Unreleased]` section above it.
@@ -53,45 +108,172 @@ Keep an empty `## [Unreleased]` section above it.
## [2.x.y] — YYYY-MM-DD
```
### 4. Update openapi.yaml version
### 5. Update openapi.yaml version ⚠️ MANDATORY
> **CI will fail** if `docs/openapi.yaml` version ≠ `package.json` version (`check:docs-sync` enforces this).
// turbo
```bash
sed -i 's/version: OLD/version: NEW/' docs/openapi.yaml
| `[docs-sync] FAIL - OpenAPI version differs from package.json` | Skipped step 5 — `docs/openapi.yaml` version not updated | Run step 5 (`sed -i ...`) and commit |
| `[docs-sync] FAIL - CHANGELOG.md first section must be "## [Unreleased]"` | `## [Unreleased]` missing or not at top of CHANGELOG | Add `## [Unreleased]\n\n---\n` before the first versioned `## [x.y.z]` |
| Electron Linux `.deb` build fails (`FpmTarget` error) | `fpm` Ruby gem not installed on `ubuntu-latest` runner | Already fixed in `electron-release.yml` (`gem install fpm` step) |
| Docker Hub `502 error writing layer blob` | Transient Docker Hub network error during ARM64 push | Re-run the Docker publish workflow; no code change needed |
@@ -6,7 +6,9 @@ description: Analyze open feature request issues, implement viable ones on dedic
## Overview
Fetches open feature request issues, analyzes each against the current codebase, implements viable ones on dedicated branches, and responds to authors with results. Does NOT merge to main — leaves branches for author validation.
Fetches open feature request issues, analyzes each against the current codebase, implements viable ones **on the current release branch** (`release/vX.Y.Z`), and responds to authors with results. Does NOT merge to main — the release branch is later merged via PR.
> **BRANCH RULE**: All work MUST happen on the current `release/vX.Y.Z` branch. Never create separate `feat/` branches. If no release branch exists yet, create one first using `/generate-release` Phase 1 steps 1–5.
## Steps
@@ -16,15 +18,48 @@ Fetches open feature request issues, analyzes each against the current codebase,
- Run: `gh issue list --repo <owner>/<repo> --state open --limit 50 --json number,title,labels,body,comments,createdAt,author`
- Filter for issues that are feature requests (label `enhancement`/`feature`, or body describes new functionality, or previously classified as feature request)
- Sort by oldest first
Before doing any work, ensure you are on the current release branch:
### 3. Analyze Each Feature Request
```bash
# Check current branch
git branch --show-current
# If on main, determine next version and create the release branch
If already on a `release/vX.Y.Z` branch, continue working there.
### 3. Fetch Open Feature Request Issues
// turbo-all
**⚠️ CRITICAL**: The JSON output of `gh issue list` can be truncated by the tool, silently hiding issues and their comments. You MUST use the two-step approach below to guarantee **all** feature requests and their full conversations are fetched.
**Step 3a — Get Issue numbers only** (small output, never truncated):
- Run: `gh issue list --repo <owner>/<repo> --state open --labels "enhancement" --limit 500 --json number --jq '.[].number'`
- (Also run the same for `--labels "feature"` if they are separated, or filter all open issues if labels are not strictly used).
- This outputs one issue number per line. Count them and confirm total.
**Step 3b — Fetch full metadata & conversations for each Issue** (one call per issue):
- Read not just the body, but **ALL comments (`comments` array)** completely to understand the full context, agreements, and restrictions discussed by the community.
- You may batch these into parallel calls (up to 4 at a time).
- Filter for issues that are feature requests (if not already filtered by label).
- Sort by oldest first.
### 4. Analyze Each Feature Request
For each feature request issue, perform a **two-level analysis**:
@@ -46,21 +81,16 @@ Ask yourself:
#### Level 2 — Implementation (only for VIABLE features)
> **⚠️ ALL implementation happens on the release branch.**
1.**Research** — Read all related source files to understand the current architecture
2.**Design** — Plan the implementation, filling gaps in the original request
3.**Create branch** — Name format: `feat/issue-<NUMBER>-<short-slug>`
```bash
git checkout main
git pull origin main
git checkout -b feat/issue-<NUMBER>-<short-slug>
```
4. **Implement** — Build the complete solution following project patterns
5. **Build** — Run `npm run build` to verify compilation
6.**Continue** — Move to the next feature (do not switch branches)
### 4. Respond to Authors
### 5. Respond to Authors
#### For VIABLE (implemented) features:
@@ -70,9 +100,9 @@ Post a comment on the issue:
````markdown
## ✅ Feature Implemented!
Hi @<author>! We've analyzed your request and implemented it on a dedicated branch.
Hi @<author>! We've analyzed your request and implemented it.
**Branch:** `feat/issue-<NUMBER>-<short-slug>`
**Branch:** `release/vX.Y.Z` (upcoming release)
### What was implemented:
@@ -82,31 +112,24 @@ Hi @<author>! We've analyzed your request and implemented it on a dedicated bran
```bash
git fetch origin
git checkout feat/issue-<NUMBER>-<short-slug>
git checkout release/vX.Y.Z
npm install && npm run dev
```
````
### Next steps:
1. **Test it** — Please verify it works as you expected
2. **Want to improve it?** — You're welcome to contribute! Just:
```bash
git checkout feat/issue-<NUMBER>-<short-slug>
# Make your improvements
git add -A && git commit -m "improve: <your changes>"
git push origin feat/issue-<NUMBER>-<short-slug>
```
Then open a Pull Request from your branch to `main` 🎉
2. **Want to improve it?** — Feel free to open a follow-up PR targeting `release/vX.Y.Z`
3. **Not quite right?** — Let us know in this issue what needs to change
Looking forward to your feedback! 🚀
```
This will be included in the next release. Looking forward to your feedback! 🚀
````
#### For NEEDS MORE INFO:
// turbo
Post a comment asking for specific missing details needed to implement, e.g.:
- "Could you describe the exact behavior when X happens?"
- "Which API endpoints should be affected?"
- "Should this apply to all providers or only specific ones?"
@@ -114,18 +137,28 @@ Post a comment asking for specific missing details needed to implement, e.g.:
Add the context of WHY you need each piece of information.
#### For NOT VIABLE:
// turbo
Post a polite comment explaining why the feature doesn't fit at this time:
- If the idea is decent but timing is wrong: "This is an interesting idea, but it doesn't align with our current priorities. Feel free to open a new issue with more details if you'd like us to reconsider."
- If fundamentally flawed: Explain the technical or architectural reasons why it won't work, suggest alternatives if possible.
- Close the issue after posting the comment.
### 5. Summary Report
### 6. Finalize & Push
After implementing all viable features:
1. **Update CHANGELOG.md** on the release branch with all new feature entries
2. Push the release branch: `git push origin release/vX.Y.Z`
3. Run `/generate-release` workflow Phase 1 steps 7–10 (tests → commit → push → open PR to main → wait for user)
### 7. Summary Report
Present a summary report to the user via `notify_user`:
| Issue | Title | Verdict | Branch / Action |
|---|---|---|---|
| #N | Title | ✅ Implemented | `feat/issue-N-slug` |
| #N | Title | ❓ Needs Info | Comment posted |
| #N | Title | ❌ Not Viable | Closed with explanation |
@@ -6,7 +6,9 @@ description: Fetch all open GitHub issues, analyze bugs, resolve what's possible
## Overview
This workflow fetches all open issues from the project's GitHub repository, classifies them, analyzes bugs, resolves what can be fixed, and triages issues with insufficient information. **It does NOT merge or release automatically** — it creates a PR and waits for user validation before merging.
This workflow fetches all open issues from the project's GitHub repository, classifies them, analyzes bugs, resolves what can be fixed, and triages issues with insufficient information. **All fixes are committed on the current release branch** (`release/vX.Y.Z`). It does NOT merge or release automatically — the release branch is later merged via PR to main.
> **BRANCH RULE**: All work MUST happen on the current `release/vX.Y.Z` branch. Never create separate `fix/` branches. If no release branch exists yet, create one first using `/generate-release` Phase 1 steps 1–5.
## Steps
@@ -17,15 +19,45 @@ This workflow fetches all open issues from the project's GitHub repository, clas
- Run: `git -C <project_root> remote get-url origin` to extract the owner/repo
- Parse the owner and repo name from the URL
### 2. Fetch All Open Issues
### 2. Ensure Release Branch Exists
// turbo
- Run: `gh issue list --repo <owner>/<repo> --state open --limit 100 --json number,title,labels,body,comments,createdAt,author`
- Parse the JSON output to get a list of all open issues
- Sort by oldest first (FIFO)
Before doing any work, ensure you are on the current release branch:
### 3. Classify Each Issue
```bash
# Check current branch
git branch --show-current
# If on main, determine next version and create the release branch
If already on a `release/vX.Y.Z` branch, continue working there.
### 3. Fetch All Open Issues
// turbo-all
**⚠️ CRITICAL**: The JSON output of `gh issue list` can be truncated by the tool, silently hiding issues. You MUST use the two-step approach below to guarantee **all** issues are fetched.
**Step 3a — Get Issue numbers only** (small output, never truncated):
- Run: `gh issue list --repo <owner>/<repo> --state open --limit 500 --json number --jq '.[].number'`
- This outputs one issue number per line. Count them and confirm total.
**Step 3b — Fetch full metadata for each Issue** (one call per issue):
- You may batch these into parallel calls (up to 4 at a time).
- Sort by oldest first (FIFO).
### 4. Classify Each Issue
For each issue, determine its type:
@@ -36,85 +68,111 @@ For each issue, determine its type:
Focus ONLY on **Bugs** for resolution. Feature requests and questions should be skipped with a note in the final report.
### 4. Analyze Each Bug — For each bug issue:
### 5. Deep-Read Each Bug Issue (One-by-One Analysis)
#### 4a. Check Information Sufficiency
**IMPORTANT**: Read each bug issue thoroughly, one at a time, before moving to the next. This is NOT a batch process — each issue needs focused attention.
Verify the issue contains enough information to reproduce and fix:
#### 5a. Understand the Problem
For each bug issue, perform the full analysis:
1.**Read the entire body** — including Description, Steps to Reproduce, Expected/Actual Behavior, Error Logs, and Screenshots
2.**Read ALL comments** — including bot triage comments (Kilo, etc.) and owner/community responses. Pay attention to:
- Whether someone already responded with a fix
- Whether a community member confirmed the issue is resolved
- Whether the issue was marked as duplicate by a bot
3.**Identify the claimed error** — extract the exact error message, status code, and provider/model involved
#### 5b. Check Information Sufficiency
Verify the issue contains enough to act on:
- [ ] Clear description of the problem
- [ ] Steps to reproduce
- [ ]Error messages or logs
- [ ] Steps to reproduce OR error logs
- [ ]Provider/model/version information
- [ ] Expected vs actual behavior
#### 4b. If Information Is INSUFFICIENT
#### 5c. Determine Issue Disposition
Call the `/issue-triage` workflow (located at `~/.gemini/antigravity/global_workflows/issue-triage.md`):
// turbo
For each bug, classify into one of 5 actions:
- Post a comment asking for more details using `gh issue comment`
- Add `needs-info` label using `gh issue edit`
- Mark this issue as **DEFERRED** and move to the next one
| **✅ CLOSE — Already Fixed** | Owner responded with fix + no user follow-up, OR community confirmed fix | Close with comment citing which version fixed it |
| **✅ CLOSE — Duplicate** | Bot flagged >85% similarity + user provides no new info | Close referencing the original issue |
| **📝 RESPOND — Needs Info** | Issue is real but missing critical reproduction details | Comment asking for specifics per `/issue-triage` |
| **📝 RESPOND — User Config** | Error is caused by unsupported env (Node version, wrong model path, missing API enablement) | Comment explaining the user-side fix |
| **🔧 FIX — Code Change** | Root cause is confirmed in the codebase | Research, implement, test, commit on release branch |
#### 4c. If Information Is SUFFICIENT
#### 5d. For "FIX — Code Change" Issues
Proceed with resolution:
Before coding, perform deep source analysis:
1.**Create a fix branch** — `git checkout -b fix/issue-<NUMBER>-<short-description>`
2.**Research** — Search the codebase for files related to the issue
3.**Root Cause** — Identify the root cause by reading the relevant source files
4.**Implement Fix** — Apply the fix following existing code patterns and conventions
5.**Test** — Build the project and run tests to verify the fix
6.**Commit** — Commit with message format: `fix: <description> (#<issue_number>)`
1.**Search the codebase** — `grep_search` for error strings, relevant function names, affected files
2.**Search the web** — for upstream API changes, SDK updates, or breaking changes that explain the bug
3.**Read the full source file** — don't rely on grep snippets; understand the surrounding logic
4.**Verify the root cause** — confirm the bug is reproducible based on the code, not just a user misconfiguration
5.**Implement the fix** — follow existing code patterns and conventions
**This is a mandatory stop point.** Use `notify_user` with `BlockedOnUser: true`:
- Inform the user that the PR was created and is **awaiting their verification**
- Include the PR number, URL, and a summary of what was changed
- Inform the user that fixes have been **committed and pushed to the release branch**
- Include summary of fixes, test status, and files changed
- **DO NOT merge, close issues, generate releases, or deploy until the user confirms**
Wait for the user to respond:
- **User confirms** → Proceed to step 8
- **User confirms** → Proceed to step 9
- **User requests changes** → Apply changes, push to the same branch, notify again
- **User rejects** → Close the PR and stop
- **User rejects** → Revert and stop
### 8. Merge, Close Issues & Release (only after user confirms PR)
### 9. Close Issues & Finalize (only after user confirms)
After the user confirms the PR:
After the user confirms:
1.**Merge**the PR: `gh pr merge <NUMBER> --merge --repo <owner>/<repo>` or via local merge
2.**Close** resolved issues with a comment: `gh issue close <NUMBER> --repo <owner>/<repo> --comment "Fixed in <commit_hash>. The fix will be included in the next release."`
3.**Switch to main**: `git checkout main && git pull`
4. Run the `/update-docs` workflow (at `~/.gemini/antigravity/global_workflows/update-docs.md`) to update CHANGELOG and README
5. Run the `/generate-release` workflow (at `.agents/workflows/generate-release.md`) to bump version, tag, and publish
6. Deploy to local VPS: `ssh root@192.168.0.15 "npm install -g omniroute@<VERSION> && pm2 restart omniroute"`
1.**Close**resolved issues with a comment: `gh issue close <NUMBER> --repo <owner>/<repo> --comment "Fixed in release/vX.Y.Z. The fix will be included in the next release."`
2.Run `/generate-release` workflow Phase 1 steps 7–10 (tests → commit → push → open PR to main → wait for user)
If NO fixes were committed, skip this step and just present the report.
This workflow reads all open GitHub Discussions, generates a categorized summary, identifies which ones need a response, drafts and posts replies, and optionally creates issues from actionable feature requests. It follows the same flow used for Issues but adapted for the Discussions forum.
// turbo-all
## Steps
### 1. Identify the GitHub Repository
- Run: `git -C <project_root> remote get-url origin` to extract the owner/repo
- Parse the owner and repo name from the URL
### 2. Fetch All Open Discussions
- Use `read_url_content` to fetch `https://github.com/<owner>/<repo>/discussions`
- Parse the discussion list to get all discussion titles, IDs, authors, categories, and dates
- For each discussion, fetch the individual page to read the full content and all comments/replies
### 3. Summarize All Discussions
For each discussion, extract:
- **Title** and **#Number**
- **Author** (GitHub username)
- **Category** (Announcements, General, Ideas, Q&A, Show and tell)
- **Date** created
- **Summary** of the original post (1-2 sentences)
- **Comments count** and key participants
- **Your previous response** (if any)
- **Pending action** — whether a response or follow-up is needed
### 4. Present Summary Report to User
Present the full summary to the user organized by category, using a table:
@@ -6,7 +6,9 @@ description: Analyze open Pull Requests from the project's GitHub repository, ge
## Overview
This workflow fetches all open PRs from the project's GitHub repository, performs a critical analysis of each one, generates a detailed report, and waits for user approval before proceeding with implementation. **All improvements are committed on top of the PR branch** and the user must verify before merge.
This workflow fetches all open PRs from the project's GitHub repository, performs a critical analysis of each one, generates a detailed report, and waits for user approval before proceeding with implementation. **All improvements are committed on the current release branch** (`release/vX.Y.Z`).
> **BRANCH RULE**: All work MUST happen on the current `release/vX.Y.Z` branch. Never create separate feature or fix branches. If no release branch exists yet, create one first using `/generate-release` Phase 1 steps 1–5.
## Steps
@@ -16,51 +18,94 @@ This workflow fetches all open PRs from the project's GitHub repository, perform
// turbo
- Run: `git -C <project_root> remote get-url origin` to extract the owner/repo
### 2. Fetch Open Pull Requests
### 2. Ensure Release Branch Exists
// turbo
Before doing any work, ensure you are on the current release branch:
```bash
# Check current branch
git branch --show-current
# If on main, determine next version and create the release branch
If already on a `release/vX.Y.Z` branch, continue working there.
### 3. Fetch Open Pull Requests
// turbo-all
**⚠️ CRITICAL**: The JSON output of `gh pr list` can be truncated by the tool, silently hiding PRs. You MUST use the two-step approach below to guarantee **all** PRs are fetched.
**Step 3a — Get PR numbers only** (small output, never truncated):
- Run: `gh pr list --repo <owner>/<repo> --state open --limit 500 --json number --jq '.[].number'`
- This outputs one PR number per line. Count them and confirm total.
**Step 3b — Fetch full metadata for each PR** (one call per PR):
- **Document gaps** — If missing layers are detected, list them as **IMPORTANT** issues in the report with concrete suggestions for what should be added
### 4. Generate Report — Create a markdown report for each PR including:
### 5. Generate Report — Create a markdown report for each PR including:
- **PR Summary** — What it does, files affected, commit count
- **Improvements/Benefits** — Numbered list with impact level (HIGH/MEDIUM/LOW)
@@ -84,62 +129,71 @@ Perform a **global impact assessment** to verify whether the PR changes are comp
- **Verdict** — Ready to merge? With mandatory vs optional fixes
- **Next Steps** — What will happen if approved
### 5. Present to User
### 6. Present to User
- Show the report via `notify_user` with `BlockedOnUser: true`
- Wait for user decision:
- **Approved** → Proceed to step 6
- **Approved** → Proceed to step 7
- **Approved with changes** → Implement the fixes and corrections before merging
- **Rejected** → Close the PR or leave a review comment
### 6. Implementation (if approved)
### 7. Pre-Merge Fixes & CI Green-Lighting (if approved)
- Checkout the PR branch: `gh pr checkout <NUMBER>`
- Implement any required fixes identified in the analysis
-If the Cross-Layer Analysis (3f) identified missing frontend/backend counterparts, implement them
- **Commit improvements on top of the PR branch** with descriptive commit messages
-Run the project's test suite to verify nothing breaks
> **⚠️ Fixes should be pushed back to the PR branch before merging.** We want the PR itself to be green and fully valid before it integrates.
-**Sync latest fixes:** Merge `main` or the current `release` branch into the PR branch so the PR inherits any latest CI or integration test fixes (preventing false-positive failures).
- **Implement improvements:** Apply the required fixes identified in the analysis directly on the PR branch (e.g., adding missing API routes, fixing SSRF, applying comments from other agents).
-**Pushing changes to PR branches:**
```bash
# Checkout the PR locally
gh pr checkout <NUMBER>
# Apply fixes, commit your changes
git commit -m "chore: apply review suggestions and missing layers"
# Attempt to push directly to the PR branch
git push
```
- **Fallback (For external forks without maintainer edit access):**
If `git push` fails because the PR comes from an external fork without write access, you MUST:
1. Create a new branch ending in `-fix` (e.g., `checkout -b fix-pr-<NUMBER>`).
2. Push your branch to the main repo (`git push origin fix-pr-<NUMBER>`).
3. Create a Pull Request targeting the contributor's repository and branch (use `gh pr create --repo <contributor-repo> --base <contributor-branch> --head diegosouzapw:fix-pr-<NUMBER>`).
4. Once they accept our PR into their branch, their original PR to our `main` will automatically update and become green.
- Run the project's test suite locally to verify nothing breaks:
// turbo
- Run: `npm test` or equivalent test command
- Build the project to verify compilation
// turbo
- Run: `npm run build` or equivalent build command
- Push the updated branch: `git push origin <branch-name>`
**This is a mandatory stop point.** Use `notify_user` with `BlockedOnUser: true`:
- Once the PR is green (you can check with `gh pr status`), proceed to merge the PR into the current release branch (`release/vX.Y.Z`).
- Inform the user that the PR has been **improved and pushed**, and is **awaiting their verification**
- Include:
- PR number and URL
- Summary of improvements/fixes applied
- Build/test status
- List of files changed
- **DO NOT merge, generate releases, or deploy until the user confirms**
Wait for the user to respond:
- **User confirms** → Proceed to step 8
- **User requests more changes** → Apply changes, push to the same branch, notify again
- **User rejects** → Leave a review comment and stop
### 8. Thank the Contributor
```bash
gh pr merge <NUMBER> --repo <owner>/<repo>
```
- Post a **thank-you comment** on the PR via the GitHub API
- The message should:
- Thank the author by name/username for their contribution
- Briefly mention what the PR accomplishes and any improvements applied
- Note it will be included in the upcoming release
- Be friendly, professional, and encouraging
- Example: _"Thanks @author for this great contribution! 🎉 The [feature/fix] is now merged and will be part of the next release. We appreciate your effort!"_
- Example: _"Thanks @author for this great contribution! 🎉 The [feature/fix] has been integrated into the release/vX.Y.Z branch and will be part of the next release. We appreciate your effort!"_
### 9. Merge & Release (only after user confirms PR)
### 9. Close the Original PR
After the user confirms the PR:
- Close the original PR with a comment explaining it was integrated into the release branch:
```bash
gh pr close <NUMBER> --repo <owner>/<repo> --comment "Integrated into release/vX.Y.Z. Will be released as part of v3.X.Y. Thank you!"
```
1.**Merge** the PR into main (local merge with `--no-ff` or via `gh pr merge`)
2.**Push** to main: `git push origin main`
3.**Clean up** the feature branch: `git branch -d <branch-name>`
4.**Update CHANGELOG.md** with the new feature/fix
5. Run the `/generate-release` workflow (at `.agents/workflows/generate-release.md`) to bump version, tag, and publish
6. Deploy to local VPS: `ssh root@192.168.0.15 "npm install -g omniroute@<VERSION> && pm2 restart omniroute"`
### 10. Continue or Finalize
After processing all approved PRs:
- If more PRs remain, go back to step 7
- When all PRs are processed, **update CHANGELOG.md** on the release branch with all new entries
- Run `/generate-release` workflow Phase 1 steps 7–10 (tests → commit → push → open PR to main → wait for user)
Each contains: API_REFERENCE.md, ARCHITECTURE.md, CODEBASE_DOCUMENTATION.md, FEATURES.md, TROUBLESHOOTING.md, USER_GUIDE.md
```
**Sync approach for feature table updates:**
a. Identify which feature table rows were added to English README.md
b. For each translated README, find the corresponding anchor lines:
- **Routing section:** Find the `💬` (System Prompt) table row — the line before it is always the last routing feature. Insert new routing features before System Prompt.
- **Resilience section:** Find the `📊` Rate Limits table row (the one in lines 590-600, NOT the quota tracking one in lines 560-570). Insert new resilience features after it.
c. The new feature entries can stay in English for technical features, matching the pattern used in the existing translations.
d. Use `sed` or similar tool to batch-insert across all 29 translated READMEs.
**Verification:**
```bash
# Verify all READMEs have the new features
grep -l "NEW_FEATURE_NAME" README.*.md | wc -l
# Should return 30 (all language versions)
```
**FEATURES.md sync:**
```bash
# Update Settings description in all docs/i18n/*/FEATURES.md
for dir in docs/i18n/*/;do
# Update the Settings section description to mention new features
description: Bump version, auto-generate CHANGELOG from git commits, update all versioned files, and refresh root + docs/ documentation to reflect the current project state
---
# Version Bump Workflow
Automatically bump the project version, generate CHANGELOG entries from git history since the last tag, update every file that references the version, and refresh project documentation to reflect the current state.
> **VERSION RULE: Always use PATCH bumps (3.x.y → 3.x.y+1)**
> NEVER use `npm version minor` or `npm version major`.
> Always use: `npm version patch --no-git-tag-version`
> The threshold rule: when `y` reaches 10, bump to `3.(x+1).0` — e.g. `3.4.10` → `3.5.0`.
If the version was ALREADY bumped (e.g. you are on a release branch and package.json already has the new version), **skip the npm version bump** and use the existing version.
For each category with entries, create a markdown section with descriptive bullet points. Use the commit messages but rewrite them to be human-readable and descriptive (not raw commit messages).
**If a commit references a PR number** (e.g. `#880`, `PR #885`), include it in the description.
### 6. Update CHANGELOG.md
Replace the `## [Unreleased]` section content with the generated entries, then add the new versioned section:
```markdown
## [Unreleased]
---
## [NEW_VERSION] — YYYY-MM-DD
### ✨ New Features
- **Feature name:** Description (#PR)
### 🐛 Bug Fixes
- **Fix name:** Description (#PR)
### 🛠️ Maintenance
- **Item:** Description
---
## [PREVIOUS_VERSION] — YYYY-MM-DD
...
```
The date must be today's date in `YYYY-MM-DD` format.
---
## Phase 3: Sync Version Across All Files
### 7. Update workspace package.json files and openapi.yaml
sed -i "s/\*\*Current version:\*\* $OLD_VERSION_PATTERN/**Current version:** $VERSION/" llm.txt
# Update "Key Features (vX.Y.Z)" header
sed -i "s/## Key Features (v$OLD_VERSION_PATTERN)/## Key Features (v$VERSION)/" llm.txt
echo"✓ llm.txt → $VERSION"
```
### 9. Regenerate lock file
// turbo
```bash
cd /home/diegosouzapw/dev/proxys/9router
npm install
echo"✓ Lock file regenerated"
```
---
## Phase 4: Update Root Documentation
Based on the CHANGELOG entries generated in Phase 2, review and update these root-level files if relevant changes warrant updates:
### 10. Review and update root documentation files
For each file below, read the current content and determine if the CHANGELOG entries require any updates. Only modify files where substantive changes have occurred:
- **README.md**: Update provider count, test count, feature highlights table, badges if any numbers changed. If a new provider was added, add it to the provider table. If a major feature was added, add it to the features section.
- **AGENTS.md**: If new architecture components (handlers, executors, services, DB modules) were added, update the Architecture section. If new commands were added, update the Build/Test table.
- **SECURITY.md**: Add new vulnerability fixes or security improvements to the relevant section.
- **llm.txt**: Update provider count, feature list, version references.
### 11. Review and update docs/ files (excluding i18n/)
For each file in `docs/` (excluding `docs/i18n/`), review if CHANGELOG changes affect it:
| `docs/VM_DEPLOYMENT_GUIDE.md` | Deployment changes, new env vars |
| `docs/TROUBLESHOOTING.md` | New known issues, resolved problems |
| `docs/AUTO-COMBO.md` | Routing changes, new strategies |
| `docs/CODEBASE_DOCUMENTATION.md` | New files, architectural changes |
| `docs/RELEASE_CHECKLIST.md` | Process changes |
| `docs/COVERAGE_PLAN.md` | Test changes |
| `docs/openapi.yaml` | Already updated in step 7 |
**Only update files where the CHANGELOG entries directly affect the documented content.** Do NOT update files just to bump a version number — only when the documented behavior, features, or architecture has actually changed.
description:"Which tool are you using OmniRoute with?"
placeholder:"e.g. Claude Code, Cursor, Roo Code, OpenClaw, Gemini CLI, cURL"
validations:
required:false
- type:textarea
id:description
attributes:
label:Description
description:"A clear description of what the bug is."
validations:
required:true
- type:textarea
id:steps
attributes:
label:Steps to Reproduce
description:"Step-by-step instructions to reproduce the behavior."
placeholder:|
1. Go to '...'
2. Click on '...'
3. See error
validations:
required:true
- type:textarea
id:expected
attributes:
label:Expected Behavior
description:"What did you expect to happen?"
validations:
required:true
- type:textarea
id:actual
attributes:
label:Actual Behavior
description:"What actually happened?"
validations:
required:true
- type:dropdown
id:test-impact
attributes:
label:Test Impact
description:"What automated test coverage should exist for this bug?"
options:
- Needs a new unit test
- Needs a new integration test
- Needs a new e2e test
- Existing automated test already fails
- Unsure
validations:
required:true
- type:textarea
id:logs
attributes:
label:Error Logs / Output
description:"Paste any relevant error messages, logs, or terminal output. This will be automatically formatted as code."
render:shell
validations:
required:false
- type:textarea
id:screenshots
attributes:
label:Screenshots
description:"If applicable, add screenshots to help explain the problem. Please also include the text of any error messages above — screenshots alone are not searchable."
validations:
required:false
- type:textarea
id:additional
attributes:
label:Additional Context
description:"Any other context about the problem (e.g. proxy config, number of accounts, network setup)."
validations:
required:false
- type:textarea
id:validation-plan
attributes:
label:Validation Plan
description:"Which commands or tests should prove this bug is fixed?"
- Treat `npm run test:coverage` as a required gate for PR work.
- The repository minimum is `60%` for statements, lines, functions, and branches.
- If a PR changes production code in `src/`, `open-sse/`, `electron/`, or `bin/`, it must include automated tests in the same PR.
- When reviewing or updating a PR, if the report shows missing tests or coverage below `60%`, do not stop after reporting the problem. Add or update tests in the PR first, rerun the coverage gate, and only then ask for confirmation.
- Prefer the smallest test layer that proves the behavior:
- unit tests first
- integration tests when multiple modules or DB state are involved
- e2e only when the behavior is truly UI or workflow-dependent
- For bug issues, try to encode the reproduction as an automated test before or alongside the fix.
[ 32698ms] [ERROR] Failed to load resource: the server responded with a status of 404 (Not Found) @ http://localhost:20130/dashboard/usage?_rsc=18t7j:0
- paragraph [ref=e145]:OmniRoute is your local AI API proxy. It routes requests to multiple AI providers with load balancing, failover, and usage tracking.
# Coverage (60% minimum for statements, lines, functions, and branches)
npm run test:coverage
```
### PR Coverage Policy
-`npm run test:coverage` is the PR coverage gate in CI.
- The repository minimum is **60%** for statements, lines, functions, and branches.
- If a PR changes production code in `src/`, `open-sse/`, `electron/`, or `bin/`, it must include or update automated tests in the same PR.
- For agent-driven review or coding flows: if coverage is below the gate or source changes ship without tests, do not stop at reporting. Add or update tests first, rerun the gate, and only then ask for confirmation.
---
## Code Style Guidelines
### Formatting (Prettier — enforced via lint-staged)
Releases are managed via the `/generate-release` workflow. When a new GitHub Release is created, the package is **automatically published to npm** via GitHub Actions.
---
## Getting Help
- **Architecture**: See [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md)
- **API Reference**: See [`docs/API_REFERENCE.md`](docs/API_REFERENCE.md)
- Control route: `src/app/api/sync/cloud/route.ts`
## Request Lifecycle (`/v1/chat/completions`)
@@ -335,7 +356,7 @@ flowchart TD
Q -- No --> R[Return all unavailable]
```
Fallback decisions are driven by `open-sse/services/accountFallback.ts` using status codes and error-message heuristics.
Fallback decisions are driven by `open-sse/services/accountFallback.ts` using status codes and error-message heuristics. Combo routing adds one extra guard: provider-scoped 400s such as upstream content-block and role-validation failures are treated as model-local failures so later combo targets can still run.
## OAuth Onboarding and Token Refresh Lifecycle
@@ -593,7 +614,7 @@ Each provider has a specialized executor extending `BaseExecutor` (in `open-sse/
> A comprehensive, beginner-friendly guide to the **omniroute** multi-provider AI proxy router.
@@ -267,7 +267,7 @@ Business logic that supports the handlers and executors.
| `provider.ts` | **Format detection** (`detectFormat`): analyzes request body structure to identify Claude/OpenAI/Gemini/Antigravity/Responses formats (includes `max_tokens` heuristic for Claude). Also: URL building, header building, thinking config normalization. Supports `openai-compatible-*` and `anthropic-compatible-*` dynamic providers. |
| `model.ts` | Model string parsing (`claude/model-name` → `{provider: "claude", model: "model-name"}`), alias resolution with collision detection, input sanitization (rejects path traversal/control chars), and model info resolution with async alias getter support. |
| `tokenRefresh.ts` | OAuth token refresh for **every provider**: Google (Gemini, Antigravity), Claude, Codex, Qwen, iFlow, GitHub (OAuth + Copilot dual-token), Kiro (AWS SSO OIDC + Social Auth). Includes in-flight promise deduplication cache and retry with exponential backoff. |
| `tokenRefresh.ts` | OAuth token refresh for **every provider**: Google (Gemini, Antigravity), Claude, Codex, Qwen, Qoder, GitHub (OAuth + Copilot dual-token), Kiro (AWS SSO OIDC + Social Auth). Includes in-flight promise deduplication cache and retry with exponential backoff. |
| `combo.ts` | **Combo models**: chains of fallback models. If model A fails with a fallback-eligible error, try model B, then C, etc. Returns actual upstream status codes. |
| `usage.ts` | Fetches quota/usage data from provider APIs (GitHub Copilot quotas, Antigravity model quotas, Codex rate limits, Kiro usage breakdowns, Claude settings). |
| `accountSelector.ts` | Smart account selection with scoring algorithm: considers priority, health status, round-robin position, and cooldown state to pick the optimal account for each request. |
@@ -539,7 +539,7 @@ A 2000-token buffer is added to reported usage to prevent clients from hitting c
| Kiro (AWS) | AWS SSO OIDC or Social | Kiro | Binary EventStream parsing |
| Legacy | Old `npm run test:coverage` | 79.42% | 75.15% | 67.94% | Inflated: counts test files and excludes `open-sse` |
| Diagnostic | Source-only, excluding tests and excluding `open-sse` | 68.16% | 63.55% | 64.06% | Useful only to isolate `src/**` |
| Recommended baseline | Source-only, excluding tests and including `open-sse` | 56.95% | 66.05% | 57.80% | This is the project-wide baseline to improve |
The recommended baseline is the number to optimize against.
## Rules
- Coverage targets apply to source files, not to `tests/**`.
- `open-sse/**` is part of the product and must remain in scope.
- New code should not reduce coverage in touched areas.
- Prefer testing behavior and branch outcomes over implementation details.
- Prefer temp SQLite databases and small fixtures over broad mocks for `src/lib/db/**`.
## Current command set
- `npm run test:coverage`
- Main source coverage gate for the unit test suite
- Generates `text-summary`, `html`, `json-summary`, and `lcov`
- `npm run coverage:report`
- Detailed file-by-file report from the latest run
| Phase 7 | 90% statements / lines | Final sweep, gap closure, strict ratchet |
Branches and functions should ratchet upward with each phase, but the primary hard target is statements / lines.
## Priority hotspots
These files or areas offer the best return for the next phases:
1. `open-sse/handlers`
- `chatCore.ts` at 7.57%
- Overall directory at 29.07%
2. `open-sse/translator/request`
- Overall directory at 36.39%
- Many translators are still near single-digit coverage
3. `open-sse/translator/response`
- Overall directory at 8.07%
4. `open-sse/executors`
- Overall directory at 36.62%
5. `src/lib/db`
- `models.ts` at 20.66%
- `registeredKeys.ts` at 34.46%
- `modelComboMappings.ts` at 36.25%
- `settings.ts` at 46.40%
- `webhooks.ts` at 33.33%
6. `src/lib/usage`
- `usageHistory.ts` at 21.12%
- `usageStats.ts` at 9.56%
- `costCalculator.ts` at 30.00%
7. `src/lib/providers`
- `validation.ts` at 41.16%
8. Low-risk utility and API files for early gains
- `src/shared/utils/upstreamError.ts`
- `src/shared/utils/apiAuth.ts`
- `src/lib/api/errorResponse.ts`
- `src/app/api/settings/require-login/route.ts`
- `src/app/api/providers/[id]/models/route.ts`
## Execution checklist
### Phase 1: 56.95% -> 60%
- [x] Fix coverage metric so it reflects source code instead of test files
- [x] Keep a legacy coverage script for comparison
- [x] Record the baseline and hotspots in-repo
- [ ] Add focused tests for low-risk utilities:
- `src/shared/utils/upstreamError.ts`
- `src/shared/utils/fetchTimeout.ts`
- `src/lib/api/errorResponse.ts`
- `src/shared/utils/apiAuth.ts`
- `src/lib/display/names.ts`
- [ ] Add route tests for:
- `src/app/api/settings/require-login/route.ts`
- `src/app/api/providers/[id]/models/route.ts`
### Phase 2: 60% -> 65%
- [ ] Add DB-backed tests for:
- `src/lib/db/modelComboMappings.ts`
- `src/lib/db/settings.ts`
- `src/lib/db/registeredKeys.ts`
- [ ] Cover branch behavior in:
- `src/lib/providers/validation.ts`
- `src/app/api/v1/embeddings/route.ts`
- `src/app/api/v1/moderations/route.ts`
### Phase 3: 65% -> 70%
- [ ] Add usage analytics tests for:
- `src/lib/usage/usageHistory.ts`
- `src/lib/usage/usageStats.ts`
- `src/lib/usage/costCalculator.ts`
- [ ] Expand route coverage for proxy management and settings branches
### Phase 4: 70% -> 75%
- [ ] Cover translator helpers and central translation paths:
- `open-sse/translator/index.ts`
- `open-sse/translator/helpers/*`
- `open-sse/translator/request/*`
- `open-sse/translator/response/*`
### Phase 5: 75% -> 80%
- [ ] Add handler-level tests for:
- `open-sse/handlers/chatCore.ts`
- `open-sse/handlers/responsesHandler.js`
- `open-sse/handlers/imageGeneration.js`
- `open-sse/handlers/embeddings.js`
- [ ] Add executor branch coverage for provider-specific auth, retries, and endpoint overrides
### Phase 6: 80% -> 85%
- [ ] Merge more edge-case suites into the main coverage path
- [ ] Increase function coverage for DB modules with weak constructor/helper coverage
- [ ] Close branch gaps in `settings.ts`, `registeredKeys.ts`, `validation.ts`, and translator helpers
### Phase 7: 85% -> 90%
- [ ] Treat the remaining low-coverage files as blockers
- [ ] Add regression tests for every uncovered production bug fixed during the push to 90%
- [ ] Raise the coverage gate in CI only after the local baseline is stable for at least two consecutive runs
## Ratchet policy
Update `npm run test:coverage` thresholds only after the project actually exceeds the next milestone with a comfortable buffer.
Recommended ratchet sequence:
1. 55/60/55
2. 60/62/58
3. 65/64/62
4. 70/66/66
5. 75/70/72
6. 80/75/78
7. 85/80/84
8. 90/85/88
Order is `statements-lines / branches / functions`.
## Known gap
The current coverage command measures the main Node unit suite and includes source reached from it, including `open-sse`. It does not yet merge Vitest coverage into a single unified report. That merge is worth doing later, but it is not a blocker for starting the 60% -> 80% climb.
@@ -108,7 +108,7 @@ Real-time request logging with filtering by provider, model, account, and API ke
## 🌐 API Endpoint
Your unified API endpoint with capability breakdown: Chat Completions, Responses API, Embeddings, Image Generation, Reranking, Audio Transcription, Text-to-Speech, Moderations, and registered API keys. Cloud proxy support for remote access.
Your unified API endpoint with capability breakdown: Chat Completions, Responses API, Embeddings, Image Generation, Reranking, Audio Transcription, Text-to-Speech, Moderations, and registered API keys. Cloudflare Quick Tunnel integration and cloud proxy support for remote access.
- Hardened Electron build packaging — symlinked `node_modules` in the standalone bundle is detected and rejected before packaging, preventing runtime dependency on the build machine (v2.5.5+)
📖 See [`electron/README.md`](../electron/README.md) for full documentation.
| `messages` | Translates missing keys in `src/i18n/messages/{locale}.json` from `en.json` |
| `readme` | Translates `README.md` into all locales as `README.{code}.md` in project root |
| `docs` | Translates `DOC_SOURCE_FILES` into `docs/i18n/{locale}/{docName}` |
| `all` | Runs all three modes |
**Features:**
- **Text protection**: Masks code blocks (```` ``` ````), inline code (`` ` ``), markdown links/images (`[text](url)`), HTML tags, tables, and ICU placeholders (`{count}`, `{value}`, `{total}`, etc.) before translation, then restores them
- **Chunked batching**: Joins multiple strings with `__OMNIROUTE_I18N_SEPARATOR__` delimiters to minimize API calls (max 1800 chars per request)
- **In-memory cache**: Avoids redundant API calls for repeated strings within a session
- **Retry logic**: Exponential backoff (up to 5 attempts with 300ms × attempt delay) for 429/5xx errors
- **Timeout**: 20 seconds per request
- **Skip existing**: If target file already exists, it is NOT overwritten
**Important behaviors:**
- `docs/i18n/README.md` is **regenerated** each run — it's an auto-generated index of all docs
- Root `README.{code}.md` files are only created if they don't exist (skips locales in `EXISTING_README_CODES`)
- Language bars (`🌐 **Languages:** ...`) are automatically inserted/updated in all translated docs
### i18n_autotranslate.py (LLM-based)
**Secondary translator** — uses any OpenAI-compatible LLM API (including OmniRoute itself) to translate existing `docs/i18n/` markdown files. Best for polishing or re-translating docs with better quality than Google Translate.
```bash
python3 scripts/i18n_autotranslate.py \
--api-url http://localhost:20128/v1 \
--api-key sk-your-key \
--model gpt-4o
```
**Features:**
- Scans `docs/i18n/` markdown files for English paragraphs
- Skips code blocks, tables, and already-translated content
- Sends paragraphs to LLM with technical translation system prompt
- Supports all 30 languages
## Validation & QA
### validate_translation.py
**Translation validator** — compares any locale JSON against `en.json` and reports issues.
- **Missing keys** — keys in `en.json` but not in locale file
- **Extra keys** — keys in locale file but not in `en.json`
- **Untranslated keys** — keys where locale value equals English source (excluding allowlist)
- **Placeholder mismatches** — ICU placeholders that don't match between source and translation
**Exit codes:**
| Code | Meaning |
|------|---------|
| 0 | OK |
| 1 | Generic error |
| 2 | Missing strings (hard error) |
| 3 | Untranslated warning (soft) |
**Environment:** Set `TRANSLATION_LANG=cs` or use `-l cs` flag.
### check_translations.py
**Code-to-JSON key checker** — scans `src/**/*.tsx` and `src/**/*.ts` for `useTranslations()` calls and verifies all referenced keys exist in `en.json`.
```bash
# Basic check
python3 scripts/check_translations.py
# Verbose output
python3 scripts/check_translations.py --verbose
# Auto-fix (adds missing keys to en.json)
python3 scripts/check_translations.py --fix
```
### generate-qa-checklist.mjs
**Static analysis QA** — scans Next.js page files for i18n risk metrics and generates a Markdown report.
```bash
node scripts/i18n/generate-qa-checklist.mjs
```
**Checks:**
- Fixed-width class usage (overflow risk)
- Directional left/right classes (RTL risk)
- Clipping-prone patterns
- Locale parity (missing/extra keys vs `en.json`)
- README language selector bars in priority locales (`es`, `fr`, `de`, `ja`, `ar`)
1. **Always edit `en.json` first** — it's the source of truth
2. **Run `generate-multilang.mjs messages`** to propagate new keys to all locales
3. **Review auto-translations** — Google Translate is a starting point, not final
4. **Validate before committing** — `python3 scripts/validate_translation.py quick -l <lang>`
5. **Update `untranslatable-keys.json`** if a key should remain in English
### Placeholder Safety
- ICU placeholders (`{count}`, `{value}`, `{total}`, `{seconds}`) must be preserved exactly
- Plural formats (`{count, plural, one {# model} other {# models}}`) must maintain structure
- The validator detects placeholder mismatches automatically
### Adding New Translation Keys in Code
```tsx
// Use namespaced keys
const t = useTranslations("settings");
t("cacheSettings"); // maps to settings.cacheSettings in JSON
// Run check_translations.py to verify keys exist
python3 scripts/check_translations.py --verbose
```
### RTL Considerations
- Arabic (`ar`) and Hebrew (`he`) are RTL locales
- Avoid hardcoded `left`/`right` CSS — use `start`/`end` logical properties
- Visual QA catches RTL layout mismatches via `run-visual-qa.mjs`
## Known Issues & History
### `in.json` → `hi.json` Fix
The generator originally used `code: "in"` (deprecated Google Translate code) for Hindi instead of the correct ISO 639-1 `hi`. This created an orphaned `in.json` duplicate of `hi.json`. Fixed by changing `code: "in"` to `code: "hi"` in `generate-multilang.mjs` and removing the orphaned file.
### `docs/i18n/README.md` Is Auto-Generated
The `docs/i18n/README.md` file is completely regenerated by `generate-multilang.mjs docs`. Any manual edits will be lost. Use `docs/I18N.md` (this file) for hand-written documentation that should persist.
### External Untranslatable Keys List
The `untranslatable-keys.json` allowlist was moved from an inline Python set in `validate_translation.py` to an external JSON file for easier maintenance. The validator loads it at runtime.
### `generate-multilang.mjs` Hindi Code Fix
The generator originally used `code: "in"` (deprecated Google Translate code) for Hindi instead of the correct ISO 639-1 `hi`. This was introduced in upstream commit `952b0b22c` by `diegosouzapw`. Fixed by changing `code: "in"` to `code: "hi"` in the `LOCALE_SPECS` array and removing the orphaned `in.json` file.
For host-integrated mode with CLI binaries, see the Docker section in the main docs.
### Void Linux (xbps-src)
Void Linux users can package and install OmniRoute natively using the `xbps-src` cross-compilation framework. This automates the Node.js standalone build along with the required `better-sqlite3` native bindings.
<details>
<summary><b>View xbps-src template</b></summary>
```bash
# Template file for 'omniroute'
pkgname=omniroute
version=3.2.4
revision=1
hostmakedepends="nodejs python3 make"
depends="openssl"
short_desc="Universal AI gateway with smart routing for multiple LLM providers"
@@ -494,6 +599,11 @@ curl -X POST http://localhost:20128/api/provider-models \
Or use Dashboard: **Providers → [Provider] → Custom Models**.
Notes:
- OpenRouter and OpenAI/Anthropic-compatible providers are managed from **Available Models** only. Manual add, import, and auto-sync all land in the same available-model list, so there is no separate Custom Models section for those providers.
- The **Custom Models** section is intended for providers that do not expose managed available-model imports.
### Dedicated Provider Routes
Route requests directly to a specific provider with model validation:
@@ -538,6 +648,17 @@ Returns models grouped by provider with types (`chat`, `embedding`, `image`).
- Automatic background sync with timeout + fail-fast
- Prefer server-side `BASE_URL`/`CLOUD_URL` in production
### Cloudflare Quick Tunnel
- Available in **Dashboard → Endpoints** for Docker and other self-hosted deployments
- Creates a temporary `https://*.trycloudflare.com` URL that forwards to your current OpenAI-compatible `/v1` endpoint
- First enable installs `cloudflared` only when needed; later restarts reuse the same managed binary
- Quick Tunnels are not auto-restored after an OmniRoute or container restart; re-enable them from the dashboard when needed
- Tunnel URLs are ephemeral and change every time you stop/start the tunnel
- Managed Quick Tunnels default to HTTP/2 transport to avoid noisy QUIC UDP buffer warnings in constrained containers
- Set `CLOUDFLARED_PROTOCOL=quic` or `auto` if you want to override the managed transport choice
- Set `CLOUDFLARED_BIN` if you prefer using a preinstalled `cloudflared` binary instead of the managed download
| **Export Database** | Downloads the current SQLite database as a `.sqlite` file |
| **Export All (.tar.gz)** | Downloads a full backup archive including: database, settings, combos, provider connections (no credentials), API key metadata |
| **Import Database** | Upload a `.sqlite` file to replace the current database. A pre-import backup is automatically created |
| **Export Database** | Downloads the current SQLite database as a `.sqlite` file |
| **Export All (.tar.gz)** | Downloads a full backup archive including: database, settings, combos, provider connections (no credentials), API key metadata |
| **Import Database** | Upload a `.sqlite` file to replace the current database. A pre-import backup is automatically created unless `DISABLE_SQLITE_AUTO_BACKUP=true` |
```bash
# API: Export database
@@ -667,10 +804,11 @@ curl -X POST http://localhost:20128/api/db-backups/import \
### Settings Dashboard
The settings page is organized into 5 tabs for easy navigation:
The settings page is organized into 6 tabs for easy navigation:
| الإعدادات → عام | رؤية الشريط الجانبي | إخفاء/ إخفاء أقسام الفصل الجانبي |
يتم تخزين هذه الإعدادات في قاعدة البيانات وتستمر من خلال عمليات إعادة تشغيل التشغيل، مما يؤدي إلى تجاوز إعدادات env var الافتراضية عند ضبطها.### التشغيل محليًا```bash
# Development mode (hot reload)
npm run dev
# Production build
npm run build
npm run start
# Common port configuration
PORT=20128 NEXT_PUBLIC_BASE_URL=http://localhost:20128 npm run dev
# التغطية (60% الحد الأدنى من البيانات/السطور/الوظائف/الفروع)
اختبار تشغيل npm: التغطية
تغطية تشغيل npm: تقرير
# فحص الوبر + التنسيق
npm تشغيل الوبر
فحص تشغيل npm```
تغطية التعليقات:
- `npm run test:coverage` يقيس المصدر لمجموعة اختبار الوحدة الرئيسية، ويستبعد `tests/**`، بما في ذلك `open-sse/**`
- يجب أن تحافظ على طلبات التنظيف على بوابة التغطية الشاملة عند**60% أو أعلى**للكشوفات والخطوط والوظائف والأروع
- إذا قام ممثل العلاقات العامة تغيير رمز الإنتاج في `src/` أو `open-sse/` أو `electron/` أو `bin/`، فيجب عليه إضافة أو تحديث النقاشة التلقائية في نفس العلاقات العامة
- `تغطية تشغيل npm: التقرير' يطبع التقرير التفصيلي لكل ملف على المدى الطويل من أحدث طرق التغطية
- `اختبار تشغيل npm:التغطية:التراث` يحافظ على قياس الأقدم للمقارنة التاريخية
- راجع`docs/COVERAGE_PLAN.md` للحصول على خارطة طريق تحسين التغطية العامة### سحب متطلبات الطلب
قبل فتح أو دمج العلاقات العامة:
- اختبار تشغيل npm: الوحدة
- اختبار تشغيل npm: التغطية
- تأكد من بقاء بوابة التغطية عند**60%+**لجميع المعايير
- تتضمن ملفات الاختبار التي تم تغييرها أو الهاتفا في وصف العلاقات العامة عند تغيير رمز الإنتاج
- التحقق من نتيجة SonarQube على PR عندما يتم التأكد من أسرار المشروع في CI
أضف إلى `src/shared/constants/providers.ts` - تم التحقق من صحة Zod عند تحميل الوحدة.### الخطوة 2: إضافة Executor (إذا كانت هناك حاجة إلى منطق مخصص)
موجود بالفعل منفذ تنفيذي في open-sse/executors/your-provider.ts لتوسيع المنفذ الأساسي.### الخطوة 3: إضافة مترجم (إذا كان تنسيق غير OpenAI)
يجب أن تكون موجودة في الطلب المترجم/الاستجابة في `open-sse/translator/`.### الخطوة 4: إضافة تكوين OAuth (إذا كان يعتمد على OAuth)
إضافة بيانات موثوقة OAuth في `src/lib/oauth/constants/oauth.ts` وتطبيقات في `src/lib/oauth/services/`.### الخطوة 5: تسجيل النماذج
أضف تعريفات الارتباطات في "open-sse/config/providerRegistry.ts".### الخطوة 6: إضافة الاختبارات
اكتب السيولة الوحدة في `الاختبارات/الوحدة/` التي تغطي الحد الأدنى:
- تسجيل المزود
- ترجمة الطلب/الرد
-تسبب سبب---## Pull Request Checklist
- [ ] اجتياز الاختبار (`اختبار npm`)
- [ ] طباعات القلم (`npm run lint`)
- [ ] نجاح البناء (`npm run build`)
- [ ] تمت إضافة أنواع TypeScript للوظائف والواجهات العامة الجديدة
- [ ] لا توجد أسرار ضمنية أو قيم بيعة
- [ ] تم التحقق من صحة جميع المدخلات باستخدام مخططات Zod
- [ ] تم تحديث سجل التغيير (في حالة التغيير الذي يواجهه المستخدم)
- [ ] تم تحديث الوثائق (إن وجدت)---## Releasing
تم إدارة الاختلاف عبر سير العمل `/generate-release`. عند إنشاء إصدار GitHub جديد، يتم**نشر المنتج اليدوي إلى npm**عبر إجراءات GitHub.---## Getting Help
Create model routing combos with 6 strategies: priority, weighted, round-robin, random, least-used, and cost-optimized. Each combo chains multiple models with automatic fallback and includes quick templates and readiness checks.

---
## 📊 Analytics
Comprehensive usage analytics with token consumption, cost estimates, activity heatmaps, weekly distribution charts, and per-provider breakdowns.
Test any model directly from the dashboard. Select provider, model, and endpoint, write prompts with Monaco Editor, stream responses in real-time, abort mid-stream, and view timing metrics.
---
## 🎨 Themes _(v2.0.5+)_
Customizable color themes for the entire dashboard. Choose from 7 preset colors (Coral, Blue, Red, Green, Violet, Orange, Cyan) or create a custom theme by picking any hex color. Supports light, dark, and system mode.
---
## ⚙️ Settings
Comprehensive settings panel with tabs:
- **General** — System storage, backup management (export/import database)
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
One-click configuration for AI coding tools: Claude Code, Codex CLI, Gemini CLI, OpenClaw, Kilo Code, Antigravity, Cline, Continue, Cursor, and Factory Droid. Features automated config apply/reset, connection profiles, and model mapping.
Dashboard for discovering and managing CLI agents. Shows a grid of 14 built-in agents (Codex, Claude, Goose, Gemini CLI, OpenClaw, Aider, OpenCode, Cline, Qwen Code, ForgeCode, Amazon Q, Open Interpreter, Cursor CLI, Warp) with:
- **Installation status** — Installed / Not Found with version detection
- **Protocol badges** — stdio, HTTP, etc.
- **Custom agents** — Register any CLI tool via form (name, binary, version command, spawn args)
- **CLI Fingerprint Matching** — Per-provider toggle to match native CLI request signatures, reducing ban risk while preserving proxy IP
---
## 🖼️ Media _(v2.0.3+)_
Generate images, videos, and music from the dashboard. Supports OpenAI, xAI, Together, Hyperbolic, SD WebUI, ComfyUI, AnimateDiff, Stable Audio Open, and MusicGen.
---
## 📝 Request Logs
Real-time request logging with filtering by provider, model, account, and API key. Shows status codes, token usage, latency, and response details.

---
## 🌐 API Endpoint
Your unified API endpoint with capability breakdown: Chat Completions, Responses API, Embeddings, Image Generation, Reranking, Audio Transcription, Text-to-Speech, Moderations, and registered API keys. Cloud proxy support for remote access.
Create, scope, and revoke API keys. Each key can be restricted to specific models/providers with full access or read-only permissions. Visual key management with usage tracking.
---
## 📋 Audit Log
Administrative action tracking with filtering by action type, actor, target, IP address, and timestamp. Full security event history.
---
## 🖥️ Desktop Application
Native Electron desktop app for Windows, macOS, and Linux. Run OmniRoute as a standalone application with system tray integration, offline support, auto-update, and one-click install.
Key features:
- Server readiness polling (no blank screen on cold start)
إذا وجدت ثغرة أمنية في OmniRoute، فيرجى إمدادها بطريقة مختلفة:
1.**لا**تفتح مشكلة عامة على GitHub 2. استخدم [GitHub Security Advisories](https://github.com/diegosouzapw/OmniRoute/security/advisories/new) 3. تشمل: الوصف، وخطوات الاستنساخ، والأثر للمناسب## Response Timeline
|**الاحتفاظ بالسجل**| التنظيف التلقائي بعد `CALL_LOG_RETENTION_DAYS` |
|**إلغاء الاشتراك في عدم التسجيل**| تعمل علامة noLog لكل مفتاح API على تسجيل الطلبات |
|**سجل التدقيق**| الإجراءات الإدارية التي تم تتبعها في جدول `audit_log` |
|**تدقيق MCP**| تسجيل التدقيق التجاري من SQLite لجميع أدوات الاتصال MCP |
|**التحقق من صحة زود**| تم التحقق من صحة جميع مدخلات واجهة برمجة التطبيقات (API) باستخدام مخططات Zod v4 عند تحميل الوحدة النموذجية |---## متغيرات البيئة المطلوبة
يجب ضبط جميع الاستخدامات قبل إنشاء الضيوف. سوف يفشل العميل بسرعة**إذا كان مفقودًا أو ضعيف.```bash
#مطلوب — لن يبدأ بدون ما يلي:
JWT_SECRET=$(openssl rand -base64 48) # دقيقة 32 حرفًا
API_KEY_SECRET=$(openssl rand -hex 32) # دقيقة 16 حرفًا
#موصى به — يتيح التشفير في حالة عدم النشاط:
STORAGE_ENCRYPTION_KEY=$(openssl rand -hex 32)```
يرفض المعلم تعلمياً القيم والضعيفة مثل `changeme` أو `secret` أو `password`.---## Docker Security
- استخدم المستخدم غير جيجا في الإنتاج
- منزل جبلار كمجلدات للقراءة فقط
- لا تنسى أبدًا بنسخ ملفات `.env` إلى صور Docker
- استخدام `.dockerignore` لاستبعاد الملفات الحساسة
- اضبط `AUTH_COOKIE_SECURE=true` عندما يكون خلف HTTPS```bash
| **Export Database** | Downloads the current SQLite database as a `.sqlite` file |
| **Export All (.tar.gz)** | Downloads a full backup archive including: database, settings, combos, provider connections (no credentials), API key metadata |
| **Import Database** | Upload a `.sqlite` file to replace the current database. A pre-import backup is automatically created |
**Cost Tracking:** Every request logs token usage and calculates cost using the pricing table. View breakdowns in **Dashboard → Usage** by provider, model, and API key.
---
### Audio Transcription
OmniRoute supports audio transcription via the OpenAI-compatible endpoint:
```bash
POST /v1/audio/transcriptions
Authorization: Bearer your-api-key
Content-Type: multipart/form-data
# Example with curl
curl -X POST http://localhost:20128/v1/audio/transcriptions \
-H "Authorization: Bearer your-api-key" \
-F "file=@audio.mp3" \
-F "model=deepgram/nova-3"
```
Available providers: **Deepgram** (`deepgram/`), **AssemblyAI** (`assemblyai/`).
OmniRoute عبارة عن بوابة توجيه نقطة تعمل بالذكاء الاصطناعي ولوحة معلومات مبنية على Next.js.
وهو يوفر نقطة نهاية واحدة متوافقة مع OpenAI (`/v1/*`) ويوجه حركة المرور عبر العديد من الخدمات الموفري الأولية مع الترجمة والاحتياط وتحديث الرمز المميز وتتبع الاستخدام.
التان الأساسية:
- سطح API متوافق مع OpenAI لـ CLI/الأدوات (28 منتجًا)
- ترجمة الطلب/الاستجابة عبر التنسيقات الموفر
- نموذج بناء التحرير والسرد (سلسلة الارتباطات المتعددة)
- موازنة حساب الحساب (حسابات متعددة لكل شخص)
- إدارة اتصال موفر OAuth + API-key
- إنشاء التضمين عبر `/v1/embeddings` (6 مقدمي خدمات، 9 نماذج)
- إنشاء الصور عبر `/v1/images/Generation` (4 مقدمي خدمات، 9 نماذج)
- فكر في تحليل العلامات (`<think>...</think>`) لنماذج الاستدلال
- تحديد القيمة للتوافق مع OpenAI SDK
- تطبيع الدور (المطور → النظام، النظام → المستخدم) للتوافق بين الموفرين
- تحويل المنتج منظم (json_schema → Gemini ResponseSchema)
- الثبات المحلي لمقدمي الخدمات والمفاتيح والأسماء المستعارة والمجموعات والإعدادات والتسعير
- تتبع تكلفة/التكلفة وتسجيل الطلب
- نوبات سحابية اختيارية للأجهزة/الحالة الثابتة
- القائمة الخاصة بها/القائمة المحظورة لـ IP للتحكم في الوصول إلى واجهة برمجة التطبيقات
- التفكير في إدارة الميزانية (العبور / التلقائي / المقصود / التكيفي)
- هيكل البناء العالمي
- تتبع البصمات
- تحديد المحسن لكل حساب مع الملفات الشخصية الخاصة بالمزود
- تقطع فاصل لمرونة المورد
- حماية القطيع ضد الرعد مع موتكس
- ذاكرة التخزين المؤقتة لإلغاء البيانات المكررة للطلبة المستندية للتوقيع
- المجال: توفر النموذج، وقواعد التكلفة، والسياسة الاحتياطية، وسياسة فك الضغط
- فرانسيسكوية المجال المجال (ذاكرة التخزين المؤقتة للكتاب في SQLite للاحتياطيات والميزانيات وفتح قواطع الضوء)
- السياسة التي تحدد الطلب المركزي (التأمين → الميزانية → الاحتياطي)
- طلب القياس عن بعد مع تجميع الكمون ص50/ص95/ص99
- معرف الارتباط (X-Request-Id) للتتبع الشامل
- تسجيل تدقيق كامل مع إلغاء الاشتراك لمفتاح API
- إطار تقييمي وجودة LLM
- لوحة تحكم واجهة المستخدم المرنة مع فاصل زمني في العمل
- مفري OAuth المطاطيون (12 وحدة ضمن `src/lib/oauth/providers/`)
وقت نموذج التشغيل الأساسي:
- تقوم مسارات تطبيق Next.js ضمن `src/app/api/*` ولتتمكن كل من واجهات تطبيقات برمجة لوحة المعلومات وواجهات برمجة تطبيقات التوافق
- نواة توجيه/SSE اشترك في `src/sse/*` + `open-sse/*` تمويل مع تنفيذ الموفر والترجمة والتدفق والرجوع والاستخدام## النطاق والحدود### In Scope
- وقت تشغيل البوابة المحلية
- واجهات برمجة التطبيقات المبتكرة للوحة المعلومات
- مصادقة الموفر وتحديث الرمز المميز
- طلب الترجمة و التدفق SSE
- الحالة المحلية + استمرارية الاستخدام
- نوبات سحابية اختيارية### خارج النطاق
- تنفيذ خدمة السحابية خلف `NEXT_PUBLIC_CLOUD_URL`
- مستوى تحرير السودان/مستوى التحكم خارج نطاق العمل
- ثنائيات CLI الخارجية نفسها (Claude CLI، Codex CLI، وما إلى ذلك) ## سطح لوحة القيادة (الحالي)
الصفحة الرئيسية ضمن `src/app/(dashboard)/dashboard/`:
- `/dashboard` - بداية سريعة + نظرة عامة على الموفر
- `/dashboard/endpoint` - وكيل نقطة النهاية + علامات نهاية نقطة النهاية MCP + A2A + API
- الواجهة: `src/lib/usageDb.ts` (وحدات متحللة في `src/lib/usage/*`)
- جداول SQLite في `storage.sqlite`: `usage_history`، `call_logs`، `proxy_logs`
- تبرز عناصر الملف الاختياري للتوافق/تصحيح سبب (`${DATA_DIR}/log.txt`, `${DATA_DIR}/call_logs/`, `<repo>/logs/...`)
- يتم رحيل ملفات JSON القديمة إلى SQLite عن طريق عمليات رحيل بدء التشغيل عند وجودها
قاعدة بيانات المجال (SQLite):
- `src/lib/db/domainState.ts` - عمليات إنتاج CRUD لحالة المجال
- الجداول (التي تم تحديدها في `src/lib/db/core.ts`): `domain_fallback_chains`، `domain_budgets`، `domain_cost_history`، `domain_lockout_state`، `domain_circuit_breakers`.
- نمط ذاكرة التخزين المؤقت للكتابة: قرص الاتصال موجود في الذاكرة الموثوقة في وقت التشغيل؛ تتم كتابة الطفرات بشكل متزامن إلى SQLite؛ يتم استعادة حالة قاعدة البيانات عند البداية الباردة ## 4) المصادقة + الأسطح الأمنية
- مصادقة ملف تعريف الارتباط في لوحة المعلومات: `src/proxy.ts`، `src/app/api/auth/login/route.ts`
- إنشاء/التحقق من مفتاح واجهة برمجة التطبيقات: `src/shared/utils/apiKey.ts`
-أسرار الموفر في الخطوط "providerConnections".
- دعم خارجي تمامًا عبر `open-sse/utils/proxyFetch.ts` (env vars) و`open-sse/utils/networkProxy.ts` (قابل للتكوين لكل المرشحين أو عالمي)## 5) Cloud Sync
يتم اتخاذ القرار الاحتياطي بواسطة `open-sse/services/accountFallback.ts` باستخدام رموز الحالة للاستدلال على رسائل الخطأ. تسهيل توجيه التشغيل والتنسيق بين الطرفين طوعًا للمساعدة في تقديم الطلبات: يتم التعامل مع 400s على نطاق الموفر مثل كتلة المحتوى الأول وفشل التحقق من صحة الدور على أنها فشل رئيسي للنموذج، لذا لا يزال لا يزال مطلوبًا التحرير والسرد التالي.## OAuth Onboarding and Token Refresh Lifecycle```mermaid
sequenceDiagram
autonumber
participant UI as Dashboard UI
participant OAuth as /api/oauth/[provider]/[action]
participant ProvAuth as Provider Auth Server
participant DB as localDb
participant Test as /api/providers/[id]/test
participant Exec as Provider Executor
UI->>OAuth: GET authorize or device-code
OAuth->>ProvAuth: create auth/device flow
ProvAuth-->>OAuth: auth URL or device code payload
يتم تنفيذ التحديث أثناء حركة التحرير المباشر داخل `open-sse/handlers/chatCore.ts` عبر المنفذ `refreshCredentials()`.## دورة حياة المزامنة السحابية (تمكين / مزامنة / تعطيل)```mermaid
sequenceDiagram
autonumber
participant UI as Endpoint Page UI
participant Sync as /api/sync/cloud
participant DB as localDb
participant Cloud as External Cloud Sync
participant Claude as ~/.claude/settings.json
UI->>Sync: POST action=enable
Sync->>DB: set cloudEnabled=true
Sync->>DB: ensure API key exists
Sync->>Cloud: POST /sync/{machineId} (providers/aliases/combos/keys)
Cloud-->>Sync: sync result
Sync->>Cloud: GET /{machineId}/v1/verify
Sync-->>UI: enabled + verification status
UI->>Sync: POST action=sync
Sync->>Cloud: POST /sync/{machineId}
Cloud-->>Sync: remote data
Sync->>DB: update newer local tokens/status
Sync-->>UI: synced
UI->>Sync: POST action=disable
Sync->>DB: set cloudEnabled=false
Sync->>Cloud: DELETE /sync/{machineId}
Sync->>Claude: switch ANTHROPIC_BASE_URL back to local (if needed)
Sync-->>UI: disabled
````
يتم تشغيل الدورية بواسطة "CloudSyncScheduler" عند السحابة.## نموذج البيانات وخريطة التخزين```mermaid
| يحتوي على كل موفر على منفذ تنفيذي متخصص لعدة `BaseExecutor` (في `open-sse/executors/base.ts`)، والذي يوفر بيانات إنشاء عنوان URL، ولكنه، جاهز المحاولة مع الأسيي، ومآثر تحديث الاعتماد، وطريقة استمرار `execute()`. | المنفذ | المزود (المقدمون) | التعامل الخاص |
استخدم الترجمات**OpenAI كتنسيق مركزي**— جرب جميع التحويلات عبر OpenAI كتنسيق وسيط:`
تنسيق المصدر → OpenAI (المحور) → التنسيق المستهدف`
يتم تحديد الترجمات ديناميكيًا استنادًا إلى شكل حمولة المصدر والتنسيق المستهدف للموفر.
طبقات معالجة إضافية في مسار الترجمة:
-**تطهير الاستجابة**— يزيل الحقول غير القياسية من استجابات تنسيق OpenAI (سواء المتدفقة أو غير المتدفقة) لضمان الامتثال الصارم لـ SDK -**تطبيع الدور**— تحويل `المطور` ← `النظام` للأهداف غير التابعة لـ OpenAI؛ يدمج "النظام" → "المستخدم" للنماذج التي ترفض دور النظام (GLM، ERNIE) -**استخراج علامة التفكير**— يوزع كتل `<think>...</think>` من المحتوى إلى حقل `reasoning_content` -**الإخراج المنظم**— يحول OpenAI `response_format.json_schema` إلى `responseMimeType` + `responseSchema` الخاص بـ Gemini## Supported API Endpoints
يعترض معالج التجاوز (`open-sse/utils/bypassHandler.ts`) طلبات "رمية سريعة" معروفة من Claude CLI - أصوات التمهيد، واستخراج العناوين، وعدد الرموز المميزة - ويعيد**استجابة زائفة**دون استهلاك الرموز المميزة للموفر الرئيسي. يتم تشغيل هذا فقط عندما يحتوي "User-Agent" على "clude-cli".## Request Logger Pipeline
يوفر مسجل الطلب (`open-sse/utils/requestLogger.ts`) مسارًا لتسجيل تصحيح الأخطاء مكون من 7 مراحل، معطل افتراضيًا، وممكن عبر `ENABLE_REQUEST_LOGS=true`:```
- مساعدو النظام الأساسي/وقت التشغيل (وليس تفعيل الخاص بالتطبيق): `APPDATA`، `NODE_ENV`، `PORT`، `HOSTNAME`## الملاحظات المعمارية المعروفة
1. تشارك `usageDb` و`localDb` في نفس الدليل الأساسي (`DATA_DIR` -> `XDG_CONFIG_HOME/omniroute` -> ``~/.omniroute`) مع ترحيل الملفات القديمة.
2. يفوض `/api/v1/route.ts` إلى نفس منشئ الكتالوج الموحد الذي يستخدمه `/api/v1/models` (`src/app/api/v1/models/catalog.ts`) العلم الانحراف الدلالي.
3. يقوم بطلب تسجيل بكتابة الرؤوس/النص الكامل عند جاكسونه؛ التعامل مع سجل الدليل على أنه حساسية.
4. يعتمد حماية السحابة على `NEXT_PUBLIC_BASE_URL` صحيح وإمكانية الوصول إلى نقطة نهاية السحابة.
5. تم نشر الدليل `open-sse/` باسم `@omniroute/open-sse`**حزمة مساحة العمل npm**. يقوم بكود المصدر باستيراده عبر `@omniroute/open-sse/...` (تم حله بواسطة Next.js `transpilePackages`). لا تسلك الطرق المستمرة في هذا المستند استخدم اسم الدليل `open-sse/` للاتساق.
6. نستخدم الكائنات الموجودة في لوحة المعلومات**Recharts**(المستندة إلى SVG) لتصورات التحليلات التفاعلية التي يمكن الوصول إليها (المخططات الشريطية للاستخدام للنموذج، والجرافيك المستخدمة للمخرجين مع النجاح).
7.استخدام السيولة E2E**Playwright**(`tests/e2e/`)، ويمكنها عبر `npm run test:e2e`. المستخدمة في الوحدة**Node.js test runner**(`tests/unit/`)، ويمكن تشغيلها عبر `npm run test:unit`. كود المصدر ضمن `src/` هو**TypeScript**(`.ts`/`.tsx`)؛ تختلف مساحة العمل `open-sse/` JavaScript (`.js`).
8. تم ضبط صفحة الإعدادات في 5 علامات: الأمان، التوجيه (6 إستراتيجيات عالمية: التعبئة العامة، جولة روبن، p2c، تنظيم غير محدد لاستخدامًا، تحسين التكلفة)، اشتراك (حدود الرسوم المتحركة للتحرير، قطع الدقة، إبداع)، الذكاء الاصطناعي (ميزانية التفكير، متشوق للنظام، ذاكرة التخزين المؤقت السريع)، المتقدمة (الوكيل).## قائمة التحقق من التشغيل
- البناء من المصدر: ``npm run build``
- إنشاء صورة Docker: `docker build -t omniroute .`
- بدء الخدمة والتحقق:
- `الحصول على /api/settings`
- `الحصول على /api/v1/models`
- يجب أن يكون عنوان URL الأساسي لهدف واجهة سطر اللاسلكي هو `http://<host>:20128/v1` عندما يكون `PORT=20128`
تم تسجيل أكثر من 30 نموذجًا عبر 6 أنواع من المهام (`الترميز`، و`المراجعة`، و`التخطيط`، و`التحليل`، و`تصحيح سبب`، و`التوثيق`). محترف أحرف البدل (على سبيل المثال، `*-coder` → درجة ترميز عالية).## Files
> دليل شامل ومناسب للمبتدئين إلى مدير المدير AI**omniroute**متعدد الموفرين.---## 1. What Is omniroute?
omniroute هو**جهاز وكيل التوجيه**يقع بين عملاء الذكاء الاصطناعي (Claude CLI، وCodex، وCursor IDE، وما إلى ذلك) وموفري الذكاء الاصطناعي (Anthropic، وGoogle، وOpenAI، وAWS، وGitHub، وما إلى ذلك). يحل مشكلة واحدة كبيرة:
> **يتحدث عملاء الذكاء الاصطناعي المختلفون "لغات" مختلفة (تنسيقات واجهة برمجة التطبيقات)، ويتوقع مقدمو خدمات الذكاء الاصطناعي المختلفون "لغات مختلفة" أيضاً.**يترجم المسار الشامل بما فيه الكفاية.
فكر في الأمر التالي مترجم عالمي في الأمم المتحدة - يمكن لأي مندوبات أي لغة، والمترجم هل يمكن أن يترجمها لأي مندوب آخر.---## 2. Architecture Overview
| `الثوابت.ts` | كائن `PROVIDERS` يحتوي على عناوين URL الأساسية وبيانات اعتماد OAuth (الافتراضية) والرؤوس ومطالبات النظام الافتراضية لكل موفر. يحدد أيضًا `HTTP_STATUS` و`ERROR_TYPES` و`COOLDOWN_MS` و`BACKOFF_CONFIG` و`SKIP_PATTERNS`. |
| "credentialLoader.ts" | يقوم بتحميل بيانات الاعتماد الخارجية من "data/provider-credentials.json" ويدمجها في الإعدادات الافتراضية المضمنة في "PROVIDERS". يحافظ على الأسرار خارج نطاق التحكم بالمصدر مع الحفاظ على التوافق مع الإصدارات السابقة. |
| `providerModels.ts` | سجل النموذج المركزي: الأسماء المستعارة لموفر الخرائط → معرفات النموذج. وظائف مثل `getModels()` و`getProviderByAlias()`. |
| `codexInstructions.ts` | تعليمات النظام التي تم إدخالها في طلبات الدستور الغذائي (قيود التحرير، قواعد الاختبار، سياسات الموافقة). |
| `base.ts` | — | قاعدة الملخصات: إنشاء عنوان URL، والرؤوس، ومنطقة إعادة المحاولة، وتحديث بيانات الاعتماد |
| `default.ts` | كلود، جيميني، أوبن آي آي، جي إل إم، كيمي، ميني ماكس | تحديث رمز OAuth العام للموفرين الكلاسيكيين |
| `مكافحة الجاذبية.ts` | جوجل كلود كود | إنشاء معرف المشروع/الجلسة، وإرجاع عناوين URL الإعلامية، بعد محاولة تحديد موقع رسائل الخطأ ("إعادة بعد 2 ساعة و7 دقائق و23 ثانية") |
| `cursor.ts` | منطقة تطوير متعددة للمؤشر | **الأكثر مخاطرًا**: مصادقة التسجيل الاختباري SHA-256، وترميز طلب Protobuf، وEventStream ثنائي → تحليل اتصال SSE |
| `codex.ts` | OpenAI Codex | حجم تعليمات النظام، وإدارة مستويات التفكير، تجديد المعلمات غير المدعومة |
| `الجوزاء-cli.ts` | جوجل الجوزاء CLI | إنشاء عنوان URL مخصص (`streamGenerateContent`)، وتحديث رمز OAuth المميز لـ Google |
| `جيثب.ts` | جيثب مساعد الطيار | نظام رمزي ثنائي (GitHub OAuth + Copilot token)، محاكاة رأس VSCode |
| `kiro.ts` | AWS CodeWhisperer | التحليل الثنائي لـ AWS EventStream، وإطارات أحداث AMZN، والتقدير المميز |
| `index.ts` | — | المصنع: اسم موفر ← فئة المنفذ، مع خيار بديل افتراضي | ---### 4.3 Handlers (`open-sse/handlers/`) |
**طبقة تأتي**— تترتب على الترجمة والتنفيذ والتدفق ويسبب سبب.
| `chatCore.ts` | **المنسق المركزي**(~ 600 سطر). لاحظ مع دورة حياة الطلب الكامل: اكتشاف ← الترجمة ← رحلة مميزة ← عزيزي القارئ/غير المتدفق ← تحديث ← أسباب ← تسجيل الاستخدام. |
| `responsesHandler.ts` | محول برمجة تطبيقات الخاصة بـ OpenAI: تحويل تنسيق الردود ← إرسال ملفات الدردشة ← إرسال إلى `chatCore` ← تحويل SSE مرة أخرى إلى تنسيق الردود. |
| `embeddings.ts` | محرك إنشاء التضمين: يحل نموذج التضمين → الموفر، ويرسل إلى واجهة برمجة تطبيقات الموفر، ويعيد الاتصال بالتضمين المتوافق مع OpenAI. يدعم 6+ مقدمي الخدمات. |
| `imageGeneration.ts` | معالج إنشاء الصور: يحل نموذج الصورة → الموفر، ويدعم الأوضاع المتوافقة مع OpenAI، وGemini-image (Antigravity)، والوضع الاحتياطي (Nebius). إرجاع صور base64 أو URL. | #### دورة حياة الطلب (chatCore.ts)```mermaid |
| `provider.ts` | **Format detection** (`detectFormat`): analyzes request body structure to identify Claude/OpenAI/Gemini/Antigravity/Responses formats (includes `max_tokens` heuristic for Claude). Also: URL building, header building, thinking config normalization. Supports `openai-compatible-*` and `anthropic-compatible-*` dynamic providers. |
| `model.ts` | Model string parsing (`claude/model-name` → `{provider: "claude", model: "model-name"}`), alias resolution with collision detection, input sanitization (rejects path traversal/control chars), and model info resolution with async alias getter support. |
| `tokenRefresh.ts` | OAuth token refresh for **every provider**: Google (Gemini, Antigravity), Claude, Codex, Qwen, Qoder, GitHub (OAuth + Copilot dual-token), Kiro (AWS SSO OIDC + Social Auth). Includes in-flight promise deduplication cache and retry with exponential backoff. |
| `combo.ts` | **Combo models**: chains of fallback models. If model A fails with a fallback-eligible error, try model B, then C, etc. Returns actual upstream status codes. |
| `usage.ts` | Fetches quota/usage data from provider APIs (GitHub Copilot quotas, Antigravity model quotas, Codex rate limits, Kiro usage breakdowns, Claude settings). |
| `accountSelector.ts` | Smart account selection with scoring algorithm: considers priority, health status, round-robin position, and cooldown state to pick the optimal account for each request. |
| `contextManager.ts` | Request context lifecycle management: creates and tracks per-request context objects with metadata (request ID, timestamps, provider info) for debugging and logging. |
| `ipFilter.ts` | IP-based access control: supports allowlist and blocklist modes. Validates client IP against configured rules before processing API requests. |
| `sessionManager.ts` | Session tracking with client fingerprinting: tracks active sessions using hashed client identifiers, monitors request counts, and provides session metrics. |
| `signatureCache.ts` | Request signature-based deduplication cache: prevents duplicate requests by caching recent request signatures and returning cached responses for identical requests within a time window. |
| `systemPrompt.ts` | Global system prompt injection: prepends or appends a configurable system prompt to all requests, with per-provider compatibility handling. |
| `thinkingBudget.ts` | Reasoning token budget management: supports passthrough, auto (strip thinking config), custom (fixed budget), and adaptive (complexity-scaled) modes for controlling thinking/reasoning tokens. |
| `wildcardRouter.ts` | Wildcard model pattern routing: resolves wildcard patterns (e.g., `*/claude-*`) to concrete provider/model pairs based on availability and priority. |
#### Token Refresh Deduplication
```mermaid
sequenceDiagram
participant R1 as Request 1
participant R2 as Request 2
participant Cache as refreshPromiseCache
participant OAuth as OAuth Provider
R1->>Cache: getAccessToken("gemini", token)
Cache->>Cache: No in-flight promise
Cache->>OAuth: Start refresh
R2->>Cache: getAccessToken("gemini", token)
Cache->>Cache: Found in-flight promise
Cache-->>R2: Return existing promise
OAuth-->>Cache: New access token
Cache-->>R1: New access token
Cache-->>R2: Same access token (shared)
Cache->>Cache: Delete cache entry
````
#### Account Fallback State Machine
```mermaid
stateDiagram-v2
[*] --> Active
Active --> Error: Request fails (401/429/500)
Error --> Cooldown: Apply backoff
Cooldown --> Active: Cooldown expires
Active --> Active: Request succeeds (reset backoff)
| "خطأ.ts" | إنشاء كلمات للأخطاء (تنسيق متوافق مع OpenAI)، وسبب المشكلة، واستخراجها، وحاول إعادة محاولة Antigravity من رسائل الخطأ، وأخطاء SSE. |
| "stream.ts" | **SSE Transform Stream**— خط أنابيب البث الأساسي. وضعان: "الترجمة" (ترجمة كاملة) و"العبور" (التطبيع + الطلب المستخدم). وأخذ بعين الاعتبار التخزين المؤقت للقطعة وتقدير استخدامها وتتبع طول الفيديو. تجنب مثيلات وحدة التشفير/وحدة فك التشفير لكل حالة DC المشتركة. |
| `streamHelpers.ts` | SSE ذات المستوى المنخفض: `parseSSELine` (متسامح مع المسافات البيضاء)، `hasValuableContent` ( تصفية أدوات الفارغة لـ OpenAI/Claude/Gemini)، `fixInvalidId`، `formatSSE` (تسلسل SSE مدرك للتنسيق مع `perf_metrics`). |
| `usageTracking.ts` | استخدام النسخة المميزة من أي تنسيق (Claude/OpenAI/Gemini/Responses)، والاستعانة بـ DNS لكل رمز مميز للأداة/الرسالة، والمخزن المؤقت (هامش أمان 2000 رمز مميز)، وتصفية الخاصيات بالتنسيق، وتسجيل وحدة التحكم مع ANSI. |
| `requestLogger.ts` | تسجيل الطلب إلى الملف (قم بالاشتراك عبر `ENABLE_REQUEST_LOGS=true`). ينشئ مجلدات الجلسة بملفات مرقمة: `1_req_client.json` → `7_res_client.txt`. كل عمليات الإدخال/الإخراج غير متزامنة (أطلق النار وانسى). داخل المسام. |
| `bypassHandler.ts` | ويمثل خيارًا محددًا لـ Claude CLI (عنوان الإنتاج، والحماية، والعد) ويعيد ميزة دون الاتصال بأي مكان. يدعم كل من الدف وغير الدف. لذلك عمدا على نطاق كلود CLI. |
| `networkProxy.ts` | يحل عنوان URL للوكلاء لموفر معين مع الأسبقية: تفعيل الخاص بالموفر → تفعيل العام → متغيرات البيئة (`HTTPS_PROXY`/`HTTP_PROXY`/`ALL_PROXY`). يدعم استثناءات `NO_PROXY`. اختيارية ذاكرة تخزين مؤقتة لمدة 30 ثانية. | #### خط أنابيب تدفق SSE```mermaid |
| `/api/settings/system-prompt` | الحصول على/وضع | القطع المؤقتة لأدوات البناء العالمية |
| `/api/sessions` | احصل على | تحديد العضوية ومعاييرها |
| `/api/rate-limits` | احصل على | الحالة لا يمكن تعديلها لكل حساب |---## 5. Key Design Patterns
### 5.1 Hub-and-Spoke Translation
تتم ترجمة جميع الاحتمالات من خلال**تنسيق OpenAI كمحور**. لا تتطلب إضافة موفر جديد سوى كتابة**زوج واحد**من المترجمين (من/ إلى OpenAI)، وليس عدد N من المترجمين.### 5.2 Executor Strategy Pattern
كل ما لديها فئة تنفيذية مخصصة ترث من "BaseExecutor". تم تصنيع المصنع الموجود في "executors/index.ts" وبالتالي أصبح المصنع جاهزًا في وقت التشغيل.### 5.3 نظام البرنامج الإضافي للتسجيل الذاتي
وحدات المترجمة نفسها عند الاستيراد عبر ``تسجيل ()'. إن إضافة مترجم جديد يعني مجرد إنشاء ملف واستيراده.### 5.4 Account Fallback with Exponential Backoff
عندما يقوم بتقديم خدمة بإرجاع 429/401/500، يمكن أن يتكامل مع الحساب التالي، مع تطبيق أحدث الحداثات الأسية (1ث → 2ث → 4ث → 2 دقيقة الضرر التام).### 5.5 Combo Model Chains
يقوم "التحرير والسرد" بتجميع سلاسل "المزود/النموذج" حاسوبياً. في حالة الفشل الأول، يتم الرجوع إلى المنتج الأصلي.### 5.6 الترجمة المتدفقة ذات الحالة
الحفاظ على ترجمة الأجزاء ذات الحالة عبر SSE (تتبع كتلة التفكير، وتراكم الاتصال بالجهة، وفهرسة كتلة المحتوى) عبر تقنية `initState()`.### 5.7 المخزن المؤقت لسلامة الاستخدام
تم إضافة مخزن مؤقت مكون من 2000 رمز مميز إلى الحد الأقصى من الاستخدام لمساعدة العملاء على الوصول إلى حدود النافذة بسبب الحمل الزائد من مطالبات النظام وترجمة السائقين.---## 6. Supported Formats
- `open-sse/translator/response/*`### المرحلة الخامسة: 75% -> 80%
- [ ] إضافة السيولة على مستوى رام لـ:
- `open-sse/handlers/chatCore.ts`
- `open-sse/handlers/responsesHandler.js`
- `open-sse/handlers/imageGeneration.js`
- `open-sse/handlers/embeddings.js`
- [ ] إضافة المنفذ الفرعي للمصادقة الخاصة بالموفر، لتقديم المحاولة، وتجاوزات نقطة النهاية### المرحلة 6: 80% -> 85%
- [ ] دمج المزيد من مجموعات الأحداث المتقدمة في مسار التغطية الرئيسية
- [ ] الزيادة الوظيفية للوحدات قاعدة البيانات ذات التغطية الضعيفة للمنشئ/المساعد
- [ ] إغلاق فجوات الفروع في "settings.ts"، و"registeredKeys.ts"، و"validation.ts"، ومساعدي المترجم### المرحلة السابعة: 85% -> 90%
- [ ] بعض القضايا ذات الميزانية المحدودة المتبقية على أدوات الحظر
- [ ] إضافة نسبة الانحدار لكل خطأ إنتاجي تم اكتشافه وإصلاحه أثناء الدفع إلى 90%
- [ ] رفع بوابة التغطية في CI فقط بعد أن يكون الخط المحلي الأساسي قائمًا لتشغيلتين متتاليتين على الأقل## Ratchet Policy
قم بالتأكيد بعتبات تشغيل npm: التغطية فقط بعد التجاوز الفعلي فعليًا، المرحلة الرئيسية التالية في مخزن الراحة.
سلسلة السقاطة لسبب:
1. 55/60/55
2. 60/62/58
3. 65/64/62
4. 70/66/66
5. 75/70/72
6. 80/75/78
7. 85/80/84
8. 90/85/88
الترتيب هو "أسطر البيانات / الفروع / الوظائف".## الثغرة المعروفة
يقيس أمر التغطية الحالية لمجموعة العقد الرئيسية بمشاركة المصدر الذي يتم الوصول إليه منه، بما في ذلك `open-sse`. لم أدمج بعد تغطية Vitest في التقرير الموحد الواحد. وقد تم إنجاز هذا لاحقًا، ولكن لا تزيد سرعة زيادة الذاكرة بنسبة 60% -> 80%.
دليل مرئي لكل قسم من معلومات لوحة OmniRoute.---## 🔌 Providers
إدارة اتصالات الذكاء الصناعي: موفري OAuth (Claude Code وCodex وGemini CLI) وموفري مفاتيح API (Groq وDeepSeek وOpenRouter) ومقدمي خدمات العيد (Qoder وQwen وKiro). لحسابات كيرو على تتبع الاعتماد الائتماني - الأرصدة النهائية لإجمالي استطلاعات الرأي المتخصصة في لوحة التحكم → استخدام.---
## 🎨 Combos
أنشئ مجموعات التوجيه باستخدام 6 إستراتيجيات: نأمل، والمتزايدة، والدورية، والعشوائية، وأقل استخدامًا، والمُحسّن من حيث التكلفة. وخاصة مجموعة نماذج متعددة مع اختلافات سريعة وفحوصات للجاهزية.---
## 📊 Analytics
تحليلات استخدام شاملة مع الرمز المميز، وتقديرات التكلفة، وخرائط، ومخططات التوزيع الأسبوعية، والتفاصيل لكل محمية.---
## 🏥 System Health
التسجيل في الوقت الفعلي: وقت العمل، والذاكرة، والإصدار، والنسب لزمن الوصول (p50/p95/p99)، وإحصائيات ذاكرة التخزين المؤقتة، وحالات منع دائرة الموفر.---
## 🔧 Translator Playground
أدوات لتصحيح أخطاء ترجمات برمجة التطبيقات:**ساحة اللعب**(محول أربعة نجاح)،**اختبار الدردشة**(الطلب المباشر)،**منصة الاختبار**(اختبارات الدفعة)، و**المراقب المباشر**(بث الوقت في العمل).---
## 🎮 Model Playground _(v2.0.9+)_
اختبر أي نموذج مباشرة من لوحة القيادة. حدد الموفر والطراز والنقطة النهائية، وكتب المطالبات باستخدام محرر موناكو، وقم بتفعيل الاستثناءات في المنتج الفعلي، وإلغاء منتصف الدفق، والمعايرة التقليدية مرة.---## 🎨 Themes _(v2.0.5+)_
ألوان قابلة للتخصيص لمعلومات لوحة المفاتيح بأكملها. اختر من بين 7 ألوان محددة ليمين (مرجاني، أزرق، أخضر، بنفسجي، لون أحمر، سماوي) أو قم باختيار سمة مخصصة عن طريق اختيار أي سداسي عشري. يدعم وضع الضوء والظلام النظام.---## ⚙️ Settings
لوحة الإعدادات شاملة مع علامات التبويب:
-**عام**— تخزين النظام، وإدارة النسخ الاحتياطي (قاعدة بيانات التصدير/الاستيراد) -**المظهر**— محدد السماعة (داكن/فاتح/نظام)، الإعدادات المسبقة لموضوع الألوان والألوان المخصصة، ورؤية السجل الصحي، وعناصر التحكم في رؤية عنصر الشريط الجانبي -**الأمان**— حماية نقطة نهاية واجهة برمجة التطبيقات، وحظر الموفر المخصص، وتصفية IP، ومعلومات الاتصال -**التوجيه**— الأسماء المستعارة للنماذج، و الابتكارات الخلفية -**المرونة**— ونتيجة لذلك الحد الأقصى للمعدل، وضبط القيود، والتعطيل التلقائي للحسابات المحظورة، وانتهاء صلاحية الموفر -**متقدم**— تجاوز، ومسار تدقيق فقط، وتطبيق التدمير الاحتياطي---
## 🔧 CLI Tools
ختمة واحدة لأدوات تميز الذكاء الصناعي: Claude Code، وCodex CLI، وGemini CLI، وOpenClaw، وKilo Code، وAntigravity، وCline، وContinue، وCursor، وFactory Droid. تم تفعيل/إعادة ضبط تلقائي، فقط تعريف الاتصال، والنتائج المباشرة.---
## 🤖 CLI Agents _(v2.0.11+)_
لوحة معلومات للتحكم في وكلاء CLI. تم عرض شبكة مكونة من 14 وكيلًا مدمجًا (Codex وClaude وGoose وGemini CLI وOpenClaw وAider وOpenCode وCline وQwen Code وForgeCode وAmazon Q وOpen Interpreter وCursor CLI وWarp) مع:
-**حالة التثبيت**— تم التثبيت/لم يتم العثور عليه باستخدام اكتشاف الإصدار -**توصيات المذكورة**— stdio، HTTP، وما إلى ذلك. -**الوكلاء يستهدفون**— هل هناك أي أداة لواجهة سطر الوكيل (CLI) عبر النموذج (الاسم، ثنائي، أمر الإصدار، وسيط النشر) -**مطابقة بصمة CLI**— التبديل لكل المرشحين لمطابقة توقيعات طلب CLI الأصلية، مما سيقدر من المبدع بالفعل مع ضمان عنوان IP الوكيل---## 🖼️ Media _(v2.0.3+)_
موجود في الصور ومقاطع الفيديو والموسيقى من لوحة التحكم. يدعم OpenAI وxAI وTogether وHyperbolic وSD WebUI وComfyUI وAnimateDiff وStable Audio Open وMusicGen.---## 📝 Request Logs
تسجيل طلبات الإنتاج في الواقع باستخدام التصفية حسب الموفر والطراز والحساب ومفتاح واجهة برمجة التطبيقات. معلمات القيمة الناتجة عن التعويض الطبيعي ووقت التعويض وتفاصيل التعويض.---
## 🌐 API Endpoint
نقطة نهاية واجهة برمجة التطبيقات الموحدة الخاصة بك مع تفاصيل التفاصيل: عمليات التسجيل، وواجهة برمجة تطبيقات الاستجابات، والتضمينات، وأي الصور، إلى الإعداد، والنسخة الصوتية، تحويل النص إلى كلام، والإشراف، ومفاتيح واجهة برمجة التطبيقات المفقودة. تكامل Cloudflare Quick Tunnel للتواصل مع وكيل السحابي للوصول إليه بعد.---
## 🔑 API Key Management
إنشاء مفاتيح API ونطاقها لربط وإلها. يمكن أن يكون هناك كل المفاتيح الرئيسية على/موفري خدمات محددة لهم حق الوصول الكامل أو أذونات القراءة فقط. إدارة المفاتيح المرئية مع تكرار الاستخدام.---## 📋 Audit Log
متابعة الإجراءات الإدارية بالتصفية حسب نوع الإجراء والممثل والهدف وعنوان IP والطابع الزمني. سجل الأحداث الأمنية الكاملة.---## 🖥️ Desktop Application
تطبيق Native Electron لسطح المكتب لأنظمة التشغيل Windows وmacOS وLinux. قم بالموافقة على OmniRoute كتطبيق مستقل مع نظام متكامل للنظام والدعم دون الاتصال والتحديث التلقائي والتثبيت بنقرة واحدة.
الميزات الرئيسية:
- استقصاء جاهزية الضيوف (لا توجد شاشة عند التشغيل البارد)
- نظام إدارة المنافذ
- اتخاذ القرار بشأن المحتوى
- مثال واحد
- التحديث التلقائي عند إعادة التشغيل
- واجهة المستخدم مشروطة بالكامل (إشارات المرور لنظام التشغيل MacOS، وشريط العنوان الإلكتروني لنظام التشغيل Windows/Linux)
- بناء الإلكترون المقوى - يتم إبتكار "وحدات_العقدة" وتشهد بالرمز في المقترحات ورفضها قبل قبولها، مما يمنع الاعتماد في وقت التشغيل على البناء (الإصدار 2.5.5+)
📖 راجع [`electron/README.md`](../electron/README.md) للحصول على التوثيق الكامل.
| `الرسائل` | يترجم المفاتيح المفقودة في `src/i18n/messages/{locale}.json` من `en.json` |
| "الملف التمهيدي" | يترجم `README.md` إلى كافة اللغات كـ `README.{code}.md` في جذر المشروع |
| `المستندات` | يترجم `DOC_SOURCE_FILES` إلى `docs/i18n/{locale}/{docName}` |
| `الكل` | يعمل على جميع الأوضاع الثلاثة |
**الميزات:**
-**حماية النص**: كتل التعليمات البرمجية للأقنعة (```)، والتعليمات البرمجية المضمنة (`` `)، وروابط/صور تخفيض السعر (`[نص](url)`)، وعلامات HTML، والجداول، والعناصر النائبة لـ ICU (`{count}`، `{value}`، `{total}`، وما إلى ذلك) قبل الترجمة، ثم استعادتها -**التجميع المقسم**: ربط سلاسل متعددة باستخدام محددات `__OMNIROUTE_I18N_SEPARATOR__` لتقليل استدعاءات واجهة برمجة التطبيقات (بحد أقصى 1800 حرف لكل طلب) -**ذاكرة التخزين المؤقت في الذاكرة**: تتجنب استدعاءات واجهة برمجة التطبيقات المتكررة للسلاسل المتكررة خلال الجلسة -**منطق إعادة المحاولة**: التراجع الأسي (حتى 5 محاولات مع 300 مللي ثانية × تأخير المحاولة) للأخطاء 429/5xx -**المهلة**: 20 ثانية لكل طلب -**تخطي الملف الموجود**: إذا كان الملف الهدف موجودًا بالفعل، فلن تتم الكتابة فوقه
**سلوكيات مهمة:**
- `docs/i18n/README.md` يتم**إعادة إنشائه**كل مرة — وهو عبارة عن فهرس يتم إنشاؤه تلقائيًا لجميع المستندات
- يتم إنشاء ملفات `README.{code}.md` الجذر فقط في حالة عدم وجودها (يتخطى اللغات المحلية في `EXISTING_README_CODES`)
- يتم إدراج/تحديث أشرطة اللغة (`🌐**اللغات:**...`) تلقائيًا في جميع المستندات المترجمة### i18n_autotranslate.py (LLM-based)
**مترجم ثانوي**— يستخدم أي LLM API متوافق مع OpenAI (بما في ذلك OmniRoute نفسه) لترجمة ملفات تخفيض السعر الموجودة `docs/i18n/`. الأفضل لتلميع المستندات أو إعادة ترجمتها بجودة أفضل من ترجمة Google.```bash
python3 scripts/i18n_autotranslate.py \
--api-url http://localhost:20128/v1 \
--api-key sk-your-key \
--model gpt-4o
````
**الميزات:**
- يقوم بمسح الملفات المشهورة بسعر رخيص `docs/i18n/` بحثًا عن الفقرات الإنجليزية
-**المفاتيح المفقودة**— المفاتيح الموجودة في `en.json` ولكن ليست في الملف المحلي
-**مفاتيح إضافية**— مفاتيح في ملف الإعدادات المحلية ولكن ليس في `en.json`
-**المفاتيح غير المترجمة**— المفاتيح التي تساوي فيها قيمة اللغة المصدر باللغة الإنجليزية (باستثناء القائمة المسموح بها)
-**عدم تطابق العناصر النائبة**— العناصر النائبة لـ ICU غير متطابقة بين المصدر والترجمة
**رموز الخروج:**
| الكود | معنى |
|------|---------|
| 0 | موافق |
| 1 | خطأ عام |
| 2 | سلاسل مفقودة (خطأ فادح) |
| 3 | تحذير غير مترجم (ناعم) |
**البيئة:**قم بتعيين `TRANSLATION_LANG=cs` أو استخدم علامة `-l cs`.### check_translations.py
**مدقق المفاتيح Code-to-JSON**— يفحص `src/**/*.tsx` و`src/**/*.ts` لاستدعاءات `useTranslations()` ويتحقق من وجود جميع المفاتيح المشار إليها في `en.json`.```bash
# Basic check
python3 scripts/check_translations.py
# Verbose output
python3 scripts/check_translations.py --verbose
# Auto-fix (adds missing keys to en.json)
python3 scripts/check_translations.py --fix
````
### generate-qa-checklist.mjs
**تحليل ثابت وجودة**— يقوم بفحص الملفات صفحة Next.js بحثًا عن مقاييس ألمانية i18n منشئ ويتقرير Markdown.`bash
العقدة النصية/i18n/generate-qa-checklist.mjs`
**الفحوصات:**
- استخدام فئة العرض الثابت (خطر التجاوز)
- فئات الاتجاه لليسار/اليمين (خطر RTL)
- الأنماط المعرضة للتقطيع
- التكافؤ المحلي (مفاتيح مفقودة/إضافية مقابل `en.json`)
- أشرطة تحديد اللغة README في اللغات المحلية ذات الأولوية (`es`، `fr`، `de`، `ja`، `ar`)
**الإخراج:**`docs/reports/i18n-visual-qa-{date}.md` + تقرير JSON## إدارة المفاتيح غير القابلة للترجمة### untranslatable-keys.json
**الملف:**`scripts/i18n/untranslatable-keys.json`
"""""""""""للمفاتيح التي يجب أن تستعين بها للمصدر باللغة الإنجليزية. انتبه بواسطة `validate_translation.py` للإشعارات المسببة لأسباب "غير الترجمة".```json
{
"description": "المفاتيح التي يجب أن تظل غير مترجمة..."،
"مفاتيح": [
"النموذج المشترك"،
"common.oauth"،
"health.cpu"،
...
]
}```
**ما ينتمي هنا:**
- أسماء العلامات التجارية/المنتجات: `landing.brandName`، `common.social-github`
- تجنب استخدام لغة CSS ذات الترميز الثابت `left`/`right` - استخدم الخصائص المنطقية `start`/`end`
- تكتشف Visual QA عدم تطابق تخطيط RTL عبر "run-visual-qa.mjs".## Known Issues & History
### `in.json` → `hi.json` Fix
استخدم المولد في الأصل `الكود: "in"` (كود ترجمة Google المهجور) للغة الهندية بدلاً من ISO 639-1 الصحيح `hi`. أدى هذا إلى إنشاء نسخة معزولة `in.json` من `hi.json`. تم الإصلاح عن طريق تغيير `code: "in"` إلى `code: "hi"` في `generate-multilang.mjs` وإزالة الملف المعزول.### `docs/i18n/README.md` Is Auto-Generated
تمت إعادة إنشاء الملف "docs/i18n/README.md" بالكامل بواسطة "generate-multilang.mjs docs". سيتم فقدان أي تعديلات يدوية. استخدم `docs/I18N.md` (هذا الملف) للوثائق المكتوبة بخط اليد والتي يجب أن تستمر.### External Untranslatable Keys List
تم نقل القائمة المسموح بها `untranslatable-keys.json` من مجموعة Python المضمنة في `validate_translation.py` إلى ملف JSON خارجي لتسهيل الصيانة. يقوم المدقق بتحميله في وقت التشغيل.### `generate-multilang.mjs` Hindi Code Fix
استخدم المولد في الأصل `الكود: "in"` (كود ترجمة Google المهجور) للغة الهندية بدلاً من ISO 639-1 الصحيح `hi`. تم تقديم هذا في الالتزام الأولي `952b0b22c` بواسطة diegosouzapw. تم الإصلاح عن طريق تغيير `code: "in"` إلى `code: "hi"` في مصفوفة `LOCALE_SPECS` وإزالة الملف اليتيم `in.json`.### `validate_translation.py` Ignored Count Output
يعرض الفحص "السريع" الآن عدد المفاتيح التي تم تجاهلها من "untranslatable-keys.json":```
1. تحقق من نقاط `BASE_URL` لمثيلك قيد التشغيل (على سبيل المثال، `http://localhost:20128`)
2. تحقق من نقاط `CLOUD_URL` إلى نقطة نهاية السحابة الخاصة بك (على سبيل المثال، `https://omniroute.dev`)
3. حافظ على قيم `NEXT_PUBLIC_*` مع قيم من جانب العمال### Cloud `stream=false` Returns 500
**العلامة:**`الرمز المميز 'd'...' غير متوقع في نقطة نهاية السحابة للمكالمات غير المتدفقة.
**السبب:**يقوم المنبع بإرجاع حمولة SSE أثناء العميل JSON.
**الحل البديل:**استخدم `stream=true` للمكالمات السحابية المباشرة. قم بتضمين SSE المحلي → JSON الاحتياطي.### السحابة تقول أنها متصلة ولكن "مفتاح API غير صالح"
1. أنشئ مفتاحًا جديدًا من لوحة التحكم المحلية (`/api/keys`)
| **ساحة اللعب** | قارن ملفات الإدخال/الإخراج جنباً إلى جنب — لا لصق طلباً فاشلاً لترى كيف ترجمته |
| **اختبار الدردشة** | أرسل الرسائل مباشرة وافحص فاعلية الطلب/الاستجابة الكاملة بما في ذلك الرؤوس |
| **الاختبار** | قم بإنهاء الفقرات المجمعة عبر مجموعات محددة على الترجمات المعطلة |
| **مراقبة حية** | شاهد تدفق الطلبات في التنسيق المطلوب على الترجمة المتقطعة | ### مشكلات التنسيق الشائعة |
-**لا تضع علامات التفكير**— تحقق مما إذا كان الموفر المستهدف مقبولاً يمكن توقعه -**استدعاءات مساعدة**— قد توفر بعض الترجمات بشكل فعال لحذف الأشخاص غير المساعدين؛ تحقق في وضع الملعب -**مطالبة النظام مفقودة**— نظام Claude وGemini مع المطالبات المختلفة؛ التحقق من إخراج الترجمة -**ترجع سلسلة SDK أولية أخرى من**- تم إصلاح ذلك في الإصدار 1.1.0: تقوم أداة التأثير الفوري الآن وأنواعها غير الممتازة (`x_groq`، و`usage_breakdown`، وما إلى ذلك) التي فشلت في التحقق من صحة OpenAI SDK Pydantic -**GLM/ERNIE يرفض دور `النظام`**- تم إصلاحه في الإصدار 1.1.0: يقوم بـ«تطبيع الدور التنفيذي برسائل مدمجة في النظام في رسائل المستخدم للنماذج غير المتوافقة» -**لم يتم التعرف على دور "المطور"**- تم إصلاحه في الإصدار 1.1.0: تم تحويله تلقائياً إلى "نظام" لتقديم الخدمات غير التابعة لـ OpenAI -**`json_schema` لا يعمل مع Gemini**— تم إصلاحه في الإصدار 1.1.0: تم الآن تحويل `response_format` إلى `responseMimeType` + `responseSchema` الخاص بـ Gemini---## Resilience Settings
### Auto rate-limit not triggering
- يُطبق حتى يتم التعديل تلقائيًا فقط على لوحة مفاتيح برمجة التطبيقات (وليس OAuth/الاشتراك)
- تحقق من أن**الإعدادات → ← ملفات تعريف الموفر**تم وأيضا تعديلها بشكل تلقائي
- تحقق مما إذا كان الموفر يعرض رموز الحالة "429" أو الذاكرة "إعادة المحاولة بعد".### ضبط التراجع الأسي
تدعم ملفات تعريف الموفر هذه الإعدادات:
-**التأخير الأساسي**— وقت الانتظار الأول بعد الأول (الافتراضي: 1 ثانية) -**الحد الأقصى للتأخير**— الحد الأقصى لوقت الانتظار (الافتراضي: 30 ثانية) -**المضاعف**— كمية الزيادة الشاملة لكل فشل متتالي (الافتراضي: 2x)### قطيع مضاد الرعد
عندما تصل العديد من الطلبات المتزامنة إلى وفرة السعر، يستخدم تقنية OmniRoute تقنية Mutex + تحديد موعد مباشر لتنزيل الطلبات مباشرة من أجل توقف الحالات المتتالية. وهذا تلقائي لموفري مفاتيح API.---## Optional RAG / LLM failure taxonomy (16 problems)
يقوم بعض مستخدمي OmniRoute بالتحرك أمام RAG أو مكدسات الوكيل. في هذه الإعدادات، من الشائع رؤية نمط غريب: يبدو OmniRoute سليمًا (مقدمو خدمة في وضع جيد، وإصدار الأحكام الشخصية على ما بعد، ولا توجد تنبيهات بخلاف حدود القضاء) ولكن الإجابة لا تزال لا تزال صحيحة.
ومن ثم، يأتي هذا الذي يأتي من خط الأنابيب النهائي RAG، وليس من نفسه.
إذا كنت تريد مفردات الحصول على وصف لتلك الإخفاقات، فيمكنك استخدام WFGY IssueMap، وهو مصدر ترخيص MIT الخارجي يحدد ستة عشرة نمطًاًا لفشل RAG / LLM. على مستوى عال يغطي:
- الانجراف استرجاع وحدود السياقة المكسورة
- الفهارس الفارغة أو القديمة ومخازن المتجهات
- التضمين مقابل عدم التطابق الدلالي
- رخص السياقة ورافعة السياقة
- مجموعة واسعة من الإجابات الاستخدام في التجارة الحرة
- خلل في النص بين النص والوكيل
- ذاكرة متعددة للعامل والمؤثرات
- مشاكل النشر والتمهيد
فكرة بسيطة:
1. عندما تقوم بالتحقق من خلل في حسابك، قم بالقاطع:
- مهمة المستخدم وطلبه
- مجموعة الطريق أو المورد في OmniRoute
- أي مؤتمر RAG في المراحل النهائية (المستندات المستردة، وأدوات الأدوات، وما إلى ذلك)
2. قم بتخطيط الحادث لواحد أو من أرقام WFGY IssueMap (`رقم 1`...`رقم 16`).
3. قم بتخزين الرقم في لوحة المعلومات الخاصة بك، أو دليل التشغيل، أو أداة التعقب بجوار سجلات OmniRoute.
4. استخدم صفحة WFGY لتقرر ما إذا كنت تريد تغيير مكدس RAG أو المسترد أو استراتيجية التوجيه.
النص الكامل والوصفات الملموسة موجودة هنا (ترخيص معهد ماساتشوستس فارس، النص فقط):
[الملف التمهيدي لخريطة مشاكل WFGY](https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md)
ستتجاهل هذا القسم إذا لم تسمح لـ RAG أو خطوط الأنابيب الخارجية خلف OmniRoute.---## Still Stuck?
-**مشكلات GitHub**: [github.com/diegosouzapw/OmniRoute/issues](https://github.com/diegosouzapw/OmniRoute/issues) -**الهندسة الداخلية**: راجع [`docs/ARCHITECTURE.md`](ARCHITECTURE.md) للحصول على التفاصيل -**مرجع واجهة برمجة التطبيقات**: راجع [`docs/API_REFERENCE.md`](API_REFERENCE.md) -**لوحة معلومات الصحة**: التحقق من**معلومات اللوحة ← صحة**معرفة النظام في الوقت الفعلي -**المترجم**: استخدم**لوحة المعلومات ← المترجم**ل التصحيح المناسب لك
2. الحصول على مفتاح API → لوحة المعلومات → إضافة مفتاح API
**الاستخدام:**`kimi/kimi-latest` —**نصيحة شاملة:**سعر ثابت دفاع 9 دولارات شهريًا مقابل 10 ملايين رمز مميز = 0.90 دولار أمريكي/التكلفة يريد لمليون واحد!### 🆓 مقدمو الخدمات مجانًا#### Qoder (8 FREE models)
بالنسبة للوضع المدمج مع واجهة CLI، راجع قسم Docker في المستند الرئيسي.### Void Linux (xbps-src)
يمكن لمستخدمي Void Linux حزم OmniRoute وتثبيته محليًا باستخدام إطار عمل مشترك متقاطع `xbps-src`. يؤدي هذا إلى رسم إنشاء Node.js المستقل جنبًا إلى جنب مع الارتباطات الأصلية المطلوبة `better-sqlite3`.
| `NODE_ENV` | وقت التشغيل الافتراضي | اضبط "الإنتاج" للنشر |
| `BASE_URL` | `http://localhost:20128` | عنوان URL الأساسي الداخلي من جانب الخادم |
| `CLOUD_URL` | `https://omniroute.dev` | عنوان URL الأساسي لنقطة نهاية المزامنة السحابية |
| `API_KEY_SECRET` | `نقطة النهاية-الوكيل-واجهة برمجة التطبيقات-مفتاح-سر` | سر HMAC لمفاتيح API التي تم إنشاؤها |
| `REQUIRE_API_KEY` | `كاذبة` | فرض مفتاح Bearer API على `/v1/*` |
| `ALLOW_API_KEY_REVEAL` | `كاذبة` | السماح لـ Api Manager بنسخ مفاتيح API الكاملة عند الطلب |
| `PROVIDER_LIMITS_SYNC_INTERVAL_MINUTES` | `70` | إيقاع التحديث من جانب الخادم لبيانات حدود الموفر المخزنة مؤقتًا؛ لا تزال أزرار تحديث واجهة المستخدم تؤدي إلى المزامنة اليدوية |
| `DISABLE_SQLITE_AUTO_BACKUP` | `كاذبة` | تعطيل لقطات SQLite التلقائية قبل الكتابة/الاستيراد/الاستعادة؛ النسخ الاحتياطية اليدوية لا تزال تعمل |
أو استخدام معلومات اللوحة:**المزودون → [الموفر] → الارتباطات الارتباطية**.
التعليقات:
- تم إدارة موفري خدمات OpenRouter وOpenAI/Anthropic المتوافقين من**نماذج النماذج**فقط. يمكنك الإضافة اليدوية والاستيراد والنوبات بشكل تلقائي لجميع العناصر الموجودة في نفس قائمة الارتباطات المتاحة، لذلك لا يوجد قسم منفصل للنماذج المخصصة لهؤلاء الموفرين.
- قسم**النماذج المتخصصة**مخصص للموزعين الذين لا يقومون بإدارة المنتجات المقدمة منهم، استيراد النتائج المتاحة للمصممين.### مسارات موفر مخصصة
توجيه الطلبات مباشرة إلى موفر محدد مع التحقق من صحة النموذج:```bash
نشر http://localhost:20128/v1/providers/openai/chat/completions
|**تصدير قاعدة البيانات**| يقوم بتنزيل قاعدة بيانات SQLite الحالية كملف `.sqlite` |
|**تصدير الكل (.tar.gz)**| تنزيل أرشيف نسخ احتياطي كامل بما في ذلك: قاعدة البيانات، والإعدادات، والمجموعات، واتصالات الموفر (بدون بيانات اعتماد)، وبيانات تعريف مفتاح API |
|**استيراد قاعدة البيانات**| قم بتحميل ملف `.sqlite` ليحل محل قاعدة البيانات الحالية. يتم إنشاء نسخة احتياطية للاستيراد المسبق تلقائيًا ما لم `DISABLE_SQLITE_AUTO_BACKUP=true` |```bash
curl -X POST http://localhost:20128/api/db-backups/import \
-F "file=@backup.sqlite"
````
**التحقق من صحة الاستيراد:**التحقق من صحة الملف المستورد للتأكد من سلامته (التحقق من صحة الملف المستورد)، والجداول الأساسية (`provider_connections`، و`provider_nodes`، و`combos`، و`api_keys`)، غير (بحد أقصى 100 ميجابايت).
**حالات الاستخدام:**
- رحيل OmniRoute بين الأجهزة
- إنشاء نسخة احتياطية خارجية للتعافي من الكوارث
- مشاركة تلكات بين أعضاء الفريق (تصدير الكل → مشاركة الأرشيف)---### Settings Dashboard
يتم تنظيم إعدادات الصفحات في 6 علامات مخصصة للتخصيص:
**تتبع التكلفة:**يقوم كل طلب بتسجيل استخدام الرمز المميز وحساب التكلفة باستخدام جدول التسعير. عرض التفاصيل في**لوحة المعلومات → الاستخدام**حسب الموفر والطراز ومفتاح واجهة برمجة التطبيقات.---
### Audio Transcription
يدعم OmniRoute النسخ الصوتي عبر نقطة النهاية المتوافقة مع OpenAI:```bash
POST /v1/audio/transcriptions
Authorization: Bearer your-api-key
Content-Type: multipart/form-data
# Example with curl
curl -X POST http://localhost:20128/v1/audio/transcriptions \
> **نصيحة**: للحصول على الحد الأقصى من الأمان، يجب بتقييد المنفذين 80 و443 بناوين Cloudflare IP فقط. راجع قسم [الأمان المتقدم](#الأمن المتقدم).---## 2. Install OmniRoute
> **Agent-to-Agent Protocol v0.3**— يتزايد أي وكيل AI من استخدام OmniRoute كوكيل توجيه ذكي عبر JSON-RPC 2.0.
يعرض مضيف A2A OmniRoute**وكيلًا من الدرجة الأولى**يمكن لوكلاء الاكتشافات الأخرى وطلب الاتصال به باستخدام [بروتوكول A2A](https://google.github.io/A2A/).---## الهندسة
OmniRoute is a local AI routing gateway and dashboard built on Next.js.
It provides a single OpenAI-compatible endpoint (`/v1/*`) and routes traffic across multiple upstream providers with translation, fallback, token refresh, and usage tracking.
Core capabilities:
- OpenAI-compatible API surface for CLI/tools (28 providers)
- Request/response translation across provider formats
- Model combo fallback (multi-model sequence)
- Account-level fallback (multi-account per provider)
- OAuth + API-key provider connection management
- Embedding generation via `/v1/embeddings` (6 providers, 9 models)
- Image generation via `/v1/images/generations` (4 providers, 9 models)
- Think tag parsing (`<think>...</think>`) for reasoning models
- Response sanitization for strict OpenAI SDK compatibility
- Role normalization (developer→system, system→user) for cross-provider compatibility
- facade: `src/lib/usageDb.ts` (decomposed modules in `src/lib/usage/*`)
- SQLite tables in `storage.sqlite`: `usage_history`, `call_logs`, `proxy_logs`
- optional file artifacts remain for compatibility/debug (`${DATA_DIR}/log.txt`, `${DATA_DIR}/call_logs/`, `<repo>/logs/...`)
- legacy JSON files are migrated to SQLite by startup migrations when present
Domain State DB (SQLite):
- `src/lib/db/domainState.ts` — CRUD operations for domain state
- Tables (created in `src/lib/db/core.ts`): `domain_fallback_chains`, `domain_budgets`, `domain_cost_history`, `domain_lockout_state`, `domain_circuit_breakers`
- Write-through cache pattern: in-memory Maps are authoritative at runtime; mutations are written synchronously to SQLite; state is restored from DB on cold start
- Format constants: `open-sse/translator/formats.ts`
### Persistence
- `src/lib/db/*`: persistent config/state and domain persistence on SQLite
- `src/lib/localDb.ts`: compatibility re-export for DB modules
- `src/lib/usageDb.ts`: usage history/call logs facade on top of SQLite tables
## Provider Executor Coverage (Strategy Pattern)
Each provider has a specialized executor extending `BaseExecutor` (in `open-sse/executors/base.ts`), which provides URL building, header construction, retry with exponential backoff, credential refresh hooks, and the `execute()` orchestration method.
| GLM/Kimi/MiniMax | claude | API Key | ✅ | ✅ | ❌ | ❌ |
| DeepSeek | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Groq | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| xAI (Grok) | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Mistral | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Perplexity | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Together AI | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Fireworks AI | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Cerebras | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| Cohere | openai | API Key | ✅ | ✅ | ❌ | ❌ |
| NVIDIA NIM | openai | API Key | ✅ | ✅ | ❌ | ❌ |
## Format Translation Coverage
Detected source formats include:
- `openai`
- `openai-responses`
- `claude`
- `gemini`
Target formats include:
- OpenAI chat/Responses
- Claude
- Gemini/Gemini-CLI/Antigravity envelope
- Kiro
- Cursor
Translations use **OpenAI as the hub format** — all conversions go through OpenAI as intermediate:
```
Source Format → OpenAI (hub) → Target Format
```
Translations are selected dynamically based on source payload shape and provider target format.
Additional processing layers in the translation pipeline:
- **Response sanitization** — Strips non-standard fields from OpenAI-format responses (both streaming and non-streaming) to ensure strict SDK compliance
- **Role normalization** — Converts `developer` → `system` for non-OpenAI targets; merges `system` → `user` for models that reject the system role (GLM, ERNIE)
- **Think tag extraction** — Parses `<think>...</think>` blocks from content into `reasoning_content` field
| `GET/POST/DELETE /api/provider-models` | Custom Models | Custom model management per provider |
## Bypass Handler
The bypass handler (`open-sse/utils/bypassHandler.ts`) intercepts known "throwaway" requests from Claude CLI — warmup pings, title extractions, and token counts — and returns a **fake response** without consuming upstream provider tokens. This is triggered only when `User-Agent` contains `claude-cli`.
## Request Logger Pipeline
The request logger (`open-sse/utils/requestLogger.ts`) provides a 7-stage debug logging pipeline, disabled by default, enabled via `ENABLE_REQUEST_LOGS=true`:
1. `usageDb` and `localDb` share the same base directory policy (`DATA_DIR` -> `XDG_CONFIG_HOME/omniroute` -> `~/.omniroute`) with legacy file migration.
2. `/api/v1/route.ts` delegates to the same unified catalog builder used by `/api/v1/models` (`src/app/api/v1/models/catalog.ts`) to avoid semantic drift.
3. Request logger writes full headers/body when enabled; treat log directory as sensitive.
4. Cloud behavior depends on correct `NEXT_PUBLIC_BASE_URL` and cloud endpoint reachability.
5. The `open-sse/` directory is published as the `@omniroute/open-sse`**npm workspace package**. Source code imports it via `@omniroute/open-sse/...` (resolved by Next.js `transpilePackages`). File paths in this document still use the directory name `open-sse/` for consistency.
6. Charts in the dashboard use **Recharts** (SVG-based) for accessible, interactive analytics visualizations (model usage bar charts, provider breakdown tables with success rates).
7. E2E tests use **Playwright** (`tests/e2e/`), run via `npm run test:e2e`. Unit tests use **Node.js test runner** (`tests/unit/`), run via `npm run test:unit`. Source code under `src/` is **TypeScript** (`.ts`/`.tsx`); the `open-sse/` workspace remains JavaScript (`.js`).
8. Settings page is organized into 5 tabs: Security, Routing (6 global strategies: fill-first, round-robin, p2c, random, least-used, cost-optimized), Resilience (editable rate limits, circuit breaker, policies), AI (thinking budget, system prompt, prompt cache), Advanced (proxy).
> A comprehensive, beginner-friendly guide to the **omniroute** multi-provider AI proxy router.
---
## 1. What Is omniroute?
omniroute is a **proxy router** that sits between AI clients (Claude CLI, Codex, Cursor IDE, etc.) and AI providers (Anthropic, Google, OpenAI, AWS, GitHub, etc.). It solves one big problem:
> **Different AI clients speak different "languages" (API formats), and different AI providers expect different "languages" too.** omniroute translates between them automatically.
Think of it like a universal translator at the United Nations — any delegate can speak any language, and the translator converts it for any other delegate.
---
## 2. Architecture Overview
```mermaid
graph LR
subgraph Clients
A[Claude CLI]
B[Codex]
C[Cursor IDE]
D[OpenAI-compatible]
end
subgraph omniroute
E[Handler Layer]
F[Translator Layer]
G[Executor Layer]
H[Services Layer]
end
subgraph Providers
I[Anthropic Claude]
J[Google Gemini]
K[OpenAI / Codex]
L[GitHub Copilot]
M[AWS Kiro]
N[Antigravity]
O[Cursor API]
end
A --> E
B --> E
C --> E
D --> E
E --> F
F --> G
G --> I
G --> J
G --> K
G --> L
G --> M
G --> N
G --> O
H -.-> E
H -.-> G
```
### Core Principle: Hub-and-Spoke Translation
All format translation passes through **OpenAI format as the hub**:
```
Client Format → [OpenAI Hub] → Provider Format (request)
Provider Format → [OpenAI Hub] → Client Format (response)
```
This means you only need **N translators** (one per format) instead of **N²** (every pair).
| `constants.ts` | `PROVIDERS` object with base URLs, OAuth credentials (defaults), headers, and default system prompts for every provider. Also defines `HTTP_STATUS`, `ERROR_TYPES`, `COOLDOWN_MS`, `BACKOFF_CONFIG`, and `SKIP_PATTERNS`. |
| `credentialLoader.ts` | Loads external credentials from `data/provider-credentials.json` and merges them over the hardcoded defaults in `PROVIDERS`. Keeps secrets out of source control while maintaining backwards compatibility. |
| `providerModels.ts` | Central model registry: maps provider aliases → model IDs. Functions like `getModels()`, `getProviderByAlias()`. |
| `codexInstructions.ts` | System instructions injected into Codex requests (editing constraints, sandbox rules, approval policies). |
| `defaultThinkingSignature.ts` | Default "thinking" signatures for Claude and Gemini models. |
| `ollamaModels.ts` | Schema definition for local Ollama models (name, size, family, quantization). |
| `provider.ts` | **Format detection** (`detectFormat`): analyzes request body structure to identify Claude/OpenAI/Gemini/Antigravity/Responses formats (includes `max_tokens` heuristic for Claude). Also: URL building, header building, thinking config normalization. Supports `openai-compatible-*` and `anthropic-compatible-*` dynamic providers. |
| `model.ts` | Model string parsing (`claude/model-name` → `{provider: "claude", model: "model-name"}`), alias resolution with collision detection, input sanitization (rejects path traversal/control chars), and model info resolution with async alias getter support. |
| `tokenRefresh.ts` | OAuth token refresh for **every provider**: Google (Gemini, Antigravity), Claude, Codex, Qwen, iFlow, GitHub (OAuth + Copilot dual-token), Kiro (AWS SSO OIDC + Social Auth). Includes in-flight promise deduplication cache and retry with exponential backoff. |
| `combo.ts` | **Combo models**: chains of fallback models. If model A fails with a fallback-eligible error, try model B, then C, etc. Returns actual upstream status codes. |
| `usage.ts` | Fetches quota/usage data from provider APIs (GitHub Copilot quotas, Antigravity model quotas, Codex rate limits, Kiro usage breakdowns, Claude settings). |
| `accountSelector.ts` | Smart account selection with scoring algorithm: considers priority, health status, round-robin position, and cooldown state to pick the optimal account for each request. |
| `contextManager.ts` | Request context lifecycle management: creates and tracks per-request context objects with metadata (request ID, timestamps, provider info) for debugging and logging. |
| `ipFilter.ts` | IP-based access control: supports allowlist and blocklist modes. Validates client IP against configured rules before processing API requests. |
| `sessionManager.ts` | Session tracking with client fingerprinting: tracks active sessions using hashed client identifiers, monitors request counts, and provides session metrics. |
| `signatureCache.ts` | Request signature-based deduplication cache: prevents duplicate requests by caching recent request signatures and returning cached responses for identical requests within a time window. |
| `systemPrompt.ts` | Global system prompt injection: prepends or appends a configurable system prompt to all requests, with per-provider compatibility handling. |
| `thinkingBudget.ts` | Reasoning token budget management: supports passthrough, auto (strip thinking config), custom (fixed budget), and adaptive (complexity-scaled) modes for controlling thinking/reasoning tokens. |
| `wildcardRouter.ts` | Wildcard model pattern routing: resolves wildcard patterns (e.g., `*/claude-*`) to concrete provider/model pairs based on availability and priority. |
#### Token Refresh Deduplication
```mermaid
sequenceDiagram
participant R1 as Request 1
participant R2 as Request 2
participant Cache as refreshPromiseCache
participant OAuth as OAuth Provider
R1->>Cache: getAccessToken("gemini", token)
Cache->>Cache: No in-flight promise
Cache->>OAuth: Start refresh
R2->>Cache: getAccessToken("gemini", token)
Cache->>Cache: Found in-flight promise
Cache-->>R2: Return existing promise
OAuth-->>Cache: New access token
Cache-->>R1: New access token
Cache-->>R2: Same access token (shared)
Cache->>Cache: Delete cache entry
```
#### Account Fallback State Machine
```mermaid
stateDiagram-v2
[*] --> Active
Active --> Error: Request fails (401/429/500)
Error --> Cooldown: Apply backoff
Cooldown --> Active: Cooldown expires
Active --> Active: Request succeeds (reset backoff)
| `usageTracking.ts` | Token usage extraction from any format (Claude/OpenAI/Gemini/Responses), estimation with separate tool/message char-per-token ratios, buffer addition (2000 tokens safety margin), format-specific field filtering, console logging with ANSI colors. |
| `requestLogger.ts` | File-based request logging (opt-in via `ENABLE_REQUEST_LOGS=true`). Creates session folders with numbered files: `1_req_client.json` → `7_res_client.txt`. All I/O is async (fire-and-forget). Masks sensitive headers. |
| `bypassHandler.ts` | Intercepts specific patterns from Claude CLI (title extraction, warmup, count) and returns fake responses without calling any provider. Supports both streaming and non-streaming. Intentionally limited to Claude CLI scope. |
| `networkProxy.ts` | Resolves outbound proxy URL for a given provider with precedence: provider-specific config → global config → environment variables (`HTTPS_PROXY`/`HTTP_PROXY`/`ALL_PROXY`). Supports `NO_PROXY` exclusions. Caches config for 30s. |
| `/api/settings/system-prompt` | GET/PUT | Global system prompt injection for all requests |
| `/api/sessions` | GET | Active session tracking and metrics |
| `/api/rate-limits` | GET | Per-account rate limit status |
---
## 5. Key Design Patterns
### 5.1 Hub-and-Spoke Translation
All formats translate through **OpenAI format as the hub**. Adding a new provider only requires writing **one pair** of translators (to/from OpenAI), not N pairs.
### 5.2 Executor Strategy Pattern
Each provider has a dedicated executor class inheriting from `BaseExecutor`. The factory in `executors/index.ts` selects the right one at runtime.
### 5.3 Self-Registering Plugin System
Translator modules register themselves on import via `register()`. Adding a new translator is just creating a file and importing it.
### 5.4 Account Fallback with Exponential Backoff
When a provider returns 429/401/500, the system can switch to the next account, applying exponential cooldowns (1s → 2s → 4s → max 2min).
### 5.5 Combo Model Chains
A "combo" groups multiple `provider/model` strings. If the first fails, fallback to the next automatically.
### 5.6 Stateful Streaming Translation
Response translation maintains state across SSE chunks (thinking block tracking, tool call accumulation, content block indexing) via the `initState()` mechanism.
### 5.7 Usage Safety Buffer
A 2000-token buffer is added to reported usage to prevent clients from hitting context window limits due to overhead from system prompts and format translation.
| Настройки → Разширени | Режим на отстраняване на грешки | Активиране на регистрационните файлове на заявките за отстраняване на грешки (UI) |
| Настройки → Общи | Видимост на страничната лента | Показване/скриване на секциите на страничната лента |
Тези настройки се запазват в базата данни и се запазват при рестартиране, като заменят настройките по подразбиране env var, когато са претърпени.### Running Locally```bash
# Development mode (hot reload)
npm run dev
# Production build
npm run build
npm run start
# Common port configuration
PORT=20128 NEXT_PUBLIC_BASE_URL=http://localhost:20128 npm run dev
````
URL адреси по подразбиране:
-**Табло за управление**: `http://localhost:20128/табло за управление`
# Coverage (60% min statements/lines/functions/branches)
npm run test:coverage
npm run coverage:report
# Lint + format check
npm run lint
npm run check
```
Бележки за покритието:
- `npm run test:coverage` измерва покритието на източника за тестови пакети на основната единица, изключвайки `tests/**` и включва `open-sse/**`
- Заявките за изтегляне трябва да поддържат общата врата за покритие на**60% или по-висока**за отчети, линии, функции и клонове
- Ако PR промени производствения код в `src/`, `open-sse/`, `electron/` или `bin/`, той трябва да добави или актуализира автоматизирани тестове в същия PR
- `npm run coverage:report` отпечатва подробния отчетен файл по файл от последното изпълнение на покритието
- `npm run test:coverage:legacy` запазва по-старата метрика за историческо сравнение
- Вижте `docs/COVERAGE_PLAN.md` за поетапна пътна карта за подобряване на покритието### Pull Request Requirements
Преди да отворите или обедините PR:
- Стартирайте `npm run test:unit`
- Стартирайте `npm run test:coverage`
- Уверете се, че вратата за покритие остава на**60%+**за всички показатели
- Включете променените или добавени тестови файлове в PR описанието при промяна на производствения код
- Проверете резултатите от SonarQube на PR, когато тайните на проекта са конфигурирани в CI
Текущо състояние на теста:**122 файла за единичен тест**, обхващащи:
- Преводачи на доставчици и конвертиране на формати
- Ограничаване на скоростта, прекъсвач и устойчивост
- Семантичен кеш, идемпотентност, проследяване на напредъка
- Операции с база данни и схема (21 DB модула)
- OAuth потоци и удостоверяване
- API валиден за крайни точки (Zod v4)
- MCP сървърни инструменти и прилагане на обхват
- Системи за памет и умения---## Code Style
-**ESLint**— Стартирайте `npm run lint` преди извършване -**Prettier**— Автоматично форматирано чрез `lint-staged` при ангажиране (2 интервала, точка и запетая, двойни кавички, ширина 100 знака, es5 запетая в края) -**TypeScript**— Всички `src/` кодове се използват `.ts`/`.tsx`; `open-sse/` използва `.ts`/`.js`; документ с TSDoc (`@param`, `@returns`, `@throws`) -**Без `eval()`**— ESLint налага `no-eval`, `no-implied-eval`, `no-new-func` -**Zod валидиране**— Използвайте Zod v4 схеми за всички входни валидации на API -**Именуване**: Файлове = camelCase/kebab-case, компоненти = PascalCase, константи = UPPER_SNAKE---## Project Structure
├── COVERAGE_PLAN.md # Test coverage improvement plan
├── openapi.yaml # OpenAPI specification
└── adr/ # Architecture Decision Records
```
---
## Adding a New Provider
### Step 1: Register Provider Constants
Добавете към `src/shared/constants/providers.ts` — Zod-валидирано при зареждане на модула.### Стъпка 2: Добавяне на изпълнител (ако е необходима персонализирана логика)
Създайте изпълнител в `open-sse/executors/your-provider.ts`, като разширите базовия изпълнител.### Стъпка 3: Добавете преводач (ако форматът не е OpenAI)
Създайте преводачи на заявка/отговор в `open-sse/translator/`.### Стъпка 4: Добавете OAuth Config (ако е базиран на OAuth)
Добавете идентификационни данни за OAuth в `src/lib/oauth/constants/oauth.ts` и услуга в `src/lib/oauth/services/`.### Стъпка 5: Регистрирайте модели
Добавете дефиниции на модели в `open-sse/config/providerRegistry.ts`.### Стъпка 6: Добавете тестове
Напишете модулни тестове в `tests/unit/`, покривайки минимум:
- Регистрация при доставчик
- Превод на заявка/отговор
- Обработка на грешки---## Pull Request Checklist
- [ ] Тестовете преминават („npm тест“)
- [ ] Linting преминава (`npm run lint`)
- [ ] Компилацията е успешна (`npm run build`)
- [] TypeScript типове, добавени за нови публични функции и интерфейси
- [ ] Няма твърдо кодирани тайни или резервни стойности
- [ ] Всички входове, валидирани със схеми на Zod
- [ ] CHANGELOG актуализиран (ако промяната е пред потребителя)
- [ ] Актуализирана документация (ако е приложимо)---## Releasing
Изданията се управляват чрез работен процес `/generate-release`. Когато се създаде ново издание на GitHub, пакетът се**автоматично публикува в npm**чрез GitHub Actions.---## Getting Help
-**Архитектура**: Вижте [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) -**API справка**: Вижте [`docs/API_REFERENCE.md`](docs/API_REFERENCE.md) -**Проблеми**: [github.com/diegosouzapw/OmniRoute/issues](https://github.com/diegosouzapw/OmniRoute/issues) -**ADRs**: Вижте `docs/adr/` за записи на архитектурни решения
File diff suppressed because it is too large
Load Diff
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.