Compare commits

...

65 Commits

Author SHA1 Message Date
diegosouzapw 659e2b414d feat(release): v2.8.2 — model alias routing fix, log export, 2 merged PRs
Build Electron Desktop App / Validate version (push) Failing after 25s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
2026-03-19 11:13:49 -03:00
diegosouzapw 7bcb58e3db feat(logs): add export button with time range dropdown (1h, 6h, 12h, 24h)
- New API: /api/logs/export?hours=24&type=call-logs
- UI: Export button with dropdown on /dashboard/logs page
- Supports export of request-logs, proxy-logs, and call-logs
- Downloads as JSON file with Content-Disposition header
2026-03-19 11:11:07 -03:00
diegosouzapw 2d7d7776a6 fix(routing): model aliases now affect routing, not just format detection (#472)
Previously resolveModelAlias() output was used only for getModelTargetFormat()
but the original model was sent in translatedBody.model and to the executor.
Now effectiveModel is propagated to all downstream operations.
2026-03-19 11:07:29 -03:00
Prakersh Maheshwari c5f429521c fix(pricing): add missing Codex 5.3/5.4 and Anthropic model ID entries (#479)
* fix(pricing): add missing Codex 5.3/5.4 and Anthropic model ID entries

Missing pricing entries cause $0.00 cost for:
- GPT 5.3 Codex family (gpt-5.3-codex, -high, -xhigh, -low, -none)
- GPT 5.4 (with hyphen: gpt-5.4)
- GPT 5.1 Codex Mini High
- Common Anthropic model IDs without dates (claude-opus-4-6,
  claude-sonnet-4-6, claude-opus-4, claude-sonnet-4)
- Dated variants used by Claude Code (claude-opus-4-5-20251101,
  claude-sonnet-4-5-20250929)

* refactor: extract shared pricing constants to reduce duplication

Address review feedback: extract duplicated pricing objects into
named constants (GPT_5_3_CODEX_PRICING, CLAUDE_OPUS_4_PRICING, etc.)
and add clarifying comment about intentional hyphen/dot variant entries.
2026-03-19 11:04:30 -03:00
diegosouzapw 426d8636bc fix(stream): extract usage from remaining buffer in flush handler (#480) 2026-03-19 11:02:13 -03:00
diegosouzapw a265c7096e feat(release): v2.8.1 — streaming log fix, Kiro compat, cache tokens, Chinese i18n, configurable tool call ID
Build Electron Desktop App / Validate version (push) Failing after 31s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
2026-03-19 08:45:54 -03:00
diegosouzapw 1c9953b1ba chore: remove ZWS_README_V1.md (internal contributor doc) 2026-03-19 08:43:17 -03:00
diegosouzapw 601cc21a44 feat: call log response content, per-model tool call ID, key PATCH & validation (#470) 2026-03-19 08:41:01 -03:00
Ethan Hunt 102c42dfe4 feat: Improve the Chinese translation (#475)
Co-authored-by: gmw <rorschach1167@qq.com>
2026-03-19 08:37:51 -03:00
Prakersh Maheshwari 4953727aa7 fix(callLogs): support Claude format usage and include cache tokens (#476)
saveCallLog only read prompt_tokens/completion_tokens (OpenAI format).
When sourceFormat=claude, the openai-to-claude translator writes
input_tokens/output_tokens instead, causing all cross-format requests
(Codex-via-Claude, Kiro-via-Claude, etc.) to show 0|0 tokens in
call_logs.

Also includes cache_read and cache_creation tokens in tokens_in total
so heavily-cached requests don't show misleadingly low input counts.

Changes:
- Read prompt_tokens || input_tokens (supports both formats)
- Read completion_tokens || output_tokens (supports both formats)
- Sum cache_read_input_tokens + cache_creation_input_tokens into total
2026-03-19 08:37:49 -03:00
Prakersh Maheshwari e6af874b47 fix(usage): include cache tokens in usage history input total (#477)
logUsage stored only non-cached input tokens in usage_history.tokens_input.
For heavily-cached Claude requests (common with Claude Code), this shows
near-zero input when the real total is 150K+, causing the analytics
dashboard to severely underreport input token usage.

Now sums: input = prompt_tokens + cache_read + cache_creation
2026-03-19 08:37:46 -03:00
Prakersh Maheshwari 801b4eef4c fix(kiro): strip injected model field from request body (#478)
chatCore.ts injects translatedBody.model for all providers after
translation. Kiro API (AWS CodeWhisperer) has strict schema validation
and rejects unknown top-level fields — only conversationState, profileArn,
and inferenceConfig are valid. This causes 100% of Kiro requests to fail
with "Improperly formed request".

Strip the injected model field in KiroExecutor.transformRequest().
2026-03-19 08:37:44 -03:00
diegosouzapw fe5c20a04e feat(release): v2.8.0 — Bailian Coding Plan, editable provider URLs, 812 tests
Build Electron Desktop App / Validate version (push) Failing after 34s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
2026-03-19 02:28:45 -03:00
diegosouzapw 246fd05fae feat(providers): add Bailian Coding Plan provider with editable base URL (#467) 2026-03-19 02:25:29 -03:00
diegosouzapw a09b298127 feat(release): v2.7.10 — Alibaba Cloud Coding, Kimi Coding API-key, Docker pino fix
Build Electron Desktop App / Validate version (push) Failing after 34s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
2026-03-19 01:50:00 -03:00
Jefferson Nunn f89f40778f feat: add API-key Kimi Coding provider path (#463)
* feat: add api-key Kimi Coding provider support

* fix(kimi-coding): honor apikey auth header in executor

Ensure DefaultExecutor sends x-api-key for kimi-coding-apikey at runtime
and deduplicate shared kimi coding config blocks in registry and models
config to reduce drift between oauth and apikey variants.

---------

Co-authored-by: OmniRoute Agent <agent@omniroute.local>
2026-03-19 01:48:26 -03:00
dtk 3d0c8d8d45 feat: add alibaba cloud coding plan provider support (#465)
Co-authored-by: dtk <git@derzsi.cloud>
2026-03-19 01:48:23 -03:00
diegosouzapw 0e5e8bf14e fix(docker): add missing split2 dependency to container image (#459) 2026-03-19 01:46:26 -03:00
diegosouzapw ce34d329d3 chore(release): v2.7.9
Build Electron Desktop App / Validate version (push) Failing after 28s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
2026-03-18 17:19:42 -03:00
diegosouzapw eaf4a5805c "fix: resolved UI combo setting schema strip (#458)"
"fix: safe crypto fallback for MITM on windows (#456)"
2026-03-18 17:18:31 -03:00
Sergey Morozov 8420e565d4 feat: add responses subpath passthrough for codex (#457) 2026-03-18 17:18:29 -03:00
diegosouzapw 1b68deb0f6 feat(release): v2.7.8 — budget save fix + combo agent UI + omniModel tag strip
Build Electron Desktop App / Validate version (push) Failing after 32s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- fix(budget): warningThreshold sent as fraction 0-1 not percentage 0-100 (#451)
- feat(combos): Agent Features UI in combo modal (system_message, tool_filter_regex,
  context_cache_protection) — previously server-only (#454)
- fix(combos): strip <omniModel> tags before forwarding to provider (#454)
2026-03-18 15:38:04 -03:00
Diego Rodrigues de Sa e Souza d1497c9ac8 Merge pull request #455 from diegosouzapw/fix/issue-451-454-budget-combo-ui
fix: budget warningThreshold + combo agent UI fields + omniModel tag strip
2026-03-18 15:37:17 -03:00
diegosouzapw 03d4cbf6d5 fix: budget warningThreshold fraction mismatch + combo agent UI fields + omniModel tag strip
- fix(budget): BudgetTab sent integer percentage (80) but schema validated
  fraction (0-1). Now divides by 100 on POST and multiplies by 100 on GET (#451)

- fix(combos): expose Agent Features UI in combo create/edit modal — fields for
  system_message override, tool_filter_regex, and context_cache_protection were
  implemented server-side (#399/#401) but missing from the dashboard UI (#454)

- fix(combos): strip <omniModel> tags from messages before forwarding to provider.
  The internal cache-pinning tag was being sent to the provider, causing cache
  misses as providers treated each tagged request as a new session (#454)
2026-03-18 15:32:47 -03:00
diegosouzapw 718be831af feat(release): v2.7.7 — Docker pino crash fix + Codex responses worker fix
Build Electron Desktop App / Validate version (push) Failing after 35s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- fix(docker): copy pino-abstract-transport + pino-pretty in standalone (#449)
- fix(responses): remove initTranslators() from /v1/responses route (#450)
- chore(deps): commit package-lock.json with each version bump
2026-03-18 15:13:26 -03:00
Diego Rodrigues de Sa e Souza 9d5ec523be Merge pull request #453 from diegosouzapw/fix/issue-449-450-pino-docker-responses-worker
fix: pino Docker crash + Codex /v1/responses worker exit + package-lock sync
2026-03-18 15:11:38 -03:00
diegosouzapw 81c43b45fb fix: pino-abstract-transport missing in Docker + responses worker crash + lock sync
- fix(docker): copy pino-abstract-transport and pino-pretty explicitly in
  runner-base stage — Next.js standalone trace omits them, causing
  'Cannot find module pino-abstract-transport' crash on startup (#449)

- fix(responses): remove initTranslators() call from /v1/responses route —
  bootstrapping translator registry from a Next.js Route Handler worker
  caused 'the worker has exited' uncaughtException on Codex CLI requests.
  Translators are already bootstrapped server-side via open-sse (#450)

- chore: include package-lock.json in commit (was being left behind on
  version bumps, causing npm ci to install inconsistent deps in Docker)
2026-03-18 15:08:57 -03:00
diegosouzapw 146a491769 feat(release): v2.7.5 — login UX + Windows CLI healthcheck
Build Electron Desktop App / Validate version (push) Failing after 34s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- fix(ux): show default password hint on login page (#437)
- fix(cli): spawn shell:true on Windows for .cmd CLI resolution (#447)
2026-03-18 14:52:05 -03:00
Diego Rodrigues de Sa e Souza 4c53388579 Merge pull request #448 from diegosouzapw/fix/issue-437-447-435-login-healthcheck-gemini
fix: login default password hint + Windows CLI healthcheck shell resolution
2026-03-18 14:51:19 -03:00
diegosouzapw 3403ddcc6e fix: login password hint + Windows CLI healthcheck + i18n key
- fix(ux): add default password hint on login page for first-time users (#437)
  The fallback password (123456) is now shown as a hint below the
  password input so users don't get locked out during initial setup.

- fix(cli): add shell:true to spawn on Windows so .cmd wrappers are
  resolved correctly via PATHEXT (#447). Claude, opencode, and other
  npm-installed CLIs show as 'not runnable' on Windows even when
  installed because spawn() cannot find .cmd files without shell:true.

- i18n: add defaultPasswordHint key to en.json auth namespace
2026-03-18 14:44:49 -03:00
diegosouzapw 684b81d835 feat(release): v2.7.4 — search playground, i18n fixes, Copilot limits, Serper validation
Build Electron Desktop App / Validate version (push) Failing after 34s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- feat(search): search playground + search tools page + local rerank (#443 @Regis-RCR)
- fix(analytics): localize day/date labels with Intl.DateTimeFormat (#444 @hijak)
- fix(copilot): correct account type display, filter unlimited quotas (#445 @hijak)
- fix(providers): stop rejecting valid Serper API keys on non-4xx (#446 @hijak)
2026-03-18 12:11:00 -03:00
Diego Rodrigues de Sa e Souza 4f32da57fd Merge pull request #443 from Regis-RCR/feat/search-playground
feat(search): add search playground, search tools, and local rerank routing
2026-03-18 12:09:51 -03:00
Diego Rodrigues de Sa e Souza 97265e48b3 Merge pull request #444 from hijak/fix/analytics-day-date-translations
fix: localize analytics day and date labels
2026-03-18 12:07:03 -03:00
Diego Rodrigues de Sa e Souza 64797158e2 Merge pull request #445 from hijak/fix/copilot-account-type-limits
fix: correct GitHub Copilot account type and limits
2026-03-18 12:06:59 -03:00
Diego Rodrigues de Sa e Souza 8359293dcd Merge pull request #446 from hijak/fix/serper-api-key-validation
fix: stop rejecting valid Serper API keys
2026-03-18 12:06:36 -03:00
Jack Cowey b2dc53d18b fix(search): return consistent validation result shape
Keep search provider validation responses consistent with other validators so Serper regression tests and CI assertions can rely on unsupported=false.

Made-with: Cursor
2026-03-18 12:55:25 +00:00
Jack Cowey edf8dd2a12 fix(search): accept authenticated serper validation responses
Treat non-auth Serper validation errors as successful authentication so valid API keys are not rejected during provider setup.

Made-with: Cursor
2026-03-18 12:29:14 +00:00
Jack Cowey 5a777bd598 fix(github): correct copilot plan and quota mapping
Normalize GitHub Copilot account tiers from the usage payload and hide misleading unlimited buckets so account type and limits render correctly in the dashboard.

Made-with: Cursor
2026-03-18 12:25:17 +00:00
Jack Cowey bd39e01ee1 fix(analytics): localize most active day and weekly labels
Use the active app locale for analytics weekday and date formatting so the dashboard no longer shows hardcoded Portuguese labels.

Made-with: Cursor
2026-03-18 12:17:56 +00:00
Regis e3ed29aab6 feat(search): add search playground, search tools, and local rerank routing
Search Playground (Phase 1):
- Web Search as 10th endpoint in Playground with isolated SearchPlayground component
- Endpoint selector moved first; Provider/Model/Send hidden when search selected
- Provider dropdown via GET /api/search/providers, formatted results with cache indicator

Search Tools page (Phase 2) at /dashboard/search-tools:
- Split panel: SearchForm (left) with query, provider, filters + ResultsPanel (right)
- Compare Providers: parallel queries with latency, cost, response size, URL overlap
- Rerank Pipeline: model selector from /v1/models, results with position delta
- Search History: last 10 searches from call_logs with replay
- Sidebar entry under Debug section

Backend:
- GET /api/search/providers — list providers with auth guard + SEARCH_CREDENTIAL_FALLBACKS
- GET /api/search/stats — cache stats, provider aggregates, recent searches (auth guard)
- Add local provider_nodes routing for /v1/rerank (oMLX, vLLM support)

Bug fixes (from F-27 PR #432):
- Fix Brave news normalizer: data.results directly, not data.news.results
- Enforce max_results truncation after normalization for all providers
- Fix EndpointPageClient: use /api/search/providers instead of /api/v1/search
- Add isAuthenticated() guards on /api/search/providers and /api/search/stats

Response size metric in results meta bar and compare table.
i18n: 30+ keys in search namespace (en.json)
2026-03-18 12:43:24 +01:00
diegosouzapw 896ce9c0e2 feat(release): v2.7.3 — fix Codex direct API weekly quota fallback
Build Electron Desktop App / Validate version (push) Failing after 36s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- fix(codex): resolveQuotaWindow() prefix-matches 'weekly' → 'weekly (7d)' cache keys
- fix(codex): applyCodexWindowPolicy() enforces useWeekly/use5h toggles in direct API
- 4 new regression tests, 766 total passing
- Closes #440
2026-03-18 08:41:13 -03:00
Diego Rodrigues de Sa e Souza 82934132e9 Merge pull request #441 from rexname/fix/issue-440-direct-api-fallback
fix(codex): block weekly-exhausted accounts in direct API fallback
2026-03-18 08:40:19 -03:00
rexname a2012b70de chore(review): harden window normalization and deterministic quota matching 2026-03-18 14:17:37 +07:00
rexname bcfeba8a57 fix(codex): enforce weekly quota blocking for direct API fallback 2026-03-18 13:57:25 +07:00
diegosouzapw d3dfd9ce57 feat(release): v2.7.2 — fix light mode contrast in logs UI
Build Electron Desktop App / Validate version (push) Failing after 38s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- fix(logs): text colors in filter buttons + combo badge now have dark: variants
- Bumped version to 2.7.2
- Updated CHANGELOG and openapi.yaml
2026-03-18 00:42:22 -03:00
Diego Rodrigues de Sa e Souza aa06d5d356 Merge pull request #433 from diegosouzapw/fix/issue-378-logs-light-mode-contrast
Merged fix for light mode contrast in filter buttons and combo badge. Thanks @rdself for the great bug report!
2026-03-18 00:41:28 -03:00
diegosouzapw 448c8a29e1 fix(logs): fix light mode contrast in filter buttons and combo badge (#378)
- text-red-400 → text-red-700 dark:text-red-400 (error filter, recording button)
- text-emerald-400 → text-emerald-700 dark:text-emerald-400 (ok filter)
- text-violet-300 → text-violet-700 dark:text-violet-300 (combo filter)
- combo row badge: violet-700 → violet-800 dark:violet-300, stronger border

Fixes #378
2026-03-17 16:46:27 -03:00
diegosouzapw 928b7120f4 feat(release): v2.7.1 — unified web search routing + Next.js 16.1.7 security
Build Electron Desktop App / Validate version (push) Failing after 35s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- POST /v1/search: 5 providers (Serper, Brave, Perplexity, Exa, Tavily), 6,500+ free/mo
- Search analytics dashboard tab + GET /api/v1/search/analytics
- db: request_type column on call_logs (migration 007)
- Next.js 16.1.7: 6 CVEs fixed (critical: CVE-2026-29057 HTTP request smuggling)
- docs/openapi.yaml: bumped to 2.7.1
2026-03-17 16:27:31 -03:00
diegosouzapw a3deacd718 feat: Implement historical model latency and success rate tracking for auto-combo routing and update Claude and Deepseek pricing and model registrations. 2026-03-17 16:18:36 -03:00
diegosouzapw 78959fffbd Merge branch 'main' of https://github.com/diegosouzapw/OmniRoute 2026-03-17 16:18:12 -03:00
Diego Rodrigues de Sa e Souza 1788616e52 Merge pull request #431 from diegosouzapw/dependabot/npm_and_yarn/next-16.1.7
Security update merged: Next.js 16.1.7 fixes 6 CVEs including critical CVE-2026-29057 (HTTP request smuggling). No breaking changes.
2026-03-17 16:18:01 -03:00
Diego Rodrigues de Sa e Souza c61e6d0777 Merge pull request #432 from Regis-RCR/feat/search-provider-routing
Merged with dashboard improvements: SearchAnalyticsTab + /api/v1/search/analytics endpoint — PR review complete by Antigravity.
2026-03-17 16:17:39 -03:00
diegosouzapw a3bc7620b1 feat(integration): integrate ClawRouter services into active pipeline
- intentClassifier → engine.ts selectProvider()
  When taskType is 'default', classifies prompt via multilingual keyword
  detection (9 langs) and uses detected intent (code/reasoning/simple/medium)
  for 6-factor task fitness scoring.

- emergencyFallback → chatCore.ts error path (after T5 intra-family fallback)
  On HTTP 402 or budget-exhaustion keywords, attempts one redirect to
  nvidia/gpt-oss-120b ($0.00/M) before returning error to combo router.
  Skipped for streaming requests and tool-calling requests.

- AutoComboConfig.routerStrategy field added
  Allows per-combo strategy override ('rules' | 'cost' | 'latency')

Note: requestDedup was already integrated in chatCore.ts (line 387-430)
Branch: feat/clawrouter-improvements
2026-03-17 15:22:12 -03:00
diegosouzapw 8064c588dc docs(i18n): sync v2.7.0 release notes to 29 language READMEs
New in v2.7.0: pluggable RouterStrategy, multilingual intent detection,
request deduplication, new providers (Grok-4 Fast, GLM-5/Z.AI,
MiniMax M2.5, Kimi K2.5). Native translations for de/es/fr/it/ru/zh-CN/ja/ko/ar/pt-BR/pt.
2026-03-17 15:11:09 -03:00
Regis 564e983c68 feat(search): add unified web search routing with 5 providers
Add POST /v1/search — a unified search endpoint routing queries across
5 providers (Serper, Brave, Perplexity Search, Exa, Tavily) with
automatic failover, in-memory caching, and request coalescing.

No open-source AI gateway offers unified search routing. This chains
free tiers for 5,500+ searches/month with zero downtime.

Providers: Serper ($0.001/q, 2500/mo free), Brave ($0.005/q, 1000/mo),
Perplexity Search ($0.005/q), Exa ($0.007/q, 1000/mo), Tavily
($0.008/q, 1000/mo). Auto-select picks cheapest with credentials.

Architecture follows existing patterns:
- searchRegistry.ts (same as embeddingRegistry.ts)
- search.ts handler (same as embeddings.ts)
- route.ts (same as /v1/embeddings/route.ts)
- searchCache.ts (bounded TTL cache + request coalescing)

Schema finalized — all future fields defined as optional with safe
defaults. No breaking changes when implementing content extraction,
answer synthesis, or ranking.

Key features:
- Per-provider request builders and response normalizers
- Enriched response: display_url, score, favicon_url, content block,
  metadata, answer block, errors array, upstream_latency_ms metrics
- Cost-sorted auto-select with failover on 429/5xx/timeout
- Credential fallback (perplexity-search reuses perplexity chat key)
- Cache key includes all result-affecting parameters
- max_results clamped to provider limits, sanitized error responses
- Factored validators (validateSearchProvider factory)
- CORS headers on all responses
- Dashboard: Search & Discovery section, search provider template
- DB migration 007: request_type column in call_logs
- 28 unit tests (registry, cache, coalescing, validation)
2026-03-17 18:28:35 +01:00
diegosouzapw e1da181740 fix(publish): also remove app/electron/ (contains app.asar binary) to prevent Z_DATA_ERROR 2026-03-17 14:25:48 -03:00
diegosouzapw c63209200e fix(publish): remove app/vscode-extension/ after build to prevent Z_DATA_ERROR in npm pack 2026-03-17 14:13:15 -03:00
diegosouzapw 737808cf53 fix(npm): exclude app/vscode-extension/ from package to prevent Z_DATA_ERROR during publish 2026-03-17 13:50:06 -03:00
diegosouzapw a197bb7736 fix(routerStrategy): use .ts extension in imports for Next.js App Router bundle compatibility 2026-03-17 13:15:47 -03:00
dependabot[bot] f9dd967bc5 deps: bump next from 16.1.6 to 16.1.7
Bumps [next](https://github.com/vercel/next.js) from 16.1.6 to 16.1.7.
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/compare/v16.1.6...v16.1.7)

---
updated-dependencies:
- dependency-name: next
  dependency-version: 16.1.7
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-17 16:14:44 +00:00
diegosouzapw 44e4d55a66 feat(release): merge feat/clawrouter-improvements — v2.7.0
Build Electron Desktop App / Validate version (push) Failing after 40s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
2026-03-17 13:12:41 -03:00
diegosouzapw 095c84ac16 fix(providerRegistry): remove duplicate claude-haiku-4-5-20251001 from anthropic provider to prevent ambiguous model resolution 2026-03-17 13:10:23 -03:00
diegosouzapw e063eae727 feat(clawrouter): implement 14 ClawRouter-inspired features
PRICING UPDATES (01-09):
- xAI Grok-4 family: grok-4-fast-non-reasoning (/usr/bin/bash.20/$0.50/M, 1143ms),
  grok-4-fast-reasoning, grok-4-1-fast-*, grok-4-0709, grok-3, grok-3-mini
- Z.AI GLM-5 family: glm-5 + glm-5-turbo (128k maxOutput, $1.00/$3.20/M)
- Gemini Flash Lite: price corrected $0.15→$0.10 / $1.25→$0.40 (per ClawRouter)
- Gemini 3.1 Pro: new flagship (1.05M context, aliased as gemini-3.1-pro)
- Anthropic Claude 4.5/4.6: haiku-4.5 ($1/$5), sonnet-4.6 ($3/$15), opus-4.6 ($5/$25)
- DeepSeek native section: deepseek-chat/v3/v3.2 ($0.28/$0.42), deepseek-reasoner ($0.55/$2.19)
- Kimi K2.5 Moonshot: kimi-k2.5 ($0.60/$3.00, 262k ctx), moonshot-kimi-k2.5 alias
- MiniMax M2.5: minimax-m2.5 ($0.30/$1.20, 204k ctx, reasoning+tools)
- NVIDIA free tier: gpt-oss-120b at $0.00/M via emergencyFallback.ts

INFRASTRUCTURE FEATURES (10-14):
- feat(router): add intentClassifier.ts for multilingual intent detection (9 langs)
  Detects code/reasoning/simple in EN, PT-BR, ES, ZH, JA, RU, DE, KO, AR
- feat(dedup): add requestDedup.ts for concurrent request deduplication
  SHA-256 hash, skip streaming, skip high-temperature, 60s failsafe TTL
- feat(autoCombo): add routerStrategy.ts pluggable strategy system
  RouterStrategy interface, RulesStrategy (6-factor) + CostStrategy, registry
- feat(fallback): add emergencyFallback.ts budget-exhaustion detector
  Triggers on HTTP 402 or budget keywords, redirects to nvidia/gpt-oss-120b
- feat(taskFitness): add fitness scores for Grok-4, Kimi K2.5, GLM-5,
  MiniMax M2.5, DeepSeek V3.2, Gemini 3.1 Pro across all task categories

PROVIDERS:
- providers.ts: add Z.AI (zai) provider entry for GLM-5 API key connections

All features on branch: feat/clawrouter-improvements
Source: github.com/BlockRunAI/ClawRouter analysis (2026-03-17)
2026-03-17 10:43:12 -03:00
diegosouzapw f02c5b5c69 fix(install/v2.6.10): Windows better-sqlite3 prebuilt download (#426)
Build Electron Desktop App / Validate version (push) Failing after 35s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
npm version patch run BEFORE staging files — this is an ATOMIC commit.

Adds Strategy 1.5 to scripts/postinstall.mjs:
- Uses @mapbox/node-pre-gyp install --fallback-to-build=false
  (bundled within better-sqlite3) to download the correct prebuilt
  binary for the current OS/arch (win32-x64/arm64, darwin-x64/arm64)
  WITHOUT requiring node-gyp, Python, or MSVC build tools.
- Tries node-pre-gyp.cmd (Windows) or node-pre-gyp (Unix) from .bin/
  with fallback to direct path in @mapbox/node-pre-gyp/bin/
- Falls back to npm rebuild only if prebuilt download fails.
- Windows-specific error: shows Option A (npx node-pre-gyp) and
  Option B (rebuild) with Visual Studio Build Tools links.

Fixes: #426 (better_sqlite3.node is not a valid Win32 application)
2026-03-17 10:09:45 -03:00
diegosouzapw 838f1d645c fix(v2.6.9): CI budget checks, #409 file attachments, atomic release workflow
Build Electron Desktop App / Validate version (push) Failing after 38s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
Includes version bump — v2.6.9 — committed ATOMICALLY with all changes:

fixes:
- fix(ci/t11): Remove 'any' from comments in openai-responses.ts + chatCore.ts
  (\bany\b regex counted comment text as explicit any violations)
- fix(chatCore/#409): Normalize unsupported content part types before forwarding
  Cursor sends {type:'file'} for .md attachments; Copilot/OpenAI providers reject
  with 'type has to be either image_url or text'. Now: file/document→text block,
  unknown types dropped with debug log. Fixes claude-* models via github-copilot.

workflow:
- chore(generate-release): ATOMIC COMMIT RULE — npm version patch MUST run before
  feature commits so the release tag always points to a commit with full changes
2026-03-17 09:09:01 -03:00
146 changed files with 10802 additions and 1505 deletions
+21
View File
@@ -32,6 +32,27 @@ Version format: `2.x.y` — examples:
npm version patch --no-git-tag-version
```
> **⚠️ ATOMIC COMMIT RULE — Version bump MUST happen before committing feature files.**
>
> **CORRECT order:**
>
> 1. `npm version patch --no-git-tag-version` ← bump first
> 2. implement features / fix bugs
> 3. `git add -A && git commit -m "chore(release): v2.x.y — all changes in ONE commit"`
>
> **OR if features are already staged:**
>
> 1. implement features (do NOT commit yet)
> 2. `npm version patch --no-git-tag-version` ← bump before committing
> 3. `git add -A && git commit -m "chore(release): v2.x.y — all changes in ONE commit"`
>
> **NEVER do this (creates version mismatch in git history):**
>
> - ~~commit features → then bump version → commit package.json separately~~
>
> This ensures that `git show v2.x.y` always contains both code changes and the version bump together.
> The GitHub release tag will point to a commit that includes ALL changes for that version.
### 2. Regenerate lock file (REQUIRED after version bump)
**Mandatory** — skipping causes `@swc/helpers` lock mismatch and CI failures:
+2
View File
@@ -55,6 +55,8 @@ logs/*
# analysis directories (generated, not tracked)
.analysis/
antigravity-manager-analysis/
.sisyphus/
.plans/
# docs (allow specific tracked files)
docs/*
+5
View File
@@ -3,6 +3,11 @@ data/
**/data/
**/db.json
# VS Code extension test runtime (large binary, not needed in npm package)
app/vscode-extension/
**/data/
**/db.json
# Source code (pre-built app/ is published instead)
src/
open-sse/
+278
View File
@@ -4,6 +4,284 @@
---
## [2.8.2] — 2026-03-19
> Sprint: 2 merged PRs, model aliases routing fix, log export, and issue triage.
### Features
- **Log Export**: New Export button on `/dashboard/logs` with time range dropdown (1h, 6h, 12h, 24h). Downloads JSON of request/proxy/call logs via `/api/logs/export` API (#user-request)
### Bug Fixes
- **Model Aliases Routing** (#472): Settings → Model Aliases now correctly affect provider routing, not just format detection. Previously `resolveModelAlias()` output was only used for `getModelTargetFormat()` but the original model ID was sent to the provider
- **Stream Flush Usage** (#480): Usage data from the last SSE event in the buffer is now correctly extracted during stream flush (merged from @prakersh)
### Merged PRs
- #480 — Extract usage from remaining buffer in flush handler (@prakersh)
- #479 — Add missing Codex 5.3/5.4 and Anthropic model ID pricing entries (@prakersh)
---
## [2.8.1] — 2026-03-19
> Sprint: Five community PRs — streaming call log fixes, Kiro compatibility, cache token analytics, Chinese translation, and configurable tool call IDs.
### ✨ Features
- **feat(logs)**: Call log response content now correctly accumulated from raw provider chunks (OpenAI/Claude/Gemini) before translation, fixing empty response payloads in streaming mode (#470, @zhangqiang8vip)
- **feat(providers)**: Per-model configurable 9-char tool call ID normalization (Mistral-style) — only models with the option enabled get truncated IDs (#470)
- **feat(api)**: Key PATCH API expanded to support `allowedConnections`, `name`, `autoResolve`, `isActive`, and `accessSchedule` fields (#470)
- **feat(dashboard)**: Response-first layout in request log detail UI (#470)
- **feat(i18n)**: Improved Chinese (zh-CN) translation — complete retranslation (#475, @only4copilot)
### 🐛 Bug Fixes
- **fix(kiro)**: Strip injected `model` field from request body — Kiro API rejects unknown top-level fields (#478, @prakersh)
- **fix(usage)**: Include cache read + cache creation tokens in usage history input totals for accurate analytics (#477, @prakersh)
- **fix(callLogs)**: Support Claude format usage fields (`input_tokens`/`output_tokens`) alongside OpenAI format, include all cache token variants (#476, @prakersh)
---
## [2.8.0] — 2026-03-19
> Sprint: Bailian Coding Plan provider with editable base URLs, plus community contributions for Alibaba Cloud and Kimi Coding.
### ✨ Features
- **feat(providers)**: Added Bailian Coding Plan (`bailian-coding-plan`) — Alibaba Model Studio with Anthropic-compatible API. Static catalog of 8 models including Qwen3.5 Plus, Qwen3 Coder, MiniMax M2.5, GLM 5, and Kimi K2.5. Includes custom auth validation (400=valid, 401/403=invalid) (#467, @Mind-Dragon)
- **feat(admin)**: Editable default URL in Provider Admin create/edit flows — users can configure custom base URLs per connection. Persisted in `providerSpecificData.baseUrl` with Zod schema validation rejecting non-http(s) schemes (#467)
### 🧪 Tests
- Added 30+ unit tests and 2 e2e scenarios for Bailian Coding Plan provider covering auth validation, schema hardening, route-level behavior, and cross-layer integration
---
## [2.7.10] — 2026-03-19
> Sprint: Two new community-contributed providers (Alibaba Cloud Coding, Kimi Coding API-key) and Docker pino fix.
### ✨ Features
- **feat(providers)**: Added Alibaba Cloud Coding Plan support with two OpenAI-compatible endpoints — `alicode` (China) and `alicode-intl` (International), each with 8 models (#465, @dtk1985)
- **feat(providers)**: Added dedicated `kimi-coding-apikey` provider path — API-key-based Kimi Coding access is no longer forced through OAuth-only `kimi-coding` route. Includes registry, constants, models API, config, and validation test (#463, @Mind-Dragon)
### 🐛 Bug Fixes
- **fix(docker)**: Added missing `split2` dependency to Docker image — `pino-abstract-transport` requires it at runtime but it was not being copied into the standalone container, causing `Cannot find module 'split2'` crashes (#459)
---
## [2.7.9] — 2026-03-18
> Sprint: Codex responses subpath passthrough natively supported, Windows MITM crash fixed, and Combos agent schemas adjusted.
### ✨ Features
- **feat(codex)**: Native responses subpath passthrough for Codex — natively routes `POST /v1/responses/compact` to Codex upstream, maintaining Claude Code compatibility without stripping the `/compact` suffix (#457)
### 🐛 Bug Fixes
- **fix(combos)**: Zod schemas (`updateComboSchema` and `createComboSchema`) now include `system_message`, `tool_filter_regex`, and `context_cache_protection`. Fixes bug where agent-specific settings created via the dashboard were silently discarded by the backend validation layer (#458)
- **fix(mitm)**: Kiro MITM profile crash on Windows fixed — `node-machine-id` failed due to missing `REG.exe` env, and the fallback threw a fatal `crypto is not defined` error. Fallback now safely and correctly imports crypto (#456)
---
## [2.7.8] — 2026-03-18
> Sprint: Budget save bug + combo agent features UI + omniModel tag security fix.
### 🐛 Bug Fixes
- **fix(budget)**: "Save Limits" no longer returns 422 — `warningThreshold` is now correctly sent as fraction (01) instead of percentage (0100) (#451)
- **fix(combos)**: `<omniModel>` internal cache tag is now stripped before forwarding requests to providers, preventing cache session breaks (#454)
### ✨ Features
- **feat(combos)**: Agent Features section added to combo create/edit modal — expose `system_message` override, `tool_filter_regex`, and `context_cache_protection` directly from the dashboard (#454)
---
## [2.7.7] — 2026-03-18
> Sprint: Docker pino crash, Codex CLI responses worker fix, package-lock sync.
### 🐛 Bug Fixes
- **fix(docker)**: `pino-abstract-transport` and `pino-pretty` now explicitly copied in Docker runner stage — Next.js standalone trace misses these peer deps, causing `Cannot find module pino-abstract-transport` crash on startup (#449)
- **fix(responses)**: Remove `initTranslators()` from `/v1/responses` route — was crashing Next.js worker with `the worker has exited` uncaughtException on Codex CLI requests (#450)
### 🔧 Maintenance
- **chore(deps)**: `package-lock.json` now committed on every version bump to ensure Docker `npm ci` uses exact dependency versions
---
## [2.7.5] — 2026-03-18
> Sprint: UX improvements and Windows CLI healthcheck fix.
### 🐛 Bug Fixes
- **fix(ux)**: Show default password hint on login page — new users now see `"Default password: 123456"` below the password input (#437)
- **fix(cli)**: Claude CLI and other npm-installed tools now correctly detected as runnable on Windows — spawn uses `shell:true` to resolve `.cmd` wrappers via PATHEXT (#447)
---
## [2.7.4] — 2026-03-18
> Sprint: Search Tools dashboard, i18n fixes, Copilot limits, Serper validation fix.
### 🚀 Features
- **feat(search)**: Add Search Playground (10th endpoint), Search Tools page with Compare Providers/Rerank Pipeline/Search History, local rerank routing, auth guards on search API (#443 by @Regis-RCR)
- New route: `/dashboard/search-tools`
- Sidebar entry under Debug section
- `GET /api/search/providers` and `GET /api/search/stats` with auth guards
- Local provider_nodes routing for `/v1/rerank`
- 30+ i18n keys in search namespace
### 🐛 Bug Fixes
- **fix(search)**: Fix Brave news normalizer (was returning 0 results), enforce max_results truncation post-normalization, fix Endpoints page fetch URL (#443 by @Regis-RCR)
- **fix(analytics)**: Localize analytics day/date labels — replace hardcoded Portuguese strings with `Intl.DateTimeFormat(locale)` (#444 by @hijak)
- **fix(copilot)**: Correct GitHub Copilot account type display, filter misleading unlimited quota rows from limits dashboard (#445 by @hijak)
- **fix(providers)**: Stop rejecting valid Serper API keys — treat non-4xx responses as valid authentication (#446 by @hijak)
---
## [2.7.3] — 2026-03-18
> Sprint: Codex direct API quota fallback fix.
### 🐛 Bug Fixes
- **fix(codex)**: Block weekly-exhausted accounts in direct API fallback (#440)
- `resolveQuotaWindow()` prefix matching: `"weekly"` now matches `"weekly (7d)"` cache keys
- `applyCodexWindowPolicy()` enforces `useWeekly`/`use5h` toggles correctly
- 4 new regression tests (766 total)
---
## [2.7.2] — 2026-03-18
> Sprint: Light mode UI contrast fixes.
### 🐛 Bug Fixes
- **fix(logs)**: Fix light mode contrast in request logs filter buttons and combo badge (#378)
- Error/Success/Combo filter buttons now readable in light mode
- Combo row badge uses stronger violet in light mode
---
## [2.7.1] — 2026-03-17
> Sprint: Unified web search routing (POST /v1/search) with 5 providers + Next.js 16.1.7 security fixes (6 CVEs).
### ✨ New Features
- **feat(search)**: Unified web search routing — `POST /v1/search` with 5 providers (Serper, Brave, Perplexity, Exa, Tavily)
- Auto-failover across providers, 6,500+ free searches/month
- In-memory cache with request coalescing (configurable TTL)
- Dashboard: Search Analytics tab in `/dashboard/analytics` with provider breakdown, cache hit rate, cost tracking
- New API: `GET /api/v1/search/analytics` for search request statistics
- DB migration: `request_type` column on `call_logs` for non-chat request tracking
- Zod validation (`v1SearchSchema`), auth-gated, cost recorded via `recordCost()`
### 🔒 Security
- **deps**: Next.js 16.1.6 → 16.1.7 — fixes 6 CVEs:
- **Critical**: CVE-2026-29057 (HTTP request smuggling via http-proxy)
- **High**: CVE-2026-27977, CVE-2026-27978 (WebSocket + Server Actions)
- **Medium**: CVE-2026-27979, CVE-2026-27980, CVE-2026-jcc7
### 📁 New Files
| File | Purpose |
| ---------------------------------------------------------------- | ------------------------------------------ |
| `open-sse/handlers/search.ts` | Search handler with 5-provider routing |
| `open-sse/config/searchRegistry.ts` | Provider registry (auth, cost, quota, TTL) |
| `open-sse/services/searchCache.ts` | In-memory cache with request coalescing |
| `src/app/api/v1/search/route.ts` | Next.js route (POST + GET) |
| `src/app/api/v1/search/analytics/route.ts` | Search stats API |
| `src/app/(dashboard)/dashboard/analytics/SearchAnalyticsTab.tsx` | Analytics dashboard tab |
| `src/lib/db/migrations/007_search_request_type.sql` | DB migration |
| `tests/unit/search-registry.test.mjs` | 277 lines of unit tests |
---
## [2.7.0] — 2026-03-17
> Sprint: ClawRouter-inspired features — toolCalling flag, multilingual intent detection, benchmark-driven fallback, request deduplication, pluggable RouterStrategy, Grok-4 Fast + GLM-5 + MiniMax M2.5 + Kimi K2.5 pricing.
### ✨ New Models & Pricing
- **feat(pricing)**: xAI Grok-4 Fast — `$0.20/$0.50 per 1M tokens`, 1143ms p50 latency, tool calling supported
- **feat(pricing)**: xAI Grok-4 (standard) — `$0.20/$1.50 per 1M tokens`, reasoning flagship
- **feat(pricing)**: GLM-5 via Z.AI — `$0.5/1M`, 128K output context
- **feat(pricing)**: MiniMax M2.5 — `$0.30/1M input`, reasoning + agentic tasks
- **feat(pricing)**: DeepSeek V3.2 — updated pricing `$0.27/$1.10 per 1M`
- **feat(pricing)**: Kimi K2.5 via Moonshot API — direct Moonshot API access
- **feat(providers)**: Z.AI provider added (`zai` alias) — GLM-5 family with 128K output
### 🧠 Routing Intelligence
- **feat(registry)**: `toolCalling` flag per model in provider registry — combos can now prefer/require tool-calling capable models
- **feat(scoring)**: Multilingual intent detection for AutoCombo scoring — PT/ZH/ES/AR script/language patterns influence model selection per request context
- **feat(fallback)**: Benchmark-driven fallback chains — real latency data (p50 from `comboMetrics`) used to re-order fallback priority dynamically
- **feat(dedup)**: Request deduplication via content-hash — 5-second idempotency window prevents duplicate provider calls from retrying clients
- **feat(router)**: Pluggable `RouterStrategy` interface in `autoCombo/routerStrategy.ts` — custom routing logic can be injected without modifying core
### 🔧 MCP Server Improvements
- **feat(mcp)**: 2 new advanced tool schemas: `omniroute_get_provider_metrics` (p50/p95/p99 per provider) and `omniroute_explain_route` (routing decision explanation)
- **feat(mcp)**: MCP tool auth scopes updated — `metrics:read` scope added for provider metrics tools
- **feat(mcp)**: `omniroute_best_combo_for_task` now accepts `languageHint` parameter for multilingual routing
### 📊 Observability
- **feat(metrics)**: `comboMetrics.ts` extended with real-time latency percentile tracking per provider/account
- **feat(health)**: Health API (`/api/monitoring/health`) now returns per-provider `p50Latency` and `errorRate` fields
- **feat(usage)**: Usage history migration for per-model latency tracking
### 🗄️ DB Migrations
- **feat(migrations)**: New column `latency_p50` in `combo_metrics` table — zero-breaking, safe for existing users
### 🐛 Bug Fixes / Closures
- **close(#411)**: better-sqlite3 hashed module resolution on Windows — fixed in v2.6.10 (f02c5b5)
- **close(#409)**: GitHub Copilot chat completions fail with Claude models when files attached — fixed in v2.6.9 (838f1d6)
- **close(#405)**: Duplicate of #411 — resolved
## [2.6.10] — 2026-03-17
> Windows fix: better-sqlite3 prebuilt download without node-gyp/Python/MSVC (#426).
### 🐛 Bug Fixes
- **fix(install/#426)**: On Windows, `npm install -g omniroute` used to fail with `better_sqlite3.node is not a valid Win32 application` because the bundled native binary was compiled for Linux. Adds **Strategy 1.5** to `scripts/postinstall.mjs`: uses `@mapbox/node-pre-gyp install --fallback-to-build=false` (bundled within `better-sqlite3`) to download the correct prebuilt binary for the current OS/arch without requiring any build tools (no node-gyp, no Python, no MSVC). Falls back to `npm rebuild` only if the download fails. Adds platform-specific error messages with clear manual fix instructions.
---
## [2.6.9] — 2026-03-17
> CI fixes (t11 any-budget), bug fix #409 (file attachments via Copilot+Claude), release workflow correction.
### 🐛 Bug Fixes
- **fix(ci)**: Remove word "any" from comments in `openai-responses.ts` and `chatCore.ts` that were failing the t11 `\bany\b` budget check (false positive from regex counting comments)
- **fix(chatCore)**: Normalize unsupported content part types before forwarding to providers (#409 — Cursor sends `{type:"file"}` when `.md` files are attached; Copilot and other OpenAI-compat providers reject with "type has to be either 'image_url' or 'text'"; fix converts `file`/`document` blocks to `text` and drops unknown types)
### 🔧 Workflow
- **chore(generate-release)**: Add ATOMIC COMMIT RULE — version bump (`npm version patch`) MUST happen before committing feature files to ensure tag always points to a commit containing all version changes together
---
## [2.6.8] — 2026-03-17
> Sprint: Combo as Agent (system prompt + tool filter), Context Caching Protection, Auto-Update, Detailed Logs, MITM Kiro IDE.
+5
View File
@@ -32,6 +32,11 @@ COPY --from=builder /app/.next/static ./.next/static
COPY --from=builder /app/.next/standalone ./
# Explicitly copy @swc/helpers — not always traced by standalone output but needed at runtime
COPY --from=builder /app/node_modules/@swc/helpers ./node_modules/@swc/helpers
# Explicitly copy pino transport dependencies — pino spawns a worker that requires
# pino-abstract-transport at runtime; Next.js standalone trace does not capture it (#449)
COPY --from=builder /app/node_modules/pino-abstract-transport ./node_modules/pino-abstract-transport
COPY --from=builder /app/node_modules/pino-pretty ./node_modules/pino-pretty
COPY --from=builder /app/node_modules/split2 ./node_modules/split2
COPY --from=builder /app/scripts/run-standalone.mjs ./run-standalone.mjs
COPY --from=builder /app/scripts/runtime-env.mjs ./runtime-env.mjs
COPY --from=builder /app/scripts/bootstrap-env.mjs ./bootstrap-env.mjs
+63 -32
View File
@@ -4,7 +4,7 @@
_Your universal API proxy — one endpoint, 44+ providers, zero downtime. Now with **MCP & A2A** agent orchestration._
**Chat Completions • Embeddings • Image Generation • Video • Music • Audio • Reranking • MCP Server • A2A Protocol • 100% TypeScript**
**Chat Completions • Embeddings • Image Generation • Video • Music • Audio • Reranking • **Web Search** MCP Server • A2A Protocol • 100% TypeScript**
---
@@ -898,27 +898,44 @@ When minimized, OmniRoute lives in your system tray with quick actions:
## 💰 Pricing at a Glance
| Tier | Provider | Cost | Quota Reset | Best For |
| ------------------- | ----------------- | ---------------------- | ---------------- | ----------------------- |
| **💳 SUBSCRIPTION** | Claude Code (Pro) | $20/mo | 5h + weekly | Already subscribed |
| | Codex (Plus/Pro) | $20-200/mo | 5h + weekly | OpenAI users |
| | Gemini CLI | **FREE** | 180K/mo + 1K/day | Everyone! |
| | GitHub Copilot | $10-19/mo | Monthly | GitHub users |
| **🔑 API KEY** | NVIDIA NIM | **FREE** (dev forever) | ~40 RPM | 70+ open models |
| | Cerebras | **FREE** (1M tok/day) | 60K TPM / 30 RPM | World's fastest |
| | Groq | **FREE** (30 RPM) | 14.4K RPD | Ultra-fast Llama/Gemma |
| | DeepSeek | Pay-per-use | None | Best price/quality |
| | xAI (Grok) | Pay-per-use | None | Grok models |
| | Mistral | Free trial + paid | Rate limited | European AI |
| | OpenRouter | Pay-per-use | None | 100+ models aggr. |
| **💰 CHEAP** | GLM-4.7 | $0.6/1M | Daily 10AM | Budget backup |
| | MiniMax M2.1 | $0.2/1M | 5-hour rolling | Cheapest option |
| | Kimi K2 | $9/mo flat | 10M tokens/mo | Predictable cost |
| **🆓 FREE** | iFlow | **$0** | Unlimited | 5 models unlimited |
| | Qwen | **$0** | Unlimited | 4 models unlimited |
| | Kiro | **$0** | Unlimited | Claude (AWS Builder ID) |
| Tier | Provider | Cost | Quota Reset | Best For |
| ------------------- | --------------------------- | ------------------------- | ---------------- | --------------------------------- |
| **💳 SUBSCRIPTION** | Claude Code (Pro) | $20/mo | 5h + weekly | Already subscribed |
| | Codex (Plus/Pro) | $20-200/mo | 5h + weekly | OpenAI users |
| | Gemini CLI | **FREE** | 180K/mo + 1K/day | Everyone! |
| | GitHub Copilot | $10-19/mo | Monthly | GitHub users |
| **🔑 API KEY** | NVIDIA NIM | **FREE** (dev forever) | ~40 RPM | 70+ open models |
| | Cerebras | **FREE** (1M tok/day) | 60K TPM / 30 RPM | World's fastest |
| | Groq | **FREE** (30 RPM) | 14.4K RPD | Ultra-fast Llama/Gemma |
| | DeepSeek V3.2 | $0.27/$1.10 per 1M | None | Best price/quality reasoning |
| | xAI Grok-4 Fast | **$0.20/$0.50 per 1M** 🆕 | None | Fastest + tool calling, ultralow |
| | xAI Grok-4 (standard) | $0.20/$1.50 per 1M 🆕 | None | Reasoning flagship from xAI |
| | Mistral | Free trial + paid | Rate limited | European AI |
| | OpenRouter | Pay-per-use | None | 100+ models aggr. |
| **💰 CHEAP** | GLM-5 (via Z.AI) 🆕 | $0.5/1M | Daily 10AM | 128K output, newest flagship |
| | GLM-4.7 | $0.6/1M | Daily 10AM | Budget backup |
| | MiniMax M2.5 🆕 | $0.3/1M input | 5-hour rolling | Reasoning + agentic tasks |
| | MiniMax M2.1 | $0.2/1M | 5-hour rolling | Cheapest option |
| | Kimi K2.5 (Moonshot API) 🆕 | Pay-per-use | None | Direct Moonshot API access |
| | Kimi K2 | $9/mo flat | 10M tokens/mo | Predictable cost |
| **🆓 FREE** | iFlow | **$0** | Unlimited | 5 models unlimited |
| | Qwen | **$0** | Unlimited | 4 models unlimited |
| | Kiro | **$0** | Unlimited | Claude Sonnet/Haiku (AWS Builder) |
**💡 $0 Combo Stack:** Gemini CLI (180K/mo) → iFlow (unlimited: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1) → Kiro (Claude for free) → Qwen (4 models, unlimited) — **Zero cost, never stops coding.** When Gemini quota runs out, OmniRoute auto-falls back to iFlow or Kiro with zero config.
> 🆕 **New models added (Mar 2026):** Grok-4 Fast family at $0.20/$0.50/M (benchmarked at 1143ms — 30% faster than Gemini 2.5 Flash), GLM-5 via Z.AI with 128K output, MiniMax M2.5 reasoning, DeepSeek V3.2 updated pricing, Kimi K2.5 via Moonshot direct API.
**💡 $0 Combo Stack — The Complete Free Setup:**
```
Gemini CLI (180K/mo free)
→ iFlow (unlimited: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1)
→ Kiro (Claude Sonnet 4.5 + Haiku — unlimited, via AWS Builder ID)
→ Qwen (4 models — unlimited)
→ Groq (14.4K req/day — ultra-fast)
→ NVIDIA NIM (70+ models — 40 RPM forever)
```
**Zero cost. Never stops coding.** Configure this as one OmniRoute combo and all fallbacks happen automatically — no manual switching ever.
---
@@ -1027,7 +1044,20 @@ Then in `/dashboard/media` → **Transcription** tab: upload any audio or video
OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
### 🚀 New in v2.0.9+Playground, CLI Fingerprints & ACP
### 🆕 New — ClawRouter-Inspired Improvements (Mar 2026)
| Feature | What It Does |
| ------------------------------------ | ------------------------------------------------------------------------------------------- |
| ⚡ **Grok-4 Fast Family** | xAI models at $0.20/$0.50/M — benchmarked 1143ms (30% faster than Gemini 2.5 Flash) |
| 🧠 **GLM-5 via Z.AI** | 128K output context, $0.5/1M — newest flagship from the GLM family |
| 🔮 **MiniMax M2.5** | Reasoning + agentic tasks at $0.30/1M — significant upgrade from M2.1 |
| 🎯 **toolCalling Flag per Model** | Per-model `toolCalling: true/false` in registry — AutoCombo skips non-tool-capable models |
| 🌍 **Multilingual Intent Detection** | PT/ZH/ES/AR keywords in AutoCombo scoring — better model selection for non-English content |
| 📊 **Benchmark-Driven Fallbacks** | Real p95 latency from live requests feeds combo scoring — AutoCombo learns from actual data |
| 🔁 **Request Deduplication** | Content-hash based dedup window — multi-agent safe, prevents duplicate charges |
| 🔌 **Pluggable RouterStrategy** | Extensible `RouterStrategy` interface — add custom routing logic as plugins |
### 🚀 Previous v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
| ------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
@@ -1075,16 +1105,17 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
### 🎵 Multi-Modal APIs
| Feature | What It Does |
| -------------------------- | ------------------------------------------------------------- |
| 🖼️ **Image Generation** | `/v1/images/generations` with cloud and local backends |
| 📐 **Embeddings** | `/v1/embeddings` for search and RAG pipelines |
| 🎤 **Audio Transcription** | `/v1/audio/transcriptions` (Whisper and additional providers) |
| 🔊 **Text-to-Speech** | `/v1/audio/speech` (multiple engines/providers) |
| 🎬 **Video Generation** | `/v1/videos/generations` (ComfyUI + SD WebUI workflows) |
| 🎵 **Music Generation** | `/v1/music/generations` (ComfyUI workflows) |
| 🛡️ **Moderations** | `/v1/moderations` safety checks |
| 🔀 **Reranking** | `/v1/rerank` for relevance scoring |
| Feature | What It Does |
| -------------------------- | ------------------------------------------------------------------------------------------------------------ |
| 🖼️ **Image Generation** | `/v1/images/generations` with cloud and local backends |
| 📐 **Embeddings** | `/v1/embeddings` for search and RAG pipelines |
| 🎤 **Audio Transcription** | `/v1/audio/transcriptions` (Whisper and additional providers) |
| 🔊 **Text-to-Speech** | `/v1/audio/speech` (multiple engines/providers) |
| 🎬 **Video Generation** | `/v1/videos/generations` (ComfyUI + SD WebUI workflows) |
| 🎵 **Music Generation** | `/v1/music/generations` (ComfyUI workflows) |
| 🛡️ **Moderations** | `/v1/moderations` safety checks |
| 🔀 **Reranking** | `/v1/rerank` for relevance scoring |
| 🔍 **Web Search** 🆕 | `/v1/search` — 5 providers (Serper, Brave, Perplexity, Exa, Tavily), 6,500+ free/month, auto-failover, cache |
### 🛡️ Resilience, Security & Governance
+10
View File
@@ -8,6 +8,16 @@ _وكيل API العالمي الخاص بك - نقطة نهاية واحدة،
---
### 🆕 الجديد في v2.7.0
- **RouterStrategy قابل للتوصيل** — استراتيجيات القواعد والتكلفة والكمون
- **كشف النية متعدد اللغات** — تسجيل التوجيه بأكثر من 30 لغة
- **إلغاء تكرار الطلبات** — تجنب مكالمات API المكررة عبر تجزئة المحتوى
- **مزودون جدد:** Grok-4 Fast (xAI) وGLM-5 / Z.AI وMiniMax M2.5 وKimi K2.5
- **أسعار محدثة:** Grok-4 Fast $0.20/$0.50/M، GLM-5 $0.50/M، MiniMax M2.5 $0.30/M
---
<div align="center">
[![إصدار npm](https://img.shields.io/npm/v/omniroute?color=cb3837&logo=npm)](https://www.npmjs.com/package/omniroute)
+10
View File
@@ -8,6 +8,16 @@ _Вашият универсален API прокси — една крайна
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
<div align="center">
[![npm версия](https://img.shields.io/npm/v/omniroute?color=cb3837&logo=npm)](https://www.npmjs.com/package/omniroute)
+10
View File
@@ -8,6 +8,16 @@ _Din universelle API-proxy — ét slutpunkt, 36+ udbydere, ingen nedetid. Nu me
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
<div align="center">
[![npm version](https://img.shields.io/npm/v/omniroute?color=cb3837&logo=npm)](https://www.npmjs.com/package/omniroute)
+10
View File
@@ -8,6 +8,16 @@ _Ihr universeller API-Proxy ein Endpunkt, mehr als 36 Anbieter, keine Ausfal
---
### 🆕 Neu in v2.7.0
- **Erweiterbare RouterStrategy** — Regeln-, Kosten- und Latenzstrategien
- **Mehrsprachige Absichtserkennung** — Routing-Scoring in 30+ Sprachen
- **Anfrage-Deduplizierung** — doppelte API-Aufrufe per Content-Hash vermeiden
- **Neue Anbieter:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Aktualisierte Preise:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
<div align="center">
[![npm-Version](https://img.shields.io/npm/v/omniroute?color=cb3837&logo=npm)](https://www.npmjs.com/package/omniroute)
+10
View File
@@ -11,6 +11,16 @@ _Tu proxy de API universal — un endpoint, 36+ proveedores, cero tiempo de inac
---
### 🆕 Novedades en v2.7.0
- **RouterStrategy enchufable** — estrategias de reglas, costo y latencia
- **Detección de intención multilingüe** — puntuación de enrutamiento en 30+ idiomas
- **Deduplicación de solicitudes** — evita llamadas duplicadas por hash de contenido
- **Nuevos proveedores:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Precios actualizados:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Universaali API-välityspalvelin yksi päätepiste, yli 36 palveluntarjoaja
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Votre proxy API universel — un endpoint, 36+ fournisseurs, zéro temps d'arr
---
### 🆕 Nouveautés dans v2.7.0
- **RouterStrategy extensible** — stratégies de règles, coût et latence
- **Détection d'intention multilingue** — scoring de routage en 30+ langues
- **Déduplication des requêtes** — évite les appels dupliqués via hash de contenu
- **Nouveaux fournisseurs :** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Tarifs mis à jour :** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _שרת ה-API האוניברסלי שלך - נקודת קצה אחת, 36+ ספ
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Az univerzális API-proxy egy végpont, 36+ szolgáltató, nulla állásid
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Proksi API universal Anda — satu titik akhir, 36+ penyedia, tanpa waktu henti
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -13,6 +13,16 @@ _आपका सार्वभौमिक एपीआई प्रॉक्
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Il tuo proxy API universale — un endpoint, 36+ provider, zero downtime._
---
### 🆕 Novità in v2.7.0
- **RouterStrategy estensibile** — strategie per regole, costo e latenza
- **Rilevamento intento multilingue** — scoring di routing in 30+ lingue
- **Deduplicazione richieste** — evita chiamate duplicate tramite hash del contenuto
- **Nuovi provider:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Prezzi aggiornati:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _ユニバーサル API プロキシ — 1 つのエンドポイント、36 以
---
### 🆕 v2.7.0 の新機能
- **プラガブル RouterStrategy** — ルール・コスト・レイテンシ戦略をサポート
- **多言語インテント検出** — 30以上の言語でルーティングスコアリング
- **リクエスト重複排除** — コンテンツハッシュで重複 API 呼び出しを防止
- **新しいプロバイダー:** Grok-4 Fast (xAI)、GLM-5 / Z.AI、MiniMax M2.5、Kimi K2.5
- **価格更新:** Grok-4 Fast $0.20/$0.50/M、GLM-5 $0.50/M、MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _범용 API 프록시 — 하나의 엔드포인트, 36개 이상의 공급자,
---
### 🆕 v2.7.0 새로운 기능
- **플러그형 RouterStrategy** — 규칙, 비용, 지연 전략 지원
- **다국어 의도 감지** — 30개 이상 언어로 라우팅 스코어링
- **요청 중복 제거** — 콘텐츠 해시로 중복 API 호출 방지
- **새 공급자:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **가격 업데이트:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Proksi API universal anda — satu titik akhir, 36+ pembekal, masa henti sifar.
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Uw universele API-proxy: één eindpunt, meer dan 36 providers, geen downtime._
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Din universelle API-proxy ett endepunkt, 36+ leverandører, null nedetid._
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Iyong unibersal na API proxy — isang endpoint, 36+ provider, zero downtime._
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Twój uniwersalny serwer proxy API — jeden punkt końcowy, ponad 36 dostawcó
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Seu proxy de API universal — um endpoint, 36+ provedores, zero tempo de inati
---
### 🆕 Novidades na v2.7.0
- **RouterStrategy plugável** — estratégias de regras, custo e latência
- **Detecção de intenção multilíngue** — scoring de roteamento em 30+ idiomas
- **Deduplicação de requisições** — evita chamadas duplicadas por hash de conteúdo
- **Novos provedores:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Preços atualizados:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Seu proxy de API universal — um endpoint, mais de 36 provedores, tempo de ina
---
### 🆕 Novidades na v2.7.0
- **RouterStrategy extensível** — estratégias de regras, custo e latência
- **Deteção de intenção multilíngue** — scoring de encaminhamento em 30+ idiomas
- **Deduplicação de pedidos** — evita chamadas duplicadas por hash de conteúdo
- **Novos fornecedores:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Preços atualizados:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Proxy-ul dvs. universal API - un punct final, peste 36 de furnizori, zero timpi
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Ваш универсальный API-прокси — одна точка до
---
### 🆕 Новое в v2.7.0
- **Подключаемая RouterStrategy** — стратегии по правилам, стоимости и задержке
- **Многоязычное распознавание намерений** — маршрутизация на 30+ языках
- **Дедупликация запросов** — устранение дублей по хэшу содержимого
- **Новые провайдеры:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Обновлённые цены:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Váš univerzálny proxy server API jeden koncový bod, 36+ poskytovateľov
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Din universella API-proxy — en slutpunkt, 36+ leverantörer, noll driftstopp.
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _พร็อกซี API สากลของคุณ — จุดสิ้
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Ваш універсальний API-проксі — одна кінцева
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _Proxy API phổ quát của bạn — một điểm cuối, hơn 36 nhà cung c
---
### 🆕 What's New in v2.7.0
- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
- **Multilingual intent detection** — routing scoring in 30+ languages
- **Request deduplication** — prevent duplicate API calls via content hash
- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+10
View File
@@ -11,6 +11,16 @@ _您的通用 API 代理 — 一个端点,36+ 提供商,零停机时间。_
---
### 🆕 v2.7.0 新功能
- **可插拔 RouterStrategy** — 支持规则、成本和延迟策略
- **多语言意图检测** — 支持 30+ 语言的路由评分
- **请求去重** — 基于内容哈希避免重复 API 调用
- **新增提供商:** Grok-4 Fast (xAI)、GLM-5 / Z.AI、MiniMax M2.5、Kimi K2.5
- **价格更新:** Grok-4 Fast $0.20/$0.50/MGLM-5 $0.50/MMiniMax M2.5 $0.30/M
---
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
| Feature | What It Does |
+1 -1
View File
@@ -1,7 +1,7 @@
openapi: 3.1.0
info:
title: OmniRoute API
version: 2.6.8
version: 2.8.2
description: |
OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
endpoint that routes requests to multiple AI providers with load balancing,
+4
View File
@@ -121,6 +121,10 @@ const nextConfig = {
source: "/responses",
destination: "/api/v1/responses",
},
{
source: "/responses/:path*",
destination: "/api/v1/responses/:path*",
},
{
source: "/models",
destination: "/api/v1/models",
+140 -16
View File
@@ -11,6 +11,7 @@
export interface RegistryModel {
id: string;
name: string;
toolCalling?: boolean;
targetFormat?: string;
unsupportedParams?: readonly string[];
}
@@ -77,6 +78,22 @@ interface LegacyProvider {
clientVersion?: string;
}
const KIMI_CODING_SHARED = {
format: "claude",
executor: "default",
baseUrl: "https://api.kimi.com/coding/v1/messages",
authHeader: "x-api-key",
headers: {
"Anthropic-Version": "2023-06-01",
"Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
},
models: [
{ id: "kimi-k2.5", name: "Kimi K2.5" },
{ id: "kimi-k2.5-thinking", name: "Kimi K2.5 Thinking" },
{ id: "kimi-latest", name: "Kimi Latest" },
] as RegistryModel[],
} as const;
// ── Registry ──────────────────────────────────────────────────────────────
export const REGISTRY: Record<string, RegistryEntry> = {
@@ -114,6 +131,7 @@ export const REGISTRY: Record<string, RegistryEntry> = {
},
models: [
{ id: "claude-opus-4-6", name: "Claude Opus 4.6" },
{ id: "claude-sonnet-4-6", name: "Claude 4.6 Sonnet" },
{ id: "claude-opus-4-5-20251101", name: "Claude 4.5 Opus" },
{ id: "claude-sonnet-4-5-20250929", name: "Claude 4.5 Sonnet" },
{ id: "claude-haiku-4-5-20251001", name: "Claude 4.5 Haiku" },
@@ -139,6 +157,9 @@ export const REGISTRY: Record<string, RegistryEntry> = {
clientSecretDefault: "",
},
models: [
{ id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
{ id: "gemini-3-1-pro", name: "Gemini 3.1 Pro (Alt ID)" },
{ id: "gemini-3.1-pro-preview", name: "Gemini 3.1 Pro Preview" },
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
{ id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
@@ -168,6 +189,9 @@ export const REGISTRY: Record<string, RegistryEntry> = {
clientSecretDefault: "",
},
models: [
{ id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
{ id: "gemini-3-1-pro", name: "Gemini 3.1 Pro (Alt ID)" },
{ id: "gemini-3.1-pro-preview", name: "Gemini 3.1 Pro Preview" },
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
{ id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
@@ -460,8 +484,13 @@ export const REGISTRY: Record<string, RegistryEntry> = {
"Anthropic-Version": "2023-06-01",
},
models: [
{ id: "claude-haiku-4.5", name: "Claude Haiku 4.5" },
{ id: "claude-sonnet-4-20250514", name: "Claude Sonnet 4" },
{ id: "claude-sonnet-4-6-20251031", name: "Claude Sonnet 4.6 (Dated)" },
{ id: "claude-sonnet-4.6", name: "Claude Sonnet 4.6" },
{ id: "claude-opus-4-20250514", name: "Claude Opus 4" },
{ id: "claude-opus-4-6-20251031", name: "Claude Opus 4.6 (Dated)" },
{ id: "claude-opus-4.6", name: "Claude Opus 4.6" },
{ id: "claude-3-5-sonnet-20241022", name: "Claude 3.5 Sonnet" },
],
},
@@ -495,6 +524,8 @@ export const REGISTRY: Record<string, RegistryEntry> = {
"Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
},
models: [
{ id: "glm-5", name: "GLM 5" },
{ id: "glm-5-turbo", name: "GLM 5 Turbo" },
{ id: "glm-4.7-flash", name: "GLM 4.7 Flash" },
{ id: "glm-4.7", name: "GLM 4.7" },
{ id: "glm-4.6v", name: "GLM 4.6V (Vision)" },
@@ -506,6 +537,51 @@ export const REGISTRY: Record<string, RegistryEntry> = {
],
},
"bailian-coding-plan": {
id: "bailian-coding-plan",
alias: "bcp",
format: "claude",
executor: "default",
baseUrl: "https://coding-intl.dashscope.aliyuncs.com/apps/anthropic/v1/messages",
chatPath: "/messages",
urlSuffix: "?beta=true",
authType: "apikey",
authHeader: "x-api-key",
headers: {
"Anthropic-Version": "2023-06-01",
"Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
},
models: [
{ id: "qwen3.5-plus", name: "Qwen3.5 Plus" },
{ id: "qwen3-max-2026-01-23", name: "Qwen3 Max (2026-01-23)" },
{ id: "qwen3-coder-next", name: "Qwen3 Coder Next" },
{ id: "qwen3-coder-plus", name: "Qwen3 Coder Plus" },
{ id: "MiniMax-M2.5", name: "MiniMax M2.5" },
{ id: "glm-5", name: "GLM 5" },
{ id: "glm-4.7", name: "GLM 4.7" },
{ id: "kimi-k2.5", name: "Kimi K2.5" },
],
},
zai: {
id: "zai",
alias: "zai",
format: "claude",
executor: "default",
baseUrl: "https://api.z.ai/api/anthropic/v1/messages",
urlSuffix: "?beta=true",
authType: "apikey",
authHeader: "x-api-key",
headers: {
"Anthropic-Version": "2023-06-01",
"Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
},
models: [
{ id: "glm-5", name: "GLM 5" },
{ id: "glm-5-turbo", name: "GLM 5 Turbo" },
],
},
kimi: {
id: "kimi",
alias: "kimi",
@@ -525,16 +601,9 @@ export const REGISTRY: Record<string, RegistryEntry> = {
"kimi-coding": {
id: "kimi-coding",
alias: "kmc",
format: "claude",
executor: "default",
baseUrl: "https://api.kimi.com/coding/v1/messages",
...KIMI_CODING_SHARED,
urlSuffix: "?beta=true",
authType: "oauth",
authHeader: "x-api-key",
headers: {
"Anthropic-Version": "2023-06-01",
"Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
},
oauth: {
clientIdEnv: "KIMI_CODING_OAUTH_CLIENT_ID",
clientIdDefault: "17e5f671-d194-4dfb-9706-5516cb48c098",
@@ -542,11 +611,13 @@ export const REGISTRY: Record<string, RegistryEntry> = {
refreshUrl: "https://auth.kimi.com/api/oauth/token",
authUrl: "https://auth.kimi.com/api/oauth/device_authorization",
},
models: [
{ id: "kimi-k2.5", name: "Kimi K2.5" },
{ id: "kimi-k2.5-thinking", name: "Kimi K2.5 Thinking" },
{ id: "kimi-latest", name: "Kimi Latest" },
],
},
"kimi-coding-apikey": {
id: "kimi-coding-apikey",
alias: "kmca",
...KIMI_CODING_SHARED,
authType: "apikey",
},
kilocode: {
@@ -637,7 +708,11 @@ export const REGISTRY: Record<string, RegistryEntry> = {
"Anthropic-Version": "2023-06-01",
"Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
},
models: [{ id: "MiniMax-M2.1", name: "MiniMax M2.1" }],
models: [
{ id: "minimax-m2.5", name: "MiniMax M2.5" },
{ id: "MiniMax-M2.5", name: "MiniMax M2.5 (Legacy Alias)" },
{ id: "MiniMax-M2.1", name: "MiniMax M2.1" },
],
},
"minimax-cn": {
@@ -655,10 +730,52 @@ export const REGISTRY: Record<string, RegistryEntry> = {
},
models: [
// Keep parity with minimax to ensure model discovery works for minimax-cn connections.
{ id: "minimax-m2.5", name: "MiniMax M2.5" },
{ id: "MiniMax-M2.5", name: "MiniMax M2.5 (Legacy Alias)" },
{ id: "MiniMax-M2.1", name: "MiniMax M2.1" },
],
},
alicode: {
id: "alicode",
alias: "alicode",
format: "openai",
executor: "default",
baseUrl: "https://coding.dashscope.aliyuncs.com/v1/chat/completions",
authType: "apikey",
authHeader: "bearer",
models: [
{ id: "qwen3.5-plus", name: "Qwen3.5 Plus" },
{ id: "kimi-k2.5", name: "Kimi K2.5" },
{ id: "glm-5", name: "GLM 5" },
{ id: "MiniMax-M2.5", name: "MiniMax M2.5" },
{ id: "qwen3-max-2026-01-23", name: "Qwen3 Max" },
{ id: "qwen3-coder-next", name: "Qwen3 Coder Next" },
{ id: "qwen3-coder-plus", name: "Qwen3 Coder Plus" },
{ id: "glm-4.7", name: "GLM 4.7" },
],
},
"alicode-intl": {
id: "alicode-intl",
alias: "alicode-intl",
format: "openai",
executor: "default",
baseUrl: "https://coding-intl.dashscope.aliyuncs.com/v1/chat/completions",
authType: "apikey",
authHeader: "bearer",
models: [
{ id: "qwen3.5-plus", name: "Qwen3.5 Plus" },
{ id: "kimi-k2.5", name: "Kimi K2.5" },
{ id: "glm-5", name: "GLM 5" },
{ id: "MiniMax-M2.5", name: "MiniMax M2.5" },
{ id: "qwen3-max-2026-01-23", name: "Qwen3 Max" },
{ id: "qwen3-coder-next", name: "Qwen3 Coder Next" },
{ id: "qwen3-coder-plus", name: "Qwen3 Coder Plus" },
{ id: "glm-4.7", name: "GLM 4.7" },
],
},
deepseek: {
id: "deepseek",
alias: "ds",
@@ -717,10 +834,14 @@ export const REGISTRY: Record<string, RegistryEntry> = {
authType: "apikey",
authHeader: "bearer",
models: [
{ id: "grok-4", name: "Grok 4" },
{ id: "grok-4-fast-non-reasoning", name: "Grok 4 Fast" },
{ id: "grok-4-fast-reasoning", name: "Grok 4 Fast Reasoning" },
{ id: "grok-code-fast-1", name: "Grok Code Fast" },
{ id: "grok-4-1-fast-non-reasoning", name: "Grok 4.1 Fast" },
{ id: "grok-4-1-fast-reasoning", name: "Grok 4.1 Fast Reasoning" },
{ id: "grok-4-0709", name: "Grok 4 (0709)" },
{ id: "grok-4", name: "Grok 4" },
{ id: "grok-3", name: "Grok 3" },
{ id: "grok-3-mini", name: "Grok 3 Mini" },
],
},
@@ -849,7 +970,10 @@ export const REGISTRY: Record<string, RegistryEntry> = {
authType: "apikey",
authHeader: "bearer",
models: [
{ id: "gpt-oss-120b", name: "GPT OSS 120B", toolCalling: false },
{ id: "openai/gpt-oss-120b", name: "GPT OSS 120B (OpenAI Prefix)", toolCalling: false },
{ id: "meta/llama-3.3-70b-instruct", name: "Llama 3.3 70B" },
{ id: "nvidia/llama-3.3-70b-instruct", name: "Llama 3.3 70B (NVIDIA Prefix)" },
{ id: "meta/llama-4-maverick-17b-128e-instruct", name: "Llama 4 Maverick" },
{ id: "moonshotai/kimi-k2.5", name: "Kimi K2.5" },
{ id: "z-ai/glm4.7", name: "GLM 4.7" },
+155
View File
@@ -0,0 +1,155 @@
/**
* Search Provider Registry
*
* Defines providers that support the /v1/search endpoint.
* Unlike LLM/embedding providers, search providers don't have "models"
* a provider IS the model (Serper = Google SERP, Brave = Brave index).
*
* API keys are stored in the same provider credentials system,
* keyed by provider ID (e.g. "serper-search", "brave-search").
* perplexity-search reuses credentials from the "perplexity" chat provider.
*/
export interface SearchProviderConfig {
id: string;
name: string;
baseUrl: string;
method: "GET" | "POST";
authType: "apikey";
authHeader: string;
costPerQuery: number;
freeMonthlyQuota: number;
searchTypes: string[];
defaultMaxResults: number;
maxMaxResults: number;
timeoutMs: number;
cacheTTLMs: number;
}
export const SEARCH_PROVIDERS: Record<string, SearchProviderConfig> = {
"serper-search": {
id: "serper-search",
name: "Serper Search",
baseUrl: "https://google.serper.dev",
method: "POST",
authType: "apikey",
authHeader: "x-api-key",
costPerQuery: 0.001,
freeMonthlyQuota: 2500,
searchTypes: ["web", "news"],
defaultMaxResults: 5,
maxMaxResults: 100,
timeoutMs: 10_000,
cacheTTLMs: 5 * 60 * 1000,
},
"brave-search": {
id: "brave-search",
name: "Brave Search",
baseUrl: "https://api.search.brave.com/res/v1",
method: "GET",
authType: "apikey",
authHeader: "x-subscription-token",
costPerQuery: 0.005,
freeMonthlyQuota: 1000,
searchTypes: ["web", "news"],
defaultMaxResults: 5,
maxMaxResults: 20,
timeoutMs: 10_000,
cacheTTLMs: 5 * 60 * 1000,
},
"perplexity-search": {
id: "perplexity-search",
name: "Perplexity Search",
baseUrl: "https://api.perplexity.ai/search",
method: "POST",
authType: "apikey",
authHeader: "bearer",
costPerQuery: 0.005,
freeMonthlyQuota: 0,
searchTypes: ["web"],
defaultMaxResults: 5,
maxMaxResults: 20,
timeoutMs: 10_000,
cacheTTLMs: 5 * 60 * 1000,
},
"exa-search": {
id: "exa-search",
name: "Exa Search",
baseUrl: "https://api.exa.ai/search",
method: "POST",
authType: "apikey",
authHeader: "x-api-key",
costPerQuery: 0.007,
freeMonthlyQuota: 1000,
searchTypes: ["web", "news"],
defaultMaxResults: 5,
maxMaxResults: 100,
timeoutMs: 10_000,
cacheTTLMs: 5 * 60 * 1000,
},
"tavily-search": {
id: "tavily-search",
name: "Tavily Search",
baseUrl: "https://api.tavily.com/search",
method: "POST",
authType: "apikey",
authHeader: "bearer",
costPerQuery: 0.008,
freeMonthlyQuota: 1000,
searchTypes: ["web", "news"],
defaultMaxResults: 5,
maxMaxResults: 20,
timeoutMs: 10_000,
cacheTTLMs: 5 * 60 * 1000,
},
};
/**
* Credential fallback mapping search providers that can reuse credentials
* from a related provider (e.g., perplexity-search uses the same API key as perplexity chat).
*/
export const SEARCH_CREDENTIAL_FALLBACKS: Record<string, string> = {
"perplexity-search": "perplexity",
};
/**
* Get search provider config by ID
*/
export function getSearchProvider(providerId: string): SearchProviderConfig | null {
return SEARCH_PROVIDERS[providerId] || null;
}
/**
* Get all search providers as a flat list
*/
export function getAllSearchProviders(): Array<{
id: string;
name: string;
searchTypes: string[];
}> {
return Object.values(SEARCH_PROVIDERS).map((p) => ({
id: p.id,
name: p.name,
searchTypes: p.searchTypes,
}));
}
/**
* Select the cheapest available provider.
* If an explicit provider is given, validate and return it.
* Otherwise, return the cheapest by costPerQuery.
*/
export function selectProvider(explicitProvider?: string): SearchProviderConfig | null {
if (explicitProvider) {
return SEARCH_PROVIDERS[explicitProvider] || null;
}
const providers = Object.values(SEARCH_PROVIDERS);
if (providers.length === 0) return null;
return providers.reduce((cheapest, p) => (p.costPerQuery < cheapest.costPerQuery ? p : cheapest));
}
+1
View File
@@ -26,6 +26,7 @@ export type ProviderCredentials = {
expiresAt?: string;
connectionId?: string; // T07: used for API key rotation index
providerSpecificData?: JsonRecord;
requestEndpointPath?: string;
};
export type ExecutorLog = {
+38 -3
View File
@@ -9,6 +9,17 @@ type EffortLevel = (typeof EFFORT_ORDER)[number];
const CODEX_FAST_WIRE_VALUE = "priority";
let defaultFastServiceTierEnabled = false;
function getResponsesSubpath(endpointPath: unknown): string | null {
const normalizedEndpoint = String(endpointPath || "").replace(/\/+$/, "");
const match = normalizedEndpoint.match(/(?:^|\/)responses(?:(\/.*))?$/i);
if (!match) return null;
return match[1] || "";
}
function isCompactResponsesEndpoint(endpointPath: unknown): boolean {
return getResponsesSubpath(endpointPath)?.toLowerCase() === "/compact";
}
function normalizeServiceTierValue(value: unknown): string | undefined {
if (typeof value !== "string") return undefined;
const normalized = value.trim().toLowerCase();
@@ -60,13 +71,31 @@ export class CodexExecutor extends BaseExecutor {
super("codex", PROVIDERS.codex);
}
buildUrl(model, stream, urlIndex = 0, credentials = null) {
void model;
void stream;
void urlIndex;
const responsesSubpath = getResponsesSubpath(credentials?.requestEndpointPath);
if (responsesSubpath !== null) {
const baseUrl = String(this.config.baseUrl || "").replace(/\/$/, "");
if (baseUrl.endsWith("/responses")) {
return `${baseUrl}${responsesSubpath}`;
}
return `${baseUrl}/responses${responsesSubpath}`;
}
return super.buildUrl(model, stream, urlIndex, credentials);
}
/**
* Codex Responses endpoint is SSE-first.
* Always request event-stream from upstream, even when client requested stream=false.
* Includes chatgpt-account-id header for strict workspace binding.
*/
buildHeaders(credentials, stream = true) {
const headers = super.buildHeaders(credentials, true);
const isCompactRequest = isCompactResponsesEndpoint(credentials?.requestEndpointPath);
const headers = super.buildHeaders(credentials, isCompactRequest ? false : true);
// Add workspace binding header if workspaceId is persisted
const workspaceId = credentials?.providerSpecificData?.workspaceId;
@@ -107,9 +136,15 @@ export class CodexExecutor extends BaseExecutor {
*/
transformRequest(model, body, stream, credentials) {
const nativeCodexPassthrough = body?._nativeCodexPassthrough === true;
const isCompactRequest = isCompactResponsesEndpoint(credentials?.requestEndpointPath);
// Codex /responses rejects stream=false; we aggregate SSE back to JSON when needed.
body.stream = true;
// Codex /responses rejects stream=false, but /responses/compact rejects the stream field entirely.
if (isCompactRequest) {
delete body.stream;
delete body.stream_options;
} else {
body.stream = true;
}
delete body._nativeCodexPassthrough;
const requestServiceTier = normalizeServiceTierValue(body.service_tier);
+2
View File
@@ -54,6 +54,8 @@ export class DefaultExecutor extends BaseExecutor {
break;
case "glm":
case "kimi-coding":
case "bailian-coding-plan":
case "kimi-coding-apikey":
case "minimax":
case "minimax-cn":
headers["x-api-key"] = credentials.apiKey || credentials.accessToken;
+5 -2
View File
@@ -77,10 +77,13 @@ export class KiroExecutor extends BaseExecutor {
}
transformRequest(model: string, body: unknown, stream: boolean, credentials: unknown): unknown {
void model;
void stream;
void credentials;
return body;
// Kiro uses conversationState.currentMessage.userInputMessage.modelId,
// not a top-level "model" field. chatCore injects translatedBody.model
// which Kiro API rejects as unknown top-level field.
const { model: _model, ...rest } = body as Record<string, unknown>;
return rest;
}
/**
+213 -43
View File
@@ -23,6 +23,7 @@ import {
appendRequestLog,
saveCallLog,
} from "@/lib/usageDb";
import { getModelNormalizeToolCallId } from "@/lib/db/models";
import { getExecutor } from "../executors/index.ts";
import { translateNonStreamingResponse } from "./responseTranslator.ts";
import { extractUsageFromResponse } from "./usageExtractor.ts";
@@ -42,6 +43,12 @@ import {
import { getIdempotencyKey, checkIdempotency, saveIdempotency } from "@/lib/idempotencyLayer";
import { createProgressTransform, wantsProgress } from "../utils/progressTracker.ts";
import { isModelUnavailableError, getNextFamilyFallback } from "../services/modelFamilyFallback.ts";
import { computeRequestHash, deduplicate, shouldDeduplicate } from "../services/requestDedup.ts";
import {
shouldUseFallback,
isFallbackDecision,
EMERGENCY_FALLBACK_CONFIG,
} from "../services/emergencyFallback.ts";
export function shouldUseNativeCodexPassthrough({
provider,
@@ -54,9 +61,8 @@ export function shouldUseNativeCodexPassthrough({
}): boolean {
if (provider !== "codex") return false;
if (sourceFormat !== FORMATS.OPENAI_RESPONSES) return false;
return String(endpointPath || "")
.toLowerCase()
.endsWith("/responses");
const normalizedEndpoint = String(endpointPath || "").replace(/\/+$/, "");
return /(?:^|\/)responses(?:\/.*)?$/i.test(normalizedEndpoint);
}
/**
@@ -89,6 +95,22 @@ export async function handleChatCore({
}) {
const { provider, model, extendedContext } = modelInfo;
const startTime = Date.now();
const persistFailureUsage = (statusCode: number, errorCode?: string | null) => {
saveRequestUsage({
provider: provider || "unknown",
model: model || "unknown",
tokens: { input: 0, output: 0, cacheRead: 0, cacheCreation: 0, reasoning: 0 },
status: String(statusCode),
success: false,
latencyMs: Date.now() - startTime,
timeToFirstTokenMs: 0,
errorCode: errorCode || String(statusCode),
timestamp: new Date().toISOString(),
connectionId: connectionId || undefined,
apiKeyId: apiKeyInfo?.id || undefined,
apiKeyName: apiKeyInfo?.name || undefined,
}).catch(() => {});
};
// ── Phase 9.2: Idempotency check ──
const idempotencyKey = getIdempotencyKey(clientRawRequest?.headers);
@@ -118,8 +140,8 @@ export async function handleChatCore({
}
const sourceFormat = detectFormat(body);
const endpointPath = (clientRawRequest?.endpoint || "").toLowerCase();
const isResponsesEndpoint = endpointPath.endsWith("/responses");
const endpointPath = String(clientRawRequest?.endpoint || "");
const isResponsesEndpoint = /(?:^|\/)responses(?:\/.*)?$/i.test(endpointPath);
const nativeCodexPassthrough = shouldUseNativeCodexPassthrough({
provider,
sourceFormat,
@@ -135,10 +157,16 @@ export async function handleChatCore({
// Detect source format and get target format
// Model-specific targetFormat takes priority over provider default
// Apply custom model aliases (Settings → Model Aliases → Pattern→Target) before routing (#315)
// Apply custom model aliases (Settings → Model Aliases → Pattern→Target) before routing (#315, #472)
// Custom aliases take priority over built-in and must be resolved here so the
// downstream getModelTargetFormat() lookup uses the correct, aliased model ID.
// downstream getModelTargetFormat() lookup AND the actual provider request use
// the correct, aliased model ID. Without this, aliases only affect format detection.
const resolvedModel = resolveModelAlias(model);
// Use resolvedModel for all downstream operations (routing, provider requests, logging)
const effectiveModel = resolvedModel !== model ? resolvedModel : model;
if (resolvedModel !== model) {
log?.info?.("ALIAS", `Model alias applied: ${model}${resolvedModel}`);
}
const alias = PROVIDER_ID_TO_ALIAS[provider] || provider;
const modelTargetFormat = getModelTargetFormat(alias, resolvedModel);
@@ -193,7 +221,7 @@ export async function handleChatCore({
} else if (isClaudePassthrough) {
// Claude-to-Claude passthrough: forward body completely untouched.
// No translation, no field stripping, no thinking normalization.
// We are just a gateway -- do not interfere with the request in any way.
// We are just a gateway -- do not interfere with the request in the slightest.
translatedBody = { ...body };
log?.debug?.("FORMAT", "claude->claude passthrough -- forwarding untouched");
} else {
@@ -246,13 +274,50 @@ export async function handleChatCore({
if (Array.isArray(translatedBody.messages)) {
for (const msg of translatedBody.messages) {
if (Array.isArray(msg.content)) {
msg.content = msg.content.filter((block: Record<string, unknown>) =>
block.type !== "text" || (typeof block.text === "string" && block.text.length > 0)
msg.content = msg.content.filter(
(block: Record<string, unknown>) =>
block.type !== "text" || (typeof block.text === "string" && block.text.length > 0)
);
}
}
}
// ── #409: Normalize unsupported content part types ──
// Cursor and other clients send {type:"file"} when attaching .md or other files.
// Providers (Copilot, OpenAI) only accept "text" and "image_url" in content arrays.
// Convert: file → text (extract content), drop unrecognized types with a warning.
if (Array.isArray(translatedBody.messages)) {
for (const msg of translatedBody.messages) {
if (msg.role === "user" && Array.isArray(msg.content)) {
msg.content = (msg.content as Record<string, unknown>[]).flatMap(
(block: Record<string, unknown>) => {
if (block.type === "text" || block.type === "image_url" || block.type === "image") {
return [block];
}
// file / document → extract text content
if (block.type === "file" || block.type === "document") {
const fileContent =
(block.file as Record<string, unknown>)?.content ??
(block.file as Record<string, unknown>)?.text ??
block.content ??
block.text;
const fileName =
(block.file as Record<string, unknown>)?.name ?? block.name ?? "attachment";
if (typeof fileContent === "string" && fileContent.length > 0) {
return [{ type: "text", text: `[${fileName}]\n${fileContent}` }];
}
return [];
}
// Unknown types: drop silently
log?.debug?.("CONTENT", `Dropped unsupported content part type="${block.type}"`);
return [];
}
);
}
}
}
const normalizeToolCallId = getModelNormalizeToolCallId(provider || "", model || "");
translatedBody = translateRequest(
sourceFormat,
targetFormat,
@@ -261,7 +326,8 @@ export async function handleChatCore({
stream,
credentials,
provider,
reqLogger
reqLogger,
{ normalizeToolCallId }
);
}
} catch (error) {
@@ -307,8 +373,8 @@ export async function handleChatCore({
delete translatedBody._toolNameMap;
delete translatedBody._disableToolPrefix;
// Update model in body
translatedBody.model = model;
// Update model in body — use resolved alias so the provider gets the correct model ID (#472)
translatedBody.model = effectiveModel;
// Strip unsupported parameters for reasoning models (o1, o3, etc.)
const unsupported = getUnsupportedParams(provider, model);
@@ -327,13 +393,66 @@ export async function handleChatCore({
// Get executor for this provider
const executor = getExecutor(provider);
const getExecutionCredentials = () =>
nativeCodexPassthrough ? { ...credentials, requestEndpointPath: endpointPath } : credentials;
// Create stream controller for disconnect detection
const streamController = createStreamController({ onDisconnect, log, provider, model });
const dedupRequestBody = { ...translatedBody, model: `${provider}/${model}` };
const dedupEnabled = shouldDeduplicate(dedupRequestBody);
const dedupHash = dedupEnabled ? computeRequestHash(dedupRequestBody) : null;
const executeProviderRequest = async (modelToCall = effectiveModel, allowDedup = false) => {
const execute = async () => {
const bodyToSend =
translatedBody.model === modelToCall
? translatedBody
: { ...translatedBody, model: modelToCall };
const rawResult = await withRateLimit(provider, connectionId, modelToCall, () =>
executor.execute({
model: modelToCall,
body: bodyToSend,
stream,
credentials: getExecutionCredentials(),
signal: streamController.signal,
log,
extendedContext,
})
);
if (stream) return rawResult;
// Non-stream responses need cloning for shared dedup consumers.
const status = rawResult.response.status;
const statusText = rawResult.response.statusText;
const headers = Array.from(rawResult.response.headers.entries());
const payload = await rawResult.response.text();
return {
...rawResult,
response: new Response(payload, { status, statusText, headers }),
};
};
if (allowDedup && dedupEnabled && dedupHash) {
const dedupResult = await deduplicate(dedupHash, execute);
if (dedupResult.wasDeduplicated) {
log?.debug?.("DEDUP", `Joined in-flight request hash=${dedupHash}`);
}
return dedupResult.result;
}
return execute();
};
// Track pending request
trackPendingRequest(model, provider, connectionId, true);
// T5: track which models we've tried for intra-family fallback
const triedModels = new Set<string>([model]);
let currentModel = model;
const triedModels = new Set<string>([effectiveModel]);
let currentModel = effectiveModel;
// Log start
appendRequestLog({ model, provider, connectionId, status: "PENDING" }).catch(() => {});
@@ -345,9 +464,6 @@ export async function handleChatCore({
0;
log?.debug?.("REQUEST", `${provider.toUpperCase()} | ${model} | ${msgCount} msgs`);
// Create stream controller for disconnect detection
const streamController = createStreamController({ onDisconnect, log, provider, model });
// Execute request using executor (handles URL building, headers, fallback, transform)
let providerResponse;
let providerUrl;
@@ -355,17 +471,7 @@ export async function handleChatCore({
let finalBody;
try {
const result = await withRateLimit(provider, connectionId, model, () =>
executor.execute({
model,
body: translatedBody,
stream,
credentials,
signal: streamController.signal,
log,
extendedContext,
})
);
const result = await executeProviderRequest(effectiveModel, true);
providerResponse = result.response;
providerUrl = result.url;
@@ -412,6 +518,7 @@ export async function handleChatCore({
streamController.handleError(error);
return createErrorResult(499, "Request aborted");
}
persistFailureUsage(HTTP_STATUS.BAD_GATEWAY, error?.name || "upstream_error");
const errMsg = formatProviderError(error, provider, model, HTTP_STATUS.BAD_GATEWAY);
console.log(`${COLORS.red}[ERROR] ${errMsg}${COLORS.reset}`);
return createErrorResult(HTTP_STATUS.BAD_GATEWAY, errMsg);
@@ -448,7 +555,7 @@ export async function handleChatCore({
model,
body: translatedBody,
stream,
credentials,
credentials: getExecutionCredentials(),
signal: streamController.signal,
log,
extendedContext,
@@ -521,17 +628,7 @@ export async function handleChatCore({
log?.info?.("MODEL_FALLBACK", `${model} unavailable (${statusCode}) → trying ${nextModel}`);
// Re-execute with the fallback model
try {
const fallbackResult = await withRateLimit(provider, connectionId, nextModel, () =>
executor.execute({
model: nextModel,
body: translatedBody,
stream,
credentials,
signal: streamController.signal,
log,
extendedContext,
})
);
const fallbackResult = await executeProviderRequest(nextModel, false);
if (fallbackResult.response.ok) {
providerResponse = fallbackResult.response;
providerUrl = fallbackResult.url;
@@ -543,18 +640,79 @@ export async function handleChatCore({
// We fall through by NOT returning here
} else {
// Fallback also failed — return original error
persistFailureUsage(statusCode, "model_unavailable");
return createErrorResult(statusCode, errMsg, retryAfterMs);
}
} catch {
persistFailureUsage(statusCode, "model_unavailable");
return createErrorResult(statusCode, errMsg, retryAfterMs);
}
} else {
persistFailureUsage(statusCode, "model_unavailable");
return createErrorResult(statusCode, errMsg, retryAfterMs);
}
} else {
persistFailureUsage(statusCode, `upstream_${statusCode}`);
return createErrorResult(statusCode, errMsg, retryAfterMs);
}
// ── End T5 ───────────────────────────────────────────────────────────────
// ── Emergency Fallback (ClawRouter Feature #09/017) ────────────────────
// When a non-streaming request fails with a budget-related error (402 or
// budget keywords), redirect to nvidia/gpt-oss-120b ($0.00/M) before
// returning the error to the combo router. This gives one last free-tier
// attempt so the user's session stays alive.
const requestHasTools = Array.isArray(translatedBody.tools) && translatedBody.tools.length > 0;
if (!stream) {
const fbDecision = shouldUseFallback(
statusCode,
message,
requestHasTools,
EMERGENCY_FALLBACK_CONFIG
);
if (isFallbackDecision(fbDecision)) {
log?.info?.("EMERGENCY_FALLBACK", fbDecision.reason);
try {
// Build a minimal fallback request using the original body but with
// the NVIDIA free-tier model and max_tokens capped to avoid overuse.
const fbExecutor = getExecutor(fbDecision.provider);
const fbResult = await fbExecutor.execute({
model: fbDecision.model,
body: {
...translatedBody,
model: fbDecision.model,
max_tokens: Math.min(
typeof translatedBody.max_tokens === "number"
? translatedBody.max_tokens
: fbDecision.maxOutputTokens,
fbDecision.maxOutputTokens
),
},
stream: false,
credentials: credentials,
signal: streamController.signal,
log,
extendedContext,
});
if (fbResult.response.ok) {
providerResponse = fbResult.response;
log?.info?.(
"EMERGENCY_FALLBACK",
`Serving ${fbDecision.provider}/${fbDecision.model} as budget fallback for ${provider}/${model}`
);
// Fall through to non-streaming handler — providerResponse is now OK
} else {
log?.warn?.(
"EMERGENCY_FALLBACK",
`Emergency fallback also failed (${fbResult.response.status})`
);
}
} catch (fbErr) {
log?.warn?.("EMERGENCY_FALLBACK", `Emergency fallback error: ${fbErr?.message}`);
}
}
}
// ── End Emergency Fallback ────────────────────────────────────────────
}
// Non-streaming response
@@ -580,6 +738,7 @@ export async function handleChatCore({
connectionId,
status: `FAILED ${HTTP_STATUS.BAD_GATEWAY}`,
}).catch(() => {});
persistFailureUsage(HTTP_STATUS.BAD_GATEWAY, "invalid_sse_payload");
return createErrorResult(
HTTP_STATUS.BAD_GATEWAY,
"Invalid SSE response for non-streaming request"
@@ -597,6 +756,7 @@ export async function handleChatCore({
connectionId,
status: `FAILED ${HTTP_STATUS.BAD_GATEWAY}`,
}).catch(() => {});
persistFailureUsage(HTTP_STATUS.BAD_GATEWAY, "invalid_json_payload");
return createErrorResult(HTTP_STATUS.BAD_GATEWAY, "Invalid JSON response from provider");
}
}
@@ -639,6 +799,11 @@ export async function handleChatCore({
provider: provider || "unknown",
model: model || "unknown",
tokens: usage,
status: "200",
success: true,
latencyMs: Date.now() - startTime,
timeToFirstTokenMs: Date.now() - startTime,
errorCode: null,
timestamp: new Date().toISOString(),
connectionId: connectionId || undefined,
apiKeyId: apiKeyInfo?.id || undefined,
@@ -715,8 +880,12 @@ export async function handleChatCore({
// Create transform stream with logger for streaming response
let transformStream;
// Callback to save call log when stream completes (streaming calls were never logged before!)
const onStreamComplete = ({ status: streamStatus, usage: streamUsage }) => {
// Callback to save call log when stream completes (include responseBody when provided by stream)
const onStreamComplete = ({
status: streamStatus,
usage: streamUsage,
responseBody: streamResponseBody,
}) => {
saveCallLog({
method: "POST",
path: clientRawRequest?.endpoint || "/v1/chat/completions",
@@ -727,6 +896,7 @@ export async function handleChatCore({
duration: Date.now() - startTime,
tokens: streamUsage || {},
requestBody: body,
responseBody: streamResponseBody ?? undefined,
sourceFormat,
targetFormat,
comboName,
+680
View File
@@ -0,0 +1,680 @@
/**
* Search Handler
*
* Handles POST /v1/search requests.
* Routes to 5 search providers with automatic failover:
* serper-search, brave-search, perplexity-search, exa-search, tavily-search
*
* Request format:
* {
* "query": "search query",
* "provider": "serper-search" | "brave-search" | ... // optional, auto-selects cheapest
* "max_results": 5,
* "search_type": "web" | "news"
* }
*/
import { getSearchProvider, type SearchProviderConfig } from "../config/searchRegistry.ts";
import { saveCallLog } from "@/lib/usageDb";
// ── Types ────────────────────────────────────────────────────────────────
export interface SearchResult {
title: string;
url: string;
display_url?: string;
snippet: string;
position: number;
score: number | null;
published_at: string | null;
favicon_url: string | null;
content: { format: string; text: string; length: number } | null;
metadata: {
author: string | null;
language: string | null;
source_type: string | null;
image_url: string | null;
} | null;
citation: {
provider: string;
retrieved_at: string;
rank: number;
};
provider_raw: Record<string, unknown> | null;
}
export interface SearchResponse {
provider: string;
query: string;
results: SearchResult[];
answer: { source: string; text: string | null; model: string | null } | null;
usage: { queries_used: number; search_cost_usd: number; llm_tokens?: number };
metrics: {
response_time_ms: number;
upstream_latency_ms: number;
gateway_latency_ms?: number;
total_results_available: number | null;
};
errors: Array<{ provider: string; code: string; message: string }>;
}
interface SearchHandlerResult {
success: boolean;
status?: number;
error?: string;
data?: SearchResponse;
}
interface SearchHandlerOptions {
query: string;
provider: string;
maxResults: number;
searchType: string;
country?: string;
language?: string;
timeRange?: string;
offset?: number;
domainFilter?: string[];
contentOptions?: {
snippet?: boolean;
full_page?: boolean;
format?: string;
max_characters?: number;
};
strictFilters?: boolean;
providerOptions?: Record<string, unknown>;
credentials: Record<string, any>;
alternateProvider?: string;
alternateCredentials?: Record<string, any> | null;
log?: any;
}
// ── Constants ────────────────────────────────────────────────────────────
const GLOBAL_TIMEOUT_MS = 15_000;
// Non-retriable HTTP status codes — fail immediately, don't try alternate
const NON_RETRIABLE = new Set([400, 401, 403, 404]);
// ── Input Sanitization ──────────────────────────────────────────────────
// Control characters that should never appear in search queries
const CONTROL_CHAR_RE = /[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/;
function sanitizeQuery(query: string): { clean: string; error?: string } {
if (CONTROL_CHAR_RE.test(query)) {
return { clean: "", error: "Query contains invalid control characters" };
}
const clean = query.normalize("NFKC").trim().replace(/\s+/g, " ");
if (clean.length === 0) {
return { clean: "", error: "Query is empty after normalization" };
}
return { clean };
}
// ── Response Normalizers ────────────────────────────────────────────────
function makeResult(
providerId: string,
item: {
title?: string;
url?: string;
snippet?: string;
score?: number;
published_at?: string;
favicon_url?: string;
author?: string;
source_type?: string;
image_url?: string;
full_text?: string;
text_format?: string;
},
idx: number,
now: string
): SearchResult {
const url = item.url || "";
return {
title: item.title || "",
url,
display_url: url ? url.replace(/^https?:\/\/(www\.)?/, "").split("?")[0] : undefined,
snippet: item.snippet || "",
position: idx + 1,
score: typeof item.score === "number" ? Math.min(1, Math.max(0, item.score)) : null,
published_at: item.published_at || null,
favicon_url: item.favicon_url || null,
content: item.full_text
? { format: item.text_format || "text", text: item.full_text, length: item.full_text.length }
: null,
metadata: {
author: item.author || null,
language: null,
source_type: item.source_type || null,
image_url: item.image_url || null,
},
citation: { provider: providerId, retrieved_at: now, rank: idx + 1 },
provider_raw: null,
};
}
function normalizeSerperResponse(
data: any,
_query: string,
searchType: string
): { results: SearchResult[]; totalResults: number | null } {
const now = new Date().toISOString();
const items = searchType === "news" ? data.news : data.organic;
if (!Array.isArray(items)) return { results: [], totalResults: null };
const results = items.map((item: any, idx: number) =>
makeResult(
"serper-search",
{
title: item.title,
url: item.link,
snippet: item.snippet || item.description,
published_at: item.date,
},
idx,
now
)
);
return {
results,
totalResults:
typeof data.searchParameters?.totalResults === "number"
? data.searchParameters.totalResults
: null,
};
}
function normalizeBraveResponse(
data: any,
_query: string,
searchType: string
): { results: SearchResult[]; totalResults: number | null } {
const now = new Date().toISOString();
// Brave news endpoint returns { results: [...] } directly,
// while web endpoint returns { web: { results: [...] } }
const container = searchType === "news" ? data.news || data : data.web;
const items = container?.results;
if (!Array.isArray(items)) return { results: [], totalResults: null };
const results = items.map((item: any, idx: number) =>
makeResult(
"brave-search",
{
title: item.title,
url: item.url,
snippet: item.description,
published_at: item.page_age || item.age,
favicon_url: item.meta_url?.favicon || item.favicon,
},
idx,
now
)
);
return { results, totalResults: container?.totalCount ?? null };
}
// ── Helpers ─────────────────────────────────────────────────────────────
function parseDomainFilter(domainFilter?: string[]): {
includes: string[];
excludes: string[];
} {
if (!domainFilter?.length) return { includes: [], excludes: [] };
const includes = domainFilter.filter((d) => !d.startsWith("-"));
const excludes = domainFilter.filter((d) => d.startsWith("-")).map((d) => d.slice(1));
return { includes, excludes };
}
// ── Provider Request Builders ───────────────────────────────────────────
interface SearchRequestParams {
query: string;
searchType: string;
maxResults: number;
token: string;
country?: string;
language?: string;
domainFilter?: string[];
}
function buildSerperRequest(
config: SearchProviderConfig,
params: SearchRequestParams
): { url: string; init: RequestInit } {
const endpoint = params.searchType === "news" ? "/news" : "/search";
const body: Record<string, unknown> = { q: params.query, num: params.maxResults };
if (params.country) body.gl = params.country.toLowerCase();
if (params.language) body.hl = params.language;
return {
url: `${config.baseUrl}${endpoint}`,
init: {
method: "POST",
headers: { "Content-Type": "application/json", "X-API-Key": params.token },
body: JSON.stringify(body),
},
};
}
function buildBraveRequest(
config: SearchProviderConfig,
params: SearchRequestParams
): { url: string; init: RequestInit } {
const endpoint = params.searchType === "news" ? "/news/search" : "/web/search";
const qp = new URLSearchParams({ q: params.query, count: String(params.maxResults) });
if (params.country) qp.set("country", params.country);
if (params.language) qp.set("search_lang", params.language);
return {
url: `${config.baseUrl}${endpoint}?${qp}`,
init: {
method: "GET",
headers: { Accept: "application/json", "X-Subscription-Token": params.token },
},
};
}
function buildPerplexityRequest(
config: SearchProviderConfig,
params: SearchRequestParams
): { url: string; init: RequestInit } {
const body: Record<string, unknown> = { query: params.query, max_results: params.maxResults };
if (params.country) body.country = params.country;
if (params.language) body.search_language_filter = [params.language];
if (params.domainFilter?.length) body.search_domain_filter = params.domainFilter;
return {
url: config.baseUrl,
init: {
method: "POST",
headers: { "Content-Type": "application/json", Authorization: `Bearer ${params.token}` },
body: JSON.stringify(body),
},
};
}
function buildExaRequest(
config: SearchProviderConfig,
params: SearchRequestParams
): { url: string; init: RequestInit } {
const { includes, excludes } = parseDomainFilter(params.domainFilter);
const body: Record<string, unknown> = {
query: params.query,
numResults: params.maxResults,
type: "auto",
text: true,
highlights: true,
};
if (includes.length) body.includeDomains = includes;
if (excludes.length) body.excludeDomains = excludes;
if (params.searchType === "news") body.category = "news";
return {
url: config.baseUrl,
init: {
method: "POST",
headers: { "Content-Type": "application/json", "x-api-key": params.token },
body: JSON.stringify(body),
},
};
}
function buildTavilyRequest(
config: SearchProviderConfig,
params: SearchRequestParams
): { url: string; init: RequestInit } {
const { includes, excludes } = parseDomainFilter(params.domainFilter);
const body: Record<string, unknown> = {
query: params.query,
max_results: params.maxResults,
topic: params.searchType === "news" ? "news" : "general",
};
if (includes.length) body.include_domains = includes;
if (excludes.length) body.exclude_domains = excludes;
if (params.country) body.country = params.country;
return {
url: config.baseUrl,
init: {
method: "POST",
headers: { "Content-Type": "application/json", Authorization: `Bearer ${params.token}` },
body: JSON.stringify(body),
},
};
}
function buildRequest(
config: SearchProviderConfig,
params: SearchRequestParams
): { url: string; init: RequestInit } {
if (config.id === "serper-search") return buildSerperRequest(config, params);
if (config.id === "brave-search") return buildBraveRequest(config, params);
if (config.id === "perplexity-search") return buildPerplexityRequest(config, params);
if (config.id === "exa-search") return buildExaRequest(config, params);
if (config.id === "tavily-search") return buildTavilyRequest(config, params);
// Fallback for future providers: POST with bearer auth
return {
url: config.baseUrl,
init: {
method: config.method,
headers: { "Content-Type": "application/json", Authorization: `Bearer ${params.token}` },
body: JSON.stringify({
query: params.query,
max_results: params.maxResults,
search_type: params.searchType,
}),
},
};
}
function normalizePerplexityResponse(
data: any,
_query: string,
_searchType: string
): { results: SearchResult[]; totalResults: number | null } {
const now = new Date().toISOString();
const items = data.results;
if (!Array.isArray(items)) return { results: [], totalResults: null };
const results = items.map((item: any, idx: number) =>
makeResult(
"perplexity-search",
{
title: item.title,
url: item.url,
snippet: item.snippet,
published_at: item.date || item.last_updated,
},
idx,
now
)
);
return { results, totalResults: results.length };
}
function normalizeExaResponse(
data: any,
_query: string,
_searchType: string
): { results: SearchResult[]; totalResults: number | null } {
const now = new Date().toISOString();
const items = data.results;
if (!Array.isArray(items)) return { results: [], totalResults: null };
const results = items.map((item: any, idx: number) =>
makeResult(
"exa-search",
{
title: item.title,
url: item.url,
snippet: item.highlights?.[0] || item.text?.slice(0, 300) || "",
score: item.score,
published_at: item.publishedDate,
favicon_url: item.favicon,
author: item.author,
image_url: item.image,
full_text: item.text,
text_format: "text",
},
idx,
now
)
);
return { results, totalResults: results.length };
}
function normalizeTavilyResponse(
data: any,
_query: string,
_searchType: string
): { results: SearchResult[]; totalResults: number | null } {
const now = new Date().toISOString();
const items = data.results;
if (!Array.isArray(items)) return { results: [], totalResults: null };
const results = items.map((item: any, idx: number) =>
makeResult(
"tavily-search",
{
title: item.title,
url: item.url,
snippet: item.content || "",
score: item.score,
published_at: item.published_date,
full_text: item.raw_content,
text_format: "text",
},
idx,
now
)
);
return { results, totalResults: results.length };
}
function normalizeResponse(
providerId: string,
data: any,
query: string,
searchType: string
): { results: SearchResult[]; totalResults: number | null } {
if (providerId === "serper-search") return normalizeSerperResponse(data, query, searchType);
if (providerId === "brave-search") return normalizeBraveResponse(data, query, searchType);
if (providerId === "perplexity-search")
return normalizePerplexityResponse(data, query, searchType);
if (providerId === "exa-search") return normalizeExaResponse(data, query, searchType);
if (providerId === "tavily-search") return normalizeTavilyResponse(data, query, searchType);
return { results: [], totalResults: null };
}
// ── Main Handler ────────────────────────────────────────────────────────
export async function handleSearch(options: SearchHandlerOptions): Promise<SearchHandlerResult> {
const {
query,
provider: providerId,
maxResults,
searchType,
country,
language,
domainFilter,
credentials,
alternateProvider,
alternateCredentials,
log,
} = options;
const startTime = Date.now();
// 1. Sanitize input
const { clean: cleanQuery, error: sanitizeError } = sanitizeQuery(query);
if (sanitizeError) {
return { success: false, status: 400, error: sanitizeError };
}
// 2. Use resolved provider from route (no re-resolution)
const primaryConfig = getSearchProvider(providerId);
if (!primaryConfig) {
return {
success: false,
status: 400,
error: `Unknown search provider: ${providerId}`,
};
}
// 3. Get alternate config for failover (pre-resolved by route)
const alternateConfig = alternateProvider ? getSearchProvider(alternateProvider) : null;
const requestParams = {
query: cleanQuery,
searchType,
maxResults,
country,
language,
domainFilter,
};
// 4. Try primary provider
const result = await tryProvider(primaryConfig, requestParams, credentials, startTime, log);
if (result.success) return result;
// 5. Failover to alternate (only for retriable errors and auto-select mode)
if (
alternateConfig &&
alternateCredentials &&
!NON_RETRIABLE.has(result.status || 0) &&
Date.now() - startTime < GLOBAL_TIMEOUT_MS
) {
if (log) {
log.warn(
"SEARCH",
`${primaryConfig.id} failed (${result.status}), trying ${alternateConfig.id}`
);
}
const fallbackResult = await tryProvider(
alternateConfig,
requestParams,
alternateCredentials,
startTime,
log
);
if (fallbackResult.success) return fallbackResult;
}
return result;
}
async function tryProvider(
config: SearchProviderConfig,
params: Omit<SearchRequestParams, "token">,
credentials: Record<string, any>,
globalStartTime: number,
log?: any
): Promise<SearchHandlerResult> {
const startTime = Date.now();
const token = credentials.apiKey || credentials.accessToken;
if (!token) {
return {
success: false,
status: 401,
error: `No credentials for search provider: ${config.id}`,
};
}
const { query, searchType, maxResults } = params;
const { url, init } = buildRequest(config, { ...params, token });
// Timeout: min of provider timeout and remaining global timeout
const remainingGlobal = GLOBAL_TIMEOUT_MS - (Date.now() - globalStartTime);
const timeout = Math.min(config.timeoutMs, Math.max(remainingGlobal, 1000));
const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), timeout);
if (log) {
log.info("SEARCH", `${config.id} | query: "${query.slice(0, 80)}" | type: ${searchType}`);
}
try {
const response = await fetch(url, { ...init, signal: controller.signal });
clearTimeout(timer);
if (!response.ok) {
const errorText = await response.text();
if (log) {
log.error("SEARCH", `${config.id} error ${response.status}: ${errorText.slice(0, 200)}`);
}
saveCallLog({
method: config.method,
path: "/v1/search",
status: response.status,
model: config.id,
provider: config.id,
duration: Date.now() - startTime,
requestType: "search",
error: errorText.slice(0, 500),
requestBody: {
query: query.slice(0, 200),
search_type: searchType,
max_results: maxResults,
},
}).catch(() => {
/* non-critical — logging must not block search response */
});
return {
success: false,
status: response.status,
error: `Search provider ${config.id} returned ${response.status}`,
};
}
const data = await response.json();
const normalized = normalizeResponse(config.id, data, query, searchType);
// Enforce max_results — some providers return more than requested
const results = normalized.results.slice(0, maxResults);
const totalResults = normalized.totalResults;
const duration = Date.now() - startTime;
saveCallLog({
method: config.method,
path: "/v1/search",
status: 200,
model: config.id,
provider: config.id,
duration,
requestType: "search",
tokens: { prompt_tokens: 0, completion_tokens: 0 },
requestBody: { query: query.slice(0, 200), search_type: searchType, max_results: maxResults },
responseBody: { results_count: results.length, cached: false },
}).catch(() => {
/* non-critical — logging must not block search response */
});
return {
success: true,
data: {
provider: config.id,
query,
results,
answer: null,
usage: { queries_used: 1, search_cost_usd: config.costPerQuery },
metrics: {
response_time_ms: duration,
upstream_latency_ms: duration,
total_results_available: totalResults,
},
errors: [],
},
};
} catch (err: any) {
clearTimeout(timer);
const isTimeout = err.name === "AbortError";
if (log) {
log.error("SEARCH", `${config.id} ${isTimeout ? "timeout" : "fetch error"}: ${err.message}`);
}
saveCallLog({
method: config.method,
path: "/v1/search",
status: isTimeout ? 504 : 502,
model: config.id,
provider: config.id,
duration: Date.now() - startTime,
requestType: "search",
error: err.message,
requestBody: { query: query.slice(0, 200), search_type: searchType, max_results: maxResults },
}).catch(() => {
/* non-critical — logging must not block search response */
});
return {
success: false,
status: isTimeout ? 504 : 502,
error: `Search provider ${isTimeout ? "timeout" : "error"}: ${err.message}`,
};
}
}
@@ -0,0 +1,48 @@
import { describe, it, expect } from "vitest";
import {
MCP_TOOLS,
MCP_TOOL_MAP,
setRoutingStrategyInput,
setRoutingStrategyTool,
} from "../schemas/tools.ts";
describe("omniroute_set_routing_strategy MCP tool schema", () => {
it("should be registered in MCP_TOOLS", () => {
const tool = MCP_TOOLS.find((t) => t.name === "omniroute_set_routing_strategy");
expect(tool).toBeDefined();
expect(tool?.phase).toBe(2);
});
it("should be available in MCP_TOOL_MAP", () => {
expect(MCP_TOOL_MAP["omniroute_set_routing_strategy"]).toBeDefined();
});
it("should require write:combos scope", () => {
expect(setRoutingStrategyTool.scopes).toContain("write:combos");
});
it("should validate a standard strategy payload", () => {
const result = setRoutingStrategyInput.safeParse({
comboId: "my-combo",
strategy: "cost-optimized",
});
expect(result.success).toBe(true);
});
it("should validate auto strategy with autoRoutingStrategy", () => {
const result = setRoutingStrategyInput.safeParse({
comboId: "my-combo",
strategy: "auto",
autoRoutingStrategy: "latency",
});
expect(result.success).toBe(true);
});
it("should reject unknown strategy", () => {
const result = setRoutingStrategyInput.safeParse({
comboId: "my-combo",
strategy: "unknown-strategy",
});
expect(result.success).toBe(false);
});
});
+55 -7
View File
@@ -107,6 +107,7 @@ export const listCombosOutput = z.object({
"priority",
"weighted",
"round-robin",
"strict-random",
"random",
"least-used",
"cost-optimized",
@@ -470,7 +471,53 @@ export const setBudgetGuardTool: McpToolDefinition<
sourceEndpoints: ["/api/usage/budget"],
};
// --- Tool 11: omniroute_set_resilience_profile ---
// --- Tool 11: omniroute_set_routing_strategy ---
export const setRoutingStrategyInput = z.object({
comboId: z.string().describe("Combo ID or name to update"),
strategy: z
.enum([
"priority",
"weighted",
"round-robin",
"strict-random",
"random",
"least-used",
"cost-optimized",
"auto",
])
.describe("Routing strategy to apply"),
autoRoutingStrategy: z
.enum(["rules", "cost", "eco", "latency", "fast"])
.optional()
.describe("Optional strategy used by auto mode (only used when strategy='auto')"),
});
export const setRoutingStrategyOutput = z.object({
success: z.boolean(),
combo: z.object({
id: z.string(),
name: z.string(),
strategy: z.string(),
autoRoutingStrategy: z.string().nullable(),
}),
});
export const setRoutingStrategyTool: McpToolDefinition<
typeof setRoutingStrategyInput,
typeof setRoutingStrategyOutput
> = {
name: "omniroute_set_routing_strategy",
description:
"Updates a combo routing strategy (priority/weighted/auto/etc.) at runtime. Supports selecting the sub-strategy used by auto mode (rules/cost/latency).",
inputSchema: setRoutingStrategyInput,
outputSchema: setRoutingStrategyOutput,
scopes: ["write:combos"],
auditLevel: "full",
phase: 2,
sourceEndpoints: ["/api/combos", "/api/combos/{id}"],
};
// --- Tool 12: omniroute_set_resilience_profile ---
export const setResilienceProfileInput = z.object({
profile: z
.enum(["aggressive", "balanced", "conservative"])
@@ -502,7 +549,7 @@ export const setResilienceProfileTool: McpToolDefinition<
sourceEndpoints: ["/api/resilience"],
};
// --- Tool 12: omniroute_test_combo ---
// --- Tool 13: omniroute_test_combo ---
export const testComboInput = z.object({
comboId: z.string().describe("ID of the combo to test"),
testPrompt: z.string().max(500).describe("Short test prompt (max 500 chars)"),
@@ -540,7 +587,7 @@ export const testComboTool: McpToolDefinition<typeof testComboInput, typeof test
sourceEndpoints: ["/api/combos/test", "/v1/chat/completions"],
};
// --- Tool 13: omniroute_get_provider_metrics ---
// --- Tool 14: omniroute_get_provider_metrics ---
export const getProviderMetricsInput = z.object({
provider: z.string().describe("Provider name (e.g., 'claude', 'gemini-cli', 'codex')"),
});
@@ -583,7 +630,7 @@ export const getProviderMetricsTool: McpToolDefinition<
sourceEndpoints: ["/api/provider-metrics", "/api/resilience"],
};
// --- Tool 14: omniroute_best_combo_for_task ---
// --- Tool 15: omniroute_best_combo_for_task ---
export const bestComboForTaskInput = z.object({
taskType: z
.enum(["coding", "review", "planning", "analysis", "debugging", "documentation"])
@@ -628,7 +675,7 @@ export const bestComboForTaskTool: McpToolDefinition<
sourceEndpoints: ["/api/combos", "/api/combos/metrics", "/api/monitoring/health"],
};
// --- Tool 15: omniroute_explain_route ---
// --- Tool 16: omniroute_explain_route ---
export const explainRouteInput = z.object({
requestId: z.string().describe("Request ID from the X-Request-Id header"),
});
@@ -674,7 +721,7 @@ export const explainRouteTool: McpToolDefinition<
sourceEndpoints: [],
};
// --- Tool 16: omniroute_get_session_snapshot ---
// --- Tool 17: omniroute_get_session_snapshot ---
export const getSessionSnapshotInput = z.object({}).describe("No parameters required");
export const getSessionSnapshotOutput = z.object({
@@ -723,7 +770,7 @@ export const getSessionSnapshotTool: McpToolDefinition<
sourceEndpoints: ["/api/usage/analytics", "/api/telemetry/summary"],
};
// --- Tool 17: omniroute_sync_pricing ---
// --- Tool 18: omniroute_sync_pricing ---
export const syncPricingInput = z.object({
sources: z
.array(z.string())
@@ -775,6 +822,7 @@ export const MCP_TOOLS = [
// Phase 2: Advanced
simulateRouteTool,
setBudgetGuardTool,
setRoutingStrategyTool,
setResilienceProfileTool,
testComboTool,
getProviderMetricsTool,
+14
View File
@@ -25,6 +25,7 @@ import {
listModelsCatalogInput,
simulateRouteInput,
setBudgetGuardInput,
setRoutingStrategyInput,
setResilienceProfileInput,
testComboInput,
getProviderMetricsInput,
@@ -45,6 +46,7 @@ import {
import {
handleSimulateRoute,
handleSetBudgetGuard,
handleSetRoutingStrategy,
handleSetResilienceProfile,
handleTestCombo,
handleGetProviderMetrics,
@@ -593,6 +595,18 @@ export function createMcpServer(): McpServer {
)
);
server.registerTool(
"omniroute_set_routing_strategy",
{
description:
"Updates combo routing strategy at runtime (priority/weighted/round-robin/auto/etc.)",
inputSchema: setRoutingStrategyInput,
},
withScopeEnforcement("omniroute_set_routing_strategy", (args) =>
handleSetRoutingStrategy(setRoutingStrategyInput.parse(args))
)
);
server.registerTool(
"omniroute_set_resilience_profile",
{
+111 -7
View File
@@ -1,16 +1,18 @@
/**
* OmniRoute MCP Advanced Tools 8 intelligence tools that differentiate
* OmniRoute MCP Advanced Tools 10 intelligence tools that differentiate
* OmniRoute from all other AI gateways.
*
* Tools:
* 1. omniroute_simulate_route Dry-run routing simulation
* 2. omniroute_set_budget_guard Session budget with degrade/block/alert
* 3. omniroute_set_resilience_profile Circuit breaker/retry profiles
* 4. omniroute_test_combo Live test each provider in a combo
* 5. omniroute_get_provider_metrics Detailed per-provider metrics
* 6. omniroute_best_combo_for_task AI-powered combo recommendation
* 7. omniroute_explain_route Post-hoc routing decision explainer
* 8. omniroute_get_session_snapshot Full session state snapshot
* 3. omniroute_set_routing_strategy Runtime strategy switch for combos
* 4. omniroute_set_resilience_profile Circuit breaker/retry profiles
* 5. omniroute_test_combo Live test each provider in a combo
* 6. omniroute_get_provider_metrics Detailed per-provider metrics
* 7. omniroute_best_combo_for_task AI-powered combo recommendation
* 8. omniroute_explain_route Post-hoc routing decision explainer
* 9. omniroute_get_session_snapshot Full session state snapshot
* 10. omniroute_sync_pricing Sync provider pricing from external source
*/
import { logToolCall } from "../audit.ts";
@@ -335,6 +337,108 @@ export async function handleSetBudgetGuard(args: {
}
}
export async function handleSetRoutingStrategy(args: {
comboId: string;
strategy:
| "priority"
| "weighted"
| "round-robin"
| "strict-random"
| "random"
| "least-used"
| "cost-optimized"
| "auto";
autoRoutingStrategy?: "rules" | "cost" | "eco" | "latency" | "fast";
}) {
const start = Date.now();
try {
const combos = normalizeCombosResponse(await apiFetch("/api/combos"));
const combo = combos.find(
(comboEntry) =>
toString(comboEntry.id) === args.comboId || toString(comboEntry.name) === args.comboId
);
if (!combo) {
const msg = `Combo '${args.comboId}' not found`;
await logToolCall(
"omniroute_set_routing_strategy",
args,
null,
Date.now() - start,
false,
msg
);
return { content: [{ type: "text" as const, text: `Error: ${msg}` }], isError: true };
}
const comboId = toString(combo.id);
if (!comboId) {
const msg = "Matched combo has no id";
await logToolCall(
"omniroute_set_routing_strategy",
args,
null,
Date.now() - start,
false,
msg
);
return { content: [{ type: "text" as const, text: `Error: ${msg}` }], isError: true };
}
const comboData = toRecord(combo.data);
const currentConfig = toRecord(
Object.keys(toRecord(combo.config)).length > 0 ? combo.config : comboData.config
);
let nextConfig: JsonRecord | undefined = undefined;
if (args.strategy === "auto" && args.autoRoutingStrategy) {
const currentAutoConfig = toRecord(currentConfig.auto);
nextConfig = {
...currentConfig,
auto: {
...currentAutoConfig,
routingStrategy: args.autoRoutingStrategy,
},
};
}
const payload: JsonRecord = { strategy: args.strategy };
if (nextConfig && Object.keys(nextConfig).length > 0) {
payload.config = nextConfig;
}
const updatedCombo = toRecord(
await apiFetch(`/api/combos/${encodeURIComponent(comboId)}`, {
method: "PUT",
body: JSON.stringify(payload),
})
);
const updatedConfig = toRecord(updatedCombo.config);
const resolvedAutoStrategy =
toString(toRecord(updatedConfig.auto).routingStrategy) ||
(args.strategy === "auto" ? (args.autoRoutingStrategy ?? "rules") : "");
const result = {
success: true,
combo: {
id: toString(updatedCombo.id, comboId),
name: toString(updatedCombo.name, toString(combo.name, comboId)),
strategy: toString(updatedCombo.strategy, args.strategy),
autoRoutingStrategy:
toString(updatedCombo.strategy, args.strategy) === "auto" ? resolvedAutoStrategy : null,
},
};
await logToolCall("omniroute_set_routing_strategy", args, result, Date.now() - start, true);
return { content: [{ type: "text" as const, text: JSON.stringify(result, null, 2) }] };
} catch (err) {
const msg = err instanceof Error ? err.message : String(err);
await logToolCall("omniroute_set_routing_strategy", args, null, Date.now() - start, false, msg);
return { content: [{ type: "text" as const, text: `Error: ${msg}` }], isError: true };
}
}
export async function handleSetResilienceProfile(args: {
profile: "aggressive" | "balanced" | "conservative";
}) {
+36 -3
View File
@@ -20,6 +20,7 @@ import {
import { getTaskFitness } from "./taskFitness";
import { getModePack } from "./modePacks";
import { getSelfHealingManager } from "./selfHealing";
import { classifyPromptIntent } from "../intentClassifier";
export interface AutoComboConfig {
id: string;
@@ -30,6 +31,8 @@ export interface AutoComboConfig {
modePack?: string;
budgetCap?: number; // max cost per request in USD
explorationRate: number; // 0.05 = 5% exploratory
/** If set, RouterStrategy name to use for selection ('rules' | 'cost' | 'latency') */
routerStrategy?: string;
}
export interface SelectionResult {
@@ -43,14 +46,44 @@ export interface SelectionResult {
/**
* Select the best provider from an auto-combo pool.
*
* @param config - AutoCombo configuration
* @param candidates - Provider candidates to score
* @param taskType - Task type hint. When "default" or omitted, the engine will attempt
* to infer the intent from `promptMessages` using multilingual classification.
* @param promptMessages - Optional raw messages for intent classification
*/
export function selectProvider(
config: AutoComboConfig,
candidates: ProviderCandidate[],
taskType: string = "default"
taskType: string = "default",
promptMessages?: Array<{ role: string; content: unknown }>
): SelectionResult {
const healer = getSelfHealingManager();
// ── Intent classification (ClawRouter Feature #10/11) ────────────────────
// When taskType is generic ('default'), attempt to classify the prompt intent
// using the multilingual intentClassifier for better task fitness scoring.
let effectiveTaskType = taskType;
if ((taskType === "default" || taskType === "") && promptMessages?.length) {
// Extract text from last user message for classification
const lastUserMsg = [...promptMessages].reverse().find((m) => m.role === "user");
if (lastUserMsg) {
const text =
typeof lastUserMsg.content === "string"
? lastUserMsg.content
: Array.isArray(lastUserMsg.content)
? (lastUserMsg.content as Array<{ type: string; text?: string }>)
.filter((b) => b.type === "text")
.map((b) => b.text || "")
.join(" ")
: "";
if (text.length > 10) {
const intent = classifyPromptIntent(text);
effectiveTaskType = intent; // 'code' | 'reasoning' | 'simple' | 'medium'
}
}
}
// Resolve weights from mode pack or config
let weights = config.weights;
if (config.modePack) {
@@ -80,8 +113,8 @@ export function selectProvider(
excluded.length = 0;
}
// Score all providers
const scored = scorePool(pool, taskType, weights, getTaskFitness);
// Score all providers (using classified intent if available)
const scored = scorePool(pool, effectiveTaskType, weights, getTaskFitness);
// Apply self-healing re-evaluation with actual scores
const finalCandidates = scored.filter((s) => {
@@ -0,0 +1,159 @@
/**
* RouterStrategy Pluggable Routing Strategy System
*
* Inspired by ClawRouter commit 14c83c258 "refactor: extract routing into pluggable RouterStrategy system".
* Provides a RouterStrategy interface and two built-in implementations:
* - RulesStrategy (default): wraps the existing 6-factor scoring engine
* - CostStrategy: always picks cheapest available model
*/
import type { ProviderCandidate, ScoredProvider } from "./scoring.ts";
import { scorePool } from "./scoring.ts";
import { getTaskFitness } from "./taskFitness.ts";
export interface RoutingContext {
taskType: string;
requestHasTools?: boolean;
requestHasVision?: boolean;
estimatedInputTokens?: number;
}
export interface RoutingDecision {
provider: string;
model: string;
strategy: string;
reason: string;
candidatesConsidered: number;
finalScore: number;
}
export interface RouterStrategy {
readonly name: string;
readonly description: string;
select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision;
}
// ── RulesStrategy: wraps 6-factor scoring engine ────────────────────────────
class RulesStrategyImpl implements RouterStrategy {
readonly name = "rules";
readonly description =
"6-factor weighted scoring: quota, health, cost, latency, taskFit, stability";
select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision {
const eligible = pool.filter((c) => c.circuitBreakerState !== "OPEN");
const ranked: ScoredProvider[] = scorePool(
eligible.length > 0 ? eligible : pool,
context.taskType,
undefined,
getTaskFitness
);
const best = ranked[0];
if (!best) throw new Error("[RulesStrategy] No candidates to score");
return {
provider: best.provider,
model: best.model,
strategy: this.name,
reason: `RulesStrategy: score=${best.score.toFixed(3)} (quota=${best.factors.quota.toFixed(2)}, health=${best.factors.health.toFixed(2)}, cost=${best.factors.costInv.toFixed(2)}, taskFit=${best.factors.taskFit.toFixed(2)})`,
candidatesConsidered: ranked.length,
finalScore: best.score,
};
}
}
// ── CostStrategy: always picks cheapest healthy provider ─────────────────────
class CostStrategyImpl implements RouterStrategy {
readonly name = "cost";
readonly description = "Always selects cheapest available provider (by costPer1MTokens)";
select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision {
const healthy = pool.filter((c) => c.circuitBreakerState !== "OPEN");
const candidates = healthy.length > 0 ? healthy : pool;
const sorted = [...candidates].sort((a, b) => a.costPer1MTokens - b.costPer1MTokens);
const best = sorted[0];
if (!best) throw new Error("[CostStrategy] No candidates available");
return {
provider: best.provider,
model: best.model,
strategy: this.name,
reason: `CostStrategy: cheapest at $${best.costPer1MTokens.toFixed(3)}/1M tokens`,
candidatesConsidered: candidates.length,
finalScore: best.costPer1MTokens === 0 ? 1.0 : 1 / best.costPer1MTokens,
};
}
}
// ── LatencyStrategy: prioritize low latency + reliability ───────────────────
class LatencyStrategyImpl implements RouterStrategy {
readonly name = "latency";
readonly description = "Prioritizes lowest p95 latency with reliability weighting";
select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision {
const healthy = pool.filter((c) => c.circuitBreakerState !== "OPEN");
const candidates = healthy.length > 0 ? healthy : pool;
const sorted = [...candidates].sort((a, b) => {
const aPenalty = a.errorRate * 1000;
const bPenalty = b.errorRate * 1000;
return a.p95LatencyMs + aPenalty - (b.p95LatencyMs + bPenalty);
});
const best = sorted[0];
if (!best) throw new Error("[LatencyStrategy] No candidates available");
const latencyScore = best.p95LatencyMs > 0 ? Math.max(0.001, 10_000 / best.p95LatencyMs) : 1;
const reliability = Math.max(0, 1 - best.errorRate);
const finalScore = latencyScore * 0.7 + reliability * 0.3;
return {
provider: best.provider,
model: best.model,
strategy: this.name,
reason: `LatencyStrategy: p95=${best.p95LatencyMs}ms, errorRate=${(best.errorRate * 100).toFixed(2)}%`,
candidatesConsidered: candidates.length,
finalScore,
};
}
}
// ── Registry ──────────────────────────────────────────────────────────────────
const strategyRegistry = new Map<string, RouterStrategy>();
const rulesStrategy = new RulesStrategyImpl();
const costStrategy = new CostStrategyImpl();
const latencyStrategy = new LatencyStrategyImpl();
strategyRegistry.set("rules", rulesStrategy);
strategyRegistry.set("cost", costStrategy);
strategyRegistry.set("eco", costStrategy); // alias
strategyRegistry.set("latency", latencyStrategy);
strategyRegistry.set("fast", latencyStrategy); // alias
export function getStrategy(name: string): RouterStrategy {
const strategy = strategyRegistry.get(name);
if (!strategy) {
console.warn(`[RouterStrategy] Strategy '${name}' not found, falling back to 'rules'`);
return rulesStrategy;
}
return strategy;
}
export function registerStrategy(name: string, strategy: RouterStrategy): void {
if (strategyRegistry.has(name)) {
console.warn(`[RouterStrategy] Overwriting strategy '${name}'`);
}
strategyRegistry.set(name, strategy);
}
export function listStrategies(): Array<{ name: string; description: string }> {
return [...strategyRegistry.entries()].map(([name, s]) => ({ name, description: s.description }));
}
export function selectWithStrategy(
pool: ProviderCandidate[],
context: RoutingContext,
strategyName = "rules"
): RoutingDecision {
return getStrategy(strategyName).select(pool, context);
}
+2 -1
View File
@@ -74,7 +74,8 @@ export function calculateScore(factors: ScoringFactors, weights: ScoringWeights)
weights.costInv * factors.costInv +
weights.latencyInv * factors.latencyInv +
weights.taskFit * factors.taskFit +
weights.stability * factors.stability
weights.stability * factors.stability +
weights.tierPriority * factors.tierPriority
);
}
@@ -24,10 +24,23 @@ const FITNESS_TABLE: Record<string, Record<string, number>> = {
"deepseek-coder": 0.9,
"deepseek-v3": 0.85,
"deepseek-r1": 0.88,
"deepseek-chat": 0.84, // DeepSeek V3.2 Chat — strong code performance
"deepseek-v3.2": 0.86, // Explicit V3.2 alias
qwen: 0.78,
llama: 0.72,
mistral: 0.75,
mixtral: 0.77,
// Grok-4 fast — good code, ultra-low latency (1143ms P50)
"grok-4-fast": 0.8,
"grok-4": 0.82,
"grok-3": 0.8,
// Kimi K2.5 — agentic with tool calling, good at code tasks
"kimi-k2": 0.82,
// GLM-5 — Z.AI model with 128k output
"glm-5": 0.78,
// MiniMax M2.5 — reasoning support helps complex code
"minimax-m2.5": 0.75,
"minimax-m2": 0.72,
},
review: {
"claude-sonnet": 0.92,
@@ -58,10 +71,15 @@ const FITNESS_TABLE: Record<string, Record<string, number>> = {
"claude-sonnet": 0.92,
"gemini-2.5-pro": 0.95,
"gemini-pro": 0.88,
"gemini-3.1-pro": 0.95, // Gemini 3.1 Pro — 1M context, ideal for long analysis
"gpt-4o": 0.85,
o1: 0.9,
o3: 0.93,
"deepseek-r1": 0.88,
"deepseek-chat": 0.8,
"kimi-k2": 0.82, // Kimi K2.5 agentic — good for analysis
"glm-5": 0.78, // GLM-5 with 128k output for long analysis
"minimax-m2.5": 0.76,
},
debugging: {
"claude-sonnet": 0.93,
@@ -87,8 +105,17 @@ const FITNESS_TABLE: Record<string, Record<string, number>> = {
"claude-opus": 0.85,
"gpt-4o": 0.85,
"gemini-pro": 0.8,
"gemini-3.1-pro": 0.85,
"deepseek-v3": 0.75,
"deepseek-chat": 0.74,
"gemini-flash": 0.72,
// New models from ClawRouter analysis (2026-03-17):
"grok-4-fast": 0.72, // ultra-fast, suitable for all tasks
"grok-4": 0.74,
"grok-3": 0.73,
"kimi-k2": 0.76, // agentic multi-step tasks
"glm-5": 0.7,
"minimax-m2.5": 0.7,
},
};
+331 -2
View File
@@ -5,19 +5,37 @@
import { checkFallbackError, formatRetryAfter, getProviderProfile } from "./accountFallback.ts";
import { unavailableResponse } from "../utils/error.ts";
import { recordComboRequest, getComboMetrics } from "./comboMetrics.ts";
import { recordComboIntent, recordComboRequest, getComboMetrics } from "./comboMetrics.ts";
import { resolveComboConfig, getDefaultComboConfig } from "./comboConfig.ts";
import * as semaphore from "./rateLimitSemaphore.ts";
import { getCircuitBreaker } from "../../src/shared/utils/circuitBreaker";
import { fisherYatesShuffle, getNextFromDeck } from "../../src/shared/utils/shuffleDeck";
import { parseModel } from "./model.ts";
import { applyComboAgentMiddleware, injectModelTag } from "./comboAgentMiddleware.ts";
import { classifyWithConfig, DEFAULT_INTENT_CONFIG } from "./intentClassifier.ts";
import { selectProvider as selectAutoProvider } from "./autoCombo/engine.ts";
import { selectWithStrategy } from "./autoCombo/routerStrategy.ts";
import { DEFAULT_WEIGHTS, scorePool } from "./autoCombo/scoring.ts";
import { supportsToolCalling } from "./modelCapabilities.ts";
// Status codes that should mark semaphore + record circuit breaker failures
const TRANSIENT_FOR_BREAKER = [429, 502, 503, 504];
const MAX_COMBO_DEPTH = 3;
// Bootstrap defaults from ClawRouter benchmark (used when no local latency history exists yet)
const DEFAULT_MODEL_P95_MS = {
"grok-4-fast-non-reasoning": 1143,
"grok-4-1-fast-non-reasoning": 1244,
"gemini-2.5-flash": 1238,
"kimi-k2.5": 1646,
"gpt-4o-mini": 2764,
"claude-sonnet-4.6": 4000,
"claude-opus-4.6": 6000,
"deepseek-chat": 2000,
};
const MIN_HISTORY_SAMPLES = 10;
// In-memory atomic counter per combo for round-robin distribution
// Resets on server restart (by design — no stale state)
const rrCounters = new Map();
@@ -202,6 +220,193 @@ function sortModelsByUsage(models, comboName) {
return withUsage.map((e) => e.modelStr);
}
function toTextContent(content) {
if (typeof content === "string") return content;
if (!Array.isArray(content)) return "";
return content
.map((part) => {
if (!part || typeof part !== "object") return "";
if (typeof part.text === "string") return part.text;
return "";
})
.join("\n");
}
function extractPromptForIntent(body) {
if (!body || typeof body !== "object") return "";
const fromMessages = Array.isArray(body.messages)
? [...body.messages].reverse().find((m) => m && typeof m === "object" && m.role === "user")
: null;
if (fromMessages) return toTextContent(fromMessages.content);
if (typeof body.input === "string") return body.input;
if (Array.isArray(body.input)) {
const text = body.input
.map((item) => {
if (!item || typeof item !== "object") return "";
if (typeof item.content === "string") return item.content;
if (typeof item.text === "string") return item.text;
return "";
})
.filter(Boolean)
.join("\n");
if (text) return text;
}
if (typeof body.prompt === "string") return body.prompt;
return "";
}
function mapIntentToTaskType(intent) {
switch (intent) {
case "code":
return "coding";
case "reasoning":
return "analysis";
case "simple":
return "default";
case "medium":
default:
return "default";
}
}
function toStringArray(input) {
if (Array.isArray(input)) {
return input.map((v) => (typeof v === "string" ? v.trim() : "")).filter(Boolean);
}
if (typeof input === "string") {
return input
.split(",")
.map((v) => v.trim())
.filter(Boolean);
}
return [];
}
function getIntentConfig(settings, combo) {
const comboIntentConfig =
combo?.autoConfig?.intentConfig ||
combo?.config?.auto?.intentConfig ||
combo?.config?.intentConfig ||
{};
return {
...DEFAULT_INTENT_CONFIG,
...comboIntentConfig,
...(typeof settings?.intentDetectionEnabled === "boolean"
? { enabled: settings.intentDetectionEnabled }
: {}),
...(Number.isFinite(Number(settings?.intentSimpleMaxWords))
? { simpleMaxWords: Number(settings.intentSimpleMaxWords) }
: {}),
...(toStringArray(settings?.intentExtraCodeKeywords).length > 0
? { extraCodeKeywords: toStringArray(settings.intentExtraCodeKeywords) }
: {}),
...(toStringArray(settings?.intentExtraReasoningKeywords).length > 0
? { extraReasoningKeywords: toStringArray(settings.intentExtraReasoningKeywords) }
: {}),
...(toStringArray(settings?.intentExtraSimpleKeywords).length > 0
? { extraSimpleKeywords: toStringArray(settings.intentExtraSimpleKeywords) }
: {}),
};
}
function getBootstrapLatencyMs(modelId) {
const normalized = String(modelId || "").toLowerCase();
return DEFAULT_MODEL_P95_MS[normalized] ?? 1500;
}
async function buildAutoCandidates(modelStrings, comboName) {
const metrics = getComboMetrics(comboName);
const { getPricingForModel } = await import("../../src/lib/localDb");
let historicalLatencyStats = {};
try {
const { getModelLatencyStats } = await import("../../src/lib/usageDb");
historicalLatencyStats = await getModelLatencyStats({
windowHours: 24,
minSamples: 3,
maxRows: 10000,
});
} catch {
// keep empty stats — auto-combo will use runtime + bootstrap signals
}
const candidates = await Promise.all(
modelStrings.map(async (modelStr) => {
const parsed = parseModel(modelStr);
const provider = parsed.provider || parsed.providerAlias || "unknown";
const model = parsed.model || modelStr;
const historicalKey = `${provider}/${model}`;
const historicalModelMetric = historicalLatencyStats[historicalKey] || null;
const historicalTotal = Number(historicalModelMetric?.totalRequests);
const hasHistoricalSignal =
Number.isFinite(historicalTotal) && historicalTotal >= MIN_HISTORY_SAMPLES;
let costPer1MTokens = 1;
try {
const pricing = await getPricingForModel(provider, model);
const inputPrice = Number(pricing?.input);
if (Number.isFinite(inputPrice) && inputPrice >= 0) {
costPer1MTokens = inputPrice;
}
} catch {
// keep default cost
}
const modelMetric = metrics?.byModel?.[modelStr] || null;
const avgLatency = Number(modelMetric?.avgLatencyMs);
const successRate = Number(modelMetric?.successRate);
const historicalP95Latency = Number(historicalModelMetric?.p95LatencyMs);
const historicalStdDev = Number(historicalModelMetric?.latencyStdDev);
const historicalSuccessRate = Number(historicalModelMetric?.successRate); // 0..1
const p95LatencyMs = hasHistoricalSignal
? Number.isFinite(historicalP95Latency) && historicalP95Latency > 0
? historicalP95Latency
: getBootstrapLatencyMs(model)
: Number.isFinite(avgLatency) && avgLatency > 0
? avgLatency
: getBootstrapLatencyMs(model);
const errorRate = hasHistoricalSignal
? Number.isFinite(historicalSuccessRate) &&
historicalSuccessRate >= 0 &&
historicalSuccessRate <= 1
? 1 - historicalSuccessRate
: 0.05
: Number.isFinite(successRate) && successRate >= 0 && successRate <= 100
? 1 - successRate / 100
: 0.05;
const latencyStdDev =
hasHistoricalSignal && Number.isFinite(historicalStdDev) && historicalStdDev > 0
? Math.max(10, historicalStdDev)
: Math.max(10, p95LatencyMs * 0.1);
const breakerStateRaw = getCircuitBreaker(`combo:${modelStr}`)?.getStatus?.()?.state;
const circuitBreakerState =
breakerStateRaw === "OPEN" || breakerStateRaw === "HALF_OPEN" ? breakerStateRaw : "CLOSED";
return {
provider,
model,
quotaRemaining: 100,
quotaTotal: 100,
circuitBreakerState,
costPer1MTokens,
p95LatencyMs,
latencyStdDev,
errorRate,
accountTier: "standard",
quotaResetIntervalSecs: 86400,
};
})
);
return candidates;
}
/**
* Handle combo chat with fallback
* Supports all 6 strategies: priority, weighted, round-robin, random, least-used, cost-optimized
@@ -316,7 +521,131 @@ export async function handleComboChat({
}
// Apply strategy-specific ordering
if (strategy === "strict-random") {
if (strategy === "auto") {
const requestHasTools = Array.isArray(body?.tools) && body.tools.length > 0;
let eligibleModels = [...orderedModels];
if (requestHasTools) {
const filtered = eligibleModels.filter((m) => supportsToolCalling(m));
if (filtered.length > 0) {
eligibleModels = filtered;
} else {
log.warn(
"COMBO",
"Auto strategy: all candidates filtered by tool-calling policy, falling back to full pool"
);
}
}
const prompt = extractPromptForIntent(body);
const systemPrompt =
typeof combo?.system_message === "string" ? combo.system_message : undefined;
const intentConfig = getIntentConfig(settings, combo);
const intent = classifyWithConfig(prompt, intentConfig, systemPrompt);
recordComboIntent(combo.name, intent);
const taskType = mapIntentToTaskType(intent);
const autoConfigSource = combo?.autoConfig || combo?.config?.auto || combo?.config || {};
const routingStrategy =
typeof autoConfigSource.routingStrategy === "string"
? autoConfigSource.routingStrategy
: typeof autoConfigSource.strategyName === "string"
? autoConfigSource.strategyName
: "rules";
const candidatePool = Array.isArray(autoConfigSource.candidatePool)
? autoConfigSource.candidatePool
: [
...new Set(
eligibleModels.map((m) => {
const parsed = parseModel(m);
return parsed.provider || parsed.providerAlias || "unknown";
})
),
];
const weights =
autoConfigSource.weights && typeof autoConfigSource.weights === "object"
? autoConfigSource.weights
: DEFAULT_WEIGHTS;
const explorationRate = Number.isFinite(Number(autoConfigSource.explorationRate))
? Number(autoConfigSource.explorationRate)
: 0.05;
const budgetCap = Number.isFinite(Number(autoConfigSource.budgetCap))
? Number(autoConfigSource.budgetCap)
: undefined;
const modePack =
typeof autoConfigSource.modePack === "string" ? autoConfigSource.modePack : undefined;
const candidates = await buildAutoCandidates(eligibleModels, combo.name);
if (candidates.length > 0) {
let selectedProvider = null;
let selectedModel = null;
let selectionReason = "";
if (routingStrategy !== "rules") {
try {
const decision = selectWithStrategy(
candidates,
{ taskType, requestHasTools },
routingStrategy
);
selectedProvider = decision.provider;
selectedModel = decision.model;
selectionReason = decision.reason;
} catch (err) {
log.warn(
"COMBO",
`Auto strategy '${routingStrategy}' failed (${err?.message || "unknown"}), falling back to rules`
);
}
}
if (!selectedProvider || !selectedModel) {
const selection = selectAutoProvider(
{
id: combo.id || combo.name,
name: combo.name,
type: "auto",
candidatePool,
weights,
modePack,
budgetCap,
explorationRate,
},
candidates,
taskType
);
selectedProvider = selection.provider;
selectedModel = selection.model;
selectionReason = `score=${selection.score.toFixed(3)}${selection.isExploration ? " (exploration)" : ""}`;
}
const modelLookup = new Map();
for (const modelStr of eligibleModels) {
const parsed = parseModel(modelStr);
const provider = parsed.provider || parsed.providerAlias || "unknown";
const modelId = parsed.model || modelStr;
modelLookup.set(`${provider}/${modelId}`, modelStr);
}
const ranked = scorePool(candidates, taskType, weights)
.map((r) => modelLookup.get(`${r.provider}/${r.model}`) || `${r.provider}/${r.model}`)
.filter(Boolean);
const selectedModelStr =
modelLookup.get(`${selectedProvider}/${selectedModel}`) ||
`${selectedProvider}/${selectedModel}`;
orderedModels = [...new Set([selectedModelStr, ...ranked, ...eligibleModels])];
log.info(
"COMBO",
`Auto selection: ${selectedModelStr} | intent=${intent} task=${taskType} | strategy=${routingStrategy} | ${selectionReason}`
);
} else {
log.warn("COMBO", "Auto strategy has no candidates, keeping default ordering");
}
} else if (strategy === "strict-random") {
const selectedId = await getNextFromDeck(`combo:${combo.name}`, orderedModels);
// Put selected model first so the fallback loop tries it first
const rest = orderedModels.filter((m) => m !== selectedId);
+19
View File
@@ -123,6 +123,20 @@ export function applyToolFilter(
});
}
/**
* Strip all <omniModel> tags from message content before forwarding to the provider.
* The tag is an internal OmniRoute marker; providers must never see it or their
* cache will treat every tagged request as a new session (#454).
*/
export function stripModelTags(messages: Message[]): Message[] {
return messages.map((msg) => {
if (typeof msg.content === "string" && CACHE_TAG_PATTERN.test(msg.content)) {
return { ...msg, content: msg.content.replace(CACHE_TAG_PATTERN, "").trimEnd() };
}
return msg;
});
}
// ── Main Middleware ──────────────────────────────────────────────────────────
/**
@@ -158,6 +172,11 @@ export function applyComboAgentMiddleware(
comboConfig.tool_filter_regex
);
// 4. Strip internal <omniModel> tags before forwarding to provider (#454)
// These tags are OmniRoute-internal markers and must never reach the provider
// since providers would treat each tagged request as a new cache session.
messages = stripModelTags(messages);
return {
body: {
...body,
+27
View File
@@ -21,6 +21,7 @@ interface ComboMetricsEntry {
totalLatencyMs: number;
strategy: string;
lastUsedAt: string | null;
intentCounts: Record<string, number>;
byModel: Record<string, ModelMetrics>;
}
@@ -69,6 +70,7 @@ export function recordComboRequest(
totalLatencyMs: 0,
strategy,
lastUsedAt: null,
intentCounts: {},
byModel: {},
});
}
@@ -131,6 +133,7 @@ export function getComboMetrics(comboName: string): ComboMetricsView | null {
combo.totalRequests > 0 ? Math.round((combo.totalSuccesses / combo.totalRequests) * 100) : 0,
fallbackRate:
combo.totalRequests > 0 ? Math.round((combo.totalFallbacks / combo.totalRequests) * 100) : 0,
intentCounts: { ...combo.intentCounts },
byModel: Object.fromEntries(
Object.entries(combo.byModel).map(([model, m]) => [
model,
@@ -156,6 +159,30 @@ export function getAllComboMetrics(): Record<string, ComboMetricsView | null> {
return result;
}
/**
* Record detected prompt intent for a combo (used by multilingual routing analytics).
*/
export function recordComboIntent(comboName: string, intent: string): void {
if (!metrics.has(comboName)) {
metrics.set(comboName, {
totalRequests: 0,
totalSuccesses: 0,
totalFailures: 0,
totalFallbacks: 0,
totalLatencyMs: 0,
strategy: "priority",
lastUsedAt: null,
intentCounts: {},
byModel: {},
});
}
const combo = metrics.get(comboName);
if (!combo) return;
const key = String(intent || "unknown");
combo.intentCounts[key] = (combo.intentCounts[key] || 0) + 1;
}
/**
* Reset metrics for a specific combo
*/
+103
View File
@@ -0,0 +1,103 @@
/**
* Emergency Fallback Budget Exhaustion Redirect
*
* When a request fails due to budget exhaustion (HTTP 402 or budget keywords
* in the error body), optionally redirect to a free-tier model
* (default provider/model: nvidia + openai/gpt-oss-120b at $0.00/M tokens).
*
* Inspired by ClawRouter: "gpt-oss-120b costs nothing and serves as
* automatic fallback when wallet is empty."
*/
export interface EmergencyFallbackConfig {
enabled: boolean;
provider: string;
model: string;
triggerOn402: boolean;
triggerOnBudgetKeywords: boolean;
budgetKeywords: string[];
/** Skip fallback for tool requests (gpt-oss-120b may not support structured tool calling) */
skipForToolRequests: boolean;
maxOutputTokens: number;
}
export const EMERGENCY_FALLBACK_CONFIG: EmergencyFallbackConfig = {
enabled: true,
provider: "nvidia",
model: "openai/gpt-oss-120b",
triggerOn402: true,
triggerOnBudgetKeywords: true,
budgetKeywords: [
"insufficient funds",
"insufficient_funds",
"budget exceeded",
"budget_exceeded",
"quota exceeded",
"quota_exceeded",
"billing",
"payment required",
"out of credits",
"no credits",
"credit limit",
"spending limit",
"saldo insuficiente",
"limite de gastos",
"cota excedida",
],
skipForToolRequests: true,
maxOutputTokens: 4096,
};
export interface FallbackDecision {
shouldFallback: true;
reason: string;
provider: string;
model: string;
maxOutputTokens: number;
}
export interface NoFallbackDecision {
shouldFallback: false;
reason: string;
}
export type FallbackResult = FallbackDecision | NoFallbackDecision;
export function shouldUseFallback(
status: number,
errorBody: string,
requestHasTools: boolean,
config: EmergencyFallbackConfig = EMERGENCY_FALLBACK_CONFIG
): FallbackResult {
if (!config.enabled) return { shouldFallback: false, reason: "emergency fallback disabled" };
if (config.skipForToolRequests && requestHasTools) {
return { shouldFallback: false, reason: "skipped: request has tools" };
}
if (config.triggerOn402 && status === 402) {
return {
shouldFallback: true,
reason: `HTTP 402 → emergency fallback to ${config.provider}/${config.model}`,
provider: config.provider,
model: config.model,
maxOutputTokens: config.maxOutputTokens,
};
}
if (config.triggerOnBudgetKeywords && errorBody) {
const lowerBody = errorBody.toLowerCase();
const matched = config.budgetKeywords.find((kw) => lowerBody.includes(kw.toLowerCase()));
if (matched) {
return {
shouldFallback: true,
reason: `Budget error detected ('${matched}') → emergency fallback to ${config.provider}/${config.model}`,
provider: config.provider,
model: config.model,
maxOutputTokens: config.maxOutputTokens,
};
}
}
return { shouldFallback: false, reason: "no budget error detected" };
}
export function isFallbackDecision(result: FallbackResult): result is FallbackDecision {
return result.shouldFallback === true;
}
+375
View File
@@ -0,0 +1,375 @@
/**
* Multilingual Intent Detection for AutoCombo
*
* Classifies prompts as: code | reasoning | simple | medium
* using keywords in 9 languages (EN, PT-BR, ES, ZH, JA, RU, DE, KO, AR).
*
* Inspired by ClawRouter (BlockRunAI) multilingual routing system.
* Execution: purely synchronous, <1ms, no I/O.
*/
export type IntentType = "code" | "reasoning" | "simple" | "medium";
export const CODE_KEYWORDS: readonly string[] = [
// English
"function",
"class",
"import",
"def",
"SELECT",
"async",
"await",
"const",
"let",
"var",
"return",
"```",
"algorithm",
"compile",
"debug",
"refactor",
"typescript",
"python",
"javascript",
"code",
"implement",
"write a",
"create a component",
"endpoint",
"repository",
"deploy",
"install",
"script",
"api",
"database",
"query",
"schema",
"interface",
"generic",
"enum",
"module",
"package",
"dependency",
// Português (PT-BR)
"função",
"classe",
"importar",
"definir",
"consulta",
"assíncrono",
"aguardar",
"constante",
"variável",
"retornar",
"algoritmo",
"compilar",
"depurar",
"refatorar",
"código",
"implementar",
"criar um",
"componente",
"como fazer",
"repositório",
"configurar",
"instalar",
"banco de dados",
"escrever uma função",
"criar uma classe",
// Español
"función",
"clase",
"importar",
"definir",
"consulta",
"asíncrono",
"esperar",
"constante",
"variable",
"retornar",
"algoritmo",
"compilar",
"depurar",
"refactorizar",
"código",
"implementar",
// 中文
"函数",
"类",
"导入",
"定义",
"查询",
"异步",
"等待",
"常量",
"变量",
"返回",
"算法",
"编译",
"调试",
"代码",
// 日本語
"関数",
"クラス",
"インポート",
"非同期",
"定数",
"変数",
"コード",
"アルゴリズム",
// Русский
"функция",
"класс",
"импорт",
"запрос",
"асинхронный",
"константа",
"переменная",
"алгоритм",
"код",
// Deutsch
"funktion",
"klasse",
"importieren",
"abfrage",
"asynchron",
"konstante",
"variable",
"algorithmus",
"code",
// 한국어
"함수",
"클래스",
"가져오기",
"정의",
"쿼리",
"비동기",
"대기",
"상수",
"변수",
"반환",
"코드",
// العربية
"دالة",
"فئة",
"استيراد",
"استعلام",
"غير متزامن",
"ثابت",
"متغير",
"كود",
"خوارزمية",
];
export const REASONING_KEYWORDS: readonly string[] = [
// English
"prove",
"theorem",
"derive",
"step by step",
"chain of thought",
"formally",
"mathematical",
"proof",
"logically",
"analyze",
"reasoning",
"deduce",
"infer",
"hypothesis",
"convergence",
// Português (PT-BR)
"provar",
"teorema",
"derivar",
"passo a passo",
"cadeia de pensamento",
"formalmente",
"matemático",
"prova",
"logicamente",
"analisar",
"raciocínio",
"deduzir",
"inferir",
"hipótese",
"demonstrar",
"cálculo",
"equação diferencial",
"integral",
"otimização",
// Español
"demostrar",
"teorema",
"derivar",
"paso a paso",
"formalmente",
"matemático",
"lógicamente",
// 中文
"证明",
"定理",
"推导",
"逐步",
"思维链",
"数学",
"逻辑",
"分析",
// 日本語
"証明",
"定理",
"導出",
"論理的",
"分析",
// Русский
"доказать",
"теорема",
"шаг за шагом",
"математически",
"логически",
// Deutsch
"beweisen",
"theorem",
"schritt für schritt",
"mathematisch",
"logisch",
// 한국어
"증명",
"정리",
"단계별",
"수학적",
"논리적",
// العربية
"إثبات",
"نظرية",
"خطوة بخطوة",
"رياضي",
"منطقياً",
];
export const SIMPLE_KEYWORDS: readonly string[] = [
// English
"what is",
"define",
"translate",
"hello",
"yes or no",
"summarize",
"list",
"tell me",
"who is",
// Português (PT-BR)
"o que é",
"definir",
"traduzir",
"olá",
"oi",
"sim ou não",
"resumir",
"listar",
"me diga",
"quem é",
"quando foi",
"onde fica",
"explique brevemente",
"de forma simples",
// Español
"qué es",
"definir",
"traducir",
"hola",
"resumir",
"listar",
// 中文
"什么是",
"定义",
"翻译",
"你好",
"总结",
"列出",
// Русский
"что такое",
"определить",
"перевести",
"привет",
"резюмировать",
// Deutsch
"was ist",
"definieren",
"übersetzen",
"hallo",
"zusammenfassen",
// 한국어
"이란",
"정의",
"번역",
"안녕",
"요약",
// العربية
"ما هو",
"تعريف",
"ترجمة",
"مرحبا",
"ملخص",
];
/**
* Classify a prompt's intent using multilingual keyword matching.
* Priority: code > reasoning > simple > medium (default)
*/
export function classifyPromptIntent(prompt: string, systemPrompt?: string): IntentType {
const fullText = `${systemPrompt ?? ""} ${prompt}`.toLowerCase();
const wordCount = prompt.trim().split(/\s+/).length;
for (const kw of CODE_KEYWORDS) {
if (fullText.includes(kw.toLowerCase())) return "code";
}
for (const kw of REASONING_KEYWORDS) {
if (fullText.includes(kw.toLowerCase())) return "reasoning";
}
if (wordCount < 60) {
for (const kw of SIMPLE_KEYWORDS) {
if (fullText.includes(kw.toLowerCase())) return "simple";
}
}
return "medium";
}
export interface IntentClassifierConfig {
enabled: boolean;
extraCodeKeywords?: string[];
extraReasoningKeywords?: string[];
extraSimpleKeywords?: string[];
simpleMaxWords?: number;
}
export const DEFAULT_INTENT_CONFIG: IntentClassifierConfig = {
enabled: true,
simpleMaxWords: 60,
};
export function classifyWithConfig(
prompt: string,
config: IntentClassifierConfig,
systemPrompt?: string
): IntentType {
if (!config.enabled) return "medium";
const fullText = `${systemPrompt ?? ""} ${prompt}`.toLowerCase();
const wordCount = prompt.trim().split(/\s+/).length;
const maxSimpleWords = config.simpleMaxWords ?? 60;
const codeKws = [...CODE_KEYWORDS, ...(config.extraCodeKeywords ?? [])];
const reasoningKws = [...REASONING_KEYWORDS, ...(config.extraReasoningKeywords ?? [])];
const simpleKws = [...SIMPLE_KEYWORDS, ...(config.extraSimpleKeywords ?? [])];
for (const kw of codeKws) {
if (fullText.includes(kw.toLowerCase())) return "code";
}
for (const kw of reasoningKws) {
if (fullText.includes(kw.toLowerCase())) return "reasoning";
}
if (wordCount < maxSimpleWords) {
for (const kw of simpleKws) {
if (fullText.includes(kw.toLowerCase())) return "simple";
}
}
return "medium";
}
+12
View File
@@ -23,6 +23,18 @@ const PROVIDER_MODEL_ALIASES = {
"gemini-3-flash": "gemini-3-flash-preview",
"raptor-mini": "oswe-vscode-prime",
},
gemini: {
"gemini-3.1-pro-preview": "gemini-3.1-pro",
"gemini-3-1-pro": "gemini-3.1-pro",
},
"gemini-cli": {
"gemini-3.1-pro-preview": "gemini-3.1-pro",
"gemini-3-1-pro": "gemini-3.1-pro",
},
nvidia: {
"gpt-oss-120b": "openai/gpt-oss-120b",
"nvidia/gpt-oss-120b": "openai/gpt-oss-120b",
},
antigravity: {},
};
+50
View File
@@ -0,0 +1,50 @@
import { PROVIDER_ID_TO_ALIAS, PROVIDER_MODELS } from "../config/providerModels.ts";
import { parseModel } from "./model.ts";
// Conservative denylist fallback used when registry metadata is absent.
// Keep small and explicit to avoid false negatives.
const TOOL_CALLING_UNSUPPORTED_PATTERNS = [
"gpt-oss-120b",
"deepseek-reasoner",
"glm-4.7",
"glm4.7",
];
function getRegistryToolCallingFlag(providerIdOrAlias: string, modelId: string): boolean | null {
const providerAlias = PROVIDER_ID_TO_ALIAS[providerIdOrAlias] || providerIdOrAlias;
const models = PROVIDER_MODELS[providerAlias];
if (!Array.isArray(models)) return null;
const found = models.find((m) => m?.id === modelId);
if (!found) return null;
return typeof found.toolCalling === "boolean" ? found.toolCalling : null;
}
/**
* Returns whether a model should be considered safe for structured function/tool calling.
*
* Decision order:
* 1) Provider registry metadata (toolCalling flag) when available.
* 2) Conservative denylist fallback for known problematic model families.
* 3) Default true.
*/
export function supportsToolCalling(modelStr: string): boolean {
const parsed = parseModel(modelStr);
const provider = parsed.provider || parsed.providerAlias || "";
const model = parsed.model || modelStr;
if (provider) {
const fromRegistry = getRegistryToolCallingFlag(provider, model);
if (fromRegistry !== null) return fromRegistry;
}
const normalized = String(modelStr || "").toLowerCase();
if (!normalized) return false;
const blocked = TOOL_CALLING_UNSUPPORTED_PATTERNS.some((pattern) => {
if (normalized === pattern) return true;
if (normalized.endsWith(`/${pattern}`)) return true;
return normalized.includes(pattern);
});
return !blocked;
}
+120
View File
@@ -0,0 +1,120 @@
/**
* Request Deduplication Service
*
* Deduplicates **concurrent** identical requests to the same upstream.
* Inspired by ClawRouter's dedup.ts (BlockRunAI / github.com/BlockRunAI/ClawRouter).
*
* IMPORTANT: In-memory only does NOT persist across restarts and does NOT
* work across multiple process instances (no cross-instance dedup).
*/
import { createHash } from "node:crypto";
export interface DedupConfig {
enabled: boolean;
maxTemperatureForDedup: number;
timeoutMs: number;
}
export const DEFAULT_DEDUP_CONFIG: DedupConfig = {
enabled: true,
maxTemperatureForDedup: 0.1,
timeoutMs: 60_000,
};
export interface DedupResult<T> {
result: T;
wasDeduplicated: boolean;
hash: string;
}
const inflight = new Map<string, Promise<unknown>>();
/**
* Compute a deterministic hash for a request body.
* Includes: model, messages, temperature, tools, tool_choice, max_tokens, response_format
* Excludes: stream, user, metadata (don't affect LLM output)
*/
export function computeRequestHash(requestBody: unknown): string {
const body = requestBody as Record<string, unknown>;
const canonical = {
model: body.model ?? null,
messages: body.messages ?? null,
temperature: typeof body.temperature === "number" ? body.temperature : 1.0,
tools: body.tools ?? null,
tool_choice: body.tool_choice ?? null,
max_tokens: body.max_tokens ?? null,
response_format: body.response_format ?? null,
top_p: body.top_p ?? null,
frequency_penalty: body.frequency_penalty ?? null,
presence_penalty: body.presence_penalty ?? null,
};
return createHash("sha256").update(JSON.stringify(canonical)).digest("hex").slice(0, 16);
}
/** Determine whether a request should be deduplicated */
export function shouldDeduplicate(
requestBody: unknown,
config: DedupConfig = DEFAULT_DEDUP_CONFIG
): boolean {
if (!config.enabled) return false;
const body = requestBody as Record<string, unknown>;
if (body.stream === true) return false;
const temperature = typeof body.temperature === "number" ? body.temperature : 1.0;
if (temperature > config.maxTemperatureForDedup) return false;
return true;
}
/**
* Execute a request with deduplication.
* Concurrent identical requests share one upstream call.
*/
export async function deduplicate<T>(
hash: string,
fn: () => Promise<T>,
config: DedupConfig = DEFAULT_DEDUP_CONFIG
): Promise<DedupResult<T>> {
if (!config.enabled) {
return { result: await fn(), wasDeduplicated: false, hash };
}
const existing = inflight.get(hash);
if (existing) {
const result = (await existing) as T;
return { result, wasDeduplicated: true, hash };
}
let resolve!: (value: T) => void;
let reject!: (reason: unknown) => void;
const sharedPromise = new Promise<T>((res, rej) => {
resolve = res;
reject = rej;
});
inflight.set(hash, sharedPromise as Promise<unknown>);
const timer = setTimeout(() => {
if (inflight.get(hash) === sharedPromise) inflight.delete(hash);
}, config.timeoutMs);
try {
const result = await fn();
resolve(result);
return { result, wasDeduplicated: false, hash };
} catch (err) {
reject(err);
throw err;
} finally {
clearTimeout(timer);
if (inflight.get(hash) === sharedPromise) inflight.delete(hash);
}
}
export function getInflightCount(): number {
return inflight.size;
}
export function getInflightHashes(): string[] {
return [...inflight.keys()];
}
export function clearInflight(): void {
inflight.clear();
}
+142
View File
@@ -0,0 +1,142 @@
/**
* Search Cache in-memory TTL cache with request coalescing
*
* Bounded at MAX_CACHE_ENTRIES to prevent OOM.
* Request coalescing deduplicates concurrent identical queries
* to prevent cache stampede (critical for agentic tools).
*/
import { createHash } from "crypto";
const MAX_CACHE_ENTRIES = 5000;
const DEFAULT_TTL_MS = parseInt(process.env.SEARCH_CACHE_TTL_MS || String(5 * 60 * 1000), 10);
interface CacheEntry<T> {
data: T;
expiresAt: number;
}
const cache = new Map<string, CacheEntry<unknown>>();
const inflight = new Map<string, Promise<unknown>>();
let hits = 0;
let misses = 0;
/**
* Normalize a query for cache key computation.
* NFKC normalization, lowercase, trim, collapse whitespace.
*/
function normalizeQuery(query: string): string {
return query.normalize("NFKC").toLowerCase().trim().replace(/\s+/g, " ");
}
/**
* Compute a deterministic cache key from search parameters.
*/
export function computeCacheKey(
query: string,
provider: string,
searchType: string,
maxResults: number,
country?: string,
language?: string,
filters?: unknown
): string {
const normalized = normalizeQuery(query);
const payload = JSON.stringify({
q: normalized,
p: provider,
t: searchType,
n: maxResults,
c: country || null,
l: language || null,
f: filters || null,
});
return createHash("sha256").update(payload).digest("hex");
}
/**
* Evict expired entries and enforce size bound.
* Called lazily on writes. O(n) worst case but amortized O(1).
*/
function evictIfNeeded(): void {
const now = Date.now();
// Remove expired entries first
for (const [key, entry] of cache) {
if (entry.expiresAt <= now) {
cache.delete(key);
}
}
// FIFO eviction if still over limit
while (cache.size >= MAX_CACHE_ENTRIES) {
const firstKey = cache.keys().next().value;
if (firstKey !== undefined) {
cache.delete(firstKey);
} else {
break;
}
}
}
/**
* Get or coalesce: return cached data, join an inflight request,
* or execute the fetch function and cache the result.
*
* @param key - Cache key from computeCacheKey()
* @param ttlMs - TTL in milliseconds (0 to bypass cache)
* @param fetchFn - Function to execute on cache miss
* @returns The cached or freshly fetched data
*/
export async function getOrCoalesce<T>(
key: string,
ttlMs: number,
fetchFn: () => Promise<T>
): Promise<{ data: T; cached: boolean }> {
// 1. Check cache
const cached = cache.get(key) as CacheEntry<T> | undefined;
if (cached && cached.expiresAt > Date.now()) {
hits++;
return { data: cached.data, cached: true };
}
// 2. Join inflight request if one exists (request coalescing)
const existing = inflight.get(key) as Promise<T> | undefined;
if (existing) {
hits++;
const data = await existing;
return { data, cached: true };
}
// 3. Cache miss — execute fetch
misses++;
const promise = fetchFn();
inflight.set(key, promise);
try {
const data = await promise;
// Store in cache
if (ttlMs > 0) {
evictIfNeeded();
cache.set(key, { data, expiresAt: Date.now() + ttlMs });
}
return { data, cached: false };
} finally {
inflight.delete(key);
}
}
/**
* Get cache statistics for monitoring.
*/
export function getCacheStats(): { size: number; hits: number; misses: number } {
return { size: cache.size, hits, misses };
}
/**
* Default TTL for search cache entries.
*/
export const SEARCH_CACHE_DEFAULT_TTL_MS = DEFAULT_TTL_MS;
+166 -39
View File
@@ -75,6 +75,30 @@ function getFieldValue(source: unknown, snakeKey: string, camelKey: string): unk
return obj[snakeKey] ?? obj[camelKey] ?? null;
}
function clampPercentage(value: number): number {
return Math.max(0, Math.min(100, value));
}
function toDisplayLabel(value: string): string {
return value
.replace(/^copilot[_\s-]*/i, "")
.split(/[\s_-]+/)
.filter(Boolean)
.map((part) => {
if (/^pro\+$/i.test(part)) return "Pro+";
if (/^[a-z]{2,}$/.test(part)) return part.charAt(0).toUpperCase() + part.slice(1).toLowerCase();
return part;
})
.join(" ")
.trim();
}
function shouldDisplayGitHubQuota(quota: UsageQuota | null): quota is UsageQuota {
if (!quota) return false;
if (quota.unlimited && quota.total <= 0) return false;
return quota.total > 0 || quota.remainingPercentage !== undefined;
}
/**
* Get usage data for a provider connection
* @param {Object} connection - Provider connection with accessToken
@@ -170,48 +194,65 @@ async function getGitHubUsage(accessToken, providerSpecificData) {
}
const data = await response.json();
const dataRecord = toRecord(data);
// Handle different response formats (paid vs free)
if (data.quota_snapshots) {
if (dataRecord.quota_snapshots) {
// Paid plan format
const snapshots = data.quota_snapshots;
const resetAt = parseResetTime(data.quota_reset_date);
const snapshots = toRecord(dataRecord.quota_snapshots);
const resetAt = parseResetTime(getFieldValue(dataRecord, "quota_reset_date", "quotaResetDate"));
const premiumQuota = formatGitHubQuotaSnapshot(snapshots.premium_interactions, resetAt);
const chatQuota = formatGitHubQuotaSnapshot(snapshots.chat, resetAt);
const completionsQuota = formatGitHubQuotaSnapshot(snapshots.completions, resetAt);
const quotas: Record<string, UsageQuota> = {};
if (shouldDisplayGitHubQuota(premiumQuota)) {
quotas.premium_interactions = premiumQuota;
}
if (shouldDisplayGitHubQuota(chatQuota)) {
quotas.chat = chatQuota;
}
if (shouldDisplayGitHubQuota(completionsQuota)) {
quotas.completions = completionsQuota;
}
return {
plan: data.copilot_plan,
resetDate: data.quota_reset_date,
quotas: {
chat: { ...formatGitHubQuotaSnapshot(snapshots.chat), resetAt },
completions: { ...formatGitHubQuotaSnapshot(snapshots.completions), resetAt },
premium_interactions: {
...formatGitHubQuotaSnapshot(snapshots.premium_interactions),
resetAt,
},
},
plan: inferGitHubPlanName(dataRecord, premiumQuota),
resetDate: getFieldValue(dataRecord, "quota_reset_date", "quotaResetDate"),
quotas,
};
} else if (data.monthly_quotas || data.limited_user_quotas) {
} else if (dataRecord.monthly_quotas || dataRecord.limited_user_quotas) {
// Free/limited plan format
const monthlyQuotas = data.monthly_quotas || {};
const usedQuotas = data.limited_user_quotas || {};
const resetAt = parseResetTime(data.limited_user_reset_date);
const monthlyQuotas = toRecord(dataRecord.monthly_quotas);
const usedQuotas = toRecord(dataRecord.limited_user_quotas);
const resetDate = getFieldValue(dataRecord, "limited_user_reset_date", "limitedUserResetDate");
const resetAt = parseResetTime(resetDate);
const quotas: Record<string, UsageQuota> = {};
const addLimitedQuota = (name: string) => {
const total = toNumber(getFieldValue(monthlyQuotas, name, name), 0);
const used = Math.max(0, toNumber(getFieldValue(usedQuotas, name, name), 0));
if (total <= 0) return null;
const clampedUsed = Math.min(used, total);
quotas[name] = {
used: clampedUsed,
total,
remaining: Math.max(total - clampedUsed, 0),
remainingPercentage: clampPercentage(((total - clampedUsed) / total) * 100),
unlimited: false,
resetAt,
};
return quotas[name];
};
const premiumQuota = addLimitedQuota("premium_interactions");
addLimitedQuota("chat");
addLimitedQuota("completions");
return {
plan: data.copilot_plan || data.access_type_sku,
resetDate: data.limited_user_reset_date,
quotas: {
chat: {
used: usedQuotas.chat || 0,
total: monthlyQuotas.chat || 0,
unlimited: false,
resetAt,
},
completions: {
used: usedQuotas.completions || 0,
total: monthlyQuotas.completions || 0,
unlimited: false,
resetAt,
},
},
plan: inferGitHubPlanName(dataRecord, premiumQuota),
resetDate,
quotas,
};
}
@@ -221,17 +262,103 @@ async function getGitHubUsage(accessToken, providerSpecificData) {
}
}
function formatGitHubQuotaSnapshot(quota) {
if (!quota) return { used: 0, total: 0, unlimited: true };
function formatGitHubQuotaSnapshot(quota, resetAt: string | null = null): UsageQuota | null {
const source = toRecord(quota);
if (Object.keys(source).length === 0) return null;
const unlimited = source.unlimited === true;
const entitlement = toNumber(source.entitlement, Number.NaN);
const totalValue = toNumber(source.total, Number.NaN);
const remainingValue = toNumber(source.remaining, Number.NaN);
const usedValue = toNumber(source.used, Number.NaN);
const percentRemainingValue = toNumber(
getFieldValue(source, "percent_remaining", "percentRemaining"),
Number.NaN
);
let total = Number.isFinite(totalValue)
? Math.max(0, totalValue)
: Number.isFinite(entitlement)
? Math.max(0, entitlement)
: 0;
let remaining = Number.isFinite(remainingValue) ? Math.max(0, remainingValue) : undefined;
let used = Number.isFinite(usedValue) ? Math.max(0, usedValue) : undefined;
let remainingPercentage = Number.isFinite(percentRemainingValue)
? clampPercentage(percentRemainingValue)
: undefined;
if (used === undefined && total > 0 && remaining !== undefined) {
used = Math.max(total - remaining, 0);
}
if (remaining === undefined && total > 0 && used !== undefined) {
remaining = Math.max(total - used, 0);
}
if (remainingPercentage === undefined && total > 0 && remaining !== undefined) {
remainingPercentage = clampPercentage((remaining / total) * 100);
}
if (total <= 0 && remainingPercentage !== undefined) {
total = 100;
used = 100 - remainingPercentage;
remaining = remainingPercentage;
}
return {
used: quota.entitlement - quota.remaining,
total: quota.entitlement,
remaining: quota.remaining,
unlimited: quota.unlimited || false,
used: Math.max(0, used ?? 0),
total,
remaining,
remainingPercentage,
resetAt,
unlimited,
};
}
function inferGitHubPlanName(data: JsonRecord, premiumQuota: UsageQuota | null): string {
const rawPlan = getFieldValue(data, "copilot_plan", "copilotPlan");
const rawSku = getFieldValue(data, "access_type_sku", "accessTypeSku");
const planText = typeof rawPlan === "string" ? rawPlan.trim() : "";
const skuText = typeof rawSku === "string" ? rawSku.trim() : "";
const combined = `${skuText} ${planText}`.trim().toUpperCase();
const monthlyQuotas = toRecord(getFieldValue(data, "monthly_quotas", "monthlyQuotas"));
const premiumTotal =
premiumQuota?.total ||
toNumber(getFieldValue(monthlyQuotas, "premium_interactions", "premiumInteractions"), 0);
const chatTotal = toNumber(getFieldValue(monthlyQuotas, "chat", "chat"), 0);
if (
combined.includes("PRO+") ||
combined.includes("PRO_PLUS") ||
combined.includes("PROPLUS")
) {
return "Copilot Pro+";
}
if (combined.includes("ENTERPRISE")) return "Copilot Enterprise";
if (combined.includes("BUSINESS")) return "Copilot Business";
if (combined.includes("STUDENT")) return "Copilot Student";
if (combined.includes("FREE")) return "Copilot Free";
if (combined.includes("PRO")) return "Copilot Pro";
if (premiumTotal >= 1400) return "Copilot Pro+";
if (premiumTotal >= 900) return "Copilot Enterprise";
if (premiumTotal >= 250) {
if (combined.includes("INDIVIDUAL")) return "Copilot Pro";
return "Copilot Business";
}
if (premiumTotal > 0 || chatTotal === 50) return "Copilot Free";
if (skuText) {
const label = toDisplayLabel(skuText);
return label ? `Copilot ${label}` : "GitHub Copilot";
}
if (planText) {
const label = toDisplayLabel(planText);
return label ? `Copilot ${label}` : "GitHub Copilot";
}
return "GitHub Copilot";
}
/**
* Gemini CLI Usage (Google Cloud)
*/
+58 -15
View File
@@ -1,26 +1,69 @@
// Tool call helper functions for translator
// Generate unique tool call ID
const ALPHANUM9 = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
// Generate unique tool call ID (default long form)
export function generateToolCallId() {
return `call_${Date.now().toString(36)}_${Math.random().toString(36).slice(2, 9)}`;
}
// Ensure all tool_calls have id field and arguments is string (some providers require it)
export function ensureToolCallIds(body) {
// Generate 9-char [a-zA-Z0-9] id for providers that require it (e.g. Mistral)
function generateToolCallId9(): string {
let s = "";
for (let i = 0; i < 9; i++) {
s += ALPHANUM9[Math.floor(Math.random() * ALPHANUM9.length)];
}
return s;
}
/** @param options.use9CharId - When true, normalize ids to 9-char [a-zA-Z0-9] (e.g. Mistral); when false, only fix type/arguments, leave ids as-is */
export function ensureToolCallIds(body, options?: { use9CharId?: boolean }) {
if (!body.messages || !Array.isArray(body.messages)) return body;
for (const msg of body.messages) {
if (msg.role === "assistant" && msg.tool_calls && Array.isArray(msg.tool_calls)) {
for (const tc of msg.tool_calls) {
if (!tc.id) {
tc.id = generateToolCallId();
}
if (!tc.type) {
tc.type = "function";
}
// Ensure arguments is JSON string, not object
if (tc.function?.arguments && typeof tc.function.arguments !== "string") {
tc.function.arguments = JSON.stringify(tc.function.arguments);
const use9CharId = options?.use9CharId === true;
for (let i = 0; i < body.messages.length; i++) {
const msg = body.messages[i];
if (msg.role !== "assistant" || !msg.tool_calls || !Array.isArray(msg.tool_calls)) continue;
const used9 = new Set<string>();
const newIdsInOrder: string[] = [];
for (const tc of msg.tool_calls) {
if (!tc.type) {
tc.type = "function";
}
if (tc.function?.arguments && typeof tc.function.arguments !== "string") {
tc.function.arguments = JSON.stringify(tc.function.arguments);
}
if (use9CharId) {
let newId: string;
do {
newId = generateToolCallId9();
} while (used9.has(newId));
used9.add(newId);
newIdsInOrder.push(newId);
tc.id = newId;
} else {
// Leave id as-is, only ensure it exists for later tool message matching
const id =
tc.id != null && String(tc.id).trim() !== "" ? String(tc.id) : generateToolCallId();
tc.id = id;
newIdsInOrder.push(id);
}
}
// Tool responses (role "tool") follow in same order as tool_calls; set tool_call_id by index.
// Stop when we hit another assistant so we only link tool messages that immediately follow this one.
if (newIdsInOrder.length > 0) {
let idx = 0;
for (let j = i + 1; j < body.messages.length; j++) {
const later = body.messages[j];
if (later.role === "assistant") break;
if (later.role !== "tool") continue;
if (idx < newIdsInOrder.length) {
later.tool_call_id = newIdsInOrder[idx];
idx++;
}
}
}
+10 -3
View File
@@ -66,6 +66,7 @@ function normalizeOpenAIResponsesRequest(body) {
return normalized;
}
/** @param options.normalizeToolCallId - When true, use 9-char tool call ids (e.g. Mistral); when false, leave ids as-is */
// Translate request: source -> openai -> target
export function translateRequest(
sourceFormat,
@@ -75,9 +76,11 @@ export function translateRequest(
stream = true,
credentials = null,
provider = null,
reqLogger = null
reqLogger = null,
options?: { normalizeToolCallId?: boolean }
) {
let result = body;
const use9CharId = options?.normalizeToolCallId === true;
// Phase 2: Apply thinking budget control before normalization
result = applyThinkingBudget(result);
@@ -85,8 +88,8 @@ export function translateRequest(
// Normalize thinking config: remove if lastMessage is not user
normalizeThinkingConfig(result);
// Always ensure tool_calls have id (some providers require it)
ensureToolCallIds(result);
// Ensure tool_calls have id; optionally normalize to 9-char for providers like Mistral
ensureToolCallIds(result, { use9CharId });
// Fix missing tool responses (insert empty tool_result if needed)
fixMissingToolResponses(result);
@@ -140,6 +143,10 @@ export function translateRequest(
result = normalizeOpenAIResponsesRequest(result);
}
// Ensure unique tool_call ids on final payload (translators may have introduced duplicates)
ensureToolCallIds(result, { use9CharId });
fixMissingToolResponses(result);
return result;
}
@@ -208,7 +208,7 @@ export function openaiResponsesToOpenAIRequest(
});
}
// Filter orphaned tool results (no matching tool_call in any assistant message)
// Filter orphaned tool results (no matching tool_call in assistant messages)
const allToolCallIds = new Set<string>();
for (const m of messages) {
const rec = toRecord(m);
+156 -19
View File
@@ -30,6 +30,8 @@ type StreamLogger = {
type StreamCompletePayload = {
status: number;
usage: unknown;
/** Minimal response body for call log (streaming: usage + note; non-streaming not used) */
responseBody?: unknown;
};
type StreamOptions = {
@@ -51,6 +53,8 @@ type TranslateState = ReturnType<typeof initState> & {
toolNameMap?: unknown;
usage?: unknown;
finishReason?: unknown;
/** Accumulated message content for call log response body */
accumulatedContent?: string;
};
function getOpenAIIntermediateChunks(value: unknown): unknown[] {
@@ -106,14 +110,21 @@ export function createSSEStream(options: StreamOptions = {}) {
let buffer = "";
let usage = null;
// State for translate mode
// State for translate mode (accumulatedContent for call log response body)
const state: TranslateState | null =
mode === STREAM_MODE.TRANSLATE
? { ...(initState(sourceFormat) as TranslateState), provider, toolNameMap }
? {
...(initState(sourceFormat) as TranslateState),
provider,
toolNameMap,
accumulatedContent: "",
}
: null;
// Track content length for usage estimation (both modes)
let totalContentLength = 0;
// Passthrough: accumulate content for call log response body
let passthroughAccumulatedContent = "";
// Guard against duplicate [DONE] events — ensures exactly one per stream
let doneSent = false;
@@ -201,9 +212,10 @@ export function createSSEStream(options: StreamOptions = {}) {
if (extracted) {
usage = extracted;
}
// Track content length from Responses format
// Track content length and accumulate for call log
if (parsed.delta && typeof parsed.delta === "string") {
totalContentLength += parsed.delta.length;
passthroughAccumulatedContent += parsed.delta;
}
} else if (isClaudeSSE) {
// Claude SSE: extract usage, track content, forward as-is
@@ -213,14 +225,23 @@ export function createSSEStream(options: StreamOptions = {}) {
// message_start carries input_tokens, message_delta carries output_tokens
if (!usage) usage = {};
if (extracted.prompt_tokens > 0) usage.prompt_tokens = extracted.prompt_tokens;
if (extracted.completion_tokens > 0) usage.completion_tokens = extracted.completion_tokens;
if (extracted.completion_tokens > 0)
usage.completion_tokens = extracted.completion_tokens;
if (extracted.total_tokens > 0) usage.total_tokens = extracted.total_tokens;
if (extracted.cache_read_input_tokens) usage.cache_read_input_tokens = extracted.cache_read_input_tokens;
if (extracted.cache_creation_input_tokens) usage.cache_creation_input_tokens = extracted.cache_creation_input_tokens;
if (extracted.cache_read_input_tokens)
usage.cache_read_input_tokens = extracted.cache_read_input_tokens;
if (extracted.cache_creation_input_tokens)
usage.cache_creation_input_tokens = extracted.cache_creation_input_tokens;
}
// Track content length and accumulate from Claude format
if (parsed.delta?.text) {
totalContentLength += parsed.delta.text.length;
passthroughAccumulatedContent += parsed.delta.text;
}
if (parsed.delta?.thinking) {
totalContentLength += parsed.delta.thinking.length;
passthroughAccumulatedContent += parsed.delta.thinking;
}
// Track content length from Claude format
if (parsed.delta?.text) totalContentLength += parsed.delta.text.length;
if (parsed.delta?.thinking) totalContentLength += parsed.delta.thinking.length;
} else {
// Chat Completions: full sanitization pipeline
parsed = sanitizeStreamingChunk(parsed);
@@ -246,6 +267,10 @@ export function createSSEStream(options: StreamOptions = {}) {
if (content && typeof content === "string") {
totalContentLength += content.length;
}
if (typeof delta?.content === "string")
passthroughAccumulatedContent += delta.content;
if (typeof delta?.reasoning_content === "string")
passthroughAccumulatedContent += delta.reasoning_content;
const extracted = extractUsage(parsed);
if (extracted) {
@@ -301,23 +326,45 @@ export function createSSEStream(options: StreamOptions = {}) {
continue;
}
// Track content length for estimation (from various formats)
// Include both regular content and reasoning/thinking content
// Track content length and accumulate for call log (from raw provider chunk, so content is never missed)
// Do this before translation so we capture content regardless of translator output shape
// Claude format
if (parsed.delta?.text) {
totalContentLength += parsed.delta.text.length;
const t = parsed.delta.text;
totalContentLength += t.length;
if (state?.accumulatedContent !== undefined && typeof t === "string")
state.accumulatedContent += t;
}
if (parsed.delta?.thinking) {
totalContentLength += parsed.delta.thinking.length;
const t = parsed.delta.thinking;
totalContentLength += t.length;
if (state?.accumulatedContent !== undefined && typeof t === "string")
state.accumulatedContent += t;
}
// OpenAI format
if (parsed.choices?.[0]?.delta?.content) {
totalContentLength += parsed.choices[0].delta.content.length;
const c = parsed.choices[0].delta.content;
if (typeof c === "string") {
totalContentLength += c.length;
if (state?.accumulatedContent !== undefined) state.accumulatedContent += c;
} else if (Array.isArray(c)) {
for (const part of c) {
if (part?.text && typeof part.text === "string") {
totalContentLength += part.text.length;
if (state?.accumulatedContent !== undefined)
state.accumulatedContent += part.text;
}
}
}
}
if (parsed.choices?.[0]?.delta?.reasoning_content) {
totalContentLength += parsed.choices[0].delta.reasoning_content.length;
const r = parsed.choices[0].delta.reasoning_content;
if (typeof r === "string") {
totalContentLength += r.length;
if (state?.accumulatedContent !== undefined) state.accumulatedContent += r;
}
}
// Gemini format - may have multiple parts
@@ -325,10 +372,30 @@ export function createSSEStream(options: StreamOptions = {}) {
for (const part of parsed.candidates[0].content.parts) {
if (part.text && typeof part.text === "string") {
totalContentLength += part.text.length;
if (state?.accumulatedContent !== undefined) state.accumulatedContent += part.text;
}
}
}
// Generic fallback: delta string, top-level content/text (e.g. some SSE payloads)
if (state?.accumulatedContent !== undefined) {
if (typeof (parsed as JsonRecord).delta === "string") {
const d = (parsed as JsonRecord).delta as string;
state.accumulatedContent += d;
totalContentLength += d.length;
}
if (typeof (parsed as JsonRecord).content === "string") {
const c = (parsed as JsonRecord).content as string;
state.accumulatedContent += c;
totalContentLength += c.length;
}
if (typeof (parsed as JsonRecord).text === "string") {
const t = (parsed as JsonRecord).text as string;
state.accumulatedContent += t;
totalContentLength += t.length;
}
}
// Extract usage
const extracted = extractUsage(parsed);
if (extracted) state.usage = extracted; // Keep original usage for logging
@@ -344,6 +411,9 @@ export function createSSEStream(options: StreamOptions = {}) {
if (translated?.length > 0) {
for (const item of translated) {
// Content for call log is accumulated only from parsed (above) to avoid double-counting;
// do not add again from item here.
// Filter empty chunks
if (!hasValuableContent(item, sourceFormat)) {
continue; // Skip this empty chunk
@@ -415,10 +485,30 @@ export function createSSEStream(options: StreamOptions = {}) {
status: "200 OK",
}).catch(() => {});
}
// Notify caller for call log persistence
// Notify caller for call log persistence (include full response body with accumulated content)
if (onComplete) {
try {
onComplete({ status: 200, usage });
const u = usage as Record<string, unknown> | null;
const prompt = Number(u?.prompt_tokens ?? u?.input_tokens ?? 0);
const completion = Number(u?.completion_tokens ?? u?.output_tokens ?? 0);
const content = passthroughAccumulatedContent.trim() || "";
const responseBody = {
choices: [
{
message: {
role: "assistant",
content,
},
},
],
usage: {
prompt_tokens: prompt,
completion_tokens: completion,
total_tokens: prompt + completion,
},
_streamed: true,
};
onComplete({ status: 200, usage, responseBody });
} catch {}
}
return;
@@ -428,6 +518,33 @@ export function createSSEStream(options: StreamOptions = {}) {
if (buffer.trim()) {
const parsed = parseSSELine(buffer.trim());
if (parsed && !parsed.done) {
// Extract usage from remaining buffer — if the usage-bearing event
// (e.g. response.completed) is the last SSE line, it ends up here
// in the flush handler where extractUsage was not called.
// Non-destructive merge: some providers send usage across multiple
// events (e.g. prompt_tokens in message_start, completion_tokens
// in message_delta). Direct assignment would lose earlier data.
const extracted = extractUsage(parsed);
if (extracted) {
if (!state.usage) {
state.usage = extracted;
} else {
if (extracted.prompt_tokens > 0)
state.usage.prompt_tokens = extracted.prompt_tokens;
if (extracted.completion_tokens > 0)
state.usage.completion_tokens = extracted.completion_tokens;
if (extracted.total_tokens > 0) state.usage.total_tokens = extracted.total_tokens;
if (extracted.cache_read_input_tokens > 0)
state.usage.cache_read_input_tokens = extracted.cache_read_input_tokens;
if (extracted.cache_creation_input_tokens > 0)
state.usage.cache_creation_input_tokens = extracted.cache_creation_input_tokens;
if (extracted.cached_tokens > 0)
state.usage.cached_tokens = extracted.cached_tokens;
if (extracted.reasoning_tokens > 0)
state.usage.reasoning_tokens = extracted.reasoning_tokens;
}
}
const translated = translateResponse(targetFormat, sourceFormat, parsed, state);
// Log OpenAI intermediate chunks
@@ -497,10 +614,30 @@ export function createSSEStream(options: StreamOptions = {}) {
status: "200 OK",
}).catch(() => {});
}
// Notify caller for call log persistence
// Notify caller for call log persistence (include full response body with accumulated content)
if (onComplete) {
try {
onComplete({ status: 200, usage: state?.usage });
const u = state?.usage as Record<string, unknown> | null | undefined;
const prompt = Number(u?.prompt_tokens ?? u?.input_tokens ?? 0);
const completion = Number(u?.completion_tokens ?? u?.output_tokens ?? 0);
const content = (state?.accumulatedContent ?? "").trim() || "";
const responseBody = {
choices: [
{
message: {
role: "assistant",
content,
},
},
],
usage: {
prompt_tokens: prompt,
completion_tokens: completion,
total_tokens: prompt + completion,
},
_streamed: true,
};
onComplete({ status: 200, usage: state?.usage, responseBody });
} catch {}
}
} catch (error) {
+3 -1
View File
@@ -400,8 +400,10 @@ export function logUsage(provider, usage, model = null, connectionId = null, api
console.log(msg);
// Save to usage DB
// input = total input tokens (non-cached + cache_read + cache_creation)
// This ensures analytics show correct totals for heavily-cached requests
const tokens = {
input: inTokens,
input: inTokens + (cacheRead || 0) + (cacheCreation || 0),
output: outTokens,
cacheRead: cacheRead || 0,
cacheCreation: cacheCreation || 0,
+42 -43
View File
@@ -1,12 +1,12 @@
{
"name": "omniroute",
"version": "2.6.8",
"version": "2.8.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "omniroute",
"version": "2.6.8",
"version": "2.8.0",
"hasInstallScript": true,
"license": "MIT",
"workspaces": [
@@ -1725,9 +1725,9 @@
}
},
"node_modules/@next/env": {
"version": "16.1.6",
"resolved": "https://registry.npmjs.org/@next/env/-/env-16.1.6.tgz",
"integrity": "sha512-N1ySLuZjnAtN3kFnwhAwPvZah8RJxKasD7x1f8shFqhncnWZn4JMfg37diLNuoHsLAlrDfM3g4mawVdtAG8XLQ==",
"version": "16.1.7",
"resolved": "https://registry.npmjs.org/@next/env/-/env-16.1.7.tgz",
"integrity": "sha512-rJJbIdJB/RQr2F1nylZr/PJzamvNNhfr3brdKP6s/GW850jbtR70QlSfFselvIBbcPUOlQwBakexjFzqLzF6pg==",
"license": "MIT"
},
"node_modules/@next/eslint-plugin-next": {
@@ -1741,9 +1741,9 @@
}
},
"node_modules/@next/swc-darwin-arm64": {
"version": "16.1.6",
"resolved": "https://registry.npmjs.org/@next/swc-darwin-arm64/-/swc-darwin-arm64-16.1.6.tgz",
"integrity": "sha512-wTzYulosJr/6nFnqGW7FrG3jfUUlEf8UjGA0/pyypJl42ExdVgC6xJgcXQ+V8QFn6niSG2Pb8+MIG1mZr2vczw==",
"version": "16.1.7",
"resolved": "https://registry.npmjs.org/@next/swc-darwin-arm64/-/swc-darwin-arm64-16.1.7.tgz",
"integrity": "sha512-b2wWIE8sABdyafc4IM8r5Y/dS6kD80JRtOGrUiKTsACFQfWWgUQ2NwoUX1yjFMXVsAwcQeNpnucF2ZrujsBBPg==",
"cpu": [
"arm64"
],
@@ -1757,9 +1757,9 @@
}
},
"node_modules/@next/swc-darwin-x64": {
"version": "16.1.6",
"resolved": "https://registry.npmjs.org/@next/swc-darwin-x64/-/swc-darwin-x64-16.1.6.tgz",
"integrity": "sha512-BLFPYPDO+MNJsiDWbeVzqvYd4NyuRrEYVB5k2N3JfWncuHAy2IVwMAOlVQDFjj+krkWzhY2apvmekMkfQR0CUQ==",
"version": "16.1.7",
"resolved": "https://registry.npmjs.org/@next/swc-darwin-x64/-/swc-darwin-x64-16.1.7.tgz",
"integrity": "sha512-zcnVaaZulS1WL0Ss38R5Q6D2gz7MtBu8GZLPfK+73D/hp4GFMrC2sudLky1QibfV7h6RJBJs/gOFvYP0X7UVlQ==",
"cpu": [
"x64"
],
@@ -1773,9 +1773,9 @@
}
},
"node_modules/@next/swc-linux-arm64-gnu": {
"version": "16.1.6",
"resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-gnu/-/swc-linux-arm64-gnu-16.1.6.tgz",
"integrity": "sha512-OJYkCd5pj/QloBvoEcJ2XiMnlJkRv9idWA/j0ugSuA34gMT6f5b7vOiCQHVRpvStoZUknhl6/UxOXL4OwtdaBw==",
"version": "16.1.7",
"resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-gnu/-/swc-linux-arm64-gnu-16.1.7.tgz",
"integrity": "sha512-2ant89Lux/Q3VyC8vNVg7uBaFVP9SwoK2jJOOR0L8TQnX8CAYnh4uctAScy2Hwj2dgjVHqHLORQZJ2wH6VxhSQ==",
"cpu": [
"arm64"
],
@@ -1789,9 +1789,9 @@
}
},
"node_modules/@next/swc-linux-arm64-musl": {
"version": "16.1.6",
"resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-musl/-/swc-linux-arm64-musl-16.1.6.tgz",
"integrity": "sha512-S4J2v+8tT3NIO9u2q+S0G5KdvNDjXfAv06OhfOzNDaBn5rw84DGXWndOEB7d5/x852A20sW1M56vhC/tRVbccQ==",
"version": "16.1.7",
"resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-musl/-/swc-linux-arm64-musl-16.1.7.tgz",
"integrity": "sha512-uufcze7LYv0FQg9GnNeZ3/whYfo+1Q3HnQpm16o6Uyi0OVzLlk2ZWoY7j07KADZFY8qwDbsmFnMQP3p3+Ftprw==",
"cpu": [
"arm64"
],
@@ -1805,9 +1805,9 @@
}
},
"node_modules/@next/swc-linux-x64-gnu": {
"version": "16.1.6",
"resolved": "https://registry.npmjs.org/@next/swc-linux-x64-gnu/-/swc-linux-x64-gnu-16.1.6.tgz",
"integrity": "sha512-2eEBDkFlMMNQnkTyPBhQOAyn2qMxyG2eE7GPH2WIDGEpEILcBPI/jdSv4t6xupSP+ot/jkfrCShLAa7+ZUPcJQ==",
"version": "16.1.7",
"resolved": "https://registry.npmjs.org/@next/swc-linux-x64-gnu/-/swc-linux-x64-gnu-16.1.7.tgz",
"integrity": "sha512-KWVf2gxYvHtvuT+c4MBOGxuse5TD7DsMFYSxVxRBnOzok/xryNeQSjXgxSv9QpIVlaGzEn/pIuI6Koosx8CGWA==",
"cpu": [
"x64"
],
@@ -1821,9 +1821,9 @@
}
},
"node_modules/@next/swc-linux-x64-musl": {
"version": "16.1.6",
"resolved": "https://registry.npmjs.org/@next/swc-linux-x64-musl/-/swc-linux-x64-musl-16.1.6.tgz",
"integrity": "sha512-oicJwRlyOoZXVlxmIMaTq7f8pN9QNbdes0q2FXfRsPhfCi8n8JmOZJm5oo1pwDaFbnnD421rVU409M3evFbIqg==",
"version": "16.1.7",
"resolved": "https://registry.npmjs.org/@next/swc-linux-x64-musl/-/swc-linux-x64-musl-16.1.7.tgz",
"integrity": "sha512-HguhaGwsGr1YAGs68uRKc4aGWxLET+NevJskOcCAwXbwj0fYX0RgZW2gsOCzr9S11CSQPIkxmoSbuVaBp4Z3dA==",
"cpu": [
"x64"
],
@@ -1837,9 +1837,9 @@
}
},
"node_modules/@next/swc-win32-arm64-msvc": {
"version": "16.1.6",
"resolved": "https://registry.npmjs.org/@next/swc-win32-arm64-msvc/-/swc-win32-arm64-msvc-16.1.6.tgz",
"integrity": "sha512-gQmm8izDTPgs+DCWH22kcDmuUp7NyiJgEl18bcr8irXA5N2m2O+JQIr6f3ct42GOs9c0h8QF3L5SzIxcYAAXXw==",
"version": "16.1.7",
"resolved": "https://registry.npmjs.org/@next/swc-win32-arm64-msvc/-/swc-win32-arm64-msvc-16.1.7.tgz",
"integrity": "sha512-S0n3KrDJokKTeFyM/vGGGR8+pCmXYrjNTk2ZozOL1C/JFdfUIL9O1ATaJOl5r2POe56iRChbsszrjMAdWSv7kQ==",
"cpu": [
"arm64"
],
@@ -1853,9 +1853,9 @@
}
},
"node_modules/@next/swc-win32-x64-msvc": {
"version": "16.1.6",
"resolved": "https://registry.npmjs.org/@next/swc-win32-x64-msvc/-/swc-win32-x64-msvc-16.1.6.tgz",
"integrity": "sha512-NRfO39AIrzBnixKbjuo2YiYhB6o9d8v/ymU9m/Xk8cyVk+k7XylniXkHwjs4s70wedVffc6bQNbufk5v0xEm0A==",
"version": "16.1.7",
"resolved": "https://registry.npmjs.org/@next/swc-win32-x64-msvc/-/swc-win32-x64-msvc-16.1.7.tgz",
"integrity": "sha512-mwgtg8CNZGYm06LeEd+bNnOUfwOyNem/rOiP14Lsz+AnUY92Zq/LXwtebtUiaeVkhbroRCQ0c8GlR4UT1U+0yg==",
"cpu": [
"x64"
],
@@ -6817,7 +6817,6 @@
"version": "2.3.2",
"resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.2.tgz",
"integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"optional": true,
@@ -8812,14 +8811,14 @@
}
},
"node_modules/next": {
"version": "16.1.6",
"resolved": "https://registry.npmjs.org/next/-/next-16.1.6.tgz",
"integrity": "sha512-hkyRkcu5x/41KoqnROkfTm2pZVbKxvbZRuNvKXLRXxs3VfyO0WhY50TQS40EuKO9SW3rBj/sF3WbVwDACeMZyw==",
"version": "16.1.7",
"resolved": "https://registry.npmjs.org/next/-/next-16.1.7.tgz",
"integrity": "sha512-WM0L7WrSvKwoLegLYr6V+mz+RIofqQgVAfHhMp9a88ms0cFX8iX9ew+snpWlSBwpkURJOUdvCEt3uLl3NNzvWg==",
"license": "MIT",
"dependencies": {
"@next/env": "16.1.6",
"@next/env": "16.1.7",
"@swc/helpers": "0.5.15",
"baseline-browser-mapping": "^2.8.3",
"baseline-browser-mapping": "^2.9.19",
"caniuse-lite": "^1.0.30001579",
"postcss": "8.4.31",
"styled-jsx": "5.1.6"
@@ -8831,14 +8830,14 @@
"node": ">=20.9.0"
},
"optionalDependencies": {
"@next/swc-darwin-arm64": "16.1.6",
"@next/swc-darwin-x64": "16.1.6",
"@next/swc-linux-arm64-gnu": "16.1.6",
"@next/swc-linux-arm64-musl": "16.1.6",
"@next/swc-linux-x64-gnu": "16.1.6",
"@next/swc-linux-x64-musl": "16.1.6",
"@next/swc-win32-arm64-msvc": "16.1.6",
"@next/swc-win32-x64-msvc": "16.1.6",
"@next/swc-darwin-arm64": "16.1.7",
"@next/swc-darwin-x64": "16.1.7",
"@next/swc-linux-arm64-gnu": "16.1.7",
"@next/swc-linux-arm64-musl": "16.1.7",
"@next/swc-linux-x64-gnu": "16.1.7",
"@next/swc-linux-x64-musl": "16.1.7",
"@next/swc-win32-arm64-msvc": "16.1.7",
"@next/swc-win32-x64-msvc": "16.1.7",
"sharp": "^0.34.4"
},
"peerDependencies": {
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "omniroute",
"version": "2.6.8",
"version": "2.8.2",
"description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
"type": "module",
"bin": {
Binary file not shown.

After

Width:  |  Height:  |  Size: 4.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.2 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.2 KiB

+1
View File
@@ -0,0 +1 @@
<svg width="56" height="64" viewBox="0 0 56 64" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M53.292 15.321l1.5-3.676s-1.909-2.043-4.227-4.358c-2.317-2.315-7.225-.953-7.225-.953L37.751 0H18.12l-5.589 6.334s-4.908-1.362-7.225.953C2.988 9.602 1.08 11.645 1.08 11.645l1.5 3.676-1.91 5.447s5.614 21.236 6.272 23.83c1.295 5.106 2.181 7.08 5.862 9.668 3.68 2.587 10.36 7.08 11.45 7.762 1.091.68 2.455 1.84 3.682 1.84 1.227 0 2.59-1.16 3.68-1.84 1.091-.681 7.77-5.175 11.452-7.762 3.68-2.587 4.567-4.562 5.862-9.668.657-2.594 6.27-23.83 6.27-23.83l-1.908-5.447z" fill="url(#paint0_linear)"/><path fill-rule="evenodd" clip-rule="evenodd" d="M34.888 11.508c.818 0 6.885-1.157 6.885-1.157s7.189 8.68 7.189 10.536c0 1.534-.619 2.134-1.347 2.842-.152.148-.31.3-.467.468l-5.39 5.717a9.42 9.42 0 01-.176.18c-.538.54-1.33 1.336-.772 2.658l.115.269c.613 1.432 1.37 3.2.407 4.99-1.025 1.906-2.78 3.178-3.905 2.967-1.124-.21-3.766-1.589-4.737-2.218-.971-.63-4.05-3.166-4.05-4.137 0-.809 2.214-2.155 3.29-2.81.214-.13.383-.232.48-.298.111-.075.297-.19.526-.332.981-.61 2.754-1.71 2.799-2.197.055-.602.034-.778-.758-2.264-.168-.316-.365-.654-.568-1.004-.754-1.295-1.598-2.745-1.41-3.784.21-1.173 2.05-1.845 3.608-2.415.194-.07.385-.14.567-.209l1.623-.609c1.556-.582 3.284-1.229 3.57-1.36.394-.181.292-.355-.903-.468a54.655 54.655 0 01-.58-.06c-1.48-.157-4.209-.446-5.535-.077-.261.073-.553.152-.86.235-1.49.403-3.317.897-3.493 1.182-.03.05-.06.093-.089.133-.168.238-.277.394-.091 1.406.055.302.169.895.31 1.629.41 2.148 1.053 5.498 1.134 6.25.011.106.024.207.036.305.103.84.171 1.399-.805 1.622l-.255.058c-1.102.252-2.717.623-3.3.623-.584 0-2.2-.37-3.302-.623l-.254-.058c-.976-.223-.907-.782-.804-1.622.012-.098.024-.2.035-.305.081-.753.725-4.112 1.137-6.259.14-.73.253-1.32.308-1.62.185-1.012.076-1.168-.092-1.406a3.743 3.743 0 01-.09-.133c-.174-.285-2-.779-3.491-1.182-.307-.083-.6-.162-.86-.235-1.327-.37-4.055-.08-5.535.077-.226.024-.422.045-.58.06-1.196.113-1.297.287-.903.468.285.131 2.013.778 3.568 1.36.597.223 1.17.437 1.624.609.183.069.373.138.568.21 1.558.57 3.398 1.241 3.608 2.414.187 1.039-.657 2.489-1.41 3.784-.204.35-.4.688-.569 1.004-.791 1.486-.812 1.662-.757 2.264.044.488 1.816 1.587 2.798 2.197.229.142.415.257.526.332.098.066.266.168.48.298 1.076.654 3.29 2 3.29 2.81 0 .97-3.078 3.507-4.05 4.137-.97.63-3.612 2.008-4.737 2.218-1.124.21-2.88-1.061-3.904-2.966-.963-1.791-.207-3.559.406-4.99l.115-.27c.559-1.322-.233-2.118-.772-2.658a9.377 9.377 0 01-.175-.18l-5.39-5.717c-.158-.167-.316-.32-.468-.468-.728-.707-1.346-1.308-1.346-2.842 0-1.855 7.189-10.536 7.189-10.536s6.066 1.157 6.884 1.157c.653 0 1.913-.433 3.227-.885.333-.114.669-.23 1-.34 1.635-.545 2.726-.549 2.726-.549s1.09.004 2.726.549c.33.11.667.226 1 .34 1.313.452 2.574.885 3.226.885zm-1.041 30.706c1.282.66 2.192 1.128 2.536 1.343.445.278.174.803-.232 1.09-.405.285-5.853 4.499-6.381 4.965l-.215.191c-.509.459-1.159 1.044-1.62 1.044-.46 0-1.11-.586-1.62-1.044l-.213-.191c-.53-.466-5.977-4.68-6.382-4.966-.405-.286-.677-.81-.232-1.09.344-.214 1.255-.683 2.539-1.344l1.22-.629c1.92-.992 4.315-1.837 4.689-1.837.373 0 2.767.844 4.689 1.837.436.226.845.437 1.222.63z" fill="#fff"/><path fill-rule="evenodd" clip-rule="evenodd" d="M43.34 6.334L37.751 0H18.12l-5.589 6.334s-4.908-1.362-7.225.953c0 0 6.544-.59 8.793 3.064 0 0 6.066 1.157 6.884 1.157.818 0 2.59-.68 4.226-1.225 1.636-.545 2.727-.549 2.727-.549s1.09.004 2.726.549 3.408 1.225 4.226 1.225c.818 0 6.885-1.157 6.885-1.157 2.249-3.654 8.792-3.064 8.792-3.064-2.317-2.315-7.225-.953-7.225-.953z" fill="url(#paint1_linear)"/><defs><linearGradient id="paint0_linear" x1=".671" y1="64.319" x2="55.2" y2="64.319" gradientUnits="userSpaceOnUse"><stop stop-color="#F50"/><stop offset=".41" stop-color="#F50"/><stop offset=".582" stop-color="#FF2000"/><stop offset="1" stop-color="#FF2000"/></linearGradient><linearGradient id="paint1_linear" x1="6.278" y1="11.466" x2="50.565" y2="11.466" gradientUnits="userSpaceOnUse"><stop stop-color="#FF452A"/><stop offset="1" stop-color="#FF2000"/></linearGradient></defs></svg>

After

Width:  |  Height:  |  Size: 4.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.6 KiB

+4
View File
@@ -0,0 +1,4 @@
<svg xmlns="http://www.w3.org/2000/svg" width="48" height="48" viewBox="0 0 48 48">
<rect width="48" height="48" rx="8" fill="#1E40AF"/>
<text x="24" y="32" text-anchor="middle" font-family="system-ui,-apple-system,sans-serif" font-size="22" font-weight="700" fill="white">exa</text>
</svg>

After

Width:  |  Height:  |  Size: 295 B

+4
View File
@@ -0,0 +1,4 @@
<svg xmlns="http://www.w3.org/2000/svg" width="48" height="48" viewBox="0 0 48 48">
<rect width="48" height="48" rx="8" fill="#1E40AF"/>
<text x="24" y="32" text-anchor="middle" font-family="system-ui,-apple-system,sans-serif" font-size="22" font-weight="700" fill="white">exa</text>
</svg>

After

Width:  |  Height:  |  Size: 295 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 7.0 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.1 KiB

After

Width:  |  Height:  |  Size: 7.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.7 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.3 KiB

+63 -7
View File
@@ -14,6 +14,7 @@
*
* Fixes: https://github.com/diegosouzapw/OmniRoute/issues/129
* Fixes: https://github.com/diegosouzapw/OmniRoute/issues/321
* Fixes: https://github.com/diegosouzapw/OmniRoute/issues/426
*/
import { existsSync, copyFileSync, mkdirSync } from "node:fs";
@@ -80,8 +81,54 @@ if (existsSync(rootBinary)) {
}
}
// Strategy 1.5: Use node-pre-gyp to download the correct prebuilt binary
// This works on Windows without requiring node-gyp, Python, or MSVC.
// better-sqlite3 ships prebuilts for win32-x64, win32-arm64, darwin-x64/arm64.
console.log(" 📥 Attempting to download prebuilt binary via node-pre-gyp...");
try {
const { execSync } = await import("node:child_process");
// better-sqlite3 bundles @mapbox/node-pre-gyp — use it directly
const preGypBin = join(
ROOT,
"app",
"node_modules",
".bin",
process.platform === "win32" ? "node-pre-gyp.cmd" : "node-pre-gyp"
);
const preGypFallback = join(
ROOT,
"app",
"node_modules",
"@mapbox",
"node-pre-gyp",
"bin",
"node-pre-gyp"
);
const preGypCmd = existsSync(preGypBin) ? preGypBin : preGypFallback;
if (existsSync(preGypCmd)) {
execSync(`"${process.execPath}" "${preGypCmd}" install --fallback-to-build=false`, {
cwd: join(ROOT, "app", "node_modules", "better-sqlite3"),
stdio: "inherit",
timeout: 60_000,
});
mkdirSync(dirname(appBinary), { recursive: true });
try {
process.dlopen({ exports: {} }, appBinary);
console.log(" ✅ Prebuilt binary downloaded and loaded successfully!\n");
process.exit(0);
} catch (loadErr) {
console.warn(` ⚠️ Downloaded binary failed to load: ${loadErr.message}`);
}
} else {
console.warn(" ⚠️ node-pre-gyp not found, skipping prebuilt download.");
}
} catch (err) {
console.warn(` ⚠️ node-pre-gyp download failed: ${err.message.split("\n")[0]}`);
}
// Strategy 2: Fall back to npm rebuild (may work if build tools are available)
console.log(" ⚠️ Root binary not available or incompatible, attempting npm rebuild...");
console.log(" ⚠️ Attempting npm rebuild (requires build tools)...");
try {
const { execSync } = await import("node:child_process");
@@ -103,14 +150,23 @@ try {
}
}
// If nothing worked, warn but don't fail the install — let the package stay
// installed so users can fix manually or use the pre-flight check in the CLI
console.warn(" ⚠️ Could not fix better-sqlite3 native module automatically.");
// If nothing worked, warn but don't fail the install
console.warn("\n ⚠️ Could not fix better-sqlite3 native module automatically.");
console.warn(" The server may not start correctly.");
console.warn(" Try manually:");
console.warn(` cd ${join(ROOT, "app")} && npm rebuild better-sqlite3`);
if (process.platform === "darwin") {
console.warn(" Manual fix options:");
if (process.platform === "win32") {
console.warn(" Option A (easiest — no build tools needed):");
console.warn(` cd "${join(ROOT, "app", "node_modules", "better-sqlite3")}"`);
console.warn(" npx @mapbox/node-pre-gyp install --fallback-to-build=false");
console.warn(" Option B (requires Build Tools for Visual Studio):");
console.warn(` cd "${join(ROOT, "app")}" && npm rebuild better-sqlite3`);
console.warn(" Install from: https://visualstudio.microsoft.com/visual-cpp-build-tools/");
console.warn(" Also ensure Python is installed: https://python.org");
} else if (process.platform === "darwin") {
console.warn(` cd ${join(ROOT, "app")} && npm rebuild better-sqlite3`);
console.warn(" If build tools are missing: xcode-select --install");
} else {
console.warn(` cd ${join(ROOT, "app")} && npm rebuild better-sqlite3`);
}
console.warn("");
+13
View File
@@ -278,6 +278,19 @@ if (existsSync(swcHelpersSrc) && !existsSync(swcHelpersDst)) {
console.log(" ✅ @swc/helpers included in standalone build.");
}
// ── Step 10.6: Remove large binaries from standalone build ──
// These directories contain platform-native binaries (.node, .asar) that
// trigger Z_DATA_ERROR during npm pack. They are not needed in the npm package.
const binaryDirsToRemove = ["vscode-extension", "electron"];
for (const dir of binaryDirsToRemove) {
const targetDir = join(APP_DIR, dir);
if (existsSync(targetDir)) {
console.log(` 🧹 Removing app/${dir}/ (not needed in npm package)...`);
rmSync(targetDir, { recursive: true, force: true });
console.log(` ✅ app/${dir}/ removed.`);
}
}
// ── Done ───────────────────────────────────────────────────
const appPkg = join(APP_DIR, "package.json");
if (existsSync(appPkg)) {
@@ -1181,6 +1181,12 @@ function ComboFormModal({ isOpen, combo, onClose, onSave, activeProviders }) {
const [config, setConfig] = useState(combo?.config || {});
const [showStrategyNudge, setShowStrategyNudge] = useState(false);
const strategyChangeMountedRef = useRef(false);
// Agent features (#399 / #401 / #454)
const [agentSystemMessage, setAgentSystemMessage] = useState<string>(combo?.system_message || "");
const [agentToolFilter, setAgentToolFilter] = useState<string>(combo?.tool_filter_regex || "");
const [agentContextCache, setAgentContextCache] = useState<boolean>(
!!combo?.context_cache_protection
);
// DnD state
const hasPricingForModel = useCallback(
@@ -1532,6 +1538,14 @@ function ComboFormModal({ isOpen, combo, onClose, onSave, activeProviders }) {
saveData.config = configToSave;
}
// Agent features (#399 / #401 / #454)
if (agentSystemMessage.trim()) saveData.system_message = agentSystemMessage.trim();
else delete saveData.system_message;
if (agentToolFilter.trim()) saveData.tool_filter_regex = agentToolFilter.trim();
else delete saveData.tool_filter_regex;
if (agentContextCache) saveData.context_cache_protection = true;
else delete saveData.context_cache_protection;
await onSave(saveData);
setSaving(false);
};
@@ -2052,6 +2066,72 @@ function ComboFormModal({ isOpen, combo, onClose, onSave, activeProviders }) {
</div>
)}
{/* Agent Features (#399 / #401 / #454) */}
<div className="flex flex-col gap-2 p-3 bg-black/[0.02] dark:bg-white/[0.02] rounded-lg border border-black/5 dark:border-white/5">
<div className="flex items-center gap-1.5 mb-1">
<span className="material-symbols-outlined text-[14px] text-primary">smart_toy</span>
<p className="text-xs font-medium">Agent Features</p>
<span className="text-[10px] text-text-muted">
optional, for agent/tool workflows
</span>
</div>
{/* System Message Override */}
<div>
<label className="text-[11px] font-medium text-text-muted block mb-0.5">
System Message Override
</label>
<textarea
rows={2}
value={agentSystemMessage}
onChange={(e) => setAgentSystemMessage(e.target.value)}
placeholder="Override the system prompt for all requests routed through this combo…"
className="w-full text-xs py-1.5 px-2 rounded border border-black/10 dark:border-white/10 bg-transparent focus:border-primary focus:outline-none resize-none"
/>
<p className="text-[10px] text-text-muted mt-0.5">
Replaces any system message sent by the client. Leave empty to pass through client
system messages.
</p>
</div>
{/* Tool Filter Regex */}
<div>
<label className="text-[11px] font-medium text-text-muted block mb-0.5">
Tool Filter Regex
</label>
<input
type="text"
value={agentToolFilter}
onChange={(e) => setAgentToolFilter(e.target.value)}
placeholder="e.g. ^(bash|computer)$"
className="w-full text-xs py-1.5 px-2 rounded border border-black/10 dark:border-white/10 bg-transparent focus:border-primary focus:outline-none font-mono"
/>
<p className="text-[10px] text-text-muted mt-0.5">
Only tools whose name matches this regex are forwarded to the provider. Leave empty
to forward all tools.
</p>
</div>
{/* Context Cache Protection */}
<div className="flex items-center justify-between gap-2">
<div>
<label className="text-[11px] font-medium text-text-muted block">
Context Cache Protection
</label>
<p className="text-[10px] text-text-muted">
Pins the provider/model across turns to preserve cache sessions. Internal tags are
stripped before forwarding to the provider.
</p>
</div>
<input
type="checkbox"
checked={agentContextCache}
onChange={(e) => setAgentContextCache(e.target.checked)}
className="accent-primary shrink-0"
/>
</div>
</div>
{/* Actions */}
<div className="flex gap-2 pt-1">
<Button onClick={onClose} variant="ghost" fullWidth size="sm">
@@ -33,11 +33,29 @@ export default function APIPageClient({ machineId }) {
const [viewTab, setViewTab] = useState("api");
const [mcpStatus, setMcpStatus] = useState<any>(null);
const [a2aStatus, setA2aStatus] = useState<any>(null);
const [searchProviders, setSearchProviders] = useState<any[]>([]);
const { copied, copy } = useCopyToClipboard();
const fetchSearchProviders = async () => {
try {
const res = await fetch("/api/search/providers");
if (res.ok) {
const data = await res.json();
setSearchProviders(data.providers || []);
}
} catch {
// Search endpoint may not be available
}
};
useEffect(() => {
Promise.allSettled([loadCloudSettings(), fetchModels(), fetchProtocolStatus()]).finally(() => {
Promise.allSettled([
loadCloudSettings(),
fetchModels(),
fetchProtocolStatus(),
fetchSearchProviders(),
]).finally(() => {
setLoading(false);
});
}, []);
@@ -575,6 +593,47 @@ export default function APIPageClient({ machineId }) {
</div>
</div>
{/* Search & Discovery */}
{searchProviders.length > 0 && (
<div className="mb-6">
<div className="flex items-center gap-2 mb-3">
<span className="material-symbols-outlined text-sm text-cyan-400">
travel_explore
</span>
<h3 className="text-xs font-semibold text-text-muted uppercase tracking-wider">
{t("categorySearch") || "Search & Discovery"}
</h3>
<div className="flex-1 h-px bg-border/50" />
</div>
<div className="flex flex-col gap-3">
<EndpointSection
icon="search"
iconColor="text-cyan-500"
iconBg="bg-cyan-500/10"
title={t("webSearch") || "Web Search"}
path="/v1/search"
description={
t("webSearchDesc") ||
"Unified web search across multiple providers with automatic failover and caching"
}
models={searchProviders.map((p) => ({
id: p.id,
name: p.name,
owned_by: p.id,
type: "search",
}))}
expanded={expandedEndpoint === "search"}
onToggle={() =>
setExpandedEndpoint(expandedEndpoint === "search" ? null : "search")
}
copy={copy}
copied={copied}
baseUrl={currentEndpoint}
/>
</div>
</div>
)}
{/* Utility & Management */}
<div>
<div className="flex items-center gap-2 mb-3">
+119 -11
View File
@@ -1,27 +1,135 @@
"use client";
import { useState } from "react";
import { useState, useRef, useEffect } from "react";
import { RequestLoggerV2, ProxyLogger, SegmentedControl } from "@/shared/components";
import ConsoleLogViewer from "@/shared/components/ConsoleLogViewer";
import AuditLogTab from "./AuditLogTab";
import { useTranslations } from "next-intl";
const TIME_RANGES = [
{ label: "1h", hours: 1 },
{ label: "6h", hours: 6 },
{ label: "12h", hours: 12 },
{ label: "24h", hours: 24 },
];
const TAB_TO_LOG_TYPE: Record<string, string> = {
"request-logs": "request-logs",
"proxy-logs": "proxy-logs",
"audit-logs": "call-logs",
console: "call-logs",
};
export default function LogsPage() {
const [activeTab, setActiveTab] = useState("request-logs");
const [showExport, setShowExport] = useState(false);
const [exporting, setExporting] = useState(false);
const dropdownRef = useRef<HTMLDivElement>(null);
const t = useTranslations("logs");
useEffect(() => {
function handleClickOutside(e: MouseEvent) {
if (dropdownRef.current && !dropdownRef.current.contains(e.target as Node)) {
setShowExport(false);
}
}
document.addEventListener("mousedown", handleClickOutside);
return () => document.removeEventListener("mousedown", handleClickOutside);
}, []);
async function handleExport(hours: number) {
setExporting(true);
setShowExport(false);
try {
const logType = TAB_TO_LOG_TYPE[activeTab] || "call-logs";
const res = await fetch(`/api/logs/export?hours=${hours}&type=${logType}`);
if (!res.ok) throw new Error("Export failed");
const blob = await res.blob();
const url = URL.createObjectURL(blob);
const a = document.createElement("a");
a.href = url;
a.download = `omniroute-${logType}-${hours}h-${new Date().toISOString().slice(0, 10)}.json`;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
} catch (err) {
console.error("Export failed:", err);
} finally {
setExporting(false);
}
}
return (
<div className="flex flex-col gap-6">
<SegmentedControl
options={[
{ value: "request-logs", label: t("requestLogs") },
{ value: "proxy-logs", label: t("proxyLogs") },
{ value: "audit-logs", label: t("auditLog") },
{ value: "console", label: t("console") },
]}
value={activeTab}
onChange={setActiveTab}
/>
<div className="flex items-center justify-between gap-4 flex-wrap">
<SegmentedControl
options={[
{ value: "request-logs", label: t("requestLogs") },
{ value: "proxy-logs", label: t("proxyLogs") },
{ value: "audit-logs", label: t("auditLog") },
{ value: "console", label: t("console") },
]}
value={activeTab}
onChange={setActiveTab}
/>
<div className="relative" ref={dropdownRef}>
<button
id="export-logs-btn"
onClick={() => setShowExport(!showExport)}
disabled={exporting}
className="flex items-center gap-2 px-4 py-2 text-sm font-medium rounded-lg
bg-[var(--card-bg,#1e1e2e)] border border-[var(--border,#333)]
text-[var(--text-secondary,#aaa)] hover:text-[var(--text-primary,#fff)]
hover:border-[var(--accent,#7c3aed)] transition-all duration-200
disabled:opacity-50 disabled:cursor-not-allowed"
>
<svg
width="16"
height="16"
viewBox="0 0 16 16"
fill="none"
stroke="currentColor"
strokeWidth="1.5"
>
<path
d="M8 2v8m0 0l-3-3m3 3l3-3M3 12h10"
strokeLinecap="round"
strokeLinejoin="round"
/>
</svg>
{exporting ? "Exporting..." : "Export"}
</button>
{showExport && (
<div
className="absolute right-0 top-full mt-1 z-50 min-w-[140px] rounded-lg
bg-[var(--card-bg,#1e1e2e)] border border-[var(--border,#333)]
shadow-xl overflow-hidden animate-in fade-in"
>
<div className="px-3 py-2 text-xs text-[var(--text-muted,#666)] border-b border-[var(--border,#333)] font-medium">
Time Range
</div>
{TIME_RANGES.map((range) => (
<button
key={range.hours}
id={`export-${range.hours}h-btn`}
onClick={() => handleExport(range.hours)}
className="w-full px-3 py-2 text-sm text-left hover:bg-[var(--hover-bg,#2a2a3e)]
text-[var(--text-secondary,#aaa)] hover:text-[var(--text-primary,#fff)]
transition-colors flex items-center justify-between"
>
<span>Last {range.label}</span>
<span className="text-xs text-[var(--text-muted,#666)]">
{range.hours === 24 ? "default" : ""}
</span>
</button>
))}
</div>
)}
</div>
</div>
{/* Content */}
{activeTab === "request-logs" && <RequestLoggerV2 />}
@@ -0,0 +1,406 @@
"use client";
import { useState, useEffect, useRef } from "react";
import dynamic from "next/dynamic";
import { useTranslations } from "next-intl";
import { Card, Button, Select, Badge } from "@/shared/components";
const Editor = dynamic(() => import("@monaco-editor/react"), { ssr: false });
interface SearchProvider {
id: string;
name: string;
status: "active" | "no_credentials";
cost_per_query: number;
}
interface SearchResult {
title: string;
url: string;
snippet: string;
score?: number;
date?: string;
}
interface SearchResponse {
id: string;
provider: string;
results: SearchResult[];
query: string;
answer: string | null;
cached: boolean;
usage: {
queries_used: number;
search_cost_usd: number;
};
metrics: {
response_time_ms: number;
upstream_latency_ms: number;
total_results_available: number | null;
};
}
function formatBytes(bytes: number): string {
if (bytes < 1024) return `${bytes} B`;
return `${(bytes / 1024).toFixed(1)} KB`;
}
export default function SearchPlayground() {
const t = useTranslations("search");
const [providers, setProviders] = useState<SearchProvider[]>([]);
const [selectedProvider, setSelectedProvider] = useState("");
const [requestBody, setRequestBody] = useState(
JSON.stringify(
{
query: "latest AI developments",
max_results: 5,
search_type: "web",
},
null,
2
)
);
const [response, setResponse] = useState<SearchResponse | null>(null);
const [rawResponse, setRawResponse] = useState("");
const [loading, setLoading] = useState(false);
const [error, setError] = useState("");
const [duration, setDuration] = useState(0);
const [statusCode, setStatusCode] = useState(0);
const [showJson, setShowJson] = useState(false);
const abortRef = useRef<AbortController | null>(null);
useEffect(() => {
fetch("/api/search/providers")
.then((res) => res.json())
.then((data) => {
const allProviders = data.providers || [];
setProviders(allProviders);
const firstActive = allProviders.find((p: SearchProvider) => p.status === "active");
if (firstActive) setSelectedProvider(firstActive.id);
})
.catch(() => {});
}, []);
const handleSend = async () => {
setLoading(true);
setError("");
setResponse(null);
setRawResponse("");
setStatusCode(0);
const controller = new AbortController();
abortRef.current = controller;
const timeout = setTimeout(() => controller.abort(), 15_000);
const start = Date.now();
try {
let body: any;
try {
body = JSON.parse(requestBody);
} catch {
setError("Invalid JSON in request body");
setLoading(false);
clearTimeout(timeout);
return;
}
if (selectedProvider) body.provider = selectedProvider;
const res = await fetch("/api/v1/search", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(body),
signal: controller.signal,
});
setDuration(Date.now() - start);
setStatusCode(res.status);
const data = await res.json();
setRawResponse(JSON.stringify(data, null, 2));
if (res.ok) {
setResponse(data);
} else {
setError(data.error?.message || data.error || `Error ${res.status}`);
}
} catch (err: any) {
setDuration(Date.now() - start);
if (err.name === "AbortError") {
setError("Request timed out (15s)");
} else {
setError(err.message || "Network error");
}
} finally {
setLoading(false);
clearTimeout(timeout);
}
};
const handleCancel = () => {
abortRef.current?.abort();
};
const getScoreColor = (score: number) => {
if (score >= 0.9) return "text-success";
if (score >= 0.7) return "text-warning";
return "text-error";
};
const getScoreBg = (score: number) => {
if (score >= 0.9) return "bg-green-500/10";
if (score >= 0.7) return "bg-yellow-500/10";
return "bg-red-500/10";
};
const noProviders = providers.filter((p) => p.status === "active").length === 0;
const editorTheme =
typeof document !== "undefined" && document.documentElement.classList.contains("dark")
? "vs-dark"
: "light";
return (
<div className="grid grid-cols-1 lg:grid-cols-2 gap-4">
{/* Request panel */}
<Card>
<div className="p-4 space-y-3">
<div className="flex items-center justify-between">
<div className="flex items-center gap-2">
<span className="material-symbols-outlined text-[18px] text-text-muted">upload</span>
<h3 className="text-sm font-semibold text-text-main">Request</h3>
<Badge variant="info" size="sm">
POST /v1/search
</Badge>
</div>
<div className="flex items-center gap-1">
<button
onClick={() => navigator.clipboard.writeText(requestBody)}
className="p-1.5 rounded hover:bg-black/5 dark:hover:bg-white/5 text-text-muted hover:text-text-main transition-colors"
title="Copy"
>
<span className="material-symbols-outlined text-[16px]">content_copy</span>
</button>
<button
onClick={() =>
setRequestBody(
JSON.stringify(
{
query: "latest AI developments",
max_results: 5,
search_type: "web",
},
null,
2
)
)
}
className="p-1.5 rounded hover:bg-black/5 dark:hover:bg-white/5 text-text-muted hover:text-text-main transition-colors"
title="Reset to default"
>
<span className="material-symbols-outlined text-[16px]">restart_alt</span>
</button>
</div>
</div>
<div className="border border-border rounded-lg overflow-hidden">
<Editor
height="400px"
defaultLanguage="json"
value={requestBody}
onChange={(value: string | undefined) => setRequestBody(value || "")}
theme={editorTheme}
options={{
minimap: { enabled: false },
fontSize: 12,
lineNumbers: "on",
scrollBeyondLastLine: false,
wordWrap: "on",
automaticLayout: true,
formatOnPaste: true,
}}
/>
</div>
<div className="flex items-center gap-3">
<div className="flex-1">
<Select
value={selectedProvider}
onChange={(e: any) => setSelectedProvider(e.target.value)}
options={providers.map((p) => ({
value: p.id,
label: `${p.name}${p.status === "no_credentials" ? " (no key)" : ""}`,
}))}
className="w-full"
/>
</div>
{loading ? (
<Button icon="stop" variant="secondary" onClick={handleCancel}>
Cancel
</Button>
) : (
<Button
icon="search"
onClick={handleSend}
disabled={noProviders || !requestBody.trim()}
>
{t("webSearch")}
</Button>
)}
</div>
{noProviders && <p className="text-xs text-text-muted">{t("noSearchProviders")}</p>}
</div>
</Card>
{/* Response panel */}
<Card>
<div className="p-4 space-y-3">
<div className="flex items-center justify-between">
<div className="flex items-center gap-2">
<span className="material-symbols-outlined text-[18px] text-text-muted">
download
</span>
<h3 className="text-sm font-semibold text-text-main">Response</h3>
{statusCode > 0 && (
<>
<Badge variant={statusCode < 400 ? "success" : "error"} size="sm">
{statusCode}
</Badge>
<span className="text-xs text-text-muted">{duration}ms</span>
</>
)}
{loading && (
<span className="material-symbols-outlined text-[14px] text-primary animate-spin">
progress_activity
</span>
)}
</div>
{response && (
<div className="flex gap-1">
<button
className={`text-xs px-3 py-1 rounded-md ${
!showJson
? "bg-primary/15 text-primary font-medium"
: "bg-black/5 dark:bg-white/5 text-text-muted"
}`}
onClick={() => setShowJson(false)}
>
{t("formatted")}
</button>
<button
className={`text-xs px-3 py-1 rounded-md ${
showJson
? "bg-primary/15 text-primary font-medium"
: "bg-black/5 dark:bg-white/5 text-text-muted"
}`}
onClick={() => setShowJson(true)}
>
{t("rawJson")}
</button>
</div>
)}
</div>
<div className="border border-border rounded-lg overflow-hidden min-h-[400px]">
{loading && (
<div className="flex items-center justify-center h-[400px]">
<span className="material-symbols-outlined text-[24px] text-primary animate-spin">
progress_activity
</span>
</div>
)}
{error && !loading && (
<div className="p-4">
<div className="text-error text-sm">{error}</div>
</div>
)}
{response && !showJson && !loading && (
<div className="p-4 space-y-3">
{/* Meta bar */}
<div className="flex justify-between items-center p-2 bg-bg-alt rounded-lg">
<div className="flex items-center gap-3 text-xs text-text-muted">
<span>
{response.results.length} {t("searchResults").toLowerCase()}
</span>
<span className="flex items-center gap-1">
<span className="w-1.5 h-1.5 rounded-full bg-primary" />
{response.provider}
</span>
<span>${response.usage?.search_cost_usd?.toFixed(4)}</span>
<span>{formatBytes(rawResponse.length)}</span>
</div>
<span
className={`text-xs flex items-center gap-1 ${
response.cached ? "text-success" : "text-warning"
}`}
>
<span
className={`w-1.5 h-1.5 rounded-full ${
response.cached ? "bg-success" : "bg-warning"
}`}
/>
{response.cached ? t("cacheHit") : t("cacheMiss")}
</span>
</div>
{/* Results */}
{response.results.map((r, i) => (
<div
key={i}
className="border-l-[3px] border-l-primary p-3 bg-surface rounded-r-lg border border-border"
>
<div className="flex justify-between items-start">
<span className="text-sm font-medium text-text-main">
{i + 1}. {r.title}
</span>
{r.score != null && (
<span
className={`text-[10px] px-2 py-0.5 rounded-md ml-2 whitespace-nowrap ${getScoreBg(r.score)} ${getScoreColor(r.score)}`}
>
{r.score.toFixed(2)}
</span>
)}
</div>
<a
href={r.url}
target="_blank"
rel="noopener noreferrer"
className="text-accent text-[11px] block mt-0.5"
>
{r.url}
</a>
<p className="text-xs text-text-muted mt-1 leading-relaxed">{r.snippet}</p>
</div>
))}
</div>
)}
{response && showJson && !loading && (
<Editor
height="400px"
defaultLanguage="json"
value={rawResponse}
theme={editorTheme}
options={{
readOnly: true,
minimap: { enabled: false },
fontSize: 12,
lineNumbers: "on",
scrollBeyondLastLine: false,
wordWrap: "on",
automaticLayout: true,
}}
/>
)}
{!loading && !error && !response && (
<div className="flex items-center justify-center h-[400px] text-text-muted text-sm">
{t("emptyState")}
</div>
)}
</div>
</div>
</Card>
</div>
);
}
+305 -279
View File
@@ -5,6 +5,9 @@ import { Card, Button, Select, Badge } from "@/shared/components";
import dynamic from "next/dynamic";
const Editor = dynamic(() => import("@monaco-editor/react"), { ssr: false });
const SearchPlayground = dynamic(() => import("./SearchPlayground"), {
ssr: false,
});
interface ModelInfo {
id: string;
@@ -27,6 +30,7 @@ const ENDPOINT_OPTIONS = [
{ value: "video", label: "Video Generation" },
{ value: "music", label: "Music Generation" },
{ value: "rerank", label: "Rerank" },
{ value: "search", label: "Web Search" },
];
const DEFAULT_BODIES: Record<string, object> = {
@@ -83,6 +87,11 @@ const DEFAULT_BODIES: Record<string, object> = {
],
top_n: 2,
},
search: {
query: "latest AI developments",
max_results: 5,
search_type: "web",
},
};
const ENDPOINT_PATHS: Record<string, string> = {
@@ -95,6 +104,7 @@ const ENDPOINT_PATHS: Record<string, string> = {
video: "/v1/videos/generations",
music: "/v1/music/generations",
rerank: "/v1/rerank",
search: "/v1/search",
};
// Models known to support vision (image input)
@@ -189,6 +199,7 @@ export default function PlaygroundPage() {
const [uploadedFile, setUploadedFile] = useState<File | null>(null);
const [uploadedImages, setUploadedImages] = useState<string[]>([]); // base64 URIs for vision
const isSearchEndpoint = selectedEndpoint === "search";
const isTranscriptionEndpoint = selectedEndpoint === "transcription";
const isChatEndpoint = selectedEndpoint === "chat";
const isImageEndpoint = selectedEndpoint === "images";
@@ -419,33 +430,7 @@ export default function PlaygroundPage() {
{/* Controls */}
<Card>
<div className="p-4 flex flex-col sm:flex-row items-end gap-4">
{/* Provider */}
<div className="flex-1 w-full">
<label className="block text-xs font-medium text-text-muted mb-1.5 uppercase tracking-wider">
Provider
</label>
<Select
value={selectedProvider}
onChange={(e: any) => handleProviderChange(e.target.value)}
options={providers}
className="w-full"
/>
</div>
{/* Model */}
<div className="flex-1 w-full">
<label className="block text-xs font-medium text-text-muted mb-1.5 uppercase tracking-wider">
Model
</label>
<Select
value={selectedModel}
onChange={(e: any) => handleModelChange(e.target.value)}
options={filteredModels}
className="w-full"
/>
</div>
{/* Endpoint */}
{/* Endpoint — always first */}
<div className="flex-1 w-full">
<label className="block text-xs font-medium text-text-muted mb-1.5 uppercase tracking-wider">
Endpoint
@@ -458,274 +443,315 @@ export default function PlaygroundPage() {
/>
</div>
{/* Send Button */}
<div className="shrink-0">
{loading ? (
<Button icon="stop" variant="secondary" onClick={handleCancel}>
Cancel
</Button>
) : (
<Button
icon="send"
onClick={handleSend}
disabled={
(!requestBody.trim() && !isTranscriptionEndpoint) ||
(!selectedModel && !isTranscriptionEndpoint)
}
>
Send
</Button>
)}
</div>
{/* Provider — hidden in search mode */}
{!isSearchEndpoint && (
<div className="flex-1 w-full">
<label className="block text-xs font-medium text-text-muted mb-1.5 uppercase tracking-wider">
Provider
</label>
<Select
value={selectedProvider}
onChange={(e: any) => handleProviderChange(e.target.value)}
options={providers}
className="w-full"
/>
</div>
)}
{/* Model — hidden in search mode */}
{!isSearchEndpoint && (
<div className="flex-1 w-full">
<label className="block text-xs font-medium text-text-muted mb-1.5 uppercase tracking-wider">
Model
</label>
<Select
value={selectedModel}
onChange={(e: any) => handleModelChange(e.target.value)}
options={filteredModels}
className="w-full"
/>
</div>
)}
{/* Send Button — hidden in search mode (SearchPlayground has its own) */}
{!isSearchEndpoint && (
<div className="shrink-0">
{loading ? (
<Button icon="stop" variant="secondary" onClick={handleCancel}>
Cancel
</Button>
) : (
<Button
icon="send"
onClick={handleSend}
disabled={
(!requestBody.trim() && !isTranscriptionEndpoint) ||
(!selectedModel && !isTranscriptionEndpoint)
}
>
Send
</Button>
)}
</div>
)}
</div>
</Card>
{/* File Upload Zone — shown for transcription and vision models */}
{(isTranscriptionEndpoint || supportsVision) && (
<Card>
<div className="p-4 space-y-3">
<div className="flex items-center gap-2">
<span className="material-symbols-outlined text-[18px] text-text-muted">
attach_file
</span>
<h3 className="text-sm font-semibold text-text-main">
{isTranscriptionEndpoint ? "Audio File" : "Attach Images (Vision)"}
</h3>
{isTranscriptionEndpoint && (
<Badge variant="info" size="sm">
multipart/form-data
</Badge>
)}
{supportsVision && (
<Badge variant="info" size="sm">
up to 4 images
</Badge>
)}
</div>
{isTranscriptionEndpoint && (
<div>
<input
type="file"
accept="audio/*,video/*"
onChange={handleAudioFileChange}
className="w-full px-3 py-2 rounded-lg bg-surface border border-border text-text-main text-sm focus:outline-none focus:ring-2 focus:ring-primary/30 file:mr-3 file:py-1 file:px-3 file:rounded file:border-0 file:bg-primary/10 file:text-primary file:text-sm"
/>
{uploadedFile && (
<p className="text-xs text-text-muted mt-1 flex items-center gap-1">
<span className="material-symbols-outlined text-[12px] text-green-500">
check_circle
</span>
{uploadedFile.name} ({(uploadedFile.size / 1024).toFixed(0)} KB)
</p>
{/* Search mode — isolated sub-component */}
{isSearchEndpoint ? (
<SearchPlayground />
) : (
<>
{/* File Upload Zone — shown for transcription and vision models */}
{(isTranscriptionEndpoint || supportsVision) && (
<Card>
<div className="p-4 space-y-3">
<div className="flex items-center gap-2">
<span className="material-symbols-outlined text-[18px] text-text-muted">
attach_file
</span>
<h3 className="text-sm font-semibold text-text-main">
{isTranscriptionEndpoint ? "Audio File" : "Attach Images (Vision)"}
</h3>
{isTranscriptionEndpoint && (
<Badge variant="info" size="sm">
multipart/form-data
</Badge>
)}
{supportsVision && (
<Badge variant="info" size="sm">
up to 4 images
</Badge>
)}
</div>
{isTranscriptionEndpoint && (
<div>
<input
type="file"
accept="audio/*,video/*"
onChange={handleAudioFileChange}
className="w-full px-3 py-2 rounded-lg bg-surface border border-border text-text-main text-sm focus:outline-none focus:ring-2 focus:ring-primary/30 file:mr-3 file:py-1 file:px-3 file:rounded file:border-0 file:bg-primary/10 file:text-primary file:text-sm"
/>
{uploadedFile && (
<p className="text-xs text-text-muted mt-1 flex items-center gap-1">
<span className="material-symbols-outlined text-[12px] text-green-500">
check_circle
</span>
{uploadedFile.name} ({(uploadedFile.size / 1024).toFixed(0)} KB)
</p>
)}
{!uploadedFile && (
<p className="text-xs text-amber-500 mt-1 flex items-center gap-1">
<span className="material-symbols-outlined text-[12px]">info</span>
Select an audio file to transcribe (mp3, wav, m4a, ogg, flac)
</p>
)}
</div>
)}
{!uploadedFile && (
<p className="text-xs text-amber-500 mt-1 flex items-center gap-1">
<span className="material-symbols-outlined text-[12px]">info</span>
Select an audio file to transcribe (mp3, wav, m4a, ogg, flac)
</p>
)}
</div>
)}
{supportsVision && (
<div>
<input
type="file"
accept="image/*"
multiple
onChange={handleImageFileChange}
className="w-full px-3 py-2 rounded-lg bg-surface border border-border text-text-main text-sm focus:outline-none focus:ring-2 focus:ring-primary/30 file:mr-3 file:py-1 file:px-3 file:rounded file:border-0 file:bg-primary/10 file:text-primary file:text-sm"
/>
{uploadedImages.length > 0 && (
<div className="flex gap-2 mt-2 flex-wrap">
{uploadedImages.map((src, i) => (
<div
key={i}
className="relative group size-16 rounded overflow-hidden border border-border"
>
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={src}
alt={`Attached ${i + 1}`}
className="w-full h-full object-cover"
/>
{supportsVision && (
<div>
<input
type="file"
accept="image/*"
multiple
onChange={handleImageFileChange}
className="w-full px-3 py-2 rounded-lg bg-surface border border-border text-text-main text-sm focus:outline-none focus:ring-2 focus:ring-primary/30 file:mr-3 file:py-1 file:px-3 file:rounded file:border-0 file:bg-primary/10 file:text-primary file:text-sm"
/>
{uploadedImages.length > 0 && (
<div className="flex gap-2 mt-2 flex-wrap">
{uploadedImages.map((src, i) => (
<div
key={i}
className="relative group size-16 rounded overflow-hidden border border-border"
>
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={src}
alt={`Attached ${i + 1}`}
className="w-full h-full object-cover"
/>
<button
onClick={() =>
setUploadedImages((prev) => prev.filter((_, idx) => idx !== i))
}
className="absolute inset-0 bg-black/50 text-white opacity-0 group-hover:opacity-100 transition-opacity flex items-center justify-center"
>
<span className="material-symbols-outlined text-[16px]">close</span>
</button>
</div>
))}
<button
onClick={() =>
setUploadedImages((prev) => prev.filter((_, idx) => idx !== i))
}
className="absolute inset-0 bg-black/50 text-white opacity-0 group-hover:opacity-100 transition-opacity flex items-center justify-center"
onClick={() => setUploadedImages([])}
className="text-xs text-text-muted hover:text-red-500 self-center ml-1"
>
<span className="material-symbols-outlined text-[16px]">close</span>
Clear all
</button>
</div>
))}
)}
</div>
)}
</div>
</Card>
)}
{/* Split Editor View */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-4">
{/* Request Panel */}
<Card>
<div className="p-4 space-y-3">
<div className="flex items-center justify-between">
<div className="flex items-center gap-2">
<span className="material-symbols-outlined text-[18px] text-text-muted">
upload
</span>
<h3 className="text-sm font-semibold text-text-main">Request</h3>
<Badge variant="info" size="sm">
POST {ENDPOINT_PATHS[selectedEndpoint]}
</Badge>
</div>
<div className="flex items-center gap-1">
<button
onClick={() => setUploadedImages([])}
className="text-xs text-text-muted hover:text-red-500 self-center ml-1"
onClick={() => handleCopy(requestBody)}
className="p-1.5 rounded hover:bg-black/5 dark:hover:bg-white/5 text-text-muted hover:text-text-main transition-colors"
title="Copy"
>
Clear all
<span className="material-symbols-outlined text-[16px]">content_copy</span>
</button>
<button
onClick={() => {
const template = { ...DEFAULT_BODIES[selectedEndpoint] };
if ("model" in template) (template as any).model = selectedModel;
setRequestBody(JSON.stringify(template, null, 2));
}}
className="p-1.5 rounded hover:bg-black/5 dark:hover:bg-white/5 text-text-muted hover:text-text-main transition-colors"
title="Reset to default"
>
<span className="material-symbols-outlined text-[16px]">restart_alt</span>
</button>
</div>
)}
</div>
)}
</div>
</Card>
)}
{/* Split Editor View */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-4">
{/* Request Panel */}
<Card>
<div className="p-4 space-y-3">
<div className="flex items-center justify-between">
<div className="flex items-center gap-2">
<span className="material-symbols-outlined text-[18px] text-text-muted">
upload
</span>
<h3 className="text-sm font-semibold text-text-main">Request</h3>
<Badge variant="info" size="sm">
POST {ENDPOINT_PATHS[selectedEndpoint]}
</Badge>
</div>
<div className="flex items-center gap-1">
<button
onClick={() => handleCopy(requestBody)}
className="p-1.5 rounded hover:bg-black/5 dark:hover:bg-white/5 text-text-muted hover:text-text-main transition-colors"
title="Copy"
>
<span className="material-symbols-outlined text-[16px]">content_copy</span>
</button>
<button
onClick={() => {
const template = { ...DEFAULT_BODIES[selectedEndpoint] };
if ("model" in template) (template as any).model = selectedModel;
setRequestBody(JSON.stringify(template, null, 2));
}}
className="p-1.5 rounded hover:bg-black/5 dark:hover:bg-white/5 text-text-muted hover:text-text-main transition-colors"
title="Reset to default"
>
<span className="material-symbols-outlined text-[16px]">restart_alt</span>
</button>
</div>
</div>
{isTranscriptionEndpoint && (
<p className="text-xs text-text-muted bg-amber-500/10 border border-amber-500/20 rounded px-2 py-1.5 flex items-start gap-1">
<span className="material-symbols-outlined text-[12px] text-amber-500 mt-0.5">
info
</span>
Transcription uses multipart/form-data. Upload the audio file above JSON below
controls extra params (model, language).
</p>
)}
<div className="border border-border rounded-lg overflow-hidden">
<Editor
height="400px"
defaultLanguage="json"
value={requestBody}
onChange={(value: string | undefined) => setRequestBody(value || "")}
theme="vs-dark"
options={{
minimap: { enabled: false },
fontSize: 12,
lineNumbers: "on",
scrollBeyondLastLine: false,
wordWrap: "on",
automaticLayout: true,
formatOnPaste: true,
}}
/>
</div>
</div>
</Card>
{/* Response Panel */}
<Card>
<div className="p-4 space-y-3">
<div className="flex items-center justify-between">
<div className="flex items-center gap-2">
<span className="material-symbols-outlined text-[18px] text-text-muted">
download
</span>
<h3 className="text-sm font-semibold text-text-main">Response</h3>
{responseStatus !== null && (
<Badge
variant={responseStatus >= 200 && responseStatus < 300 ? "success" : "error"}
size="sm"
>
{responseStatus}
</Badge>
)}
{responseDuration !== null && (
<span className="text-xs text-text-muted">{responseDuration}ms</span>
)}
{loading && (
<span className="material-symbols-outlined text-[14px] text-primary animate-spin">
progress_activity
</span>
)}
</div>
<div className="flex items-center gap-1">
<button
onClick={() => handleCopy(responseBody)}
className="p-1.5 rounded hover:bg-black/5 dark:hover:bg-white/5 text-text-muted hover:text-text-main transition-colors"
title="Copy"
>
<span className="material-symbols-outlined text-[16px]">content_copy</span>
</button>
</div>
</div>
<div className="border border-border rounded-lg overflow-hidden">
{audioUrl ? (
<div className="p-4 space-y-3">
<audio controls src={audioUrl} className="w-full rounded-lg" autoPlay />
<a
href={audioUrl}
download="speech.mp3"
className="inline-flex items-center gap-2 text-sm text-primary hover:underline"
>
<span className="material-symbols-outlined text-[16px]">download</span>
Download audio
</a>
</div>
) : imageData ? (
<ImageResultsInline data={imageData} />
) : transcriptionText !== null ? (
<div className="p-4 space-y-2">
<p className="text-xs text-text-muted font-medium uppercase tracking-wider">
Transcription
{isTranscriptionEndpoint && (
<p className="text-xs text-text-muted bg-amber-500/10 border border-amber-500/20 rounded px-2 py-1.5 flex items-start gap-1">
<span className="material-symbols-outlined text-[12px] text-amber-500 mt-0.5">
info
</span>
Transcription uses multipart/form-data. Upload the audio file above JSON below
controls extra params (model, language).
</p>
<div className="bg-surface/50 rounded p-3 text-sm text-text-main leading-relaxed whitespace-pre-wrap">
{transcriptionText}
</div>
<button
onClick={() => handleCopy(transcriptionText)}
className="text-xs text-primary hover:underline flex items-center gap-1"
>
<span className="material-symbols-outlined text-[12px]">content_copy</span>
Copy text
</button>
)}
<div className="border border-border rounded-lg overflow-hidden">
<Editor
height="400px"
defaultLanguage="json"
value={requestBody}
onChange={(value: string | undefined) => setRequestBody(value || "")}
theme="vs-dark"
options={{
minimap: { enabled: false },
fontSize: 12,
lineNumbers: "on",
scrollBeyondLastLine: false,
wordWrap: "on",
automaticLayout: true,
formatOnPaste: true,
}}
/>
</div>
) : (
<Editor
height="400px"
defaultLanguage="json"
value={responseBody}
theme="vs-dark"
options={{
minimap: { enabled: false },
fontSize: 12,
lineNumbers: "on",
scrollBeyondLastLine: false,
wordWrap: "on",
automaticLayout: true,
readOnly: true,
}}
/>
)}
</div>
</div>
</Card>
{/* Response Panel */}
<Card>
<div className="p-4 space-y-3">
<div className="flex items-center justify-between">
<div className="flex items-center gap-2">
<span className="material-symbols-outlined text-[18px] text-text-muted">
download
</span>
<h3 className="text-sm font-semibold text-text-main">Response</h3>
{responseStatus !== null && (
<Badge
variant={
responseStatus >= 200 && responseStatus < 300 ? "success" : "error"
}
size="sm"
>
{responseStatus}
</Badge>
)}
{responseDuration !== null && (
<span className="text-xs text-text-muted">{responseDuration}ms</span>
)}
{loading && (
<span className="material-symbols-outlined text-[14px] text-primary animate-spin">
progress_activity
</span>
)}
</div>
<div className="flex items-center gap-1">
<button
onClick={() => handleCopy(responseBody)}
className="p-1.5 rounded hover:bg-black/5 dark:hover:bg-white/5 text-text-muted hover:text-text-main transition-colors"
title="Copy"
>
<span className="material-symbols-outlined text-[16px]">content_copy</span>
</button>
</div>
</div>
<div className="border border-border rounded-lg overflow-hidden">
{audioUrl ? (
<div className="p-4 space-y-3">
<audio controls src={audioUrl} className="w-full rounded-lg" autoPlay />
<a
href={audioUrl}
download="speech.mp3"
className="inline-flex items-center gap-2 text-sm text-primary hover:underline"
>
<span className="material-symbols-outlined text-[16px]">download</span>
Download audio
</a>
</div>
) : imageData ? (
<ImageResultsInline data={imageData} />
) : transcriptionText !== null ? (
<div className="p-4 space-y-2">
<p className="text-xs text-text-muted font-medium uppercase tracking-wider">
Transcription
</p>
<div className="bg-surface/50 rounded p-3 text-sm text-text-main leading-relaxed whitespace-pre-wrap">
{transcriptionText}
</div>
<button
onClick={() => handleCopy(transcriptionText)}
className="text-xs text-primary hover:underline flex items-center gap-1"
>
<span className="material-symbols-outlined text-[12px]">content_copy</span>
Copy text
</button>
</div>
) : (
<Editor
height="400px"
defaultLanguage="json"
value={responseBody}
theme="vs-dark"
options={{
minimap: { enabled: false },
fontSize: 12,
lineNumbers: "on",
scrollBeyondLastLine: false,
wordWrap: "on",
automaticLayout: true,
readOnly: true,
}}
/>
)}
</div>
</div>
</Card>
</div>
</Card>
</div>
</>
)}
</div>
);
}
@@ -101,6 +101,7 @@ export default function ProviderDetailPage() {
const isOpenAICompatible = isOpenAICompatibleProvider(providerId);
const isAnthropicCompatible = isAnthropicCompatibleProvider(providerId);
const isCompatible = isOpenAICompatible || isAnthropicCompatible;
const isSearchProvider = providerId.endsWith("-search");
const providerStorageAlias = isCompatible ? providerId : providerAlias;
const providerDisplayAlias = isCompatible ? providerNode?.prefix || providerId : providerAlias;
@@ -285,9 +286,13 @@ export default function ProviderDetailPage() {
if (res.ok) {
await fetchConnections();
setShowEditModal(false);
return null;
}
const data = await res.json().catch(() => ({}));
return data.error?.message || data.error || t("failedSaveConnection");
} catch (error) {
console.log("Error updating connection:", error);
return t("failedSaveConnectionRetry");
}
};
@@ -1060,21 +1065,43 @@ export default function ProviderDetailPage() {
)}
</Card>
{/* Models */}
<Card>
<h2 className="text-lg font-semibold mb-4">{t("availableModels")}</h2>
{renderModelsSection()}
{/* Models — hidden for search providers (they don't have models) */}
{!isSearchProvider && (
<Card>
<h2 className="text-lg font-semibold mb-4">{t("availableModels")}</h2>
{renderModelsSection()}
{/* Custom Models — available for ALL providers */}
{!isCompatible && (
<CustomModelsSection
providerId={providerId}
providerAlias={providerDisplayAlias}
copied={copied}
onCopy={copy}
/>
)}
</Card>
{/* Custom Models — available for non-compatible, non-search providers */}
{!isCompatible && (
<CustomModelsSection
providerId={providerId}
providerAlias={providerDisplayAlias}
copied={copied}
onCopy={copy}
/>
)}
</Card>
)}
{/* Search provider info */}
{isSearchProvider && (
<Card>
<h2 className="text-lg font-semibold mb-4">{t("searchProvider") || "Search Provider"}</h2>
<p className="text-sm text-text-muted">
{t("searchProviderDesc") ||
"This provider is used for web search via POST /v1/search. No model configuration needed — search providers are ready to use once an API key is connected."}
</p>
{providerId === "perplexity-search" && (
<div className="mt-3 flex items-center gap-2 px-3 py-2 rounded-lg bg-blue-500/10 border border-blue-500/20">
<span className="material-symbols-outlined text-sm text-blue-400">link</span>
<p className="text-xs text-blue-300">
Uses the same API key as <strong>Perplexity</strong> (chat provider). If you already
have Perplexity configured, no additional setup is needed.
</p>
</div>
)}
</Card>
)}
{/* Modals */}
{providerId === "kiro" ? (
@@ -1450,6 +1477,7 @@ function CustomModelsSection({ providerId, providerAlias, copied, onCopy }) {
const [editingModelId, setEditingModelId] = useState<string | null>(null);
const [editingApiFormat, setEditingApiFormat] = useState("chat-completions");
const [editingEndpoints, setEditingEndpoints] = useState<string[]>(["chat"]);
const [editingNormalizeToolCallId, setEditingNormalizeToolCallId] = useState(false);
const [savingModelId, setSavingModelId] = useState<string | null>(null);
const fetchCustomModels = useCallback(async () => {
@@ -1521,12 +1549,14 @@ function CustomModelsSection({ providerId, providerAlias, copied, onCopy }) {
? model.supportedEndpoints
: ["chat"]
);
setEditingNormalizeToolCallId(Boolean(model.normalizeToolCallId));
};
const cancelEdit = () => {
setEditingModelId(null);
setEditingApiFormat("chat-completions");
setEditingEndpoints(["chat"]);
setEditingNormalizeToolCallId(false);
setSavingModelId(null);
};
@@ -1550,6 +1580,7 @@ function CustomModelsSection({ providerId, providerAlias, copied, onCopy }) {
source: model?.source || "manual",
apiFormat: editingApiFormat,
supportedEndpoints: editingEndpoints,
normalizeToolCallId: editingNormalizeToolCallId,
}),
});
@@ -1711,6 +1742,14 @@ function CustomModelsSection({ providerId, providerAlias, copied, onCopy }) {
🔊 Audio
</span>
)}
{model.normalizeToolCallId && (
<span
className="text-[10px] px-1.5 py-0.5 rounded-full bg-slate-500/15 text-slate-400 font-medium"
title="9-char tool call ID (Mistral)"
>
ID×9
</span>
)}
</div>
{editingModelId === model.id && (
@@ -1763,6 +1802,16 @@ function CustomModelsSection({ providerId, providerAlias, copied, onCopy }) {
))}
</div>
</div>
<label className="flex items-center gap-2 text-xs text-text-main cursor-pointer">
<input
type="checkbox"
checked={editingNormalizeToolCallId}
onChange={(e) => setEditingNormalizeToolCallId(e.target.checked)}
className="rounded border-border"
/>
Normalize Tool Call ID (9 chars, Mistral)
</label>
</div>
<div className="mt-3 flex items-center gap-2">
<Button
@@ -2595,10 +2644,14 @@ function AddApiKeyModal({
onClose,
}) {
const t = useTranslations("providers");
const isBailian = provider === "bailian-coding-plan";
const defaultBailianUrl = "https://coding-intl.dashscope.aliyuncs.com/apps/anthropic/v1";
const [formData, setFormData] = useState({
name: "",
apiKey: "",
priority: 1,
baseUrl: isBailian ? defaultBailianUrl : "",
});
const [validating, setValidating] = useState(false);
const [validationResult, setValidationResult] = useState(null);
@@ -2629,6 +2682,16 @@ function AddApiKeyModal({
setSaving(true);
setSaveError(null);
try {
let validatedBailianBaseUrl = null;
if (isBailian) {
const checked = normalizeAndValidateHttpBaseUrl(formData.baseUrl, defaultBailianUrl);
if (checked.error) {
setSaveError(checked.error);
return;
}
validatedBailianBaseUrl = checked.value;
}
let isValid = false;
try {
setValidating(true);
@@ -2652,12 +2715,22 @@ function AddApiKeyModal({
return;
}
const error = await onSave({
const payload = {
name: formData.name,
apiKey: formData.apiKey,
priority: formData.priority,
testStatus: "active",
});
providerSpecificData: undefined,
};
// Include baseUrl in providerSpecificData for bailian-coding-plan
if (isBailian) {
payload.providerSpecificData = {
baseUrl: validatedBailianBaseUrl,
};
}
const error = await onSave(payload);
if (error) {
setSaveError(typeof error === "string" ? error : t("failedSaveConnection"));
}
@@ -2728,6 +2801,15 @@ function AddApiKeyModal({
setFormData({ ...formData, priority: Number.parseInt(e.target.value) || 1 })
}
/>
{isBailian && (
<Input
label="Base URL"
value={formData.baseUrl}
onChange={(e) => setFormData({ ...formData, baseUrl: e.target.value })}
placeholder={defaultBailianUrl}
hint="Optional: Custom base URL for bailian-coding-plan provider"
/>
)}
<div className="flex gap-2">
<Button
onClick={handleSubmit}
@@ -2755,6 +2837,19 @@ AddApiKeyModal.propTypes = {
onClose: PropTypes.func.isRequired,
};
function normalizeAndValidateHttpBaseUrl(rawValue, fallbackUrl) {
const value = (typeof rawValue === "string" ? rawValue.trim() : "") || fallbackUrl;
try {
const parsed = new URL(value);
if (parsed.protocol !== "http:" && parsed.protocol !== "https:") {
return { value: null, error: "Base URL must use http or https" };
}
return { value, error: null };
} catch {
return { value: null, error: "Base URL must be a valid URL" };
}
}
function EditConnectionModal({ isOpen, connection, onSave, onClose }) {
const t = useTranslations("providers");
const [formData, setFormData] = useState({
@@ -2762,22 +2857,29 @@ function EditConnectionModal({ isOpen, connection, onSave, onClose }) {
priority: 1,
apiKey: "",
healthCheckInterval: 60,
baseUrl: "",
});
const [testing, setTesting] = useState(false);
const [testResult, setTestResult] = useState(null);
const [validating, setValidating] = useState(false);
const [validationResult, setValidationResult] = useState(null);
const [saving, setSaving] = useState(false);
const [saveError, setSaveError] = useState<string | null>(null);
const [extraApiKeys, setExtraApiKeys] = useState<string[]>([]);
const [newExtraKey, setNewExtraKey] = useState("");
const isBailian = connection?.provider === "bailian-coding-plan";
const defaultBailianUrl = "https://coding-intl.dashscope.aliyuncs.com/apps/anthropic/v1";
useEffect(() => {
if (connection) {
const existingBaseUrl = connection.providerSpecificData?.baseUrl;
setFormData({
name: connection.name || "",
priority: connection.priority || 1,
apiKey: "",
healthCheckInterval: connection.healthCheckInterval ?? 60,
baseUrl: existingBaseUrl || (isBailian ? defaultBailianUrl : ""),
});
// Load existing extra keys from providerSpecificData
const existing = connection.providerSpecificData?.extraApiKeys;
@@ -2785,8 +2887,9 @@ function EditConnectionModal({ isOpen, connection, onSave, onClose }) {
setNewExtraKey("");
setTestResult(null);
setValidationResult(null);
setSaveError(null);
}
}, [connection]);
}, [connection, isBailian]);
const handleTest = async () => {
if (!connection?.provider) return;
@@ -2832,12 +2935,24 @@ function EditConnectionModal({ isOpen, connection, onSave, onClose }) {
const handleSubmit = async () => {
setSaving(true);
setSaveError(null);
try {
const updates: any = {
name: formData.name,
priority: formData.priority,
healthCheckInterval: formData.healthCheckInterval,
};
let validatedBailianBaseUrl = null;
if (isBailian) {
const checked = normalizeAndValidateHttpBaseUrl(formData.baseUrl, defaultBailianUrl);
if (checked.error) {
setSaveError(checked.error);
return;
}
validatedBailianBaseUrl = checked.value;
}
if (!isOAuth && formData.apiKey) {
updates.apiKey = formData.apiKey;
let isValid = validationResult === "success";
@@ -2869,14 +2984,21 @@ function EditConnectionModal({ isOpen, connection, onSave, onClose }) {
updates.rateLimitedUntil = null;
}
}
// Persist extra API keys in providerSpecificData
// Persist extra API keys and baseUrl in providerSpecificData
if (!isOAuth) {
updates.providerSpecificData = {
...(connection.providerSpecificData || {}),
extraApiKeys: extraApiKeys.filter((k) => k.trim().length > 0),
};
// Update baseUrl for bailian-coding-plan
if (isBailian) {
updates.providerSpecificData.baseUrl = validatedBailianBaseUrl;
}
}
const error = await onSave(updates);
if (error) {
setSaveError(typeof error === "string" ? error : t("failedSaveConnection"));
}
await onSave(updates);
} finally {
setSaving(false);
}
@@ -2957,9 +3079,24 @@ function EditConnectionModal({ isOpen, connection, onSave, onClose }) {
{validationResult === "success" ? t("valid") : t("invalid")}
</Badge>
)}
{saveError && (
<div className="text-sm text-red-500 bg-red-500/10 border border-red-500/20 rounded-lg px-3 py-2">
{saveError}
</div>
)}
</>
)}
{isBailian && (
<Input
label="Base URL"
value={formData.baseUrl}
onChange={(e) => setFormData({ ...formData, baseUrl: e.target.value })}
placeholder={defaultBailianUrl}
hint="Custom base URL for bailian-coding-plan provider"
/>
)}
{/* T07: Extra API Keys for round-robin rotation */}
{!isOAuth && (
<div className="flex flex-col gap-2">
@@ -0,0 +1,297 @@
"use client";
import { useState, useEffect, useRef } from "react";
import { useTranslations } from "next-intl";
import dynamic from "next/dynamic";
const SearchForm = dynamic(() => import("./components/SearchForm"), {
ssr: false,
});
const SearchHistory = dynamic(() => import("./components/SearchHistory"), {
ssr: false,
});
const ResultsPanel = dynamic(() => import("./components/ResultsPanel"), {
ssr: false,
});
const ProviderComparison = dynamic(() => import("./components/ProviderComparison"), { ssr: false });
const RerankPanel = dynamic(() => import("./components/RerankPanel"), {
ssr: false,
});
import type { SearchFormData } from "./components/SearchForm";
import type { CompareResult } from "./components/ProviderComparison";
interface SearchProvider {
id: string;
name: string;
status: "active" | "no_credentials";
cost_per_query: number;
}
interface SearchResult {
title: string;
url: string;
snippet: string;
score?: number;
}
interface SearchResponse {
id: string;
provider: string;
query: string;
results: SearchResult[];
cached: boolean;
usage: {
queries_used: number;
search_cost_usd: number;
};
metrics: {
response_time_ms: number;
upstream_latency_ms: number;
total_results_available: number | null;
};
}
export default function SearchToolsClient() {
const t = useTranslations("search");
const [providers, setProviders] = useState<SearchProvider[]>([]);
const [response, setResponse] = useState<SearchResponse | null>(null);
const [rawJson, setRawJson] = useState("");
const [loading, setLoading] = useState(false);
const [error, setError] = useState("");
const [statusCode, setStatusCode] = useState(0);
const [duration, setDuration] = useState(0);
const [lastQuery, setLastQuery] = useState<SearchFormData | null>(null);
const abortRef = useRef<AbortController | null>(null);
const [showCompare, setShowCompare] = useState(false);
const [compareLoading, setCompareLoading] = useState(false);
const [compareResults, setCompareResults] = useState<CompareResult[]>([]);
const [initialCompareResult, setInitialCompareResult] = useState<CompareResult | null>(null);
const [showRerank, setShowRerank] = useState(false);
useEffect(() => {
fetch("/api/search/providers")
.then((res) => res.json())
.then((data) => setProviders(data.providers || []))
.catch(() => {});
}, []);
const handleSearch = async (formData: SearchFormData) => {
setLoading(true);
setError("");
setResponse(null);
setRawJson("");
setStatusCode(0);
setShowCompare(false);
setShowRerank(false);
setCompareResults([]);
const controller = new AbortController();
abortRef.current = controller;
const timeout = setTimeout(() => controller.abort(), 15_000);
const start = Date.now();
try {
const body: any = { ...formData };
if (!body.provider) delete body.provider;
const res = await fetch("/api/v1/search", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(body),
signal: controller.signal,
});
setDuration(Date.now() - start);
setStatusCode(res.status);
const data = await res.json();
setRawJson(JSON.stringify(data, null, 2));
setLastQuery(formData);
if (res.ok) {
setResponse(data);
} else {
setError(data.error?.message || data.error || `Error ${res.status}`);
}
} catch (err: any) {
setDuration(Date.now() - start);
if (err.name === "AbortError") {
setError("Request timed out (15s)");
} else {
setError(err.message || "Network error");
}
} finally {
setLoading(false);
clearTimeout(timeout);
}
};
const handleCompare = async () => {
if (!response || !lastQuery) return;
const usedProvider = response.provider;
const otherProviders = providers
.filter((p) => p.status === "active" && p.id !== usedProvider)
.map((p) => p.id);
if (otherProviders.length === 0) return;
const initial: CompareResult = {
provider: usedProvider,
latency: response.metrics.response_time_ms,
cost: response.usage.search_cost_usd,
resultCount: response.results.length,
responseSize: rawJson.length,
urls: response.results.map((r) => r.url),
};
setInitialCompareResult(initial);
setShowCompare(true);
setCompareLoading(true);
const promises = otherProviders.map(async (providerId) => {
const start = Date.now();
try {
const res = await fetch("/api/v1/search", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ ...lastQuery, provider: providerId }),
});
const data = await res.json();
const elapsed = Date.now() - start;
if (!res.ok) {
return {
provider: providerId,
latency: elapsed,
cost: 0,
resultCount: 0,
responseSize: 0,
urls: [],
error: data.error?.message || `Error ${res.status}`,
} as CompareResult;
}
const respJson = JSON.stringify(data);
return {
provider: providerId,
latency: data.metrics?.response_time_ms || elapsed,
cost: data.usage?.search_cost_usd || 0,
resultCount: data.results?.length || 0,
responseSize: respJson.length,
urls: (data.results || []).map((r: any) => r.url),
} as CompareResult;
} catch (err: any) {
return {
provider: providerId,
latency: Date.now() - start,
cost: 0,
resultCount: 0,
responseSize: 0,
urls: [],
error: err.message,
} as CompareResult;
}
});
const results = await Promise.allSettled(promises);
setCompareResults(
results.map((r) =>
r.status === "fulfilled"
? r.value
: {
provider: "unknown",
latency: 0,
cost: 0,
resultCount: 0,
responseSize: 0,
urls: [],
error: "Failed",
}
)
);
setCompareLoading(false);
};
const handleCancel = () => {
abortRef.current?.abort();
};
const handleHistoryReplay = (entry: any) => {
handleSearch({
query: entry.query,
provider: entry.provider || "",
search_type: entry.filters?.search_type || "web",
max_results: entry.filters?.max_results || 5,
...entry.filters,
});
};
return (
<div className="flex h-[calc(100vh-120px)]">
<div className="w-[340px] flex-shrink-0 bg-bg-alt border-r border-border overflow-y-auto flex flex-col">
<SearchForm
onSearch={handleSearch}
loading={loading}
onCancel={handleCancel}
providers={providers}
/>
<SearchHistory onReplay={handleHistoryReplay} />
</div>
<div className="flex-1 overflow-y-auto">
<ResultsPanel
response={response}
rawJson={rawJson}
loading={loading}
error={error}
statusCode={statusCode}
duration={duration}
/>
{response && (
<div className="px-4 py-2 flex gap-2">
<button
className="flex-1 bg-surface border border-border rounded-lg p-2 text-center hover:border-accent/30 transition-colors flex items-center justify-center gap-2"
onClick={handleCompare}
disabled={compareLoading}
>
<span className="text-accent text-sm">&#8693;</span>
<span className="text-xs text-text-muted">{t("compareProviders")}</span>
</button>
<button
className="flex-1 bg-surface border border-border rounded-lg p-2 text-center hover:border-primary/30 transition-colors flex items-center justify-center gap-2"
onClick={() => setShowRerank(!showRerank)}
>
<span className="text-primary text-sm">&#8645;</span>
<span className="text-xs text-text-muted">{t("rerankResults")}</span>
</button>
</div>
)}
{showCompare && initialCompareResult && (
<div className="px-4 pb-3">
<ProviderComparison
initialProvider={response!.provider}
initialResult={initialCompareResult}
otherResults={compareResults}
loading={compareLoading}
onClose={() => setShowCompare(false)}
/>
</div>
)}
{showRerank && response && (
<div className="px-4 pb-3">
<RerankPanel
query={response.query}
results={response.results}
onClose={() => setShowRerank(false)}
/>
</div>
)}
</div>
</div>
);
}
@@ -0,0 +1,168 @@
"use client";
import { useTranslations } from "next-intl";
export interface CompareResult {
provider: string;
latency: number;
cost: number;
resultCount: number;
responseSize: number;
urls: string[];
error?: string;
}
interface ProviderComparisonProps {
initialProvider: string;
initialResult: CompareResult;
otherResults: CompareResult[];
loading: boolean;
onClose: () => void;
}
function formatBytes(bytes: number): string {
if (bytes < 1024) return `${bytes} B`;
return `${(bytes / 1024).toFixed(1)} KB`;
}
export default function ProviderComparison({
initialProvider,
initialResult,
otherResults,
loading,
onClose,
}: ProviderComparisonProps) {
const t = useTranslations("search");
const allResults = [initialResult, ...otherResults];
const initialUrls = new Set(initialResult.urls);
const valid = allResults.filter((r) => !r.error);
const latencies = valid.map((r) => r.latency);
const costs = valid.map((r) => r.cost);
const sizes = valid.map((r) => r.responseSize);
const bestLatency = Math.min(...latencies);
const worstLatency = Math.max(...latencies);
const bestCost = Math.min(...costs);
const worstCost = Math.max(...costs);
const bestSize = Math.min(...sizes);
const worstSize = Math.max(...sizes);
const getLatencyColor = (val: number) => {
if (val === bestLatency) return "text-success font-medium";
if (val === worstLatency) return "text-warning";
return "text-text-main";
};
const getCostColor = (val: number) => {
if (val === bestCost) return "text-success font-medium";
if (val === worstCost) return "text-warning";
return "text-text-main";
};
const getSizeColor = (val: number) => {
if (val === bestSize) return "text-success font-medium";
if (val === worstSize) return "text-warning";
return "text-text-main";
};
return (
<div className="bg-surface border border-accent/20 rounded-lg overflow-hidden">
<div className="flex justify-between items-center px-4 py-2.5 bg-accent/5 border-b border-accent/15">
<span className="text-xs font-semibold text-accent flex items-center gap-1.5">
{t("compareProviders")}
</span>
<button onClick={onClose} className="text-text-muted text-xs hover:text-text-main">
</button>
</div>
<div className="p-3 overflow-x-auto">
{loading ? (
<div className="flex items-center justify-center py-8">
<span className="material-symbols-outlined text-[20px] text-accent animate-spin">
progress_activity
</span>
<span className="text-xs text-text-muted ml-2">{t("compareProviders")}...</span>
</div>
) : (
<table className="w-full text-xs">
<thead>
<tr className="border-b border-border">
<th className="text-left p-2 text-text-muted font-semibold" />
{allResults.map((r) => (
<th
key={r.provider}
className={`text-center p-2 font-semibold ${
r.provider === initialProvider ? "text-primary" : "text-text-muted"
}`}
>
{r.provider.replace("-search", "")}
{r.provider === initialProvider && " ✓"}
</th>
))}
</tr>
</thead>
<tbody>
<tr className="border-b border-border/50">
<td className="p-2 text-text-muted">{t("latency")}</td>
{allResults.map((r) => (
<td
key={r.provider}
className={`text-center p-2 ${r.error ? "text-error" : getLatencyColor(r.latency)}`}
>
{r.error ? "Error" : `${r.latency}ms`}
</td>
))}
</tr>
<tr className="border-b border-border/50">
<td className="p-2 text-text-muted">{t("cost")}</td>
{allResults.map((r) => (
<td
key={r.provider}
className={`text-center p-2 ${r.error ? "text-error" : getCostColor(r.cost)}`}
>
{r.error ? "Error" : `$${r.cost.toFixed(4)}`}
</td>
))}
</tr>
<tr className="border-b border-border/50">
<td className="p-2 text-text-muted">{t("results")}</td>
{allResults.map((r) => (
<td
key={r.provider}
className={`text-center p-2 ${r.error ? "text-error" : "text-text-main"}`}
>
{r.error ? "Error" : r.resultCount}
</td>
))}
</tr>
<tr className="border-b border-border/50">
<td className="p-2 text-text-muted">Size</td>
{allResults.map((r) => (
<td
key={r.provider}
className={`text-center p-2 ${r.error ? "text-error" : getSizeColor(r.responseSize)}`}
>
{r.error ? "Error" : formatBytes(r.responseSize)}
</td>
))}
</tr>
<tr>
<td className="p-2 text-text-muted">{t("urlOverlap")}</td>
{allResults.map((r) => (
<td key={r.provider} className="text-center p-2 text-text-main">
{r.provider === initialProvider
? "—"
: r.error
? "Error"
: `${r.urls.filter((u) => initialUrls.has(u)).length}/${r.resultCount}`}
</td>
))}
</tr>
</tbody>
</table>
)}
</div>
</div>
);
}
@@ -0,0 +1,152 @@
"use client";
import { useState, useEffect } from "react";
import { useTranslations } from "next-intl";
import { Button, Select } from "@/shared/components";
interface RerankResult {
index: number;
originalIndex: number;
title: string;
snippet: string;
score: number;
delta: number;
}
interface RerankPanelProps {
query: string;
results: { title: string; snippet: string; url: string }[];
onClose: () => void;
}
export default function RerankPanel({ query, results, onClose }: RerankPanelProps) {
const t = useTranslations("search");
const [models, setModels] = useState<{ value: string; label: string }[]>([]);
const [selectedModel, setSelectedModel] = useState("");
const [reranked, setReranked] = useState<RerankResult[]>([]);
const [loading, setLoading] = useState(false);
const [error, setError] = useState("");
useEffect(() => {
fetch("/v1/models")
.then((res) => res.json())
.then((data) => {
const rerankModels = (data?.data || [])
.filter((m: any) => m.id.toLowerCase().includes("rerank"))
.map((m: any) => ({ value: m.id, label: m.id }));
setModels(rerankModels);
if (rerankModels.length > 0) setSelectedModel(rerankModels[0].value);
})
.catch(() => {});
}, []);
const handleRerank = async () => {
setLoading(true);
setError("");
try {
const res = await fetch("/api/v1/rerank", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: selectedModel,
query,
documents: results.map((r) => r.snippet),
top_n: results.length,
}),
});
const data = await res.json();
if (!res.ok) {
setError(data.error?.message || data.error || `Error ${res.status}`);
return;
}
const rerankedResults: RerankResult[] = (data.results || []).map(
(r: any, newIndex: number) => {
const origIndex = r.index;
return {
index: newIndex,
originalIndex: origIndex,
title: results[origIndex]?.title || "",
snippet: results[origIndex]?.snippet || "",
score: r.relevance_score,
delta: origIndex - newIndex,
};
}
);
setReranked(rerankedResults);
} catch (err: any) {
setError(err.message || "Rerank failed");
} finally {
setLoading(false);
}
};
const getDeltaDisplay = (delta: number) => {
if (delta > 0) return <span className="text-success">{delta}</span>;
if (delta < 0) return <span className="text-error">{Math.abs(delta)}</span>;
return <span className="text-text-muted">=</span>;
};
const noModels = models.length === 0;
return (
<div className="bg-surface border border-border rounded-lg overflow-hidden">
<div className="flex justify-between items-center px-4 py-2.5 border-b border-border">
<span className="text-xs font-semibold text-text-main flex items-center gap-1.5">
{t("rerankResults")}
</span>
<button onClick={onClose} className="text-text-muted text-xs hover:text-text-main">
</button>
</div>
<div className="p-4">
{noModels ? (
<p className="text-xs text-text-muted">{t("noRerankModels")}</p>
) : (
<>
<div className="flex gap-2 items-end mb-3">
<div className="flex-1">
<label className="block text-[10px] text-text-muted uppercase tracking-wider mb-1">
{t("rerankModel")}
</label>
<Select
value={selectedModel}
onChange={(e: any) => setSelectedModel(e.target.value)}
options={models}
className="w-full"
/>
</div>
<Button variant="primary" onClick={handleRerank} disabled={loading || !selectedModel}>
{loading ? "Reranking..." : t("rerank")}
</Button>
</div>
{error && <p className="text-xs text-error mb-2">{error}</p>}
{reranked.length > 0 && (
<div className="space-y-2">
{reranked.map((r) => (
<div key={r.index} className="flex items-start gap-3 p-2 bg-bg-alt rounded-lg">
<div className="flex flex-col items-center min-w-[32px]">
<span className="text-xs font-medium text-text-main">#{r.index + 1}</span>
<span className="text-[10px]">{getDeltaDisplay(r.delta)}</span>
</div>
<div className="flex-1">
<div className="text-xs font-medium text-text-main">{r.title}</div>
<div className="text-[10px] text-text-muted mt-0.5 line-clamp-2">
{r.snippet}
</div>
</div>
<span className="text-[10px] text-accent whitespace-nowrap">
{r.score.toFixed(4)}
</span>
</div>
))}
</div>
)}
</>
)}
</div>
</div>
);
}
@@ -0,0 +1,223 @@
"use client";
import { useState } from "react";
import { useTranslations } from "next-intl";
import dynamic from "next/dynamic";
import { Badge } from "@/shared/components";
const Editor = dynamic(() => import("@monaco-editor/react"), { ssr: false });
interface SearchResult {
title: string;
url: string;
snippet: string;
score?: number;
date?: string;
}
interface SearchResponse {
id: string;
provider: string;
results: SearchResult[];
query: string;
answer: string | null;
cached: boolean;
usage: {
queries_used: number;
search_cost_usd: number;
};
metrics: {
response_time_ms: number;
upstream_latency_ms: number;
total_results_available: number | null;
};
}
interface ResultsPanelProps {
response: SearchResponse | null;
rawJson: string;
loading: boolean;
error: string;
statusCode: number;
duration: number;
}
function formatBytes(bytes: number): string {
if (bytes < 1024) return `${bytes} B`;
return `${(bytes / 1024).toFixed(1)} KB`;
}
export default function ResultsPanel({
response,
rawJson,
loading,
error,
statusCode,
duration,
}: ResultsPanelProps) {
const t = useTranslations("search");
const [showJson, setShowJson] = useState(false);
const getScoreColor = (score: number) => {
if (score >= 0.9) return "text-success";
if (score >= 0.7) return "text-warning";
return "text-error";
};
const getScoreBg = (score: number) => {
if (score >= 0.9) return "bg-green-500/10";
if (score >= 0.7) return "bg-yellow-500/10";
return "bg-red-500/10";
};
const editorTheme =
typeof document !== "undefined" && document.documentElement.classList.contains("dark")
? "vs-dark"
: "light";
return (
<div className="flex flex-col">
{/* Header */}
<div className="flex justify-between items-center p-3 border-b border-border">
<div className="flex items-center gap-2">
<span className="text-xs font-semibold text-text-muted uppercase tracking-wider">
{t("searchResults")}
</span>
{statusCode > 0 && (
<>
<Badge variant={statusCode < 400 ? "success" : "error"} size="sm">
{statusCode}
</Badge>
<span className="text-xs text-text-muted">{duration}ms</span>
</>
)}
</div>
{response && (
<div className="flex gap-1">
<button
className={`text-xs px-3 py-1 rounded-md ${
!showJson
? "bg-primary/15 text-primary font-medium"
: "bg-black/5 dark:bg-white/5 text-text-muted"
}`}
onClick={() => setShowJson(false)}
>
{t("formatted")}
</button>
<button
className={`text-xs px-3 py-1 rounded-md ${
showJson
? "bg-primary/15 text-primary font-medium"
: "bg-black/5 dark:bg-white/5 text-text-muted"
}`}
onClick={() => setShowJson(true)}
>
{t("rawJson")}
</button>
</div>
)}
</div>
{/* Content */}
{loading && (
<div className="flex items-center justify-center py-20">
<span className="material-symbols-outlined text-[24px] text-primary animate-spin">
progress_activity
</span>
</div>
)}
{error && !loading && (
<div className="p-4">
<div className="text-error text-sm">{error}</div>
</div>
)}
{response && !showJson && !loading && (
<div className="p-4 space-y-3">
{/* Meta bar */}
<div className="flex justify-between items-center p-2 bg-bg-alt rounded-lg">
<div className="flex items-center gap-3 text-xs text-text-muted">
<span>
{response.results.length} {t("results").toLowerCase()}
</span>
<span className="flex items-center gap-1">
<span className="w-1.5 h-1.5 rounded-full bg-primary" />
{response.provider}
</span>
<span>{response.metrics?.response_time_ms}ms</span>
<span>${response.usage?.search_cost_usd?.toFixed(4)}</span>
<span>{formatBytes(rawJson.length)}</span>
</div>
<span
className={`text-xs flex items-center gap-1 ${
response.cached ? "text-success" : "text-warning"
}`}
>
<span
className={`w-1.5 h-1.5 rounded-full ${
response.cached ? "bg-success" : "bg-warning"
}`}
/>
{response.cached ? t("cacheHit") : t("cacheMiss")}
</span>
</div>
{/* Results list */}
{response.results.map((r, i) => (
<div
key={i}
className="border-l-[3px] border-l-primary p-3 bg-surface rounded-r-lg border border-border"
>
<div className="flex justify-between items-start">
<span className="text-sm font-medium text-text-main">
{i + 1}. {r.title}
</span>
{r.score != null && (
<span
className={`text-[10px] px-2 py-0.5 rounded-md ml-2 whitespace-nowrap ${getScoreBg(r.score)} ${getScoreColor(r.score)}`}
>
{r.score.toFixed(2)}
</span>
)}
</div>
<a
href={r.url}
target="_blank"
rel="noopener noreferrer"
className="text-accent text-[11px] block mt-0.5"
>
{r.url}
</a>
<p className="text-xs text-text-muted mt-1 leading-relaxed">{r.snippet}</p>
</div>
))}
</div>
)}
{response && showJson && !loading && (
<div className="h-64">
<Editor
height="100%"
language="json"
value={rawJson}
theme={editorTheme}
options={{
readOnly: true,
minimap: { enabled: false },
fontSize: 12,
automaticLayout: true,
wordWrap: "on",
}}
/>
</div>
)}
{!loading && !error && !response && (
<div className="flex items-center justify-center py-20 text-text-muted text-sm">
{t("emptyState")}
</div>
)}
</div>
);
}
@@ -0,0 +1,308 @@
"use client";
import { useState } from "react";
import { useTranslations } from "next-intl";
import { Button, Select } from "@/shared/components";
interface SearchProvider {
id: string;
name: string;
status: "active" | "no_credentials";
cost_per_query: number;
}
export interface SearchFormData {
query: string;
provider: string;
search_type: string;
max_results: number;
country?: string;
language?: string;
time_range?: string;
include_domains?: string[];
exclude_domains?: string[];
safe_search?: string;
}
interface SearchFormProps {
onSearch: (data: SearchFormData) => void;
loading: boolean;
onCancel: () => void;
providers: SearchProvider[];
}
export default function SearchForm({ onSearch, loading, onCancel, providers }: SearchFormProps) {
const t = useTranslations("search");
const [query, setQuery] = useState("");
const [provider, setProvider] = useState("auto");
const [searchType, setSearchType] = useState("web");
const [maxResults, setMaxResults] = useState(5);
const [showFilters, setShowFilters] = useState(false);
const [country, setCountry] = useState("");
const [language, setLanguage] = useState("");
const [timeRange, setTimeRange] = useState("");
const [includeDomains, setIncludeDomains] = useState<string[]>([]);
const [excludeDomains, setExcludeDomains] = useState<string[]>([]);
const [safeSearch, setSafeSearch] = useState("moderate");
const [domainInput, setDomainInput] = useState("");
const [excludeDomainInput, setExcludeDomainInput] = useState("");
const activeProviders = providers.filter((p) => p.status === "active");
const noProviders = activeProviders.length === 0;
const handleSubmit = () => {
const data: SearchFormData = {
query,
provider: provider === "auto" ? "" : provider,
search_type: searchType,
max_results: maxResults,
};
if (country) data.country = country;
if (language) data.language = language;
if (timeRange) data.time_range = timeRange;
if (includeDomains.length > 0) data.include_domains = includeDomains;
if (excludeDomains.length > 0) data.exclude_domains = excludeDomains;
if (safeSearch !== "moderate") data.safe_search = safeSearch;
onSearch(data);
};
const addDomain = (type: "include" | "exclude") => {
const input = type === "include" ? domainInput : excludeDomainInput;
const setter = type === "include" ? setIncludeDomains : setExcludeDomains;
const list = type === "include" ? includeDomains : excludeDomains;
if (input.trim() && !list.includes(input.trim())) {
setter([...list, input.trim()]);
}
type === "include" ? setDomainInput("") : setExcludeDomainInput("");
};
const removeDomain = (domain: string, type: "include" | "exclude") => {
const setter = type === "include" ? setIncludeDomains : setExcludeDomains;
const list = type === "include" ? includeDomains : excludeDomains;
setter(list.filter((d) => d !== domain));
};
return (
<div className="flex flex-col h-full">
{/* Query */}
<div className="p-4 border-b border-border">
<label className="block text-[10px] font-semibold text-text-muted uppercase tracking-wider mb-2">
{t("searchQuery")}
</label>
<textarea
value={query}
onChange={(e) => setQuery(e.target.value)}
placeholder="Enter search query..."
className="w-full bg-surface border border-border rounded-lg p-2.5 text-sm text-text-main resize-none h-16 focus:outline-none focus:ring-2 focus:ring-primary/30"
onKeyDown={(e) => {
if (e.key === "Enter" && !e.shiftKey) {
e.preventDefault();
if (!noProviders && query.trim()) handleSubmit();
}
}}
/>
</div>
{/* Provider + Type + Max Results */}
<div className="p-4 border-b border-border space-y-2">
<div className="flex gap-2">
<div className="flex-1">
<label className="block text-[10px] text-text-muted uppercase tracking-wider mb-1">
{t("provider")}
</label>
<Select
value={provider}
onChange={(e: any) => setProvider(e.target.value)}
options={[
{ value: "auto", label: "auto (cheapest)" },
...activeProviders.map((p) => ({
value: p.id,
label: p.name,
})),
]}
className="w-full"
/>
</div>
<div className="flex-1">
<label className="block text-[10px] text-text-muted uppercase tracking-wider mb-1">
{t("searchType")}
</label>
<Select
value={searchType}
onChange={(e: any) => setSearchType(e.target.value)}
options={[
{ value: "web", label: "web" },
{ value: "news", label: "news" },
]}
className="w-full"
/>
</div>
</div>
<div className="w-20">
<label className="block text-[10px] text-text-muted uppercase tracking-wider mb-1">
{t("maxResults")}
</label>
<input
type="number"
value={maxResults}
onChange={(e) => setMaxResults(parseInt(e.target.value) || 5)}
min={1}
max={100}
className="w-full bg-surface border border-border rounded-lg px-2.5 py-1.5 text-sm text-text-main focus:outline-none focus:ring-2 focus:ring-primary/30"
/>
</div>
</div>
{/* Filters (collapsible) */}
<div className="p-4 border-b border-border">
<button
className="flex justify-between items-center w-full"
onClick={() => setShowFilters(!showFilters)}
>
<span className="text-[10px] font-semibold text-text-muted uppercase tracking-wider">
{t("filters")}
</span>
<span className="text-text-muted text-xs">{showFilters ? "▼" : "▶"}</span>
</button>
{showFilters && (
<div className="mt-3 space-y-2">
<div className="flex gap-2">
<div className="flex-1">
<label className="block text-[10px] text-text-muted mb-1">{t("country")}</label>
<input
value={country}
onChange={(e) => setCountry(e.target.value)}
placeholder="any"
className="w-full bg-surface border border-border rounded-md px-2 py-1.5 text-xs text-text-main focus:outline-none"
/>
</div>
<div className="flex-1">
<label className="block text-[10px] text-text-muted mb-1">{t("language")}</label>
<input
value={language}
onChange={(e) => setLanguage(e.target.value)}
placeholder="any"
className="w-full bg-surface border border-border rounded-md px-2 py-1.5 text-xs text-text-main focus:outline-none"
/>
</div>
</div>
<div>
<label className="block text-[10px] text-text-muted mb-1">{t("timeRange")}</label>
<Select
value={timeRange}
onChange={(e: any) => setTimeRange(e.target.value)}
options={[
{ value: "", label: "any" },
{ value: "day", label: "Past day" },
{ value: "week", label: "Past week" },
{ value: "month", label: "Past month" },
{ value: "year", label: "Past year" },
]}
className="w-full"
/>
</div>
<div>
<label className="block text-[10px] text-text-muted mb-1">
{t("includeDomains")}
</label>
<div className="flex gap-1">
<input
value={domainInput}
onChange={(e) => setDomainInput(e.target.value)}
placeholder="example.com"
className="flex-1 bg-surface border border-border rounded-md px-2 py-1.5 text-xs text-text-main focus:outline-none"
onKeyDown={(e) => e.key === "Enter" && addDomain("include")}
/>
<button onClick={() => addDomain("include")} className="text-primary text-lg px-1">
+
</button>
</div>
{includeDomains.length > 0 && (
<div className="flex flex-wrap gap-1 mt-1">
{includeDomains.map((d) => (
<span
key={d}
className="text-[10px] bg-primary/10 text-primary px-2 py-0.5 rounded-full flex items-center gap-1"
>
{d}
<button
onClick={() => removeDomain(d, "include")}
className="text-primary/60"
>
×
</button>
</span>
))}
</div>
)}
</div>
<div>
<label className="block text-[10px] text-text-muted mb-1">
{t("excludeDomains")}
</label>
<div className="flex gap-1">
<input
value={excludeDomainInput}
onChange={(e) => setExcludeDomainInput(e.target.value)}
placeholder="example.com"
className="flex-1 bg-surface border border-border rounded-md px-2 py-1.5 text-xs text-text-main focus:outline-none"
onKeyDown={(e) => e.key === "Enter" && addDomain("exclude")}
/>
<button onClick={() => addDomain("exclude")} className="text-primary text-lg px-1">
+
</button>
</div>
{excludeDomains.length > 0 && (
<div className="flex flex-wrap gap-1 mt-1">
{excludeDomains.map((d) => (
<span
key={d}
className="text-[10px] bg-error/10 text-error px-2 py-0.5 rounded-full flex items-center gap-1"
>
{d}
<button onClick={() => removeDomain(d, "exclude")} className="text-error/60">
×
</button>
</span>
))}
</div>
)}
</div>
<div>
<label className="block text-[10px] text-text-muted mb-1">{t("safeSearch")}</label>
<Select
value={safeSearch}
onChange={(e: any) => setSafeSearch(e.target.value)}
options={[
{ value: "off", label: "Off" },
{ value: "moderate", label: "Moderate" },
{ value: "strict", label: "Strict" },
]}
className="w-full"
/>
</div>
</div>
)}
</div>
{/* Search button */}
<div className="p-4 border-b border-border">
{loading ? (
<Button variant="danger" onClick={onCancel} className="w-full">
Cancel
</Button>
) : (
<Button
variant="primary"
onClick={handleSubmit}
disabled={noProviders || !query.trim()}
className="w-full"
>
Search
</Button>
)}
{noProviders && <p className="text-xs text-text-muted mt-2">{t("noSearchProviders")}</p>}
</div>
</div>
);
}
@@ -0,0 +1,67 @@
"use client";
import { useState, useEffect } from "react";
import { useTranslations } from "next-intl";
interface HistoryEntry {
query: string;
provider: string;
timestamp: string;
filters: Record<string, any>;
}
interface SearchHistoryProps {
onReplay: (entry: HistoryEntry) => void;
}
function timeAgo(timestamp: string): string {
try {
const rtf = new Intl.RelativeTimeFormat("en", { numeric: "auto" });
const diff = Date.now() - new Date(timestamp).getTime();
const minutes = Math.floor(diff / 60_000);
if (minutes < 1) return rtf.format(0, "minute");
if (minutes < 60) return rtf.format(-minutes, "minute");
const hours = Math.floor(minutes / 60);
if (hours < 24) return rtf.format(-hours, "hour");
return rtf.format(-Math.floor(hours / 24), "day");
} catch {
return new Date(timestamp).toLocaleString();
}
}
export default function SearchHistory({ onReplay }: SearchHistoryProps) {
const t = useTranslations("search");
const [entries, setEntries] = useState<HistoryEntry[]>([]);
useEffect(() => {
fetch("/api/search/stats")
.then((res) => res.json())
.then((data) => setEntries(data.recent_searches || []))
.catch(() => {});
}, []);
if (entries.length === 0) return null;
return (
<div className="p-4 flex-1">
<span className="text-[10px] font-semibold text-text-muted uppercase tracking-wider">
{t("searchHistory")}
</span>
<div className="mt-2 space-y-1.5">
{entries.map((entry, i) => (
<button
key={`${entry.timestamp}:${entry.provider}:${entry.query}`}
onClick={() => onReplay(entry)}
className="w-full text-left p-2 bg-surface border border-border rounded-lg hover:border-primary/30 transition-colors"
>
<div className="text-xs text-text-main truncate">{entry.query}</div>
<div className="flex justify-between mt-0.5">
<span className="text-[10px] text-text-muted">{entry.provider}</span>
<span className="text-[10px] text-text-muted">{timeAgo(entry.timestamp)}</span>
</div>
</button>
))}
</div>
</div>
);
}
@@ -0,0 +1,5 @@
import SearchToolsClient from "./SearchToolsClient";
export default function SearchToolsPage() {
return <SearchToolsClient />;
}
@@ -83,7 +83,11 @@ export default function BudgetTab() {
if (data.monthlyLimitUsd)
setForm((f) => ({ ...f, monthlyLimitUsd: String(data.monthlyLimitUsd) }));
if (data.warningThreshold)
setForm((f) => ({ ...f, warningThreshold: String(data.warningThreshold) }));
// stored as fraction (01), display as percentage (0100)
setForm((f) => ({
...f,
warningThreshold: String(Math.round(data.warningThreshold * 100)),
}));
}
} catch {
// silent
@@ -104,7 +108,8 @@ export default function BudgetTab() {
apiKeyId: selectedKey,
dailyLimitUsd: form.dailyLimitUsd ? parseFloat(form.dailyLimitUsd) : null,
monthlyLimitUsd: form.monthlyLimitUsd ? parseFloat(form.monthlyLimitUsd) : null,
warningThreshold: parseInt(form.warningThreshold) || 80,
// schema expects a fraction (01); UI shows percentage (0100)
warningThreshold: (parseInt(form.warningThreshold) || 80) / 100,
}),
});
if (res.ok) {

Some files were not shown because too many files have changed in this diff Show More