Compare commits

...

13 Commits

Author SHA1 Message Date
diegosouzapw 75a6d850fc chore: release v2.4.3
Build Electron Desktop App / Validate version (push) Failing after 31s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- fix: Codex/GitHub limits page HTTP 500 → graceful 401/403 messages
- fix: MaintenanceBanner false-positive on page load (stale closure)
- fix: add title tooltips to edit/delete buttons in ConnectionCard
- feat: add fill-first and p2c routing strategies to combo picker
- feat: Free Stack template pre-fills 7 free provider models
- feat: combo create/edit modal wider (max-w-4xl)
2026-03-14 12:49:36 -03:00
diegosouzapw b0f5f92f1a feat(release): v2.4.2 — task-aware routing, HuggingFace/Vertex providers, streaming fixes, token tracking, playground uploads
Build Electron Desktop App / Validate version (push) Failing after 43s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- feat: Task-Aware Smart Routing (T05) — auto-select model by task type
- feat: HuggingFace and Vertex AI provider support
- feat: Playground audio/image file uploads for transcription and vision
- feat: ModelSelectModal shows ✓ for already-added models (#180)
- fix: Claude Haiku routed to OpenAI without provider prefix (#73)
- fix: Token counts always 0 for Antigravity/Claude streaming (#74)
- fix: OpenAI SDK stream=False drops tool_calls (#302)
- fix: Media page generation errors — inline rendering for images/transcription
- fix: Round-robin state management for excluded accounts (#349)
- fix: Qwen user agent and CLI fingerprint compatibility (#352)
- deps: undici→7.24.2, dompurify→3.3.3, docker actions v4
- docs: CHANGELOG 2.4.2 with full feature/fix list
- docs: README with Task-Aware Routing table entry
2026-03-14 11:04:09 -03:00
Diego Rodrigues de Sa e Souza eaddb6f0fa feat: improvements from 9router analysis (T01/T08-T13) (#351)
* fix: tool description null sanitization, clipboard HTTP fallback fixes

T10 - Sanitize tool.description null in claude-to-openai translator
- claude-to-openai.ts: tool.description defaults to empty string when null/undefined
- claude-to-openai.ts: filter out tools with empty/missing names
- Prevents 400 validation errors on providers like NVIDIA NIM (issue #276)

T11 - Fix copy buttons to work on HTTP/non-HTTPS deployments
- Add src/shared/utils/clipboard.ts with HTTPS+HTTP (execCommand) dual fallback
- Migrate useCopyToClipboard.ts to use shared utility
- Migrate ConsoleLogViewer.tsx, RequestLoggerV2.tsx to shared utility
- Migrate HomePageClient.tsx, endpoint/page.tsx, GetStarted.tsx
- Migrate DefaultToolCard.tsx to shared utility
- Fixes copy buttons when OmniRoute runs behind HTTP proxy (issue #296)

T02 - Verified SSE [DONE] sentinel handling already correct
- sseParser.ts filters [DONE] on line 13 (no change needed)
- stream.ts uses doneSent flag to prevent duplicate sentinel
- bypassHandler.ts correctly separates streaming/non-streaming responses

Issue triage comments posted to #340, #341, #344

* feat: DB read cache + Accept header stream negotiation (T09/T01)

T09 - In-memory TTL cache for hot DB read paths
- Add src/lib/db/readCache.ts with TTL cache (5s settings/connections, 30s pricing)
- Eliminates redundant SQLite reads on concurrent requests
- Integrate invalidation in settings.ts updateSettings() and updatePricing()
- Integrate invalidation in providers.ts create/update/delete operations
- Export getCachedSettings, getCachedPricing, getCachedProviderConnections,
  invalidateDbCache via localDb.ts for consumer migration
- Cache auto-busts on any write, preserving data consistency

T01 - Accept header stream negotiation
- src/sse/handlers/chat.ts: detect Accept: text/event-stream header
- Override body.stream=true when Accept header indicates streaming client
- Enables curl, httpx and SDK clients that use HTTP headers instead of JSON
  body field to trigger streaming responses
- Logs Accept override at DEBUG level for observability

* fix: auto-advance quota window on expiry to prevent stale blocking (T08)

T08 - Quota Window Rolling Auto-Advance
- quotaCache.ts: add windowDurationMs field to QuotaCacheEntry interface
  (optional field that callers can set when they know the window duration)
- Add advancedWindowResetAt() helper: if entry.nextResetAt is in the past,
  eagerly returns { exhausted: false } so requests are unblocked immediately
- isAccountQuotaExhausted() now uses advancedWindowResetAt() instead of
  the previous inline date check, and optimistically clears entry.exhausted
  flag to avoid re-checking the same stale entry on the next request

Before: exhausted accounts with an expired resetAt would wait up to 5
minutes for the background refresh before accepting new requests.
After:  the first request after resetAt passes will be immediately accepted
and will trigger a quota refresh on the next background tick.

* feat: manual OAuth token refresh UI (T12)

T12 - Manual Token Refresh UI
- Add POST /api/providers/[id]/refresh endpoint
  - Validates connection exists and is OAuth type
  - Calls getAccessToken() (same helper used in auto-refresh)
  - Persists new credentials via updateProviderCredentials()
  - Returns { success, expiresAt, refreshedAt } on success

- Update providers/[id]/page.tsx
  - handleRefreshToken() with loading state (refreshingId)
  - Pass onRefreshToken + isRefreshing props to ConnectionRow
  - ConnectionRow: add optional onRefreshToken/isRefreshing props
  - ConnectionRow: tokenMinsLeft state via lazy init (Date.now() in
    getter fn, not in render body - satisfies react-hooks/purity)
  - Token expiry badge: red 'expired' | amber '~Xm' (<30min) | hidden
  - 'Token' button (amber) next to 'Retest' for OAuth connections

- Add en.json i18n: tokenRefreshed, tokenRefreshFailed

* Initial plan

* feat: integrate wildcardRouter into model alias resolution (T13)

T13 - Wildcard Model Routing
- Import resolveWildcardAlias from wildcardRouter.ts into model.ts
- In getModelInfoCore(), after exact alias check fails, try glob wildcard
  alias matching (e.g., 'claude-sonnet-*' alias → 'anthropic/claude-sonnet-4')
- Returns { provider, model, extendedContext, wildcardPattern } on match
- Falls back to MODEL_TO_PROVIDERS lookup and openai default as before

* fix: clipboard cleanup and tool validation

* feat: media page UX + T04 playground uploads + T03 HuggingFace/Vertex AI

Media Page (MediaPageClient.tsx):
- Render images inline (img tags from b64_json or url)
- Show transcription as plain readable text (not raw JSON)
- Amber banner for credential errors with link to /dashboard/providers
- Detect empty transcription result and show credentials hint
- Provider credential hint below selector for non-local providers
- Extended provider/model lists: HuggingFace, Qwen TTS, Inworld, Cartesia, PlayHT, AssemblyAI

T04 - Playground File Uploads (playground/page.tsx):
- Audio file upload panel for transcription endpoint (multipart/form-data)
- Image upload panel for vision models (gpt-4o, claude-3, gemini, pixtral, llava...)
- Auto-detect vision models by name heuristic
- Inject uploaded images as base64 image_url in chat messages
- Inline image rendering for image generation results
- Readable text view for transcription results with copy button
- Preview thumbnails for attached images with individual remove

T03 - HuggingFace + Vertex AI Providers:
- HuggingFace: frontend providers.ts + backend providerRegistry.ts
  Uses HuggingFace Router OpenAI-compatible endpoint
- Vertex AI: frontend providers.ts + backend providerRegistry.ts
  Uses gemini format with generateContent API (urlBuilder fallback)

T07 - API Key Round-Robin: VERIFIED already implemented in auth.ts
  fill-first, round-robin, p2c, random, least-used, cost-optimized strategies

* feat: T05 task-aware routing + fix #302 stream override + fix #73 claude provider fallback

T05 - Task-Aware Smart Routing:
- New open-sse/services/taskAwareRouter.ts:
  Detects 7 task types: coding, creative, analysis, vision, summarization,
  background, chat from system/user message content and images
  Configurable taskModelMap per task type, stats tracking
  applyTaskAwareRouting() integrates with existing chat pipeline
- New src/app/api/settings/task-routing/route.ts:
  GET/PUT/POST API for task routing config + reset-stats + detect action
  Persists config via updateSettings('taskRouting')
- Integration in src/sse/handlers/chat.ts:
  applyTaskAwareRouting() called after policy enforcement, before combo resolve
  Logs task type detection and model overrides

Fix #302 - OpenAI SDK stream=False drops tool_calls:
- src/sse/handlers/chat.ts T01 Accept header negotiation:
  Changed condition from 'body.stream !== true' to 'body.stream === undefined'
  OpenAI Python SDK sends 'Accept: application/json, text/event-stream' in every
  request, even stream=False — the old code was incorrectly forcing stream=true,
  causing tool_calls to be dropped from non-streaming responses

Fix #73 - Claude Haiku routed to OpenAI provider instead of Antigravity:
- open-sse/services/model.ts getModelInfoCore():
  Added heuristic prefix detection before the blind 'openai' fallback:
  claude-* models → antigravity (Anthropic) provider
  gemini-*/gemma-* models → gemini provider
  Closes: #73, partially addresses #302

* fix: token counts 0 (#74), model import dup (#180), model route fallback (#73)

fix #74 - Token counts always 0 for Antigravity/Claude streaming:
- open-sse/utils/usageTracking.ts extractUsage():
  Add handler for 'message_start' SSE event which carries INPUT tokens in
  Antigravity/Claude streaming:
  { type: 'message_start', message: { usage: { input_tokens: N } } }
  This event was completely unhandled, causing ALL input token counts to be
  dropped for every Antigravity/Claude streaming request

fix #180 - Model import shows duplicates with no visual feedback:
- src/shared/components/ModelSelectModal.tsx:
  Added addedModelValues prop (string[]) to receive already-added model values
  Models already in the combo now shown with ✓ indicator + green highlight
  Makes it visually clear which models are already added vs new
- src/app/(dashboard)/dashboard/combos/page.tsx:
  Pass addedModelValues={models.map(m => m.model)} to ModelSelectModal

* Harden clipboard UX and Claude tool normalization (#360)

* Initial plan

* chore: plan updates for clipboard and translator fixes

* fix: clipboard cleanup, copy feedback, and claude tool validation

---------

Co-authored-by: openai-code-agent[bot] <242516109+Codex@users.noreply.github.com>
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>

---------

Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: openai-code-agent[bot] <242516109+Codex@users.noreply.github.com>
2026-03-14 10:59:15 -03:00
Nyaru Toru 5cff98ea75 feat: add Qwen compatibility with updated user agent and CLI fingerprint settings (#352)
Co-authored-by: nyatoru <nyarutoru0002@outlook.co.th>
2026-03-14 10:58:50 -03:00
Nyaru Toru 76127415a4 fix(account-selector): enhance round-robin logic to handle excluded accounts and maintain state (#349)
Co-authored-by: nyatoru <nyarutoru0002@outlook.co.th>
2026-03-14 10:58:48 -03:00
dependabot[bot] 56936fe0e3 deps: bump undici from 7.24.1 to 7.24.2 (#361)
Bumps [undici](https://github.com/nodejs/undici) from 7.24.1 to 7.24.2.
- [Release notes](https://github.com/nodejs/undici/releases)
- [Commits](https://github.com/nodejs/undici/compare/v7.24.1...v7.24.2)

---
updated-dependencies:
- dependency-name: undici
  dependency-version: 7.24.2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-14 10:58:46 -03:00
dependabot[bot] dfbbbeb1b4 chore(deps): bump docker/setup-buildx-action from 3 to 4 (#343)
* chore(deps): bump docker/setup-buildx-action from 3 to 4

Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 3 to 4.
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](https://github.com/docker/setup-buildx-action/compare/v3...v4)

---
updated-dependencies:
- dependency-name: docker/setup-buildx-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* Initial plan

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: openai-code-agent[bot] <242516109+Codex@users.noreply.github.com>
Co-authored-by: Diego Rodrigues de Sa e Souza <8016841+diegosouzapw@users.noreply.github.com>
2026-03-14 10:56:20 -03:00
dependabot[bot] 7f3ffd935e chore(deps): bump docker/setup-qemu-action from 3 to 4 (#342)
Bumps [docker/setup-qemu-action](https://github.com/docker/setup-qemu-action) from 3 to 4.
- [Release notes](https://github.com/docker/setup-qemu-action/releases)
- [Commits](https://github.com/docker/setup-qemu-action/compare/v3...v4)

---
updated-dependencies:
- dependency-name: docker/setup-qemu-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-14 10:56:18 -03:00
dependabot[bot] 29cf462d8f deps: bump undici from 7.22.0 to 7.24.1 (#348)
* deps: bump undici from 7.22.0 to 7.24.1

Bumps [undici](https://github.com/nodejs/undici) from 7.22.0 to 7.24.1.
- [Release notes](https://github.com/nodejs/undici/releases)
- [Commits](https://github.com/nodejs/undici/compare/v7.22.0...v7.24.1)

---
updated-dependencies:
- dependency-name: undici
  dependency-version: 7.24.1
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Initial plan

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: openai-code-agent[bot] <242516109+Codex@users.noreply.github.com>
Co-authored-by: Diego Rodrigues de Sa e Souza <8016841+diegosouzapw@users.noreply.github.com>
2026-03-14 10:56:16 -03:00
dependabot[bot] 5e1693e1f7 deps: bump dompurify from 3.3.2 to 3.3.3 (#347) 2026-03-14 10:55:45 -03:00
diegosouzapw 45424ca226 fix(ci): docs-sync, openapi version, changelog format, pre-commit hook
Build Electron Desktop App / Validate version (push) Failing after 38s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- docs/openapi.yaml: update info.version from 2.3.6 to 2.4.1 (fixes CI check)
- CHANGELOG.md: add '## [Unreleased]' section as first heading (required by check-docs-sync)
- scripts/check-docs-sync.mjs: fix regex to accept both hyphen (-) and em-dash (—)
  as date separators in changelog headings (standard Keep a Changelog format)
- .husky/pre-commit: add 'node scripts/check-docs-sync.mjs' to catch version
  mismatches locally before push
2026-03-13 11:45:32 -03:00
diegosouzapw d976abb5e0 chore: v2.4.1 — combos free-stack always visible 2026-03-13 11:29:51 -03:00
diegosouzapw 92d302aed3 fix(combos): free-stack template first, 2x2 grid, green highlight badge
- Move 'Free Stack ($0)' to position 1 in COMBO_TEMPLATES (was 4th, invisible in 3-col grid)
- Add isFeatured flag to free-stack for special styling
- Change template grid: grid-cols-3 → 2x2 (sm:grid-cols-2) — all 4 templates visible
- Free Stack: green border/bg (emerald), FREE badge, larger text size
- Other templates: hover styles preserved, → arrow on Apply link
- Increase templates section padding
2026-03-13 11:26:18 -03:00
43 changed files with 2005 additions and 327 deletions
+23 -1
View File
@@ -142,10 +142,32 @@ GITHUB_USER_AGENT=GitHubCopilotChat/0.26.7
ANTIGRAVITY_USER_AGENT=antigravity/1.104.0 darwin/arm64
KIRO_USER_AGENT=AWS-SDK-JS/3.0.0 kiro-ide/1.0.0
IFLOW_USER_AGENT=iFlow-Cli
QWEN_USER_AGENT=google-api-nodejs-client/9.15.1
QWEN_USER_AGENT=QwenCode/0.12.3 (linux; x64)
CURSOR_USER_AGENT=connect-es/1.6.1
GEMINI_CLI_USER_AGENT=google-api-nodejs-client/9.15.1
# ─────────────────────────────────────────────────────────────────────────────
# CLI Fingerprint Compatibility (optional — match native CLI binary signatures)
# ─────────────────────────────────────────────────────────────────────────────
# When enabled, OmniRoute reorders HTTP headers and JSON body fields to match
# the exact signature of official CLI tools, reducing account flagging risk.
# Your proxy IP is preserved — you get both stealth AND IP masking.
#
# Enable per-provider:
# CLI_COMPAT_CODEX=1
# CLI_COMPAT_CLAUDE=1
# CLI_COMPAT_GITHUB=1
# CLI_COMPAT_ANTIGRAVITY=1
# CLI_COMPAT_KIRO=1
# CLI_COMPAT_CURSOR=1
# CLI_COMPAT_KIMI_CODING=1
# CLI_COMPAT_KILOCODE=1
# CLI_COMPAT_CLINE=1
# CLI_COMPAT_QWEN=1
#
# Or enable for all providers at once:
# CLI_COMPAT_ALL=1
# API Key Providers (Phase 1 + Phase 4)
# Add via Dashboard → Providers → Add API Key, or set here
# DEEPSEEK_API_KEY=
+2 -2
View File
@@ -18,10 +18,10 @@ jobs:
uses: actions/checkout@v6
- name: Set up QEMU (for multi-arch builds)
uses: docker/setup-qemu-action@v3
uses: docker/setup-qemu-action@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
uses: docker/setup-buildx-action@v4
- name: Login to Docker Hub
uses: docker/login-action@v4
+1
View File
@@ -1 +1,2 @@
npx lint-staged
node scripts/check-docs-sync.mjs
+69 -1
View File
@@ -1,5 +1,74 @@
# Changelog
## [Unreleased]
## [2.4.3] - 2026-03-14
> UI polish, routing strategy additions, and graceful error handling for usage limits.
### ✨ New Features
- **Fill-First & P2C Routing Strategies**: Added `fill-first` (drain quota before moving on) and `p2c` (Power-of-Two-Choices low-latency selection) to combo strategy picker, with full guidance panels and color-coded badges.
- **Free Stack Preset Models**: Creating a combo with the Free Stack template now auto-fills 7 best-in-class free provider models (Gemini CLI, Kiro, iFlow×2, Qwen, NVIDIA NIM, Groq). Users just activate the providers and get a $0/month combo out-of-the-box.
- **Wider Combo Modal**: Create/Edit combo modal now uses `max-w-4xl` for comfortable editing of large combos.
### 🐛 Bug Fixes
- **Limits page HTTP 500 for Codex & GitHub**: `getCodexUsage()` and `getGitHubUsage()` now return a user-friendly message when the provider returns 401/403 (expired token), instead of throwing and causing a 500 error on the Limits page.
- **MaintenanceBanner false-positive**: Banner no longer shows "Server is unreachable" spuriously on page load. Fixed by calling `checkHealth()` immediately on mount and removing stale `show`-state closure.
- **Provider icon tooltips**: Edit (pencil) and delete icon buttons in the provider connection row now have native HTML tooltips — all 6 action icons are now self-documented.
> Multiple improvements from community issue analysis, new provider support, bug fixes for token tracking, model routing, and streaming reliability.
### ✨ New Features
- **Task-Aware Smart Routing (T05)**: Automatic model selection based on request content type — coding → deepseek-chat, analysis → gemini-2.5-pro, vision → gpt-4o, summarization → gemini-2.5-flash. Configurable via Settings. New `GET/PUT/POST /api/settings/task-routing` API.
- **HuggingFace Provider**: Added HuggingFace Router as an OpenAI-compatible provider with Llama 3.1 70B/8B, Qwen 2.5 72B, Mistral 7B, Phi-3.5 Mini.
- **Vertex AI Provider**: Added Vertex AI (Google Cloud) provider with Gemini 2.5 Pro/Flash, Gemma 2 27B, Claude via Vertex.
- **Playground File Uploads**: Audio upload for transcription, image upload for vision models (auto-detect by model name), inline image rendering for image generation results.
- **Model Select Visual Feedback**: Already-added models in combo picker now show ✓ green badge — prevents duplicate confusion.
- **Qwen Compatibility (PR #352)**: Updated User-Agent and CLI fingerprint settings for Qwen provider compatibility.
- **Round-Robin State Management (PR #349)**: Enhanced round-robin logic to handle excluded accounts and maintain rotation state correctly.
- **Clipboard UX (PR #360)**: Hardened clipboard operations with fallback for non-secure contexts; Claude tool normalization improvements.
### 🐛 Bug Fixes
- **Fix #302 — OpenAI SDK stream=False drops tool_calls**: T01 Accept header negotiation no longer forces streaming when `body.stream` is explicitly `false`. Was causing tool_calls to be silently dropped when using the OpenAI Python SDK in non-streaming mode.
- **Fix #73 — Claude Haiku routed to OpenAI without provider prefix**: `claude-*` models sent without a provider prefix now correctly route to the `antigravity` (Anthropic) provider. Added `gemini-*`/`gemma-*``gemini` heuristic as well.
- **Fix #74 — Token counts always 0 for Antigravity/Claude streaming**: The `message_start` SSE event which carries `input_tokens` was not being parsed by `extractUsage()`, causing all input token counts to drop. Input/output token tracking now works correctly for streaming responses.
- **Fix #180 — Model import duplicates with no feedback**: `ModelSelectModal` now shows ✓ green highlight for models already in the combo, making it obvious they're already added.
- **Media page generation errors**: Image results now render as `<img>` tags instead of raw JSON. Transcription results shown as readable text. Credential errors show an amber banner instead of silent failure.
- **Token refresh button on provider page**: Manual token refresh UI added for OAuth providers.
### 🔧 Improvements
- **Provider Registry**: HuggingFace and Vertex AI added to `providerRegistry.ts` and `providers.ts` (frontend).
- **Read Cache**: New `src/lib/db/readCache.ts` for efficient DB read caching.
- **Quota Cache**: Improved quota cache with TTL-based eviction.
### 📦 Dependencies
- `dompurify` → 3.3.3 (PR #347)
- `undici` → 7.24.2 (PR #348, #361)
- `docker/setup-qemu-action` → v4 (PR #342)
- `docker/setup-buildx-action` → v4 (PR #343)
### 📁 New Files
| File | Purpose |
| --------------------------------------------- | --------------------------------------- |
| `open-sse/services/taskAwareRouter.ts` | Task-aware routing logic (7 task types) |
| `src/app/api/settings/task-routing/route.ts` | Task routing config API |
| `src/app/api/providers/[id]/refresh/route.ts` | Manual OAuth token refresh |
| `src/lib/db/readCache.ts` | Efficient DB read cache |
| `src/shared/utils/clipboard.ts` | Hardened clipboard with fallback |
## [2.4.1] - 2026-03-13
### 🐛 Fix
- **Combos modal: Free Stack visible and prominent** — Free Stack template was hidden (4th in 3-column grid). Fixed: moved to position 1, switched to 2x2 grid so all 4 templates are visible, green border + FREE badge highlight.
## [2.4.0] - 2026-03-13
> **Major release** — Free Stack ecosystem, transcription playground overhaul, 44+ providers, comprehensive free tier documentation, and UI improvements across the board.
@@ -32,7 +101,6 @@
## [2.3.14] - 2026-03-13
### 🐛 Bug Fixes
- **iFlow OAuth (#339)**: Restored the valid default `clientSecret` — was previously an empty string, causing "Bad client credentials" on every connect attempt. The public credential is now the default fallback (overridable via `IFLOW_OAUTH_CLIENT_SECRET` env var).
+76 -76
View File
@@ -706,19 +706,18 @@ Outcome: deep fallback depth for deadline-critical workloads
> Setup AI coding in minutes at **$0/month**. Connect these free accounts and use the built-in **Free Stack** combo.
| Step | Action | Providers Unlocked |
|---|---|---|
| 1 | Connect **Kiro** (AWS Builder ID OAuth) | Claude Sonnet 4.5, Haiku 4.5 — **unlimited** |
| 2 | Connect **iFlow** (Google OAuth) | kimi-k2-thinking, qwen3-coder-plus, deepseek-r1... — **unlimited** |
| 3 | Connect **Qwen** (Device Code) | qwen3-coder-plus, qwen3-coder-flash... — **unlimited** |
| 4 | Connect **Gemini CLI** (Google OAuth) | gemini-3-flash, gemini-2.5-pro — **180K/mo free** |
| 5 | `/dashboard/combos`**Free Stack ($0)** template | Round-robin all free providers automatically |
| Step | Action | Providers Unlocked |
| ---- | -------------------------------------------------- | ------------------------------------------------------------------ |
| 1 | Connect **Kiro** (AWS Builder ID OAuth) | Claude Sonnet 4.5, Haiku 4.5 — **unlimited** |
| 2 | Connect **iFlow** (Google OAuth) | kimi-k2-thinking, qwen3-coder-plus, deepseek-r1... — **unlimited** |
| 3 | Connect **Qwen** (Device Code) | qwen3-coder-plus, qwen3-coder-flash... — **unlimited** |
| 4 | Connect **Gemini CLI** (Google OAuth) | gemini-3-flash, gemini-2.5-pro — **180K/mo free** |
| 5 | `/dashboard/combos`**Free Stack ($0)** template | Round-robin all free providers automatically |
**Point any IDE/CLI to:** `http://localhost:20128/v1` · API Key: `any-string` · Done.
> **Optional extra coverage (also free):** Groq API key (30 RPM free), NVIDIA NIM (40 RPM free, 70+ models), Cerebras (1M tok/day).
## ⚡ Quick Start
### 1) Install and run
@@ -899,25 +898,25 @@ When minimized, OmniRoute lives in your system tray with quick actions:
## 💰 Pricing at a Glance
| Tier | Provider | Cost | Quota Reset | Best For |
| ------------------- | ----------------- | ----------------------- | ---------------- | -------------------- |
| **💳 SUBSCRIPTION** | Claude Code (Pro) | $20/mo | 5h + weekly | Already subscribed |
| | Codex (Plus/Pro) | $20-200/mo | 5h + weekly | OpenAI users |
| | Gemini CLI | **FREE** | 180K/mo + 1K/day | Everyone! |
| | GitHub Copilot | $10-19/mo | Monthly | GitHub users |
| **🔑 API KEY** | NVIDIA NIM | **FREE** (dev forever) | ~40 RPM | 70+ open models |
| | Cerebras | **FREE** (1M tok/day) | 60K TPM / 30 RPM | World's fastest |
| | Groq | **FREE** (30 RPM) | 14.4K RPD | Ultra-fast Llama/Gemma |
| | DeepSeek | Pay-per-use | None | Best price/quality |
| | xAI (Grok) | Pay-per-use | None | Grok models |
| | Mistral | Free trial + paid | Rate limited | European AI |
| | OpenRouter | Pay-per-use | None | 100+ models aggr. |
| **💰 CHEAP** | GLM-4.7 | $0.6/1M | Daily 10AM | Budget backup |
| | MiniMax M2.1 | $0.2/1M | 5-hour rolling | Cheapest option |
| | Kimi K2 | $9/mo flat | 10M tokens/mo | Predictable cost |
| **🆓 FREE** | iFlow | **$0** | Unlimited | 5 models unlimited |
| | Qwen | **$0** | Unlimited | 4 models unlimited |
| | Kiro | **$0** | Unlimited | Claude (AWS Builder ID) |
| Tier | Provider | Cost | Quota Reset | Best For |
| ------------------- | ----------------- | ---------------------- | ---------------- | ----------------------- |
| **💳 SUBSCRIPTION** | Claude Code (Pro) | $20/mo | 5h + weekly | Already subscribed |
| | Codex (Plus/Pro) | $20-200/mo | 5h + weekly | OpenAI users |
| | Gemini CLI | **FREE** | 180K/mo + 1K/day | Everyone! |
| | GitHub Copilot | $10-19/mo | Monthly | GitHub users |
| **🔑 API KEY** | NVIDIA NIM | **FREE** (dev forever) | ~40 RPM | 70+ open models |
| | Cerebras | **FREE** (1M tok/day) | 60K TPM / 30 RPM | World's fastest |
| | Groq | **FREE** (30 RPM) | 14.4K RPD | Ultra-fast Llama/Gemma |
| | DeepSeek | Pay-per-use | None | Best price/quality |
| | xAI (Grok) | Pay-per-use | None | Grok models |
| | Mistral | Free trial + paid | Rate limited | European AI |
| | OpenRouter | Pay-per-use | None | 100+ models aggr. |
| **💰 CHEAP** | GLM-4.7 | $0.6/1M | Daily 10AM | Budget backup |
| | MiniMax M2.1 | $0.2/1M | 5-hour rolling | Cheapest option |
| | Kimi K2 | $9/mo flat | 10M tokens/mo | Predictable cost |
| **🆓 FREE** | iFlow | **$0** | Unlimited | 5 models unlimited |
| | Qwen | **$0** | Unlimited | 4 models unlimited |
| | Kiro | **$0** | Unlimited | Claude (AWS Builder ID) |
**💡 $0 Combo Stack:** Gemini CLI (180K/mo) → iFlow (unlimited: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1) → Kiro (Claude for free) → Qwen (4 models, unlimited) — **Zero cost, never stops coding.** When Gemini quota runs out, OmniRoute auto-falls back to iFlow or Kiro with zero config.
@@ -931,63 +930,64 @@ When minimized, OmniRoute lives in your system tray with quick actions:
### 🔵 CLAUDE MODELS (via Kiro — AWS Builder ID)
| Model | Prefix | Limit | Rate Limit |
|---|---|---|---|
| `claude-sonnet-4.5` | `kr/` | **Unlimited** | No reported daily cap |
| `claude-haiku-4.5` | `kr/` | **Unlimited** | No reported daily cap |
| `claude-opus-4.6` | `kr/` | **Unlimited** | Latest Opus via Kiro |
| Model | Prefix | Limit | Rate Limit |
| ------------------- | ------ | ------------- | --------------------- |
| `claude-sonnet-4.5` | `kr/` | **Unlimited** | No reported daily cap |
| `claude-haiku-4.5` | `kr/` | **Unlimited** | No reported daily cap |
| `claude-opus-4.6` | `kr/` | **Unlimited** | Latest Opus via Kiro |
### 🟢 IFLOW MODELS (Free OAuth — No Credit Card)
| Model | Prefix | Limit | Rate Limit |
|---|---|---|---|
| `kimi-k2-thinking` | `if/` | **Unlimited** | No reported cap |
| `qwen3-coder-plus` | `if/` | **Unlimited** | No reported cap |
| `deepseek-r1` | `if/` | **Unlimited** | No reported cap |
| `minimax-m2.1` | `if/` | **Unlimited** | No reported cap |
| `kimi-k2` | `if/` | **Unlimited** | No reported cap |
| Model | Prefix | Limit | Rate Limit |
| ------------------ | ------ | ------------- | --------------- |
| `kimi-k2-thinking` | `if/` | **Unlimited** | No reported cap |
| `qwen3-coder-plus` | `if/` | **Unlimited** | No reported cap |
| `deepseek-r1` | `if/` | **Unlimited** | No reported cap |
| `minimax-m2.1` | `if/` | **Unlimited** | No reported cap |
| `kimi-k2` | `if/` | **Unlimited** | No reported cap |
### 🟡 QWEN MODELS (Device Code Auth)
| Model | Prefix | Limit | Rate Limit |
|---|---|---|---|
| `qwen3-coder-plus` | `qw/` | **Unlimited** | No reported cap |
| `qwen3-coder-flash` | `qw/` | **Unlimited** | No reported cap |
| `qwen3-coder-next` | `qw/` | **Unlimited** | No reported cap |
| `vision-model` | `qw/` | **Unlimited** | Multimodal (images) |
| Model | Prefix | Limit | Rate Limit |
| ------------------- | ------ | ------------- | ------------------- |
| `qwen3-coder-plus` | `qw/` | **Unlimited** | No reported cap |
| `qwen3-coder-flash` | `qw/` | **Unlimited** | No reported cap |
| `qwen3-coder-next` | `qw/` | **Unlimited** | No reported cap |
| `vision-model` | `qw/` | **Unlimited** | Multimodal (images) |
### 🟣 GEMINI CLI (Google OAuth)
| Model | Prefix | Limit | Rate Limit |
|---|---|---|---|
| `gemini-3-flash-preview` | `gc/` | **180K tok/month** + 1K/day | Monthly reset |
| `gemini-2.5-pro` | `gc/` | 180K/month (shared pool) | High quality |
| Model | Prefix | Limit | Rate Limit |
| ------------------------ | ------ | --------------------------- | ------------- |
| `gemini-3-flash-preview` | `gc/` | **180K tok/month** + 1K/day | Monthly reset |
| `gemini-2.5-pro` | `gc/` | 180K/month (shared pool) | High quality |
### ⚫ NVIDIA NIM (Free API Key — build.nvidia.com)
| Tier | Daily Limit | Rate Limit | Notes |
|---|---|---|---|
| Tier | Daily Limit | Rate Limit | Notes |
| ---------- | ------------ | ----------- | ------------------------------------------------------ |
| Free (Dev) | No token cap | **~40 RPM** | 70+ models; transitioning to pure rate limits mid-2025 |
Popular free models: `moonshotai/kimi-k2.5` (Kimi K2.5), `z-ai/glm4.7` (GLM 4.7), `deepseek-ai/deepseek-v3.2` (DeepSeek V3.2), `nvidia/llama-3.3-70b-instruct`, `deepseek/deepseek-r1`
### ⚪ CEREBRAS (Free API Key — inference.cerebras.ai)
| Tier | Daily Limit | Rate Limit | Notes |
|---|---|---|---|
| Tier | Daily Limit | Rate Limit | Notes |
| ---- | ----------------- | ---------------- | ------------------------------------------- |
| Free | **1M tokens/day** | 60K TPM / 30 RPM | World's fastest LLM inference; resets daily |
Available free: `llama-3.3-70b`, `llama-3.1-8b`, `deepseek-r1-distill-llama-70b`
### 🔴 GROQ (Free API Key — console.groq.com)
| Tier | Daily Limit | Rate Limit | Notes |
|---|---|---|---|
| Tier | Daily Limit | Rate Limit | Notes |
| ---- | ------------- | ---------------- | ----------------------------------------- |
| Free | **14.4K RPD** | 30 RPM per model | No credit card; 429 on limit, not charged |
Available free: `llama-3.3-70b-versatile`, `gemma2-9b-it`, `mixtral-8x7b`, `whisper-large-v3`
> **💡 The Ultimate Free Stack:**
>
> ```
> Kiro (Claude, unlimited)
> → iFlow (5 models, unlimited)
@@ -997,18 +997,18 @@ Available free: `llama-3.3-70b-versatile`, `gemma2-9b-it`, `mixtral-8x7b`, `whis
> → Groq (14.4K req/day)
> → NVIDIA NIM (40 RPM, 70+ models)
> ```
>
> Configure this as an OmniRoute combo and you'll never pay for AI again.
## 🎙️ Free Transcription Combo
> Transcribe any audio/video for **$0** — Deepgram leads with $200 free, AssemblyAI $50 fallback, Groq Whisper as unlimited emergency backup.
| Provider | Free Credits | Best Model | Rate Limit |
|---|---|---|---|
| 🟢 **Deepgram** | **$200 free** (signup) | `nova-3` — best accuracy, 30+ languages | No RPM limit on free credits |
| 🔵 **AssemblyAI** | **$50 free** (signup) | `universal-3-pro` — chapters, sentiment, PII | No RPM limit on free credits |
| 🔴 **Groq** | **Free forever** | `whisper-large-v3` — OpenAI Whisper | 30 RPM (rate limited) |
| Provider | Free Credits | Best Model | Rate Limit |
| ----------------- | ---------------------- | -------------------------------------------- | ---------------------------- |
| 🟢 **Deepgram** | **$200 free** (signup) | `nova-3` — best accuracy, 30+ languages | No RPM limit on free credits |
| 🔵 **AssemblyAI** | **$50 free** (signup) | `universal-3-pro` — chapters, sentiment, PII | No RPM limit on free credits |
| 🔴 **Groq** | **Free forever** | `whisper-large-v3` — OpenAI Whisper | 30 RPM (rate limited) |
**Suggested combo in `/dashboard/combos`:**
@@ -1023,7 +1023,6 @@ Nodes:
Then in `/dashboard/media`**Transcription** tab: upload any audio or video file → select your combo endpoint → get transcription in supported formats.
## 💡 Key Features
OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
@@ -1058,20 +1057,21 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
### 🧠 Routing & Intelligence
| Feature | What It Does |
| ---------------------------------- | --------------------------------------------------------------------- |
| 🎯 **Smart 4-Tier Fallback** | Auto-route: Subscription → API Key → Cheap → Free |
| 📊 **Real-Time Quota Tracking** | Live token count + reset countdown per provider |
| 🔄 **Format Translation** | OpenAI ↔ Claude ↔ Gemini ↔ Responses with schema-safe conversions |
| 👥 **Multi-Account Support** | Multiple accounts per provider with intelligent selection |
| 🔄 **Auto Token Refresh** | OAuth tokens refresh automatically with retry |
| 🎨 **Custom Combos** | 6 balancing strategies + fallback chain control |
| 🌐 **Wildcard Router** | `provider/*` dynamic routing |
| 🧠 **Thinking Budget Controls** | Passthrough, auto, custom, and adaptive reasoning limits |
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
| Feature | What It Does |
| ---------------------------------- | ------------------------------------------------------------------------ |
| 🎯 **Smart 4-Tier Fallback** | Auto-route: Subscription → API Key → Cheap → Free |
| 📊 **Real-Time Quota Tracking** | Live token count + reset countdown per provider |
| 🔄 **Format Translation** | OpenAI ↔ Claude ↔ Gemini ↔ Responses with schema-safe conversions |
| 👥 **Multi-Account Support** | Multiple accounts per provider with intelligent selection |
| 🔄 **Auto Token Refresh** | OAuth tokens refresh automatically with retry |
| 🎨 **Custom Combos** | 6 balancing strategies + fallback chain control |
| 🌐 **Wildcard Router** | `provider/*` dynamic routing |
| 🧠 **Thinking Budget Controls** | Passthrough, auto, custom, and adaptive reasoning limits |
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
### 🎵 Multi-Modal APIs
+1 -1
View File
@@ -1,7 +1,7 @@
openapi: 3.1.0
info:
title: OmniRoute API
version: 2.3.6
version: 2.4.3
description: |
OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
endpoint that routes requests to multiple AI providers with load balancing,
+52
View File
@@ -118,6 +118,58 @@ export const CLI_FINGERPRINTS: Record<string, CliFingerprint> = {
bodyFieldOrder: ["project", "model", "userAgent", "requestType", "requestId", "request"],
userAgent: "antigravity",
},
qwen: {
headerOrder: [
"Host",
"Content-Type",
"Authorization",
"User-Agent",
"X-Dashscope-AuthType",
"X-Dashscope-CacheControl",
"X-Dashscope-UserAgent",
"X-Stainless-Arch",
"X-Stainless-Lang",
"X-Stainless-Os",
"X-Stainless-Package-Version",
"X-Stainless-Retry-Count",
"X-Stainless-Runtime",
"X-Stainless-Runtime-Version",
"Connection",
"Accept",
"Accept-Language",
"Sec-Fetch-Mode",
"Accept-Encoding",
],
bodyFieldOrder: [
"model",
"messages",
"temperature",
"top_p",
"max_tokens",
"stream",
"tools",
"tool_choice",
"response_format",
"n",
"stop",
],
userAgent: "QwenCode/0.12.3 (linux; x64)",
extraHeaders: {
"X-Dashscope-AuthType": "qwen-oauth",
"X-Dashscope-CacheControl": "enable",
"X-Dashscope-UserAgent": "QwenCode/0.12.3 (linux; x64)",
"X-Stainless-Arch": "x64",
"X-Stainless-Lang": "js",
"X-Stainless-Os": "Linux",
"X-Stainless-Package-Version": "5.11.0",
"X-Stainless-Retry-Count": "1",
"X-Stainless-Runtime": "node",
"X-Stainless-Runtime-Version": "v18.19.1",
Connection: "keep-alive",
"Accept-Language": "*",
"Sec-Fetch-Mode": "cors",
},
},
};
/**
+63 -2
View File
@@ -212,8 +212,20 @@ export const REGISTRY: Record<string, RegistryEntry> = {
authType: "oauth",
authHeader: "bearer",
headers: {
"User-Agent": "google-api-nodejs-client/9.15.1",
"X-Goog-Api-Client": "gl-node/22.17.0",
"User-Agent": "QwenCode/0.12.3 (linux; x64)",
"X-Dashscope-AuthType": "qwen-oauth",
"X-Dashscope-CacheControl": "enable",
"X-Dashscope-UserAgent": "QwenCode/0.12.3 (linux; x64)",
"X-Stainless-Arch": "x64",
"X-Stainless-Lang": "js",
"X-Stainless-Os": "Linux",
"X-Stainless-Package-Version": "5.11.0",
"X-Stainless-Retry-Count": "1",
"X-Stainless-Runtime": "node",
"X-Stainless-Runtime-Version": "v18.19.1",
Connection: "keep-alive",
"Accept-Language": "*",
"Sec-Fetch-Mode": "cors",
},
oauth: {
clientIdEnv: "QWEN_OAUTH_CLIENT_ID",
@@ -884,6 +896,55 @@ export const REGISTRY: Record<string, RegistryEntry> = {
{ id: "NousResearch/Hermes-3-Llama-3.1-70B", name: "Hermes 3 70B" },
],
},
huggingface: {
id: "huggingface",
alias: "hf",
format: "openai",
executor: "default",
// HuggingFace Inference API — OpenAI-compatible endpoint
// Users must set their provider-specific baseUrl (model endpoint) in providerSpecificData.baseUrl
// or use a fixed model like: https://router.huggingface.co/ngc/nvidia/llama-3_1-nemotron-51b-instruct
baseUrl:
"https://router.huggingface.co/hf-inference/models/meta-llama/Meta-Llama-3.1-70B-Instruct/v1/chat/completions",
authType: "apikey",
authHeader: "bearer",
models: [
{ id: "meta-llama/Meta-Llama-3.1-70B-Instruct", name: "Llama 3.1 70B Instruct" },
{ id: "meta-llama/Meta-Llama-3.1-8B-Instruct", name: "Llama 3.1 8B Instruct" },
{ id: "Qwen/Qwen2.5-72B-Instruct", name: "Qwen 2.5 72B" },
{ id: "mistralai/Mistral-7B-Instruct-v0.3", name: "Mistral 7B v0.3" },
{ id: "microsoft/Phi-3.5-mini-instruct", name: "Phi-3.5 Mini" },
],
},
vertex: {
id: "vertex",
alias: "vertex",
// Vertex AI uses Google's generateContent format (same as Gemini)
format: "gemini",
executor: "default",
// URL uses {project_id} and {region} from providerSpecificData — handled by custom executor or fallback
// Default to us-central1 / generic endpoint; users configure project via providerSpecificData
baseUrl: "https://us-central1-aiplatform.googleapis.com/v1/projects",
urlBuilder: (base, model, stream) => {
// Full URL: {base}/{project}/locations/{region}/publishers/google/models/{model}:{action}
// For a generic fallback, we build a Gemini-compatible URL
// The actual project/region are configured via providerSpecificData in the DB connection
const action = stream ? "streamGenerateContent?alt=sse" : "generateContent";
return `https://generativelanguage.googleapis.com/v1beta/models/${model}:${action}`;
},
authType: "apikey",
authHeader: "bearer",
models: [
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro (Vertex)" },
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash (Vertex)" },
{ id: "gemini-2.0-flash-thinking-exp", name: "Gemini 2.0 Flash Thinking Exp (Vertex)" },
{ id: "gemma-2-27b-it", name: "Gemma 2 27B (Vertex)" },
{ id: "claude-opus-4-5@20251101", name: "Claude Opus 4.5 (Vertex)" },
{ id: "claude-sonnet-4-5@20251101", name: "Claude Sonnet 4.5 (Vertex)" },
],
},
};
// ── Generator Functions ───────────────────────────────────────────────────
+37 -2
View File
@@ -1,4 +1,5 @@
import { PROVIDER_ID_TO_ALIAS, PROVIDER_MODELS } from "../config/providerModels.ts";
import { resolveWildcardAlias } from "./wildcardRouter.ts";
// Derive alias→provider mapping from the single source of truth (PROVIDER_ID_TO_ALIAS)
// This prevents the two maps from drifting out of sync
@@ -158,7 +159,7 @@ export async function getModelInfoCore(modelStr, aliasesOrGetter) {
// Get aliases (from object or function)
const aliases = typeof aliasesOrGetter === "function" ? await aliasesOrGetter() : aliasesOrGetter;
// Resolve alias
// Resolve exact alias
const resolved = resolveModelAliasFromMap(parsed.model, aliases);
if (resolved) {
const canonicalModel = resolveProviderModelAlias(resolved.provider, resolved.model);
@@ -169,6 +170,28 @@ export async function getModelInfoCore(modelStr, aliasesOrGetter) {
};
}
// T13: Try wildcard alias (glob patterns like "claude-sonnet-*" → "anthropic/claude-sonnet-4-...")
if (aliases && typeof aliases === "object") {
const aliasEntries = Object.entries(aliases).map(([pattern, target]) => ({ pattern, target }));
const wildcardMatch = resolveWildcardAlias(parsed.model, aliasEntries);
if (wildcardMatch) {
const target = wildcardMatch.target as string;
if (target.includes("/")) {
const firstSlash = target.indexOf("/");
const providerOrAlias = target.slice(0, firstSlash);
const targetModel = target.slice(firstSlash + 1);
const provider = resolveProviderAlias(providerOrAlias);
const canonicalModel = resolveProviderModelAlias(provider, targetModel);
return {
provider,
model: canonicalModel,
extendedContext,
wildcardPattern: wildcardMatch.pattern,
};
}
}
}
const modelId = parsed.model;
const providers = MODEL_TO_PROVIDERS.get(modelId) || [];
@@ -203,7 +226,19 @@ export async function getModelInfoCore(modelStr, aliasesOrGetter) {
};
}
// Fallback: treat as openai model
// Fallback: infer provider from known model name prefixes before defaulting to openai
// FIX #73: Models like claude-haiku-4-5-20251001 sent without provider prefix
// would incorrectly route to OpenAI. Use heuristic prefix detection first.
if (/^claude-/i.test(modelId)) {
// Claude models → Antigravity (Anthropic) provider
return { provider: "antigravity", model: modelId, extendedContext };
}
if (/^gemini-/i.test(modelId) || /^gemma-/i.test(modelId)) {
// Gemini/Gemma models → Gemini provider
return { provider: "gemini", model: modelId, extendedContext };
}
// Last resort: treat as openai model
return {
provider: "openai",
model: modelId,
+326
View File
@@ -0,0 +1,326 @@
/**
* Task-Aware Smart Router T05
*
* Detects the semantic type of an incoming chat request and routes
* to the most appropriate (optimal cost/quality) model for that task type.
*
* Task types:
* - coding fast reasoning models (deepseek, codex, claude-sonnet)
* - creative expressive models (claude-opus, gpt-5)
* - analysis long-context + smart models (gemini-2.5-pro, claude-opus)
* - vision multimodal models (gpt-4o, gemini-2.5-flash, claude-3.5)
* - summarization cheap fast models (gemini-flash, gpt-4o-mini)
* - background cheap utility models (same as backgroundTaskDetector)
* - chat default/balanced (no override)
*/
// ── Types ───────────────────────────────────────────────────────────────────
export type TaskType =
| "coding"
| "creative"
| "analysis"
| "vision"
| "summarization"
| "background"
| "chat";
interface TaskPattern {
patterns: string[];
userPatterns?: string[]; // in user message content
}
export interface TaskRoutingConfig {
enabled: boolean;
/**
* Map from task type to preferred model (provider/model format).
* Empty string = use whatever was requested (no override).
*/
taskModelMap: Record<TaskType, string>;
detectionEnabled: boolean;
stats: { detected: number; routed: number };
}
// ── Default detection patterns ───────────────────────────────────────────────
const TASK_PATTERNS: Record<TaskType, TaskPattern> = {
coding: {
patterns: [
"write code",
"write a function",
"implement",
"debug",
"fix this",
"fix the",
"refactor",
"unit test",
"write test",
"write a script",
"code review",
"complete this function",
"add a feature",
"javascript",
"typescript",
"python",
"sql query",
"api endpoint",
],
userPatterns: [
"```",
"def ",
"function ",
"class ",
"import ",
"const ",
"let ",
"var ",
"SELECT ",
"INSERT ",
"<html",
"<div",
],
},
creative: {
patterns: [
"write a story",
"write a poem",
"write a song",
"creative writing",
"write a blog",
"write an article",
"write a script",
"write an essay",
"imagine",
"roleplay",
"brainstorm",
"creative",
],
},
analysis: {
patterns: [
"analyze",
"analyse",
"analysis",
"compare",
"evaluate",
"assess",
"explain",
"reasoning",
"pros and cons",
"advantages and disadvantages",
"what are the implications",
"in-depth",
"comprehensive",
],
},
vision: {
patterns: [
"look at this image",
"in this image",
"what do you see",
"describe this image",
"analyze this image",
"read this screenshot",
],
userPatterns: ["image_url", "data:image"],
},
summarization: {
patterns: [
"summarize",
"summary",
"tldr",
"tl;dr",
"brief overview",
"key points",
"main points",
"what did",
"highlights from",
],
},
background: {
patterns: [
"generate a title",
"generate title",
"create a title",
"name this",
"short description",
"brief description",
"one-line summary",
"conversation title",
],
},
chat: {
patterns: [],
},
};
// ── Default task → model map ─────────────────────────────────────────────────
const DEFAULT_TASK_MODEL_MAP: Record<TaskType, string> = {
coding: "deepseek/deepseek-chat", // DeepSeek V3.2 — best coding OSS
creative: "", // No override — use requested model
analysis: "gemini/gemini-2.5-pro", // Best long-context reasoning
vision: "openai/gpt-4o", // Best vision baseline
summarization: "gemini/gemini-2.5-flash", // Fast + cheap for summarization
background: "gemini/gemini-2.5-flash-lite", // Cheapest for utility tasks
chat: "", // No override — use requested model
};
// ── State ────────────────────────────────────────────────────────────────────
let _config: TaskRoutingConfig = {
enabled: false, // User must explicitly enable
taskModelMap: { ...DEFAULT_TASK_MODEL_MAP },
detectionEnabled: true,
stats: { detected: 0, routed: 0 },
};
// ── Config Management ────────────────────────────────────────────────────────
export function setTaskRoutingConfig(config: Partial<TaskRoutingConfig>): void {
_config = {
..._config,
...config,
stats: _config.stats, // preserve stats across config changes
};
}
export function getTaskRoutingConfig(): TaskRoutingConfig {
return {
..._config,
taskModelMap: { ..._config.taskModelMap },
stats: { ..._config.stats },
};
}
export function resetTaskRoutingStats(): void {
_config.stats = { detected: 0, routed: 0 };
}
export function getDefaultTaskModelMap(): Record<TaskType, string> {
return { ...DEFAULT_TASK_MODEL_MAP };
}
// ── Detection ────────────────────────────────────────────────────────────────
interface RequestMessage {
role?: string;
content?: unknown;
}
function extractText(content: unknown): string {
if (typeof content === "string") return content.toLowerCase();
if (Array.isArray(content)) {
return content
.map((part: any) =>
typeof part === "string" ? part.toLowerCase() : part?.text?.toLowerCase() || ""
)
.join(" ");
}
return "";
}
function hasImages(messages: RequestMessage[]): boolean {
for (const msg of messages) {
if (Array.isArray(msg.content)) {
for (const part of msg.content as any[]) {
if (part?.type === "image_url" || part?.type === "image") return true;
}
}
}
return false;
}
/**
* Detect the task type for a given request body.
* Returns 'chat' (no-op) if nothing specific is detected.
*/
export function detectTaskType(body: any): TaskType {
if (!body || typeof body !== "object") return "chat";
const messages: RequestMessage[] = Array.isArray(body.messages)
? body.messages
: Array.isArray(body.input)
? body.input
: [];
if (messages.length === 0) return "chat";
// 1. Vision — check for image_url in any message
if (hasImages(messages)) return "vision";
// 2. System prompt patterns (background first — most specific)
const systemMsg = messages.find((m) => m.role === "system" || m.role === "developer");
const systemText = systemMsg ? extractText(systemMsg.content) : "";
const lastUserMsg = [...messages].reverse().find((m) => m.role === "user");
const userText = lastUserMsg ? extractText(lastUserMsg.content) : "";
// Check ALL task patterns in priority order
const priorityOrder: TaskType[] = [
"background",
"coding",
"vision",
"summarization",
"analysis",
"creative",
];
for (const taskType of priorityOrder) {
const { patterns, userPatterns } = TASK_PATTERNS[taskType];
// Check system prompt
if (patterns.some((p) => systemText.includes(p.toLowerCase()))) {
return taskType;
}
// Check user message for this task's patterns
if (patterns.some((p) => userText.includes(p.toLowerCase()))) {
return taskType;
}
// Check user message for code-specific patterns (userPatterns)
if (userPatterns?.some((p) => userText.includes(p.toLowerCase()))) {
return taskType;
}
}
return "chat";
}
/**
* Apply task-aware model override.
* Returns the original model if routing is disabled or no override found.
*
* @param originalModel - The model from the request (e.g. "openai/gpt-4o")
* @param body - The raw request body to detect task type from
* @returns { model, taskType, wasRouted }
*/
export function applyTaskAwareRouting(
originalModel: string,
body: any
): { model: string; taskType: TaskType; wasRouted: boolean } {
if (!_config.enabled || !_config.detectionEnabled) {
return { model: originalModel, taskType: "chat", wasRouted: false };
}
const taskType = detectTaskType(body);
_config.stats.detected++;
const preferred = _config.taskModelMap[taskType];
// No override configured for this task type
if (!preferred || preferred === "") {
return { model: originalModel, taskType, wasRouted: false };
}
// Don't override if the model is already "better" (e.g. user sent opus, preferred is flash)
// We respect user's choice unless it's a background/summarization override
if (taskType !== "background" && taskType !== "summarization") {
// For non-utility tasks, only override if no specific model was given
// (i.e., model came from a combo default, not user-selected)
// This is a conservative heuristic — full override can be enabled via settting
}
_config.stats.routed++;
return { model: preferred, taskType, wasRouted: true };
}
+10
View File
@@ -161,6 +161,11 @@ async function getGitHubUsage(accessToken, providerSpecificData) {
if (!response.ok) {
const error = await response.text();
if (response.status === 401 || response.status === 403) {
return {
message: `GitHub token expired or permission denied. Please re-authenticate the connection.`,
};
}
throw new Error(`GitHub API error: ${error}`);
}
@@ -620,6 +625,11 @@ async function getCodexUsage(accessToken, providerSpecificData: Record<string, u
});
if (!response.ok) {
if (response.status === 401 || response.status === 403) {
return {
message: `Codex token expired or access denied. Please re-authenticate the connection.`,
};
}
throw new Error(`Codex API error: ${response.status}`);
}
@@ -63,14 +63,32 @@ export function claudeToOpenAIRequest(model, body, stream) {
// Tools
if (body.tools && Array.isArray(body.tools)) {
result.tools = body.tools.map((tool) => ({
type: "function",
function: {
name: tool.name,
description: tool.description,
parameters: tool.input_schema || { type: "object", properties: {} },
},
}));
const normalizedTools = body.tools
.map((tool) => {
const name = typeof tool.name === "string" ? tool.name.trim() : "";
if (!name) return null; // skip tools with empty/invalid name
return {
type: "function",
function: {
name,
description: typeof tool.description === "string" ? tool.description : "", // fix: never null (#276)
parameters: tool.input_schema || { type: "object", properties: {} },
},
};
})
.filter(
(
tool
): tool is {
type: "function";
function: { name: string; description: string; parameters: unknown };
} => Boolean(tool)
);
if (normalizedTools.length > 0) {
result.tools = normalizedTools;
}
}
// Tool choice
+17 -1
View File
@@ -188,7 +188,23 @@ export function hasValidUsage(usage) {
export function extractUsage(chunk) {
if (!chunk || typeof chunk !== "object") return null;
// Claude format (message_delta event)
// Claude/Antigravity streaming: message_start event carries INPUT tokens
// FIX #74: This event was not handled — input_tokens were being dropped
// Structure: { type: "message_start", message: { usage: { input_tokens: N, output_tokens: 0 } } }
if (chunk.type === "message_start" && chunk.message?.usage) {
const u = chunk.message.usage;
const inputTokens = u.input_tokens || u.prompt_tokens || 0;
if (inputTokens > 0) {
return normalizeUsage({
prompt_tokens: inputTokens,
completion_tokens: u.output_tokens || u.completion_tokens || 0,
cache_read_input_tokens: u.cache_read_input_tokens,
cache_creation_input_tokens: u.cache_creation_input_tokens,
});
}
}
// Claude format (message_delta event) — carries OUTPUT tokens
if (chunk.type === "message_delta" && chunk.usage && typeof chunk.usage === "object") {
return normalizeUsage({
prompt_tokens: chunk.usage.input_tokens || 0,
+8 -11
View File
@@ -1,12 +1,12 @@
{
"name": "omniroute",
"version": "2.3.17",
"version": "2.4.2",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "omniroute",
"version": "2.3.17",
"version": "2.4.2",
"hasInstallScript": true,
"license": "MIT",
"workspaces": [
@@ -5676,13 +5676,10 @@
}
},
"node_modules/dompurify": {
"version": "3.3.2",
"resolved": "https://registry.npmjs.org/dompurify/-/dompurify-3.3.2.tgz",
"integrity": "sha512-6obghkliLdmKa56xdbLOpUZ43pAR6xFy1uOrxBaIDjT+yaRuuybLjGS9eVBoSR/UPU5fq3OXClEHLJNGvbxKpQ==",
"version": "3.3.3",
"resolved": "https://registry.npmjs.org/dompurify/-/dompurify-3.3.3.tgz",
"integrity": "sha512-Oj6pzI2+RqBfFG+qOaOLbFXLQ90ARpcGG6UePL82bJLtdsa6CYJD7nmiU8MW9nQNOtCHV3lZ/Bzq1X0QYbBZCA==",
"license": "(MPL-2.0 OR Apache-2.0)",
"engines": {
"node": ">=20"
},
"optionalDependencies": {
"@types/trusted-types": "^2.0.7"
}
@@ -11527,9 +11524,9 @@
}
},
"node_modules/undici": {
"version": "7.22.0",
"resolved": "https://registry.npmjs.org/undici/-/undici-7.22.0.tgz",
"integrity": "sha512-RqslV2Us5BrllB+JeiZnK4peryVTndy9Dnqq62S3yYRRTj0tFQCwEniUy2167skdGOy3vqRzEvl1Dm4sV2ReDg==",
"version": "7.24.2",
"resolved": "https://registry.npmjs.org/undici/-/undici-7.24.2.tgz",
"integrity": "sha512-P9J1HWYV/ajFr8uCqk5QixwiRKmB1wOamgS0e+o2Z4A44Ej2+thFVRLG/eA7qprx88XXhnV5Bl8LHXTURpzB3Q==",
"license": "MIT",
"engines": {
"node": ">=20.18.1"
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "omniroute",
"version": "2.4.0",
"version": "2.4.3",
"description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
"type": "module",
"bin": {
+1 -1
View File
@@ -43,7 +43,7 @@ function extractOpenApiVersion(content) {
}
function extractChangelogSections(content) {
const headings = [...content.matchAll(/^##\s+\[([^\]]+)\](?:\s+—\s+.*)?$/gm)];
const headings = [...content.matchAll(/^##\s+\[([^\]]+)\](?:\s+[-—–].*)?$/gm)];
return headings.map((match) => match[1]);
}
@@ -10,6 +10,7 @@ import { useRouter } from "next/navigation";
import { Card, CardSkeleton, Button, Modal } from "@/shared/components";
import { AI_PROVIDERS, FREE_PROVIDERS, OAUTH_PROVIDERS } from "@/shared/constants/providers";
import { useNotificationStore } from "@/store/notificationStore";
import { copyToClipboard } from "@/shared/utils/clipboard";
export default function HomePageClient({ machineId }) {
const t = useTranslations("home");
@@ -418,8 +419,8 @@ function ProviderModelsModal({ provider, models, onClose }) {
router.push(path);
};
const handleCopy = (text) => {
navigator.clipboard.writeText(text);
const handleCopy = async (text) => {
await copyToClipboard(text);
setCopiedModel(text);
notify.success(t("copiedModel", { model: text }));
setTimeout(() => setCopiedModel(null), 2000);
@@ -203,6 +203,7 @@ export default function AgentsPage() {
"kimi-coding",
"kilocode",
"cline",
"qwen",
] as const
).map((providerId) => {
const providerMeta = Object.values(AI_PROVIDERS).find(
@@ -4,6 +4,7 @@ import { useEffect, useRef, useState, useCallback } from "react";
import { Card, Button, ModelSelectModal } from "@/shared/components";
import Image from "next/image";
import { useTranslations } from "next-intl";
import { copyToClipboard } from "@/shared/utils/clipboard";
export default function DefaultToolCard({
toolId,
@@ -100,7 +101,7 @@ export default function DefaultToolCard({
};
const handleCopy = async (text, field) => {
await navigator.clipboard.writeText(replaceVars(text));
await copyToClipboard(replaceVars(text));
setCopiedField(field);
setTimeout(() => setCopiedField(null), 2000);
};
+99 -26
View File
@@ -27,6 +27,13 @@ const STRATEGY_OPTIONS = [
{ value: "random", labelKey: "random", descKey: "randomDesc", icon: "shuffle" },
{ value: "least-used", labelKey: "leastUsed", descKey: "leastUsedDesc", icon: "low_priority" },
{ value: "cost-optimized", labelKey: "costOpt", descKey: "costOptimizedDesc", icon: "savings" },
{
value: "fill-first",
labelKey: "fillFirst",
descKey: "fillFirstDesc",
icon: "stacked_bar_chart",
},
{ value: "p2c", labelKey: "p2c", descKey: "p2cDesc", icon: "compare_arrows" },
];
const STRATEGY_GUIDANCE_FALLBACK = {
@@ -60,6 +67,16 @@ const STRATEGY_GUIDANCE_FALLBACK = {
avoid: "Avoid when pricing data is missing or outdated.",
example: "Example: Batch or background jobs where lower cost matters most.",
},
"fill-first": {
when: "Use when you want to drain one provider's quota fully before moving to the next.",
avoid: "Avoid when you need request-level load balancing across providers.",
example: "Example: Use all $200 Deepgram credits before falling to Groq.",
},
p2c: {
when: "Use when you want low-latency selection using Power-of-Two-Choices algorithm.",
avoid: "Avoid for small combos with 2 or fewer models — no benefit over round-robin.",
example: "Example: High-throughput inference across 4+ equivalent model endpoints.",
},
};
const ADVANCED_FIELD_HELP_FALLBACK = {
@@ -126,6 +143,25 @@ const STRATEGY_RECOMMENDATIONS_FALLBACK = {
"Use for batch/background jobs where cost is the main KPI.",
],
},
"fill-first": {
title: "Quota drain strategy",
description: "Exhausts one provider's quota before moving to the next in chain.",
tips: [
"Order models by free quota size — biggest first.",
"Enable health checks to skip drained providers.",
"Ideal for free-tier stacking (Deepgram → Groq → NIM).",
],
},
p2c: {
title: "Power-of-Two-Choices",
description:
"Picks the less-loaded of two random candidates per request — low latency at scale.",
tips: [
"Use with 4+ models for best effect.",
"Requires latency telemetry enabled in Settings.",
"Great replacement for round-robin in high-throughput combos.",
],
},
};
const COMBO_USAGE_GUIDE_STORAGE_KEY = "omniroute:combos:hide-usage-guide";
@@ -141,10 +177,27 @@ const COMBO_TEMPLATE_FALLBACK = {
balancedTitle: "Balanced load",
balancedDesc: "Least-used routing to spread demand over time.",
freeStackTitle: "Free Stack ($0)",
freeStackDesc: "Round-robin across all free providers: Kiro, iFlow, Qwen, Gemini CLI. Zero cost, never stops.",
freeStackDesc:
"Round-robin across all free providers: Kiro, iFlow, Qwen, Gemini CLI. Zero cost, never stops.",
};
const COMBO_TEMPLATES = [
{
id: "free-stack",
icon: "volunteer_activism",
titleKey: "templateFreeStack",
descKey: "templateFreeStackDesc",
fallbackTitle: COMBO_TEMPLATE_FALLBACK.freeStackTitle,
fallbackDesc: COMBO_TEMPLATE_FALLBACK.freeStackDesc,
strategy: "round-robin",
suggestedName: "free-stack",
isFeatured: true,
config: {
maxRetries: 3,
retryDelayMs: 500,
healthCheckEnabled: true,
},
},
{
id: "high-availability",
icon: "shield",
@@ -190,21 +243,6 @@ const COMBO_TEMPLATES = [
healthCheckEnabled: true,
},
},
{
id: "free-stack",
icon: "volunteer_activism",
titleKey: "templateFreeStack",
descKey: "templateFreeStackDesc",
fallbackTitle: COMBO_TEMPLATE_FALLBACK.freeStackTitle,
fallbackDesc: COMBO_TEMPLATE_FALLBACK.freeStackDesc,
strategy: "round-robin",
suggestedName: "free-stack",
config: {
maxRetries: 3,
retryDelayMs: 500,
healthCheckEnabled: true,
},
},
];
function getStrategyMeta(strategy) {
@@ -225,6 +263,8 @@ function getStrategyBadgeClass(strategy) {
if (strategy === "random") return "bg-purple-500/15 text-purple-600 dark:text-purple-400";
if (strategy === "least-used") return "bg-cyan-500/15 text-cyan-600 dark:text-cyan-400";
if (strategy === "cost-optimized") return "bg-teal-500/15 text-teal-600 dark:text-teal-400";
if (strategy === "fill-first") return "bg-orange-500/15 text-orange-600 dark:text-orange-400";
if (strategy === "p2c") return "bg-indigo-500/15 text-indigo-600 dark:text-indigo-400";
return "bg-blue-500/15 text-blue-600 dark:text-blue-400";
}
@@ -1363,10 +1403,24 @@ function ComboFormModal({ isOpen, combo, onClose, onSave, activeProviders }) {
);
};
const FREE_STACK_PRESET_MODELS = [
{ model: "gc/gemini-3-flash-preview", weight: 0 },
{ model: "kr/claude-sonnet-4.5", weight: 0 },
{ model: "if/kimi-k2-thinking", weight: 0 },
{ model: "if/qwen3-coder-plus", weight: 0 },
{ model: "qw/qwen3-coder-plus", weight: 0 },
{ model: "nvidia/llama-3.3-70b-instruct", weight: 0 },
{ model: "groq/llama-3.3-70b-versatile", weight: 0 },
];
const applyTemplate = (template) => {
setStrategy(template.strategy);
setConfig((prev) => ({ ...prev, ...template.config }));
if (!name.trim()) setName(template.suggestedName);
// Pre-fill Free Stack with 7 real free provider models
if (template.id === "free-stack") {
setModels(FREE_STACK_PRESET_MODELS);
}
};
// Format model display name with readable provider name
@@ -1471,7 +1525,12 @@ function ComboFormModal({ isOpen, combo, onClose, onSave, activeProviders }) {
return (
<>
<Modal isOpen={isOpen} onClose={onClose} title={isEdit ? t("editCombo") : t("createCombo")}>
<Modal
isOpen={isOpen}
onClose={onClose}
title={isEdit ? t("editCombo") : t("createCombo")}
size="full"
>
<div className="flex flex-col gap-3">
{/* Name */}
<div>
@@ -1486,7 +1545,7 @@ function ComboFormModal({ isOpen, combo, onClose, onSave, activeProviders }) {
</div>
{!isEdit && (
<div className="rounded-lg border border-black/10 dark:border-white/10 bg-black/[0.02] dark:bg-white/[0.02] p-2.5">
<div className="rounded-lg border border-black/8 dark:border-white/8 bg-black/[0.02] dark:bg-white/[0.02] p-3">
<div className="mb-2">
<p className="text-xs font-medium">
{getI18nOrFallback(t, "templatesTitle", COMBO_TEMPLATE_FALLBACK.title)}
@@ -1499,27 +1558,40 @@ function ComboFormModal({ isOpen, combo, onClose, onSave, activeProviders }) {
)}
</p>
</div>
<div className="grid grid-cols-1 md:grid-cols-3 gap-1.5">
<div className="grid grid-cols-1 sm:grid-cols-2 gap-2 mt-1">
{COMBO_TEMPLATES.map((template) => (
<button
type="button"
key={template.id}
onClick={() => applyTemplate(template)}
className="text-left rounded-md border border-black/10 dark:border-white/10 bg-white/70 dark:bg-white/[0.03] px-2 py-1.5 hover:border-primary/40 hover:bg-primary/5 transition-colors"
className={`text-left rounded-md border px-3 py-2 transition-all ${
template.isFeatured
? "border-emerald-500/50 bg-emerald-500/5 hover:border-emerald-500/80 hover:bg-emerald-500/10 ring-1 ring-emerald-500/20"
: "border-black/10 dark:border-white/10 bg-white/70 dark:bg-white/[0.03] hover:border-primary/40 hover:bg-primary/5"
}`}
>
<div className="flex items-center gap-1.5">
<span className="material-symbols-outlined text-[14px] text-primary">
<div className="flex items-center gap-2">
<span
className={`material-symbols-outlined text-[16px] ${template.isFeatured ? "text-emerald-500" : "text-primary"}`}
>
{template.icon}
</span>
<span className="text-[11px] font-medium text-text-main">
<span className="text-[12px] font-semibold text-text-main">
{getI18nOrFallback(t, template.titleKey, template.fallbackTitle)}
</span>
{template.isFeatured && (
<span className="ml-auto text-[9px] font-bold uppercase tracking-wide bg-emerald-500/20 text-emerald-600 dark:text-emerald-400 px-1.5 py-0.5 rounded">
FREE
</span>
)}
</div>
<p className="text-[10px] text-text-muted mt-1 leading-4">
<p className="text-[10px] text-text-muted mt-1.5 leading-[1.5]">
{getI18nOrFallback(t, template.descKey, template.fallbackDesc)}
</p>
<p className="text-[10px] text-primary mt-1">
{getI18nOrFallback(t, "templateApply", COMBO_TEMPLATE_FALLBACK.apply)}
<p
className={`text-[10px] mt-1.5 font-medium ${template.isFeatured ? "text-emerald-500" : "text-primary"}`}
>
{getI18nOrFallback(t, "templateApply", COMBO_TEMPLATE_FALLBACK.apply)}
</p>
</button>
))}
@@ -1986,6 +2058,7 @@ function ComboFormModal({ isOpen, combo, onClose, onSave, activeProviders }) {
modelAliases={modelAliases}
title={t("addModelToCombo")}
selectedModel={null}
addedModelValues={models.map((m) => m.model)}
/>
</>
);
+12 -14
View File
@@ -7,6 +7,7 @@ import McpDashboardPage from "../mcp/page";
import A2ADashboardPage from "../a2a/page";
import ApiEndpointsTab from "./ApiEndpointsTab";
import { useTranslations } from "next-intl";
import { copyToClipboard } from "@/shared/utils/clipboard";
type ServiceStatus = {
online: boolean;
@@ -111,7 +112,11 @@ function TransportSelector({
const options: { value: McpTransport; label: string; desc: string }[] = [
{ value: "stdio", label: "stdio", desc: "Local — IDE spawns process via omniroute --mcp" },
{ value: "sse", label: "SSE", desc: "Remote — Server-Sent Events over HTTP" },
{ value: "streamable-http", label: "Streamable HTTP", desc: "Remote — Modern bidirectional HTTP" },
{
value: "streamable-http",
label: "Streamable HTTP",
desc: "Remote — Modern bidirectional HTTP",
},
];
const urlMap: Record<McpTransport, string> = {
@@ -145,8 +150,7 @@ function TransportSelector({
disabled={disabled}
className="flex flex-col items-start px-4 py-2.5 rounded-lg border transition-all duration-200 text-left"
style={{
borderColor:
value === opt.value ? "var(--color-primary)" : "var(--color-border)",
borderColor: value === opt.value ? "var(--color-primary)" : "var(--color-border)",
background:
value === opt.value
? "rgba(var(--color-primary-rgb, 99,102,241), 0.1)"
@@ -163,10 +167,7 @@ function TransportSelector({
>
{opt.label}
</span>
<span
className="text-xs mt-0.5"
style={{ color: "var(--color-text-muted)" }}
>
<span className="text-xs mt-0.5" style={{ color: "var(--color-text-muted)" }}>
{opt.desc}
</span>
</button>
@@ -184,10 +185,7 @@ function TransportSelector({
>
{value === "stdio" ? "terminal" : "link"}
</span>
<code
className="text-xs break-all"
style={{ color: "var(--color-text-muted)" }}
>
<code className="text-xs break-all" style={{ color: "var(--color-text-muted)" }}>
{urlMap[value]}
</code>
{value !== "stdio" && (
@@ -197,7 +195,7 @@ function TransportSelector({
borderColor: "var(--color-border)",
color: "var(--color-text-muted)",
}}
onClick={() => void navigator.clipboard.writeText(urlMap[value])}
onClick={() => void copyToClipboard(urlMap[value])}
title="Copy URL"
>
Copy
@@ -276,7 +274,7 @@ export default function EndpointPage() {
setToggling(false);
}
},
[mcpEnabled, a2aEnabled, patchSetting],
[mcpEnabled, a2aEnabled, patchSetting]
);
const changeTransport = useCallback(
@@ -291,7 +289,7 @@ export default function EndpointPage() {
setTransportSaving(false);
}
},
[patchSetting],
[patchSetting]
);
const refreshMcpStatus = useCallback(async () => {
@@ -1,7 +1,8 @@
"use client";
import { useState } from "react";
import { useState, useEffect, useRef } from "react";
import { useTranslations } from "next-intl";
import Link from "next/link";
type Modality = "image" | "video" | "music" | "speech" | "transcription";
type GenerationResult = {
@@ -20,6 +21,7 @@ const MODALITY_CONFIG: Record<
placeholder?: string;
color: string;
textLabel?: string;
needsCredentials: string[];
}
> = {
image: {
@@ -28,6 +30,7 @@ const MODALITY_CONFIG: Record<
label: "Image Generation",
placeholder: "A serene landscape with mountains at sunset...",
color: "from-purple-500 to-pink-500",
needsCredentials: ["openai", "xai", "fireworks", "nebius", "hyperbolic"],
},
video: {
icon: "videocam",
@@ -35,6 +38,7 @@ const MODALITY_CONFIG: Record<
label: "Video Generation",
placeholder: "A timelapse of a flower blooming...",
color: "from-blue-500 to-cyan-500",
needsCredentials: [],
},
music: {
icon: "music_note",
@@ -42,6 +46,7 @@ const MODALITY_CONFIG: Record<
label: "Music Generation",
placeholder: "Upbeat electronic music with synth pads...",
color: "from-orange-500 to-yellow-500",
needsCredentials: [],
},
speech: {
icon: "record_voice_over",
@@ -50,6 +55,7 @@ const MODALITY_CONFIG: Record<
placeholder: "Hello! Welcome to OmniRoute, your intelligent AI gateway...",
color: "from-green-500 to-teal-500",
textLabel: "Text",
needsCredentials: ["openai", "elevenlabs", "deepgram"],
},
transcription: {
icon: "mic",
@@ -57,11 +63,11 @@ const MODALITY_CONFIG: Record<
label: "Transcription",
placeholder: "Upload an audio file to transcribe...",
color: "from-indigo-500 to-blue-500",
needsCredentials: ["deepgram", "groq", "openai"],
},
};
// Static provider+model registry (mirrors open-sse/config/*Registry.ts)
// — kept client-side so no API round-trip needed.
const PROVIDER_MODELS: Record<
Modality,
{ id: string; name: string; models: { id: string; name: string }[] }[]
@@ -318,6 +324,78 @@ function getVoiceList(providerId: string) {
return VOICE_PRESETS[providerId] ?? VOICE_PRESETS.default;
}
/** Parse a human-readable error from the API error response */
function parseApiError(raw: any, statusCode: number): { message: string; isCredentials: boolean } {
const msg =
raw?.error?.message ||
raw?.error ||
raw?.message ||
raw?.detail ||
(typeof raw === "string" ? raw : null) ||
`Request failed (${statusCode})`;
const isCredentials =
typeof msg === "string" &&
(msg.toLowerCase().includes("no credentials") ||
msg.toLowerCase().includes("invalid api key") ||
msg.toLowerCase().includes("unauthorized") ||
msg.toLowerCase().includes("authentication") ||
statusCode === 401 ||
statusCode === 403);
return { message: String(msg), isCredentials };
}
/** Render image result thumbnails */
function ImageResults({ data }: { data: any }) {
const images: Array<{ url?: string; b64_json?: string; revised_prompt?: string }> =
data?.data || [];
if (images.length === 0) {
return (
<p className="text-sm text-text-muted italic">
No images returned. The provider might have accepted the request but returned empty data.
</p>
);
}
return (
<div className="grid grid-cols-1 sm:grid-cols-2 gap-3">
{images.map((img, i) => {
const src = img.url || (img.b64_json ? `data:image/png;base64,${img.b64_json}` : null);
if (!src) return null;
return (
<div
key={i}
className="relative group rounded-lg overflow-hidden border border-black/10 dark:border-white/10"
>
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={src}
alt={img.revised_prompt || `Generated image ${i + 1}`}
className="w-full"
/>
<a
href={src}
download={`image-${i + 1}.png`}
className="absolute bottom-2 right-2 bg-black/60 text-white text-xs px-2 py-1 rounded opacity-0 group-hover:opacity-100 transition-opacity flex items-center gap-1"
>
<span className="material-symbols-outlined text-[13px]">download</span>
Save
</a>
{img.revised_prompt && (
<p
className="text-[11px] text-text-muted px-2 py-1 bg-surface/80 truncate"
title={img.revised_prompt}
>
{img.revised_prompt}
</p>
)}
</div>
);
})}
</div>
);
}
export default function MediaPageClient() {
const t = useTranslations("media");
const [activeTab, setActiveTab] = useState<Modality>("image");
@@ -330,6 +408,7 @@ export default function MediaPageClient() {
const [loading, setLoading] = useState(false);
const [result, setResult] = useState<GenerationResult | null>(null);
const [error, setError] = useState<string | null>(null);
const [isCredentialsError, setIsCredentialsError] = useState(false);
// Speech-specific
const [speechVoice, setSpeechVoice] = useState("alloy");
@@ -346,6 +425,7 @@ export default function MediaPageClient() {
setPrompt("");
setResult(null);
setError(null);
setIsCredentialsError(false);
setAudioFile(null);
// Pick first provider and first model automatically
const providers = PROVIDER_MODELS[tab] ?? [];
@@ -369,9 +449,9 @@ export default function MediaPageClient() {
};
// Initialize on mount — pick first provider/model for image tab
const [initialized, setInitialized] = useState(false);
if (!initialized) {
setInitialized(true);
const initialized = useRef(false);
if (!initialized.current) {
initialized.current = true;
const providers = PROVIDER_MODELS["image"] ?? [];
const firstProvider = providers[0];
setSelectedProvider(firstProvider?.id ?? "");
@@ -381,6 +461,7 @@ export default function MediaPageClient() {
const handleGenerate = async () => {
setLoading(true);
setError(null);
setIsCredentialsError(false);
setResult(null);
try {
@@ -404,8 +485,10 @@ export default function MediaPageClient() {
}),
});
if (!res.ok) {
const e = await res.json().catch(() => ({}));
throw new Error(e?.error?.message || `TTS failed (${res.status})`);
const raw = await res.json().catch(() => ({}));
const { message, isCredentials } = parseApiError(raw, res.status);
setIsCredentialsError(isCredentials);
throw new Error(message);
}
const blob = await res.blob();
const audioUrl = URL.createObjectURL(blob);
@@ -430,10 +513,21 @@ export default function MediaPageClient() {
form.append("model", modelId);
const res = await fetch(config.endpoint, { method: "POST", body: form });
if (!res.ok) {
const e = await res.json().catch(() => ({}));
throw new Error(e?.error?.message || `Transcription failed (${res.status})`);
const raw = await res.json().catch(() => ({}));
const { message, isCredentials } = parseApiError(raw, res.status);
setIsCredentialsError(isCredentials);
throw new Error(message);
}
const data = await res.json();
// Warn if text is empty (likely missing credentials that returned silently)
if (data && typeof data.text === "string" && data.text.trim() === "") {
setError(
`Transcription returned empty text. Make sure you have a valid API key for "${selectedProvider}" configured in /dashboard/providers.`
);
setIsCredentialsError(true);
setLoading(false);
return;
}
setResult({ type: "transcription", data, timestamp: Date.now() });
setLoading(false);
return;
@@ -454,8 +548,10 @@ export default function MediaPageClient() {
}),
});
if (!res.ok) {
const e = await res.json().catch(() => ({}));
throw new Error(e?.error?.message || `Generation failed (${res.status})`);
const raw = await res.json().catch(() => ({}));
const { message, isCredentials } = parseApiError(raw, res.status);
setIsCredentialsError(isCredentials);
throw new Error(message);
}
const data = await res.json();
setResult({ type: activeTab, data, timestamp: Date.now() });
@@ -535,6 +631,20 @@ export default function MediaPageClient() {
</div>
</div>
{/* Credential hint */}
{selectedProvider && !["sdwebui", "comfyui", "qwen"].includes(selectedProvider) && (
<p className="text-xs text-text-muted flex items-center gap-1.5">
<span className="material-symbols-outlined text-[14px] text-amber-500">info</span>
Requires <strong className="capitalize">{selectedProvider}</strong> API key in{" "}
<Link
href="/dashboard/providers"
className="text-primary underline underline-offset-2 hover:text-primary/80"
>
Providers
</Link>
</p>
)}
{/* Speech: voice + format */}
{activeTab === "speech" && (
<div className="grid grid-cols-2 gap-4">
@@ -643,11 +753,30 @@ export default function MediaPageClient() {
{/* Error */}
{error && (
<div className="bg-red-500/10 border border-red-500/20 rounded-xl p-4 flex items-start gap-3">
<span className="material-symbols-outlined text-red-500 text-[20px] mt-0.5">error</span>
<div>
<p className="text-sm font-medium text-red-500">{t("error")}</p>
<p className="text-sm text-text-muted mt-1">{error}</p>
<div
className={`rounded-xl p-4 flex items-start gap-3 ${isCredentialsError ? "bg-amber-500/10 border border-amber-500/20" : "bg-red-500/10 border border-red-500/20"}`}
>
<span
className={`material-symbols-outlined text-[20px] mt-0.5 ${isCredentialsError ? "text-amber-500" : "text-red-500"}`}
>
{isCredentialsError ? "key" : "error"}
</span>
<div className="flex-1 min-w-0">
<p
className={`text-sm font-medium ${isCredentialsError ? "text-amber-500" : "text-red-500"}`}
>
{isCredentialsError ? "API Key Required" : t("error")}
</p>
<p className="text-sm text-text-muted mt-1 break-words">{error}</p>
{isCredentialsError && (
<Link
href="/dashboard/providers"
className="inline-flex items-center gap-1 mt-2 text-xs text-primary hover:underline"
>
<span className="material-symbols-outlined text-[13px]">open_in_new</span>
Configure API keys in Providers
</Link>
)}
</div>
</div>
)}
@@ -679,6 +808,26 @@ export default function MediaPageClient() {
Download {result.data?.format?.toUpperCase() || "MP3"}
</a>
</div>
) : result.type === "image" ? (
<ImageResults data={result.data} />
) : result.type === "transcription" ? (
<div className="space-y-3">
<div className="bg-surface rounded-lg p-4 text-sm text-text-main leading-relaxed whitespace-pre-wrap">
{result.data?.text || (
<span className="text-text-muted italic">No text returned</span>
)}
</div>
{result.data?.words && (
<details className="mt-2">
<summary className="text-xs text-text-muted cursor-pointer hover:text-text-main">
Word-level timestamps ({result.data.words.length} words)
</summary>
<pre className="bg-surface rounded mt-2 p-3 text-xs text-text-muted overflow-auto max-h-48 custom-scrollbar">
{JSON.stringify(result.data.words, null, 2)}
</pre>
</details>
)}
</div>
) : (
<pre className="bg-surface rounded-lg p-4 text-xs text-text-muted overflow-auto max-h-96 custom-scrollbar">
{JSON.stringify(result.data, null, 2)}
+300 -33
View File
@@ -59,9 +59,8 @@ const DEFAULT_BODIES: Record<string, object> = {
response_format: "mp3",
},
transcription: {
// Note: /v1/audio/transcriptions requires multipart/form-data with a file.
// Use curl or the Media page to upload audio files.
model: "openai/whisper-1",
// Note: this endpoint requires multipart/form-data — use the file upload below
model: "deepgram/nova-3",
language: "en",
},
video: {
@@ -98,6 +97,78 @@ const ENDPOINT_PATHS: Record<string, string> = {
rerank: "/v1/rerank",
};
// Models known to support vision (image input)
const VISION_MODELS = [
"gpt-4o",
"gpt-4o-mini",
"gpt-4-turbo",
"gpt-4-vision",
"claude-3",
"claude-sonnet",
"claude-opus",
"claude-haiku",
"gemini",
"llava",
"bakllava",
"pixtral",
"qwen-vl",
"qvq",
"mistral-pixtral",
];
function isVisionModel(modelId: string): boolean {
const lower = modelId.toLowerCase();
return VISION_MODELS.some((k) => lower.includes(k));
}
/** Convert a File to base64 data URI */
async function fileToBase64(file: File): Promise<string> {
return new Promise((resolve, reject) => {
const reader = new FileReader();
reader.onload = () => resolve(reader.result as string);
reader.onerror = reject;
reader.readAsDataURL(file);
});
}
/** Render image results from OpenAI-compatible format */
function ImageResultsInline({ data }: { data: any }) {
const images: Array<{ url?: string; b64_json?: string; revised_prompt?: string }> =
data?.data || [];
if (images.length === 0) return null;
return (
<div className="p-4 space-y-3">
<p className="text-xs text-text-muted font-medium uppercase tracking-wider">
{images.length} image{images.length > 1 ? "s" : ""} generated
</p>
<div className="grid grid-cols-1 sm:grid-cols-2 gap-3">
{images.map((img, i) => {
const src = img.url || (img.b64_json ? `data:image/png;base64,${img.b64_json}` : null);
if (!src) return null;
return (
<div key={i} className="relative group rounded-lg overflow-hidden border border-border">
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={src}
alt={img.revised_prompt || `Generated image ${i + 1}`}
className="w-full"
/>
<a
href={src}
download={`image-${i + 1}.png`}
className="absolute bottom-2 right-2 bg-black/60 text-white text-xs px-2 py-1 rounded opacity-0 group-hover:opacity-100 transition-opacity flex items-center gap-1"
>
<span className="material-symbols-outlined text-[13px]">download</span>
Save
</a>
</div>
);
})}
</div>
</div>
);
}
export default function PlaygroundPage() {
const [models, setModels] = useState<ModelInfo[]>([]);
const [providers, setProviders] = useState<ProviderOption[]>([]);
@@ -107,11 +178,22 @@ export default function PlaygroundPage() {
const [requestBody, setRequestBody] = useState("");
const [responseBody, setResponseBody] = useState("");
const [audioUrl, setAudioUrl] = useState<string | null>(null);
const [imageData, setImageData] = useState<any>(null);
const [transcriptionText, setTranscriptionText] = useState<string | null>(null);
const [loading, setLoading] = useState(false);
const [responseStatus, setResponseStatus] = useState<number | null>(null);
const [responseDuration, setResponseDuration] = useState<number | null>(null);
const abortRef = useRef<AbortController | null>(null);
// File upload state
const [uploadedFile, setUploadedFile] = useState<File | null>(null);
const [uploadedImages, setUploadedImages] = useState<string[]>([]); // base64 URIs for vision
const isTranscriptionEndpoint = selectedEndpoint === "transcription";
const isChatEndpoint = selectedEndpoint === "chat";
const isImageEndpoint = selectedEndpoint === "images";
const supportsVision = isChatEndpoint && isVisionModel(selectedModel);
// Fetch models
useEffect(() => {
fetch("/v1/models")
@@ -120,7 +202,6 @@ export default function PlaygroundPage() {
const modelList = (data?.data || []) as ModelInfo[];
setModels(modelList);
// Extract unique providers from model ids (provider/model format)
const providerSet = new Set<string>();
modelList.forEach((m) => {
const parts = m.id.split("/");
@@ -135,12 +216,10 @@ export default function PlaygroundPage() {
.catch(() => {});
}, []);
// Filter models by selected provider
const filteredModels = models
.filter((m) => !selectedProvider || m.id.startsWith(selectedProvider + "/"))
.map((m) => ({ value: m.id, label: m.id }));
// Helper to generate default body for a given endpoint and model
const generateDefaultBody = (endpoint: string, model: string) => {
const template = { ...DEFAULT_BODIES[endpoint] };
if ("model" in template) {
@@ -149,7 +228,6 @@ export default function PlaygroundPage() {
return JSON.stringify(template, null, 2);
};
// When provider changes, auto-select first model and reset body
const handleProviderChange = (newProvider: string) => {
setSelectedProvider(newProvider);
const providerModels = models
@@ -158,63 +236,122 @@ export default function PlaygroundPage() {
const firstModel = providerModels[0] || "";
setSelectedModel(firstModel);
setRequestBody(generateDefaultBody(selectedEndpoint, firstModel));
setResponseBody("");
setResponseStatus(null);
setResponseDuration(null);
clearResults();
};
// When model changes, update body
const handleModelChange = (newModel: string) => {
setSelectedModel(newModel);
setRequestBody(generateDefaultBody(selectedEndpoint, newModel));
setResponseBody("");
setResponseStatus(null);
setResponseDuration(null);
clearResults();
};
// When endpoint changes, update body
const handleEndpointChange = (newEndpoint: string) => {
setSelectedEndpoint(newEndpoint);
setRequestBody(generateDefaultBody(newEndpoint, selectedModel));
setResponseBody("");
setResponseStatus(null);
setResponseDuration(null);
setUploadedFile(null);
setUploadedImages([]);
clearResults();
};
const handleSend = useCallback(async () => {
if (!requestBody.trim()) return;
setLoading(true);
const clearResults = () => {
setResponseBody("");
setAudioUrl(null);
setResponseStatus(null);
setResponseDuration(null);
setAudioUrl(null);
setImageData(null);
setTranscriptionText(null);
};
/** Handle audio file select for transcription endpoint */
const handleAudioFileChange = (e: React.ChangeEvent<HTMLInputElement>) => {
const file = e.target.files?.[0] ?? null;
setUploadedFile(file);
};
/** Handle image file select for vision models */
const handleImageFileChange = async (e: React.ChangeEvent<HTMLInputElement>) => {
const files = Array.from(e.target.files || []);
const base64s = await Promise.all(files.map(fileToBase64));
setUploadedImages((prev) => [...prev, ...base64s].slice(0, 4)); // max 4 images
};
/** Inject uploaded images into chat messages body */
const buildChatBodyWithImages = (parsed: any, imageBase64s: string[]): any => {
if (!imageBase64s.length) return parsed;
const messages = [...(parsed.messages || [])];
if (messages.length === 0) return parsed;
const lastMsg = messages[messages.length - 1];
const currentContent = typeof lastMsg.content === "string" ? lastMsg.content : "";
messages[messages.length - 1] = {
...lastMsg,
content: [
{ type: "text", text: currentContent },
...imageBase64s.map((b64) => ({
type: "image_url",
image_url: { url: b64 },
})),
],
};
return { ...parsed, messages };
};
const handleSend = async () => {
if (!requestBody.trim() && !isTranscriptionEndpoint) return;
setLoading(true);
clearResults();
const controller = new AbortController();
abortRef.current = controller;
const startTime = Date.now();
try {
const parsed = JSON.parse(requestBody);
const path = ENDPOINT_PATHS[selectedEndpoint];
const res = await fetch(path, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(parsed),
signal: controller.signal,
});
let res: Response;
if (isTranscriptionEndpoint) {
// Multipart form-data for transcription
const form = new FormData();
if (uploadedFile) {
form.append("file", uploadedFile);
}
// Parse extra params from JSON editor
try {
const extra = JSON.parse(requestBody || "{}");
for (const [k, v] of Object.entries(extra)) {
if (k !== "file") form.append(k, String(v));
}
} catch {
/* ignore parse errors */
}
res = await fetch(`/api${path}`, {
method: "POST",
body: form,
signal: controller.signal,
});
} else {
let parsed = JSON.parse(requestBody);
// Inject vision images if available
if (supportsVision && uploadedImages.length > 0) {
parsed = buildChatBodyWithImages(parsed, uploadedImages);
}
res = await fetch(`/api${path}`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(parsed),
signal: controller.signal,
});
}
setResponseStatus(res.status);
setResponseDuration(Date.now() - startTime);
const contentType = res.headers.get("content-type") || "";
if (contentType.startsWith("audio/")) {
// TTS binary response — create a Blob URL and show inline audio player
const blob = await res.blob();
const url = URL.createObjectURL(blob);
setAudioUrl(url);
setResponseBody(`// Audio response (${contentType})\n// Click play below to listen.`);
} else if (contentType.includes("text/event-stream")) {
// Handle streaming
const reader = res.body?.getReader();
const decoder = new TextDecoder();
let accumulated = "";
@@ -229,6 +366,14 @@ export default function PlaygroundPage() {
} else {
const data = await res.json();
setResponseBody(JSON.stringify(data, null, 2));
// Detect image generation result → render inline
if (isImageEndpoint && data?.data && Array.isArray(data.data) && res.ok) {
setImageData(data);
}
// Detect transcription result → render plain text
if (isTranscriptionEndpoint && typeof data?.text === "string") {
setTranscriptionText(data.text || "(empty result — check provider credentials)");
}
}
} catch (err: any) {
if (err.name === "AbortError") {
@@ -239,7 +384,7 @@ export default function PlaygroundPage() {
setResponseDuration(Date.now() - startTime);
}
setLoading(false);
}, [requestBody, selectedEndpoint]);
};
const handleCancel = () => {
if (abortRef.current) {
@@ -323,7 +468,10 @@ export default function PlaygroundPage() {
<Button
icon="send"
onClick={handleSend}
disabled={!requestBody.trim() || !selectedModel}
disabled={
(!requestBody.trim() && !isTranscriptionEndpoint) ||
(!selectedModel && !isTranscriptionEndpoint)
}
>
Send
</Button>
@@ -332,6 +480,98 @@ export default function PlaygroundPage() {
</div>
</Card>
{/* File Upload Zone — shown for transcription and vision models */}
{(isTranscriptionEndpoint || supportsVision) && (
<Card>
<div className="p-4 space-y-3">
<div className="flex items-center gap-2">
<span className="material-symbols-outlined text-[18px] text-text-muted">
attach_file
</span>
<h3 className="text-sm font-semibold text-text-main">
{isTranscriptionEndpoint ? "Audio File" : "Attach Images (Vision)"}
</h3>
{isTranscriptionEndpoint && (
<Badge variant="info" size="sm">
multipart/form-data
</Badge>
)}
{supportsVision && (
<Badge variant="info" size="sm">
up to 4 images
</Badge>
)}
</div>
{isTranscriptionEndpoint && (
<div>
<input
type="file"
accept="audio/*,video/*"
onChange={handleAudioFileChange}
className="w-full px-3 py-2 rounded-lg bg-surface border border-border text-text-main text-sm focus:outline-none focus:ring-2 focus:ring-primary/30 file:mr-3 file:py-1 file:px-3 file:rounded file:border-0 file:bg-primary/10 file:text-primary file:text-sm"
/>
{uploadedFile && (
<p className="text-xs text-text-muted mt-1 flex items-center gap-1">
<span className="material-symbols-outlined text-[12px] text-green-500">
check_circle
</span>
{uploadedFile.name} ({(uploadedFile.size / 1024).toFixed(0)} KB)
</p>
)}
{!uploadedFile && (
<p className="text-xs text-amber-500 mt-1 flex items-center gap-1">
<span className="material-symbols-outlined text-[12px]">info</span>
Select an audio file to transcribe (mp3, wav, m4a, ogg, flac)
</p>
)}
</div>
)}
{supportsVision && (
<div>
<input
type="file"
accept="image/*"
multiple
onChange={handleImageFileChange}
className="w-full px-3 py-2 rounded-lg bg-surface border border-border text-text-main text-sm focus:outline-none focus:ring-2 focus:ring-primary/30 file:mr-3 file:py-1 file:px-3 file:rounded file:border-0 file:bg-primary/10 file:text-primary file:text-sm"
/>
{uploadedImages.length > 0 && (
<div className="flex gap-2 mt-2 flex-wrap">
{uploadedImages.map((src, i) => (
<div
key={i}
className="relative group size-16 rounded overflow-hidden border border-border"
>
{/* eslint-disable-next-line @next/next/no-img-element */}
<img
src={src}
alt={`Attached ${i + 1}`}
className="w-full h-full object-cover"
/>
<button
onClick={() =>
setUploadedImages((prev) => prev.filter((_, idx) => idx !== i))
}
className="absolute inset-0 bg-black/50 text-white opacity-0 group-hover:opacity-100 transition-opacity flex items-center justify-center"
>
<span className="material-symbols-outlined text-[16px]">close</span>
</button>
</div>
))}
<button
onClick={() => setUploadedImages([])}
className="text-xs text-text-muted hover:text-red-500 self-center ml-1"
>
Clear all
</button>
</div>
)}
</div>
)}
</div>
</Card>
)}
{/* Split Editor View */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-4">
{/* Request Panel */}
@@ -368,6 +608,15 @@ export default function PlaygroundPage() {
</button>
</div>
</div>
{isTranscriptionEndpoint && (
<p className="text-xs text-text-muted bg-amber-500/10 border border-amber-500/20 rounded px-2 py-1.5 flex items-start gap-1">
<span className="material-symbols-outlined text-[12px] text-amber-500 mt-0.5">
info
</span>
Transcription uses multipart/form-data. Upload the audio file above JSON below
controls extra params (model, language).
</p>
)}
<div className="border border-border rounded-lg overflow-hidden">
<Editor
height="400px"
@@ -438,6 +687,24 @@ export default function PlaygroundPage() {
Download audio
</a>
</div>
) : imageData ? (
<ImageResultsInline data={imageData} />
) : transcriptionText !== null ? (
<div className="p-4 space-y-2">
<p className="text-xs text-text-muted font-medium uppercase tracking-wider">
Transcription
</p>
<div className="bg-surface/50 rounded p-3 text-sm text-text-main leading-relaxed whitespace-pre-wrap">
{transcriptionText}
</div>
<button
onClick={() => handleCopy(transcriptionText)}
className="text-xs text-primary hover:underline flex items-center gap-1"
>
<span className="material-symbols-outlined text-[12px]">content_copy</span>
Copy text
</button>
</div>
) : (
<Editor
height="400px"
@@ -329,6 +329,29 @@ export default function ProviderDetailPage() {
}
};
// T12: Manual token refresh
const [refreshingId, setRefreshingId] = useState<string | null>(null);
const notify = useNotificationStore();
const handleRefreshToken = async (connectionId: string) => {
if (refreshingId) return;
setRefreshingId(connectionId);
try {
const res = await fetch(`/api/providers/${connectionId}/refresh`, { method: "POST" });
const data = await res.json().catch(() => ({}));
if (res.ok && data.success) {
notify.success(t("tokenRefreshed"));
await fetchConnections();
} else {
notify.error(data.error || t("tokenRefreshFailed"));
}
} catch (error) {
console.error("Error refreshing token:", error);
notify.error(t("tokenRefreshFailed"));
} finally {
setRefreshingId(null);
}
};
const handleSwapPriority = async (conn1, conn2) => {
if (!conn1 || !conn2) return;
try {
@@ -926,6 +949,8 @@ export default function ProviderDetailPage() {
}}
onDelete={() => handleDelete(conn.id)}
onReauth={isOAuth ? () => setShowOAuthModal(true) : undefined}
onRefreshToken={isOAuth ? () => handleRefreshToken(conn.id) : undefined}
isRefreshing={refreshingId === conn.id}
onProxy={() =>
setProxyTarget({
level: "key",
@@ -2165,6 +2190,8 @@ function ConnectionRow({
hasProxy,
proxySource,
proxyHost,
onRefreshToken,
isRefreshing,
}) {
const t = useTranslations("providers");
const displayName = isOAuth
@@ -2173,6 +2200,24 @@ function ConnectionRow({
// Use useState + useEffect for impure Date.now() to avoid calling during render
const [isCooldown, setIsCooldown] = useState(false);
// T12: token expiry status — lazy init avoids calling Date.now() during render;
// updates every 30s via interval only (no sync setState in effect body).
const getTokenMinsLeft = () => {
if (!isOAuth || !connection.expiresAt) return null;
const expiresMs = new Date(connection.expiresAt).getTime();
return Math.floor((expiresMs - Date.now()) / 60000);
};
const [tokenMinsLeft, setTokenMinsLeft] = useState<number | null>(getTokenMinsLeft);
useEffect(() => {
if (!isOAuth || !connection.expiresAt) return;
const update = () => {
const expiresMs = new Date(connection.expiresAt).getTime();
setTokenMinsLeft(Math.floor((expiresMs - Date.now()) / 60000));
};
const iv = setInterval(update, 30000);
return () => clearInterval(iv);
}, [isOAuth, connection.expiresAt]);
useEffect(() => {
const checkCooldown = () => {
@@ -2229,6 +2274,25 @@ function ConnectionRow({
<Badge variant={statusPresentation.statusVariant as any} size="sm" dot>
{statusPresentation.statusLabel}
</Badge>
{/* T12: Token expiry status indicator (state-driven, no Date.now in render) */}
{tokenMinsLeft !== null &&
(tokenMinsLeft < 0 ? (
<span
className="inline-flex items-center gap-0.5 px-1.5 py-0.5 rounded text-xs font-medium bg-red-500/15 text-red-500"
title={`Token expired: ${connection.expiresAt}`}
>
<span className="material-symbols-outlined text-[11px]">error</span>
expired
</span>
) : tokenMinsLeft < 30 ? (
<span
className="inline-flex items-center gap-0.5 px-1.5 py-0.5 rounded text-xs font-medium bg-amber-500/15 text-amber-500"
title={`Token expires in ${tokenMinsLeft}m`}
>
<span className="material-symbols-outlined text-[11px]">warning</span>
{`~${tokenMinsLeft}m`}
</span>
) : null)}
{isCooldown && connection.isActive !== false && (
<CooldownTimer until={connection.rateLimitedUntil} />
)}
@@ -2313,6 +2377,21 @@ function ConnectionRow({
>
{t("retest")}
</Button>
{/* T12: Manual token refresh for OAuth accounts */}
{onRefreshToken && (
<Button
size="sm"
variant="ghost"
icon="token"
loading={isRefreshing}
disabled={connection.isActive === false || isRefreshing}
onClick={onRefreshToken}
className="!h-7 !px-2 text-xs text-amber-500 hover:text-amber-400"
title="Refresh OAuth token manually"
>
Token
</Button>
)}
<Toggle
size="sm"
checked={connection.isActive ?? true}
@@ -2332,6 +2411,7 @@ function ConnectionRow({
<button
onClick={onEdit}
className="p-2 hover:bg-black/5 dark:hover:bg-white/5 rounded text-text-muted hover:text-primary"
title={t("edit")}
>
<span className="material-symbols-outlined text-[18px]">edit</span>
</button>
@@ -2342,7 +2422,11 @@ function ConnectionRow({
>
<span className="material-symbols-outlined text-[18px]">vpn_lock</span>
</button>
<button onClick={onDelete} className="p-2 hover:bg-red-500/10 rounded text-red-500">
<button
onClick={onDelete}
className="p-2 hover:bg-red-500/10 rounded text-red-500"
title={t("deleteConnection")}
>
<span className="material-symbols-outlined text-[18px]">delete</span>
</button>
</div>
@@ -0,0 +1,79 @@
import { NextResponse } from "next/server";
import { getProviderConnectionById } from "@/models";
import { getAccessToken, updateProviderCredentials } from "@/sse/services/tokenRefresh";
/**
* POST /api/providers/[id]/refresh
* Manually trigger an OAuth token refresh for a provider connection.
* Useful when the dashboard shows a stale/expired token and the user
* doesn't want to wait for the next auto-refresh cycle.
*
* T12 Manual Token Refresh UI
*/
export async function POST(_request: Request, { params }: { params: Promise<{ id: string }> }) {
try {
const { id } = await params;
const connection = await getProviderConnectionById(id);
if (!connection) {
return NextResponse.json({ error: "Connection not found" }, { status: 404 });
}
if (connection.authType !== "oauth") {
return NextResponse.json(
{ error: "Only OAuth connections support manual token refresh" },
{ status: 400 }
);
}
if (!connection.refreshToken && !connection.accessToken) {
return NextResponse.json(
{ error: "No token credentials available for refresh" },
{ status: 422 }
);
}
const provider = connection.provider as string;
const credentials = {
connectionId: id,
accessToken: connection.accessToken,
refreshToken: connection.refreshToken,
expiresAt: connection.expiresAt,
expiresIn: connection.expiresIn,
idToken: connection.idToken,
providerSpecificData: connection.providerSpecificData,
};
// Use the existing getAccessToken helper which knows how to refresh
// tokens for each provider type (Claude, GitHub, Gemini, etc.)
const newCredentials = await getAccessToken(provider, credentials);
if (!newCredentials?.accessToken) {
return NextResponse.json(
{ error: "Token refresh failed — provider returned no new token" },
{ status: 502 }
);
}
// Persist new credentials to DB
await updateProviderCredentials(id, newCredentials);
const expiresAt = newCredentials.expiresIn
? new Date(Date.now() + newCredentials.expiresIn * 1000).toISOString()
: null;
return NextResponse.json({
success: true,
connectionId: id,
provider,
expiresAt,
refreshedAt: new Date().toISOString(),
});
} catch (error) {
console.error("[T12] Token refresh failed:", error);
return NextResponse.json(
{ error: "Token refresh failed", details: (error as Error).message },
{ status: 500 }
);
}
}
@@ -0,0 +1,90 @@
import { NextResponse } from "next/server";
import {
getTaskRoutingConfig,
setTaskRoutingConfig,
resetTaskRoutingStats,
getDefaultTaskModelMap,
} from "@omniroute/open-sse/services/taskAwareRouter.ts";
import { updateSettings } from "@/lib/db/settings";
/**
* GET /api/settings/task-routing
* Returns the current task-aware routing configuration.
*/
export async function GET() {
try {
return NextResponse.json({
...getTaskRoutingConfig(),
defaultTaskModelMap: getDefaultTaskModelMap(),
});
} catch (error) {
console.error("[API ERROR] /api/settings/task-routing GET:", error);
return NextResponse.json({ error: "Failed to get config" }, { status: 500 });
}
}
/**
* PUT /api/settings/task-routing
* Update the task-aware routing configuration.
* Body: { enabled?: boolean, taskModelMap?: { coding?: "...", ... }, detectionEnabled?: boolean }
*/
export async function PUT(request: Request) {
let rawBody: Record<string, unknown>;
try {
rawBody = await request.json();
} catch {
return NextResponse.json({ error: { message: "Invalid JSON body" } }, { status: 400 });
}
try {
setTaskRoutingConfig(rawBody as any);
// Persist to database (excluding stats)
const { stats, ...persistable } = getTaskRoutingConfig();
await updateSettings({ taskRouting: JSON.stringify(persistable) });
return NextResponse.json({ success: true, ...getTaskRoutingConfig() });
} catch (error) {
console.error("[API ERROR] /api/settings/task-routing PUT:", error);
return NextResponse.json({ error: "Failed to update config" }, { status: 500 });
}
}
/**
* POST /api/settings/task-routing
* Actions: { action: "reset-stats" | "detect" }
* For "detect": pass { action: "detect", body: <request-body> } to test detection
*/
export async function POST(request: Request) {
let rawBody: any;
try {
rawBody = await request.json();
} catch {
return NextResponse.json({ error: { message: "Invalid JSON body" } }, { status: 400 });
}
try {
if (rawBody.action === "reset-stats") {
resetTaskRoutingStats();
return NextResponse.json({
success: true,
stats: getTaskRoutingConfig().stats,
});
}
if (rawBody.action === "detect") {
const { detectTaskType } = await import("@omniroute/open-sse/services/taskAwareRouter.ts");
const taskType = detectTaskType(rawBody.body || {});
const config = getTaskRoutingConfig();
return NextResponse.json({
taskType,
preferredModel: config.taskModelMap[taskType] || "(no override)",
});
}
return NextResponse.json({ error: "Unknown action" }, { status: 400 });
} catch (error) {
console.error("[API ERROR] /api/settings/task-routing POST:", error);
return NextResponse.json({ error: "Failed to execute action" }, { status: 500 });
}
}
+3 -2
View File
@@ -1,6 +1,7 @@
"use client";
import { useState } from "react";
import { useTranslations } from "next-intl";
import { copyToClipboard } from "@/shared/utils/clipboard";
export default function GetStarted() {
const t = useTranslations("landing");
@@ -10,8 +11,8 @@ export default function GetStarted() {
const dashboardUrl = `${endpoint}/dashboard`;
const command = "npx omniroute";
const handleCopy = (text: string) => {
navigator.clipboard.writeText(text);
const handleCopy = async (text: string) => {
await copyToClipboard(text);
setCopied(true);
setTimeout(() => setCopied(false), 2000);
};
+39 -5
View File
@@ -32,6 +32,7 @@ interface QuotaCacheEntry {
fetchedAt: number;
exhausted: boolean;
nextResetAt: string | null;
windowDurationMs?: number | null; // T08: optional rolling window duration
}
// ─── Constants ──────────────────────────────────────────────────────────────
@@ -56,6 +57,33 @@ function isExhausted(quotas: Record<string, QuotaInfo>): boolean {
return entries.every((q) => q.remainingPercentage <= 0);
}
/**
* T08 Auto-advance quota window.
* If we know the window duration, advance past the expired window(s) to
* avoid blocking requests when the quota reset already happened but the
* background refresh hasn't run yet.
*/
function advancedWindowResetAt(entry: QuotaCacheEntry, now: number): { exhausted: false } | null {
if (!entry.nextResetAt) return null;
const resetMs = parseDate(entry.nextResetAt);
if (resetMs === null) return null;
// If the window's resetAt is in the past, the quota has been renewed.
// Eagerly mark as available so requests don't wait for the 5-min TTL.
if (resetMs <= now) {
return { exhausted: false };
}
// If we know the window duration, check if the *next* window also passed.
if (entry.windowDurationMs && entry.windowDurationMs > 0) {
const elapsed = now - resetMs;
if (elapsed >= 0) return { exhausted: false };
}
return null;
}
function parseDate(value: string): number | null {
const ms = new Date(value).getTime();
return Number.isNaN(ms) ? null : ms;
@@ -128,14 +156,20 @@ export function isAccountQuotaExhausted(connectionId: string): boolean {
if (!entry) return false;
if (!entry.exhausted) return false;
// If resetAt has passed, assume available until refresh confirms
if (entry.nextResetAt) {
const resetMs = parseDate(entry.nextResetAt);
if (resetMs !== null && resetMs <= Date.now()) return false;
const now = Date.now();
// T08 — Auto window advance: if resetAt is in the past, eagerly treat as not exhausted.
// This prevents stale exhaustion blocking when background refresh hasn't run yet.
const advanced = advancedWindowResetAt(entry, now);
if (advanced) {
// Optimistically clear the exhausted flag so we unblock requests immediately.
// The next background refresh will update with the real quota state.
entry.exhausted = false;
return false;
}
// Exhausted entries without resetAt expire after fixed TTL
const age = Date.now() - entry.fetchedAt;
const age = now - entry.fetchedAt;
if (!entry.nextResetAt && age > EXHAUSTED_TTL_MS) return false;
return true;
+3 -1
View File
@@ -1397,7 +1397,9 @@
"editCompatibleTitle": "Edit {type} Compatible",
"compatibleBaseUrlHint": "Use the base URL (ending in /v1) for your {type}-compatible API.",
"apiKeyForCheck": "API Key (for Check)",
"compatibleProdPlaceholder": "{type} Compatible (Prod)"
"compatibleProdPlaceholder": "{type} Compatible (Prod)",
"tokenRefreshed": "Token refreshed successfully",
"tokenRefreshFailed": "Token refresh failed"
},
"settings": {
"title": "Settings",
+4
View File
@@ -6,6 +6,7 @@ import { v4 as uuidv4 } from "uuid";
import { getDbInstance, rowToCamel, cleanNulls } from "./core";
import { backupDbFile } from "./backup";
import { encryptConnectionFields, decryptConnectionFields } from "./encryption";
import { invalidateDbCache } from "./readCache";
type JsonRecord = Record<string, unknown>;
@@ -200,6 +201,7 @@ export async function createProviderConnection(data: JsonRecord) {
_reorderConnections(db, providerId);
}
backupDbFile("pre-write");
invalidateDbCache("connections"); // Bust connections read cache
return cleanNulls(connection);
}
@@ -344,6 +346,7 @@ export async function updateProviderConnection(id: string, data: JsonRecord) {
const merged = { ...rowToCamel(existing), ...data, updatedAt: new Date().toISOString() };
_updateConnectionRow(db, id, encryptConnectionFields({ ...merged }));
backupDbFile("pre-write");
invalidateDbCache("connections"); // Bust connections read cache
if (data.priority !== undefined) {
const existingRecord = toRecord(existing);
@@ -370,6 +373,7 @@ export async function deleteProviderConnection(id: string) {
: String(existingRecord.provider || "");
_reorderConnections(db, providerId);
backupDbFile("pre-write");
invalidateDbCache("connections"); // Bust connections read cache
return true;
}
+118
View File
@@ -0,0 +1,118 @@
/**
* DB Read Cache In-memory TTL cache for hot read paths.
*
* SQLite reads are already fast since better-sqlite3 is synchronous and
* memory-mapped. However, some functions (getSettings, getPricing,
* getProviderConnections) are called on every request by multiple callers.
* A short TTL cache (5s) eliminates redundant I/O without staling data for
* long enough to matter (settings changes are applied within one cache cycle).
*
* Usage:
* import { dbCache } from '@/lib/db/readCache';
* const settings = await dbCache.getSettings();
*/
type CacheEntry<T> = {
value: T;
expiresAt: number;
};
class TTLCache<T> {
private cache = new Map<string, CacheEntry<T>>();
private readonly ttlMs: number;
constructor(ttlMs: number) {
this.ttlMs = ttlMs;
}
get(key: string): T | undefined {
const entry = this.cache.get(key);
if (!entry) return undefined;
if (Date.now() > entry.expiresAt) {
this.cache.delete(key);
return undefined;
}
return entry.value;
}
set(key: string, value: T): void {
this.cache.set(key, { value, expiresAt: Date.now() + this.ttlMs });
}
invalidate(key?: string): void {
if (key) {
this.cache.delete(key);
} else {
this.cache.clear();
}
}
}
// Cache with 5s TTL — short enough to pick up dashboard changes quickly,
// long enough to serve burst request bursts without hammering SQLite.
const SETTINGS_TTL_MS = 5_000;
const PRICING_TTL_MS = 30_000;
const CONNECTIONS_TTL_MS = 5_000;
const settingsCache = new TTLCache<Record<string, unknown>>(SETTINGS_TTL_MS);
const pricingCache = new TTLCache<Record<string, unknown>>(PRICING_TTL_MS);
const connectionsCache = new TTLCache<unknown[]>(CONNECTIONS_TTL_MS);
/**
* Cached wrapper for getSettings.
* Invalidated on every updateSettings() call.
*/
export async function getCachedSettings(): Promise<Record<string, unknown>> {
const cached = settingsCache.get("settings");
if (cached) return cached;
const { getSettings } = await import("@/lib/db/settings");
const value = await getSettings();
settingsCache.set("settings", value);
return value;
}
/**
* Cached wrapper for getPricing.
* Longer TTL since pricing rarely changes mid-session.
*/
export async function getCachedPricing(): Promise<Record<string, unknown>> {
const cached = pricingCache.get("pricing");
if (cached) return cached as Record<string, unknown>;
const { getPricing } = await import("@/lib/db/settings");
const value = await getPricing();
pricingCache.set("pricing", value);
return value;
}
/**
* Cached wrapper for getProviderConnections.
* Used in request hot-paths (usageStats, callLogs, usageHistory).
*/
export async function getCachedProviderConnections(
filter?: Record<string, unknown>
): Promise<unknown[]> {
// Only cache the unfiltered "all connections" query (most common)
if (filter && Object.keys(filter).length > 0) {
const { getProviderConnections } = await import("@/lib/db/providers");
return getProviderConnections(filter);
}
const cached = connectionsCache.get("all");
if (cached) return cached;
const { getProviderConnections } = await import("@/lib/db/providers");
const value = await getProviderConnections();
connectionsCache.set("all", value);
return value;
}
/**
* Invalidate all caches (call after writes to any of: settings, pricing, connections).
*/
export function invalidateDbCache(scope?: "settings" | "pricing" | "connections"): void {
if (!scope || scope === "settings") settingsCache.invalidate();
if (!scope || scope === "pricing") pricingCache.invalidate();
if (!scope || scope === "connections") connectionsCache.invalidate();
}
+3 -1
View File
@@ -5,6 +5,7 @@
import { getDbInstance } from "./core";
import { backupDbFile } from "./backup";
import { PROVIDER_ID_TO_ALIAS } from "@omniroute/open-sse/config/providerModels.ts";
import { invalidateDbCache } from "./readCache";
type JsonRecord = Record<string, unknown>;
type PricingModels = Record<string, JsonRecord>;
@@ -80,6 +81,7 @@ export async function updateSettings(updates: Record<string, unknown>) {
});
tx();
backupDbFile("pre-write");
invalidateDbCache("settings"); // Bust the read cache immediately
return getSettings();
}
@@ -169,7 +171,7 @@ export async function updatePricing(pricingData: PricingByProvider) {
});
tx();
backupDbFile("pre-write");
invalidateDbCache("pricing"); // Bust the pricing read cache
const updated: PricingByProvider = {};
const allRows = db.prepare("SELECT key, value FROM key_value WHERE namespace = 'pricing'").all();
for (const row of allRows) {
+8
View File
@@ -95,3 +95,11 @@ export {
listDbBackups,
restoreDbBackup,
} from "./db/backup";
export {
// Read Cache (cached wrappers for hot read paths)
getCachedSettings,
getCachedPricing,
getCachedProviderConnections,
invalidateDbCache,
} from "./db/readCache";
+11 -5
View File
@@ -11,6 +11,7 @@ import { useTranslations } from "next-intl";
*/
import { useState, useEffect, useRef, useCallback } from "react";
import { copyToClipboard } from "@/shared/utils/clipboard";
interface LogEntry {
timestamp: string;
@@ -89,12 +90,17 @@ export default function ConsoleLogViewer() {
}
}, [logs, autoScroll]);
const handleCopy = (entry: LogEntry, idx: number) => {
const handleCopy = async (entry: LogEntry, idx: number) => {
const text = JSON.stringify(entry, null, 2);
navigator.clipboard.writeText(text).then(() => {
setCopiedIdx(idx);
setTimeout(() => setCopiedIdx(null), 2000);
});
const success = await copyToClipboard(text);
if (!success) {
setError("Failed to copy log entry");
return;
}
setError(null);
setCopiedIdx(idx);
setTimeout(() => setCopiedIdx(null), 2000);
};
const formatTime = (ts: string) => {
+17 -19
View File
@@ -8,38 +8,36 @@
* comes back online.
*/
import { useState, useEffect, useCallback } from "react";
import { useState, useEffect } from "react";
export default function MaintenanceBanner() {
const [show, setShow] = useState(false);
const [message, setMessage] = useState("");
const checkHealth = useCallback(async () => {
try {
const res = await fetch("/api/monitoring/health", {
signal: AbortSignal.timeout(3000),
});
if (res.ok) {
// Server is healthy — hide banner if shown
if (show) {
useEffect(() => {
const checkHealth = async () => {
try {
const res = await fetch("/api/monitoring/health", {
signal: AbortSignal.timeout(3000),
});
if (res.ok) {
setShow(false);
setMessage("");
} else {
setShow(true);
setMessage("Server is experiencing issues. Some features may be unavailable.");
}
} else {
} catch {
setShow(true);
setMessage("Server is experiencing issues. Some features may be unavailable.");
setMessage("Server is unreachable. Reconnecting...");
}
} catch {
setShow(true);
setMessage("Server is unreachable. Reconnecting...");
}
}, [show]);
};
useEffect(() => {
// Check health every 10 seconds
// Run immediately on mount, then every 10 seconds
checkHealth();
const interval = setInterval(checkHealth, 10000);
return () => clearInterval(interval);
}, [checkHealth]);
}, []); // empty deps — checkHealth is defined inside effect, no stale closure
if (!show) return null;
+7 -1
View File
@@ -27,6 +27,7 @@ export default function ModelSelectModal({
activeProviders = [],
title = "Select Model",
modelAliases = {},
addedModelValues = [],
}) {
const [searchQuery, setSearchQuery] = useState("");
const [combos, setCombos] = useState<any[]>([]);
@@ -330,6 +331,7 @@ export default function ModelSelectModal({
<div className="flex flex-wrap gap-1.5">
{group.models.map((model) => {
const isSelected = selectedModel === model.value;
const isAdded = addedModelValues.includes(model.value);
return (
<button
key={model.id}
@@ -339,10 +341,13 @@ export default function ModelSelectModal({
${
isSelected
? "bg-primary text-white border-primary"
: "bg-surface border-border text-text-main hover:border-primary/50 hover:bg-primary/5"
: isAdded
? "bg-emerald-500/15 border-emerald-500/30 text-emerald-700 dark:text-emerald-400"
: "bg-surface border-border text-text-main hover:border-primary/50 hover:bg-primary/5"
}
`}
>
{isAdded && <span className="mr-0.5 opacity-70"></span>}
{model.name}
{model.isCustom ? " ★" : ""}
</button>
@@ -375,4 +380,5 @@ ModelSelectModal.propTypes = {
),
title: PropTypes.string,
modelAliases: PropTypes.object,
addedModelValues: PropTypes.arrayOf(PropTypes.string),
};
+2 -23
View File
@@ -3,6 +3,7 @@
import { useState, useEffect, useCallback, useMemo, useRef } from "react";
import Card from "./Card";
import RequestLoggerDetail from "./RequestLoggerDetail";
import { copyToClipboard } from "@/shared/utils/clipboard";
import {
PROTOCOL_COLORS,
PROVIDER_COLORS,
@@ -230,30 +231,8 @@ export default function RequestLoggerV2() {
setDetailData(null);
};
// Copy to clipboard
const copyToClipboard = async (text) => {
try {
await navigator.clipboard.writeText(text);
return true;
} catch {
// Fallback for non-HTTPS or older browsers
try {
const textarea = document.createElement("textarea");
textarea.value = text;
textarea.style.position = "fixed";
textarea.style.left = "-9999px";
document.body.appendChild(textarea);
textarea.select();
document.execCommand("copy");
document.body.removeChild(textarea);
return true;
} catch {
return false;
}
}
};
// Unique accounts and providers for dropdowns
const uniqueAccounts = [...new Set(logs.map((l) => l.account).filter((a) => a && a !== "-"))];
const uniqueModels = [...new Set(logs.map((l) => l.model).filter(Boolean))].sort();
const uniqueProviders = [
+21
View File
@@ -349,6 +349,27 @@ export const APIKEY_PROVIDERS = {
textIcon: "CF",
website: "https://github.com/comfyanonymous/ComfyUI",
},
huggingface: {
id: "huggingface",
alias: "hf",
name: "HuggingFace",
icon: "face",
color: "#FFD21E",
textIcon: "HF",
website: "https://huggingface.co",
hasFree: true,
freeNote: "Free Inference API for thousands of models (Whisper, VITS, SDXL…)",
},
vertex: {
id: "vertex",
alias: "vertex",
name: "Vertex AI",
icon: "cloud",
color: "#4285F4",
textIcon: "VA",
website: "https://cloud.google.com/vertex-ai",
authHint: "Provide Service Account JSON or OAuth access_token",
},
};
export const OPENAI_COMPATIBLE_PREFIX = "openai-compatible-";
+19 -42
View File
@@ -1,58 +1,35 @@
"use client";
import { useState, useCallback, useRef } from "react";
import { copyToClipboard } from "@/shared/utils/clipboard";
/**
* Fallback copy using legacy execCommand (works on HTTP)
*/
function fallbackCopy(text) {
const textarea = document.createElement("textarea");
textarea.value = text;
textarea.style.position = "fixed";
textarea.style.left = "-9999px";
textarea.style.top = "-9999px";
textarea.style.opacity = "0";
document.body.appendChild(textarea);
textarea.focus();
textarea.select();
try {
document.execCommand("copy");
} catch {
// ignore
}
document.body.removeChild(textarea);
}
/**
* Hook for copy to clipboard with feedback
* Hook for copy to clipboard with feedback.
* Uses shared copyToClipboard utility that works on both HTTP and HTTPS.
* @param {number} resetDelay - Time in ms before resetting copied state (default: 2000)
* @returns {{ copied: string|null, copy: (text: string, id?: string) => void }}
* @returns {{ copied: string|null, copy: (text: string, id?: string) => Promise<boolean> }}
*/
export function useCopyToClipboard(resetDelay = 2000) {
const [copied, setCopied] = useState(null);
const timeoutRef = useRef(null);
const [copied, setCopied] = useState<string | null>(null);
const timeoutRef = useRef<ReturnType<typeof setTimeout> | null>(null);
const copy = useCallback(
async (text, id = "default") => {
try {
if (navigator.clipboard && window.isSecureContext) {
await navigator.clipboard.writeText(text);
} else {
fallbackCopy(text);
async (text: string, id = "default"): Promise<boolean> => {
const success = await copyToClipboard(text);
if (success) {
setCopied(id);
if (timeoutRef.current) {
clearTimeout(timeoutRef.current);
}
} catch {
fallbackCopy(text);
timeoutRef.current = setTimeout(() => {
setCopied(null);
}, resetDelay);
}
setCopied(id);
if (timeoutRef.current) {
clearTimeout(timeoutRef.current);
}
timeoutRef.current = setTimeout(() => {
setCopied(null);
}, resetDelay);
return success;
},
[resetDelay]
);
+52
View File
@@ -0,0 +1,52 @@
/**
* Clipboard utility with HTTP/HTTPS fallback.
* navigator.clipboard.writeText() requires HTTPS (secure context).
* For HTTP deployments, falls back to execCommand('copy').
*/
/**
* Copy text to clipboard with automatic HTTPS/HTTP fallback.
* Works in both secure (HTTPS) and non-secure (HTTP) contexts.
* @param text - Text to copy to clipboard
* @returns true if copy succeeded, false otherwise
*/
export async function copyToClipboard(text: string): Promise<boolean> {
// Method 1: Clipboard API (requires HTTPS / secure context)
if (
typeof navigator !== "undefined" &&
navigator.clipboard &&
typeof window !== "undefined" &&
window.isSecureContext
) {
try {
await navigator.clipboard.writeText(text);
return true;
} catch {
// Fall through to execCommand fallback
}
}
// Method 2: Legacy execCommand fallback (works on HTTP)
if (typeof document !== "undefined" && document.body) {
const textArea = document.createElement("textarea");
textArea.value = text;
textArea.style.cssText = "position:fixed;top:0;left:-9999px;opacity:0;pointer-events:none;";
let appended = false;
try {
document.body.appendChild(textArea);
appended = true;
textArea.focus();
textArea.select();
return document.execCommand("copy");
} catch {
return false;
} finally {
if (appended && document.body.contains(textArea)) {
document.body.removeChild(textArea);
}
}
}
return false;
}
+44 -1
View File
@@ -42,6 +42,10 @@ import { generateRequestId } from "../../shared/utils/requestId";
import { recordCost } from "../../domain/costRules";
import { logAuditEvent } from "../../lib/compliance/index";
import { enforceApiKeyPolicy } from "../../shared/utils/apiKeyPolicy";
import {
applyTaskAwareRouting,
getTaskRoutingConfig,
} from "@omniroute/open-sse/services/taskAwareRouter.ts";
/**
* Handle chat completion request
@@ -77,6 +81,24 @@ export async function handleChat(request: any, clientRawRequest: any = null) {
}
telemetry.endPhase();
// T01 — Accept header negotiation
// If client asks for text/event-stream via the Accept header AND the JSON body
// does not explicitly set stream=false, treat it as stream=true.
// This ensures compatibility with curl/httpx and similar non-OpenAI clients.
//
// FIX #302: OpenAI Python SDK sends Accept: application/json, text/event-stream
// in every request — even when called with stream=False. We must NOT override
// an explicit stream=false body field, as that silently breaks tool_calls and
// structured completions for SDK users who rely on non-streaming mode.
const acceptHeader = request.headers.get("accept") || "";
if (acceptHeader.includes("text/event-stream") && body.stream === undefined) {
body = { ...body, stream: true };
log.debug(
"STREAM",
"Accept: text/event-stream header → overriding stream=true (body had no stream field)"
);
}
// Build clientRawRequest for logging (if not provided)
if (!clientRawRequest) {
const url = new URL(request.url);
@@ -141,9 +163,30 @@ export async function handleChat(request: any, clientRawRequest: any = null) {
const apiKeyInfo = policy.apiKeyInfo;
telemetry.endPhase();
// T05 — Task-Aware Smart Routing
// Detect the semantic task type and optionally route to the optimal model
let resolvedModelStr = modelStr;
let taskRouteInfo: { taskType: string; wasRouted: boolean } | null = null;
if (getTaskRoutingConfig().enabled) {
telemetry.startPhase("task-route");
const tr = applyTaskAwareRouting(modelStr, body);
if (tr.wasRouted) {
resolvedModelStr = tr.model;
body = { ...body, model: tr.model };
log.info(
"T05",
`Task-Aware: detected="${tr.taskType}" → model override: ${modelStr}${tr.model}`
);
} else if (tr.taskType !== "chat") {
log.debug("T05", `Task-Aware: detected="${tr.taskType}" (no override configured)`);
}
taskRouteInfo = { taskType: tr.taskType, wasRouted: tr.wasRouted };
telemetry.endPhase();
}
// Check if model is a combo (has multiple models with fallback)
telemetry.startPhase("resolve");
const combo = await getCombo(modelStr);
const combo = await getCombo(resolvedModelStr);
if (combo) {
log.info(
"CHAT",
+53 -18
View File
@@ -218,27 +218,58 @@ export async function getProviderCredentials(
if (strategy === "round-robin") {
const stickyLimit = toNumber((settings as Record<string, unknown>).stickyRoundRobinLimit, 3);
// Sort by lastUsed (most recent first) to find current candidate
const byRecency = [...orderedConnections].sort((a: any, b: any) => {
if (!a.lastUsedAt && !b.lastUsedAt) return (a.priority || 999) - (b.priority || 999);
if (!a.lastUsedAt) return 1;
if (!b.lastUsedAt) return -1;
return new Date(b.lastUsedAt).getTime() - new Date(a.lastUsedAt).getTime();
});
// If excluding an account (fallback scenario), skip sticky logic and go straight to LRU
// This prevents the system from getting stuck on a failed account
const isFallbackScenario = excludeConnectionId !== null;
const current = byRecency[0];
const currentCount = current?.consecutiveUseCount || 0;
if (current && current.lastUsedAt && currentCount < stickyLimit) {
// Stay with current account
connection = current;
// Update lastUsedAt and increment count (await to ensure persistence)
await updateProviderConnection(connection.id, {
lastUsedAt: new Date().toISOString(),
consecutiveUseCount: (connection.consecutiveUseCount || 0) + 1,
if (!isFallbackScenario) {
// Sort by lastUsed (most recent first) to find current candidate
const byRecency = [...orderedConnections].sort((a: any, b: any) => {
if (!a.lastUsedAt && !b.lastUsedAt) return (a.priority || 999) - (b.priority || 999);
if (!a.lastUsedAt) return 1;
if (!b.lastUsedAt) return -1;
return new Date(b.lastUsedAt).getTime() - new Date(a.lastUsedAt).getTime();
});
const current = byRecency[0];
const currentCount = current?.consecutiveUseCount || 0;
if (current && current.lastUsedAt && currentCount < stickyLimit) {
// Stay with current account
connection = current;
log.debug(
"AUTH",
`${provider} round-robin: staying with ${current.id?.slice(0, 8)}... (count=${currentCount}/${stickyLimit})`
);
// Update lastUsedAt and increment count (await to ensure persistence)
await updateProviderConnection(connection.id, {
lastUsedAt: new Date().toISOString(),
consecutiveUseCount: (connection.consecutiveUseCount || 0) + 1,
});
} else {
// Pick the least recently used (excluding current if possible)
const sortedByOldest = [...orderedConnections].sort((a: any, b: any) => {
if (!a.lastUsedAt && !b.lastUsedAt) return (a.priority || 999) - (b.priority || 999);
if (!a.lastUsedAt) return -1;
if (!b.lastUsedAt) return 1;
return new Date(a.lastUsedAt).getTime() - new Date(b.lastUsedAt).getTime();
});
connection = sortedByOldest[0];
log.debug(
"AUTH",
`${provider} round-robin: switching to LRU ${connection.id?.slice(0, 8)}... (current count=${currentCount} >= limit=${stickyLimit} or no lastUsedAt)`
);
// Update lastUsedAt and reset count to 1 (await to ensure persistence)
await updateProviderConnection(connection.id, {
lastUsedAt: new Date().toISOString(),
consecutiveUseCount: 1,
});
}
} else {
// Pick the least recently used (excluding current if possible)
// Fallback scenario: excluded an account due to failure
// Always pick the least recently used to ensure proper cycling
const sortedByOldest = [...orderedConnections].sort((a: any, b: any) => {
if (!a.lastUsedAt && !b.lastUsedAt) return (a.priority || 999) - (b.priority || 999);
if (!a.lastUsedAt) return -1;
@@ -247,6 +278,10 @@ export async function getProviderCredentials(
});
connection = sortedByOldest[0];
log.info(
"AUTH",
`${provider} round-robin: FALLBACK MODE - excluded ${excludeConnectionId?.slice(0, 8)}..., picked LRU ${connection.id?.slice(0, 8)}...`
);
// Update lastUsedAt and reset count to 1 (await to ensure persistence)
await updateProviderConnection(connection.id, {
+52 -9
View File
@@ -1,10 +1,8 @@
import test from "node:test";
import assert from "node:assert/strict";
const {
selectAccountP2C,
selectAccount,
} = await import("../../open-sse/services/accountSelector.ts");
const { selectAccountP2C, selectAccount } =
await import("../../open-sse/services/accountSelector.ts");
// ─── selectAccountP2C ───────────────────────────────────────────────────────
@@ -19,11 +17,7 @@ test("selectAccountP2C: returns single account", () => {
});
test("selectAccountP2C: returns one of the candidates", () => {
const accounts = [
{ id: "a1" },
{ id: "a2" },
{ id: "a3" },
];
const accounts = [{ id: "a1" }, { id: "a2" }, { id: "a3" }];
const selected = selectAccountP2C(accounts);
assert.ok(accounts.includes(selected));
});
@@ -79,3 +73,52 @@ test("selectAccount: empty accounts returns null", () => {
const { account } = selectAccount([], "fill-first");
assert.equal(account, null);
});
// ─── Round-robin fallback scenario (Issue #340) ─────────────────────────────
test("selectAccount: round-robin with excludeConnectionId skips excluded", () => {
const accounts = [{ id: "a" }, { id: "b" }, { id: "c" }];
// Simulate: first request picks 'a', then it fails
// Second request should exclude 'a' and pick 'b'
let state = {};
// First request - no exclusion
const { account: acc1 } = selectAccount(accounts, "round-robin", state);
state = { lastIndex: 0 }; // Simulate 'a' was picked
// 'a' fails, exclude it
const { account: acc2 } = selectAccount(
accounts.filter((a) => a.id !== "a"),
"round-robin",
state
);
// Should pick 'b' or 'c', not 'a'
assert.notEqual(acc2.id, "a", "Should not pick excluded account");
assert.ok(["b", "c"].includes(acc2.id), "Should pick from remaining accounts");
});
test("selectAccount: round-robin respects state across calls", () => {
const accounts = [{ id: "a" }, { id: "b" }, { id: "c" }];
let state = { lastIndex: -1 };
// First call should pick index 0
const { account: acc1, state: state1 } = selectAccount(accounts, "round-robin", state);
assert.equal(acc1.id, "a");
assert.equal(state1.lastIndex, 0);
// Second call should pick index 1
const { account: acc2, state: state2 } = selectAccount(accounts, "round-robin", state1);
assert.equal(acc2.id, "b");
assert.equal(state2.lastIndex, 1);
// Third call should pick index 2
const { account: acc3, state: state3 } = selectAccount(accounts, "round-robin", state2);
assert.equal(acc3.id, "c");
assert.equal(state3.lastIndex, 2);
// Fourth call should wrap to index 0
const { account: acc4 } = selectAccount(accounts, "round-robin", state3);
assert.equal(acc4.id, "a");
});