Compare commits

...

40 Commits

Author SHA1 Message Date
diegosouzapw df23162e9d chore(release): v3.3.5 - all changes in ONE commit
Build Electron Desktop App / Validate version (push) Failing after 31s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
Build Electron Desktop App / Publish to npm (push) Has been skipped
2026-03-30 17:35:51 -03:00
dependabot[bot] 2c12f18b44 deps: bump the production group with 8 updates
Bumps the production group with 8 updates:

| Package | From | To |
| --- | --- | --- |
| [@lobehub/icons](https://github.com/lobehub/lobe-icons) | `5.0.1` | `5.2.0` |
| [@modelcontextprotocol/sdk](https://github.com/modelcontextprotocol/typescript-sdk) | `1.27.1` | `1.29.0` |
| [@swc/helpers](https://github.com/swc-project/swc/tree/HEAD/packages/helpers) | `0.5.19` | `0.5.20` |
| [jose](https://github.com/panva/jose) | `6.2.1` | `6.2.2` |
| [next](https://github.com/vercel/next.js) | `16.1.7` | `16.2.1` |
| [recharts](https://github.com/recharts/recharts) | `3.8.0` | `3.8.1` |
| [undici](https://github.com/nodejs/undici) | `7.24.4` | `7.24.6` |
| [wreq-js](https://github.com/sqdshguy/wreq-js) | `2.2.0` | `2.2.2` |


Updates `@lobehub/icons` from 5.0.1 to 5.2.0
- [Release notes](https://github.com/lobehub/lobe-icons/releases)
- [Changelog](https://github.com/lobehub/lobe-icons/blob/master/CHANGELOG.md)
- [Commits](https://github.com/lobehub/lobe-icons/compare/v5.0.1...v5.2.0)

Updates `@modelcontextprotocol/sdk` from 1.27.1 to 1.29.0
- [Release notes](https://github.com/modelcontextprotocol/typescript-sdk/releases)
- [Commits](https://github.com/modelcontextprotocol/typescript-sdk/compare/v1.27.1...v1.29.0)

Updates `@swc/helpers` from 0.5.19 to 0.5.20
- [Release notes](https://github.com/swc-project/swc/releases)
- [Changelog](https://github.com/swc-project/swc/blob/main/CHANGELOG-CORE.md)
- [Commits](https://github.com/swc-project/swc/commits/HEAD/packages/helpers)

Updates `jose` from 6.2.1 to 6.2.2
- [Release notes](https://github.com/panva/jose/releases)
- [Changelog](https://github.com/panva/jose/blob/main/CHANGELOG.md)
- [Commits](https://github.com/panva/jose/compare/v6.2.1...v6.2.2)

Updates `next` from 16.1.7 to 16.2.1
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/compare/v16.1.7...v16.2.1)

Updates `recharts` from 3.8.0 to 3.8.1
- [Release notes](https://github.com/recharts/recharts/releases)
- [Changelog](https://github.com/recharts/recharts/blob/main/CHANGELOG.md)
- [Commits](https://github.com/recharts/recharts/compare/v3.8.0...v3.8.1)

Updates `undici` from 7.24.4 to 7.24.6
- [Release notes](https://github.com/nodejs/undici/releases)
- [Commits](https://github.com/nodejs/undici/compare/v7.24.4...v7.24.6)

Updates `wreq-js` from 2.2.0 to 2.2.2
- [Release notes](https://github.com/sqdshguy/wreq-js/releases)
- [Commits](https://github.com/sqdshguy/wreq-js/compare/v2.2.0...v2.2.2)

---
updated-dependencies:
- dependency-name: "@lobehub/icons"
  dependency-version: 5.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: production
- dependency-name: "@modelcontextprotocol/sdk"
  dependency-version: 1.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: production
- dependency-name: "@swc/helpers"
  dependency-version: 0.5.20
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production
- dependency-name: jose
  dependency-version: 6.2.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production
- dependency-name: next
  dependency-version: 16.2.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: production
- dependency-name: recharts
  dependency-version: 3.8.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production
- dependency-name: undici
  dependency-version: 7.24.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production
- dependency-name: wreq-js
  dependency-version: 2.2.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-30 17:32:55 -03:00
dependabot[bot] eaeb28b4e1 deps: bump the development group with 7 updates
Bumps the development group with 7 updates:

| Package | From | To |
| --- | --- | --- |
| [@tailwindcss/postcss](https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/@tailwindcss-postcss) | `4.2.1` | `4.2.2` |
| [@types/keytar](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/keytar) | `4.4.0` | `4.4.2` |
| [eslint-config-next](https://github.com/vercel/next.js/tree/HEAD/packages/eslint-config-next) | `16.1.6` | `16.2.1` |
| [tailwindcss](https://github.com/tailwindlabs/tailwindcss/tree/HEAD/packages/tailwindcss) | `4.2.1` | `4.2.2` |
| [typescript](https://github.com/microsoft/TypeScript) | `5.9.3` | `6.0.2` |
| [typescript-eslint](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/typescript-eslint) | `8.57.1` | `8.58.0` |
| [vitest](https://github.com/vitest-dev/vitest/tree/HEAD/packages/vitest) | `4.1.0` | `4.1.2` |


Updates `@tailwindcss/postcss` from 4.2.1 to 4.2.2
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss/commits/v4.2.2/packages/@tailwindcss-postcss)

Updates `@types/keytar` from 4.4.0 to 4.4.2
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/keytar)

Updates `eslint-config-next` from 16.1.6 to 16.2.1
- [Release notes](https://github.com/vercel/next.js/releases)
- [Changelog](https://github.com/vercel/next.js/blob/canary/release.js)
- [Commits](https://github.com/vercel/next.js/commits/v16.2.1/packages/eslint-config-next)

Updates `tailwindcss` from 4.2.1 to 4.2.2
- [Release notes](https://github.com/tailwindlabs/tailwindcss/releases)
- [Changelog](https://github.com/tailwindlabs/tailwindcss/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tailwindlabs/tailwindcss/commits/v4.2.2/packages/tailwindcss)

Updates `typescript` from 5.9.3 to 6.0.2
- [Release notes](https://github.com/microsoft/TypeScript/releases)
- [Commits](https://github.com/microsoft/TypeScript/compare/v5.9.3...v6.0.2)

Updates `typescript-eslint` from 8.57.1 to 8.58.0
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/typescript-eslint/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v8.58.0/packages/typescript-eslint)

Updates `vitest` from 4.1.0 to 4.1.2
- [Release notes](https://github.com/vitest-dev/vitest/releases)
- [Commits](https://github.com/vitest-dev/vitest/commits/v4.1.2/packages/vitest)

---
updated-dependencies:
- dependency-name: "@tailwindcss/postcss"
  dependency-version: 4.2.2
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: development
- dependency-name: "@types/keytar"
  dependency-version: 4.4.2
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: development
- dependency-name: eslint-config-next
  dependency-version: 16.2.1
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: development
- dependency-name: tailwindcss
  dependency-version: 4.2.2
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: development
- dependency-name: typescript
  dependency-version: 6.0.2
  dependency-type: direct:development
  update-type: version-update:semver-major
  dependency-group: development
- dependency-name: typescript-eslint
  dependency-version: 8.58.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: development
- dependency-name: vitest
  dependency-version: 4.1.2
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: development
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-30 17:32:51 -03:00
Chris Staley d5647eab33 fix: remove dead userDismissed ref after auto-open removal
The userDismissed ref was only read by the removed auto-open useEffect.
Remove the ref declaration and the three onClose assignments that set it.
2026-03-30 17:32:49 -03:00
Chris Staley 89eb8885b1 fix: remove unnecessary comment from previous commit 2026-03-30 17:32:49 -03:00
Chris Staley a5dc5687f8 fix: remove auto-opening OAuth/API key modal on provider detail page
Auto-opening the "Add Connection" dialog when navigating to a provider
with zero connections was a poor UX pattern. It surprised users who were
simply browsing provider details (e.g. after deleting a connection or
checking settings). The page already displays a clear empty state with
an "Add Connection" button — users should click it when ready.
2026-03-30 17:32:49 -03:00
oyi77 6780485051 feat(cache): persistent metrics, cache entry browser, settings UI, MCP tools, prefix analyzer
Implements remaining features from #813:

Phase 1 - Persistent Metrics:
- Add cache_metrics table for persistent hit/miss tracking
- Semantic cache stats now survive server restarts

Phase 2 - Cache Entry Browser:
- /api/cache/entries endpoint with search, pagination, delete
- CacheEntriesTab component for browsing cached entries

Phase 3 - Settings UI:
- CacheSettingsTab for semantic/prompt cache configuration
- /api/settings/cache-config endpoint

Phase 4 - Prefix Analyzer:
- src/lib/promptCache/prefixAnalyzer.ts for intelligent caching
- Analyzes message arrays to find stable prefixes

Phase 5 - Provider Support:
- Added deepseek to CACHING_PROVIDERS

Phase 6 - MCP Tools:
- omniroute_cache_stats tool
- omniroute_cache_flush tool

Phase 7 - Retention:
- cleanOldMetrics() for auto-purge of old entries

Closes #813
2026-03-30 17:32:45 -03:00
oyi77 d043e7a242 feat(cache): fix cache page to display prompt cache metrics and trend data
Closes #813
2026-03-30 17:32:45 -03:00
Chris Staley c5d9b5f51d fix: apply PR review feedback for Gemini CLI quota
- Add early return guard for missing accessToken in getGeminiUsage
- Add 10s fetch timeout (AbortSignal.timeout) on retrieveUserQuota calls
- Clamp used value with Math.max(0, ...) for non-negative display
- Use full accessToken as cache key instead of truncated prefix
- Replace catch(err: any) with instanceof Error check in models route
2026-03-30 17:32:42 -03:00
Chris Staley 35e2892b98 feat: add real Gemini CLI quota tracking via retrieveUserQuota API
Replace stub getGeminiUsage() with per-model quota fetching from Google
Cloud Code Assist's retrieveUserQuota endpoint (same API the official
Gemini CLI /stats command uses). Fixes OAuth env var name, aligns model
list with official Gemini CLI VALID_GEMINI_MODELS, and makes "Import
from /models" discover new models via the quota endpoint.
2026-03-30 17:32:42 -03:00
diegosouzapw 11dfdbb7a3 feat(analytics): add diversity score card UI and diversity API route
Implement DiversityScoreCard component to fetch and display provider diversity score with loading state and conditional styling, integrate it into AnalyticsPage overview, and add a new API route at src/app/api/analytics/diversity/route.ts to return the diversity report using getDiversityReport
2026-03-30 16:37:49 -03:00
diegosouzapw a864258cb8 feat(ui): integrate FSM, adaptive routing, and provider diversity
Build Electron Desktop App / Validate version (push) Failing after 35s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
Build Electron Desktop App / Publish to npm (push) Has been skipped
2026-03-30 12:58:45 -03:00
Diego Rodrigues de Sa e Souza 8a9c15c874 Merge pull request #819 from diegosouzapw/release/v3.3.4
Release v3.3.4
2026-03-30 11:26:17 -03:00
diegosouzapw 7a666526b7 chore(release): bump version to 3.3.4 2026-03-30 11:23:59 -03:00
diegosouzapw 3fc1cac015 docs(i18n): update CHANGELOG, README and sync multi-language FEATURES docs 2026-03-30 11:21:47 -03:00
Diego Rodrigues de Sa e Souza 04a0b07bf6 Merge pull request #793 from igormorais123/feat/provider-diversity-scoring
feat(sse): add provider diversity scoring via Shannon entropy
2026-03-30 11:07:03 -03:00
Diego Rodrigues de Sa e Souza 59e48ca91a Merge pull request #794 from igormorais123/feat/adaptive-volume-routing
feat(sse): add adaptive volume/complexity detector for routing strategy override
2026-03-30 11:07:00 -03:00
Diego Rodrigues de Sa e Souza 8ff562c5af Merge pull request #795 from igormorais123/feat/provider-expiration-tracking
feat(domain): add provider expiration tracking with proactive alerts
2026-03-30 11:06:56 -03:00
Diego Rodrigues de Sa e Souza b502a93728 Merge pull request #796 from igormorais123/feat/config-audit-trail
feat(domain): add configuration audit trail with diff detection and rollback
2026-03-30 11:06:53 -03:00
Diego Rodrigues de Sa e Souza b6afa6c2c7 Merge pull request #803 from igormorais123/feat/graceful-degradation-wrapper
feat(domain): add graceful degradation framework with multi-layer fallback
2026-03-30 11:06:50 -03:00
Diego Rodrigues de Sa e Souza 5887da0229 Merge pull request #805 from igormorais123/feat/fsm-workflow-orchestrator
feat(sse): add deterministic FSM orchestrator for multi-step workflows
2026-03-30 11:06:46 -03:00
Diego Rodrigues de Sa e Souza a7d833d96a Merge pull request #817 from diegosouzapw/feat/auto-disable-banned-accounts-setting
Feat/auto disable banned accounts setting
2026-03-30 11:06:42 -03:00
Diego Rodrigues de Sa e Souza db3753d611 Merge PR 811: fix UI fallbacks and Electron release workflow
fix: UI fallbacks and Electron release workflow
2026-03-30 11:04:02 -03:00
diegosouzapw f810b13bca fix: complete bugfixes for UI, OAuth fallbacks, cliRuntime Windows constraints and Codex non-streaming integration 2026-03-30 11:01:55 -03:00
diegosouzapw 5ad687c6d8 fix(ui/ci): use ProviderIcon for Provider header breadcrumbs and add permissions to electron-release.yml (#745, #761)
- Use ProviderIcon for internal .png paths solving SVG provider 404 images (#745).
- Add id-token: write and packages: write permissions to .github/workflows/electron-release.yml to fix permissions denied failure when calling the reusable workflow npm-publish.yml (#761).
- Fix tests and ESM resolution for autoUpdate.ts override logic.
2026-03-30 07:38:30 -03:00
Diego Rodrigues de Sa e Souza 6ad0910790 Merge pull request #810 from oyi77/main
feat(settings): add debug toggle and sidebar visibility toggle
2026-03-30 07:07:54 -03:00
Diego Rodrigues de Sa e Souza 4d8c0546cf Merge pull request #783 from rdself/coder/cloudflared-exit1-fix
Fix cloudflared quick tunnel startup in Docker
2026-03-30 07:07:39 -03:00
oyi77 35f96d4a40 feat(settings): add debug toggle and sidebar visibility toggle
feat(ui): replace hide-sidebar toggle with dynamic visibility toggle
2026-03-30 15:15:02 +07:00
igormorais123 ae96fb6f63 feat(sse): add deterministic FSM orchestrator for multi-step workflows
Risk-based phase skipping: high=all 9 phases, medium=skip planner, low=execute+test.
Veto authority, pause/resume, retry limits, full audit trail.

Closes #800, closes #802
2026-03-30 01:28:45 -03:00
igormorais123 67592d80aa feat(domain): add graceful degradation framework with multi-layer fallback
Add a standardized degradation pattern for services depending on external
systems. withDegradation() tries primary → fallback → safe default,
tracking status in a global registry for dashboard visibility.

Features:
- Async and sync variants
- Global registry with per-feature status tracking
- Degradation levels: full → reduced → minimal → default
- Summary and report APIs for dashboard integration
- Reason tracking for debugging

Example: Rate limiting degrades from Redis → in-memory → permissive
instead of crashing when Redis is unavailable.

Closes #799
2026-03-30 01:23:10 -03:00
igormorais123 94a5e43e5d feat(domain): add configuration audit trail with diff detection and rollback
Add configAudit module that records every change to provider connections,
combos, and routing policies with:

- Before/after state snapshots
- Structured diff (added, removed, changed keys)
- Source tracking (dashboard, API, sync, auto-healing)
- Filtered retrieval with pagination
- Rollback state extraction
- Configuration snapshot export for backup

Enables traceability and quick rollback when config changes cause issues.

Closes #791
2026-03-30 00:49:22 -03:00
igormorais123 26958f8f70 feat(domain): add provider expiration tracking with proactive alerts
Add providerExpiration module to track OAuth token, subscription, and
API credit expiration dates per provider connection. Provides:

- setExpiration() / getExpiration() for CRUD operations
- getExpiringSoon() for proactive alerts
- getExpirationSummary() for dashboard health display
- detectExpirationFromResponse() for auto-detection from HTTP headers
- Status classification: active → expiring_soon → expired

Prevents silent failures from expired credentials by alerting operators
before tokens/subscriptions expire.

Closes #790
2026-03-30 00:48:06 -03:00
igormorais123 a427d215e3 feat(sse): add adaptive volume/complexity detector for routing strategy override
Add volumeDetector module that analyzes request characteristics (batch
size, token count, tool usage, browser signals, complexity keywords)
and recommends routing strategy overrides.

Rules:
- Batch >= 50 items → round-robin with economy models
- Critical complexity (many tools, browser, deploy) → priority premium-first
- Browser/UI interaction → force premium priority
- Short requests (<200 tokens) → flag for economy tier

Closes #789
2026-03-30 00:46:55 -03:00
igormorais123 271cf37b8a feat(sse): add provider diversity scoring via Shannon entropy
Add a providerDiversity module that tracks provider usage distribution
using a rolling time window and calculates Shannon entropy normalized
to [0..1]. This enables the auto-combo scoring engine to factor in
provider diversity — boosting underrepresented providers to reduce
single-point-of-failure risk.

Key features:
- Rolling window with configurable size and TTL
- Shannon entropy calculation normalized to [0..1]
- Per-provider diversity boost for auto-combo integration
- Diversity report for dashboard display
- Full test coverage

Closes #788
2026-03-30 00:45:17 -03:00
R.D. 179c03e79d Isolate cloudflared runtime environment 2026-03-29 22:30:07 -04:00
Diego Rodrigues de Sa e Souza 0a1b68639b Merge pull request #782 from diegosouzapw/release/v3.3.3
chore(release): v3.3.3 — UI bugfixes and AutoUpdate repairs
2026-03-29 22:51:52 -03:00
diegosouzapw d69e7ec850 chore(release): v3.3.3 — Core UI bugfixes and AutoUpdate repairs 2026-03-29 21:18:07 -03:00
diegosouzapw d0c172830c feat(ui): add AutoDisableCard to Resilience settings (#765) 2026-03-29 15:57:19 -03:00
oyi77 d5bf0d1199 fix: address reviewer comments for auto-disable (use getCachedSettings, immediate disable on permanent bans) 2026-03-30 01:47:28 +07:00
oyi77 82dd4aa403 feat: auto-disable banned accounts setting with UI toggle
Add a configurable setting to automatically disable provider accounts
that return permanent/terminal errors (403 banned, ToS violation, etc.)

Changes:
- open-sse/services/accountFallback.ts: extend ACCOUNT_DEACTIVATED_SIGNALS
  with AG-specific ban messages ('verify your account', 'service disabled
  for violation')
- src/app/api/settings/auto-disable-accounts/route.ts: new GET/PUT endpoint
  for the setting (enabled bool + threshold int)
- src/shared/validation/schemas.ts: updateAutoDisableAccountsSchema
- src/sse/services/auth.ts: in markAccountUnavailable(), capture result.permanent
  from checkFallbackError() and — when autoDisableBannedAccounts is enabled and
  backoffLevel >= threshold — set isActive=false on the connection

Default: disabled (backward-compatible). Enable via Settings UI or PUT
/api/settings/auto-disable-accounts { "enabled": true, "threshold": 3 }

Fixes: antigravity accounts with 403/Verify-your-account errors being
retried indefinitely in the rotation pool.

Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
2026-03-29 23:24:27 +07:00
136 changed files with 5185 additions and 646 deletions
+14 -4
View File
@@ -19,11 +19,21 @@ This workflow fetches all open issues from the project's GitHub repository, clas
### 2. Fetch All Open Issues
// turbo
// turbo-all
- Run: `gh issue list --repo <owner>/<repo> --state open --limit 500 --json number,title,labels,body,comments,createdAt,author`
- Parse the JSON output to get a list of **all** open issues
- Sort by oldest first (FIFO)
**⚠️ CRITICAL**: The JSON output of `gh issue list` can be truncated by the tool, silently hiding issues. You MUST use the two-step approach below to guarantee **all** issues are fetched.
**Step 2a — Get Issue numbers only** (small output, never truncated):
- Run: `gh issue list --repo <owner>/<repo> --state open --limit 500 --json number --jq '.[].number'`
- This outputs one issue number per line. Count them and confirm total.
**Step 2b — Fetch full metadata for each Issue** (one call per issue):
- For each issue number from step 2a, run:
`gh issue view <NUMBER> --repo <owner>/<repo> --json number,title,labels,body,comments,createdAt,author`
- You may batch these into parallel calls (up to 4 at a time).
- Sort by oldest first (FIFO).
### 3. Classify Each Issue
+22 -4
View File
@@ -18,17 +18,35 @@ This workflow fetches all open PRs from the project's GitHub repository, perform
### 2. Fetch Open Pull Requests
// turbo
// turbo-all
**⚠️ CRITICAL**: The JSON output of `gh pr list` can be truncated by the tool, silently hiding PRs. You MUST use the two-step approach below to guarantee **all** PRs are fetched.
**Step 2a — Get PR numbers only** (small output, never truncated):
- Run: `gh pr list --repo <owner>/<repo> --state open --limit 500 --json number --jq '.[].number'`
- This outputs one PR number per line. Count them and confirm total.
**Step 2b — Fetch full metadata for each PR** (one call per PR):
- For each PR number from step 2a, run:
`gh pr view <NUMBER> --repo <owner>/<repo> --json number,title,author,headRefName,body,createdAt,additions,deletions,files`
- You may batch these into parallel calls (up to 4 at a time).
**Step 2c — Fetch diffs for each PR** (one call per PR, saved to /tmp):
- For each PR number, run:
`gh pr diff <NUMBER> --repo <owner>/<repo> > /tmp/pr<NUMBER>.diff`
- Then read each diff file with `view_file`.
- Run: `gh pr list --repo <owner>/<repo> --state open --limit 500 --json number,title,author,headRefName,body,createdAt,additions,deletions,files`
- This fetches **all** open PRs without restriction. Get the diff for each with:
`gh pr diff <NUMBER> --repo <owner>/<repo>`
- For each open PR, collect:
- PR number, title, author, branch, number of commits, date
- PR description/body
- Files changed (diff)
- Existing review comments (from bots or humans)
**Verification**: Confirm the count of PRs analyzed matches the count from step 2a before proceeding.
### 3. Analyze Each PR — For each open PR, perform the following analysis:
#### 3a. Feature Assessment
+16 -16
View File
@@ -18,8 +18,8 @@ jobs:
name: Lint
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
@@ -36,8 +36,8 @@ jobs:
name: Security Audit
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
@@ -55,8 +55,8 @@ jobs:
matrix:
node-version: [20, 22]
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
cache: npm
@@ -74,8 +74,8 @@ jobs:
JWT_SECRET: ci-test-secret-with-sufficient-length-for-validation
API_KEY_SECRET: ci-test-api-key-secret-long
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
cache: npm
@@ -90,8 +90,8 @@ jobs:
JWT_SECRET: ci-test-secret-with-sufficient-length-for-validation
API_KEY_SECRET: ci-test-api-key-secret-long
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
@@ -109,8 +109,8 @@ jobs:
JWT_SECRET: ci-test-secret-with-sufficient-length-for-validation
API_KEY_SECRET: ci-test-api-key-secret-long
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
@@ -129,8 +129,8 @@ jobs:
INITIAL_PASSWORD: ci-test-password-for-integration
DATA_DIR: /tmp/omniroute-ci
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
@@ -145,8 +145,8 @@ jobs:
JWT_SECRET: ci-test-secret-with-sufficient-length-for-validation
API_KEY_SECRET: ci-test-api-key-secret-long
steps:
- uses: actions/checkout@v6
- uses: actions/setup-node@v6
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
+7 -7
View File
@@ -25,24 +25,24 @@ jobs:
IMAGE_NAME: diegosouzapw/omniroute
steps:
- name: Checkout
uses: actions/checkout@v6
uses: actions/checkout@v4
with:
ref: ${{ github.event_name == 'workflow_dispatch' && format('refs/tags/v{0}', inputs.version) || '' }}
- name: Set up QEMU (for multi-arch builds)
uses: docker/setup-qemu-action@v4
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4
uses: docker/setup-buildx-action@v3
- name: Login to Docker Hub
uses: docker/login-action@v4
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Login to GitHub Container Registry
uses: docker/login-action@v4
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
@@ -61,7 +61,7 @@ jobs:
echo "Publishing Docker image: $IMAGE_NAME:$VERSION"
- name: Build and push multi-arch image
uses: docker/build-push-action@v7
uses: docker/build-push-action@v6
with:
context: .
target: runner-base
@@ -83,7 +83,7 @@ jobs:
docker buildx imagetools inspect "${{ env.IMAGE_NAME }}:${{ steps.version.outputs.version }}"
- name: Update Docker Hub description
uses: peter-evans/dockerhub-description@v5
uses: peter-evans/dockerhub-description@v4
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
+9 -7
View File
@@ -13,6 +13,8 @@ on:
permissions:
contents: write
id-token: write
packages: write
jobs:
validate:
@@ -22,7 +24,7 @@ jobs:
version: ${{ steps.validate.outputs.version }}
steps:
- name: Checkout code
uses: actions/checkout@v6
uses: actions/checkout@v4
with:
fetch-depth: 0
@@ -70,16 +72,16 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v6
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v6
uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
- name: Cache node_modules
uses: actions/cache@v5
uses: actions/cache@v4
with:
path: node_modules
key: ${{ runner.os }}-node-${{ hashFiles('package-lock.json') }}
@@ -146,7 +148,7 @@ jobs:
fi
- name: Upload artifacts
uses: actions/upload-artifact@v7
uses: actions/upload-artifact@v4
with:
name: electron-${{ matrix.platform }}
path: release-assets/
@@ -157,12 +159,12 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v6
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Download all artifacts
uses: actions/download-artifact@v8
uses: actions/download-artifact@v4
with:
path: release-assets
merge-multiple: true
+4 -4
View File
@@ -43,10 +43,10 @@ jobs:
environment: NPM_TOKEN
steps:
- name: Checkout
uses: actions/checkout@v6
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v6
uses: actions/setup-node@v4
with:
node-version: 22
registry-url: https://registry.npmjs.org
@@ -111,11 +111,11 @@ jobs:
run: |
VERSION="${{ steps.resolve.outputs.version }}"
TAG="${{ steps.resolve.outputs.tag }}"
echo "Configuring for GitHub Packages..."
echo "//npm.pkg.github.com/:_authToken=${{ secrets.GITHUB_TOKEN }}" > .npmrc
npm pkg set name="@diegosouzapw/omniroute"
if [ "$TAG" = "latest" ]; then
npm publish --registry=https://npm.pkg.github.com || echo "⚠️ Version ${VERSION} might already be published on GitHub."
else
+50
View File
@@ -4,6 +4,56 @@
---
## [3.3.5] - 2026-03-30
### ✨ New Features
- **Gemini Quota Tracking:** Added real-time Gemini CLI quota tracking via the `retrieveUserQuota` API (PR #825)
- **Cache Dashboard:** Enhanced the Cache Dashboard to display prompt cache metrics, 24h trends, and estimated cost savings (PR #824)
### 🐛 Bug Fixes
- **Token Accounting:** Included prompt cache tokens safely in historical usage inputs calculations for correct quota deductions (PR #822)
- **User Experience:** Removed invasive auto-opening OAuth modal loops on barren provider detailed pages (PR #820)
- **Dependency Updates:** Bumped and locked down dependencies for development and production trees including Next.js 16.2.1, Recharts, and TailwindCSS 4.2.2 (PR #826, #827)
---
## [3.3.4] - 2026-03-30
### ✨ New Features
- **A2A Workflows:** Added deterministic FSM orchestrator for multi-step agent workflows.
- **Graceful Degradation:** Added a new multi-layer fallback framework to preserve core functionality during partial system outages.
- **Config Audit:** Added an audit trail with diff detection to track changes and enable configuration rollbacks.
- **Provider Health:** Added provider expiration tracking with proactive UI alerts for expiring API keys.
- **Adaptive Routing:** Added an adaptive volume and complexity detector to override routing strategies dynamically based on load.
- **Provider Diversity:** Implemented provider diversity scoring via Shannon entropy to improve load distribution.
- **Auto-Disable Bounds:** Added an Auto-Disable Banned Accounts setting toggle to the Resilience dashboard.
### 🐛 Bug Fixes
- **Codex & Claude Compatibility:** Fixed UI fallbacks, patched Codex non-streaming integration issues, and resolved CLI runtime detection on Windows.
- **Release Automation:** Expanded permissions required for the Electron App build in GitHub Actions.
- **Cloudflare Runtime:** Addressed correct runtime isolation exit codes for Cloudflared tunnel components.
### 🧪 Tests
- **Test Suite Updates:** Expanded test coverage for volume detectors, provider diversity, configuration audit, and FSM.
---
## [3.3.3] - 2026-03-29
### 🐛 Bug Fixes
- **CI/CD Reliability:** Patched GitHub Actions to stable dependency versions (`actions/checkout@v4`, `actions/upload-artifact@v4`) to mitigate unannounced builder environment deprecations.
- **Image Fallbacks:** Replaced arbitrary fallback chains in `ProviderIcon.tsx` with explicit asset validation to prevent UI loading `<Image>` components for files that don't exist, eliminating `404` errors in dashboard console logs (#745).
- **Admin Updater:** Dynamic source-installation detection for the dashboard Updater. Safely disables the `Update Now` button when OmniRoute is built locally rather than through npm, prompting for `git pull` (#743).
- **Update ERESOLVE Error:** Injected `package.json` overrides for `react`/`react-dom` and enabled `--legacy-peer-deps` within the internal automatic updater scripts to resolve breaking dependency tree conflicts with `@lobehub/ui`.
---
## [3.3.2] - 2026-03-29
### ✨ New Features
+7
View File
@@ -1218,6 +1218,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1248,6 +1251,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -66,8 +66,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+7
View File
@@ -1222,6 +1222,9 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔀 **Model Aliases** | Built-in + custom model aliasing and migration safety |
| ⚡ **Background Degradation** | Route low-priority background tasks to cheaper models |
| 🧪 **Task-Aware Smart Routing** | Auto-select model by content type (coding/vision/analysis/summarization) |
| 🔄 **A2A Agent Workflows** | Deterministic FSM orchestrator for stateful multi-step agent executions |
| 🔀 **Adaptive Routing** | Dynamic strategy override based on token volume and prompt complexity |
| 🎲 **Provider Diversity** | Shannon entropy scoring balancing auto-combo traffic distribution |
| 💬 **System Prompt Injection** | Global behavior controls applied consistently |
| 📄 **Responses API Compatibility** | Full `/v1/responses` support for Codex and advanced agentic workflows |
@@ -1252,6 +1255,10 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
| 🔏 **CLI Fingerprint Matching** | Matches native CLI request signatures — **reduces ban risk while preserving proxy IP** |
| 🌐 **IP Filtering** | Allowlist/blocklist control for exposed deployments |
| 📊 **Editable Rate Limits** | Configurable global/provider-level limits with persistence |
| 📉 **Graceful Degradation** | Multi-layer capability fallbacks protecting core gateway operations |
| 📜 **Config Audit Trail** | Diff-based change tracking preventing operational drift with simple rollbacks |
| ⏳ **Provider Health Sync** | Proactive token expiration monitoring triggering alerts before authorization failures |
| 🚪 **Auto-Disable Banned Accounts** | Operational circuit breaker sealing permanently blocked token accounts automatically |
| 🔑 **API Key Management + Scoping** | Secure key issuance/rotation and model/provider controls |
| 👁️ **Scoped API Key Reveal** 🆕 | Opt-in recovery of API keys via `ALLOW_API_KEY_REVEAL` |
| 🛡️ **Protected `/models`** | Optional auth gating and provider hiding for model catalog |
+2 -2
View File
@@ -68,8 +68,8 @@ Comprehensive settings panel with tabs:
- **Appearance** — Theme selector (dark/light/system), color theme presets and custom colors, health log visibility, sidebar item visibility controls
- **Security** — API endpoint protection, custom provider blocking, IP filtering, session info
- **Routing** — Model aliases, background task degradation
- **Resilience** — Rate limit persistence, circuit breaker tuning
- **Advanced** — Configuration overrides
- **Resilience** — Rate limit persistence, circuit breaker tuning, auto-disable banned accounts, provider expiration monitoring
- **Advanced** — Configuration overrides, configuration audit trail, fallback degradation mode
![Settings Dashboard](screenshots/06-settings.png)
+1 -1
View File
@@ -1,7 +1,7 @@
openapi: 3.1.0
info:
title: OmniRoute API
version: 3.3.2
version: 3.3.5
description: |
OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
endpoint that routes requests to multiple AI providers with load balancing,
+4 -9
View File
@@ -226,23 +226,18 @@ export const REGISTRY: Record<string, RegistryEntry> = {
oauth: {
clientIdEnv: "GEMINI_CLI_OAUTH_CLIENT_ID",
clientIdDefault: "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com",
clientSecretEnv: "GEMINI_CLI_OAUTH_CLIENT_SECRET",
clientSecretEnv: "GEMINI_OAUTH_CLIENT_SECRET",
clientSecretDefault: "",
},
models: [
{ id: "gemini-3.1-pro-high", name: "Gemini 3.1 Pro High" },
{ id: "gemini-3.1-pro-low", name: "Gemini 3.1 Pro Low" },
{ id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
{ id: "gemini-3-1-pro", name: "Gemini 3.1 Pro (Alt ID)" },
{ id: "gemini-3-pro-preview", name: "Gemini 3 Pro Preview" },
{ id: "gemini-3.1-pro-preview", name: "Gemini 3.1 Pro Preview" },
{ id: "gemini-3.1-flash-lite-preview", name: "Gemini 3.1 Flash Lite Preview" },
{ id: "gemini-3.1-pro-preview-customtools", name: "Gemini 3.1 Pro Preview Custom Tools" },
{ id: "gemini-3-flash-preview", name: "Gemini 3 Flash Preview" },
{ id: "gemini-3.1-flash-lite-preview", name: "Gemini 3.1 Flash Lite Preview" },
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
{ id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
{ id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" },
{ id: "gemini-1.5-pro", name: "Gemini 1.5 Pro" },
{ id: "gemini-1.5-flash", name: "Gemini 1.5 Flash" },
],
},
+232 -164
View File
@@ -77,11 +77,13 @@ export function translateNonStreamingResponse(
sourceFormat: string,
toolNameMap?: Map<string, string> | null
): unknown {
// If already in source format (usually OpenAI), return as-is
if (targetFormat === sourceFormat || targetFormat === FORMATS.OPENAI) {
// If already in source format, return as-is
if (targetFormat === sourceFormat) {
return responseBody;
}
let intermediateOpenAI = responseBody;
// Handle OpenAI Responses API format
if (targetFormat === FORMATS.OPENAI_RESPONSES) {
const responseRoot = toRecord(responseBody);
@@ -126,7 +128,7 @@ export function translateNonStreamingResponse(
? itemObj.arguments
: JSON.stringify(itemObj.arguments || {});
const rawName = toString(itemObj.name);
// Strip Claude OAuth proxy_ prefix using toolNameMap (mirrors tool_use fix for #605)
// Strip Claude OAuth proxy_ prefix using toolNameMap
const resolvedName = toolNameMap?.get(rawName) ?? rawName;
toolCalls.push({
id: callId,
@@ -149,7 +151,7 @@ export function translateNonStreamingResponse(
if (toolCalls.length > 0) {
message.tool_calls = toolCalls;
}
if (!message.content && !message.tool_calls) {
if (message.content === undefined) {
message.content = "";
}
@@ -212,11 +214,11 @@ export function translateNonStreamingResponse(
}
}
return result;
intermediateOpenAI = result;
}
// Handle Gemini/Antigravity format
if (
else if (
targetFormat === FORMATS.GEMINI ||
targetFormat === FORMATS.ANTIGRAVITY ||
targetFormat === FORMATS.GEMINI_CLI
@@ -224,183 +226,249 @@ export function translateNonStreamingResponse(
const root = toRecord(responseBody);
const response = toRecord(root.response ?? root);
const candidates = Array.isArray(response.candidates) ? response.candidates : [];
if (!candidates[0]) {
return responseBody; // Can't translate, return raw
if (candidates[0]) {
const candidate = toRecord(candidates[0]);
const content = toRecord(candidate.content);
const usage = toRecord(response.usageMetadata ?? root.usageMetadata);
let textContent = "";
const toolCalls: JsonRecord[] = [];
let reasoningContent = "";
if (Array.isArray(content.parts)) {
for (const part of content.parts) {
const partObj = toRecord(part);
if (partObj.thought === true && typeof partObj.text === "string") {
reasoningContent += partObj.text;
} else if (typeof partObj.text === "string") {
textContent += partObj.text;
}
if (partObj.functionCall) {
const fn = toRecord(partObj.functionCall);
toolCalls.push({
id: `call_${toString(fn.name, "unknown")}_${Date.now()}_${toolCalls.length}`,
type: "function",
function: {
name: toString(fn.name),
arguments: JSON.stringify(fn.args || {}),
},
});
}
}
}
const message: JsonRecord = { role: "assistant" };
if (textContent) {
message.content = textContent;
}
if (reasoningContent) {
message.reasoning_content = reasoningContent;
}
if (toolCalls.length > 0) {
message.tool_calls = toolCalls;
}
if (!message.content && !message.tool_calls) {
message.content = "";
}
let finishReason = toString(candidate.finishReason, "stop").toLowerCase();
if (finishReason === "stop" && toolCalls.length > 0) {
finishReason = "tool_calls";
}
const createdMs = Date.parse(toString(response.createTime));
const created = Number.isFinite(createdMs)
? Math.floor(createdMs / 1000)
: Math.floor(Date.now() / 1000);
const result: JsonRecord = {
id: `chatcmpl-${toString(response.responseId, String(Date.now()))}`,
object: "chat.completion",
created,
model: toString(response.modelVersion, "gemini"),
choices: [
{
index: 0,
message,
finish_reason: finishReason,
},
],
};
if (Object.keys(usage).length > 0) {
result.usage = {
prompt_tokens:
toNumber(usage.promptTokenCount, 0) + toNumber(usage.thoughtsTokenCount, 0),
completion_tokens: toNumber(usage.candidatesTokenCount, 0),
total_tokens: toNumber(usage.totalTokenCount, 0),
};
if (toNumber(usage.thoughtsTokenCount, 0) > 0) {
(result.usage as JsonRecord).completion_tokens_details = {
reasoning_tokens: toNumber(usage.thoughtsTokenCount, 0),
};
}
}
intermediateOpenAI = result;
}
}
const candidate = toRecord(candidates[0]);
const content = toRecord(candidate.content);
const usage = toRecord(response.usageMetadata ?? root.usageMetadata);
// Handle Claude format
else if (targetFormat === FORMATS.CLAUDE) {
const root = toRecord(responseBody);
const contentBlocks = Array.isArray(root.content) ? root.content : [];
if (contentBlocks.length > 0) {
let textContent = "";
let thinkingContent = "";
const toolCalls: JsonRecord[] = [];
// Build message content
let textContent = "";
const toolCalls: JsonRecord[] = [];
let reasoningContent = "";
if (Array.isArray(content.parts)) {
for (const part of content.parts) {
const partObj = toRecord(part);
// Handle thinking/reasoning
if (partObj.thought === true && typeof partObj.text === "string") {
reasoningContent += partObj.text;
}
// Regular text
else if (typeof partObj.text === "string") {
textContent += partObj.text;
}
// Function calls
if (partObj.functionCall) {
const fn = toRecord(partObj.functionCall);
for (const block of contentBlocks) {
const blockObj = toRecord(block);
if (blockObj.type === "text") {
textContent += toString(blockObj.text);
} else if (blockObj.type === "thinking") {
thinkingContent += toString(blockObj.thinking);
} else if (blockObj.type === "tool_use") {
const rawName = toString(blockObj.name);
const strippedName = toolNameMap?.get(rawName) ?? rawName;
toolCalls.push({
id: `call_${toString(fn.name, "unknown")}_${Date.now()}_${toolCalls.length}`,
id: toString(blockObj.id, `call_${Date.now()}_${toolCalls.length}`),
type: "function",
function: {
name: toString(fn.name),
arguments: JSON.stringify(fn.args || {}),
name: strippedName,
arguments: JSON.stringify(blockObj.input || {}),
},
});
}
}
}
// Build OpenAI format message
const message: JsonRecord = { role: "assistant" };
if (textContent) {
message.content = textContent;
}
if (reasoningContent) {
message.reasoning_content = reasoningContent;
}
if (toolCalls.length > 0) {
message.tool_calls = toolCalls;
}
// If no content at all, set content to empty string
if (!message.content && !message.tool_calls) {
message.content = "";
}
const message: JsonRecord = { role: "assistant" };
if (textContent) {
message.content = textContent;
}
if (thinkingContent) {
message.reasoning_content = thinkingContent;
}
if (toolCalls.length > 0) {
message.tool_calls = toolCalls;
}
if (message.content === undefined) {
message.content = "";
}
// Determine finish reason
let finishReason = toString(candidate.finishReason, "stop").toLowerCase();
if (finishReason === "stop" && toolCalls.length > 0) {
finishReason = "tool_calls";
}
let finishReason = toString(root.stop_reason, "stop");
if (finishReason === "end_turn") finishReason = "stop";
if (finishReason === "tool_use") finishReason = "tool_calls";
const createdMs = Date.parse(toString(response.createTime));
const created = Number.isFinite(createdMs)
? Math.floor(createdMs / 1000)
: Math.floor(Date.now() / 1000);
const result: JsonRecord = {
id: `chatcmpl-${toString(response.responseId, String(Date.now()))}`,
object: "chat.completion",
created,
model: toString(response.modelVersion, "gemini"),
choices: [
{
index: 0,
message,
finish_reason: finishReason,
},
],
};
// Add usage if available (match streaming translator: add thoughtsTokenCount to prompt_tokens)
if (Object.keys(usage).length > 0) {
result.usage = {
prompt_tokens: toNumber(usage.promptTokenCount, 0) + toNumber(usage.thoughtsTokenCount, 0),
completion_tokens: toNumber(usage.candidatesTokenCount, 0),
total_tokens: toNumber(usage.totalTokenCount, 0),
const result: JsonRecord = {
id: `chatcmpl-${toString(root.id, String(Date.now()))}`,
object: "chat.completion",
created: Math.floor(Date.now() / 1000),
model: toString(root.model, "claude"),
choices: [
{
index: 0,
message,
finish_reason: finishReason,
},
],
};
if (toNumber(usage.thoughtsTokenCount, 0) > 0) {
(result.usage as JsonRecord).completion_tokens_details = {
reasoning_tokens: toNumber(usage.thoughtsTokenCount, 0),
const usage = toRecord(root.usage);
if (Object.keys(usage).length > 0) {
const promptTokens = toNumber(usage.input_tokens, 0);
const completionTokens = toNumber(usage.output_tokens, 0);
result.usage = {
prompt_tokens: promptTokens,
completion_tokens: completionTokens,
total_tokens: promptTokens + completionTokens,
};
}
}
return result;
intermediateOpenAI = result;
}
}
// Handle Claude format
if (targetFormat === FORMATS.CLAUDE) {
const root = toRecord(responseBody);
const contentBlocks = Array.isArray(root.content) ? root.content : [];
if (contentBlocks.length === 0) {
return responseBody; // Can't translate, return raw
}
let textContent = "";
let thinkingContent = "";
const toolCalls: JsonRecord[] = [];
for (const block of contentBlocks) {
const blockObj = toRecord(block);
if (blockObj.type === "text") {
textContent += toString(blockObj.text);
} else if (blockObj.type === "thinking") {
thinkingContent += toString(blockObj.thinking);
} else if (blockObj.type === "tool_use") {
// Strip Claude OAuth tool name prefix (proxy_) using the map from request translation.
// Fallback to raw name if block wasn't prefixed (disableToolPrefix path).
const rawName = toString(blockObj.name);
const strippedName = toolNameMap?.get(rawName) ?? rawName;
toolCalls.push({
id: toString(blockObj.id, `call_${Date.now()}_${toolCalls.length}`),
type: "function",
function: {
name: strippedName,
arguments: JSON.stringify(blockObj.input || {}),
},
});
}
}
const message: JsonRecord = { role: "assistant" };
if (textContent) {
message.content = textContent;
}
if (thinkingContent) {
message.reasoning_content = thinkingContent;
}
if (toolCalls.length > 0) {
message.tool_calls = toolCalls;
}
if (!message.content && !message.tool_calls) {
message.content = "";
}
let finishReason = toString(root.stop_reason, "stop");
if (finishReason === "end_turn") finishReason = "stop";
if (finishReason === "tool_use") finishReason = "tool_calls";
const result: JsonRecord = {
id: `chatcmpl-${toString(root.id, String(Date.now()))}`,
object: "chat.completion",
created: Math.floor(Date.now() / 1000),
model: toString(root.model, "claude"),
choices: [
{
index: 0,
message,
finish_reason: finishReason,
},
],
};
const usage = toRecord(root.usage);
if (Object.keys(usage).length > 0) {
const promptTokens = toNumber(usage.input_tokens, 0);
const completionTokens = toNumber(usage.output_tokens, 0);
result.usage = {
prompt_tokens: promptTokens,
completion_tokens: completionTokens,
total_tokens: promptTokens + completionTokens,
};
}
return result;
// Phase 3: Translate from OpenAI back to Client Source format
if (sourceFormat === FORMATS.CLAUDE && sourceFormat !== targetFormat) {
return convertOpenAINonStreamingToClaude(toRecord(intermediateOpenAI));
}
// Unknown format, return as-is
return responseBody;
// Return intermediateOpenAI (which is either the raw response if unknown targetFormat, or an OpenAI compatible payload)
return intermediateOpenAI;
}
/**
* Helper to convert an OpenAI chat.completion JSON object to Claude format for non-streaming.
*/
function convertOpenAINonStreamingToClaude(openaiResponse: JsonRecord): JsonRecord {
const choice = Array.isArray(openaiResponse.choices) ? openaiResponse.choices[0] : null;
if (!choice) return openaiResponse; // If it doesn't look like OpenAI, return as-is
const choiceObj = toRecord(choice);
const messageObj = toRecord(choiceObj.message);
const content = [];
let hasTextOrReasoning = false;
if (messageObj.reasoning_content) {
hasTextOrReasoning = true;
content.push({
type: "thinking",
thinking: toString(messageObj.reasoning_content),
});
}
// Always include text if it exists (even empty string), or if there are no tool calls and no reasoning
const hasToolCalls = Array.isArray(messageObj.tool_calls) && messageObj.tool_calls.length > 0;
if (messageObj.content !== undefined && messageObj.content !== null) {
hasTextOrReasoning = true;
content.push({
type: "text",
text: toString(messageObj.content),
});
} else if (!hasTextOrReasoning) {
// Claude format expects a text block even before tool calls (or if empty)
content.push({
type: "text",
text: "",
});
}
if (Array.isArray(messageObj.tool_calls)) {
for (const tool of messageObj.tool_calls) {
const toolObj = toRecord(tool);
const fn = toRecord(toolObj.function);
content.push({
type: "tool_use",
id: toString(toolObj.id, `call_${Date.now()}`),
name: toString(fn.name),
input:
typeof fn.arguments === "string" ? JSON.parse(fn.arguments || "{}") : fn.arguments || {},
});
}
}
let stopReason = toString(choiceObj.finish_reason, "end_turn");
if (stopReason === "stop") stopReason = "end_turn";
if (stopReason === "tool_calls") stopReason = "tool_use";
const usageSrc = toRecord(openaiResponse.usage);
const claudeResponse: JsonRecord = {
id: toString(openaiResponse.id, `msg_${Date.now()}`),
type: "message",
role: "assistant",
model: toString(openaiResponse.model, "claude"),
content,
stop_reason: stopReason,
stop_sequence: null,
usage: {
input_tokens: toNumber(usageSrc.prompt_tokens, 0),
output_tokens: toNumber(usageSrc.completion_tokens, 0),
},
};
return claudeResponse;
}
+6
View File
@@ -60,6 +60,12 @@ export {
getSessionSnapshotInput,
getSessionSnapshotOutput,
getSessionSnapshotTool,
cacheStatsInput,
cacheStatsOutput,
cacheStatsTool,
cacheFlushInput,
cacheFlushOutput,
cacheFlushTool,
} from "./tools.ts";
// A2A schemas
+65 -2
View File
@@ -806,11 +806,73 @@ export const syncPricingTool: McpToolDefinition<typeof syncPricingInput, typeof
sourceEndpoints: ["/api/pricing/sync"],
};
// ============ Cache Tools ============
export const cacheStatsInput = z.object({}).describe("No parameters required");
export const cacheStatsOutput = z.object({
semanticCache: z.object({
memoryEntries: z.number(),
dbEntries: z.number(),
hits: z.number(),
misses: z.number(),
hitRate: z.string(),
tokensSaved: z.number(),
}),
promptCache: z
.object({
totalRequests: z.number(),
requestsWithCacheControl: z.number(),
totalCachedTokens: z.number(),
totalCacheCreationTokens: z.number(),
estimatedCostSaved: z.number(),
})
.nullable(),
idempotency: z.object({
activeKeys: z.number(),
windowMs: z.number(),
}),
});
export const cacheStatsTool: McpToolDefinition<typeof cacheStatsInput, typeof cacheStatsOutput> = {
name: "omniroute_cache_stats",
description:
"Returns cache statistics including semantic cache hit rate, prompt cache metrics by provider, and idempotency layer stats.",
inputSchema: cacheStatsInput,
outputSchema: cacheStatsOutput,
scopes: ["read:cache"],
auditLevel: "basic",
phase: 2,
sourceEndpoints: ["/api/cache"],
};
export const cacheFlushInput = z.object({
signature: z.string().optional().describe("Specific cache signature to invalidate"),
model: z.string().optional().describe("Invalidate all entries for a specific model"),
});
export const cacheFlushOutput = z.object({
ok: z.boolean(),
invalidated: z.number().optional(),
scope: z.string().optional(),
});
export const cacheFlushTool: McpToolDefinition<typeof cacheFlushInput, typeof cacheFlushOutput> = {
name: "omniroute_cache_flush",
description:
"Flush cache entries. Provide signature to invalidate a single entry, model to invalidate all entries for a model, or omit both to clear all.",
inputSchema: cacheFlushInput,
outputSchema: cacheFlushOutput,
scopes: ["write:cache"],
auditLevel: "full",
phase: 2,
sourceEndpoints: ["/api/cache"],
};
// ============ Tool Registry ============
/** All MCP tool definitions, ordered by phase then name */
export const MCP_TOOLS = [
// Phase 1: Essential
getHealthTool,
listCombosTool,
getComboMetricsTool,
@@ -819,7 +881,6 @@ export const MCP_TOOLS = [
routeRequestTool,
costReportTool,
listModelsCatalogTool,
// Phase 2: Advanced
simulateRouteTool,
setBudgetGuardTool,
setRoutingStrategyTool,
@@ -830,6 +891,8 @@ export const MCP_TOOLS = [
explainRouteTool,
getSessionSnapshotTool,
syncPricingTool,
cacheStatsTool,
cacheFlushTool,
] as const;
/** Essential tools only (Phase 1) */
@@ -0,0 +1,118 @@
import { describe, it } from "node:test";
import assert from "node:assert/strict";
import { detectVolumeSignals, recommendStrategyOverride } from "../volumeDetector";
describe("volumeDetector", async () => {
describe("detectVolumeSignals", async () => {
it("detects simple single-message request", async () => {
const body = {
messages: [{ role: "user", content: "Hello" }],
};
const signals = detectVolumeSignals(body);
assert.equal(signals.batchSize, 1);
assert.ok(signals.estimatedTokens < 100);
assert.equal(signals.toolCount, 0);
assert.equal(signals.hasBrowser, false);
assert.equal(signals.complexity, "trivial");
});
it("detects tool-heavy request as high complexity", async () => {
const body = {
messages: [{ role: "user", content: "Deploy the app to production" }],
tools: [
{ type: "function", function: { name: "run_command" } },
{ type: "function", function: { name: "read_file" } },
{ type: "function", function: { name: "write_file" } },
{ type: "function", function: { name: "browser_action" } },
],
};
const signals = detectVolumeSignals(body);
assert.equal(signals.toolCount, 4);
assert.equal(signals.complexity, "critical");
});
it("detects browser keywords", async () => {
const body = {
messages: [{ role: "user", content: "Navigate to the page and take a screenshot" }],
};
const signals = detectVolumeSignals(body);
assert.equal(signals.hasBrowser, true);
});
it("detects batch from multi-part content", async () => {
const parts = Array.from({ length: 20 }, (_, i) => ({
type: "text",
text: `Item ${i}`,
}));
const body = {
messages: [{ role: "user", content: parts }],
};
const signals = detectVolumeSignals(body);
assert.equal(signals.batchSize, 20);
});
it("detects security keywords as high complexity", async () => {
const body = {
messages: [{ role: "user", content: "Refactor the authentication module for production" }],
};
const signals = detectVolumeSignals(body);
assert.ok(
signals.complexity === "critical" || signals.complexity === "high",
`expected critical or high, got ${signals.complexity}`
);
});
});
describe("recommendStrategyOverride", async () => {
it("recommends round-robin for large batches", async () => {
const signals = detectVolumeSignals({ input: Array(60).fill("item") });
const override = await recommendStrategyOverride(signals, "priority");
assert.equal(override.shouldOverride, true);
assert.equal(override.strategy, "round-robin");
assert.equal(override.preferEconomy, true);
});
it("recommends premium-first for browser tasks", async () => {
const signals = {
batchSize: 1,
estimatedTokens: 500,
toolCount: 2,
hasBrowser: true,
hasImages: false,
complexity: "high" as const,
};
const override = await recommendStrategyOverride(signals, "round-robin");
assert.equal(override.shouldOverride, true);
assert.equal(override.strategy, "priority");
assert.equal(override.forcePremium, true);
});
it("flags economy for tiny requests without changing strategy", async () => {
const signals = {
batchSize: 1,
estimatedTokens: 100,
toolCount: 0,
hasBrowser: false,
hasImages: false,
complexity: "trivial" as const,
};
const override = await recommendStrategyOverride(signals, "priority");
assert.equal(override.shouldOverride, false);
assert.equal(override.preferEconomy, true);
});
it("no override for normal medium requests", async () => {
const signals = {
batchSize: 1,
estimatedTokens: 1000,
toolCount: 0,
hasBrowser: false,
hasImages: false,
complexity: "low" as const,
};
const override = await recommendStrategyOverride(signals, "priority");
assert.equal(override.shouldOverride, false);
assert.equal(override.preferEconomy, false);
});
});
});
@@ -0,0 +1,125 @@
import { describe, it, beforeEach } from "node:test";
import assert from "node:assert/strict";
import {
recordProviderUsage,
calculateDiversityScore,
getProviderDiversityBoost,
getDiversityReport,
resetDiversity,
configureDiversity,
} from "../providerDiversity";
describe("providerDiversity", () => {
beforeEach(() => {
resetDiversity();
});
describe("calculateDiversityScore", () => {
it("returns 1.0 when no data is recorded", () => {
assert.equal(calculateDiversityScore(), 1.0);
});
it("returns 0.0 when all requests go to one provider", () => {
for (let i = 0; i < 20; i++) {
recordProviderUsage("claude");
}
assert.equal(calculateDiversityScore(), 0.0);
});
it("returns 1.0 for perfectly even distribution across 2 providers", () => {
for (let i = 0; i < 10; i++) {
recordProviderUsage("claude");
recordProviderUsage("openai");
}
assert.equal(calculateDiversityScore(), 1.0);
});
it("returns value between 0 and 1 for uneven distribution", () => {
for (let i = 0; i < 15; i++) recordProviderUsage("claude");
for (let i = 0; i < 5; i++) recordProviderUsage("openai");
const score = calculateDiversityScore();
assert.ok(score > 0, "should be > 0 (not single provider)");
assert.ok(score < 1, "should be < 1 (not perfectly even)");
});
it("higher entropy with more providers", () => {
// 2 providers
resetDiversity();
for (let i = 0; i < 10; i++) {
recordProviderUsage("claude");
recordProviderUsage("openai");
}
const score2 = calculateDiversityScore();
// 4 providers (same total requests)
resetDiversity();
for (let i = 0; i < 5; i++) {
recordProviderUsage("claude");
recordProviderUsage("openai");
recordProviderUsage("google");
recordProviderUsage("together");
}
const score4 = calculateDiversityScore();
// Both should be 1.0 (perfectly distributed within their pool)
assert.equal(score2, 1.0);
assert.equal(score4, 1.0);
});
});
describe("getProviderDiversityBoost", () => {
it("returns 0.5 when no data is recorded", () => {
assert.equal(getProviderDiversityBoost("claude"), 0.5);
});
it("returns low boost for heavily used provider", () => {
for (let i = 0; i < 18; i++) recordProviderUsage("claude");
for (let i = 0; i < 2; i++) recordProviderUsage("openai");
const claudeBoost = getProviderDiversityBoost("claude");
const openaiBoost = getProviderDiversityBoost("openai");
assert.ok(claudeBoost < openaiBoost, "heavily used provider should have lower boost");
assert.ok(claudeBoost < 0.2, "90% used provider should have very low boost");
assert.ok(openaiBoost > 0.8, "10% used provider should have high boost");
});
it("returns 1.0 for never-used provider", () => {
for (let i = 0; i < 10; i++) recordProviderUsage("claude");
const boost = getProviderDiversityBoost("google");
assert.equal(boost, 1.0);
});
});
describe("getDiversityReport", () => {
it("returns structured report", () => {
recordProviderUsage("claude");
recordProviderUsage("claude");
recordProviderUsage("openai");
const report = getDiversityReport();
assert.equal(report.totalRequests, 3);
assert.ok(report.score > 0);
assert.ok(report.score < 1);
assert.equal(report.providers["claude"].count, 2);
assert.equal(report.providers["openai"].count, 1);
assert.ok(Math.abs(report.providers["claude"].share - 2 / 3) < 0.01);
});
});
describe("window management", () => {
it("respects windowSize limit", () => {
configureDiversity({ windowSize: 10, ttlMs: 3_600_000 });
for (let i = 0; i < 20; i++) {
recordProviderUsage("claude");
}
const report = getDiversityReport();
assert.ok(report.totalRequests <= 10, "should not exceed window size");
});
});
});
@@ -0,0 +1,170 @@
/**
* Provider Diversity Tracking via Shannon Entropy
*
* Measures and tracks how evenly distributed requests are across providers.
* A system routing 90% of traffic to one provider has a catastrophic single
* point of failure. This module provides a diversity score [0..1] that can
* be used as a scoring factor in auto-combo selection.
*
* Shannon entropy normalized to [0..1]:
* - 0.0 = all requests go to one provider (maximum risk)
* - 1.0 = perfectly even distribution (minimum risk)
*
* @see https://en.wikipedia.org/wiki/Entropy_(information_theory)
*/
/** Rolling window entry for provider usage tracking */
interface UsageEntry {
provider: string;
timestamp: number;
}
/** Configuration for the diversity tracker */
export interface DiversityConfig {
/** Maximum entries in the rolling window (default: 200) */
windowSize: number;
/** Time-to-live in ms for entries — older entries are pruned (default: 1 hour) */
ttlMs: number;
}
const DEFAULT_CONFIG: DiversityConfig = {
windowSize: 200,
ttlMs: 3_600_000, // 1 hour
};
/** In-memory rolling window of recent provider usage */
let usageWindow: UsageEntry[] = [];
let config: DiversityConfig = { ...DEFAULT_CONFIG };
/**
* Configure the diversity tracker.
*/
export function configureDiversity(userConfig: Partial<DiversityConfig>): void {
config = { ...DEFAULT_CONFIG, ...userConfig };
}
/**
* Record that a provider was used for a request.
* Call this after a successful request completes.
*/
export function recordProviderUsage(provider: string): void {
const now = Date.now();
usageWindow.push({ provider, timestamp: now });
// Prune by window size
if (usageWindow.length > config.windowSize) {
usageWindow = usageWindow.slice(-config.windowSize);
}
// Prune by TTL
const cutoff = now - config.ttlMs;
usageWindow = usageWindow.filter((e) => e.timestamp >= cutoff);
}
/**
* Calculate Shannon entropy normalized to [0..1] for the current usage window.
*
* @returns Normalized entropy where 0 = single provider, 1 = perfect distribution
*/
export function calculateDiversityScore(): number {
if (usageWindow.length === 0) return 1.0; // No data = assume diverse
const now = Date.now();
const cutoff = now - config.ttlMs;
const recent = usageWindow.filter((e) => e.timestamp >= cutoff);
if (recent.length === 0) return 1.0;
// Count occurrences per provider
const counts = new Map<string, number>();
for (const entry of recent) {
counts.set(entry.provider, (counts.get(entry.provider) || 0) + 1);
}
const total = recent.length;
const nUnique = counts.size;
if (nUnique <= 1) return 0.0;
// Shannon entropy
let entropy = 0;
for (const count of counts.values()) {
const p = count / total;
entropy -= p * Math.log2(p);
}
// Normalize by maximum possible entropy
const maxEntropy = Math.log2(nUnique);
return maxEntropy > 0 ? entropy / maxEntropy : 0;
}
/**
* Get the diversity score for a specific provider.
* Returns a boost value [0..1] where underrepresented providers score higher.
* This can be used as a per-candidate factor in auto-combo scoring.
*
* @param provider - The provider to score
* @returns Diversity boost where 1.0 = never used (maximum boost), 0.0 = most used
*/
export function getProviderDiversityBoost(provider: string): number {
if (usageWindow.length === 0) return 0.5; // No data = neutral
const now = Date.now();
const cutoff = now - config.ttlMs;
const recent = usageWindow.filter((e) => e.timestamp >= cutoff);
if (recent.length === 0) return 0.5;
const total = recent.length;
const providerCount = recent.filter((e) => e.provider === provider).length;
// Inverse usage share: providers used less get higher boost
const usageShare = providerCount / total;
return Math.max(0, 1 - usageShare);
}
/**
* Get a summary of the current provider distribution.
* Useful for dashboard display and debugging.
*/
export function getDiversityReport(): {
score: number;
totalRequests: number;
providers: Record<string, { count: number; share: number }>;
windowSize: number;
ttlMs: number;
} {
const now = Date.now();
const cutoff = now - config.ttlMs;
const recent = usageWindow.filter((e) => e.timestamp >= cutoff);
const counts = new Map<string, number>();
for (const entry of recent) {
counts.set(entry.provider, (counts.get(entry.provider) || 0) + 1);
}
const providers: Record<string, { count: number; share: number }> = {};
for (const [provider, count] of counts) {
providers[provider] = {
count,
share: recent.length > 0 ? count / recent.length : 0,
};
}
return {
score: calculateDiversityScore(),
totalRequests: recent.length,
providers,
windowSize: config.windowSize,
ttlMs: config.ttlMs,
};
}
/**
* Reset the diversity tracker. Useful for testing.
*/
export function resetDiversity(): void {
usageWindow = [];
config = { ...DEFAULT_CONFIG };
}
+172 -27
View File
@@ -159,13 +159,13 @@ async function getGlmUsage(apiKey: string, providerSpecificData?: Record<string,
* @returns {Promise<unknown>} Usage data with quotas
*/
export async function getUsageForProvider(connection) {
const { provider, accessToken, apiKey, providerSpecificData } = connection;
const { provider, accessToken, apiKey, providerSpecificData, projectId } = connection;
switch (provider) {
case "github":
return await getGitHubUsage(accessToken, providerSpecificData);
case "gemini-cli":
return await getGeminiUsage(accessToken);
return await getGeminiUsage(accessToken, providerSpecificData, projectId);
case "antigravity":
return await getAntigravityUsage(accessToken, undefined);
case "claude":
@@ -195,24 +195,22 @@ function parseResetTime(resetValue) {
if (!resetValue) return null;
try {
// If it's already a Date object
let date;
if (resetValue instanceof Date) {
return resetValue.toISOString();
date = resetValue;
} else if (typeof resetValue === "number") {
date = new Date(resetValue);
} else if (typeof resetValue === "string") {
date = new Date(resetValue);
} else {
return null;
}
// If it's a number (Unix timestamp in milliseconds)
if (typeof resetValue === "number") {
return new Date(resetValue).toISOString();
}
// Epoch-zero (1970-01-01) means no scheduled reset — treat as null
if (date.getTime() <= 0) return null;
// If it's a string (ISO date or parseable date string)
if (typeof resetValue === "string") {
return new Date(resetValue).toISOString();
}
return null;
return date.toISOString();
} catch (error) {
console.warn(`Failed to parse reset time: ${resetValue}`, error);
return null;
}
}
@@ -417,36 +415,183 @@ function inferGitHubPlanName(data: JsonRecord, premiumQuota: UsageQuota | null):
return "GitHub Copilot";
}
// ── Gemini CLI subscription info cache ──────────────────────────────────────
// Prevents duplicate loadCodeAssist calls within the same quota cycle.
// Key: accessToken → { data, fetchedAt }
const _geminiCliSubCache = new Map();
const GEMINI_CLI_CACHE_TTL_MS = 5 * 60 * 1000; // 5 minutes
/**
* Gemini CLI Usage (Google Cloud)
* Gemini CLI Usage fetch per-model quota from Cloud Code Assist API.
* Gemini CLI and Antigravity share the same upstream (cloudcode-pa.googleapis.com),
* so this follows the same pattern as getAntigravityUsage().
*/
async function getGeminiUsage(accessToken) {
async function getGeminiUsage(accessToken, providerSpecificData?, connectionProjectId?) {
if (!accessToken) {
return { plan: "Free", message: "Gemini CLI access token not available." };
}
try {
// Gemini CLI uses Google Cloud quotas
// Try to get quota info from Cloud Resource Manager
const subscriptionInfo = await getGeminiCliSubscriptionInfoCached(accessToken);
const projectId =
connectionProjectId ||
providerSpecificData?.projectId ||
subscriptionInfo?.cloudaicompanionProject ||
null;
const plan = getGeminiCliPlanLabel(subscriptionInfo);
if (!projectId) {
return { plan, message: "Gemini CLI project ID not available." };
}
// Use retrieveUserQuota (same endpoint as Gemini CLI /stats command).
// Returns per-model buckets with remainingFraction and resetTime.
const response = await fetch(
"https://cloudresourcemanager.googleapis.com/v1/projects?filter=lifecycleState:ACTIVE",
"https://cloudcode-pa.googleapis.com/v1internal:retrieveUserQuota",
{
method: "POST",
headers: {
Authorization: `Bearer ${accessToken}`,
Accept: "application/json",
"Content-Type": "application/json",
},
body: JSON.stringify({ project: projectId }),
signal: AbortSignal.timeout(10000),
}
);
if (!response.ok) {
// Quota API may not be accessible, return generic message
return {
message: "Gemini CLI uses Google Cloud quotas. Check Google Cloud Console for details.",
};
return { plan, message: `Gemini CLI quota error (${response.status}).` };
}
return { message: "Gemini CLI connected. Usage tracked via Google Cloud Console." };
const data = await response.json();
const quotas: Record<string, UsageQuota> = {};
if (Array.isArray(data.buckets)) {
for (const bucket of data.buckets) {
if (!bucket.modelId || bucket.remainingFraction == null) continue;
const remainingFraction = toNumber(bucket.remainingFraction, 0);
const remainingPercentage = remainingFraction * 100;
const QUOTA_NORMALIZED_BASE = 1000;
const total = QUOTA_NORMALIZED_BASE;
const remaining = Math.round(total * remainingFraction);
const used = Math.max(0, total - remaining);
quotas[bucket.modelId] = {
used,
total,
resetAt: parseResetTime(bucket.resetTime),
remainingPercentage,
unlimited: false,
};
}
}
return { plan, quotas };
} catch (error) {
return { message: "Unable to fetch Gemini usage. Check Google Cloud Console." };
return { message: `Gemini CLI error: ${(error as Error).message}` };
}
}
/**
* Get Gemini CLI subscription info (cached, 5 min TTL)
*/
async function getGeminiCliSubscriptionInfoCached(accessToken) {
const cacheKey = accessToken;
const cached = _geminiCliSubCache.get(cacheKey);
if (cached && Date.now() - cached.fetchedAt < GEMINI_CLI_CACHE_TTL_MS) {
return cached.data;
}
const data = await getGeminiCliSubscriptionInfo(accessToken);
_geminiCliSubCache.set(cacheKey, { data, fetchedAt: Date.now() });
return data;
}
/**
* Get Gemini CLI subscription info using correct headers.
*/
async function getGeminiCliSubscriptionInfo(accessToken) {
try {
const response = await fetch(
"https://cloudcode-pa.googleapis.com/v1internal:loadCodeAssist",
{
method: "POST",
headers: {
Authorization: `Bearer ${accessToken}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
metadata: {
ideType: "IDE_UNSPECIFIED",
platform: "PLATFORM_UNSPECIFIED",
pluginType: "GEMINI",
},
}),
}
);
if (!response.ok) return null;
return await response.json();
} catch {
return null;
}
}
/**
* Map Gemini CLI subscription tier to display label (same tiers as Antigravity).
*/
function getGeminiCliPlanLabel(subscriptionInfo) {
if (!subscriptionInfo || Object.keys(subscriptionInfo).length === 0) return "Free";
let tierId = "";
if (Array.isArray(subscriptionInfo.allowedTiers)) {
for (const tier of subscriptionInfo.allowedTiers) {
if (tier.isDefault && tier.id) {
tierId = tier.id.trim().toUpperCase();
break;
}
}
}
if (!tierId) {
tierId = (subscriptionInfo.currentTier?.id || "").toUpperCase();
}
if (tierId) {
if (tierId.includes("ULTRA")) return "Ultra";
if (tierId.includes("PRO")) return "Pro";
if (tierId.includes("ENTERPRISE")) return "Enterprise";
if (tierId.includes("BUSINESS") || tierId.includes("STANDARD")) return "Business";
if (tierId.includes("FREE") || tierId.includes("INDIVIDUAL") || tierId.includes("LEGACY"))
return "Free";
}
const tierName =
subscriptionInfo.currentTier?.name ||
subscriptionInfo.currentTier?.displayName ||
subscriptionInfo.subscriptionType ||
subscriptionInfo.tier ||
"";
const upper = tierName.toUpperCase();
if (upper.includes("ULTRA")) return "Ultra";
if (upper.includes("PRO")) return "Pro";
if (upper.includes("ENTERPRISE")) return "Enterprise";
if (upper.includes("STANDARD") || upper.includes("BUSINESS")) return "Business";
if (upper.includes("INDIVIDUAL") || upper.includes("FREE")) return "Free";
if (subscriptionInfo.currentTier?.upgradeSubscriptionType) return "Free";
if (tierName) {
return tierName.charAt(0).toUpperCase() + tierName.slice(1).toLowerCase();
}
return "Free";
}
// ── Antigravity subscription info cache ──────────────────────────────────────
// Prevents duplicate loadCodeAssist calls within the same quota cycle.
// Key: truncated accessToken → { data, fetchedAt }
+224
View File
@@ -0,0 +1,224 @@
/**
* Volume & Complexity Detector for Adaptive Routing
*
* Detects request characteristics (batch size, token estimate, tool count,
* complexity signals) and recommends routing strategy overrides.
*
* When a request clearly belongs to a different routing profile than the
* combo's default strategy, this module suggests an override. For example:
* - Batch of 500 items round-robin (prevent throttling)
* - 3 tools + browser priority with premium-first (needs best model)
* - 50 tokens keep strategy but flag for economy tier
*/
/** Signals extracted from a request for routing decisions */
export interface VolumeSignals {
/** Number of items in a batch (1 for single requests) */
batchSize: number;
/** Estimated total tokens (input + output) */
estimatedTokens: number;
/** Number of tools defined in the request */
toolCount: number;
/** Whether the request involves browser/UI interaction */
hasBrowser: boolean;
/** Whether the request includes image/screenshot content */
hasImages: boolean;
/** Rough complexity level derived from signals */
complexity: "trivial" | "low" | "medium" | "high" | "critical";
}
/** Strategy override recommendation */
export interface StrategyOverride {
/** Whether an override is recommended */
shouldOverride: boolean;
/** Recommended strategy (null if no override) */
strategy: "priority" | "round-robin" | "cost-optimized" | "weighted" | null;
/** Whether to prefer economy models */
preferEconomy: boolean;
/** Whether to force premium models first */
forcePremium: boolean;
/** Reason for the override (for logging) */
reason: string;
}
// Tool-related keywords that signal browser/UI interaction
const BROWSER_KEYWORDS = [
"browser",
"playwright",
"puppeteer",
"screenshot",
"navigate",
"click",
"form",
"page",
"tab",
"window",
"computer_use",
"computer-use",
];
// Keywords that signal high complexity
const HIGH_COMPLEXITY_KEYWORDS = [
"deploy",
"migration",
"security",
"auth",
"database",
"refactor",
"production",
"incident",
];
/**
* Detect volume and complexity signals from a chat request body.
*
* @param body - The raw request body (OpenAI or Claude format)
* @returns Extracted signals
*/
export function detectVolumeSignals(body: Record<string, unknown>): VolumeSignals {
const messages = (body.messages || body.input || []) as unknown[];
const tools = (body.tools || []) as unknown[];
const toolCount = tools.length;
// Estimate batch size from array structures
let batchSize = 1;
if (Array.isArray(body.input) && body.input.length > 1) {
batchSize = body.input.length;
} else if (Array.isArray(messages)) {
// Check if the last user message contains multiple items (common batch pattern)
const lastMsg = messages[messages.length - 1] as Record<string, unknown> | undefined;
if (lastMsg && Array.isArray(lastMsg.content)) {
const contentParts = lastMsg.content as unknown[];
batchSize = Math.max(1, contentParts.length);
}
}
// Estimate tokens from serialized message size
const serialized = JSON.stringify(messages);
const estimatedTokens = Math.ceil(serialized.length / 4); // rough: 4 chars ≈ 1 token
// Detect browser/UI signals
const lowerSerialized = serialized.toLowerCase();
const hasBrowser = BROWSER_KEYWORDS.some((kw) => lowerSerialized.includes(kw));
// Detect image content
const hasImages =
lowerSerialized.includes("image_url") ||
lowerSerialized.includes("image/") ||
lowerSerialized.includes("base64") ||
lowerSerialized.includes("screenshot");
// Determine complexity
const hasHighKeywords = HIGH_COMPLEXITY_KEYWORDS.some((kw) => lowerSerialized.includes(kw));
let complexity: VolumeSignals["complexity"];
if (toolCount > 3 || (hasBrowser && toolCount > 1) || hasHighKeywords) {
complexity = "critical";
} else if (toolCount > 1 || hasBrowser || hasImages || estimatedTokens > 10000) {
complexity = "high";
} else if (toolCount === 1 || estimatedTokens > 2000) {
complexity = "medium";
} else if (estimatedTokens > 500) {
complexity = "low";
} else {
complexity = "trivial";
}
return {
batchSize,
estimatedTokens,
toolCount,
hasBrowser,
hasImages,
complexity,
};
}
/**
* Recommend a routing strategy override based on detected volume signals.
*
* @param signals - Volume signals from detectVolumeSignals()
* @param currentStrategy - The combo's configured strategy
* @returns Override recommendation
*/
export async function recommendStrategyOverride(
signals: VolumeSignals,
currentStrategy: string
): Promise<StrategyOverride> {
const noOverride: StrategyOverride = {
shouldOverride: false,
strategy: null,
preferEconomy: false,
forcePremium: false,
reason: "no override needed",
};
// Check if adaptive routing is enabled globally
try {
const { getSettings } = await import("@/lib/localDb");
const settings = await getSettings();
if (!settings.adaptiveVolumeRouting) {
return noOverride;
}
} catch (error) {
console.error("Failed to check adaptiveVolumeRouting setting:", error);
return noOverride;
}
// Rule 1: Large batch → round-robin to distribute load
if (signals.batchSize >= 50) {
return {
shouldOverride: true,
strategy: "round-robin",
preferEconomy: true,
forcePremium: false,
reason: `batch size ${signals.batchSize} >= 50: distribute load via round-robin with economy models`,
};
}
// Rule 2: Medium batch with low complexity → cost-optimized
if (signals.batchSize >= 10 && signals.complexity === "low") {
return {
shouldOverride: currentStrategy !== "cost-optimized",
strategy: "cost-optimized",
preferEconomy: true,
forcePremium: false,
reason: `batch size ${signals.batchSize} with low complexity: use cost-optimized routing`,
};
}
// Rule 3: Critical complexity → force priority with premium
if (signals.complexity === "critical") {
return {
shouldOverride: true,
strategy: "priority",
preferEconomy: false,
forcePremium: true,
reason: `critical complexity (tools=${signals.toolCount}, browser=${signals.hasBrowser}): force premium-first priority`,
};
}
// Rule 4: Browser/UI interaction → force priority with premium
if (signals.hasBrowser) {
return {
shouldOverride: currentStrategy !== "priority",
strategy: "priority",
preferEconomy: false,
forcePremium: true,
reason: "browser/UI interaction detected: force premium-first priority",
};
}
// Rule 5: Very short request → flag for economy (but don't change strategy)
if (signals.estimatedTokens <= 200) {
return {
shouldOverride: false,
strategy: null,
preferEconomy: true,
forcePremium: false,
reason: `short request (${signals.estimatedTokens} tokens): prefer economy tier`,
};
}
return noOverride;
}
+103
View File
@@ -0,0 +1,103 @@
/**
* Deterministic FSM for Multi-Step Workflows
*
* Orchestrates plan -> review -> execute -> verify using rules, not LLM decisions.
* Risk-based phase skipping: high=all phases, medium=skip planner, low=execute+test only.
*/
export type Phase = "classify"|"plan"|"plan_review"|"execute"|"code_review"|"quality_review"|"security"|"test"|"output_review"|"done"|"failed"|"paused";
export type RiskLevel = "low"|"medium"|"high";
export type Verdict = "approve"|"approve_with_notes"|"request_changes"|"reject"|"block";
export interface PhaseRecord {
phase: Phase; enteredAt: string; exitedAt: string|null;
verdict: Verdict|null; provider: string|null; model: string|null;
retryCount: number; notes: string|null;
}
export interface WorkflowContext {
id: string; currentPhase: Phase; risk: RiskLevel;
lastVerdict: Verdict|null; retries: Record<string,number>;
maxRetries: number; testsPass: boolean;
history: PhaseRecord[]; createdAt: string;
metadata: Record<string,unknown>;
}
interface Transition { from: Phase; to: Phase; condition: (ctx: WorkflowContext) => boolean; description: string; }
const HIGH_KEYWORDS = ["schema","migration","deploy","delete","drop","env","database","refactor","security","auth","production","secrets","credentials","permission"];
const MED_KEYWORDS = ["endpoint","feature","service","model","api","integration","webhook","middleware","route"];
export function classifyRisk(desc: string): RiskLevel {
const l = desc.toLowerCase();
if (HIGH_KEYWORDS.some(k => l.includes(k))) return "high";
if (MED_KEYWORDS.some(k => l.includes(k))) return "medium";
return "low";
}
const PHASE_ORDER: Phase[] = ["classify","plan","plan_review","execute","code_review","quality_review","security","test","output_review"];
const T: Transition[] = [
{from:"classify",to:"plan",condition:c=>c.risk==="high",description:"High risk -> full planning"},
{from:"classify",to:"execute",condition:c=>c.risk==="medium",description:"Medium risk -> skip planner"},
{from:"classify",to:"execute",condition:c=>c.risk==="low",description:"Low risk -> direct execute"},
{from:"plan",to:"plan_review",condition:()=>true,description:"Plan -> review"},
{from:"plan_review",to:"execute",condition:c=>c.lastVerdict==="approve"||c.lastVerdict==="approve_with_notes",description:"Plan approved -> execute"},
{from:"plan_review",to:"plan",condition:c=>(c.lastVerdict==="reject"||c.lastVerdict==="request_changes")&&(c.retries["plan"]??0)<c.maxRetries,description:"Plan rejected -> retry"},
{from:"plan_review",to:"failed",condition:c=>(c.lastVerdict==="reject"||c.lastVerdict==="request_changes")&&(c.retries["plan"]??0)>=c.maxRetries,description:"Plan rejected max retries"},
{from:"execute",to:"code_review",condition:c=>c.risk!=="low",description:"Non-low -> code review"},
{from:"execute",to:"test",condition:c=>c.risk==="low",description:"Low -> skip reviews"},
{from:"code_review",to:"quality_review",condition:c=>c.lastVerdict==="approve"||c.lastVerdict==="approve_with_notes",description:"Code approved -> quality"},
{from:"code_review",to:"execute",condition:c=>(c.lastVerdict==="reject"||c.lastVerdict==="request_changes")&&(c.retries["execute"]??0)<c.maxRetries,description:"Code rejected -> re-execute"},
{from:"code_review",to:"failed",condition:c=>(c.lastVerdict==="reject"||c.lastVerdict==="request_changes")&&(c.retries["execute"]??0)>=c.maxRetries,description:"Code rejected max retries"},
{from:"quality_review",to:"security",condition:c=>c.risk==="high",description:"High -> security audit"},
{from:"quality_review",to:"test",condition:c=>c.risk!=="high",description:"Non-high -> skip security"},
{from:"security",to:"failed",condition:c=>c.lastVerdict==="block",description:"Security BLOCK -> failed"},
{from:"security",to:"test",condition:c=>c.lastVerdict!=="block",description:"Security passed -> test"},
{from:"test",to:"output_review",condition:c=>c.testsPass,description:"Tests pass -> output review"},
{from:"test",to:"execute",condition:c=>!c.testsPass&&(c.retries["execute"]??0)<c.maxRetries,description:"Tests fail -> re-execute"},
{from:"test",to:"failed",condition:c=>!c.testsPass&&(c.retries["execute"]??0)>=c.maxRetries,description:"Tests fail max retries"},
{from:"output_review",to:"done",condition:()=>true,description:"Output reviewed -> done"},
];
export function createWorkflow(id: string, description: string, opts?: {maxRetries?: number; metadata?: Record<string,unknown>}): WorkflowContext {
const risk = classifyRisk(description);
return {id, currentPhase:"classify", risk, lastVerdict:null, retries:{}, maxRetries:opts?.maxRetries??3, testsPass:false,
history:[{phase:"classify",enteredAt:new Date().toISOString(),exitedAt:null,verdict:null,provider:null,model:null,retryCount:0,notes:`Risk: ${risk}`}],
createdAt:new Date().toISOString(), metadata:opts?.metadata??{}};
}
export function advance(ctx: WorkflowContext, result?: {verdict?:Verdict;testsPass?:boolean;provider?:string;model?:string;notes?:string}): Phase|null {
if(result?.verdict!=null) ctx.lastVerdict=result.verdict;
if(result?.testsPass!=null) ctx.testsPass=result.testsPass;
const cur=ctx.history[ctx.history.length-1];
if(cur){cur.exitedAt=new Date().toISOString();cur.verdict=result?.verdict??null;cur.provider=result?.provider??null;cur.model=result?.model??null;cur.notes=result?.notes??null;}
for(const t of T){
if(t.from===ctx.currentPhase&&t.condition(ctx)){
const fi=PHASE_ORDER.indexOf(t.from),ti=PHASE_ORDER.indexOf(t.to);
if(ti>=0&&fi>=0&&ti<=fi) ctx.retries[t.to]=(ctx.retries[t.to]??0)+1;
ctx.currentPhase=t.to;
ctx.history.push({phase:t.to,enteredAt:new Date().toISOString(),exitedAt:null,verdict:null,provider:null,model:null,retryCount:ctx.retries[t.to]??0,notes:t.description});
return t.to;
}
}
return null;
}
export function pause(ctx: WorkflowContext, reason: string): void {
ctx.currentPhase="paused";
ctx.history.push({phase:"paused",enteredAt:new Date().toISOString(),exitedAt:null,verdict:null,provider:null,model:null,retryCount:0,notes:reason});
}
export function resume(ctx: WorkflowContext, phase: Phase): void {
const p=ctx.history[ctx.history.length-1];if(p)p.exitedAt=new Date().toISOString();
ctx.currentPhase=phase;
ctx.history.push({phase,enteredAt:new Date().toISOString(),exitedAt:null,verdict:null,provider:null,model:null,retryCount:ctx.retries[phase]??0,notes:"Resumed"});
}
export function isTerminated(ctx: WorkflowContext): boolean { return ctx.currentPhase==="done"||ctx.currentPhase==="failed"; }
export function getPhaseSequence(ctx: WorkflowContext): Phase[] { return ctx.history.map(r=>r.phase); }
export function getLLMCallCount(ctx: WorkflowContext): number {
const sys: Phase[]=["classify","paused","done","failed"];
return ctx.history.filter(r=>!sys.includes(r.phase)&&r.exitedAt!=null).length;
}
+1 -6
View File
@@ -72,12 +72,7 @@ const DETERMINISTIC_STRATEGIES: Set<RoutingStrategyValue> = new Set(["priority",
/**
* Providers that support prompt caching
*/
const CACHING_PROVIDERS = new Set([
"claude",
"anthropic",
"zai",
"qwen", // Alibaba Qwen Coding Plan International
]);
const CACHING_PROVIDERS = new Set(["claude", "anthropic", "zai", "qwen", "deepseek"]);
/**
* Detect if the client is Claude Code or another caching-aware client
+2 -2
View File
@@ -1,12 +1,12 @@
{
"name": "omniroute",
"version": "3.3.2",
"version": "3.3.5",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "omniroute",
"version": "3.3.2",
"version": "3.3.5",
"hasInstallScript": true,
"license": "MIT",
"workspaces": [
+4 -2
View File
@@ -1,6 +1,6 @@
{
"name": "omniroute",
"version": "3.3.2",
"version": "3.3.5",
"description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
"type": "module",
"bin": {
@@ -162,6 +162,8 @@
},
"overrides": {
"dompurify": "^3.3.2",
"path-to-regexp": "^8.4.0"
"path-to-regexp": "^8.4.0",
"react": "$react",
"react-dom": "$react-dom"
}
}
@@ -546,18 +546,22 @@ export default function HomePageClient({ machineId }) {
<div>
<p className="font-semibold text-sm">Update Available: v{versionInfo.latest}</p>
<p className="text-xs opacity-80 mt-0.5">
{t("updateAvailableDesc") ||
`You are currently using v${versionInfo.current}. Update to access the latest features and bug fixes.`}
{versionInfo.autoUpdateSupported
? t("updateAvailableDesc") ||
`You are currently using v${versionInfo.current}. Update to access the latest features and bug fixes.`
: versionInfo.autoUpdateError ||
"Manual update required for this installation type."}
</p>
</div>
</div>
<Button
size="sm"
onClick={handleUpdate}
disabled={updating}
onClick={versionInfo.autoUpdateSupported ? handleUpdate : undefined}
disabled={updating || !versionInfo.autoUpdateSupported}
className="shrink-0 ml-4 font-semibold"
title={versionInfo.autoUpdateError || ""}
>
{t("updateNow") || "Update Now"}
{versionInfo.autoUpdateSupported ? t("updateNow") || "Update Now" : "Manual Update"}
</Button>
</div>
)}
+67 -32
View File
@@ -433,43 +433,78 @@ export default function A2ADashboardPage() {
<th className="text-left py-2 pr-2">{t("tableTask")}</th>
<th className="text-left py-2 pr-2">{t("tableSkill")}</th>
<th className="text-left py-2 pr-2">{t("tableState")}</th>
<th className="text-left py-2 pr-2">{t("tablePhase") || "FSM Status"}</th>
<th className="text-left py-2 pr-2">{t("tableUpdated")}</th>
<th className="text-left py-2">{t("tableActions")}</th>
</tr>
</thead>
<tbody>
{tasksData.tasks.map((task) => (
<tr key={task.id} className="border-b border-border/40">
<td className="py-2 pr-2 font-mono text-xs">{task.id}</td>
<td className="py-2 pr-2">{task.skill}</td>
<td className="py-2 pr-2">
<span className={`text-xs px-2 py-1 rounded-full ${stateClass(task.state)}`}>
{t(`state.${task.state}`)}
</span>
</td>
<td className="py-2 pr-2 text-xs">
{new Date(task.updatedAt).toLocaleString()}
</td>
<td className="py-2 flex gap-2">
<Button size="sm" variant="secondary" onClick={() => handleLoadTask(task.id)}>
{t("view")}
</Button>
<Button
size="sm"
variant="secondary"
onClick={() => handleCancelTask(task.id)}
disabled={
task.state === "completed" ||
task.state === "failed" ||
task.state === "cancelled" ||
actionBusy === "cancel"
}
>
{t("cancel")}
</Button>
</td>
</tr>
))}
{tasksData.tasks.map((task) => {
const fsmPhase =
task.metadata?.fsmPhase || (task.metadata?.workflowFSM as any)?.currentPhase;
let fsmBadgeColor = "bg-gray-500/15 text-gray-500";
if (fsmPhase === "plan" || fsmPhase === "plan_review")
fsmBadgeColor = "bg-purple-500/15 text-purple-500";
else if (fsmPhase === "execute") fsmBadgeColor = "bg-blue-500/15 text-blue-500";
else if (
["code_review", "quality_review", "security", "test", "output_review"].includes(
fsmPhase
)
)
fsmBadgeColor = "bg-amber-500/15 text-amber-500";
else if (fsmPhase === "done") fsmBadgeColor = "bg-green-500/15 text-green-500";
else if (fsmPhase === "failed") fsmBadgeColor = "bg-red-500/15 text-red-500";
return (
<tr key={task.id} className="border-b border-border/40">
<td className="py-2 pr-2 font-mono text-xs">{task.id}</td>
<td className="py-2 pr-2">{task.skill}</td>
<td className="py-2 pr-2">
<span
className={`text-xs px-2 py-1 rounded-full ${stateClass(task.state)}`}
>
{t(`state.${task.state}`)}
</span>
</td>
<td className="py-2 pr-2">
{fsmPhase ? (
<span
className={`text-xs px-2 py-1 rounded border border-current/20 font-medium ${fsmBadgeColor}`}
>
{fsmPhase}
</span>
) : (
<span className="text-xs text-text-muted"></span>
)}
</td>
<td className="py-2 pr-2 text-xs">
{new Date(task.updatedAt).toLocaleString()}
</td>
<td className="py-2 flex gap-2">
<Button
size="sm"
variant="secondary"
onClick={() => handleLoadTask(task.id)}
>
{t("view")}
</Button>
<Button
size="sm"
variant="secondary"
onClick={() => handleCancelTask(task.id)}
disabled={
task.state === "completed" ||
task.state === "failed" ||
task.state === "cancelled" ||
actionBusy === "cancel"
}
>
{t("cancel")}
</Button>
</td>
</tr>
);
})}
</tbody>
</table>
</div>
@@ -0,0 +1,136 @@
"use client";
import { useEffect, useState } from "react";
import { Card } from "@/shared/components";
import { useTranslations } from "next-intl";
export default function DiversityScoreCard() {
const [data, setData] = useState<any>(null);
const [loading, setLoading] = useState(true);
const t = useTranslations("analytics");
useEffect(() => {
fetch("/api/analytics/diversity")
.then((res) => res.json())
.then((json) => {
setData(json);
setLoading(false);
})
.catch((err) => {
console.error(err);
setLoading(false);
});
}, []);
if (loading || !data) {
return (
<Card className="p-5 flex flex-col justify-center items-center h-full min-h-[200px]">
<div className="animate-spin rounded-full h-8 w-8 border-b-2 border-primary"></div>
</Card>
);
}
const scorePercentage = Math.round((data.score || 0) * 100);
let riskColor = "text-green-500";
let gaugeColor = "bg-green-500";
let riskLabel = "Healthy Distribution";
if (scorePercentage < 40) {
riskColor = "text-red-500";
gaugeColor = "bg-red-500";
riskLabel = "High Vendor Lock-in Risk";
} else if (scorePercentage < 70) {
riskColor = "text-amber-500";
gaugeColor = "bg-amber-500";
riskLabel = "Moderate Distribution";
}
return (
<Card className="p-5 flex flex-col h-full bg-[var(--card-bg,#1e1e2e)] relative overflow-hidden group">
<div className="flex items-center gap-2 mb-4">
<span className="material-symbols-outlined text-[20px] text-cyan-400">pie_chart</span>
<h3 className="font-semibold text-[var(--text-primary,#fff)] flex-1">
Provider Diversity Score
</h3>
<span
className={`text-xs px-2 py-0.5 rounded-md border ${gaugeColor.replace("bg-", "border-").replace("500", "500/20")} ${gaugeColor.replace("bg-", "bg-").replace("500", "500/10")} ${riskColor}`}
>
Shannon Entropy
</span>
</div>
<div className="flex items-center justify-between mt-2 mb-6">
<div className="flex flex-col">
<span className={`text-4xl font-bold tabular-nums tracking-tight ${riskColor}`}>
{scorePercentage}%
</span>
<span className="text-sm text-[var(--text-muted,#aaaaaa)] mt-1">{riskLabel}</span>
</div>
{/* Simple CSS Donut */}
<div className="relative w-20 h-20 flex-shrink-0">
<svg className="w-full h-full transform -rotate-90" viewBox="0 0 36 36">
<path
className="text-[var(--border,#333)]"
strokeWidth="4"
stroke="currentColor"
fill="none"
d="M18 2.0845 a 15.9155 15.9155 0 0 1 0 31.831 a 15.9155 15.9155 0 0 1 0 -31.831"
/>
<path
className={riskColor}
strokeWidth="4"
strokeDasharray={`${scorePercentage}, 100`}
stroke="currentColor"
fill="none"
strokeLinecap="round"
d="M18 2.0845 a 15.9155 15.9155 0 0 1 0 31.831 a 15.9155 15.9155 0 0 1 0 -31.831"
/>
</svg>
</div>
</div>
<div className="space-y-4 flex-1">
<p className="text-xs uppercase tracking-wider font-semibold text-[var(--text-muted,#888)]">
Provider Share
</p>
{Object.keys(data.providers || {}).length === 0 ? (
<div className="text-sm text-[var(--text-secondary,#666)] py-2">
No recent usage data available.
</div>
) : (
<div className="space-y-3">
{Object.entries(data.providers)
.sort(([, a]: any, [, b]: any) => b.share - a.share)
.slice(0, 4) // Top 4 providers
.map(([provider, stat]: [string, any]) => (
<div key={provider} className="flex flex-col gap-1.5">
<div className="flex items-center justify-between text-sm">
<span className="font-medium text-[var(--text-primary,#ddd)] capitalize">
{provider}
</span>
<span className="font-mono text-[var(--text-muted,#aaa)]">
{Math.round(stat.share * 100)}%
</span>
</div>
<div className="w-full h-1.5 bg-[var(--surface,#333)] rounded-full overflow-hidden">
<div
className={`h-full ${gaugeColor} rounded-full`}
style={{ width: `${Math.round(stat.share * 100)}%` }}
/>
</div>
</div>
))}
</div>
)}
</div>
<div className="mt-4 pt-4 border-t border-[var(--border,#333)] flex justify-between text-[11px] text-[var(--text-muted,#777)]">
<span>Window: {data.windowSize} reqs</span>
<span>Based on Last {Math.round(data.ttlMs / 60000)} mins</span>
</div>
</Card>
);
}
@@ -4,6 +4,7 @@ import { useState, Suspense } from "react";
import { UsageAnalytics, CardSkeleton, SegmentedControl } from "@/shared/components";
import EvalsTab from "../usage/components/EvalsTab";
import SearchAnalyticsTab from "./SearchAnalyticsTab";
import DiversityScoreCard from "./components/DiversityScoreCard";
import { useTranslations } from "next-intl";
export default function AnalyticsPage() {
@@ -38,9 +39,14 @@ export default function AnalyticsPage() {
/>
{activeTab === "overview" && (
<Suspense fallback={<CardSkeleton />}>
<UsageAnalytics />
</Suspense>
<div className="flex flex-col gap-6">
<div className="grid grid-cols-1 md:grid-cols-3 gap-6">
<DiversityScoreCard />
</div>
<Suspense fallback={<CardSkeleton />}>
<UsageAnalytics />
</Suspense>
</div>
)}
{activeTab === "evals" && <EvalsTab />}
{activeTab === "search" && <SearchAnalyticsTab />}
@@ -0,0 +1,272 @@
"use client";
import { useState, useEffect } from "react";
import { useTranslations } from "next-intl";
interface ConfigDiff {
added: string[];
removed: string[];
changed: Array<{ key: string; from: any; to: any }>;
isEmpty: boolean;
}
interface AuditEntry {
id: string;
timestamp: string;
action: string;
target: string;
targetId: string;
targetName: string;
source: string;
before: any;
after: any;
diff: ConfigDiff;
note: string | null;
}
export default function ConfigAuditViewer() {
const t = useTranslations("logs");
const [entries, setEntries] = useState<AuditEntry[]>([]);
const [loading, setLoading] = useState(true);
const [selectedEntry, setSelectedEntry] = useState<AuditEntry | null>(null);
useEffect(() => {
fetchLogs();
}, []);
const fetchLogs = async () => {
setLoading(true);
try {
const res = await fetch("/api/audit");
const data = await res.json();
setEntries(data.entries || []);
} catch (e) {
console.error(e);
} finally {
setLoading(false);
}
};
const getActionColor = (action: string) => {
switch (action) {
case "create":
return "text-green-400 bg-green-400/10 border-green-500/20";
case "update":
return "text-blue-400 bg-blue-400/10 border-blue-500/20";
case "delete":
return "text-red-400 bg-red-400/10 border-red-500/20";
default:
return "text-gray-400 bg-gray-400/10 border-gray-500/20";
}
};
if (loading) {
return (
<div className="flex justify-center p-8">
<div className="w-6 h-6 border-2 border-[var(--accent)] border-t-transparent rounded-full animate-spin"></div>
</div>
);
}
if (entries.length === 0) {
return (
<div className="flex flex-col items-center justify-center p-12 text-[var(--text-muted,#666)]">
<svg
fill="none"
viewBox="0 0 24 24"
stroke="currentColor"
className="w-12 h-12 mb-4 opacity-50"
>
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={1.5}
d="M9 12h6m-6 4h6m2 5H7a2 2 0 01-2-2V5a2 2 0 012-2h5.586a1 1 0 01.707.293l5.414 5.414a1 1 0 01.293.707V19a2 2 0 01-2 2z"
/>
</svg>
<p>No Configuration Audit Logs found.</p>
</div>
);
}
return (
<div className="w-full">
<div className="overflow-x-auto">
<table className="w-full text-left border-collapse">
<thead>
<tr className="border-b border-[var(--border,#333)] text-[var(--text-secondary,#aaa)] text-sm">
<th className="px-6 py-4 font-medium">Timestamp</th>
<th className="px-6 py-4 font-medium">Action</th>
<th className="px-6 py-4 font-medium">Target</th>
<th className="px-6 py-4 font-medium">Resource</th>
<th className="px-6 py-4 font-medium">Source</th>
<th className="px-6 py-4 font-medium text-right">Details</th>
</tr>
</thead>
<tbody className="divide-y divide-[var(--border,#333)]">
{entries.map((entry) => (
<tr
key={entry.id}
className="hover:bg-[var(--hover-bg,#2a2a3e)] transition-colors group"
>
<td className="px-6 py-3 whitespace-nowrap text-sm text-[var(--text-secondary,#aaa)]">
{new Date(entry.timestamp).toLocaleString()}
</td>
<td className="px-6 py-3 whitespace-nowrap">
<span
className={`px-2 py-1 text-xs rounded-md border capitalize ${getActionColor(entry.action)}`}
>
{entry.action}
</span>
</td>
<td className="px-6 py-3 whitespace-nowrap text-sm text-[var(--text-primary,#fff)] font-medium capitalize">
{entry.target}
</td>
<td className="px-6 py-3 text-sm text-[var(--text-secondary,#aaa)] font-mono">
{entry.targetName}
</td>
<td className="px-6 py-3 whitespace-nowrap text-sm text-[var(--text-muted,#666)] capitalize">
{entry.source}
</td>
<td className="px-6 py-3 whitespace-nowrap text-right">
<button
onClick={() => setSelectedEntry(entry)}
className="px-3 py-1 text-xs font-medium text-[var(--text-primary,#fff)] bg-[var(--accent,#7c3aed)] hover:bg-opacity-80 rounded-md transition-colors invisible group-hover:visible"
>
View Diff
</button>
</td>
</tr>
))}
</tbody>
</table>
</div>
{selectedEntry && (
<div className="fixed inset-0 z-50 flex items-center justify-center bg-black/60 backdrop-blur-sm p-4 animate-in fade-in duration-200">
<div className="bg-[var(--card-bg,#1e1e2e)] border border-[var(--border,#333)] rounded-2xl w-full max-w-4xl max-h-[85vh] flex flex-col shadow-2xl overflow-hidden scale-in">
{/* Modal Header */}
<div className="flex items-center justify-between px-6 py-5 border-b border-[var(--border,#333)] bg-[#15151f]">
<div>
<h3 className="text-xl font-semibold text-[var(--text-primary,#fff)] capitalize">
{selectedEntry.action} {selectedEntry.target}
</h3>
<p className="text-sm text-[var(--text-secondary,#aaa)] font-mono mt-1">
ID: {selectedEntry.targetId} {selectedEntry.targetName}
</p>
</div>
<button
onClick={() => setSelectedEntry(null)}
className="p-2 text-[var(--text-muted,#666)] hover:text-white bg-[var(--hover-bg,#2a2a3e)] hover:bg-[#333] rounded-full transition-colors"
title="Close"
>
<svg fill="none" viewBox="0 0 24 24" stroke="currentColor" className="w-5 h-5">
<path
strokeLinecap="round"
strokeLinejoin="round"
strokeWidth={2}
d="M6 18L18 6M6 6l12 12"
/>
</svg>
</button>
</div>
{/* Modal Content */}
<div className="p-6 overflow-y-auto custom-scrollbar flex-1 bg-[#1a1a24]">
{selectedEntry.note && (
<div className="mb-6 p-4 bg-yellow-500/10 border border-yellow-500/20 text-yellow-300 rounded-xl text-sm italic">
📝 {selectedEntry.note}
</div>
)}
{selectedEntry.diff?.isEmpty ? (
<div className="text-center p-8 text-[var(--text-muted,#666)]">
No changes detected in Diff
</div>
) : (
<div className="space-y-6">
{/* Added Keys */}
{selectedEntry.diff?.added?.length > 0 && (
<div>
<h4 className="text-sm font-semibold text-green-400 mb-2 uppercase tracking-wider">
++ Added Properties
</h4>
<pre className="bg-[#111116] border border-green-500/20 rounded-xl p-4 overflow-x-auto text-xs font-mono text-green-300 shadow-inner">
{JSON.stringify(
selectedEntry.diff.added.reduce(
(acc, key) => ({ ...acc, [key]: selectedEntry.after?.[key] }),
{}
),
null,
2
)}
</pre>
</div>
)}
{/* Removed Keys */}
{selectedEntry.diff?.removed?.length > 0 && (
<div>
<h4 className="text-sm font-semibold text-red-400 mb-2 uppercase tracking-wider">
-- Removed Properties
</h4>
<pre className="bg-[#111116] border border-red-500/20 rounded-xl p-4 overflow-x-auto text-xs font-mono text-red-300 shadow-inner">
{JSON.stringify(
selectedEntry.diff.removed.reduce(
(acc, key) => ({ ...acc, [key]: selectedEntry.before?.[key] }),
{}
),
null,
2
)}
</pre>
</div>
)}
{/* Changed Keys */}
{selectedEntry.diff?.changed?.length > 0 && (
<div>
<h4 className="text-sm font-semibold text-yellow-400 mb-2 uppercase tracking-wider">
~ Modified Properties
</h4>
<div className="space-y-2">
{selectedEntry.diff.changed.map((change, idx) => (
<div
key={idx}
className="bg-[#111116] border border-yellow-500/20 rounded-xl overflow-hidden shadow-inner text-sm font-mono flex flex-col"
>
<div className="px-4 py-2 bg-[#1b1b22] border-b border-[#2d2d3a] text-yellow-500/80 font-semibold">
{change.key}
</div>
<div className="grid grid-cols-2 divide-x divide-[#2d2d3a]">
<div className="p-4 bg-red-500/5 text-red-300/80">
<div className="text-[10px] text-red-400/50 mb-1 uppercase">
Before
</div>
<pre className="whitespace-pre-wrap break-words">
{JSON.stringify(change.from, null, 2)}
</pre>
</div>
<div className="p-4 bg-green-500/5 text-green-300/80">
<div className="text-[10px] text-green-400/50 mb-1 uppercase">
After
</div>
<pre className="whitespace-pre-wrap break-words">
{JSON.stringify(change.to, null, 2)}
</pre>
</div>
</div>
</div>
))}
</div>
</div>
)}
</div>
)}
</div>
</div>
</div>
)}
</div>
);
}
@@ -0,0 +1,44 @@
"use client";
import { useState, useEffect } from "react";
import ConfigAuditViewer from "./ConfigAuditViewer";
export default function ConfigAuditPage() {
const [summary, setSummary] = useState<any>(null);
useEffect(() => {
fetch("/api/audit?summary=true")
.then((res) => res.json())
.then((data) => setSummary(data))
.catch((err) => console.error(err));
}, []);
return (
<div className="flex flex-col gap-6 w-full max-w-6xl mx-auto">
<div className="flex items-center justify-between">
<div>
<h1 className="text-2xl font-semibold text-[var(--text-primary,#fff)]">
Configuration Audit
</h1>
<p className="text-sm text-[var(--text-secondary,#aaa)] mt-1">
Track and diff changes made to routing policies, combos, and connections.
</p>
</div>
{summary && (
<div className="flex items-center gap-4 text-sm bg-[var(--card-bg,#1e1e2e)] px-4 py-2 rounded-xl border border-[var(--border,#333)]">
<div className="flex flex-col">
<span className="text-[var(--text-muted,#666)]">Total Audits</span>
<span className="font-mono text-[var(--text-primary,#fff)] font-semibold">
{summary.totalEntries}
</span>
</div>
</div>
)}
</div>
<div className="bg-[var(--card-bg,#1e1e2e)] border border-[var(--border,#333)] rounded-xl overflow-hidden shadow-sm">
<ConfigAuditViewer />
</div>
</div>
);
}
@@ -0,0 +1,174 @@
"use client";
import { useState, useEffect, useCallback } from "react";
import { Button } from "@/shared/components";
import { useTranslations } from "next-intl";
interface CacheEntry {
id: string;
signature: string;
model: string;
hit_count: number;
tokens_saved: number;
created_at: string;
expires_at: string;
}
interface Pagination {
page: number;
limit: number;
total: number;
totalPages: number;
}
export default function CacheEntriesTab() {
const t = useTranslations("cache");
const [entries, setEntries] = useState<CacheEntry[]>([]);
const [pagination, setPagination] = useState<Pagination>({
page: 1,
limit: 20,
total: 0,
totalPages: 0,
});
const [loading, setLoading] = useState(true);
const [search, setSearch] = useState("");
const [deleting, setDeleting] = useState<string | null>(null);
const fetchEntries = useCallback(
async (page = 1) => {
setLoading(true);
try {
const params = new URLSearchParams({ page: String(page), limit: String(pagination.limit) });
if (search) params.set("search", search);
const res = await fetch(`/api/cache/entries?${params}`);
if (res.ok) {
const data = await res.json();
setEntries(data.entries);
setPagination(data.pagination);
}
} catch {
// ignore
} finally {
setLoading(false);
}
},
[search, pagination.limit]
);
useEffect(() => {
fetchEntries();
}, [fetchEntries]);
const handleDelete = async (signature: string) => {
setDeleting(signature);
try {
await fetch(`/api/cache/entries?signature=${encodeURIComponent(signature)}`, {
method: "DELETE",
});
await fetchEntries(pagination.page);
} finally {
setDeleting(null);
}
};
const formatDate = (dateStr: string) => {
return new Date(dateStr).toLocaleString();
};
return (
<div className="flex flex-col gap-4">
<div className="flex items-center gap-3">
<input
type="text"
placeholder={t("searchEntries")}
value={search}
onChange={(e) => setSearch(e.target.value)}
onKeyDown={(e) => e.key === "Enter" && fetchEntries()}
className="flex-1 px-3 py-2 text-sm rounded-lg border border-border bg-surface text-text-main placeholder:text-text-muted"
/>
<Button variant="secondary" size="sm" onClick={() => fetchEntries()}>
{t("search")}
</Button>
</div>
{loading ? (
<div className="text-sm text-text-muted">{t("loading")}</div>
) : entries.length === 0 ? (
<div className="text-sm text-text-muted text-center py-8">{t("noEntries")}</div>
) : (
<>
<div className="overflow-x-auto">
<table className="w-full text-sm">
<thead>
<tr className="text-left text-xs text-text-muted border-b border-border/30">
<th className="pb-2 pr-4">{t("signature")}</th>
<th className="pb-2 pr-4">{t("model")}</th>
<th className="pb-2 pr-4">{t("hits")}</th>
<th className="pb-2 pr-4">{t("tokensSaved")}</th>
<th className="pb-2 pr-4">{t("created")}</th>
<th className="pb-2 pr-4">{t("expires")}</th>
<th className="pb-2">{t("actions")}</th>
</tr>
</thead>
<tbody>
{entries.map((entry) => (
<tr key={entry.id} className="border-b border-border/20">
<td className="py-2 pr-4 font-mono text-xs">
{entry.signature.slice(0, 12)}...
</td>
<td className="py-2 pr-4">{entry.model}</td>
<td className="py-2 pr-4 tabular-nums">{entry.hit_count}</td>
<td className="py-2 pr-4 tabular-nums text-green-500">
{entry.tokens_saved.toLocaleString()}
</td>
<td className="py-2 pr-4 text-xs text-text-muted">
{formatDate(entry.created_at)}
</td>
<td className="py-2 pr-4 text-xs text-text-muted">
{formatDate(entry.expires_at)}
</td>
<td className="py-2">
<button
onClick={() => handleDelete(entry.signature)}
disabled={deleting === entry.signature}
className="text-xs text-red-400 hover:text-red-300 disabled:opacity-50"
>
{deleting === entry.signature ? "..." : "🗑️"}
</button>
</td>
</tr>
))}
</tbody>
</table>
</div>
{/* Pagination */}
{pagination.totalPages > 1 && (
<div className="flex items-center justify-center gap-2 pt-2">
<Button
variant="secondary"
size="sm"
onClick={() => fetchEntries(pagination.page - 1)}
disabled={pagination.page <= 1}
>
</Button>
<span className="text-sm text-text-muted">
{pagination.page} / {pagination.totalPages}
</span>
<Button
variant="secondary"
size="sm"
onClick={() => fetchEntries(pagination.page + 1)}
disabled={pagination.page >= pagination.totalPages}
>
</Button>
</div>
)}
</>
)}
</div>
);
}
+362 -139
View File
@@ -4,6 +4,7 @@ import { useState, useEffect, useCallback } from "react";
import { Card, Button, EmptyState } from "@/shared/components";
import { useNotificationStore } from "@/store/notificationStore";
import { useTranslations } from "next-intl";
import CacheEntriesTab from "./components/CacheEntriesTab";
// ─── Types ───────────────────────────────────────────────────────────────────
@@ -16,13 +17,44 @@ interface SemanticCacheStats {
tokensSaved: number;
}
interface PromptCacheProviderStats {
requests: number;
inputTokens: number;
cachedTokens: number;
cacheCreationTokens: number;
}
interface PromptCacheMetrics {
totalRequests: number;
requestsWithCacheControl: number;
totalInputTokens: number;
totalCachedTokens: number;
totalCacheCreationTokens: number;
tokensSaved: number;
estimatedCostSaved: number;
byProvider: Record<string, PromptCacheProviderStats>;
byStrategy: Record<string, PromptCacheProviderStats>;
lastUpdated: string;
}
interface IdempotencyStats {
activeKeys: number;
windowMs: number;
}
interface CacheTrendPoint {
timestamp: string;
requests: number;
cachedRequests: number;
inputTokens: number;
cachedTokens: number;
cacheCreationTokens: number;
}
interface CacheStats {
semanticCache: SemanticCacheStats;
promptCache: PromptCacheMetrics | null;
trend: CacheTrendPoint[];
idempotency: IdempotencyStats;
}
@@ -107,6 +139,7 @@ export default function CachePage() {
const [stats, setStats] = useState<CacheStats | null>(null);
const [loading, setLoading] = useState(true);
const [clearing, setClearing] = useState(false);
const [activeTab, setActiveTab] = useState<"overview" | "entries">("overview");
const notify = useNotificationStore();
const fetchStats = useCallback(async () => {
@@ -136,27 +169,32 @@ export default function CachePage() {
const res = await fetch("/api/cache", { method: "DELETE" });
if (res.ok) {
const data = await res.json();
notify.add({
type: "success",
message: t("clearSuccess", { count: data.expiredRemoved ?? 0 }),
});
notify.success(t("clearSuccess", { count: data.expiredRemoved ?? 0 }));
await fetchStats();
} else {
notify.add({ type: "error", message: t("clearError") });
notify.error(t("clearError"));
}
} catch (error) {
console.error("[CachePage] Failed to clear cache:", error);
notify.add({ type: "error", message: t("clearError") });
notify.error(t("clearError"));
} finally {
setClearing(false);
}
};
const sc = stats?.semanticCache;
const pc = stats?.promptCache;
const trend = stats?.trend ?? [];
const idp = stats?.idempotency;
const hitRate = sc ? parseFloat(sc.hitRate) : 0;
const totalRequests = sc ? sc.hits + sc.misses : 0;
const promptCacheHitRate =
pc && pc.totalRequests > 0 ? (pc.requestsWithCacheControl / pc.totalRequests) * 100 : 0;
const providerEntries = pc ? Object.entries(pc.byProvider) : [];
const maxTrendRequests = Math.max(1, ...trend.map((p) => p.requests));
return (
<div className="flex flex-col gap-6">
{/* Header */}
@@ -190,149 +228,334 @@ export default function CachePage() {
</div>
</div>
{/* Loading skeleton */}
{loading && (
<div
className="grid grid-cols-2 md:grid-cols-4 gap-4"
aria-busy="true"
aria-label="Loading cache statistics"
{/* Tab navigation */}
<div className="flex gap-1 p-1 rounded-lg bg-black/5 dark:bg-white/5 w-fit">
<button
onClick={() => setActiveTab("overview")}
className={`px-4 py-2 rounded-md text-sm font-medium transition-all ${
activeTab === "overview"
? "bg-white dark:bg-white/10 text-text-main shadow-sm"
: "text-text-muted hover:text-text-main"
}`}
>
{Array.from({ length: 4 }).map((_, i) => (
<div key={i} className="h-24 rounded-xl bg-surface-raised animate-pulse" />
))}
</div>
)}
{t("overview")}
</button>
<button
onClick={() => setActiveTab("entries")}
className={`px-4 py-2 rounded-md text-sm font-medium transition-all ${
activeTab === "entries"
? "bg-white dark:bg-white/10 text-text-main shadow-sm"
: "text-text-muted hover:text-text-main"
}`}
>
{t("entries")}
</button>
</div>
{/* Error / empty state */}
{!loading && !stats && (
<EmptyState
icon="cached"
title={t("unavailable")}
description={t("unavailableDesc")}
actionLabel={t("refresh")}
onAction={() => void fetchStats()}
/>
)}
{/* Entries tab */}
{activeTab === "entries" && <CacheEntriesTab />}
{/* Main content */}
{!loading && stats && (
{/* Overview tab content */}
{activeTab === "overview" && (
<>
{/* Stats grid */}
<div className="grid grid-cols-2 md:grid-cols-4 gap-4">
<StatCard
icon="memory"
label={t("memoryEntries")}
value={sc?.memoryEntries ?? 0}
sub={t("memoryEntriesSub")}
/>
<StatCard
icon="storage"
label={t("dbEntries")}
value={sc?.dbEntries ?? 0}
sub={t("dbEntriesSub")}
/>
<StatCard
icon="trending_up"
label={t("cacheHits")}
value={sc?.hits ?? 0}
sub={t("cacheHitsSub", { total: totalRequests })}
valueClass="text-green-500"
/>
<StatCard
icon="token"
label={t("tokensSaved")}
value={(sc?.tokensSaved ?? 0).toLocaleString()}
sub={t("tokensSavedSub")}
valueClass="text-blue-400"
/>
</div>
{/* Hit rate + breakdown */}
<Card>
<div className="p-5 flex flex-col gap-4">
<div className="flex items-center justify-between">
<h2 className="font-medium text-sm">{t("performance")}</h2>
<span className="text-xs text-text-muted">
{t("autoRefresh", { seconds: REFRESH_INTERVAL_SECONDS })}
</span>
</div>
<HitRateBar hitRate={hitRate} label={t("hitRate")} />
<div className="grid grid-cols-3 gap-4 pt-3 border-t border-border/30 text-center">
<div>
<div className="text-lg font-semibold tabular-nums text-green-500">
{sc?.hits ?? 0}
</div>
<div className="text-xs text-text-muted mt-0.5">{t("hits")}</div>
</div>
<div>
<div className="text-lg font-semibold tabular-nums text-red-400">
{sc?.misses ?? 0}
</div>
<div className="text-xs text-text-muted mt-0.5">{t("misses")}</div>
</div>
<div>
<div className="text-lg font-semibold tabular-nums">{totalRequests}</div>
<div className="text-xs text-text-muted mt-0.5">{t("total")}</div>
</div>
</div>
{/* Loading skeleton */}
{loading && (
<div
className="grid grid-cols-2 md:grid-cols-4 gap-4"
aria-busy="true"
aria-label="Loading cache statistics"
>
{Array.from({ length: 4 }).map((_, i) => (
<div key={i} className="h-24 rounded-xl bg-surface-raised animate-pulse" />
))}
</div>
</Card>
)}
{/* Cache behavior */}
<Card>
<div className="p-5 flex flex-col gap-3">
<h2 className="font-medium text-sm">{t("behavior")}</h2>
<div className="grid grid-cols-1 md:grid-cols-2 gap-3">
<InfoRow icon="info">{t("behaviorDeterministic")}</InfoRow>
<InfoRow icon="info">
{t.rich("behaviorBypass", {
header: () => (
<code className="bg-surface px-1 py-0.5 rounded text-xs font-mono">
X-OmniRoute-No-Cache: true
</code>
),
})}
</InfoRow>
<InfoRow icon="info">{t("behaviorTwoTier")}</InfoRow>
<InfoRow icon="info">
{t.rich("behaviorTtl", {
envVar: () => (
<code className="bg-surface px-1 py-0.5 rounded text-xs font-mono">
SEMANTIC_CACHE_TTL_MS
</code>
),
})}
</InfoRow>
</div>
</div>
</Card>
{/* Error / empty state */}
{!loading && !stats && (
<EmptyState
icon="cached"
title={t("unavailable")}
description={t("unavailableDesc")}
actionLabel={t("refresh")}
onAction={() => void fetchStats()}
/>
)}
{/* Idempotency */}
<Card>
<div className="p-5 flex flex-col gap-3">
<div className="flex items-center gap-2">
<span
className="material-symbols-outlined text-base text-text-muted"
aria-hidden="true"
>
fingerprint
</span>
<h2 className="font-medium text-sm">{t("idempotency")}</h2>
{/* Main content */}
{!loading && stats && (
<>
{/* Stats grid */}
<div className="grid grid-cols-2 md:grid-cols-4 gap-4">
<StatCard
icon="memory"
label={t("memoryEntries")}
value={sc?.memoryEntries ?? 0}
sub={t("memoryEntriesSub")}
/>
<StatCard
icon="storage"
label={t("dbEntries")}
value={sc?.dbEntries ?? 0}
sub={t("dbEntriesSub")}
/>
<StatCard
icon="trending_up"
label={t("cacheHits")}
value={sc?.hits ?? 0}
sub={t("cacheHitsSub", { total: totalRequests })}
valueClass="text-green-500"
/>
<StatCard
icon="token"
label={t("tokensSaved")}
value={(sc?.tokensSaved ?? 0).toLocaleString()}
sub={t("tokensSavedSub")}
valueClass="text-blue-400"
/>
</div>
<div className="grid grid-cols-2 gap-4">
<div className="p-3 rounded-lg bg-surface/50">
<div className="text-lg font-semibold tabular-nums">{idp?.activeKeys ?? 0}</div>
<div className="text-xs text-text-muted mt-0.5">{t("activeDedupKeys")}</div>
</div>
<div className="p-3 rounded-lg bg-surface/50">
<div className="text-lg font-semibold tabular-nums">
{idp ? `${(idp.windowMs / 1000).toFixed(0)}s` : "—"}
{/* Hit rate + breakdown */}
<Card>
<div className="p-5 flex flex-col gap-4">
<div className="flex items-center justify-between">
<h2 className="font-medium text-sm">{t("performance")}</h2>
<span className="text-xs text-text-muted">
{t("autoRefresh", { seconds: REFRESH_INTERVAL_SECONDS })}
</span>
</div>
<HitRateBar hitRate={hitRate} label={t("hitRate")} />
<div className="grid grid-cols-3 gap-4 pt-3 border-t border-border/30 text-center">
<div>
<div className="text-lg font-semibold tabular-nums text-green-500">
{sc?.hits ?? 0}
</div>
<div className="text-xs text-text-muted mt-0.5">{t("hits")}</div>
</div>
<div>
<div className="text-lg font-semibold tabular-nums text-red-400">
{sc?.misses ?? 0}
</div>
<div className="text-xs text-text-muted mt-0.5">{t("misses")}</div>
</div>
<div>
<div className="text-lg font-semibold tabular-nums">{totalRequests}</div>
<div className="text-xs text-text-muted mt-0.5">{t("total")}</div>
</div>
</div>
<div className="text-xs text-text-muted mt-0.5">{t("dedupWindow")}</div>
</div>
</div>
</div>
</Card>
</Card>
{/* Prompt Cache Stats */}
{pc && (
<Card>
<div className="p-5 flex flex-col gap-4">
<div className="flex items-center gap-2">
<span
className="material-symbols-outlined text-base text-text-muted"
aria-hidden="true"
>
bolt
</span>
<h2 className="font-medium text-sm">{t("promptCache")}</h2>
</div>
<div className="grid grid-cols-2 md:grid-cols-4 gap-4">
<div className="p-3 rounded-lg bg-surface/50">
<div className="text-lg font-semibold tabular-nums">
{pc.requestsWithCacheControl.toLocaleString()}
</div>
<div className="text-xs text-text-muted mt-0.5">{t("cachedRequests")}</div>
</div>
<div className="p-3 rounded-lg bg-surface/50">
<div className="text-lg font-semibold tabular-nums text-green-500">
{promptCacheHitRate.toFixed(1)}%
</div>
<div className="text-xs text-text-muted mt-0.5">{t("cacheHitRate")}</div>
</div>
<div className="p-3 rounded-lg bg-surface/50">
<div className="text-lg font-semibold tabular-nums text-blue-400">
{pc.totalCachedTokens.toLocaleString()}
</div>
<div className="text-xs text-text-muted mt-0.5">{t("cachedTokens")}</div>
</div>
<div className="p-3 rounded-lg bg-surface/50">
<div className="text-lg font-semibold tabular-nums text-purple-400">
{pc.totalCacheCreationTokens.toLocaleString()}
</div>
<div className="text-xs text-text-muted mt-0.5">
{t("cacheCreationTokens")}
</div>
</div>
</div>
{providerEntries.length > 0 && (
<div className="pt-3 border-t border-border/30">
<h3 className="text-xs font-medium text-text-muted mb-3">
{t("byProvider")}
</h3>
<div className="overflow-x-auto">
<table className="w-full text-sm">
<thead>
<tr className="text-left text-xs text-text-muted border-b border-border/30">
<th className="pb-2 pr-4">{t("provider")}</th>
<th className="pb-2 pr-4">{t("requests")}</th>
<th className="pb-2 pr-4">{t("inputTokens")}</th>
<th className="pb-2 pr-4">{t("cachedTokensCol")}</th>
<th className="pb-2">{t("cacheCreation")}</th>
</tr>
</thead>
<tbody>
{providerEntries.map(([provider, data]) => (
<tr key={provider} className="border-b border-border/20">
<td className="py-2 pr-4 font-medium">{provider}</td>
<td className="py-2 pr-4 tabular-nums">
{data.requests.toLocaleString()}
</td>
<td className="py-2 pr-4 tabular-nums">
{data.inputTokens.toLocaleString()}
</td>
<td className="py-2 pr-4 tabular-nums text-green-500">
{data.cachedTokens.toLocaleString()}
</td>
<td className="py-2 tabular-nums text-purple-400">
{data.cacheCreationTokens.toLocaleString()}
</td>
</tr>
))}
</tbody>
</table>
</div>
</div>
)}
</div>
</Card>
)}
{/* Cache Trend (24h) */}
{trend.length > 0 && (
<Card>
<div className="p-5 flex flex-col gap-4">
<div className="flex items-center gap-2">
<span
className="material-symbols-outlined text-base text-text-muted"
aria-hidden="true"
>
timeline
</span>
<h2 className="font-medium text-sm">{t("trend24h")}</h2>
</div>
<div className="flex items-end gap-1 h-32">
{trend.map((point) => {
const height = Math.max(4, (point.requests / maxTrendRequests) * 100);
const cachedHeight =
point.requests > 0
? Math.max(2, (point.cachedRequests / point.requests) * height)
: 0;
const hour = new Date(point.timestamp).toLocaleTimeString([], {
hour: "2-digit",
minute: "2-digit",
hour12: false,
});
return (
<div
key={point.timestamp}
className="flex-1 flex flex-col items-center gap-1 group relative"
>
<div className="absolute bottom-full mb-1 hidden group-hover:block bg-surface-raised border border-border rounded px-2 py-1 text-xs whitespace-nowrap z-10">
{hour}: {point.requests} {t("requests").toLowerCase()},{" "}
{point.cachedRequests} {t("cached").toLowerCase()}
</div>
<div className="w-full flex flex-col justify-end h-full gap-px">
<div
className="w-full bg-green-500/30 rounded-t"
style={{ height: `${cachedHeight}%` }}
/>
<div
className="w-full bg-text-muted/20 rounded-t"
style={{ height: `${height - cachedHeight}%` }}
/>
</div>
<span className="text-[10px] text-text-muted truncate w-full text-center">
{hour.split(":")[0]}
</span>
</div>
);
})}
</div>
<div className="flex items-center gap-4 text-xs text-text-muted">
<div className="flex items-center gap-1.5">
<div className="w-3 h-3 rounded bg-text-muted/20" />
<span>{t("total")}</span>
</div>
<div className="flex items-center gap-1.5">
<div className="w-3 h-3 rounded bg-green-500/30" />
<span>{t("cached")}</span>
</div>
</div>
</div>
</Card>
)}
{/* Cache behavior */}
<Card>
<div className="p-5 flex flex-col gap-3">
<h2 className="font-medium text-sm">{t("behavior")}</h2>
<div className="grid grid-cols-1 md:grid-cols-2 gap-3">
<InfoRow icon="info">{t("behaviorDeterministic")}</InfoRow>
<InfoRow icon="info">
{t.rich("behaviorBypass", {
header: () => (
<code className="bg-surface px-1 py-0.5 rounded text-xs font-mono">
X-OmniRoute-No-Cache: true
</code>
),
})}
</InfoRow>
<InfoRow icon="info">{t("behaviorTwoTier")}</InfoRow>
<InfoRow icon="info">
{t.rich("behaviorTtl", {
envVar: () => (
<code className="bg-surface px-1 py-0.5 rounded text-xs font-mono">
SEMANTIC_CACHE_TTL_MS
</code>
),
})}
</InfoRow>
</div>
</div>
</Card>
{/* Idempotency */}
<Card>
<div className="p-5 flex flex-col gap-3">
<div className="flex items-center gap-2">
<span
className="material-symbols-outlined text-base text-text-muted"
aria-hidden="true"
>
fingerprint
</span>
<h2 className="font-medium text-sm">{t("idempotency")}</h2>
</div>
<div className="grid grid-cols-2 gap-4">
<div className="p-3 rounded-lg bg-surface/50">
<div className="text-lg font-semibold tabular-nums">
{idp?.activeKeys ?? 0}
</div>
<div className="text-xs text-text-muted mt-0.5">{t("activeDedupKeys")}</div>
</div>
<div className="p-3 rounded-lg bg-surface/50">
<div className="text-lg font-semibold tabular-nums">
{idp ? `${(idp.windowMs / 1000).toFixed(0)}s` : "—"}
</div>
<div className="text-xs text-text-muted mt-0.5">{t("dedupWindow")}</div>
</div>
</div>
</div>
</Card>
</>
)}
</>
)}
</div>
@@ -48,6 +48,7 @@ export default function HealthPage() {
const [telemetry, setTelemetry] = useState(null);
const [cache, setCache] = useState(null);
const [signatureCache, setSignatureCache] = useState(null);
const [degradation, setDegradation] = useState(null);
const [resetting, setResetting] = useState(false);
const fetchHealth = useCallback(async () => {
@@ -69,12 +70,14 @@ export default function HealthPage() {
fetch("/api/telemetry/summary").then((r) => r.json()),
fetch("/api/cache/stats").then((r) => r.json()),
fetch("/api/rate-limits").then((r) => r.json()),
fetch("/api/health/degradation").then((r) => r.json()),
]);
if (results[0].status === "fulfilled") setTelemetry(results[0].value);
if (results[1].status === "fulfilled") setCache(results[1].value);
if (results[2].status === "fulfilled" && results[2].value.cacheStats) {
setSignatureCache(results[2].value.cacheStats);
}
if (results[3].status === "fulfilled") setDegradation(results[3].value);
}, []);
useEffect(() => {
@@ -270,6 +273,80 @@ export default function HealthPage() {
</Card>
</div>
{/* Graceful Degradation Status */}
{degradation && degradation.features && degradation.features.length > 0 && (
<Card className="p-5" role="region" aria-label="Graceful Degradation Status">
<div className="flex items-center justify-between mb-4">
<h2 className="text-lg font-semibold text-text-main flex items-center gap-2">
<span className="material-symbols-outlined text-[20px] text-primary">healing</span>
Graceful Degradation Status
</h2>
<div className="flex items-center gap-3 text-xs text-text-muted font-medium">
<span className="px-2 py-0.5 rounded bg-green-500/10 text-green-400">
Full: {degradation.summary.full}
</span>
<span className="px-2 py-0.5 rounded bg-amber-500/10 text-amber-500">
Reduced: {degradation.summary.reduced}
</span>
<span className="px-2 py-0.5 rounded bg-orange-500/10 text-orange-500">
Minimal: {degradation.summary.minimal}
</span>
<span className="px-2 py-0.5 rounded bg-red-500/10 text-red-500">
Default: {degradation.summary.default}
</span>
</div>
</div>
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-3">
{degradation.features.map((feat: any) => {
const bg =
feat.level === "full"
? "bg-green-500/5 border-green-500/10"
: feat.level === "reduced"
? "bg-amber-500/5 border-amber-500/20"
: feat.level === "minimal"
? "bg-orange-500/5 border-orange-500/20"
: "bg-red-500/5 border-red-500/20";
const dot =
feat.level === "full"
? "bg-green-500"
: feat.level === "reduced"
? "bg-amber-500"
: feat.level === "minimal"
? "bg-orange-500"
: "bg-red-500";
return (
<div
key={feat.feature}
className={`rounded-lg p-3 border \${bg} flex flex-col gap-2`}
>
<div className="flex items-center justify-between">
<span className="text-sm font-semibold capitalize flex items-center gap-2 text-[var(--text-primary,#fff)]">
<span className={`w-2 h-2 rounded-full \${dot}`}></span>
{feat.feature}
</span>
<span className="text-xs uppercase tracking-wider font-bold opacity-70">
{feat.level}
</span>
</div>
<div className="text-xs text-[var(--text-secondary,#aaa)]">{feat.capability}</div>
{feat.reason && (
<div
className="text-[10px] text-red-300 mt-1 bg-red-900/20 p-1.5 rounded"
title={feat.reason}
>
{feat.reason.length > 80 ? feat.reason.substring(0, 80) + "..." : feat.reason}
</div>
)}
<div className="text-[10px] text-[var(--text-muted,#666)] text-right mt-1">
Since {new Date(feat.since).toLocaleTimeString()}
</div>
</div>
);
})}
</div>
</Card>
)}
{/* Telemetry Cards — Latency & Prompt Cache */}
<div className="grid grid-cols-1 md:grid-cols-3 gap-4">
{/* Latency Card */}
@@ -802,8 +802,6 @@ export default function ProviderDetailPage() {
const { copied, copy } = useCopyToClipboard();
const t = useTranslations("providers");
const notify = useNotificationStore();
const hasAutoOpened = useRef(false);
const userDismissed = useRef(false);
const [proxyTarget, setProxyTarget] = useState(null);
const [proxyConfig, setProxyConfig] = useState(null);
const [connProxyMap, setConnProxyMap] = useState<
@@ -989,25 +987,6 @@ export default function ProviderDetailPage() {
}
}, [loading, connections, loadConnProxies]);
// Auto-open Add Connection modal when no connections exist (better UX)
// Only fires once on initial load, not on HMR remounts or after user dismissal
useEffect(() => {
if (
!loading &&
connections.length === 0 &&
providerInfo &&
!isCompatible &&
!hasAutoOpened.current &&
!userDismissed.current
) {
hasAutoOpened.current = true;
if (isOAuth) {
setShowOAuthModal(true);
} else {
setShowAddApiKeyModal(true);
}
}
}, [loading]); // eslint-disable-line react-hooks/exhaustive-deps
const handleSetAlias = async (modelId, alias, providerAliasOverride = providerAlias) => {
const fullModel = `${providerAliasOverride}/${modelId}`;
@@ -2428,7 +2407,6 @@ export default function ProviderDetailPage() {
providerInfo={providerInfo}
onSuccess={handleOAuthSuccess}
onClose={() => {
userDismissed.current = true;
setShowOAuthModal(false);
}}
/>
@@ -2437,7 +2415,6 @@ export default function ProviderDetailPage() {
isOpen={showOAuthModal}
onSuccess={handleOAuthSuccess}
onClose={() => {
userDismissed.current = true;
setShowOAuthModal(false);
}}
/>
@@ -2448,7 +2425,6 @@ export default function ProviderDetailPage() {
providerInfo={providerInfo}
onSuccess={handleOAuthSuccess}
onClose={() => {
userDismissed.current = true;
setShowOAuthModal(false);
}}
/>
@@ -95,6 +95,7 @@ function getConnectionErrorTag(connection) {
export default function ProvidersPage() {
const [connections, setConnections] = useState<any[]>([]);
const [providerNodes, setProviderNodes] = useState<any[]>([]);
const [expirations, setExpirations] = useState<any>(null);
const [loading, setLoading] = useState(true);
const [showAddCompatibleModal, setShowAddCompatibleModal] = useState(false);
const [showAddAnthropicCompatibleModal, setShowAddAnthropicCompatibleModal] = useState(false);
@@ -108,14 +109,17 @@ export default function ProvidersPage() {
useEffect(() => {
const fetchData = async () => {
try {
const [connectionsRes, nodesRes] = await Promise.all([
const [connectionsRes, nodesRes, expirationsRes] = await Promise.all([
fetch("/api/providers"),
fetch("/api/provider-nodes"),
fetch("/api/providers/expiration"),
]);
const connectionsData = await connectionsRes.json();
const nodesData = await nodesRes.json();
const expirationsData = await expirationsRes.json();
if (connectionsRes.ok) setConnections(connectionsData.connections || []);
if (nodesRes.ok) setProviderNodes(nodesData.nodes || []);
if (expirationsRes.ok && expirationsData) setExpirations(expirationsData);
} catch (error) {
console.log("Error fetching data:", error);
} finally {
@@ -188,7 +192,16 @@ export default function ProvidersPage() {
const errorCode = latestError ? getConnectionErrorTag(latestError) : null;
const errorTime = latestError?.lastErrorAt ? getRelativeTime(latestError.lastErrorAt) : null;
return { connected, error, total, errorCode, errorTime, allDisabled };
// Check expirations
const providerExpirations =
expirations?.list?.filter((e: any) => e.provider === providerId) || [];
const hasExpired = providerExpirations.some((e: any) => e.status === "expired");
const hasExpiringSoon = providerExpirations.some((e: any) => e.status === "expiring_soon");
let expiryStatus = null;
if (hasExpired) expiryStatus = "expired";
else if (hasExpiringSoon) expiryStatus = "expiring_soon";
return { connected, error, total, errorCode, errorTime, allDisabled, expiryStatus };
};
// Toggle all connections for a provider on/off
@@ -289,6 +302,40 @@ export default function ProvidersPage() {
return (
<div className="flex flex-col gap-6">
{/* Expiration Banner */}
{expirations?.summary &&
(expirations.summary.expired > 0 || expirations.summary.expiringSoon > 0) && (
<div
className={`p-4 rounded-xl flex items-start gap-3 border ${
expirations.summary.expired > 0
? "bg-red-500/10 border-red-500/20"
: "bg-amber-500/10 border-amber-500/20"
}`}
>
<span
className={`material-symbols-outlined text-[24px] ${
expirations.summary.expired > 0 ? "text-red-500" : "text-amber-500"
}`}
>
{expirations.summary.expired > 0 ? "error" : "warning"}
</span>
<div className="flex-1">
<h3
className={`font-semibold ${expirations.summary.expired > 0 ? "text-red-500" : "text-amber-500"}`}
>
{expirations.summary.expired > 0
? `${expirations.summary.expired} Provider connection(s) expired`
: `${expirations.summary.expiringSoon} Provider connection(s) expiring soon`}
</h3>
<p className="text-sm mt-1 opacity-80 text-text-main">
{expirations.summary.expired > 0
? "Immediate action required. Expired connections will permanently fail."
: "Please review and renew expiring connections to avoid disruption."}
</p>
</div>
</div>
)}
{/* OAuth Providers */}
<div className="flex flex-col gap-4">
<div className="flex flex-wrap items-center gap-2">
@@ -582,6 +629,16 @@ function ProviderCard({ providerId, provider, stats, authType, onToggle }) {
) : (
<>
{getStatusDisplay(connected, error, errorCode, t)}
{stats.expiryStatus === "expired" && (
<Badge variant="error" size="sm" dot>
Expired
</Badge>
)}
{stats.expiryStatus === "expiring_soon" && (
<Badge variant="warning" size="sm" dot>
Expiring Soon
</Badge>
)}
{errorTime && <span className="text-text-muted"> {errorTime}</span>}
</>
)}
@@ -709,6 +766,16 @@ function ApiKeyProviderCard({ providerId, provider, stats, authType, onToggle })
) : (
<>
{getStatusDisplay(connected, error, errorCode, t)}
{stats.expiryStatus === "expired" && (
<Badge variant="error" size="sm" dot>
Expired
</Badge>
)}
{stats.expiryStatus === "expiring_soon" && (
<Badge variant="warning" size="sm" dot>
Expiring Soon
</Badge>
)}
{isCompatible && (
<Badge variant="default" size="sm">
{provider.apiType === "responses" ? t("responses") : t("chat")}
@@ -221,9 +221,7 @@ export default function AppearanceTab() {
<div className="pt-4 border-t border-border">
<div className="mb-3">
<p className="font-medium">
{getSettingsLabel("sidebarVisibility", "Hide sidebar items")}
</p>
<p className="font-medium">{t("sidebarVisibilityToggle")}</p>
<p className="text-sm text-text-muted">
{getSettingsLabel(
"sidebarVisibilityDesc",
@@ -249,7 +247,7 @@ export default function AppearanceTab() {
>
<p className="font-medium">{item.label}</p>
<Toggle
checked={hiddenSidebarSet.has(item.id)}
checked={!hiddenSidebarSet.has(item.id)}
onChange={() => toggleSidebarItem(item.id)}
disabled={loading}
/>
@@ -0,0 +1,132 @@
"use client";
import { useState, useEffect } from "react";
import { Card, Button, Input } from "@/shared/components";
import { useTranslations } from "next-intl";
import { useNotificationStore } from "@/store/notificationStore";
export default function AutoDisableCard() {
const [data, setData] = useState({ enabled: false, threshold: 3 });
const [draft, setDraft] = useState({ enabled: false, threshold: 3 });
const [loading, setLoading] = useState(true);
const [saving, setSaving] = useState(false);
const [editMode, setEditMode] = useState(false);
const t = useTranslations("settings");
const tc = useTranslations("common");
const notify = useNotificationStore();
useEffect(() => {
fetch("/api/settings/auto-disable-accounts")
.then((res) => res.json())
.then((json) => {
setData(json);
setDraft(json);
setLoading(false);
})
.catch(() => setLoading(false));
}, []);
const handleSave = async () => {
setSaving(true);
try {
const res = await fetch("/api/settings/auto-disable-accounts", {
method: "PUT",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(draft),
});
if (!res.ok) throw new Error("Failed to save auto-disable config");
const savedData = await res.json();
setData(savedData);
setEditMode(false);
notify.success(t("savedSuccessfully") || "Saved successfully");
} catch (err) {
notify.error(err instanceof Error ? err.message : "Error saving");
} finally {
setSaving(false);
}
};
if (loading) return null;
return (
<Card className="p-0 overflow-hidden">
<div className="p-6">
<div className="flex items-center justify-between mb-4">
<div className="flex items-center gap-2">
<span className="material-symbols-outlined text-xl text-primary" aria-hidden="true">
block
</span>
<h2 className="text-lg font-bold">{t("autoDisableBannedAccounts")}</h2>
</div>
{editMode ? (
<div className="flex gap-2">
<Button
size="sm"
variant="secondary"
onClick={() => {
setDraft(data);
setEditMode(false);
}}
>
{tc("cancel")}
</Button>
<Button
size="sm"
variant="primary"
icon="save"
onClick={handleSave}
disabled={saving}
>
{tc("save")}
</Button>
</div>
) : (
<Button size="sm" variant="secondary" icon="edit" onClick={() => setEditMode(true)}>
{tc("edit")}
</Button>
)}
</div>
<p className="text-sm text-text-muted mb-4">{t("autoDisableDescription")}</p>
<div className="grid grid-cols-1 sm:grid-cols-2 gap-4">
<div className="rounded-lg bg-black/5 dark:bg-white/5 p-4 flex flex-col justify-center">
<label className="flex items-center gap-3 cursor-pointer">
<input
type="checkbox"
checked={editMode ? draft.enabled : data.enabled}
onChange={(e) => setDraft((prev) => ({ ...prev, enabled: e.target.checked }))}
disabled={!editMode}
className="w-4 h-4 text-primary bg-surface/50 border-white/20 rounded focus:ring-primary/50"
/>
<span className="text-sm font-medium">{t("autoDisableBannedAccounts")}</span>
</label>
</div>
<div className="rounded-lg bg-black/5 dark:bg-white/5 p-4">
<h3 className="text-xs font-bold uppercase tracking-wider mb-2 flex items-center gap-2">
{t("autoDisableThreshold")}
</h3>
{editMode ? (
<Input
type="number"
min="1"
max="10"
value={draft.threshold}
onChange={(e) =>
setDraft((prev) => ({ ...prev, threshold: parseInt(e.target.value) || 1 }))
}
disabled={!draft.enabled}
/>
) : (
<span className={`text-sm font-mono ${!data.enabled && "opacity-50"}`}>
{data.threshold} {t("failures", { count: data.threshold })}
</span>
)}
<p className="text-xs text-text-muted mt-2">{t("autoDisableThresholdDesc")}</p>
</div>
</div>
</div>
</Card>
);
}
@@ -0,0 +1,191 @@
"use client";
import { useState, useEffect } from "react";
import { Card, Button } from "@/shared/components";
import { useTranslations } from "next-intl";
interface CacheConfig {
semanticCacheEnabled: boolean;
semanticCacheMaxSize: number;
semanticCacheTTL: number;
promptCacheEnabled: boolean;
promptCacheStrategy: "auto" | "system-only" | "manual";
alwaysPreserveClientCache: "auto" | "always" | "never";
}
export default function CacheSettingsTab() {
const t = useTranslations("settings");
const [config, setConfig] = useState<CacheConfig>({
semanticCacheEnabled: true,
semanticCacheMaxSize: 100,
semanticCacheTTL: 1800000,
promptCacheEnabled: true,
promptCacheStrategy: "auto",
alwaysPreserveClientCache: "auto",
});
const [saving, setSaving] = useState(false);
const [loading, setLoading] = useState(true);
useEffect(() => {
fetch("/api/settings/cache-config")
.then((r) => (r.ok ? r.json() : null))
.then((data) => {
if (data) setConfig(data);
})
.catch(() => {})
.finally(() => setLoading(false));
}, []);
const handleSave = async () => {
setSaving(true);
try {
await fetch("/api/settings/cache-config", {
method: "PUT",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(config),
});
} finally {
setSaving(false);
}
};
if (loading) {
return (
<Card className="p-6">
<p className="text-sm text-text-muted">{t("loading")}</p>
</Card>
);
}
return (
<Card className="p-6">
<h3 className="text-lg font-semibold text-text-main flex items-center gap-2 mb-4">
<span className="material-symbols-outlined text-[20px]">cached</span>
{t("cacheSettings")}
</h3>
<div className="space-y-6">
{/* Semantic Cache */}
<div className="space-y-3">
<h4 className="text-sm font-medium text-text-main">{t("semanticCache")}</h4>
<label className="flex items-center justify-between">
<span className="text-sm text-text-muted">{t("enabled")}</span>
<button
onClick={() =>
setConfig((c) => ({ ...c, semanticCacheEnabled: !c.semanticCacheEnabled }))
}
className={`relative w-10 h-5 rounded-full transition-colors ${
config.semanticCacheEnabled ? "bg-green-500" : "bg-border"
}`}
>
<span
className={`absolute top-0.5 w-4 h-4 rounded-full bg-white transition-transform ${
config.semanticCacheEnabled ? "left-5" : "left-0.5"
}`}
/>
</button>
</label>
<label className="flex items-center justify-between">
<span className="text-sm text-text-muted">{t("maxEntries")}</span>
<input
type="number"
min={1}
max={1000}
value={config.semanticCacheMaxSize}
onChange={(e) =>
setConfig((c) => ({ ...c, semanticCacheMaxSize: parseInt(e.target.value) || 100 }))
}
className="w-24 px-2 py-1 text-sm rounded border border-border bg-surface text-text-main"
/>
</label>
<label className="flex items-center justify-between">
<span className="text-sm text-text-muted">{t("ttlMinutes")}</span>
<input
type="number"
min={1}
max={1440}
value={Math.round(config.semanticCacheTTL / 60000)}
onChange={(e) =>
setConfig((c) => ({
...c,
semanticCacheTTL: (parseInt(e.target.value) || 30) * 60000,
}))
}
className="w-24 px-2 py-1 text-sm rounded border border-border bg-surface text-text-main"
/>
</label>
</div>
{/* Prompt Cache */}
<div className="space-y-3 pt-4 border-t border-border/30">
<h4 className="text-sm font-medium text-text-main">{t("promptCache")}</h4>
<label className="flex items-center justify-between">
<span className="text-sm text-text-muted">{t("enabled")}</span>
<button
onClick={() =>
setConfig((c) => ({ ...c, promptCacheEnabled: !c.promptCacheEnabled }))
}
className={`relative w-10 h-5 rounded-full transition-colors ${
config.promptCacheEnabled ? "bg-green-500" : "bg-border"
}`}
>
<span
className={`absolute top-0.5 w-4 h-4 rounded-full bg-white transition-transform ${
config.promptCacheEnabled ? "left-5" : "left-0.5"
}`}
/>
</button>
</label>
<label className="flex items-center justify-between">
<span className="text-sm text-text-muted">{t("strategy")}</span>
<select
value={config.promptCacheStrategy}
onChange={(e) =>
setConfig((c) => ({
...c,
promptCacheStrategy: e.target.value as CacheConfig["promptCacheStrategy"],
}))
}
className="px-2 py-1 text-sm rounded border border-border bg-surface text-text-main"
>
<option value="auto">Auto</option>
<option value="system-only">System Only</option>
<option value="manual">Manual</option>
</select>
</label>
<label className="flex items-center justify-between">
<span className="text-sm text-text-muted">{t("preserveClientCache")}</span>
<select
value={config.alwaysPreserveClientCache}
onChange={(e) =>
setConfig((c) => ({
...c,
alwaysPreserveClientCache: e.target
.value as CacheConfig["alwaysPreserveClientCache"],
}))
}
className="px-2 py-1 text-sm rounded border border-border bg-surface text-text-main"
>
<option value="auto">Auto</option>
<option value="always">Always</option>
<option value="never">Never</option>
</select>
</label>
</div>
{/* Save */}
<div className="pt-4 border-t border-border/30">
<Button onClick={handleSave} disabled={saving} size="sm">
{saving ? t("saving") : t("save")}
</Button>
</div>
</div>
</Card>
);
}
@@ -1,7 +1,7 @@
"use client";
import { useState, useEffect, useRef } from "react";
import { Card, Button, ProxyConfigModal } from "@/shared/components";
import { Card, Button, ProxyConfigModal, Toggle } from "@/shared/components";
import { useTranslations } from "next-intl";
import ProxyRegistryManager from "./ProxyRegistryManager";
@@ -11,6 +11,8 @@ export default function ProxyTab() {
const mountedRef = useRef(true);
const t = useTranslations("settings");
const tc = useTranslations("common");
const [debugMode, setDebugMode] = useState(false);
const [loading, setLoading] = useState(true);
const loadGlobalProxy = async () => {
try {
@@ -22,6 +24,21 @@ export default function ProxyTab() {
} catch {}
};
const updateDebugMode = async (value: boolean) => {
try {
const res = await fetch("/api/settings", {
method: "PATCH",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ debugMode: value }),
});
if (res.ok) {
setDebugMode(value);
}
} catch (err) {
console.error("Failed to update debugMode:", err);
}
};
useEffect(() => {
mountedRef.current = true;
async function init() {
@@ -40,6 +57,19 @@ export default function ProxyTab() {
};
}, []);
useEffect(() => {
fetch("/api/settings")
.then((res) => {
if (!res.ok) throw new Error(`HTTP error ${res.status}`);
return res.json();
})
.then((data) => {
setDebugMode(data.debugMode === true);
setLoading(false);
})
.catch(() => setLoading(false));
}, []);
return (
<>
<div className="flex flex-col gap-6">
@@ -78,6 +108,18 @@ export default function ProxyTab() {
</Card>
<ProxyRegistryManager />
<Card className="p-6 mt-4">
<div className="flex items-center justify-between">
<div>
<p className="font-medium">{t("debugToggle")}</p>
</div>
<Toggle
checked={debugMode}
onChange={() => updateDebugMode(!debugMode)}
disabled={loading}
/>
</div>
</Card>
</div>
<ProxyConfigModal
@@ -4,6 +4,7 @@ import { useState, useEffect, useCallback } from "react";
import { Card, Button } from "@/shared/components";
import { useNotificationStore } from "@/store/notificationStore";
import { useLocale, useTranslations } from "next-intl";
import AutoDisableCard from "./AutoDisableCard";
// ─── State colors and labels ──────────────────────────────────────────────
const STATE_STYLES = {
@@ -656,6 +657,8 @@ export default function ResilienceTab() {
onSave={handleSaveProfiles}
saving={saving}
/>
{/* 1.5 Auto Disable Banned Accounts */}
<AutoDisableCard />
{/* 2. Rate Limiting (editable defaults + active limiters) */}
<RateLimitCard
rateLimitStatus={data?.rateLimitStatus || []}
@@ -152,6 +152,40 @@ export default function RoutingTab() {
</p>
</Card>
{/* Adaptive Volume Routing */}
<Card>
<div className="flex items-start justify-between gap-4">
<div className="flex gap-3">
<div className="p-2 rounded-lg bg-emerald-500/10 text-emerald-500 h-fit">
<span className="material-symbols-outlined text-[20px]" aria-hidden="true">
network_ping
</span>
</div>
<div>
<h3 className="text-lg font-semibold">
{t("adaptiveVolumeRouting") || "Adaptive Volume Routing"}
</h3>
<p className="text-sm text-text-muted mt-1">
{t("adaptiveVolumeRoutingDesc") ||
"Automatically adjusts traffic volume between providers based on real-time latency and error rates."}
</p>
</div>
</div>
<div className="pt-1">
<label className="relative inline-flex items-center cursor-pointer">
<input
type="checkbox"
className="sr-only peer"
checked={!!settings.adaptiveVolumeRouting}
onChange={(e) => updateSetting({ adaptiveVolumeRouting: e.target.checked })}
disabled={loading}
/>
<div className="w-11 h-6 bg-border peer-focus:outline-none rounded-full peer peer-checked:after:translate-x-full peer-checked:after:border-white after:content-[''] after:absolute after:top-[2px] after:left-[2px] after:bg-white after:border-gray-300 after:border after:rounded-full after:h-5 after:w-5 after:transition-all peer-checked:bg-primary"></div>
</label>
</div>
</div>
</Card>
{/* Wildcard Aliases */}
<Card>
<div className="flex items-center gap-3 mb-4">
@@ -18,6 +18,7 @@ import ModelAliasesTab from "./components/ModelAliasesTab";
import BackgroundDegradationTab from "./components/BackgroundDegradationTab";
import CacheStatsCard from "./components/CacheStatsCard";
import CacheSettingsTab from "./components/CacheSettingsTab";
import ResilienceTab from "./components/ResilienceTab";
const tabs = [
@@ -89,6 +90,7 @@ export default function SettingsPage() {
<CodexServiceTierTab />
<SystemPromptTab />
<CacheStatsCard />
<CacheSettingsTab />
</div>
)}
@@ -7,6 +7,7 @@ import Card from "@/shared/components/Card";
import Badge from "@/shared/components/Badge";
import QuotaProgressBar from "./QuotaProgressBar";
import { calculatePercentage } from "./utils";
import ProviderIcon from "@/shared/components/ProviderIcon";
const planVariants = {
free: "default",
@@ -70,15 +71,7 @@ export default function ProviderLimitCard({
{provider?.slice(0, 2).toUpperCase() || "PR"}
</span>
) : (
<Image
src={`/providers/${provider}.png`}
alt={provider || t("providerLimits")}
width={40}
height={40}
className="object-contain rounded-lg"
sizes="40px"
onError={() => setImgError(true)}
/>
<ProviderIcon providerId={provider} size={40} />
)}
</div>

Some files were not shown because too many files have changed in this diff Show More