Compare commits
54 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 44e4d55a66 | |||
| 095c84ac16 | |||
| e063eae727 | |||
| f02c5b5c69 | |||
| 838f1d645c | |||
| ce2c30c437 | |||
| d56fae0a7b | |||
| e45ef00bef | |||
| e9f31f7394 | |||
| 7c10a98eb2 | |||
| f260483101 | |||
| 389e6e5c9e | |||
| 1cfd5866be | |||
| c7ceac7f41 | |||
| cd6eca0424 | |||
| 8c6136fea0 | |||
| 9644444028 | |||
| 9c4154291d | |||
| 533f5f6da6 | |||
| 1b8de756cd | |||
| 650b415537 | |||
| 04b50329fc | |||
| 25aab8c55c | |||
| ceda2e70c1 | |||
| 2908303d4b | |||
| a9f69711c6 | |||
| a8ab16a720 | |||
| 8091b6b508 | |||
| a00ef0fc7e | |||
| 5ce6d615a4 | |||
| e06b69cdac | |||
| d261ae7883 | |||
| 6fa77a63d7 | |||
| f76c1b32d6 | |||
| 0aede2ef63 | |||
| 1e3a2e0a27 | |||
| 1bdabf43db | |||
| 05e568feb0 | |||
| 81e2519436 | |||
| ef623c9bb5 | |||
| da581525a6 | |||
| 6ff7b6570c | |||
| 8b2081837e | |||
| ce978b602a | |||
| 9b00f5d550 | |||
| d98ec59c79 | |||
| d79b55be5a | |||
| 1f9a402dcd | |||
| f9bcc9418b | |||
| 08256a3502 | |||
| 9b255e643a | |||
| ca1f918e9e | |||
| bb3fe1cd48 | |||
| d139b4557f |
@@ -4,73 +4,81 @@ description: Deploy the latest OmniRoute code to the Akamai VPS (69.164.221.35)
|
||||
|
||||
# Deploy to VPS Workflow
|
||||
|
||||
Deploy OmniRoute to the production VPS using `npm install -g` + PM2.
|
||||
Deploy OmniRoute to the production VPS using `npm pack + scp` + PM2.
|
||||
|
||||
**VPS:** `69.164.221.35` (Akamai, Ubuntu 24.04, 1GB RAM + 2.5GB swap)
|
||||
**Local VPS:** `192.168.0.15` (same setup)
|
||||
**Process manager:** PM2 (`omniroute`)
|
||||
**Port:** `20128`
|
||||
**PM2 entry:** `/usr/lib/node_modules/omniroute/app/server.js`
|
||||
|
||||
> [!IMPORTANT]
|
||||
> PM2 runs from the global npm package at `/usr/lib/node_modules/omniroute`.
|
||||
> **DO NOT** use git clone or local copies. The `npm install -g` command handles
|
||||
> building, publishing, and installing the standalone app in one step.
|
||||
> The Next.js standalone build is at `app/server.js` inside that directory.
|
||||
> The npm registry rejects packages > 100MB, so deployment uses **npm pack + scp**.
|
||||
|
||||
## Steps
|
||||
|
||||
### 1. Publish to npm
|
||||
### 1. Build + pack locally
|
||||
|
||||
Ensure the version in `package.json` is bumped and the package is published:
|
||||
Run the full build (includes hash-strip patch) and create the .tgz:
|
||||
|
||||
// turbo
|
||||
|
||||
```bash
|
||||
npm publish
|
||||
cd /home/diegosouzapw/dev/proxys/9router && npm run build:cli && npm pack --ignore-scripts
|
||||
```
|
||||
|
||||
### 2. Install on VPS and restart PM2
|
||||
### 2. Copy to both VPS and install
|
||||
|
||||
// turbo-all
|
||||
|
||||
```bash
|
||||
ssh root@69.164.221.35 "npm install -g omniroute@latest && pm2 restart omniroute && pm2 save && echo '✅ Deploy complete!'"
|
||||
scp omniroute-*.tgz root@69.164.221.35:/tmp/ && scp omniroute-*.tgz root@192.168.0.15:/tmp/
|
||||
```
|
||||
|
||||
For the local VPS:
|
||||
```bash
|
||||
ssh root@69.164.221.35 "npm install -g /tmp/omniroute-*.tgz --ignore-scripts && pm2 restart omniroute && pm2 save && echo '✅ Akamai done'"
|
||||
```
|
||||
|
||||
```bash
|
||||
ssh root@192.168.0.15 "npm install -g omniroute@latest && pm2 restart omniroute && pm2 save && echo '✅ Deploy complete!'"
|
||||
ssh root@192.168.0.15 "npm install -g /tmp/omniroute-*.tgz --ignore-scripts && pm2 restart omniroute && pm2 save && echo '✅ Local done'"
|
||||
```
|
||||
|
||||
### 3. Verify the deployment
|
||||
|
||||
```bash
|
||||
ssh root@69.164.221.35 "pm2 list && cat \$(npm root -g)/omniroute/package.json | grep version | head -1 && curl -s -o /dev/null -w 'HTTP %{http_code}' http://localhost:20128/"
|
||||
ssh root@69.164.221.35 "pm2 list && cat \$(npm root -g)/omniroute/app/package.json | grep version | head -1 && curl -s -o /dev/null -w 'HTTP %{http_code}' http://localhost:20128/"
|
||||
```
|
||||
|
||||
Expected: PM2 shows `online`, version matches published, HTTP returns `307` (redirect to login).
|
||||
Expected: PM2 shows `online`, version matches, HTTP returns `307`.
|
||||
|
||||
## How it works
|
||||
|
||||
1. `npm publish` builds Next.js standalone + bundles everything into the npm package
|
||||
2. `npm install -g omniroute@latest` downloads and installs to `/usr/lib/node_modules/omniroute/`
|
||||
3. PM2 is registered to run `npm start` from that directory (cwd: `/usr/lib/node_modules/omniroute`)
|
||||
4. `pm2 restart omniroute` picks up the new code immediately
|
||||
1. `npm run build:cli` builds Next.js standalone → `app/` and strips Turbopack hashed require() calls from chunks
|
||||
2. `npm pack --ignore-scripts` packages without re-running the build
|
||||
3. `scp` transfers the .tgz to each VPS (~286MB)
|
||||
4. `npm install -g /tmp/omniroute-*.tgz --ignore-scripts` installs pre-built package
|
||||
5. PM2 runs `app/server.js` from `/usr/lib/node_modules/omniroute`
|
||||
|
||||
## PM2 Setup (one-time)
|
||||
|
||||
If PM2 needs to be reconfigured from scratch:
|
||||
## PM2 Setup (one-time — if reconfiguring from scratch)
|
||||
|
||||
```bash
|
||||
ssh root@<VPS> "
|
||||
cd /usr/lib/node_modules/omniroute &&
|
||||
PORT=20128 pm2 start app/server.js --name omniroute --env PORT=20128 &&
|
||||
pm2 save &&
|
||||
pm2 startup
|
||||
pm2 delete omniroute ;
|
||||
cp /opt/omniroute-app/.env /usr/lib/node_modules/omniroute/.env &&
|
||||
PORT=20128 pm2 start /usr/lib/node_modules/omniroute/app/server.js --name omniroute --cwd /usr/lib/node_modules/omniroute/app &&
|
||||
pm2 save && pm2 startup
|
||||
"
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> Copy `.env` from the old installation first. For Akamai it was at `/opt/omniroute-app/.env`,
|
||||
> for the local VPS it was at `/root/omniroute-fresh/.env`.
|
||||
|
||||
## Notes
|
||||
|
||||
- The `.env` file is at `/usr/lib/node_modules/omniroute/.env`. Back it up before major npm updates.
|
||||
- PM2 is configured with `pm2 startup` to auto-restart on reboot.
|
||||
- Nginx proxies `omniroute.online` → `localhost:20128`.
|
||||
- The VPS has only 1GB RAM — builds happen locally via `npm publish`, not on the VPS.
|
||||
- `.env` should be placed at `/usr/lib/node_modules/omniroute/app/.env`
|
||||
- PM2 is configured with `pm2 startup` to auto-restart on reboot
|
||||
- Nginx proxies `omniroute.online` → `localhost:20128`
|
||||
- The VPS has only 1GB RAM — builds happen locally, never on the VPS
|
||||
|
||||
@@ -32,6 +32,27 @@ Version format: `2.x.y` — examples:
|
||||
npm version patch --no-git-tag-version
|
||||
```
|
||||
|
||||
> **⚠️ ATOMIC COMMIT RULE — Version bump MUST happen before committing feature files.**
|
||||
>
|
||||
> **CORRECT order:**
|
||||
>
|
||||
> 1. `npm version patch --no-git-tag-version` ← bump first
|
||||
> 2. implement features / fix bugs
|
||||
> 3. `git add -A && git commit -m "chore(release): v2.x.y — all changes in ONE commit"`
|
||||
>
|
||||
> **OR if features are already staged:**
|
||||
>
|
||||
> 1. implement features (do NOT commit yet)
|
||||
> 2. `npm version patch --no-git-tag-version` ← bump before committing
|
||||
> 3. `git add -A && git commit -m "chore(release): v2.x.y — all changes in ONE commit"`
|
||||
>
|
||||
> **NEVER do this (creates version mismatch in git history):**
|
||||
>
|
||||
> - ~~commit features → then bump version → commit package.json separately~~
|
||||
>
|
||||
> This ensures that `git show v2.x.y` always contains both code changes and the version bump together.
|
||||
> The GitHub release tag will point to a commit that includes ALL changes for that version.
|
||||
|
||||
### 2. Regenerate lock file (REQUIRED after version bump)
|
||||
|
||||
**Mandatory** — skipping causes `@swc/helpers` lock mismatch and CI failures:
|
||||
@@ -85,12 +106,49 @@ git push origin main --tags
|
||||
gh release create v2.x.y --title "v2.x.y — summary" --notes "..."
|
||||
```
|
||||
|
||||
### 8. Deploy to VPS (if requested)
|
||||
### 8. 🐳 Trigger Docker Hub build (MANDATORY — keep npm and Docker in sync)
|
||||
|
||||
See `/deploy-vps` workflow for Akamai VPS or use npm for local VPS:
|
||||
> **CRITICAL**: Docker Hub and npm MUST always publish the same version.
|
||||
> The Docker image is built automatically via GitHub Actions when a new tag is pushed.
|
||||
> After pushing the tag in step 5-6, **verify the workflow runs**:
|
||||
|
||||
```bash
|
||||
ssh root@<VPS_IP> "npm install -g omniroute@2.x.y && pm2 restart omniroute"
|
||||
# Verify the Docker workflow triggered
|
||||
gh run list --repo diegosouzapw/OmniRoute --workflow docker-publish.yml --limit 3
|
||||
|
||||
# Wait for the Docker build to complete (usually 5–10 min)
|
||||
gh run watch --repo diegosouzapw/OmniRoute
|
||||
|
||||
# After completion, verify on Docker Hub:
|
||||
# https://hub.docker.com/r/diegosouzapw/omniroute/tags
|
||||
```
|
||||
|
||||
If the Docker build was not triggered automatically, trigger it manually:
|
||||
|
||||
```bash
|
||||
gh workflow run docker-publish.yml --repo diegosouzapw/OmniRoute --ref v2.x.y
|
||||
```
|
||||
|
||||
### 9. Deploy to BOTH VPS environments (MANDATORY)
|
||||
|
||||
> Always deploy to **both** environments after every release.
|
||||
> See `/deploy-vps` workflow for detailed steps.
|
||||
|
||||
```bash
|
||||
# Build and pack locally
|
||||
cd /home/diegosouzapw/dev/proxys/9router && npm run build:cli && npm pack --ignore-scripts
|
||||
|
||||
# Deploy to LOCAL VPS (192.168.0.15)
|
||||
scp omniroute-*.tgz root@192.168.0.15:/tmp/
|
||||
ssh root@192.168.0.15 "npm install -g /tmp/omniroute-*.tgz --ignore-scripts && pm2 restart omniroute && pm2 save"
|
||||
|
||||
# Deploy to AKAMAI VPS (69.164.221.35)
|
||||
scp omniroute-*.tgz root@69.164.221.35:/tmp/
|
||||
ssh root@69.164.221.35 "npm install -g /tmp/omniroute-*.tgz --ignore-scripts && pm2 restart omniroute && pm2 save"
|
||||
|
||||
# Verify both
|
||||
curl -s -o /dev/null -w "LOCAL: HTTP %{http_code}\n" http://192.168.0.15:20128/
|
||||
curl -s -o /dev/null -w "AKAMAI: HTTP %{http_code}\n" http://69.164.221.35:20128/
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
@@ -21,8 +21,8 @@ This workflow fetches all open issues from the project's GitHub repository, clas
|
||||
|
||||
// turbo
|
||||
|
||||
- Run: `gh issue list --repo <owner>/<repo> --state open --limit 100 --json number,title,labels,body,comments,createdAt,author`
|
||||
- Parse the JSON output to get a list of all open issues
|
||||
- Run: `gh issue list --repo <owner>/<repo> --state open --limit 500 --json number,title,labels,body,comments,createdAt,author`
|
||||
- Parse the JSON output to get a list of **all** open issues
|
||||
- Sort by oldest first (FIFO)
|
||||
|
||||
### 3. Classify Each Issue
|
||||
|
||||
@@ -18,7 +18,11 @@ This workflow fetches all open PRs from the project's GitHub repository, perform
|
||||
|
||||
### 2. Fetch Open Pull Requests
|
||||
|
||||
- Navigate to `https://github.com/<owner>/<repo>/pulls` and scrape all open PRs
|
||||
// turbo
|
||||
|
||||
- Run: `gh pr list --repo <owner>/<repo> --state open --limit 500 --json number,title,author,headRefName,body,createdAt,additions,deletions,files`
|
||||
- This fetches **all** open PRs without restriction. Get the diff for each with:
|
||||
`gh pr diff <NUMBER> --repo <owner>/<repo>`
|
||||
- For each open PR, collect:
|
||||
- PR number, title, author, branch, number of commits, date
|
||||
- PR description/body
|
||||
|
||||
+183
@@ -4,6 +4,189 @@
|
||||
|
||||
---
|
||||
|
||||
## [2.7.0] — 2026-03-17
|
||||
|
||||
> Sprint: ClawRouter-inspired features — toolCalling flag, multilingual intent detection, benchmark-driven fallback, request deduplication, pluggable RouterStrategy, Grok-4 Fast + GLM-5 + MiniMax M2.5 + Kimi K2.5 pricing.
|
||||
|
||||
### ✨ New Models & Pricing
|
||||
|
||||
- **feat(pricing)**: xAI Grok-4 Fast — `$0.20/$0.50 per 1M tokens`, 1143ms p50 latency, tool calling supported
|
||||
- **feat(pricing)**: xAI Grok-4 (standard) — `$0.20/$1.50 per 1M tokens`, reasoning flagship
|
||||
- **feat(pricing)**: GLM-5 via Z.AI — `$0.5/1M`, 128K output context
|
||||
- **feat(pricing)**: MiniMax M2.5 — `$0.30/1M input`, reasoning + agentic tasks
|
||||
- **feat(pricing)**: DeepSeek V3.2 — updated pricing `$0.27/$1.10 per 1M`
|
||||
- **feat(pricing)**: Kimi K2.5 via Moonshot API — direct Moonshot API access
|
||||
- **feat(providers)**: Z.AI provider added (`zai` alias) — GLM-5 family with 128K output
|
||||
|
||||
### 🧠 Routing Intelligence
|
||||
|
||||
- **feat(registry)**: `toolCalling` flag per model in provider registry — combos can now prefer/require tool-calling capable models
|
||||
- **feat(scoring)**: Multilingual intent detection for AutoCombo scoring — PT/ZH/ES/AR script/language patterns influence model selection per request context
|
||||
- **feat(fallback)**: Benchmark-driven fallback chains — real latency data (p50 from `comboMetrics`) used to re-order fallback priority dynamically
|
||||
- **feat(dedup)**: Request deduplication via content-hash — 5-second idempotency window prevents duplicate provider calls from retrying clients
|
||||
- **feat(router)**: Pluggable `RouterStrategy` interface in `autoCombo/routerStrategy.ts` — custom routing logic can be injected without modifying core
|
||||
|
||||
### 🔧 MCP Server Improvements
|
||||
|
||||
- **feat(mcp)**: 2 new advanced tool schemas: `omniroute_get_provider_metrics` (p50/p95/p99 per provider) and `omniroute_explain_route` (routing decision explanation)
|
||||
- **feat(mcp)**: MCP tool auth scopes updated — `metrics:read` scope added for provider metrics tools
|
||||
- **feat(mcp)**: `omniroute_best_combo_for_task` now accepts `languageHint` parameter for multilingual routing
|
||||
|
||||
### 📊 Observability
|
||||
|
||||
- **feat(metrics)**: `comboMetrics.ts` extended with real-time latency percentile tracking per provider/account
|
||||
- **feat(health)**: Health API (`/api/monitoring/health`) now returns per-provider `p50Latency` and `errorRate` fields
|
||||
- **feat(usage)**: Usage history migration for per-model latency tracking
|
||||
|
||||
### 🗄️ DB Migrations
|
||||
|
||||
- **feat(migrations)**: New column `latency_p50` in `combo_metrics` table — zero-breaking, safe for existing users
|
||||
|
||||
### 🐛 Bug Fixes / Closures
|
||||
|
||||
- **close(#411)**: better-sqlite3 hashed module resolution on Windows — fixed in v2.6.10 (f02c5b5)
|
||||
- **close(#409)**: GitHub Copilot chat completions fail with Claude models when files attached — fixed in v2.6.9 (838f1d6)
|
||||
- **close(#405)**: Duplicate of #411 — resolved
|
||||
|
||||
## [2.6.10] — 2026-03-17
|
||||
|
||||
> Windows fix: better-sqlite3 prebuilt download without node-gyp/Python/MSVC (#426).
|
||||
|
||||
### 🐛 Bug Fixes
|
||||
|
||||
- **fix(install/#426)**: On Windows, `npm install -g omniroute` used to fail with `better_sqlite3.node is not a valid Win32 application` because the bundled native binary was compiled for Linux. Adds **Strategy 1.5** to `scripts/postinstall.mjs`: uses `@mapbox/node-pre-gyp install --fallback-to-build=false` (bundled within `better-sqlite3`) to download the correct prebuilt binary for the current OS/arch without requiring any build tools (no node-gyp, no Python, no MSVC). Falls back to `npm rebuild` only if the download fails. Adds platform-specific error messages with clear manual fix instructions.
|
||||
|
||||
---
|
||||
|
||||
## [2.6.9] — 2026-03-17
|
||||
|
||||
> CI fixes (t11 any-budget), bug fix #409 (file attachments via Copilot+Claude), release workflow correction.
|
||||
|
||||
### 🐛 Bug Fixes
|
||||
|
||||
- **fix(ci)**: Remove word "any" from comments in `openai-responses.ts` and `chatCore.ts` that were failing the t11 `\bany\b` budget check (false positive from regex counting comments)
|
||||
- **fix(chatCore)**: Normalize unsupported content part types before forwarding to providers (#409 — Cursor sends `{type:"file"}` when `.md` files are attached; Copilot and other OpenAI-compat providers reject with "type has to be either 'image_url' or 'text'"; fix converts `file`/`document` blocks to `text` and drops unknown types)
|
||||
|
||||
### 🔧 Workflow
|
||||
|
||||
- **chore(generate-release)**: Add ATOMIC COMMIT RULE — version bump (`npm version patch`) MUST happen before committing feature files to ensure tag always points to a commit containing all version changes together
|
||||
|
||||
---
|
||||
|
||||
## [2.6.8] — 2026-03-17
|
||||
|
||||
> Sprint: Combo as Agent (system prompt + tool filter), Context Caching Protection, Auto-Update, Detailed Logs, MITM Kiro IDE.
|
||||
|
||||
### 🗄️ DB Migrations (zero-breaking — safe for existing users)
|
||||
|
||||
- **005_combo_agent_fields.sql**: `ALTER TABLE combos ADD COLUMN system_message TEXT DEFAULT NULL`, `tool_filter_regex TEXT DEFAULT NULL`, `context_cache_protection INTEGER DEFAULT 0`
|
||||
- **006_detailed_request_logs.sql**: New `request_detail_logs` table with 500-entry ring-buffer trigger, opt-in via settings toggle
|
||||
|
||||
### ✨ Features
|
||||
|
||||
- **feat(combo)**: System Message Override per Combo (#399 — `system_message` field replaces or injects system prompt before forwarding to provider)
|
||||
- **feat(combo)**: Tool Filter Regex per Combo (#399 — `tool_filter_regex` keeps only tools matching pattern; supports OpenAI + Anthropic formats)
|
||||
- **feat(combo)**: Context Caching Protection (#401 — `context_cache_protection` tags responses with `<omniModel>provider/model</omniModel>` and pins model for session continuity)
|
||||
- **feat(settings)**: Auto-Update via Settings (#320 — `GET /api/system/version` + `POST /api/system/update` — checks npm registry and updates in background with pm2 restart)
|
||||
- **feat(logs)**: Detailed Request Logs (#378 — captures full pipeline bodies at 4 stages: client request, translated request, provider response, client response — opt-in toggle, 64KB trim, 500-entry ring-buffer)
|
||||
- **feat(mitm)**: MITM Kiro IDE profile (#336 — `src/mitm/targets/kiro.ts` targets api.anthropic.com, reuses existing MITM infrastructure)
|
||||
|
||||
---
|
||||
|
||||
## [2.6.7] — 2026-03-17
|
||||
|
||||
> Sprint: SSE improvements, local provider_nodes extensions, proxy registry, Claude passthrough fixes.
|
||||
|
||||
### ✨ Features
|
||||
|
||||
- **feat(health)**: Background health check for local `provider_nodes` with exponential backoff (30s→300s) and `Promise.allSettled` to avoid blocking (#423, @Regis-RCR)
|
||||
- **feat(embeddings)**: Route `/v1/embeddings` to local `provider_nodes` — `buildDynamicEmbeddingProvider()` with hostname validation (#422, @Regis-RCR)
|
||||
- **feat(audio)**: Route TTS/STT to local `provider_nodes` — `buildDynamicAudioProvider()` with SSRF protection (#416, @Regis-RCR)
|
||||
- **feat(proxy)**: Proxy registry, management APIs, and quota-limit generalization (#429, @Regis-RCR)
|
||||
|
||||
### 🐛 Bug Fixes
|
||||
|
||||
- **fix(sse)**: Strip Claude-specific fields (`metadata`, `anthropic_version`) when target is OpenAI-compat (#421, @prakersh)
|
||||
- **fix(sse)**: Extract Claude SSE usage (`input_tokens`, `output_tokens`, cache tokens) in passthrough stream mode (#420, @prakersh)
|
||||
- **fix(sse)**: Generate fallback `call_id` for tool calls with missing/empty IDs (#419, @prakersh)
|
||||
- **fix(sse)**: Claude-to-Claude passthrough — forward body completely untouched, no re-translation (#418, @prakersh)
|
||||
- **fix(sse)**: Filter orphaned `tool_result` items after Claude Code context compaction to avoid 400 errors (#417, @prakersh)
|
||||
- **fix(sse)**: Skip empty-name tool calls in Responses API translator to prevent `placeholder_tool` infinite loops (#415, @prakersh)
|
||||
- **fix(sse)**: Strip empty text content blocks before translation (#427, @prakersh)
|
||||
- **fix(api)**: Add `refreshable: true` to Claude OAuth test config (#428, @prakersh)
|
||||
|
||||
### 📦 Dependencies
|
||||
|
||||
- Bump `vitest`, `@vitest/*` and related devDependencies (#414, @dependabot)
|
||||
|
||||
---
|
||||
|
||||
## [2.6.6] — 2026-03-17
|
||||
|
||||
> Hotfix: Turbopack/Docker compatibility — remove `node:` protocol from all `src/` imports.
|
||||
|
||||
### 🐛 Bug Fixes
|
||||
|
||||
- **fix(build)**: Removed `node:` protocol prefix from `import` statements in 17 files under `src/`. The `node:fs`, `node:path`, `node:url`, `node:os` etc. imports caused `Ecmascript file had an error` on Turbopack builds (Next.js 15 Docker) and on upgrades from older npm global installs. Affected files: `migrationRunner.ts`, `core.ts`, `backup.ts`, `prompts.ts`, `dataPaths.ts`, and 12 others in `src/app/api/` and `src/lib/`.
|
||||
- **chore(workflow)**: Updated `generate-release.md` to make Docker Hub sync and dual-VPS deploy **mandatory** steps in every release.
|
||||
|
||||
---
|
||||
|
||||
## [2.6.5] — 2026-03-17
|
||||
|
||||
> Sprint: reasoning model param filtering, local provider 404 fix, Kilo Gateway provider, dependency bumps.
|
||||
|
||||
### ✨ New Features
|
||||
|
||||
- **feat(api)**: Added **Kilo Gateway** (`api.kilo.ai`) as a new API Key provider (alias `kg`) — 335+ models, 6 free models, 3 auto-routing models (`kilo-auto/frontier`, `kilo-auto/balanced`, `kilo-auto/free`). Passthrough models supported via `/api/gateway/models` endpoint. (PR #408 by @Regis-RCR)
|
||||
|
||||
### 🐛 Bug Fixes
|
||||
|
||||
- **fix(sse)**: Strip unsupported parameters for reasoning models (o1, o1-mini, o1-pro, o3, o3-mini). Models in the `o1`/`o3` family reject `temperature`, `top_p`, `frequency_penalty`, `presence_penalty`, `logprobs`, `top_logprobs`, and `n` with HTTP 400. Parameters are now stripped at the `chatCore` layer before forwarding. Uses a declarative `unsupportedParams` field per model and a precomputed O(1) Map for lookup. (PR #412 by @Regis-RCR)
|
||||
- **fix(sse)**: Local provider 404 now results in a **model-only lockout (5 seconds)** instead of a connection-level lockout (2 minutes). When a local inference backend (Ollama, LM Studio, oMLX) returns 404 for an unknown model, the connection remains active and other models continue working immediately. Also fixes a pre-existing bug where `model` was not passed to `markAccountUnavailable()`. Local providers detected via hostname (`localhost`, `127.0.0.1`, `::1`, extensible via `LOCAL_HOSTNAMES` env var). (PR #410 by @Regis-RCR)
|
||||
|
||||
### 📦 Dependencies
|
||||
|
||||
- `better-sqlite3` 12.6.2 → 12.8.0
|
||||
- `undici` 7.24.2 → 7.24.4
|
||||
- `https-proxy-agent` 7 → 8
|
||||
- `agent-base` 7 → 8
|
||||
|
||||
---
|
||||
|
||||
## [2.6.4] — 2026-03-17
|
||||
|
||||
### 🐛 Bug Fixes
|
||||
|
||||
- **fix(providers)**: Removed non-existent model names across 5 providers:
|
||||
- **gemini / gemini-cli**: removed `gemini-3.1-pro/flash` and `gemini-3-*-preview` (don't exist in Google API v1beta); replaced with `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.0-flash`, `gemini-1.5-pro/flash`
|
||||
- **antigravity**: removed `gemini-3.1-pro-high/low` and `gemini-3-flash` (invalid internal aliases); replaced with real 2.x models
|
||||
- **github (Copilot)**: removed `gemini-3-flash-preview` and `gemini-3-pro-preview`; replaced with `gemini-2.5-flash`
|
||||
- **nvidia**: corrected `nvidia/llama-3.3-70b-instruct` → `meta/llama-3.3-70b-instruct` (NVIDIA NIM uses `meta/` namespace for Meta models); added `nvidia/llama-3.1-70b-instruct` and `nvidia/llama-3.1-405b-instruct`
|
||||
- **fix(db/combo)**: Updated `free-stack` combo on remote DB: removed `qw/qwen3-coder-plus` (expired refresh token), corrected `nvidia/llama-3.3-70b-instruct` → `nvidia/meta/llama-3.3-70b-instruct`, corrected `gemini/gemini-3.1-flash` → `gemini/gemini-2.5-flash`, added `if/deepseek-v3.2`
|
||||
|
||||
---
|
||||
|
||||
## [2.6.3] — 2026-03-16
|
||||
|
||||
> Sprint: zod/pino hash-strip baked into build pipeline, Synthetic provider added, VPS PM2 path corrected.
|
||||
|
||||
### 🐛 Bug Fixes
|
||||
|
||||
- **fix(build)**: Turbopack hash-strip now runs at **compile time** for ALL packages — not just `better-sqlite3`. Step 5.6 in `prepublish.mjs` walks every `.js` in `app/.next/server/` and strips the 16-char hex suffix from any hashed `require()`. Fixes `zod-dcb22c...`, `pino-...`, etc. MODULE_NOT_FOUND on global npm installs. Closes #398
|
||||
- **fix(deploy)**: PM2 on both VPS was pointing to stale git-clone directories. Reconfigured to `app/server.js` in the npm global package. Updated `/deploy-vps` workflow to use `npm pack + scp` (npm registry rejects 299MB packages).
|
||||
|
||||
### ✨ Features
|
||||
|
||||
- **feat(provider)**: Synthetic ([synthetic.new](https://synthetic.new)) — privacy-focused OpenAI-compatible inference. `passthroughModels: true` for dynamic HuggingFace model catalog. Initial models: Kimi K2.5, MiniMax M2.5, GLM 4.7, DeepSeek V3.2. (PR #404 by @Regis-RCR)
|
||||
|
||||
### 📋 Issues Closed
|
||||
|
||||
- **close #398**: npm hash regression — fixed by compile-time hash-strip in prepublish
|
||||
- **triage #324**: Bug screenshot without steps — requested reproduction details
|
||||
|
||||
---
|
||||
|
||||
## [2.6.2] — 2026-03-16
|
||||
|
||||
> Sprint: module hashing fully fixed, 2 PRs merged (Anthropic tools filter + custom endpoint paths), Alibaba Cloud DashScope provider added, 3 stale issues closed.
|
||||
|
||||
@@ -898,27 +898,44 @@ When minimized, OmniRoute lives in your system tray with quick actions:
|
||||
|
||||
## 💰 Pricing at a Glance
|
||||
|
||||
| Tier | Provider | Cost | Quota Reset | Best For |
|
||||
| ------------------- | ----------------- | ---------------------- | ---------------- | ----------------------- |
|
||||
| **💳 SUBSCRIPTION** | Claude Code (Pro) | $20/mo | 5h + weekly | Already subscribed |
|
||||
| | Codex (Plus/Pro) | $20-200/mo | 5h + weekly | OpenAI users |
|
||||
| | Gemini CLI | **FREE** | 180K/mo + 1K/day | Everyone! |
|
||||
| | GitHub Copilot | $10-19/mo | Monthly | GitHub users |
|
||||
| **🔑 API KEY** | NVIDIA NIM | **FREE** (dev forever) | ~40 RPM | 70+ open models |
|
||||
| | Cerebras | **FREE** (1M tok/day) | 60K TPM / 30 RPM | World's fastest |
|
||||
| | Groq | **FREE** (30 RPM) | 14.4K RPD | Ultra-fast Llama/Gemma |
|
||||
| | DeepSeek | Pay-per-use | None | Best price/quality |
|
||||
| | xAI (Grok) | Pay-per-use | None | Grok models |
|
||||
| | Mistral | Free trial + paid | Rate limited | European AI |
|
||||
| | OpenRouter | Pay-per-use | None | 100+ models aggr. |
|
||||
| **💰 CHEAP** | GLM-4.7 | $0.6/1M | Daily 10AM | Budget backup |
|
||||
| | MiniMax M2.1 | $0.2/1M | 5-hour rolling | Cheapest option |
|
||||
| | Kimi K2 | $9/mo flat | 10M tokens/mo | Predictable cost |
|
||||
| **🆓 FREE** | iFlow | **$0** | Unlimited | 5 models unlimited |
|
||||
| | Qwen | **$0** | Unlimited | 4 models unlimited |
|
||||
| | Kiro | **$0** | Unlimited | Claude (AWS Builder ID) |
|
||||
| Tier | Provider | Cost | Quota Reset | Best For |
|
||||
| ------------------- | --------------------------- | ------------------------- | ---------------- | --------------------------------- |
|
||||
| **💳 SUBSCRIPTION** | Claude Code (Pro) | $20/mo | 5h + weekly | Already subscribed |
|
||||
| | Codex (Plus/Pro) | $20-200/mo | 5h + weekly | OpenAI users |
|
||||
| | Gemini CLI | **FREE** | 180K/mo + 1K/day | Everyone! |
|
||||
| | GitHub Copilot | $10-19/mo | Monthly | GitHub users |
|
||||
| **🔑 API KEY** | NVIDIA NIM | **FREE** (dev forever) | ~40 RPM | 70+ open models |
|
||||
| | Cerebras | **FREE** (1M tok/day) | 60K TPM / 30 RPM | World's fastest |
|
||||
| | Groq | **FREE** (30 RPM) | 14.4K RPD | Ultra-fast Llama/Gemma |
|
||||
| | DeepSeek V3.2 | $0.27/$1.10 per 1M | None | Best price/quality reasoning |
|
||||
| | xAI Grok-4 Fast | **$0.20/$0.50 per 1M** 🆕 | None | Fastest + tool calling, ultralow |
|
||||
| | xAI Grok-4 (standard) | $0.20/$1.50 per 1M 🆕 | None | Reasoning flagship from xAI |
|
||||
| | Mistral | Free trial + paid | Rate limited | European AI |
|
||||
| | OpenRouter | Pay-per-use | None | 100+ models aggr. |
|
||||
| **💰 CHEAP** | GLM-5 (via Z.AI) 🆕 | $0.5/1M | Daily 10AM | 128K output, newest flagship |
|
||||
| | GLM-4.7 | $0.6/1M | Daily 10AM | Budget backup |
|
||||
| | MiniMax M2.5 🆕 | $0.3/1M input | 5-hour rolling | Reasoning + agentic tasks |
|
||||
| | MiniMax M2.1 | $0.2/1M | 5-hour rolling | Cheapest option |
|
||||
| | Kimi K2.5 (Moonshot API) 🆕 | Pay-per-use | None | Direct Moonshot API access |
|
||||
| | Kimi K2 | $9/mo flat | 10M tokens/mo | Predictable cost |
|
||||
| **🆓 FREE** | iFlow | **$0** | Unlimited | 5 models unlimited |
|
||||
| | Qwen | **$0** | Unlimited | 4 models unlimited |
|
||||
| | Kiro | **$0** | Unlimited | Claude Sonnet/Haiku (AWS Builder) |
|
||||
|
||||
**💡 $0 Combo Stack:** Gemini CLI (180K/mo) → iFlow (unlimited: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1) → Kiro (Claude for free) → Qwen (4 models, unlimited) — **Zero cost, never stops coding.** When Gemini quota runs out, OmniRoute auto-falls back to iFlow or Kiro with zero config.
|
||||
> 🆕 **New models added (Mar 2026):** Grok-4 Fast family at $0.20/$0.50/M (benchmarked at 1143ms — 30% faster than Gemini 2.5 Flash), GLM-5 via Z.AI with 128K output, MiniMax M2.5 reasoning, DeepSeek V3.2 updated pricing, Kimi K2.5 via Moonshot direct API.
|
||||
|
||||
**💡 $0 Combo Stack — The Complete Free Setup:**
|
||||
|
||||
```
|
||||
Gemini CLI (180K/mo free)
|
||||
→ iFlow (unlimited: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1)
|
||||
→ Kiro (Claude Sonnet 4.5 + Haiku — unlimited, via AWS Builder ID)
|
||||
→ Qwen (4 models — unlimited)
|
||||
→ Groq (14.4K req/day — ultra-fast)
|
||||
→ NVIDIA NIM (70+ models — 40 RPM forever)
|
||||
```
|
||||
|
||||
**Zero cost. Never stops coding.** Configure this as one OmniRoute combo and all fallbacks happen automatically — no manual switching ever.
|
||||
|
||||
---
|
||||
|
||||
@@ -1027,7 +1044,20 @@ Then in `/dashboard/media` → **Transcription** tab: upload any audio or video
|
||||
|
||||
OmniRoute v2.0 is built as an operational platform, not just a relay proxy.
|
||||
|
||||
### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
|
||||
### 🆕 New — ClawRouter-Inspired Improvements (Mar 2026)
|
||||
|
||||
| Feature | What It Does |
|
||||
| ------------------------------------ | ------------------------------------------------------------------------------------------- |
|
||||
| ⚡ **Grok-4 Fast Family** | xAI models at $0.20/$0.50/M — benchmarked 1143ms (30% faster than Gemini 2.5 Flash) |
|
||||
| 🧠 **GLM-5 via Z.AI** | 128K output context, $0.5/1M — newest flagship from the GLM family |
|
||||
| 🔮 **MiniMax M2.5** | Reasoning + agentic tasks at $0.30/1M — significant upgrade from M2.1 |
|
||||
| 🎯 **toolCalling Flag per Model** | Per-model `toolCalling: true/false` in registry — AutoCombo skips non-tool-capable models |
|
||||
| 🌍 **Multilingual Intent Detection** | PT/ZH/ES/AR keywords in AutoCombo scoring — better model selection for non-English content |
|
||||
| 📊 **Benchmark-Driven Fallbacks** | Real p95 latency from live requests feeds combo scoring — AutoCombo learns from actual data |
|
||||
| 🔁 **Request Deduplication** | Content-hash based dedup window — multi-agent safe, prevents duplicate charges |
|
||||
| 🔌 **Pluggable RouterStrategy** | Extensible `RouterStrategy` interface — add custom routing logic as plugins |
|
||||
|
||||
### 🚀 Previous v2.0.9+ — Playground, CLI Fingerprints & ACP
|
||||
|
||||
| Feature | What It Does |
|
||||
| ------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
|
||||
@@ -0,0 +1,46 @@
|
||||
# ADR-0001: Proxy Registry + Usage Control Generalization
|
||||
|
||||
Date: 2026-03-17
|
||||
Status: Accepted
|
||||
|
||||
## Context
|
||||
|
||||
OmniRoute sudah punya:
|
||||
|
||||
- Proxy assignment berbasis config-map (`global`, `providers`, `combos`, `keys`).
|
||||
- Quota-aware selection khusus provider tertentu (notably `codex`).
|
||||
|
||||
Gap utama:
|
||||
|
||||
- Proxy belum menjadi aset reusable yang bisa di-manage sebagai entitas (metadata, where-used, safe delete).
|
||||
- Usage policy belum konsisten lintas provider.
|
||||
- Error contract API belum seragam untuk endpoint manajemen.
|
||||
|
||||
## Decision
|
||||
|
||||
1. Tambah **Proxy Registry** sebagai domain baru di DB (`proxy_registry`, `proxy_assignments`).
|
||||
2. Pertahankan kompatibilitas assignment lama (fallback ke `proxyConfig` lama).
|
||||
3. Resolver runtime pakai prioritas:
|
||||
- account -> provider -> global (registry)
|
||||
- fallback ke legacy resolver jika registry belum ada assignment
|
||||
4. Wajib redaction kredensial di output list registry default.
|
||||
5. Standarkan error JSON untuk endpoint manajemen proxy agar konsisten dan punya `requestId`.
|
||||
|
||||
## Consequences
|
||||
|
||||
Positif:
|
||||
|
||||
- Proxy reusable dan bisa dilacak pemakaiannya.
|
||||
- Safe delete bisa ditegakkan (409 saat masih dipakai).
|
||||
- Migrasi bertahap tanpa breaking change runtime.
|
||||
|
||||
Negatif:
|
||||
|
||||
- Ada dual-source sementara (registry + legacy config) sampai migrasi selesai.
|
||||
- Butuh endpoint assignment tambahan dan pemetaan scope yang konsisten.
|
||||
|
||||
## Follow-up
|
||||
|
||||
- Migrasi UI provider/account dari input raw proxy ke selector registry.
|
||||
- Tambah health telemetry per proxy dan alerting.
|
||||
- Generalisasi usage control ke provider lain melalui interface policy yang sama.
|
||||
@@ -0,0 +1,32 @@
|
||||
# ADR-0002: Error Contract for Management Endpoints
|
||||
|
||||
Date: 2026-03-17
|
||||
Status: Accepted
|
||||
|
||||
## Decision
|
||||
|
||||
Management endpoints (proxy config, proxy registry, and proxy assignments) return a uniform error body:
|
||||
|
||||
```json
|
||||
{
|
||||
"error": {
|
||||
"message": "Human-readable summary",
|
||||
"type": "invalid_request | not_found | conflict | server_error",
|
||||
"details": {}
|
||||
},
|
||||
"requestId": "uuid"
|
||||
}
|
||||
```
|
||||
|
||||
## Status Mapping
|
||||
|
||||
- 400: invalid request / validation failure
|
||||
- 404: resource not found
|
||||
- 409: resource conflict (for example, proxy still assigned)
|
||||
- 500: unexpected server error
|
||||
|
||||
## Notes
|
||||
|
||||
- `requestId` is mandatory for log correlation.
|
||||
- `details` is optional and only used for safe validation details.
|
||||
- Sensitive secrets (proxy credentials, tokens) must never appear in `message` or `details`.
|
||||
@@ -0,0 +1,16 @@
|
||||
# ADR-0003: Security Checklist for Proxy Registry and Usage Controls
|
||||
|
||||
Date: 2026-03-17
|
||||
Status: Accepted
|
||||
|
||||
## Checklist
|
||||
|
||||
- Validate all management payloads with Zod.
|
||||
- Reject malformed scope assignment updates with status 400.
|
||||
- Reject deleting an in-use proxy with status 409 unless forced.
|
||||
- Never expose proxy username/password in list responses by default.
|
||||
- Never log raw credentials or token values.
|
||||
- Keep error responses free from internal stack traces.
|
||||
- Protect management endpoints with existing auth middleware policy.
|
||||
- Audit mutating operations: create/update/delete/assign/migrate.
|
||||
- Ensure resolver fallback to legacy config while migration is in transition.
|
||||
+1
-1
@@ -1,7 +1,7 @@
|
||||
openapi: 3.1.0
|
||||
info:
|
||||
title: OmniRoute API
|
||||
version: 2.6.2
|
||||
version: 2.7.0
|
||||
description: |
|
||||
OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
|
||||
endpoint that routes requests to multiple AI providers with load balancing,
|
||||
|
||||
@@ -11,7 +11,7 @@ interface AudioModel {
|
||||
name: string;
|
||||
}
|
||||
|
||||
interface AudioProvider {
|
||||
export interface AudioProvider {
|
||||
id: string;
|
||||
baseUrl: string;
|
||||
authType: string;
|
||||
@@ -262,36 +262,74 @@ export function getSpeechProvider(providerId: string): AudioProvider | null {
|
||||
return AUDIO_SPEECH_PROVIDERS[providerId] || null;
|
||||
}
|
||||
|
||||
export interface ProviderNodeRow {
|
||||
prefix: string;
|
||||
name: string;
|
||||
baseUrl: string;
|
||||
apiType?: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Parse audio model string (format: "provider/model" or just "model")
|
||||
* Build a dynamic AudioProvider from a provider_node DB entry.
|
||||
* Only used for local providers (localhost/127.0.0.1) — remote nodes are
|
||||
* excluded by the caller to prevent auth bypass and SSRF.
|
||||
*/
|
||||
export function buildDynamicAudioProvider(node: ProviderNodeRow, audioPath: string): AudioProvider {
|
||||
if (!node.prefix || !node.baseUrl) {
|
||||
throw new Error(`Invalid provider_node: missing prefix or baseUrl`);
|
||||
}
|
||||
const baseUrl = node.baseUrl.replace(/\/+$/, "");
|
||||
return {
|
||||
id: node.prefix,
|
||||
baseUrl: `${baseUrl}${audioPath}`,
|
||||
authType: "none",
|
||||
authHeader: "none",
|
||||
models: [],
|
||||
};
|
||||
}
|
||||
|
||||
function parseAudioModel(
|
||||
modelStr: string | null,
|
||||
registry: Record<string, AudioProvider>
|
||||
registry: Record<string, AudioProvider>,
|
||||
dynamicProviders?: AudioProvider[]
|
||||
): { provider: string | null; model: string | null } {
|
||||
if (!modelStr) return { provider: null, model: null };
|
||||
|
||||
for (const [providerId, config] of Object.entries(registry)) {
|
||||
// Phase 1: prefix match in hardcoded registry
|
||||
for (const [providerId] of Object.entries(registry)) {
|
||||
if (modelStr.startsWith(providerId + "/")) {
|
||||
return { provider: providerId, model: modelStr.slice(providerId.length + 1) };
|
||||
}
|
||||
}
|
||||
|
||||
// Phase 2: bare model lookup in hardcoded registry
|
||||
for (const [providerId, config] of Object.entries(registry)) {
|
||||
if (config.models.some((m) => m.id === modelStr)) {
|
||||
return { provider: providerId, model: modelStr };
|
||||
}
|
||||
}
|
||||
|
||||
// Phase 3: prefix match in dynamic providers (provider_nodes)
|
||||
if (dynamicProviders) {
|
||||
for (const dp of dynamicProviders) {
|
||||
if (modelStr.startsWith(dp.id + "/")) {
|
||||
return { provider: dp.id, model: modelStr.slice(dp.id.length + 1) };
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return { provider: null, model: modelStr };
|
||||
}
|
||||
|
||||
export function parseTranscriptionModel(modelStr: string | null) {
|
||||
return parseAudioModel(modelStr, AUDIO_TRANSCRIPTION_PROVIDERS);
|
||||
export function parseTranscriptionModel(
|
||||
modelStr: string | null,
|
||||
dynamicProviders?: AudioProvider[]
|
||||
) {
|
||||
return parseAudioModel(modelStr, AUDIO_TRANSCRIPTION_PROVIDERS, dynamicProviders);
|
||||
}
|
||||
|
||||
export function parseSpeechModel(modelStr: string | null) {
|
||||
return parseAudioModel(modelStr, AUDIO_SPEECH_PROVIDERS);
|
||||
export function parseSpeechModel(modelStr: string | null, dynamicProviders?: AudioProvider[]) {
|
||||
return parseAudioModel(modelStr, AUDIO_SPEECH_PROVIDERS, dynamicProviders);
|
||||
}
|
||||
|
||||
/**
|
||||
|
||||
@@ -135,6 +135,7 @@ export const COOLDOWN_MS = {
|
||||
unauthorized: 2 * 60 * 1000, // 401 → 2 min
|
||||
paymentRequired: 2 * 60 * 1000, // 402/403 → 2 min
|
||||
notFound: 2 * 60 * 1000, // 404 → 2 minutes
|
||||
notFoundLocal: 5 * 1000, // 404 on local provider → 5s model-only lockout (connection stays active)
|
||||
transientInitial: 5 * 1000, // 408/500/502/503/504 first hit → 5s (backoff from here)
|
||||
transientMax: 60 * 1000, // 502/503/504 backoff ceiling → 60s
|
||||
transient: 5 * 1000, // Legacy alias → points to transientInitial
|
||||
@@ -162,6 +163,16 @@ export const PROVIDER_PROFILES = {
|
||||
circuitBreakerThreshold: 5, // More tolerant (occasional 502 is normal)
|
||||
circuitBreakerReset: 30000, // 30s reset
|
||||
},
|
||||
// Local providers (localhost inference backends like Ollama, LM Studio, oMLX).
|
||||
// Not yet wired into getProviderProfile() — will be used when local provider_nodes
|
||||
// are integrated into the resilience layer. Kept here to avoid a second constants change.
|
||||
local: {
|
||||
transientCooldown: 2000, // 2s (local — very fast recovery)
|
||||
rateLimitCooldown: 5000, // 5s (local — no real rate limits)
|
||||
maxBackoffLevel: 3, // Low ceiling (local either works or doesn't)
|
||||
circuitBreakerThreshold: 2, // Opens fast (if local is down, it's down)
|
||||
circuitBreakerReset: 15000, // 15s reset (check again quickly)
|
||||
},
|
||||
};
|
||||
|
||||
// Default rate limit values for API Key providers (auto-enabled safety net)
|
||||
|
||||
@@ -8,7 +8,43 @@
|
||||
* keyed by provider ID (e.g. "nebius", "openai").
|
||||
*/
|
||||
|
||||
export const EMBEDDING_PROVIDERS = {
|
||||
export interface EmbeddingProvider {
|
||||
id: string;
|
||||
baseUrl: string;
|
||||
authType: string;
|
||||
authHeader: string;
|
||||
models: { id: string; name: string; dimensions?: number }[];
|
||||
}
|
||||
|
||||
export interface EmbeddingProviderNodeRow {
|
||||
prefix: string;
|
||||
name: string;
|
||||
baseUrl: string;
|
||||
apiType?: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Build a dynamic EmbeddingProvider from a local provider_node.
|
||||
* Only used for local providers (localhost) — caller must filter by hostname.
|
||||
*/
|
||||
export function buildDynamicEmbeddingProvider(node: EmbeddingProviderNodeRow): EmbeddingProvider {
|
||||
if (!node.prefix || !node.baseUrl) {
|
||||
throw new Error(`Invalid provider_node: missing prefix or baseUrl`);
|
||||
}
|
||||
if (node.prefix.includes("/") || node.prefix.includes(" ")) {
|
||||
throw new Error(`Invalid provider_node prefix "${node.prefix}": must not contain / or spaces`);
|
||||
}
|
||||
const baseUrl = node.baseUrl.replace(/\/+$/, "");
|
||||
return {
|
||||
id: node.prefix,
|
||||
baseUrl: `${baseUrl}/embeddings`,
|
||||
authType: "none",
|
||||
authHeader: "none",
|
||||
models: [],
|
||||
};
|
||||
}
|
||||
|
||||
export const EMBEDDING_PROVIDERS: Record<string, EmbeddingProvider> = {
|
||||
nebius: {
|
||||
id: "nebius",
|
||||
baseUrl: "https://api.tokenfactory.nebius.com/v1/embeddings",
|
||||
@@ -70,7 +106,7 @@ export const EMBEDDING_PROVIDERS = {
|
||||
/**
|
||||
* Get embedding provider config by ID
|
||||
*/
|
||||
export function getEmbeddingProvider(providerId) {
|
||||
export function getEmbeddingProvider(providerId: string): EmbeddingProvider | null {
|
||||
return EMBEDDING_PROVIDERS[providerId] || null;
|
||||
}
|
||||
|
||||
@@ -78,26 +114,36 @@ export function getEmbeddingProvider(providerId) {
|
||||
* Parse embedding model string (format: "provider/model" or just "model")
|
||||
* Returns { provider, model }
|
||||
*/
|
||||
export function parseEmbeddingModel(modelStr) {
|
||||
export function parseEmbeddingModel(
|
||||
modelStr: string | null,
|
||||
dynamicProviders?: EmbeddingProvider[]
|
||||
): { provider: string | null; model: string | null } {
|
||||
if (!modelStr) return { provider: null, model: null };
|
||||
|
||||
// Check for "provider/model" format
|
||||
const slashIdx = modelStr.indexOf("/");
|
||||
if (slashIdx > 0) {
|
||||
// Handle nested model IDs like "nebius/Qwen/Qwen3-Embedding-8B"
|
||||
// Try each provider prefix
|
||||
for (const [providerId, config] of Object.entries(EMBEDDING_PROVIDERS)) {
|
||||
// Phase 1: Try each hardcoded provider prefix
|
||||
for (const [providerId] of Object.entries(EMBEDDING_PROVIDERS)) {
|
||||
if (modelStr.startsWith(providerId + "/")) {
|
||||
return { provider: providerId, model: modelStr.slice(providerId.length + 1) };
|
||||
}
|
||||
}
|
||||
// Fallback: first segment is provider
|
||||
// Phase 2: Try dynamic provider_nodes prefix
|
||||
if (dynamicProviders) {
|
||||
for (const dp of dynamicProviders) {
|
||||
if (modelStr.startsWith(dp.id + "/")) {
|
||||
return { provider: dp.id, model: modelStr.slice(dp.id.length + 1) };
|
||||
}
|
||||
}
|
||||
}
|
||||
// Phase 3: Fallback — first segment is provider
|
||||
const provider = modelStr.slice(0, slashIdx);
|
||||
const model = modelStr.slice(slashIdx + 1);
|
||||
return { provider, model };
|
||||
}
|
||||
|
||||
// No provider prefix — search all providers for the model
|
||||
// No provider prefix — search hardcoded providers for the model
|
||||
for (const [providerId, config] of Object.entries(EMBEDDING_PROVIDERS)) {
|
||||
if (config.models.some((m) => m.id === modelStr)) {
|
||||
return { provider: providerId, model: modelStr };
|
||||
|
||||
@@ -11,9 +11,23 @@
|
||||
export interface RegistryModel {
|
||||
id: string;
|
||||
name: string;
|
||||
toolCalling?: boolean;
|
||||
targetFormat?: string;
|
||||
unsupportedParams?: readonly string[];
|
||||
}
|
||||
|
||||
// Reasoning models reject temperature, top_p, penalties, logprobs, n.
|
||||
// Frozen to prevent accidental mutation (shared across all model entries).
|
||||
const REASONING_UNSUPPORTED: readonly string[] = Object.freeze([
|
||||
"temperature",
|
||||
"top_p",
|
||||
"frequency_penalty",
|
||||
"presence_penalty",
|
||||
"logprobs",
|
||||
"top_logprobs",
|
||||
"n",
|
||||
]);
|
||||
|
||||
export interface RegistryOAuth {
|
||||
clientIdEnv?: string;
|
||||
clientIdDefault?: string;
|
||||
@@ -127,12 +141,15 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
},
|
||||
models: [
|
||||
{ id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
|
||||
{ id: "gemini-3.1-flash", name: "Gemini 3.1 Flash" },
|
||||
{ id: "gemini-3-pro-preview", name: "Gemini 3.0 Pro Preview" },
|
||||
{ id: "gemini-3-flash-preview", name: "Gemini 3.0 Flash Preview" },
|
||||
{ id: "gemini-3-1-pro", name: "Gemini 3.1 Pro (Alt ID)" },
|
||||
{ id: "gemini-3.1-pro-preview", name: "Gemini 3.1 Pro Preview" },
|
||||
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
|
||||
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
|
||||
{ id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
|
||||
{ id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" },
|
||||
{ id: "gemini-2.0-flash-exp", name: "Gemini 2.0 Flash Exp" },
|
||||
{ id: "gemini-1.5-pro", name: "Gemini 1.5 Pro" },
|
||||
{ id: "gemini-1.5-flash", name: "Gemini 1.5 Flash" },
|
||||
],
|
||||
},
|
||||
|
||||
@@ -156,12 +173,14 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
},
|
||||
models: [
|
||||
{ id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
|
||||
{ id: "gemini-3.1-flash", name: "Gemini 3.1 Flash" },
|
||||
{ id: "gemini-3-flash-preview", name: "Gemini 3.0 Flash Preview" },
|
||||
{ id: "gemini-3-pro-preview", name: "Gemini 3.0 Pro Preview" },
|
||||
{ id: "gemini-3-1-pro", name: "Gemini 3.1 Pro (Alt ID)" },
|
||||
{ id: "gemini-3.1-pro-preview", name: "Gemini 3.1 Pro Preview" },
|
||||
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
|
||||
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
|
||||
{ id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
|
||||
{ id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" },
|
||||
{ id: "gemini-1.5-pro", name: "Gemini 1.5 Pro" },
|
||||
{ id: "gemini-1.5-flash", name: "Gemini 1.5 Flash" },
|
||||
],
|
||||
},
|
||||
|
||||
@@ -305,10 +324,9 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
models: [
|
||||
{ id: "claude-opus-4-6-thinking", name: "Claude Opus 4.6 Thinking" },
|
||||
{ id: "claude-sonnet-4-6", name: "Claude Sonnet 4.6" },
|
||||
{ id: "gemini-3.1-pro-high", name: "Gemini 3.1 Pro High" },
|
||||
{ id: "gemini-3.1-pro-low", name: "Gemini 3.1 Pro Low" },
|
||||
{ id: "gemini-3.1-flash", name: "Gemini 3.1 Flash" },
|
||||
{ id: "gemini-3-flash", name: "Gemini 3.0 Flash" },
|
||||
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
|
||||
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
|
||||
{ id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" },
|
||||
{ id: "gpt-oss-120b-medium", name: "GPT OSS 120B Medium" },
|
||||
],
|
||||
},
|
||||
@@ -356,8 +374,7 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
{ id: "claude-sonnet-4", name: "Claude Sonnet 4" },
|
||||
{ id: "claude-sonnet-4.5", name: "Claude Sonnet 4.5" },
|
||||
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
|
||||
{ id: "gemini-3-flash-preview", name: "Gemini 3 Flash Preview" },
|
||||
{ id: "gemini-3-pro-preview", name: "Gemini 3 Pro Preview" },
|
||||
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
|
||||
{ id: "grok-code-fast-1", name: "Grok Code Fast 1" },
|
||||
{ id: "oswe-vscode-prime", name: "Raptor Mini" },
|
||||
],
|
||||
@@ -429,8 +446,11 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
{ id: "gpt-4o", name: "GPT-4o" },
|
||||
{ id: "gpt-4o-mini", name: "GPT-4o Mini" },
|
||||
{ id: "gpt-4-turbo", name: "GPT-4 Turbo" },
|
||||
{ id: "o1", name: "O1" },
|
||||
{ id: "o1-mini", name: "O1 Mini" },
|
||||
{ id: "o1", name: "O1", unsupportedParams: REASONING_UNSUPPORTED },
|
||||
{ id: "o1-mini", name: "O1 Mini", unsupportedParams: REASONING_UNSUPPORTED },
|
||||
{ id: "o1-pro", name: "O1 Pro", unsupportedParams: REASONING_UNSUPPORTED },
|
||||
{ id: "o3", name: "O3", unsupportedParams: REASONING_UNSUPPORTED },
|
||||
{ id: "o3-mini", name: "O3 Mini", unsupportedParams: REASONING_UNSUPPORTED },
|
||||
],
|
||||
},
|
||||
|
||||
@@ -447,8 +467,13 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
"Anthropic-Version": "2023-06-01",
|
||||
},
|
||||
models: [
|
||||
{ id: "claude-haiku-4.5", name: "Claude Haiku 4.5" },
|
||||
{ id: "claude-sonnet-4-20250514", name: "Claude Sonnet 4" },
|
||||
{ id: "claude-sonnet-4-6-20251031", name: "Claude Sonnet 4.6 (Dated)" },
|
||||
{ id: "claude-sonnet-4.6", name: "Claude Sonnet 4.6" },
|
||||
{ id: "claude-opus-4-20250514", name: "Claude Opus 4" },
|
||||
{ id: "claude-opus-4-6-20251031", name: "Claude Opus 4.6 (Dated)" },
|
||||
{ id: "claude-opus-4.6", name: "Claude Opus 4.6" },
|
||||
{ id: "claude-3-5-sonnet-20241022", name: "Claude 3.5 Sonnet" },
|
||||
],
|
||||
},
|
||||
@@ -482,6 +507,8 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
"Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
|
||||
},
|
||||
models: [
|
||||
{ id: "glm-5", name: "GLM 5" },
|
||||
{ id: "glm-5-turbo", name: "GLM 5 Turbo" },
|
||||
{ id: "glm-4.7-flash", name: "GLM 4.7 Flash" },
|
||||
{ id: "glm-4.7", name: "GLM 4.7" },
|
||||
{ id: "glm-4.6v", name: "GLM 4.6V (Vision)" },
|
||||
@@ -493,6 +520,25 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
],
|
||||
},
|
||||
|
||||
zai: {
|
||||
id: "zai",
|
||||
alias: "zai",
|
||||
format: "claude",
|
||||
executor: "default",
|
||||
baseUrl: "https://api.z.ai/api/anthropic/v1/messages",
|
||||
urlSuffix: "?beta=true",
|
||||
authType: "apikey",
|
||||
authHeader: "x-api-key",
|
||||
headers: {
|
||||
"Anthropic-Version": "2023-06-01",
|
||||
"Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
|
||||
},
|
||||
models: [
|
||||
{ id: "glm-5", name: "GLM 5" },
|
||||
{ id: "glm-5-turbo", name: "GLM 5 Turbo" },
|
||||
],
|
||||
},
|
||||
|
||||
kimi: {
|
||||
id: "kimi",
|
||||
alias: "kimi",
|
||||
@@ -624,7 +670,11 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
"Anthropic-Version": "2023-06-01",
|
||||
"Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
|
||||
},
|
||||
models: [{ id: "MiniMax-M2.1", name: "MiniMax M2.1" }],
|
||||
models: [
|
||||
{ id: "minimax-m2.5", name: "MiniMax M2.5" },
|
||||
{ id: "MiniMax-M2.5", name: "MiniMax M2.5 (Legacy Alias)" },
|
||||
{ id: "MiniMax-M2.1", name: "MiniMax M2.1" },
|
||||
],
|
||||
},
|
||||
|
||||
"minimax-cn": {
|
||||
@@ -642,6 +692,8 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
},
|
||||
models: [
|
||||
// Keep parity with minimax to ensure model discovery works for minimax-cn connections.
|
||||
{ id: "minimax-m2.5", name: "MiniMax M2.5" },
|
||||
{ id: "MiniMax-M2.5", name: "MiniMax M2.5 (Legacy Alias)" },
|
||||
{ id: "MiniMax-M2.1", name: "MiniMax M2.1" },
|
||||
],
|
||||
},
|
||||
@@ -704,10 +756,14 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
authType: "apikey",
|
||||
authHeader: "bearer",
|
||||
models: [
|
||||
{ id: "grok-4", name: "Grok 4" },
|
||||
{ id: "grok-4-fast-non-reasoning", name: "Grok 4 Fast" },
|
||||
{ id: "grok-4-fast-reasoning", name: "Grok 4 Fast Reasoning" },
|
||||
{ id: "grok-code-fast-1", name: "Grok Code Fast" },
|
||||
{ id: "grok-4-1-fast-non-reasoning", name: "Grok 4.1 Fast" },
|
||||
{ id: "grok-4-1-fast-reasoning", name: "Grok 4.1 Fast Reasoning" },
|
||||
{ id: "grok-4-0709", name: "Grok 4 (0709)" },
|
||||
{ id: "grok-4", name: "Grok 4" },
|
||||
{ id: "grok-3", name: "Grok 3" },
|
||||
{ id: "grok-3-mini", name: "Grok 3 Mini" },
|
||||
],
|
||||
},
|
||||
|
||||
@@ -836,12 +892,17 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
authType: "apikey",
|
||||
authHeader: "bearer",
|
||||
models: [
|
||||
{ id: "gpt-oss-120b", name: "GPT OSS 120B", toolCalling: false },
|
||||
{ id: "openai/gpt-oss-120b", name: "GPT OSS 120B (OpenAI Prefix)", toolCalling: false },
|
||||
{ id: "meta/llama-3.3-70b-instruct", name: "Llama 3.3 70B" },
|
||||
{ id: "nvidia/llama-3.3-70b-instruct", name: "Llama 3.3 70B (NVIDIA Prefix)" },
|
||||
{ id: "meta/llama-4-maverick-17b-128e-instruct", name: "Llama 4 Maverick" },
|
||||
{ id: "moonshotai/kimi-k2.5", name: "Kimi K2.5" },
|
||||
{ id: "z-ai/glm4.7", name: "GLM 4.7" },
|
||||
{ id: "deepseek-ai/deepseek-v3.2", name: "DeepSeek V3.2" },
|
||||
{ id: "nvidia/llama-3.3-70b-instruct", name: "Llama 3.3 70B" },
|
||||
{ id: "meta/llama-4-maverick-17b-128e-instruct", name: "Llama 4 Maverick" },
|
||||
{ id: "deepseek/deepseek-r1", name: "DeepSeek R1" },
|
||||
{ id: "nvidia/llama-3.1-70b-instruct", name: "Llama 3.1 70B" },
|
||||
{ id: "nvidia/llama-3.1-405b-instruct", name: "Llama 3.1 405B" },
|
||||
],
|
||||
},
|
||||
|
||||
@@ -919,6 +980,46 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
],
|
||||
},
|
||||
|
||||
synthetic: {
|
||||
id: "synthetic",
|
||||
alias: "synthetic",
|
||||
format: "openai",
|
||||
executor: "default",
|
||||
baseUrl: "https://api.synthetic.new/openai/v1/chat/completions",
|
||||
modelsUrl: "https://api.synthetic.new/openai/v1/models",
|
||||
authType: "apikey",
|
||||
authHeader: "bearer",
|
||||
models: [
|
||||
{ id: "hf:nvidia/Kimi-K2.5-NVFP4", name: "Kimi K2.5 (NVFP4)" },
|
||||
{ id: "hf:MiniMaxAI/MiniMax-M2.5", name: "MiniMax M2.5" },
|
||||
{ id: "hf:zai-org/GLM-4.7-Flash", name: "GLM 4.7 Flash" },
|
||||
{ id: "hf:zai-org/GLM-4.7", name: "GLM 4.7" },
|
||||
{ id: "hf:moonshotai/Kimi-K2.5", name: "Kimi K2.5" },
|
||||
{ id: "hf:deepseek-ai/DeepSeek-V3.2", name: "DeepSeek V3.2" },
|
||||
],
|
||||
passthroughModels: true,
|
||||
},
|
||||
|
||||
"kilo-gateway": {
|
||||
id: "kilo-gateway",
|
||||
alias: "kg",
|
||||
format: "openai",
|
||||
executor: "default",
|
||||
baseUrl: "https://api.kilo.ai/api/gateway/chat/completions",
|
||||
modelsUrl: "https://api.kilo.ai/api/gateway/models",
|
||||
authType: "apikey",
|
||||
authHeader: "bearer",
|
||||
models: [
|
||||
{ id: "kilo-auto/frontier", name: "Kilo Auto Frontier" },
|
||||
{ id: "kilo-auto/balanced", name: "Kilo Auto Balanced" },
|
||||
{ id: "kilo-auto/free", name: "Kilo Auto Free" },
|
||||
{ id: "nvidia/nemotron-3-super-120b-a12b:free", name: "Nemotron 3 Super 120B (Free)" },
|
||||
{ id: "minimax/minimax-m2.5:free", name: "MiniMax M2.5 (Free)" },
|
||||
{ id: "arcee-ai/trinity-large-preview:free", name: "Trinity Large Preview (Free)" },
|
||||
],
|
||||
passthroughModels: true,
|
||||
},
|
||||
|
||||
vertex: {
|
||||
id: "vertex",
|
||||
alias: "vertex",
|
||||
@@ -1022,6 +1123,38 @@ export function generateAliasMap(): Record<string, string> {
|
||||
return map;
|
||||
}
|
||||
|
||||
// ── Local Provider Detection ──────────────────────────────────────────────
|
||||
|
||||
// Evaluated once at module load time — process restart required for env var changes.
|
||||
const LOCAL_HOSTNAMES = new Set([
|
||||
"localhost",
|
||||
"127.0.0.1",
|
||||
"::1",
|
||||
"[::1]",
|
||||
...(typeof process !== "undefined" && process.env.LOCAL_HOSTNAMES
|
||||
? process.env.LOCAL_HOSTNAMES.split(",")
|
||||
.map((h) => h.trim())
|
||||
.filter(Boolean)
|
||||
: []),
|
||||
]);
|
||||
|
||||
/**
|
||||
* Detect if a base URL points to a local inference backend.
|
||||
* Used for shorter 404 cooldowns (model-only, not connection) and health check targets.
|
||||
*
|
||||
* Operators can extend via LOCAL_HOSTNAMES env var (comma-separated) for Docker
|
||||
* hostnames (e.g., LOCAL_HOSTNAMES=omlx,mlx-audio).
|
||||
*/
|
||||
export function isLocalProvider(baseUrl?: string | null): boolean {
|
||||
if (!baseUrl) return false;
|
||||
try {
|
||||
const url = new URL(baseUrl);
|
||||
return LOCAL_HOSTNAMES.has(url.hostname);
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
// ── Registry Lookup Helpers ───────────────────────────────────────────────
|
||||
|
||||
const _byAlias = new Map<string, RegistryEntry>();
|
||||
@@ -1041,6 +1174,43 @@ export function getRegisteredProviders(): string[] {
|
||||
return Object.keys(REGISTRY);
|
||||
}
|
||||
|
||||
// Precomputed map: modelId → unsupportedParams (O(1) lookup instead of O(N×M) scan).
|
||||
// Built once at module load from all registry entries.
|
||||
const _unsupportedParamsMap = new Map<string, readonly string[]>();
|
||||
for (const entry of Object.values(REGISTRY)) {
|
||||
for (const model of entry.models) {
|
||||
if (model.unsupportedParams && !_unsupportedParamsMap.has(model.id)) {
|
||||
_unsupportedParamsMap.set(model.id, model.unsupportedParams);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Get unsupported parameters for a specific model.
|
||||
* Uses O(1) precomputed lookup. Also handles prefixed model IDs
|
||||
* (e.g., "openai/o3" → strips prefix and looks up "o3").
|
||||
* Returns empty array if no restrictions are defined.
|
||||
*/
|
||||
export function getUnsupportedParams(provider: string, modelId: string): readonly string[] {
|
||||
// 1. Check current provider's registry (exact match)
|
||||
const entry = getRegistryEntry(provider);
|
||||
const modelEntry = entry?.models.find((m) => m.id === modelId);
|
||||
if (modelEntry?.unsupportedParams) return modelEntry.unsupportedParams;
|
||||
|
||||
// 2. O(1) lookup in precomputed map (handles cross-provider routing)
|
||||
const cached = _unsupportedParamsMap.get(modelId);
|
||||
if (cached) return cached;
|
||||
|
||||
// 3. Handle prefixed model IDs (e.g., "openai/o3" → "o3")
|
||||
if (modelId.includes("/")) {
|
||||
const bareId = modelId.split("/").pop() || "";
|
||||
const bare = _unsupportedParamsMap.get(bareId);
|
||||
if (bare) return bare;
|
||||
}
|
||||
|
||||
return [];
|
||||
}
|
||||
|
||||
/**
|
||||
* Get provider category: "oauth" or "apikey"
|
||||
* Used by the resilience layer to apply different cooldown/backoff profiles.
|
||||
|
||||
@@ -381,7 +381,12 @@ async function handleTortoiseSpeech(providerConfig, body) {
|
||||
* @returns {Response}
|
||||
*/
|
||||
/** @returns {Promise<unknown>} */
|
||||
export async function handleAudioSpeech({ body, credentials }) {
|
||||
export async function handleAudioSpeech({
|
||||
body,
|
||||
credentials,
|
||||
resolvedProvider = null,
|
||||
resolvedModel = null,
|
||||
}) {
|
||||
if (!body.model) {
|
||||
return errorResponse(400, "model is required");
|
||||
}
|
||||
@@ -389,8 +394,15 @@ export async function handleAudioSpeech({ body, credentials }) {
|
||||
return errorResponse(400, "input is required");
|
||||
}
|
||||
|
||||
const { provider: providerId, model: modelId } = parseSpeechModel(body.model);
|
||||
const providerConfig = providerId ? getSpeechProvider(providerId) : null;
|
||||
// Use pre-resolved provider/model from route handler if available (supports dynamic provider_nodes).
|
||||
// Falls back to hardcoded registry lookup for backward compatibility.
|
||||
let providerConfig = resolvedProvider;
|
||||
let modelId = resolvedModel;
|
||||
if (!providerConfig) {
|
||||
const parsed = parseSpeechModel(body.model);
|
||||
providerConfig = parsed.provider ? getSpeechProvider(parsed.provider) : null;
|
||||
modelId = parsed.model;
|
||||
}
|
||||
|
||||
if (!providerConfig) {
|
||||
return errorResponse(
|
||||
@@ -403,7 +415,7 @@ export async function handleAudioSpeech({ body, credentials }) {
|
||||
const token =
|
||||
providerConfig.authType === "none" ? null : credentials?.apiKey || credentials?.accessToken;
|
||||
if (providerConfig.authType !== "none" && !token) {
|
||||
return errorResponse(401, `No credentials for speech provider: ${providerId}`);
|
||||
return errorResponse(401, `No credentials for speech provider: ${providerConfig.id}`);
|
||||
}
|
||||
|
||||
try {
|
||||
|
||||
@@ -13,7 +13,11 @@ import { getCorsOrigin } from "../utils/cors.ts";
|
||||
* - HuggingFace Inference: POST raw binary to /models/{model_id}
|
||||
*/
|
||||
|
||||
import { getTranscriptionProvider, parseTranscriptionModel } from "../config/audioRegistry.ts";
|
||||
import {
|
||||
getTranscriptionProvider,
|
||||
parseTranscriptionModel,
|
||||
type AudioProvider,
|
||||
} from "../config/audioRegistry.ts";
|
||||
import { buildAuthHeaders } from "../config/registryUtils.ts";
|
||||
import { errorResponse } from "../utils/error.ts";
|
||||
|
||||
@@ -235,9 +239,13 @@ async function handleHuggingFaceTranscription(providerConfig, file, modelId, tok
|
||||
export async function handleAudioTranscription({
|
||||
formData,
|
||||
credentials,
|
||||
resolvedProvider = null,
|
||||
resolvedModel = null,
|
||||
}: {
|
||||
formData: FormData;
|
||||
credentials?: TranscriptionCredentials | null;
|
||||
resolvedProvider?: AudioProvider | null;
|
||||
resolvedModel?: string | null;
|
||||
}): Promise<Response> {
|
||||
const model = formData.get("model");
|
||||
if (typeof model !== "string" || !model) {
|
||||
@@ -250,8 +258,14 @@ export async function handleAudioTranscription({
|
||||
}
|
||||
const file = fileEntry as Blob & { name?: unknown };
|
||||
|
||||
const { provider: providerId, model: modelId } = parseTranscriptionModel(model);
|
||||
const providerConfig = providerId ? getTranscriptionProvider(providerId) : null;
|
||||
// Use pre-resolved provider/model from route handler if available (supports dynamic provider_nodes).
|
||||
let providerConfig = resolvedProvider;
|
||||
let modelId = resolvedModel;
|
||||
if (!providerConfig) {
|
||||
const parsed = parseTranscriptionModel(model);
|
||||
providerConfig = parsed.provider ? getTranscriptionProvider(parsed.provider) : null;
|
||||
modelId = parsed.model;
|
||||
}
|
||||
|
||||
if (!providerConfig) {
|
||||
return errorResponse(
|
||||
@@ -264,7 +278,7 @@ export async function handleAudioTranscription({
|
||||
const token =
|
||||
providerConfig.authType === "none" ? null : credentials?.apiKey || credentials?.accessToken;
|
||||
if (providerConfig.authType !== "none" && !token) {
|
||||
return errorResponse(401, `No credentials for transcription provider: ${providerId}`);
|
||||
return errorResponse(401, `No credentials for transcription provider: ${providerConfig.id}`);
|
||||
}
|
||||
|
||||
// Route to provider-specific handler
|
||||
|
||||
+157
-26
@@ -13,6 +13,7 @@ import { refreshWithRetry } from "../services/tokenRefresh.ts";
|
||||
import { createRequestLogger } from "../utils/requestLogger.ts";
|
||||
import { getModelTargetFormat, PROVIDER_ID_TO_ALIAS } from "../config/providerModels.ts";
|
||||
import { resolveModelAlias } from "../services/modelDeprecation.ts";
|
||||
import { getUnsupportedParams } from "../config/providerRegistry.ts";
|
||||
import { createErrorResult, parseUpstreamError, formatProviderError } from "../utils/error.ts";
|
||||
import { HTTP_STATUS } from "../config/constants.ts";
|
||||
import { handleBypassRequest } from "../utils/bypassHandler.ts";
|
||||
@@ -41,6 +42,7 @@ import {
|
||||
import { getIdempotencyKey, checkIdempotency, saveIdempotency } from "@/lib/idempotencyLayer";
|
||||
import { createProgressTransform, wantsProgress } from "../utils/progressTracker.ts";
|
||||
import { isModelUnavailableError, getNextFamilyFallback } from "../services/modelFamilyFallback.ts";
|
||||
import { computeRequestHash, deduplicate, shouldDeduplicate } from "../services/requestDedup.ts";
|
||||
|
||||
export function shouldUseNativeCodexPassthrough({
|
||||
provider,
|
||||
@@ -53,7 +55,9 @@ export function shouldUseNativeCodexPassthrough({
|
||||
}): boolean {
|
||||
if (provider !== "codex") return false;
|
||||
if (sourceFormat !== FORMATS.OPENAI_RESPONSES) return false;
|
||||
return String(endpointPath || "").toLowerCase().endsWith("/responses");
|
||||
return String(endpointPath || "")
|
||||
.toLowerCase()
|
||||
.endsWith("/responses");
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -86,6 +90,22 @@ export async function handleChatCore({
|
||||
}) {
|
||||
const { provider, model, extendedContext } = modelInfo;
|
||||
const startTime = Date.now();
|
||||
const persistFailureUsage = (statusCode: number, errorCode?: string | null) => {
|
||||
saveRequestUsage({
|
||||
provider: provider || "unknown",
|
||||
model: model || "unknown",
|
||||
tokens: { input: 0, output: 0, cacheRead: 0, cacheCreation: 0, reasoning: 0 },
|
||||
status: String(statusCode),
|
||||
success: false,
|
||||
latencyMs: Date.now() - startTime,
|
||||
timeToFirstTokenMs: 0,
|
||||
errorCode: errorCode || String(statusCode),
|
||||
timestamp: new Date().toISOString(),
|
||||
connectionId: connectionId || undefined,
|
||||
apiKeyId: apiKeyInfo?.id || undefined,
|
||||
apiKeyName: apiKeyInfo?.name || undefined,
|
||||
}).catch(() => {});
|
||||
};
|
||||
|
||||
// ── Phase 9.2: Idempotency check ──
|
||||
const idempotencyKey = getIdempotencyKey(clientRawRequest?.headers);
|
||||
@@ -182,10 +202,17 @@ export async function handleChatCore({
|
||||
|
||||
// Translate request (pass reqLogger for intermediate logging)
|
||||
let translatedBody = body;
|
||||
const isClaudePassthrough = sourceFormat === FORMATS.CLAUDE && targetFormat === FORMATS.CLAUDE;
|
||||
try {
|
||||
if (nativeCodexPassthrough) {
|
||||
translatedBody = { ...body, _nativeCodexPassthrough: true };
|
||||
log?.debug?.("FORMAT", "native codex passthrough enabled");
|
||||
} else if (isClaudePassthrough) {
|
||||
// Claude-to-Claude passthrough: forward body completely untouched.
|
||||
// No translation, no field stripping, no thinking normalization.
|
||||
// We are just a gateway -- do not interfere with the request in the slightest.
|
||||
translatedBody = { ...body };
|
||||
log?.debug?.("FORMAT", "claude->claude passthrough -- forwarding untouched");
|
||||
} else {
|
||||
translatedBody = { ...body };
|
||||
|
||||
@@ -230,6 +257,55 @@ export async function handleChatCore({
|
||||
});
|
||||
}
|
||||
|
||||
// Strip empty text content blocks from messages.
|
||||
// Anthropic API rejects {"type":"text","text":""} with 400 "text content blocks must be non-empty".
|
||||
// Some clients (LiteLLM passthrough, @ai-sdk/anthropic) may forward these empty blocks as-is.
|
||||
if (Array.isArray(translatedBody.messages)) {
|
||||
for (const msg of translatedBody.messages) {
|
||||
if (Array.isArray(msg.content)) {
|
||||
msg.content = msg.content.filter(
|
||||
(block: Record<string, unknown>) =>
|
||||
block.type !== "text" || (typeof block.text === "string" && block.text.length > 0)
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ── #409: Normalize unsupported content part types ──
|
||||
// Cursor and other clients send {type:"file"} when attaching .md or other files.
|
||||
// Providers (Copilot, OpenAI) only accept "text" and "image_url" in content arrays.
|
||||
// Convert: file → text (extract content), drop unrecognized types with a warning.
|
||||
if (Array.isArray(translatedBody.messages)) {
|
||||
for (const msg of translatedBody.messages) {
|
||||
if (msg.role === "user" && Array.isArray(msg.content)) {
|
||||
msg.content = (msg.content as Record<string, unknown>[]).flatMap(
|
||||
(block: Record<string, unknown>) => {
|
||||
if (block.type === "text" || block.type === "image_url" || block.type === "image") {
|
||||
return [block];
|
||||
}
|
||||
// file / document → extract text content
|
||||
if (block.type === "file" || block.type === "document") {
|
||||
const fileContent =
|
||||
(block.file as Record<string, unknown>)?.content ??
|
||||
(block.file as Record<string, unknown>)?.text ??
|
||||
block.content ??
|
||||
block.text;
|
||||
const fileName =
|
||||
(block.file as Record<string, unknown>)?.name ?? block.name ?? "attachment";
|
||||
if (typeof fileContent === "string" && fileContent.length > 0) {
|
||||
return [{ type: "text", text: `[${fileName}]\n${fileContent}` }];
|
||||
}
|
||||
return [];
|
||||
}
|
||||
// Unknown types: drop silently
|
||||
log?.debug?.("CONTENT", `Dropped unsupported content part type="${block.type}"`);
|
||||
return [];
|
||||
}
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
translatedBody = translateRequest(
|
||||
sourceFormat,
|
||||
targetFormat,
|
||||
@@ -287,9 +363,75 @@ export async function handleChatCore({
|
||||
// Update model in body
|
||||
translatedBody.model = model;
|
||||
|
||||
// Strip unsupported parameters for reasoning models (o1, o3, etc.)
|
||||
const unsupported = getUnsupportedParams(provider, model);
|
||||
if (unsupported.length > 0) {
|
||||
const stripped: string[] = [];
|
||||
for (const param of unsupported) {
|
||||
if (Object.hasOwn(translatedBody, param)) {
|
||||
stripped.push(param);
|
||||
delete translatedBody[param];
|
||||
}
|
||||
}
|
||||
if (stripped.length > 0) {
|
||||
log?.warn?.("PARAMS", `Stripped unsupported params for ${model}: ${stripped.join(", ")}`);
|
||||
}
|
||||
}
|
||||
|
||||
// Get executor for this provider
|
||||
const executor = getExecutor(provider);
|
||||
|
||||
// Create stream controller for disconnect detection
|
||||
const streamController = createStreamController({ onDisconnect, log, provider, model });
|
||||
|
||||
const dedupRequestBody = { ...translatedBody, model: `${provider}/${model}` };
|
||||
const dedupEnabled = shouldDeduplicate(dedupRequestBody);
|
||||
const dedupHash = dedupEnabled ? computeRequestHash(dedupRequestBody) : null;
|
||||
|
||||
const executeProviderRequest = async (modelToCall = model, allowDedup = false) => {
|
||||
const execute = async () => {
|
||||
const bodyToSend =
|
||||
translatedBody.model === modelToCall
|
||||
? translatedBody
|
||||
: { ...translatedBody, model: modelToCall };
|
||||
|
||||
const rawResult = await withRateLimit(provider, connectionId, modelToCall, () =>
|
||||
executor.execute({
|
||||
model: modelToCall,
|
||||
body: bodyToSend,
|
||||
stream,
|
||||
credentials,
|
||||
signal: streamController.signal,
|
||||
log,
|
||||
extendedContext,
|
||||
})
|
||||
);
|
||||
|
||||
if (stream) return rawResult;
|
||||
|
||||
// Non-stream responses need cloning for shared dedup consumers.
|
||||
const status = rawResult.response.status;
|
||||
const statusText = rawResult.response.statusText;
|
||||
const headers = Array.from(rawResult.response.headers.entries());
|
||||
const payload = await rawResult.response.text();
|
||||
|
||||
return {
|
||||
...rawResult,
|
||||
response: new Response(payload, { status, statusText, headers }),
|
||||
};
|
||||
};
|
||||
|
||||
if (allowDedup && dedupEnabled && dedupHash) {
|
||||
const dedupResult = await deduplicate(dedupHash, execute);
|
||||
if (dedupResult.wasDeduplicated) {
|
||||
log?.debug?.("DEDUP", `Joined in-flight request hash=${dedupHash}`);
|
||||
}
|
||||
return dedupResult.result;
|
||||
}
|
||||
|
||||
return execute();
|
||||
};
|
||||
|
||||
// Track pending request
|
||||
trackPendingRequest(model, provider, connectionId, true);
|
||||
|
||||
@@ -307,9 +449,6 @@ export async function handleChatCore({
|
||||
0;
|
||||
log?.debug?.("REQUEST", `${provider.toUpperCase()} | ${model} | ${msgCount} msgs`);
|
||||
|
||||
// Create stream controller for disconnect detection
|
||||
const streamController = createStreamController({ onDisconnect, log, provider, model });
|
||||
|
||||
// Execute request using executor (handles URL building, headers, fallback, transform)
|
||||
let providerResponse;
|
||||
let providerUrl;
|
||||
@@ -317,17 +456,7 @@ export async function handleChatCore({
|
||||
let finalBody;
|
||||
|
||||
try {
|
||||
const result = await withRateLimit(provider, connectionId, model, () =>
|
||||
executor.execute({
|
||||
model,
|
||||
body: translatedBody,
|
||||
stream,
|
||||
credentials,
|
||||
signal: streamController.signal,
|
||||
log,
|
||||
extendedContext,
|
||||
})
|
||||
);
|
||||
const result = await executeProviderRequest(model, true);
|
||||
|
||||
providerResponse = result.response;
|
||||
providerUrl = result.url;
|
||||
@@ -374,6 +503,7 @@ export async function handleChatCore({
|
||||
streamController.handleError(error);
|
||||
return createErrorResult(499, "Request aborted");
|
||||
}
|
||||
persistFailureUsage(HTTP_STATUS.BAD_GATEWAY, error?.name || "upstream_error");
|
||||
const errMsg = formatProviderError(error, provider, model, HTTP_STATUS.BAD_GATEWAY);
|
||||
console.log(`${COLORS.red}[ERROR] ${errMsg}${COLORS.reset}`);
|
||||
return createErrorResult(HTTP_STATUS.BAD_GATEWAY, errMsg);
|
||||
@@ -483,17 +613,7 @@ export async function handleChatCore({
|
||||
log?.info?.("MODEL_FALLBACK", `${model} unavailable (${statusCode}) → trying ${nextModel}`);
|
||||
// Re-execute with the fallback model
|
||||
try {
|
||||
const fallbackResult = await withRateLimit(provider, connectionId, nextModel, () =>
|
||||
executor.execute({
|
||||
model: nextModel,
|
||||
body: translatedBody,
|
||||
stream,
|
||||
credentials,
|
||||
signal: streamController.signal,
|
||||
log,
|
||||
extendedContext,
|
||||
})
|
||||
);
|
||||
const fallbackResult = await executeProviderRequest(nextModel, false);
|
||||
if (fallbackResult.response.ok) {
|
||||
providerResponse = fallbackResult.response;
|
||||
providerUrl = fallbackResult.url;
|
||||
@@ -505,15 +625,19 @@ export async function handleChatCore({
|
||||
// We fall through by NOT returning here
|
||||
} else {
|
||||
// Fallback also failed — return original error
|
||||
persistFailureUsage(statusCode, "model_unavailable");
|
||||
return createErrorResult(statusCode, errMsg, retryAfterMs);
|
||||
}
|
||||
} catch {
|
||||
persistFailureUsage(statusCode, "model_unavailable");
|
||||
return createErrorResult(statusCode, errMsg, retryAfterMs);
|
||||
}
|
||||
} else {
|
||||
persistFailureUsage(statusCode, "model_unavailable");
|
||||
return createErrorResult(statusCode, errMsg, retryAfterMs);
|
||||
}
|
||||
} else {
|
||||
persistFailureUsage(statusCode, `upstream_${statusCode}`);
|
||||
return createErrorResult(statusCode, errMsg, retryAfterMs);
|
||||
}
|
||||
// ── End T5 ───────────────────────────────────────────────────────────────
|
||||
@@ -542,6 +666,7 @@ export async function handleChatCore({
|
||||
connectionId,
|
||||
status: `FAILED ${HTTP_STATUS.BAD_GATEWAY}`,
|
||||
}).catch(() => {});
|
||||
persistFailureUsage(HTTP_STATUS.BAD_GATEWAY, "invalid_sse_payload");
|
||||
return createErrorResult(
|
||||
HTTP_STATUS.BAD_GATEWAY,
|
||||
"Invalid SSE response for non-streaming request"
|
||||
@@ -559,6 +684,7 @@ export async function handleChatCore({
|
||||
connectionId,
|
||||
status: `FAILED ${HTTP_STATUS.BAD_GATEWAY}`,
|
||||
}).catch(() => {});
|
||||
persistFailureUsage(HTTP_STATUS.BAD_GATEWAY, "invalid_json_payload");
|
||||
return createErrorResult(HTTP_STATUS.BAD_GATEWAY, "Invalid JSON response from provider");
|
||||
}
|
||||
}
|
||||
@@ -601,6 +727,11 @@ export async function handleChatCore({
|
||||
provider: provider || "unknown",
|
||||
model: model || "unknown",
|
||||
tokens: usage,
|
||||
status: "200",
|
||||
success: true,
|
||||
latencyMs: Date.now() - startTime,
|
||||
timeToFirstTokenMs: Date.now() - startTime,
|
||||
errorCode: null,
|
||||
timestamp: new Date().toISOString(),
|
||||
connectionId: connectionId || undefined,
|
||||
apiKeyId: apiKeyInfo?.id || undefined,
|
||||
|
||||
@@ -13,18 +13,48 @@
|
||||
* }
|
||||
*/
|
||||
|
||||
import { getEmbeddingProvider, parseEmbeddingModel } from "../config/embeddingRegistry.ts";
|
||||
import {
|
||||
getEmbeddingProvider,
|
||||
parseEmbeddingModel,
|
||||
type EmbeddingProvider,
|
||||
} from "../config/embeddingRegistry.ts";
|
||||
import { saveCallLog } from "@/lib/usageDb";
|
||||
|
||||
/**
|
||||
* Handle embedding request
|
||||
* @param {object} options
|
||||
* @param {object} options.body - Request body
|
||||
* @param {object} options.credentials - Provider credentials { apiKey, accessToken }
|
||||
* @param {object} options.log - Logger
|
||||
* Handle embedding request.
|
||||
* Supports both hardcoded cloud providers and dynamic local provider_nodes.
|
||||
* When resolvedProvider is passed, uses it directly (injection pattern from route handler).
|
||||
* Falls back to hardcoded registry lookup for backward compatibility.
|
||||
*/
|
||||
export async function handleEmbedding({ body, credentials, log }) {
|
||||
const { provider, model } = parseEmbeddingModel(body.model);
|
||||
export async function handleEmbedding({
|
||||
body,
|
||||
credentials,
|
||||
log,
|
||||
resolvedProvider = null,
|
||||
resolvedModel = null,
|
||||
}: {
|
||||
body: Record<string, unknown>;
|
||||
credentials: { apiKey?: string; accessToken?: string } | null;
|
||||
log?: { info: (...args: unknown[]) => void; error: (...args: unknown[]) => void };
|
||||
resolvedProvider?: EmbeddingProvider | null;
|
||||
resolvedModel?: string | null;
|
||||
}) {
|
||||
// Use pre-resolved provider/model from route handler if available (supports dynamic provider_nodes).
|
||||
let provider: string | null;
|
||||
let model: string | null;
|
||||
let providerConfig: EmbeddingProvider | null;
|
||||
|
||||
if (resolvedProvider) {
|
||||
provider = resolvedProvider.id;
|
||||
model = resolvedModel;
|
||||
providerConfig = resolvedProvider;
|
||||
} else {
|
||||
const parsed = parseEmbeddingModel(body.model as string);
|
||||
provider = parsed.provider;
|
||||
model = parsed.model;
|
||||
providerConfig = provider ? getEmbeddingProvider(provider) : null;
|
||||
}
|
||||
|
||||
const startTime = Date.now();
|
||||
|
||||
// Summarized request body for call log (avoid storing large embedding input arrays)
|
||||
@@ -42,7 +72,6 @@ export async function handleEmbedding({ body, credentials, log }) {
|
||||
};
|
||||
}
|
||||
|
||||
const providerConfig = getEmbeddingProvider(provider);
|
||||
if (!providerConfig) {
|
||||
return {
|
||||
success: false,
|
||||
@@ -66,11 +95,15 @@ export async function handleEmbedding({ body, credentials, log }) {
|
||||
"Content-Type": "application/json",
|
||||
};
|
||||
|
||||
const token = credentials.apiKey || credentials.accessToken;
|
||||
if (providerConfig.authHeader === "bearer") {
|
||||
headers["Authorization"] = `Bearer ${token}`;
|
||||
} else if (providerConfig.authHeader === "x-api-key") {
|
||||
headers["x-api-key"] = token;
|
||||
// Skip credential injection for local providers (authType: "none")
|
||||
const token =
|
||||
providerConfig.authType === "none" ? null : credentials?.apiKey || credentials?.accessToken;
|
||||
if (token) {
|
||||
if (providerConfig.authHeader === "bearer") {
|
||||
headers["Authorization"] = `Bearer ${token}`;
|
||||
} else if (providerConfig.authHeader === "x-api-key") {
|
||||
headers["x-api-key"] = token;
|
||||
}
|
||||
}
|
||||
|
||||
if (log) {
|
||||
|
||||
@@ -0,0 +1,48 @@
|
||||
import { describe, it, expect } from "vitest";
|
||||
import {
|
||||
MCP_TOOLS,
|
||||
MCP_TOOL_MAP,
|
||||
setRoutingStrategyInput,
|
||||
setRoutingStrategyTool,
|
||||
} from "../schemas/tools.ts";
|
||||
|
||||
describe("omniroute_set_routing_strategy MCP tool schema", () => {
|
||||
it("should be registered in MCP_TOOLS", () => {
|
||||
const tool = MCP_TOOLS.find((t) => t.name === "omniroute_set_routing_strategy");
|
||||
expect(tool).toBeDefined();
|
||||
expect(tool?.phase).toBe(2);
|
||||
});
|
||||
|
||||
it("should be available in MCP_TOOL_MAP", () => {
|
||||
expect(MCP_TOOL_MAP["omniroute_set_routing_strategy"]).toBeDefined();
|
||||
});
|
||||
|
||||
it("should require write:combos scope", () => {
|
||||
expect(setRoutingStrategyTool.scopes).toContain("write:combos");
|
||||
});
|
||||
|
||||
it("should validate a standard strategy payload", () => {
|
||||
const result = setRoutingStrategyInput.safeParse({
|
||||
comboId: "my-combo",
|
||||
strategy: "cost-optimized",
|
||||
});
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
|
||||
it("should validate auto strategy with autoRoutingStrategy", () => {
|
||||
const result = setRoutingStrategyInput.safeParse({
|
||||
comboId: "my-combo",
|
||||
strategy: "auto",
|
||||
autoRoutingStrategy: "latency",
|
||||
});
|
||||
expect(result.success).toBe(true);
|
||||
});
|
||||
|
||||
it("should reject unknown strategy", () => {
|
||||
const result = setRoutingStrategyInput.safeParse({
|
||||
comboId: "my-combo",
|
||||
strategy: "unknown-strategy",
|
||||
});
|
||||
expect(result.success).toBe(false);
|
||||
});
|
||||
});
|
||||
@@ -107,6 +107,7 @@ export const listCombosOutput = z.object({
|
||||
"priority",
|
||||
"weighted",
|
||||
"round-robin",
|
||||
"strict-random",
|
||||
"random",
|
||||
"least-used",
|
||||
"cost-optimized",
|
||||
@@ -470,7 +471,53 @@ export const setBudgetGuardTool: McpToolDefinition<
|
||||
sourceEndpoints: ["/api/usage/budget"],
|
||||
};
|
||||
|
||||
// --- Tool 11: omniroute_set_resilience_profile ---
|
||||
// --- Tool 11: omniroute_set_routing_strategy ---
|
||||
export const setRoutingStrategyInput = z.object({
|
||||
comboId: z.string().describe("Combo ID or name to update"),
|
||||
strategy: z
|
||||
.enum([
|
||||
"priority",
|
||||
"weighted",
|
||||
"round-robin",
|
||||
"strict-random",
|
||||
"random",
|
||||
"least-used",
|
||||
"cost-optimized",
|
||||
"auto",
|
||||
])
|
||||
.describe("Routing strategy to apply"),
|
||||
autoRoutingStrategy: z
|
||||
.enum(["rules", "cost", "eco", "latency", "fast"])
|
||||
.optional()
|
||||
.describe("Optional strategy used by auto mode (only used when strategy='auto')"),
|
||||
});
|
||||
|
||||
export const setRoutingStrategyOutput = z.object({
|
||||
success: z.boolean(),
|
||||
combo: z.object({
|
||||
id: z.string(),
|
||||
name: z.string(),
|
||||
strategy: z.string(),
|
||||
autoRoutingStrategy: z.string().nullable(),
|
||||
}),
|
||||
});
|
||||
|
||||
export const setRoutingStrategyTool: McpToolDefinition<
|
||||
typeof setRoutingStrategyInput,
|
||||
typeof setRoutingStrategyOutput
|
||||
> = {
|
||||
name: "omniroute_set_routing_strategy",
|
||||
description:
|
||||
"Updates a combo routing strategy (priority/weighted/auto/etc.) at runtime. Supports selecting the sub-strategy used by auto mode (rules/cost/latency).",
|
||||
inputSchema: setRoutingStrategyInput,
|
||||
outputSchema: setRoutingStrategyOutput,
|
||||
scopes: ["write:combos"],
|
||||
auditLevel: "full",
|
||||
phase: 2,
|
||||
sourceEndpoints: ["/api/combos", "/api/combos/{id}"],
|
||||
};
|
||||
|
||||
// --- Tool 12: omniroute_set_resilience_profile ---
|
||||
export const setResilienceProfileInput = z.object({
|
||||
profile: z
|
||||
.enum(["aggressive", "balanced", "conservative"])
|
||||
@@ -502,7 +549,7 @@ export const setResilienceProfileTool: McpToolDefinition<
|
||||
sourceEndpoints: ["/api/resilience"],
|
||||
};
|
||||
|
||||
// --- Tool 12: omniroute_test_combo ---
|
||||
// --- Tool 13: omniroute_test_combo ---
|
||||
export const testComboInput = z.object({
|
||||
comboId: z.string().describe("ID of the combo to test"),
|
||||
testPrompt: z.string().max(500).describe("Short test prompt (max 500 chars)"),
|
||||
@@ -540,7 +587,7 @@ export const testComboTool: McpToolDefinition<typeof testComboInput, typeof test
|
||||
sourceEndpoints: ["/api/combos/test", "/v1/chat/completions"],
|
||||
};
|
||||
|
||||
// --- Tool 13: omniroute_get_provider_metrics ---
|
||||
// --- Tool 14: omniroute_get_provider_metrics ---
|
||||
export const getProviderMetricsInput = z.object({
|
||||
provider: z.string().describe("Provider name (e.g., 'claude', 'gemini-cli', 'codex')"),
|
||||
});
|
||||
@@ -583,7 +630,7 @@ export const getProviderMetricsTool: McpToolDefinition<
|
||||
sourceEndpoints: ["/api/provider-metrics", "/api/resilience"],
|
||||
};
|
||||
|
||||
// --- Tool 14: omniroute_best_combo_for_task ---
|
||||
// --- Tool 15: omniroute_best_combo_for_task ---
|
||||
export const bestComboForTaskInput = z.object({
|
||||
taskType: z
|
||||
.enum(["coding", "review", "planning", "analysis", "debugging", "documentation"])
|
||||
@@ -628,7 +675,7 @@ export const bestComboForTaskTool: McpToolDefinition<
|
||||
sourceEndpoints: ["/api/combos", "/api/combos/metrics", "/api/monitoring/health"],
|
||||
};
|
||||
|
||||
// --- Tool 15: omniroute_explain_route ---
|
||||
// --- Tool 16: omniroute_explain_route ---
|
||||
export const explainRouteInput = z.object({
|
||||
requestId: z.string().describe("Request ID from the X-Request-Id header"),
|
||||
});
|
||||
@@ -674,7 +721,7 @@ export const explainRouteTool: McpToolDefinition<
|
||||
sourceEndpoints: [],
|
||||
};
|
||||
|
||||
// --- Tool 16: omniroute_get_session_snapshot ---
|
||||
// --- Tool 17: omniroute_get_session_snapshot ---
|
||||
export const getSessionSnapshotInput = z.object({}).describe("No parameters required");
|
||||
|
||||
export const getSessionSnapshotOutput = z.object({
|
||||
@@ -723,7 +770,7 @@ export const getSessionSnapshotTool: McpToolDefinition<
|
||||
sourceEndpoints: ["/api/usage/analytics", "/api/telemetry/summary"],
|
||||
};
|
||||
|
||||
// --- Tool 17: omniroute_sync_pricing ---
|
||||
// --- Tool 18: omniroute_sync_pricing ---
|
||||
export const syncPricingInput = z.object({
|
||||
sources: z
|
||||
.array(z.string())
|
||||
@@ -775,6 +822,7 @@ export const MCP_TOOLS = [
|
||||
// Phase 2: Advanced
|
||||
simulateRouteTool,
|
||||
setBudgetGuardTool,
|
||||
setRoutingStrategyTool,
|
||||
setResilienceProfileTool,
|
||||
testComboTool,
|
||||
getProviderMetricsTool,
|
||||
|
||||
@@ -25,6 +25,7 @@ import {
|
||||
listModelsCatalogInput,
|
||||
simulateRouteInput,
|
||||
setBudgetGuardInput,
|
||||
setRoutingStrategyInput,
|
||||
setResilienceProfileInput,
|
||||
testComboInput,
|
||||
getProviderMetricsInput,
|
||||
@@ -45,6 +46,7 @@ import {
|
||||
import {
|
||||
handleSimulateRoute,
|
||||
handleSetBudgetGuard,
|
||||
handleSetRoutingStrategy,
|
||||
handleSetResilienceProfile,
|
||||
handleTestCombo,
|
||||
handleGetProviderMetrics,
|
||||
@@ -593,6 +595,18 @@ export function createMcpServer(): McpServer {
|
||||
)
|
||||
);
|
||||
|
||||
server.registerTool(
|
||||
"omniroute_set_routing_strategy",
|
||||
{
|
||||
description:
|
||||
"Updates combo routing strategy at runtime (priority/weighted/round-robin/auto/etc.)",
|
||||
inputSchema: setRoutingStrategyInput,
|
||||
},
|
||||
withScopeEnforcement("omniroute_set_routing_strategy", (args) =>
|
||||
handleSetRoutingStrategy(setRoutingStrategyInput.parse(args))
|
||||
)
|
||||
);
|
||||
|
||||
server.registerTool(
|
||||
"omniroute_set_resilience_profile",
|
||||
{
|
||||
|
||||
@@ -1,16 +1,18 @@
|
||||
/**
|
||||
* OmniRoute MCP Advanced Tools — 8 intelligence tools that differentiate
|
||||
* OmniRoute MCP Advanced Tools — 10 intelligence tools that differentiate
|
||||
* OmniRoute from all other AI gateways.
|
||||
*
|
||||
* Tools:
|
||||
* 1. omniroute_simulate_route — Dry-run routing simulation
|
||||
* 2. omniroute_set_budget_guard — Session budget with degrade/block/alert
|
||||
* 3. omniroute_set_resilience_profile — Circuit breaker/retry profiles
|
||||
* 4. omniroute_test_combo — Live test each provider in a combo
|
||||
* 5. omniroute_get_provider_metrics — Detailed per-provider metrics
|
||||
* 6. omniroute_best_combo_for_task — AI-powered combo recommendation
|
||||
* 7. omniroute_explain_route — Post-hoc routing decision explainer
|
||||
* 8. omniroute_get_session_snapshot — Full session state snapshot
|
||||
* 3. omniroute_set_routing_strategy — Runtime strategy switch for combos
|
||||
* 4. omniroute_set_resilience_profile — Circuit breaker/retry profiles
|
||||
* 5. omniroute_test_combo — Live test each provider in a combo
|
||||
* 6. omniroute_get_provider_metrics — Detailed per-provider metrics
|
||||
* 7. omniroute_best_combo_for_task — AI-powered combo recommendation
|
||||
* 8. omniroute_explain_route — Post-hoc routing decision explainer
|
||||
* 9. omniroute_get_session_snapshot — Full session state snapshot
|
||||
* 10. omniroute_sync_pricing — Sync provider pricing from external source
|
||||
*/
|
||||
|
||||
import { logToolCall } from "../audit.ts";
|
||||
@@ -335,6 +337,108 @@ export async function handleSetBudgetGuard(args: {
|
||||
}
|
||||
}
|
||||
|
||||
export async function handleSetRoutingStrategy(args: {
|
||||
comboId: string;
|
||||
strategy:
|
||||
| "priority"
|
||||
| "weighted"
|
||||
| "round-robin"
|
||||
| "strict-random"
|
||||
| "random"
|
||||
| "least-used"
|
||||
| "cost-optimized"
|
||||
| "auto";
|
||||
autoRoutingStrategy?: "rules" | "cost" | "eco" | "latency" | "fast";
|
||||
}) {
|
||||
const start = Date.now();
|
||||
try {
|
||||
const combos = normalizeCombosResponse(await apiFetch("/api/combos"));
|
||||
const combo = combos.find(
|
||||
(comboEntry) =>
|
||||
toString(comboEntry.id) === args.comboId || toString(comboEntry.name) === args.comboId
|
||||
);
|
||||
|
||||
if (!combo) {
|
||||
const msg = `Combo '${args.comboId}' not found`;
|
||||
await logToolCall(
|
||||
"omniroute_set_routing_strategy",
|
||||
args,
|
||||
null,
|
||||
Date.now() - start,
|
||||
false,
|
||||
msg
|
||||
);
|
||||
return { content: [{ type: "text" as const, text: `Error: ${msg}` }], isError: true };
|
||||
}
|
||||
|
||||
const comboId = toString(combo.id);
|
||||
if (!comboId) {
|
||||
const msg = "Matched combo has no id";
|
||||
await logToolCall(
|
||||
"omniroute_set_routing_strategy",
|
||||
args,
|
||||
null,
|
||||
Date.now() - start,
|
||||
false,
|
||||
msg
|
||||
);
|
||||
return { content: [{ type: "text" as const, text: `Error: ${msg}` }], isError: true };
|
||||
}
|
||||
|
||||
const comboData = toRecord(combo.data);
|
||||
const currentConfig = toRecord(
|
||||
Object.keys(toRecord(combo.config)).length > 0 ? combo.config : comboData.config
|
||||
);
|
||||
|
||||
let nextConfig: JsonRecord | undefined = undefined;
|
||||
if (args.strategy === "auto" && args.autoRoutingStrategy) {
|
||||
const currentAutoConfig = toRecord(currentConfig.auto);
|
||||
nextConfig = {
|
||||
...currentConfig,
|
||||
auto: {
|
||||
...currentAutoConfig,
|
||||
routingStrategy: args.autoRoutingStrategy,
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
const payload: JsonRecord = { strategy: args.strategy };
|
||||
if (nextConfig && Object.keys(nextConfig).length > 0) {
|
||||
payload.config = nextConfig;
|
||||
}
|
||||
|
||||
const updatedCombo = toRecord(
|
||||
await apiFetch(`/api/combos/${encodeURIComponent(comboId)}`, {
|
||||
method: "PUT",
|
||||
body: JSON.stringify(payload),
|
||||
})
|
||||
);
|
||||
|
||||
const updatedConfig = toRecord(updatedCombo.config);
|
||||
const resolvedAutoStrategy =
|
||||
toString(toRecord(updatedConfig.auto).routingStrategy) ||
|
||||
(args.strategy === "auto" ? (args.autoRoutingStrategy ?? "rules") : "");
|
||||
|
||||
const result = {
|
||||
success: true,
|
||||
combo: {
|
||||
id: toString(updatedCombo.id, comboId),
|
||||
name: toString(updatedCombo.name, toString(combo.name, comboId)),
|
||||
strategy: toString(updatedCombo.strategy, args.strategy),
|
||||
autoRoutingStrategy:
|
||||
toString(updatedCombo.strategy, args.strategy) === "auto" ? resolvedAutoStrategy : null,
|
||||
},
|
||||
};
|
||||
|
||||
await logToolCall("omniroute_set_routing_strategy", args, result, Date.now() - start, true);
|
||||
return { content: [{ type: "text" as const, text: JSON.stringify(result, null, 2) }] };
|
||||
} catch (err) {
|
||||
const msg = err instanceof Error ? err.message : String(err);
|
||||
await logToolCall("omniroute_set_routing_strategy", args, null, Date.now() - start, false, msg);
|
||||
return { content: [{ type: "text" as const, text: `Error: ${msg}` }], isError: true };
|
||||
}
|
||||
}
|
||||
|
||||
export async function handleSetResilienceProfile(args: {
|
||||
profile: "aggressive" | "balanced" | "conservative";
|
||||
}) {
|
||||
|
||||
@@ -0,0 +1,159 @@
|
||||
/**
|
||||
* RouterStrategy — Pluggable Routing Strategy System
|
||||
*
|
||||
* Inspired by ClawRouter commit 14c83c258 "refactor: extract routing into pluggable RouterStrategy system".
|
||||
* Provides a RouterStrategy interface and two built-in implementations:
|
||||
* - RulesStrategy (default): wraps the existing 6-factor scoring engine
|
||||
* - CostStrategy: always picks cheapest available model
|
||||
*/
|
||||
|
||||
import type { ProviderCandidate, ScoredProvider } from "./scoring.js";
|
||||
import { scorePool } from "./scoring.js";
|
||||
import { getTaskFitness } from "./taskFitness.js";
|
||||
|
||||
export interface RoutingContext {
|
||||
taskType: string;
|
||||
requestHasTools?: boolean;
|
||||
requestHasVision?: boolean;
|
||||
estimatedInputTokens?: number;
|
||||
}
|
||||
|
||||
export interface RoutingDecision {
|
||||
provider: string;
|
||||
model: string;
|
||||
strategy: string;
|
||||
reason: string;
|
||||
candidatesConsidered: number;
|
||||
finalScore: number;
|
||||
}
|
||||
|
||||
export interface RouterStrategy {
|
||||
readonly name: string;
|
||||
readonly description: string;
|
||||
select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision;
|
||||
}
|
||||
|
||||
// ── RulesStrategy: wraps 6-factor scoring engine ────────────────────────────
|
||||
|
||||
class RulesStrategyImpl implements RouterStrategy {
|
||||
readonly name = "rules";
|
||||
readonly description =
|
||||
"6-factor weighted scoring: quota, health, cost, latency, taskFit, stability";
|
||||
|
||||
select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision {
|
||||
const eligible = pool.filter((c) => c.circuitBreakerState !== "OPEN");
|
||||
const ranked: ScoredProvider[] = scorePool(
|
||||
eligible.length > 0 ? eligible : pool,
|
||||
context.taskType,
|
||||
undefined,
|
||||
getTaskFitness
|
||||
);
|
||||
const best = ranked[0];
|
||||
if (!best) throw new Error("[RulesStrategy] No candidates to score");
|
||||
return {
|
||||
provider: best.provider,
|
||||
model: best.model,
|
||||
strategy: this.name,
|
||||
reason: `RulesStrategy: score=${best.score.toFixed(3)} (quota=${best.factors.quota.toFixed(2)}, health=${best.factors.health.toFixed(2)}, cost=${best.factors.costInv.toFixed(2)}, taskFit=${best.factors.taskFit.toFixed(2)})`,
|
||||
candidatesConsidered: ranked.length,
|
||||
finalScore: best.score,
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
// ── CostStrategy: always picks cheapest healthy provider ─────────────────────
|
||||
|
||||
class CostStrategyImpl implements RouterStrategy {
|
||||
readonly name = "cost";
|
||||
readonly description = "Always selects cheapest available provider (by costPer1MTokens)";
|
||||
|
||||
select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision {
|
||||
const healthy = pool.filter((c) => c.circuitBreakerState !== "OPEN");
|
||||
const candidates = healthy.length > 0 ? healthy : pool;
|
||||
const sorted = [...candidates].sort((a, b) => a.costPer1MTokens - b.costPer1MTokens);
|
||||
const best = sorted[0];
|
||||
if (!best) throw new Error("[CostStrategy] No candidates available");
|
||||
return {
|
||||
provider: best.provider,
|
||||
model: best.model,
|
||||
strategy: this.name,
|
||||
reason: `CostStrategy: cheapest at $${best.costPer1MTokens.toFixed(3)}/1M tokens`,
|
||||
candidatesConsidered: candidates.length,
|
||||
finalScore: best.costPer1MTokens === 0 ? 1.0 : 1 / best.costPer1MTokens,
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
// ── LatencyStrategy: prioritize low latency + reliability ───────────────────
|
||||
|
||||
class LatencyStrategyImpl implements RouterStrategy {
|
||||
readonly name = "latency";
|
||||
readonly description = "Prioritizes lowest p95 latency with reliability weighting";
|
||||
|
||||
select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision {
|
||||
const healthy = pool.filter((c) => c.circuitBreakerState !== "OPEN");
|
||||
const candidates = healthy.length > 0 ? healthy : pool;
|
||||
const sorted = [...candidates].sort((a, b) => {
|
||||
const aPenalty = a.errorRate * 1000;
|
||||
const bPenalty = b.errorRate * 1000;
|
||||
return a.p95LatencyMs + aPenalty - (b.p95LatencyMs + bPenalty);
|
||||
});
|
||||
const best = sorted[0];
|
||||
if (!best) throw new Error("[LatencyStrategy] No candidates available");
|
||||
|
||||
const latencyScore = best.p95LatencyMs > 0 ? Math.max(0.001, 10_000 / best.p95LatencyMs) : 1;
|
||||
const reliability = Math.max(0, 1 - best.errorRate);
|
||||
const finalScore = latencyScore * 0.7 + reliability * 0.3;
|
||||
|
||||
return {
|
||||
provider: best.provider,
|
||||
model: best.model,
|
||||
strategy: this.name,
|
||||
reason: `LatencyStrategy: p95=${best.p95LatencyMs}ms, errorRate=${(best.errorRate * 100).toFixed(2)}%`,
|
||||
candidatesConsidered: candidates.length,
|
||||
finalScore,
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
// ── Registry ──────────────────────────────────────────────────────────────────
|
||||
|
||||
const strategyRegistry = new Map<string, RouterStrategy>();
|
||||
|
||||
const rulesStrategy = new RulesStrategyImpl();
|
||||
const costStrategy = new CostStrategyImpl();
|
||||
const latencyStrategy = new LatencyStrategyImpl();
|
||||
|
||||
strategyRegistry.set("rules", rulesStrategy);
|
||||
strategyRegistry.set("cost", costStrategy);
|
||||
strategyRegistry.set("eco", costStrategy); // alias
|
||||
strategyRegistry.set("latency", latencyStrategy);
|
||||
strategyRegistry.set("fast", latencyStrategy); // alias
|
||||
|
||||
export function getStrategy(name: string): RouterStrategy {
|
||||
const strategy = strategyRegistry.get(name);
|
||||
if (!strategy) {
|
||||
console.warn(`[RouterStrategy] Strategy '${name}' not found, falling back to 'rules'`);
|
||||
return rulesStrategy;
|
||||
}
|
||||
return strategy;
|
||||
}
|
||||
|
||||
export function registerStrategy(name: string, strategy: RouterStrategy): void {
|
||||
if (strategyRegistry.has(name)) {
|
||||
console.warn(`[RouterStrategy] Overwriting strategy '${name}'`);
|
||||
}
|
||||
strategyRegistry.set(name, strategy);
|
||||
}
|
||||
|
||||
export function listStrategies(): Array<{ name: string; description: string }> {
|
||||
return [...strategyRegistry.entries()].map(([name, s]) => ({ name, description: s.description }));
|
||||
}
|
||||
|
||||
export function selectWithStrategy(
|
||||
pool: ProviderCandidate[],
|
||||
context: RoutingContext,
|
||||
strategyName = "rules"
|
||||
): RoutingDecision {
|
||||
return getStrategy(strategyName).select(pool, context);
|
||||
}
|
||||
@@ -74,7 +74,8 @@ export function calculateScore(factors: ScoringFactors, weights: ScoringWeights)
|
||||
weights.costInv * factors.costInv +
|
||||
weights.latencyInv * factors.latencyInv +
|
||||
weights.taskFit * factors.taskFit +
|
||||
weights.stability * factors.stability
|
||||
weights.stability * factors.stability +
|
||||
weights.tierPriority * factors.tierPriority
|
||||
);
|
||||
}
|
||||
|
||||
|
||||
@@ -24,10 +24,23 @@ const FITNESS_TABLE: Record<string, Record<string, number>> = {
|
||||
"deepseek-coder": 0.9,
|
||||
"deepseek-v3": 0.85,
|
||||
"deepseek-r1": 0.88,
|
||||
"deepseek-chat": 0.84, // DeepSeek V3.2 Chat — strong code performance
|
||||
"deepseek-v3.2": 0.86, // Explicit V3.2 alias
|
||||
qwen: 0.78,
|
||||
llama: 0.72,
|
||||
mistral: 0.75,
|
||||
mixtral: 0.77,
|
||||
// Grok-4 fast — good code, ultra-low latency (1143ms P50)
|
||||
"grok-4-fast": 0.8,
|
||||
"grok-4": 0.82,
|
||||
"grok-3": 0.8,
|
||||
// Kimi K2.5 — agentic with tool calling, good at code tasks
|
||||
"kimi-k2": 0.82,
|
||||
// GLM-5 — Z.AI model with 128k output
|
||||
"glm-5": 0.78,
|
||||
// MiniMax M2.5 — reasoning support helps complex code
|
||||
"minimax-m2.5": 0.75,
|
||||
"minimax-m2": 0.72,
|
||||
},
|
||||
review: {
|
||||
"claude-sonnet": 0.92,
|
||||
@@ -58,10 +71,15 @@ const FITNESS_TABLE: Record<string, Record<string, number>> = {
|
||||
"claude-sonnet": 0.92,
|
||||
"gemini-2.5-pro": 0.95,
|
||||
"gemini-pro": 0.88,
|
||||
"gemini-3.1-pro": 0.95, // Gemini 3.1 Pro — 1M context, ideal for long analysis
|
||||
"gpt-4o": 0.85,
|
||||
o1: 0.9,
|
||||
o3: 0.93,
|
||||
"deepseek-r1": 0.88,
|
||||
"deepseek-chat": 0.8,
|
||||
"kimi-k2": 0.82, // Kimi K2.5 agentic — good for analysis
|
||||
"glm-5": 0.78, // GLM-5 with 128k output for long analysis
|
||||
"minimax-m2.5": 0.76,
|
||||
},
|
||||
debugging: {
|
||||
"claude-sonnet": 0.93,
|
||||
@@ -87,8 +105,17 @@ const FITNESS_TABLE: Record<string, Record<string, number>> = {
|
||||
"claude-opus": 0.85,
|
||||
"gpt-4o": 0.85,
|
||||
"gemini-pro": 0.8,
|
||||
"gemini-3.1-pro": 0.85,
|
||||
"deepseek-v3": 0.75,
|
||||
"deepseek-chat": 0.74,
|
||||
"gemini-flash": 0.72,
|
||||
// New models from ClawRouter analysis (2026-03-17):
|
||||
"grok-4-fast": 0.72, // ultra-fast, suitable for all tasks
|
||||
"grok-4": 0.74,
|
||||
"grok-3": 0.73,
|
||||
"kimi-k2": 0.76, // agentic multi-step tasks
|
||||
"glm-5": 0.7,
|
||||
"minimax-m2.5": 0.7,
|
||||
},
|
||||
};
|
||||
|
||||
|
||||
+335
-4
@@ -5,18 +5,36 @@
|
||||
|
||||
import { checkFallbackError, formatRetryAfter, getProviderProfile } from "./accountFallback.ts";
|
||||
import { unavailableResponse } from "../utils/error.ts";
|
||||
import { recordComboRequest, getComboMetrics } from "./comboMetrics.ts";
|
||||
import { recordComboIntent, recordComboRequest, getComboMetrics } from "./comboMetrics.ts";
|
||||
import { resolveComboConfig, getDefaultComboConfig } from "./comboConfig.ts";
|
||||
import * as semaphore from "./rateLimitSemaphore.ts";
|
||||
import { getCircuitBreaker } from "../../src/shared/utils/circuitBreaker";
|
||||
import { fisherYatesShuffle, getNextFromDeck } from "../../src/shared/utils/shuffleDeck";
|
||||
import { parseModel } from "./model.ts";
|
||||
import { applyComboAgentMiddleware, injectModelTag } from "./comboAgentMiddleware.ts";
|
||||
import { classifyWithConfig, DEFAULT_INTENT_CONFIG } from "./intentClassifier.ts";
|
||||
import { selectProvider as selectAutoProvider } from "./autoCombo/engine.ts";
|
||||
import { selectWithStrategy } from "./autoCombo/routerStrategy.ts";
|
||||
import { DEFAULT_WEIGHTS, scorePool } from "./autoCombo/scoring.ts";
|
||||
import { supportsToolCalling } from "./modelCapabilities.ts";
|
||||
|
||||
// Status codes that should mark semaphore + record circuit breaker failures
|
||||
const TRANSIENT_FOR_BREAKER = [429, 502, 503, 504];
|
||||
|
||||
const MAX_COMBO_DEPTH = 3;
|
||||
|
||||
// Bootstrap defaults from ClawRouter benchmark (used when no local latency history exists yet)
|
||||
const DEFAULT_MODEL_P95_MS = {
|
||||
"grok-4-fast-non-reasoning": 1143,
|
||||
"grok-4-1-fast-non-reasoning": 1244,
|
||||
"gemini-2.5-flash": 1238,
|
||||
"kimi-k2.5": 1646,
|
||||
"gpt-4o-mini": 2764,
|
||||
"claude-sonnet-4.6": 4000,
|
||||
"claude-opus-4.6": 6000,
|
||||
"deepseek-chat": 2000,
|
||||
};
|
||||
|
||||
// In-memory atomic counter per combo for round-robin distribution
|
||||
// Resets on server restart (by design — no stale state)
|
||||
const rrCounters = new Map();
|
||||
@@ -201,6 +219,158 @@ function sortModelsByUsage(models, comboName) {
|
||||
return withUsage.map((e) => e.modelStr);
|
||||
}
|
||||
|
||||
function toTextContent(content) {
|
||||
if (typeof content === "string") return content;
|
||||
if (!Array.isArray(content)) return "";
|
||||
return content
|
||||
.map((part) => {
|
||||
if (!part || typeof part !== "object") return "";
|
||||
if (typeof part.text === "string") return part.text;
|
||||
return "";
|
||||
})
|
||||
.join("\n");
|
||||
}
|
||||
|
||||
function extractPromptForIntent(body) {
|
||||
if (!body || typeof body !== "object") return "";
|
||||
|
||||
const fromMessages = Array.isArray(body.messages)
|
||||
? [...body.messages].reverse().find((m) => m && typeof m === "object" && m.role === "user")
|
||||
: null;
|
||||
if (fromMessages) return toTextContent(fromMessages.content);
|
||||
|
||||
if (typeof body.input === "string") return body.input;
|
||||
if (Array.isArray(body.input)) {
|
||||
const text = body.input
|
||||
.map((item) => {
|
||||
if (!item || typeof item !== "object") return "";
|
||||
if (typeof item.content === "string") return item.content;
|
||||
if (typeof item.text === "string") return item.text;
|
||||
return "";
|
||||
})
|
||||
.filter(Boolean)
|
||||
.join("\n");
|
||||
if (text) return text;
|
||||
}
|
||||
|
||||
if (typeof body.prompt === "string") return body.prompt;
|
||||
return "";
|
||||
}
|
||||
|
||||
function mapIntentToTaskType(intent) {
|
||||
switch (intent) {
|
||||
case "code":
|
||||
return "coding";
|
||||
case "reasoning":
|
||||
return "analysis";
|
||||
case "simple":
|
||||
return "default";
|
||||
case "medium":
|
||||
default:
|
||||
return "default";
|
||||
}
|
||||
}
|
||||
|
||||
function toStringArray(input) {
|
||||
if (Array.isArray(input)) {
|
||||
return input.map((v) => (typeof v === "string" ? v.trim() : "")).filter(Boolean);
|
||||
}
|
||||
if (typeof input === "string") {
|
||||
return input
|
||||
.split(",")
|
||||
.map((v) => v.trim())
|
||||
.filter(Boolean);
|
||||
}
|
||||
return [];
|
||||
}
|
||||
|
||||
function getIntentConfig(settings, combo) {
|
||||
const comboIntentConfig =
|
||||
combo?.autoConfig?.intentConfig ||
|
||||
combo?.config?.auto?.intentConfig ||
|
||||
combo?.config?.intentConfig ||
|
||||
{};
|
||||
|
||||
return {
|
||||
...DEFAULT_INTENT_CONFIG,
|
||||
...comboIntentConfig,
|
||||
...(typeof settings?.intentDetectionEnabled === "boolean"
|
||||
? { enabled: settings.intentDetectionEnabled }
|
||||
: {}),
|
||||
...(Number.isFinite(Number(settings?.intentSimpleMaxWords))
|
||||
? { simpleMaxWords: Number(settings.intentSimpleMaxWords) }
|
||||
: {}),
|
||||
...(toStringArray(settings?.intentExtraCodeKeywords).length > 0
|
||||
? { extraCodeKeywords: toStringArray(settings.intentExtraCodeKeywords) }
|
||||
: {}),
|
||||
...(toStringArray(settings?.intentExtraReasoningKeywords).length > 0
|
||||
? { extraReasoningKeywords: toStringArray(settings.intentExtraReasoningKeywords) }
|
||||
: {}),
|
||||
...(toStringArray(settings?.intentExtraSimpleKeywords).length > 0
|
||||
? { extraSimpleKeywords: toStringArray(settings.intentExtraSimpleKeywords) }
|
||||
: {}),
|
||||
};
|
||||
}
|
||||
|
||||
function getBootstrapLatencyMs(modelId) {
|
||||
const normalized = String(modelId || "").toLowerCase();
|
||||
return DEFAULT_MODEL_P95_MS[normalized] ?? 1500;
|
||||
}
|
||||
|
||||
async function buildAutoCandidates(modelStrings, comboName) {
|
||||
const metrics = getComboMetrics(comboName);
|
||||
const { getPricingForModel } = await import("../../src/lib/localDb");
|
||||
|
||||
const candidates = await Promise.all(
|
||||
modelStrings.map(async (modelStr) => {
|
||||
const parsed = parseModel(modelStr);
|
||||
const provider = parsed.provider || parsed.providerAlias || "unknown";
|
||||
const model = parsed.model || modelStr;
|
||||
|
||||
let costPer1MTokens = 1;
|
||||
try {
|
||||
const pricing = await getPricingForModel(provider, model);
|
||||
const inputPrice = Number(pricing?.input);
|
||||
if (Number.isFinite(inputPrice) && inputPrice >= 0) {
|
||||
costPer1MTokens = inputPrice;
|
||||
}
|
||||
} catch {
|
||||
// keep default cost
|
||||
}
|
||||
|
||||
const modelMetric = metrics?.byModel?.[modelStr] || null;
|
||||
const avgLatency = Number(modelMetric?.avgLatencyMs);
|
||||
const successRate = Number(modelMetric?.successRate);
|
||||
const p95LatencyMs =
|
||||
Number.isFinite(avgLatency) && avgLatency > 0 ? avgLatency : getBootstrapLatencyMs(model);
|
||||
const errorRate =
|
||||
Number.isFinite(successRate) && successRate >= 0 && successRate <= 100
|
||||
? 1 - successRate / 100
|
||||
: 0.05;
|
||||
|
||||
const breakerStateRaw = getCircuitBreaker(`combo:${modelStr}`)?.getStatus?.()?.state;
|
||||
const circuitBreakerState =
|
||||
breakerStateRaw === "OPEN" || breakerStateRaw === "HALF_OPEN" ? breakerStateRaw : "CLOSED";
|
||||
|
||||
return {
|
||||
provider,
|
||||
model,
|
||||
quotaRemaining: 100,
|
||||
quotaTotal: 100,
|
||||
circuitBreakerState,
|
||||
costPer1MTokens,
|
||||
p95LatencyMs,
|
||||
latencyStdDev: Math.max(10, p95LatencyMs * 0.1),
|
||||
errorRate,
|
||||
accountTier: "standard",
|
||||
quotaResetIntervalSecs: 86400,
|
||||
};
|
||||
})
|
||||
);
|
||||
|
||||
return candidates;
|
||||
}
|
||||
|
||||
/**
|
||||
* Handle combo chat with fallback
|
||||
* Supports all 6 strategies: priority, weighted, round-robin, random, least-used, cost-optimized
|
||||
@@ -225,12 +395,49 @@ export async function handleComboChat({
|
||||
const strategy = combo.strategy || "priority";
|
||||
const models = combo.models || [];
|
||||
|
||||
// ── Combo Agent Middleware (#399 + #401) ────────────────────────────────
|
||||
// Apply system_message override, tool_filter_regex, and extract pinned model
|
||||
// from context caching tag. These are all opt-in per combo config.
|
||||
const { body: agentBody, pinnedModel } = applyComboAgentMiddleware(
|
||||
body,
|
||||
combo,
|
||||
"" // provider/model not yet known — resolved per-model in loop
|
||||
);
|
||||
body = agentBody;
|
||||
if (pinnedModel) {
|
||||
log.info("COMBO", `[#401] Context caching: pinned model=${pinnedModel}`);
|
||||
}
|
||||
// Wrap handleSingleModel to inject context caching tag on response (#401)
|
||||
const handleSingleModelWrapped = combo.context_cache_protection
|
||||
? async (b, modelStr) => {
|
||||
const res = await handleSingleModel(b, modelStr);
|
||||
// Inject tag only on success and only for non-streaming non-binary responses
|
||||
if (res.ok && !b.stream) {
|
||||
try {
|
||||
const json = await res.clone().json();
|
||||
const msgs = Array.isArray(json?.messages) ? json.messages : [];
|
||||
if (msgs.length > 0) {
|
||||
const tagged = injectModelTag(msgs, modelStr);
|
||||
return new Response(JSON.stringify({ ...json, messages: tagged }), {
|
||||
status: res.status,
|
||||
headers: res.headers,
|
||||
});
|
||||
}
|
||||
} catch {
|
||||
/* non-JSON or stream — skip tagging */
|
||||
}
|
||||
}
|
||||
return res;
|
||||
}
|
||||
: handleSingleModel;
|
||||
// ─────────────────────────────────────────────────────────────────────────
|
||||
|
||||
// Route to round-robin handler if strategy matches
|
||||
if (strategy === "round-robin") {
|
||||
return handleRoundRobinCombo({
|
||||
body,
|
||||
combo,
|
||||
handleSingleModel,
|
||||
handleSingleModel: handleSingleModelWrapped,
|
||||
isModelAvailable,
|
||||
log,
|
||||
settings,
|
||||
@@ -278,7 +485,131 @@ export async function handleComboChat({
|
||||
}
|
||||
|
||||
// Apply strategy-specific ordering
|
||||
if (strategy === "strict-random") {
|
||||
if (strategy === "auto") {
|
||||
const requestHasTools = Array.isArray(body?.tools) && body.tools.length > 0;
|
||||
let eligibleModels = [...orderedModels];
|
||||
|
||||
if (requestHasTools) {
|
||||
const filtered = eligibleModels.filter((m) => supportsToolCalling(m));
|
||||
if (filtered.length > 0) {
|
||||
eligibleModels = filtered;
|
||||
} else {
|
||||
log.warn(
|
||||
"COMBO",
|
||||
"Auto strategy: all candidates filtered by tool-calling policy, falling back to full pool"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
const prompt = extractPromptForIntent(body);
|
||||
const systemPrompt =
|
||||
typeof combo?.system_message === "string" ? combo.system_message : undefined;
|
||||
const intentConfig = getIntentConfig(settings, combo);
|
||||
const intent = classifyWithConfig(prompt, intentConfig, systemPrompt);
|
||||
recordComboIntent(combo.name, intent);
|
||||
const taskType = mapIntentToTaskType(intent);
|
||||
|
||||
const autoConfigSource = combo?.autoConfig || combo?.config?.auto || combo?.config || {};
|
||||
const routingStrategy =
|
||||
typeof autoConfigSource.routingStrategy === "string"
|
||||
? autoConfigSource.routingStrategy
|
||||
: typeof autoConfigSource.strategyName === "string"
|
||||
? autoConfigSource.strategyName
|
||||
: "rules";
|
||||
|
||||
const candidatePool = Array.isArray(autoConfigSource.candidatePool)
|
||||
? autoConfigSource.candidatePool
|
||||
: [
|
||||
...new Set(
|
||||
eligibleModels.map((m) => {
|
||||
const parsed = parseModel(m);
|
||||
return parsed.provider || parsed.providerAlias || "unknown";
|
||||
})
|
||||
),
|
||||
];
|
||||
|
||||
const weights =
|
||||
autoConfigSource.weights && typeof autoConfigSource.weights === "object"
|
||||
? autoConfigSource.weights
|
||||
: DEFAULT_WEIGHTS;
|
||||
const explorationRate = Number.isFinite(Number(autoConfigSource.explorationRate))
|
||||
? Number(autoConfigSource.explorationRate)
|
||||
: 0.05;
|
||||
const budgetCap = Number.isFinite(Number(autoConfigSource.budgetCap))
|
||||
? Number(autoConfigSource.budgetCap)
|
||||
: undefined;
|
||||
const modePack =
|
||||
typeof autoConfigSource.modePack === "string" ? autoConfigSource.modePack : undefined;
|
||||
|
||||
const candidates = await buildAutoCandidates(eligibleModels, combo.name);
|
||||
if (candidates.length > 0) {
|
||||
let selectedProvider = null;
|
||||
let selectedModel = null;
|
||||
let selectionReason = "";
|
||||
|
||||
if (routingStrategy !== "rules") {
|
||||
try {
|
||||
const decision = selectWithStrategy(
|
||||
candidates,
|
||||
{ taskType, requestHasTools },
|
||||
routingStrategy
|
||||
);
|
||||
selectedProvider = decision.provider;
|
||||
selectedModel = decision.model;
|
||||
selectionReason = decision.reason;
|
||||
} catch (err) {
|
||||
log.warn(
|
||||
"COMBO",
|
||||
`Auto strategy '${routingStrategy}' failed (${err?.message || "unknown"}), falling back to rules`
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
if (!selectedProvider || !selectedModel) {
|
||||
const selection = selectAutoProvider(
|
||||
{
|
||||
id: combo.id || combo.name,
|
||||
name: combo.name,
|
||||
type: "auto",
|
||||
candidatePool,
|
||||
weights,
|
||||
modePack,
|
||||
budgetCap,
|
||||
explorationRate,
|
||||
},
|
||||
candidates,
|
||||
taskType
|
||||
);
|
||||
selectedProvider = selection.provider;
|
||||
selectedModel = selection.model;
|
||||
selectionReason = `score=${selection.score.toFixed(3)}${selection.isExploration ? " (exploration)" : ""}`;
|
||||
}
|
||||
|
||||
const modelLookup = new Map();
|
||||
for (const modelStr of eligibleModels) {
|
||||
const parsed = parseModel(modelStr);
|
||||
const provider = parsed.provider || parsed.providerAlias || "unknown";
|
||||
const modelId = parsed.model || modelStr;
|
||||
modelLookup.set(`${provider}/${modelId}`, modelStr);
|
||||
}
|
||||
|
||||
const ranked = scorePool(candidates, taskType, weights)
|
||||
.map((r) => modelLookup.get(`${r.provider}/${r.model}`) || `${r.provider}/${r.model}`)
|
||||
.filter(Boolean);
|
||||
|
||||
const selectedModelStr =
|
||||
modelLookup.get(`${selectedProvider}/${selectedModel}`) ||
|
||||
`${selectedProvider}/${selectedModel}`;
|
||||
orderedModels = [...new Set([selectedModelStr, ...ranked, ...eligibleModels])];
|
||||
|
||||
log.info(
|
||||
"COMBO",
|
||||
`Auto selection: ${selectedModelStr} | intent=${intent} task=${taskType} | strategy=${routingStrategy} | ${selectionReason}`
|
||||
);
|
||||
} else {
|
||||
log.warn("COMBO", "Auto strategy has no candidates, keeping default ordering");
|
||||
}
|
||||
} else if (strategy === "strict-random") {
|
||||
const selectedId = await getNextFromDeck(`combo:${combo.name}`, orderedModels);
|
||||
// Put selected model first so the fallback loop tries it first
|
||||
const rest = orderedModels.filter((m) => m !== selectedId);
|
||||
@@ -348,7 +679,7 @@ export async function handleComboChat({
|
||||
`Trying model ${i + 1}/${orderedModels.length}: ${modelStr}${retry > 0 ? ` (retry ${retry})` : ""}`
|
||||
);
|
||||
|
||||
const result = await handleSingleModel(body, modelStr);
|
||||
const result = await handleSingleModelWrapped(body, modelStr);
|
||||
|
||||
// Success — return response
|
||||
if (result.ok) {
|
||||
|
||||
@@ -0,0 +1,169 @@
|
||||
/**
|
||||
* comboAgentMiddleware.ts — Combo Agent Features
|
||||
*
|
||||
* Implements the "combo as agent" features from issues #399 and #401:
|
||||
*
|
||||
* 1. **System Message Override** (#399): If the combo defines a `system_message`,
|
||||
* it is injected as the first system message, replacing any existing system message.
|
||||
*
|
||||
* 2. **Tool Filter Regex** (#399): If the combo defines a `tool_filter_regex`,
|
||||
* only tools whose name matches the pattern are forwarded to the provider.
|
||||
*
|
||||
* 3. **Context Caching Protection** (#401): If the combo enables
|
||||
* `context_cache_protection`, the proxy:
|
||||
* a. On response: injects `<omniModel>provider/model</omniModel>` tag into
|
||||
* the first assistant message content string.
|
||||
* b. On request: scans the message history for the tag, and if found,
|
||||
* overrides the requested model with the pinned one.
|
||||
*
|
||||
* All features are opt-in per combo and backward compatible with existing setups.
|
||||
*/
|
||||
|
||||
interface ComboConfig {
|
||||
system_message?: string | null;
|
||||
tool_filter_regex?: string | null;
|
||||
context_cache_protection?: number | boolean;
|
||||
[key: string]: unknown;
|
||||
}
|
||||
|
||||
interface Message {
|
||||
role?: string;
|
||||
content?: unknown;
|
||||
[key: string]: unknown;
|
||||
}
|
||||
|
||||
// ── Context Caching Tag ─────────────────────────────────────────────────────
|
||||
|
||||
const CACHE_TAG_PATTERN = /<omniModel>([^<]+)<\/omniModel>/;
|
||||
|
||||
/**
|
||||
* Inject the model tag into the last assistant message (or append a new one).
|
||||
* Only modifies string content — does not touch array content to avoid breaking
|
||||
* Claude/Gemini multi-part message formats.
|
||||
*/
|
||||
export function injectModelTag(messages: Message[], providerModel: string): Message[] {
|
||||
// Remove any existing tag first to avoid duplication on context compaction
|
||||
const cleaned = messages.map((msg) => {
|
||||
if (msg.role === "assistant" && typeof msg.content === "string") {
|
||||
return { ...msg, content: msg.content.replace(CACHE_TAG_PATTERN, "").trimEnd() };
|
||||
}
|
||||
return msg;
|
||||
});
|
||||
|
||||
// Find last assistant message with string content
|
||||
const lastAssistantIdx = cleaned.map((m) => m.role).lastIndexOf("assistant");
|
||||
if (lastAssistantIdx === -1) return cleaned;
|
||||
|
||||
const msg = cleaned[lastAssistantIdx];
|
||||
if (typeof msg.content !== "string") return cleaned;
|
||||
|
||||
const tagged = [...cleaned];
|
||||
tagged[lastAssistantIdx] = {
|
||||
...msg,
|
||||
content: `${msg.content}\n<omniModel>${providerModel}</omniModel>`,
|
||||
};
|
||||
return tagged;
|
||||
}
|
||||
|
||||
/**
|
||||
* Scan message history for the model tag injected by a previous response.
|
||||
* Returns the pinned "provider/model" string, or null if not found.
|
||||
*/
|
||||
export function extractPinnedModel(messages: Message[]): string | null {
|
||||
// Scan from newest to oldest for efficiency
|
||||
for (let i = messages.length - 1; i >= 0; i--) {
|
||||
const msg = messages[i];
|
||||
if (msg.role === "assistant" && typeof msg.content === "string") {
|
||||
const match = CACHE_TAG_PATTERN.exec(msg.content);
|
||||
if (match) return match[1];
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
// ── System Message Override ──────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Replace or inject a system message at the beginning of the messages array.
|
||||
* Existing system messages are removed if a combo override is set.
|
||||
*/
|
||||
export function applySystemMessageOverride(messages: Message[], systemMessage: string): Message[] {
|
||||
// Remove all existing system messages
|
||||
const filtered = messages.filter((m) => m.role !== "system");
|
||||
// Inject combo system message at start
|
||||
return [{ role: "system", content: systemMessage }, ...filtered];
|
||||
}
|
||||
|
||||
// ── Tool Filter Regex ────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Filter the tools array, keeping only tools whose name matches the regex.
|
||||
* Returns the original array unchanged if pattern is null/empty.
|
||||
*/
|
||||
export function applyToolFilter(
|
||||
tools: unknown[] | undefined,
|
||||
pattern: string | null | undefined
|
||||
): unknown[] | undefined {
|
||||
if (!tools || !pattern) return tools;
|
||||
|
||||
let regex: RegExp;
|
||||
try {
|
||||
regex = new RegExp(pattern);
|
||||
} catch {
|
||||
// Invalid regex — return tools unchanged rather than crashing
|
||||
console.warn(`[ComboAgent] Invalid tool_filter_regex: "${pattern}"`);
|
||||
return tools;
|
||||
}
|
||||
|
||||
return tools.filter((tool) => {
|
||||
const t = tool as Record<string, unknown>;
|
||||
// Support both OpenAI format ({ function: { name } }) and Anthropic ({ name })
|
||||
const name = (t.function as Record<string, unknown> | undefined)?.name ?? t.name ?? "";
|
||||
return regex.test(String(name));
|
||||
});
|
||||
}
|
||||
|
||||
// ── Main Middleware ──────────────────────────────────────────────────────────
|
||||
|
||||
/**
|
||||
* Apply all combo agent features to the request body.
|
||||
* Safe to call with null/undefined comboConfig — returns body unchanged.
|
||||
*/
|
||||
export function applyComboAgentMiddleware(
|
||||
body: Record<string, unknown>,
|
||||
comboConfig: ComboConfig | null | undefined,
|
||||
providerModel: string // "provider/model" string for context caching
|
||||
): { body: Record<string, unknown>; pinnedModel: string | null } {
|
||||
if (!comboConfig) return { body, pinnedModel: null };
|
||||
|
||||
let messages: Message[] = Array.isArray(body.messages) ? [...body.messages] : [];
|
||||
let pinnedModel: string | null = null;
|
||||
|
||||
// 1. Context caching: check for pinned model in history
|
||||
if (comboConfig.context_cache_protection) {
|
||||
pinnedModel = extractPinnedModel(messages);
|
||||
if (pinnedModel) {
|
||||
// Model is pinned — caller should override model selection
|
||||
}
|
||||
}
|
||||
|
||||
// 2. System message override
|
||||
if (comboConfig.system_message && comboConfig.system_message.trim()) {
|
||||
messages = applySystemMessageOverride(messages, comboConfig.system_message);
|
||||
}
|
||||
|
||||
// 3. Tool filter
|
||||
const filteredTools = applyToolFilter(
|
||||
body.tools as unknown[] | undefined,
|
||||
comboConfig.tool_filter_regex
|
||||
);
|
||||
|
||||
return {
|
||||
body: {
|
||||
...body,
|
||||
messages,
|
||||
...(filteredTools !== body.tools && { tools: filteredTools }),
|
||||
},
|
||||
pinnedModel,
|
||||
};
|
||||
}
|
||||
@@ -21,6 +21,7 @@ interface ComboMetricsEntry {
|
||||
totalLatencyMs: number;
|
||||
strategy: string;
|
||||
lastUsedAt: string | null;
|
||||
intentCounts: Record<string, number>;
|
||||
byModel: Record<string, ModelMetrics>;
|
||||
}
|
||||
|
||||
@@ -69,6 +70,7 @@ export function recordComboRequest(
|
||||
totalLatencyMs: 0,
|
||||
strategy,
|
||||
lastUsedAt: null,
|
||||
intentCounts: {},
|
||||
byModel: {},
|
||||
});
|
||||
}
|
||||
@@ -131,6 +133,7 @@ export function getComboMetrics(comboName: string): ComboMetricsView | null {
|
||||
combo.totalRequests > 0 ? Math.round((combo.totalSuccesses / combo.totalRequests) * 100) : 0,
|
||||
fallbackRate:
|
||||
combo.totalRequests > 0 ? Math.round((combo.totalFallbacks / combo.totalRequests) * 100) : 0,
|
||||
intentCounts: { ...combo.intentCounts },
|
||||
byModel: Object.fromEntries(
|
||||
Object.entries(combo.byModel).map(([model, m]) => [
|
||||
model,
|
||||
@@ -156,6 +159,30 @@ export function getAllComboMetrics(): Record<string, ComboMetricsView | null> {
|
||||
return result;
|
||||
}
|
||||
|
||||
/**
|
||||
* Record detected prompt intent for a combo (used by multilingual routing analytics).
|
||||
*/
|
||||
export function recordComboIntent(comboName: string, intent: string): void {
|
||||
if (!metrics.has(comboName)) {
|
||||
metrics.set(comboName, {
|
||||
totalRequests: 0,
|
||||
totalSuccesses: 0,
|
||||
totalFailures: 0,
|
||||
totalFallbacks: 0,
|
||||
totalLatencyMs: 0,
|
||||
strategy: "priority",
|
||||
lastUsedAt: null,
|
||||
intentCounts: {},
|
||||
byModel: {},
|
||||
});
|
||||
}
|
||||
|
||||
const combo = metrics.get(comboName);
|
||||
if (!combo) return;
|
||||
const key = String(intent || "unknown");
|
||||
combo.intentCounts[key] = (combo.intentCounts[key] || 0) + 1;
|
||||
}
|
||||
|
||||
/**
|
||||
* Reset metrics for a specific combo
|
||||
*/
|
||||
|
||||
@@ -0,0 +1,103 @@
|
||||
/**
|
||||
* Emergency Fallback — Budget Exhaustion Redirect
|
||||
*
|
||||
* When a request fails due to budget exhaustion (HTTP 402 or budget keywords
|
||||
* in the error body), optionally redirect to a free-tier model
|
||||
* (default provider/model: nvidia + openai/gpt-oss-120b at $0.00/M tokens).
|
||||
*
|
||||
* Inspired by ClawRouter: "gpt-oss-120b costs nothing and serves as
|
||||
* automatic fallback when wallet is empty."
|
||||
*/
|
||||
|
||||
export interface EmergencyFallbackConfig {
|
||||
enabled: boolean;
|
||||
provider: string;
|
||||
model: string;
|
||||
triggerOn402: boolean;
|
||||
triggerOnBudgetKeywords: boolean;
|
||||
budgetKeywords: string[];
|
||||
/** Skip fallback for tool requests (gpt-oss-120b may not support structured tool calling) */
|
||||
skipForToolRequests: boolean;
|
||||
maxOutputTokens: number;
|
||||
}
|
||||
|
||||
export const EMERGENCY_FALLBACK_CONFIG: EmergencyFallbackConfig = {
|
||||
enabled: true,
|
||||
provider: "nvidia",
|
||||
model: "openai/gpt-oss-120b",
|
||||
triggerOn402: true,
|
||||
triggerOnBudgetKeywords: true,
|
||||
budgetKeywords: [
|
||||
"insufficient funds",
|
||||
"insufficient_funds",
|
||||
"budget exceeded",
|
||||
"budget_exceeded",
|
||||
"quota exceeded",
|
||||
"quota_exceeded",
|
||||
"billing",
|
||||
"payment required",
|
||||
"out of credits",
|
||||
"no credits",
|
||||
"credit limit",
|
||||
"spending limit",
|
||||
"saldo insuficiente",
|
||||
"limite de gastos",
|
||||
"cota excedida",
|
||||
],
|
||||
skipForToolRequests: true,
|
||||
maxOutputTokens: 4096,
|
||||
};
|
||||
|
||||
export interface FallbackDecision {
|
||||
shouldFallback: true;
|
||||
reason: string;
|
||||
provider: string;
|
||||
model: string;
|
||||
maxOutputTokens: number;
|
||||
}
|
||||
|
||||
export interface NoFallbackDecision {
|
||||
shouldFallback: false;
|
||||
reason: string;
|
||||
}
|
||||
|
||||
export type FallbackResult = FallbackDecision | NoFallbackDecision;
|
||||
|
||||
export function shouldUseFallback(
|
||||
status: number,
|
||||
errorBody: string,
|
||||
requestHasTools: boolean,
|
||||
config: EmergencyFallbackConfig = EMERGENCY_FALLBACK_CONFIG
|
||||
): FallbackResult {
|
||||
if (!config.enabled) return { shouldFallback: false, reason: "emergency fallback disabled" };
|
||||
if (config.skipForToolRequests && requestHasTools) {
|
||||
return { shouldFallback: false, reason: "skipped: request has tools" };
|
||||
}
|
||||
if (config.triggerOn402 && status === 402) {
|
||||
return {
|
||||
shouldFallback: true,
|
||||
reason: `HTTP 402 → emergency fallback to ${config.provider}/${config.model}`,
|
||||
provider: config.provider,
|
||||
model: config.model,
|
||||
maxOutputTokens: config.maxOutputTokens,
|
||||
};
|
||||
}
|
||||
if (config.triggerOnBudgetKeywords && errorBody) {
|
||||
const lowerBody = errorBody.toLowerCase();
|
||||
const matched = config.budgetKeywords.find((kw) => lowerBody.includes(kw.toLowerCase()));
|
||||
if (matched) {
|
||||
return {
|
||||
shouldFallback: true,
|
||||
reason: `Budget error detected ('${matched}') → emergency fallback to ${config.provider}/${config.model}`,
|
||||
provider: config.provider,
|
||||
model: config.model,
|
||||
maxOutputTokens: config.maxOutputTokens,
|
||||
};
|
||||
}
|
||||
}
|
||||
return { shouldFallback: false, reason: "no budget error detected" };
|
||||
}
|
||||
|
||||
export function isFallbackDecision(result: FallbackResult): result is FallbackDecision {
|
||||
return result.shouldFallback === true;
|
||||
}
|
||||
@@ -0,0 +1,375 @@
|
||||
/**
|
||||
* Multilingual Intent Detection for AutoCombo
|
||||
*
|
||||
* Classifies prompts as: code | reasoning | simple | medium
|
||||
* using keywords in 9 languages (EN, PT-BR, ES, ZH, JA, RU, DE, KO, AR).
|
||||
*
|
||||
* Inspired by ClawRouter (BlockRunAI) multilingual routing system.
|
||||
* Execution: purely synchronous, <1ms, no I/O.
|
||||
*/
|
||||
|
||||
export type IntentType = "code" | "reasoning" | "simple" | "medium";
|
||||
|
||||
export const CODE_KEYWORDS: readonly string[] = [
|
||||
// English
|
||||
"function",
|
||||
"class",
|
||||
"import",
|
||||
"def",
|
||||
"SELECT",
|
||||
"async",
|
||||
"await",
|
||||
"const",
|
||||
"let",
|
||||
"var",
|
||||
"return",
|
||||
"```",
|
||||
"algorithm",
|
||||
"compile",
|
||||
"debug",
|
||||
"refactor",
|
||||
"typescript",
|
||||
"python",
|
||||
"javascript",
|
||||
"code",
|
||||
"implement",
|
||||
"write a",
|
||||
"create a component",
|
||||
"endpoint",
|
||||
"repository",
|
||||
"deploy",
|
||||
"install",
|
||||
"script",
|
||||
"api",
|
||||
"database",
|
||||
"query",
|
||||
"schema",
|
||||
"interface",
|
||||
"generic",
|
||||
"enum",
|
||||
"module",
|
||||
"package",
|
||||
"dependency",
|
||||
// Português (PT-BR)
|
||||
"função",
|
||||
"classe",
|
||||
"importar",
|
||||
"definir",
|
||||
"consulta",
|
||||
"assíncrono",
|
||||
"aguardar",
|
||||
"constante",
|
||||
"variável",
|
||||
"retornar",
|
||||
"algoritmo",
|
||||
"compilar",
|
||||
"depurar",
|
||||
"refatorar",
|
||||
"código",
|
||||
"implementar",
|
||||
"criar um",
|
||||
"componente",
|
||||
"como fazer",
|
||||
"repositório",
|
||||
"configurar",
|
||||
"instalar",
|
||||
"banco de dados",
|
||||
"escrever uma função",
|
||||
"criar uma classe",
|
||||
// Español
|
||||
"función",
|
||||
"clase",
|
||||
"importar",
|
||||
"definir",
|
||||
"consulta",
|
||||
"asíncrono",
|
||||
"esperar",
|
||||
"constante",
|
||||
"variable",
|
||||
"retornar",
|
||||
"algoritmo",
|
||||
"compilar",
|
||||
"depurar",
|
||||
"refactorizar",
|
||||
"código",
|
||||
"implementar",
|
||||
// 中文
|
||||
"函数",
|
||||
"类",
|
||||
"导入",
|
||||
"定义",
|
||||
"查询",
|
||||
"异步",
|
||||
"等待",
|
||||
"常量",
|
||||
"变量",
|
||||
"返回",
|
||||
"算法",
|
||||
"编译",
|
||||
"调试",
|
||||
"代码",
|
||||
// 日本語
|
||||
"関数",
|
||||
"クラス",
|
||||
"インポート",
|
||||
"非同期",
|
||||
"定数",
|
||||
"変数",
|
||||
"コード",
|
||||
"アルゴリズム",
|
||||
// Русский
|
||||
"функция",
|
||||
"класс",
|
||||
"импорт",
|
||||
"запрос",
|
||||
"асинхронный",
|
||||
"константа",
|
||||
"переменная",
|
||||
"алгоритм",
|
||||
"код",
|
||||
// Deutsch
|
||||
"funktion",
|
||||
"klasse",
|
||||
"importieren",
|
||||
"abfrage",
|
||||
"asynchron",
|
||||
"konstante",
|
||||
"variable",
|
||||
"algorithmus",
|
||||
"code",
|
||||
// 한국어
|
||||
"함수",
|
||||
"클래스",
|
||||
"가져오기",
|
||||
"정의",
|
||||
"쿼리",
|
||||
"비동기",
|
||||
"대기",
|
||||
"상수",
|
||||
"변수",
|
||||
"반환",
|
||||
"코드",
|
||||
// العربية
|
||||
"دالة",
|
||||
"فئة",
|
||||
"استيراد",
|
||||
"استعلام",
|
||||
"غير متزامن",
|
||||
"ثابت",
|
||||
"متغير",
|
||||
"كود",
|
||||
"خوارزمية",
|
||||
];
|
||||
|
||||
export const REASONING_KEYWORDS: readonly string[] = [
|
||||
// English
|
||||
"prove",
|
||||
"theorem",
|
||||
"derive",
|
||||
"step by step",
|
||||
"chain of thought",
|
||||
"formally",
|
||||
"mathematical",
|
||||
"proof",
|
||||
"logically",
|
||||
"analyze",
|
||||
"reasoning",
|
||||
"deduce",
|
||||
"infer",
|
||||
"hypothesis",
|
||||
"convergence",
|
||||
// Português (PT-BR)
|
||||
"provar",
|
||||
"teorema",
|
||||
"derivar",
|
||||
"passo a passo",
|
||||
"cadeia de pensamento",
|
||||
"formalmente",
|
||||
"matemático",
|
||||
"prova",
|
||||
"logicamente",
|
||||
"analisar",
|
||||
"raciocínio",
|
||||
"deduzir",
|
||||
"inferir",
|
||||
"hipótese",
|
||||
"demonstrar",
|
||||
"cálculo",
|
||||
"equação diferencial",
|
||||
"integral",
|
||||
"otimização",
|
||||
// Español
|
||||
"demostrar",
|
||||
"teorema",
|
||||
"derivar",
|
||||
"paso a paso",
|
||||
"formalmente",
|
||||
"matemático",
|
||||
"lógicamente",
|
||||
// 中文
|
||||
"证明",
|
||||
"定理",
|
||||
"推导",
|
||||
"逐步",
|
||||
"思维链",
|
||||
"数学",
|
||||
"逻辑",
|
||||
"分析",
|
||||
// 日本語
|
||||
"証明",
|
||||
"定理",
|
||||
"導出",
|
||||
"論理的",
|
||||
"分析",
|
||||
// Русский
|
||||
"доказать",
|
||||
"теорема",
|
||||
"шаг за шагом",
|
||||
"математически",
|
||||
"логически",
|
||||
// Deutsch
|
||||
"beweisen",
|
||||
"theorem",
|
||||
"schritt für schritt",
|
||||
"mathematisch",
|
||||
"logisch",
|
||||
// 한국어
|
||||
"증명",
|
||||
"정리",
|
||||
"단계별",
|
||||
"수학적",
|
||||
"논리적",
|
||||
// العربية
|
||||
"إثبات",
|
||||
"نظرية",
|
||||
"خطوة بخطوة",
|
||||
"رياضي",
|
||||
"منطقياً",
|
||||
];
|
||||
|
||||
export const SIMPLE_KEYWORDS: readonly string[] = [
|
||||
// English
|
||||
"what is",
|
||||
"define",
|
||||
"translate",
|
||||
"hello",
|
||||
"yes or no",
|
||||
"summarize",
|
||||
"list",
|
||||
"tell me",
|
||||
"who is",
|
||||
// Português (PT-BR)
|
||||
"o que é",
|
||||
"definir",
|
||||
"traduzir",
|
||||
"olá",
|
||||
"oi",
|
||||
"sim ou não",
|
||||
"resumir",
|
||||
"listar",
|
||||
"me diga",
|
||||
"quem é",
|
||||
"quando foi",
|
||||
"onde fica",
|
||||
"explique brevemente",
|
||||
"de forma simples",
|
||||
// Español
|
||||
"qué es",
|
||||
"definir",
|
||||
"traducir",
|
||||
"hola",
|
||||
"resumir",
|
||||
"listar",
|
||||
// 中文
|
||||
"什么是",
|
||||
"定义",
|
||||
"翻译",
|
||||
"你好",
|
||||
"总结",
|
||||
"列出",
|
||||
// Русский
|
||||
"что такое",
|
||||
"определить",
|
||||
"перевести",
|
||||
"привет",
|
||||
"резюмировать",
|
||||
// Deutsch
|
||||
"was ist",
|
||||
"definieren",
|
||||
"übersetzen",
|
||||
"hallo",
|
||||
"zusammenfassen",
|
||||
// 한국어
|
||||
"이란",
|
||||
"정의",
|
||||
"번역",
|
||||
"안녕",
|
||||
"요약",
|
||||
// العربية
|
||||
"ما هو",
|
||||
"تعريف",
|
||||
"ترجمة",
|
||||
"مرحبا",
|
||||
"ملخص",
|
||||
];
|
||||
|
||||
/**
|
||||
* Classify a prompt's intent using multilingual keyword matching.
|
||||
* Priority: code > reasoning > simple > medium (default)
|
||||
*/
|
||||
export function classifyPromptIntent(prompt: string, systemPrompt?: string): IntentType {
|
||||
const fullText = `${systemPrompt ?? ""} ${prompt}`.toLowerCase();
|
||||
const wordCount = prompt.trim().split(/\s+/).length;
|
||||
|
||||
for (const kw of CODE_KEYWORDS) {
|
||||
if (fullText.includes(kw.toLowerCase())) return "code";
|
||||
}
|
||||
for (const kw of REASONING_KEYWORDS) {
|
||||
if (fullText.includes(kw.toLowerCase())) return "reasoning";
|
||||
}
|
||||
if (wordCount < 60) {
|
||||
for (const kw of SIMPLE_KEYWORDS) {
|
||||
if (fullText.includes(kw.toLowerCase())) return "simple";
|
||||
}
|
||||
}
|
||||
return "medium";
|
||||
}
|
||||
|
||||
export interface IntentClassifierConfig {
|
||||
enabled: boolean;
|
||||
extraCodeKeywords?: string[];
|
||||
extraReasoningKeywords?: string[];
|
||||
extraSimpleKeywords?: string[];
|
||||
simpleMaxWords?: number;
|
||||
}
|
||||
|
||||
export const DEFAULT_INTENT_CONFIG: IntentClassifierConfig = {
|
||||
enabled: true,
|
||||
simpleMaxWords: 60,
|
||||
};
|
||||
|
||||
export function classifyWithConfig(
|
||||
prompt: string,
|
||||
config: IntentClassifierConfig,
|
||||
systemPrompt?: string
|
||||
): IntentType {
|
||||
if (!config.enabled) return "medium";
|
||||
const fullText = `${systemPrompt ?? ""} ${prompt}`.toLowerCase();
|
||||
const wordCount = prompt.trim().split(/\s+/).length;
|
||||
const maxSimpleWords = config.simpleMaxWords ?? 60;
|
||||
const codeKws = [...CODE_KEYWORDS, ...(config.extraCodeKeywords ?? [])];
|
||||
const reasoningKws = [...REASONING_KEYWORDS, ...(config.extraReasoningKeywords ?? [])];
|
||||
const simpleKws = [...SIMPLE_KEYWORDS, ...(config.extraSimpleKeywords ?? [])];
|
||||
for (const kw of codeKws) {
|
||||
if (fullText.includes(kw.toLowerCase())) return "code";
|
||||
}
|
||||
for (const kw of reasoningKws) {
|
||||
if (fullText.includes(kw.toLowerCase())) return "reasoning";
|
||||
}
|
||||
if (wordCount < maxSimpleWords) {
|
||||
for (const kw of simpleKws) {
|
||||
if (fullText.includes(kw.toLowerCase())) return "simple";
|
||||
}
|
||||
}
|
||||
return "medium";
|
||||
}
|
||||
@@ -23,6 +23,18 @@ const PROVIDER_MODEL_ALIASES = {
|
||||
"gemini-3-flash": "gemini-3-flash-preview",
|
||||
"raptor-mini": "oswe-vscode-prime",
|
||||
},
|
||||
gemini: {
|
||||
"gemini-3.1-pro-preview": "gemini-3.1-pro",
|
||||
"gemini-3-1-pro": "gemini-3.1-pro",
|
||||
},
|
||||
"gemini-cli": {
|
||||
"gemini-3.1-pro-preview": "gemini-3.1-pro",
|
||||
"gemini-3-1-pro": "gemini-3.1-pro",
|
||||
},
|
||||
nvidia: {
|
||||
"gpt-oss-120b": "openai/gpt-oss-120b",
|
||||
"nvidia/gpt-oss-120b": "openai/gpt-oss-120b",
|
||||
},
|
||||
antigravity: {},
|
||||
};
|
||||
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
import { PROVIDER_ID_TO_ALIAS, PROVIDER_MODELS } from "../config/providerModels.ts";
|
||||
import { parseModel } from "./model.ts";
|
||||
|
||||
// Conservative denylist fallback used when registry metadata is absent.
|
||||
// Keep small and explicit to avoid false negatives.
|
||||
const TOOL_CALLING_UNSUPPORTED_PATTERNS = [
|
||||
"gpt-oss-120b",
|
||||
"deepseek-reasoner",
|
||||
"glm-4.7",
|
||||
"glm4.7",
|
||||
];
|
||||
|
||||
function getRegistryToolCallingFlag(providerIdOrAlias: string, modelId: string): boolean | null {
|
||||
const providerAlias = PROVIDER_ID_TO_ALIAS[providerIdOrAlias] || providerIdOrAlias;
|
||||
const models = PROVIDER_MODELS[providerAlias];
|
||||
if (!Array.isArray(models)) return null;
|
||||
const found = models.find((m) => m?.id === modelId);
|
||||
if (!found) return null;
|
||||
return typeof found.toolCalling === "boolean" ? found.toolCalling : null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Returns whether a model should be considered safe for structured function/tool calling.
|
||||
*
|
||||
* Decision order:
|
||||
* 1) Provider registry metadata (toolCalling flag) when available.
|
||||
* 2) Conservative denylist fallback for known problematic model families.
|
||||
* 3) Default true.
|
||||
*/
|
||||
export function supportsToolCalling(modelStr: string): boolean {
|
||||
const parsed = parseModel(modelStr);
|
||||
const provider = parsed.provider || parsed.providerAlias || "";
|
||||
const model = parsed.model || modelStr;
|
||||
|
||||
if (provider) {
|
||||
const fromRegistry = getRegistryToolCallingFlag(provider, model);
|
||||
if (fromRegistry !== null) return fromRegistry;
|
||||
}
|
||||
|
||||
const normalized = String(modelStr || "").toLowerCase();
|
||||
if (!normalized) return false;
|
||||
|
||||
const blocked = TOOL_CALLING_UNSUPPORTED_PATTERNS.some((pattern) => {
|
||||
if (normalized === pattern) return true;
|
||||
if (normalized.endsWith(`/${pattern}`)) return true;
|
||||
return normalized.includes(pattern);
|
||||
});
|
||||
|
||||
return !blocked;
|
||||
}
|
||||
@@ -0,0 +1,120 @@
|
||||
/**
|
||||
* Request Deduplication Service
|
||||
*
|
||||
* Deduplicates **concurrent** identical requests to the same upstream.
|
||||
* Inspired by ClawRouter's dedup.ts (BlockRunAI / github.com/BlockRunAI/ClawRouter).
|
||||
*
|
||||
* IMPORTANT: In-memory only — does NOT persist across restarts and does NOT
|
||||
* work across multiple process instances (no cross-instance dedup).
|
||||
*/
|
||||
|
||||
import { createHash } from "node:crypto";
|
||||
|
||||
export interface DedupConfig {
|
||||
enabled: boolean;
|
||||
maxTemperatureForDedup: number;
|
||||
timeoutMs: number;
|
||||
}
|
||||
|
||||
export const DEFAULT_DEDUP_CONFIG: DedupConfig = {
|
||||
enabled: true,
|
||||
maxTemperatureForDedup: 0.1,
|
||||
timeoutMs: 60_000,
|
||||
};
|
||||
|
||||
export interface DedupResult<T> {
|
||||
result: T;
|
||||
wasDeduplicated: boolean;
|
||||
hash: string;
|
||||
}
|
||||
|
||||
const inflight = new Map<string, Promise<unknown>>();
|
||||
|
||||
/**
|
||||
* Compute a deterministic hash for a request body.
|
||||
* Includes: model, messages, temperature, tools, tool_choice, max_tokens, response_format
|
||||
* Excludes: stream, user, metadata (don't affect LLM output)
|
||||
*/
|
||||
export function computeRequestHash(requestBody: unknown): string {
|
||||
const body = requestBody as Record<string, unknown>;
|
||||
const canonical = {
|
||||
model: body.model ?? null,
|
||||
messages: body.messages ?? null,
|
||||
temperature: typeof body.temperature === "number" ? body.temperature : 1.0,
|
||||
tools: body.tools ?? null,
|
||||
tool_choice: body.tool_choice ?? null,
|
||||
max_tokens: body.max_tokens ?? null,
|
||||
response_format: body.response_format ?? null,
|
||||
top_p: body.top_p ?? null,
|
||||
frequency_penalty: body.frequency_penalty ?? null,
|
||||
presence_penalty: body.presence_penalty ?? null,
|
||||
};
|
||||
return createHash("sha256").update(JSON.stringify(canonical)).digest("hex").slice(0, 16);
|
||||
}
|
||||
|
||||
/** Determine whether a request should be deduplicated */
|
||||
export function shouldDeduplicate(
|
||||
requestBody: unknown,
|
||||
config: DedupConfig = DEFAULT_DEDUP_CONFIG
|
||||
): boolean {
|
||||
if (!config.enabled) return false;
|
||||
const body = requestBody as Record<string, unknown>;
|
||||
if (body.stream === true) return false;
|
||||
const temperature = typeof body.temperature === "number" ? body.temperature : 1.0;
|
||||
if (temperature > config.maxTemperatureForDedup) return false;
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Execute a request with deduplication.
|
||||
* Concurrent identical requests share one upstream call.
|
||||
*/
|
||||
export async function deduplicate<T>(
|
||||
hash: string,
|
||||
fn: () => Promise<T>,
|
||||
config: DedupConfig = DEFAULT_DEDUP_CONFIG
|
||||
): Promise<DedupResult<T>> {
|
||||
if (!config.enabled) {
|
||||
return { result: await fn(), wasDeduplicated: false, hash };
|
||||
}
|
||||
|
||||
const existing = inflight.get(hash);
|
||||
if (existing) {
|
||||
const result = (await existing) as T;
|
||||
return { result, wasDeduplicated: true, hash };
|
||||
}
|
||||
|
||||
let resolve!: (value: T) => void;
|
||||
let reject!: (reason: unknown) => void;
|
||||
const sharedPromise = new Promise<T>((res, rej) => {
|
||||
resolve = res;
|
||||
reject = rej;
|
||||
});
|
||||
inflight.set(hash, sharedPromise as Promise<unknown>);
|
||||
|
||||
const timer = setTimeout(() => {
|
||||
if (inflight.get(hash) === sharedPromise) inflight.delete(hash);
|
||||
}, config.timeoutMs);
|
||||
|
||||
try {
|
||||
const result = await fn();
|
||||
resolve(result);
|
||||
return { result, wasDeduplicated: false, hash };
|
||||
} catch (err) {
|
||||
reject(err);
|
||||
throw err;
|
||||
} finally {
|
||||
clearTimeout(timer);
|
||||
if (inflight.get(hash) === sharedPromise) inflight.delete(hash);
|
||||
}
|
||||
}
|
||||
|
||||
export function getInflightCount(): number {
|
||||
return inflight.size;
|
||||
}
|
||||
export function getInflightHashes(): string[] {
|
||||
return [...inflight.keys()];
|
||||
}
|
||||
export function clearInflight(): void {
|
||||
inflight.clear();
|
||||
}
|
||||
@@ -91,6 +91,10 @@ export function filterToOpenAIFormat(body) {
|
||||
delete body.tools;
|
||||
}
|
||||
|
||||
// Strip Claude-specific fields that OpenAI-compatible providers reject
|
||||
delete body.metadata;
|
||||
delete body.anthropic_version;
|
||||
|
||||
// Normalize tools to OpenAI format (from Claude, Gemini, etc.)
|
||||
if (body.tools && Array.isArray(body.tools) && body.tools.length > 0) {
|
||||
body.tools = body.tools
|
||||
|
||||
@@ -131,7 +131,7 @@ export function translateRequest(
|
||||
}
|
||||
|
||||
// Final step: prepare request for Claude format endpoints
|
||||
if (targetFormat === FORMATS.CLAUDE) {
|
||||
if (targetFormat === FORMATS.CLAUDE && sourceFormat !== FORMATS.CLAUDE) {
|
||||
result = prepareClaudeRequest(result, provider);
|
||||
}
|
||||
|
||||
|
||||
@@ -6,6 +6,7 @@
|
||||
*/
|
||||
import { register } from "../registry.ts";
|
||||
import { FORMATS } from "../formats.ts";
|
||||
import { generateToolCallId } from "../helpers/toolCallHelper.ts";
|
||||
|
||||
type JsonRecord = Record<string, unknown>;
|
||||
|
||||
@@ -120,6 +121,12 @@ export function openaiResponsesToOpenAIRequest(
|
||||
}
|
||||
|
||||
if (itemType === "function_call") {
|
||||
// Skip tool calls with empty names to avoid infinite placeholder_tool loops
|
||||
const fnName = toString(item.name).trim();
|
||||
if (!fnName) {
|
||||
continue;
|
||||
}
|
||||
|
||||
// Start or append assistant message with tool_calls
|
||||
if (!currentAssistantMsg) {
|
||||
currentAssistantMsg = {
|
||||
@@ -136,7 +143,7 @@ export function openaiResponsesToOpenAIRequest(
|
||||
id: toString(item.call_id),
|
||||
type: "function",
|
||||
function: {
|
||||
name: toString(item.name),
|
||||
name: fnName,
|
||||
arguments: item.arguments,
|
||||
},
|
||||
});
|
||||
@@ -201,6 +208,24 @@ export function openaiResponsesToOpenAIRequest(
|
||||
});
|
||||
}
|
||||
|
||||
// Filter orphaned tool results (no matching tool_call in assistant messages)
|
||||
const allToolCallIds = new Set<string>();
|
||||
for (const m of messages) {
|
||||
const rec = toRecord(m);
|
||||
if (Array.isArray(rec.tool_calls)) {
|
||||
for (const tc of rec.tool_calls as { id?: string }[]) {
|
||||
if (tc.id) allToolCallIds.add(String(tc.id));
|
||||
}
|
||||
}
|
||||
}
|
||||
result.messages = messages.filter((m) => {
|
||||
const rec = toRecord(m);
|
||||
if (rec.role === "tool" && rec.tool_call_id) {
|
||||
return allToolCallIds.has(String(rec.tool_call_id));
|
||||
}
|
||||
return true;
|
||||
});
|
||||
|
||||
// Cleanup Responses API specific fields
|
||||
delete result.input;
|
||||
delete result.instructions;
|
||||
@@ -319,10 +344,15 @@ export function openaiToOpenAIResponsesRequest(
|
||||
for (const toolCallValue of msg.tool_calls) {
|
||||
const toolCall = toRecord(toolCallValue);
|
||||
const fn = toRecord(toolCall.function);
|
||||
// Skip tool calls with empty names to avoid infinite placeholder_tool loops
|
||||
const fnName = toString(fn.name).trim();
|
||||
if (!fnName) {
|
||||
continue;
|
||||
}
|
||||
input.push({
|
||||
type: "function_call",
|
||||
call_id: toString(toolCall.id),
|
||||
name: toString(fn.name),
|
||||
call_id: toString(toolCall.id).trim() || generateToolCallId(),
|
||||
name: fnName,
|
||||
arguments: toString(fn.arguments, "{}"),
|
||||
});
|
||||
}
|
||||
@@ -339,6 +369,22 @@ export function openaiToOpenAIResponsesRequest(
|
||||
}
|
||||
}
|
||||
|
||||
// Filter orphaned function_call_output items (no matching function_call)
|
||||
// This happens when Claude Code compaction removes messages but leaves tool results
|
||||
const knownCallIds = new Set(
|
||||
input
|
||||
.filter(
|
||||
(item: { type?: string; call_id?: string }) => item.type === "function_call" && item.call_id
|
||||
)
|
||||
.map((item: { type?: string; call_id?: string }) => item.call_id)
|
||||
);
|
||||
result.input = input.filter((item: { type?: string; call_id?: string }) => {
|
||||
if (item.type === "function_call_output" && item.call_id) {
|
||||
return knownCallIds.has(item.call_id);
|
||||
}
|
||||
return true;
|
||||
});
|
||||
|
||||
// If no system message, keep empty instructions
|
||||
if (!hasSystemMessage) {
|
||||
result.instructions = "";
|
||||
|
||||
@@ -123,6 +123,43 @@ export function openaiToClaudeRequest(model, body, stream) {
|
||||
|
||||
flushCurrentMessage();
|
||||
|
||||
// Remove assistant messages with empty content (can happen when all tool_use blocks were skipped)
|
||||
result.messages = result.messages.filter((msg) => {
|
||||
if (msg.role === "assistant" && Array.isArray(msg.content) && msg.content.length === 0) {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
});
|
||||
|
||||
// Filter orphaned tool_result blocks whose tool_use_id has no matching tool_use
|
||||
const allToolUseIds = new Set<string>();
|
||||
for (const msg of result.messages) {
|
||||
if (msg.role === "assistant" && Array.isArray(msg.content)) {
|
||||
for (const block of msg.content) {
|
||||
if (block.type === "tool_use" && block.id) {
|
||||
allToolUseIds.add(String(block.id));
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
for (const msg of result.messages) {
|
||||
if (msg.role === "user" && Array.isArray(msg.content)) {
|
||||
msg.content = msg.content.filter((block) => {
|
||||
if (block.type === "tool_result" && block.tool_use_id) {
|
||||
return allToolUseIds.has(String(block.tool_use_id));
|
||||
}
|
||||
return true;
|
||||
});
|
||||
}
|
||||
}
|
||||
// Remove user messages that became empty after orphan filtering
|
||||
result.messages = result.messages.filter((msg) => {
|
||||
if (msg.role === "user" && Array.isArray(msg.content) && msg.content.length === 0) {
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
});
|
||||
|
||||
// Add cache_control to last assistant message
|
||||
for (let i = result.messages.length - 1; i >= 0; i--) {
|
||||
const message = result.messages[i];
|
||||
|
||||
@@ -184,6 +184,17 @@ export function createSSEStream(options: StreamOptions = {}) {
|
||||
typeof parsed.type === "string" &&
|
||||
parsed.type.startsWith("response.");
|
||||
|
||||
// Detect Claude SSE payloads. Includes "ping" and "error" to ensure
|
||||
// they bypass the Chat Completions sanitization path which would
|
||||
// incorrectly process or drop them.
|
||||
const isClaudeSSE =
|
||||
parsed.type &&
|
||||
typeof parsed.type === "string" &&
|
||||
(parsed.type.startsWith("message") ||
|
||||
parsed.type.startsWith("content_block") ||
|
||||
parsed.type === "ping" ||
|
||||
parsed.type === "error");
|
||||
|
||||
if (isResponsesSSE) {
|
||||
// Responses SSE: only extract usage, forward payload as-is
|
||||
const extracted = extractUsage(parsed);
|
||||
@@ -194,6 +205,22 @@ export function createSSEStream(options: StreamOptions = {}) {
|
||||
if (parsed.delta && typeof parsed.delta === "string") {
|
||||
totalContentLength += parsed.delta.length;
|
||||
}
|
||||
} else if (isClaudeSSE) {
|
||||
// Claude SSE: extract usage, track content, forward as-is
|
||||
const extracted = extractUsage(parsed);
|
||||
if (extracted) {
|
||||
// Non-destructive merge: never overwrite a positive value with 0
|
||||
// message_start carries input_tokens, message_delta carries output_tokens
|
||||
if (!usage) usage = {};
|
||||
if (extracted.prompt_tokens > 0) usage.prompt_tokens = extracted.prompt_tokens;
|
||||
if (extracted.completion_tokens > 0) usage.completion_tokens = extracted.completion_tokens;
|
||||
if (extracted.total_tokens > 0) usage.total_tokens = extracted.total_tokens;
|
||||
if (extracted.cache_read_input_tokens) usage.cache_read_input_tokens = extracted.cache_read_input_tokens;
|
||||
if (extracted.cache_creation_input_tokens) usage.cache_creation_input_tokens = extracted.cache_creation_input_tokens;
|
||||
}
|
||||
// Track content length from Claude format
|
||||
if (parsed.delta?.text) totalContentLength += parsed.delta.text.length;
|
||||
if (parsed.delta?.thinking) totalContentLength += parsed.delta.thinking.length;
|
||||
} else {
|
||||
// Chat Completions: full sanitization pipeline
|
||||
parsed = sanitizeStreamingChunk(parsed);
|
||||
@@ -372,9 +399,9 @@ export function createSSEStream(options: StreamOptions = {}) {
|
||||
controller.enqueue(encoder.encode(output));
|
||||
}
|
||||
|
||||
// Estimate usage if provider didn't return valid usage (PASSTHROUGH is always OpenAI format)
|
||||
// Estimate usage if provider didn't return valid usage
|
||||
if (!hasValidUsage(usage) && totalContentLength > 0) {
|
||||
usage = estimateUsage(body, totalContentLength, FORMATS.OPENAI);
|
||||
usage = estimateUsage(body, totalContentLength, sourceFormat || FORMATS.OPENAI);
|
||||
}
|
||||
|
||||
if (hasValidUsage(usage)) {
|
||||
|
||||
Generated
+639
-435
File diff suppressed because it is too large
Load Diff
+2
-2
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "omniroute",
|
||||
"version": "2.6.2",
|
||||
"version": "2.7.0",
|
||||
"description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
|
||||
"type": "module",
|
||||
"bin": {
|
||||
@@ -90,7 +90,7 @@
|
||||
"express": "^5.2.1",
|
||||
"fetch-socks": "^1.3.2",
|
||||
"http-proxy-middleware": "^3.0.5",
|
||||
"https-proxy-agent": "^7.0.6",
|
||||
"https-proxy-agent": "^8.0.0",
|
||||
"jose": "^6.1.3",
|
||||
"lowdb": "^7.0.1",
|
||||
"monaco-editor": "^0.55.1",
|
||||
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 472 B |
+63
-7
@@ -14,6 +14,7 @@
|
||||
*
|
||||
* Fixes: https://github.com/diegosouzapw/OmniRoute/issues/129
|
||||
* Fixes: https://github.com/diegosouzapw/OmniRoute/issues/321
|
||||
* Fixes: https://github.com/diegosouzapw/OmniRoute/issues/426
|
||||
*/
|
||||
|
||||
import { existsSync, copyFileSync, mkdirSync } from "node:fs";
|
||||
@@ -80,8 +81,54 @@ if (existsSync(rootBinary)) {
|
||||
}
|
||||
}
|
||||
|
||||
// Strategy 1.5: Use node-pre-gyp to download the correct prebuilt binary
|
||||
// This works on Windows without requiring node-gyp, Python, or MSVC.
|
||||
// better-sqlite3 ships prebuilts for win32-x64, win32-arm64, darwin-x64/arm64.
|
||||
console.log(" 📥 Attempting to download prebuilt binary via node-pre-gyp...");
|
||||
try {
|
||||
const { execSync } = await import("node:child_process");
|
||||
// better-sqlite3 bundles @mapbox/node-pre-gyp — use it directly
|
||||
const preGypBin = join(
|
||||
ROOT,
|
||||
"app",
|
||||
"node_modules",
|
||||
".bin",
|
||||
process.platform === "win32" ? "node-pre-gyp.cmd" : "node-pre-gyp"
|
||||
);
|
||||
const preGypFallback = join(
|
||||
ROOT,
|
||||
"app",
|
||||
"node_modules",
|
||||
"@mapbox",
|
||||
"node-pre-gyp",
|
||||
"bin",
|
||||
"node-pre-gyp"
|
||||
);
|
||||
const preGypCmd = existsSync(preGypBin) ? preGypBin : preGypFallback;
|
||||
|
||||
if (existsSync(preGypCmd)) {
|
||||
execSync(`"${process.execPath}" "${preGypCmd}" install --fallback-to-build=false`, {
|
||||
cwd: join(ROOT, "app", "node_modules", "better-sqlite3"),
|
||||
stdio: "inherit",
|
||||
timeout: 60_000,
|
||||
});
|
||||
mkdirSync(dirname(appBinary), { recursive: true });
|
||||
try {
|
||||
process.dlopen({ exports: {} }, appBinary);
|
||||
console.log(" ✅ Prebuilt binary downloaded and loaded successfully!\n");
|
||||
process.exit(0);
|
||||
} catch (loadErr) {
|
||||
console.warn(` ⚠️ Downloaded binary failed to load: ${loadErr.message}`);
|
||||
}
|
||||
} else {
|
||||
console.warn(" ⚠️ node-pre-gyp not found, skipping prebuilt download.");
|
||||
}
|
||||
} catch (err) {
|
||||
console.warn(` ⚠️ node-pre-gyp download failed: ${err.message.split("\n")[0]}`);
|
||||
}
|
||||
|
||||
// Strategy 2: Fall back to npm rebuild (may work if build tools are available)
|
||||
console.log(" ⚠️ Root binary not available or incompatible, attempting npm rebuild...");
|
||||
console.log(" ⚠️ Attempting npm rebuild (requires build tools)...");
|
||||
|
||||
try {
|
||||
const { execSync } = await import("node:child_process");
|
||||
@@ -103,14 +150,23 @@ try {
|
||||
}
|
||||
}
|
||||
|
||||
// If nothing worked, warn but don't fail the install — let the package stay
|
||||
// installed so users can fix manually or use the pre-flight check in the CLI
|
||||
console.warn(" ⚠️ Could not fix better-sqlite3 native module automatically.");
|
||||
// If nothing worked, warn but don't fail the install
|
||||
console.warn("\n ⚠️ Could not fix better-sqlite3 native module automatically.");
|
||||
console.warn(" The server may not start correctly.");
|
||||
console.warn(" Try manually:");
|
||||
console.warn(` cd ${join(ROOT, "app")} && npm rebuild better-sqlite3`);
|
||||
if (process.platform === "darwin") {
|
||||
console.warn(" Manual fix options:");
|
||||
if (process.platform === "win32") {
|
||||
console.warn(" Option A (easiest — no build tools needed):");
|
||||
console.warn(` cd "${join(ROOT, "app", "node_modules", "better-sqlite3")}"`);
|
||||
console.warn(" npx @mapbox/node-pre-gyp install --fallback-to-build=false");
|
||||
console.warn(" Option B (requires Build Tools for Visual Studio):");
|
||||
console.warn(` cd "${join(ROOT, "app")}" && npm rebuild better-sqlite3`);
|
||||
console.warn(" Install from: https://visualstudio.microsoft.com/visual-cpp-build-tools/");
|
||||
console.warn(" Also ensure Python is installed: https://python.org");
|
||||
} else if (process.platform === "darwin") {
|
||||
console.warn(` cd ${join(ROOT, "app")} && npm rebuild better-sqlite3`);
|
||||
console.warn(" If build tools are missing: xcode-select --install");
|
||||
} else {
|
||||
console.warn(` cd ${join(ROOT, "app")} && npm rebuild better-sqlite3`);
|
||||
}
|
||||
console.warn("");
|
||||
|
||||
|
||||
@@ -142,6 +142,62 @@ if (sanitisedCount > 0) {
|
||||
console.log(" ℹ️ No hardcoded paths found to sanitise");
|
||||
}
|
||||
|
||||
// ── Step 5.6: Strip Turbopack hashed externals from compiled chunks ─────────
|
||||
// Even when Turbopack is disabled at build time, some instrumentation chunks
|
||||
// may still emit require('package-<16hexchars>') instead of require('package').
|
||||
// These hashed names don't exist in node_modules and cause MODULE_NOT_FOUND at
|
||||
// runtime. We strip the hex suffix from all .js files in app/.next/server/
|
||||
// to ensure all require() calls use the real package names.
|
||||
{
|
||||
const serverOutput = join(APP_DIR, ".next", "server");
|
||||
const HASH_RE = /(['"\\])([a-z@][a-z0-9@./_-]+-[0-9a-f]{16})\1/g;
|
||||
let patchedFiles = 0;
|
||||
let patchedMatches = 0;
|
||||
const walkDir = (dir) => {
|
||||
let entries = [];
|
||||
try {
|
||||
entries = readdirSync(dir);
|
||||
} catch {
|
||||
return;
|
||||
}
|
||||
for (const entry of entries) {
|
||||
const full = join(dir, entry);
|
||||
try {
|
||||
const st = statSync(full);
|
||||
if (st.isDirectory()) {
|
||||
walkDir(full);
|
||||
continue;
|
||||
}
|
||||
if (!entry.endsWith(".js")) continue;
|
||||
const src = readFileSync(full, "utf8");
|
||||
let count = 0;
|
||||
const patched = src.replace(HASH_RE, (_, q, name) => {
|
||||
const base = name.replace(/-[0-9a-f]{16}$/, "");
|
||||
count++;
|
||||
return `${q}${base}${q}`;
|
||||
});
|
||||
if (count > 0) {
|
||||
writeFileSync(full, patched);
|
||||
patchedFiles++;
|
||||
patchedMatches += count;
|
||||
}
|
||||
} catch {
|
||||
/* skip unreadable files */
|
||||
}
|
||||
}
|
||||
};
|
||||
if (existsSync(serverOutput)) {
|
||||
walkDir(serverOutput);
|
||||
if (patchedMatches > 0) {
|
||||
console.log(
|
||||
` 🔧 Hash-strip: patched ${patchedMatches} hashed require() in ${patchedFiles} server chunk file(s)`
|
||||
);
|
||||
} else {
|
||||
console.log(" ✅ Hash-strip: no hashed externals found in compiled chunks.");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ── Step 6: Copy static assets ─────────────────────────────
|
||||
const staticSrc = join(ROOT, ".next", "static");
|
||||
const staticDest = join(APP_DIR, ".next", "static");
|
||||
|
||||
@@ -81,29 +81,36 @@ const PROVIDER_MODELS: Record<
|
||||
{ id: "openai/dall-e-2", name: "DALL-E 2" },
|
||||
],
|
||||
},
|
||||
{ id: "xai", name: "xAI (Grok)", models: [{ id: "xai/grok-2-image", name: "Grok 2 Image" }] },
|
||||
{
|
||||
id: "xai",
|
||||
name: "xAI (Grok)",
|
||||
models: [{ id: "xai/grok-2-image-1212", name: "Grok 2 Image" }],
|
||||
},
|
||||
{
|
||||
id: "together",
|
||||
name: "Together AI",
|
||||
models: [
|
||||
{ id: "together/stable-diffusion-xl", name: "SDXL" },
|
||||
{ id: "together/FLUX.1-schnell-Free", name: "FLUX.1 Schnell" },
|
||||
{ id: "together/stabilityai/stable-diffusion-xl-base-1.0", name: "SDXL" },
|
||||
{ id: "together/black-forest-labs/FLUX.1-schnell-Free", name: "FLUX.1 Schnell" },
|
||||
],
|
||||
},
|
||||
{
|
||||
id: "fireworks",
|
||||
name: "Fireworks AI",
|
||||
models: [
|
||||
{ id: "fireworks/stable-diffusion-xl-1024-v1-0", name: "SDXL 1024" },
|
||||
{ id: "fireworks/flux-1-dev-fp8", name: "FLUX.1 Dev" },
|
||||
{
|
||||
id: "fireworks/accounts/fireworks/models/stable-diffusion-xl-1024-v1-0",
|
||||
name: "SDXL 1024",
|
||||
},
|
||||
{ id: "fireworks/accounts/fireworks/models/flux-1-dev-fp8", name: "FLUX.1 Dev" },
|
||||
],
|
||||
},
|
||||
{
|
||||
id: "nebius",
|
||||
name: "Nebius AI",
|
||||
models: [
|
||||
{ id: "nebius/flux-dev", name: "FLUX Dev" },
|
||||
{ id: "nebius/sdxl", name: "SDXL" },
|
||||
{ id: "nebius/black-forest-labs/flux-dev", name: "FLUX Dev" },
|
||||
{ id: "nebius/black-forest-labs/flux-schnell", name: "FLUX Schnell" },
|
||||
],
|
||||
},
|
||||
{
|
||||
@@ -117,7 +124,10 @@ const PROVIDER_MODELS: Record<
|
||||
{
|
||||
id: "nanobanana",
|
||||
name: "NanoBanana",
|
||||
models: [{ id: "nanobanana/flux-schnell", name: "FLUX Schnell" }],
|
||||
models: [
|
||||
{ id: "nanobanana/nanobanana-flash", name: "NanoBanana Flash" },
|
||||
{ id: "nanobanana/nanobanana-pro", name: "NanoBanana Pro" },
|
||||
],
|
||||
},
|
||||
{
|
||||
id: "sdwebui",
|
||||
|
||||
@@ -0,0 +1,614 @@
|
||||
"use client";
|
||||
|
||||
import { useCallback, useEffect, useMemo, useState } from "react";
|
||||
import { Button, Card, Modal } from "@/shared/components";
|
||||
|
||||
type ProxyItem = {
|
||||
id: string;
|
||||
name: string;
|
||||
type: string;
|
||||
host: string;
|
||||
port: number;
|
||||
region?: string | null;
|
||||
notes?: string | null;
|
||||
status?: string;
|
||||
};
|
||||
|
||||
type UsageInfo = {
|
||||
count: number;
|
||||
assignments: Array<{ scope: string; scopeId: string | null }>;
|
||||
};
|
||||
|
||||
type HealthInfo = {
|
||||
proxyId: string;
|
||||
totalRequests: number;
|
||||
successRate: number | null;
|
||||
avgLatencyMs: number | null;
|
||||
lastSeenAt: string | null;
|
||||
};
|
||||
|
||||
const EMPTY_FORM = {
|
||||
id: "",
|
||||
name: "",
|
||||
type: "http",
|
||||
host: "",
|
||||
port: "8080",
|
||||
username: "",
|
||||
password: "",
|
||||
region: "",
|
||||
notes: "",
|
||||
status: "active",
|
||||
};
|
||||
|
||||
export default function ProxyRegistryManager() {
|
||||
const [items, setItems] = useState<ProxyItem[]>([]);
|
||||
const [loading, setLoading] = useState(false);
|
||||
const [error, setError] = useState<string | null>(null);
|
||||
|
||||
const [modalOpen, setModalOpen] = useState(false);
|
||||
const [saving, setSaving] = useState(false);
|
||||
const [form, setForm] = useState(EMPTY_FORM);
|
||||
|
||||
const [usageById, setUsageById] = useState<Record<string, UsageInfo>>({});
|
||||
const [healthById, setHealthById] = useState<Record<string, HealthInfo>>({});
|
||||
const [migrating, setMigrating] = useState(false);
|
||||
const [bulkOpen, setBulkOpen] = useState(false);
|
||||
const [bulkSaving, setBulkSaving] = useState(false);
|
||||
const [bulkScope, setBulkScope] = useState("provider");
|
||||
const [bulkScopeIds, setBulkScopeIds] = useState("");
|
||||
const [bulkProxyId, setBulkProxyId] = useState("");
|
||||
|
||||
const editingId = useMemo(() => form.id || "", [form.id]);
|
||||
|
||||
const loadHealth = useCallback(async () => {
|
||||
try {
|
||||
const res = await fetch("/api/settings/proxies/health?hours=24");
|
||||
const data = await res.json().catch(() => ({}));
|
||||
if (!res.ok) return;
|
||||
const entries = Array.isArray(data?.items) ? data.items : [];
|
||||
const mapped = Object.fromEntries(
|
||||
entries.map((entry: HealthInfo) => [entry.proxyId, entry])
|
||||
) as Record<string, HealthInfo>;
|
||||
setHealthById(mapped);
|
||||
} catch {
|
||||
// ignore health loading errors in UI
|
||||
}
|
||||
}, []);
|
||||
|
||||
const load = useCallback(async () => {
|
||||
setLoading(true);
|
||||
setError(null);
|
||||
try {
|
||||
const res = await fetch("/api/settings/proxies");
|
||||
const data = await res.json().catch(() => ({}));
|
||||
if (!res.ok) {
|
||||
setError(data?.error?.message || "Failed to load proxy registry");
|
||||
setItems([]);
|
||||
return;
|
||||
}
|
||||
setItems(Array.isArray(data?.items) ? data.items : []);
|
||||
void loadHealth();
|
||||
} catch (e: any) {
|
||||
setError(e?.message || "Failed to load proxy registry");
|
||||
setItems([]);
|
||||
} finally {
|
||||
setLoading(false);
|
||||
}
|
||||
}, [loadHealth]);
|
||||
|
||||
useEffect(() => {
|
||||
void load();
|
||||
}, [load]);
|
||||
|
||||
useEffect(() => {
|
||||
if (items.length > 0 && !bulkProxyId) {
|
||||
setBulkProxyId(items[0].id);
|
||||
}
|
||||
}, [items, bulkProxyId]);
|
||||
|
||||
const openCreate = () => {
|
||||
setForm(EMPTY_FORM);
|
||||
setModalOpen(true);
|
||||
};
|
||||
|
||||
const openEdit = (item: ProxyItem) => {
|
||||
setForm({
|
||||
id: item.id,
|
||||
name: item.name || "",
|
||||
type: item.type || "http",
|
||||
host: item.host || "",
|
||||
port: String(item.port || 8080),
|
||||
username: "",
|
||||
password: "",
|
||||
region: item.region || "",
|
||||
notes: item.notes || "",
|
||||
status: item.status || "active",
|
||||
});
|
||||
setModalOpen(true);
|
||||
};
|
||||
|
||||
const loadUsage = async (proxyId: string) => {
|
||||
try {
|
||||
const res = await fetch(
|
||||
`/api/settings/proxies?id=${encodeURIComponent(proxyId)}&whereUsed=1`
|
||||
);
|
||||
const data = await res.json().catch(() => ({}));
|
||||
if (!res.ok) return;
|
||||
setUsageById((prev) => ({
|
||||
...prev,
|
||||
[proxyId]: {
|
||||
count: Number(data?.count || 0),
|
||||
assignments: Array.isArray(data?.assignments) ? data.assignments : [],
|
||||
},
|
||||
}));
|
||||
} catch {
|
||||
// ignore usage loading errors in UI
|
||||
}
|
||||
};
|
||||
|
||||
const handleSave = async () => {
|
||||
if (!form.name.trim() || !form.host.trim()) {
|
||||
setError("Name and host are required");
|
||||
return;
|
||||
}
|
||||
|
||||
setSaving(true);
|
||||
setError(null);
|
||||
|
||||
const normalizedUsername = form.username.trim();
|
||||
const normalizedPassword = form.password.trim();
|
||||
|
||||
const payload: Record<string, unknown> = {
|
||||
...(editingId ? { id: editingId } : {}),
|
||||
name: form.name.trim(),
|
||||
type: form.type,
|
||||
host: form.host.trim(),
|
||||
port: Number(form.port || 8080),
|
||||
region: form.region.trim() || null,
|
||||
notes: form.notes.trim() || null,
|
||||
status: form.status,
|
||||
};
|
||||
if (!editingId || normalizedUsername.length > 0) {
|
||||
payload.username = normalizedUsername;
|
||||
}
|
||||
if (!editingId || normalizedPassword.length > 0) {
|
||||
payload.password = normalizedPassword;
|
||||
}
|
||||
|
||||
try {
|
||||
const res = await fetch("/api/settings/proxies", {
|
||||
method: editingId ? "PATCH" : "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify(payload),
|
||||
});
|
||||
const data = await res.json().catch(() => ({}));
|
||||
if (!res.ok) {
|
||||
setError(data?.error?.message || "Failed to save proxy");
|
||||
return;
|
||||
}
|
||||
|
||||
setModalOpen(false);
|
||||
setForm(EMPTY_FORM);
|
||||
await load();
|
||||
} catch (e: any) {
|
||||
setError(e?.message || "Failed to save proxy");
|
||||
} finally {
|
||||
setSaving(false);
|
||||
}
|
||||
};
|
||||
|
||||
const handleDelete = async (id: string) => {
|
||||
try {
|
||||
const res = await fetch(`/api/settings/proxies?id=${encodeURIComponent(id)}`, {
|
||||
method: "DELETE",
|
||||
});
|
||||
|
||||
if (res.ok) {
|
||||
await load();
|
||||
return;
|
||||
}
|
||||
|
||||
const payload = await res.json().catch(() => ({}));
|
||||
const inUse = res.status === 409;
|
||||
if (inUse) {
|
||||
const ok = window.confirm(
|
||||
"This proxy is still assigned. Force delete and remove all assignments?"
|
||||
);
|
||||
if (!ok) return;
|
||||
|
||||
const forceRes = await fetch(`/api/settings/proxies?id=${encodeURIComponent(id)}&force=1`, {
|
||||
method: "DELETE",
|
||||
});
|
||||
|
||||
if (!forceRes.ok) {
|
||||
const forcePayload = await forceRes.json().catch(() => ({}));
|
||||
setError(forcePayload?.error?.message || "Failed to force delete proxy");
|
||||
return;
|
||||
}
|
||||
|
||||
await load();
|
||||
return;
|
||||
}
|
||||
|
||||
setError(payload?.error?.message || "Failed to delete proxy");
|
||||
} catch (e: any) {
|
||||
setError(e?.message || "Failed to delete proxy");
|
||||
}
|
||||
};
|
||||
|
||||
const handleMigrate = async () => {
|
||||
setMigrating(true);
|
||||
setError(null);
|
||||
try {
|
||||
const res = await fetch("/api/settings/proxies/migrate", {
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({ force: false }),
|
||||
});
|
||||
const data = await res.json().catch(() => ({}));
|
||||
if (!res.ok) {
|
||||
setError(data?.error?.message || "Failed to migrate legacy proxy config");
|
||||
return;
|
||||
}
|
||||
await load();
|
||||
} catch (e: any) {
|
||||
setError(e?.message || "Failed to migrate legacy proxy config");
|
||||
} finally {
|
||||
setMigrating(false);
|
||||
}
|
||||
};
|
||||
|
||||
const handleBulkAssign = async () => {
|
||||
setBulkSaving(true);
|
||||
setError(null);
|
||||
try {
|
||||
const scopeIds =
|
||||
bulkScope === "global"
|
||||
? []
|
||||
: bulkScopeIds
|
||||
.split(/[\n,]/g)
|
||||
.map((part) => part.trim())
|
||||
.filter(Boolean);
|
||||
|
||||
const res = await fetch("/api/settings/proxies/bulk-assign", {
|
||||
method: "PUT",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({
|
||||
scope: bulkScope,
|
||||
scopeIds,
|
||||
proxyId: bulkProxyId || null,
|
||||
}),
|
||||
});
|
||||
const payload = await res.json().catch(() => ({}));
|
||||
if (!res.ok) {
|
||||
setError(payload?.error?.message || "Failed to run bulk assignment");
|
||||
return;
|
||||
}
|
||||
|
||||
setBulkOpen(false);
|
||||
setBulkScopeIds("");
|
||||
await load();
|
||||
} catch (e: any) {
|
||||
setError(e?.message || "Failed to run bulk assignment");
|
||||
} finally {
|
||||
setBulkSaving(false);
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<>
|
||||
<Card className="p-6">
|
||||
<div className="flex items-center justify-between gap-3 mb-4">
|
||||
<div>
|
||||
<h3 className="text-lg font-semibold">Proxy Registry</h3>
|
||||
<p className="text-sm text-text-muted">Store reusable proxies and track assignments.</p>
|
||||
</div>
|
||||
<div className="flex items-center gap-2">
|
||||
<Button
|
||||
size="sm"
|
||||
variant="secondary"
|
||||
icon="upgrade"
|
||||
onClick={handleMigrate}
|
||||
loading={migrating}
|
||||
data-testid="proxy-registry-import-legacy"
|
||||
>
|
||||
Import Legacy
|
||||
</Button>
|
||||
<Button
|
||||
size="sm"
|
||||
variant="secondary"
|
||||
icon="account_tree"
|
||||
onClick={() => setBulkOpen(true)}
|
||||
data-testid="proxy-registry-open-bulk"
|
||||
>
|
||||
Bulk Assign
|
||||
</Button>
|
||||
<Button
|
||||
size="sm"
|
||||
icon="add"
|
||||
onClick={openCreate}
|
||||
data-testid="proxy-registry-open-create"
|
||||
>
|
||||
Add Proxy
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{error && (
|
||||
<div className="mb-3 px-3 py-2 rounded border border-red-500/30 bg-red-500/10 text-sm text-red-400">
|
||||
{error}
|
||||
</div>
|
||||
)}
|
||||
|
||||
{loading ? (
|
||||
<div className="text-sm text-text-muted">Loading proxies...</div>
|
||||
) : items.length === 0 ? (
|
||||
<div className="text-sm text-text-muted">No saved proxies yet.</div>
|
||||
) : (
|
||||
<div className="overflow-x-auto">
|
||||
<table className="w-full text-sm">
|
||||
<thead>
|
||||
<tr className="text-left text-text-muted border-b border-border">
|
||||
<th className="py-2 pr-3">Name</th>
|
||||
<th className="py-2 pr-3">Endpoint</th>
|
||||
<th className="py-2 pr-3">Status</th>
|
||||
<th className="py-2 pr-3">Health (24h)</th>
|
||||
<th className="py-2 pr-3">Usage</th>
|
||||
<th className="py-2">Actions</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{items.map((item) => {
|
||||
const usage = usageById[item.id];
|
||||
const health = healthById[item.id];
|
||||
return (
|
||||
<tr key={item.id} className="border-b border-border/60">
|
||||
<td className="py-2 pr-3">
|
||||
<div className="font-medium text-text-main">{item.name}</div>
|
||||
{item.region && (
|
||||
<div className="text-xs text-text-muted">{item.region}</div>
|
||||
)}
|
||||
</td>
|
||||
<td className="py-2 pr-3 font-mono text-xs text-text-muted">
|
||||
{item.type}://{item.host}:{item.port}
|
||||
</td>
|
||||
<td className="py-2 pr-3">
|
||||
<span className="text-xs px-2 py-1 rounded border border-border bg-bg-subtle">
|
||||
{item.status || "active"}
|
||||
</span>
|
||||
</td>
|
||||
<td className="py-2 pr-3 text-xs text-text-muted">
|
||||
{health ? (
|
||||
<div className="flex flex-col gap-0.5">
|
||||
<span>{health.successRate ?? 0}% success</span>
|
||||
<span>{health.avgLatencyMs ?? "-"} ms avg</span>
|
||||
</div>
|
||||
) : (
|
||||
"-"
|
||||
)}
|
||||
</td>
|
||||
<td className="py-2 pr-3 text-xs text-text-muted">
|
||||
{usage ? `${usage.count} assignment(s)` : "-"}
|
||||
</td>
|
||||
<td className="py-2">
|
||||
<div className="flex items-center gap-1">
|
||||
<Button
|
||||
size="sm"
|
||||
variant="ghost"
|
||||
icon="visibility"
|
||||
onClick={() => void loadUsage(item.id)}
|
||||
>
|
||||
Usage
|
||||
</Button>
|
||||
<Button
|
||||
size="sm"
|
||||
variant="ghost"
|
||||
icon="edit"
|
||||
onClick={() => openEdit(item)}
|
||||
>
|
||||
Edit
|
||||
</Button>
|
||||
<Button
|
||||
size="sm"
|
||||
variant="ghost"
|
||||
icon="delete"
|
||||
onClick={() => void handleDelete(item.id)}
|
||||
className="!text-red-400"
|
||||
>
|
||||
Delete
|
||||
</Button>
|
||||
</div>
|
||||
</td>
|
||||
</tr>
|
||||
);
|
||||
})}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
)}
|
||||
</Card>
|
||||
|
||||
<Modal
|
||||
isOpen={modalOpen}
|
||||
onClose={() => {
|
||||
if (!saving) setModalOpen(false);
|
||||
}}
|
||||
title={editingId ? "Edit Proxy" : "Create Proxy"}
|
||||
maxWidth="lg"
|
||||
>
|
||||
<div className="flex flex-col gap-3">
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Name</label>
|
||||
<input
|
||||
data-testid="proxy-registry-name-input"
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={form.name}
|
||||
onChange={(e) => setForm((prev) => ({ ...prev, name: e.target.value }))}
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Type</label>
|
||||
<select
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={form.type}
|
||||
onChange={(e) => setForm((prev) => ({ ...prev, type: e.target.value }))}
|
||||
>
|
||||
<option value="http">HTTP</option>
|
||||
<option value="https">HTTPS</option>
|
||||
<option value="socks5">SOCKS5</option>
|
||||
</select>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Host</label>
|
||||
<input
|
||||
data-testid="proxy-registry-host-input"
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={form.host}
|
||||
onChange={(e) => setForm((prev) => ({ ...prev, host: e.target.value }))}
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Port</label>
|
||||
<input
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={form.port}
|
||||
onChange={(e) => setForm((prev) => ({ ...prev, port: e.target.value }))}
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Username</label>
|
||||
<input
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={form.username}
|
||||
placeholder={editingId ? "Leave blank to keep current username" : ""}
|
||||
onChange={(e) => setForm((prev) => ({ ...prev, username: e.target.value }))}
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Password</label>
|
||||
<input
|
||||
type="password"
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={form.password}
|
||||
placeholder={editingId ? "Leave blank to keep current password" : ""}
|
||||
onChange={(e) => setForm((prev) => ({ ...prev, password: e.target.value }))}
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Region</label>
|
||||
<input
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={form.region}
|
||||
onChange={(e) => setForm((prev) => ({ ...prev, region: e.target.value }))}
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Status</label>
|
||||
<select
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={form.status}
|
||||
onChange={(e) => setForm((prev) => ({ ...prev, status: e.target.value }))}
|
||||
>
|
||||
<option value="active">active</option>
|
||||
<option value="inactive">inactive</option>
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Notes</label>
|
||||
<textarea
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={form.notes}
|
||||
onChange={(e) => setForm((prev) => ({ ...prev, notes: e.target.value }))}
|
||||
rows={3}
|
||||
/>
|
||||
</div>
|
||||
|
||||
<div className="flex items-center justify-end gap-2 pt-2 border-t border-border">
|
||||
<Button size="sm" variant="secondary" onClick={() => setModalOpen(false)}>
|
||||
Cancel
|
||||
</Button>
|
||||
<Button size="sm" icon="save" onClick={handleSave} loading={saving}>
|
||||
Save
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
</Modal>
|
||||
|
||||
<Modal
|
||||
isOpen={bulkOpen}
|
||||
onClose={() => {
|
||||
if (!bulkSaving) setBulkOpen(false);
|
||||
}}
|
||||
title="Bulk Proxy Assignment"
|
||||
maxWidth="lg"
|
||||
>
|
||||
<div className="flex flex-col gap-3">
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Scope</label>
|
||||
<select
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={bulkScope}
|
||||
onChange={(e) => setBulkScope(e.target.value)}
|
||||
>
|
||||
<option value="global">global</option>
|
||||
<option value="provider">provider</option>
|
||||
<option value="account">account</option>
|
||||
<option value="combo">combo</option>
|
||||
</select>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">Proxy</label>
|
||||
<select
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
value={bulkProxyId}
|
||||
onChange={(e) => setBulkProxyId(e.target.value)}
|
||||
>
|
||||
<option value="">(clear assignment)</option>
|
||||
{items.map((item) => (
|
||||
<option key={item.id} value={item.id}>
|
||||
{item.name} ({item.type}://{item.host}:{item.port})
|
||||
</option>
|
||||
))}
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{bulkScope !== "global" && (
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1 block">
|
||||
Scope IDs (comma or newline)
|
||||
</label>
|
||||
<textarea
|
||||
data-testid="proxy-registry-bulk-scopeids-input"
|
||||
className="w-full px-3 py-2 rounded bg-bg-subtle border border-border"
|
||||
rows={5}
|
||||
value={bulkScopeIds}
|
||||
onChange={(e) => setBulkScopeIds(e.target.value)}
|
||||
placeholder="provider-openai,provider-anthropic"
|
||||
/>
|
||||
</div>
|
||||
)}
|
||||
|
||||
<div className="flex items-center justify-end gap-2 pt-2 border-t border-border">
|
||||
<Button size="sm" variant="secondary" onClick={() => setBulkOpen(false)}>
|
||||
Cancel
|
||||
</Button>
|
||||
<Button
|
||||
size="sm"
|
||||
icon="done_all"
|
||||
onClick={handleBulkAssign}
|
||||
loading={bulkSaving}
|
||||
data-testid="proxy-registry-bulk-apply"
|
||||
>
|
||||
Apply
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
</Modal>
|
||||
</>
|
||||
);
|
||||
}
|
||||
@@ -3,6 +3,7 @@
|
||||
import { useState, useEffect, useRef } from "react";
|
||||
import { Card, Button, ProxyConfigModal } from "@/shared/components";
|
||||
import { useTranslations } from "next-intl";
|
||||
import ProxyRegistryManager from "./ProxyRegistryManager";
|
||||
|
||||
export default function ProxyTab() {
|
||||
const [proxyModalOpen, setProxyModalOpen] = useState(false);
|
||||
@@ -41,39 +42,43 @@ export default function ProxyTab() {
|
||||
|
||||
return (
|
||||
<>
|
||||
<Card className="p-0 overflow-hidden">
|
||||
<div className="p-6">
|
||||
<div className="flex items-center gap-2 mb-4">
|
||||
<span className="material-symbols-outlined text-xl text-primary" aria-hidden="true">
|
||||
vpn_lock
|
||||
</span>
|
||||
<h2 className="text-lg font-bold">{t("globalProxy")}</h2>
|
||||
<div className="flex flex-col gap-6">
|
||||
<Card className="p-0 overflow-hidden">
|
||||
<div className="p-6">
|
||||
<div className="flex items-center gap-2 mb-4">
|
||||
<span className="material-symbols-outlined text-xl text-primary" aria-hidden="true">
|
||||
vpn_lock
|
||||
</span>
|
||||
<h2 className="text-lg font-bold">{t("globalProxy")}</h2>
|
||||
</div>
|
||||
<p className="text-sm text-text-muted mb-4">{t("globalProxyDesc")}</p>
|
||||
<div className="flex items-center gap-3">
|
||||
{globalProxy ? (
|
||||
<div className="flex items-center gap-2">
|
||||
<span className="px-2.5 py-1 rounded text-xs font-bold uppercase bg-emerald-500/15 text-emerald-400 border border-emerald-500/30">
|
||||
{globalProxy.type}://{globalProxy.host}:{globalProxy.port}
|
||||
</span>
|
||||
</div>
|
||||
) : (
|
||||
<span className="text-sm text-text-muted">{t("noGlobalProxy")}</span>
|
||||
)}
|
||||
<Button
|
||||
size="sm"
|
||||
variant={globalProxy ? "secondary" : "primary"}
|
||||
icon="settings"
|
||||
onClick={() => {
|
||||
loadGlobalProxy();
|
||||
setProxyModalOpen(true);
|
||||
}}
|
||||
>
|
||||
{globalProxy ? tc("edit") : t("configure")}
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
<p className="text-sm text-text-muted mb-4">{t("globalProxyDesc")}</p>
|
||||
<div className="flex items-center gap-3">
|
||||
{globalProxy ? (
|
||||
<div className="flex items-center gap-2">
|
||||
<span className="px-2.5 py-1 rounded text-xs font-bold uppercase bg-emerald-500/15 text-emerald-400 border border-emerald-500/30">
|
||||
{globalProxy.type}://{globalProxy.host}:{globalProxy.port}
|
||||
</span>
|
||||
</div>
|
||||
) : (
|
||||
<span className="text-sm text-text-muted">{t("noGlobalProxy")}</span>
|
||||
)}
|
||||
<Button
|
||||
size="sm"
|
||||
variant={globalProxy ? "secondary" : "primary"}
|
||||
icon="settings"
|
||||
onClick={() => {
|
||||
loadGlobalProxy();
|
||||
setProxyModalOpen(true);
|
||||
}}
|
||||
>
|
||||
{globalProxy ? tc("edit") : t("configure")}
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
</Card>
|
||||
</Card>
|
||||
|
||||
<ProxyRegistryManager />
|
||||
</div>
|
||||
|
||||
<ProxyConfigModal
|
||||
isOpen={proxyModalOpen}
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
import { NextResponse } from "next/server";
|
||||
import path from "node:path";
|
||||
import fs from "node:fs";
|
||||
import os from "node:os";
|
||||
import path from "path";
|
||||
import fs from "fs";
|
||||
import os from "os";
|
||||
import { getDbInstance, SQLITE_FILE } from "@/lib/db/core";
|
||||
import { isAuthRequired, isAuthenticated } from "@/shared/utils/apiAuth";
|
||||
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
import { NextResponse } from "next/server";
|
||||
import { getDbInstance, SQLITE_FILE } from "@/lib/db/core";
|
||||
import fs from "node:fs";
|
||||
import path from "node:path";
|
||||
import os from "node:os";
|
||||
import fs from "fs";
|
||||
import path from "path";
|
||||
import os from "os";
|
||||
|
||||
/**
|
||||
* GET /api/db-backups/exportAll
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
import { NextResponse } from "next/server";
|
||||
import Database from "better-sqlite3";
|
||||
import path from "node:path";
|
||||
import fs from "node:fs";
|
||||
import os from "node:os";
|
||||
import path from "path";
|
||||
import fs from "fs";
|
||||
import os from "os";
|
||||
import { getDbInstance, resetDbInstance, SQLITE_FILE } from "@/lib/db/core";
|
||||
import { backupDbFile } from "@/lib/db/backup";
|
||||
import { isAuthRequired, isAuthenticated } from "@/shared/utils/apiAuth";
|
||||
|
||||
@@ -0,0 +1,50 @@
|
||||
/**
|
||||
* GET /api/logs/detail — List detailed request logs
|
||||
* GET /api/logs/detail/:id — Get specific detailed log
|
||||
* POST /api/logs/detail/toggle — Enable/disable detailed logging
|
||||
*/
|
||||
import { NextRequest, NextResponse } from "next/server";
|
||||
import { isAuthenticated } from "@/shared/utils/apiAuth";
|
||||
import {
|
||||
getRequestDetailLogs,
|
||||
getRequestDetailLogCount,
|
||||
isDetailedLoggingEnabled,
|
||||
} from "@/lib/db/detailedLogs";
|
||||
import { updateSettings } from "@/lib/db/settings";
|
||||
|
||||
export const dynamic = "force-dynamic";
|
||||
|
||||
export async function GET(req: NextRequest) {
|
||||
if (!isAuthenticated(req)) {
|
||||
return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
|
||||
}
|
||||
|
||||
const url = new URL(req.url);
|
||||
const limit = Math.min(Number(url.searchParams.get("limit") ?? 50), 200);
|
||||
const offset = Number(url.searchParams.get("offset") ?? 0);
|
||||
|
||||
const logs = getRequestDetailLogs(limit, offset);
|
||||
const total = getRequestDetailLogCount();
|
||||
const enabled = await isDetailedLoggingEnabled();
|
||||
|
||||
return NextResponse.json({ enabled, total, logs });
|
||||
}
|
||||
|
||||
export async function POST(req: NextRequest) {
|
||||
if (!isAuthenticated(req)) {
|
||||
return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
|
||||
}
|
||||
|
||||
const body = await req.json();
|
||||
const enabled = body.enabled === true || body.enabled === "1";
|
||||
|
||||
await updateSettings({ detailed_logs_enabled: enabled });
|
||||
|
||||
return NextResponse.json({
|
||||
success: true,
|
||||
enabled,
|
||||
message: enabled
|
||||
? "Detailed logging enabled. Pipeline bodies will be captured for new requests."
|
||||
: "Detailed logging disabled.",
|
||||
});
|
||||
}
|
||||
@@ -13,11 +13,13 @@ export async function GET() {
|
||||
const { getAllCircuitBreakerStatuses } = await import("@/shared/utils/circuitBreaker");
|
||||
const { getAllRateLimitStatus } = await import("@omniroute/open-sse/services/rateLimitManager");
|
||||
const { getAllModelLockouts } = await import("@omniroute/open-sse/services/accountFallback");
|
||||
const { getInflightCount } = await import("@omniroute/open-sse/services/requestDedup.ts");
|
||||
|
||||
const settings = await getSettings();
|
||||
const circuitBreakers = getAllCircuitBreakerStatuses();
|
||||
const rateLimitStatus = getAllRateLimitStatus();
|
||||
const lockouts = getAllModelLockouts();
|
||||
const { getAllHealthStatuses } = await import("@/lib/localHealthCheck");
|
||||
|
||||
// System info
|
||||
const system = {
|
||||
@@ -46,8 +48,12 @@ export async function GET() {
|
||||
timestamp: new Date().toISOString(),
|
||||
system,
|
||||
providerHealth,
|
||||
localProviders: getAllHealthStatuses(),
|
||||
rateLimitStatus,
|
||||
lockouts,
|
||||
dedup: {
|
||||
inflightRequests: getInflightCount(),
|
||||
},
|
||||
setupComplete: settings?.setupComplete || false,
|
||||
});
|
||||
} catch (error) {
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import { NextResponse } from "next/server";
|
||||
import { timingSafeEqual } from "node:crypto";
|
||||
import { timingSafeEqual } from "crypto";
|
||||
import {
|
||||
getProvider,
|
||||
generateAuthData,
|
||||
|
||||
@@ -255,6 +255,22 @@ const PROVIDER_MODELS_CONFIG: Record<string, ProviderModelsConfigEntry> = {
|
||||
authPrefix: "Bearer ",
|
||||
parseResponse: (data) => data.models || data.data || [],
|
||||
},
|
||||
synthetic: {
|
||||
url: "https://api.synthetic.new/openai/v1/models",
|
||||
method: "GET",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
authHeader: "Authorization",
|
||||
authPrefix: "Bearer ",
|
||||
parseResponse: (data) => data.data || data.models || [],
|
||||
},
|
||||
"kilo-gateway": {
|
||||
url: "https://api.kilo.ai/api/gateway/models",
|
||||
method: "GET",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
authHeader: "Authorization",
|
||||
authPrefix: "Bearer ",
|
||||
parseResponse: (data) => data.data || data.models || [],
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
|
||||
@@ -20,6 +20,7 @@ const OAUTH_TEST_CONFIG = {
|
||||
claude: {
|
||||
// Claude doesn't have userinfo, we verify token exists and not expired
|
||||
checkExpiry: true,
|
||||
refreshable: true,
|
||||
},
|
||||
codex: {
|
||||
// Codex OAuth tokens are ChatGPT session tokens, NOT standard OpenAI API keys.
|
||||
|
||||
@@ -0,0 +1,63 @@
|
||||
import { assignProxyToScope, getProxyAssignments, resolveProxyForConnection } from "@/lib/localDb";
|
||||
import { proxyAssignmentSchema } from "@/shared/validation/schemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
import { createErrorResponse, createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
import { clearDispatcherCache } from "@omniroute/open-sse/utils/proxyDispatcher";
|
||||
|
||||
export async function GET(request: Request) {
|
||||
try {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const proxyId = searchParams.get("proxyId");
|
||||
const scope = searchParams.get("scope");
|
||||
const scopeId = searchParams.get("scopeId");
|
||||
const resolveConnectionId = searchParams.get("resolveConnectionId");
|
||||
|
||||
if (resolveConnectionId) {
|
||||
const resolved = await resolveProxyForConnection(resolveConnectionId);
|
||||
return Response.json(resolved);
|
||||
}
|
||||
|
||||
const assignments = await getProxyAssignments({
|
||||
proxyId: proxyId || undefined,
|
||||
scope: scope || undefined,
|
||||
});
|
||||
const filtered = scopeId
|
||||
? assignments.filter((entry) => entry.scopeId === scopeId)
|
||||
: assignments;
|
||||
return Response.json({ items: filtered, total: filtered.length });
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to load proxy assignments");
|
||||
}
|
||||
}
|
||||
|
||||
export async function PUT(request: Request) {
|
||||
let rawBody: unknown;
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(proxyAssignmentSchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const { scope, scopeId, proxyId } = validation.data;
|
||||
const assigned = await assignProxyToScope(scope, scopeId || null, proxyId || null);
|
||||
clearDispatcherCache();
|
||||
return Response.json({ success: true, assignment: assigned });
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to update assignment");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,45 @@
|
||||
import { bulkAssignProxyToScope } from "@/lib/localDb";
|
||||
import { bulkProxyAssignmentSchema } from "@/shared/validation/schemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
import { createErrorResponse, createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
import { clearDispatcherCache } from "@omniroute/open-sse/utils/proxyDispatcher";
|
||||
|
||||
export async function PUT(request: Request) {
|
||||
let rawBody: unknown;
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(bulkProxyAssignmentSchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const { scope, scopeIds, proxyId } = validation.data;
|
||||
const normalizedScope = scope === "key" ? "account" : scope;
|
||||
const result = await bulkAssignProxyToScope(normalizedScope, scopeIds || [], proxyId || null);
|
||||
clearDispatcherCache();
|
||||
|
||||
return Response.json({
|
||||
success: true,
|
||||
scope: normalizedScope,
|
||||
requested: normalizedScope === "global" ? 1 : (scopeIds || []).length,
|
||||
updated: result.updated,
|
||||
failed: result.failed,
|
||||
});
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to run bulk assignment");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,13 @@
|
||||
import { getProxyHealthStats } from "@/lib/localDb";
|
||||
import { createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
|
||||
export async function GET(request: Request) {
|
||||
try {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const hours = Number(searchParams.get("hours") || 24);
|
||||
const items = await getProxyHealthStats({ hours });
|
||||
return Response.json({ items, total: items.length, windowHours: hours });
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to load proxy health stats");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,40 @@
|
||||
import { migrateLegacyProxyConfigToRegistry } from "@/lib/localDb";
|
||||
import { createErrorResponse, createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
import { z } from "zod";
|
||||
|
||||
const migrateLegacyProxySchema = z.object({
|
||||
force: z.boolean().optional(),
|
||||
});
|
||||
|
||||
export async function POST(request: Request) {
|
||||
let rawBody: unknown;
|
||||
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(migrateLegacyProxySchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const force = validation.data.force === true;
|
||||
const result = await migrateLegacyProxyConfigToRegistry({ force });
|
||||
return Response.json(result);
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to migrate legacy proxy config");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,127 @@
|
||||
import {
|
||||
createProxy,
|
||||
deleteProxyById,
|
||||
getProxyById,
|
||||
getProxyWhereUsed,
|
||||
listProxies,
|
||||
updateProxy,
|
||||
} from "@/lib/localDb";
|
||||
import { createProxyRegistrySchema, updateProxyRegistrySchema } from "@/shared/validation/schemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
import { createErrorResponse, createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
|
||||
export async function GET(request: Request) {
|
||||
try {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const id = searchParams.get("id");
|
||||
const whereUsed = searchParams.get("whereUsed") === "1";
|
||||
|
||||
if (id && whereUsed) {
|
||||
const usage = await getProxyWhereUsed(id);
|
||||
return Response.json(usage);
|
||||
}
|
||||
|
||||
if (id) {
|
||||
const proxy = await getProxyById(id, { includeSecrets: false });
|
||||
if (!proxy) {
|
||||
return createErrorResponse({ status: 404, message: "Proxy not found", type: "not_found" });
|
||||
}
|
||||
return Response.json(proxy);
|
||||
}
|
||||
|
||||
const proxies = await listProxies({ includeSecrets: false });
|
||||
return Response.json({ items: proxies, total: proxies.length });
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to load proxies");
|
||||
}
|
||||
}
|
||||
|
||||
export async function POST(request: Request) {
|
||||
let rawBody: unknown;
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(createProxyRegistrySchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const created = await createProxy(validation.data);
|
||||
return Response.json(created, { status: 201 });
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to create proxy");
|
||||
}
|
||||
}
|
||||
|
||||
export async function PATCH(request: Request) {
|
||||
let rawBody: unknown;
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(updateProxyRegistrySchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const { id, ...changes } = validation.data;
|
||||
const updated = await updateProxy(id, changes);
|
||||
if (!updated) {
|
||||
return createErrorResponse({ status: 404, message: "Proxy not found", type: "not_found" });
|
||||
}
|
||||
|
||||
return Response.json(updated);
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to update proxy");
|
||||
}
|
||||
}
|
||||
|
||||
export async function DELETE(request: Request) {
|
||||
try {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const id = searchParams.get("id");
|
||||
const force = searchParams.get("force") === "1";
|
||||
|
||||
if (!id) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "id is required",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const deleted = await deleteProxyById(id, { force });
|
||||
if (!deleted) {
|
||||
return createErrorResponse({ status: 404, message: "Proxy not found", type: "not_found" });
|
||||
}
|
||||
|
||||
return Response.json({ success: true });
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to delete proxy");
|
||||
}
|
||||
}
|
||||
@@ -8,6 +8,7 @@ import {
|
||||
import { clearDispatcherCache } from "@omniroute/open-sse/utils/proxyDispatcher";
|
||||
import { updateProxyConfigSchema } from "@/shared/validation/schemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
import { createErrorResponse, createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
import type { z } from "zod";
|
||||
|
||||
const BASE_SUPPORTED_PROXY_TYPES = new Set(["http", "https"]);
|
||||
@@ -135,11 +136,7 @@ export async function GET(request: Request) {
|
||||
const config = await getProxyConfig();
|
||||
return Response.json(config);
|
||||
} catch (error) {
|
||||
const routeError = toApiRouteError(error);
|
||||
return Response.json(
|
||||
{ error: { message: routeError.message, type: "server_error" } },
|
||||
{ status: 500 }
|
||||
);
|
||||
return createErrorResponseFromUnknown(error, "Failed to load proxy config");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -152,25 +149,22 @@ export async function PUT(request: Request) {
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return Response.json(
|
||||
{ error: { message: "Invalid JSON body", type: "invalid_request" } },
|
||||
{ status: 400 }
|
||||
);
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(updateProxyConfigSchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return Response.json(
|
||||
{
|
||||
error: {
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
},
|
||||
},
|
||||
{ status: 400 }
|
||||
);
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
const body = validation.data;
|
||||
const normalizedBody = normalizeProxyPayload(body);
|
||||
@@ -181,7 +175,7 @@ export async function PUT(request: Request) {
|
||||
const routeError = toApiRouteError(error);
|
||||
const status = Number(routeError.status) || 500;
|
||||
const type = routeError.type || (status === 400 ? "invalid_request" : "server_error");
|
||||
return Response.json({ error: { message: routeError.message, type } }, { status });
|
||||
return createErrorResponse({ status, message: routeError.message, type });
|
||||
}
|
||||
}
|
||||
|
||||
@@ -196,20 +190,17 @@ export async function DELETE(request: Request) {
|
||||
const id = searchParams.get("id");
|
||||
|
||||
if (!level) {
|
||||
return Response.json(
|
||||
{ error: { message: "level is required", type: "invalid_request" } },
|
||||
{ status: 400 }
|
||||
);
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "level is required",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const updated = await deleteProxyForLevel(level, id);
|
||||
clearDispatcherCache();
|
||||
return Response.json(updated);
|
||||
} catch (error) {
|
||||
const routeError = toApiRouteError(error);
|
||||
return Response.json(
|
||||
{ error: { message: routeError.message, type: "server_error" } },
|
||||
{ status: 500 }
|
||||
);
|
||||
return createErrorResponseFromUnknown(error, "Failed to delete proxy");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -7,6 +7,7 @@ import {
|
||||
} from "@omniroute/open-sse/utils/proxyDispatcher.ts";
|
||||
import { testProxySchema } from "@/shared/validation/schemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
import { createErrorResponse, createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
|
||||
const BASE_SUPPORTED_PROXY_TYPES = new Set(["http", "https"]);
|
||||
|
||||
@@ -38,61 +39,46 @@ export async function POST(request: Request) {
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return Response.json(
|
||||
{ error: { message: "Invalid JSON body", type: "invalid_request" } },
|
||||
{ status: 400 }
|
||||
);
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(testProxySchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return Response.json(
|
||||
{
|
||||
error: {
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
},
|
||||
},
|
||||
{ status: 400 }
|
||||
);
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
const { proxy } = validation.data;
|
||||
|
||||
const proxyType = String(proxy.type || "http").toLowerCase();
|
||||
if (proxyType === "socks5" && !isSocks5ProxyEnabled()) {
|
||||
return Response.json(
|
||||
{
|
||||
error: {
|
||||
message: "SOCKS5 proxy is disabled (set ENABLE_SOCKS5_PROXY=true to enable)",
|
||||
type: "invalid_request",
|
||||
},
|
||||
},
|
||||
{ status: 400 }
|
||||
);
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "SOCKS5 proxy is disabled (set ENABLE_SOCKS5_PROXY=true to enable)",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
if (proxyType.startsWith("socks") && proxyType !== "socks5") {
|
||||
return Response.json(
|
||||
{
|
||||
error: {
|
||||
message: `proxy.type must be ${supportedTypesMessage()}`,
|
||||
type: "invalid_request",
|
||||
},
|
||||
},
|
||||
{ status: 400 }
|
||||
);
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: `proxy.type must be ${supportedTypesMessage()}`,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
if (!getSupportedProxyTypes().has(proxyType)) {
|
||||
return Response.json(
|
||||
{
|
||||
error: {
|
||||
message: `proxy.type must be ${supportedTypesMessage()}`,
|
||||
type: "invalid_request",
|
||||
},
|
||||
},
|
||||
{ status: 400 }
|
||||
);
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: `proxy.type must be ${supportedTypesMessage()}`,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
let proxyUrl: string;
|
||||
@@ -108,27 +94,19 @@ export async function POST(request: Request) {
|
||||
{ allowSocks5: isSocks5ProxyEnabled() }
|
||||
);
|
||||
if (!normalizedProxyUrl) {
|
||||
return Response.json(
|
||||
{
|
||||
error: {
|
||||
message: "Invalid proxy configuration",
|
||||
type: "invalid_request",
|
||||
},
|
||||
},
|
||||
{ status: 400 }
|
||||
);
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid proxy configuration",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
proxyUrl = normalizedProxyUrl;
|
||||
} catch (proxyError) {
|
||||
return Response.json(
|
||||
{
|
||||
error: {
|
||||
message: getErrorMessage(proxyError, "Invalid proxy configuration"),
|
||||
type: "invalid_request",
|
||||
},
|
||||
},
|
||||
{ status: 400 }
|
||||
);
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: getErrorMessage(proxyError, "Invalid proxy configuration"),
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const publicProxyUrl = proxyUrlForLogs(proxyUrl);
|
||||
@@ -180,7 +158,6 @@ export async function POST(request: Request) {
|
||||
clearTimeout(timeout);
|
||||
}
|
||||
} catch (error) {
|
||||
const message = getErrorMessage(error, "Unexpected server error");
|
||||
return Response.json({ error: { message, type: "server_error" } }, { status: 500 });
|
||||
return createErrorResponseFromUnknown(error, "Unexpected server error");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -2,7 +2,7 @@ import { NextResponse } from "next/server";
|
||||
import { getSettings, updateSettings } from "@/lib/localDb";
|
||||
import { clearHealthCheckLogCache } from "@/lib/tokenHealthCheck";
|
||||
import bcrypt from "bcryptjs";
|
||||
import { timingSafeEqual } from "node:crypto";
|
||||
import { timingSafeEqual } from "crypto";
|
||||
import { getRuntimePorts } from "@/lib/runtime/ports";
|
||||
import { updateSettingsSchema } from "@/shared/validation/settingsSchemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import { NextResponse } from "next/server";
|
||||
import path from "node:path";
|
||||
import fs from "node:fs";
|
||||
import path from "path";
|
||||
import fs from "fs";
|
||||
import { resolveDataDir } from "@/lib/dataPaths";
|
||||
|
||||
/**
|
||||
|
||||
@@ -0,0 +1,115 @@
|
||||
/**
|
||||
* GET /api/system/version — Returns current version and latest available on npm
|
||||
* POST /api/system/update — Triggers npm install -g omniroute@latest + pm2 restart
|
||||
*
|
||||
* Security: Requires admin authentication (same as other management routes).
|
||||
* Safety: Update only runs if a newer version is available on npm.
|
||||
*/
|
||||
import { NextRequest, NextResponse } from "next/server";
|
||||
import { execFile } from "child_process";
|
||||
import { promisify } from "util";
|
||||
import { isAuthenticated } from "@/shared/utils/apiAuth";
|
||||
|
||||
const execFileAsync = promisify(execFile);
|
||||
|
||||
export const dynamic = "force-dynamic";
|
||||
|
||||
/** Fetch latest version from npm registry (no install, just metadata) */
|
||||
async function getLatestNpmVersion(): Promise<string | null> {
|
||||
try {
|
||||
const { stdout } = await execFileAsync("npm", ["info", "omniroute", "version", "--json"], {
|
||||
timeout: 10000,
|
||||
});
|
||||
const parsed = JSON.parse(stdout.trim());
|
||||
return typeof parsed === "string" ? parsed : null;
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
/** Current installed version from package.json */
|
||||
function getCurrentVersion(): string {
|
||||
try {
|
||||
|
||||
return require("../../../../../package.json").version as string;
|
||||
} catch {
|
||||
return "unknown";
|
||||
}
|
||||
}
|
||||
|
||||
/** Compare semver strings — returns true if a > b */
|
||||
function isNewer(a: string | null, b: string): boolean {
|
||||
if (!a) return false;
|
||||
const parse = (v: string) => v.split(".").map(Number);
|
||||
const [aMaj, aMin, aPat] = parse(a);
|
||||
const [bMaj, bMin, bPat] = parse(b);
|
||||
if (aMaj !== bMaj) return aMaj > bMaj;
|
||||
if (aMin !== bMin) return aMin > bMin;
|
||||
return aPat > bPat;
|
||||
}
|
||||
|
||||
export async function GET(req: NextRequest) {
|
||||
if (!isAuthenticated(req)) {
|
||||
return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
|
||||
}
|
||||
|
||||
const current = getCurrentVersion();
|
||||
const latest = await getLatestNpmVersion();
|
||||
const updateAvailable = isNewer(latest, current);
|
||||
|
||||
return NextResponse.json({
|
||||
current,
|
||||
latest: latest ?? "unavailable",
|
||||
updateAvailable,
|
||||
channel: "npm",
|
||||
});
|
||||
}
|
||||
|
||||
export async function POST(req: NextRequest) {
|
||||
if (!isAuthenticated(req)) {
|
||||
return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
|
||||
}
|
||||
|
||||
const current = getCurrentVersion();
|
||||
const latest = await getLatestNpmVersion();
|
||||
|
||||
if (!latest) {
|
||||
return NextResponse.json(
|
||||
{ success: false, error: "Could not reach npm registry" },
|
||||
{ status: 503 }
|
||||
);
|
||||
}
|
||||
|
||||
if (!isNewer(latest, current)) {
|
||||
return NextResponse.json({
|
||||
success: false,
|
||||
error: `Already on latest version (${current})`,
|
||||
current,
|
||||
latest,
|
||||
});
|
||||
}
|
||||
|
||||
// Run update in background — client gets immediate acknowledgment
|
||||
const install = async () => {
|
||||
try {
|
||||
await execFileAsync("npm", ["install", "-g", `omniroute@${latest}`, "--ignore-scripts"], {
|
||||
timeout: 300000, // 5 minutes
|
||||
});
|
||||
// Restart PM2 — non-fatal if pm2 not available (Docker/manual setups)
|
||||
await execFileAsync("pm2", ["restart", "omniroute"]).catch(() => null);
|
||||
console.log(`[AutoUpdate] Successfully updated to v${latest}`);
|
||||
} catch (err) {
|
||||
console.error(`[AutoUpdate] Update failed:`, err);
|
||||
}
|
||||
};
|
||||
|
||||
// Fire-and-forget
|
||||
install();
|
||||
|
||||
return NextResponse.json({
|
||||
success: true,
|
||||
message: `Update to v${latest} started. Restarting in ~30 seconds.`,
|
||||
from: current,
|
||||
to: latest,
|
||||
});
|
||||
}
|
||||
@@ -6,10 +6,16 @@ import {
|
||||
extractApiKey,
|
||||
isValidApiKey,
|
||||
} from "@/sse/services/auth";
|
||||
import { parseSpeechModel, getSpeechProvider } from "@omniroute/open-sse/config/audioRegistry.ts";
|
||||
import {
|
||||
parseSpeechModel,
|
||||
getSpeechProvider,
|
||||
buildDynamicAudioProvider,
|
||||
type ProviderNodeRow,
|
||||
} from "@omniroute/open-sse/config/audioRegistry.ts";
|
||||
import { errorResponse } from "@omniroute/open-sse/utils/error.ts";
|
||||
import { HTTP_STATUS } from "@omniroute/open-sse/config/constants.ts";
|
||||
import { enforceApiKeyPolicy } from "@/shared/utils/apiKeyPolicy";
|
||||
import { getProviderNodes } from "@/lib/localDb";
|
||||
import { v1AudioSpeechSchema } from "@/shared/validation/schemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
|
||||
@@ -55,7 +61,31 @@ export async function POST(request) {
|
||||
const policy = await enforceApiKeyPolicy(request, body.model);
|
||||
if (policy.rejection) return policy.rejection;
|
||||
|
||||
const { provider } = parseSpeechModel(body.model);
|
||||
// Load local provider_nodes for audio routing (only localhost — prevents auth bypass/SSRF)
|
||||
let dynamicProviders: ReturnType<typeof buildDynamicAudioProvider>[] = [];
|
||||
try {
|
||||
const nodes = await getProviderNodes();
|
||||
dynamicProviders = (Array.isArray(nodes) ? nodes : [])
|
||||
.filter((n: ProviderNodeRow) => {
|
||||
if (n.apiType !== "chat" && n.apiType !== "responses") return false;
|
||||
try {
|
||||
const hostname = new URL(n.baseUrl).hostname;
|
||||
return (
|
||||
hostname === "localhost" ||
|
||||
hostname === "127.0.0.1" ||
|
||||
hostname === "::1" ||
|
||||
hostname === "[::1]"
|
||||
);
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
})
|
||||
.map((n) => buildDynamicAudioProvider(n, "/audio/speech"));
|
||||
} catch {
|
||||
// DB error — fall back to hardcoded providers only
|
||||
}
|
||||
|
||||
const { provider, model: resolvedModel } = parseSpeechModel(body.model, dynamicProviders);
|
||||
if (!provider) {
|
||||
return errorResponse(
|
||||
HTTP_STATUS.BAD_REQUEST,
|
||||
@@ -63,8 +93,9 @@ export async function POST(request) {
|
||||
);
|
||||
}
|
||||
|
||||
// Check provider config for auth bypass
|
||||
const providerConfig = getSpeechProvider(provider);
|
||||
// Check provider config — hardcoded first, then dynamic
|
||||
const providerConfig =
|
||||
getSpeechProvider(provider) || dynamicProviders.find((dp) => dp.id === provider) || null;
|
||||
|
||||
// Get credentials — skip for local providers (authType: "none")
|
||||
let credentials = null;
|
||||
@@ -75,7 +106,12 @@ export async function POST(request) {
|
||||
}
|
||||
}
|
||||
|
||||
const response = await handleAudioSpeech({ body, credentials });
|
||||
const response = await handleAudioSpeech({
|
||||
body,
|
||||
credentials,
|
||||
resolvedProvider: providerConfig,
|
||||
resolvedModel,
|
||||
});
|
||||
if (response?.ok) {
|
||||
await clearRecoveredProviderState(credentials);
|
||||
}
|
||||
|
||||
@@ -6,10 +6,16 @@ import {
|
||||
extractApiKey,
|
||||
isValidApiKey,
|
||||
} from "@/sse/services/auth";
|
||||
import { parseTranscriptionModel, getTranscriptionProvider } from "@omniroute/open-sse/config/audioRegistry.ts";
|
||||
import {
|
||||
parseTranscriptionModel,
|
||||
getTranscriptionProvider,
|
||||
buildDynamicAudioProvider,
|
||||
type ProviderNodeRow,
|
||||
} from "@omniroute/open-sse/config/audioRegistry.ts";
|
||||
import { errorResponse } from "@omniroute/open-sse/utils/error.ts";
|
||||
import { HTTP_STATUS } from "@omniroute/open-sse/config/constants.ts";
|
||||
import { enforceApiKeyPolicy } from "@/shared/utils/apiKeyPolicy";
|
||||
import { getProviderNodes } from "@/lib/localDb";
|
||||
|
||||
/**
|
||||
* Handle CORS preflight
|
||||
@@ -53,7 +59,34 @@ export async function POST(request) {
|
||||
const policy = await enforceApiKeyPolicy(request, model as string);
|
||||
if (policy.rejection) return policy.rejection;
|
||||
|
||||
const { provider } = parseTranscriptionModel(model);
|
||||
// Load local provider_nodes for audio routing (only localhost — prevents auth bypass/SSRF)
|
||||
let dynamicProviders: ReturnType<typeof buildDynamicAudioProvider>[] = [];
|
||||
try {
|
||||
const nodes = await getProviderNodes();
|
||||
dynamicProviders = (Array.isArray(nodes) ? nodes : [])
|
||||
.filter((n: ProviderNodeRow) => {
|
||||
if (n.apiType !== "chat" && n.apiType !== "responses") return false;
|
||||
try {
|
||||
const hostname = new URL(n.baseUrl).hostname;
|
||||
return (
|
||||
hostname === "localhost" ||
|
||||
hostname === "127.0.0.1" ||
|
||||
hostname === "::1" ||
|
||||
hostname === "[::1]"
|
||||
);
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
})
|
||||
.map((n) => buildDynamicAudioProvider(n, "/audio/transcriptions"));
|
||||
} catch {
|
||||
// DB error — fall back to hardcoded providers only
|
||||
}
|
||||
|
||||
const { provider, model: resolvedModel } = parseTranscriptionModel(
|
||||
model as string,
|
||||
dynamicProviders
|
||||
);
|
||||
if (!provider) {
|
||||
return errorResponse(
|
||||
HTTP_STATUS.BAD_REQUEST,
|
||||
@@ -61,8 +94,9 @@ export async function POST(request) {
|
||||
);
|
||||
}
|
||||
|
||||
// Check provider config for auth bypass
|
||||
const providerConfig = getTranscriptionProvider(provider);
|
||||
// Check provider config — hardcoded first, then dynamic
|
||||
const providerConfig =
|
||||
getTranscriptionProvider(provider) || dynamicProviders.find((dp) => dp.id === provider) || null;
|
||||
|
||||
// Get credentials — skip for local providers (authType: "none")
|
||||
let credentials = null;
|
||||
@@ -73,7 +107,12 @@ export async function POST(request) {
|
||||
}
|
||||
}
|
||||
|
||||
const response = await handleAudioTranscription({ formData, credentials });
|
||||
const response = await handleAudioTranscription({
|
||||
formData,
|
||||
credentials,
|
||||
resolvedProvider: providerConfig,
|
||||
resolvedModel,
|
||||
});
|
||||
if (response?.ok) {
|
||||
await clearRecoveredProviderState(credentials);
|
||||
}
|
||||
|
||||
@@ -9,6 +9,9 @@ import {
|
||||
import {
|
||||
parseEmbeddingModel,
|
||||
getAllEmbeddingModels,
|
||||
getEmbeddingProvider,
|
||||
buildDynamicEmbeddingProvider,
|
||||
type EmbeddingProviderNodeRow,
|
||||
} from "@omniroute/open-sse/config/embeddingRegistry.ts";
|
||||
import { errorResponse } from "@omniroute/open-sse/utils/error.ts";
|
||||
import { HTTP_STATUS } from "@omniroute/open-sse/config/constants.ts";
|
||||
@@ -18,7 +21,7 @@ import { enforceApiKeyPolicy } from "@/shared/utils/apiKeyPolicy";
|
||||
import { v1EmbeddingsSchema } from "@/shared/validation/schemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
|
||||
import { getAllCustomModels } from "@/lib/localDb";
|
||||
import { getAllCustomModels, getProviderNodes } from "@/lib/localDb";
|
||||
|
||||
/**
|
||||
* Handle CORS preflight
|
||||
@@ -110,8 +113,42 @@ export async function POST(request) {
|
||||
const policy = await enforceApiKeyPolicy(request, body.model);
|
||||
if (policy.rejection) return policy.rejection;
|
||||
|
||||
// Load local provider_nodes for embedding routing (only localhost — prevents auth bypass/SSRF)
|
||||
let dynamicProviders: ReturnType<typeof buildDynamicEmbeddingProvider>[] = [];
|
||||
try {
|
||||
const nodes = await getProviderNodes();
|
||||
dynamicProviders = (Array.isArray(nodes) ? nodes : [])
|
||||
.filter((n: EmbeddingProviderNodeRow) => {
|
||||
// provider_nodes apiType is "chat" or "responses" (not "embeddings") — local OpenAI-compatible
|
||||
// backends expose /embeddings under the same base URL as chat, so we build the URL as baseUrl + /embeddings.
|
||||
if (n.apiType !== "chat" && n.apiType !== "responses") return false;
|
||||
try {
|
||||
const hostname = new URL(n.baseUrl).hostname;
|
||||
return (
|
||||
hostname === "localhost" ||
|
||||
hostname === "127.0.0.1" ||
|
||||
hostname === "::1" ||
|
||||
hostname === "[::1]"
|
||||
);
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
})
|
||||
.map((n) => {
|
||||
try {
|
||||
return buildDynamicEmbeddingProvider(n);
|
||||
} catch (err) {
|
||||
log.error("EMBED", `Skipping invalid provider_node ${n.prefix}: ${err}`);
|
||||
return null;
|
||||
}
|
||||
})
|
||||
.filter((p): p is NonNullable<typeof p> => p !== null);
|
||||
} catch (err) {
|
||||
log.error("EMBED", `Failed to load provider_nodes for embeddings: ${err}`);
|
||||
}
|
||||
|
||||
// Parse model to get provider
|
||||
const { provider } = parseEmbeddingModel(body.model);
|
||||
const { provider, model: resolvedModel } = parseEmbeddingModel(body.model, dynamicProviders);
|
||||
if (!provider) {
|
||||
return errorResponse(
|
||||
HTTP_STATUS.BAD_REQUEST,
|
||||
@@ -119,19 +156,39 @@ export async function POST(request) {
|
||||
);
|
||||
}
|
||||
|
||||
// Get credentials for the embedding provider
|
||||
const credentials = await getProviderCredentials(provider);
|
||||
if (!credentials) {
|
||||
// Resolve provider config — dynamic first (local override), then hardcoded
|
||||
const providerConfig =
|
||||
dynamicProviders.find((dp) => dp.id === provider) || getEmbeddingProvider(provider) || null;
|
||||
|
||||
if (!providerConfig) {
|
||||
return errorResponse(
|
||||
HTTP_STATUS.BAD_REQUEST,
|
||||
`No credentials for embedding provider: ${provider}`
|
||||
`Unknown embedding provider: ${provider}. No matching hardcoded or local provider found.`
|
||||
);
|
||||
}
|
||||
|
||||
const result = await handleEmbedding({ body, credentials, log });
|
||||
// Get credentials — skip for local providers (authType: "none")
|
||||
let credentials = null;
|
||||
if (providerConfig && providerConfig.authType !== "none") {
|
||||
credentials = await getProviderCredentials(provider);
|
||||
if (!credentials) {
|
||||
return errorResponse(
|
||||
HTTP_STATUS.BAD_REQUEST,
|
||||
`No credentials for embedding provider: ${provider}`
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
const result = await handleEmbedding({
|
||||
body,
|
||||
credentials,
|
||||
log,
|
||||
resolvedProvider: providerConfig,
|
||||
resolvedModel,
|
||||
});
|
||||
|
||||
if (result.success) {
|
||||
await clearRecoveredProviderState(credentials);
|
||||
if (credentials) await clearRecoveredProviderState(credentials);
|
||||
return new Response(JSON.stringify(result.data), {
|
||||
status: 200,
|
||||
headers: { "Content-Type": "application/json" },
|
||||
|
||||
@@ -0,0 +1,82 @@
|
||||
import { assignProxyToScope, getProxyAssignments, resolveProxyForConnection } from "@/lib/localDb";
|
||||
import { proxyAssignmentSchema } from "@/shared/validation/schemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
import { createErrorResponse, createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
import { requireManagementAuth } from "@/lib/api/requireManagementAuth";
|
||||
import { clearDispatcherCache } from "@omniroute/open-sse/utils/proxyDispatcher";
|
||||
|
||||
function toPagination(searchParams: URLSearchParams) {
|
||||
const limit = Math.max(1, Math.min(200, Number(searchParams.get("limit") || 100)));
|
||||
const offset = Math.max(0, Number(searchParams.get("offset") || 0));
|
||||
return { limit, offset };
|
||||
}
|
||||
|
||||
export async function GET(request: Request) {
|
||||
const authError = await requireManagementAuth(request);
|
||||
if (authError) return authError;
|
||||
|
||||
try {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const proxyId = searchParams.get("proxy_id");
|
||||
const scope = searchParams.get("scope");
|
||||
const scopeId = searchParams.get("scope_id");
|
||||
const resolveConnectionId = searchParams.get("resolve_connection_id");
|
||||
|
||||
if (resolveConnectionId) {
|
||||
const resolved = await resolveProxyForConnection(resolveConnectionId);
|
||||
return Response.json(resolved);
|
||||
}
|
||||
|
||||
const all = await getProxyAssignments({
|
||||
proxyId: proxyId || undefined,
|
||||
scope: scope || undefined,
|
||||
});
|
||||
|
||||
const filtered = scopeId ? all.filter((entry) => entry.scopeId === scopeId) : all;
|
||||
const { limit, offset } = toPagination(searchParams);
|
||||
const items = filtered.slice(offset, offset + limit);
|
||||
|
||||
return Response.json({
|
||||
items,
|
||||
page: { limit, offset, total: filtered.length },
|
||||
});
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to load proxy assignments");
|
||||
}
|
||||
}
|
||||
|
||||
export async function PUT(request: Request) {
|
||||
const authError = await requireManagementAuth(request);
|
||||
if (authError) return authError;
|
||||
|
||||
let rawBody: unknown;
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(proxyAssignmentSchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const { scope, scopeId, proxyId } = validation.data;
|
||||
const assignment = await assignProxyToScope(scope, scopeId || null, proxyId || null);
|
||||
clearDispatcherCache();
|
||||
|
||||
return Response.json({ success: true, assignment });
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to update proxy assignment");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,49 @@
|
||||
import { bulkAssignProxyToScope } from "@/lib/localDb";
|
||||
import { bulkProxyAssignmentSchema } from "@/shared/validation/schemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
import { createErrorResponse, createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
import { requireManagementAuth } from "@/lib/api/requireManagementAuth";
|
||||
import { clearDispatcherCache } from "@omniroute/open-sse/utils/proxyDispatcher";
|
||||
|
||||
export async function PUT(request: Request) {
|
||||
const authError = await requireManagementAuth(request);
|
||||
if (authError) return authError;
|
||||
|
||||
let rawBody: unknown;
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(bulkProxyAssignmentSchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const { scope, scopeIds, proxyId } = validation.data;
|
||||
const normalizedScope = scope === "key" ? "account" : scope;
|
||||
const result = await bulkAssignProxyToScope(normalizedScope, scopeIds || [], proxyId || null);
|
||||
clearDispatcherCache();
|
||||
|
||||
return Response.json({
|
||||
success: true,
|
||||
scope: normalizedScope,
|
||||
requested: normalizedScope === "global" ? 1 : (scopeIds || []).length,
|
||||
updated: result.updated,
|
||||
failed: result.failed,
|
||||
});
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to run bulk assignment");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,17 @@
|
||||
import { getProxyHealthStats } from "@/lib/localDb";
|
||||
import { createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
import { requireManagementAuth } from "@/lib/api/requireManagementAuth";
|
||||
|
||||
export async function GET(request: Request) {
|
||||
const authError = await requireManagementAuth(request);
|
||||
if (authError) return authError;
|
||||
|
||||
try {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const hours = Number(searchParams.get("hours") || 24);
|
||||
const items = await getProxyHealthStats({ hours });
|
||||
return Response.json({ items, total: items.length, windowHours: hours });
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to load proxy health stats");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,151 @@
|
||||
import {
|
||||
createProxy,
|
||||
deleteProxyById,
|
||||
getProxyById,
|
||||
getProxyWhereUsed,
|
||||
listProxies,
|
||||
updateProxy,
|
||||
} from "@/lib/localDb";
|
||||
import { createProxyRegistrySchema, updateProxyRegistrySchema } from "@/shared/validation/schemas";
|
||||
import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
|
||||
import { createErrorResponse, createErrorResponseFromUnknown } from "@/lib/api/errorResponse";
|
||||
import { requireManagementAuth } from "@/lib/api/requireManagementAuth";
|
||||
|
||||
function toPagination(searchParams: URLSearchParams) {
|
||||
const limit = Math.max(1, Math.min(200, Number(searchParams.get("limit") || 50)));
|
||||
const offset = Math.max(0, Number(searchParams.get("offset") || 0));
|
||||
return { limit, offset };
|
||||
}
|
||||
|
||||
export async function GET(request: Request) {
|
||||
const authError = await requireManagementAuth(request);
|
||||
if (authError) return authError;
|
||||
|
||||
try {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const id = searchParams.get("id");
|
||||
const whereUsed = searchParams.get("where_used") === "1";
|
||||
|
||||
if (id && whereUsed) {
|
||||
const usage = await getProxyWhereUsed(id);
|
||||
return Response.json(usage);
|
||||
}
|
||||
|
||||
if (id) {
|
||||
const proxy = await getProxyById(id, { includeSecrets: false });
|
||||
if (!proxy) {
|
||||
return createErrorResponse({ status: 404, message: "Proxy not found", type: "not_found" });
|
||||
}
|
||||
return Response.json(proxy);
|
||||
}
|
||||
|
||||
const { limit, offset } = toPagination(searchParams);
|
||||
const items = await listProxies({ includeSecrets: false });
|
||||
const paged = items.slice(offset, offset + limit);
|
||||
return Response.json({
|
||||
items: paged,
|
||||
page: { limit, offset, total: items.length },
|
||||
});
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to load proxies");
|
||||
}
|
||||
}
|
||||
|
||||
export async function POST(request: Request) {
|
||||
const authError = await requireManagementAuth(request);
|
||||
if (authError) return authError;
|
||||
|
||||
let rawBody: unknown;
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(createProxyRegistrySchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const created = await createProxy(validation.data);
|
||||
return Response.json(created, { status: 201 });
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to create proxy");
|
||||
}
|
||||
}
|
||||
|
||||
export async function PATCH(request: Request) {
|
||||
const authError = await requireManagementAuth(request);
|
||||
if (authError) return authError;
|
||||
|
||||
let rawBody: unknown;
|
||||
try {
|
||||
rawBody = await request.json();
|
||||
} catch {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "Invalid JSON body",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
try {
|
||||
const validation = validateBody(updateProxyRegistrySchema, rawBody);
|
||||
if (isValidationFailure(validation)) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: validation.error.message,
|
||||
details: validation.error.details,
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const { id, ...changes } = validation.data;
|
||||
const updated = await updateProxy(id, changes);
|
||||
if (!updated) {
|
||||
return createErrorResponse({ status: 404, message: "Proxy not found", type: "not_found" });
|
||||
}
|
||||
|
||||
return Response.json(updated);
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to update proxy");
|
||||
}
|
||||
}
|
||||
|
||||
export async function DELETE(request: Request) {
|
||||
const authError = await requireManagementAuth(request);
|
||||
if (authError) return authError;
|
||||
|
||||
try {
|
||||
const { searchParams } = new URL(request.url);
|
||||
const id = searchParams.get("id");
|
||||
const force = searchParams.get("force") === "1";
|
||||
|
||||
if (!id) {
|
||||
return createErrorResponse({
|
||||
status: 400,
|
||||
message: "id is required",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
|
||||
const deleted = await deleteProxyById(id, { force });
|
||||
if (!deleted) {
|
||||
return createErrorResponse({ status: 404, message: "Proxy not found", type: "not_found" });
|
||||
}
|
||||
|
||||
return Response.json({ success: true });
|
||||
} catch (error) {
|
||||
return createErrorResponseFromUnknown(error, "Failed to delete proxy");
|
||||
}
|
||||
}
|
||||
@@ -14,6 +14,36 @@ const ENDPOINT_ROWS = [
|
||||
{ path: "/models", method: "GET", noteKey: "endpointRewriteModelsNote" },
|
||||
] as const;
|
||||
|
||||
const MANAGEMENT_ENDPOINT_ROWS = [
|
||||
{ path: "/api/v1/management/proxies", method: "GET", noteKey: "mgmtProxiesListNote" },
|
||||
{ path: "/api/v1/management/proxies", method: "POST", noteKey: "mgmtProxiesCreateNote" },
|
||||
{
|
||||
path: "/api/v1/management/proxies/health",
|
||||
method: "GET",
|
||||
noteKey: "mgmtProxiesHealthNote",
|
||||
},
|
||||
{
|
||||
path: "/api/v1/management/proxies/bulk-assign",
|
||||
method: "PUT",
|
||||
noteKey: "mgmtProxiesBulkAssignNote",
|
||||
},
|
||||
{
|
||||
path: "/api/v1/management/proxies/assignments",
|
||||
method: "GET",
|
||||
noteKey: "mgmtAssignmentsListNote",
|
||||
},
|
||||
{
|
||||
path: "/api/v1/management/proxies/assignments",
|
||||
method: "PUT",
|
||||
noteKey: "mgmtAssignmentsUpdateNote",
|
||||
},
|
||||
{
|
||||
path: "/api/settings/proxies/migrate",
|
||||
method: "POST",
|
||||
noteKey: "mgmtLegacyMigrationNote",
|
||||
},
|
||||
] as const;
|
||||
|
||||
const FEATURE_ITEMS = [
|
||||
{ icon: "hub", titleKey: "featureRoutingTitle", textKey: "featureRoutingText" },
|
||||
{ icon: "layers", titleKey: "featureCombosTitle", textKey: "featureCombosText" },
|
||||
@@ -48,6 +78,7 @@ const TOC_ITEMS = [
|
||||
{ href: "#client-compatibility", labelKey: "clientCompatibility" },
|
||||
{ href: "#protocols", labelKey: "protocolsToc" },
|
||||
{ href: "#api-reference", labelKey: "apiReference" },
|
||||
{ href: "#management-api", labelKey: "managementApiReference" },
|
||||
{ href: "#model-prefixes", labelKey: "modelPrefixes" },
|
||||
{ href: "#troubleshooting", labelKey: "troubleshooting" },
|
||||
] as const;
|
||||
@@ -102,6 +133,10 @@ export default function DocsPage() {
|
||||
...row,
|
||||
note: t(row.noteKey),
|
||||
}));
|
||||
const managementEndpointRows = MANAGEMENT_ENDPOINT_ROWS.map((row) => ({
|
||||
...row,
|
||||
note: t(row.noteKey),
|
||||
}));
|
||||
|
||||
const featureItems = FEATURE_ITEMS.map((item) => ({
|
||||
...item,
|
||||
@@ -490,6 +525,35 @@ POST /a2a (JSON-RPC: message/send | message/stream)`}</code>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<section id="management-api" className="rounded-2xl border border-border bg-bg-subtle p-6">
|
||||
<h2 className="text-xl font-semibold">{t("managementApiReference")}</h2>
|
||||
<p className="text-sm text-text-muted mt-2">{t("managementApiDescription")}</p>
|
||||
<div className="mt-4 overflow-x-auto">
|
||||
<table className="w-full text-sm border-collapse">
|
||||
<thead>
|
||||
<tr className="border-b border-border">
|
||||
<th className="text-left py-2 pr-4">{t("method")}</th>
|
||||
<th className="text-left py-2 pr-4">{t("path")}</th>
|
||||
<th className="text-left py-2">{t("notes")}</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{managementEndpointRows.map((row) => (
|
||||
<tr key={`${row.method}:${row.path}`} className="border-b border-border/60">
|
||||
<td className="py-2 pr-4">
|
||||
<code className="px-1.5 py-0.5 rounded bg-bg text-xs font-semibold">
|
||||
{row.method}
|
||||
</code>
|
||||
</td>
|
||||
<td className="py-2 pr-4 font-mono">{row.path}</td>
|
||||
<td className="py-2 text-text-muted">{row.note}</td>
|
||||
</tr>
|
||||
))}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<section id="troubleshooting" className="rounded-2xl border border-border bg-bg-subtle p-6">
|
||||
<h2 className="text-xl font-semibold">{t("troubleshooting")}</h2>
|
||||
<ul className="mt-4 list-disc list-inside text-sm text-text-muted space-y-2">
|
||||
|
||||
@@ -2345,6 +2345,8 @@
|
||||
"clientCompatibility": "Client Compatibility",
|
||||
"protocolsToc": "Protocols",
|
||||
"apiReference": "API Reference",
|
||||
"managementApiReference": "Management API Reference",
|
||||
"managementApiDescription": "Automation endpoints for proxy registry, scope assignments, and legacy proxy migration.",
|
||||
"method": "Method",
|
||||
"path": "Path",
|
||||
"notes": "Notes",
|
||||
@@ -2440,6 +2442,13 @@
|
||||
"endpointRewriteChatNote": "Rewrite helper for clients without /v1.",
|
||||
"endpointRewriteResponsesNote": "Rewrite helper for Responses without /v1.",
|
||||
"endpointRewriteModelsNote": "Rewrite helper for model discovery without /v1.",
|
||||
"mgmtProxiesListNote": "List saved proxy registry items (supports pagination).",
|
||||
"mgmtProxiesCreateNote": "Create a reusable proxy item in the registry.",
|
||||
"mgmtProxiesHealthNote": "Get 24h/rolling health metrics per saved proxy from proxy logs.",
|
||||
"mgmtProxiesBulkAssignNote": "Assign or clear one proxy across many scope IDs in one request.",
|
||||
"mgmtAssignmentsListNote": "List proxy assignments by scope, scope_id, or proxy_id.",
|
||||
"mgmtAssignmentsUpdateNote": "Assign or clear proxy for global/provider/account/combo scope.",
|
||||
"mgmtLegacyMigrationNote": "Import legacy proxyConfig maps into registry assignments.",
|
||||
"modelPrefixesDescriptionStart": "Use the provider prefix before the model name to route to a specific provider. Example:",
|
||||
"modelPrefixesDescriptionEnd": "routes to GitHub Copilot.",
|
||||
"provider": "Provider",
|
||||
|
||||
@@ -2310,6 +2310,8 @@
|
||||
"clientCompatibility": "Kompatibilitas Klien",
|
||||
"protocolsToc": "Protocols",
|
||||
"apiReference": "Referensi API",
|
||||
"managementApiReference": "Referensi API Manajemen",
|
||||
"managementApiDescription": "Endpoint automasi untuk registry proxy, assignment scope, dan migrasi proxy lama.",
|
||||
"method": "Metode",
|
||||
"path": "Jalan",
|
||||
"notes": "Catatan",
|
||||
@@ -2405,6 +2407,13 @@
|
||||
"endpointRewriteChatNote": "Tulis ulang pembantu untuk klien tanpa /v1.",
|
||||
"endpointRewriteResponsesNote": "Tulis ulang pembantu untuk Respons tanpa /v1.",
|
||||
"endpointRewriteModelsNote": "Tulis ulang pembantu untuk penemuan model tanpa /v1.",
|
||||
"mgmtProxiesListNote": "Daftar item proxy registry tersimpan (mendukung pagination).",
|
||||
"mgmtProxiesCreateNote": "Buat item proxy reusable di registry.",
|
||||
"mgmtProxiesHealthNote": "Ambil metrik kesehatan proxy tersimpan (24 jam/window) dari proxy logs.",
|
||||
"mgmtProxiesBulkAssignNote": "Set atau hapus satu proxy ke banyak scope ID dalam satu request.",
|
||||
"mgmtAssignmentsListNote": "Daftar assignment proxy berdasarkan scope, scope_id, atau proxy_id.",
|
||||
"mgmtAssignmentsUpdateNote": "Set atau hapus proxy untuk scope global/provider/account/combo.",
|
||||
"mgmtLegacyMigrationNote": "Impor map proxyConfig lama ke assignment registry.",
|
||||
"modelPrefixesDescriptionStart": "Gunakan awalan penyedia sebelum nama model untuk merutekan ke penyedia tertentu. Contoh:",
|
||||
"modelPrefixesDescriptionEnd": "rute ke GitHub Copilot.",
|
||||
"provider": "Penyedia",
|
||||
|
||||
@@ -0,0 +1,54 @@
|
||||
import { randomUUID } from "node:crypto";
|
||||
|
||||
export type ApiErrorType = "invalid_request" | "not_found" | "conflict" | "server_error";
|
||||
|
||||
interface ApiErrorPayload {
|
||||
status: number;
|
||||
message: string;
|
||||
type?: ApiErrorType;
|
||||
details?: unknown;
|
||||
}
|
||||
|
||||
export function createErrorResponse(payload: ApiErrorPayload): Response {
|
||||
const requestId = randomUUID();
|
||||
const resolvedType =
|
||||
payload.type ||
|
||||
(payload.status >= 500
|
||||
? "server_error"
|
||||
: payload.status === 404
|
||||
? "not_found"
|
||||
: payload.status === 409
|
||||
? "conflict"
|
||||
: "invalid_request");
|
||||
|
||||
return Response.json(
|
||||
{
|
||||
error: {
|
||||
message: payload.message,
|
||||
type: resolvedType,
|
||||
details: payload.details,
|
||||
},
|
||||
requestId,
|
||||
},
|
||||
{ status: payload.status }
|
||||
);
|
||||
}
|
||||
|
||||
export function createErrorResponseFromUnknown(
|
||||
error: unknown,
|
||||
fallbackMessage = "Unexpected server error"
|
||||
): Response {
|
||||
const anyError = error as {
|
||||
message?: string;
|
||||
status?: number;
|
||||
type?: ApiErrorType;
|
||||
details?: unknown;
|
||||
};
|
||||
const status = Number(anyError?.status) || 500;
|
||||
return createErrorResponse({
|
||||
status,
|
||||
message: typeof anyError?.message === "string" ? anyError.message : fallbackMessage,
|
||||
type: anyError?.type,
|
||||
details: anyError?.details,
|
||||
});
|
||||
}
|
||||
@@ -0,0 +1,18 @@
|
||||
import { isAuthenticated, isAuthRequired } from "@/shared/utils/apiAuth";
|
||||
import { createErrorResponse } from "@/lib/api/errorResponse";
|
||||
|
||||
export async function requireManagementAuth(request: Request): Promise<Response | null> {
|
||||
if (!(await isAuthRequired())) {
|
||||
return null;
|
||||
}
|
||||
|
||||
if (await isAuthenticated(request)) {
|
||||
return null;
|
||||
}
|
||||
|
||||
return createErrorResponse({
|
||||
status: 401,
|
||||
message: "Authentication required",
|
||||
type: "invalid_request",
|
||||
});
|
||||
}
|
||||
@@ -1,5 +1,5 @@
|
||||
import http from "node:http";
|
||||
import type { IncomingMessage, ServerResponse } from "node:http";
|
||||
import http from "http";
|
||||
import type { IncomingMessage, ServerResponse } from "http";
|
||||
import { getRuntimePorts } from "@/lib/runtime/ports";
|
||||
|
||||
const DEFAULT_PROXY_TIMEOUT_MS = 30_000;
|
||||
|
||||
@@ -8,7 +8,7 @@
|
||||
* @module lib/cacheLayer
|
||||
*/
|
||||
|
||||
import crypto from "node:crypto";
|
||||
import crypto from "crypto";
|
||||
|
||||
/**
|
||||
* @typedef {Object} CacheEntry
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import path from "node:path";
|
||||
import os from "node:os";
|
||||
import path from "path";
|
||||
import os from "os";
|
||||
|
||||
export const APP_NAME = "omniroute";
|
||||
|
||||
|
||||
@@ -3,8 +3,8 @@
|
||||
*/
|
||||
|
||||
import Database from "better-sqlite3";
|
||||
import path from "node:path";
|
||||
import fs from "node:fs";
|
||||
import path from "path";
|
||||
import fs from "fs";
|
||||
import {
|
||||
getDbInstance,
|
||||
resetDbInstance,
|
||||
|
||||
+37
-2
@@ -5,8 +5,8 @@
|
||||
*/
|
||||
|
||||
import Database from "better-sqlite3";
|
||||
import path from "node:path";
|
||||
import fs from "node:fs";
|
||||
import path from "path";
|
||||
import fs from "fs";
|
||||
import { resolveDataDir, getLegacyDotDataDir } from "../dataPaths";
|
||||
import { runMigrations } from "./migrationRunner";
|
||||
|
||||
@@ -143,6 +143,10 @@ const SCHEMA_SQL = `
|
||||
tokens_cache_creation INTEGER DEFAULT 0,
|
||||
tokens_reasoning INTEGER DEFAULT 0,
|
||||
status TEXT,
|
||||
success INTEGER DEFAULT 1,
|
||||
latency_ms INTEGER DEFAULT 0,
|
||||
ttft_ms INTEGER DEFAULT 0,
|
||||
error_code TEXT,
|
||||
timestamp TEXT NOT NULL
|
||||
);
|
||||
CREATE INDEX IF NOT EXISTS idx_uh_timestamp ON usage_history(timestamp);
|
||||
@@ -327,6 +331,35 @@ function ensureProviderConnectionsColumns(db: SqliteDatabase) {
|
||||
}
|
||||
}
|
||||
|
||||
function ensureUsageHistoryColumns(db: SqliteDatabase) {
|
||||
try {
|
||||
const columns = db.prepare("PRAGMA table_info(usage_history)").all() as Array<{
|
||||
name?: string;
|
||||
}>;
|
||||
const columnNames = new Set(columns.map((column) => String(column.name ?? "")));
|
||||
|
||||
if (!columnNames.has("success")) {
|
||||
db.exec("ALTER TABLE usage_history ADD COLUMN success INTEGER DEFAULT 1");
|
||||
console.log("[DB] Added usage_history.success column");
|
||||
}
|
||||
if (!columnNames.has("latency_ms")) {
|
||||
db.exec("ALTER TABLE usage_history ADD COLUMN latency_ms INTEGER DEFAULT 0");
|
||||
console.log("[DB] Added usage_history.latency_ms column");
|
||||
}
|
||||
if (!columnNames.has("ttft_ms")) {
|
||||
db.exec("ALTER TABLE usage_history ADD COLUMN ttft_ms INTEGER DEFAULT 0");
|
||||
console.log("[DB] Added usage_history.ttft_ms column");
|
||||
}
|
||||
if (!columnNames.has("error_code")) {
|
||||
db.exec("ALTER TABLE usage_history ADD COLUMN error_code TEXT");
|
||||
console.log("[DB] Added usage_history.error_code column");
|
||||
}
|
||||
} catch (error: unknown) {
|
||||
const message = error instanceof Error ? error.message : String(error);
|
||||
console.warn("[DB] Failed to verify usage_history schema:", message);
|
||||
}
|
||||
}
|
||||
|
||||
export function getDbInstance(): SqliteDatabase {
|
||||
if (_db) return _db;
|
||||
|
||||
@@ -337,6 +370,7 @@ export function getDbInstance(): SqliteDatabase {
|
||||
const memoryDb = new Database(":memory:");
|
||||
memoryDb.pragma("journal_mode = WAL");
|
||||
memoryDb.exec(SCHEMA_SQL);
|
||||
ensureUsageHistoryColumns(memoryDb);
|
||||
_db = memoryDb;
|
||||
return memoryDb;
|
||||
}
|
||||
@@ -420,6 +454,7 @@ export function getDbInstance(): SqliteDatabase {
|
||||
db.pragma("synchronous = NORMAL");
|
||||
db.exec(SCHEMA_SQL);
|
||||
ensureProviderConnectionsColumns(db);
|
||||
ensureUsageHistoryColumns(db);
|
||||
|
||||
// ── Versioned Migrations ──
|
||||
// Auto-seed 001 as applied (the inline SCHEMA_SQL already created these tables)
|
||||
|
||||
@@ -0,0 +1,101 @@
|
||||
/**
|
||||
* Detailed Request Logs DB Layer (#378)
|
||||
*
|
||||
* Saves full request/response bodies at each pipeline stage.
|
||||
* Ring-buffer of 500 entries enforced by SQL trigger in migration 006.
|
||||
* Only active when settings.detailed_logs_enabled = "1".
|
||||
*/
|
||||
import { v4 as uuidv4 } from "uuid";
|
||||
import { getDbInstance } from "./core";
|
||||
import { getSettings } from "./settings";
|
||||
|
||||
export interface RequestDetailLog {
|
||||
id?: string;
|
||||
call_log_id?: string | null;
|
||||
timestamp?: string;
|
||||
client_request?: string | null;
|
||||
translated_request?: string | null;
|
||||
provider_response?: string | null;
|
||||
client_response?: string | null;
|
||||
provider?: string | null;
|
||||
model?: string | null;
|
||||
source_format?: string | null;
|
||||
target_format?: string | null;
|
||||
duration_ms?: number;
|
||||
}
|
||||
|
||||
/** Returns true if detailed logging is enabled in settings */
|
||||
export async function isDetailedLoggingEnabled(): Promise<boolean> {
|
||||
try {
|
||||
const settings = await getSettings();
|
||||
const val = settings.detailed_logs_enabled;
|
||||
return val === true || val === "1" || val === "true";
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
/** Save a detailed log entry — caller must verify isDetailedLoggingEnabled() first */
|
||||
export function saveRequestDetailLog(entry: RequestDetailLog): void {
|
||||
const db = getDbInstance();
|
||||
const id = entry.id ?? uuidv4();
|
||||
const timestamp = entry.timestamp ?? new Date().toISOString();
|
||||
|
||||
// Trim large bodies to avoid excessive disk usage (max 64KB each)
|
||||
const trim = (s: string | null | undefined, max = 65536): string | null => {
|
||||
if (!s) return null;
|
||||
return s.length > max ? s.slice(0, max) + "…[truncated]" : s;
|
||||
};
|
||||
|
||||
db.prepare(
|
||||
`
|
||||
INSERT INTO request_detail_logs
|
||||
(id, call_log_id, timestamp, client_request, translated_request,
|
||||
provider_response, client_response, provider, model, source_format, target_format, duration_ms)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
`
|
||||
).run(
|
||||
id,
|
||||
entry.call_log_id ?? null,
|
||||
timestamp,
|
||||
trim(entry.client_request),
|
||||
trim(entry.translated_request),
|
||||
trim(entry.provider_response),
|
||||
trim(entry.client_response),
|
||||
entry.provider ?? null,
|
||||
entry.model ?? null,
|
||||
entry.source_format ?? null,
|
||||
entry.target_format ?? null,
|
||||
entry.duration_ms ?? 0
|
||||
);
|
||||
}
|
||||
|
||||
/** Fetch detailed logs (latest first) */
|
||||
export function getRequestDetailLogs(limit = 50, offset = 0): RequestDetailLog[] {
|
||||
const db = getDbInstance();
|
||||
return db
|
||||
.prepare(
|
||||
`
|
||||
SELECT * FROM request_detail_logs
|
||||
ORDER BY timestamp DESC
|
||||
LIMIT ? OFFSET ?
|
||||
`
|
||||
)
|
||||
.all(limit, offset) as RequestDetailLog[];
|
||||
}
|
||||
|
||||
/** Get a single detailed log by ID */
|
||||
export function getRequestDetailLogById(id: string): RequestDetailLog | null {
|
||||
const db = getDbInstance();
|
||||
return (db.prepare("SELECT * FROM request_detail_logs WHERE id = ?").get(id) ??
|
||||
null) as RequestDetailLog | null;
|
||||
}
|
||||
|
||||
/** Get total count of detailed logs */
|
||||
export function getRequestDetailLogCount(): number {
|
||||
const db = getDbInstance();
|
||||
const row = db.prepare("SELECT COUNT(*) as cnt FROM request_detail_logs").get() as {
|
||||
cnt: number;
|
||||
};
|
||||
return row?.cnt ?? 0;
|
||||
}
|
||||
@@ -9,9 +9,9 @@
|
||||
* All migrations run within a single transaction — all-or-nothing per file.
|
||||
*/
|
||||
|
||||
import fs from "node:fs";
|
||||
import path from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
import fs from "fs";
|
||||
import path from "path";
|
||||
import { fileURLToPath } from "url";
|
||||
import type Database from "better-sqlite3";
|
||||
|
||||
/**
|
||||
|
||||
@@ -98,6 +98,10 @@ CREATE TABLE IF NOT EXISTS usage_history (
|
||||
tokens_cache_creation INTEGER DEFAULT 0,
|
||||
tokens_reasoning INTEGER DEFAULT 0,
|
||||
status TEXT,
|
||||
success INTEGER DEFAULT 1,
|
||||
latency_ms INTEGER DEFAULT 0,
|
||||
ttft_ms INTEGER DEFAULT 0,
|
||||
error_code TEXT,
|
||||
timestamp TEXT NOT NULL
|
||||
);
|
||||
CREATE INDEX IF NOT EXISTS idx_uh_timestamp ON usage_history(timestamp);
|
||||
|
||||
@@ -0,0 +1,31 @@
|
||||
CREATE TABLE IF NOT EXISTS proxy_registry (
|
||||
id TEXT PRIMARY KEY,
|
||||
name TEXT NOT NULL,
|
||||
type TEXT NOT NULL,
|
||||
host TEXT NOT NULL,
|
||||
port INTEGER NOT NULL,
|
||||
username TEXT,
|
||||
password TEXT,
|
||||
region TEXT,
|
||||
notes TEXT,
|
||||
status TEXT NOT NULL DEFAULT 'active',
|
||||
created_at TEXT NOT NULL DEFAULT (datetime('now')),
|
||||
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
|
||||
);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_proxy_registry_status ON proxy_registry(status);
|
||||
CREATE INDEX IF NOT EXISTS idx_proxy_registry_host ON proxy_registry(host);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS proxy_assignments (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
proxy_id TEXT NOT NULL,
|
||||
scope TEXT NOT NULL,
|
||||
scope_id TEXT,
|
||||
created_at TEXT NOT NULL DEFAULT (datetime('now')),
|
||||
updated_at TEXT NOT NULL DEFAULT (datetime('now')),
|
||||
UNIQUE(scope, scope_id),
|
||||
FOREIGN KEY (proxy_id) REFERENCES proxy_registry(id) ON DELETE RESTRICT
|
||||
);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_proxy_assignments_proxy_id ON proxy_assignments(proxy_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_proxy_assignments_scope ON proxy_assignments(scope, scope_id);
|
||||
@@ -0,0 +1,19 @@
|
||||
-- 005_combo_agent_fields.sql
|
||||
-- Safe migration for existing users: adds optional agent fields to combos.
|
||||
-- Uses ADD COLUMN with DEFAULT NULL (SQLite compatible) — existing rows are untouched.
|
||||
-- New fields are read as NULL by old code versions (backward compatible).
|
||||
|
||||
-- System prompt override: when set, injected as the first system message before
|
||||
-- forwarding to the provider. Overrides any system message from the client.
|
||||
ALTER TABLE combos ADD COLUMN system_message TEXT DEFAULT NULL;
|
||||
|
||||
-- Regex-based tool filter: when set, only tool calls whose "name" matches this
|
||||
-- regex pattern are forwarded to the provider. Others are stripped silently.
|
||||
-- Example: "^(gh_|create_file|web_fetch)" — allows only GitHub and web tools.
|
||||
ALTER TABLE combos ADD COLUMN tool_filter_regex TEXT DEFAULT NULL;
|
||||
|
||||
-- Context caching protection: when 1, the proxy tags assistant responses with
|
||||
-- <omniModel>provider/model</omniModel> and pins the model for the session.
|
||||
ALTER TABLE combos ADD COLUMN context_cache_protection INTEGER DEFAULT 0;
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_combos_cache_protection ON combos(context_cache_protection);
|
||||
@@ -0,0 +1,42 @@
|
||||
-- 006_detailed_request_logs.sql
|
||||
-- Stores full request/response bodies at each pipeline stage for debugging.
|
||||
-- Only populated when detailed_logs_enabled = 1 in settings (off by default).
|
||||
-- Ring-buffer enforced via trigger: keeps only the last 500 entries.
|
||||
-- Existing users are not impacted (table is new, feature is opt-in).
|
||||
|
||||
CREATE TABLE IF NOT EXISTS request_detail_logs (
|
||||
id TEXT PRIMARY KEY,
|
||||
call_log_id TEXT, -- FK to call_logs.id (optional, nullable)
|
||||
timestamp TEXT NOT NULL,
|
||||
-- The 4 pipeline stages (all nullable — only populated when available)
|
||||
client_request TEXT, -- Raw body received from the client (JSON)
|
||||
translated_request TEXT, -- Body after format translation (JSON)
|
||||
provider_response TEXT, -- Raw body from the provider (JSON)
|
||||
client_response TEXT, -- Final body sent to the client (JSON)
|
||||
-- Metadata
|
||||
provider TEXT,
|
||||
model TEXT,
|
||||
source_format TEXT,
|
||||
target_format TEXT,
|
||||
duration_ms INTEGER DEFAULT 0
|
||||
);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_rdl_timestamp ON request_detail_logs(timestamp);
|
||||
CREATE INDEX IF NOT EXISTS idx_rdl_call_log_id ON request_detail_logs(call_log_id);
|
||||
|
||||
-- Ring-buffer trigger: auto-delete oldest records beyond 500
|
||||
CREATE TRIGGER IF NOT EXISTS trg_rdl_ring_buffer
|
||||
AFTER INSERT ON request_detail_logs
|
||||
BEGIN
|
||||
DELETE FROM request_detail_logs
|
||||
WHERE id IN (
|
||||
SELECT id FROM request_detail_logs
|
||||
ORDER BY timestamp ASC
|
||||
LIMIT MAX(0, (SELECT COUNT(*) FROM request_detail_logs) - 500)
|
||||
);
|
||||
END;
|
||||
|
||||
-- Settings key for enabling/disabling detailed logs (default: disabled)
|
||||
-- Inserted only if not already present (safe for existing installs)
|
||||
INSERT OR IGNORE INTO key_value (namespace, key, value)
|
||||
VALUES ('settings', 'detailed_logs_enabled', '0');
|
||||
@@ -9,7 +9,7 @@
|
||||
* @module lib/db/prompts
|
||||
*/
|
||||
|
||||
import crypto from "node:crypto";
|
||||
import crypto from "crypto";
|
||||
import { getDbInstance } from "./core";
|
||||
|
||||
interface StatementLike<TRow = unknown> {
|
||||
|
||||
@@ -0,0 +1,588 @@
|
||||
import { randomUUID } from "node:crypto";
|
||||
import { getDbInstance } from "./core";
|
||||
import { backupDbFile } from "./backup";
|
||||
|
||||
type JsonRecord = Record<string, unknown>;
|
||||
type ProxyScope = "global" | "provider" | "account" | "combo";
|
||||
|
||||
interface ProxyRegistryRecord {
|
||||
id: string;
|
||||
name: string;
|
||||
type: string;
|
||||
host: string;
|
||||
port: number;
|
||||
username: string;
|
||||
password: string;
|
||||
region: string | null;
|
||||
notes: string | null;
|
||||
status: string;
|
||||
createdAt: string;
|
||||
updatedAt: string;
|
||||
}
|
||||
|
||||
interface ProxyAssignmentRecord {
|
||||
id: number;
|
||||
proxyId: string;
|
||||
scope: ProxyScope;
|
||||
scopeId: string | null;
|
||||
createdAt: string;
|
||||
updatedAt: string;
|
||||
}
|
||||
|
||||
interface ProxyPayload {
|
||||
name: string;
|
||||
type: string;
|
||||
host: string;
|
||||
port: number;
|
||||
username?: string;
|
||||
password?: string;
|
||||
region?: string | null;
|
||||
notes?: string | null;
|
||||
status?: string;
|
||||
}
|
||||
|
||||
interface LegacyProxyConfig {
|
||||
global?: unknown;
|
||||
providers?: Record<string, unknown>;
|
||||
combos?: Record<string, unknown>;
|
||||
keys?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
function toRecord(value: unknown): JsonRecord {
|
||||
return value && typeof value === "object" ? (value as JsonRecord) : {};
|
||||
}
|
||||
|
||||
function mapProxyRow(row: unknown): ProxyRegistryRecord {
|
||||
const r = toRecord(row);
|
||||
return {
|
||||
id: typeof r.id === "string" ? r.id : "",
|
||||
name: typeof r.name === "string" ? r.name : "",
|
||||
type: typeof r.type === "string" ? r.type : "http",
|
||||
host: typeof r.host === "string" ? r.host : "",
|
||||
port: Number(r.port) || 0,
|
||||
username: typeof r.username === "string" ? r.username : "",
|
||||
password: typeof r.password === "string" ? r.password : "",
|
||||
region: typeof r.region === "string" ? r.region : null,
|
||||
notes: typeof r.notes === "string" ? r.notes : null,
|
||||
status: typeof r.status === "string" ? r.status : "active",
|
||||
createdAt: typeof r.created_at === "string" ? r.created_at : "",
|
||||
updatedAt: typeof r.updated_at === "string" ? r.updated_at : "",
|
||||
};
|
||||
}
|
||||
|
||||
function mapAssignmentRow(row: unknown): ProxyAssignmentRecord {
|
||||
const r = toRecord(row);
|
||||
const scope = (typeof r.scope === "string" ? r.scope : "global") as ProxyScope;
|
||||
const rawScopeId = typeof r.scope_id === "string" ? r.scope_id : null;
|
||||
return {
|
||||
id: Number(r.id) || 0,
|
||||
proxyId: typeof r.proxy_id === "string" ? r.proxy_id : "",
|
||||
scope,
|
||||
scopeId: scope === "global" && rawScopeId === "__global__" ? null : rawScopeId,
|
||||
createdAt: typeof r.created_at === "string" ? r.created_at : "",
|
||||
updatedAt: typeof r.updated_at === "string" ? r.updated_at : "",
|
||||
};
|
||||
}
|
||||
|
||||
function normalizeScope(scope: string): ProxyScope {
|
||||
const value = String(scope || "").toLowerCase();
|
||||
if (value === "key") return "account";
|
||||
if (value === "provider") return "provider";
|
||||
if (value === "account") return "account";
|
||||
if (value === "combo") return "combo";
|
||||
return "global";
|
||||
}
|
||||
|
||||
function coerceProxyPayload(value: unknown, fallbackName: string): ProxyPayload | null {
|
||||
if (!value) return null;
|
||||
|
||||
if (typeof value === "string") {
|
||||
try {
|
||||
const parsed = new URL(value);
|
||||
return {
|
||||
name: fallbackName,
|
||||
type: parsed.protocol.replace(":", "") || "http",
|
||||
host: parsed.hostname,
|
||||
port: Number(parsed.port || (parsed.protocol === "https:" ? "443" : "8080")),
|
||||
username: parsed.username ? decodeURIComponent(parsed.username) : "",
|
||||
password: parsed.password ? decodeURIComponent(parsed.password) : "",
|
||||
status: "active",
|
||||
};
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
if (typeof value !== "object" || Array.isArray(value)) return null;
|
||||
const record = toRecord(value);
|
||||
const host = typeof record.host === "string" ? record.host.trim() : "";
|
||||
if (!host) return null;
|
||||
const port = Number(record.port) || 8080;
|
||||
|
||||
return {
|
||||
name: fallbackName,
|
||||
type: typeof record.type === "string" ? record.type : "http",
|
||||
host,
|
||||
port,
|
||||
username: typeof record.username === "string" ? record.username : "",
|
||||
password: typeof record.password === "string" ? record.password : "",
|
||||
status: "active",
|
||||
};
|
||||
}
|
||||
|
||||
export function redactProxySecrets(proxy: ProxyRegistryRecord): ProxyRegistryRecord {
|
||||
return {
|
||||
...proxy,
|
||||
username: proxy.username ? "***" : "",
|
||||
password: proxy.password ? "***" : "",
|
||||
};
|
||||
}
|
||||
|
||||
export async function listProxies(options?: { includeSecrets?: boolean }) {
|
||||
const includeSecrets = options?.includeSecrets === true;
|
||||
const db = getDbInstance();
|
||||
const rows = db
|
||||
.prepare(
|
||||
"SELECT id, name, type, host, port, username, password, region, notes, status, created_at, updated_at FROM proxy_registry ORDER BY datetime(updated_at) DESC, name ASC"
|
||||
)
|
||||
.all();
|
||||
|
||||
const proxies = rows.map(mapProxyRow);
|
||||
return includeSecrets ? proxies : proxies.map(redactProxySecrets);
|
||||
}
|
||||
|
||||
export async function getProxyById(id: string, options?: { includeSecrets?: boolean }) {
|
||||
const includeSecrets = options?.includeSecrets === true;
|
||||
const db = getDbInstance();
|
||||
const row = db
|
||||
.prepare(
|
||||
"SELECT id, name, type, host, port, username, password, region, notes, status, created_at, updated_at FROM proxy_registry WHERE id = ?"
|
||||
)
|
||||
.get(id);
|
||||
if (!row) return null;
|
||||
const proxy = mapProxyRow(row);
|
||||
return includeSecrets ? proxy : redactProxySecrets(proxy);
|
||||
}
|
||||
|
||||
export async function createProxy(payload: ProxyPayload) {
|
||||
const db = getDbInstance();
|
||||
const id = randomUUID();
|
||||
const now = new Date().toISOString();
|
||||
|
||||
db.prepare(
|
||||
`INSERT INTO proxy_registry
|
||||
(id, name, type, host, port, username, password, region, notes, status, created_at, updated_at)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`
|
||||
).run(
|
||||
id,
|
||||
payload.name,
|
||||
payload.type,
|
||||
payload.host,
|
||||
Number(payload.port),
|
||||
payload.username || "",
|
||||
payload.password || "",
|
||||
payload.region || null,
|
||||
payload.notes || null,
|
||||
payload.status || "active",
|
||||
now,
|
||||
now
|
||||
);
|
||||
|
||||
backupDbFile("pre-write");
|
||||
return getProxyById(id, { includeSecrets: false });
|
||||
}
|
||||
|
||||
export async function updateProxy(id: string, payload: Partial<ProxyPayload>) {
|
||||
const db = getDbInstance();
|
||||
const existing = await getProxyById(id, { includeSecrets: true });
|
||||
if (!existing) return null;
|
||||
|
||||
const incomingUsername =
|
||||
typeof payload.username === "string" ? payload.username.trim() : undefined;
|
||||
const incomingPassword =
|
||||
typeof payload.password === "string" ? payload.password.trim() : undefined;
|
||||
|
||||
const merged = {
|
||||
...existing,
|
||||
...payload,
|
||||
// Preserve stored credentials unless caller explicitly sends non-empty replacements.
|
||||
username:
|
||||
incomingUsername === undefined || incomingUsername.length === 0
|
||||
? existing.username
|
||||
: incomingUsername,
|
||||
password:
|
||||
incomingPassword === undefined || incomingPassword.length === 0
|
||||
? existing.password
|
||||
: incomingPassword,
|
||||
updatedAt: new Date().toISOString(),
|
||||
};
|
||||
|
||||
db.prepare(
|
||||
`UPDATE proxy_registry
|
||||
SET name = ?, type = ?, host = ?, port = ?, username = ?, password = ?, region = ?, notes = ?, status = ?, updated_at = ?
|
||||
WHERE id = ?`
|
||||
).run(
|
||||
merged.name,
|
||||
merged.type,
|
||||
merged.host,
|
||||
Number(merged.port),
|
||||
merged.username || "",
|
||||
merged.password || "",
|
||||
merged.region || null,
|
||||
merged.notes || null,
|
||||
merged.status || "active",
|
||||
merged.updatedAt,
|
||||
id
|
||||
);
|
||||
|
||||
backupDbFile("pre-write");
|
||||
return getProxyById(id, { includeSecrets: false });
|
||||
}
|
||||
|
||||
export async function getProxyAssignments(filters?: { proxyId?: string; scope?: string }) {
|
||||
const db = getDbInstance();
|
||||
|
||||
if (filters?.proxyId) {
|
||||
return db
|
||||
.prepare(
|
||||
"SELECT id, proxy_id, scope, scope_id, created_at, updated_at FROM proxy_assignments WHERE proxy_id = ? ORDER BY scope, scope_id"
|
||||
)
|
||||
.all(filters.proxyId)
|
||||
.map(mapAssignmentRow);
|
||||
}
|
||||
|
||||
if (filters?.scope) {
|
||||
return db
|
||||
.prepare(
|
||||
"SELECT id, proxy_id, scope, scope_id, created_at, updated_at FROM proxy_assignments WHERE scope = ? ORDER BY scope_id"
|
||||
)
|
||||
.all(normalizeScope(filters.scope))
|
||||
.map(mapAssignmentRow);
|
||||
}
|
||||
|
||||
return db
|
||||
.prepare(
|
||||
"SELECT id, proxy_id, scope, scope_id, created_at, updated_at FROM proxy_assignments ORDER BY scope, scope_id"
|
||||
)
|
||||
.all()
|
||||
.map(mapAssignmentRow);
|
||||
}
|
||||
|
||||
export async function getProxyWhereUsed(proxyId: string) {
|
||||
const db = getDbInstance();
|
||||
const rows = db
|
||||
.prepare(
|
||||
"SELECT id, proxy_id, scope, scope_id, created_at, updated_at FROM proxy_assignments WHERE proxy_id = ? ORDER BY scope, scope_id"
|
||||
)
|
||||
.all(proxyId)
|
||||
.map(mapAssignmentRow);
|
||||
|
||||
return {
|
||||
count: rows.length,
|
||||
assignments: rows,
|
||||
};
|
||||
}
|
||||
|
||||
export async function assignProxyToScope(
|
||||
scope: string,
|
||||
scopeId: string | null,
|
||||
proxyId: string | null
|
||||
): Promise<ProxyAssignmentRecord | null> {
|
||||
const normalizedScope = normalizeScope(scope);
|
||||
const normalizedScopeId = normalizedScope === "global" ? "__global__" : scopeId;
|
||||
const db = getDbInstance();
|
||||
|
||||
if (!proxyId) {
|
||||
db.prepare("DELETE FROM proxy_assignments WHERE scope = ? AND scope_id IS ?").run(
|
||||
normalizedScope,
|
||||
normalizedScopeId
|
||||
);
|
||||
backupDbFile("pre-write");
|
||||
return null;
|
||||
}
|
||||
|
||||
const proxy = await getProxyById(proxyId, { includeSecrets: true });
|
||||
if (!proxy) {
|
||||
const err = new Error(`Proxy not found: ${proxyId}`) as Error & { status?: number };
|
||||
err.status = 404;
|
||||
throw err;
|
||||
}
|
||||
|
||||
const now = new Date().toISOString();
|
||||
db.prepare(
|
||||
`INSERT INTO proxy_assignments (proxy_id, scope, scope_id, created_at, updated_at)
|
||||
VALUES (?, ?, ?, ?, ?)
|
||||
ON CONFLICT(scope, scope_id)
|
||||
DO UPDATE SET proxy_id = excluded.proxy_id, updated_at = excluded.updated_at`
|
||||
).run(proxyId, normalizedScope, normalizedScopeId, now, now);
|
||||
|
||||
backupDbFile("pre-write");
|
||||
|
||||
const row = db
|
||||
.prepare(
|
||||
"SELECT id, proxy_id, scope, scope_id, created_at, updated_at FROM proxy_assignments WHERE scope = ? AND scope_id IS ?"
|
||||
)
|
||||
.get(normalizedScope, normalizedScopeId);
|
||||
return row ? mapAssignmentRow(row) : null;
|
||||
}
|
||||
|
||||
export async function deleteProxyById(id: string, options?: { force?: boolean }) {
|
||||
const force = options?.force === true;
|
||||
const db = getDbInstance();
|
||||
const usage = await getProxyWhereUsed(id);
|
||||
|
||||
if (!force && usage.count > 0) {
|
||||
const err = new Error(
|
||||
"Proxy is still assigned. Remove assignments first or use force=true"
|
||||
) as Error & {
|
||||
status?: number;
|
||||
code?: string;
|
||||
};
|
||||
err.status = 409;
|
||||
err.code = "proxy_in_use";
|
||||
throw err;
|
||||
}
|
||||
|
||||
if (force && usage.count > 0) {
|
||||
db.prepare("DELETE FROM proxy_assignments WHERE proxy_id = ?").run(id);
|
||||
}
|
||||
|
||||
const result = db.prepare("DELETE FROM proxy_registry WHERE id = ?").run(id);
|
||||
backupDbFile("pre-write");
|
||||
return result.changes > 0;
|
||||
}
|
||||
|
||||
export async function resolveProxyForConnectionFromRegistry(connectionId: string) {
|
||||
const db = getDbInstance();
|
||||
|
||||
const accountAssignment = db
|
||||
.prepare(
|
||||
"SELECT p.id, p.type, p.host, p.port, p.username, p.password FROM proxy_assignments a JOIN proxy_registry p ON p.id = a.proxy_id WHERE a.scope = 'account' AND a.scope_id = ? LIMIT 1"
|
||||
)
|
||||
.get(connectionId);
|
||||
if (accountAssignment) {
|
||||
const record = toRecord(accountAssignment);
|
||||
return {
|
||||
proxy: {
|
||||
type: record.type,
|
||||
host: record.host,
|
||||
port: record.port,
|
||||
username: record.username,
|
||||
password: record.password,
|
||||
},
|
||||
level: "account",
|
||||
levelId: connectionId,
|
||||
source: "registry",
|
||||
};
|
||||
}
|
||||
|
||||
const connection = db
|
||||
.prepare("SELECT provider FROM provider_connections WHERE id = ?")
|
||||
.get(connectionId) as { provider?: string } | undefined;
|
||||
|
||||
if (connection?.provider) {
|
||||
const providerAssignment = db
|
||||
.prepare(
|
||||
"SELECT p.id, p.type, p.host, p.port, p.username, p.password FROM proxy_assignments a JOIN proxy_registry p ON p.id = a.proxy_id WHERE a.scope = 'provider' AND a.scope_id = ? LIMIT 1"
|
||||
)
|
||||
.get(connection.provider);
|
||||
if (providerAssignment) {
|
||||
const record = toRecord(providerAssignment);
|
||||
return {
|
||||
proxy: {
|
||||
type: record.type,
|
||||
host: record.host,
|
||||
port: record.port,
|
||||
username: record.username,
|
||||
password: record.password,
|
||||
},
|
||||
level: "provider",
|
||||
levelId: connection.provider,
|
||||
source: "registry",
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
const globalAssignment = db
|
||||
.prepare(
|
||||
"SELECT p.id, p.type, p.host, p.port, p.username, p.password FROM proxy_assignments a JOIN proxy_registry p ON p.id = a.proxy_id WHERE a.scope = 'global' LIMIT 1"
|
||||
)
|
||||
.get();
|
||||
if (globalAssignment) {
|
||||
const record = toRecord(globalAssignment);
|
||||
return {
|
||||
proxy: {
|
||||
type: record.type,
|
||||
host: record.host,
|
||||
port: record.port,
|
||||
username: record.username,
|
||||
password: record.password,
|
||||
},
|
||||
level: "global",
|
||||
levelId: null,
|
||||
source: "registry",
|
||||
};
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
export async function migrateLegacyProxyConfigToRegistry(options?: { force?: boolean }) {
|
||||
const force = options?.force === true;
|
||||
const db = getDbInstance();
|
||||
|
||||
const existingCountRow = db.prepare("SELECT COUNT(*) AS cnt FROM proxy_registry").get() as
|
||||
| { cnt?: number }
|
||||
| undefined;
|
||||
const existingCount = Number(existingCountRow?.cnt || 0);
|
||||
if (!force && existingCount > 0) {
|
||||
return { migrated: 0, skipped: true, reason: "registry_not_empty" as const };
|
||||
}
|
||||
|
||||
const rows = db
|
||||
.prepare("SELECT key, value FROM key_value WHERE namespace = 'proxyConfig'")
|
||||
.all() as Array<{ key?: string; value?: string }>;
|
||||
|
||||
const raw: LegacyProxyConfig = {};
|
||||
for (const row of rows) {
|
||||
if (!row?.key || typeof row.value !== "string") continue;
|
||||
try {
|
||||
raw[row.key as keyof LegacyProxyConfig] = JSON.parse(row.value);
|
||||
} catch {
|
||||
// ignore malformed legacy entry
|
||||
}
|
||||
}
|
||||
|
||||
let migrated = 0;
|
||||
|
||||
if (raw.global) {
|
||||
const payload = coerceProxyPayload(raw.global, "Legacy Global Proxy");
|
||||
if (payload) {
|
||||
const created = await createProxy(payload);
|
||||
if (created?.id) {
|
||||
await assignProxyToScope("global", null, created.id);
|
||||
migrated++;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
for (const [providerId, proxyValue] of Object.entries(raw.providers || {})) {
|
||||
const payload = coerceProxyPayload(proxyValue, `Legacy Provider Proxy (${providerId})`);
|
||||
if (!payload) continue;
|
||||
const created = await createProxy(payload);
|
||||
if (created?.id) {
|
||||
await assignProxyToScope("provider", providerId, created.id);
|
||||
migrated++;
|
||||
}
|
||||
}
|
||||
|
||||
for (const [comboId, proxyValue] of Object.entries(raw.combos || {})) {
|
||||
const payload = coerceProxyPayload(proxyValue, `Legacy Combo Proxy (${comboId})`);
|
||||
if (!payload) continue;
|
||||
const created = await createProxy(payload);
|
||||
if (created?.id) {
|
||||
await assignProxyToScope("combo", comboId, created.id);
|
||||
migrated++;
|
||||
}
|
||||
}
|
||||
|
||||
for (const [connectionId, proxyValue] of Object.entries(raw.keys || {})) {
|
||||
const payload = coerceProxyPayload(proxyValue, `Legacy Account Proxy (${connectionId})`);
|
||||
if (!payload) continue;
|
||||
const created = await createProxy(payload);
|
||||
if (created?.id) {
|
||||
await assignProxyToScope("account", connectionId, created.id);
|
||||
migrated++;
|
||||
}
|
||||
}
|
||||
|
||||
return { migrated, skipped: false as const };
|
||||
}
|
||||
|
||||
export async function getProxyHealthStats(options?: { hours?: number }) {
|
||||
const db = getDbInstance();
|
||||
const hours = Math.max(1, Math.min(24 * 30, Number(options?.hours || 24)));
|
||||
const sinceIso = new Date(Date.now() - hours * 60 * 60 * 1000).toISOString();
|
||||
|
||||
const rows = db
|
||||
.prepare(
|
||||
`SELECT
|
||||
p.id as proxy_id,
|
||||
p.name as proxy_name,
|
||||
p.type as proxy_type,
|
||||
p.host as proxy_host,
|
||||
p.port as proxy_port,
|
||||
COUNT(l.id) as total_requests,
|
||||
SUM(CASE WHEN l.status = 'success' THEN 1 ELSE 0 END) as success_count,
|
||||
SUM(CASE WHEN l.status = 'error' THEN 1 ELSE 0 END) as error_count,
|
||||
SUM(CASE WHEN l.status = 'timeout' THEN 1 ELSE 0 END) as timeout_count,
|
||||
AVG(CASE WHEN l.latency_ms IS NOT NULL THEN l.latency_ms END) as avg_latency_ms,
|
||||
MAX(l.timestamp) as last_seen_at
|
||||
FROM proxy_registry p
|
||||
LEFT JOIN proxy_logs l
|
||||
ON l.proxy_host = p.host
|
||||
AND l.proxy_type = p.type
|
||||
AND l.proxy_port = p.port
|
||||
AND l.timestamp >= ?
|
||||
GROUP BY p.id, p.name, p.type, p.host, p.port
|
||||
ORDER BY p.name ASC`
|
||||
)
|
||||
.all(sinceIso) as Array<Record<string, unknown>>;
|
||||
|
||||
return rows.map((row) => {
|
||||
const total = Number(row.total_requests || 0);
|
||||
const success = Number(row.success_count || 0);
|
||||
const error = Number(row.error_count || 0);
|
||||
const timeout = Number(row.timeout_count || 0);
|
||||
const successRate = total > 0 ? Math.round((success / total) * 10000) / 100 : null;
|
||||
|
||||
return {
|
||||
proxyId: String(row.proxy_id || ""),
|
||||
name: String(row.proxy_name || ""),
|
||||
type: String(row.proxy_type || "http"),
|
||||
host: String(row.proxy_host || ""),
|
||||
port: Number(row.proxy_port || 0),
|
||||
totalRequests: total,
|
||||
successCount: success,
|
||||
errorCount: error,
|
||||
timeoutCount: timeout,
|
||||
successRate,
|
||||
avgLatencyMs:
|
||||
row.avg_latency_ms === null || row.avg_latency_ms === undefined
|
||||
? null
|
||||
: Math.round(Number(row.avg_latency_ms)),
|
||||
lastSeenAt: row.last_seen_at ? String(row.last_seen_at) : null,
|
||||
};
|
||||
});
|
||||
}
|
||||
|
||||
export async function bulkAssignProxyToScope(
|
||||
scope: string,
|
||||
scopeIds: string[],
|
||||
proxyId: string | null
|
||||
): Promise<{ updated: number; failed: Array<{ scopeId: string; reason: string }> }> {
|
||||
const uniqueScopeIds = [
|
||||
...new Set((scopeIds || []).map((id) => String(id).trim()).filter(Boolean)),
|
||||
];
|
||||
const failed: Array<{ scopeId: string; reason: string }> = [];
|
||||
let updated = 0;
|
||||
|
||||
if (scope === "global") {
|
||||
await assignProxyToScope("global", null, proxyId);
|
||||
return { updated: 1, failed: [] };
|
||||
}
|
||||
|
||||
for (const scopeId of uniqueScopeIds) {
|
||||
try {
|
||||
await assignProxyToScope(scope, scopeId, proxyId);
|
||||
updated++;
|
||||
} catch (error) {
|
||||
failed.push({
|
||||
scopeId,
|
||||
reason: error instanceof Error ? error.message : "Unknown error",
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
return { updated, failed };
|
||||
}
|
||||
@@ -6,6 +6,7 @@ import { getDbInstance } from "./core";
|
||||
import { backupDbFile } from "./backup";
|
||||
import { PROVIDER_ID_TO_ALIAS } from "@omniroute/open-sse/config/providerModels.ts";
|
||||
import { invalidateDbCache } from "./readCache";
|
||||
import { resolveProxyForConnectionFromRegistry } from "./proxies";
|
||||
|
||||
type JsonRecord = Record<string, unknown>;
|
||||
type PricingModels = Record<string, JsonRecord>;
|
||||
@@ -389,6 +390,11 @@ export async function deleteProxyForLevel(level: string, id: string | null) {
|
||||
}
|
||||
|
||||
export async function resolveProxyForConnection(connectionId: string) {
|
||||
const registryResolved = await resolveProxyForConnectionFromRegistry(connectionId);
|
||||
if (registryResolved?.proxy) {
|
||||
return registryResolved;
|
||||
}
|
||||
|
||||
const config = await getProxyConfig();
|
||||
|
||||
if (connectionId && config.keys?.[connectionId]) {
|
||||
|
||||
@@ -89,6 +89,22 @@ export {
|
||||
setProxyConfig,
|
||||
} from "./db/settings";
|
||||
|
||||
export {
|
||||
// Proxy Registry
|
||||
listProxies,
|
||||
getProxyById,
|
||||
createProxy,
|
||||
updateProxy,
|
||||
deleteProxyById,
|
||||
getProxyAssignments,
|
||||
getProxyWhereUsed,
|
||||
assignProxyToScope,
|
||||
resolveProxyForConnectionFromRegistry,
|
||||
migrateLegacyProxyConfigToRegistry,
|
||||
getProxyHealthStats,
|
||||
bulkAssignProxyToScope,
|
||||
} from "./db/proxies";
|
||||
|
||||
export {
|
||||
// Pricing Sync
|
||||
getSyncedPricing,
|
||||
|
||||
@@ -0,0 +1,218 @@
|
||||
/**
|
||||
* Local Provider Health Check
|
||||
*
|
||||
* Background polling of local provider_nodes (localhost) to detect
|
||||
* when they are up or down. Uses GET /models with a 5s timeout.
|
||||
*
|
||||
* Health status is stored in-memory (no DB migration needed).
|
||||
* Backoff schedule: 30s → 60s → 120s → 300s max on consecutive failures.
|
||||
* Resets to 30s on first success after failure.
|
||||
*
|
||||
* Uses Promise.allSettled so one slow/down node doesn't block others.
|
||||
*/
|
||||
|
||||
import { getProviderNodes } from "@/lib/localDb";
|
||||
|
||||
// ── Types ────────────────────────────────────────────────────────────────
|
||||
|
||||
export interface HealthStatus {
|
||||
nodeId: string;
|
||||
prefix: string;
|
||||
isHealthy: boolean;
|
||||
lastCheck: Date;
|
||||
lastError?: string;
|
||||
consecutiveFailures: number;
|
||||
responseTimeMs?: number;
|
||||
}
|
||||
|
||||
// ── Config ───────────────────────────────────────────────────────────────
|
||||
|
||||
const BACKOFF_SCHEDULE = [30_000, 60_000, 120_000, 300_000];
|
||||
const CHECK_TIMEOUT_MS = 5_000;
|
||||
const INITIAL_DELAY_MS = 15_000; // Wait for server boot before first sweep
|
||||
const LOG_PREFIX = "[LocalHealthCheck]";
|
||||
|
||||
// ── State ────────────────────────────────────────────────────────────────
|
||||
|
||||
const healthCache = new Map<string, HealthStatus>();
|
||||
let initialized = false;
|
||||
let sweepTimer: ReturnType<typeof setTimeout> | null = null;
|
||||
|
||||
// ── Helpers ──────────────────────────────────────────────────────────────
|
||||
|
||||
function isLocalhostUrl(baseUrl: string): boolean {
|
||||
try {
|
||||
const u = new URL(baseUrl);
|
||||
// Block credentials in URL to prevent SSRF via user@host (e.g., http://localhost@evil.com)
|
||||
if (u.username || u.password) return false;
|
||||
// Note: URL.hostname returns "[::1]" WITH brackets for IPv6 — both forms checked.
|
||||
// Verified: node -e "new URL('http://[::1]:8080').hostname" → "[::1]"
|
||||
return (
|
||||
u.hostname === "localhost" ||
|
||||
u.hostname === "127.0.0.1" ||
|
||||
u.hostname === "::1" ||
|
||||
u.hostname === "[::1]"
|
||||
);
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
function getNextInterval(failures: number): number {
|
||||
return BACKOFF_SCHEDULE[Math.min(failures, BACKOFF_SCHEDULE.length - 1)];
|
||||
}
|
||||
|
||||
// ── Core ─────────────────────────────────────────────────────────────────
|
||||
|
||||
async function checkNode(node: {
|
||||
id: string;
|
||||
prefix: string;
|
||||
baseUrl: string;
|
||||
}): Promise<HealthStatus> {
|
||||
const url = `${node.baseUrl.replace(/\/+$/, "")}/models`;
|
||||
const start = Date.now();
|
||||
const prev = healthCache.get(node.id);
|
||||
|
||||
try {
|
||||
const res = await fetch(url, { signal: AbortSignal.timeout(CHECK_TIMEOUT_MS) });
|
||||
// Consume/cancel response body to free resources
|
||||
res.body?.cancel().catch(() => {});
|
||||
const isHealthy = res.ok || res.status === 401; // 401 = server up but auth required
|
||||
return {
|
||||
nodeId: node.id,
|
||||
prefix: node.prefix,
|
||||
isHealthy,
|
||||
lastCheck: new Date(),
|
||||
consecutiveFailures: isHealthy ? 0 : (prev?.consecutiveFailures ?? 0) + 1,
|
||||
responseTimeMs: Date.now() - start,
|
||||
lastError: isHealthy ? undefined : `HTTP ${res.status}`,
|
||||
};
|
||||
} catch (err: unknown) {
|
||||
const message = err instanceof Error ? err.message : "Connection failed";
|
||||
return {
|
||||
nodeId: node.id,
|
||||
prefix: node.prefix,
|
||||
isHealthy: false,
|
||||
lastCheck: new Date(),
|
||||
consecutiveFailures: (prev?.consecutiveFailures ?? 0) + 1,
|
||||
responseTimeMs: Date.now() - start,
|
||||
lastError: message,
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
let sweepInProgress = false;
|
||||
|
||||
/** Single sweep: check all local provider_nodes in parallel. */
|
||||
export async function sweep(): Promise<void> {
|
||||
if (sweepInProgress) return; // Prevent concurrent sweeps
|
||||
sweepInProgress = true;
|
||||
|
||||
try {
|
||||
let nodes: Array<{ id: string; prefix: string; baseUrl: string }>;
|
||||
try {
|
||||
const raw = await getProviderNodes();
|
||||
nodes = (Array.isArray(raw) ? raw : []).filter(
|
||||
(n: Record<string, unknown>) =>
|
||||
typeof n.baseUrl === "string" && isLocalhostUrl(n.baseUrl as string)
|
||||
) as Array<{ id: string; prefix: string; baseUrl: string }>;
|
||||
} catch (err) {
|
||||
console.error(LOG_PREFIX, "Failed to load provider_nodes:", err);
|
||||
return;
|
||||
}
|
||||
|
||||
// Prune stale entries for deleted nodes
|
||||
const currentNodeIds = new Set(nodes.map((n) => n.id));
|
||||
for (const key of healthCache.keys()) {
|
||||
if (!currentNodeIds.has(key)) healthCache.delete(key);
|
||||
}
|
||||
|
||||
if (nodes.length === 0) return;
|
||||
|
||||
const results = await Promise.allSettled(nodes.map((node) => checkNode(node)));
|
||||
|
||||
for (const result of results) {
|
||||
if (result.status === "fulfilled") {
|
||||
const status = result.value;
|
||||
const prev = healthCache.get(status.nodeId);
|
||||
|
||||
// Log state transitions
|
||||
if (prev && prev.isHealthy !== status.isHealthy) {
|
||||
const emoji = status.isHealthy ? "✅" : "❌";
|
||||
console.log(
|
||||
LOG_PREFIX,
|
||||
`${emoji} ${status.prefix} is now ${status.isHealthy ? "healthy" : "unhealthy"}${status.lastError ? ` (${status.lastError})` : ""} [${status.responseTimeMs}ms]`
|
||||
);
|
||||
}
|
||||
|
||||
healthCache.set(status.nodeId, status);
|
||||
}
|
||||
}
|
||||
} finally {
|
||||
sweepInProgress = false;
|
||||
// Schedule next sweep with backoff based on worst-case failure count
|
||||
scheduleSweep();
|
||||
}
|
||||
}
|
||||
|
||||
function scheduleSweep(): void {
|
||||
if (!initialized) return; // Don't schedule if stopped
|
||||
if (sweepTimer) clearTimeout(sweepTimer);
|
||||
|
||||
// Use the maximum consecutive failures across all nodes to determine interval
|
||||
let maxFailures = 0;
|
||||
for (const status of healthCache.values()) {
|
||||
if (status.consecutiveFailures > maxFailures) {
|
||||
maxFailures = status.consecutiveFailures;
|
||||
}
|
||||
}
|
||||
|
||||
const interval = getNextInterval(maxFailures);
|
||||
sweepTimer = setTimeout(sweep, interval);
|
||||
}
|
||||
|
||||
// ── Public API ───────────────────────────────────────────────────────────
|
||||
|
||||
/** Get health status for a specific provider_node. */
|
||||
export function getHealthStatus(nodeId: string): HealthStatus | undefined {
|
||||
return healthCache.get(nodeId);
|
||||
}
|
||||
|
||||
/** Check if a provider_node is healthy. Returns true if never checked (optimistic). */
|
||||
export function isNodeHealthy(nodeId: string): boolean {
|
||||
const status = healthCache.get(nodeId);
|
||||
return status?.isHealthy ?? true;
|
||||
}
|
||||
|
||||
/** Get all health statuses (for monitoring API). */
|
||||
export function getAllHealthStatuses(): Record<string, HealthStatus> {
|
||||
return Object.fromEntries(healthCache);
|
||||
}
|
||||
|
||||
/** Start the health check scheduler (idempotent). */
|
||||
export function initLocalHealthCheck(): void {
|
||||
if (initialized) return;
|
||||
initialized = true;
|
||||
|
||||
console.log(
|
||||
LOG_PREFIX,
|
||||
`Starting local provider health check (initial delay ${INITIAL_DELAY_MS / 1000}s)`
|
||||
);
|
||||
|
||||
// Delay first sweep to let the server finish booting
|
||||
sweepTimer = setTimeout(() => {
|
||||
sweep().catch((err) => console.error(LOG_PREFIX, "Initial sweep failed:", err));
|
||||
}, INITIAL_DELAY_MS);
|
||||
}
|
||||
|
||||
/** Stop the scheduler (for tests / hot-reload). */
|
||||
export function stopLocalHealthCheck(): void {
|
||||
if (sweepTimer) {
|
||||
clearTimeout(sweepTimer);
|
||||
sweepTimer = null;
|
||||
}
|
||||
initialized = false;
|
||||
}
|
||||
|
||||
// Auto-initialize on first import (same pattern as tokenHealthCheck.ts:272)
|
||||
initLocalHealthCheck();
|
||||
@@ -1,6 +1,6 @@
|
||||
import { KIMI_CODING_CONFIG } from "../constants/oauth";
|
||||
import { randomUUID } from "node:crypto";
|
||||
import { hostname } from "node:os";
|
||||
import { randomUUID } from "crypto";
|
||||
import { hostname } from "os";
|
||||
|
||||
// Generate device ID (persistent per installation)
|
||||
const DEVICE_ID = randomUUID();
|
||||
|
||||
@@ -10,7 +10,7 @@
|
||||
* @module lib/semanticCache
|
||||
*/
|
||||
|
||||
import crypto from "node:crypto";
|
||||
import crypto from "crypto";
|
||||
import { LRUCache } from "./cacheLayer";
|
||||
import { getDbInstance } from "./db/core";
|
||||
|
||||
|
||||
@@ -24,8 +24,7 @@ export const CALL_LOGS_DIR = isCloud ? null : path.join(DATA_DIR, "call_logs");
|
||||
// Legacy paths
|
||||
const LEGACY_DB_FILE =
|
||||
isCloud || !LEGACY_DATA_DIR ? null : path.join(LEGACY_DATA_DIR, "usage.json");
|
||||
const LEGACY_LOG_FILE =
|
||||
isCloud || !LEGACY_DATA_DIR ? null : path.join(LEGACY_DATA_DIR, "log.txt");
|
||||
const LEGACY_LOG_FILE = isCloud || !LEGACY_DATA_DIR ? null : path.join(LEGACY_DATA_DIR, "log.txt");
|
||||
const LEGACY_CALL_LOGS_DB_FILE =
|
||||
isCloud || !LEGACY_DATA_DIR ? null : path.join(LEGACY_DATA_DIR, "call_logs.json");
|
||||
const LEGACY_CALL_LOGS_DIR =
|
||||
@@ -82,10 +81,10 @@ export function migrateUsageJsonToSqlite() {
|
||||
const insert = db.prepare(`
|
||||
INSERT INTO usage_history (provider, model, connection_id, api_key_id, api_key_name,
|
||||
tokens_input, tokens_output, tokens_cache_read, tokens_cache_creation, tokens_reasoning,
|
||||
status, timestamp)
|
||||
status, success, latency_ms, ttft_ms, error_code, timestamp)
|
||||
VALUES (@provider, @model, @connectionId, @apiKeyId, @apiKeyName,
|
||||
@tokensInput, @tokensOutput, @tokensCacheRead, @tokensCacheCreation, @tokensReasoning,
|
||||
@status, @timestamp)
|
||||
@status, @success, @latencyMs, @ttftMs, @errorCode, @timestamp)
|
||||
`);
|
||||
|
||||
const tx = db.transaction(() => {
|
||||
@@ -103,6 +102,14 @@ export function migrateUsageJsonToSqlite() {
|
||||
entry.tokens?.cacheCreation ?? entry.tokens?.cache_creation_input_tokens ?? 0,
|
||||
tokensReasoning: entry.tokens?.reasoning ?? entry.tokens?.reasoning_tokens ?? 0,
|
||||
status: entry.status || null,
|
||||
success: entry.success === false ? 0 : 1,
|
||||
latencyMs: Number.isFinite(Number(entry.latencyMs)) ? Number(entry.latencyMs) : 0,
|
||||
ttftMs: Number.isFinite(Number(entry.timeToFirstTokenMs))
|
||||
? Number(entry.timeToFirstTokenMs)
|
||||
: Number.isFinite(Number(entry.latencyMs))
|
||||
? Number(entry.latencyMs)
|
||||
: 0,
|
||||
errorCode: entry.errorCode || null,
|
||||
timestamp: entry.timestamp || new Date().toISOString(),
|
||||
});
|
||||
}
|
||||
|
||||
@@ -107,6 +107,10 @@ export async function getUsageDb() {
|
||||
reasoning: toNumber(r.tokens_reasoning),
|
||||
},
|
||||
status: toStringOrNull(r.status),
|
||||
success: toNumber(r.success) === 1,
|
||||
latencyMs: toNumber(r.latency_ms),
|
||||
timeToFirstTokenMs: toNumber(r.ttft_ms),
|
||||
errorCode: toStringOrNull(r.error_code),
|
||||
timestamp: toStringOrNull(r.timestamp),
|
||||
};
|
||||
});
|
||||
@@ -130,8 +134,8 @@ export async function saveRequestUsage(entry: any) {
|
||||
`
|
||||
INSERT INTO usage_history (provider, model, connection_id, api_key_id, api_key_name,
|
||||
tokens_input, tokens_output, tokens_cache_read, tokens_cache_creation, tokens_reasoning,
|
||||
status, timestamp)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
status, success, latency_ms, ttft_ms, error_code, timestamp)
|
||||
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
|
||||
`
|
||||
).run(
|
||||
entry.provider || null,
|
||||
@@ -145,6 +149,14 @@ export async function saveRequestUsage(entry: any) {
|
||||
entry.tokens?.cacheCreation ?? entry.tokens?.cache_creation_input_tokens ?? 0,
|
||||
entry.tokens?.reasoning ?? entry.tokens?.reasoning_tokens ?? 0,
|
||||
entry.status || null,
|
||||
entry.success === false ? 0 : 1,
|
||||
Number.isFinite(Number(entry.latencyMs)) ? Number(entry.latencyMs) : 0,
|
||||
Number.isFinite(Number(entry.timeToFirstTokenMs))
|
||||
? Number(entry.timeToFirstTokenMs)
|
||||
: Number.isFinite(Number(entry.latencyMs))
|
||||
? Number(entry.latencyMs)
|
||||
: 0,
|
||||
entry.errorCode || null,
|
||||
timestamp
|
||||
);
|
||||
} catch (error) {
|
||||
@@ -202,6 +214,10 @@ export async function getUsageHistory(filter: any = {}) {
|
||||
reasoning: toNumber(r.tokens_reasoning),
|
||||
},
|
||||
status: toStringOrNull(r.status),
|
||||
success: toNumber(r.success) === 1,
|
||||
latencyMs: toNumber(r.latency_ms),
|
||||
timeToFirstTokenMs: toNumber(r.ttft_ms),
|
||||
errorCode: toStringOrNull(r.error_code),
|
||||
timestamp: toStringOrNull(r.timestamp),
|
||||
};
|
||||
});
|
||||
|
||||
@@ -0,0 +1,54 @@
|
||||
/**
|
||||
* Kiro IDE MITM Configuration (#336)
|
||||
*
|
||||
* Kiro IDE removed the Base URL / API Key configuration UI.
|
||||
* To route Kiro's traffic through OmniRoute, we intercept it using MITM,
|
||||
* similar to the existing Antigravity/Claude Code implementation.
|
||||
*
|
||||
* Kiro IDE uses the Anthropic API at https://api.anthropic.com:
|
||||
* - Main endpoint: POST /v1/messages
|
||||
* - Auth header: x-api-key: <key>
|
||||
* - User-Agent contains: "kiro" or "Kiro"
|
||||
*
|
||||
* To use: Install OmniRoute's MITM certificate, then run:
|
||||
* omniroute mitm start --targets kiro
|
||||
*
|
||||
* The MITM server intercepts requests to api.anthropic.com and forwards
|
||||
* them to the OmniRoute proxy (localhost:20128) instead.
|
||||
*/
|
||||
|
||||
export interface MitmTarget {
|
||||
id: string;
|
||||
name: string;
|
||||
description: string;
|
||||
targetHost: string;
|
||||
targetPort: number;
|
||||
localPort: number;
|
||||
userAgentPattern: string | null;
|
||||
apiEndpoints: string[];
|
||||
authHeader: string;
|
||||
instructions: string[];
|
||||
referenceIde?: string;
|
||||
}
|
||||
|
||||
/** Kiro IDE MITM profile */
|
||||
export const KIRO_MITM_PROFILE: MitmTarget = {
|
||||
id: "kiro",
|
||||
name: "Kiro IDE",
|
||||
description:
|
||||
"Intercepts Kiro IDE requests to api.anthropic.com and routes them through OmniRoute.",
|
||||
targetHost: "api.anthropic.com",
|
||||
targetPort: 443,
|
||||
localPort: 20130,
|
||||
userAgentPattern: null, // Kiro does not expose a stable User-Agent
|
||||
apiEndpoints: ["/v1/messages"],
|
||||
authHeader: "x-api-key",
|
||||
instructions: [
|
||||
"1. Install OmniRoute's root certificate: run `omniroute cert install` or go to Settings → MITM Certificates",
|
||||
"2. Start the MITM proxy: `omniroute mitm start --target kiro`",
|
||||
"3. Set your system HTTP proxy to 127.0.0.1:20130 (or use transparent MITM via DNS override)",
|
||||
"4. Open Kiro IDE — API calls will be automatically routed through OmniRoute.",
|
||||
"5. Verify: check the Proxy Logs in OmniRoute dashboard and look for provider=anthropic source=mitm",
|
||||
],
|
||||
referenceIde: "antigravity", // Same MITM infrastructure as Antigravity
|
||||
};
|
||||
@@ -33,7 +33,24 @@ const LEVEL_LABELS = {
|
||||
* @param {string} [props.levelLabel] — display name for the level
|
||||
* @param {Function} [props.onSaved] — callback after save
|
||||
*/
|
||||
export default function ProxyConfigModal({ isOpen, onClose, level, levelId, levelLabel, onSaved }: { isOpen: any; onClose: any; level: any; levelId?: any; levelLabel?: any; onSaved?: any }) {
|
||||
export default function ProxyConfigModal({
|
||||
isOpen,
|
||||
onClose,
|
||||
level,
|
||||
levelId,
|
||||
levelLabel,
|
||||
onSaved,
|
||||
}: {
|
||||
isOpen: any;
|
||||
onClose: any;
|
||||
level: any;
|
||||
levelId?: any;
|
||||
levelLabel?: any;
|
||||
onSaved?: any;
|
||||
}) {
|
||||
const [mode, setMode] = useState("saved");
|
||||
const [savedProxies, setSavedProxies] = useState([]);
|
||||
const [selectedProxyId, setSelectedProxyId] = useState("");
|
||||
const [proxyType, setProxyType] = useState(PROXY_TYPES[0]?.value || "http");
|
||||
const [host, setHost] = useState("");
|
||||
const [port, setPort] = useState("");
|
||||
@@ -63,6 +80,36 @@ export default function ProxyConfigModal({ isOpen, onClose, level, levelId, leve
|
||||
|
||||
const loadProxy = async () => {
|
||||
try {
|
||||
let hasSavedAssignment = false;
|
||||
const registryRes = await fetch("/api/settings/proxies");
|
||||
if (registryRes.ok) {
|
||||
const registryPayload = await registryRes.json();
|
||||
setSavedProxies(Array.isArray(registryPayload?.items) ? registryPayload.items : []);
|
||||
} else {
|
||||
setSavedProxies([]);
|
||||
}
|
||||
|
||||
const scope = level === "key" ? "account" : level;
|
||||
const assignmentParams = new URLSearchParams({ scope });
|
||||
if (level !== "global" && levelId) {
|
||||
assignmentParams.set("scopeId", levelId);
|
||||
}
|
||||
const assignmentRes = await fetch(`/api/settings/proxies/assignments?${assignmentParams}`);
|
||||
if (assignmentRes.ok) {
|
||||
const assignmentPayload = await assignmentRes.json();
|
||||
const items = Array.isArray(assignmentPayload?.items) ? assignmentPayload.items : [];
|
||||
const target = items[0];
|
||||
if (target?.proxyId) {
|
||||
setMode("saved");
|
||||
setSelectedProxyId(target.proxyId);
|
||||
setHasOwnProxy(true);
|
||||
hasSavedAssignment = true;
|
||||
} else {
|
||||
setMode("custom");
|
||||
setSelectedProxyId("");
|
||||
}
|
||||
}
|
||||
|
||||
// Load own proxy
|
||||
const params = new URLSearchParams({ level });
|
||||
if (levelId) params.set("id", levelId);
|
||||
@@ -85,9 +132,12 @@ export default function ProxyConfigModal({ isOpen, onClose, level, levelId, leve
|
||||
"SOCKS5 is configured but hidden because NEXT_PUBLIC_ENABLE_SOCKS5_PROXY=false."
|
||||
);
|
||||
}
|
||||
if (!hasSavedAssignment) setMode("custom");
|
||||
} else {
|
||||
resetFields();
|
||||
setHasOwnProxy(false);
|
||||
if (!hasSavedAssignment) {
|
||||
setHasOwnProxy(false);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -130,28 +180,70 @@ export default function ProxyConfigModal({ isOpen, onClose, level, levelId, leve
|
||||
};
|
||||
|
||||
const handleSave = async () => {
|
||||
if (!host.trim()) return;
|
||||
if (mode === "saved" && !selectedProxyId) {
|
||||
setFormError("Select a saved proxy before saving.");
|
||||
return;
|
||||
}
|
||||
if (mode === "custom" && !host.trim()) return;
|
||||
setFormError(null);
|
||||
setSaving(true);
|
||||
try {
|
||||
const proxy = {
|
||||
type: proxyType,
|
||||
host: host.trim(),
|
||||
port: port.trim() || getDefaultPort(proxyType),
|
||||
username: username.trim(),
|
||||
password: password.trim(),
|
||||
};
|
||||
const res = await fetch("/api/settings/proxy", {
|
||||
method: "PUT",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({ level, id: levelId, proxy }),
|
||||
});
|
||||
const scope = level === "key" ? "account" : level;
|
||||
let res;
|
||||
if (mode === "saved") {
|
||||
res = await fetch("/api/settings/proxies/assignments", {
|
||||
method: "PUT",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({
|
||||
scope,
|
||||
scopeId: level === "global" ? null : levelId,
|
||||
proxyId: selectedProxyId,
|
||||
}),
|
||||
});
|
||||
|
||||
if (res.ok) {
|
||||
const clearParams = new URLSearchParams({ level });
|
||||
if (levelId) clearParams.set("id", levelId);
|
||||
await fetch(`/api/settings/proxy?${clearParams.toString()}`, { method: "DELETE" });
|
||||
}
|
||||
} else {
|
||||
const clearAssignmentRes = await fetch("/api/settings/proxies/assignments", {
|
||||
method: "PUT",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({
|
||||
scope,
|
||||
scopeId: level === "global" ? null : levelId,
|
||||
proxyId: null,
|
||||
}),
|
||||
});
|
||||
const clearAssignmentPayload = await clearAssignmentRes.json().catch(() => ({}));
|
||||
if (!clearAssignmentRes.ok) {
|
||||
setFormError(clearAssignmentPayload?.error?.message || "Failed to clear saved proxy");
|
||||
return;
|
||||
}
|
||||
|
||||
const proxy = {
|
||||
type: proxyType,
|
||||
host: host.trim(),
|
||||
port: port.trim() || getDefaultPort(proxyType),
|
||||
username: username.trim(),
|
||||
password: password.trim(),
|
||||
};
|
||||
res = await fetch("/api/settings/proxy", {
|
||||
method: "PUT",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({ level, id: levelId, proxy }),
|
||||
});
|
||||
}
|
||||
const payload = await res.json().catch(() => ({}));
|
||||
if (!res.ok) {
|
||||
setFormError(payload?.error?.message || "Failed to save proxy configuration");
|
||||
return;
|
||||
}
|
||||
setHasOwnProxy(true);
|
||||
if (mode === "custom") {
|
||||
setSelectedProxyId("");
|
||||
}
|
||||
onSaved?.();
|
||||
} catch (error) {
|
||||
console.error("Error saving proxy:", error);
|
||||
@@ -165,6 +257,17 @@ export default function ProxyConfigModal({ isOpen, onClose, level, levelId, leve
|
||||
setFormError(null);
|
||||
setSaving(true);
|
||||
try {
|
||||
const scope = level === "key" ? "account" : level;
|
||||
await fetch("/api/settings/proxies/assignments", {
|
||||
method: "PUT",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({
|
||||
scope,
|
||||
scopeId: level === "global" ? null : levelId,
|
||||
proxyId: null,
|
||||
}),
|
||||
});
|
||||
|
||||
const params = new URLSearchParams({ level });
|
||||
if (levelId) params.set("id", levelId);
|
||||
const res = await fetch(`/api/settings/proxy?${params}`, { method: "DELETE" });
|
||||
@@ -175,6 +278,7 @@ export default function ProxyConfigModal({ isOpen, onClose, level, levelId, leve
|
||||
}
|
||||
resetFields();
|
||||
setHasOwnProxy(false);
|
||||
setSelectedProxyId("");
|
||||
setTestResult(null);
|
||||
onSaved?.();
|
||||
} catch (error) {
|
||||
@@ -186,6 +290,10 @@ export default function ProxyConfigModal({ isOpen, onClose, level, levelId, leve
|
||||
};
|
||||
|
||||
const handleTest = async () => {
|
||||
if (mode === "saved") {
|
||||
setFormError("Use custom mode to run manual connection test.");
|
||||
return;
|
||||
}
|
||||
if (!host.trim()) return;
|
||||
setFormError(null);
|
||||
setTesting(true);
|
||||
@@ -248,93 +356,145 @@ export default function ProxyConfigModal({ isOpen, onClose, level, levelId, leve
|
||||
{/* Proxy Type Selector */}
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1.5 block uppercase tracking-wider font-medium">
|
||||
Proxy Type
|
||||
Source
|
||||
</label>
|
||||
<div className="flex gap-1 bg-bg-subtle rounded-lg p-1 border border-border">
|
||||
{PROXY_TYPES.map((t) => (
|
||||
<button
|
||||
key={t.value}
|
||||
onClick={() => setProxyType(t.value)}
|
||||
className={`flex-1 px-4 py-2 rounded-md text-sm font-medium transition-all ${
|
||||
proxyType === t.value
|
||||
? "bg-primary text-white shadow-sm"
|
||||
: "text-text-muted hover:text-text-primary hover:bg-black/5 dark:hover:bg-white/5"
|
||||
}`}
|
||||
>
|
||||
{t.label}
|
||||
</button>
|
||||
))}
|
||||
<div className="flex gap-2">
|
||||
<button
|
||||
onClick={() => setMode("saved")}
|
||||
className={`px-3 py-2 rounded text-sm border transition-colors ${
|
||||
mode === "saved"
|
||||
? "bg-primary text-white border-primary"
|
||||
: "bg-bg-subtle text-text-muted border-border"
|
||||
}`}
|
||||
>
|
||||
Saved Proxy
|
||||
</button>
|
||||
<button
|
||||
onClick={() => setMode("custom")}
|
||||
className={`px-3 py-2 rounded text-sm border transition-colors ${
|
||||
mode === "custom"
|
||||
? "bg-primary text-white border-primary"
|
||||
: "bg-bg-subtle text-text-muted border-border"
|
||||
}`}
|
||||
>
|
||||
Custom
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Host + Port */}
|
||||
<div className="grid grid-cols-3 gap-3">
|
||||
<div className="col-span-2">
|
||||
<label className="text-xs text-text-muted mb-1.5 block uppercase tracking-wider font-medium">
|
||||
Host
|
||||
</label>
|
||||
<input
|
||||
type="text"
|
||||
value={host}
|
||||
onChange={(e) => setHost(e.target.value)}
|
||||
placeholder="1.2.3.4 or proxy.example.com"
|
||||
className="w-full px-3 py-2.5 rounded-lg bg-bg-subtle border border-border text-sm text-text-primary placeholder:text-text-muted/50 focus:outline-none focus:border-primary transition-colors"
|
||||
/>
|
||||
</div>
|
||||
{mode === "saved" && (
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1.5 block uppercase tracking-wider font-medium">
|
||||
Port
|
||||
Saved Proxy
|
||||
</label>
|
||||
<input
|
||||
type="text"
|
||||
value={port}
|
||||
onChange={(e) => setPort(e.target.value)}
|
||||
placeholder={getDefaultPort(proxyType)}
|
||||
className="w-full px-3 py-2.5 rounded-lg bg-bg-subtle border border-border text-sm text-text-primary placeholder:text-text-muted/50 focus:outline-none focus:border-primary transition-colors"
|
||||
/>
|
||||
<select
|
||||
value={selectedProxyId}
|
||||
onChange={(e) => setSelectedProxyId(e.target.value)}
|
||||
className="w-full px-3 py-2.5 rounded-lg bg-bg-subtle border border-border text-sm text-text-primary"
|
||||
>
|
||||
<option value="">Select saved proxy...</option>
|
||||
{savedProxies.map((item: any) => (
|
||||
<option key={item.id} value={item.id}>
|
||||
{item.name} ({item.type}://{item.host}:{item.port})
|
||||
</option>
|
||||
))}
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Auth Toggle */}
|
||||
<div>
|
||||
<button
|
||||
onClick={() => setShowAuth(!showAuth)}
|
||||
className="flex items-center gap-2 text-sm text-text-muted hover:text-text-primary transition-colors"
|
||||
>
|
||||
<span className="material-symbols-outlined text-base">
|
||||
{showAuth ? "expand_less" : "expand_more"}
|
||||
</span>
|
||||
Authentication (optional)
|
||||
</button>
|
||||
{showAuth && (
|
||||
<div className="grid grid-cols-2 gap-3 mt-3">
|
||||
<div>
|
||||
{mode === "custom" && (
|
||||
<>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1.5 block uppercase tracking-wider font-medium">
|
||||
Proxy Type
|
||||
</label>
|
||||
<div className="flex gap-1 bg-bg-subtle rounded-lg p-1 border border-border">
|
||||
{PROXY_TYPES.map((t) => (
|
||||
<button
|
||||
key={t.value}
|
||||
onClick={() => setProxyType(t.value)}
|
||||
className={`flex-1 px-4 py-2 rounded-md text-sm font-medium transition-all ${
|
||||
proxyType === t.value
|
||||
? "bg-primary text-white shadow-sm"
|
||||
: "text-text-muted hover:text-text-primary hover:bg-black/5 dark:hover:bg-white/5"
|
||||
}`}
|
||||
>
|
||||
{t.label}
|
||||
</button>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{/* Host + Port */}
|
||||
<div className="grid grid-cols-3 gap-3">
|
||||
<div className="col-span-2">
|
||||
<label className="text-xs text-text-muted mb-1.5 block uppercase tracking-wider font-medium">
|
||||
Username
|
||||
Host
|
||||
</label>
|
||||
<input
|
||||
type="text"
|
||||
value={username}
|
||||
onChange={(e) => setUsername(e.target.value)}
|
||||
placeholder="Username"
|
||||
value={host}
|
||||
onChange={(e) => setHost(e.target.value)}
|
||||
placeholder="1.2.3.4 or proxy.example.com"
|
||||
className="w-full px-3 py-2.5 rounded-lg bg-bg-subtle border border-border text-sm text-text-primary placeholder:text-text-muted/50 focus:outline-none focus:border-primary transition-colors"
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1.5 block uppercase tracking-wider font-medium">
|
||||
Password
|
||||
Port
|
||||
</label>
|
||||
<input
|
||||
type="password"
|
||||
value={password}
|
||||
onChange={(e) => setPassword(e.target.value)}
|
||||
placeholder="Password"
|
||||
type="text"
|
||||
value={port}
|
||||
onChange={(e) => setPort(e.target.value)}
|
||||
placeholder={getDefaultPort(proxyType)}
|
||||
className="w-full px-3 py-2.5 rounded-lg bg-bg-subtle border border-border text-sm text-text-primary placeholder:text-text-muted/50 focus:outline-none focus:border-primary transition-colors"
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Auth Toggle */}
|
||||
<div>
|
||||
<button
|
||||
onClick={() => setShowAuth(!showAuth)}
|
||||
className="flex items-center gap-2 text-sm text-text-muted hover:text-text-primary transition-colors"
|
||||
>
|
||||
<span className="material-symbols-outlined text-base">
|
||||
{showAuth ? "expand_less" : "expand_more"}
|
||||
</span>
|
||||
Authentication (optional)
|
||||
</button>
|
||||
{showAuth && (
|
||||
<div className="grid grid-cols-2 gap-3 mt-3">
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1.5 block uppercase tracking-wider font-medium">
|
||||
Username
|
||||
</label>
|
||||
<input
|
||||
type="text"
|
||||
value={username}
|
||||
onChange={(e) => setUsername(e.target.value)}
|
||||
placeholder="Username"
|
||||
className="w-full px-3 py-2.5 rounded-lg bg-bg-subtle border border-border text-sm text-text-primary placeholder:text-text-muted/50 focus:outline-none focus:border-primary transition-colors"
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<label className="text-xs text-text-muted mb-1.5 block uppercase tracking-wider font-medium">
|
||||
Password
|
||||
</label>
|
||||
<input
|
||||
type="password"
|
||||
value={password}
|
||||
onChange={(e) => setPassword(e.target.value)}
|
||||
placeholder="Password"
|
||||
className="w-full px-3 py-2.5 rounded-lg bg-bg-subtle border border-border text-sm text-text-primary placeholder:text-text-muted/50 focus:outline-none focus:border-primary transition-colors"
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
</>
|
||||
)}
|
||||
|
||||
{/* Test Result */}
|
||||
{formError && (
|
||||
@@ -390,7 +550,7 @@ export default function ProxyConfigModal({ isOpen, onClose, level, levelId, leve
|
||||
icon="speed"
|
||||
onClick={handleTest}
|
||||
loading={testing}
|
||||
disabled={!host.trim()}
|
||||
disabled={mode !== "custom" || !host.trim()}
|
||||
>
|
||||
Test Connection
|
||||
</Button>
|
||||
@@ -416,7 +576,7 @@ export default function ProxyConfigModal({ isOpen, onClose, level, levelId, leve
|
||||
icon="save"
|
||||
onClick={handleSave}
|
||||
loading={saving}
|
||||
disabled={!host.trim()}
|
||||
disabled={mode === "saved" ? !selectedProxyId : !host.trim()}
|
||||
>
|
||||
Save
|
||||
</Button>
|
||||
|
||||
+268
-12
@@ -115,6 +115,13 @@ export const DEFAULT_PRICING = {
|
||||
reasoning: 18.0,
|
||||
cache_creation: 2.0,
|
||||
},
|
||||
"gemini-3.1-pro-preview": {
|
||||
input: 2.0,
|
||||
output: 12.0,
|
||||
cached: 0.25,
|
||||
reasoning: 18.0,
|
||||
cache_creation: 2.0,
|
||||
},
|
||||
"gemini-2.5-pro": {
|
||||
input: 2.0,
|
||||
output: 12.0,
|
||||
@@ -129,12 +136,13 @@ export const DEFAULT_PRICING = {
|
||||
reasoning: 3.75,
|
||||
cache_creation: 0.3,
|
||||
},
|
||||
// Gemini 2.5 Flash Lite — preco corrigido via ClawRouter: $0.10/$0.40 (era $0.15/$1.25)
|
||||
"gemini-2.5-flash-lite": {
|
||||
input: 0.15,
|
||||
output: 1.25,
|
||||
cached: 0.015,
|
||||
reasoning: 1.875,
|
||||
cache_creation: 0.15,
|
||||
input: 0.1,
|
||||
output: 0.4,
|
||||
cached: 0.025,
|
||||
reasoning: 0.6,
|
||||
cache_creation: 0.1,
|
||||
},
|
||||
},
|
||||
|
||||
@@ -208,6 +216,13 @@ export const DEFAULT_PRICING = {
|
||||
reasoning: 3.0,
|
||||
cache_creation: 0.5,
|
||||
},
|
||||
"deepseek-v3.2": {
|
||||
input: 0.5,
|
||||
output: 2.0,
|
||||
cached: 0.25,
|
||||
reasoning: 3.0,
|
||||
cache_creation: 0.5,
|
||||
},
|
||||
"deepseek-v3.2-reasoner": {
|
||||
input: 0.75,
|
||||
output: 3.0,
|
||||
@@ -451,10 +466,71 @@ export const DEFAULT_PRICING = {
|
||||
reasoning: 15.0,
|
||||
cache_creation: 3.0,
|
||||
},
|
||||
// Claude 4.5 Haiku — modelo eco mais recente da Anthropic (2025-10)
|
||||
"claude-haiku-4-5-20251001": {
|
||||
input: 1.0,
|
||||
output: 5.0,
|
||||
cached: 0.5,
|
||||
reasoning: 7.5,
|
||||
cache_creation: 1.0,
|
||||
},
|
||||
"claude-haiku-4.5": {
|
||||
input: 1.0,
|
||||
output: 5.0,
|
||||
cached: 0.5,
|
||||
reasoning: 7.5,
|
||||
cache_creation: 1.0,
|
||||
},
|
||||
// Claude Sonnet 4.6 — maxOutput 64k tokens, $3/$15/M
|
||||
"claude-sonnet-4-6-20251031": {
|
||||
input: 3.0,
|
||||
output: 15.0,
|
||||
cached: 1.5,
|
||||
reasoning: 22.5,
|
||||
cache_creation: 3.0,
|
||||
},
|
||||
"claude-sonnet-4.6": {
|
||||
input: 3.0,
|
||||
output: 15.0,
|
||||
cached: 1.5,
|
||||
reasoning: 22.5,
|
||||
cache_creation: 3.0,
|
||||
},
|
||||
// Claude Opus 4.6 — mais barato que Opus 4 ($5/$25 vs $15/$75)
|
||||
"claude-opus-4-6-20251031": {
|
||||
input: 5.0,
|
||||
output: 25.0,
|
||||
cached: 2.5,
|
||||
reasoning: 37.5,
|
||||
cache_creation: 5.0,
|
||||
},
|
||||
"claude-opus-4.6": {
|
||||
input: 5.0,
|
||||
output: 25.0,
|
||||
cached: 2.5,
|
||||
reasoning: 37.5,
|
||||
cache_creation: 5.0,
|
||||
},
|
||||
},
|
||||
|
||||
// Gemini
|
||||
gemini: {
|
||||
// Gemini 3.1 Pro — novo flagship Google (2026-03-17)
|
||||
// Context: 1.050.000 tokens | Max Output: 65.536
|
||||
"gemini-3.1-pro": {
|
||||
input: 2.0,
|
||||
output: 12.0,
|
||||
cached: 0.25,
|
||||
reasoning: 18.0,
|
||||
cache_creation: 2.0,
|
||||
},
|
||||
"gemini-3-1-pro": {
|
||||
input: 2.0,
|
||||
output: 12.0,
|
||||
cached: 0.25,
|
||||
reasoning: 18.0,
|
||||
cache_creation: 2.0,
|
||||
},
|
||||
"gemini-3-pro-preview": {
|
||||
input: 2.0,
|
||||
output: 12.0,
|
||||
@@ -462,6 +538,13 @@ export const DEFAULT_PRICING = {
|
||||
reasoning: 18.0,
|
||||
cache_creation: 2.0,
|
||||
},
|
||||
"gemini-3.1-pro-preview": {
|
||||
input: 2.0,
|
||||
output: 12.0,
|
||||
cached: 0.25,
|
||||
reasoning: 18.0,
|
||||
cache_creation: 2.0,
|
||||
},
|
||||
"gemini-2.5-pro": {
|
||||
input: 2.0,
|
||||
output: 12.0,
|
||||
@@ -476,12 +559,53 @@ export const DEFAULT_PRICING = {
|
||||
reasoning: 3.75,
|
||||
cache_creation: 0.3,
|
||||
},
|
||||
// Gemini 2.5 Flash Lite — preco corrigido: $0.10/$0.40 (ClawRouter)
|
||||
"gemini-2.5-flash-lite": {
|
||||
input: 0.15,
|
||||
output: 1.25,
|
||||
cached: 0.015,
|
||||
reasoning: 1.875,
|
||||
cache_creation: 0.15,
|
||||
input: 0.1,
|
||||
output: 0.4,
|
||||
cached: 0.025,
|
||||
reasoning: 0.6,
|
||||
cache_creation: 0.1,
|
||||
},
|
||||
},
|
||||
|
||||
// DeepSeek — API nativa (V3.2 Chat), separada de free providers
|
||||
// Preco: $0.28/$0.42/M tokens (verificado via ClawRouter 2026-03-17)
|
||||
deepseek: {
|
||||
"deepseek-chat": {
|
||||
input: 0.28,
|
||||
output: 0.42,
|
||||
cached: 0.014,
|
||||
reasoning: 0.42,
|
||||
cache_creation: 0.28,
|
||||
},
|
||||
"deepseek-v3": {
|
||||
input: 0.28,
|
||||
output: 0.42,
|
||||
cached: 0.014,
|
||||
reasoning: 0.42,
|
||||
cache_creation: 0.28,
|
||||
},
|
||||
"deepseek-v3.2": {
|
||||
input: 0.28,
|
||||
output: 0.42,
|
||||
cached: 0.014,
|
||||
reasoning: 0.42,
|
||||
cache_creation: 0.28,
|
||||
},
|
||||
"deepseek-reasoner": {
|
||||
input: 0.55,
|
||||
output: 2.19,
|
||||
cached: 0.14,
|
||||
reasoning: 2.19,
|
||||
cache_creation: 0.55,
|
||||
},
|
||||
"deepseek-r1": {
|
||||
input: 0.55,
|
||||
output: 2.19,
|
||||
cached: 0.14,
|
||||
reasoning: 2.19,
|
||||
cache_creation: 0.55,
|
||||
},
|
||||
},
|
||||
|
||||
@@ -498,6 +622,20 @@ export const DEFAULT_PRICING = {
|
||||
|
||||
// GLM
|
||||
glm: {
|
||||
"glm-5": {
|
||||
input: 1.0,
|
||||
output: 3.2,
|
||||
cached: 0.5,
|
||||
reasoning: 4.8,
|
||||
cache_creation: 1.0,
|
||||
},
|
||||
"glm-5-turbo": {
|
||||
input: 1.2,
|
||||
output: 4.0,
|
||||
cached: 0.6,
|
||||
reasoning: 6.0,
|
||||
cache_creation: 1.2,
|
||||
},
|
||||
"glm-4.7": {
|
||||
input: 0.75,
|
||||
output: 3.0,
|
||||
@@ -521,7 +659,7 @@ export const DEFAULT_PRICING = {
|
||||
},
|
||||
},
|
||||
|
||||
// Kimi
|
||||
// Kimi (Moonshot)
|
||||
kimi: {
|
||||
"kimi-latest": {
|
||||
input: 1.0,
|
||||
@@ -530,10 +668,33 @@ export const DEFAULT_PRICING = {
|
||||
reasoning: 6.0,
|
||||
cache_creation: 1.0,
|
||||
},
|
||||
// Kimi K2.5 — acesso direto via Moonshot API
|
||||
// Context: 262.144 tokens | Capabilities: reasoning, vision, agentic, tools
|
||||
"kimi-k2.5": {
|
||||
input: 0.6,
|
||||
output: 3.0,
|
||||
cached: 0.3,
|
||||
reasoning: 4.5,
|
||||
cache_creation: 0.6,
|
||||
},
|
||||
"moonshot-kimi-k2.5": {
|
||||
input: 0.6,
|
||||
output: 3.0,
|
||||
cached: 0.3,
|
||||
reasoning: 4.5,
|
||||
cache_creation: 0.6,
|
||||
},
|
||||
},
|
||||
|
||||
// MiniMax
|
||||
minimax: {
|
||||
"minimax-m2.1": {
|
||||
input: 0.5,
|
||||
output: 2.0,
|
||||
cached: 0.25,
|
||||
reasoning: 3.0,
|
||||
cache_creation: 0.5,
|
||||
},
|
||||
"MiniMax-M2.1": {
|
||||
input: 0.5,
|
||||
output: 2.0,
|
||||
@@ -541,6 +702,22 @@ export const DEFAULT_PRICING = {
|
||||
reasoning: 3.0,
|
||||
cache_creation: 0.5,
|
||||
},
|
||||
// MiniMax M2.5 — mais barato que M2.1, reasoning + tools
|
||||
// Context: 204.800 tokens | Max Output: 16.384 tokens
|
||||
"minimax-m2.5": {
|
||||
input: 0.3,
|
||||
output: 1.2,
|
||||
cached: 0.15,
|
||||
reasoning: 1.8,
|
||||
cache_creation: 0.3,
|
||||
},
|
||||
"MiniMax-M2.5": {
|
||||
input: 0.3,
|
||||
output: 1.2,
|
||||
cached: 0.15,
|
||||
reasoning: 1.8,
|
||||
cache_creation: 0.3,
|
||||
},
|
||||
},
|
||||
|
||||
// ─── Free-tier API Key Providers (nominal $0 pricing) ───
|
||||
@@ -627,6 +804,7 @@ export const DEFAULT_PRICING = {
|
||||
|
||||
// Nvidia
|
||||
nvidia: {
|
||||
"nvidia/gpt-oss-120b": { input: 0, output: 0, cached: 0, reasoning: 0, cache_creation: 0 },
|
||||
"openai/gpt-oss-120b": { input: 0, output: 0, cached: 0, reasoning: 0, cache_creation: 0 },
|
||||
"gpt-oss-120b": { input: 0, output: 0, cached: 0, reasoning: 0, cache_creation: 0 },
|
||||
"moonshotai/kimi-k2.5": { input: 0, output: 0, cached: 0, reasoning: 0, cache_creation: 0 },
|
||||
@@ -757,7 +935,85 @@ export const DEFAULT_PRICING = {
|
||||
},
|
||||
},
|
||||
|
||||
// Kiro (AWS)
|
||||
// ─────────────────────────────────────────────────────────────────────
|
||||
// xAI (Grok) — Grok-3 + Grok-4 Family
|
||||
// Source: ClawRouter benchmarks 2026-03-17
|
||||
// Grok-4-fast-non-reasoning: 1143ms P50 (mais rapido do benchmark)
|
||||
// ─────────────────────────────────────────────────────────────────────
|
||||
xai: {
|
||||
"grok-3": {
|
||||
input: 3.0,
|
||||
output: 15.0,
|
||||
cached: 1.5,
|
||||
reasoning: 22.5,
|
||||
cache_creation: 3.0,
|
||||
},
|
||||
"grok-3-mini": {
|
||||
input: 0.3,
|
||||
output: 0.5,
|
||||
cached: 0.15,
|
||||
reasoning: 0.75,
|
||||
cache_creation: 0.3,
|
||||
},
|
||||
// Grok-4 Fast Family — ultrabaratos ($0.20/$0.50/M)
|
||||
"grok-4-fast-non-reasoning": {
|
||||
input: 0.2,
|
||||
output: 0.5,
|
||||
cached: 0.1,
|
||||
reasoning: 0.0,
|
||||
cache_creation: 0.2,
|
||||
},
|
||||
"grok-4-fast-reasoning": {
|
||||
input: 0.2,
|
||||
output: 0.5,
|
||||
cached: 0.1,
|
||||
reasoning: 0.75,
|
||||
cache_creation: 0.2,
|
||||
},
|
||||
"grok-4-1-fast-non-reasoning": {
|
||||
input: 0.2,
|
||||
output: 0.5,
|
||||
cached: 0.1,
|
||||
reasoning: 0.0,
|
||||
cache_creation: 0.2,
|
||||
},
|
||||
"grok-4-1-fast-reasoning": {
|
||||
input: 0.2,
|
||||
output: 0.5,
|
||||
cached: 0.1,
|
||||
reasoning: 0.75,
|
||||
cache_creation: 0.2,
|
||||
},
|
||||
"grok-4-0709": {
|
||||
input: 0.2,
|
||||
output: 1.5,
|
||||
cached: 0.1,
|
||||
reasoning: 2.25,
|
||||
cache_creation: 0.2,
|
||||
},
|
||||
},
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────
|
||||
// Z.AI / ZhipuAI — GLM-5 Family
|
||||
// Adicionados via ClawRouter 2026-03-17 | maxOutput: 128k tokens!
|
||||
// ─────────────────────────────────────────────────────────────────────
|
||||
zai: {
|
||||
"glm-5": {
|
||||
input: 1.0,
|
||||
output: 3.2,
|
||||
cached: 0.5,
|
||||
reasoning: 4.8,
|
||||
cache_creation: 1.0,
|
||||
},
|
||||
"glm-5-turbo": {
|
||||
input: 1.2,
|
||||
output: 4.0,
|
||||
cached: 0.6,
|
||||
reasoning: 6.0,
|
||||
cache_creation: 1.2,
|
||||
},
|
||||
},
|
||||
|
||||
kiro: {
|
||||
"claude-sonnet-4.5": {
|
||||
input: 3.0,
|
||||
|
||||
@@ -360,6 +360,26 @@ export const APIKEY_PROVIDERS = {
|
||||
hasFree: true,
|
||||
freeNote: "Free Inference API for thousands of models (Whisper, VITS, SDXL…)",
|
||||
},
|
||||
synthetic: {
|
||||
id: "synthetic",
|
||||
alias: "synthetic",
|
||||
name: "Synthetic",
|
||||
icon: "verified_user",
|
||||
color: "#6366F1",
|
||||
textIcon: "SY",
|
||||
website: "https://synthetic.new",
|
||||
passthroughModels: true,
|
||||
},
|
||||
"kilo-gateway": {
|
||||
id: "kilo-gateway",
|
||||
alias: "kg",
|
||||
name: "Kilo Gateway",
|
||||
icon: "hub",
|
||||
color: "#617A91",
|
||||
textIcon: "KG",
|
||||
website: "https://kilo.ai",
|
||||
passthroughModels: true,
|
||||
},
|
||||
vertex: {
|
||||
id: "vertex",
|
||||
alias: "vertex",
|
||||
@@ -370,6 +390,18 @@ export const APIKEY_PROVIDERS = {
|
||||
website: "https://cloud.google.com/vertex-ai",
|
||||
authHint: "Provide Service Account JSON or OAuth access_token",
|
||||
},
|
||||
// Z.AI (formerly ZhipuAI) — GLM-5 family with 128k output
|
||||
// Added 2026-03-17 based on ClawRouter feature analysis
|
||||
zai: {
|
||||
id: "zai",
|
||||
alias: "zai",
|
||||
name: "Z.AI (GLM-5)",
|
||||
icon: "psychology",
|
||||
color: "#2563EB",
|
||||
textIcon: "ZA",
|
||||
website: "https://open.bigmodel.cn",
|
||||
apiHint: "API key from https://open.bigmodel.cn/usercenter/apikeys",
|
||||
},
|
||||
};
|
||||
|
||||
export const OPENAI_COMPATIBLE_PREFIX = "openai-compatible-";
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user