Compare commits

...

16 Commits

Author SHA1 Message Date
diegosouzapw a8ab16a720 chore(release): v2.6.5 — reasoning params filter, local 404 fix, Kilo Gateway, dep bumps
Build Electron Desktop App / Validate version (push) Failing after 24s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
- fix(sse): strip unsupported params for o1/o1-mini/o1-pro/o3/o3-mini (PR #412 @Regis-RCR)
- fix(sse): model-only lockout (5s) for local provider 404 (PR #410 @Regis-RCR)
- feat(api): Kilo Gateway provider — 335+ models, alias 'kg' (PR #408 @Regis-RCR)
- deps: better-sqlite3 12.8, undici 7.24.4, https-proxy-agent 8 (PR #413)
2026-03-17 03:05:45 -03:00
Diego Rodrigues de Sa e Souza a00ef0fc7e Merge pull request #413 from diegosouzapw/dependabot/npm_and_yarn/production-4d4ff746af
deps: bump the production group with 5 updates
2026-03-17 03:03:49 -03:00
Diego Rodrigues de Sa e Souza 5ce6d615a4 Merge pull request #408 from Regis-RCR/feat/kilo-gateway-provider
feat(api): add Kilo Gateway provider
2026-03-17 03:03:47 -03:00
Diego Rodrigues de Sa e Souza e06b69cdac Merge pull request #410 from Regis-RCR/fix/local-404-cascade
fix(sse): model-only lockout for local provider 404
2026-03-17 03:03:31 -03:00
Diego Rodrigues de Sa e Souza d261ae7883 Merge pull request #412 from Regis-RCR/fix/param-filter-reasoning
fix(sse): strip unsupported params for reasoning models (o1/o3)
2026-03-17 03:03:28 -03:00
diegosouzapw 6fa77a63d7 chore(release): v2.6.4 — model name fixes across providers 2026-03-17 01:59:25 -03:00
diegosouzapw f76c1b32d6 fix(providers): remove non-existent model names and fix incorrect model IDs
- gemini/gemini-cli: removed gemini-3.1-pro/flash/preview (don't exist in Google API v1beta),
  replaced with real models: gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-*
- antigravity: removed gemini-3.1-pro-high/low and gemini-3-flash (internal aliases invalid),
  replaced with gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash
- github: removed gemini-3-flash-preview and gemini-3-pro-preview, replaced with gemini-2.5-flash
- nvidia: corrected 'nvidia/llama-3.3-70b-instruct' to 'meta/llama-3.3-70b-instruct'
  (NVIDIA NIM uses meta/ namespace, not nvidia/ namespace for Meta models)
- nvidia: added meta/llama-3.1-70b-instruct and nvidia/llama-3.1-405b-instruct

Also fixed free-stack combo on .15 DB:
- removed qw/qwen3-coder-plus (qwen provider has expired refresh token)
- corrected nvidia/llama-3.3-70b-instruct → nvidia/meta/llama-3.3-70b-instruct
- corrected gemini/gemini-3.1-flash → gemini/gemini-2.5-flash
- added if/deepseek-v3.2 as replacement for qw/qwen3-coder-plus
2026-03-17 01:48:40 -03:00
dependabot[bot] d98ec59c79 deps: bump the production group with 5 updates
Bumps the production group with 5 updates:

| Package | From | To |
| --- | --- | --- |
| [better-sqlite3](https://github.com/WiseLibs/better-sqlite3) | `12.6.2` | `12.8.0` |
| [https-proxy-agent](https://github.com/TooTallNate/proxy-agents/tree/HEAD/packages/https-proxy-agent) | `7.0.6` | `8.0.0` |
| [undici](https://github.com/nodejs/undici) | `7.24.2` | `7.24.4` |
| [wreq-js](https://github.com/sqdshguy/wreq-js) | `2.1.1` | `2.2.0` |
| [zustand](https://github.com/pmndrs/zustand) | `5.0.11` | `5.0.12` |


Updates `better-sqlite3` from 12.6.2 to 12.8.0
- [Release notes](https://github.com/WiseLibs/better-sqlite3/releases)
- [Commits](https://github.com/WiseLibs/better-sqlite3/compare/v12.6.2...v12.8.0)

Updates `https-proxy-agent` from 7.0.6 to 8.0.0
- [Release notes](https://github.com/TooTallNate/proxy-agents/releases)
- [Changelog](https://github.com/TooTallNate/proxy-agents/blob/main/packages/https-proxy-agent/CHANGELOG.md)
- [Commits](https://github.com/TooTallNate/proxy-agents/commits/https-proxy-agent@8.0.0/packages/https-proxy-agent)

Updates `undici` from 7.24.2 to 7.24.4
- [Release notes](https://github.com/nodejs/undici/releases)
- [Commits](https://github.com/nodejs/undici/compare/v7.24.2...v7.24.4)

Updates `wreq-js` from 2.1.1 to 2.2.0
- [Release notes](https://github.com/sqdshguy/wreq-js/releases)
- [Commits](https://github.com/sqdshguy/wreq-js/compare/v2.1.1...v2.2.0)

Updates `zustand` from 5.0.11 to 5.0.12
- [Release notes](https://github.com/pmndrs/zustand/releases)
- [Commits](https://github.com/pmndrs/zustand/compare/v5.0.11...v5.0.12)

---
updated-dependencies:
- dependency-name: better-sqlite3
  dependency-version: 12.8.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: production
- dependency-name: https-proxy-agent
  dependency-version: 8.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: production
- dependency-name: undici
  dependency-version: 7.24.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production
- dependency-name: wreq-js
  dependency-version: 2.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: production
- dependency-name: zustand
  dependency-version: 5.0.12
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: production
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-16 19:03:12 +00:00
Regis d79b55be5a fix(sse): strip unsupported params for reasoning models (o1/o3)
Reasoning models (o1, o1-pro, o3, o3-mini) reject standard parameters
like temperature and top_p with 400 Bad Request. OmniRoute's default
executor forwards all parameters without filtering.

This fix adds declarative parameter filtering:
- Add unsupportedParams[] field to RegistryModel interface
- Add REASONING_UNSUPPORTED frozen constant shared across entries
- Add o1-pro, o3, o3-mini to OpenAI registry (were missing)
- Add getUnsupportedParams() helper with:
  - O(1) precomputed map lookup (not O(N×M) scan)
  - Cross-provider routing support via precomputed map
  - Prefixed model ID support (e.g., "openai/o3" → "o3")
- Strip unsupported params in chatCore.ts before executor call
- Use Object.hasOwn() for safe property check (no prototype chain)
- Log stripped params at WARN level for visibility
2026-03-16 19:41:55 +01:00
Regis 1f9a402dcd fix(sse): address bot review — tighten local detection, guard null model
- Remove apiKey===null heuristic (too broad — could match cloud providers
  with non-standard auth). Use URL-based detection only.
- Guard local 404 branch with provider && model check — if either is null,
  fall through to standard connection lockout (safer behavior).
- Document LOCAL_HOSTNAMES as module-load-time constant (restart required).
- Document PROVIDER_PROFILES.local as intentionally not yet wired.
2026-03-16 19:03:47 +01:00
Regis f9bcc9418b fix(sse): model-only lockout for local provider 404 (connection stays active)
When a local inference backend (oMLX, Ollama, LM Studio) returns 404
for an unknown model, OmniRoute previously locked the entire connection
for 2 minutes — blocking all valid models on that connection.

This fix introduces local provider detection and changes the 404
behavior for local providers:
- Model-only lockout (5s) instead of connection-level lockout (2min)
- Connection stays active — other models continue working immediately
- Detection via URL heuristic (localhost/127.0.0.1) + apiKey===null fallback
- Configurable via LOCAL_HOSTNAMES env var for Docker setups

Also fixes a pre-existing bug where the model parameter was not passed
to markAccountUnavailable() from chat.ts, preventing per-model lockouts
from working at all.

Changes:
- Add isLocalProvider(baseUrl) helper in providerRegistry.ts
- Add COOLDOWN_MS.notFoundLocal (5s) and PROVIDER_PROFILES.local
- Add local 404 branch in markAccountUnavailable() in auth.ts
- Pass model param to markAccountUnavailable() in chat.ts (bug fix)
2026-03-16 18:55:41 +01:00
Regis 08256a3502 feat(api): add Kilo Gateway provider (335+ models, 6 free, auto-routing)
Kilo Gateway (api.kilo.ai/api/gateway) is an OpenAI-compatible API
offering 335+ models via a single API key, including 6 free models
and 3 auto-routing models (frontier/balanced/free).

This is distinct from the existing KiloCode provider which uses
OAuth + /api/openrouter/ endpoint.

- Register kilo-gateway in providerRegistry.ts (alias: kg)
- Add to APIKEY_PROVIDERS in providers.ts
- Add models endpoint config in route.ts
- Add official Kilo AI icon (favicon)
2026-03-16 17:26:27 +01:00
diegosouzapw 9b255e643a chore(release): v2.6.3 — compile-time hash-strip fix, Synthetic provider (PR #404), VPS PM2 path fix
Build Electron Desktop App / Validate version (push) Failing after 42s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
2026-03-16 11:00:43 -03:00
Diego Rodrigues de Sa e Souza ca1f918e9e Merge pull request #404 from Regis-RCR/feat/synthetic-provider
feat(api): add Synthetic as a new API key provider
2026-03-16 10:59:13 -03:00
diegosouzapw bb3fe1cd48 fix(build): strip Turbopack hashed require() from compiled server chunks in prepublish
Even with EXPERIMENTAL_TURBOPACK=0 and NEXT_PRIVATE_BUILD_WORKER=0, Next.js 16
instrumentation chunks still emit require('better-sqlite3-<16hexchars>') and
require('zod-<16hexchars>') into the compiled .js files inside .next/server/.

The webpack externals function in next.config.mjs patches the runtime bundler
but does NOT rewrite already-compiled chunks. Added step 5.6 to prepublish.mjs:
walks all .js files in app/.next/server/ and strips the 16-char hex suffix from
any require() string that matches the Turbopack hash pattern.

Also updated deploy-vps workflow: npm registry rejects 299MB packages, so
deployment now uses npm pack + scp + npm install -g /tmp/omniroute-*.tgz.
PM2 entry point is app/server.js inside the npm global package.
2026-03-16 10:46:27 -03:00
Regis d139b4557f feat(api): add Synthetic as a new API key provider
Add Synthetic (synthetic.new) as a privacy-focused LLM provider
with OpenAI-compatible API, dynamic model catalog via /models
endpoint, and passthrough model support.

- Register provider in providerRegistry.ts with 6 initial models
- Add APIKEY_PROVIDERS entry with verified_user icon (#6366F1)
- Add models listing config for /api/providers/[id]/models endpoint
- passthroughModels enabled for dynamic model catalog
2026-03-16 12:39:23 +01:00
14 changed files with 401 additions and 74 deletions
+35 -27
View File
@@ -4,73 +4,81 @@ description: Deploy the latest OmniRoute code to the Akamai VPS (69.164.221.35)
# Deploy to VPS Workflow
Deploy OmniRoute to the production VPS using `npm install -g` + PM2.
Deploy OmniRoute to the production VPS using `npm pack + scp` + PM2.
**VPS:** `69.164.221.35` (Akamai, Ubuntu 24.04, 1GB RAM + 2.5GB swap)
**Local VPS:** `192.168.0.15` (same setup)
**Process manager:** PM2 (`omniroute`)
**Port:** `20128`
**PM2 entry:** `/usr/lib/node_modules/omniroute/app/server.js`
> [!IMPORTANT]
> PM2 runs from the global npm package at `/usr/lib/node_modules/omniroute`.
> **DO NOT** use git clone or local copies. The `npm install -g` command handles
> building, publishing, and installing the standalone app in one step.
> The Next.js standalone build is at `app/server.js` inside that directory.
> The npm registry rejects packages > 100MB, so deployment uses **npm pack + scp**.
## Steps
### 1. Publish to npm
### 1. Build + pack locally
Ensure the version in `package.json` is bumped and the package is published:
Run the full build (includes hash-strip patch) and create the .tgz:
// turbo
```bash
npm publish
cd /home/diegosouzapw/dev/proxys/9router && npm run build:cli && npm pack --ignore-scripts
```
### 2. Install on VPS and restart PM2
### 2. Copy to both VPS and install
// turbo-all
```bash
ssh root@69.164.221.35 "npm install -g omniroute@latest && pm2 restart omniroute && pm2 save && echo '✅ Deploy complete!'"
scp omniroute-*.tgz root@69.164.221.35:/tmp/ && scp omniroute-*.tgz root@192.168.0.15:/tmp/
```
For the local VPS:
```bash
ssh root@69.164.221.35 "npm install -g /tmp/omniroute-*.tgz --ignore-scripts && pm2 restart omniroute && pm2 save && echo '✅ Akamai done'"
```
```bash
ssh root@192.168.0.15 "npm install -g omniroute@latest && pm2 restart omniroute && pm2 save && echo '✅ Deploy complete!'"
ssh root@192.168.0.15 "npm install -g /tmp/omniroute-*.tgz --ignore-scripts && pm2 restart omniroute && pm2 save && echo '✅ Local done'"
```
### 3. Verify the deployment
```bash
ssh root@69.164.221.35 "pm2 list && cat \$(npm root -g)/omniroute/package.json | grep version | head -1 && curl -s -o /dev/null -w 'HTTP %{http_code}' http://localhost:20128/"
ssh root@69.164.221.35 "pm2 list && cat \$(npm root -g)/omniroute/app/package.json | grep version | head -1 && curl -s -o /dev/null -w 'HTTP %{http_code}' http://localhost:20128/"
```
Expected: PM2 shows `online`, version matches published, HTTP returns `307` (redirect to login).
Expected: PM2 shows `online`, version matches, HTTP returns `307`.
## How it works
1. `npm publish` builds Next.js standalone + bundles everything into the npm package
2. `npm install -g omniroute@latest` downloads and installs to `/usr/lib/node_modules/omniroute/`
3. PM2 is registered to run `npm start` from that directory (cwd: `/usr/lib/node_modules/omniroute`)
4. `pm2 restart omniroute` picks up the new code immediately
1. `npm run build:cli` builds Next.js standalone `app/` and strips Turbopack hashed require() calls from chunks
2. `npm pack --ignore-scripts` packages without re-running the build
3. `scp` transfers the .tgz to each VPS (~286MB)
4. `npm install -g /tmp/omniroute-*.tgz --ignore-scripts` installs pre-built package
5. PM2 runs `app/server.js` from `/usr/lib/node_modules/omniroute`
## PM2 Setup (one-time)
If PM2 needs to be reconfigured from scratch:
## PM2 Setup (one-time — if reconfiguring from scratch)
```bash
ssh root@<VPS> "
cd /usr/lib/node_modules/omniroute &&
PORT=20128 pm2 start app/server.js --name omniroute --env PORT=20128 &&
pm2 save &&
pm2 startup
pm2 delete omniroute ;
cp /opt/omniroute-app/.env /usr/lib/node_modules/omniroute/.env &&
PORT=20128 pm2 start /usr/lib/node_modules/omniroute/app/server.js --name omniroute --cwd /usr/lib/node_modules/omniroute/app &&
pm2 save && pm2 startup
"
```
> [!NOTE]
> Copy `.env` from the old installation first. For Akamai it was at `/opt/omniroute-app/.env`,
> for the local VPS it was at `/root/omniroute-fresh/.env`.
## Notes
- The `.env` file is at `/usr/lib/node_modules/omniroute/.env`. Back it up before major npm updates.
- PM2 is configured with `pm2 startup` to auto-restart on reboot.
- Nginx proxies `omniroute.online``localhost:20128`.
- The VPS has only 1GB RAM — builds happen locally via `npm publish`, not on the VPS.
- `.env` should be placed at `/usr/lib/node_modules/omniroute/app/.env`
- PM2 is configured with `pm2 startup` to auto-restart on reboot
- Nginx proxies `omniroute.online``localhost:20128`
- The VPS has only 1GB RAM — builds happen locally, never on the VPS
+55
View File
@@ -4,6 +4,61 @@
---
## [2.6.5] — 2026-03-17
> Sprint: reasoning model param filtering, local provider 404 fix, Kilo Gateway provider, dependency bumps.
### ✨ New Features
- **feat(api)**: Added **Kilo Gateway** (`api.kilo.ai`) as a new API Key provider (alias `kg`) — 335+ models, 6 free models, 3 auto-routing models (`kilo-auto/frontier`, `kilo-auto/balanced`, `kilo-auto/free`). Passthrough models supported via `/api/gateway/models` endpoint. (PR #408 by @Regis-RCR)
### 🐛 Bug Fixes
- **fix(sse)**: Strip unsupported parameters for reasoning models (o1, o1-mini, o1-pro, o3, o3-mini). Models in the `o1`/`o3` family reject `temperature`, `top_p`, `frequency_penalty`, `presence_penalty`, `logprobs`, `top_logprobs`, and `n` with HTTP 400. Parameters are now stripped at the `chatCore` layer before forwarding. Uses a declarative `unsupportedParams` field per model and a precomputed O(1) Map for lookup. (PR #412 by @Regis-RCR)
- **fix(sse)**: Local provider 404 now results in a **model-only lockout (5 seconds)** instead of a connection-level lockout (2 minutes). When a local inference backend (Ollama, LM Studio, oMLX) returns 404 for an unknown model, the connection remains active and other models continue working immediately. Also fixes a pre-existing bug where `model` was not passed to `markAccountUnavailable()`. Local providers detected via hostname (`localhost`, `127.0.0.1`, `::1`, extensible via `LOCAL_HOSTNAMES` env var). (PR #410 by @Regis-RCR)
### 📦 Dependencies
- `better-sqlite3` 12.6.2 → 12.8.0
- `undici` 7.24.2 → 7.24.4
- `https-proxy-agent` 7 → 8
- `agent-base` 7 → 8
---
## [2.6.4] — 2026-03-17
### 🐛 Bug Fixes
- **fix(providers)**: Removed non-existent model names across 5 providers:
- **gemini / gemini-cli**: removed `gemini-3.1-pro/flash` and `gemini-3-*-preview` (don't exist in Google API v1beta); replaced with `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.0-flash`, `gemini-1.5-pro/flash`
- **antigravity**: removed `gemini-3.1-pro-high/low` and `gemini-3-flash` (invalid internal aliases); replaced with real 2.x models
- **github (Copilot)**: removed `gemini-3-flash-preview` and `gemini-3-pro-preview`; replaced with `gemini-2.5-flash`
- **nvidia**: corrected `nvidia/llama-3.3-70b-instruct``meta/llama-3.3-70b-instruct` (NVIDIA NIM uses `meta/` namespace for Meta models); added `nvidia/llama-3.1-70b-instruct` and `nvidia/llama-3.1-405b-instruct`
- **fix(db/combo)**: Updated `free-stack` combo on remote DB: removed `qw/qwen3-coder-plus` (expired refresh token), corrected `nvidia/llama-3.3-70b-instruct``nvidia/meta/llama-3.3-70b-instruct`, corrected `gemini/gemini-3.1-flash``gemini/gemini-2.5-flash`, added `if/deepseek-v3.2`
---
## [2.6.3] — 2026-03-16
> Sprint: zod/pino hash-strip baked into build pipeline, Synthetic provider added, VPS PM2 path corrected.
### 🐛 Bug Fixes
- **fix(build)**: Turbopack hash-strip now runs at **compile time** for ALL packages — not just `better-sqlite3`. Step 5.6 in `prepublish.mjs` walks every `.js` in `app/.next/server/` and strips the 16-char hex suffix from any hashed `require()`. Fixes `zod-dcb22c...`, `pino-...`, etc. MODULE_NOT_FOUND on global npm installs. Closes #398
- **fix(deploy)**: PM2 on both VPS was pointing to stale git-clone directories. Reconfigured to `app/server.js` in the npm global package. Updated `/deploy-vps` workflow to use `npm pack + scp` (npm registry rejects 299MB packages).
### ✨ Features
- **feat(provider)**: Synthetic ([synthetic.new](https://synthetic.new)) — privacy-focused OpenAI-compatible inference. `passthroughModels: true` for dynamic HuggingFace model catalog. Initial models: Kimi K2.5, MiniMax M2.5, GLM 4.7, DeepSeek V3.2. (PR #404 by @Regis-RCR)
### 📋 Issues Closed
- **close #398**: npm hash regression — fixed by compile-time hash-strip in prepublish
- **triage #324**: Bug screenshot without steps — requested reproduction details
---
## [2.6.2] — 2026-03-16
> Sprint: module hashing fully fixed, 2 PRs merged (Anthropic tools filter + custom endpoint paths), Alibaba Cloud DashScope provider added, 3 stale issues closed.
+1 -1
View File
@@ -1,7 +1,7 @@
openapi: 3.1.0
info:
title: OmniRoute API
version: 2.6.2
version: 2.6.5
description: |
OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
endpoint that routes requests to multiple AI providers with load balancing,
+11
View File
@@ -135,6 +135,7 @@ export const COOLDOWN_MS = {
unauthorized: 2 * 60 * 1000, // 401 → 2 min
paymentRequired: 2 * 60 * 1000, // 402/403 → 2 min
notFound: 2 * 60 * 1000, // 404 → 2 minutes
notFoundLocal: 5 * 1000, // 404 on local provider → 5s model-only lockout (connection stays active)
transientInitial: 5 * 1000, // 408/500/502/503/504 first hit → 5s (backoff from here)
transientMax: 60 * 1000, // 502/503/504 backoff ceiling → 60s
transient: 5 * 1000, // Legacy alias → points to transientInitial
@@ -162,6 +163,16 @@ export const PROVIDER_PROFILES = {
circuitBreakerThreshold: 5, // More tolerant (occasional 502 is normal)
circuitBreakerReset: 30000, // 30s reset
},
// Local providers (localhost inference backends like Ollama, LM Studio, oMLX).
// Not yet wired into getProviderProfile() — will be used when local provider_nodes
// are integrated into the resilience layer. Kept here to avoid a second constants change.
local: {
transientCooldown: 2000, // 2s (local — very fast recovery)
rateLimitCooldown: 5000, // 5s (local — no real rate limits)
maxBackoffLevel: 3, // Low ceiling (local either works or doesn't)
circuitBreakerThreshold: 2, // Opens fast (if local is down, it's down)
circuitBreakerReset: 15000, // 15s reset (check again quickly)
},
};
// Default rate limit values for API Key providers (auto-enabled safety net)
+142 -18
View File
@@ -12,8 +12,21 @@ export interface RegistryModel {
id: string;
name: string;
targetFormat?: string;
unsupportedParams?: readonly string[];
}
// Reasoning models reject temperature, top_p, penalties, logprobs, n.
// Frozen to prevent accidental mutation (shared across all model entries).
const REASONING_UNSUPPORTED: readonly string[] = Object.freeze([
"temperature",
"top_p",
"frequency_penalty",
"presence_penalty",
"logprobs",
"top_logprobs",
"n",
]);
export interface RegistryOAuth {
clientIdEnv?: string;
clientIdDefault?: string;
@@ -126,13 +139,13 @@ export const REGISTRY: Record<string, RegistryEntry> = {
clientSecretDefault: "",
},
models: [
{ id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
{ id: "gemini-3.1-flash", name: "Gemini 3.1 Flash" },
{ id: "gemini-3-pro-preview", name: "Gemini 3.0 Pro Preview" },
{ id: "gemini-3-flash-preview", name: "Gemini 3.0 Flash Preview" },
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
{ id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
{ id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" },
{ id: "gemini-2.0-flash-exp", name: "Gemini 2.0 Flash Exp" },
{ id: "gemini-1.5-pro", name: "Gemini 1.5 Pro" },
{ id: "gemini-1.5-flash", name: "Gemini 1.5 Flash" },
],
},
@@ -155,13 +168,12 @@ export const REGISTRY: Record<string, RegistryEntry> = {
clientSecretDefault: "",
},
models: [
{ id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
{ id: "gemini-3.1-flash", name: "Gemini 3.1 Flash" },
{ id: "gemini-3-flash-preview", name: "Gemini 3.0 Flash Preview" },
{ id: "gemini-3-pro-preview", name: "Gemini 3.0 Pro Preview" },
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
{ id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
{ id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" },
{ id: "gemini-1.5-pro", name: "Gemini 1.5 Pro" },
{ id: "gemini-1.5-flash", name: "Gemini 1.5 Flash" },
],
},
@@ -305,10 +317,9 @@ export const REGISTRY: Record<string, RegistryEntry> = {
models: [
{ id: "claude-opus-4-6-thinking", name: "Claude Opus 4.6 Thinking" },
{ id: "claude-sonnet-4-6", name: "Claude Sonnet 4.6" },
{ id: "gemini-3.1-pro-high", name: "Gemini 3.1 Pro High" },
{ id: "gemini-3.1-pro-low", name: "Gemini 3.1 Pro Low" },
{ id: "gemini-3.1-flash", name: "Gemini 3.1 Flash" },
{ id: "gemini-3-flash", name: "Gemini 3.0 Flash" },
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
{ id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" },
{ id: "gpt-oss-120b-medium", name: "GPT OSS 120B Medium" },
],
},
@@ -356,8 +367,7 @@ export const REGISTRY: Record<string, RegistryEntry> = {
{ id: "claude-sonnet-4", name: "Claude Sonnet 4" },
{ id: "claude-sonnet-4.5", name: "Claude Sonnet 4.5" },
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
{ id: "gemini-3-flash-preview", name: "Gemini 3 Flash Preview" },
{ id: "gemini-3-pro-preview", name: "Gemini 3 Pro Preview" },
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
{ id: "grok-code-fast-1", name: "Grok Code Fast 1" },
{ id: "oswe-vscode-prime", name: "Raptor Mini" },
],
@@ -429,8 +439,11 @@ export const REGISTRY: Record<string, RegistryEntry> = {
{ id: "gpt-4o", name: "GPT-4o" },
{ id: "gpt-4o-mini", name: "GPT-4o Mini" },
{ id: "gpt-4-turbo", name: "GPT-4 Turbo" },
{ id: "o1", name: "O1" },
{ id: "o1-mini", name: "O1 Mini" },
{ id: "o1", name: "O1", unsupportedParams: REASONING_UNSUPPORTED },
{ id: "o1-mini", name: "O1 Mini", unsupportedParams: REASONING_UNSUPPORTED },
{ id: "o1-pro", name: "O1 Pro", unsupportedParams: REASONING_UNSUPPORTED },
{ id: "o3", name: "O3", unsupportedParams: REASONING_UNSUPPORTED },
{ id: "o3-mini", name: "O3 Mini", unsupportedParams: REASONING_UNSUPPORTED },
],
},
@@ -836,12 +849,14 @@ export const REGISTRY: Record<string, RegistryEntry> = {
authType: "apikey",
authHeader: "bearer",
models: [
{ id: "meta/llama-3.3-70b-instruct", name: "Llama 3.3 70B" },
{ id: "meta/llama-4-maverick-17b-128e-instruct", name: "Llama 4 Maverick" },
{ id: "moonshotai/kimi-k2.5", name: "Kimi K2.5" },
{ id: "z-ai/glm4.7", name: "GLM 4.7" },
{ id: "deepseek-ai/deepseek-v3.2", name: "DeepSeek V3.2" },
{ id: "nvidia/llama-3.3-70b-instruct", name: "Llama 3.3 70B" },
{ id: "meta/llama-4-maverick-17b-128e-instruct", name: "Llama 4 Maverick" },
{ id: "deepseek/deepseek-r1", name: "DeepSeek R1" },
{ id: "nvidia/llama-3.1-70b-instruct", name: "Llama 3.1 70B" },
{ id: "nvidia/llama-3.1-405b-instruct", name: "Llama 3.1 405B" },
],
},
@@ -919,6 +934,46 @@ export const REGISTRY: Record<string, RegistryEntry> = {
],
},
synthetic: {
id: "synthetic",
alias: "synthetic",
format: "openai",
executor: "default",
baseUrl: "https://api.synthetic.new/openai/v1/chat/completions",
modelsUrl: "https://api.synthetic.new/openai/v1/models",
authType: "apikey",
authHeader: "bearer",
models: [
{ id: "hf:nvidia/Kimi-K2.5-NVFP4", name: "Kimi K2.5 (NVFP4)" },
{ id: "hf:MiniMaxAI/MiniMax-M2.5", name: "MiniMax M2.5" },
{ id: "hf:zai-org/GLM-4.7-Flash", name: "GLM 4.7 Flash" },
{ id: "hf:zai-org/GLM-4.7", name: "GLM 4.7" },
{ id: "hf:moonshotai/Kimi-K2.5", name: "Kimi K2.5" },
{ id: "hf:deepseek-ai/DeepSeek-V3.2", name: "DeepSeek V3.2" },
],
passthroughModels: true,
},
"kilo-gateway": {
id: "kilo-gateway",
alias: "kg",
format: "openai",
executor: "default",
baseUrl: "https://api.kilo.ai/api/gateway/chat/completions",
modelsUrl: "https://api.kilo.ai/api/gateway/models",
authType: "apikey",
authHeader: "bearer",
models: [
{ id: "kilo-auto/frontier", name: "Kilo Auto Frontier" },
{ id: "kilo-auto/balanced", name: "Kilo Auto Balanced" },
{ id: "kilo-auto/free", name: "Kilo Auto Free" },
{ id: "nvidia/nemotron-3-super-120b-a12b:free", name: "Nemotron 3 Super 120B (Free)" },
{ id: "minimax/minimax-m2.5:free", name: "MiniMax M2.5 (Free)" },
{ id: "arcee-ai/trinity-large-preview:free", name: "Trinity Large Preview (Free)" },
],
passthroughModels: true,
},
vertex: {
id: "vertex",
alias: "vertex",
@@ -1022,6 +1077,38 @@ export function generateAliasMap(): Record<string, string> {
return map;
}
// ── Local Provider Detection ──────────────────────────────────────────────
// Evaluated once at module load time — process restart required for env var changes.
const LOCAL_HOSTNAMES = new Set([
"localhost",
"127.0.0.1",
"::1",
"[::1]",
...(typeof process !== "undefined" && process.env.LOCAL_HOSTNAMES
? process.env.LOCAL_HOSTNAMES.split(",")
.map((h) => h.trim())
.filter(Boolean)
: []),
]);
/**
* Detect if a base URL points to a local inference backend.
* Used for shorter 404 cooldowns (model-only, not connection) and health check targets.
*
* Operators can extend via LOCAL_HOSTNAMES env var (comma-separated) for Docker
* hostnames (e.g., LOCAL_HOSTNAMES=omlx,mlx-audio).
*/
export function isLocalProvider(baseUrl?: string | null): boolean {
if (!baseUrl) return false;
try {
const url = new URL(baseUrl);
return LOCAL_HOSTNAMES.has(url.hostname);
} catch {
return false;
}
}
// ── Registry Lookup Helpers ───────────────────────────────────────────────
const _byAlias = new Map<string, RegistryEntry>();
@@ -1041,6 +1128,43 @@ export function getRegisteredProviders(): string[] {
return Object.keys(REGISTRY);
}
// Precomputed map: modelId → unsupportedParams (O(1) lookup instead of O(N×M) scan).
// Built once at module load from all registry entries.
const _unsupportedParamsMap = new Map<string, readonly string[]>();
for (const entry of Object.values(REGISTRY)) {
for (const model of entry.models) {
if (model.unsupportedParams && !_unsupportedParamsMap.has(model.id)) {
_unsupportedParamsMap.set(model.id, model.unsupportedParams);
}
}
}
/**
* Get unsupported parameters for a specific model.
* Uses O(1) precomputed lookup. Also handles prefixed model IDs
* (e.g., "openai/o3" → strips prefix and looks up "o3").
* Returns empty array if no restrictions are defined.
*/
export function getUnsupportedParams(provider: string, modelId: string): readonly string[] {
// 1. Check current provider's registry (exact match)
const entry = getRegistryEntry(provider);
const modelEntry = entry?.models.find((m) => m.id === modelId);
if (modelEntry?.unsupportedParams) return modelEntry.unsupportedParams;
// 2. O(1) lookup in precomputed map (handles cross-provider routing)
const cached = _unsupportedParamsMap.get(modelId);
if (cached) return cached;
// 3. Handle prefixed model IDs (e.g., "openai/o3" → "o3")
if (modelId.includes("/")) {
const bareId = modelId.split("/").pop() || "";
const bare = _unsupportedParamsMap.get(bareId);
if (bare) return bare;
}
return [];
}
/**
* Get provider category: "oauth" or "apikey"
* Used by the resilience layer to apply different cooldown/backoff profiles.
+19 -1
View File
@@ -13,6 +13,7 @@ import { refreshWithRetry } from "../services/tokenRefresh.ts";
import { createRequestLogger } from "../utils/requestLogger.ts";
import { getModelTargetFormat, PROVIDER_ID_TO_ALIAS } from "../config/providerModels.ts";
import { resolveModelAlias } from "../services/modelDeprecation.ts";
import { getUnsupportedParams } from "../config/providerRegistry.ts";
import { createErrorResult, parseUpstreamError, formatProviderError } from "../utils/error.ts";
import { HTTP_STATUS } from "../config/constants.ts";
import { handleBypassRequest } from "../utils/bypassHandler.ts";
@@ -53,7 +54,9 @@ export function shouldUseNativeCodexPassthrough({
}): boolean {
if (provider !== "codex") return false;
if (sourceFormat !== FORMATS.OPENAI_RESPONSES) return false;
return String(endpointPath || "").toLowerCase().endsWith("/responses");
return String(endpointPath || "")
.toLowerCase()
.endsWith("/responses");
}
/**
@@ -287,6 +290,21 @@ export async function handleChatCore({
// Update model in body
translatedBody.model = model;
// Strip unsupported parameters for reasoning models (o1, o3, etc.)
const unsupported = getUnsupportedParams(provider, model);
if (unsupported.length > 0) {
const stripped: string[] = [];
for (const param of unsupported) {
if (Object.hasOwn(translatedBody, param)) {
stripped.push(param);
delete translatedBody[param];
}
}
if (stripped.length > 0) {
log?.warn?.("PARAMS", `Stripped unsupported params for ${model}: ${stripped.join(", ")}`);
}
}
// Get executor for this provider
const executor = getExecutor(provider);
+23 -24
View File
@@ -1,12 +1,12 @@
{
"name": "omniroute",
"version": "2.6.2",
"version": "2.6.5",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "omniroute",
"version": "2.6.2",
"version": "2.6.5",
"hasInstallScript": true,
"license": "MIT",
"workspaces": [
@@ -23,7 +23,7 @@
"express": "^5.2.1",
"fetch-socks": "^1.3.2",
"http-proxy-middleware": "^3.0.5",
"https-proxy-agent": "^7.0.6",
"https-proxy-agent": "^8.0.0",
"jose": "^6.1.3",
"lowdb": "^7.0.1",
"monaco-editor": "^0.55.1",
@@ -4236,9 +4236,9 @@
}
},
"node_modules/agent-base": {
"version": "7.1.4",
"resolved": "https://registry.npmjs.org/agent-base/-/agent-base-7.1.4.tgz",
"integrity": "sha512-MnA+YT8fwfJPgBx3m60MNqakm30XOkyIoH1y6huTQvC0PwZG7ki8NacLBcrPbNoo8vEZy7Jpuk7+jMO+CUovTQ==",
"version": "8.0.0",
"resolved": "https://registry.npmjs.org/agent-base/-/agent-base-8.0.0.tgz",
"integrity": "sha512-QT8i0hCz6C/KQ+KTAbSNwCHDGdmUJl2tp2ZpNlGSWCfhUNVbYG2WLE3MdZGBAgXPV4GAvjGMxo+C1hroyxmZEg==",
"license": "MIT",
"engines": {
"node": ">= 14"
@@ -4672,9 +4672,9 @@
}
},
"node_modules/better-sqlite3": {
"version": "12.6.2",
"resolved": "https://registry.npmjs.org/better-sqlite3/-/better-sqlite3-12.6.2.tgz",
"integrity": "sha512-8VYKM3MjCa9WcaSAI3hzwhmyHVlH8tiGFwf0RlTsZPWJ1I5MkzjiudCo4KC4DxOaL/53A5B1sI/IbldNFDbsKA==",
"version": "12.8.0",
"resolved": "https://registry.npmjs.org/better-sqlite3/-/better-sqlite3-12.8.0.tgz",
"integrity": "sha512-RxD2Vd96sQDjQr20kdP+F+dK/1OUNiVOl200vKBZY8u0vTwysfolF6Hq+3ZK2+h8My9YvZhHsF+RSGZW2VYrPQ==",
"hasInstallScript": true,
"license": "MIT",
"dependencies": {
@@ -6866,7 +6866,6 @@
"version": "2.3.2",
"resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.2.tgz",
"integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"optional": true,
@@ -7270,13 +7269,13 @@
}
},
"node_modules/https-proxy-agent": {
"version": "7.0.6",
"resolved": "https://registry.npmjs.org/https-proxy-agent/-/https-proxy-agent-7.0.6.tgz",
"integrity": "sha512-vK9P5/iUfdl95AI+JVyUuIcVtd4ofvtrOr3HNtM2yxC9bnMbEdp3x01OhQNnjb8IJYi38VlTE3mBXwcfvywuSw==",
"version": "8.0.0",
"resolved": "https://registry.npmjs.org/https-proxy-agent/-/https-proxy-agent-8.0.0.tgz",
"integrity": "sha512-YYeW+iCnAS3xhvj2dvVoWgsbca3RfQy/IlaNHHOtDmU0jMqPI9euIq3Y9BJETdxk16h9NHHCKqp/KB9nIMStCQ==",
"license": "MIT",
"dependencies": {
"agent-base": "^7.1.2",
"debug": "4"
"agent-base": "8.0.0",
"debug": "^4.3.4"
},
"engines": {
"node": ">= 14"
@@ -11524,9 +11523,9 @@
}
},
"node_modules/undici": {
"version": "7.24.2",
"resolved": "https://registry.npmjs.org/undici/-/undici-7.24.2.tgz",
"integrity": "sha512-P9J1HWYV/ajFr8uCqk5QixwiRKmB1wOamgS0e+o2Z4A44Ej2+thFVRLG/eA7qprx88XXhnV5Bl8LHXTURpzB3Q==",
"version": "7.24.4",
"resolved": "https://registry.npmjs.org/undici/-/undici-7.24.4.tgz",
"integrity": "sha512-BM/JzwwaRXxrLdElV2Uo6cTLEjhSb3WXboncJamZ15NgUURmvlXvxa6xkwIOILIjPNo9i8ku136ZvWV0Uly8+w==",
"license": "MIT",
"engines": {
"node": ">=20.18.1"
@@ -12129,9 +12128,9 @@
"license": "ISC"
},
"node_modules/wreq-js": {
"version": "2.1.1",
"resolved": "https://registry.npmjs.org/wreq-js/-/wreq-js-2.1.1.tgz",
"integrity": "sha512-nJBOMBTczqcyHpF8a8YdPyxb30htK2RxuAfr6O8a6oyKHj2nRPjXbZcGXrquIdZx1b+6NV/GHweD3OqWwE7n4A==",
"version": "2.2.0",
"resolved": "https://registry.npmjs.org/wreq-js/-/wreq-js-2.2.0.tgz",
"integrity": "sha512-lXW1/bvdPTpFMdfBftkJIp6OzxkAqAON4dlrKrmaFNT86eu60VCEVmEdK3nWY1ZyiEZ6IXQPRrc1uXG394BoBA==",
"cpu": [
"x64",
"arm64"
@@ -12336,9 +12335,9 @@
}
},
"node_modules/zustand": {
"version": "5.0.11",
"resolved": "https://registry.npmjs.org/zustand/-/zustand-5.0.11.tgz",
"integrity": "sha512-fdZY+dk7zn/vbWNCYmzZULHRrss0jx5pPFiOuMZ/5HJN6Yv3u+1Wswy/4MpZEkEGhtNH+pwxZB8OKgUBPzYAGg==",
"version": "5.0.12",
"resolved": "https://registry.npmjs.org/zustand/-/zustand-5.0.12.tgz",
"integrity": "sha512-i77ae3aZq4dhMlRhJVCYgMLKuSiZAaUPAct2AksxQ+gOtimhGMdXljRT21P5BNpeT4kXlLIckvkPM029OljD7g==",
"license": "MIT",
"engines": {
"node": ">=12.20.0"
+2 -2
View File
@@ -1,6 +1,6 @@
{
"name": "omniroute",
"version": "2.6.2",
"version": "2.6.5",
"description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
"type": "module",
"bin": {
@@ -90,7 +90,7 @@
"express": "^5.2.1",
"fetch-socks": "^1.3.2",
"http-proxy-middleware": "^3.0.5",
"https-proxy-agent": "^7.0.6",
"https-proxy-agent": "^8.0.0",
"jose": "^6.1.3",
"lowdb": "^7.0.1",
"monaco-editor": "^0.55.1",
Binary file not shown.

After

Width:  |  Height:  |  Size: 472 B

+56
View File
@@ -142,6 +142,62 @@ if (sanitisedCount > 0) {
console.log(" ️ No hardcoded paths found to sanitise");
}
// ── Step 5.6: Strip Turbopack hashed externals from compiled chunks ─────────
// Even when Turbopack is disabled at build time, some instrumentation chunks
// may still emit require('package-<16hexchars>') instead of require('package').
// These hashed names don't exist in node_modules and cause MODULE_NOT_FOUND at
// runtime. We strip the hex suffix from all .js files in app/.next/server/
// to ensure all require() calls use the real package names.
{
const serverOutput = join(APP_DIR, ".next", "server");
const HASH_RE = /(['"\\])([a-z@][a-z0-9@./_-]+-[0-9a-f]{16})\1/g;
let patchedFiles = 0;
let patchedMatches = 0;
const walkDir = (dir) => {
let entries = [];
try {
entries = readdirSync(dir);
} catch {
return;
}
for (const entry of entries) {
const full = join(dir, entry);
try {
const st = statSync(full);
if (st.isDirectory()) {
walkDir(full);
continue;
}
if (!entry.endsWith(".js")) continue;
const src = readFileSync(full, "utf8");
let count = 0;
const patched = src.replace(HASH_RE, (_, q, name) => {
const base = name.replace(/-[0-9a-f]{16}$/, "");
count++;
return `${q}${base}${q}`;
});
if (count > 0) {
writeFileSync(full, patched);
patchedFiles++;
patchedMatches += count;
}
} catch {
/* skip unreadable files */
}
}
};
if (existsSync(serverOutput)) {
walkDir(serverOutput);
if (patchedMatches > 0) {
console.log(
` 🔧 Hash-strip: patched ${patchedMatches} hashed require() in ${patchedFiles} server chunk file(s)`
);
} else {
console.log(" ✅ Hash-strip: no hashed externals found in compiled chunks.");
}
}
}
// ── Step 6: Copy static assets ─────────────────────────────
const staticSrc = join(ROOT, ".next", "static");
const staticDest = join(APP_DIR, ".next", "static");
@@ -255,6 +255,22 @@ const PROVIDER_MODELS_CONFIG: Record<string, ProviderModelsConfigEntry> = {
authPrefix: "Bearer ",
parseResponse: (data) => data.models || data.data || [],
},
synthetic: {
url: "https://api.synthetic.new/openai/v1/models",
method: "GET",
headers: { "Content-Type": "application/json" },
authHeader: "Authorization",
authPrefix: "Bearer ",
parseResponse: (data) => data.data || data.models || [],
},
"kilo-gateway": {
url: "https://api.kilo.ai/api/gateway/models",
method: "GET",
headers: { "Content-Type": "application/json" },
authHeader: "Authorization",
authPrefix: "Bearer ",
parseResponse: (data) => data.data || data.models || [],
},
};
/**
+20
View File
@@ -360,6 +360,26 @@ export const APIKEY_PROVIDERS = {
hasFree: true,
freeNote: "Free Inference API for thousands of models (Whisper, VITS, SDXL…)",
},
synthetic: {
id: "synthetic",
alias: "synthetic",
name: "Synthetic",
icon: "verified_user",
color: "#6366F1",
textIcon: "SY",
website: "https://synthetic.new",
passthroughModels: true,
},
"kilo-gateway": {
id: "kilo-gateway",
alias: "kg",
name: "Kilo Gateway",
icon: "hub",
color: "#617A91",
textIcon: "KG",
website: "https://kilo.ai",
passthroughModels: true,
},
vertex: {
id: "vertex",
alias: "vertex",
+2 -1
View File
@@ -382,7 +382,8 @@ async function handleSingleModelChat(
credentials.connectionId,
result.status,
result.error,
provider
provider,
model
);
if (shouldFallback) {
+19
View File
@@ -14,6 +14,8 @@ import {
isModelLocked,
lockModel,
} from "@omniroute/open-sse/services/accountFallback.ts";
import { isLocalProvider } from "@omniroute/open-sse/config/providerRegistry.ts";
import { COOLDOWN_MS } from "@omniroute/open-sse/config/constants.ts";
import * as log from "../utils/logger";
import { fisherYatesShuffle, getNextFromDeckSync } from "@/shared/utils/shuffleDeck";
@@ -563,6 +565,23 @@ export async function markAccountUnavailable(
);
if (!shouldFallback) return { shouldFallback: false, cooldownMs: 0 };
// ── Local provider 404: model-only lockout, connection stays active ──
// Detection: URL-based only (apiKey===null heuristic was too broad — could match
// cloud providers with non-standard auth stored in providerSpecificData).
const connBaseUrl = (conn?.providerSpecificData as Record<string, unknown>)?.baseUrl as
| string
| undefined;
if (isLocalProvider(connBaseUrl) && status === 404 && provider && model) {
const localCooldown = COOLDOWN_MS.notFoundLocal;
lockModel(provider, connectionId, model, "local_not_found", localCooldown);
log.info(
"AUTH",
`Local 404 for ${model} — model-only lockout ${localCooldown / 1000}s (connection stays active)`
);
return { shouldFallback: true, cooldownMs: localCooldown };
}
const rateLimitedUntil = getUnavailableUntil(cooldownMs);
const errorMsg = typeof errorText === "string" ? errorText.slice(0, 100) : "Provider error";