Compare commits
16 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| a8ab16a720 | |||
| a00ef0fc7e | |||
| 5ce6d615a4 | |||
| e06b69cdac | |||
| d261ae7883 | |||
| 6fa77a63d7 | |||
| f76c1b32d6 | |||
| d98ec59c79 | |||
| d79b55be5a | |||
| 1f9a402dcd | |||
| f9bcc9418b | |||
| 08256a3502 | |||
| 9b255e643a | |||
| ca1f918e9e | |||
| bb3fe1cd48 | |||
| d139b4557f |
@@ -4,73 +4,81 @@ description: Deploy the latest OmniRoute code to the Akamai VPS (69.164.221.35)
|
||||
|
||||
# Deploy to VPS Workflow
|
||||
|
||||
Deploy OmniRoute to the production VPS using `npm install -g` + PM2.
|
||||
Deploy OmniRoute to the production VPS using `npm pack + scp` + PM2.
|
||||
|
||||
**VPS:** `69.164.221.35` (Akamai, Ubuntu 24.04, 1GB RAM + 2.5GB swap)
|
||||
**Local VPS:** `192.168.0.15` (same setup)
|
||||
**Process manager:** PM2 (`omniroute`)
|
||||
**Port:** `20128`
|
||||
**PM2 entry:** `/usr/lib/node_modules/omniroute/app/server.js`
|
||||
|
||||
> [!IMPORTANT]
|
||||
> PM2 runs from the global npm package at `/usr/lib/node_modules/omniroute`.
|
||||
> **DO NOT** use git clone or local copies. The `npm install -g` command handles
|
||||
> building, publishing, and installing the standalone app in one step.
|
||||
> The Next.js standalone build is at `app/server.js` inside that directory.
|
||||
> The npm registry rejects packages > 100MB, so deployment uses **npm pack + scp**.
|
||||
|
||||
## Steps
|
||||
|
||||
### 1. Publish to npm
|
||||
### 1. Build + pack locally
|
||||
|
||||
Ensure the version in `package.json` is bumped and the package is published:
|
||||
Run the full build (includes hash-strip patch) and create the .tgz:
|
||||
|
||||
// turbo
|
||||
|
||||
```bash
|
||||
npm publish
|
||||
cd /home/diegosouzapw/dev/proxys/9router && npm run build:cli && npm pack --ignore-scripts
|
||||
```
|
||||
|
||||
### 2. Install on VPS and restart PM2
|
||||
### 2. Copy to both VPS and install
|
||||
|
||||
// turbo-all
|
||||
|
||||
```bash
|
||||
ssh root@69.164.221.35 "npm install -g omniroute@latest && pm2 restart omniroute && pm2 save && echo '✅ Deploy complete!'"
|
||||
scp omniroute-*.tgz root@69.164.221.35:/tmp/ && scp omniroute-*.tgz root@192.168.0.15:/tmp/
|
||||
```
|
||||
|
||||
For the local VPS:
|
||||
```bash
|
||||
ssh root@69.164.221.35 "npm install -g /tmp/omniroute-*.tgz --ignore-scripts && pm2 restart omniroute && pm2 save && echo '✅ Akamai done'"
|
||||
```
|
||||
|
||||
```bash
|
||||
ssh root@192.168.0.15 "npm install -g omniroute@latest && pm2 restart omniroute && pm2 save && echo '✅ Deploy complete!'"
|
||||
ssh root@192.168.0.15 "npm install -g /tmp/omniroute-*.tgz --ignore-scripts && pm2 restart omniroute && pm2 save && echo '✅ Local done'"
|
||||
```
|
||||
|
||||
### 3. Verify the deployment
|
||||
|
||||
```bash
|
||||
ssh root@69.164.221.35 "pm2 list && cat \$(npm root -g)/omniroute/package.json | grep version | head -1 && curl -s -o /dev/null -w 'HTTP %{http_code}' http://localhost:20128/"
|
||||
ssh root@69.164.221.35 "pm2 list && cat \$(npm root -g)/omniroute/app/package.json | grep version | head -1 && curl -s -o /dev/null -w 'HTTP %{http_code}' http://localhost:20128/"
|
||||
```
|
||||
|
||||
Expected: PM2 shows `online`, version matches published, HTTP returns `307` (redirect to login).
|
||||
Expected: PM2 shows `online`, version matches, HTTP returns `307`.
|
||||
|
||||
## How it works
|
||||
|
||||
1. `npm publish` builds Next.js standalone + bundles everything into the npm package
|
||||
2. `npm install -g omniroute@latest` downloads and installs to `/usr/lib/node_modules/omniroute/`
|
||||
3. PM2 is registered to run `npm start` from that directory (cwd: `/usr/lib/node_modules/omniroute`)
|
||||
4. `pm2 restart omniroute` picks up the new code immediately
|
||||
1. `npm run build:cli` builds Next.js standalone → `app/` and strips Turbopack hashed require() calls from chunks
|
||||
2. `npm pack --ignore-scripts` packages without re-running the build
|
||||
3. `scp` transfers the .tgz to each VPS (~286MB)
|
||||
4. `npm install -g /tmp/omniroute-*.tgz --ignore-scripts` installs pre-built package
|
||||
5. PM2 runs `app/server.js` from `/usr/lib/node_modules/omniroute`
|
||||
|
||||
## PM2 Setup (one-time)
|
||||
|
||||
If PM2 needs to be reconfigured from scratch:
|
||||
## PM2 Setup (one-time — if reconfiguring from scratch)
|
||||
|
||||
```bash
|
||||
ssh root@<VPS> "
|
||||
cd /usr/lib/node_modules/omniroute &&
|
||||
PORT=20128 pm2 start app/server.js --name omniroute --env PORT=20128 &&
|
||||
pm2 save &&
|
||||
pm2 startup
|
||||
pm2 delete omniroute ;
|
||||
cp /opt/omniroute-app/.env /usr/lib/node_modules/omniroute/.env &&
|
||||
PORT=20128 pm2 start /usr/lib/node_modules/omniroute/app/server.js --name omniroute --cwd /usr/lib/node_modules/omniroute/app &&
|
||||
pm2 save && pm2 startup
|
||||
"
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> Copy `.env` from the old installation first. For Akamai it was at `/opt/omniroute-app/.env`,
|
||||
> for the local VPS it was at `/root/omniroute-fresh/.env`.
|
||||
|
||||
## Notes
|
||||
|
||||
- The `.env` file is at `/usr/lib/node_modules/omniroute/.env`. Back it up before major npm updates.
|
||||
- PM2 is configured with `pm2 startup` to auto-restart on reboot.
|
||||
- Nginx proxies `omniroute.online` → `localhost:20128`.
|
||||
- The VPS has only 1GB RAM — builds happen locally via `npm publish`, not on the VPS.
|
||||
- `.env` should be placed at `/usr/lib/node_modules/omniroute/app/.env`
|
||||
- PM2 is configured with `pm2 startup` to auto-restart on reboot
|
||||
- Nginx proxies `omniroute.online` → `localhost:20128`
|
||||
- The VPS has only 1GB RAM — builds happen locally, never on the VPS
|
||||
|
||||
@@ -4,6 +4,61 @@
|
||||
|
||||
---
|
||||
|
||||
## [2.6.5] — 2026-03-17
|
||||
|
||||
> Sprint: reasoning model param filtering, local provider 404 fix, Kilo Gateway provider, dependency bumps.
|
||||
|
||||
### ✨ New Features
|
||||
|
||||
- **feat(api)**: Added **Kilo Gateway** (`api.kilo.ai`) as a new API Key provider (alias `kg`) — 335+ models, 6 free models, 3 auto-routing models (`kilo-auto/frontier`, `kilo-auto/balanced`, `kilo-auto/free`). Passthrough models supported via `/api/gateway/models` endpoint. (PR #408 by @Regis-RCR)
|
||||
|
||||
### 🐛 Bug Fixes
|
||||
|
||||
- **fix(sse)**: Strip unsupported parameters for reasoning models (o1, o1-mini, o1-pro, o3, o3-mini). Models in the `o1`/`o3` family reject `temperature`, `top_p`, `frequency_penalty`, `presence_penalty`, `logprobs`, `top_logprobs`, and `n` with HTTP 400. Parameters are now stripped at the `chatCore` layer before forwarding. Uses a declarative `unsupportedParams` field per model and a precomputed O(1) Map for lookup. (PR #412 by @Regis-RCR)
|
||||
- **fix(sse)**: Local provider 404 now results in a **model-only lockout (5 seconds)** instead of a connection-level lockout (2 minutes). When a local inference backend (Ollama, LM Studio, oMLX) returns 404 for an unknown model, the connection remains active and other models continue working immediately. Also fixes a pre-existing bug where `model` was not passed to `markAccountUnavailable()`. Local providers detected via hostname (`localhost`, `127.0.0.1`, `::1`, extensible via `LOCAL_HOSTNAMES` env var). (PR #410 by @Regis-RCR)
|
||||
|
||||
### 📦 Dependencies
|
||||
|
||||
- `better-sqlite3` 12.6.2 → 12.8.0
|
||||
- `undici` 7.24.2 → 7.24.4
|
||||
- `https-proxy-agent` 7 → 8
|
||||
- `agent-base` 7 → 8
|
||||
|
||||
---
|
||||
|
||||
## [2.6.4] — 2026-03-17
|
||||
|
||||
### 🐛 Bug Fixes
|
||||
|
||||
- **fix(providers)**: Removed non-existent model names across 5 providers:
|
||||
- **gemini / gemini-cli**: removed `gemini-3.1-pro/flash` and `gemini-3-*-preview` (don't exist in Google API v1beta); replaced with `gemini-2.5-pro`, `gemini-2.5-flash`, `gemini-2.0-flash`, `gemini-1.5-pro/flash`
|
||||
- **antigravity**: removed `gemini-3.1-pro-high/low` and `gemini-3-flash` (invalid internal aliases); replaced with real 2.x models
|
||||
- **github (Copilot)**: removed `gemini-3-flash-preview` and `gemini-3-pro-preview`; replaced with `gemini-2.5-flash`
|
||||
- **nvidia**: corrected `nvidia/llama-3.3-70b-instruct` → `meta/llama-3.3-70b-instruct` (NVIDIA NIM uses `meta/` namespace for Meta models); added `nvidia/llama-3.1-70b-instruct` and `nvidia/llama-3.1-405b-instruct`
|
||||
- **fix(db/combo)**: Updated `free-stack` combo on remote DB: removed `qw/qwen3-coder-plus` (expired refresh token), corrected `nvidia/llama-3.3-70b-instruct` → `nvidia/meta/llama-3.3-70b-instruct`, corrected `gemini/gemini-3.1-flash` → `gemini/gemini-2.5-flash`, added `if/deepseek-v3.2`
|
||||
|
||||
---
|
||||
|
||||
## [2.6.3] — 2026-03-16
|
||||
|
||||
> Sprint: zod/pino hash-strip baked into build pipeline, Synthetic provider added, VPS PM2 path corrected.
|
||||
|
||||
### 🐛 Bug Fixes
|
||||
|
||||
- **fix(build)**: Turbopack hash-strip now runs at **compile time** for ALL packages — not just `better-sqlite3`. Step 5.6 in `prepublish.mjs` walks every `.js` in `app/.next/server/` and strips the 16-char hex suffix from any hashed `require()`. Fixes `zod-dcb22c...`, `pino-...`, etc. MODULE_NOT_FOUND on global npm installs. Closes #398
|
||||
- **fix(deploy)**: PM2 on both VPS was pointing to stale git-clone directories. Reconfigured to `app/server.js` in the npm global package. Updated `/deploy-vps` workflow to use `npm pack + scp` (npm registry rejects 299MB packages).
|
||||
|
||||
### ✨ Features
|
||||
|
||||
- **feat(provider)**: Synthetic ([synthetic.new](https://synthetic.new)) — privacy-focused OpenAI-compatible inference. `passthroughModels: true` for dynamic HuggingFace model catalog. Initial models: Kimi K2.5, MiniMax M2.5, GLM 4.7, DeepSeek V3.2. (PR #404 by @Regis-RCR)
|
||||
|
||||
### 📋 Issues Closed
|
||||
|
||||
- **close #398**: npm hash regression — fixed by compile-time hash-strip in prepublish
|
||||
- **triage #324**: Bug screenshot without steps — requested reproduction details
|
||||
|
||||
---
|
||||
|
||||
## [2.6.2] — 2026-03-16
|
||||
|
||||
> Sprint: module hashing fully fixed, 2 PRs merged (Anthropic tools filter + custom endpoint paths), Alibaba Cloud DashScope provider added, 3 stale issues closed.
|
||||
|
||||
+1
-1
@@ -1,7 +1,7 @@
|
||||
openapi: 3.1.0
|
||||
info:
|
||||
title: OmniRoute API
|
||||
version: 2.6.2
|
||||
version: 2.6.5
|
||||
description: |
|
||||
OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
|
||||
endpoint that routes requests to multiple AI providers with load balancing,
|
||||
|
||||
@@ -135,6 +135,7 @@ export const COOLDOWN_MS = {
|
||||
unauthorized: 2 * 60 * 1000, // 401 → 2 min
|
||||
paymentRequired: 2 * 60 * 1000, // 402/403 → 2 min
|
||||
notFound: 2 * 60 * 1000, // 404 → 2 minutes
|
||||
notFoundLocal: 5 * 1000, // 404 on local provider → 5s model-only lockout (connection stays active)
|
||||
transientInitial: 5 * 1000, // 408/500/502/503/504 first hit → 5s (backoff from here)
|
||||
transientMax: 60 * 1000, // 502/503/504 backoff ceiling → 60s
|
||||
transient: 5 * 1000, // Legacy alias → points to transientInitial
|
||||
@@ -162,6 +163,16 @@ export const PROVIDER_PROFILES = {
|
||||
circuitBreakerThreshold: 5, // More tolerant (occasional 502 is normal)
|
||||
circuitBreakerReset: 30000, // 30s reset
|
||||
},
|
||||
// Local providers (localhost inference backends like Ollama, LM Studio, oMLX).
|
||||
// Not yet wired into getProviderProfile() — will be used when local provider_nodes
|
||||
// are integrated into the resilience layer. Kept here to avoid a second constants change.
|
||||
local: {
|
||||
transientCooldown: 2000, // 2s (local — very fast recovery)
|
||||
rateLimitCooldown: 5000, // 5s (local — no real rate limits)
|
||||
maxBackoffLevel: 3, // Low ceiling (local either works or doesn't)
|
||||
circuitBreakerThreshold: 2, // Opens fast (if local is down, it's down)
|
||||
circuitBreakerReset: 15000, // 15s reset (check again quickly)
|
||||
},
|
||||
};
|
||||
|
||||
// Default rate limit values for API Key providers (auto-enabled safety net)
|
||||
|
||||
@@ -12,8 +12,21 @@ export interface RegistryModel {
|
||||
id: string;
|
||||
name: string;
|
||||
targetFormat?: string;
|
||||
unsupportedParams?: readonly string[];
|
||||
}
|
||||
|
||||
// Reasoning models reject temperature, top_p, penalties, logprobs, n.
|
||||
// Frozen to prevent accidental mutation (shared across all model entries).
|
||||
const REASONING_UNSUPPORTED: readonly string[] = Object.freeze([
|
||||
"temperature",
|
||||
"top_p",
|
||||
"frequency_penalty",
|
||||
"presence_penalty",
|
||||
"logprobs",
|
||||
"top_logprobs",
|
||||
"n",
|
||||
]);
|
||||
|
||||
export interface RegistryOAuth {
|
||||
clientIdEnv?: string;
|
||||
clientIdDefault?: string;
|
||||
@@ -126,13 +139,13 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
clientSecretDefault: "",
|
||||
},
|
||||
models: [
|
||||
{ id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
|
||||
{ id: "gemini-3.1-flash", name: "Gemini 3.1 Flash" },
|
||||
{ id: "gemini-3-pro-preview", name: "Gemini 3.0 Pro Preview" },
|
||||
{ id: "gemini-3-flash-preview", name: "Gemini 3.0 Flash Preview" },
|
||||
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
|
||||
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
|
||||
{ id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
|
||||
{ id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" },
|
||||
{ id: "gemini-2.0-flash-exp", name: "Gemini 2.0 Flash Exp" },
|
||||
{ id: "gemini-1.5-pro", name: "Gemini 1.5 Pro" },
|
||||
{ id: "gemini-1.5-flash", name: "Gemini 1.5 Flash" },
|
||||
],
|
||||
},
|
||||
|
||||
@@ -155,13 +168,12 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
clientSecretDefault: "",
|
||||
},
|
||||
models: [
|
||||
{ id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
|
||||
{ id: "gemini-3.1-flash", name: "Gemini 3.1 Flash" },
|
||||
{ id: "gemini-3-flash-preview", name: "Gemini 3.0 Flash Preview" },
|
||||
{ id: "gemini-3-pro-preview", name: "Gemini 3.0 Pro Preview" },
|
||||
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
|
||||
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
|
||||
{ id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
|
||||
{ id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" },
|
||||
{ id: "gemini-1.5-pro", name: "Gemini 1.5 Pro" },
|
||||
{ id: "gemini-1.5-flash", name: "Gemini 1.5 Flash" },
|
||||
],
|
||||
},
|
||||
|
||||
@@ -305,10 +317,9 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
models: [
|
||||
{ id: "claude-opus-4-6-thinking", name: "Claude Opus 4.6 Thinking" },
|
||||
{ id: "claude-sonnet-4-6", name: "Claude Sonnet 4.6" },
|
||||
{ id: "gemini-3.1-pro-high", name: "Gemini 3.1 Pro High" },
|
||||
{ id: "gemini-3.1-pro-low", name: "Gemini 3.1 Pro Low" },
|
||||
{ id: "gemini-3.1-flash", name: "Gemini 3.1 Flash" },
|
||||
{ id: "gemini-3-flash", name: "Gemini 3.0 Flash" },
|
||||
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
|
||||
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
|
||||
{ id: "gemini-2.0-flash", name: "Gemini 2.0 Flash" },
|
||||
{ id: "gpt-oss-120b-medium", name: "GPT OSS 120B Medium" },
|
||||
],
|
||||
},
|
||||
@@ -356,8 +367,7 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
{ id: "claude-sonnet-4", name: "Claude Sonnet 4" },
|
||||
{ id: "claude-sonnet-4.5", name: "Claude Sonnet 4.5" },
|
||||
{ id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
|
||||
{ id: "gemini-3-flash-preview", name: "Gemini 3 Flash Preview" },
|
||||
{ id: "gemini-3-pro-preview", name: "Gemini 3 Pro Preview" },
|
||||
{ id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
|
||||
{ id: "grok-code-fast-1", name: "Grok Code Fast 1" },
|
||||
{ id: "oswe-vscode-prime", name: "Raptor Mini" },
|
||||
],
|
||||
@@ -429,8 +439,11 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
{ id: "gpt-4o", name: "GPT-4o" },
|
||||
{ id: "gpt-4o-mini", name: "GPT-4o Mini" },
|
||||
{ id: "gpt-4-turbo", name: "GPT-4 Turbo" },
|
||||
{ id: "o1", name: "O1" },
|
||||
{ id: "o1-mini", name: "O1 Mini" },
|
||||
{ id: "o1", name: "O1", unsupportedParams: REASONING_UNSUPPORTED },
|
||||
{ id: "o1-mini", name: "O1 Mini", unsupportedParams: REASONING_UNSUPPORTED },
|
||||
{ id: "o1-pro", name: "O1 Pro", unsupportedParams: REASONING_UNSUPPORTED },
|
||||
{ id: "o3", name: "O3", unsupportedParams: REASONING_UNSUPPORTED },
|
||||
{ id: "o3-mini", name: "O3 Mini", unsupportedParams: REASONING_UNSUPPORTED },
|
||||
],
|
||||
},
|
||||
|
||||
@@ -836,12 +849,14 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
authType: "apikey",
|
||||
authHeader: "bearer",
|
||||
models: [
|
||||
{ id: "meta/llama-3.3-70b-instruct", name: "Llama 3.3 70B" },
|
||||
{ id: "meta/llama-4-maverick-17b-128e-instruct", name: "Llama 4 Maverick" },
|
||||
{ id: "moonshotai/kimi-k2.5", name: "Kimi K2.5" },
|
||||
{ id: "z-ai/glm4.7", name: "GLM 4.7" },
|
||||
{ id: "deepseek-ai/deepseek-v3.2", name: "DeepSeek V3.2" },
|
||||
{ id: "nvidia/llama-3.3-70b-instruct", name: "Llama 3.3 70B" },
|
||||
{ id: "meta/llama-4-maverick-17b-128e-instruct", name: "Llama 4 Maverick" },
|
||||
{ id: "deepseek/deepseek-r1", name: "DeepSeek R1" },
|
||||
{ id: "nvidia/llama-3.1-70b-instruct", name: "Llama 3.1 70B" },
|
||||
{ id: "nvidia/llama-3.1-405b-instruct", name: "Llama 3.1 405B" },
|
||||
],
|
||||
},
|
||||
|
||||
@@ -919,6 +934,46 @@ export const REGISTRY: Record<string, RegistryEntry> = {
|
||||
],
|
||||
},
|
||||
|
||||
synthetic: {
|
||||
id: "synthetic",
|
||||
alias: "synthetic",
|
||||
format: "openai",
|
||||
executor: "default",
|
||||
baseUrl: "https://api.synthetic.new/openai/v1/chat/completions",
|
||||
modelsUrl: "https://api.synthetic.new/openai/v1/models",
|
||||
authType: "apikey",
|
||||
authHeader: "bearer",
|
||||
models: [
|
||||
{ id: "hf:nvidia/Kimi-K2.5-NVFP4", name: "Kimi K2.5 (NVFP4)" },
|
||||
{ id: "hf:MiniMaxAI/MiniMax-M2.5", name: "MiniMax M2.5" },
|
||||
{ id: "hf:zai-org/GLM-4.7-Flash", name: "GLM 4.7 Flash" },
|
||||
{ id: "hf:zai-org/GLM-4.7", name: "GLM 4.7" },
|
||||
{ id: "hf:moonshotai/Kimi-K2.5", name: "Kimi K2.5" },
|
||||
{ id: "hf:deepseek-ai/DeepSeek-V3.2", name: "DeepSeek V3.2" },
|
||||
],
|
||||
passthroughModels: true,
|
||||
},
|
||||
|
||||
"kilo-gateway": {
|
||||
id: "kilo-gateway",
|
||||
alias: "kg",
|
||||
format: "openai",
|
||||
executor: "default",
|
||||
baseUrl: "https://api.kilo.ai/api/gateway/chat/completions",
|
||||
modelsUrl: "https://api.kilo.ai/api/gateway/models",
|
||||
authType: "apikey",
|
||||
authHeader: "bearer",
|
||||
models: [
|
||||
{ id: "kilo-auto/frontier", name: "Kilo Auto Frontier" },
|
||||
{ id: "kilo-auto/balanced", name: "Kilo Auto Balanced" },
|
||||
{ id: "kilo-auto/free", name: "Kilo Auto Free" },
|
||||
{ id: "nvidia/nemotron-3-super-120b-a12b:free", name: "Nemotron 3 Super 120B (Free)" },
|
||||
{ id: "minimax/minimax-m2.5:free", name: "MiniMax M2.5 (Free)" },
|
||||
{ id: "arcee-ai/trinity-large-preview:free", name: "Trinity Large Preview (Free)" },
|
||||
],
|
||||
passthroughModels: true,
|
||||
},
|
||||
|
||||
vertex: {
|
||||
id: "vertex",
|
||||
alias: "vertex",
|
||||
@@ -1022,6 +1077,38 @@ export function generateAliasMap(): Record<string, string> {
|
||||
return map;
|
||||
}
|
||||
|
||||
// ── Local Provider Detection ──────────────────────────────────────────────
|
||||
|
||||
// Evaluated once at module load time — process restart required for env var changes.
|
||||
const LOCAL_HOSTNAMES = new Set([
|
||||
"localhost",
|
||||
"127.0.0.1",
|
||||
"::1",
|
||||
"[::1]",
|
||||
...(typeof process !== "undefined" && process.env.LOCAL_HOSTNAMES
|
||||
? process.env.LOCAL_HOSTNAMES.split(",")
|
||||
.map((h) => h.trim())
|
||||
.filter(Boolean)
|
||||
: []),
|
||||
]);
|
||||
|
||||
/**
|
||||
* Detect if a base URL points to a local inference backend.
|
||||
* Used for shorter 404 cooldowns (model-only, not connection) and health check targets.
|
||||
*
|
||||
* Operators can extend via LOCAL_HOSTNAMES env var (comma-separated) for Docker
|
||||
* hostnames (e.g., LOCAL_HOSTNAMES=omlx,mlx-audio).
|
||||
*/
|
||||
export function isLocalProvider(baseUrl?: string | null): boolean {
|
||||
if (!baseUrl) return false;
|
||||
try {
|
||||
const url = new URL(baseUrl);
|
||||
return LOCAL_HOSTNAMES.has(url.hostname);
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
// ── Registry Lookup Helpers ───────────────────────────────────────────────
|
||||
|
||||
const _byAlias = new Map<string, RegistryEntry>();
|
||||
@@ -1041,6 +1128,43 @@ export function getRegisteredProviders(): string[] {
|
||||
return Object.keys(REGISTRY);
|
||||
}
|
||||
|
||||
// Precomputed map: modelId → unsupportedParams (O(1) lookup instead of O(N×M) scan).
|
||||
// Built once at module load from all registry entries.
|
||||
const _unsupportedParamsMap = new Map<string, readonly string[]>();
|
||||
for (const entry of Object.values(REGISTRY)) {
|
||||
for (const model of entry.models) {
|
||||
if (model.unsupportedParams && !_unsupportedParamsMap.has(model.id)) {
|
||||
_unsupportedParamsMap.set(model.id, model.unsupportedParams);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Get unsupported parameters for a specific model.
|
||||
* Uses O(1) precomputed lookup. Also handles prefixed model IDs
|
||||
* (e.g., "openai/o3" → strips prefix and looks up "o3").
|
||||
* Returns empty array if no restrictions are defined.
|
||||
*/
|
||||
export function getUnsupportedParams(provider: string, modelId: string): readonly string[] {
|
||||
// 1. Check current provider's registry (exact match)
|
||||
const entry = getRegistryEntry(provider);
|
||||
const modelEntry = entry?.models.find((m) => m.id === modelId);
|
||||
if (modelEntry?.unsupportedParams) return modelEntry.unsupportedParams;
|
||||
|
||||
// 2. O(1) lookup in precomputed map (handles cross-provider routing)
|
||||
const cached = _unsupportedParamsMap.get(modelId);
|
||||
if (cached) return cached;
|
||||
|
||||
// 3. Handle prefixed model IDs (e.g., "openai/o3" → "o3")
|
||||
if (modelId.includes("/")) {
|
||||
const bareId = modelId.split("/").pop() || "";
|
||||
const bare = _unsupportedParamsMap.get(bareId);
|
||||
if (bare) return bare;
|
||||
}
|
||||
|
||||
return [];
|
||||
}
|
||||
|
||||
/**
|
||||
* Get provider category: "oauth" or "apikey"
|
||||
* Used by the resilience layer to apply different cooldown/backoff profiles.
|
||||
|
||||
@@ -13,6 +13,7 @@ import { refreshWithRetry } from "../services/tokenRefresh.ts";
|
||||
import { createRequestLogger } from "../utils/requestLogger.ts";
|
||||
import { getModelTargetFormat, PROVIDER_ID_TO_ALIAS } from "../config/providerModels.ts";
|
||||
import { resolveModelAlias } from "../services/modelDeprecation.ts";
|
||||
import { getUnsupportedParams } from "../config/providerRegistry.ts";
|
||||
import { createErrorResult, parseUpstreamError, formatProviderError } from "../utils/error.ts";
|
||||
import { HTTP_STATUS } from "../config/constants.ts";
|
||||
import { handleBypassRequest } from "../utils/bypassHandler.ts";
|
||||
@@ -53,7 +54,9 @@ export function shouldUseNativeCodexPassthrough({
|
||||
}): boolean {
|
||||
if (provider !== "codex") return false;
|
||||
if (sourceFormat !== FORMATS.OPENAI_RESPONSES) return false;
|
||||
return String(endpointPath || "").toLowerCase().endsWith("/responses");
|
||||
return String(endpointPath || "")
|
||||
.toLowerCase()
|
||||
.endsWith("/responses");
|
||||
}
|
||||
|
||||
/**
|
||||
@@ -287,6 +290,21 @@ export async function handleChatCore({
|
||||
// Update model in body
|
||||
translatedBody.model = model;
|
||||
|
||||
// Strip unsupported parameters for reasoning models (o1, o3, etc.)
|
||||
const unsupported = getUnsupportedParams(provider, model);
|
||||
if (unsupported.length > 0) {
|
||||
const stripped: string[] = [];
|
||||
for (const param of unsupported) {
|
||||
if (Object.hasOwn(translatedBody, param)) {
|
||||
stripped.push(param);
|
||||
delete translatedBody[param];
|
||||
}
|
||||
}
|
||||
if (stripped.length > 0) {
|
||||
log?.warn?.("PARAMS", `Stripped unsupported params for ${model}: ${stripped.join(", ")}`);
|
||||
}
|
||||
}
|
||||
|
||||
// Get executor for this provider
|
||||
const executor = getExecutor(provider);
|
||||
|
||||
|
||||
Generated
+23
-24
@@ -1,12 +1,12 @@
|
||||
{
|
||||
"name": "omniroute",
|
||||
"version": "2.6.2",
|
||||
"version": "2.6.5",
|
||||
"lockfileVersion": 3,
|
||||
"requires": true,
|
||||
"packages": {
|
||||
"": {
|
||||
"name": "omniroute",
|
||||
"version": "2.6.2",
|
||||
"version": "2.6.5",
|
||||
"hasInstallScript": true,
|
||||
"license": "MIT",
|
||||
"workspaces": [
|
||||
@@ -23,7 +23,7 @@
|
||||
"express": "^5.2.1",
|
||||
"fetch-socks": "^1.3.2",
|
||||
"http-proxy-middleware": "^3.0.5",
|
||||
"https-proxy-agent": "^7.0.6",
|
||||
"https-proxy-agent": "^8.0.0",
|
||||
"jose": "^6.1.3",
|
||||
"lowdb": "^7.0.1",
|
||||
"monaco-editor": "^0.55.1",
|
||||
@@ -4236,9 +4236,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/agent-base": {
|
||||
"version": "7.1.4",
|
||||
"resolved": "https://registry.npmjs.org/agent-base/-/agent-base-7.1.4.tgz",
|
||||
"integrity": "sha512-MnA+YT8fwfJPgBx3m60MNqakm30XOkyIoH1y6huTQvC0PwZG7ki8NacLBcrPbNoo8vEZy7Jpuk7+jMO+CUovTQ==",
|
||||
"version": "8.0.0",
|
||||
"resolved": "https://registry.npmjs.org/agent-base/-/agent-base-8.0.0.tgz",
|
||||
"integrity": "sha512-QT8i0hCz6C/KQ+KTAbSNwCHDGdmUJl2tp2ZpNlGSWCfhUNVbYG2WLE3MdZGBAgXPV4GAvjGMxo+C1hroyxmZEg==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">= 14"
|
||||
@@ -4672,9 +4672,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/better-sqlite3": {
|
||||
"version": "12.6.2",
|
||||
"resolved": "https://registry.npmjs.org/better-sqlite3/-/better-sqlite3-12.6.2.tgz",
|
||||
"integrity": "sha512-8VYKM3MjCa9WcaSAI3hzwhmyHVlH8tiGFwf0RlTsZPWJ1I5MkzjiudCo4KC4DxOaL/53A5B1sI/IbldNFDbsKA==",
|
||||
"version": "12.8.0",
|
||||
"resolved": "https://registry.npmjs.org/better-sqlite3/-/better-sqlite3-12.8.0.tgz",
|
||||
"integrity": "sha512-RxD2Vd96sQDjQr20kdP+F+dK/1OUNiVOl200vKBZY8u0vTwysfolF6Hq+3ZK2+h8My9YvZhHsF+RSGZW2VYrPQ==",
|
||||
"hasInstallScript": true,
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
@@ -6866,7 +6866,6 @@
|
||||
"version": "2.3.2",
|
||||
"resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.2.tgz",
|
||||
"integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==",
|
||||
"dev": true,
|
||||
"hasInstallScript": true,
|
||||
"license": "MIT",
|
||||
"optional": true,
|
||||
@@ -7270,13 +7269,13 @@
|
||||
}
|
||||
},
|
||||
"node_modules/https-proxy-agent": {
|
||||
"version": "7.0.6",
|
||||
"resolved": "https://registry.npmjs.org/https-proxy-agent/-/https-proxy-agent-7.0.6.tgz",
|
||||
"integrity": "sha512-vK9P5/iUfdl95AI+JVyUuIcVtd4ofvtrOr3HNtM2yxC9bnMbEdp3x01OhQNnjb8IJYi38VlTE3mBXwcfvywuSw==",
|
||||
"version": "8.0.0",
|
||||
"resolved": "https://registry.npmjs.org/https-proxy-agent/-/https-proxy-agent-8.0.0.tgz",
|
||||
"integrity": "sha512-YYeW+iCnAS3xhvj2dvVoWgsbca3RfQy/IlaNHHOtDmU0jMqPI9euIq3Y9BJETdxk16h9NHHCKqp/KB9nIMStCQ==",
|
||||
"license": "MIT",
|
||||
"dependencies": {
|
||||
"agent-base": "^7.1.2",
|
||||
"debug": "4"
|
||||
"agent-base": "8.0.0",
|
||||
"debug": "^4.3.4"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">= 14"
|
||||
@@ -11524,9 +11523,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/undici": {
|
||||
"version": "7.24.2",
|
||||
"resolved": "https://registry.npmjs.org/undici/-/undici-7.24.2.tgz",
|
||||
"integrity": "sha512-P9J1HWYV/ajFr8uCqk5QixwiRKmB1wOamgS0e+o2Z4A44Ej2+thFVRLG/eA7qprx88XXhnV5Bl8LHXTURpzB3Q==",
|
||||
"version": "7.24.4",
|
||||
"resolved": "https://registry.npmjs.org/undici/-/undici-7.24.4.tgz",
|
||||
"integrity": "sha512-BM/JzwwaRXxrLdElV2Uo6cTLEjhSb3WXboncJamZ15NgUURmvlXvxa6xkwIOILIjPNo9i8ku136ZvWV0Uly8+w==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=20.18.1"
|
||||
@@ -12129,9 +12128,9 @@
|
||||
"license": "ISC"
|
||||
},
|
||||
"node_modules/wreq-js": {
|
||||
"version": "2.1.1",
|
||||
"resolved": "https://registry.npmjs.org/wreq-js/-/wreq-js-2.1.1.tgz",
|
||||
"integrity": "sha512-nJBOMBTczqcyHpF8a8YdPyxb30htK2RxuAfr6O8a6oyKHj2nRPjXbZcGXrquIdZx1b+6NV/GHweD3OqWwE7n4A==",
|
||||
"version": "2.2.0",
|
||||
"resolved": "https://registry.npmjs.org/wreq-js/-/wreq-js-2.2.0.tgz",
|
||||
"integrity": "sha512-lXW1/bvdPTpFMdfBftkJIp6OzxkAqAON4dlrKrmaFNT86eu60VCEVmEdK3nWY1ZyiEZ6IXQPRrc1uXG394BoBA==",
|
||||
"cpu": [
|
||||
"x64",
|
||||
"arm64"
|
||||
@@ -12336,9 +12335,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/zustand": {
|
||||
"version": "5.0.11",
|
||||
"resolved": "https://registry.npmjs.org/zustand/-/zustand-5.0.11.tgz",
|
||||
"integrity": "sha512-fdZY+dk7zn/vbWNCYmzZULHRrss0jx5pPFiOuMZ/5HJN6Yv3u+1Wswy/4MpZEkEGhtNH+pwxZB8OKgUBPzYAGg==",
|
||||
"version": "5.0.12",
|
||||
"resolved": "https://registry.npmjs.org/zustand/-/zustand-5.0.12.tgz",
|
||||
"integrity": "sha512-i77ae3aZq4dhMlRhJVCYgMLKuSiZAaUPAct2AksxQ+gOtimhGMdXljRT21P5BNpeT4kXlLIckvkPM029OljD7g==",
|
||||
"license": "MIT",
|
||||
"engines": {
|
||||
"node": ">=12.20.0"
|
||||
|
||||
+2
-2
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "omniroute",
|
||||
"version": "2.6.2",
|
||||
"version": "2.6.5",
|
||||
"description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
|
||||
"type": "module",
|
||||
"bin": {
|
||||
@@ -90,7 +90,7 @@
|
||||
"express": "^5.2.1",
|
||||
"fetch-socks": "^1.3.2",
|
||||
"http-proxy-middleware": "^3.0.5",
|
||||
"https-proxy-agent": "^7.0.6",
|
||||
"https-proxy-agent": "^8.0.0",
|
||||
"jose": "^6.1.3",
|
||||
"lowdb": "^7.0.1",
|
||||
"monaco-editor": "^0.55.1",
|
||||
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 472 B |
@@ -142,6 +142,62 @@ if (sanitisedCount > 0) {
|
||||
console.log(" ℹ️ No hardcoded paths found to sanitise");
|
||||
}
|
||||
|
||||
// ── Step 5.6: Strip Turbopack hashed externals from compiled chunks ─────────
|
||||
// Even when Turbopack is disabled at build time, some instrumentation chunks
|
||||
// may still emit require('package-<16hexchars>') instead of require('package').
|
||||
// These hashed names don't exist in node_modules and cause MODULE_NOT_FOUND at
|
||||
// runtime. We strip the hex suffix from all .js files in app/.next/server/
|
||||
// to ensure all require() calls use the real package names.
|
||||
{
|
||||
const serverOutput = join(APP_DIR, ".next", "server");
|
||||
const HASH_RE = /(['"\\])([a-z@][a-z0-9@./_-]+-[0-9a-f]{16})\1/g;
|
||||
let patchedFiles = 0;
|
||||
let patchedMatches = 0;
|
||||
const walkDir = (dir) => {
|
||||
let entries = [];
|
||||
try {
|
||||
entries = readdirSync(dir);
|
||||
} catch {
|
||||
return;
|
||||
}
|
||||
for (const entry of entries) {
|
||||
const full = join(dir, entry);
|
||||
try {
|
||||
const st = statSync(full);
|
||||
if (st.isDirectory()) {
|
||||
walkDir(full);
|
||||
continue;
|
||||
}
|
||||
if (!entry.endsWith(".js")) continue;
|
||||
const src = readFileSync(full, "utf8");
|
||||
let count = 0;
|
||||
const patched = src.replace(HASH_RE, (_, q, name) => {
|
||||
const base = name.replace(/-[0-9a-f]{16}$/, "");
|
||||
count++;
|
||||
return `${q}${base}${q}`;
|
||||
});
|
||||
if (count > 0) {
|
||||
writeFileSync(full, patched);
|
||||
patchedFiles++;
|
||||
patchedMatches += count;
|
||||
}
|
||||
} catch {
|
||||
/* skip unreadable files */
|
||||
}
|
||||
}
|
||||
};
|
||||
if (existsSync(serverOutput)) {
|
||||
walkDir(serverOutput);
|
||||
if (patchedMatches > 0) {
|
||||
console.log(
|
||||
` 🔧 Hash-strip: patched ${patchedMatches} hashed require() in ${patchedFiles} server chunk file(s)`
|
||||
);
|
||||
} else {
|
||||
console.log(" ✅ Hash-strip: no hashed externals found in compiled chunks.");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ── Step 6: Copy static assets ─────────────────────────────
|
||||
const staticSrc = join(ROOT, ".next", "static");
|
||||
const staticDest = join(APP_DIR, ".next", "static");
|
||||
|
||||
@@ -255,6 +255,22 @@ const PROVIDER_MODELS_CONFIG: Record<string, ProviderModelsConfigEntry> = {
|
||||
authPrefix: "Bearer ",
|
||||
parseResponse: (data) => data.models || data.data || [],
|
||||
},
|
||||
synthetic: {
|
||||
url: "https://api.synthetic.new/openai/v1/models",
|
||||
method: "GET",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
authHeader: "Authorization",
|
||||
authPrefix: "Bearer ",
|
||||
parseResponse: (data) => data.data || data.models || [],
|
||||
},
|
||||
"kilo-gateway": {
|
||||
url: "https://api.kilo.ai/api/gateway/models",
|
||||
method: "GET",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
authHeader: "Authorization",
|
||||
authPrefix: "Bearer ",
|
||||
parseResponse: (data) => data.data || data.models || [],
|
||||
},
|
||||
};
|
||||
|
||||
/**
|
||||
|
||||
@@ -360,6 +360,26 @@ export const APIKEY_PROVIDERS = {
|
||||
hasFree: true,
|
||||
freeNote: "Free Inference API for thousands of models (Whisper, VITS, SDXL…)",
|
||||
},
|
||||
synthetic: {
|
||||
id: "synthetic",
|
||||
alias: "synthetic",
|
||||
name: "Synthetic",
|
||||
icon: "verified_user",
|
||||
color: "#6366F1",
|
||||
textIcon: "SY",
|
||||
website: "https://synthetic.new",
|
||||
passthroughModels: true,
|
||||
},
|
||||
"kilo-gateway": {
|
||||
id: "kilo-gateway",
|
||||
alias: "kg",
|
||||
name: "Kilo Gateway",
|
||||
icon: "hub",
|
||||
color: "#617A91",
|
||||
textIcon: "KG",
|
||||
website: "https://kilo.ai",
|
||||
passthroughModels: true,
|
||||
},
|
||||
vertex: {
|
||||
id: "vertex",
|
||||
alias: "vertex",
|
||||
|
||||
@@ -382,7 +382,8 @@ async function handleSingleModelChat(
|
||||
credentials.connectionId,
|
||||
result.status,
|
||||
result.error,
|
||||
provider
|
||||
provider,
|
||||
model
|
||||
);
|
||||
|
||||
if (shouldFallback) {
|
||||
|
||||
@@ -14,6 +14,8 @@ import {
|
||||
isModelLocked,
|
||||
lockModel,
|
||||
} from "@omniroute/open-sse/services/accountFallback.ts";
|
||||
import { isLocalProvider } from "@omniroute/open-sse/config/providerRegistry.ts";
|
||||
import { COOLDOWN_MS } from "@omniroute/open-sse/config/constants.ts";
|
||||
import * as log from "../utils/logger";
|
||||
import { fisherYatesShuffle, getNextFromDeckSync } from "@/shared/utils/shuffleDeck";
|
||||
|
||||
@@ -563,6 +565,23 @@ export async function markAccountUnavailable(
|
||||
);
|
||||
if (!shouldFallback) return { shouldFallback: false, cooldownMs: 0 };
|
||||
|
||||
// ── Local provider 404: model-only lockout, connection stays active ──
|
||||
// Detection: URL-based only (apiKey===null heuristic was too broad — could match
|
||||
// cloud providers with non-standard auth stored in providerSpecificData).
|
||||
const connBaseUrl = (conn?.providerSpecificData as Record<string, unknown>)?.baseUrl as
|
||||
| string
|
||||
| undefined;
|
||||
|
||||
if (isLocalProvider(connBaseUrl) && status === 404 && provider && model) {
|
||||
const localCooldown = COOLDOWN_MS.notFoundLocal;
|
||||
lockModel(provider, connectionId, model, "local_not_found", localCooldown);
|
||||
log.info(
|
||||
"AUTH",
|
||||
`Local 404 for ${model} — model-only lockout ${localCooldown / 1000}s (connection stays active)`
|
||||
);
|
||||
return { shouldFallback: true, cooldownMs: localCooldown };
|
||||
}
|
||||
|
||||
const rateLimitedUntil = getUnavailableUntil(cooldownMs);
|
||||
const errorMsg = typeof errorText === "string" ? errorText.slice(0, 100) : "Provider error";
|
||||
|
||||
|
||||
Reference in New Issue
Block a user