Compare commits

..

5 Commits

Author SHA1 Message Date
diegosouzapw 8df24c855b chore(release): v2.2.8
Build Electron Desktop App / Validate version (push) Failing after 32s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
fix(docker): healthcheck now uses /api/monitoring/health (#296, PR #301)
fix(rate-limit): maxWait=120s on Bottleneck prevents endless queue (#297, PR #302)
2026-03-11 00:20:57 -03:00
Diego Rodrigues de Sa e Souza f25882c0e9 Merge pull request #302 from diegosouzapw/fix/issue-296-healthcheck-endpoint
fix(docker): use /api/monitoring/health for Docker healthcheck (#296)
2026-03-11 00:20:17 -03:00
Diego Rodrigues de Sa e Souza be6c769192 Merge pull request #301 from diegosouzapw/fix/issue-297-rate-limit-maxwait
fix(rate-limit): prevent endless queue with maxWait (#297)
2026-03-11 00:20:14 -03:00
diegosouzapw a4276444b5 fix(rate-limit): add maxWait to Bottleneck to prevent endless queuing (#297)
When all provider quotas are exhausted (reservoir=0 after repeated 429s),
Bottleneck's schedule() would queue requests indefinitely since no maxWait
was configured. Clients (Cursor, Claude Code, VS Code) would hang forever.

Fix: add maxWait=120000 (2min, configurable via RATE_LIMIT_MAX_WAIT_MS env)
to DEFAULT_SETTINGS and all three Bottleneck constructors. When a job waits
longer than maxWait, Bottleneck rejects with a BottleneckError which
propagates as a 502/503 error to the client — a clean fail-fast instead
of infinite hang.
2026-03-10 23:58:36 -03:00
diegosouzapw 0af27b8d8a fix(docker): use /api/monitoring/health for healthcheck (#296)
The healthcheck script was querying /api/settings which returns config
data rather than system health. Updated to /api/monitoring/health which
is the canonical health endpoint used across tests, SystemMonitor.tsx,
MaintenanceBanner.tsx, playwright config, and MCP tools.
2026-03-10 23:57:17 -03:00
6 changed files with 25 additions and 6 deletions
+11
View File
@@ -11,6 +11,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
---
## [2.2.8] — 2026-03-11
> ### Bug Fixes
### Bug Fixes
- **Docker healthcheck wrong endpoint (#296)** — `scripts/healthcheck.mjs` now queries `/api/monitoring/health` instead of `/api/settings`. Aligns the healthcheck with all other health monitoring components (PR #301).
- **429 causes endless queue / requests hang forever (#297)** — Added `maxWait=120000` (2 min) to all Bottleneck instances. When all provider quotas are exhausted, requests now fail-fast with a clean error instead of queueing indefinitely. Configurable via `RATE_LIMIT_MAX_WAIT_MS` env var (PR #302).
---
## [2.2.7] — 2026-03-10
> ### Bug Fixes & Dependency Updates
+1 -1
View File
@@ -1,7 +1,7 @@
openapi: 3.1.0
info:
title: OmniRoute API
version: 2.2.7
version: 2.2.8
description: |
OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
endpoint that routes requests to multiple AI providers with load balancing,
+8
View File
@@ -59,6 +59,11 @@ const PERSIST_DEBOUNCE_MS = 60_000; // Debounce persistence to every 60s max
// Track initialization
let initialized = false;
// Max time (ms) a job can wait in queue before failing with a timeout error.
// Prevents infinite queuing when all providers are exhausted after a 429.
// Configurable via RATE_LIMIT_MAX_WAIT_MS env var (default: 2 minutes).
const MAX_WAIT_MS = parseInt(process.env.RATE_LIMIT_MAX_WAIT_MS || "120000", 10);
// Default conservative settings (before we learn from headers)
const DEFAULT_SETTINGS = {
maxConcurrent: 10,
@@ -66,6 +71,7 @@ const DEFAULT_SETTINGS = {
reservoir: null, // No initial reservoir — unlimited until we learn
reservoirRefreshAmount: null,
reservoirRefreshInterval: null,
maxWait: MAX_WAIT_MS, // Fail-fast: don't queue forever on 429 exhaustion
};
/**
@@ -111,6 +117,7 @@ export async function initializeRateLimits() {
reservoir: rpm,
reservoirRefreshAmount: rpm,
reservoirRefreshInterval: 60 * 1000,
maxWait: MAX_WAIT_MS,
id: key,
})
);
@@ -135,6 +142,7 @@ export async function initializeRateLimits() {
reservoir: DEFAULT_API_LIMITS.requestsPerMinute,
reservoirRefreshAmount: DEFAULT_API_LIMITS.requestsPerMinute,
reservoirRefreshInterval: 60 * 1000, // Refresh every minute
maxWait: MAX_WAIT_MS,
id: key,
})
);
+2 -2
View File
@@ -1,12 +1,12 @@
{
"name": "omniroute",
"version": "2.2.7",
"version": "2.2.8",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "omniroute",
"version": "2.2.7",
"version": "2.2.8",
"hasInstallScript": true,
"license": "MIT",
"workspaces": [
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "omniroute",
"version": "2.2.7",
"version": "2.2.8",
"description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
"type": "module",
"bin": {
+2 -2
View File
@@ -2,12 +2,12 @@
/**
* Docker healthcheck script for OmniRoute.
* Checks the /api/settings endpoint on the dashboard port.
* Checks the /api/monitoring/health endpoint on the dashboard port.
* Used by Dockerfile and docker-compose files.
*/
const port = process.env.DASHBOARD_PORT || process.env.PORT || "20128";
fetch(`http://127.0.0.1:${port}/api/settings`)
fetch(`http://127.0.0.1:${port}/api/monitoring/health`)
.then((r) => {
if (!r.ok) throw new Error(`HTTP ${r.status}`);
})