chore(release): v2.2.8

fix(docker): healthcheck now uses /api/monitoring/health (#296, PR #301) fix(rate-limit): maxWait=120s on Bottleneck prevents endless queue (#297, PR #302)
Merge pull request #302 from diegosouzapw/fix/issue-296-healthcheck-endpoint
2026-03-11 00:20:57 -03:00 · 2026-03-11 00:20:17 -03:00 · 2026-03-11 00:20:14 -03:00 · 2026-03-10 23:58:36 -03:00 · 2026-03-10 23:57:17 -03:00
6 changed files with 25 additions and 6 deletions
@@ -11,6 +11,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ---

+## [2.2.8] — 2026-03-11
+
+> ### Bug Fixes
+
+### Bug Fixes
+
+- **Docker healthcheck wrong endpoint (#296)** — `scripts/healthcheck.mjs` now queries `/api/monitoring/health` instead of `/api/settings`. Aligns the healthcheck with all other health monitoring components (PR #301).
+- **429 causes endless queue / requests hang forever (#297)** — Added `maxWait=120000` (2 min) to all Bottleneck instances. When all provider quotas are exhausted, requests now fail-fast with a clean error instead of queueing indefinitely. Configurable via `RATE_LIMIT_MAX_WAIT_MS` env var (PR #302).
+
+---
+
 ## [2.2.7] — 2026-03-10

 > ### Bug Fixes & Dependency Updates
@@ -1,7 +1,7 @@
 openapi: 3.1.0
 info:
  title: OmniRoute API
-  version: 2.2.7
+  version: 2.2.8
  description: |
    OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
    endpoint that routes requests to multiple AI providers with load balancing,
@@ -59,6 +59,11 @@ const PERSIST_DEBOUNCE_MS = 60_000; // Debounce persistence to every 60s max
 // Track initialization
 let initialized = false;

+// Max time (ms) a job can wait in queue before failing with a timeout error.
+// Prevents infinite queuing when all providers are exhausted after a 429.
+// Configurable via RATE_LIMIT_MAX_WAIT_MS env var (default: 2 minutes).
+const MAX_WAIT_MS = parseInt(process.env.RATE_LIMIT_MAX_WAIT_MS || "120000", 10);
+
 // Default conservative settings (before we learn from headers)
 const DEFAULT_SETTINGS = {
  maxConcurrent: 10,
@@ -66,6 +71,7 @@ const DEFAULT_SETTINGS = {
  reservoir: null, // No initial reservoir — unlimited until we learn
  reservoirRefreshAmount: null,
  reservoirRefreshInterval: null,
+  maxWait: MAX_WAIT_MS, // Fail-fast: don't queue forever on 429 exhaustion
 };

 /**
@@ -111,6 +117,7 @@ export async function initializeRateLimits() {
              reservoir: rpm,
              reservoirRefreshAmount: rpm,
              reservoirRefreshInterval: 60 * 1000,
+              maxWait: MAX_WAIT_MS,
              id: key,
            })
          );
@@ -135,6 +142,7 @@ export async function initializeRateLimits() {
              reservoir: DEFAULT_API_LIMITS.requestsPerMinute,
              reservoirRefreshAmount: DEFAULT_API_LIMITS.requestsPerMinute,
              reservoirRefreshInterval: 60 * 1000, // Refresh every minute
+              maxWait: MAX_WAIT_MS,
              id: key,
            })
          );
@@ -1,12 +1,12 @@
 {
  "name": "omniroute",
-  "version": "2.2.7",
+  "version": "2.2.8",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "omniroute",
-      "version": "2.2.7",
+      "version": "2.2.8",
      "hasInstallScript": true,
      "license": "MIT",
      "workspaces": [
@@ -1,6 +1,6 @@
 {
  "name": "omniroute",
-  "version": "2.2.7",
+  "version": "2.2.8",
  "description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
  "type": "module",
  "bin": {
@@ -2,12 +2,12 @@

 /**
 * Docker healthcheck script for OmniRoute.
- * Checks the /api/settings endpoint on the dashboard port.
+ * Checks the /api/monitoring/health endpoint on the dashboard port.
 * Used by Dockerfile and docker-compose files.
 */
 const port = process.env.DASHBOARD_PORT || process.env.PORT || "20128";

-fetch(`http://127.0.0.1:${port}/api/settings`)
+fetch(`http://127.0.0.1:${port}/api/monitoring/health`)
  .then((r) => {
    if (!r.ok) throw new Error(`HTTP ${r.status}`);
  })
Author	SHA1	Message	Date
diegosouzapw	8df24c855b	chore(release): v2.2.8 Build Electron Desktop App / Validate version (push) Failing after 32s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details fix(docker): healthcheck now uses /api/monitoring/health (#296, PR #301) fix(rate-limit): maxWait=120s on Bottleneck prevents endless queue (#297, PR #302)	2026-03-11 00:20:57 -03:00
Diego Rodrigues de Sa e Souza	f25882c0e9	Merge pull request #302 from diegosouzapw/fix/issue-296-healthcheck-endpoint fix(docker): use /api/monitoring/health for Docker healthcheck (#296)	2026-03-11 00:20:17 -03:00
Diego Rodrigues de Sa e Souza	be6c769192	Merge pull request #301 from diegosouzapw/fix/issue-297-rate-limit-maxwait fix(rate-limit): prevent endless queue with maxWait (#297)	2026-03-11 00:20:14 -03:00
diegosouzapw	a4276444b5	fix(rate-limit): add maxWait to Bottleneck to prevent endless queuing (#297 ) When all provider quotas are exhausted (reservoir=0 after repeated 429s), Bottleneck's schedule() would queue requests indefinitely since no maxWait was configured. Clients (Cursor, Claude Code, VS Code) would hang forever. Fix: add maxWait=120000 (2min, configurable via RATE_LIMIT_MAX_WAIT_MS env) to DEFAULT_SETTINGS and all three Bottleneck constructors. When a job waits longer than maxWait, Bottleneck rejects with a BottleneckError which propagates as a 502/503 error to the client — a clean fail-fast instead of infinite hang.	2026-03-10 23:58:36 -03:00
diegosouzapw	0af27b8d8a	fix(docker): use /api/monitoring/health for healthcheck (#296 ) The healthcheck script was querying /api/settings which returns config data rather than system health. Updated to /api/monitoring/health which is the canonical health endpoint used across tests, SystemMonitor.tsx, MaintenanceBanner.tsx, playwright config, and MCP tools.	2026-03-10 23:57:17 -03:00