feat(release): v2.0.8 — custom image model handler resolution

Merge pull request #239 from diegosouzapw/fix/issue-238-image-handler
fix: pass resolved provider to image handler for custom models (#238)
2026-03-07 10:05:20 -03:00 · 2026-03-07 10:04:24 -03:00 · 2026-03-07 10:03:48 -03:00 · 2026-03-07 06:58:07 -03:00 · 2026-03-07 06:56:49 -03:00 · 2026-03-07 06:56:09 -03:00
31 changed files with 899 additions and 1041 deletions
@@ -31,14 +31,14 @@ jobs:
        uses: docker/setup-buildx-action@v3

      - name: Login to Docker Hub
-        uses: docker/login-action@v3
+        uses: docker/login-action@v4
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}

      - name: Build and push by digest
        id: build
-        uses: docker/build-push-action@v6
+        uses: docker/build-push-action@v7
        with:
          context: .
          target: runner-base
@@ -87,7 +87,7 @@ jobs:
          merge-multiple: true

      - name: Login to Docker Hub
-        uses: docker/login-action@v3
+        uses: docker/login-action@v4
        with:
          username: ${{ secrets.DOCKERHUB_USERNAME }}
          password: ${{ secrets.DOCKERHUB_TOKEN }}
@@ -78,7 +78,7 @@ jobs:
          cache: npm

      - name: Cache node_modules
-        uses: actions/cache@v4
+        uses: actions/cache@v5
        with:
          path: node_modules
          key: ${{ runner.os }}-node-${{ hashFiles('package-lock.json') }}
@@ -120,7 +120,7 @@ jobs:
          fi

      - name: Upload artifacts
-        uses: actions/upload-artifact@v4
+        uses: actions/upload-artifact@v7
        with:
          name: electron-${{ matrix.platform }}
          path: release-assets/
@@ -136,7 +136,7 @@ jobs:
          fetch-depth: 0

      - name: Download all artifacts
-        uses: actions/download-artifact@v4
+        uses: actions/download-artifact@v8
        with:
          path: release-assets
          merge-multiple: true
@@ -172,6 +172,5 @@ jobs:
            release-assets/*.blockmap
            release-assets/*.source.tar.gz
            release-assets/*.source.zip
-            release-assets/OmniRoute.exe
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
@@ -117,3 +117,8 @@ icon.iconset/

 # VS Code Extension (independent Git repo)
 vscode-extension/
+
+# SQLite residual files
+*.sqlite-shm
+*.sqlite-wal
+*.sqlite-journal
@@ -7,6 +7,168 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ---

+## [2.0.8] — 2026-03-07
+
+> ### 🐛 Bug Fix — Custom Image Model Handler Resolution
+
+### 🐛 Bug Fixes
+
+- **#238 — Custom image models still fail in handler layer** — v2.0.7 fixed the route-layer validation, but the handler (`handleImageGeneration()`) called `parseImageModel()` again internally, rejecting custom models a second time. Fix: handler now accepts an optional `resolvedProvider` parameter; when provided, it skips re-validation and routes custom models to the OpenAI-compatible handler with a synthetic config. PR #239
+
+### 📁 Files Changed
+
+| File                                         | Change                                                                           |
+| -------------------------------------------- | -------------------------------------------------------------------------------- |
+| `open-sse/handlers/imageGeneration.ts`       | Added `resolvedProvider` param + custom model fallback                           |
+| `src/app/api/v1/images/generations/route.ts` | Tracks `isCustomModel`, passes `resolvedProvider`, credentials for custom models |
+
+---
+
+## [2.0.7] — 2026-03-07
+
+> ### 🐛 Bug Fixes — Custom Image Models + Codex OAuth Workspace Isolation
+
+### 🐛 Bug Fixes
+
+- **#232 — Custom Gemini image models fail on `/v1/images/generations`** — Custom models tagged with `supportedEndpoints: ["images"]` appeared in the model listing (GET) but were rejected by the POST handler. `parseImageModel()` only checked the built-in `IMAGE_PROVIDERS` registry. Fix: added a custom model DB fallback for models with the `images` endpoint tag. PR #237
+- **#236 — Codex OAuth overwrites existing connection when same email added to another workspace** — The OAuth callback route had 3 upsert blocks matching connections by email-only, bypassing the workspace-aware logic in `createProviderConnection()`. When the same email authenticated to a new workspace, the existing connection's `workspaceId` was silently overwritten. Fix: for Codex, the match now also checks `providerSpecificData.workspaceId`, allowing separate connections per workspace. PR #237
+
+### 📁 Files Changed
+
+| File                                             | Change                                               |
+| ------------------------------------------------ | ---------------------------------------------------- |
+| `src/app/api/v1/images/generations/route.ts`     | Custom model DB fallback in POST handler             |
+| `src/app/api/oauth/[provider]/[action]/route.ts` | Workspace-aware Codex matching in 3 upsert locations |
+
+### ⏭️ Issues Triaged
+
+- **#234** — Playground feature request — Acknowledged, added to roadmap
+- **#235** — ACP support feature request — Acknowledged, added to roadmap
+
+---
+
+## [2.0.6] — 2026-03-07
+
+> ### 🐛 Bug Fix — Custom Model API Format Routing
+
+### 🐛 Bug Fixes
+
+- **#204 — Custom model `apiFormat` not used in routing** — Custom models configured with `apiFormat: "responses"` in the dashboard were still being routed through the Chat Completions translator. The `apiFormat` field was stored in the DB and displayed in the UI, but never consumed by the routing layer. Fix: `getModelInfo()` now returns `apiFormat` from the custom model DB, and both `resolveModelOrError()` functions override `targetFormat` to `openai-responses` when set. PR #233
+
+### ✅ Issues Closed
+
+- **#205** — Combo endpoint support — Already implemented in v2.0.2
+- **#206** — Manual model→endpoint mapping — Already implemented in v2.0.2
+- **#223** — CLI fingerprint parity — Responded with 4-phase roadmap
+
+### 📁 Files Changed
+
+| File                              | Change                                                                 |
+| --------------------------------- | ---------------------------------------------------------------------- |
+| `src/sse/services/model.ts`       | Added `lookupCustomModelApiFormat()`, enriched `getModelInfo()` return |
+| `src/sse/handlers/chat.ts`        | Override `targetFormat` when `apiFormat === "responses"`               |
+| `src/sse/handlers/chatHelpers.ts` | Same override in duplicate `resolveModelOrError()`                     |
+
+---
+
+## [2.0.5] — 2026-03-06
+
+> ### 🐛 Bug Fix, Electron Auto-Update & Dependency Bumps
+
+### 🐛 Bug Fixes
+
+- **#224 — Chat→Responses translation creates invalid reasoning IDs** — Removed synthetic reasoning item generation in `openaiToOpenAIResponsesRequest()`. The translator was creating reasoning items with IDs like `reasoning_15`, but OpenAI's Responses API requires server-generated `rs_*` IDs, causing `400 Invalid Request` errors from Responses-compatible upstreams. Fix: omit reasoning items entirely during translation
+- **CI: duplicate OmniRoute.exe in release workflow** — Removed redundant explicit `release-assets/OmniRoute.exe` entry that caused `softprops/action-gh-release` to fail with 404 on duplicate upload. PR #222 by @benzntech
+
+### ✨ New Features
+
+- **Electron Auto-Update** — Added auto-update functionality to the desktop app using `electron-updater`. Includes IPC handlers for check/download/install, "Check for Updates" in system tray menu, desktop notification when update is ready, and silent startup check (3s delay). PR #221 by @benzntech
+
+### 📦 Dependencies
+
+- Bump `actions/cache` from 4 to 5 (#225)
+- Bump `actions/download-artifact` from 4 to 8 (#226)
+- Bump `docker/login-action` from 3 to 4 (#227)
+- Bump `actions/upload-artifact` from 4 to 7 (#228)
+- Bump `docker/build-push-action` from 6 to 7 (#229)
+- Bump `express-rate-limit` from 8.2.1 to 8.3.0 (#230)
+
+### 📁 Files Changed
+
+| File                                              | Change                                               |
+| ------------------------------------------------- | ---------------------------------------------------- |
+| `open-sse/translator/request/openai-responses.ts` | Remove synthetic reasoning item generation           |
+| `.github/workflows/electron-release.yml`          | Remove duplicate exe entry, bump GH Actions          |
+| `.github/workflows/docker-publish.yml`            | Bump docker/login-action and build-push-action       |
+| `electron/main.js`                                | Auto-updater setup, IPC handlers, tray menu          |
+| `electron/package.json`                           | Added electron-updater dep and GitHub publish config |
+| `electron/preload.js`                             | Exposed update APIs via contextBridge                |
+| `package-lock.json`                               | Updated express-rate-limit                           |
+
+---
+
+## [2.0.4] — 2026-03-06
+
+> ### 🐛 Bug Fixes — Round-Robin Persistence & Docker Compatibility
+
+### 🐛 Bug Fixes
+
+- **#218 — Round-robin sticks to one account** — Added `last_used_at` column to `provider_connections` schema. Round-robin routing relied on `lastUsedAt` to rotate between accounts, but the column was missing from the database — the value was always `null`, causing selection to fall back to the same account. Includes auto-migration for existing databases
+- **#217 — `Cannot find module 'zod'` in Docker/standalone builds** — Added `zod` to `serverExternalPackages` in `next.config.mjs`. Next.js standalone builds weren't tracing `zod` through dynamic imports, causing crashes on Docker startup. Data is **not lost** — the crash prevented the server from reading the existing database
+
+### 📁 Files Changed
+
+| File                      | Change                                                 |
+| ------------------------- | ------------------------------------------------------ |
+| `src/lib/db/core.ts`      | Schema + migration + JSON migration for `last_used_at` |
+| `src/lib/db/providers.ts` | INSERT + UPDATE SQL for `last_used_at`                 |
+| `next.config.mjs`         | `serverExternalPackages: ['better-sqlite3', 'zod']`    |
+
+---
+
+## [2.0.3] — 2026-03-05
+
+> ### 🐛 Bug Fixes & Quota System Hardening
+
+### 🐛 Bug Fixes
+
+- **#215 — Deferred tools 400 error** — Skip `cache_control` on tools with `defer_loading=true` when assigning prompt caching to the last tool. Previously, the API rejected requests with 400 when MCP tools (Playwright, etc.) had deferred loading enabled. Fix applied in both `claudeHelper.ts` and `openai-to-claude.ts` translation layers. PR #216 by @DavyMassoneto
+- **Stale compiled schemas.js** — Deleted stale compiled `schemas.js` (912 lines) that shadowed the TypeScript `.ts` source, causing `cloudSyncActionSchema` warnings in the dashboard. PR #216 by @DavyMassoneto
+- **#202 — False quota exhaustion** — Fixed empty API responses (`{}`) creating quota objects with `utilization ?? 0` = 0% remaining, incorrectly marking accounts as exhausted. Added `hasUtilization()` guard. PR #214 by @DavyMassoneto
+- **Invalid date crash** — `parseDate()` now validates dates before comparison, handling `Invalid Date` from malformed `resetAt` values gracefully. PR #214 by @DavyMassoneto
+- **`total=0` false infinite quota** — `normalizeQuotas` now defaults to 0% remaining when `total` is zero (was incorrectly reporting 100%). PR #214 by @DavyMassoneto
+- **Tailwind v4 build failure** — Tailwind v4 scanned `*.sqlite-shm`/`.sqlite-wal` binary files, triggering "Invalid code point" errors. Added `@source not` exclusions in `globals.css`. PR #214 by @DavyMassoneto
+
+### ✨ Improvements
+
+- **Quota-aware account selection** — All load-balancing strategies (sticky, round-robin, p2c, random, least-used, cost-optimized, fill-first) now prioritize accounts with available quota over exhausted ones. PR #214 by @DavyMassoneto
+- **Concurrent refresh protection** — `tickRunning` flag prevents overlapping background quota refresh ticks; `refreshingSet` deduplicates per-connection refreshes. Thundering herd prevention with `MAX_CONCURRENT_REFRESHES=5`. PR #214 by @DavyMassoneto
+- **`clearModelUnavailability` on success** — Model unavailability is now cleared on every successful request, not only on fallback paths. PR #214 by @DavyMassoneto
+- **Centralized `anthropic-version`** — Hardcoded `anthropic-version` header (3 occurrences) centralized into `CLAUDE_CONFIG.apiVersion`. PR #214 by @DavyMassoneto
+- **Extracted `safePercentage()` utility** — Shared percentage validation function extracted to `src/shared/utils/formatting.ts`, eliminating duplication between backend and frontend. PR #214 by @DavyMassoneto
+- **`isRecord()` type guard** — Replaces inline `typeof` chain in usage API route. PR #214 by @DavyMassoneto
+
+### 📁 Files Changed
+
+| File                                                                                  | Change                                                     |
+| ------------------------------------------------------------------------------------- | ---------------------------------------------------------- |
+| `open-sse/translator/helpers/claudeHelper.ts`                                         | Skip `cache_control` on deferred tools                     |
+| `open-sse/translator/request/openai-to-claude.ts`                                     | Same fix in translator layer                               |
+| `src/shared/validation/schemas.js`                                                    | **DELETED** — stale compiled JS                            |
+| `.gitignore`                                                                          | Exclude Tailwind binary scanning                           |
+| `open-sse/services/usage.ts`                                                          | Legacy endpoint fallback logging                           |
+| `src/domain/quotaCache.ts`                                                            | **NEW** — Core quota cache with hardening                  |
+| `src/shared/utils/formatting.ts`                                                      | **NEW** — `safePercentage()` utility                       |
+| `src/instrumentation.ts`                                                              | Startup log for quota cache                                |
+| `src/sse/handlers/chat.ts`                                                            | `clearModelUnavailability` + `markAccountExhaustedFrom429` |
+| `src/sse/services/auth.ts`                                                            | Quota-aware account selection                              |
+| `src/app/globals.css`                                                                 | Tailwind `@source not` exclusions                          |
+| `src/app/api/usage/[connectionId]/route.ts`                                           | `isRecord()` type guard                                    |
+| `src/app/(dashboard)/dashboard/usage/components/ProviderLimits/ProviderLimitCard.tsx` | Use `remainingPercentage` directly                         |
+| `src/app/(dashboard)/dashboard/usage/components/ProviderLimits/utils.tsx`             | Use shared `safePercentage()`                              |
+
+---
+
 ## [2.0.2] — 2026-03-05

 > ### 🐛 Bug Fixes & ✨ Endpoint-Aware Model Management
@@ -26,10 +26,12 @@ const {
  nativeImage,
  shell,
  session,
+  Notification,
 } = require("electron");
 const path = require("path");
 const { spawn } = require("child_process");
 const fs = require("fs");
+const { autoUpdater } = require("electron-updater");

 // ── Single Instance Lock ───────────────────────────────────
 const gotTheLock = app.requestSingleInstanceLock();
@@ -62,6 +64,11 @@ let serverPort = 20128;

 const getServerUrl = () => `http://localhost:${serverPort}`;

+// ── Auto-Updater Configuration ──────────────────────────────
+autoUpdater.autoDownload = false;
+autoUpdater.autoInstallOnAppQuit = true;
+autoUpdater.logger = console;
+
 // ── Helper: Send IPC event to renderer (#5) ────────────────
 function sendToRenderer(channel, data) {
  if (mainWindow && !mainWindow.isDestroyed()) {
@@ -103,6 +110,77 @@ async function waitForServerExit(proc, timeoutMs = 5000) {
  ]);
 }

+// ── Auto-Updater Event Handlers ─────────────────────────────
+function setupAutoUpdater() {
+  autoUpdater.on("checking-for-update", () => {
+    sendToRenderer("update-status", { status: "checking" });
+    console.log("[Electron] Checking for updates...");
+  });
+
+  autoUpdater.on("update-available", (info) => {
+    sendToRenderer("update-status", { status: "available", version: info.version });
+    console.log("[Electron] Update available:", info.version);
+  });
+
+  autoUpdater.on("update-not-available", (info) => {
+    sendToRenderer("update-status", { status: "not-available", version: info.version });
+    console.log("[Electron] No update available");
+  });
+
+  autoUpdater.on("download-progress", (progress) => {
+    sendToRenderer("update-status", {
+      status: "downloading",
+      percent: Math.round(progress.percent),
+      transferred: progress.transferred,
+      total: progress.total,
+    });
+  });
+
+  autoUpdater.on("update-downloaded", (info) => {
+    sendToRenderer("update-status", { status: "downloaded", version: info.version });
+    console.log("[Electron] Update downloaded:", info.version);
+
+    if (Notification.isSupported()) {
+      const notification = new Notification({
+        title: "OmniRoute Update Ready",
+        body: `Version ${info.version} is ready to install. Click to restart.`,
+      });
+      notification.on("click", () => {
+        autoUpdater.quitAndInstall();
+      });
+      notification.show();
+    }
+  });
+
+  autoUpdater.on("error", (error) => {
+    sendToRenderer("update-status", { status: "error", message: error.message });
+    console.error("[Electron] Update error:", error);
+  });
+}
+
+async function checkForUpdates(silent = false) {
+  if (isDev) {
+    console.log("[Electron] Dev mode — skipping auto-update");
+    if (!silent) {
+      sendToRenderer("update-status", { status: "error", message: "Updates disabled in dev mode" });
+    }
+    return;
+  }
+  await autoUpdater.checkForUpdates();
+}
+
+async function downloadUpdate() {
+  await autoUpdater.downloadUpdate();
+}
+
+function installUpdate() {
+  if (nextServer) {
+    nextServer.kill("SIGTERM");
+    nextServer = null;
+  }
+  autoUpdater.quitAndInstall();
+}
+
 // ── Content Security Policy (#15) ──────────────────────────
 function setupContentSecurityPolicy() {
  session.defaultSession.webRequest.onHeadersReceived((details, callback) => {
@@ -236,6 +314,11 @@ function createTray() {
      ],
    },
    { type: "separator" },
+    {
+      label: "Check for Updates",
+      click: () => checkForUpdates(false),
+    },
+    { type: "separator" },
    {
      label: "Quit",
      click: () => {
@@ -391,6 +474,36 @@ function setupIpcHandlers() {
  });

  ipcMain.on("window-close", () => mainWindow?.close());
+
+  // Auto-update IPC handlers
+  ipcMain.handle("check-for-updates", async () => {
+    try {
+      await checkForUpdates(false);
+      return { success: true };
+    } catch (error) {
+      console.error("[Electron] Check for updates failed:", error);
+      sendToRenderer("update-status", { status: "error", message: error.message });
+      return { success: false, error: error.message };
+    }
+  });
+
+  ipcMain.handle("download-update", async () => {
+    try {
+      await downloadUpdate();
+      return { success: true };
+    } catch (error) {
+      console.error("[Electron] Download update failed:", error);
+      sendToRenderer("update-status", { status: "error", message: error.message });
+      return { success: false, error: error.message };
+    }
+  });
+
+  ipcMain.handle("install-update", () => {
+    installUpdate();
+    // No return value — app will quit and restart
+  });
+
+  ipcMain.handle("get-app-version", () => app.getVersion());
 }

 // ── App Lifecycle ──────────────────────────────────────────
@@ -407,6 +520,14 @@ app.whenReady().then(async () => {
  createWindow();
  createTray();
  setupIpcHandlers();
+  setupAutoUpdater();
+
+  // Check for updates after a short delay (don't block startup)
+  if (!isDev) {
+    setTimeout(() => {
+      checkForUpdates(true);
+    }, 3000);
+  }

  // macOS: recreate window when dock icon clicked
  app.on("activate", () => {
@@ -15,7 +15,9 @@
    "build:linux": "electron-builder --linux",
    "pack": "electron-builder --dir"
  },
-  "dependencies": {},
+  "dependencies": {
+    "electron-updater": "^6.8.3"
+  },
  "devDependencies": {
    "electron": "^40.6.1",
    "electron-builder": "^25.1.8"
@@ -28,6 +30,11 @@
      "output": "dist-electron",
      "buildResources": "assets"
    },
+    "publish": {
+      "provider": "github",
+      "owner": "diegosouzapw",
+      "repo": "OmniRoute"
+    },
    "files": [
      "main.js",
      "preload.js",
@@ -13,9 +13,18 @@ const { contextBridge, ipcRenderer } = require("electron");

 // ── Channel Whitelist ──────────────────────────────────────
 const VALID_CHANNELS = {
-  invoke: ["get-app-info", "open-external", "get-data-dir", "restart-server"],
+  invoke: [
+    "get-app-info",
+    "open-external",
+    "get-data-dir",
+    "restart-server",
+    "check-for-updates",
+    "download-update",
+    "install-update",
+    "get-app-version",
+  ],
  send: ["window-minimize", "window-maximize", "window-close"],
-  receive: ["server-status", "port-changed"],
+  receive: ["server-status", "port-changed", "update-status"],
 };

 // ── Fix #16: Generic IPC wrappers ──────────────────────────
@@ -48,6 +57,12 @@ contextBridge.exposeInMainWorld("electronAPI", {
  openExternal: (url) => safeInvoke("open-external", url),
  getDataDir: () => safeInvoke("get-data-dir"),
  restartServer: () => safeInvoke("restart-server"),
+  getAppVersion: () => safeInvoke("get-app-version"),
+
+  // ── Auto-Update ──────────────────────────────────────────
+  checkForUpdates: () => safeInvoke("check-for-updates"),
+  downloadUpdate: () => safeInvoke("download-update"),
+  installUpdate: () => safeInvoke("install-update"),

  // ── Send (fire-and-forget) ───────────────────────────────
  minimizeWindow: () => safeSend("window-minimize"),
@@ -58,6 +73,7 @@ contextBridge.exposeInMainWorld("electronAPI", {
  // Fix #6: Returns a disposer function for precise cleanup
  onServerStatus: (callback) => safeOn("server-status", callback),
  onPortChanged: (callback) => safeOn("port-changed", callback),
+  onUpdateStatus: (callback) => safeOn("update-status", callback),

  // ── Static Properties ────────────────────────────────────
  isElectron: true,
@@ -6,7 +6,7 @@ const withNextIntl = createNextIntlPlugin("./src/i18n/request.ts");
 const nextConfig = {
  turbopack: {},
  output: "standalone",
-  serverExternalPackages: ["better-sqlite3"],
+  serverExternalPackages: ["better-sqlite3", "zod"],
  transpilePackages: ["@omniroute/open-sse"],
  allowedDevOrigins: ["192.168.*"],
  typescript: {
@@ -30,9 +30,23 @@ import {
 * @param {object} options.body - Request body
 * @param {object} options.credentials - Provider credentials { apiKey, accessToken }
 * @param {object} options.log - Logger
+ * @param {string} [options.resolvedProvider] - Pre-resolved provider ID (from route layer custom model resolution)
 */
-export async function handleImageGeneration({ body, credentials, log }) {
-  const { provider, model } = parseImageModel(body.model);
+export async function handleImageGeneration({ body, credentials, log, resolvedProvider = null }) {
+  let provider, model;
+
+  if (resolvedProvider) {
+    // Provider was already resolved by the route layer (custom model from DB)
+    // Extract model name from the full "provider/model" string
+    provider = resolvedProvider;
+    const modelStr = body.model || "";
+    model = modelStr.startsWith(provider + "/") ? modelStr.slice(provider.length + 1) : modelStr;
+  } else {
+    // Standard path: resolve from built-in image registry
+    const parsed = parseImageModel(body.model);
+    provider = parsed.provider;
+    model = parsed.model;
+  }

  if (!provider) {
    return {
@@ -43,12 +57,42 @@ export async function handleImageGeneration({ body, credentials, log }) {
  }

  const providerConfig = getImageProvider(provider);
+
+  // For custom models without a built-in provider config, use OpenAI-compatible handler
+  // with a synthetic config based on the provider's credentials
  if (!providerConfig) {
-    return {
-      success: false,
-      status: 400,
-      error: `Unknown image provider: ${provider}`,
+    if (!resolvedProvider) {
+      return {
+        success: false,
+        status: 400,
+        error: `Unknown image provider: ${provider}`,
+      };
+    }
+
+    // Custom model: use OpenAI-compatible format with provider's base URL
+    // The credentials were already resolved by the route layer
+    if (log) {
+      log.info("IMAGE", `Custom model ${provider}/${model} — using OpenAI-compatible handler`);
+    }
+
+    const syntheticConfig = {
+      id: provider,
+      baseUrl:
+        credentials?.baseUrl ||
+        `https://generativelanguage.googleapis.com/v1beta/openai/images/generations`,
+      authType: "apikey",
+      authHeader: "bearer",
+      format: "openai",
    };
+
+    return handleOpenAIImageGeneration({
+      model,
+      provider,
+      providerConfig: syntheticConfig,
+      body,
+      credentials,
+      log,
+    });
  }

  // Route to format-specific handler
@@ -3,6 +3,7 @@
 */

 import { PROVIDERS } from "../config/constants.ts";
+import { safePercentage } from "@/shared/utils/formatting";

 // GitHub API config
 const GITHUB_CONFIG = {
@@ -34,6 +35,7 @@ const CLAUDE_CONFIG = {
  oauthUsageUrl: "https://api.anthropic.com/api/oauth/usage",
  usageUrl: "https://api.anthropic.com/v1/organizations/{org_id}/usage",
  settingsUrl: "https://api.anthropic.com/v1/settings",
+  apiVersion: "2023-06-01",
 };

 type JsonRecord = Record<string, unknown>;
@@ -469,7 +471,7 @@ async function getClaudeUsage(accessToken) {
      headers: {
        Authorization: `Bearer ${accessToken}`,
        "anthropic-beta": "oauth-2025-04-20",
-        "anthropic-version": "2023-06-01",
+        "anthropic-version": CLAUDE_CONFIG.apiVersion,
      },
    });

@@ -477,36 +479,34 @@ async function getClaudeUsage(accessToken) {
      const data = await oauthResponse.json();
      const quotas: Record<string, any> = {};

-      // utilization = percentage USED (e.g., 22 means 22% used, 78% remaining)
+      // utilization = percentage REMAINING (e.g., 90 means 90% remaining, 10% used)
+      const hasUtilization = (window: any) =>
+        window && typeof window === "object" && safePercentage(window.utilization) !== undefined;
+
      const createQuotaObject = (window: any) => {
-        const used = window?.utilization ?? 0;
-        const remaining = 100 - used;
+        const remaining = safePercentage(window.utilization) as number;
+        const used = 100 - remaining;
        return {
          used,
          total: 100,
          remaining,
-          resetAt: parseResetTime(window?.resets_at),
+          resetAt: parseResetTime(window.resets_at),
          remainingPercentage: remaining,
          unlimited: false,
        };
      };

-      if (data.five_hour && typeof data.five_hour === "object") {
+      if (hasUtilization(data.five_hour)) {
        quotas["session (5h)"] = createQuotaObject(data.five_hour);
      }

-      if (data.seven_day && typeof data.seven_day === "object") {
+      if (hasUtilization(data.seven_day)) {
        quotas["weekly (7d)"] = createQuotaObject(data.seven_day);
      }

      // Parse model-specific weekly windows (e.g., seven_day_sonnet, seven_day_opus)
      for (const [key, value] of Object.entries(data)) {
-        if (
-          key.startsWith("seven_day_") &&
-          key !== "seven_day" &&
-          value &&
-          typeof value === "object"
-        ) {
+        if (key.startsWith("seven_day_") && key !== "seven_day" && hasUtilization(value)) {
          const modelName = key.replace("seven_day_", "");
          quotas[`weekly ${modelName} (7d)`] = createQuotaObject(value);
        }
@@ -519,7 +519,10 @@ async function getClaudeUsage(accessToken) {
      };
    }

-    // Fallback: Try legacy settings/org endpoint (for API key users with org admin access)
+    // Fallback: OAuth endpoint returned non-OK, try legacy settings/org endpoint
+    console.warn(
+      `[Claude Usage] OAuth endpoint returned ${oauthResponse.status}, falling back to legacy`
+    );
    return await getClaudeUsageLegacy(accessToken);
  } catch (error) {
    return { message: `Claude connected. Unable to fetch usage: ${(error as any).message}` };
@@ -536,7 +539,7 @@ async function getClaudeUsageLegacy(accessToken) {
      method: "GET",
      headers: {
        Authorization: `Bearer ${accessToken}`,
-        "anthropic-version": "2023-06-01",
+        "anthropic-version": CLAUDE_CONFIG.apiVersion,
      },
    });

@@ -550,7 +553,7 @@ async function getClaudeUsageLegacy(accessToken) {
            method: "GET",
            headers: {
              Authorization: `Bearer ${accessToken}`,
-              "anthropic-version": "2023-06-01",
+              "anthropic-version": CLAUDE_CONFIG.apiVersion,
            },
          }
        );
@@ -185,15 +185,19 @@ export function prepareClaudeRequest(body, provider = null) {
    }
  }

-  // 3. Tools: remove all cache_control, add only to last tool with ttl 1h
+  // 3. Tools: remove all cache_control, add only to last non-deferred tool with ttl 1h
+  // Tools with defer_loading=true cannot have cache_control (API rejects it)
  if (body.tools && Array.isArray(body.tools)) {
-    body.tools = body.tools.map((tool, i) => {
+    body.tools = body.tools.map((tool) => {
      const { cache_control, ...rest } = tool;
-      if (i === body.tools.length - 1) {
-        return { ...rest, cache_control: { type: "ephemeral", ttl: "1h" } };
-      }
      return rest;
    });
+    for (let i = body.tools.length - 1; i >= 0; i--) {
+      if (!body.tools[i].defer_loading) {
+        body.tools[i].cache_control = { type: "ephemeral", ttl: "1h" };
+        break;
+      }
+    }
  }

  return body;
@@ -275,30 +275,11 @@ export function openaiToOpenAIResponsesRequest(

    // Convert assistant messages
    if (role === "assistant") {
-      // Add reasoning content before assistant output
-      if (msg.reasoning_content) {
-        input.push({
-          type: "reasoning",
-          id: `reasoning_${input.length}`,
-          summary: [{ type: "summary_text", text: toString(msg.reasoning_content) }],
-        });
-      }
+      // Skip reasoning_content — OpenAI Responses API requires server-generated
+      // rs_* IDs for reasoning items. Synthesizing client-side IDs (e.g. reasoning_N)
+      // causes 400 errors from Responses-compatible upstreams. (#224)

-      // Handle thinking blocks in array content
-      if (Array.isArray(msg.content)) {
-        for (const blockValue of msg.content) {
-          const block = toRecord(blockValue);
-          if (block.type === "thinking" || block.type === "redacted_thinking") {
-            input.push({
-              type: "reasoning",
-              id: `reasoning_${input.length}`,
-              summary: [
-                { type: "summary_text", text: toString(block.thinking || block.data, "...") },
-              ],
-            });
-          }
-        }
-      }
+      // Skip thinking blocks in array content — same rs_* ID constraint applies

      // Build assistant output content
      const outputContent: unknown[] = [];
@@ -175,8 +175,13 @@ export function openaiToClaudeRequest(model, body, stream) {
      };
    });

-    if (result.tools.length > 0) {
-      result.tools[result.tools.length - 1].cache_control = { type: "ephemeral", ttl: "1h" };
+    // Add cache_control to last tool that doesn't have defer_loading
+    // Tools with defer_loading=true cannot have cache_control (API rejects it)
+    for (let i = result.tools.length - 1; i >= 0; i--) {
+      if (!result.tools[i].defer_loading) {
+        result.tools[i].cache_control = { type: "ephemeral", ttl: "1h" };
+        break;
+      }
    }
  }

@@ -1,12 +1,12 @@
 {
  "name": "omniroute",
-  "version": "2.0.1",
+  "version": "2.0.7",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "omniroute",
-      "version": "2.0.1",
+      "version": "2.0.7",
      "hasInstallScript": true,
      "license": "MIT",
      "workspaces": [
@@ -6596,12 +6596,12 @@
      }
    },
    "node_modules/express-rate-limit": {
-      "version": "8.2.1",
-      "resolved": "https://registry.npmjs.org/express-rate-limit/-/express-rate-limit-8.2.1.tgz",
-      "integrity": "sha512-PCZEIEIxqwhzw4KF0n7QF4QqruVTcF73O5kFKUnGOyjbCCgizBBiFaYpd/fnBLUMPw/BWw9OsiN7GgrNYr7j6g==",
+      "version": "8.3.0",
+      "resolved": "https://registry.npmjs.org/express-rate-limit/-/express-rate-limit-8.3.0.tgz",
+      "integrity": "sha512-KJzBawY6fB9FiZGdE/0aftepZ91YlaGIrV8vgblRM3J8X+dHx/aiowJWwkx6LIGyuqGiANsjSwwrbb8mifOJ4Q==",
      "license": "MIT",
      "dependencies": {
-        "ip-address": "10.0.1"
+        "ip-address": "10.1.0"
      },
      "engines": {
        "node": ">= 16"
@@ -6613,15 +6613,6 @@
        "express": ">= 4.11"
      }
    },
-    "node_modules/express-rate-limit/node_modules/ip-address": {
-      "version": "10.0.1",
-      "resolved": "https://registry.npmjs.org/ip-address/-/ip-address-10.0.1.tgz",
-      "integrity": "sha512-NWv9YLW4PoW2B7xtzaS3NCot75m6nK7Icdv0o3lfMceJVRfSoQwqD4wEH5rLwoKJwUiZ/rfpiVBhnaF0FK4HoA==",
-      "license": "MIT",
-      "engines": {
-        "node": ">= 12"
-      }
-    },
    "node_modules/fast-copy": {
      "version": "4.0.2",
      "resolved": "https://registry.npmjs.org/fast-copy/-/fast-copy-4.0.2.tgz",
@@ -1,6 +1,6 @@
 {
  "name": "omniroute",
-  "version": "2.0.2",
+  "version": "2.0.8",
  "description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
  "type": "module",
  "bin": {
@@ -150,10 +150,9 @@ export default function ProviderLimitCard({
      {!loading && !error && !message && quotas?.length > 0 && (
        <div className="space-y-4">
          {quotas.map((quota, index) => {
-            // For Antigravity, use remainingPercentage if available, otherwise calculate
            const percentage =
              quota.remainingPercentage !== undefined
-                ? Math.round(((quota.total - quota.used) / quota.total) * 100)
+                ? Math.round(quota.remainingPercentage)
                : calculatePercentage(quota.used, quota.total);
            const unlimited = quota.total === 0 || quota.total === null;

@@ -1,4 +1,5 @@
 import { getModelsByProviderId } from "@omniroute/open-sse/config/providerModels.ts";
+import { safePercentage } from "@/shared/utils/formatting";

 /**
 * Format ISO date string to countdown format (inspired by vscode-antigravity-cockpit)
@@ -110,7 +111,7 @@ export function parseQuotaData(provider, data) {
              used: quota.used || 0,
              total: quota.total || 0,
              resetAt: quota.resetAt || null,
-              remainingPercentage: quota.remainingPercentage,
+              remainingPercentage: safePercentage(quota.remainingPercentage),
            });
          });
        }
@@ -159,7 +160,7 @@ export function parseQuotaData(provider, data) {
              used: quota.used || 0,
              total: quota.total || 0,
              resetAt: quota.resetAt || null,
-              remainingPercentage: quota.remainingPercentage,
+              remainingPercentage: safePercentage(quota.remainingPercentage),
            });
          });
        }
@@ -221,9 +221,15 @@ export async function POST(
      let connection: any;
      if (tokenData.email) {
        const existing = await getProviderConnections({ provider });
-        const match = existing.find(
-          (c: any) => c.email === tokenData.email && c.authType === "oauth"
-        );
+        const match = existing.find((c: any) => {
+          if (c.email !== tokenData.email || c.authType !== "oauth") return false;
+          // For Codex, also check workspaceId to avoid overwriting different workspace connections
+          if (provider === "codex" && tokenData.providerSpecificData?.workspaceId) {
+            const existingWorkspace = c.providerSpecificData?.workspaceId;
+            return existingWorkspace === tokenData.providerSpecificData.workspaceId;
+          }
+          return true;
+        });
        const matchId = typeof match?.id === "string" ? match.id : null;
        if (matchId) {
          connection = await updateProviderConnection(matchId, {
@@ -285,9 +291,15 @@ export async function POST(
        let connection: any;
        if (result.tokens.email) {
          const existing = await getProviderConnections({ provider });
-          const match = existing.find(
-            (c: any) => c.email === result.tokens.email && c.authType === "oauth"
-          );
+          const match = existing.find((c: any) => {
+            if (c.email !== result.tokens.email || c.authType !== "oauth") return false;
+            // For Codex, also check workspaceId to avoid overwriting different workspace connections
+            if (provider === "codex" && result.tokens.providerSpecificData?.workspaceId) {
+              const existingWorkspace = c.providerSpecificData?.workspaceId;
+              return existingWorkspace === result.tokens.providerSpecificData.workspaceId;
+            }
+            return true;
+          });
          const matchId = typeof match?.id === "string" ? match.id : null;
          if (matchId) {
            connection = await updateProviderConnection(matchId, {
@@ -399,9 +411,15 @@ export async function POST(
        let connection: any;
        if (tokenData.email) {
          const existing = await getProviderConnections({ provider });
-          const match = existing.find(
-            (c: any) => c.email === tokenData.email && c.authType === "oauth"
-          );
+          const match = existing.find((c: any) => {
+            if (c.email !== tokenData.email || c.authType !== "oauth") return false;
+            // For Codex, also check workspaceId to avoid overwriting different workspace connections
+            if (provider === "codex" && tokenData.providerSpecificData?.workspaceId) {
+              const existingWorkspace = c.providerSpecificData?.workspaceId;
+              return existingWorkspace === tokenData.providerSpecificData.workspaceId;
+            }
+            return true;
+          });
          const matchId = typeof match?.id === "string" ? match.id : null;
          if (matchId) {
            connection = await updateProviderConnection(matchId, {
@@ -4,6 +4,11 @@ import { getUsageForProvider } from "@omniroute/open-sse/services/usage.ts";
 import { getExecutor } from "@omniroute/open-sse/executors/index.ts";
 import { syncToCloud } from "@/lib/cloudSync";
 import { runWithProxyContext } from "@omniroute/open-sse/utils/proxyFetch.ts";
+import { setQuotaCache } from "@/domain/quotaCache";
+
+function isRecord(value: unknown): value is Record<string, any> {
+  return value !== null && typeof value === "object" && !Array.isArray(value);
+}

 /**
 * Sync to cloud if enabled
@@ -147,6 +152,12 @@ export async function GET(request: Request, { params }: { params: Promise<{ conn
    const usage = await runWithProxyContext(proxyInfo?.proxy || null, () =>
      getUsageForProvider(connection)
    );
+
+    // Populate quota cache for quota-aware account selection
+    if (isRecord(usage?.quotas)) {
+      setQuotaCache(connectionId, connection.provider, usage.quotas);
+    }
+
    return Response.json(usage);
  } catch (error) {
    console.error("[Usage API] Error fetching usage:", error);
@@ -107,7 +107,30 @@ export async function POST(request) {
  if (policy.rejection) return policy.rejection;

  // Parse model to get provider
-  const { provider } = parseImageModel(body.model);
+  let { provider } = parseImageModel(body.model);
+  let isCustomModel = false;
+
+  // If not in built-in registry, check custom models tagged for images
+  if (!provider) {
+    try {
+      const customModelsMap = (await getAllCustomModels()) as Record<string, any>;
+      for (const [providerId, models] of Object.entries(customModelsMap)) {
+        if (!Array.isArray(models)) continue;
+        for (const model of models) {
+          if (!model?.id || !Array.isArray(model.supportedEndpoints)) continue;
+          if (!model.supportedEndpoints.includes("images")) continue;
+          const fullId = `${providerId}/${model.id}`;
+          if (fullId === body.model) {
+            provider = providerId;
+            isCustomModel = true;
+            break;
+          }
+        }
+        if (provider) break;
+      }
+    } catch {}
+  }
+
  if (!provider) {
    return errorResponse(
      HTTP_STATUS.BAD_REQUEST,
@@ -128,9 +151,23 @@ export async function POST(request) {
        `No credentials for image provider: ${provider}`
      );
    }
+  } else if (isCustomModel) {
+    // Custom models need credentials from the provider connection
+    credentials = await getProviderCredentials(provider);
+    if (!credentials) {
+      return errorResponse(
+        HTTP_STATUS.BAD_REQUEST,
+        `No credentials for custom image provider: ${provider}`
+      );
+    }
  }

-  const result = await handleImageGeneration({ body, credentials, log });
+  const result = await handleImageGeneration({
+    body,
+    credentials,
+    log,
+    ...(isCustomModel && { resolvedProvider: provider }),
+  });

  if (result.success) {
    return new Response(JSON.stringify((result as any).data), {
@@ -6,6 +6,9 @@
   directives ensure all utility classes in route groups are included. */
@source "../app/(dashboard)";
@source "../../open-sse";
+@source not "../../*.sqlite*";
+@source not "../../.claude*";
+@source not "../../.claude-memory";

@custom-variant dark (&:where(.dark, .dark *));

@@ -0,0 +1,264 @@
+/**
+ * Quota Cache — Domain Layer
+ *
+ * In-memory cache of provider quota data per connectionId.
+ * Populated by:
+ *   - Dashboard usage endpoint (GET /api/usage/[connectionId])
+ *   - 429 responses marking account as exhausted
+ *
+ * Background refresh runs every 1 minute:
+ *   - Active accounts (quota > 0%): refetch every 5 minutes
+ *   - Exhausted accounts: refetch every 5 minutes (or immediately after resetAt passes)
+ *
+ * @module domain/quotaCache
+ */
+
+import { getUsageForProvider } from "@omniroute/open-sse/services/usage.ts";
+import { getProviderConnectionById, resolveProxyForConnection } from "@/lib/localDb";
+import { runWithProxyContext } from "@omniroute/open-sse/utils/proxyFetch.ts";
+import { safePercentage } from "@/shared/utils/formatting";
+
+// ─── Types ──────────────────────────────────────────────────────────────────
+
+interface QuotaInfo {
+  remainingPercentage: number;
+  resetAt: string | null;
+}
+
+interface QuotaCacheEntry {
+  connectionId: string;
+  provider: string;
+  quotas: Record<string, QuotaInfo>;
+  fetchedAt: number;
+  exhausted: boolean;
+  nextResetAt: string | null;
+}
+
+// ─── Constants ──────────────────────────────────────────────────────────────
+
+const ACTIVE_TTL_MS = 5 * 60 * 1000; // 5 minutes for active accounts
+const EXHAUSTED_TTL_MS = 5 * 60 * 1000; // 5 minutes for 429-sourced entries (no resetAt)
+const EXHAUSTED_REFRESH_MS = 5 * 60 * 1000; // 5 minutes: recheck exhausted accounts (aligned with TTL)
+const REFRESH_INTERVAL_MS = 60 * 1000; // Background tick every 1 minute
+
+// ─── State ──────────────────────────────────────────────────────────────────
+
+const cache = new Map<string, QuotaCacheEntry>();
+const MAX_CONCURRENT_REFRESHES = 5;
+let refreshTimer: ReturnType<typeof setInterval> | null = null;
+let tickRunning = false;
+
+// ─── Helpers ────────────────────────────────────────────────────────────────
+
+function isExhausted(quotas: Record<string, QuotaInfo>): boolean {
+  const entries = Object.values(quotas);
+  if (entries.length === 0) return false;
+  return entries.every((q) => q.remainingPercentage <= 0);
+}
+
+function parseDate(value: string): number | null {
+  const ms = new Date(value).getTime();
+  return Number.isNaN(ms) ? null : ms;
+}
+
+function earliestResetAt(quotas: Record<string, QuotaInfo>): string | null {
+  let earliest: string | null = null;
+  let earliestMs = Infinity;
+  for (const q of Object.values(quotas)) {
+    if (!q.resetAt) continue;
+    const ms = parseDate(q.resetAt);
+    if (ms !== null && ms < earliestMs) {
+      earliestMs = ms;
+      earliest = q.resetAt;
+    }
+  }
+  return earliest;
+}
+
+function normalizeQuotas(rawQuotas: Record<string, any>): Record<string, QuotaInfo> {
+  const result: Record<string, QuotaInfo> = {};
+  for (const [key, q] of Object.entries(rawQuotas)) {
+    if (q && typeof q === "object") {
+      result[key] = {
+        remainingPercentage:
+          safePercentage(q.remainingPercentage) ??
+          (q.total > 0 ? Math.round(((q.total - (q.used || 0)) / q.total) * 100) : 0),
+        resetAt: q.resetAt || null,
+      };
+    }
+  }
+  return result;
+}
+
+// ─── Public API ─────────────────────────────────────────────────────────────
+
+/**
+ * Store quota data for a connection (called by usage endpoint and background refresh).
+ */
+export function setQuotaCache(
+  connectionId: string,
+  provider: string,
+  rawQuotas: Record<string, any>
+) {
+  const quotas = normalizeQuotas(rawQuotas);
+  const exhausted = isExhausted(quotas);
+  cache.set(connectionId, {
+    connectionId,
+    provider,
+    quotas,
+    fetchedAt: Date.now(),
+    exhausted,
+    nextResetAt: exhausted ? earliestResetAt(quotas) : null,
+  });
+}
+
+/**
+ * Get cached quota entry (returns null if not cached).
+ */
+export function getQuotaCache(connectionId: string): QuotaCacheEntry | null {
+  return cache.get(connectionId) || null;
+}
+
+/**
+ * Check if an account's quota is exhausted based on cached data.
+ * Returns false if no cache entry exists (unknown = assume available).
+ */
+export function isAccountQuotaExhausted(connectionId: string): boolean {
+  const entry = cache.get(connectionId);
+  if (!entry) return false;
+  if (!entry.exhausted) return false;
+
+  // If resetAt has passed, assume available until refresh confirms
+  if (entry.nextResetAt) {
+    const resetMs = parseDate(entry.nextResetAt);
+    if (resetMs !== null && resetMs <= Date.now()) return false;
+  }
+
+  // Exhausted entries without resetAt expire after fixed TTL
+  const age = Date.now() - entry.fetchedAt;
+  if (!entry.nextResetAt && age > EXHAUSTED_TTL_MS) return false;
+
+  return true;
+}
+
+/**
+ * Mark an account as quota-exhausted from a 429 response (no quota data available).
+ * Uses 5-minute fixed TTL since we don't know the actual resetAt.
+ */
+export function markAccountExhaustedFrom429(connectionId: string, provider: string) {
+  cache.set(connectionId, {
+    connectionId,
+    provider,
+    quotas: {},
+    fetchedAt: Date.now(),
+    exhausted: true,
+    nextResetAt: null,
+  });
+}
+
+// ─── Background Refresh ─────────────────────────────────────────────────────
+
+const refreshingSet = new Set<string>();
+
+async function refreshEntry(entry: QuotaCacheEntry) {
+  if (refreshingSet.has(entry.connectionId)) return;
+  refreshingSet.add(entry.connectionId);
+
+  try {
+    const connection = await getProviderConnectionById(entry.connectionId);
+    if (!connection || connection.authType !== "oauth" || !connection.isActive) {
+      cache.delete(entry.connectionId);
+      return;
+    }
+
+    const proxyInfo = await resolveProxyForConnection(entry.connectionId);
+    const usage = await runWithProxyContext(proxyInfo?.proxy || null, () =>
+      getUsageForProvider(connection)
+    );
+
+    if (usage?.quotas) {
+      setQuotaCache(entry.connectionId, entry.provider, usage.quotas);
+    }
+  } catch (err) {
+    console.warn(
+      `[QuotaCache] Refresh failed for ${entry.connectionId.slice(0, 8)}:`,
+      (err as any)?.message || err
+    );
+  } finally {
+    refreshingSet.delete(entry.connectionId);
+  }
+}
+
+function needsRefresh(entry: QuotaCacheEntry, now: number): boolean {
+  const age = now - entry.fetchedAt;
+  if (entry.exhausted) {
+    if (entry.nextResetAt) {
+      const resetMs = parseDate(entry.nextResetAt);
+      if (resetMs !== null && resetMs <= now) return true;
+    }
+    return age >= EXHAUSTED_REFRESH_MS;
+  }
+  return age >= ACTIVE_TTL_MS;
+}
+
+async function backgroundRefreshTick() {
+  if (tickRunning) return;
+  tickRunning = true;
+
+  try {
+    const now = Date.now();
+    const pending = [...cache.values()].filter((e) => needsRefresh(e, now));
+
+    // Refresh in batches to avoid thundering herd
+    for (let i = 0; i < pending.length; i += MAX_CONCURRENT_REFRESHES) {
+      const batch = pending.slice(i, i + MAX_CONCURRENT_REFRESHES);
+      await Promise.allSettled(batch.map(refreshEntry));
+    }
+  } finally {
+    tickRunning = false;
+  }
+}
+
+/**
+ * Start the background refresh timer.
+ */
+export function startBackgroundRefresh() {
+  if (refreshTimer) return;
+  refreshTimer = setInterval(backgroundRefreshTick, REFRESH_INTERVAL_MS);
+  refreshTimer?.unref?.();
+}
+
+/**
+ * Stop the background refresh timer.
+ */
+export function stopBackgroundRefresh() {
+  if (refreshTimer) {
+    clearInterval(refreshTimer);
+    refreshTimer = null;
+  }
+}
+
+/**
+ * Get cache stats (for debugging/dashboard).
+ */
+export function getQuotaCacheStats() {
+  const entries: Array<{
+    connectionId: string;
+    provider: string;
+    exhausted: boolean;
+    nextResetAt: string | null;
+    ageMs: number;
+  }> = [];
+
+  for (const entry of cache.values()) {
+    entries.push({
+      connectionId: entry.connectionId.slice(0, 8) + "...",
+      provider: entry.provider,
+      exhausted: entry.exhausted,
+      nextResetAt: entry.nextResetAt,
+      ageMs: Date.now() - entry.fetchedAt,
+    });
+  }
+
+  return { total: cache.size, entries };
+}
@@ -39,6 +39,13 @@ export async function register() {
    const { initApiBridgeServer } = await import("@/lib/apiBridgeServer");
    initApiBridgeServer();

+    // Quota cache: start background refresh for quota-aware account selection
+    // Dynamic import required — quotaCache depends on better-sqlite3 (Node-only),
+    // and instrumentation.ts is bundled for all runtimes including Edge.
+    const { startBackgroundRefresh } = await import("@/domain/quotaCache");
+    startBackgroundRefresh();
+    console.log("[STARTUP] Quota cache background refresh started");
+
    // Compliance: Initialize audit_log table + cleanup expired logs
    try {
      const { initAuditLog, cleanupExpiredLogs } = await import("@/lib/compliance/index");
@@ -79,6 +79,7 @@ const SCHEMA_SQL = `
    token_type TEXT,
    consecutive_use_count INTEGER DEFAULT 0,
    rate_limit_protection INTEGER DEFAULT 0,
+    last_used_at TEXT,
    created_at TEXT NOT NULL,
    updated_at TEXT NOT NULL
  );
@@ -311,6 +312,10 @@ function ensureProviderConnectionsColumns(db: SqliteDatabase) {
      );
      console.log("[DB] Added provider_connections.rate_limit_protection column");
    }
+    if (!columnNames.has("last_used_at")) {
+      db.exec("ALTER TABLE provider_connections ADD COLUMN last_used_at TEXT");
+      console.log("[DB] Added provider_connections.last_used_at column");
+    }
  } catch (error: unknown) {
    const message = error instanceof Error ? error.message : String(error);
    console.warn("[DB] Failed to verify provider_connections schema:", message);
@@ -483,7 +488,7 @@ function migrateFromJson(db: SqliteDatabase, jsonPath: string) {
          rate_limited_until, health_check_interval, last_health_check_at,
          last_tested, api_key, id_token, provider_specific_data,
          expires_in, display_name, global_priority, default_model,
-          token_type, consecutive_use_count, rate_limit_protection, created_at, updated_at
+          token_type, consecutive_use_count, rate_limit_protection, last_used_at, created_at, updated_at
        ) VALUES (
          @id, @provider, @authType, @name, @email, @priority, @isActive,
          @accessToken, @refreshToken, @expiresAt, @tokenExpiresAt,
@@ -492,7 +497,7 @@ function migrateFromJson(db: SqliteDatabase, jsonPath: string) {
          @rateLimitedUntil, @healthCheckInterval, @lastHealthCheckAt,
          @lastTested, @apiKey, @idToken, @providerSpecificData,
          @expiresIn, @displayName, @globalPriority, @defaultModel,
-          @tokenType, @consecutiveUseCount, @rateLimitProtection, @createdAt, @updatedAt
+          @tokenType, @consecutiveUseCount, @rateLimitProtection, @lastUsedAt, @createdAt, @updatedAt
        )
      `);

@@ -533,6 +538,7 @@ function migrateFromJson(db: SqliteDatabase, jsonPath: string) {
          defaultModel: conn.defaultModel || null,
          tokenType: conn.tokenType || null,
          consecutiveUseCount: conn.consecutiveUseCount || 0,
+          lastUsedAt: conn.lastUsedAt || null,
          rateLimitProtection:
            conn.rateLimitProtection === true || conn.rateLimitProtection === 1 ? 1 : 0,
          createdAt: conn.createdAt || new Date().toISOString(),
@@ -217,7 +217,7 @@ function _insertConnectionRow(db: DbLike, conn: JsonRecord) {
      rate_limited_until, health_check_interval, last_health_check_at,
      last_tested, api_key, id_token, provider_specific_data,
      expires_in, display_name, global_priority, default_model,
-      token_type, consecutive_use_count, rate_limit_protection, created_at, updated_at
+      token_type, consecutive_use_count, rate_limit_protection, last_used_at, created_at, updated_at
    ) VALUES (
      @id, @provider, @authType, @name, @email, @priority, @isActive,
      @accessToken, @refreshToken, @expiresAt, @tokenExpiresAt,
@@ -226,7 +226,7 @@ function _insertConnectionRow(db: DbLike, conn: JsonRecord) {
      @rateLimitedUntil, @healthCheckInterval, @lastHealthCheckAt,
      @lastTested, @apiKey, @idToken, @providerSpecificData,
      @expiresIn, @displayName, @globalPriority, @defaultModel,
-      @tokenType, @consecutiveUseCount, @rateLimitProtection, @createdAt, @updatedAt
+      @tokenType, @consecutiveUseCount, @rateLimitProtection, @lastUsedAt, @createdAt, @updatedAt
    )
  `
  ).run({
@@ -267,6 +267,7 @@ function _insertConnectionRow(db: DbLike, conn: JsonRecord) {
    consecutiveUseCount: conn.consecutiveUseCount || 0,
    rateLimitProtection:
      conn.rateLimitProtection === true || conn.rateLimitProtection === 1 ? 1 : 0,
+    lastUsedAt: conn.lastUsedAt || null,
    createdAt: conn.createdAt,
    updatedAt: conn.updatedAt,
  });
@@ -290,6 +291,7 @@ function _updateConnectionRow(db: DbLike, id: string, data: JsonRecord) {
      default_model = @defaultModel, token_type = @tokenType,
      consecutive_use_count = @consecutiveUseCount,
      rate_limit_protection = @rateLimitProtection,
+      last_used_at = @lastUsedAt,
      updated_at = @updatedAt
    WHERE id = @id
  `
@@ -331,6 +333,7 @@ function _updateConnectionRow(db: DbLike, id: string, data: JsonRecord) {
    consecutiveUseCount: data.consecutiveUseCount || 0,
    rateLimitProtection:
      data.rateLimitProtection === true || data.rateLimitProtection === 1 ? 1 : 0,
+    lastUsedAt: data.lastUsedAt || null,
    updatedAt: now,
  });
 }
@@ -148,3 +148,11 @@ export function truncateUrl(url, max = 50) {
    return url.length > max ? url.slice(0, max) + "…" : url;
  }
 }
+
+/**
+ * Safely extract a finite number, returning undefined for invalid values.
+ * Used by quota normalization in both backend (quotaCache) and frontend (ProviderLimits).
+ */
+export function safePercentage(value: unknown): number | undefined {
+  return typeof value === "number" && isFinite(value) ? value : undefined;
+}
@@ -1,912 +0,0 @@
-"use strict";
-Object.defineProperty(exports, "__esModule", { value: true });
-exports.dbBackupRestoreSchema =
-  exports.testComboSchema =
-  exports.updateComboSchema =
-  exports.cloudSyncActionSchema =
-  exports.cloudModelAliasUpdateSchema =
-  exports.cloudResolveAliasSchema =
-  exports.cloudCredentialUpdateSchema =
-  exports.kiroSocialExchangeSchema =
-  exports.kiroImportSchema =
-  exports.cursorImportSchema =
-  exports.oauthPollSchema =
-  exports.oauthExchangeSchema =
-  exports.translatorTranslateSchema =
-  exports.translatorSendSchema =
-  exports.translatorSaveSchema =
-  exports.translatorDetectSchema =
-  exports.testProxySchema =
-  exports.updateProxyConfigSchema =
-  exports.removeModelAliasSchema =
-  exports.addModelAliasSchema =
-  exports.updateModelAliasesSchema =
-  exports.updateIpFilterSchema =
-  exports.updateThinkingBudgetSchema =
-  exports.updateSystemPromptSchema =
-  exports.updateRequireLoginSchema =
-  exports.updateComboDefaultsSchema =
-  exports.resetStatsActionSchema =
-  exports.jsonObjectSchema =
-  exports.updateResilienceSchema =
-  exports.toggleRateLimitSchema =
-  exports.updatePricingSchema =
-  exports.providerModelMutationSchema =
-  exports.clearModelAvailabilitySchema =
-  exports.updateModelAliasSchema =
-  exports.removeFallbackSchema =
-  exports.registerFallbackSchema =
-  exports.policyActionSchema =
-  exports.setBudgetSchema =
-  exports.v1CountTokensSchema =
-  exports.providerChatCompletionSchema =
-  exports.v1RerankSchema =
-  exports.v1ModerationSchema =
-  exports.v1AudioSpeechSchema =
-  exports.v1ImageGenerationSchema =
-  exports.v1EmbeddingsSchema =
-  exports.loginSchema =
-  exports.updateSettingsSchema =
-  exports.createComboSchema =
-  exports.createKeySchema =
-  exports.createProviderSchema =
-    void 0;
-exports.guideSettingsSaveSchema =
-  exports.codexProfileIdSchema =
-  exports.codexProfileNameSchema =
-  exports.cliModelConfigSchema =
-  exports.cliSettingsEnvSchema =
-  exports.cliBackupMutationSchema =
-  exports.cliMitmAliasUpdateSchema =
-  exports.cliMitmStopSchema =
-  exports.cliMitmStartSchema =
-  exports.v1betaGeminiGenerateSchema =
-  exports.validateProviderApiKeySchema =
-  exports.providersBatchTestSchema =
-  exports.updateProviderConnectionSchema =
-  exports.providerNodeValidateSchema =
-  exports.updateProviderNodeSchema =
-  exports.createProviderNodeSchema =
-  exports.updateKeyPermissionsSchema =
-  exports.evalRunSuiteSchema =
-    void 0;
-exports.validateBody = validateBody;
-var zod_1 = require("zod");
-// ──── Provider Schemas ────
-exports.createProviderSchema = zod_1.z.object({
-  provider: zod_1.z.string().min(1).max(100),
-  apiKey: zod_1.z.string().min(1).max(10000),
-  name: zod_1.z.string().min(1).max(200),
-  priority: zod_1.z.number().int().min(1).max(100).optional(),
-  globalPriority: zod_1.z.number().int().min(1).max(100).nullable().optional(),
-  defaultModel: zod_1.z.string().max(200).nullable().optional(),
-  testStatus: zod_1.z.string().max(50).optional(),
-});
-// ──── API Key Schemas ────
-exports.createKeySchema = zod_1.z.object({
-  name: zod_1.z.string().min(1, "Name is required").max(200),
-});
-// ──── Combo Schemas ────
-// A model entry can be a plain string (legacy) or an object with weight
-var comboModelEntry = zod_1.z.union([
-  zod_1.z.string(),
-  zod_1.z.object({
-    model: zod_1.z.string().min(1),
-    weight: zod_1.z.number().min(0).max(100).default(0),
-  }),
-]);
-// Per-combo config overrides
-var comboConfigSchema = zod_1.z
-  .object({
-    maxRetries: zod_1.z.number().int().min(0).max(10).optional(),
-    retryDelayMs: zod_1.z.number().int().min(0).max(60000).optional(),
-    timeoutMs: zod_1.z.number().int().min(1000).max(600000).optional(),
-    healthCheckEnabled: zod_1.z.boolean().optional(),
-  })
-  .optional();
-var comboStrategySchema = zod_1.z.enum([
-  "priority",
-  "weighted",
-  "round-robin",
-  "random",
-  "least-used",
-  "cost-optimized",
-]);
-var comboRuntimeConfigSchema = zod_1.z
-  .object({
-    strategy: comboStrategySchema.optional(),
-    maxRetries: zod_1.z.coerce.number().int().min(0).max(10).optional(),
-    retryDelayMs: zod_1.z.coerce.number().int().min(0).max(60000).optional(),
-    timeoutMs: zod_1.z.coerce.number().int().min(1000).max(600000).optional(),
-    concurrencyPerModel: zod_1.z.coerce.number().int().min(1).max(20).optional(),
-    queueTimeoutMs: zod_1.z.coerce.number().int().min(1000).max(120000).optional(),
-    healthCheckEnabled: zod_1.z.boolean().optional(),
-    healthCheckTimeoutMs: zod_1.z.coerce.number().int().min(100).max(30000).optional(),
-    maxComboDepth: zod_1.z.coerce.number().int().min(1).max(10).optional(),
-    trackMetrics: zod_1.z.boolean().optional(),
-  })
-  .strict();
-exports.createComboSchema = zod_1.z.object({
-  name: zod_1.z
-    .string()
-    .min(1, "Name is required")
-    .max(100)
-    .regex(/^[a-zA-Z0-9_/.-]+$/, "Name can only contain letters, numbers, -, _, / and ."),
-  models: zod_1.z.array(comboModelEntry).optional().default([]),
-  strategy: comboStrategySchema.optional().default("priority"),
-  config: comboConfigSchema,
-});
-// ──── Settings Schemas ────
-// FASE-01: Removed .passthrough() — only explicitly listed fields are accepted
-exports.updateSettingsSchema = zod_1.z.object({
-  newPassword: zod_1.z.string().min(1).max(200).optional(),
-  currentPassword: zod_1.z.string().max(200).optional(),
-  theme: zod_1.z.string().max(50).optional(),
-  language: zod_1.z.string().max(10).optional(),
-  requireLogin: zod_1.z.boolean().optional(),
-  enableRequestLogs: zod_1.z.boolean().optional(),
-  enableSocks5Proxy: zod_1.z.boolean().optional(),
-  instanceName: zod_1.z.string().max(100).optional(),
-  corsOrigins: zod_1.z.string().max(500).optional(),
-  logRetentionDays: zod_1.z.number().int().min(1).max(365).optional(),
-  cloudUrl: zod_1.z.string().max(500).optional(),
-  baseUrl: zod_1.z.string().max(500).optional(),
-  setupComplete: zod_1.z.boolean().optional(),
-  requireAuthForModels: zod_1.z.boolean().optional(),
-  blockedProviders: zod_1.z.array(zod_1.z.string().max(100)).optional(),
-  hideHealthCheckLogs: zod_1.z.boolean().optional(),
-  // Routing settings (#134)
-  fallbackStrategy: zod_1.z
-    .enum(["fill-first", "round-robin", "p2c", "random", "least-used", "cost-optimized"])
-    .optional(),
-  wildcardAliases: zod_1.z
-    .array(zod_1.z.object({ pattern: zod_1.z.string(), target: zod_1.z.string() }))
-    .optional(),
-  stickyRoundRobinLimit: zod_1.z.number().int().min(0).max(1000).optional(),
-});
-// ──── Auth Schemas ────
-exports.loginSchema = zod_1.z.object({
-  password: zod_1.z.string().min(1, "Password is required").max(200),
-});
-// ──── API Route Payload Schemas (T06) ────
-var modelIdSchema = zod_1.z.string().trim().min(1, "Model is required").max(200);
-var nonEmptyStringSchema = zod_1.z.string().trim().min(1, "Field is required");
-var embeddingTokenArraySchema = zod_1.z
-  .array(zod_1.z.number().int().min(0))
-  .min(1, "input token array must contain at least one item");
-var embeddingInputSchema = zod_1.z.union([
-  nonEmptyStringSchema,
-  zod_1.z.array(nonEmptyStringSchema).min(1, "input must contain at least one item"),
-  embeddingTokenArraySchema,
-  zod_1.z.array(embeddingTokenArraySchema).min(1, "input must contain at least one item"),
-]);
-var chatMessageSchema = zod_1.z
-  .object({
-    role: zod_1.z.string().trim().min(1, "messages[].role is required"),
-    content: zod_1.z
-      .union([nonEmptyStringSchema, zod_1.z.array(zod_1.z.unknown()).min(1), zod_1.z.null()])
-      .optional(),
-  })
-  .catchall(zod_1.z.unknown());
-var countTokensMessageSchema = zod_1.z
-  .object({
-    content: zod_1.z.union([
-      nonEmptyStringSchema,
-      zod_1.z
-        .array(
-          zod_1.z
-            .object({
-              type: zod_1.z.string().optional(),
-              text: zod_1.z.string().optional(),
-            })
-            .catchall(zod_1.z.unknown())
-        )
-        .min(1, "messages[].content must contain at least one item"),
-    ]),
-  })
-  .catchall(zod_1.z.unknown());
-exports.v1EmbeddingsSchema = zod_1.z
-  .object({
-    model: modelIdSchema,
-    input: embeddingInputSchema,
-    dimensions: zod_1.z.coerce.number().int().positive().optional(),
-    encoding_format: zod_1.z.enum(["float", "base64"]).optional(),
-  })
-  .catchall(zod_1.z.unknown());
-exports.v1ImageGenerationSchema = zod_1.z
-  .object({
-    model: modelIdSchema,
-    prompt: nonEmptyStringSchema,
-  })
-  .catchall(zod_1.z.unknown());
-exports.v1AudioSpeechSchema = zod_1.z
-  .object({
-    model: modelIdSchema,
-    input: nonEmptyStringSchema,
-  })
-  .catchall(zod_1.z.unknown());
-exports.v1ModerationSchema = zod_1.z
-  .object({
-    model: modelIdSchema.optional(),
-    input: zod_1.z.unknown().refine(function (value) {
-      if (value === undefined || value === null) return false;
-      if (typeof value === "string") return value.trim().length > 0;
-      if (Array.isArray(value)) return value.length > 0;
-      return true;
-    }, "Input is required"),
-  })
-  .catchall(zod_1.z.unknown());
-exports.v1RerankSchema = zod_1.z
-  .object({
-    model: modelIdSchema,
-    query: nonEmptyStringSchema,
-    documents: zod_1.z.array(zod_1.z.unknown()).min(1, "documents must contain at least one item"),
-  })
-  .catchall(zod_1.z.unknown());
-exports.providerChatCompletionSchema = zod_1.z
-  .object({
-    model: modelIdSchema,
-    messages: zod_1.z.array(chatMessageSchema).min(1).optional(),
-    input: zod_1.z
-      .union([nonEmptyStringSchema, zod_1.z.array(zod_1.z.unknown()).min(1)])
-      .optional(),
-    prompt: nonEmptyStringSchema.optional(),
-  })
-  .catchall(zod_1.z.unknown())
-  .superRefine(function (value, ctx) {
-    if (value.messages === undefined && value.input === undefined && value.prompt === undefined) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "messages, input or prompt is required",
-        path: [],
-      });
-    }
-  });
-exports.v1CountTokensSchema = zod_1.z
-  .object({
-    messages: zod_1.z
-      .array(countTokensMessageSchema)
-      .min(1, "messages must contain at least one item"),
-  })
-  .catchall(zod_1.z.unknown());
-exports.setBudgetSchema = zod_1.z.object({
-  apiKeyId: zod_1.z.string().trim().min(1, "apiKeyId is required"),
-  dailyLimitUsd: zod_1.z.coerce.number().positive("dailyLimitUsd must be greater than zero"),
-  monthlyLimitUsd: zod_1.z.coerce
-    .number()
-    .positive("monthlyLimitUsd must be greater than zero")
-    .optional(),
-  warningThreshold: zod_1.z.coerce.number().min(0).max(1).optional(),
-});
-exports.policyActionSchema = zod_1.z
-  .object({
-    action: zod_1.z.enum(["unlock"]),
-    identifier: zod_1.z.string().trim().min(1).optional(),
-  })
-  .superRefine(function (value, ctx) {
-    if (value.action === "unlock" && !value.identifier) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "identifier is required for unlock action",
-        path: ["identifier"],
-      });
-    }
-  });
-var fallbackChainEntrySchema = zod_1.z
-  .object({
-    provider: zod_1.z.string().trim().min(1, "provider is required"),
-    priority: zod_1.z.number().int().min(1).max(100).optional(),
-    enabled: zod_1.z.boolean().optional(),
-  })
-  .catchall(zod_1.z.unknown());
-exports.registerFallbackSchema = zod_1.z.object({
-  model: modelIdSchema,
-  chain: zod_1.z.array(fallbackChainEntrySchema).min(1, "chain must contain at least one provider"),
-});
-exports.removeFallbackSchema = zod_1.z.object({
-  model: modelIdSchema,
-});
-exports.updateModelAliasSchema = zod_1.z.object({
-  model: modelIdSchema,
-  alias: zod_1.z.string().trim().min(1, "Alias is required").max(200),
-});
-exports.clearModelAvailabilitySchema = zod_1.z.object({
-  provider: zod_1.z.string().trim().min(1, "provider is required").max(120),
-  model: modelIdSchema,
-});
-exports.providerModelMutationSchema = zod_1.z.object({
-  provider: zod_1.z.string().trim().min(1, "provider is required").max(120),
-  modelId: zod_1.z.string().trim().min(1, "modelId is required").max(240),
-  modelName: zod_1.z.string().trim().max(240).optional(),
-  source: zod_1.z.string().trim().max(80).optional(),
-});
-var pricingFieldsSchema = zod_1.z
-  .object({
-    input: zod_1.z.number().min(0).optional(),
-    output: zod_1.z.number().min(0).optional(),
-    cached: zod_1.z.number().min(0).optional(),
-    reasoning: zod_1.z.number().min(0).optional(),
-    cache_creation: zod_1.z.number().min(0).optional(),
-  })
-  .strict();
-exports.updatePricingSchema = zod_1.z.record(
-  zod_1.z.string().trim().min(1),
-  zod_1.z.record(zod_1.z.string().trim().min(1), pricingFieldsSchema)
-);
-exports.toggleRateLimitSchema = zod_1.z.object({
-  connectionId: zod_1.z.string().trim().min(1, "connectionId is required"),
-  enabled: zod_1.z.boolean(),
-});
-var resilienceProfileSchema = zod_1.z.object({
-  transientCooldown: zod_1.z.number().min(0),
-  rateLimitCooldown: zod_1.z.number().min(0),
-  maxBackoffLevel: zod_1.z.number().int().min(0),
-  circuitBreakerThreshold: zod_1.z.number().int().min(0),
-  circuitBreakerReset: zod_1.z.number().min(0),
-});
-var resilienceDefaultsSchema = zod_1.z
-  .object({
-    requestsPerMinute: zod_1.z.number().int().min(1).optional(),
-    minTimeBetweenRequests: zod_1.z.number().int().min(1).optional(),
-    concurrentRequests: zod_1.z.number().int().min(1).optional(),
-  })
-  .strict();
-exports.updateResilienceSchema = zod_1.z
-  .object({
-    profiles: zod_1.z
-      .object({
-        oauth: resilienceProfileSchema.optional(),
-        apikey: resilienceProfileSchema.optional(),
-      })
-      .strict()
-      .optional(),
-    defaults: resilienceDefaultsSchema.optional(),
-  })
-  .superRefine(function (value, ctx) {
-    if (!value.profiles && !value.defaults) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "Must provide profiles or defaults",
-        path: [],
-      });
-    }
-  });
-exports.jsonObjectSchema = zod_1.z.record(zod_1.z.string(), zod_1.z.unknown());
-exports.resetStatsActionSchema = zod_1.z.object({
-  action: zod_1.z.literal("reset-stats"),
-});
-exports.updateComboDefaultsSchema = zod_1.z
-  .object({
-    comboDefaults: comboRuntimeConfigSchema.optional(),
-    providerOverrides: zod_1.z
-      .record(zod_1.z.string().trim().min(1), comboRuntimeConfigSchema)
-      .optional(),
-  })
-  .superRefine(function (value, ctx) {
-    if (!value.comboDefaults && !value.providerOverrides) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "Nothing to update",
-        path: [],
-      });
-    }
-  });
-exports.updateRequireLoginSchema = zod_1.z
-  .object({
-    requireLogin: zod_1.z.boolean().optional(),
-    password: zod_1.z.string().min(4, "Password must be at least 4 characters").optional(),
-  })
-  .superRefine(function (value, ctx) {
-    if (value.requireLogin === undefined && !value.password) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "No valid fields to update",
-        path: [],
-      });
-    }
-  });
-exports.updateSystemPromptSchema = zod_1.z
-  .object({
-    prompt: zod_1.z.string().max(50000).optional(),
-    enabled: zod_1.z.boolean().optional(),
-  })
-  .strict()
-  .superRefine(function (value, ctx) {
-    if (value.prompt === undefined && value.enabled === undefined) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "No valid fields to update",
-        path: [],
-      });
-    }
-  });
-exports.updateThinkingBudgetSchema = zod_1.z
-  .object({
-    mode: zod_1.z.enum(["passthrough", "auto", "custom", "adaptive"]).optional(),
-    customBudget: zod_1.z.coerce.number().int().min(0).max(131072).optional(),
-    effortLevel: zod_1.z.enum(["none", "low", "medium", "high"]).optional(),
-    baseBudget: zod_1.z.coerce.number().int().min(0).max(131072).optional(),
-    complexityMultiplier: zod_1.z.coerce.number().min(0).optional(),
-  })
-  .strict()
-  .superRefine(function (value, ctx) {
-    if (
-      value.mode === undefined &&
-      value.customBudget === undefined &&
-      value.effortLevel === undefined &&
-      value.baseBudget === undefined &&
-      value.complexityMultiplier === undefined
-    ) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "No valid fields to update",
-        path: [],
-      });
-    }
-  });
-var ipFilterModeSchema = zod_1.z.enum(["blacklist", "whitelist"]);
-var tempBanSchema = zod_1.z.object({
-  ip: zod_1.z.string().trim().min(1),
-  durationMs: zod_1.z.coerce.number().int().min(1).optional(),
-  reason: zod_1.z.string().max(200).optional(),
-});
-exports.updateIpFilterSchema = zod_1.z
-  .object({
-    enabled: zod_1.z.boolean().optional(),
-    mode: ipFilterModeSchema.optional(),
-    blacklist: zod_1.z.array(zod_1.z.string()).optional(),
-    whitelist: zod_1.z.array(zod_1.z.string()).optional(),
-    addBlacklist: zod_1.z.string().optional(),
-    removeBlacklist: zod_1.z.string().optional(),
-    addWhitelist: zod_1.z.string().optional(),
-    removeWhitelist: zod_1.z.string().optional(),
-    tempBan: tempBanSchema.optional(),
-    removeBan: zod_1.z.string().optional(),
-  })
-  .strict()
-  .superRefine(function (value, ctx) {
-    if (Object.keys(value).length === 0) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "No valid fields to update",
-        path: [],
-      });
-    }
-  });
-exports.updateModelAliasesSchema = zod_1.z.object({
-  aliases: zod_1.z.record(zod_1.z.string().trim().min(1), zod_1.z.string().trim().min(1)),
-});
-exports.addModelAliasSchema = zod_1.z.object({
-  from: zod_1.z.string().trim().min(1),
-  to: zod_1.z.string().trim().min(1),
-});
-exports.removeModelAliasSchema = zod_1.z.object({
-  from: zod_1.z.string().trim().min(1),
-});
-var proxyConfigSchema = zod_1.z
-  .object({
-    type: zod_1.z
-      .preprocess(
-        function (value) {
-          return typeof value === "string" ? value.trim().toLowerCase() : value;
-        },
-        zod_1.z.enum(["http", "https", "socks5"])
-      )
-      .optional(),
-    host: zod_1.z.string().trim().min(1).optional(),
-    port: zod_1.z.coerce.number().int().min(1).max(65535).optional(),
-    username: zod_1.z.string().optional(),
-    password: zod_1.z.string().optional(),
-  })
-  .strict();
-exports.updateProxyConfigSchema = zod_1.z
-  .object({
-    proxy: proxyConfigSchema.nullable().optional(),
-    global: proxyConfigSchema.nullable().optional(),
-    providers: zod_1.z
-      .record(zod_1.z.string().trim().min(1), proxyConfigSchema.nullable())
-      .optional(),
-    combos: zod_1.z.record(zod_1.z.string().trim().min(1), proxyConfigSchema.nullable()).optional(),
-    keys: zod_1.z.record(zod_1.z.string().trim().min(1), proxyConfigSchema.nullable()).optional(),
-    level: zod_1.z.enum(["global", "provider", "combo", "key"]).optional(),
-    id: zod_1.z.string().optional(),
-  })
-  .strict()
-  .superRefine(function (value, ctx) {
-    var _a;
-    var hasPayload =
-      value.proxy !== undefined ||
-      value.global !== undefined ||
-      value.providers !== undefined ||
-      value.combos !== undefined ||
-      value.keys !== undefined ||
-      value.level !== undefined;
-    if (!hasPayload) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "No valid fields to update",
-        path: [],
-      });
-    }
-    if (value.level !== undefined && value.proxy === undefined) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "proxy is required when level is provided",
-        path: ["proxy"],
-      });
-    }
-    if (
-      value.level &&
-      value.level !== "global" &&
-      !((_a = value.id) === null || _a === void 0 ? void 0 : _a.trim())
-    ) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "id is required for provider/combo/key level updates",
-        path: ["id"],
-      });
-    }
-  });
-exports.testProxySchema = zod_1.z.object({
-  proxy: zod_1.z.object({
-    type: zod_1.z.string().optional(),
-    host: zod_1.z.string().trim().min(1, "proxy.host is required"),
-    port: zod_1.z.union([zod_1.z.string(), zod_1.z.number()]),
-    username: zod_1.z.string().optional(),
-    password: zod_1.z.string().optional(),
-  }),
-});
-var jsonRecordSchema = zod_1.z.record(zod_1.z.string(), zod_1.z.unknown());
-var nonEmptyJsonRecordSchema = jsonRecordSchema.refine(function (value) {
-  return Object.keys(value).length > 0;
-}, "Body must be a non-empty object");
-var translatorLogFileSchema = zod_1.z.enum([
-  "1_req_client.json",
-  "2_req_source.json",
-  "3_req_openai.json",
-  "4_req_target.json",
-  "5_res_provider.txt",
-]);
-exports.translatorDetectSchema = zod_1.z.object({
-  body: nonEmptyJsonRecordSchema,
-});
-exports.translatorSaveSchema = zod_1.z.object({
-  file: translatorLogFileSchema,
-  content: zod_1.z.string().min(1, "Content is required").max(1000000, "Content is too large"),
-});
-exports.translatorSendSchema = zod_1.z.object({
-  provider: zod_1.z.string().trim().min(1, "Provider is required"),
-  body: nonEmptyJsonRecordSchema,
-});
-exports.translatorTranslateSchema = zod_1.z
-  .object({
-    step: zod_1.z.union([zod_1.z.number().int().min(1).max(4), zod_1.z.literal("direct")]),
-    provider: zod_1.z.string().trim().min(1).optional(),
-    body: nonEmptyJsonRecordSchema,
-    sourceFormat: zod_1.z.string().optional(),
-    targetFormat: zod_1.z.string().optional(),
-  })
-  .superRefine(function (value, ctx) {
-    if (value.step !== "direct" && !value.provider) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "Step and provider are required",
-        path: ["provider"],
-      });
-    }
-  });
-exports.oauthExchangeSchema = zod_1.z.object({
-  code: zod_1.z.string().trim().min(1),
-  redirectUri: zod_1.z.string().trim().min(1),
-  codeVerifier: zod_1.z.string().trim().min(1),
-  state: zod_1.z.string().optional(),
-});
-exports.oauthPollSchema = zod_1.z.object({
-  deviceCode: zod_1.z.string().trim().min(1),
-  codeVerifier: zod_1.z.string().optional(),
-  extraData: zod_1.z.unknown().optional(),
-});
-exports.cursorImportSchema = zod_1.z.object({
-  accessToken: zod_1.z.string().trim().min(1, "Access token is required"),
-  machineId: zod_1.z.string().trim().min(1, "Machine ID is required"),
-});
-exports.kiroImportSchema = zod_1.z.object({
-  refreshToken: zod_1.z.string().trim().min(1, "Refresh token is required"),
-});
-exports.kiroSocialExchangeSchema = zod_1.z.object({
-  code: zod_1.z.string().trim().min(1, "Code is required"),
-  codeVerifier: zod_1.z.string().trim().min(1, "Code verifier is required"),
-  provider: zod_1.z.enum(["google", "github"]),
-});
-exports.cloudCredentialUpdateSchema = zod_1.z.object({
-  provider: zod_1.z.string().trim().min(1, "Provider is required"),
-  credentials: zod_1.z
-    .object({
-      accessToken: zod_1.z.string().optional(),
-      refreshToken: zod_1.z.string().optional(),
-      expiresIn: zod_1.z.coerce.number().positive().optional(),
-    })
-    .strict()
-    .superRefine(function (value, ctx) {
-      if (
-        value.accessToken === undefined &&
-        value.refreshToken === undefined &&
-        value.expiresIn === undefined
-      ) {
-        ctx.addIssue({
-          code: zod_1.z.ZodIssueCode.custom,
-          message: "At least one credential field must be provided",
-          path: [],
-        });
-      }
-    }),
-});
-exports.cloudResolveAliasSchema = zod_1.z.object({
-  alias: zod_1.z.string().trim().min(1, "Missing alias"),
-});
-exports.cloudModelAliasUpdateSchema = zod_1.z.object({
-  model: zod_1.z.string().trim().min(1, "Model and alias required"),
-  alias: zod_1.z.string().trim().min(1, "Model and alias required"),
-});
-exports.cloudSyncActionSchema = zod_1.z.object({
-  action: zod_1.z.enum(["enable", "sync", "disable"]),
-});
-exports.updateComboSchema = zod_1.z
-  .object({
-    name: zod_1.z
-      .string()
-      .min(1, "Name is required")
-      .max(100)
-      .regex(/^[a-zA-Z0-9_/.-]+$/, "Name can only contain letters, numbers, -, _, / and .")
-      .optional(),
-    models: zod_1.z.array(comboModelEntry).optional(),
-    strategy: comboStrategySchema.optional(),
-    config: comboRuntimeConfigSchema.optional(),
-    isActive: zod_1.z.boolean().optional(),
-  })
-  .superRefine(function (value, ctx) {
-    if (
-      value.name === undefined &&
-      value.models === undefined &&
-      value.strategy === undefined &&
-      value.config === undefined &&
-      value.isActive === undefined
-    ) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "No valid fields to update",
-        path: [],
-      });
-    }
-  });
-exports.testComboSchema = zod_1.z.object({
-  comboName: zod_1.z.string().trim().min(1, "comboName is required"),
-});
-exports.dbBackupRestoreSchema = zod_1.z.object({
-  backupId: zod_1.z.string().trim().min(1, "backupId is required"),
-});
-exports.evalRunSuiteSchema = zod_1.z.object({
-  suiteId: zod_1.z.string().trim().min(1, "suiteId is required"),
-  outputs: zod_1.z.record(zod_1.z.string(), zod_1.z.unknown()),
-});
-exports.updateKeyPermissionsSchema = zod_1.z
-  .object({
-    allowedModels: zod_1.z.array(zod_1.z.string().trim().min(1)).max(1000).optional(),
-    noLog: zod_1.z.boolean().optional(),
-  })
-  .superRefine(function (value, ctx) {
-    if (value.allowedModels === undefined && value.noLog === undefined) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "No valid fields to update",
-        path: [],
-      });
-    }
-  });
-exports.createProviderNodeSchema = zod_1.z
-  .object({
-    name: zod_1.z.string().trim().min(1, "Name is required"),
-    prefix: zod_1.z.string().trim().min(1, "Prefix is required"),
-    apiType: zod_1.z.enum(["chat", "responses"]).optional(),
-    baseUrl: zod_1.z.string().trim().min(1).optional(),
-    type: zod_1.z.enum(["openai-compatible", "anthropic-compatible"]).optional(),
-  })
-  .superRefine(function (value, ctx) {
-    var nodeType = value.type || "openai-compatible";
-    if (nodeType === "openai-compatible" && !value.apiType) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "Invalid OpenAI compatible API type",
-        path: ["apiType"],
-      });
-    }
-  });
-exports.updateProviderNodeSchema = zod_1.z.object({
-  name: zod_1.z.string().trim().min(1, "Name is required"),
-  prefix: zod_1.z.string().trim().min(1, "Prefix is required"),
-  apiType: zod_1.z.enum(["chat", "responses"]).optional(),
-  baseUrl: zod_1.z.string().trim().min(1, "Base URL is required"),
-});
-exports.providerNodeValidateSchema = zod_1.z.object({
-  baseUrl: zod_1.z.string().trim().min(1, "Base URL and API key required"),
-  apiKey: zod_1.z.string().trim().min(1, "Base URL and API key required"),
-  type: zod_1.z.enum(["openai-compatible", "anthropic-compatible"]).optional(),
-});
-exports.updateProviderConnectionSchema = zod_1.z
-  .object({
-    name: zod_1.z.string().max(200).optional(),
-    priority: zod_1.z.coerce.number().int().min(1).max(100).optional(),
-    globalPriority: zod_1.z
-      .union([zod_1.z.coerce.number().int().min(1).max(100), zod_1.z.null()])
-      .optional(),
-    defaultModel: zod_1.z.union([zod_1.z.string().max(200), zod_1.z.null()]).optional(),
-    isActive: zod_1.z.boolean().optional(),
-    apiKey: zod_1.z.string().max(10000).optional(),
-    testStatus: zod_1.z.string().max(50).optional(),
-    lastError: zod_1.z.union([zod_1.z.string(), zod_1.z.null()]).optional(),
-    lastErrorAt: zod_1.z.union([zod_1.z.string(), zod_1.z.null()]).optional(),
-    lastErrorType: zod_1.z.union([zod_1.z.string(), zod_1.z.null()]).optional(),
-    lastErrorSource: zod_1.z.union([zod_1.z.string(), zod_1.z.null()]).optional(),
-    errorCode: zod_1.z.union([zod_1.z.string(), zod_1.z.null()]).optional(),
-    rateLimitedUntil: zod_1.z.union([zod_1.z.string(), zod_1.z.null()]).optional(),
-    lastTested: zod_1.z.union([zod_1.z.string(), zod_1.z.null()]).optional(),
-    healthCheckInterval: zod_1.z.coerce.number().int().min(0).optional(),
-  })
-  .superRefine(function (value, ctx) {
-    if (Object.keys(value).length === 0) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "No valid fields to update",
-        path: [],
-      });
-    }
-  });
-exports.providersBatchTestSchema = zod_1.z
-  .object({
-    mode: zod_1.z.enum(["provider", "oauth", "free", "apikey", "compatible", "all"]),
-    providerId: zod_1.z.string().trim().min(1).optional(),
-  })
-  .superRefine(function (value, ctx) {
-    if (value.mode === "provider" && !value.providerId) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "providerId is required when mode=provider",
-        path: ["providerId"],
-      });
-    }
-  });
-exports.validateProviderApiKeySchema = zod_1.z.object({
-  provider: zod_1.z.string().trim().min(1, "Provider and API key required"),
-  apiKey: zod_1.z.string().trim().min(1, "Provider and API key required"),
-});
-var geminiPartSchema = zod_1.z
-  .object({
-    text: zod_1.z.string().optional(),
-  })
-  .catchall(zod_1.z.unknown());
-var geminiContentSchema = zod_1.z
-  .object({
-    role: zod_1.z.string().optional(),
-    parts: zod_1.z.array(geminiPartSchema).optional(),
-  })
-  .catchall(zod_1.z.unknown());
-exports.v1betaGeminiGenerateSchema = zod_1.z
-  .object({
-    contents: zod_1.z.array(geminiContentSchema).optional(),
-    systemInstruction: zod_1.z
-      .object({
-        parts: zod_1.z.array(geminiPartSchema).optional(),
-      })
-      .catchall(zod_1.z.unknown())
-      .optional(),
-    generationConfig: zod_1.z
-      .object({
-        stream: zod_1.z.boolean().optional(),
-        maxOutputTokens: zod_1.z.coerce.number().int().min(1).optional(),
-        temperature: zod_1.z.coerce.number().optional(),
-        topP: zod_1.z.coerce.number().optional(),
-      })
-      .catchall(zod_1.z.unknown())
-      .optional(),
-  })
-  .catchall(zod_1.z.unknown())
-  .superRefine(function (value, ctx) {
-    if (!value.contents && !value.systemInstruction) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "contents or systemInstruction is required",
-        path: [],
-      });
-    }
-  });
-exports.cliMitmStartSchema = zod_1.z.object({
-  apiKey: zod_1.z.string().trim().min(1, "Missing apiKey"),
-  sudoPassword: zod_1.z.string().optional(),
-});
-exports.cliMitmStopSchema = zod_1.z.object({
-  sudoPassword: zod_1.z.string().optional(),
-});
-exports.cliMitmAliasUpdateSchema = zod_1.z.object({
-  tool: zod_1.z.string().trim().min(1, "tool and mappings required"),
-  mappings: zod_1.z.record(zod_1.z.string(), zod_1.z.string().optional()),
-});
-exports.cliBackupMutationSchema = zod_1.z
-  .object({
-    tool: zod_1.z.string().trim().min(1).optional(),
-    toolId: zod_1.z.string().trim().min(1).optional(),
-    backupId: zod_1.z.string().trim().min(1, "tool and backupId are required"),
-  })
-  .superRefine(function (value, ctx) {
-    if (!value.tool && !value.toolId) {
-      ctx.addIssue({
-        code: zod_1.z.ZodIssueCode.custom,
-        message: "tool and backupId are required",
-        path: ["tool"],
-      });
-    }
-  });
-var envKeySchema = zod_1.z
-  .string()
-  .trim()
-  .min(1, "Environment key is required")
-  .max(120)
-  .regex(/^[A-Z_][A-Z0-9_]*$/, "Invalid environment key format");
-var envValueSchema = zod_1.z
-  .union([zod_1.z.string(), zod_1.z.number(), zod_1.z.boolean()])
-  .transform(function (value) {
-    return String(value);
-  })
-  .refine(function (value) {
-    return value.length > 0;
-  }, "Environment value is required")
-  .refine(function (value) {
-    return value.length <= 10000;
-  }, "Environment value is too long");
-exports.cliSettingsEnvSchema = zod_1.z.object({
-  env: zod_1.z.record(envKeySchema, envValueSchema).refine(function (value) {
-    return Object.keys(value).length > 0;
-  }, "env must contain at least one key"),
-});
-exports.cliModelConfigSchema = zod_1.z.object({
-  baseUrl: zod_1.z.string().trim().min(1, "baseUrl and model are required"),
-  apiKey: zod_1.z.string().optional(),
-  model: zod_1.z.string().trim().min(1, "baseUrl and model are required"),
-});
-exports.codexProfileNameSchema = zod_1.z.object({
-  name: zod_1.z.string().trim().min(1, "Profile name is required"),
-});
-exports.codexProfileIdSchema = zod_1.z.object({
-  profileId: zod_1.z.string().trim().min(1, "profileId is required"),
-});
-exports.guideSettingsSaveSchema = zod_1.z.object({
-  baseUrl: zod_1.z.string().trim().min(1).optional(),
-  apiKey: zod_1.z.string().optional(),
-  model: zod_1.z.string().trim().min(1, "Model is required"),
-});
-// ──── Helper ────
-/**
- * Parse and validate request body with a Zod schema.
- * Returns { success: true, data } or { success: false, error }.
- */
-function validateBody(schema, body) {
-  var _a;
-  var result = schema.safeParse(body);
-  if (result.success) {
-    return { success: true, data: result.data };
-  }
-  var issues = Array.isArray((_a = result.error) === null || _a === void 0 ? void 0 : _a.issues)
-    ? result.error.issues
-    : [];
-  return {
-    success: false,
-    error: {
-      message: "Invalid request",
-      details: issues.map(function (e) {
-        return {
-          field: e.path.join("."),
-          message: e.message,
-        };
-      }),
-    },
-  };
-}
@@ -31,7 +31,12 @@ import { sanitizeRequest } from "../../shared/utils/inputSanitizer";

 // Pipeline integration — wired modules
 import { getCircuitBreaker, CircuitBreakerOpenError } from "../../shared/utils/circuitBreaker";
-import { isModelAvailable, setModelUnavailable } from "../../domain/modelAvailability";
+import {
+  isModelAvailable,
+  setModelUnavailable,
+  clearModelUnavailability,
+} from "../../domain/modelAvailability";
+import { markAccountExhaustedFrom429 } from "../../domain/quotaCache";
 import { RequestTelemetry, recordTelemetry } from "../../shared/utils/requestTelemetry";
 import { generateRequestId } from "../../shared/utils/requestId";
 import { recordCost } from "../../domain/costRules";
@@ -127,7 +132,10 @@ export async function handleChat(request: any, clientRawRequest: any = null) {
  telemetry.startPhase("policy");
  const policy = await enforceApiKeyPolicy(request, modelStr);
  if (policy.rejection) {
-    log.warn("POLICY", `API key policy rejected: ${modelStr} (key=${policy.apiKeyInfo?.id || "unknown"})`);
+    log.warn(
+      "POLICY",
+      `API key policy rejected: ${modelStr} (key=${policy.apiKeyInfo?.id || "unknown"})`
+    );
    return policy.rejection;
  }
  const apiKeyInfo = policy.apiKeyInfo;
@@ -243,6 +251,13 @@ async function handleSingleModelChat(
    const credentials = await getProviderCredentials(provider, excludeConnectionId);

    if (!credentials || credentials.allRateLimited) {
+      if (lastStatus === 429 || lastStatus === 503) {
+        setModelUnavailable(provider, model, 60000, `HTTP ${lastStatus}`);
+        log.info(
+          "AVAILABILITY",
+          `${provider}/${model} marked unavailable — all accounts exhausted (HTTP ${lastStatus})`
+        );
+      }
      return handleNoCredentials(
        credentials,
        excludeConnectionId,
@@ -296,22 +311,19 @@ async function handleSingleModelChat(
    });

    if (result.success) {
+      clearModelUnavailability(provider, model);
      recordCostIfNeeded(apiKeyInfo, result);
      if (telemetry) telemetry.startPhase("finalize");
      if (telemetry) telemetry.endPhase();
      return result.response;
    }

-    // Pipeline: Mark model unavailable on repeated failures
-    if (result.status === 429 || result.status === 503) {
-      setModelUnavailable(provider, model, 60000, `HTTP ${result.status}`);
-      log.info(
-        "AVAILABILITY",
-        `${provider}/${model} marked unavailable for 60s (HTTP ${result.status})`
-      );
+    // 6. Mark account as quota-exhausted on 429 response
+    if (result.status === 429) {
+      markAccountExhaustedFrom429(credentials.connectionId, provider);
    }

-    // 6. Fallback to next account
+    // 7. Fallback to next account
    const { shouldFallback } = await markAccountUnavailable(
      credentials.connectionId,
      result.status,
@@ -357,7 +369,14 @@ async function resolveModelOrError(modelStr: string, body: any) {
  const { provider, model } = modelInfo;
  const sourceFormat = detectFormat(body);
  const providerAlias = PROVIDER_ID_TO_ALIAS[provider] || provider;
-  const targetFormat = getModelTargetFormat(providerAlias, model) || getTargetFormat(provider);
+
+  // If the custom model specifies apiFormat="responses", override targetFormat
+  // to route through the Responses API translator instead of Chat Completions
+  let targetFormat = getModelTargetFormat(providerAlias, model) || getTargetFormat(provider);
+  if ((modelInfo as any).apiFormat === "responses") {
+    targetFormat = "openai-responses";
+    log.info("ROUTING", `Custom model apiFormat=responses → targetFormat=openai-responses`);
+  }

  if (modelStr !== `${provider}/${model}`) {
    log.info("ROUTING", `${modelStr} → ${provider}/${model}`);
@@ -34,7 +34,12 @@ const HTTP_STATUS = {
 * @param {Function} errorResponse - Error response factory
 * @returns {Promise<{ error?: Response, provider: string, model: string, sourceFormat: string, targetFormat: string }>}
 */
-export async function resolveModelOrError(modelStr: string, body: any, log: any, errorResponse: Function) {
+export async function resolveModelOrError(
+  modelStr: string,
+  body: any,
+  log: any,
+  errorResponse: Function
+) {
  const modelInfo = await getModelInfo(modelStr);

  if (!modelInfo.provider) {
@@ -44,7 +49,8 @@ export async function resolveModelOrError(modelStr: string, body: any, log: any,
        `Ambiguous model '${modelStr}'. Use provider/model prefix (ex: gh/${modelStr} or cc/${modelStr}).`;
      log.warn("CHAT", message, {
        model: modelStr,
-        candidates: (modelInfo as any).candidateAliases || (modelInfo as any).candidateProviders || [],
+        candidates:
+          (modelInfo as any).candidateAliases || (modelInfo as any).candidateProviders || [],
      });
      return { error: errorResponse(HTTP_STATUS.BAD_REQUEST, message) };
    }
@@ -56,7 +62,14 @@ export async function resolveModelOrError(modelStr: string, body: any, log: any,
  const { provider, model } = modelInfo;
  const sourceFormat = detectFormat(body);
  const providerAlias = PROVIDER_ID_TO_ALIAS[provider] || provider;
-  const targetFormat = getModelTargetFormat(providerAlias, model) || getTargetFormat(provider);
+
+  // If the custom model specifies apiFormat="responses", override targetFormat
+  // to route through the Responses API translator instead of Chat Completions
+  let targetFormat = getModelTargetFormat(providerAlias, model) || getTargetFormat(provider);
+  if ((modelInfo as any).apiFormat === "responses") {
+    targetFormat = "openai-responses";
+    log.info("ROUTING", `Custom model apiFormat=responses → targetFormat=openai-responses`);
+  }

  // Log routing
  if (modelStr !== `${provider}/${model}`) {
@@ -4,6 +4,7 @@ import {
  updateProviderConnection,
  getSettings,
 } from "@/lib/localDb";
+import { isAccountQuotaExhausted } from "@/domain/quotaCache";
 import {
  isAccountUnavailable,
  getUnavailableUntil,
@@ -197,6 +198,19 @@ export async function getProviderCredentials(
      return null;
    }

+    // Quota-aware: prioritize accounts with available quota
+    const withQuota = availableConnections.filter((c) => !isAccountQuotaExhausted(c.id));
+    const exhaustedQuota = availableConnections.filter((c) => isAccountQuotaExhausted(c.id));
+    const orderedConnections =
+      withQuota.length > 0 ? [...withQuota, ...exhaustedQuota] : availableConnections;
+
+    if (exhaustedQuota.length > 0) {
+      log.debug(
+        "AUTH",
+        `${provider} | quota-aware: ${withQuota.length} with quota, ${exhaustedQuota.length} exhausted`
+      );
+    }
+
    const settings = await getSettings();
    const strategy = settings.fallbackStrategy || "fill-first";

@@ -205,7 +219,7 @@ export async function getProviderCredentials(
      const stickyLimit = toNumber((settings as Record<string, unknown>).stickyRoundRobinLimit, 3);

      // Sort by lastUsed (most recent first) to find current candidate
-      const byRecency = [...availableConnections].sort((a: any, b: any) => {
+      const byRecency = [...orderedConnections].sort((a: any, b: any) => {
        if (!a.lastUsedAt && !b.lastUsedAt) return (a.priority || 999) - (b.priority || 999);
        if (!a.lastUsedAt) return 1;
        if (!b.lastUsedAt) return -1;
@@ -225,7 +239,7 @@ export async function getProviderCredentials(
        });
      } else {
        // Pick the least recently used (excluding current if possible)
-        const sortedByOldest = [...availableConnections].sort((a: any, b: any) => {
+        const sortedByOldest = [...orderedConnections].sort((a: any, b: any) => {
          if (!a.lastUsedAt && !b.lastUsedAt) return (a.priority || 999) - (b.priority || 999);
          if (!a.lastUsedAt) return -1;
          if (!b.lastUsedAt) return 1;
@@ -242,14 +256,14 @@ export async function getProviderCredentials(
      }
    } else if (strategy === "p2c") {
      // Power of Two Choices: pick 2 random, choose the one with fewer failures
-      if (availableConnections.length <= 2) {
-        connection = availableConnections[0];
+      if (orderedConnections.length <= 2) {
+        connection = orderedConnections[0];
      } else {
-        const i = Math.floor(Math.random() * availableConnections.length);
-        let j = Math.floor(Math.random() * (availableConnections.length - 1));
+        const i = Math.floor(Math.random() * orderedConnections.length);
+        let j = Math.floor(Math.random() * (orderedConnections.length - 1));
        if (j >= i) j++;
-        const a = availableConnections[i];
-        const b = availableConnections[j];
+        const a = orderedConnections[i];
+        const b = orderedConnections[j];
        // Prefer the one with fewer consecutive uses / better health
        const scoreA = (a.consecutiveUseCount || 0) + (a.lastError ? 10 : 0);
        const scoreB = (b.consecutiveUseCount || 0) + (b.lastError ? 10 : 0);
@@ -257,11 +271,11 @@ export async function getProviderCredentials(
      }
    } else if (strategy === "random") {
      // Random: Fisher-Yates-inspired random pick
-      const idx = Math.floor(Math.random() * availableConnections.length);
-      connection = availableConnections[idx];
+      const idx = Math.floor(Math.random() * orderedConnections.length);
+      connection = orderedConnections[idx];
    } else if (strategy === "least-used") {
      // Least Used: pick the one with oldest lastUsedAt
-      const sorted = [...availableConnections].sort((a, b) => {
+      const sorted = [...orderedConnections].sort((a, b) => {
        if (!a.lastUsedAt && !b.lastUsedAt) return (a.priority || 999) - (b.priority || 999);
        if (!a.lastUsedAt) return -1;
        if (!b.lastUsedAt) return 1;
@@ -271,13 +285,13 @@ export async function getProviderCredentials(
    } else if (strategy === "cost-optimized") {
      // Cost Optimized: sort by priority ascending (lower = cheaper/preferred)
      // Future: can be enhanced with actual cost data per provider
-      const sorted = [...availableConnections].sort(
+      const sorted = [...orderedConnections].sort(
        (a, b) => (a.priority || 999) - (b.priority || 999)
      );
      connection = sorted[0];
    } else {
      // Default: fill-first (already sorted by priority in getProviderConnections)
-      connection = availableConnections[0];
+      connection = orderedConnections[0];
    }

    return {
@@ -1,5 +1,5 @@
 // Re-export from open-sse with localDb integration
-import { getModelAliases, getComboByName, getProviderNodes } from "@/lib/localDb";
+import { getModelAliases, getComboByName, getProviderNodes, getCustomModels } from "@/lib/localDb";
 import {
  parseModel,
  resolveModelAliasFromMap,
@@ -16,13 +16,30 @@ export async function resolveModelAlias(alias) {
  return resolveModelAliasFromMap(alias, aliases);
 }

+/**
+ * Look up the apiFormat for a custom model from the DB.
+ * Returns "responses" if the model is configured for the Responses API, otherwise undefined.
+ */
+async function lookupCustomModelApiFormat(
+  providerId: string,
+  modelId: string
+): Promise<string | undefined> {
+  try {
+    const models = await getCustomModels(providerId);
+    if (!Array.isArray(models)) return undefined;
+    const match = models.find((m: any) => m.id === modelId);
+    return match?.apiFormat === "responses" ? "responses" : undefined;
+  } catch {
+    return undefined;
+  }
+}
+
 /**
 * Get full model info (parse or resolve)
 */
 export async function getModelInfo(modelStr) {
  const parsed = parseModel(modelStr);

-  // Check custom provider nodes first (for both alias and non-alias formats)
  // Check custom provider nodes first (for both alias and non-alias formats)
  if (parsed.providerAlias || parsed.provider) {
    // Ensure prefixToCheck is always a concise identifier, not a full model string
@@ -32,14 +49,26 @@ export async function getModelInfo(modelStr) {
    const openaiNodes = await getProviderNodes({ type: "openai-compatible" });
    const matchedOpenAI = openaiNodes.find((node) => node.prefix === prefixToCheck);
    if (matchedOpenAI) {
-      return { provider: matchedOpenAI.id, model: parsed.model };
+      const apiFormat = await lookupCustomModelApiFormat(
+        matchedOpenAI.id as string,
+        parsed.model as string
+      );
+      return { provider: matchedOpenAI.id, model: parsed.model, ...(apiFormat && { apiFormat }) };
    }

    // Check Anthropic Compatible nodes
    const anthropicNodes = await getProviderNodes({ type: "anthropic-compatible" });
    const matchedAnthropic = anthropicNodes.find((node) => node.prefix === prefixToCheck);
    if (matchedAnthropic) {
-      return { provider: matchedAnthropic.id, model: parsed.model };
+      const apiFormat = await lookupCustomModelApiFormat(
+        matchedAnthropic.id as string,
+        parsed.model as string
+      );
+      return {
+        provider: matchedAnthropic.id,
+        model: parsed.model,
+        ...(apiFormat && { apiFormat }),
+      };
    }
  }
Author	SHA1	Message	Date
diegosouzapw	7cb420d8e6	feat(release): v2.0.8 — custom image model handler resolution Build Electron Desktop App / Validate version (push) Failing after 26s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details	2026-03-07 10:05:20 -03:00
Diego Rodrigues de Sa e Souza	d3919d441f	Merge pull request #239 from diegosouzapw/fix/issue-238-image-handler fix: pass resolved provider to image handler for custom models (#238)	2026-03-07 10:04:24 -03:00
diegosouzapw	4b5824babc	fix: pass resolved provider to image handler for custom models (#238 )	2026-03-07 10:03:48 -03:00
diegosouzapw	fb87df14fd	feat(release): v2.0.7 — custom image model routing + Codex OAuth workspace isolation Build Electron Desktop App / Validate version (push) Failing after 34s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details	2026-03-07 06:58:07 -03:00
Diego Rodrigues de Sa e Souza	da9e4e929b	Merge pull request #237 from diegosouzapw/fix/issue-232-236-image-oauth fix: custom image model routing + Codex OAuth workspace isolation (#232, #236)	2026-03-07 06:56:49 -03:00
diegosouzapw	10b23b15ae	fix: custom image model routing + Codex OAuth workspace isolation (#232 , #236 )	2026-03-07 06:56:09 -03:00
diegosouzapw	30fba39b35	feat(release): v2.0.6 — custom model apiFormat routing fix Build Electron Desktop App / Validate version (push) Failing after 33s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details	2026-03-07 01:36:21 -03:00
Diego Rodrigues de Sa e Souza	5a75ff67c9	Merge pull request #233 from diegosouzapw/fix/issue-204-apiformat-routing fix: wire apiFormat from custom model DB into routing layer (#204)	2026-03-07 01:35:30 -03:00
diegosouzapw	358828b617	fix: wire apiFormat from custom model DB into routing layer (#204 )	2026-03-07 01:26:59 -03:00
diegosouzapw	e080c4a16a	feat(release): v2.0.5 — fix Chat→Responses reasoning IDs, electron auto-update, dependency bumps Build Electron Desktop App / Validate version (push) Failing after 31s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details	2026-03-06 18:51:24 -03:00
Diego Rodrigues de Sa e Souza	04b7e38baf	Merge pull request #221 from benzntech/feat/electron-auto-update feat(electron): add auto-update functionality with electron-updater	2026-03-06 18:49:54 -03:00
Diego Rodrigues de Sa e Souza	7ee23fbe19	Merge pull request #230 from diegosouzapw/dependabot/npm_and_yarn/express-rate-limit-8.3.0 deps: bump express-rate-limit from 8.2.1 to 8.3.0	2026-03-06 18:49:51 -03:00
Diego Rodrigues de Sa e Souza	c49bdb4ebb	Merge pull request #229 from diegosouzapw/dependabot/github_actions/docker/build-push-action-7 chore(deps): bump docker/build-push-action from 6 to 7	2026-03-06 18:49:48 -03:00
Diego Rodrigues de Sa e Souza	0f7efed8d5	Merge pull request #228 from diegosouzapw/dependabot/github_actions/actions/upload-artifact-7 chore(deps): bump actions/upload-artifact from 4 to 7	2026-03-06 18:49:46 -03:00
Diego Rodrigues de Sa e Souza	d07bc6dcf3	Merge pull request #227 from diegosouzapw/dependabot/github_actions/docker/login-action-4 chore(deps): bump docker/login-action from 3 to 4	2026-03-06 18:49:43 -03:00
Diego Rodrigues de Sa e Souza	d607d46fa3	Merge pull request #226 from diegosouzapw/dependabot/github_actions/actions/download-artifact-8 chore(deps): bump actions/download-artifact from 4 to 8	2026-03-06 18:49:40 -03:00
Diego Rodrigues de Sa e Souza	2225dd14aa	Merge pull request #225 from diegosouzapw/dependabot/github_actions/actions/cache-5 chore(deps): bump actions/cache from 4 to 5	2026-03-06 18:49:37 -03:00
Diego Rodrigues de Sa e Souza	f6c0e7bbbe	Merge pull request #222 from benzntech/fix/electron-release-duplicate-asset fix(ci): remove duplicate OmniRoute.exe entry in electron release workflow	2026-03-06 18:49:28 -03:00
Diego Rodrigues de Sa e Souza	c4675c5219	Merge pull request #231 from diegosouzapw/fix/issue-224-reasoning-ids fix: omit synthesized reasoning items in Chat→Responses translation (#224)	2026-03-06 18:49:25 -03:00
diegosouzapw	2d977a3c4d	fix: omit synthesized reasoning items in Chat→Responses translation (#224 )	2026-03-06 18:48:34 -03:00
dependabot[bot]	9405918258	deps: bump express-rate-limit from 8.2.1 to 8.3.0 Bumps [express-rate-limit](https://github.com/express-rate-limit/express-rate-limit) from 8.2.1 to 8.3.0. - [Release notes](https://github.com/express-rate-limit/express-rate-limit/releases) - [Commits](https://github.com/express-rate-limit/express-rate-limit/compare/v8.2.1...v8.3.0) --- updated-dependencies: - dependency-name: express-rate-limit dependency-version: 8.3.0 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2026-03-06 18:46:36 +00:00
dependabot[bot]	a69d7dd4b5	chore(deps): bump docker/build-push-action from 6 to 7 Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 6 to 7. - [Release notes](https://github.com/docker/build-push-action/releases) - [Commits](https://github.com/docker/build-push-action/compare/v6...v7) --- updated-dependencies: - dependency-name: docker/build-push-action dependency-version: '7' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>	2026-03-06 18:27:03 +00:00
dependabot[bot]	428e6cb53f	chore(deps): bump actions/upload-artifact from 4 to 7 Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 7. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/v4...v7) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-version: '7' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>	2026-03-06 18:26:59 +00:00
dependabot[bot]	c9a2955d28	chore(deps): bump docker/login-action from 3 to 4 Bumps [docker/login-action](https://github.com/docker/login-action) from 3 to 4. - [Release notes](https://github.com/docker/login-action/releases) - [Commits](https://github.com/docker/login-action/compare/v3...v4) --- updated-dependencies: - dependency-name: docker/login-action dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>	2026-03-06 18:26:54 +00:00
dependabot[bot]	7aefcd3437	chore(deps): bump actions/download-artifact from 4 to 8 Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 8. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](https://github.com/actions/download-artifact/compare/v4...v8) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-version: '8' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>	2026-03-06 18:26:51 +00:00
dependabot[bot]	79f4f79c46	chore(deps): bump actions/cache from 4 to 5 Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5. - [Release notes](https://github.com/actions/cache/releases) - [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md) - [Commits](https://github.com/actions/cache/compare/v4...v5) --- updated-dependencies: - dependency-name: actions/cache dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>	2026-03-06 18:26:46 +00:00
benzntech	c11c275678	fix(electron): address auto-updater review issues - Remove unused dialog import - Stop Next.js server before quitAndInstall() to prevent data loss - Propagate errors from checkForUpdates/downloadUpdate to IPC handlers so renderer can distinguish success from failure - Remove meaningless return value from install-update handler	2026-03-06 19:22:41 +05:30
benzntech	bbcd1d3a08	fix(ci): remove duplicate OmniRoute.exe entry in electron release workflow Duplicate release-assets/OmniRoute.exe glob caused softprops/action-gh-release to attempt a second upload of the same asset, triggering a 404 Not Found error on the GitHub release asset update API. The file is already covered by the *.exe glob pattern above it.	2026-03-06 19:18:41 +05:30
benzntech	3342d5b931	feat(electron): add auto-update functionality with electron-updater	2026-03-06 18:54:00 +05:30
diegosouzapw	f96ee44213	feat(release): v2.0.4 — round-robin lastUsedAt persistence, zod standalone build fix Build Electron Desktop App / Validate version (push) Failing after 31s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details	2026-03-05 23:24:56 -03:00
Diego Rodrigues de Sa e Souza	bc53fe0cd9	Merge pull request #219 from diegosouzapw/fix/issue-218-round-robin-lastUsedAt fix: persist lastUsedAt for round-robin + zod in standalone build (#218, #217)	2026-03-05 23:24:13 -03:00
diegosouzapw	97a67b5d3e	fix: persist lastUsedAt in provider_connections schema for round-robin (#218 ) - Add last_used_at column to provider_connections CREATE TABLE schema - Add ensureProviderConnectionsColumns migration for existing databases - Add last_used_at to INSERT and UPDATE SQL in providers.ts - Add last_used_at to JSON migration INSERT in core.ts - Add zod to serverExternalPackages in next.config.mjs (#217) Fixes #218: Round-robin routing strategy now correctly persists the lastUsedAt timestamp, allowing rotation between accounts. Fixes #217: zod module is now properly included in standalone/Docker builds by declaring it as a server external package.	2026-03-05 23:22:10 -03:00
diegosouzapw	1ffa58be76	feat(release): v2.0.3 — deferred tools cache_control fix, quota system hardening Build Electron Desktop App / Validate version (push) Failing after 29s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details	2026-03-05 21:57:04 -03:00
Diego Rodrigues de Sa e Souza	a5cf51c0b9	Merge pull request #214 from DavyMassoneto/fix/claude-oauth-usage-endpoint fix: harden quota system — code review fixes + build fix	2026-03-05 21:55:23 -03:00
Diego Rodrigues de Sa e Souza	3d38cbf70f	Merge pull request #216 from DavyMassoneto/fix/defer-loading-cache-control-conflict fix: skip cache_control on deferred tools + remove stale schemas.js	2026-03-05 21:55:14 -03:00
DavyMassoneto	196a4e037c	fix: skip cache_control on deferred tools + remove stale schemas.js - Skip tools with defer_loading=true when assigning cache_control (Anthropic API rejects the combination with 400) - Delete stale schemas.js that shadowed the .ts source, causing missing cloudSyncActionSchema export Fixes #215	2026-03-05 20:19:58 -03:00
DavyMassoneto	bfe495931f	fix(claude): correct utilization semantics, harden quota cache, fix premature model unavailability - Fix inverted Claude OAuth utilization (remaining, not used) - Add hasUtilization() guard to prevent false exhaustion from empty responses - Centralize anthropic-version into CLAUDE_CONFIG.apiVersion - Add parseDate() for safe date validation in quota cache - Batch background refresh with MAX_CONCURRENT_REFRESHES=5 - Move setModelUnavailable to after all accounts exhausted, not first 429 - Extract safePercentage() to shared utils (dedup) - Use isRecord() type guard in usage API route - Exclude binary files from Tailwind v4 source scanning	2026-03-05 19:39:59 -03:00
DavyMassoneto	11bcdd810a	feat: quota-aware account selection + fix premature model unavailability - Move setModelUnavailable from per-account loop to all-accounts-exhausted path - Clear model unavailability on successful fallback - Add in-memory quota cache with background refresh (5min active, 20min exhausted) - Integrate quota cache in account selection to skip exhausted accounts - Mark accounts as exhausted from 429 when no cached quota data exists - Populate quota cache from dashboard usage endpoint	2026-03-05 18:49:56 -03:00