Release v3.4.0 (Integration) (#861 )

* test(settings): add unit tests for debugMode and hiddenSidebarItems Tests cover: - PATCH debugMode=true/false - PATCH hiddenSidebarItems with array values - Combined updates with both fields * test(e2e): add Playwright tests for settings toggles Tests cover: - Debug mode toggle on/off - Sidebar visibility toggle - Settings persistence after page reload * fix(tests): address code review issues - Unit tests: fix async/await for getSettings, use direct db functions - E2E tests: remove conditional logic, use Playwright auto-waiting assertions * feat(logging): unify request log retention and artifacts * docs: add dashboard settings toggles to CONTRIBUTING Add section documenting: - Debug Mode toggle (Settings → Advanced) - Sidebar Visibility toggle (Settings → General) * fix(cache): only inject prompt_cache_key for supported providers Only inject prompt_cache_key for providers that support prompt caching (Claude, Anthropic, ZAI, Qwen, DeepSeek). This fixes issue #848 where NVIDIA API rejected the parameter. * fix(model-sync): log only channel-level model changes * feat(providers): add 4 free models to opencode-zen * feat(providers): add explicit contextLength for opencode-zen free models * feat(providers): add contextLength for all opencode-zen models * feat: Improve the Chinese translation * fix: preserve client cache_control for all Claude-protocol providers Previously, the cache control preservation logic only recognized a hardcoded list of providers (claude, anthropic, zai, qwen, deepseek). This caused OmniRoute to inject its own cache_control markers for Claude-protocol providers not in that list (bailian-coding-plan, glm, minimax, minimax-cn, etc.), overwriting the client's cache markers. The fix checks both: 1. Known caching providers list (existing behavior) 2. Whether targetFormat === 'claude' (all Claude-protocol providers) This ensures all Claude-compatible providers properly preserve client cache_control headers when appropriate (Claude Code client, deterministic routing, etc.). Also removes unused CacheStatsCard from settings/components (duplicate of the one in cache/ page). Fixes cache token calculation for GLM, Minimax, and other Claude-compatible providers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: pure passthrough for Claude→Claude when cache_control preserved The Claude passthrough path round-trips through OpenAI format (claude→openai→claude) for structural normalization. This strips cache_control markers from every content block since OpenAI format has no equivalent, causing ~42k cache creation tokens per request with zero cache reads. When preserveCacheControl is true (Claude Code client, "always" setting, or deterministic combo), skip the round-trip entirely and forward the body as-is. Claude Code sends well-formed Messages API payloads — the normalization was only needed for non-Code clients. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: restore CacheStatsCard — was not a duplicate The first commit incorrectly deleted CacheStatsCard from settings/components/ as a "duplicate". It's the only copy — both settings/page.tsx and cache/page.tsx import from this location. Restored the i18n-ized version from main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(429): parse long quota reset times from error body - Parse XhYmZs format from antigravity error messages (e.g., 27h41m36s) - Dynamic retry-after threshold (60s default) instead of hardcoded 10s - Add parseRetryFromErrorText() in accountFallback.ts for body parsing - Fix 403 'verify your account' to trigger permanent deactivation - Add keyword matching for 'quota will reset', 'exhausted capacity' - Add unit tests for retry parsing and keyword matching Fixes #858 (Antigravity 429 handling) Fixes #832 (Qwen quota 429 - same underlying bug) * chore: bump version to v3.4.0-dev * fix(migrations): rename 013 to 014 to avoid collision with v3.3.11 * chore(docs): update CHANGELOG for v3.4.0 integrations * fix: Claude token refresh, Antigravity quota, and 429 rate-limit handling - Fix Claude OAuth token refresh to use form-urlencoded format (standard OAuth2) - Add anthropic-beta header required by Claude OAuth API - Switch Antigravity quota to use retrieveUserQuota API (same as Gemini CLI) - Parse quota reset time for all providers (not just Antigravity) - Add quota reset keywords to error classifier - Cap maximum retry time at 24 hours to prevent infinite wait Closes #836, #857, #858, #832 * fix(dashboard): resolve /dashboard/limits hanging UI with 70+ accounts via chunk parallelization (#784) --------- Co-authored-by: oyi77 <oyi77@users.noreply.github.com> Co-authored-by: R.D. <rogerproself@gmail.com> Co-authored-by: kang-heewon <heewon.dev@gmail.com> Co-authored-by: gmw <rorschach1167@qq.com> Co-authored-by: tombii <github@tombii.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Merge pull request #860 from diegosouzapw/release/v3.3.11
2026-03-31 10:22:52 -03:00 · 2026-03-31 08:20:04 -03:00 · 2026-03-31 08:17:07 -03:00 · 2026-03-31 07:57:43 -03:00 · 2026-03-31 00:16:51 -03:00 · 2026-03-31 00:15:20 -03:00
153 changed files with 74274 additions and 1442 deletions
@@ -11,6 +11,8 @@ Bump version, finalize CHANGELOG, commit, open a **PR to main** and wait for use
 > Always use: `npm version patch --no-git-tag-version`
 > The threshold rule: when `y` reaches 10, bump to `2.(x+1).0` — e.g. `2.1.10` → `2.2.0`.

+> **🔴 SINGLE BRANCH RULE**: The `release/vX.Y.Z` branch is the **ONLY** development branch for the entire release cycle. ALL work — bug fixes, feature implementations, PR integrations, issue resolutions — MUST be committed directly on this branch. Never create separate `fix/`, `feat/`, or topic branches. When running `/resolve-issues`, `/implement-features`, or `/review-prs`, always work on the current release branch.
+
 ---

 ## ⚠️ Two-Phase Flow
@@ -176,24 +178,17 @@ Inform the user:

 > Run these steps only AFTER the user has merged the PR.

-### 11. Pull main and create tag
+### 11. Create Git Tag and GitHub Release (MANDATORY)
+
+// turbo

 ```bash
 git checkout main
 git pull origin main
-git tag -a v2.x.y -m "Release v2.x.y"
-```
-
-### 12. Push tag to GitHub
-
-```bash
+VERSION=$(node -p "require('./package.json').version")
+git tag -a "v$VERSION" -m "Release v$VERSION"
 git push origin --tags
-```
-
-### 13. Create GitHub release
-
-```bash
-gh release create v2.x.y --title "v2.x.y — summary" --notes "..."
+gh release create "v$VERSION" --title "v$VERSION" --notes "OmniRoute v$VERSION Release" --target main
 ```

 ### 14. 🐳 Trigger Docker Hub build (MANDATORY — keep npm and Docker in sync)
@@ -6,7 +6,9 @@ description: Analyze open feature request issues, implement viable ones on dedic

 ## Overview

-Fetches open feature request issues, analyzes each against the current codebase, implements viable ones on dedicated branches, and responds to authors with results. Does NOT merge to main — leaves branches for author validation.
+Fetches open feature request issues, analyzes each against the current codebase, implements viable ones **on the current release branch** (`release/vX.Y.Z`), and responds to authors with results. Does NOT merge to main — the release branch is later merged via PR.
+
+> **BRANCH RULE**: All work MUST happen on the current `release/vX.Y.Z` branch. Never create separate `feat/` branches. If no release branch exists yet, create one first using `/generate-release` Phase 1 steps 1–5.

 ## Steps

@@ -16,15 +18,48 @@ Fetches open feature request issues, analyzes each against the current codebase,

 - Run: `git -C <project_root> remote get-url origin` to extract owner/repo

-### 2. Fetch Open Feature Request Issues
+### 2. Ensure Release Branch Exists

 // turbo

- Run: `gh issue list --repo <owner>/<repo> --state open --limit 50 --json number,title,labels,body,comments,createdAt,author`
- Filter for issues that are feature requests (label `enhancement`/`feature`, or body describes new functionality, or previously classified as feature request)
- Sort by oldest first
+Before doing any work, ensure you are on the current release branch:

-### 3. Analyze Each Feature Request
+```bash
+# Check current branch
+git branch --show-current
+
+# If on main, determine next version and create the release branch
+VERSION=$(node -p "require('./package.json').version")
+NEXT=$(node -p "const [a,b,c]=('$VERSION').split('.').map(Number); c>=9?a+'.'+(b+1)+'.0':a+'.'+b+'.'+(c+1)")
+git checkout -b release/v$NEXT
+npm version patch --no-git-tag-version
+npm install
+```
+
+If already on a `release/vX.Y.Z` branch, continue working there.
+
+### 3. Fetch Open Feature Request Issues
+
+// turbo-all
+
+**⚠️ CRITICAL**: The JSON output of `gh issue list` can be truncated by the tool, silently hiding issues and their comments. You MUST use the two-step approach below to guarantee **all** feature requests and their full conversations are fetched.
+
+**Step 3a — Get Issue numbers only** (small output, never truncated):
+
+- Run: `gh issue list --repo <owner>/<repo> --state open --labels "enhancement" --limit 500 --json number --jq '.[].number'`
+- (Also run the same for `--labels "feature"` if they are separated, or filter all open issues if labels are not strictly used).
+- This outputs one issue number per line. Count them and confirm total.
+
+**Step 3b — Fetch full metadata & conversations for each Issue** (one call per issue):
+
+- For each issue number from step 3a, run:
+  `gh issue view <NUMBER> --repo <owner>/<repo> --json number,title,labels,body,comments,createdAt,author`
+- Read not just the body, but **ALL comments (`comments` array)** completely to understand the full context, agreements, and restrictions discussed by the community.
+- You may batch these into parallel calls (up to 4 at a time).
+- Filter for issues that are feature requests (if not already filtered by label).
+- Sort by oldest first.
+
+### 4. Analyze Each Feature Request

 For each feature request issue, perform a **two-level analysis**:

@@ -46,21 +81,16 @@ Ask yourself:

 #### Level 2 — Implementation (only for VIABLE features)

+> **⚠️ ALL implementation happens on the release branch.**
+
 1. **Research** — Read all related source files to understand the current architecture
 2. **Design** — Plan the implementation, filling gaps in the original request
-3. **Create branch** — Name format: `feat/issue-<NUMBER>-<short-slug>`
-   ```bash
-   git checkout main
-   git pull origin main
-   git checkout -b feat/issue-<NUMBER>-<short-slug>
-   ```
-4. **Implement** — Build the complete solution following project patterns
-5. **Build** — Run `npm run build` to verify compilation
-6. **Commit** — Commit with: `feat: <description> (#<NUMBER>)`
-7. **Push** — Push the branch: `git push -u origin feat/issue-<NUMBER>-<short-slug>`
-8. **Return to main** — `git checkout main`
+3. **Implement** — Build the complete solution following project patterns, **on the release branch**
+4. **Build** — Run `npm run build` to verify compilation
+5. **Commit** — Commit with: `feat: <description> (#<NUMBER>)`
+6. **Continue** — Move to the next feature (do not switch branches)

-### 4. Respond to Authors
+### 5. Respond to Authors

 #### For VIABLE (implemented) features:

@@ -70,9 +100,9 @@ Post a comment on the issue:
 ````markdown
 ## ✅ Feature Implemented!

-Hi @<author>! We've analyzed your request and implemented it on a dedicated branch.
+Hi @<author>! We've analyzed your request and implemented it.

-**Branch:** `feat/issue-<NUMBER>-<short-slug>`
+**Branch:** `release/vX.Y.Z` (upcoming release)

 ### What was implemented:

@@ -82,31 +112,24 @@ Hi @<author>! We've analyzed your request and implemented it on a dedicated bran

 ```bash
 git fetch origin
-git checkout feat/issue-<NUMBER>-<short-slug>
+git checkout release/vX.Y.Z
 npm install && npm run dev
 ```
-````

 ### Next steps:

 1. **Test it** — Please verify it works as you expected
-2. **Want to improve it?** — You're welcome to contribute! Just:
-   ```bash
-   git checkout feat/issue-<NUMBER>-<short-slug>
-   # Make your improvements
-   git add -A && git commit -m "improve: <your changes>"
-   git push origin feat/issue-<NUMBER>-<short-slug>
-   ```
-   Then open a Pull Request from your branch to `main` 🎉
+2. **Want to improve it?** — Feel free to open a follow-up PR targeting `release/vX.Y.Z`
 3. **Not quite right?** — Let us know in this issue what needs to change

-Looking forward to your feedback! 🚀
-
-```
+This will be included in the next release. Looking forward to your feedback! 🚀
+````

 #### For NEEDS MORE INFO:
+
 // turbo
 Post a comment asking for specific missing details needed to implement, e.g.:
+
 - "Could you describe the exact behavior when X happens?"
 - "Which API endpoints should be affected?"
 - "Should this apply to all providers or only specific ones?"
@@ -114,18 +137,28 @@ Post a comment asking for specific missing details needed to implement, e.g.:
 Add the context of WHY you need each piece of information.

 #### For NOT VIABLE:
+
 // turbo
 Post a polite comment explaining why the feature doesn't fit at this time:
+
 - If the idea is decent but timing is wrong: "This is an interesting idea, but it doesn't align with our current priorities. Feel free to open a new issue with more details if you'd like us to reconsider."
 - If fundamentally flawed: Explain the technical or architectural reasons why it won't work, suggest alternatives if possible.
 - Close the issue after posting the comment.

-### 5. Summary Report
+### 6. Finalize & Push
+
+After implementing all viable features:
+
+1. **Update CHANGELOG.md** on the release branch with all new feature entries
+2. Push the release branch: `git push origin release/vX.Y.Z`
+3. Run `/generate-release` workflow Phase 1 steps 7–10 (tests → commit → push → open PR to main → wait for user)
+
+### 7. Summary Report
+
 Present a summary report to the user via `notify_user`:

-| Issue | Title | Verdict | Branch / Action |
-|---|---|---|---|
-| #N | Title | ✅ Implemented | `feat/issue-N-slug` |
-| #N | Title | ❓ Needs Info | Comment posted |
-| #N | Title | ❌ Not Viable | Closed with explanation |
-```
+| Issue | Title | Verdict        | Action                  |
+| ----- | ----- | -------------- | ----------------------- |
+| #N    | Title | ✅ Implemented | Committed on release/vX |
+| #N    | Title | ❓ Needs Info  | Comment posted          |
+| #N    | Title | ❌ Not Viable  | Closed with explanation |
@@ -6,7 +6,9 @@ description: Fetch all open GitHub issues, analyze bugs, resolve what's possible

 ## Overview

-This workflow fetches all open issues from the project's GitHub repository, classifies them, analyzes bugs, resolves what can be fixed, and triages issues with insufficient information. **It does NOT merge or release automatically** — it creates a PR and waits for user validation before merging.
+This workflow fetches all open issues from the project's GitHub repository, classifies them, analyzes bugs, resolves what can be fixed, and triages issues with insufficient information. **All fixes are committed on the current release branch** (`release/vX.Y.Z`). It does NOT merge or release automatically — the release branch is later merged via PR to main.
+
+> **BRANCH RULE**: All work MUST happen on the current `release/vX.Y.Z` branch. Never create separate `fix/` branches. If no release branch exists yet, create one first using `/generate-release` Phase 1 steps 1–5.

 ## Steps

@@ -17,25 +19,45 @@ This workflow fetches all open issues from the project's GitHub repository, clas
 - Run: `git -C <project_root> remote get-url origin` to extract the owner/repo
 - Parse the owner and repo name from the URL

-### 2. Fetch All Open Issues
+### 2. Ensure Release Branch Exists
+
+// turbo
+
+Before doing any work, ensure you are on the current release branch:
+
+```bash
+# Check current branch
+git branch --show-current
+
+# If on main, determine next version and create the release branch
+VERSION=$(node -p "require('./package.json').version")
+NEXT=$(node -p "const [a,b,c]=('$VERSION').split('.').map(Number); c>=9?a+'.'+(b+1)+'.0':a+'.'+b+'.'+(c+1)")
+git checkout -b release/v$NEXT
+npm version patch --no-git-tag-version
+npm install
+```
+
+If already on a `release/vX.Y.Z` branch, continue working there.
+
+### 3. Fetch All Open Issues

 // turbo-all

 **⚠️ CRITICAL**: The JSON output of `gh issue list` can be truncated by the tool, silently hiding issues. You MUST use the two-step approach below to guarantee **all** issues are fetched.

-**Step 2a — Get Issue numbers only** (small output, never truncated):
+**Step 3a — Get Issue numbers only** (small output, never truncated):

 - Run: `gh issue list --repo <owner>/<repo> --state open --limit 500 --json number --jq '.[].number'`
 - This outputs one issue number per line. Count them and confirm total.

-**Step 2b — Fetch full metadata for each Issue** (one call per issue):
+**Step 3b — Fetch full metadata for each Issue** (one call per issue):

- For each issue number from step 2a, run:
+- For each issue number from step 3a, run:
  `gh issue view <NUMBER> --repo <owner>/<repo> --json number,title,labels,body,comments,createdAt,author`
 - You may batch these into parallel calls (up to 4 at a time).
 - Sort by oldest first (FIFO).

-### 3. Classify Each Issue
+### 4. Classify Each Issue

 For each issue, determine its type:

@@ -46,9 +68,9 @@ For each issue, determine its type:

 Focus ONLY on **Bugs** for resolution. Feature requests and questions should be skipped with a note in the final report.

-### 4. Analyze Each Bug — For each bug issue:
+### 5. Analyze Each Bug — For each bug issue:

-#### 4a. Check Information Sufficiency
+#### 5a. Check Information Sufficiency

 Verify the issue contains enough information to reproduce and fix:

@@ -57,7 +79,7 @@ Verify the issue contains enough information to reproduce and fix:
 - [ ] Error messages or logs
 - [ ] Expected vs actual behavior

-#### 4b. If Information Is INSUFFICIENT
+#### 5b. If Information Is INSUFFICIENT

 Call the `/issue-triage` workflow (located at `~/.gemini/antigravity/global_workflows/issue-triage.md`):
 // turbo
@@ -66,18 +88,19 @@ Call the `/issue-triage` workflow (located at `~/.gemini/antigravity/global_work
 - Add `needs-info` label using `gh issue edit`
 - Mark this issue as **DEFERRED** and move to the next one

-#### 4c. If Information Is SUFFICIENT
+#### 5c. If Information Is SUFFICIENT

-Proceed with resolution:
+Proceed with resolution **on the release branch**:

-1. **Create a fix branch** — `git checkout -b fix/issue-<NUMBER>-<short-description>`
-2. **Research** — Search the codebase for files related to the issue
-3. **Root Cause** — Identify the root cause by reading the relevant source files
-4. **Implement Fix** — Apply the fix following existing code patterns and conventions
-5. **Test** — Build the project and run tests to verify the fix
-6. **Commit** — Commit with message format: `fix: <description> (#<issue_number>)`
+1. **Research** — Search the codebase for files related to the issue
+2. **Root Cause** — Identify the root cause by reading the relevant source files
+3. **Implement Fix** — Apply the fix following existing code patterns and conventions
+4. **Test** — Build the project and run tests to verify the fix
+5. **Commit** — Commit with message format: `fix: <description> (#<issue_number>)`

-### 5. Generate Report & Wait for Validation
+> **⚠️ Do NOT create a separate branch.** All commits go directly on the release branch.
+
+### 6. Generate Report & Wait for Validation

 Present a summary report to the user via `notify_user` with `BlockedOnUser: true`:

@@ -90,41 +113,37 @@ Present a summary report to the user via `notify_user` with `BlockedOnUser: true
 > **⚠️ IMPORTANT**: Do NOT commit, close issues, or generate releases at this step.
 > Wait for the user to review the changes and respond with **OK** before proceeding.

- If the user says **OK** or approves → Proceed to step 6
+- If the user says **OK** or approves → Proceed to step 7
 - If the user requests changes → Apply the requested adjustments first, then present the report again
 - If the user rejects → Revert the changes and stop

-### 6. Commit & Push Fix Branch (only after user approval)
+### 7. Commit & Push (only after user approval)

 After the user validates:

- Commit each fix individually with message format: `fix: <description> (#<issue_number>)`
- Push the fix branch: `git push origin fix/issue-<NUMBER>-<short-description>`
- Create a PR: `gh pr create --title "fix: <description> (#<issue_number>)" --body "<details>" --base main`
+- Commit each fix individually on the release branch with message format: `fix: <description> (#<issue_number>)`
+- Push the release branch: `git push origin release/vX.Y.Z`
+- **Update CHANGELOG.md** with all new bug fix entries

-### 7. 🛑 WAIT — Notify User & Await PR Verification
+### 8. 🛑 WAIT — Notify User & Await Verification

 **This is a mandatory stop point.** Use `notify_user` with `BlockedOnUser: true`:

- Inform the user that the PR was created and is **awaiting their verification**
- Include the PR number, URL, and a summary of what was changed
+- Inform the user that fixes have been **committed and pushed to the release branch**
+- Include summary of fixes, test status, and files changed
 - **DO NOT merge, close issues, generate releases, or deploy until the user confirms**

 Wait for the user to respond:

- **User confirms** → Proceed to step 8
+- **User confirms** → Proceed to step 9
 - **User requests changes** → Apply changes, push to the same branch, notify again
- **User rejects** → Close the PR and stop
+- **User rejects** → Revert and stop

-### 8. Merge, Close Issues & Release (only after user confirms PR)
+### 9. Close Issues & Finalize (only after user confirms)

-After the user confirms the PR:
+After the user confirms:

-1. **Merge** the PR: `gh pr merge <NUMBER> --merge --repo <owner>/<repo>` or via local merge
-2. **Close** resolved issues with a comment: `gh issue close <NUMBER> --repo <owner>/<repo> --comment "Fixed in <commit_hash>. The fix will be included in the next release."`
-3. **Switch to main**: `git checkout main && git pull`
-4. Run the `/update-docs` workflow (at `~/.gemini/antigravity/global_workflows/update-docs.md`) to update CHANGELOG and README
-5. Run the `/generate-release` workflow (at `.agents/workflows/generate-release.md`) to bump version, tag, and publish
-6. Deploy to local VPS: `ssh root@192.168.0.15 "npm install -g omniroute@<VERSION> && pm2 restart omniroute"`
+1. **Close** resolved issues with a comment: `gh issue close <NUMBER> --repo <owner>/<repo> --comment "Fixed in release/vX.Y.Z. The fix will be included in the next release."`
+2. Run `/generate-release` workflow Phase 1 steps 7–10 (tests → commit → push → open PR to main → wait for user)

 If NO fixes were committed, skip this step and just present the report.
@@ -6,7 +6,9 @@ description: Analyze open Pull Requests from the project's GitHub repository, ge

 ## Overview

-This workflow fetches all open PRs from the project's GitHub repository, performs a critical analysis of each one, generates a detailed report, and waits for user approval before proceeding with implementation. **All improvements are committed on top of the PR branch** and the user must verify before merge.
+This workflow fetches all open PRs from the project's GitHub repository, performs a critical analysis of each one, generates a detailed report, and waits for user approval before proceeding with implementation. **All improvements are committed on the current release branch** (`release/vX.Y.Z`).
+
+> **BRANCH RULE**: All work MUST happen on the current `release/vX.Y.Z` branch. Never create separate feature or fix branches. If no release branch exists yet, create one first using `/generate-release` Phase 1 steps 1–5.

 ## Steps

@@ -16,24 +18,45 @@ This workflow fetches all open PRs from the project's GitHub repository, perform
  // turbo
 - Run: `git -C <project_root> remote get-url origin` to extract the owner/repo

-### 2. Fetch Open Pull Requests
+### 2. Ensure Release Branch Exists
+
+// turbo
+
+Before doing any work, ensure you are on the current release branch:
+
+```bash
+# Check current branch
+git branch --show-current
+
+# If on main, determine next version and create the release branch
+VERSION=$(node -p "require('./package.json').version")
+# Bump patch: e.g. 3.3.11 → 3.3.12
+NEXT=$(node -p "const [a,b,c]=('$VERSION').split('.').map(Number); c>=9?a+'.'+(b+1)+'.0':a+'.'+b+'.'+(c+1)")
+git checkout -b release/v$NEXT
+npm version patch --no-git-tag-version
+npm install
+```
+
+If already on a `release/vX.Y.Z` branch, continue working there.
+
+### 3. Fetch Open Pull Requests

 // turbo-all

 **⚠️ CRITICAL**: The JSON output of `gh pr list` can be truncated by the tool, silently hiding PRs. You MUST use the two-step approach below to guarantee **all** PRs are fetched.

-**Step 2a — Get PR numbers only** (small output, never truncated):
+**Step 3a — Get PR numbers only** (small output, never truncated):

 - Run: `gh pr list --repo <owner>/<repo> --state open --limit 500 --json number --jq '.[].number'`
 - This outputs one PR number per line. Count them and confirm total.

-**Step 2b — Fetch full metadata for each PR** (one call per PR):
+**Step 3b — Fetch full metadata for each PR** (one call per PR):

- For each PR number from step 2a, run:
+- For each PR number from step 3a, run:
  `gh pr view <NUMBER> --repo <owner>/<repo> --json number,title,author,headRefName,body,createdAt,additions,deletions,files`
 - You may batch these into parallel calls (up to 4 at a time).

-**Step 2c — Fetch diffs for each PR** (one call per PR, saved to /tmp):
+**Step 3c — Fetch diffs for each PR** (one call per PR, saved to /tmp):

 - For each PR number, run:
  `gh pr diff <NUMBER> --repo <owner>/<repo> > /tmp/pr<NUMBER>.diff`
@@ -45,44 +68,44 @@ This workflow fetches all open PRs from the project's GitHub repository, perform
  - Files changed (diff)
  - Existing review comments (from bots or humans)

-**Verification**: Confirm the count of PRs analyzed matches the count from step 2a before proceeding.
+**Verification**: Confirm the count of PRs analyzed matches the count from step 3a before proceeding.

-### 3. Analyze Each PR — For each open PR, perform the following analysis:
+### 4. Analyze Each PR — For each open PR, perform the following analysis:

-#### 3a. Feature Assessment
+#### 4a. Feature Assessment

 - **Does it make sense?** Evaluate if the feature fills a real gap or solves a valid problem
 - **Alignment** — Check if it aligns with the project's architecture and roadmap
 - **Complexity** — Assess if the scope is reasonable or if it should be split

-#### 3b. Code Quality Review
+#### 4b. Code Quality Review

 - Check for code duplication
 - Evaluate error handling patterns (consistent with existing codebase?)
 - Check naming conventions and code style
 - Verify TypeScript types (any `any` usage, missing types?)

-#### 3c. Security Review
+#### 4c. Security Review

 - Check for missing authentication/authorization on new endpoints
 - Check for injection vulnerabilities (URL params, SQL, XSS)
 - Verify input validation on all user-controlled data
 - Check for hardcoded secrets or credentials

-#### 3d. Architecture Review
+#### 4d. Architecture Review

 - Does the change follow existing patterns?
 - Are there any breaking changes to public APIs?
 - Is the database schema affected? Migration needed?
 - Impact on performance (N+1 queries, missing indexes?)

-#### 3e. Test Coverage
+#### 4e. Test Coverage

 - Does the PR include tests?
 - Are edge cases covered?
 - Would existing tests break?

-#### 3f. Cross-Layer (Global) Analysis
+#### 4f. Cross-Layer (Global) Analysis

 Perform a **global impact assessment** to verify whether the PR changes are complete across all layers of the application:

@@ -97,7 +120,7 @@ Perform a **global impact assessment** to verify whether the PR changes are comp
 - **Cross-cutting concerns**: Check shared layers (types, DTOs, validation schemas, routes, middleware) for completeness
 - **Document gaps** — If missing layers are detected, list them as **IMPORTANT** issues in the report with concrete suggestions for what should be added

-### 4. Generate Report — Create a markdown report for each PR including:
+### 5. Generate Report — Create a markdown report for each PR including:

 - **PR Summary** — What it does, files affected, commit count
 - **Improvements/Benefits** — Numbered list with impact level (HIGH/MEDIUM/LOW)
@@ -106,45 +129,35 @@ Perform a **global impact assessment** to verify whether the PR changes are comp
 - **Verdict** — Ready to merge? With mandatory vs optional fixes
 - **Next Steps** — What will happen if approved

-### 5. Present to User
+### 6. Present to User

 - Show the report via `notify_user` with `BlockedOnUser: true`
 - Wait for user decision:
-  - **Approved** → Proceed to step 6
+  - **Approved** → Proceed to step 7
  - **Approved with changes** → Implement the fixes and corrections before merging
  - **Rejected** → Close the PR or leave a review comment

-### 6. Implementation (if approved)
+### 7. Implementation (if approved)

- Checkout the PR branch: `gh pr checkout <NUMBER>`
- Implement any required fixes identified in the analysis
- If the Cross-Layer Analysis (3f) identified missing frontend/backend counterparts, implement them
- **Commit improvements on top of the PR branch** with descriptive commit messages
+> **⚠️ ALL work happens on the release branch, NOT the PR branch.**
+
+- Cherry-pick or merge the PR's changes into the current release branch:
+
+  ```bash
+  # Option A: Merge PR branch into release branch
+  git merge --no-ff <pr-branch> -m "Merge PR #<NUMBER>: <title>"
+
+  # Option B: Cherry-pick if cleaner
+  git cherry-pick <commit-hash>
+  ```
+
+- Implement any required fixes identified in the analysis **on the release branch**
+- If the Cross-Layer Analysis (4f) identified missing frontend/backend counterparts, implement them
 - Run the project's test suite to verify nothing breaks
  // turbo
 - Run: `npm test` or equivalent test command
- Build the project to verify compilation
-  // turbo
- Run: `npm run build` or equivalent build command
- Push the updated branch: `git push origin <branch-name>`
-
-### 7. 🛑 WAIT — Notify User & Await PR Verification
-
-**This is a mandatory stop point.** Use `notify_user` with `BlockedOnUser: true`:
-
- Inform the user that the PR has been **improved and pushed**, and is **awaiting their verification**
- Include:
-  - PR number and URL
-  - Summary of improvements/fixes applied
-  - Build/test status
-  - List of files changed
- **DO NOT merge, generate releases, or deploy until the user confirms**
-
-Wait for the user to respond:
-
- **User confirms** → Proceed to step 8
- **User requests more changes** → Apply changes, push to the same branch, notify again
- **User rejects** → Leave a review comment and stop
+- Commit improvements with descriptive messages
+- Push the release branch: `git push origin release/vX.Y.Z`

 ### 8. Thank the Contributor

@@ -152,16 +165,21 @@ Wait for the user to respond:
 - The message should:
  - Thank the author by name/username for their contribution
  - Briefly mention what the PR accomplishes and any improvements applied
+  - Note it will be included in the upcoming release
  - Be friendly, professional, and encouraging
- Example: _"Thanks @author for this great contribution! 🎉 The [feature/fix] is now merged and will be part of the next release. We appreciate your effort!"_
+- Example: _"Thanks @author for this great contribution! 🎉 The [feature/fix] has been integrated into the release/vX.Y.Z branch and will be part of the next release. We appreciate your effort!"_

-### 9. Merge & Release (only after user confirms PR)
+### 9. Close the Original PR

-After the user confirms the PR:
+- Close the original PR with a comment explaining it was integrated into the release branch:
+  ```bash
+  gh pr close <NUMBER> --repo <owner>/<repo> --comment "Integrated into release/vX.Y.Z. Will be released as part of v3.X.Y. Thank you!"
+  ```

-1. **Merge** the PR into main (local merge with `--no-ff` or via `gh pr merge`)
-2. **Push** to main: `git push origin main`
-3. **Clean up** the feature branch: `git branch -d <branch-name>`
-4. **Update CHANGELOG.md** with the new feature/fix
-5. Run the `/generate-release` workflow (at `.agents/workflows/generate-release.md`) to bump version, tag, and publish
-6. Deploy to local VPS: `ssh root@192.168.0.15 "npm install -g omniroute@<VERSION> && pm2 restart omniroute"`
+### 10. Continue or Finalize
+
+After processing all approved PRs:
+
+- If more PRs remain, go back to step 7
+- When all PRs are processed, **update CHANGELOG.md** on the release branch with all new entries
+- Run `/generate-release` workflow Phase 1 steps 7–10 (tests → commit → push → open PR to main → wait for user)
@@ -21,6 +21,7 @@ STORAGE_ENCRYPTION_KEY_VERSION=v1
 LOG_RETENTION_DAYS=90
 SQLITE_MAX_SIZE_MB=2048
 SQLITE_CLEAN_LEGACY_FILES=true
+DISABLE_SQLITE_AUTO_BACKUP=false

 # Recommended runtime variables
 # Canonical/base port (keeps backward compatibility)
@@ -135,3 +135,6 @@ vscode-extension/

 # Compiled npm-package build artifact (not source, should not be in git)
 /app
+
+# IDEA
+.idea/
@@ -2,6 +2,69 @@

 ## [Unreleased]

+> [!WARNING]
+> **BREAKING CHANGE: request logging, retention, and logging environment variables have been redesigned.**
+> On the first startup after upgrading, OmniRoute archives legacy request logs from `DATA_DIR/logs/`, legacy `DATA_DIR/call_logs/`, and `DATA_DIR/log.txt` into `DATA_DIR/log_archives/*.zip`, then removes the deprecated layout and switches to the new unified artifact format under `DATA_DIR/call_logs/`.
+
+### ✨ New Features
+
+- **Unified Request Log Artifacts:** Request logging now stores one SQLite index row plus one JSON artifact per request under `DATA_DIR/call_logs/`, with optional pipeline capture embedded in the same file.
+- **Language:** Improved the Chinese translation (#855)
+- **Opencode-Zen Models:** Added 4 free models to opencode-zen registry (#854)
+- **Tests:** Added unit and E2E tests for settings toggles and bug fixes (#850)
+
+### 🐛 Bug Fixes
+
+- **429 Quota Parsing:** Parsed long quota reset times from error bodies to honor correct backoffs and prevent rate-limited account bans (#859)
+- **Prompt Caching:** Preserved client `cache_control` headers for all Claude-protocol providers (like Minimax, GLM, and Bailian), correctly recognizing caching support (#856)
+- **Model Sync Logs:** Reduced log spam by recording `sync-models` only when the channel actually modifies the list (#853)
+- **Provider Quota & Token Parsing:** Switched Antigravity limits to use `retrieveUserQuota` natively and correctly mapped Claude token refresh payloads to URL-encoded forms (#862)
+- **Rate-Limiting Stability:** Universalized the 429 Retry-After parsing architecture to cap provider-induced cooldowns at 24 hours max (#862)
+- **Dashboard Limit Rendering:** Re-architected `/dashboard/limits` quota mapping to render immediately inside chunks, fixing a major UI freezing delay on accounts exceeding 70 active connections (#784)
+
+### ⚠️ Breaking Changes
+
+- **Request Log Layout:** Removed the old multi-file `DATA_DIR/logs/` request log sessions and the `DATA_DIR/log.txt` summary file. New requests are written as single JSON artifacts in `DATA_DIR/call_logs/YYYY-MM-DD/`.
+- **Logging Environment Variables:** Replaced `LOG_*`, `ENABLE_REQUEST_LOGS`, `CALL_LOGS_MAX`, `CALL_LOG_PAYLOAD_MODE`, and `PROXY_LOG_MAX_ENTRIES` with the new `APP_LOG_*` and `CALL_LOG_RETENTION_DAYS` configuration model.
+- **Pipeline Toggle Setting:** Replaced the legacy `detailed_logs_enabled` setting with `call_log_pipeline_enabled`. New pipeline details are embedded inside the request artifact instead of being stored as separate `request_detail_logs` records.
+
+### 🛠️ Maintenance
+
+- **Legacy Request Log Upgrade Backup:** Upgrades now archive old `data/logs/`, legacy `data/call_logs/`, and `data/log.txt` layouts into `DATA_DIR/log_archives/*.zip` before removing the deprecated structure.
+- **Streaming Usage Persistence:** Streaming requests now write a single `usage_history` row on completion instead of emitting a duplicate in-progress usage row with empty status metadata.
+
+---
+
+## [3.3.11] - 2026-03-31
+
+### 🚀 Features
+
+- **Subscription Utilization Analytics:** Added quota snapshot time-series tracking, Provider Utilization and Combo Health tabs with recharts visualizations, and corresponding API endpoints (#847)
+- **SQLite Backup Control:** New `OMNIROUTE_DISABLE_AUTO_BACKUP` env flag to disable automatic SQLite backups (#846)
+- **Model Registry Update:** Injected `gpt-5.4-mini` into the Codex provider's array of models (#756)
+- **Provider Limit Tracking:** Track and display when provider rate limits were last refreshed per account (#843)
+
+### 🐛 Bug Fixes
+
+- **Qwen Auth Routing:** Re-routed Qwen OAuth completions from the DashScope API to the Web Inference API (`chat.qwen.ai`), resolving authorization failures (#844, #807, #832)
+- **Qwen Auto-Retry Loop:** Added targeted 429 Quota Exceeded backoff handling inside `chatCore` protecting burst requests
+- **Codex OAuth Fallback:** Modern browser popup blocking no longer traps the user; it automatically falls back to manual URL entry (#808)
+- **Claude Token Refresh:** Anthropic's strict `application/json` boundaries are now respected during token generation instead of encoded URLs (#836)
+- **Codex Messages Schema:** Stripped purist `messages` injects from native passthrough requests to avoid structural rejections from the ChatGPT upstream (#806)
+- **CLI Detection Size Limit:** Safely bumped the Node binary scanning upper bound from 100MB to 350MB, allowing heavy standalone tools like Claude Code (229MB) and OpenCode (153MB) to be correctly detected by the VPS runtime (#809)
+- **CLI Runtime Environment:** Restored ability for CLI configurations to respect user override paths (`CLI_{PROVIDER}_BIN`) bypassing strict path-bound discovery rules
+- **Nvidia Header Conflicts:** Removed `prompt_cache_key` properties from upstream headers when calling non-Anthropic providers (#848)
+- **Codex Fast Tier Toggle:** Restored Codex service tier toggle contrast in light mode (#842)
+- **Test Infrastructure:** Updated `t28-model-catalog-updates` test that incorrectly expected the outdated DashScope endpoint for the Qwen native registry
+
+---
+
+## [3.3.9] - 2026-03-31
+
+### 🐛 Bug Fixes
+
+- **Custom Provider Rotation:** Integrated `getRotatingApiKey` internally inside DefaultExecutor, ensuring `extraApiKeys` rotation triggers correctly for custom and compatible upstream providers (#815)
+
 ---

 ## [3.3.8] - 2026-03-30
@@ -41,6 +41,17 @@ Key variables for development:
 | `INITIAL_PASSWORD`     | `123456`                | First login password      |
 | `ENABLE_REQUEST_LOGS`  | `false`                 | Enable debug request logs |

+### Dashboard Settings
+
+The dashboard provides UI toggles for features that can also be configured via environment variables:
+
+| Setting Location    | Toggle             | Description                    |
+| ------------------- | ------------------ | ------------------------------ |
+| Settings → Advanced | Debug Mode         | Enable debug request logs (UI) |
+| Settings → General  | Sidebar Visibility | Show/hide sidebar sections     |
+
+These settings are stored in the database and persist across restarts, overriding env var defaults when set.
+
 ### Running Locally

 ```bash
@@ -26,6 +26,30 @@ _Your universal API proxy — one endpoint, 67+ providers, zero downtime. Now wi

 ---

+## Breaking Change: Unified Logging Upgrade
+
+> [!WARNING]
+> **This release changes both the on-disk request log layout and the logging environment variables.**
+>
+> If you are upgrading an existing instance:
+>
+> - Request logs now live in `DATA_DIR/call_logs/YYYY-MM-DD/` as **one JSON artifact per request**.
+> - The old `DATA_DIR/logs/` session folders and `DATA_DIR/log.txt` summary file are removed.
+> - On the first startup after upgrading, OmniRoute creates a safety backup at `DATA_DIR/log_archives/*.zip` before removing the deprecated request log layout.
+> - Legacy logging env vars such as `LOG_TO_FILE`, `LOG_FILE_PATH`, `LOG_MAX_FILE_SIZE`, `LOG_RETENTION_DAYS`, `LOG_LEVEL`, `LOG_FORMAT`, `ENABLE_REQUEST_LOGS`, `CALL_LOGS_MAX`, `CALL_LOG_PAYLOAD_MODE`, and `PROXY_LOG_MAX_ENTRIES` are no longer supported.
+> - Use the new env model instead:
+>   - `APP_LOG_TO_FILE`
+>   - `APP_LOG_FILE_PATH`
+>   - `APP_LOG_MAX_FILE_SIZE`
+>   - `APP_LOG_RETENTION_DAYS`
+>   - `APP_LOG_LEVEL`
+>   - `APP_LOG_FORMAT`
+>   - `CALL_LOG_RETENTION_DAYS`
+>
+> For release details and upgrade notes, see the [CHANGELOG](CHANGELOG.md).
+
+---
+
 ## 🆕 What's New in v3.0.0

 > **Upgrading from v2.9.5?** — See the [full CHANGELOG](CHANGELOG.md#300--2026-03-22-release-candidate--not-yet-merged-to-main) for all changes.
@@ -409,7 +433,7 @@ Installing, configuring, and maintaining an AI proxy across different environmen
 - **Electron Desktop App** — Native app for Windows/macOS/Linux with system tray, auto-start, offline mode
 - **Split-Port Mode** — API and Dashboard on separate ports for advanced scenarios (reverse proxy, container networking)
 - **Cloud Sync** — Config synchronization across devices via Cloudflare Workers
- **DB Backups** — Automatic backup, restore, export and import of all settings
+- **DB Backups** — Automatic backup, restore, export and import of all settings, with `DISABLE_SQLITE_AUTO_BACKUP` for externally managed backups

 </details>

@@ -1276,21 +1300,22 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.

 ### ☁️ Deployment & Platform

-| Feature                       | What It Does                                              |
-| ----------------------------- | --------------------------------------------------------- |
-| 🌐 **Deploy Anywhere**        | Localhost, VPS, Docker, Cloud environments                |
-| 🚇 **Cloudflare Tunnel** 🆕   | One-click Quick Tunnel integration from the dashboard     |
-| 💾 **Cloud Sync**             | Configuration sync via cloud worker                       |
-| 🔄 **Backup/Restore**         | Export/import and disaster recovery flows                 |
-| 🧙 **Onboarding Wizard**      | First-run guided setup                                    |
-| 🔧 **CLI Tools Dashboard**    | One-click setup for popular coding tools                  |
-| 🎮 **Model Playground**       | Test any provider/model/endpoint from the dashboard       |
-| 🔏 **CLI Fingerprint Toggle** | Per-provider fingerprint matching in Settings > Security  |
-| 🌐 **i18n (30 languages)**    | Full dashboard + docs language support with RTL coverage  |
-| 🧹 **Clear All Models**       | One-click model list clearing in provider details         |
-| 👁️ **Sidebar Controls** 🆕    | Hide components and integrations from Appearance Settings |
-| 📋 **Issue Templates**        | Standardized GitHub templates for bugs and features       |
-| 📂 **Custom Data Directory**  | `DATA_DIR` override for storage location                  |
+| Feature                        | What It Does                                                          |
+| ------------------------------ | --------------------------------------------------------------------- |
+| 🌐 **Deploy Anywhere**         | Localhost, VPS, Docker, Cloud environments                            |
+| 🚇 **Cloudflare Tunnel** 🆕    | One-click Quick Tunnel integration from the dashboard                 |
+| 🔑 **API Key Model Filtering** | Native /v1/models response filtered via assigned Bearer context roles |
+| ⚡ **Smart Cache Bypass**      | Configurable TTL heuristics and forced refetch controls               |
+| 🔄 **Backup/Restore**          | Export/import and disaster recovery flows                             |
+| 🧙 **Onboarding Wizard**       | First-run guided setup                                                |
+| 🔧 **CLI Tools Dashboard**     | One-click setup for popular coding tools                              |
+| 🎮 **Model Playground**        | Test any provider/model/endpoint from the dashboard                   |
+| 🔏 **CLI Fingerprint Toggle**  | Per-provider fingerprint matching in Settings > Security              |
+| 🌐 **i18n (30 languages)**     | Full dashboard + docs language support with RTL coverage              |
+| 🧹 **Clear All Models**        | One-click model list clearing in provider details                     |
+| 👁️ **Sidebar Controls** 🆕     | Hide components and integrations from Appearance Settings             |
+| 📋 **Issue Templates**         | Standardized GitHub templates for bugs and features                   |
+| 📂 **Custom Data Directory**   | `DATA_DIR` override for storage location                              |

 ### Feature Deep Dive

@@ -1810,7 +1835,9 @@ opencode

 **No request logs**

- Set `ENABLE_REQUEST_LOGS=true` in `.env`
+- Request artifacts are written to `DATA_DIR/call_logs/` as one JSON file per request
+- Enable pipeline capture from Dashboard → Logs → Request Logs if you need detailed per-stage payloads
+- Set `APP_LOG_TO_FILE=true` if you also want application console logs in `logs/application/app.log`

 **Connection test shows "Invalid" for OpenAI-compatible providers**

@@ -507,25 +507,26 @@ post_install() {

 ### Environment Variables

-| Variable                  | Default                              | Description                                                      |
-| ------------------------- | ------------------------------------ | ---------------------------------------------------------------- |
-| `JWT_SECRET`              | `omniroute-default-secret-change-me` | JWT signing secret (**change in production**)                    |
-| `INITIAL_PASSWORD`        | `123456`                             | First login password                                             |
-| `DATA_DIR`                | `~/.omniroute`                       | Data directory (db, usage, logs)                                 |
-| `PORT`                    | framework default                    | Service port (`20128` in examples)                               |
-| `HOSTNAME`                | framework default                    | Bind host (Docker defaults to `0.0.0.0`)                         |
-| `NODE_ENV`                | runtime default                      | Set `production` for deploy                                      |
-| `BASE_URL`                | `http://localhost:20128`             | Server-side internal base URL                                    |
-| `CLOUD_URL`               | `https://omniroute.dev`              | Cloud sync endpoint base URL                                     |
-| `API_KEY_SECRET`          | `endpoint-proxy-api-key-secret`      | HMAC secret for generated API keys                               |
-| `REQUIRE_API_KEY`         | `false`                              | Enforce Bearer API key on `/v1/*`                                |
-| `ALLOW_API_KEY_REVEAL`    | `false`                              | Allow Api Manager to copy full API keys on demand                |
-| `ENABLE_REQUEST_LOGS`     | `false`                              | Enables request/response logs                                    |
-| `AUTH_COOKIE_SECURE`      | `false`                              | Force `Secure` auth cookie (behind HTTPS reverse proxy)          |
-| `CLOUDFLARED_BIN`         | unset                                | Use an existing `cloudflared` binary instead of managed download |
-| `OMNIROUTE_MEMORY_MB`     | `512`                                | Node.js heap limit in MB                                         |
-| `PROMPT_CACHE_MAX_SIZE`   | `50`                                 | Max prompt cache entries                                         |
-| `SEMANTIC_CACHE_MAX_SIZE` | `100`                                | Max semantic cache entries                                       |
+| Variable                               | Default                              | Description                                                                                |
+| -------------------------------------- | ------------------------------------ | ------------------------------------------------------------------------------------------ |
+| `JWT_SECRET`                           | `omniroute-default-secret-change-me` | JWT signing secret (**change in production**)                                              |
+| `INITIAL_PASSWORD`                     | `123456`                             | First login password                                                                       |
+| `DATA_DIR`                             | `~/.omniroute`                       | Data directory (db, usage, logs)                                                           |
+| `PORT`                                 | framework default                    | Service port (`20128` in examples)                                                         |
+| `HOSTNAME`                             | framework default                    | Bind host (Docker defaults to `0.0.0.0`)                                                   |
+| `NODE_ENV`                             | runtime default                      | Set `production` for deploy                                                                |
+| `BASE_URL`                             | `http://localhost:20128`             | Server-side internal base URL                                                              |
+| `CLOUD_URL`                            | `https://omniroute.dev`              | Cloud sync endpoint base URL                                                               |
+| `API_KEY_SECRET`                       | `endpoint-proxy-api-key-secret`      | HMAC secret for generated API keys                                                         |
+| `REQUIRE_API_KEY`                      | `false`                              | Enforce Bearer API key on `/v1/*`                                                          |
+| `ALLOW_API_KEY_REVEAL`                 | `false`                              | Allow Api Manager to copy full API keys on demand                                          |
+| `DISABLE_SQLITE_AUTO_BACKUP` | `false`                                        | Disable automatic SQLite snapshots before writes/import/restore; manual backups still work |
+| `ENABLE_REQUEST_LOGS`                  | `false`                              | Enables request/response logs                                                              |
+| `AUTH_COOKIE_SECURE`                   | `false`                              | Force `Secure` auth cookie (behind HTTPS reverse proxy)                                    |
+| `CLOUDFLARED_BIN`                      | unset                                | Use an existing `cloudflared` binary instead of managed download                           |
+| `OMNIROUTE_MEMORY_MB`                  | `512`                                | Node.js heap limit in MB                                                                   |
+| `PROMPT_CACHE_MAX_SIZE`                | `50`                                 | Max prompt cache entries                                                                   |
+| `SEMANTIC_CACHE_MAX_SIZE`              | `100`                                | Max semantic cache entries                                                                 |

 For the full environment variable reference, see the [README](../README.md).

@@ -768,11 +769,11 @@ OmniRoute implements provider-level resilience with four components:

 Manage database backups in **Dashboard → Settings → System & Storage**.

-| Action                   | Description                                                                                                                    |
-| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------ |
-| **Export Database**      | Downloads the current SQLite database as a `.sqlite` file                                                                      |
-| **Export All (.tar.gz)** | Downloads a full backup archive including: database, settings, combos, provider connections (no credentials), API key metadata |
-| **Import Database**      | Upload a `.sqlite` file to replace the current database. A pre-import backup is automatically created                          |
+| Action                   | Description                                                                                                                                              |
+| ------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| **Export Database**      | Downloads the current SQLite database as a `.sqlite` file                                                                                                |
+| **Export All (.tar.gz)** | Downloads a full backup archive including: database, settings, combos, provider connections (no credentials), API key metadata                           |
+| **Import Database**      | Upload a `.sqlite` file to replace the current database. A pre-import backup is automatically created unless `DISABLE_SQLITE_AUTO_BACKUP=true` |

 ```bash
 # API: Export database
@@ -1,7 +1,7 @@
 openapi: 3.1.0
 info:
  title: OmniRoute API
-  version: 3.3.8
+  version: 3.3.11
  description: |
    OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
    endpoint that routes requests to multiple AI providers with load balancing,
@@ -1,6 +1,6 @@
 {
  "name": "omniroute-desktop",
-  "version": "3.3.7",
+  "version": "3.3.11",
  "description": "OmniRoute Desktop Application",
  "main": "main.js",
  "author": {
@@ -264,6 +264,7 @@ export const REGISTRY: Record<string, RegistryEntry> = {
    },
    models: [
      { id: "gpt-5.4", name: "GPT 5.4" },
+      { id: "gpt-5.4-mini", name: "GPT 5.4 Mini" },
      { id: "gpt-5.3-codex", name: "GPT 5.3 Codex" },
      { id: "gpt-5.3-codex-xhigh", name: "GPT 5.3 Codex (xHigh)" },
      { id: "gpt-5.3-codex-high", name: "GPT 5.3 Codex (High)" },
@@ -286,7 +287,7 @@ export const REGISTRY: Record<string, RegistryEntry> = {
    alias: "qw",
    format: "openai",
    executor: "default",
-    baseUrl: "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
+    baseUrl: "https://chat.qwen.ai/api/v1/services/aigc/text-generation/generation",
    authType: "oauth",
    authHeader: "bearer",
    headers: {
@@ -590,9 +591,13 @@ export const REGISTRY: Record<string, RegistryEntry> = {
    authPrefix: "Bearer",
    defaultContextLength: 200000,
    models: [
-      { id: "minimax-m2.5-free", name: "MiniMax M2.5 Free" },
-      { id: "big-pickle", name: "Big Pickle" },
-      { id: "gpt-5-nano", name: "GPT 5 Nano" },
+      { id: "minimax-m2.5-free", name: "MiniMax M2.5 Free", contextLength: 204800 },
+      { id: "big-pickle", name: "Big Pickle", contextLength: 200000 },
+      { id: "gpt-5-nano", name: "GPT 5 Nano", contextLength: 400000 },
+      { id: "mimo-v2-omni-free", name: "MiMo V2 Omni Free", contextLength: 262144 },
+      { id: "mimo-v2-pro-free", name: "MiMo V2 Pro Free", contextLength: 1048576 },
+      { id: "nemotron-3-super-free", name: "Nemotron 3 Super Free", contextLength: 1000000 },
+      { id: "qwen3.6-plus-free", name: "Qwen 3.6 Plus Free", contextLength: 1048576 },
    ],
  },

@@ -2,7 +2,8 @@ import crypto from "crypto";
 import { BaseExecutor, mergeUpstreamExtraHeaders } from "./base.ts";
 import { PROVIDERS, OAUTH_ENDPOINTS, HTTP_STATUS } from "../config/constants.ts";

-const MAX_RETRY_AFTER_MS = 10000;
+const MAX_RETRY_AFTER_MS = 60_000;
+const LONG_RETRY_THRESHOLD_MS = 60_000;

 /**
 * Strip provider prefixes (e.g. "antigravity/model" → "model").
@@ -224,12 +225,15 @@ export class AntigravityExecutor extends BaseExecutor {
          signal,
        });

+        // Parse retry time for 429/503 responses
+        let retryMs = null;
+
        if (
          response.status === HTTP_STATUS.RATE_LIMITED ||
          response.status === HTTP_STATUS.SERVICE_UNAVAILABLE
        ) {
          // Try to get retry time from headers first
-          let retryMs = this.parseRetryHeaders(response.headers);
+          retryMs = this.parseRetryHeaders(response.headers);

          // If no retry time in headers, try to parse from error message body
          if (!retryMs) {
@@ -243,12 +247,13 @@ export class AntigravityExecutor extends BaseExecutor {
            }
          }

-          if (retryMs && retryMs <= MAX_RETRY_AFTER_MS) {
+          if (retryMs && retryMs <= LONG_RETRY_THRESHOLD_MS) {
+            const effectiveRetryMs = Math.min(retryMs, MAX_RETRY_AFTER_MS);
            log?.debug?.(
              "RETRY",
-              `${response.status} with Retry-After: ${Math.ceil(retryMs / 1000)}s, waiting...`
+              `${response.status} with Retry-After: ${Math.ceil(effectiveRetryMs / 1000)}s, waiting...`
            );
-            await new Promise((resolve) => setTimeout(resolve, retryMs));
+            await new Promise((resolve) => setTimeout(resolve, effectiveRetryMs));
            urlIndex--;
            continue;
          }
@@ -291,6 +296,33 @@ export class AntigravityExecutor extends BaseExecutor {
          continue;
        }

+        // If we have a 429 with long retry time, embed it in response body
+        if (
+          response.status === HTTP_STATUS.RATE_LIMITED &&
+          retryMs &&
+          retryMs > LONG_RETRY_THRESHOLD_MS
+        ) {
+          try {
+            const respBody = await response.clone().text();
+            let obj;
+            try {
+              obj = JSON.parse(respBody);
+            } catch {
+              obj = {};
+            }
+            obj.retryAfterMs = retryMs;
+            const modifiedBody = JSON.stringify(obj);
+            const modifiedResponse = new Response(modifiedBody, {
+              status: response.status,
+              headers: response.headers,
+            });
+            return { response: modifiedResponse, url, headers, transformedBody };
+          } catch (err) {
+            log?.warn?.("RETRY", `Failed to embed retryAfterMs: ${err}`);
+            // Fall back to original response
+          }
+        }
+
        return { response, url, headers, transformedBody };
      } catch (error) {
        lastError = error;
@@ -270,6 +270,11 @@ export class CodexExecutor extends BaseExecutor {
    // Ensure store is false (Codex requirement)
    body.store = false;

+    // Issue #806: Even for native passthrough, some clients (purist completions) might indiscriminately inject
+    // a `messages` or `prompt` array which the strict Codex Responses schema rejects.
+    delete body.messages;
+    delete body.prompt;
+
    if (nativeCodexPassthrough) {
      return body;
    }
@@ -1,6 +1,7 @@
 import { BaseExecutor } from "./base.ts";
 import { PROVIDERS, OAUTH_ENDPOINTS } from "../config/constants.ts";
 import { getAccessToken } from "../services/tokenRefresh.ts";
+import { getRotatingApiKey } from "../services/apiKeyRotator.ts";

 export class DefaultExecutor extends BaseExecutor {
  constructor(provider) {
@@ -41,15 +42,23 @@ export class DefaultExecutor extends BaseExecutor {
  buildHeaders(credentials, stream = true) {
    const headers = { "Content-Type": "application/json", ...this.config.headers };

+    // T07: resolve extra keys round-robin locally since DefaultExecutor overrides BaseExecutor buildHeaders
+    const extraKeys =
+      (credentials.providerSpecificData?.extraApiKeys as string[] | undefined) ?? [];
+    const effectiveKey =
+      extraKeys.length > 0 && credentials.connectionId && credentials.apiKey
+        ? getRotatingApiKey(credentials.connectionId, credentials.apiKey, extraKeys)
+        : credentials.apiKey;
+
    switch (this.provider) {
      case "gemini":
-        credentials.apiKey
-          ? (headers["x-goog-api-key"] = credentials.apiKey)
+        effectiveKey
+          ? (headers["x-goog-api-key"] = effectiveKey)
          : (headers["Authorization"] = `Bearer ${credentials.accessToken}`);
        break;
      case "claude":
-        credentials.apiKey
-          ? (headers["x-api-key"] = credentials.apiKey)
+        effectiveKey
+          ? (headers["x-api-key"] = effectiveKey)
          : (headers["Authorization"] = `Bearer ${credentials.accessToken}`);
        break;
      case "glm":
@@ -58,12 +67,12 @@ export class DefaultExecutor extends BaseExecutor {
      case "kimi-coding-apikey":
      case "minimax":
      case "minimax-cn":
-        headers["x-api-key"] = credentials.apiKey || credentials.accessToken;
+        headers["x-api-key"] = effectiveKey || credentials.accessToken;
        break;
      default:
        if (this.provider?.startsWith?.("anthropic-compatible-")) {
-          if (credentials.apiKey) {
-            headers["x-api-key"] = credentials.apiKey;
+          if (effectiveKey) {
+            headers["x-api-key"] = effectiveKey;
          } else if (credentials.accessToken) {
            headers["Authorization"] = `Bearer ${credentials.accessToken}`;
          }
@@ -71,7 +80,7 @@ export class DefaultExecutor extends BaseExecutor {
            headers["anthropic-version"] = "2023-06-01";
          }
        } else {
-          headers["Authorization"] = `Bearer ${credentials.apiKey || credentials.accessToken}`;
+          headers["Authorization"] = `Bearer ${effectiveKey || credentials.accessToken}`;
        }
    }

@@ -23,7 +23,7 @@ import {
 import { HTTP_STATUS, PROVIDER_MAX_TOKENS } from "../config/constants.ts";
 import { classifyProviderError, PROVIDER_ERROR_TYPES } from "../services/errorClassifier.ts";
 import { updateProviderConnection } from "@/lib/db/providers";
-import { isDetailedLoggingEnabled, saveRequestDetailLog } from "@/lib/db/detailedLogs";
+import { isDetailedLoggingEnabled } from "@/lib/db/detailedLogs";
 import { logAuditEvent } from "@/lib/compliance";
 import { handleBypassRequest } from "../utils/bypassHandler.ts";
 import {
@@ -47,7 +47,10 @@ import {
 } from "@/lib/localDb";
 import { getExecutor } from "../executors/index.ts";
 import { getCacheControlSettings } from "@/lib/cacheControlSettings";
-import { shouldPreserveCacheControl } from "../utils/cacheControlPolicy.ts";
+import {
+  shouldPreserveCacheControl,
+  providerSupportsCaching,
+} from "../utils/cacheControlPolicy.ts";
 import { getCacheMetrics } from "@/lib/db/settings.ts";

 import {
@@ -533,6 +536,24 @@ export async function handleChatCore({
    claudeCacheUsageMeta?: Record<string, unknown>;
  }) => {
    const callLogId = generateRequestId();
+    const pipelinePayloads = detailedLoggingEnabled ? reqLogger?.getPipelinePayloads?.() : null;
+
+    if (pipelinePayloads) {
+      if (providerResponse !== undefined) {
+        pipelinePayloads.providerResponse = providerResponse as Record<string, unknown>;
+      }
+      if (clientResponse !== undefined) {
+        pipelinePayloads.clientResponse = clientResponse as Record<string, unknown>;
+      }
+      if (error) {
+        pipelinePayloads.error = {
+          ...(typeof pipelinePayloads.error === "object" && pipelinePayloads.error
+            ? (pipelinePayloads.error as Record<string, unknown>)
+            : {}),
+          message: error,
+        };
+      }
+    }

    saveCallLog({
      id: callLogId,
@@ -565,31 +586,8 @@ export async function handleChatCore({
      apiKeyId: apiKeyInfo?.id || null,
      apiKeyName: apiKeyInfo?.name || null,
      noLog: noLogEnabled,
+      pipelinePayloads,
    }).catch(() => {});
-
-    if (!detailedLoggingEnabled) {
-      return;
-    }
-
-    try {
-      saveRequestDetailLog({
-        call_log_id: callLogId,
-        client_request: clientRawRequest?.body ?? body,
-        translated_request: providerRequest ?? null,
-        provider_response: providerResponse ?? null,
-        client_response: clientResponse ?? null,
-        provider,
-        model,
-        source_format: sourceFormat,
-        target_format: targetFormat,
-        duration_ms: Date.now() - startTime,
-        api_key_id: apiKeyInfo?.id || null,
-        no_log: noLogEnabled,
-      });
-    } catch (err) {
-      const errMessage = err instanceof Error ? err.message : String(err);
-      log?.debug?.("DETAIL_LOG", `Failed to save detailed log: ${errMessage}`);
-    }
  };

  // Primary path: merge client model id + alias target so config on either key applies; resolved
@@ -697,6 +695,7 @@ export async function handleChatCore({
    isCombo,
    comboStrategy,
    targetProvider: provider,
+    targetFormat,
    settings: { alwaysPreserveClientCache: cacheControlMode },
  });

@@ -711,6 +710,15 @@ export async function handleChatCore({
    if (nativeCodexPassthrough) {
      translatedBody = { ...body, _nativeCodexPassthrough: true };
      log?.debug?.("FORMAT", "native codex passthrough enabled");
+    } else if (isClaudePassthrough && preserveCacheControl) {
+      // Pure passthrough: when preserveCacheControl is true, forward the body
+      // as-is without any normalization. The OpenAI round-trip would strip
+      // cache_control markers; even prepareClaudeRequest can alter structure.
+      // Claude Code sends well-formed Messages API payloads — trust it.
+      translatedBody = { ...body };
+      translatedBody._disableToolPrefix = true;
+
+      log?.debug?.("FORMAT", "claude passthrough with cache_control preservation");
    } else if (isClaudePassthrough) {
      // Claude OAuth expects the same Claude Code prompt + structural normalization
      // as the OpenAI-compatible chat path. Round-trip through OpenAI to reuse the
@@ -955,11 +963,13 @@ export async function handleChatCore({
          ? translatedBody
          : { ...translatedBody, model: modelToCall };

-      // Inject prompt_cache_key for OpenAI providers if not already set
+      // Inject prompt_cache_key only for providers that support it
      if (
        targetFormat === FORMATS.OPENAI &&
+        providerSupportsCaching(provider) &&
        !bodyToSend.prompt_cache_key &&
-        Array.isArray(bodyToSend.messages)
+        Array.isArray(bodyToSend.messages) &&
+        !["nvidia", "codex", "xai"].includes(provider)
      ) {
        const { generatePromptCacheKey } = await import("@/lib/promptCache");
        const cacheKey = generatePromptCacheKey(bodyToSend.messages);
@@ -968,18 +978,39 @@ export async function handleChatCore({
        }
      }

-      const rawResult = await withRateLimit(provider, connectionId, modelToCall, () =>
-        executor.execute({
-          model: modelToCall,
-          body: bodyToSend,
-          stream,
-          credentials: getExecutionCredentials(),
-          signal: streamController.signal,
-          log,
-          extendedContext,
-          upstreamExtraHeaders: buildUpstreamHeadersForExecute(modelToCall),
-        })
-      );
+      const rawResult = await withRateLimit(provider, connectionId, modelToCall, async () => {
+        let attempts = 0;
+        const maxAttempts = provider === "qwen" ? 3 : 1;
+
+        while (attempts < maxAttempts) {
+          const res = await executor.execute({
+            model: modelToCall,
+            body: bodyToSend,
+            stream,
+            credentials: getExecutionCredentials(),
+            signal: streamController.signal,
+            log,
+            extendedContext,
+            upstreamExtraHeaders: buildUpstreamHeadersForExecute(modelToCall),
+          });
+
+          // Qwen 429 strict quota backoff (wait 1.5s, 3s and retry)
+          if (provider === "qwen" && res.response.status === 429 && attempts < maxAttempts - 1) {
+            const bodyPeek = await res.response
+              .clone()
+              .text()
+              .catch(() => "");
+            if (bodyPeek.toLowerCase().includes("exceeded your current quota")) {
+              const delay = 1500 * (attempts + 1);
+              log?.warn?.("QWEN_RETRY", `Quota 429 hit. Retrying in ${delay}ms...`);
+              await new Promise((r) => setTimeout(r, delay));
+              attempts++;
+              continue;
+            }
+          }
+          return res;
+        }
+      });

      if (stream) return rawResult;

@@ -1168,6 +1199,31 @@ export async function handleChatCore({
          console.warn(
            `[provider] Node ${connectionId} banned (${statusCode}) — disabling permanently`
          );
+        } else if (errorType === PROVIDER_ERROR_TYPES.ACCOUNT_DEACTIVATED) {
+          await updateProviderConnection(connectionId, {
+            isActive: false,
+            testStatus: "deactivated",
+            lastErrorType: errorType,
+            lastError: message,
+            errorCode: statusCode,
+          });
+          console.warn(
+            `[provider] Node ${connectionId} account deactivated (${statusCode}) — disabling permanently`
+          );
+        } else if (errorType === PROVIDER_ERROR_TYPES.RATE_LIMITED) {
+          const rateLimitedUntil = new Date(Date.now() + retryAfterMs).toISOString();
+          await updateProviderConnection(connectionId, {
+            rateLimitedUntil: rateLimitedUntil,
+            testStatus: "credits_exhausted",
+            lastErrorType: errorType,
+            lastError: message,
+            errorCode: statusCode,
+            healthCheckInterval: null,
+            lastHealthCheckAt: null,
+          });
+          console.warn(
+            `[provider] Node ${connectionId} rate limited (${statusCode}) - Next available at ${rateLimitedUntil}`
+          );
        } else if (errorType === PROVIDER_ERROR_TYPES.QUOTA_EXHAUSTED) {
          await updateProviderConnection(connectionId, {
            testStatus: "credits_exhausted",
@@ -1,6 +1,6 @@
 {
  "name": "@omniroute/open-sse",
-  "version": "3.3.7",
+  "version": "3.3.11",
  "description": "Express SSE sidecar for OmniRoute — handles streaming, protocol translation, and provider orchestration",
  "type": "module",
  "main": "index.js",
@@ -229,6 +229,39 @@ function parseDelayString(value) {
  return isNaN(num) ? null : num * 1000;
 }

+/**
+ * T07: Parse retry time from error text body with combined "XhYmZs" format.
+ * Examples: "Your quota will reset after 2h30m14s", "reset after 45m", "reset after 30s"
+ * Returns milliseconds or null if not parseable.
+ *
+ * @param {string} errorText - Error message text from response body
+ * @returns {number|null} Retry duration in milliseconds
+ */
+export function parseRetryFromErrorText(errorText) {
+  if (!errorText || typeof errorText !== "string") return null;
+
+  const match = errorText.match(/reset after (\d+h)?(\d+m)?(\d+s)?/i);
+  if (!match) {
+    // Also try the variant without "reset after": "will reset after XhYmZs"
+    const altMatch = errorText.match(/will reset after (\d+h)?(\d+m)?(\d+s)?/i);
+    if (!altMatch) return null;
+    return computeDurationMs(altMatch);
+  }
+
+  return computeDurationMs(match);
+}
+
+/**
+ * Compute total milliseconds from regex match groups (Xh)(Ym)(Zs)
+ */
+function computeDurationMs(match) {
+  let totalMs = 0;
+  if (match[1]) totalMs += parseInt(match[1], 10) * 3600 * 1000; // hours
+  if (match[2]) totalMs += parseInt(match[2], 10) * 60 * 1000; // minutes
+  if (match[3]) totalMs += parseInt(match[3], 10) * 1000; // seconds
+  return totalMs > 0 ? totalMs : null;
+}
+
 // ─── Error Classification ───────────────────────────────────────────────────

 /**
@@ -241,6 +274,8 @@ export function classifyErrorText(errorText) {
  if (
    lower.includes("quota exceeded") ||
    lower.includes("quota depleted") ||
+    lower.includes("quota will reset") ||
+    lower.includes("your quota will reset") ||
    lower.includes("billing")
  ) {
    return RateLimitReason.QUOTA_EXHAUSTED;
@@ -428,6 +463,9 @@ export function checkFallbackError(
      lowerError.includes("rate limit") ||
      lowerError.includes("too many requests") ||
      lowerError.includes("quota exceeded") ||
+      lowerError.includes("quota will reset") ||
+      lowerError.includes("exhausted your capacity") ||
+      lowerError.includes("quota exhausted") ||
      lowerError.includes("capacity") ||
      lowerError.includes("overloaded")
    ) {
@@ -443,6 +481,15 @@ export function checkFallbackError(
          };
        }
      }
+      const retryFromBody = parseRetryFromErrorText(errorStr);
+      if (retryFromBody && retryFromBody > 60_000) {
+        return {
+          shouldFallback: true,
+          cooldownMs: retryFromBody,
+          newBackoffLevel: 0,
+          reason: RateLimitReason.RATE_LIMIT_EXCEEDED,
+        };
+      }
      const newLevel = Math.min(backoffLevel + 1, BACKOFF_CONFIG.maxLevel);
      const reason = classifyErrorText(errorStr);
      return {
@@ -46,6 +46,9 @@ export function classifyProviderError(statusCode: number, responseBody: unknown)
  }

  if (statusCode === 402) return PROVIDER_ERROR_TYPES.QUOTA_EXHAUSTED;
+  if (statusCode === 403 && accountDeactivated) {
+    return PROVIDER_ERROR_TYPES.ACCOUNT_DEACTIVATED;
+  }
  if (statusCode === 403) return PROVIDER_ERROR_TYPES.FORBIDDEN;
  if (statusCode >= 500) return PROVIDER_ERROR_TYPES.SERVER_ERROR;

@@ -207,17 +207,21 @@ export async function refreshKimiCodingToken(refreshToken, log) {
 */
 export async function refreshClaudeOAuthToken(refreshToken, log) {
  try {
+    // Standard OAuth2 token refresh uses form-urlencoded (not JSON)
+    const params = new URLSearchParams({
+      grant_type: "refresh_token",
+      refresh_token: refreshToken,
+      client_id: PROVIDERS.claude.clientId,
+    });
+
    const response = await fetch(OAUTH_ENDPOINTS.anthropic.token, {
      method: "POST",
      headers: {
-        "Content-Type": "application/json",
+        "Content-Type": "application/x-www-form-urlencoded",
        Accept: "application/json",
+        "anthropic-beta": "oauth-2025-04-20",
      },
-      body: JSON.stringify({
-        grant_type: "refresh_token",
-        refresh_token: refreshToken,
-        client_id: PROVIDERS.claude.clientId,
-      }),
+      body: params.toString(),
    });

    if (!response.ok) {
@@ -658,23 +658,33 @@ function getAntigravityPlanLabel(subscriptionInfo) {
 /**
 * Antigravity Usage - Fetch quota from Google Cloud Code API
 * Now calls loadCodeAssist ONCE (cached) and reuses for projectId + plan.
+ * Uses retrieveUserQuota API (same as Gemini CLI) for accurate quota data across all tiers.
 */
 async function getAntigravityUsage(accessToken, providerSpecificData) {
  try {
-    // Single cached call for subscription info (provides both projectId and plan)
    const subscriptionInfo = await getAntigravitySubscriptionInfoCached(accessToken);
    const projectId = subscriptionInfo?.cloudaicompanionProject || null;

-    // Fetch quota data
-    const response = await fetch(ANTIGRAVITY_CONFIG.quotaApiUrl, {
-      method: "POST",
-      headers: {
-        Authorization: `Bearer ${accessToken}`,
-        "User-Agent": ANTIGRAVITY_CONFIG.userAgent,
-        "Content-Type": "application/json",
-      },
-      body: JSON.stringify(projectId ? { project: projectId } : {}),
-    });
+    if (!projectId) {
+      return {
+        plan: getAntigravityPlanLabel(subscriptionInfo),
+        message: "Antigravity project ID not available.",
+      };
+    }
+
+    // Use retrieveUserQuota API (same as Gemini CLI) - works correctly for both Free and Pro tiers
+    const response = await fetch(
+      "https://cloudcode-pa.googleapis.com/v1internal:retrieveUserQuota",
+      {
+        method: "POST",
+        headers: {
+          Authorization: `Bearer ${accessToken}`,
+          "Content-Type": "application/json",
+        },
+        body: JSON.stringify({ project: projectId }),
+        signal: AbortSignal.timeout(10000),
+      }
+    );

    if (response.status === 403) {
      return { message: "Antigravity access forbidden. Check subscription." };
@@ -685,54 +695,26 @@ async function getAntigravityUsage(accessToken, providerSpecificData) {
    }

    const data = await response.json();
-    const dataObj = toRecord(data);
-    const modelEntries = toRecord(dataObj.models);
    const quotas: Record<string, UsageQuota> = {};

-    // Parse model quotas (inspired by vscode-antigravity-cockpit)
-    if (Object.keys(modelEntries).length > 0) {
-      // Filter only recommended/important models (must match PROVIDER_MODELS ag ids)
-      const importantModels = [
-        "claude-opus-4-6-thinking",
-        "claude-sonnet-4-6",
-        "gemini-3.1-pro-high",
-        "gemini-3.1-pro-low",
-        "gemini-3-flash",
-        "gpt-oss-120b-medium",
-      ];
+    // Parse buckets from retrieveUserQuota response (same format as Gemini CLI)
+    if (Array.isArray(data.buckets)) {
+      for (const bucket of data.buckets) {
+        if (!bucket.modelId || bucket.remainingFraction == null) continue;

-      for (const [modelKey, infoValue] of Object.entries(modelEntries)) {
-        const info = toRecord(infoValue);
-        const quotaInfo = toRecord(info.quotaInfo);
-        // Skip models without quota info
-        if (Object.keys(quotaInfo).length === 0) {
-          continue;
-        }
-
-        // Skip internal models and non-important models
-        if (info.isInternal === true || !importantModels.includes(modelKey)) {
-          continue;
-        }
-
-        const remainingFraction = toNumber(quotaInfo.remainingFraction, 0);
+        const remainingFraction = toNumber(bucket.remainingFraction, 0);
        const remainingPercentage = remainingFraction * 100;
-
-        // Convert percentage to used/total for UI compatibility
-        // QUOTA_NORMALIZED_BASE is an arbitrary base for converting fractions
-        // to integer used/total pairs that the dashboard UI can display as bars.
        const QUOTA_NORMALIZED_BASE = 1000;
        const total = QUOTA_NORMALIZED_BASE;
        const remaining = Math.round(total * remainingFraction);
-        const used = total - remaining;
+        const used = Math.max(0, total - remaining);

-        // Use modelKey as key (matches PROVIDER_MODELS id)
-        quotas[modelKey] = {
+        quotas[bucket.modelId] = {
          used,
          total,
-          resetAt: parseResetTime(quotaInfo.resetTime),
+          resetAt: parseResetTime(bucket.resetTime),
          remainingPercentage,
          unlimited: false,
-          displayName: typeof info.displayName === "string" ? info.displayName : modelKey,
        };
      }
    }
@@ -743,7 +725,7 @@ async function getAntigravityUsage(accessToken, providerSpecificData) {
      subscriptionInfo,
    };
  } catch (error) {
-    return { message: `Antigravity error: ${error.message}` };
+    return { message: `Antigravity error: ${(error as Error).message}` };
  }
 }

@@ -90,10 +90,19 @@ export function isClaudeCodeClient(userAgent: string | null | undefined): boolea

 /**
 * Check if a provider supports prompt caching
+ * Supports caching if:
+ * 1. Provider is in the known caching providers list, OR
+ * 2. Provider uses Claude protocol (detected via targetFormat)
 */
-export function providerSupportsCaching(provider: string | null | undefined): boolean {
+export function providerSupportsCaching(
+  provider: string | null | undefined,
+  targetFormat?: string | null
+): boolean {
  if (!provider) return false;
-  return CACHING_PROVIDERS.has(provider.toLowerCase());
+  if (CACHING_PROVIDERS.has(provider.toLowerCase())) return true;
+  // All Claude-protocol providers support prompt caching
+  if (targetFormat === "claude") return true;
+  return false;
 }

 /**
@@ -121,12 +130,14 @@ export function shouldPreserveCacheControl({
  isCombo,
  comboStrategy,
  targetProvider,
+  targetFormat,
  settings,
 }: {
  userAgent: string | null | undefined;
  isCombo: boolean;
  comboStrategy?: RoutingStrategyValue | null;
  targetProvider: string | null | undefined;
+  targetFormat?: string | null;
  settings?: CacheControlSettings;
 }): boolean {
  // User override takes precedence
@@ -144,7 +155,7 @@ export function shouldPreserveCacheControl({
  }

  // Target provider must support caching
-  if (!providerSupportsCaching(targetProvider)) {
+  if (!providerSupportsCaching(targetProvider, targetFormat)) {
    return false;
  }

@@ -122,6 +122,17 @@ export async function parseUpstreamError(response, provider = null) {
    retryAfterMs = parseAntigravityRetryTime(messageStr);
  }

+  // Also parse retry time for other providers (Qwen, etc.) with "quota will reset after XhYmZs" format
+  if (response.status === 429 && !retryAfterMs) {
+    retryAfterMs = parseAntigravityRetryTime(messageStr);
+  }
+
+  // Cap maximum retry time at 24 hours to prevent infinite wait
+  const MAX_RETRY_MS = 24 * 60 * 60 * 1000;
+  if (retryAfterMs && retryAfterMs > MAX_RETRY_MS) {
+    retryAfterMs = MAX_RETRY_MS;
+  }
+
  return {
    statusCode: response.status,
    message: messageStr,
@@ -2,7 +2,7 @@
 * Structured console logger utility for omniroute.
 *
 * Provides consistent, machine-parseable log output across the codebase.
- * Supports two output formats controlled by LOG_FORMAT env var:
+ * Supports two output formats controlled by APP_LOG_FORMAT env var:
 *   - "text" (default): [LEVEL] [TAG] message {metadata}
 *   - "json": Single-line JSON objects for log aggregators
 *
@@ -18,17 +18,16 @@
 *   reqLog.info("AUTH", "Token refreshed", { provider: "claude" });
 *
 * Environment variables:
- *   LOG_LEVEL  — minimum level: debug | info | warn | error (default: info)
- *   LOG_FORMAT — output format: text | json (default: text)
+ *   APP_LOG_LEVEL  — minimum level: debug | info | warn | error (default: info)
+ *   APP_LOG_FORMAT — output format: text | json (default: text)
 */
+import { getAppLogFormat, getAppLogLevel } from "../../src/lib/logEnv";

 const LEVELS = { debug: 0, info: 1, warn: 2, error: 3 };

-const currentLevel =
-  LEVELS[((typeof process !== "undefined" && process.env?.LOG_LEVEL) || "info").toLowerCase()] ??
-  LEVELS.info;
+const currentLevel = LEVELS[getAppLogLevel("info").toLowerCase()] ?? LEVELS.info;

-const jsonFormat = typeof process !== "undefined" && process.env?.LOG_FORMAT === "json";
+const jsonFormat = getAppLogFormat("text") === "json";

 let requestCounter = 0;

@@ -1,92 +1,117 @@
-// Check if running in Node.js environment (has fs module)
-const isNode =
-  typeof process !== "undefined" && process.versions?.node && typeof window === "undefined";
+type JsonRecord = Record<string, unknown>;

-// Check if logging is enabled via environment variable (default: false)
-const LOGGING_ENABLED =
-  typeof process !== "undefined" && process.env?.ENABLE_REQUEST_LOGS === "true";
+type HeaderInput =
+  | Headers
+  | Record<string, unknown>
+  | { entries?: () => IterableIterator<[string, string]> }
+  | null
+  | undefined;

-let fs = null;
-let path = null;
-let LOGS_DIR = null;
+export type RequestPipelinePayloads = {
+  clientRawRequest?: JsonRecord;
+  sourceRequest?: JsonRecord;
+  openaiRequest?: JsonRecord;
+  providerRequest?: JsonRecord;
+  providerResponse?: JsonRecord;
+  clientResponse?: JsonRecord;
+  error?: JsonRecord;
+  streamChunks?: {
+    provider?: string[];
+    openai?: string[];
+    client?: string[];
+  };
+};

-// Lazy load Node.js modules (avoid top-level await)
-async function ensureNodeModules() {
-  if (!isNode || !LOGGING_ENABLED || fs) return;
-  try {
-    fs = await import("fs");
-    path = await import("path");
-    const { resolveDataDir } = await import("../../src/lib/dataPaths");
-    LOGS_DIR = path.join(resolveDataDir(), "logs");
-  } catch {
-    // Running in non-Node environment (Worker, Browser, etc.)
-  }
-}
+type RequestLogger = {
+  sessionPath: null;
+  logClientRawRequest: (endpoint: unknown, body: unknown, headers?: HeaderInput) => void;
+  logRawRequest: (body: unknown, headers?: HeaderInput) => void;
+  logOpenAIRequest: (body: unknown) => void;
+  logTargetRequest: (url: unknown, headers: HeaderInput, body: unknown) => void;
+  logProviderResponse: (
+    status: unknown,
+    statusText: unknown,
+    headers: HeaderInput,
+    body: unknown
+  ) => void;
+  appendProviderChunk: (chunk: string) => void;
+  appendOpenAIChunk: (chunk: string) => void;
+  logConvertedResponse: (body: unknown) => void;
+  appendConvertedChunk: (chunk: string) => void;
+  logError: (error: unknown, requestBody?: unknown) => void;
+  getPipelinePayloads: () => RequestPipelinePayloads | null;
+};

-// Format timestamp for folder name: 20251228_143045
-function formatTimestamp(date = new Date()) {
-  const pad = (n) => String(n).padStart(2, "0");
-  const y = date.getFullYear();
-  const m = pad(date.getMonth() + 1);
-  const d = pad(date.getDate());
-  const h = pad(date.getHours());
-  const min = pad(date.getMinutes());
-  const s = pad(date.getSeconds());
-  return `${y}${m}${d}_${h}${min}${s}`;
-}
-
-// Create log session folder: {sourceFormat}_{targetFormat}_{model}_{timestamp}
-async function createLogSession(sourceFormat, targetFormat, model) {
-  await ensureNodeModules();
-  if (!fs || !LOGS_DIR) return null;
-
-  try {
-    await fs.promises.mkdir(LOGS_DIR, { recursive: true });
-
-    const timestamp = formatTimestamp();
-    const safeModel = (model || "unknown").replace(/[/:]/g, "-");
-    const folderName = `${sourceFormat}_${targetFormat}_${safeModel}_${timestamp}`;
-    const sessionPath = path.join(LOGS_DIR, folderName);
-
-    await fs.promises.mkdir(sessionPath, { recursive: true });
-
-    return sessionPath;
-  } catch (err) {
-    console.log("[LOG] Failed to create log session:", err.message);
-    return null;
-  }
-}
-
-// Write JSON file (async, fire-and-forget)
-function writeJsonFile(sessionPath, filename, data) {
-  if (!fs || !sessionPath) return;
-
-  const filePath = path.join(sessionPath, filename);
-  fs.promises
-    .writeFile(filePath, JSON.stringify(data, null, 2))
-    .catch((err) => console.log(`[LOG] Failed to write ${filename}:`, err.message));
-}
-
-// Mask sensitive data in headers before writing to log files
-function maskSensitiveHeaders(headers) {
+function maskSensitiveHeaders(headers: HeaderInput): Record<string, unknown> {
  if (!headers) return {};
-  const masked = { ...headers };
+
+  const headerEntries =
+    typeof (headers as Headers).entries === "function"
+      ? Object.fromEntries((headers as Headers).entries())
+      : { ...(headers as Record<string, unknown>) };
+
+  const masked = { ...headerEntries };
  const sensitiveKeys = ["authorization", "x-api-key", "cookie", "token"];

  for (const key of Object.keys(masked)) {
    const lowerKey = key.toLowerCase();
-    if (sensitiveKeys.some((sk) => lowerKey.includes(sk))) {
-      const value = masked[key];
-      if (value && value.length > 20) {
-        masked[key] = value.slice(0, 10) + "..." + value.slice(-5);
-      }
+    if (!sensitiveKeys.some((candidate) => lowerKey.includes(candidate))) {
+      continue;
+    }
+
+    const value = masked[key];
+    if (typeof value === "string" && value.length > 20) {
+      masked[key] = `${value.slice(0, 10)}...${value.slice(-5)}`;
+    } else if (value) {
+      masked[key] = "[REDACTED]";
    }
  }
+
  return masked;
 }

-// No-op logger when logging is disabled
-function createNoOpLogger() {
+function createEmptyStreamChunks() {
+  return {
+    provider: [] as string[],
+    openai: [] as string[],
+    client: [] as string[],
+  };
+}
+
+function hasOwnValues(value: unknown): boolean {
+  return Boolean(value && typeof value === "object" && Object.keys(value as JsonRecord).length > 0);
+}
+
+function compactPipelinePayloads(
+  payloads: RequestPipelinePayloads
+): RequestPipelinePayloads | null {
+  const result: RequestPipelinePayloads = {};
+
+  for (const [key, value] of Object.entries(payloads)) {
+    if (value === null || value === undefined) {
+      continue;
+    }
+
+    if (key === "streamChunks" && value && typeof value === "object") {
+      const chunkRecord = value as Record<string, unknown>;
+      const compactedChunks = Object.fromEntries(
+        Object.entries(chunkRecord).filter(
+          ([, chunkValue]) => Array.isArray(chunkValue) && chunkValue.length > 0
+        )
+      );
+      if (Object.keys(compactedChunks).length > 0) {
+        result.streamChunks = compactedChunks as RequestPipelinePayloads["streamChunks"];
+      }
+      continue;
+    }
+
+    result[key as keyof RequestPipelinePayloads] = value as never;
+  }
+
+  return hasOwnValues(result) ? result : null;
+}
+
+function createNoOpLogger(): RequestLogger {
  return {
    sessionPath: null,
    logClientRawRequest() {},
@@ -99,151 +124,106 @@ function createNoOpLogger() {
    logConvertedResponse() {},
    appendConvertedChunk() {},
    logError() {},
+    getPipelinePayloads() {
+      return null;
+    },
  };
 }

-/**
- * Create a new log session and return logger functions
- * @param {string} sourceFormat - Source format from client (claude, openai, etc.)
- * @param {string} targetFormat - Target format to provider (antigravity, gemini-cli, etc.)
- * @param {string} model - Model name
- * @returns {Promise<object>} Promise that resolves to logger object with methods to log each stage
- */
-export async function createRequestLogger(sourceFormat, targetFormat, model) {
-  // Return no-op logger if logging is disabled
-  if (!LOGGING_ENABLED) {
-    return createNoOpLogger();
-  }
-
-  // Wait for session to be created before returning logger
-  const sessionPath = await createLogSession(sourceFormat, targetFormat, model);
+export async function createRequestLogger(
+  _sourceFormat?: string,
+  _targetFormat?: string,
+  _model?: string
+): Promise<RequestLogger> {
+  const streamChunks = createEmptyStreamChunks();
+  const payloads: RequestPipelinePayloads = {
+    streamChunks,
+  };

  return {
-    get sessionPath() {
-      return sessionPath;
-    },
+    sessionPath: null,

-    // 1. Log client raw request (before all conversion steps)
    logClientRawRequest(endpoint, body, headers = {}) {
-      writeJsonFile(sessionPath, "1_req_client.json", {
+      payloads.clientRawRequest = {
        timestamp: new Date().toISOString(),
        endpoint,
        headers: maskSensitiveHeaders(headers),
        body,
-      });
+      };
    },

-    // 2. Log raw request from client (after initial conversion like responsesApi)
    logRawRequest(body, headers = {}) {
-      writeJsonFile(sessionPath, "2_req_source.json", {
+      payloads.sourceRequest = {
        timestamp: new Date().toISOString(),
        headers: maskSensitiveHeaders(headers),
        body,
-      });
+      };
    },

-    // 3. Log OpenAI intermediate format (source → openai)
    logOpenAIRequest(body) {
-      writeJsonFile(sessionPath, "3_req_openai.json", {
+      payloads.openaiRequest = {
        timestamp: new Date().toISOString(),
        body,
-      });
+      };
    },

-    // 4. Log target format request (openai → target)
    logTargetRequest(url, headers, body) {
-      writeJsonFile(sessionPath, "4_req_target.json", {
+      payloads.providerRequest = {
        timestamp: new Date().toISOString(),
        url,
        headers: maskSensitiveHeaders(headers),
        body,
-      });
+      };
    },

-    // 5. Log provider response (for non-streaming or error)
    logProviderResponse(status, statusText, headers, body) {
-      const filename = "5_res_provider.json";
-      writeJsonFile(sessionPath, filename, {
+      payloads.providerResponse = {
        timestamp: new Date().toISOString(),
        status,
        statusText,
-        headers: headers
-          ? typeof headers.entries === "function"
-            ? Object.fromEntries(headers.entries())
-            : headers
-          : {},
+        headers: maskSensitiveHeaders(headers),
        body,
-      });
+      };
    },

-    // 5. Append streaming chunk to provider response (async)
    appendProviderChunk(chunk) {
-      if (!fs || !sessionPath) return;
-      const filePath = path.join(sessionPath, "5_res_provider.txt");
-      fs.promises.appendFile(filePath, chunk).catch(() => {});
+      if (typeof chunk === "string" && chunk.length > 0) {
+        streamChunks.provider.push(chunk);
+      }
    },

-    // 6. Append OpenAI intermediate chunks (async)
    appendOpenAIChunk(chunk) {
-      if (!fs || !sessionPath) return;
-      const filePath = path.join(sessionPath, "6_res_openai.txt");
-      fs.promises.appendFile(filePath, chunk).catch(() => {});
+      if (typeof chunk === "string" && chunk.length > 0) {
+        streamChunks.openai.push(chunk);
+      }
    },

-    // 7. Log converted response to client (for non-streaming)
    logConvertedResponse(body) {
-      writeJsonFile(sessionPath, "7_res_client.json", {
+      payloads.clientResponse = {
        timestamp: new Date().toISOString(),
        body,
-      });
+      };
    },

-    // 7b. Append streaming chunk to converted response (async)
    appendConvertedChunk(chunk) {
-      if (!fs || !sessionPath) return;
-      const filePath = path.join(sessionPath, "7_res_client.txt");
-      fs.promises.appendFile(filePath, chunk).catch(() => {});
+      if (typeof chunk === "string" && chunk.length > 0) {
+        streamChunks.client.push(chunk);
+      }
    },

-    // 8. Log error
    logError(error, requestBody = null) {
-      writeJsonFile(sessionPath, "6_error.json", {
+      payloads.error = {
        timestamp: new Date().toISOString(),
-        error: error?.message || String(error),
-        stack: error?.stack,
-        requestBody,
-      });
-    },
-  };
-}
-
-// Legacy logError (kept for backward compatibility, converted to async)
-export function logError(provider, { error, url, model, requestBody }) {
-  if (!fs || !LOGS_DIR) return;
-
-  const writeLog = async () => {
-    try {
-      await fs.promises.mkdir(LOGS_DIR, { recursive: true });
-
-      const date = new Date().toISOString().split("T")[0];
-      const logPath = path.join(LOGS_DIR, `${provider}-${date}.log`);
-
-      const logEntry = {
-        timestamp: new Date().toISOString(),
-        type: "error",
-        provider,
-        model,
-        url,
-        error: error?.message || String(error),
-        stack: error?.stack,
+        error: error instanceof Error ? error.message : String(error),
+        stack: error instanceof Error ? error.stack : undefined,
        requestBody,
      };
+    },

-      await fs.promises.appendFile(logPath, JSON.stringify(logEntry) + "\n");
-    } catch (err) {
-      console.log("[LOG] Failed to write error log:", err.message);
-    }
+    getPipelinePayloads() {
+      return compactPipelinePayloads(payloads);
+    },
  };
-
-  writeLog();
 }
+
+export function logError(_provider: string, _entry: unknown) {}
@@ -2,7 +2,7 @@
 * Token Usage Tracking - Extract, normalize, estimate and log token usage
 */

-import { saveRequestUsage, appendRequestLog } from "@/lib/usageDb";
+import { appendRequestLog } from "@/lib/usageDb";
 import {
  getLoggedInputTokens,
  getLoggedOutputTokens,
@@ -444,7 +444,8 @@ export function logUsage(provider, usage, model = null, connectionId = null, api

  console.log(msg);

-  // Save to usage DB with cache-read tracked separately from the main input counter.
+  // Streaming requests persist usage once in chatCore's completion callback.
+  // Keep this helper side-effect free apart from console visibility.
  const tokens = {
    input: inTokens,
    output: outTokens,
@@ -452,13 +453,5 @@ export function logUsage(provider, usage, model = null, connectionId = null, api
    cacheCreation: cacheCreation || 0,
    reasoning: reasoning || 0,
  };
-  saveRequestUsage({
-    model,
-    provider,
-    connectionId,
-    apiKeyId: apiKeyInfo?.id || undefined,
-    apiKeyName: apiKeyInfo?.name || undefined,
-    tokens,
-  }).catch(() => {});
  appendRequestLog({ model, provider, connectionId, tokens, status: "200 OK" }).catch(() => {});
 }
@@ -1,12 +1,12 @@
 {
  "name": "omniroute",
-  "version": "3.3.7",
+  "version": "3.4.0",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "omniroute",
-      "version": "3.3.7",
+      "version": "3.4.0",
      "hasInstallScript": true,
      "license": "MIT",
      "workspaces": [
@@ -44,6 +44,7 @@
        "undici": "^7.19.2",
        "uuid": "^13.0.0",
        "wreq-js": "^2.0.1",
+        "yazl": "^3.3.1",
        "zod": "^4.3.6",
        "zustand": "^5.0.10"
      },
@@ -7654,6 +7655,15 @@
        "ieee754": "^1.1.13"
      }
    },
+    "node_modules/buffer-crc32": {
+      "version": "1.0.0",
+      "resolved": "https://registry.npmjs.org/buffer-crc32/-/buffer-crc32-1.0.0.tgz",
+      "integrity": "sha512-Db1SbgBS/fg/392AblrMJk97KggmvYhr4pB5ZIMTWtaivCPMWLkmb7m21cJvpvgK+J3nsU2CmmixNBZx4vFj/w==",
+      "license": "MIT",
+      "engines": {
+        "node": ">=8.0.0"
+      }
+    },
    "node_modules/bundle-name": {
      "version": "4.1.0",
      "resolved": "https://registry.npmjs.org/bundle-name/-/bundle-name-4.1.0.tgz",
@@ -20226,6 +20236,15 @@
        "node": ">=8"
      }
    },
+    "node_modules/yazl": {
+      "version": "3.3.1",
+      "resolved": "https://registry.npmjs.org/yazl/-/yazl-3.3.1.tgz",
+      "integrity": "sha512-BbETDVWG+VcMUle37k5Fqp//7SDOK2/1+T7X8TD96M3D9G8jK5VLUdQVdVjGi8im7FGkazX7kk5hkU8X4L5Bng==",
+      "license": "MIT",
+      "dependencies": {
+        "buffer-crc32": "^1.0.0"
+      }
+    },
    "node_modules/yocto-queue": {
      "version": "0.1.0",
      "resolved": "https://registry.npmjs.org/yocto-queue/-/yocto-queue-0.1.0.tgz",
@@ -20324,7 +20343,7 @@
    },
    "open-sse": {
      "name": "@omniroute/open-sse",
-      "version": "3.3.7"
+      "version": "3.3.11"
    }
  }
 }
@@ -1,6 +1,6 @@
 {
  "name": "omniroute",
-  "version": "3.3.8",
+  "version": "3.4.0",
  "description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
  "type": "module",
  "bin": {
@@ -115,6 +115,7 @@
    "undici": "^7.19.2",
    "uuid": "^13.0.0",
    "wreq-js": "^2.0.1",
+    "yazl": "^3.3.1",
    "zod": "^4.3.6",
    "zustand": "^5.0.10"
  },
@@ -1,16 +0,0 @@
-## [3.2.8] - 2026-03-29
-
-### ✨ Enhancements & Refactoring
-
- **Docker Auto-Update UI** — Integrated a detached background update process for Docker Compose deployments. The Dashboard UI now seamlessly tracks update lifecycle events combining JSON REST responses with SSE streaming progress overlays for robust cross-environment reliability.
- **Cache Analytics** — Repaired zero-metrics visualization mapping by migrating Semantic Cache telemetry logs directly into the centralized tracking SQLite module.
-
-### 🐛 Bug Fixes
-
- **Authentication Logic** — Fixed a bug where saving dashboard settings or adding models failed with a 401 Unauthorized error when `requireLogin` was disabled. API endpoints now correctly evaluate the global authentication toggle. Resolved global redirection by reactivating `src/middleware.ts`.
- **CLI Tool Detection (Windows)** — Prevented fatal initialization exceptions during CLI environment detection by catching `cross-spawn` ENOENT errors correctly. Adds explicit detection paths for `\AppData\Local\droid\droid.exe`.
- **Codex Native Passthrough** — Normalized model translation parameters preventing context poisoning in proxy pass-through mode, enforcing generic `store: false` constraints explicitly for all Codex-originated requests.
- **SSE Token Reporting** — Normalized provider tool-call chunk `finish_reason` detection, fixing 0% Usage analytics for stream-only responses missing strict `<DONE>` indicators.
- **DeepSeek <think> Tags** — Implemented an explicit `<think>` extraction mapping inside `responsesHandler.ts`, ensuring DeepSeek reasoning streams map equivalently to native Anthropic `<thinking>` structures.
-
---
@@ -0,0 +1,452 @@
+"use client";
+
+import { useCallback, useEffect, useMemo, useState } from "react";
+import Card from "@/shared/components/Card";
+import Badge from "@/shared/components/Badge";
+import { Skeleton, Spinner } from "@/shared/components/Loading";
+import TimeRangeSelector from "@/shared/components/analytics/TimeRangeSelector";
+import type {
+  ComboHealthMetrics,
+  ComboHealthResponse,
+  UtilizationTimeRange,
+} from "@/shared/types/utilization";
+import { cn } from "@/shared/utils/cn";
+
+function formatPercent(value: number, digits = 0) {
+  return `${value.toFixed(digits)}%`;
+}
+
+function formatShare(value: number) {
+  return formatPercent(value * 100, 1);
+}
+
+function formatLatency(value: number) {
+  return `${Math.round(value).toLocaleString()}ms`;
+}
+
+function getTrendMeta(trend: ComboHealthMetrics["quotaHealth"]["providers"][number]["trend"]) {
+  if (trend === "improving") {
+    return {
+      icon: "trending_up",
+      label: "Improving",
+      variant: "success" as const,
+    };
+  }
+
+  if (trend === "declining") {
+    return {
+      icon: "trending_down",
+      label: "Declining",
+      variant: "warning" as const,
+    };
+  }
+
+  return {
+    icon: "trending_flat",
+    label: "Stable",
+    variant: "default" as const,
+  };
+}
+
+function MetricBlock({
+  icon,
+  label,
+  value,
+  subValue,
+}: {
+  icon: string;
+  label: string;
+  value: string;
+  subValue?: string;
+}) {
+  return (
+    <div className="rounded-lg border border-black/5 bg-black/[0.02] p-4 dark:border-white/5 dark:bg-white/[0.02]">
+      <div className="flex items-center gap-2 text-xs font-semibold uppercase tracking-wider text-text-muted">
+        <span className="material-symbols-outlined text-[16px]">{icon}</span>
+        {label}
+      </div>
+      <div className="mt-2 text-2xl font-semibold text-text-main">{value}</div>
+      {subValue ? <div className="mt-1 text-xs text-text-muted">{subValue}</div> : null}
+    </div>
+  );
+}
+
+function DistributionBar({ label, value, meta }: { label: string; value: number; meta: string }) {
+  const width = `${Math.max(value * 100, value > 0 ? 6 : 0)}%`;
+
+  return (
+    <div className="flex flex-col gap-2 rounded-lg border border-black/5 bg-black/[0.02] p-3 dark:border-white/5 dark:bg-white/[0.02]">
+      <div className="flex items-center justify-between gap-3 text-sm">
+        <span className="truncate font-medium text-text-main">{label}</span>
+        <span className="shrink-0 text-xs text-text-muted">{meta}</span>
+      </div>
+      <div className="h-2 overflow-hidden rounded-full bg-black/5 dark:bg-white/5">
+        <div className="h-full rounded-full bg-primary transition-all" style={{ width }} />
+      </div>
+    </div>
+  );
+}
+
+function ComboHealthCard({ combo }: { combo: ComboHealthMetrics }) {
+  const sortedDistribution = useMemo(
+    () =>
+      [...combo.usageSkew.modelDistribution].sort(
+        (left, right) => right.requestShare - left.requestShare
+      ),
+    [combo.usageSkew.modelDistribution]
+  );
+
+  return (
+    <Card className="overflow-hidden p-0">
+      <div className="border-b border-black/5 px-6 py-5 dark:border-white/5">
+        <div className="flex flex-col gap-4 lg:flex-row lg:items-start lg:justify-between">
+          <div className="min-w-0">
+            <div className="flex flex-wrap items-center gap-2">
+              <h3 className="text-lg font-semibold text-text-main">{combo.comboName}</h3>
+              <Badge variant="primary" size="sm">
+                {combo.strategy}
+              </Badge>
+            </div>
+            <p className="mt-2 text-sm text-text-muted">
+              {combo.models.length} models across {combo.quotaHealth.providers.length} providers
+            </p>
+          </div>
+          <div className="grid grid-cols-1 gap-3 sm:grid-cols-3 lg:min-w-[420px]">
+            <MetricBlock
+              icon="battery_status_good"
+              label="Worst quota left"
+              value={formatPercent(combo.quotaHealth.worstRemainingPct)}
+            />
+            <MetricBlock
+              icon="balance"
+              label="Usage skew"
+              value={combo.usageSkew.giniCoefficient.toFixed(2)}
+              subValue="Gini coefficient"
+            />
+            <MetricBlock
+              icon="bolt"
+              label="Success rate"
+              value={formatPercent(combo.performance.successRate * 100, 1)}
+              subValue={`${combo.performance.totalRequests.toLocaleString()} requests`}
+            />
+          </div>
+        </div>
+      </div>
+
+      <div className="grid gap-6 px-6 py-5 xl:grid-cols-[1.1fr_1fr_0.95fr]">
+        <section className="flex flex-col gap-4">
+          <div>
+            <div className="text-sm font-semibold text-text-main">Quota health</div>
+            <div className="mt-1 text-xs text-text-muted">
+              Lowest remaining quota across providers with short trend signals.
+            </div>
+          </div>
+
+          <div className="flex flex-col gap-3">
+            {combo.quotaHealth.providers.map((provider) => {
+              const trendMeta = getTrendMeta(provider.trend);
+              const width = `${Math.max(provider.remainingPct, provider.remainingPct > 0 ? 6 : 0)}%`;
+
+              return (
+                <div
+                  key={provider.provider}
+                  className="rounded-lg border border-black/5 bg-black/[0.02] p-4 dark:border-white/5 dark:bg-white/[0.02]"
+                >
+                  <div className="flex flex-wrap items-center justify-between gap-3">
+                    <div>
+                      <div className="text-sm font-medium text-text-main">{provider.provider}</div>
+                      <div className="mt-1 text-xs text-text-muted">
+                        Remaining quota {formatPercent(provider.remainingPct, 1)}
+                      </div>
+                    </div>
+                    <div className="flex items-center gap-2">
+                      <Badge variant={trendMeta.variant} size="sm" icon={trendMeta.icon}>
+                        {trendMeta.label}
+                      </Badge>
+                      {provider.isExhausted ? (
+                        <Badge variant="error" size="sm">
+                          Exhausted
+                        </Badge>
+                      ) : null}
+                    </div>
+                  </div>
+                  <div className="mt-3 h-2 overflow-hidden rounded-full bg-black/5 dark:bg-white/5">
+                    <div
+                      className="h-full rounded-full bg-primary transition-all"
+                      style={{ width }}
+                    />
+                  </div>
+                </div>
+              );
+            })}
+          </div>
+        </section>
+
+        <section className="flex flex-col gap-4">
+          <div>
+            <div className="text-sm font-semibold text-text-main">Usage skew</div>
+            <div className="mt-1 text-xs text-text-muted">
+              Model request share and token share within this combo.
+            </div>
+          </div>
+
+          <div className="flex flex-col gap-3">
+            {sortedDistribution.map((entry) => (
+              <div
+                key={entry.model}
+                className="rounded-lg border border-black/5 bg-black/[0.02] p-4 dark:border-white/5 dark:bg-white/[0.02]"
+              >
+                <div className="flex items-start justify-between gap-3">
+                  <div className="min-w-0">
+                    <div className="truncate text-sm font-medium text-text-main">{entry.model}</div>
+                    <div className="mt-1 text-xs text-text-muted">
+                      Request share {formatShare(entry.requestShare)} · Token share{" "}
+                      {formatShare(entry.tokenShare)}
+                    </div>
+                  </div>
+                  <Badge size="sm">{formatShare(entry.requestShare)}</Badge>
+                </div>
+
+                <div className="mt-3 grid gap-2">
+                  <DistributionBar
+                    label="Requests"
+                    value={entry.requestShare}
+                    meta={formatShare(entry.requestShare)}
+                  />
+                  <DistributionBar
+                    label="Tokens"
+                    value={entry.tokenShare}
+                    meta={formatShare(entry.tokenShare)}
+                  />
+                </div>
+              </div>
+            ))}
+          </div>
+        </section>
+
+        <section className="flex flex-col gap-4">
+          <div>
+            <div className="text-sm font-semibold text-text-main">Performance</div>
+            <div className="mt-1 text-xs text-text-muted">
+              Reliability and throughput for routed combo traffic.
+            </div>
+          </div>
+
+          <div className="grid gap-3 sm:grid-cols-3 xl:grid-cols-1">
+            <MetricBlock
+              icon="timer"
+              label="Avg latency"
+              value={formatLatency(combo.performance.avgLatencyMs)}
+            />
+            <MetricBlock
+              icon="task_alt"
+              label="Success rate"
+              value={formatPercent(combo.performance.successRate * 100, 1)}
+            />
+            <MetricBlock
+              icon="stacked_line_chart"
+              label="Total requests"
+              value={combo.performance.totalRequests.toLocaleString()}
+            />
+          </div>
+        </section>
+      </div>
+    </Card>
+  );
+}
+
+function ComboHealthSkeleton() {
+  return (
+    <div className="flex flex-col gap-4">
+      {[0, 1].map((index) => (
+        <Card key={index} className="p-6">
+          <div className="flex flex-col gap-6">
+            <div className="flex flex-col gap-4 xl:flex-row xl:items-center xl:justify-between">
+              <div className="space-y-2">
+                <Skeleton className="h-6 w-40" />
+                <Skeleton className="h-4 w-52" />
+              </div>
+              <div className="grid grid-cols-1 gap-3 sm:grid-cols-3 xl:min-w-[420px]">
+                {[0, 1, 2].map((item) => (
+                  <Skeleton key={item} className="h-24 rounded-lg" />
+                ))}
+              </div>
+            </div>
+            <div className="grid gap-4 xl:grid-cols-3">
+              {[0, 1, 2].map((item) => (
+                <div key={item} className="space-y-3">
+                  <Skeleton className="h-5 w-28" />
+                  <Skeleton className="h-24 rounded-lg" />
+                  <Skeleton className="h-24 rounded-lg" />
+                </div>
+              ))}
+            </div>
+          </div>
+        </Card>
+      ))}
+    </div>
+  );
+}
+
+export default function ComboHealthTab() {
+  const [range, setRange] = useState<UtilizationTimeRange>("24h");
+  const [data, setData] = useState<ComboHealthResponse | null>(null);
+  const [loading, setLoading] = useState(true);
+  const [error, setError] = useState<string | null>(null);
+  const [retrying, setRetrying] = useState(false);
+
+  const fetchData = useCallback(
+    async (controller: AbortController, isRetry = false) => {
+      if (isRetry) {
+        setRetrying(true);
+      } else {
+        setLoading(true);
+      }
+      setError(null);
+
+      try {
+        const response = await fetch(`/api/usage/combo-health?range=${range}`, {
+          signal: controller.signal,
+        });
+
+        if (!response.ok) {
+          throw new Error("Failed to fetch combo health data");
+        }
+
+        const result = (await response.json()) as ComboHealthResponse;
+        setData(result);
+        setError(null);
+      } catch (fetchError) {
+        if ((fetchError as Error).name === "AbortError") {
+          return;
+        }
+        setError(fetchError instanceof Error ? fetchError.message : "Unknown error");
+        if (!isRetry) setData(null);
+      } finally {
+        if (!controller.signal.aborted) {
+          setLoading(false);
+          if (isRetry) setRetrying(false);
+        }
+      }
+    },
+    [range]
+  );
+
+  useEffect(() => {
+    const controller = new AbortController();
+    fetchData(controller, false);
+    return () => controller.abort();
+  }, [fetchData]);
+
+  const combos = data?.combos ?? [];
+
+  const handleRetry = useCallback(() => {
+    const controller = new AbortController();
+    fetchData(controller, true);
+  }, [fetchData]);
+
+  return (
+    <div className="flex flex-col gap-6">
+      <div className="flex flex-col gap-4 rounded-xl border border-black/5 bg-surface p-5 shadow-sm dark:border-white/5 md:flex-row md:items-center md:justify-between">
+        <div>
+          <h2 className="text-lg font-semibold text-text-main">Combo health</h2>
+          <p className="mt-1 text-sm text-text-muted">
+            Monitor quota pressure, skewed model usage, and delivery performance by combo.
+          </p>
+        </div>
+        <TimeRangeSelector value={range} onChange={setRange} />
+      </div>
+
+      {loading ? <ComboHealthSkeleton /> : null}
+
+      {!loading && error ? (
+        <Card className="p-8">
+          <div className="flex flex-col items-center justify-center gap-4 text-center">
+            <span className="material-symbols-outlined text-[40px] text-error">sync_problem</span>
+            <div className="flex flex-col gap-1">
+              <div className="font-medium text-text-main">Unable to load combo health</div>
+              <div className="text-sm text-text-muted">{error}</div>
+            </div>
+            <button
+              type="button"
+              onClick={handleRetry}
+              disabled={retrying}
+              className="inline-flex items-center gap-2 rounded-lg bg-primary px-4 py-2 text-sm font-medium text-white hover:bg-primary-hover disabled:opacity-50 disabled:cursor-not-allowed"
+            >
+              {retrying ? (
+                <>
+                  <span className="material-symbols-outlined animate-spin text-[18px]">
+                    progress_activity
+                  </span>
+                  Retrying…
+                </>
+              ) : (
+                <>
+                  <span className="material-symbols-outlined text-[18px]">refresh</span>
+                  Retry
+                </>
+              )}
+            </button>
+          </div>
+        </Card>
+      ) : null}
+
+      {!loading && !error && combos.length === 0 ? (
+        <Card className="p-10">
+          <div className="flex flex-col items-center justify-center gap-4 text-center">
+            <span className="material-symbols-outlined text-[40px] text-text-muted/70">
+              monitor_heart
+            </span>
+            <div className="text-base font-medium text-text-main">
+              No combo health data available
+            </div>
+            <div className="max-w-md text-sm text-text-muted">
+              Combo quota snapshots and routed requests will appear here after traffic starts
+              flowing.
+            </div>
+            <div className="rounded-lg border border-black/5 bg-black/[0.02] p-4 dark:border-white/5 dark:bg-white/[0.02]">
+              <p className="text-xs font-medium text-text-main">Getting started</p>
+              <ul className="mt-2 text-left text-xs text-text-muted">
+                <li className="flex items-start gap-2">
+                  <span className="material-symbols-outlined text-[14px] text-primary">
+                    check_circle
+                  </span>
+                  <span>
+                    Create combos in <strong>Combos</strong> with multiple providers
+                  </span>
+                </li>
+                <li className="mt-1 flex items-start gap-2">
+                  <span className="material-symbols-outlined text-[14px] text-primary">
+                    check_circle
+                  </span>
+                  <span>Send requests to combo endpoints to generate traffic data</span>
+                </li>
+                <li className="mt-1 flex items-start gap-2">
+                  <span className="material-symbols-outlined text-[14px] text-primary">
+                    check_circle
+                  </span>
+                  <span>Health metrics will appear automatically as requests are routed</span>
+                </li>
+              </ul>
+            </div>
+          </div>
+        </Card>
+      ) : null}
+
+      {!loading && !error && combos.length > 0 ? (
+        <div className="flex flex-col gap-4">
+          <div className="flex items-center gap-2 text-sm text-text-muted">
+            <Spinner
+              size="sm"
+              className={cn("text-primary", "[&_.material-symbols-outlined]:text-[16px]")}
+            />
+            Tracking {combos.length} combos for {range}
+          </div>
+          {combos.map((combo) => (
+            <ComboHealthCard key={combo.comboId} combo={combo} />
+          ))}
+        </div>
+      ) : null}
+    </div>
+  );
+}
@@ -0,0 +1,389 @@
+"use client";
+
+import { useCallback, useEffect, useMemo, useState } from "react";
+import {
+  CartesianGrid,
+  Legend,
+  Line,
+  LineChart,
+  ResponsiveContainer,
+  Tooltip,
+  XAxis,
+  YAxis,
+} from "recharts";
+import Card from "@/shared/components/Card";
+import ProviderIcon from "@/shared/components/ProviderIcon";
+import TimeRangeSelector from "@/shared/components/analytics/TimeRangeSelector";
+import type {
+  ProviderUtilizationPoint,
+  ProviderUtilizationResponse,
+  UtilizationTimeRange,
+} from "@/shared/types/utilization";
+
+const RANGE_LABELS: Record<UtilizationTimeRange, string> = {
+  "1h": "Last hour",
+  "24h": "Last 24 hours",
+  "7d": "Last 7 days",
+  "30d": "Last 30 days",
+};
+
+const PROVIDER_COLORS = [
+  "var(--color-primary)",
+  "var(--color-accent)",
+  "var(--color-success)",
+  "var(--color-warning)",
+  "var(--color-error)",
+  "var(--color-text-muted)",
+];
+
+function formatTimestamp(value: string, range: UtilizationTimeRange) {
+  const date = new Date(value);
+
+  if (Number.isNaN(date.getTime())) {
+    return value;
+  }
+
+  if (range === "1h" || range === "24h") {
+    return new Intl.DateTimeFormat(undefined, {
+      hour: "2-digit",
+      minute: "2-digit",
+    }).format(date);
+  }
+
+  return new Intl.DateTimeFormat(undefined, {
+    month: "short",
+    day: "numeric",
+  }).format(date);
+}
+
+function formatTooltipTimestamp(value: string, range: UtilizationTimeRange) {
+  const date = new Date(value);
+
+  if (Number.isNaN(date.getTime())) {
+    return value;
+  }
+
+  return new Intl.DateTimeFormat(undefined, {
+    month: "short",
+    day: "numeric",
+    hour: range === "1h" || range === "24h" ? "2-digit" : undefined,
+    minute: range === "1h" || range === "24h" ? "2-digit" : undefined,
+  }).format(date);
+}
+
+function formatPercent(value: number) {
+  return `${Math.round(value)}%`;
+}
+
+function getLatestPoints(points: ProviderUtilizationPoint[]) {
+  const latestByProvider = new Map<string, ProviderUtilizationPoint>();
+
+  for (const point of points) {
+    const current = latestByProvider.get(point.provider);
+    if (!current || new Date(point.timestamp).getTime() > new Date(current.timestamp).getTime()) {
+      latestByProvider.set(point.provider, point);
+    }
+  }
+
+  return Array.from(latestByProvider.values()).sort((a, b) => b.remainingPct - a.remainingPct);
+}
+
+export default function ProviderUtilizationTab() {
+  const [range, setRange] = useState<UtilizationTimeRange>("24h");
+  const [data, setData] = useState<ProviderUtilizationResponse | null>(null);
+  const [loading, setLoading] = useState(true);
+  const [error, setError] = useState<string | null>(null);
+
+  const fetchUtilization = useCallback(
+    async (selectedRange: UtilizationTimeRange, signal?: AbortSignal) => {
+      setLoading(true);
+
+      try {
+        const response = await fetch(`/api/usage/utilization?range=${selectedRange}`, {
+          signal,
+          cache: "no-store",
+        });
+
+        if (!response.ok) {
+          throw new Error("Failed to fetch utilization data");
+        }
+
+        const json = (await response.json()) as ProviderUtilizationResponse;
+        setData(json);
+        setError(null);
+      } catch (fetchError) {
+        if (fetchError instanceof DOMException && fetchError.name === "AbortError") {
+          return;
+        }
+
+        setError(
+          fetchError instanceof Error ? fetchError.message : "Failed to fetch utilization data"
+        );
+        setData(null);
+      } finally {
+        if (!signal?.aborted) {
+          setLoading(false);
+        }
+      }
+    },
+    []
+  );
+
+  useEffect(() => {
+    const controller = new AbortController();
+
+    fetchUtilization(range, controller.signal);
+
+    return () => controller.abort();
+  }, [fetchUtilization, range]);
+
+  const providerColors = useMemo(() => {
+    const colors = new Map<string, string>();
+
+    for (const [index, provider] of (data?.providers ?? []).entries()) {
+      colors.set(provider, PROVIDER_COLORS[index % PROVIDER_COLORS.length]);
+    }
+
+    return colors;
+  }, [data?.providers]);
+
+  const chartData = useMemo(() => {
+    if (!data?.data.length) {
+      return [];
+    }
+
+    const byTimestamp = new Map<string, Record<string, number | string>>();
+
+    for (const point of data.data) {
+      const entry = byTimestamp.get(point.timestamp) ?? {
+        timestamp: point.timestamp,
+        label: formatTimestamp(point.timestamp, data.timeRange),
+      };
+
+      entry[point.provider] = Number(point.remainingPct.toFixed(2));
+      byTimestamp.set(point.timestamp, entry);
+    }
+
+    return Array.from(byTimestamp.entries())
+      .sort(([left], [right]) => new Date(left).getTime() - new Date(right).getTime())
+      .map(([, value]) => value);
+  }, [data]);
+
+  const latestPoints = useMemo(() => getLatestPoints(data?.data ?? []), [data?.data]);
+
+  const hasData = Boolean(data?.data.length);
+  const [retrying, setRetrying] = useState(false);
+
+  const handleRetry = useCallback(() => {
+    setRetrying(true);
+    setError(null);
+    fetchUtilization(range).finally(() => setRetrying(false));
+  }, [range, fetchUtilization]);
+
+  return (
+    <div className="flex flex-col gap-6">
+      <Card
+        title="Provider utilization"
+        subtitle={RANGE_LABELS[range]}
+        icon="monitoring"
+        action={<TimeRangeSelector value={range} onChange={setRange} />}
+        className="overflow-hidden"
+      >
+        {loading && !hasData ? (
+          <div className="flex min-h-80 items-center justify-center text-sm text-text-muted">
+            <span className="material-symbols-outlined mr-2 animate-spin text-[18px]">
+              progress_activity
+            </span>
+            Loading utilization data…
+          </div>
+        ) : error ? (
+          <div className="flex min-h-80 flex-col items-center justify-center gap-4 text-center">
+            <span className="material-symbols-outlined text-[32px] text-error">error</span>
+            <div className="flex flex-col gap-1">
+              <p className="text-sm font-medium text-text-main">Failed to load utilization data</p>
+              <p className="text-sm text-text-muted">{error}</p>
+            </div>
+            <button
+              type="button"
+              onClick={handleRetry}
+              disabled={retrying}
+              className="inline-flex items-center gap-2 rounded-lg bg-primary px-4 py-2 text-sm font-medium text-white hover:bg-primary-hover disabled:opacity-50 disabled:cursor-not-allowed"
+            >
+              {retrying ? (
+                <>
+                  <span className="material-symbols-outlined animate-spin text-[18px]">
+                    progress_activity
+                  </span>
+                  Retrying…
+                </>
+              ) : (
+                <>
+                  <span className="material-symbols-outlined text-[18px]">refresh</span>
+                  Retry
+                </>
+              )}
+            </button>
+          </div>
+        ) : !hasData ? (
+          <div className="flex min-h-80 flex-col items-center justify-center gap-4 text-center">
+            <span className="material-symbols-outlined text-[40px] text-text-muted/70">
+              timeline
+            </span>
+            <div className="flex flex-col gap-2">
+              <p className="text-sm font-medium text-text-main">No utilization data available</p>
+              <p className="max-w-md text-sm text-text-muted">
+                Provider quota snapshots will appear here after utilization data is collected.
+              </p>
+            </div>
+            <div className="rounded-lg border border-black/5 bg-black/[0.02] p-4 dark:border-white/5 dark:bg-white/[0.02]">
+              <p className="text-xs font-medium text-text-main">Getting started</p>
+              <ul className="mt-2 text-left text-xs text-text-muted">
+                <li className="flex items-start gap-2">
+                  <span className="material-symbols-outlined text-[14px] text-primary">
+                    check_circle
+                  </span>
+                  <span>
+                    Connect providers via OAuth or API keys in <strong>Providers</strong>
+                  </span>
+                </li>
+                <li className="mt-1 flex items-start gap-2">
+                  <span className="material-symbols-outlined text-[14px] text-primary">
+                    check_circle
+                  </span>
+                  <span>
+                    Enable quota tracking by using the provider in a combo or direct request
+                  </span>
+                </li>
+                <li className="mt-1 flex items-start gap-2">
+                  <span className="material-symbols-outlined text-[14px] text-primary">
+                    check_circle
+                  </span>
+                  <span>Data will appear automatically as quota snapshots are collected</span>
+                </li>
+              </ul>
+            </div>
+          </div>
+        ) : (
+          <div className="flex flex-col gap-5">
+            <div className="h-80 w-full rounded-xl border border-black/5 bg-black/[0.02] px-3 py-4 dark:border-white/5 dark:bg-white/[0.02]">
+              <ResponsiveContainer width="100%" height="100%">
+                <LineChart data={chartData} margin={{ top: 8, right: 16, bottom: 0, left: 0 }}>
+                  <CartesianGrid
+                    stroke="var(--color-border)"
+                    strokeDasharray="3 3"
+                    vertical={false}
+                  />
+                  <XAxis
+                    dataKey="timestamp"
+                    tickFormatter={(value) => formatTimestamp(String(value), range)}
+                    tick={{ fill: "var(--color-text-muted)", fontSize: 12 }}
+                    axisLine={{ stroke: "var(--color-border)" }}
+                    tickLine={{ stroke: "var(--color-border)" }}
+                    minTickGap={24}
+                  />
+                  <YAxis
+                    domain={[0, 100]}
+                    tickFormatter={formatPercent}
+                    tick={{ fill: "var(--color-text-muted)", fontSize: 12 }}
+                    axisLine={{ stroke: "var(--color-border)" }}
+                    tickLine={{ stroke: "var(--color-border)" }}
+                    width={44}
+                  />
+                  <Tooltip
+                    labelFormatter={(value) => formatTooltipTimestamp(String(value), range)}
+                    formatter={(value: number, name: string) => [formatPercent(value), name]}
+                    contentStyle={{
+                      backgroundColor: "var(--color-surface)",
+                      borderColor: "var(--color-border)",
+                      borderRadius: 12,
+                      color: "var(--color-text-main)",
+                      boxShadow: "var(--shadow-soft)",
+                    }}
+                    itemStyle={{ color: "var(--color-text-main)" }}
+                    labelStyle={{ color: "var(--color-text-main)", fontWeight: 600 }}
+                  />
+                  <Legend />
+                  {data?.providers.map((provider) => (
+                    <Line
+                      key={provider}
+                      type="monotone"
+                      dataKey={provider}
+                      name={provider}
+                      stroke={providerColors.get(provider) ?? "var(--color-primary)"}
+                      strokeWidth={2.5}
+                      dot={false}
+                      activeDot={{ r: 4, strokeWidth: 0 }}
+                      connectNulls
+                    />
+                  ))}
+                </LineChart>
+              </ResponsiveContainer>
+            </div>
+
+            <div className="grid grid-cols-1 gap-4 md:grid-cols-2 xl:grid-cols-3">
+              {latestPoints.map((point) => {
+                const isLow = point.remainingPct <= 20;
+
+                return (
+                  <Card.Section key={point.provider} className="flex h-full flex-col gap-4">
+                    <div className="flex items-start justify-between gap-3">
+                      <div className="flex items-center gap-3">
+                        <div className="flex h-11 w-11 items-center justify-center rounded-xl border border-black/5 bg-surface text-text-main dark:border-white/5">
+                          <ProviderIcon providerId={point.provider} size={22} />
+                        </div>
+                        <div>
+                          <p className="text-sm font-semibold text-text-main">{point.provider}</p>
+                          <p className="text-xs text-text-muted">Latest quota snapshot</p>
+                        </div>
+                      </div>
+                      <span
+                        className={`inline-flex items-center rounded-full px-2.5 py-1 text-xs font-medium ${
+                          point.isExhausted
+                            ? "bg-error/10 text-error"
+                            : isLow
+                              ? "bg-warning/10 text-warning"
+                              : "bg-success/10 text-success"
+                        }`}
+                      >
+                        {point.isExhausted ? "Exhausted" : isLow ? "Low" : "Healthy"}
+                      </span>
+                    </div>
+
+                    <div className="flex items-end justify-between gap-3">
+                      <div>
+                        <p className="text-3xl font-bold text-text-main">
+                          {point.remainingPct.toFixed(point.remainingPct < 10 ? 1 : 0)}%
+                        </p>
+                        <p className="mt-1 text-xs text-text-muted">Remaining capacity</p>
+                      </div>
+                      <div className="text-right text-xs text-text-muted">
+                        <p>{formatTooltipTimestamp(point.timestamp, range)}</p>
+                        <p className="mt-1 uppercase tracking-[0.14em]">{point.windowKey}</p>
+                      </div>
+                    </div>
+
+                    <div className="flex flex-col gap-2">
+                      <div className="h-2 overflow-hidden rounded-full bg-black/5 dark:bg-white/5">
+                        <div
+                          className={`h-full rounded-full transition-all ${
+                            point.isExhausted ? "bg-error" : isLow ? "bg-warning" : "bg-primary"
+                          }`}
+                          style={{ width: `${Math.max(point.remainingPct, 0)}%` }}
+                        />
+                      </div>
+                      <div className="flex items-center justify-between text-xs text-text-muted">
+                        <span>0%</span>
+                        <span>Remaining quota</span>
+                        <span>100%</span>
+                      </div>
+                    </div>
+                  </Card.Section>
+                );
+              })}
+            </div>
+          </div>
+        )}
+      </Card>
+    </div>
+  );
+}
@@ -5,6 +5,8 @@ import { UsageAnalytics, CardSkeleton, SegmentedControl } from "@/shared/compone
 import EvalsTab from "../usage/components/EvalsTab";
 import SearchAnalyticsTab from "./SearchAnalyticsTab";
 import DiversityScoreCard from "./components/DiversityScoreCard";
+import ProviderUtilizationTab from "./ProviderUtilizationTab";
+import ComboHealthTab from "./ComboHealthTab";
 import { useTranslations } from "next-intl";

 export default function AnalyticsPage() {
@@ -15,6 +17,8 @@ export default function AnalyticsPage() {
    overview: t("overviewDescription"),
    evals: t("evalsDescription"),
    search: "Search request analytics — provider breakdown, cache hit rate, and cost tracking.",
+    utilization: t("utilizationDescription"),
+    comboHealth: t("comboHealthDescription"),
  };

  return (
@@ -33,6 +37,8 @@ export default function AnalyticsPage() {
          { value: "overview", label: t("overview") },
          { value: "evals", label: t("evals") },
          { value: "search", label: "Search" },
+          { value: "utilization", label: t("utilization") },
+          { value: "comboHealth", label: t("comboHealth") },
        ]}
        value={activeTab}
        onChange={setActiveTab}
@@ -50,6 +56,8 @@ export default function AnalyticsPage() {
      )}
      {activeTab === "evals" && <EvalsTab />}
      {activeTab === "search" && <SearchAnalyticsTab />}
+      {activeTab === "utilization" && <ProviderUtilizationTab />}
+      {activeTab === "comboHealth" && <ComboHealthTab />}
    </div>
  );
 }
@@ -84,8 +84,10 @@ export default function CodexServiceTierTab() {
        <button
          onClick={() => save(!enabled)}
          disabled={loading || saving}
-          className={`relative inline-flex h-6 w-11 items-center rounded-full transition-colors ${
-            enabled ? "bg-sky-500" : "bg-white/10"
+          className={`relative inline-flex h-6 w-11 items-center rounded-full border transition-colors ${
+            enabled
+              ? "bg-sky-500 border-sky-500"
+              : "bg-black/10 border-black/10 dark:bg-white/10 dark:border-white/10"
          }`}
        >
          <span
@@ -18,11 +18,6 @@ export default function SystemStorageTab() {
  const [importStatus, setImportStatus] = useState({ type: "", message: "" });
  const [confirmImport, setConfirmImport] = useState(false);
  const [pendingImportFile, setPendingImportFile] = useState<File | null>(null);
-  const [maxCallLogs, setMaxCallLogs] = useState(10000);
-  const [maxCallLogsDraft, setMaxCallLogsDraft] = useState("10000");
-  const [settingsLoading, setSettingsLoading] = useState(true);
-  const [maxCallLogsSaving, setMaxCallLogsSaving] = useState(false);
-  const [maxCallLogsStatus, setMaxCallLogsStatus] = useState({ type: "", message: "" });
  const fileInputRef = useRef<HTMLInputElement>(null);
  const locale = useLocale();
  const t = useTranslations("settings");
@@ -31,7 +26,10 @@ export default function SystemStorageTab() {
    driver: "sqlite",
    dbPath: "~/.omniroute/storage.sqlite",
    sizeBytes: 0,
-    retentionDays: 90,
+    retentionDays: {
+      app: 7,
+      call: 7,
+    },
    lastBackupAt: null,
  });

@@ -59,27 +57,6 @@ export default function SystemStorageTab() {
    }
  };

-  const loadSettings = async () => {
-    setSettingsLoading(true);
-    try {
-      const res = await fetch("/api/settings");
-      if (!res.ok) return;
-      const data = await res.json();
-      const value =
-        typeof data.maxCallLogs === "number" &&
-        Number.isInteger(data.maxCallLogs) &&
-        data.maxCallLogs > 0
-          ? data.maxCallLogs
-          : 10000;
-      setMaxCallLogs(value);
-      setMaxCallLogsDraft(String(value));
-    } catch (err) {
-      console.error("Failed to fetch settings:", err);
-    } finally {
-      setSettingsLoading(false);
-    }
-  };
-
  const handleManualBackup = async () => {
    setManualBackupLoading(true);
    setManualBackupStatus({ type: "", message: "" });
@@ -145,47 +122,8 @@ export default function SystemStorageTab() {

  useEffect(() => {
    loadStorageHealth();
-    loadSettings();
  }, []);

-  const handleSaveMaxCallLogs = async () => {
-    const parsed = Number.parseInt(maxCallLogsDraft, 10);
-    if (!Number.isInteger(parsed) || parsed <= 0) {
-      setMaxCallLogsStatus({
-        type: "error",
-        message: "Enter a positive integer for the call log limit.",
-      });
-      return;
-    }
-
-    setMaxCallLogsSaving(true);
-    setMaxCallLogsStatus({ type: "", message: "" });
-    try {
-      const res = await fetch("/api/settings", {
-        method: "PATCH",
-        headers: { "Content-Type": "application/json" },
-        body: JSON.stringify({ maxCallLogs: parsed }),
-      });
-      const data = await res.json();
-      if (!res.ok) {
-        throw new Error(data.error || "Failed to save call log limit");
-      }
-      setMaxCallLogs(parsed);
-      setMaxCallLogsDraft(String(parsed));
-      setMaxCallLogsStatus({
-        type: "success",
-        message: "Call log retention limit saved.",
-      });
-    } catch (err) {
-      setMaxCallLogsStatus({
-        type: "error",
-        message: (err as Error).message || "Failed to save call log limit",
-      });
-    } finally {
-      setMaxCallLogsSaving(false);
-    }
-  };
-
  const handleExport = async () => {
    setExportLoading(true);
    try {
@@ -345,53 +283,23 @@ export default function SystemStorageTab() {
      </div>

      <div className="p-3 rounded-lg bg-bg border border-border mb-4">
-        <div className="flex items-start justify-between gap-3">
+        <div className="flex items-start justify-between gap-3 flex-wrap">
          <div>
-            <p className="text-sm font-medium text-text-main">Call log retention limit</p>
+            <p className="text-sm font-medium text-text-main">Log retention policy</p>
            <p className="text-xs text-text-muted">
-              Keep only the most recent call log entries in SQLite. Older entries are pruned
-              automatically after each new request log is saved.
+              Request logs follow <code>CALL_LOG_RETENTION_DAYS</code>. Application and audit logs
+              follow <code>APP_LOG_RETENTION_DAYS</code>.
            </p>
          </div>
-          <Badge variant="default" size="sm">
-            {maxCallLogs.toLocaleString()}
-          </Badge>
-        </div>
-
-        <div className="flex flex-wrap items-center gap-2 mt-3">
-          <input
-            type="number"
-            min="1"
-            step="1"
-            value={maxCallLogsDraft}
-            onChange={(e) => setMaxCallLogsDraft(e.target.value)}
-            disabled={settingsLoading || maxCallLogsSaving}
-            className="w-40 rounded-lg border border-border bg-bg-secondary px-3 py-2 text-sm text-text-main focus:outline-none focus:ring-1 focus:ring-primary/40"
-            aria-label="Call log retention limit"
-          />
-          <Button
-            variant="outline"
-            size="sm"
-            onClick={handleSaveMaxCallLogs}
-            loading={maxCallLogsSaving}
-            disabled={settingsLoading}
-          >
-            Save limit
-          </Button>
-        </div>
-
-        {maxCallLogsStatus.message && (
-          <div
-            className={`mt-3 rounded-lg border px-3 py-2 text-sm ${
-              maxCallLogsStatus.type === "success"
-                ? "border-green-500/20 bg-green-500/10 text-green-500"
-                : "border-red-500/20 bg-red-500/10 text-red-500"
-            }`}
-            role="alert"
-          >
-            {maxCallLogsStatus.message}
+          <div className="flex items-center gap-2">
+            <Badge variant="default" size="sm">
+              Call {storageHealth.retentionDays.call}d
+            </Badge>
+            <Badge variant="default" size="sm">
+              App {storageHealth.retentionDays.app}d
+            </Badge>
          </div>
-        )}
+        </div>
      </div>

      {/* Export / Import */}
@@ -88,7 +88,7 @@ export default function ProviderLimits() {
    if (typeof window === "undefined") return false;
    return localStorage.getItem(LS_AUTO_REFRESH) === "true";
  });
-  const [lastUpdated, setLastUpdated] = useState(null);
+  const [lastRefreshedAt, setLastRefreshedAt] = useState<Record<string, string>>({});
  const [refreshingAll, setRefreshingAll] = useState(false);
  const [countdown, setCountdown] = useState(120);
  const [initialLoading, setInitialLoading] = useState(true);
@@ -181,6 +181,10 @@ export default function ProviderLimits() {
            raw: data,
          },
        }));
+        setLastRefreshedAt((prev) => ({
+          ...prev,
+          [connectionId]: new Date().toISOString(),
+        }));
      } catch (error) {
        setErrors((prev) => ({
          ...prev,
@@ -195,8 +199,7 @@ export default function ProviderLimits() {

  const refreshProvider = useCallback(
    async (connectionId, provider) => {
-      await fetchQuota(connectionId, provider);
-      setLastUpdated(new Date());
+      await fetchQuota(connectionId, provider, { force: true });
    },
    [fetchQuota]
  );
@@ -207,30 +210,34 @@ export default function ProviderLimits() {
    setCountdown(120);
    try {
      const conns = await fetchConnections();
+
+      // Show table layout immediately once connections are loaded (Issue #784)
+      setInitialLoading(false);
+
      const usageConnections = conns.filter(
        (conn) =>
          USAGE_SUPPORTED_PROVIDERS.includes(conn.provider) &&
          (conn.authType === "oauth" || conn.authType === "apikey")
      );
-      // Fix Issue #784: Fetch quotas in chunks of 5 to avoid spamming the backend/provider APIs and hanging the UI.
+      // Fix: Fetch quotas in chunks of 5 to avoid spamming the backend/provider APIs and hanging the UI.
      const chunkSize = 5;
      for (let i = 0; i < usageConnections.length; i += chunkSize) {
        const chunk = usageConnections.slice(i, i + chunkSize);
-        await Promise.all(chunk.map((conn) => fetchQuota(conn.id, conn.provider)));
+        await Promise.all(chunk.map((conn) => fetchQuota(conn.id, conn.provider, { force: true })));
      }
-      setLastUpdated(new Date());
    } catch (error) {
      console.error("Error refreshing all:", error);
    } finally {
      setRefreshingAll(false);
+      setInitialLoading(false); // Fallback to ensure skeleton is cleared
    }
  }, [refreshingAll, fetchConnections, fetchQuota]);

  useEffect(() => {
    const init = async () => {
      setInitialLoading(true);
-      await refreshAll();
-      setInitialLoading(false);
+      // No longer await refreshAll here so we don't block the UI
+      refreshAll();
    };
    init();
  }, []); // eslint-disable-line react-hooks/exhaustive-deps
@@ -520,7 +527,7 @@ export default function ProviderLimits() {
        {/* Table header */}
        <div
          className="items-center px-4 py-2.5 border-b border-border text-[11px] font-semibold uppercase tracking-wider text-text-muted"
-          style={{ display: "grid", gridTemplateColumns: "280px 1fr 100px 48px" }}
+          style={{ display: "grid", gridTemplateColumns: "280px 1fr 128px 48px" }}
        >
          <div>{t("account")}</div>
          <div>{t("modelQuotas")}</div>
@@ -539,6 +546,7 @@ export default function ProviderLimits() {
            };
            const tierMeta = tierByConnection[conn.id] || normalizePlanTier(null);
            const resolvedPlan = resolvedPlanByConnection[conn.id];
+            const refreshedAt = lastRefreshedAt[conn.id];

            return (
              <div
@@ -546,7 +554,7 @@ export default function ProviderLimits() {
                className="items-center px-4 py-3.5 transition-[background] duration-150 hover:bg-black/[0.03] dark:hover:bg-white/[0.02]"
                style={{
                  display: "grid",
-                  gridTemplateColumns: "280px 1fr 100px 48px",
+                  gridTemplateColumns: "280px 1fr 128px 48px",
                  borderBottom: !isLast ? "1px solid var(--color-border)" : "none",
                }}
              >
@@ -670,11 +678,16 @@ export default function ProviderLimits() {
                  )}
                </div>

-                {/* Last Used */}
+                {/* Last Refreshed */}
                <div className="text-center text-[11px] text-text-muted">
-                  {lastUpdated ? (
+                  {refreshedAt ? (
                    <span>
-                      {lastUpdated.toLocaleTimeString([], { hour: "2-digit", minute: "2-digit" })}
+                      {new Date(refreshedAt).toLocaleTimeString([], {
+                        hour: "2-digit",
+                        minute: "2-digit",
+                        second: "2-digit",
+                        hour12: false,
+                      })}
                    </span>
                  ) : (
                    "-"
@@ -12,7 +12,7 @@

 import { NextRequest, NextResponse } from "next/server";
 import { readFileSync, existsSync } from "fs";
-import { join } from "path";
+import { getAppLogFilePath } from "@/lib/logEnv";

 const LEVEL_ORDER: Record<string, number> = {
  trace: 5,
@@ -34,7 +34,7 @@ const NUMERIC_LEVEL_MAP: Record<number, string> = {
 };

 function getLogFilePath(): string {
-  return process.env.LOG_FILE_PATH || join(process.cwd(), "logs", "application", "app.log");
+  return getAppLogFilePath();
 }

 function parseLevel(raw: string | number): string {
@@ -1,6 +1,6 @@
 /**
- * GET  /api/logs/detail  — List detailed request logs + current enabled flag
- * POST /api/logs/detail — Enable/disable detailed logging
+ * GET  /api/logs/detail  — List legacy detailed request logs + current enabled flag
+ * POST /api/logs/detail — Enable/disable pipeline capture for unified call log artifacts
 */
 import { NextRequest, NextResponse } from "next/server";
 import { requireManagementAuth } from "@/lib/api/requireManagementAuth";
@@ -35,13 +35,13 @@ export async function POST(req: NextRequest) {
  const body = await req.json();
  const enabled = body.enabled === true || body.enabled === "1";

-  await updateSettings({ detailed_logs_enabled: enabled });
+  await updateSettings({ call_log_pipeline_enabled: enabled });

  return NextResponse.json({
    success: true,
    enabled,
    message: enabled
-      ? "Detailed logging enabled. Pipeline bodies will be captured for new requests."
-      : "Detailed logging disabled.",
+      ? "Pipeline capture enabled. New request artifacts will include per-stage payloads."
+      : "Pipeline capture disabled.",
  });
 }
@@ -17,18 +17,12 @@ export async function GET(request: Request) {
    let rows: unknown[] = [];
    let tableName = "";

-    if (logType === "call-logs") {
+    if (logType === "call-logs" || logType === "request-logs") {
      tableName = "call_logs";
      const stmt = db.prepare(
        "SELECT * FROM call_logs WHERE timestamp >= @since ORDER BY timestamp DESC"
      );
      rows = stmt.all({ since });
-    } else if (logType === "request-logs") {
-      tableName = "request_logs";
-      const stmt = db.prepare(
-        "SELECT * FROM request_logs WHERE timestamp >= @since ORDER BY timestamp DESC"
-      );
-      rows = stmt.all({ since });
    } else if (logType === "proxy-logs") {
      tableName = "proxy_logs";
      const stmt = db.prepare(
@@ -1,6 +1,6 @@
 import { NextResponse } from "next/server";
 import { getProviderConnectionById } from "@/models";
-import { replaceCustomModels } from "@/lib/db/models";
+import { getCustomModels, replaceCustomModels } from "@/lib/db/models";
 import {
  syncManagedAvailableModelAliases,
  usesManagedAvailableModels,
@@ -13,11 +13,108 @@ import {
 } from "@/shared/services/modelSyncScheduler";
 import { getModelsByProviderId } from "@/shared/constants/models";

+type JsonRecord = Record<string, unknown>;
+
+function asRecord(value: unknown): JsonRecord {
+  return value && typeof value === "object" && !Array.isArray(value) ? (value as JsonRecord) : {};
+}
+
+function toNonEmptyString(value: unknown): string | null {
+  return typeof value === "string" && value.trim().length > 0 ? value.trim() : null;
+}
+
+function normalizeModelForComparison(model: unknown) {
+  const record = asRecord(model);
+  const id = toNonEmptyString(record.id) || "";
+  const name = toNonEmptyString(record.name) || id;
+  const source = toNonEmptyString(record.source) || "auto-sync";
+  const apiFormat = toNonEmptyString(record.apiFormat) || "chat-completions";
+  const supportedEndpoints = Array.isArray(record.supportedEndpoints)
+    ? Array.from(
+        new Set(
+          record.supportedEndpoints
+            .map((endpoint) => toNonEmptyString(endpoint))
+            .filter((endpoint): endpoint is string => Boolean(endpoint))
+        )
+      ).sort()
+    : ["chat"];
+
+  return {
+    id,
+    name,
+    source,
+    apiFormat,
+    supportedEndpoints,
+  };
+}
+
+function summarizeModelChanges(previousModels: unknown, nextModels: unknown) {
+  const previousList = Array.isArray(previousModels) ? previousModels : [];
+  const nextList = Array.isArray(nextModels) ? nextModels : [];
+
+  const previousMap = new Map(
+    previousList
+      .map((model) => normalizeModelForComparison(model))
+      .filter((model) => model.id)
+      .map((model) => [model.id, JSON.stringify(model)])
+  );
+  const nextMap = new Map(
+    nextList
+      .map((model) => normalizeModelForComparison(model))
+      .filter((model) => model.id)
+      .map((model) => [model.id, JSON.stringify(model)])
+  );
+
+  let added = 0;
+  let removed = 0;
+  let updated = 0;
+
+  for (const [id, nextValue] of nextMap.entries()) {
+    const previousValue = previousMap.get(id);
+    if (!previousValue) {
+      added += 1;
+      continue;
+    }
+    if (previousValue !== nextValue) {
+      updated += 1;
+    }
+  }
+
+  for (const id of previousMap.keys()) {
+    if (!nextMap.has(id)) {
+      removed += 1;
+    }
+  }
+
+  return {
+    added,
+    removed,
+    updated,
+    total: added + removed + updated,
+  };
+}
+
+function getModelSyncChannelLabel(connection: unknown) {
+  const record = asRecord(connection);
+  const providerSpecificData = asRecord(record.providerSpecificData);
+
+  return (
+    toNonEmptyString(record.displayName) ||
+    toNonEmptyString(record.email) ||
+    toNonEmptyString(providerSpecificData.tag) ||
+    toNonEmptyString(record.name) ||
+    toNonEmptyString(record.provider) ||
+    (toNonEmptyString(record.id) ? `connection:${String(record.id).slice(0, 8)}` : null) ||
+    "unknown"
+  );
+}
+
 /**
 * POST /api/providers/[id]/sync-models
 *
 * Fetches the model list from a provider's /models endpoint and replaces the
- * full custom models list for that provider. Logs the operation to call_logs.
+ * full custom models list for that provider. Successful syncs only write a
+ * call log when the fetched channel actually changes the stored model list.
 *
 * Used by:
 * - modelSyncScheduler (auto-sync on interval)
@@ -40,8 +137,7 @@ export async function POST(request: Request, { params }: { params: Promise<{ id:
      return NextResponse.json({ error: "Connection not found" }, { status: 404 });
    }

-    // Use a human-readable provider name for logs
-    const providerLabel = connection.name || connection.provider || "unknown";
+    const providerLabel = getModelSyncChannelLabel(connection);

    // Fetch models from the existing /api/providers/[id]/models endpoint
    const origin = new URL(request.url).origin;
@@ -92,7 +188,9 @@ export async function POST(request: Request, { params }: { params: Promise<{ id:
      }))
      .filter((m: any) => m.id && !registryIds.has(m.id));

+    const previousModels = await getCustomModels(connection.provider);
    const replaced = await replaceCustomModels(connection.provider, models);
+    const modelChanges = summarizeModelChanges(previousModels, replaced);

    let syncedAliases = 0;
    if (usesManagedAvailableModels(connection.provider)) {
@@ -103,29 +201,34 @@ export async function POST(request: Request, { params }: { params: Promise<{ id:
      syncedAliases = aliasSync.assignedAliases.length;
    }

-    // Log the successful sync
-    await saveCallLog({
-      method: "GET",
-      path: `/api/providers/${id}/models`,
-      status: 200,
-      model: "model-sync",
-      provider: providerLabel,
-      sourceFormat: "-",
-      connectionId: id,
-      duration: Date.now() - start,
-      requestType: "model-sync",
-      responseBody: {
-        syncedModels: models.length,
-        syncedAliases,
-        provider: connection.provider,
-      },
-    });
+    if (modelChanges.total > 0) {
+      await saveCallLog({
+        method: "GET",
+        path: `/api/providers/${id}/models`,
+        status: 200,
+        model: "model-sync",
+        provider: providerLabel,
+        sourceFormat: "-",
+        connectionId: id,
+        duration: Date.now() - start,
+        requestType: "model-sync",
+        responseBody: {
+          syncedModels: models.length,
+          syncedAliases,
+          provider: connection.provider,
+          channel: providerLabel,
+          modelChanges,
+        },
+      });
+    }

    return NextResponse.json({
      ok: true,
      provider: connection.provider,
      syncedModels: replaced.length,
      syncedAliases,
+      modelChanges,
+      logged: modelChanges.total > 0,
      models: replaced,
    });
  } catch (error: any) {
@@ -19,14 +19,12 @@ export async function GET() {
      setCliCompatProviders(settings.cliCompatProviders as string[]);
    }

-    const enableRequestLogs = process.env.ENABLE_REQUEST_LOGS === "true";
    const runtimePorts = getRuntimePorts();
    const cloudUrl = process.env.CLOUD_URL || process.env.NEXT_PUBLIC_CLOUD_URL || null;
    const machineId = await getConsistentMachineId();

    return NextResponse.json({
      ...safeSettings,
-      enableRequestLogs,
      hasPassword: !!password || !!process.env.INITIAL_PASSWORD,
      runtimePorts,
      apiPort: runtimePorts.apiPort,
@@ -114,11 +112,6 @@ export async function PATCH(request) {
      setCliCompatProviders(body.cliCompatProviders || []);
    }

-    if ("maxCallLogs" in body) {
-      const { invalidateCallLogsMaxCache } = await import("@/lib/usage/callLogs");
-      invalidateCallLogsMaxCache();
-    }
-
    // Sync cache control settings to runtime cache
    if ("alwaysPreserveClientCache" in body) {
      const { invalidateCacheControlSettingsCache } = await import("@/lib/cacheControlSettings");
@@ -2,6 +2,7 @@ import { NextResponse } from "next/server";
 import path from "path";
 import fs from "fs";
 import { resolveDataDir } from "@/lib/dataPaths";
+import { getAppLogRetentionDays, getCallLogRetentionDays } from "@/lib/logEnv";

 /**
 * GET /api/storage/health — Return database storage information.
@@ -56,7 +57,10 @@ export async function GET() {
      sizeBytes,
      lastBackupAt,
      backupCount,
-      retentionDays: 90,
+      retentionDays: {
+        app: getAppLogRetentionDays(),
+        call: getCallLogRetentionDays(),
+      },
      dataDir: dataDir.startsWith(homeDir) ? "~" + dataDir.slice(homeDir.length) : dataDir,
    });
  } catch (error) {
@@ -0,0 +1,366 @@
+import { NextResponse } from "next/server";
+import { z } from "zod";
+import { getDbInstance } from "@/lib/db/core";
+import { getComboById, getCombos } from "@/lib/db/combos";
+import { getQuotaSnapshots } from "@/lib/db/quotaSnapshots";
+import type {
+  ComboHealthMetrics,
+  ComboHealthResponse,
+  QuotaSnapshotRow,
+  UtilizationTimeRange,
+} from "@/shared/types/utilization";
+
+type ComboModelNode = string | { model?: string | null };
+
+type ComboRecord = {
+  id?: string;
+  name?: string;
+  strategy?: string;
+  models?: ComboModelNode[];
+};
+
+type ModelUsageRow = {
+  model: string | null;
+  requests: number | null;
+  totalTokens: number | null;
+};
+
+type PerformanceRow = {
+  totalRequests: number | null;
+  successCount: number | null;
+  avgLatencyMs: number | null;
+};
+
+type QuotaSnapshotView = {
+  connectionId?: string;
+  remainingPercentage?: number | null;
+  isExhausted?: number;
+  createdAt?: string;
+};
+
+type ProviderHealth = {
+  provider: string;
+  remainingPct: number;
+  isExhausted: boolean;
+  trend: "improving" | "stable" | "declining";
+};
+
+const querySchema = z.object({
+  range: z.enum(["1h", "24h", "7d", "30d"]),
+  comboId: z
+    .string()
+    .regex(/^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i)
+    .optional(),
+});
+
+function getRangeStartIso(range: UtilizationTimeRange): string {
+  const end = new Date();
+  const start = new Date(end);
+
+  switch (range) {
+    case "1h":
+      start.setHours(start.getHours() - 1);
+      break;
+    case "24h":
+      start.setDate(start.getDate() - 1);
+      break;
+    case "7d":
+      start.setDate(start.getDate() - 7);
+      break;
+    case "30d":
+      start.setDate(start.getDate() - 30);
+      break;
+  }
+
+  return start.toISOString();
+}
+
+function roundNumber(value: number, digits = 2): number {
+  if (!Number.isFinite(value)) return 0;
+  return Number(value.toFixed(digits));
+}
+
+function toSafeNumber(value: number | null | undefined): number {
+  return typeof value === "number" && Number.isFinite(value) ? value : 0;
+}
+
+function normalizeComboModels(models: ComboModelNode[] | undefined): string[] {
+  if (!Array.isArray(models)) return [];
+
+  return models
+    .map((entry) => {
+      if (typeof entry === "string") return entry;
+      if (entry && typeof entry === "object" && typeof entry.model === "string") {
+        return entry.model;
+      }
+      return "";
+    })
+    .filter((entry): entry is string => entry.trim().length > 0);
+}
+
+function extractProvider(model: string): string {
+  const [provider] = model.split("/");
+  return provider?.trim() || "unknown";
+}
+
+function calculateGini(values: number[]): number {
+  if (values.length === 0) return 0;
+  const sorted = [...values].sort((a, b) => a - b);
+  const count = sorted.length;
+  const sum = sorted.reduce((accumulator, value) => accumulator + value, 0);
+  if (sum === 0) return 0;
+
+  let weightedSum = 0;
+  for (let index = 0; index < count; index += 1) {
+    weightedSum += (index + 1) * sorted[index];
+  }
+
+  return (2 * weightedSum) / (count * sum) - (count + 1) / count;
+}
+
+function buildProviderHealth(provider: string, snapshots: QuotaSnapshotRow[]): ProviderHealth {
+  if (snapshots.length === 0) {
+    return {
+      provider,
+      remainingPct: 0,
+      isExhausted: false,
+      trend: "stable",
+    };
+  }
+
+  const histories = new Map<string, QuotaSnapshotRow[]>();
+  for (const snapshot of snapshots) {
+    const snapshotView = snapshot as unknown as QuotaSnapshotView;
+    const connectionId = snapshotView.connectionId || "unknown";
+    const existing = histories.get(connectionId) ?? [];
+    existing.push(snapshot);
+    histories.set(connectionId, existing);
+  }
+
+  const firstValues: number[] = [];
+  const lastValues: number[] = [];
+  let isExhausted = false;
+
+  for (const history of histories.values()) {
+    const ordered = [...history].sort((left, right) => {
+      const leftView = left as unknown as QuotaSnapshotView;
+      const rightView = right as unknown as QuotaSnapshotView;
+      return (leftView.createdAt || "").localeCompare(rightView.createdAt || "");
+    });
+    const firstSnapshot = ordered.find((entry) => {
+      const entryView = entry as unknown as QuotaSnapshotView;
+      return entryView.remainingPercentage !== null && entryView.remainingPercentage !== undefined;
+    });
+    const lastSnapshot = [...ordered].reverse().find((entry) => {
+      const entryView = entry as unknown as QuotaSnapshotView;
+      return entryView.remainingPercentage !== null && entryView.remainingPercentage !== undefined;
+    });
+    const firstSnapshotView = firstSnapshot as unknown as QuotaSnapshotView | undefined;
+    const lastSnapshotView = lastSnapshot as unknown as QuotaSnapshotView | undefined;
+
+    if (
+      firstSnapshotView?.remainingPercentage !== null &&
+      firstSnapshotView?.remainingPercentage !== undefined
+    ) {
+      firstValues.push(firstSnapshotView.remainingPercentage);
+    }
+
+    if (
+      lastSnapshotView?.remainingPercentage !== null &&
+      lastSnapshotView?.remainingPercentage !== undefined
+    ) {
+      lastValues.push(lastSnapshotView.remainingPercentage);
+    }
+
+    const latestEntry = ordered[ordered.length - 1] as unknown as QuotaSnapshotView | undefined;
+    isExhausted = isExhausted || latestEntry?.isExhausted === 1;
+  }
+
+  const firstAverage =
+    firstValues.length > 0
+      ? firstValues.reduce((accumulator, value) => accumulator + value, 0) / firstValues.length
+      : 0;
+  const lastAverage =
+    lastValues.length > 0
+      ? lastValues.reduce((accumulator, value) => accumulator + value, 0) / lastValues.length
+      : 0;
+  const delta = lastAverage - firstAverage;
+
+  let trend: ProviderHealth["trend"] = "stable";
+  if (delta >= 5) trend = "improving";
+  if (delta <= -5) trend = "declining";
+
+  return {
+    provider,
+    remainingPct: roundNumber(lastAverage),
+    isExhausted,
+    trend,
+  };
+}
+
+function buildUsageSkew(
+  comboName: string,
+  comboModels: string[],
+  since: string
+): ComboHealthMetrics["usageSkew"] {
+  const db = getDbInstance();
+  const rows = db
+    .prepare(
+      `SELECT
+         model,
+         COUNT(*) as requests,
+         SUM(COALESCE(tokens_in, 0) + COALESCE(tokens_out, 0)) as totalTokens
+       FROM call_logs
+       WHERE combo_name = ?
+         AND timestamp >= ?
+       GROUP BY model`
+    )
+    .all(comboName, since) as ModelUsageRow[];
+
+  const usageByModel = new Map<string, { requests: number; tokens: number }>();
+  for (const model of comboModels) {
+    usageByModel.set(model, { requests: 0, tokens: 0 });
+  }
+
+  for (const row of rows) {
+    const model =
+      typeof row.model === "string" && row.model.trim().length > 0 ? row.model : "unknown";
+    usageByModel.set(model, {
+      requests: toSafeNumber(row.requests),
+      tokens: toSafeNumber(row.totalTokens),
+    });
+  }
+
+  const modelDistributionEntries = Array.from(usageByModel.entries());
+  const totalRequests = modelDistributionEntries.reduce(
+    (accumulator, [, usage]) => accumulator + usage.requests,
+    0
+  );
+  const totalTokens = modelDistributionEntries.reduce(
+    (accumulator, [, usage]) => accumulator + usage.tokens,
+    0
+  );
+
+  return {
+    modelDistribution: modelDistributionEntries.map(([model, usage]) => ({
+      model,
+      requestShare: totalRequests > 0 ? roundNumber(usage.requests / totalRequests, 4) : 0,
+      tokenShare: totalTokens > 0 ? roundNumber(usage.tokens / totalTokens, 4) : 0,
+    })),
+    giniCoefficient: roundNumber(
+      calculateGini(modelDistributionEntries.map(([, usage]) => usage.requests)),
+      4
+    ),
+  };
+}
+
+function buildPerformance(comboName: string, since: string): ComboHealthMetrics["performance"] {
+  const db = getDbInstance();
+  const row = db
+    .prepare(
+      `SELECT
+         COUNT(*) as totalRequests,
+         SUM(CASE WHEN status >= 200 AND status < 400 THEN 1 ELSE 0 END) as successCount,
+         AVG(duration) as avgLatencyMs
+       FROM call_logs
+       WHERE combo_name = ?
+         AND timestamp >= ?`
+    )
+    .get(comboName, since) as PerformanceRow | undefined;
+
+  const totalRequests = toSafeNumber(row?.totalRequests);
+  const successCount = toSafeNumber(row?.successCount);
+  const avgLatencyMs = toSafeNumber(row?.avgLatencyMs);
+
+  return {
+    avgLatencyMs: roundNumber(avgLatencyMs),
+    successRate: totalRequests > 0 ? roundNumber(successCount / totalRequests, 4) : 0,
+    totalRequests,
+  };
+}
+
+function buildQuotaHealth(providers: string[], since: string): ComboHealthMetrics["quotaHealth"] {
+  const providerHealth = providers.map((provider) =>
+    buildProviderHealth(provider, getQuotaSnapshots({ provider, since }))
+  );
+
+  const worstRemainingPct =
+    providerHealth.length > 0
+      ? providerHealth.reduce(
+          (lowest, entry) => Math.min(lowest, entry.remainingPct),
+          providerHealth[0].remainingPct
+        )
+      : 0;
+
+  return {
+    providers: providerHealth,
+    worstRemainingPct: roundNumber(worstRemainingPct),
+  };
+}
+
+function buildComboHealth(combo: ComboRecord, since: string): ComboHealthMetrics | null {
+  const comboId = typeof combo.id === "string" ? combo.id : "";
+  const comboName = typeof combo.name === "string" ? combo.name : "";
+  if (!comboId || !comboName) return null;
+
+  const models = normalizeComboModels(combo.models);
+  const providers = Array.from(new Set(models.map(extractProvider)));
+
+  return {
+    comboId,
+    comboName,
+    strategy:
+      typeof combo.strategy === "string" && combo.strategy.trim().length > 0
+        ? combo.strategy
+        : "priority",
+    models,
+    quotaHealth: buildQuotaHealth(providers, since),
+    usageSkew: buildUsageSkew(comboName, models, since),
+    performance: buildPerformance(comboName, since),
+  };
+}
+
+export async function GET(request: Request) {
+  try {
+    const { searchParams } = new URL(request.url);
+    const parsedQuery = querySchema.safeParse({
+      range: searchParams.get("range"),
+      comboId: searchParams.get("comboId") || undefined,
+    });
+
+    if (!parsedQuery.success) {
+      return NextResponse.json(
+        {
+          error: parsedQuery.error.issues[0]?.message ?? "Invalid query parameters",
+        },
+        { status: 400 }
+      );
+    }
+
+    const { range, comboId } = parsedQuery.data;
+    const since = getRangeStartIso(range);
+
+    let combos: ComboRecord[] = [];
+    if (comboId) {
+      const combo = (await getComboById(comboId)) as ComboRecord | null;
+      if (!combo) {
+        return NextResponse.json({ error: "Combo not found" }, { status: 404 });
+      }
+      combos = [combo];
+    } else {
+      combos = (await getCombos()) as ComboRecord[];
+    }
+
+    const response: ComboHealthResponse = {
+      timeRange: range,
+      combos: combos
+        .map((combo) => buildComboHealth(combo, since))
+        .filter((combo): combo is ComboHealthMetrics => combo !== null),
+    };
+
+    return NextResponse.json(response);
+  } catch (error) {
+    console.error("Error fetching combo health:", error);
+    return NextResponse.json({ error: "Failed to fetch combo health" }, { status: 500 });
+  }
+}
@@ -0,0 +1,67 @@
+import { NextResponse } from "next/server";
+import { getAggregatedSnapshots } from "@/lib/db/quotaSnapshots";
+import type { ProviderUtilizationResponse, UtilizationTimeRange } from "@/shared/types/utilization";
+import { BUCKET_SIZES } from "@/shared/types/utilization";
+
+const VALID_RANGES: UtilizationTimeRange[] = ["1h", "24h", "7d", "30d"];
+
+function getRangeStartIso(range: UtilizationTimeRange): string {
+  const end = new Date();
+  const start = new Date(end);
+
+  switch (range) {
+    case "1h":
+      start.setHours(start.getHours() - 1);
+      break;
+    case "24h":
+      start.setDate(start.getDate() - 1);
+      break;
+    case "7d":
+      start.setDate(start.getDate() - 7);
+      break;
+    case "30d":
+      start.setDate(start.getDate() - 30);
+      break;
+  }
+
+  return start.toISOString();
+}
+
+export async function GET(request: Request) {
+  try {
+    const { searchParams } = new URL(request.url);
+    const rangeParam = searchParams.get("range");
+    const providerParam = searchParams.get("provider");
+
+    if (!rangeParam || !VALID_RANGES.includes(rangeParam as UtilizationTimeRange)) {
+      return NextResponse.json(
+        { error: "Invalid range. Must be one of: 1h, 24h, 7d, 30d" },
+        { status: 400 }
+      );
+    }
+
+    const range = rangeParam as UtilizationTimeRange;
+    const since = getRangeStartIso(range);
+    const bucketMinutes = BUCKET_SIZES[range];
+
+    const data = getAggregatedSnapshots({
+      provider: providerParam || undefined,
+      since,
+      bucketMinutes,
+    });
+
+    const providers = Array.from(new Set(data.map((d) => d.provider)));
+
+    const response: ProviderUtilizationResponse = {
+      timeRange: range,
+      bucketSizeMinutes: bucketMinutes,
+      providers,
+      data,
+    };
+
+    return NextResponse.json(response);
+  } catch (error) {
+    console.error("Error fetching utilization data:", error);
+    return NextResponse.json({ error: "Failed to fetch utilization data" }, { status: 500 });
+  }
+}
@@ -17,6 +17,7 @@ import { getUsageForProvider } from "@omniroute/open-sse/services/usage.ts";
 import { getProviderConnectionById, resolveProxyForConnection } from "@/lib/localDb";
 import { runWithProxyContext } from "@omniroute/open-sse/utils/proxyFetch.ts";
 import { safePercentage } from "@/shared/utils/formatting";
+import { saveQuotaSnapshot, cleanupOldSnapshots } from "@/lib/db/quotaSnapshots";

 // ─── Types ──────────────────────────────────────────────────────────────────

@@ -181,14 +182,40 @@ export function setQuotaCache(
 ) {
  const quotas = normalizeQuotas(rawQuotas);
  const exhausted = isExhausted(quotas);
-  cache.set(connectionId, {
+  const entry: QuotaCacheEntry = {
    connectionId,
    provider,
    quotas,
    fetchedAt: Date.now(),
    exhausted,
    nextResetAt: exhausted ? earliestResetAt(quotas) : null,
-  });
+  };
+  cache.set(connectionId, entry);
+
+  if (entry && rawQuotas) {
+    for (const [windowKey, quotaInfo] of Object.entries(rawQuotas)) {
+      if (!quotaInfo || typeof quotaInfo !== "object") continue;
+      const remainingPercentage =
+        safePercentage(quotaInfo.remainingPercentage) ??
+        (quotaInfo.total > 0
+          ? Math.round(((quotaInfo.total - (quotaInfo.used || 0)) / quotaInfo.total) * 100)
+          : 0);
+      try {
+        saveQuotaSnapshot({
+          provider,
+          connection_id: connectionId,
+          window_key: windowKey,
+          remaining_percentage: remainingPercentage,
+          is_exhausted: entry.exhausted ? 1 : 0,
+          next_reset_at: quotaInfo.resetAt ?? null,
+          window_duration_ms: entry.windowDurationMs ?? null,
+          raw_data: null,
+        });
+      } catch (error) {
+        console.error("[quotaCache] Failed to save snapshot:", error);
+      }
+    }
+  }
 }

 /**
@@ -330,6 +357,7 @@ async function backgroundRefreshTick() {
  tickRunning = true;

  try {
+    cleanupOldSnapshots();
    const now = Date.now();
    const pending = [...cache.values()].filter((e) => needsRefresh(e, now));

@@ -96,11 +96,10 @@
 * @property {boolean} hasPassword - Whether a password has been set
 * @property {string} [theme] - UI theme
 * @property {string} [language] - UI language
- * @property {boolean} [enableRequestLogs] - Whether request logging is enabled
 * @property {boolean} [enableSocks5Proxy] - Whether SOCKS5 proxy is allowed
 * @property {string} [instanceName] - Instance display name
 * @property {string} [corsOrigins] - Allowed CORS origins
- * @property {number} [logRetentionDays] - Log retention in days
+ * @property {boolean} [call_log_pipeline_enabled] - Whether per-request pipeline capture is enabled
 * @property {string[]} [hiddenSidebarItems] - Sidebar entry ids hidden for visual decluttering
 */

@@ -36,7 +36,7 @@
    "time": "الوقت",
    "details": "التفاصيل",
    "created": "تم إنشاؤها",
-    "lastUsed": "آخر استخدام",
+    "lastUsed": "Last Refreshed",
    "loadMore": "تحميل المزيد",
    "noResults": "لم يتم العثور على نتائج",
    "reloadPage": "إعادة تحميل الصفحة",
@@ -270,7 +270,11 @@
    "overviewDescription": "راقب أنماط استخدام واجهة برمجة التطبيقات (API) واستهلاك الرمز المميز والتكاليف واتجاهات النشاط عبر جميع مقدمي الخدمات والنماذج.",
    "evalsDescription": "قم بتشغيل مجموعات التقييم لاختبار نقاط نهاية LLM الخاصة بك والتحقق من صحتها. مقارنة جودة النموذج، واكتشاف الانحدارات، وقياس وقت الاستجابة.",
    "overview": "نظرة عامة",
-    "evals": "التقييمات"
+    "evals": "التقييمات",
+    "utilization": "الاستخدام",
+    "utilizationDescription": "اتجاهات استخدام حصة المزود وتتبع حدود المعدل",
+    "comboHealth": "صحة المجموعة",
+    "comboHealthDescription": "الحصة على مستوى المجموعة وتوزيع الاستخدام ومقاييس الأداء"
  },
  "apiManager": {
    "title": "مفاتيح واجهة برمجة التطبيقات",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "جار التحميل...",
    "account": "الحساب",
    "modelQuotas": "الحصص النموذجية",
-    "lastUsed": "آخر استخدام",
+    "lastUsed": "Last Refreshed",
    "actions": "الإجراءات",
    "refreshQuota": "تحديث الحصة",
    "today": "اليوم",
@@ -36,7 +36,7 @@
    "time": "време",
    "details": "Подробности",
    "created": "Създаден",
-    "lastUsed": "Последно използвано",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Зареди още",
    "noResults": "Няма намерени резултати",
    "reloadPage": "Презареди страницата",
@@ -268,9 +268,13 @@
  "analytics": {
    "title": "Анализ",
    "overviewDescription": "Наблюдавайте своите модели на използване на API, потреблението на токени, разходите и тенденциите в дейността при всички доставчици и модели.",
-    "evalsDescription": "Изпълнете пакети за оценка, за да тествате и валидирате вашите крайни точки на LLM. Сравнете качеството на модела, открийте регресии и сравните латентността.",
+    "evalsDescription": "Изпълнете пакети за оценка, за да тествате и валидирате вашите крайни точки на LLM. Сравнете качеството на модела, открийте регресии и сравнете латентността.",
    "overview": "Преглед",
-    "evals": "Оценки"
+    "evals": "Оценки",
+    "utilization": "Използване",
+    "utilizationDescription": "Тенденции в използването на квотата на доставчика и проследяване на ограниченията на скоростта",
+    "comboHealth": "Здраве на комбинацията",
+    "comboHealthDescription": "Квота на ниво комбинация, разпределение на използването и метрики на производителността"
  },
  "apiManager": {
    "title": "API ключове",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Зареждане...",
    "account": "акаунт",
    "modelQuotas": "Моделни квоти",
-    "lastUsed": "Последно използвано",
+    "lastUsed": "Last Refreshed",
    "actions": "Действия",
    "refreshQuota": "Опресняване на квотата",
    "today": "Днес",
@@ -36,7 +36,7 @@
    "time": "Čas",
    "details": "Podrobnosti",
    "created": "Vytvořeno",
-    "lastUsed": "Naposledy použité",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Načíst více",
    "noResults": "Nenalezeny žádné výsledky",
    "reloadPage": "Znovu načíst stránku",
@@ -270,7 +270,11 @@
    "overviewDescription": "Sledujte vzorce používání API, spotřebu tokenů, náklady a trendy aktivity napříč všemi poskytovateli a modely.",
    "evalsDescription": "Spusťte sady vyhodnocovacích programů pro test a ověření LLM koncových bodů. Porovnejte kvalitu modelů, detekujte zhoršení a porovnejte latenci.",
    "overview": "Přehled",
-    "evals": "Hodnocení"
+    "evals": "Hodnocení",
+    "utilization": "Využití",
+    "utilizationDescription": "Trendy využití kvót poskytovatele a sledování limitů rychlosti",
+    "comboHealth": "Zdraví Combo",
+    "comboHealthDescription": "Kvóta na úrovni kombinace, distribuce využití a metriky výkonu"
  },
  "apiManager": {
    "title": "API Klíče",
@@ -2333,7 +2337,7 @@
    "loadingQuotas": "Načítám...",
    "account": "Účet",
    "modelQuotas": "Kvóty Modelu",
-    "lastUsed": "Naposledy použito",
+    "lastUsed": "Last Refreshed",
    "actions": "Akce",
    "refreshQuota": "Obnovit Kvótu",
    "today": "Dnes",
@@ -36,7 +36,7 @@
    "time": "Tid",
    "details": "Detaljer",
    "created": "Oprettet",
-    "lastUsed": "Sidst brugt",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Indlæs mere",
    "noResults": "Ingen resultater fundet",
    "reloadPage": "Genindlæs siden",
@@ -270,7 +270,11 @@
    "overviewDescription": "Overvåg dine API-brugsmønstre, tokenforbrug, omkostninger og aktivitetstendenser på tværs af alle udbydere og modeller.",
    "evalsDescription": "Kør evalueringspakker for at teste og validere dine LLM-endepunkter. Sammenlign modelkvalitet, detekter regressioner og benchmark-forsinkelse.",
    "overview": "Oversigt",
-    "evals": "Evals"
+    "evals": "Evals",
+    "utilization": "Udnyttelse",
+    "utilizationDescription": "Leverandørkvoteforbrugstendenser og hastighedsgrænseovervågning",
+    "comboHealth": "Kombosundhed",
+    "comboHealthDescription": "Komboniveaudkvote, fordeling af brug og ydeevnemålinger"
  },
  "apiManager": {
    "title": "API nøgler",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Indlæser...",
    "account": "Konto",
    "modelQuotas": "Modelkvoter",
-    "lastUsed": "Sidst brugt",
+    "lastUsed": "Last Refreshed",
    "actions": "Handlinger",
    "refreshQuota": "Opdater kvote",
    "today": "I dag",
@@ -36,7 +36,7 @@
    "time": "Zeit",
    "details": "Einzelheiten",
    "created": "Erstellt",
-    "lastUsed": "Zuletzt verwendet",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Mehr laden",
    "noResults": "Keine Ergebnisse gefunden",
    "reloadPage": "Seite neu laden",
@@ -270,7 +270,11 @@
    "overviewDescription": "Überwachen Sie Ihre API-Nutzungsmuster, Token-Verbrauch, Kosten und Aktivitätstrends bei allen Anbietern und Modellen.",
    "evalsDescription": "Führen Sie Evaluierungssuiten aus, um Ihre LLM-Endpunkte zu testen und zu validieren. Vergleichen Sie die Modellqualität, erkennen Sie Regressionen und messen Sie die Latenz.",
    "overview": "Übersicht",
-    "evals": "Bewertungen"
+    "evals": "Bewertungen",
+    "utilization": "Auslastung",
+    "utilizationDescription": "Anbieter-Kontingent-Nutzungstrends und Ratenlimit-Verfolgung",
+    "comboHealth": "Combo-Gesundheit",
+    "comboHealthDescription": "Combo-level Kontingent, Nutzungsverteilung und Leistungsmetriken"
  },
  "apiManager": {
    "title": "API-Schlüssel",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Laden...",
    "account": "Konto",
    "modelQuotas": "Modellkontingente",
-    "lastUsed": "Zuletzt verwendet",
+    "lastUsed": "Last Refreshed",
    "actions": "Aktionen",
    "refreshQuota": "Kontingent aktualisieren",
    "today": "Heute",
@@ -36,7 +36,7 @@
    "time": "Time",
    "details": "Details",
    "created": "Created",
-    "lastUsed": "Last Used",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Load More",
    "noResults": "No results found",
    "reloadPage": "Reload Page",
@@ -275,7 +275,11 @@
    "overviewDescription": "Monitor your API usage patterns, token consumption, costs, and activity trends across all providers and models.",
    "evalsDescription": "Run evaluation suites to test and validate your LLM endpoints. Compare model quality, detect regressions, and benchmark latency.",
    "overview": "Overview",
-    "evals": "Evals"
+    "evals": "Evals",
+    "utilization": "Utilization",
+    "utilizationDescription": "Provider quota usage trends and rate limit tracking",
+    "comboHealth": "Combo Health",
+    "comboHealthDescription": "Combo-level quota, usage distribution, and performance metrics"
  },
  "apiManager": {
    "title": "API Keys",
@@ -2382,7 +2386,7 @@
    "loadingQuotas": "Loading...",
    "account": "Account",
    "modelQuotas": "Model Quotas",
-    "lastUsed": "Last Used",
+    "lastUsed": "Last Refreshed",
    "actions": "Actions",
    "refreshQuota": "Refresh quota",
    "today": "Today",
@@ -36,7 +36,7 @@
    "time": "Hora",
    "details": "Detalles",
    "created": "Creado",
-    "lastUsed": "Usado por última vez",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Cargar más",
    "noResults": "No se encontraron resultados",
    "reloadPage": "Recargar página",
@@ -270,7 +270,11 @@
    "overviewDescription": "Supervise sus patrones de uso de API, consumo de tokens, costos y tendencias de actividad en todos los proveedores y modelos.",
    "evalsDescription": "Ejecute conjuntos de evaluación para probar y validar sus puntos finales de LLM. Compare la calidad del modelo, detecte regresiones y compare la latencia.",
    "overview": "Descripción general",
-    "evals": "evaluaciones"
+    "evals": "evaluaciones",
+    "utilization": "Utilización",
+    "utilizationDescription": "Tendencias de uso de cuota del proveedor y seguimiento de límites de tasa",
+    "comboHealth": "Salud del combo",
+    "comboHealthDescription": "Cuota a nivel de combo, distribución de uso y métricas de rendimiento"
  },
  "apiManager": {
    "title": "Claves API",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Cargando...",
    "account": "cuenta",
    "modelQuotas": "Cuotas modelo",
-    "lastUsed": "Usado por última vez",
+    "lastUsed": "Last Refreshed",
    "actions": "Acciones",
    "refreshQuota": "Actualizar cuota",
    "today": "hoy",
@@ -36,7 +36,7 @@
    "time": "Aika",
    "details": "Yksityiskohdat",
    "created": "Luotu",
-    "lastUsed": "Viimeksi käytetty",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Lataa lisää",
    "noResults": "Tuloksia ei löytynyt",
    "reloadPage": "Lataa sivu uudelleen",
@@ -270,7 +270,11 @@
    "overviewDescription": "Tarkkaile sovellusliittymän käyttötapoja, tunnuksen kulutusta, kustannuksia ja toimintatrendejä kaikilla palveluntarjoajilla ja malleilla.",
    "evalsDescription": "Testaa ja vahvista LLM-päätepisteesi suorittamalla arviointipaketteja. Vertaile mallin laatua, havaitse regressiot ja vertaile viivettä.",
    "overview": "Yleiskatsaus",
-    "evals": "Evals"
+    "evals": "Evals",
+    "utilization": "Käyttöaste",
+    "utilizationDescription": "Palveluntarjoajan kiintiön käyttötrendit ja nopeusrajoitusten seuranta",
+    "comboHealth": "Yhdistelmän kunto",
+    "comboHealthDescription": "Yhdistelmätason kiintiö, käytön jakautuminen ja suorituskykymittarit"
  },
  "apiManager": {
    "title": "API-avaimet",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Ladataan...",
    "account": "Tili",
    "modelQuotas": "Mallikiintiöt",
-    "lastUsed": "Viimeksi käytetty",
+    "lastUsed": "Last Refreshed",
    "actions": "Toiminnot",
    "refreshQuota": "Päivitä kiintiö",
    "today": "Tänään",
@@ -36,7 +36,7 @@
    "time": "Temps",
    "details": "Détails",
    "created": "Créé",
-    "lastUsed": "Dernière utilisation",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Charger plus",
    "noResults": "Aucun résultat trouvé",
    "reloadPage": "Recharger la page",
@@ -270,7 +270,11 @@
    "overviewDescription": "Surveillez vos modèles d'utilisation des API, la consommation de jetons, les coûts et les tendances d'activité sur tous les fournisseurs et modèles.",
    "evalsDescription": "Exécutez des suites d'évaluation pour tester et valider vos points de terminaison LLM. Comparez la qualité des modèles, détectez les régressions et évaluez la latence.",
    "overview": "Aperçu",
-    "evals": "Évaluations"
+    "evals": "Évaluations",
+    "utilization": "Utilisation",
+    "utilizationDescription": "Tendances d'utilisation du quota fournisseur et suivi des limites de débit",
+    "comboHealth": "Santé du combo",
+    "comboHealthDescription": "Quota au niveau combo, distribution de l'utilisation et métriques de performance"
  },
  "apiManager": {
    "title": "Clés API",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Chargement...",
    "account": "Compte",
    "modelQuotas": "Quotas de modèles",
-    "lastUsed": "Dernière utilisation",
+    "lastUsed": "Last Refreshed",
    "actions": "Actions",
    "refreshQuota": "Actualiser le quota",
    "today": "Aujourd'hui",
@@ -36,7 +36,7 @@
    "time": "זמן",
    "details": "פרטים",
    "created": "נוצר",
-    "lastUsed": "בשימוש אחרון",
+    "lastUsed": "Last Refreshed",
    "loadMore": "טען עוד",
    "noResults": "לא נמצאו תוצאות",
    "reloadPage": "טען מחדש את הדף",
@@ -270,7 +270,11 @@
    "overviewDescription": "עקוב אחר דפוסי השימוש שלך ב-API, צריכת אסימונים, עלויות ומגמות פעילות בכל הספקים והדגמים.",
    "evalsDescription": "הפעל חבילות הערכה כדי לבדוק ולאמת את נקודות הקצה שלך ב-LLM. השווה את איכות המודל, זיהוי רגרסיות והשהייה בהשוואה.",
    "overview": "סקירה כללית",
-    "evals": "Evals"
+    "evals": "Evals",
+    "utilization": "ניצול",
+    "utilizationDescription": "מגמות שימוש במכסת הספק ומעקב אחר מגבלות קצב",
+    "comboHealth": "בריאות הקומבו",
+    "comboHealthDescription": "מכסה ברמת הקומבו, הפצת שימוש ומדדי ביצוע"
  },
  "apiManager": {
    "title": "מפתחות API",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "טוען...",
    "account": "חשבון",
    "modelQuotas": "מכסות דגם",
-    "lastUsed": "בשימוש אחרון",
+    "lastUsed": "Last Refreshed",
    "actions": "פעולות",
    "refreshQuota": "רענון מכסה",
    "today": "היום",
@@ -36,7 +36,7 @@
    "time": "समय",
    "details": "विवरण",
    "created": "बनाया गया",
-    "lastUsed": "अंतिम बार उपयोग किया गया",
+    "lastUsed": "Last Refreshed",
    "loadMore": "अधिक लोड करें",
    "noResults": "कोई परिणाम नहीं मिला",
    "reloadPage": "पृष्ठ पुनः लोड करें",
@@ -188,7 +188,11 @@
    "overviewDescription": "सभी प्रदाताओं और मॉडलों में अपने एपीआई उपयोग पैटर्न, टोकन खपत, लागत और गतिविधि रुझान की निगरानी करें।",
    "evalsDescription": "अपने एलएलएम समापन बिंदुओं का परीक्षण और सत्यापन करने के लिए मूल्यांकन सुइट चलाएँ। मॉडल गुणवत्ता की तुलना करें, प्रतिगमन और बेंचमार्क विलंबता का पता लगाएं।",
    "overview": "सिंहावलोकन",
-    "evals": "एवल्स"
+    "evals": "एवल्स",
+    "utilization": "उपयोग",
+    "utilizationDescription": "प्रदाता कोटा उपयोग रुझान और दर सीमा ट्रैकिंग",
+    "comboHealth": "कॉम्बो स्वास्थ्य",
+    "comboHealthDescription": "कॉम्बो-स्तरीय कोटा, उपयोग वितरण और प्रदर्शन मेट्रिक्स"
  },
  "apiManager": {
    "title": "एपीआई कुंजी",
@@ -2169,7 +2173,7 @@
    "loadingQuotas": "लोड हो रहा है...",
    "account": "खाता",
    "modelQuotas": "मॉडल कोटा",
-    "lastUsed": "अंतिम बार उपयोग किया गया",
+    "lastUsed": "Last Refreshed",
    "actions": "क्रियाएँ",
    "refreshQuota": "ताज़ा कोटा",
    "today": "आज",
@@ -36,7 +36,7 @@
    "time": "Idő",
    "details": "Részletek",
    "created": "Létrehozva",
-    "lastUsed": "Utoljára használt",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Load More",
    "noResults": "Nincs találat",
    "reloadPage": "Oldal újratöltése",
@@ -270,7 +270,11 @@
    "overviewDescription": "Kövesse nyomon API-használati mintáit, tokenfelhasználását, költségeit és tevékenységi trendjeit az összes szolgáltatónál és modellnél.",
    "evalsDescription": "Futtasson kiértékelő csomagokat az LLM-végpontok teszteléséhez és érvényesítéséhez. Hasonlítsa össze a modell minőségét, észlelje a regressziókat és mérje fel a késleltetést.",
    "overview": "Áttekintés",
-    "evals": "Evals"
+    "evals": "Evals",
+    "utilization": "Kihasználtság",
+    "utilizationDescription": "Szolgáltatói kvótahasználati trendek és sebességkorlát-követés",
+    "comboHealth": "Kombo egészsége",
+    "comboHealthDescription": "Kombó szintű kvóta, használateloszlás és teljesítménymutatók"
  },
  "apiManager": {
    "title": "API kulcsok",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Betöltés...",
    "account": "fiók",
    "modelQuotas": "Modellkvóták",
-    "lastUsed": "Utoljára használt",
+    "lastUsed": "Last Refreshed",
    "actions": "Akciók",
    "refreshQuota": "Kvóta frissítése",
    "today": "Ma",
@@ -36,7 +36,7 @@
    "time": "Waktu",
    "details": "Detail",
    "created": "Dibuat",
-    "lastUsed": "Terakhir Digunakan",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Muat Lebih Banyak",
    "noResults": "Tidak ada hasil yang ditemukan",
    "reloadPage": "Muat Ulang Halaman",
@@ -270,7 +270,11 @@
    "overviewDescription": "Pantau pola penggunaan API Anda, konsumsi token, biaya, dan tren aktivitas di semua penyedia dan model.",
    "evalsDescription": "Jalankan rangkaian evaluasi untuk menguji dan memvalidasi titik akhir LLM Anda. Bandingkan kualitas model, deteksi regresi, dan latensi tolok ukur.",
    "overview": "Ikhtisar",
-    "evals": "Evaluasi"
+    "evals": "Evaluasi",
+    "utilization": "Pemanfaatan",
+    "utilizationDescription": "Tren penggunaan kuota penyedia dan pelacakan batas tarif",
+    "comboHealth": "Kesehatan Kombinasi",
+    "comboHealthDescription": "Kuota tingkat kombinasi, distribusi penggunaan, dan metrik kinerja"
  },
  "apiManager": {
    "title": "Kunci API",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Memuat...",
    "account": "Akun",
    "modelQuotas": "Kuota Model",
-    "lastUsed": "Terakhir Digunakan",
+    "lastUsed": "Last Refreshed",
    "actions": "Tindakan",
    "refreshQuota": "Segarkan kuota",
    "today": "Hari ini",
@@ -36,7 +36,7 @@
    "time": "समय",
    "details": "विवरण",
    "created": "बनाया गया",
-    "lastUsed": "अंतिम बार उपयोग किया गया",
+    "lastUsed": "Last Refreshed",
    "loadMore": "अधिक लोड करें",
    "noResults": "कोई परिणाम नहीं मिला",
    "reloadPage": "पृष्ठ पुनः लोड करें",
@@ -270,7 +270,11 @@
    "overviewDescription": "सभी प्रदाताओं और मॉडलों में अपने एपीआई उपयोग पैटर्न, टोकन खपत, लागत और गतिविधि रुझान की निगरानी करें।",
    "evalsDescription": "अपने एलएलएम समापन बिंदुओं का परीक्षण और सत्यापन करने के लिए मूल्यांकन सुइट चलाएँ। मॉडल गुणवत्ता की तुलना करें, प्रतिगमन और बेंचमार्क विलंबता का पता लगाएं।",
    "overview": "सिंहावलोकन",
-    "evals": "एवल्स"
+    "evals": "एवल्स",
+    "utilization": "उपयोग",
+    "utilizationDescription": "प्रदाता कोटा उपयोग रुझान और दर सीमा ट्रैकिंग",
+    "comboHealth": "कॉम्बो स्वास्थ्य",
+    "comboHealthDescription": "कॉम्बो-स्तरीय कोटा, उपयोग वितरण और प्रदर्शन मेट्रिक्स"
  },
  "apiManager": {
    "title": "एपीआई कुंजी",
@@ -2333,7 +2337,7 @@
    "loadingQuotas": "लोड हो रहा है...",
    "account": "खाता",
    "modelQuotas": "मॉडल कोटा",
-    "lastUsed": "अंतिम बार उपयोग किया गया",
+    "lastUsed": "Last Refreshed",
    "actions": "क्रियाएँ",
    "refreshQuota": "ताज़ा कोटा",
    "today": "आज",
@@ -36,7 +36,7 @@
    "time": "Tempo",
    "details": "Dettagli",
    "created": "Creato",
-    "lastUsed": "Ultimo utilizzo",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Carica altro",
    "noResults": "Nessun risultato trovato",
    "reloadPage": "Ricarica la pagina",
@@ -270,7 +270,11 @@
    "overviewDescription": "Monitora i modelli di utilizzo delle API, il consumo di token, i costi e le tendenze delle attività su tutti i provider e modelli.",
    "evalsDescription": "Esegui suite di valutazione per testare e convalidare i tuoi endpoint LLM. Confronta la qualità del modello, rileva le regressioni e confronta la latenza.",
    "overview": "Panoramica",
-    "evals": "Valutazioni"
+    "evals": "Valutazioni",
+    "utilization": "Utilizzo",
+    "utilizationDescription": "Tendenze di utilizzo della quota del provider e monitoraggio dei limiti di frequenza",
+    "comboHealth": "Salute Combo",
+    "comboHealthDescription": "Quota a livello combo, distribuzione dell'utilizzo e metriche delle prestazioni"
  },
  "apiManager": {
    "title": "Chiavi API",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Caricamento...",
    "account": "Account",
    "modelQuotas": "Quote modello",
-    "lastUsed": "Ultimo utilizzo",
+    "lastUsed": "Last Refreshed",
    "actions": "Azioni",
    "refreshQuota": "Aggiorna quota",
    "today": "Oggi",
@@ -36,7 +36,7 @@
    "time": "時間",
    "details": "詳細",
    "created": "作成されました",
-    "lastUsed": "最後に使用したもの",
+    "lastUsed": "Last Refreshed",
    "loadMore": "もっと読み込む",
    "noResults": "結果が見つかりませんでした",
    "reloadPage": "ページをリロードする",
@@ -270,7 +270,11 @@
    "overviewDescription": "すべてのプロバイダーとモデルにわたる API 使用パターン、トークン消費量、コスト、アクティビティの傾向を監視します。",
    "evalsDescription": "評価スイートを実行して、LLM エンドポイントをテストおよび検証します。モデルの品質を比較し、回帰を検出し、待ち時間をベンチマークします。",
    "overview": "概要",
-    "evals": "エヴァルス"
+    "evals": "エヴァルス",
+    "utilization": "活用率",
+    "utilizationDescription": "プロバイダーのクォータ使用傾向とレート制限の追跡",
+    "comboHealth": "コンボ健全性",
+    "comboHealthDescription": "コンボレベルのクォータ、使用分布、パフォーマンスメトリクス"
  },
  "apiManager": {
    "title": "APIキー",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "読み込み中...",
    "account": "アカウント",
    "modelQuotas": "モデルのクォータ",
-    "lastUsed": "最後に使用したもの",
+    "lastUsed": "Last Refreshed",
    "actions": "アクション",
    "refreshQuota": "クォータの更新",
    "today": "今日",
@@ -36,7 +36,7 @@
    "time": "시간",
    "details": "세부정보",
    "created": "생성됨",
-    "lastUsed": "마지막으로 사용됨",
+    "lastUsed": "Last Refreshed",
    "loadMore": "더 로드하기",
    "noResults": "검색결과가 없습니다",
    "reloadPage": "페이지 새로고침",
@@ -270,7 +270,11 @@
    "overviewDescription": "모든 공급자와 모델 전반에 걸쳐 API 사용 패턴, 토큰 소비, 비용, 활동 추세를 모니터링하세요.",
    "evalsDescription": "평가 모음을 실행하여 LLM 엔드포인트를 테스트하고 검증하세요. 모델 품질을 비교하고, 회귀를 감지하고, 벤치마크 대기 시간을 측정합니다.",
    "overview": "개요",
-    "evals": "평가"
+    "evals": "평가",
+    "utilization": "활용률",
+    "utilizationDescription": "공급자 할당량 사용 추세 및 속도 제한 추적",
+    "comboHealth": "콤보 건강 상태",
+    "comboHealthDescription": "콤보 수준 할당량, 사용 분포 및 성능 메트릭"
  },
  "apiManager": {
    "title": "API 키",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "로드 중...",
    "account": "계정",
    "modelQuotas": "모델 할당량",
-    "lastUsed": "마지막으로 사용됨",
+    "lastUsed": "Last Refreshed",
    "actions": "작업",
    "refreshQuota": "새로고침 할당량",
    "today": "오늘",
@@ -36,7 +36,7 @@
    "time": "Masa",
    "details": "Butiran",
    "created": "Dicipta",
-    "lastUsed": "Terakhir Digunakan",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Muatkan Lagi",
    "noResults": "Tiada hasil ditemui",
    "reloadPage": "Muat Semula Halaman",
@@ -270,7 +270,11 @@
    "overviewDescription": "Pantau corak penggunaan API anda, penggunaan token, kos dan aliran aktiviti merentas semua pembekal dan model.",
    "evalsDescription": "Jalankan suite penilaian untuk menguji dan mengesahkan titik akhir LLM anda. Bandingkan kualiti model, mengesan regresi dan kependaman penanda aras.",
    "overview": "Gambaran keseluruhan",
-    "evals": "Evals"
+    "evals": "Evals",
+    "utilization": "Pemanfaatan",
+    "utilizationDescription": "Tren penggunaan kuota pembekal dan penjejakan had kadar",
+    "comboHealth": "Kesihatan Combo",
+    "comboHealthDescription": "Kuota peringkat combo, pengagihan penggunaan dan metrik prestasi"
  },
  "apiManager": {
    "title": "Kunci API",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Memuatkan...",
    "account": "Akaun",
    "modelQuotas": "Kuota Model",
-    "lastUsed": "Terakhir Digunakan",
+    "lastUsed": "Last Refreshed",
    "actions": "Tindakan",
    "refreshQuota": "Muat semula kuota",
    "today": "Hari ini",
@@ -36,7 +36,7 @@
    "time": "Tijd",
    "details": "Details",
    "created": "Gemaakt",
-    "lastUsed": "Laatst gebruikt",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Laad meer",
    "noResults": "Geen resultaten gevonden",
    "reloadPage": "Pagina opnieuw laden",
@@ -270,7 +270,11 @@
    "overviewDescription": "Houd toezicht op uw API-gebruikspatronen, tokenverbruik, kosten en activiteitstrends voor alle providers en modellen.",
    "evalsDescription": "Voer evaluatiesuites uit om uw LLM-eindpunten te testen en te valideren. Vergelijk de modelkwaliteit, detecteer regressies en benchmark de latentie.",
    "overview": "Overzicht",
-    "evals": "Evals"
+    "evals": "Evals",
+    "utilization": "Benutting",
+    "utilizationDescription": "Trends in quotagebruik van de provider en bijhouden van snelheidslimieten",
+    "comboHealth": "Combo-gezondheid",
+    "comboHealthDescription": "Quota op combo-niveau, gebruiksdistributie en prestatiegegevens"
  },
  "apiManager": {
    "title": "API-sleutels",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Laden...",
    "account": "Rekening",
    "modelQuotas": "Modelquota",
-    "lastUsed": "Laatst gebruikt",
+    "lastUsed": "Last Refreshed",
    "actions": "Acties",
    "refreshQuota": "Quota vernieuwen",
    "today": "Vandaag",
@@ -36,7 +36,7 @@
    "time": "Tid",
    "details": "Detaljer",
    "created": "Opprettet",
-    "lastUsed": "Sist brukt",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Last inn mer",
    "noResults": "Ingen resultater funnet",
    "reloadPage": "Last inn siden på nytt",
@@ -270,7 +270,11 @@
    "overviewDescription": "Overvåk API-bruksmønstre, tokenforbruk, kostnader og aktivitetstrender på tvers av alle leverandører og modeller.",
    "evalsDescription": "Kjør evalueringspakker for å teste og validere LLM-endepunktene dine. Sammenlign modellkvalitet, registrer regresjoner og benchmark latens.",
    "overview": "Oversikt",
-    "evals": "Evals"
+    "evals": "Evals",
+    "utilization": "Utnyttelse",
+    "utilizationDescription": "Leverandørkvotebrukstrender og hastighetsgrensesporing",
+    "comboHealth": "Kombohelse",
+    "comboHealthDescription": "Kombonivåkvote, bruksfordeling og ytelsesmålinger"
  },
  "apiManager": {
    "title": "API-nøkler",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Laster inn...",
    "account": "Konto",
    "modelQuotas": "Modellkvoter",
-    "lastUsed": "Sist brukt",
+    "lastUsed": "Last Refreshed",
    "actions": "Handlinger",
    "refreshQuota": "Oppdater kvote",
    "today": "I dag",
@@ -36,7 +36,7 @@
    "time": "Oras",
    "details": "Mga Detalye",
    "created": "Nilikha",
-    "lastUsed": "Huling Ginamit",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Mag-load ng Higit Pa",
    "noResults": "Walang nakitang resulta",
    "reloadPage": "I-reload ang Pahina",
@@ -270,7 +270,11 @@
    "overviewDescription": "Subaybayan ang iyong mga pattern ng paggamit ng API, pagkonsumo ng token, gastos, at mga trend ng aktibidad sa lahat ng provider at modelo.",
    "evalsDescription": "Magpatakbo ng mga evaluation suite para subukan at patunayan ang iyong mga LLM endpoint. Ihambing ang kalidad ng modelo, tuklasin ang mga regression, at benchmark na latency.",
    "overview": "Pangkalahatang-ideya",
-    "evals": "Evals"
+    "evals": "Evals",
+    "utilization": "Paggamit",
+    "utilizationDescription": "Mga trend sa paggamit ng quota ng provider at pagsubaybay sa rate limit",
+    "comboHealth": "Kalusugan ng Combo",
+    "comboHealthDescription": "Quota sa antas ng combo, pamamahagi ng paggamit, at mga metrics ng pagganap"
  },
  "apiManager": {
    "title": "Mga API Key",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Naglo-load...",
    "account": "Account",
    "modelQuotas": "Mga Modelong Quota",
-    "lastUsed": "Huling Ginamit",
+    "lastUsed": "Last Refreshed",
    "actions": "Mga aksyon",
    "refreshQuota": "I-refresh ang quota",
    "today": "Ngayong araw",
@@ -36,7 +36,7 @@
    "time": "Czas",
    "details": "Szczegóły",
    "created": "Utworzono",
-    "lastUsed": "Ostatnio używany",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Załaduj więcej",
    "noResults": "Nie znaleziono żadnych wyników",
    "reloadPage": "Załaduj ponownie stronę",
@@ -270,7 +270,11 @@
    "overviewDescription": "Monitoruj wzorce wykorzystania interfejsu API, zużycie tokenów, koszty i trendy aktywności u wszystkich dostawców i modeli.",
    "evalsDescription": "Uruchom pakiety ewaluacyjne, aby przetestować i zweryfikować punkty końcowe LLM. Porównaj jakość modelu, wykryj regresje i porównaj opóźnienia.",
    "overview": "Przegląd",
-    "evals": "Ewaluacje"
+    "evals": "Ewaluacje",
+    "utilization": "Wykorzystanie",
+    "utilizationDescription": "Trendy wykorzystania kwoty dostawcy i śledzenie limitów szybkości",
+    "comboHealth": "Zdrowie Kombinacji",
+    "comboHealthDescription": "Kwota na poziomie kombinacji, dystrybucja użycia i metryki wydajności"
  },
  "apiManager": {
    "title": "Klucze API",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Ładowanie...",
    "account": "Konto",
    "modelQuotas": "Kwoty modelowe",
-    "lastUsed": "Ostatnio używany",
+    "lastUsed": "Last Refreshed",
    "actions": "Działania",
    "refreshQuota": "Odśwież limit",
    "today": "Dzisiaj",
@@ -36,7 +36,7 @@
    "time": "Tempo",
    "details": "Detalhes",
    "created": "Criado",
-    "lastUsed": "Último Uso",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Carregar Mais",
    "noResults": "Nenhum resultado encontrado",
    "reloadPage": "Recarregar Página",
@@ -270,7 +270,11 @@
    "overviewDescription": "Monitore padrões de uso da API, consumo de tokens, custos e tendências de atividade em todos os provedores e modelos.",
    "evalsDescription": "Execute suítes de avaliação para testar e validar seus endpoints LLM. Compare qualidade de modelos, detecte regressões e faça benchmarks de latência.",
    "overview": "Visão Geral",
-    "evals": "Avaliações"
+    "evals": "Avaliações",
+    "utilization": "Utilização",
+    "utilizationDescription": "Tendências de uso de cota do provedor e rastreamento de limites de taxa",
+    "comboHealth": "Saúde do Combo",
+    "comboHealthDescription": "Cota em nível de combo, distribuição de uso e métricas de desempenho"
  },
  "apiManager": {
    "title": "Chaves de API",
@@ -2288,7 +2292,7 @@
    "loadingQuotas": "Carregando...",
    "account": "Conta",
    "modelQuotas": "Cotas de Modelo",
-    "lastUsed": "Último uso",
+    "lastUsed": "Last Refreshed",
    "actions": "Ações",
    "refreshQuota": "Atualizar cota",
    "today": "Hoje",
@@ -36,7 +36,7 @@
    "time": "Hora",
    "details": "Detalhes",
    "created": "Criado",
-    "lastUsed": "Último uso",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Carregar mais",
    "noResults": "Nenhum resultado encontrado",
    "reloadPage": "Recarregar página",
@@ -270,7 +270,11 @@
    "overviewDescription": "Monitore seus padrões de uso de API, consumo de tokens, custos e tendências de atividades em todos os provedores e modelos.",
    "evalsDescription": "Execute conjuntos de avaliação para testar e validar seus endpoints LLM. Compare a qualidade do modelo, detecte regressões e compare a latência.",
    "overview": "Visão geral",
-    "evals": "Avaliações"
+    "evals": "Avaliações",
+    "utilization": "Utilização",
+    "utilizationDescription": "Tendências de uso de quota do provedor e rastreamento de limites de taxa",
+    "comboHealth": "Saúde do Combo",
+    "comboHealthDescription": "Quota ao nível do combo, distribuição de uso e métricas de desempenho"
  },
  "apiManager": {
    "title": "Chaves de API",
@@ -2282,7 +2286,7 @@
    "loadingQuotas": "Carregando...",
    "account": "Conta",
    "modelQuotas": "Cotas de modelo",
-    "lastUsed": "Último uso",
+    "lastUsed": "Last Refreshed",
    "actions": "Ações",
    "refreshQuota": "Atualizar cota",
    "today": "Hoje",
@@ -36,7 +36,7 @@
    "time": "timpul",
    "details": "Detalii",
    "created": "Creat",
-    "lastUsed": "Ultima utilizare",
+    "lastUsed": "Last Refreshed",
    "loadMore": "Încărcați mai multe",
    "noResults": "Nu s-au găsit rezultate",
    "reloadPage": "Reîncărcați pagina",
@@ -270,7 +270,11 @@
    "overviewDescription": "Monitorizați-vă modelele de utilizare a API-ului, consumul de simboluri, costurile și tendințele activității la toți furnizorii și modelele.",
    "evalsDescription": "Rulați suite de evaluare pentru a testa și valida punctele finale LLM. Comparați calitatea modelului, detectați regresiile și evaluați latența.",
    "overview": "Prezentare generală",
-    "evals": "Evaluări"
+    "evals": "Evaluări",
+    "utilization": "Utilizare",
+    "utilizationDescription": "Tendințe de utilizare a cotelor furnizorului și urmărirea limitelor de rată",
+    "comboHealth": "Sănătatea Combo-ului",
+    "comboHealthDescription": "Cotă la nivel de combo, distribuția utilizării și metricile de performanță"
  },
  "apiManager": {
    "title": "Chei API",
@@ -2270,7 +2274,7 @@
    "loadingQuotas": "Se încarcă...",
    "account": "Cont",
    "modelQuotas": "Cote model",
-    "lastUsed": "Ultima utilizare",
+    "lastUsed": "Last Refreshed",
    "actions": "Acțiuni",
    "refreshQuota": "Actualizează cota",
    "today": "Astăzi",
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Diego Rodrigues de Sa e Souza	70a4d38d04	Release v3.4.0 (Integration) (#861 ) Build Electron Desktop App / Validate version (push) Failing after 34s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details Build Electron Desktop App / Publish to npm (push) Has been skipped Details * test(settings): add unit tests for debugMode and hiddenSidebarItems Tests cover: - PATCH debugMode=true/false - PATCH hiddenSidebarItems with array values - Combined updates with both fields * test(e2e): add Playwright tests for settings toggles Tests cover: - Debug mode toggle on/off - Sidebar visibility toggle - Settings persistence after page reload * fix(tests): address code review issues - Unit tests: fix async/await for getSettings, use direct db functions - E2E tests: remove conditional logic, use Playwright auto-waiting assertions * feat(logging): unify request log retention and artifacts * docs: add dashboard settings toggles to CONTRIBUTING Add section documenting: - Debug Mode toggle (Settings → Advanced) - Sidebar Visibility toggle (Settings → General) * fix(cache): only inject prompt_cache_key for supported providers Only inject prompt_cache_key for providers that support prompt caching (Claude, Anthropic, ZAI, Qwen, DeepSeek). This fixes issue #848 where NVIDIA API rejected the parameter. * fix(model-sync): log only channel-level model changes * feat(providers): add 4 free models to opencode-zen * feat(providers): add explicit contextLength for opencode-zen free models * feat(providers): add contextLength for all opencode-zen models * feat: Improve the Chinese translation * fix: preserve client cache_control for all Claude-protocol providers Previously, the cache control preservation logic only recognized a hardcoded list of providers (claude, anthropic, zai, qwen, deepseek). This caused OmniRoute to inject its own cache_control markers for Claude-protocol providers not in that list (bailian-coding-plan, glm, minimax, minimax-cn, etc.), overwriting the client's cache markers. The fix checks both: 1. Known caching providers list (existing behavior) 2. Whether targetFormat === 'claude' (all Claude-protocol providers) This ensures all Claude-compatible providers properly preserve client cache_control headers when appropriate (Claude Code client, deterministic routing, etc.). Also removes unused CacheStatsCard from settings/components (duplicate of the one in cache/ page). Fixes cache token calculation for GLM, Minimax, and other Claude-compatible providers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: pure passthrough for Claude→Claude when cache_control preserved The Claude passthrough path round-trips through OpenAI format (claude→openai→claude) for structural normalization. This strips cache_control markers from every content block since OpenAI format has no equivalent, causing ~42k cache creation tokens per request with zero cache reads. When preserveCacheControl is true (Claude Code client, "always" setting, or deterministic combo), skip the round-trip entirely and forward the body as-is. Claude Code sends well-formed Messages API payloads — the normalization was only needed for non-Code clients. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: restore CacheStatsCard — was not a duplicate The first commit incorrectly deleted CacheStatsCard from settings/components/ as a "duplicate". It's the only copy — both settings/page.tsx and cache/page.tsx import from this location. Restored the i18n-ized version from main. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(429): parse long quota reset times from error body - Parse XhYmZs format from antigravity error messages (e.g., 27h41m36s) - Dynamic retry-after threshold (60s default) instead of hardcoded 10s - Add parseRetryFromErrorText() in accountFallback.ts for body parsing - Fix 403 'verify your account' to trigger permanent deactivation - Add keyword matching for 'quota will reset', 'exhausted capacity' - Add unit tests for retry parsing and keyword matching Fixes #858 (Antigravity 429 handling) Fixes #832 (Qwen quota 429 - same underlying bug) * chore: bump version to v3.4.0-dev * fix(migrations): rename 013 to 014 to avoid collision with v3.3.11 * chore(docs): update CHANGELOG for v3.4.0 integrations * fix: Claude token refresh, Antigravity quota, and 429 rate-limit handling - Fix Claude OAuth token refresh to use form-urlencoded format (standard OAuth2) - Add anthropic-beta header required by Claude OAuth API - Switch Antigravity quota to use retrieveUserQuota API (same as Gemini CLI) - Parse quota reset time for all providers (not just Antigravity) - Add quota reset keywords to error classifier - Cap maximum retry time at 24 hours to prevent infinite wait Closes #836, #857, #858, #832 * fix(dashboard): resolve /dashboard/limits hanging UI with 70+ accounts via chunk parallelization (#784) --------- Co-authored-by: oyi77 <oyi77@users.noreply.github.com> Co-authored-by: R.D. <rogerproself@gmail.com> Co-authored-by: kang-heewon <heewon.dev@gmail.com> Co-authored-by: gmw <rorschach1167@qq.com> Co-authored-by: tombii <github@tombii.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>	2026-03-31 10:22:52 -03:00
Diego Rodrigues de Sa e Souza	dbe17b4b16	Merge pull request #860 from diegosouzapw/release/v3.3.11 Build Electron Desktop App / Validate version (push) Failing after 35s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details Build Electron Desktop App / Publish to npm (push) Has been skipped Details chore(release): v3.3.11 — analytics, backup control, CLI fixes, workflow unification	2026-03-31 08:20:04 -03:00
diegosouzapw	ee4df2806f	chore(release): v3.3.11 — analytics, backup control, CLI fixes, workflow unification	2026-03-31 08:17:07 -03:00
diegosouzapw	afc0bc9323	fix: resolve CLI detection regression and model catalog tests	2026-03-31 07:57:43 -03:00
diegosouzapw	e071393eb5	chore(release): v3.3.10 — version bump and docs Build Electron Desktop App / Validate version (push) Failing after 41s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details Build Electron Desktop App / Publish to npm (push) Has been skipped Details	2026-03-31 00:16:51 -03:00
Diego Rodrigues de Sa e Souza	6f9fec658f	chore(release): v3.3.10 — merge Analytics and SQLite fixes (#849 ) Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>	2026-03-31 00:15:20 -03:00
Owen	9227964cb6	feat(analytics): add subscription utilization analytics (#847 ) - Add quota_snapshots table for time-series quota tracking - Add QuotaSnapshot DB module with CRUD + cleanup (6h gate) - Add snapshot save hook in quotaCache.setQuotaCache() - Add Provider Utilization API: GET /api/usage/utilization - Add Combo Health API: GET /api/usage/combo-health - Add ProviderUtilizationTab with recharts LineChart - Add ComboHealthTab with quota/skew/performance metrics - Add TimeRangeSelector component (1h/24h/7d/30d) - Integrate tabs into /dashboard/analytics - Add unit tests for quotaSnapshots module - Add E2E tests for analytics tabs - Add i18n keys for 33 languages	2026-03-31 00:12:27 -03:00
Randi	cf6056cede	Add env flag to disable automatic SQLite backups (#846 ) * feat(db): allow disabling sqlite auto backups * chore(db): rename sqlite auto backup env flag	2026-03-31 00:12:23 -03:00
Diego Rodrigues de Sa e Souza	4397612349	Merge pull request #842 from rdself/coder/fix-codex-fast-tier-light-mode Restore Codex fast tier toggle visibility in light mode	2026-03-30 23:24:44 -03:00
diegosouzapw	cf3719a663	Merge pull request #843 from rdself/coder/provider-limits-last-refreshed with i18n synchronization	2026-03-30 23:24:41 -03:00
diegosouzapw	77bf35d728	chore(i18n): sync lastUsed key across all 30 languages	2026-03-30 23:11:55 -03:00
R.D.	e7addec0a1	fix(usage): track provider limit refreshes per account	2026-03-30 21:17:24 -04:00
R.D.	243d61d95f	fix(ui): restore codex service tier toggle contrast	2026-03-30 21:14:52 -04:00
Diego Rodrigues de Sa e Souza	028874fd05	Merge pull request #840 from diegosouzapw/release/v3.3.9 Build Electron Desktop App / Validate version (push) Failing after 33s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details Build Electron Desktop App / Publish to npm (push) Has been skipped Details chore(release): v3.3.9 — summary	2026-03-30 21:38:32 -03:00
diegosouzapw	6d366fe80f	chore(release): v3.3.9 — custom provider key rotation fix	2026-03-30 21:33:21 -03:00
Diego Rodrigues de Sa e Souza	0924f767e9	Merge pull request #839 from diegosouzapw/fix/issue-815-custom-provider-key-rotation fix: rotate extra api keys for custom providers (#815)	2026-03-30 21:30:56 -03:00
diegosouzapw	173b5a1cd1	fix: rotate extra api keys for custom providers (#815 )	2026-03-30 21:13:50 -03:00