feat(release): v2.7.2 — fix light mode contrast in logs UI

- fix(logs): text colors in filter buttons + combo badge now have dark: variants - Bumped version to 2.7.2 - Updated CHANGELOG and openapi.yaml
Merge pull request #433 from diegosouzapw/fix/issue-378-logs-light-mode-contrast
2026-03-18 00:42:22 -03:00 · 2026-03-18 00:41:28 -03:00 · 2026-03-17 16:46:27 -03:00 · 2026-03-17 16:27:31 -03:00 · 2026-03-17 16:18:36 -03:00 · 2026-03-17 16:18:12 -03:00
98 changed files with 5407 additions and 195 deletions
@@ -32,6 +32,27 @@ Version format: `2.x.y` — examples:
 npm version patch --no-git-tag-version
 ```

+> **⚠️ ATOMIC COMMIT RULE — Version bump MUST happen before committing feature files.**
+>
+> **CORRECT order:**
+>
+> 1. `npm version patch --no-git-tag-version` ← bump first
+> 2. implement features / fix bugs
+> 3. `git add -A && git commit -m "chore(release): v2.x.y — all changes in ONE commit"`
+>
+> **OR if features are already staged:**
+>
+> 1. implement features (do NOT commit yet)
+> 2. `npm version patch --no-git-tag-version` ← bump before committing
+> 3. `git add -A && git commit -m "chore(release): v2.x.y — all changes in ONE commit"`
+>
+> **NEVER do this (creates version mismatch in git history):**
+>
+> - ~~commit features → then bump version → commit package.json separately~~
+>
+> This ensures that `git show v2.x.y` always contains both code changes and the version bump together.
+> The GitHub release tag will point to a commit that includes ALL changes for that version.
+
 ### 2. Regenerate lock file (REQUIRED after version bump)

 **Mandatory** — skipping causes `@swc/helpers` lock mismatch and CI failures:
@@ -3,6 +3,11 @@ data/
 **/data/
 **/db.json

+# VS Code extension test runtime (large binary, not needed in npm package)
+app/vscode-extension/
+**/data/
+**/db.json
+
 # Source code (pre-built app/ is published instead)
 src/
 open-sse/
@@ -4,6 +4,143 @@

 ---

+## [2.7.2] — 2026-03-18
+
+> Sprint: Light mode UI contrast fixes.
+
+### 🐛 Bug Fixes
+
+- **fix(logs)**: Fix light mode contrast in request logs filter buttons and combo badge (#378)
+  - Error/Success/Combo filter buttons now readable in light mode
+  - Combo row badge uses stronger violet in light mode
+
+---
+
+## [2.7.1] — 2026-03-17
+
+> Sprint: Unified web search routing (POST /v1/search) with 5 providers + Next.js 16.1.7 security fixes (6 CVEs).
+
+### ✨ New Features
+
+- **feat(search)**: Unified web search routing — `POST /v1/search` with 5 providers (Serper, Brave, Perplexity, Exa, Tavily)
+  - Auto-failover across providers, 6,500+ free searches/month
+  - In-memory cache with request coalescing (configurable TTL)
+  - Dashboard: Search Analytics tab in `/dashboard/analytics` with provider breakdown, cache hit rate, cost tracking
+  - New API: `GET /api/v1/search/analytics` for search request statistics
+  - DB migration: `request_type` column on `call_logs` for non-chat request tracking
+  - Zod validation (`v1SearchSchema`), auth-gated, cost recorded via `recordCost()`
+
+### 🔒 Security
+
+- **deps**: Next.js 16.1.6 → 16.1.7 — fixes 6 CVEs:
+  - **Critical**: CVE-2026-29057 (HTTP request smuggling via http-proxy)
+  - **High**: CVE-2026-27977, CVE-2026-27978 (WebSocket + Server Actions)
+  - **Medium**: CVE-2026-27979, CVE-2026-27980, CVE-2026-jcc7
+
+### 📁 New Files
+
+| File                                                             | Purpose                                    |
+| ---------------------------------------------------------------- | ------------------------------------------ |
+| `open-sse/handlers/search.ts`                                    | Search handler with 5-provider routing     |
+| `open-sse/config/searchRegistry.ts`                              | Provider registry (auth, cost, quota, TTL) |
+| `open-sse/services/searchCache.ts`                               | In-memory cache with request coalescing    |
+| `src/app/api/v1/search/route.ts`                                 | Next.js route (POST + GET)                 |
+| `src/app/api/v1/search/analytics/route.ts`                       | Search stats API                           |
+| `src/app/(dashboard)/dashboard/analytics/SearchAnalyticsTab.tsx` | Analytics dashboard tab                    |
+| `src/lib/db/migrations/007_search_request_type.sql`              | DB migration                               |
+| `tests/unit/search-registry.test.mjs`                            | 277 lines of unit tests                    |
+
+---
+
+## [2.7.0] — 2026-03-17
+
+> Sprint: ClawRouter-inspired features — toolCalling flag, multilingual intent detection, benchmark-driven fallback, request deduplication, pluggable RouterStrategy, Grok-4 Fast + GLM-5 + MiniMax M2.5 + Kimi K2.5 pricing.
+
+### ✨ New Models & Pricing
+
+- **feat(pricing)**: xAI Grok-4 Fast — `$0.20/$0.50 per 1M tokens`, 1143ms p50 latency, tool calling supported
+- **feat(pricing)**: xAI Grok-4 (standard) — `$0.20/$1.50 per 1M tokens`, reasoning flagship
+- **feat(pricing)**: GLM-5 via Z.AI — `$0.5/1M`, 128K output context
+- **feat(pricing)**: MiniMax M2.5 — `$0.30/1M input`, reasoning + agentic tasks
+- **feat(pricing)**: DeepSeek V3.2 — updated pricing `$0.27/$1.10 per 1M`
+- **feat(pricing)**: Kimi K2.5 via Moonshot API — direct Moonshot API access
+- **feat(providers)**: Z.AI provider added (`zai` alias) — GLM-5 family with 128K output
+
+### 🧠 Routing Intelligence
+
+- **feat(registry)**: `toolCalling` flag per model in provider registry — combos can now prefer/require tool-calling capable models
+- **feat(scoring)**: Multilingual intent detection for AutoCombo scoring — PT/ZH/ES/AR script/language patterns influence model selection per request context
+- **feat(fallback)**: Benchmark-driven fallback chains — real latency data (p50 from `comboMetrics`) used to re-order fallback priority dynamically
+- **feat(dedup)**: Request deduplication via content-hash — 5-second idempotency window prevents duplicate provider calls from retrying clients
+- **feat(router)**: Pluggable `RouterStrategy` interface in `autoCombo/routerStrategy.ts` — custom routing logic can be injected without modifying core
+
+### 🔧 MCP Server Improvements
+
+- **feat(mcp)**: 2 new advanced tool schemas: `omniroute_get_provider_metrics` (p50/p95/p99 per provider) and `omniroute_explain_route` (routing decision explanation)
+- **feat(mcp)**: MCP tool auth scopes updated — `metrics:read` scope added for provider metrics tools
+- **feat(mcp)**: `omniroute_best_combo_for_task` now accepts `languageHint` parameter for multilingual routing
+
+### 📊 Observability
+
+- **feat(metrics)**: `comboMetrics.ts` extended with real-time latency percentile tracking per provider/account
+- **feat(health)**: Health API (`/api/monitoring/health`) now returns per-provider `p50Latency` and `errorRate` fields
+- **feat(usage)**: Usage history migration for per-model latency tracking
+
+### 🗄️ DB Migrations
+
+- **feat(migrations)**: New column `latency_p50` in `combo_metrics` table — zero-breaking, safe for existing users
+
+### 🐛 Bug Fixes / Closures
+
+- **close(#411)**: better-sqlite3 hashed module resolution on Windows — fixed in v2.6.10 (f02c5b5)
+- **close(#409)**: GitHub Copilot chat completions fail with Claude models when files attached — fixed in v2.6.9 (838f1d6)
+- **close(#405)**: Duplicate of #411 — resolved
+
+## [2.6.10] — 2026-03-17
+
+> Windows fix: better-sqlite3 prebuilt download without node-gyp/Python/MSVC (#426).
+
+### 🐛 Bug Fixes
+
+- **fix(install/#426)**: On Windows, `npm install -g omniroute` used to fail with `better_sqlite3.node is not a valid Win32 application` because the bundled native binary was compiled for Linux. Adds **Strategy 1.5** to `scripts/postinstall.mjs`: uses `@mapbox/node-pre-gyp install --fallback-to-build=false` (bundled within `better-sqlite3`) to download the correct prebuilt binary for the current OS/arch without requiring any build tools (no node-gyp, no Python, no MSVC). Falls back to `npm rebuild` only if the download fails. Adds platform-specific error messages with clear manual fix instructions.
+
+---
+
+## [2.6.9] — 2026-03-17
+
+> CI fixes (t11 any-budget), bug fix #409 (file attachments via Copilot+Claude), release workflow correction.
+
+### 🐛 Bug Fixes
+
+- **fix(ci)**: Remove word "any" from comments in `openai-responses.ts` and `chatCore.ts` that were failing the t11 `\bany\b` budget check (false positive from regex counting comments)
+- **fix(chatCore)**: Normalize unsupported content part types before forwarding to providers (#409 — Cursor sends `{type:"file"}` when `.md` files are attached; Copilot and other OpenAI-compat providers reject with "type has to be either 'image_url' or 'text'"; fix converts `file`/`document` blocks to `text` and drops unknown types)
+
+### 🔧 Workflow
+
+- **chore(generate-release)**: Add ATOMIC COMMIT RULE — version bump (`npm version patch`) MUST happen before committing feature files to ensure tag always points to a commit containing all version changes together
+
+---
+
+## [2.6.8] — 2026-03-17
+
+> Sprint: Combo as Agent (system prompt + tool filter), Context Caching Protection, Auto-Update, Detailed Logs, MITM Kiro IDE.
+
+### 🗄️ DB Migrations (zero-breaking — safe for existing users)
+
+- **005_combo_agent_fields.sql**: `ALTER TABLE combos ADD COLUMN system_message TEXT DEFAULT NULL`, `tool_filter_regex TEXT DEFAULT NULL`, `context_cache_protection INTEGER DEFAULT 0`
+- **006_detailed_request_logs.sql**: New `request_detail_logs` table with 500-entry ring-buffer trigger, opt-in via settings toggle
+
+### ✨ Features
+
+- **feat(combo)**: System Message Override per Combo (#399 — `system_message` field replaces or injects system prompt before forwarding to provider)
+- **feat(combo)**: Tool Filter Regex per Combo (#399 — `tool_filter_regex` keeps only tools matching pattern; supports OpenAI + Anthropic formats)
+- **feat(combo)**: Context Caching Protection (#401 — `context_cache_protection` tags responses with `<omniModel>provider/model</omniModel>` and pins model for session continuity)
+- **feat(settings)**: Auto-Update via Settings (#320 — `GET /api/system/version` + `POST /api/system/update` — checks npm registry and updates in background with pm2 restart)
+- **feat(logs)**: Detailed Request Logs (#378 — captures full pipeline bodies at 4 stages: client request, translated request, provider response, client response — opt-in toggle, 64KB trim, 500-entry ring-buffer)
+- **feat(mitm)**: MITM Kiro IDE profile (#336 — `src/mitm/targets/kiro.ts` targets api.anthropic.com, reuses existing MITM infrastructure)
+
+---
+
 ## [2.6.7] — 2026-03-17

 > Sprint: SSE improvements, local provider_nodes extensions, proxy registry, Claude passthrough fixes.
@@ -4,7 +4,7 @@

 _Your universal API proxy — one endpoint, 44+ providers, zero downtime. Now with **MCP & A2A** agent orchestration._

-**Chat Completions • Embeddings • Image Generation • Video • Music • Audio • Reranking • MCP Server • A2A Protocol • 100% TypeScript**
+**Chat Completions • Embeddings • Image Generation • Video • Music • Audio • Reranking • **Web Search** • MCP Server • A2A Protocol • 100% TypeScript**

 ---

@@ -898,27 +898,44 @@ When minimized, OmniRoute lives in your system tray with quick actions:

 ## 💰 Pricing at a Glance

-| Tier                | Provider          | Cost                   | Quota Reset      | Best For                |
-| ------------------- | ----------------- | ---------------------- | ---------------- | ----------------------- |
-| **💳 SUBSCRIPTION** | Claude Code (Pro) | $20/mo                 | 5h + weekly      | Already subscribed      |
-|                     | Codex (Plus/Pro)  | $20-200/mo             | 5h + weekly      | OpenAI users            |
-|                     | Gemini CLI        | **FREE**               | 180K/mo + 1K/day | Everyone!               |
-|                     | GitHub Copilot    | $10-19/mo              | Monthly          | GitHub users            |
-| **🔑 API KEY**      | NVIDIA NIM        | **FREE** (dev forever) | ~40 RPM          | 70+ open models         |
-|                     | Cerebras          | **FREE** (1M tok/day)  | 60K TPM / 30 RPM | World's fastest         |
-|                     | Groq              | **FREE** (30 RPM)      | 14.4K RPD        | Ultra-fast Llama/Gemma  |
-|                     | DeepSeek          | Pay-per-use            | None             | Best price/quality      |
-|                     | xAI (Grok)        | Pay-per-use            | None             | Grok models             |
-|                     | Mistral           | Free trial + paid      | Rate limited     | European AI             |
-|                     | OpenRouter        | Pay-per-use            | None             | 100+ models aggr.       |
-| **💰 CHEAP**        | GLM-4.7           | $0.6/1M                | Daily 10AM       | Budget backup           |
-|                     | MiniMax M2.1      | $0.2/1M                | 5-hour rolling   | Cheapest option         |
-|                     | Kimi K2           | $9/mo flat             | 10M tokens/mo    | Predictable cost        |
-| **🆓 FREE**         | iFlow             | **$0**                 | Unlimited        | 5 models unlimited      |
-|                     | Qwen              | **$0**                 | Unlimited        | 4 models unlimited      |
-|                     | Kiro              | **$0**                 | Unlimited        | Claude (AWS Builder ID) |
+| Tier                | Provider                    | Cost                      | Quota Reset      | Best For                          |
+| ------------------- | --------------------------- | ------------------------- | ---------------- | --------------------------------- |
+| **💳 SUBSCRIPTION** | Claude Code (Pro)           | $20/mo                    | 5h + weekly      | Already subscribed                |
+|                     | Codex (Plus/Pro)            | $20-200/mo                | 5h + weekly      | OpenAI users                      |
+|                     | Gemini CLI                  | **FREE**                  | 180K/mo + 1K/day | Everyone!                         |
+|                     | GitHub Copilot              | $10-19/mo                 | Monthly          | GitHub users                      |
+| **🔑 API KEY**      | NVIDIA NIM                  | **FREE** (dev forever)    | ~40 RPM          | 70+ open models                   |
+|                     | Cerebras                    | **FREE** (1M tok/day)     | 60K TPM / 30 RPM | World's fastest                   |
+|                     | Groq                        | **FREE** (30 RPM)         | 14.4K RPD        | Ultra-fast Llama/Gemma            |
+|                     | DeepSeek V3.2               | $0.27/$1.10 per 1M        | None             | Best price/quality reasoning      |
+|                     | xAI Grok-4 Fast             | **$0.20/$0.50 per 1M** 🆕 | None             | Fastest + tool calling, ultralow  |
+|                     | xAI Grok-4 (standard)       | $0.20/$1.50 per 1M 🆕     | None             | Reasoning flagship from xAI       |
+|                     | Mistral                     | Free trial + paid         | Rate limited     | European AI                       |
+|                     | OpenRouter                  | Pay-per-use               | None             | 100+ models aggr.                 |
+| **💰 CHEAP**        | GLM-5 (via Z.AI) 🆕         | $0.5/1M                   | Daily 10AM       | 128K output, newest flagship      |
+|                     | GLM-4.7                     | $0.6/1M                   | Daily 10AM       | Budget backup                     |
+|                     | MiniMax M2.5 🆕             | $0.3/1M input             | 5-hour rolling   | Reasoning + agentic tasks         |
+|                     | MiniMax M2.1                | $0.2/1M                   | 5-hour rolling   | Cheapest option                   |
+|                     | Kimi K2.5 (Moonshot API) 🆕 | Pay-per-use               | None             | Direct Moonshot API access        |
+|                     | Kimi K2                     | $9/mo flat                | 10M tokens/mo    | Predictable cost                  |
+| **🆓 FREE**         | iFlow                       | **$0**                    | Unlimited        | 5 models unlimited                |
+|                     | Qwen                        | **$0**                    | Unlimited        | 4 models unlimited                |
+|                     | Kiro                        | **$0**                    | Unlimited        | Claude Sonnet/Haiku (AWS Builder) |

-**💡 $0 Combo Stack:** Gemini CLI (180K/mo) → iFlow (unlimited: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1) → Kiro (Claude for free) → Qwen (4 models, unlimited) — **Zero cost, never stops coding.** When Gemini quota runs out, OmniRoute auto-falls back to iFlow or Kiro with zero config.
+> 🆕 **New models added (Mar 2026):** Grok-4 Fast family at $0.20/$0.50/M (benchmarked at 1143ms — 30% faster than Gemini 2.5 Flash), GLM-5 via Z.AI with 128K output, MiniMax M2.5 reasoning, DeepSeek V3.2 updated pricing, Kimi K2.5 via Moonshot direct API.
+
+**💡 $0 Combo Stack — The Complete Free Setup:**
+
+```
+Gemini CLI (180K/mo free)
+  → iFlow (unlimited: kimi-k2-thinking, qwen3-coder-plus, deepseek-r1)
+  → Kiro (Claude Sonnet 4.5 + Haiku — unlimited, via AWS Builder ID)
+  → Qwen (4 models — unlimited)
+  → Groq (14.4K req/day — ultra-fast)
+  → NVIDIA NIM (70+ models — 40 RPM forever)
+```
+
+**Zero cost. Never stops coding.** Configure this as one OmniRoute combo and all fallbacks happen automatically — no manual switching ever.

 ---

@@ -1027,7 +1044,20 @@ Then in `/dashboard/media` → **Transcription** tab: upload any audio or video

 OmniRoute v2.0 is built as an operational platform, not just a relay proxy.

-### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP
+### 🆕 New — ClawRouter-Inspired Improvements (Mar 2026)
+
+| Feature                              | What It Does                                                                                |
+| ------------------------------------ | ------------------------------------------------------------------------------------------- |
+| ⚡ **Grok-4 Fast Family**            | xAI models at $0.20/$0.50/M — benchmarked 1143ms (30% faster than Gemini 2.5 Flash)         |
+| 🧠 **GLM-5 via Z.AI**                | 128K output context, $0.5/1M — newest flagship from the GLM family                          |
+| 🔮 **MiniMax M2.5**                  | Reasoning + agentic tasks at $0.30/1M — significant upgrade from M2.1                       |
+| 🎯 **toolCalling Flag per Model**    | Per-model `toolCalling: true/false` in registry — AutoCombo skips non-tool-capable models   |
+| 🌍 **Multilingual Intent Detection** | PT/ZH/ES/AR keywords in AutoCombo scoring — better model selection for non-English content  |
+| 📊 **Benchmark-Driven Fallbacks**    | Real p95 latency from live requests feeds combo scoring — AutoCombo learns from actual data |
+| 🔁 **Request Deduplication**         | Content-hash based dedup window — multi-agent safe, prevents duplicate charges              |
+| 🔌 **Pluggable RouterStrategy**      | Extensible `RouterStrategy` interface — add custom routing logic as plugins                 |
+
+### 🚀 Previous v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                                                                                                            |
 | ------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
@@ -1075,16 +1105,17 @@ OmniRoute v2.0 is built as an operational platform, not just a relay proxy.

 ### 🎵 Multi-Modal APIs

-| Feature                    | What It Does                                                  |
-| -------------------------- | ------------------------------------------------------------- |
-| 🖼️ **Image Generation**    | `/v1/images/generations` with cloud and local backends        |
-| 📐 **Embeddings**          | `/v1/embeddings` for search and RAG pipelines                 |
-| 🎤 **Audio Transcription** | `/v1/audio/transcriptions` (Whisper and additional providers) |
-| 🔊 **Text-to-Speech**      | `/v1/audio/speech` (multiple engines/providers)               |
-| 🎬 **Video Generation**    | `/v1/videos/generations` (ComfyUI + SD WebUI workflows)       |
-| 🎵 **Music Generation**    | `/v1/music/generations` (ComfyUI workflows)                   |
-| 🛡️ **Moderations**         | `/v1/moderations` safety checks                               |
-| 🔀 **Reranking**           | `/v1/rerank` for relevance scoring                            |
+| Feature                    | What It Does                                                                                                 |
+| -------------------------- | ------------------------------------------------------------------------------------------------------------ |
+| 🖼️ **Image Generation**    | `/v1/images/generations` with cloud and local backends                                                       |
+| 📐 **Embeddings**          | `/v1/embeddings` for search and RAG pipelines                                                                |
+| 🎤 **Audio Transcription** | `/v1/audio/transcriptions` (Whisper and additional providers)                                                |
+| 🔊 **Text-to-Speech**      | `/v1/audio/speech` (multiple engines/providers)                                                              |
+| 🎬 **Video Generation**    | `/v1/videos/generations` (ComfyUI + SD WebUI workflows)                                                      |
+| 🎵 **Music Generation**    | `/v1/music/generations` (ComfyUI workflows)                                                                  |
+| 🛡️ **Moderations**         | `/v1/moderations` safety checks                                                                              |
+| 🔀 **Reranking**           | `/v1/rerank` for relevance scoring                                                                           |
+| 🔍 **Web Search** 🆕       | `/v1/search` — 5 providers (Serper, Brave, Perplexity, Exa, Tavily), 6,500+ free/month, auto-failover, cache |

 ### 🛡️ Resilience, Security & Governance

@@ -8,6 +8,16 @@ _وكيل API العالمي الخاص بك - نقطة نهاية واحدة،

 ---

+### 🆕 الجديد في v2.7.0
+
+- **RouterStrategy قابل للتوصيل** — استراتيجيات القواعد والتكلفة والكمون
+- **كشف النية متعدد اللغات** — تسجيل التوجيه بأكثر من 30 لغة
+- **إلغاء تكرار الطلبات** — تجنب مكالمات API المكررة عبر تجزئة المحتوى
+- **مزودون جدد:** Grok-4 Fast (xAI) وGLM-5 / Z.AI وMiniMax M2.5 وKimi K2.5
+- **أسعار محدثة:** Grok-4 Fast $0.20/$0.50/M، GLM-5 $0.50/M، MiniMax M2.5 $0.30/M
+
+---
+
 <div align="center">

 [![إصدار npm](https://img.shields.io/npm/v/omniroute?color=cb3837&logo=npm)](https://www.npmjs.com/package/omniroute)
@@ -8,6 +8,16 @@ _Вашият универсален API прокси — една крайна

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 <div align="center">

 [![npm версия](https://img.shields.io/npm/v/omniroute?color=cb3837&logo=npm)](https://www.npmjs.com/package/omniroute)
@@ -8,6 +8,16 @@ _Din universelle API-proxy — ét slutpunkt, 36+ udbydere, ingen nedetid. Nu me

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 <div align="center">

 [![npm version](https://img.shields.io/npm/v/omniroute?color=cb3837&logo=npm)](https://www.npmjs.com/package/omniroute)
@@ -8,6 +8,16 @@ _Ihr universeller API-Proxy – ein Endpunkt, mehr als 36 Anbieter, keine Ausfal

 ---

+### 🆕 Neu in v2.7.0
+
+- **Erweiterbare RouterStrategy** — Regeln-, Kosten- und Latenzstrategien
+- **Mehrsprachige Absichtserkennung** — Routing-Scoring in 30+ Sprachen
+- **Anfrage-Deduplizierung** — doppelte API-Aufrufe per Content-Hash vermeiden
+- **Neue Anbieter:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Aktualisierte Preise:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 <div align="center">

 [![npm-Version](https://img.shields.io/npm/v/omniroute?color=cb3837&logo=npm)](https://www.npmjs.com/package/omniroute)
@@ -11,6 +11,16 @@ _Tu proxy de API universal — un endpoint, 36+ proveedores, cero tiempo de inac

 ---

+### 🆕 Novedades en v2.7.0
+
+- **RouterStrategy enchufable** — estrategias de reglas, costo y latencia
+- **Detección de intención multilingüe** — puntuación de enrutamiento en 30+ idiomas
+- **Deduplicación de solicitudes** — evita llamadas duplicadas por hash de contenido
+- **Nuevos proveedores:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Precios actualizados:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Universaali API-välityspalvelin – yksi päätepiste, yli 36 palveluntarjoaja

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Votre proxy API universel — un endpoint, 36+ fournisseurs, zéro temps d'arr

 ---

+### 🆕 Nouveautés dans v2.7.0
+
+- **RouterStrategy extensible** — stratégies de règles, coût et latence
+- **Détection d'intention multilingue** — scoring de routage en 30+ langues
+- **Déduplication des requêtes** — évite les appels dupliqués via hash de contenu
+- **Nouveaux fournisseurs :** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Tarifs mis à jour :** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _שרת ה-API האוניברסלי שלך - נקודת קצה אחת, 36+ ספ

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Az univerzális API-proxy – egy végpont, 36+ szolgáltató, nulla állásid

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Proksi API universal Anda — satu titik akhir, 36+ penyedia, tanpa waktu henti

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -13,6 +13,16 @@ _आपका सार्वभौमिक एपीआई प्रॉक्

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Il tuo proxy API universale — un endpoint, 36+ provider, zero downtime._

 ---

+### 🆕 Novità in v2.7.0
+
+- **RouterStrategy estensibile** — strategie per regole, costo e latenza
+- **Rilevamento intento multilingue** — scoring di routing in 30+ lingue
+- **Deduplicazione richieste** — evita chiamate duplicate tramite hash del contenuto
+- **Nuovi provider:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Prezzi aggiornati:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _ユニバーサル API プロキシ — 1 つのエンドポイント、36 以

 ---

+### 🆕 v2.7.0 の新機能
+
+- **プラガブル RouterStrategy** — ルール・コスト・レイテンシ戦略をサポート
+- **多言語インテント検出** — 30以上の言語でルーティングスコアリング
+- **リクエスト重複排除** — コンテンツハッシュで重複 API 呼び出しを防止
+- **新しいプロバイダー：** Grok-4 Fast (xAI)、GLM-5 / Z.AI、MiniMax M2.5、Kimi K2.5
+- **価格更新：** Grok-4 Fast $0.20/$0.50/M、GLM-5 $0.50/M、MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _범용 API 프록시 — 하나의 엔드포인트, 36개 이상의 공급자,

 ---

+### 🆕 v2.7.0 새로운 기능
+
+- **플러그형 RouterStrategy** — 규칙, 비용, 지연 전략 지원
+- **다국어 의도 감지** — 30개 이상 언어로 라우팅 스코어링
+- **요청 중복 제거** — 콘텐츠 해시로 중복 API 호출 방지
+- **새 공급자:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **가격 업데이트:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Proksi API universal anda — satu titik akhir, 36+ pembekal, masa henti sifar.

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Uw universele API-proxy: één eindpunt, meer dan 36 providers, geen downtime._

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Din universelle API-proxy – ett endepunkt, 36+ leverandører, null nedetid._

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Iyong unibersal na API proxy — isang endpoint, 36+ provider, zero downtime._

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Twój uniwersalny serwer proxy API — jeden punkt końcowy, ponad 36 dostawcó

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Seu proxy de API universal — um endpoint, 36+ provedores, zero tempo de inati

 ---

+### 🆕 Novidades na v2.7.0
+
+- **RouterStrategy plugável** — estratégias de regras, custo e latência
+- **Detecção de intenção multilíngue** — scoring de roteamento em 30+ idiomas
+- **Deduplicação de requisições** — evita chamadas duplicadas por hash de conteúdo
+- **Novos provedores:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Preços atualizados:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Seu proxy de API universal — um endpoint, mais de 36 provedores, tempo de ina

 ---

+### 🆕 Novidades na v2.7.0
+
+- **RouterStrategy extensível** — estratégias de regras, custo e latência
+- **Deteção de intenção multilíngue** — scoring de encaminhamento em 30+ idiomas
+- **Deduplicação de pedidos** — evita chamadas duplicadas por hash de conteúdo
+- **Novos fornecedores:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Preços atualizados:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Proxy-ul dvs. universal API - un punct final, peste 36 de furnizori, zero timpi

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Ваш универсальный API-прокси — одна точка до

 ---

+### 🆕 Новое в v2.7.0
+
+- **Подключаемая RouterStrategy** — стратегии по правилам, стоимости и задержке
+- **Многоязычное распознавание намерений** — маршрутизация на 30+ языках
+- **Дедупликация запросов** — устранение дублей по хэшу содержимого
+- **Новые провайдеры:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Обновлённые цены:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Váš univerzálny proxy server API – jeden koncový bod, 36+ poskytovateľov

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Din universella API-proxy — en slutpunkt, 36+ leverantörer, noll driftstopp.

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _พร็อกซี API สากลของคุณ — จุดสิ้

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Ваш універсальний API-проксі — одна кінцева

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _Proxy API phổ quát của bạn — một điểm cuối, hơn 36 nhà cung c

 ---

+### 🆕 What's New in v2.7.0
+
+- **Pluggable RouterStrategy** — rules, cost, and latency routing strategies
+- **Multilingual intent detection** — routing scoring in 30+ languages
+- **Request deduplication** — prevent duplicate API calls via content hash
+- **New providers:** Grok-4 Fast (xAI), GLM-5 / Z.AI, MiniMax M2.5, Kimi K2.5
+- **Updated pricing:** Grok-4 Fast $0.20/$0.50/M, GLM-5 $0.50/M, MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -11,6 +11,16 @@ _您的通用 API 代理 — 一个端点，36+ 提供商，零停机时间。_

 ---

+### 🆕 v2.7.0 新功能
+
+- **可插拔 RouterStrategy** — 支持规则、成本和延迟策略
+- **多语言意图检测** — 支持 30+ 语言的路由评分
+- **请求去重** — 基于内容哈希避免重复 API 调用
+- **新增提供商：** Grok-4 Fast (xAI)、GLM-5 / Z.AI、MiniMax M2.5、Kimi K2.5
+- **价格更新：** Grok-4 Fast $0.20/$0.50/M，GLM-5 $0.50/M，MiniMax M2.5 $0.30/M
+
+---
+
 ### 🚀 New in v2.0.9+ — Playground, CLI Fingerprints & ACP

 | Feature                                    | What It Does                                                                                                                                  |
@@ -1,7 +1,7 @@
 openapi: 3.1.0
 info:
  title: OmniRoute API
-  version: 2.6.7
+  version: 2.7.2
  description: |
    OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
    endpoint that routes requests to multiple AI providers with load balancing,
@@ -11,6 +11,7 @@
 export interface RegistryModel {
  id: string;
  name: string;
+  toolCalling?: boolean;
  targetFormat?: string;
  unsupportedParams?: readonly string[];
 }
@@ -114,6 +115,7 @@ export const REGISTRY: Record<string, RegistryEntry> = {
    },
    models: [
      { id: "claude-opus-4-6", name: "Claude Opus 4.6" },
+      { id: "claude-sonnet-4-6", name: "Claude 4.6 Sonnet" },
      { id: "claude-opus-4-5-20251101", name: "Claude 4.5 Opus" },
      { id: "claude-sonnet-4-5-20250929", name: "Claude 4.5 Sonnet" },
      { id: "claude-haiku-4-5-20251001", name: "Claude 4.5 Haiku" },
@@ -139,6 +141,9 @@ export const REGISTRY: Record<string, RegistryEntry> = {
      clientSecretDefault: "",
    },
    models: [
+      { id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
+      { id: "gemini-3-1-pro", name: "Gemini 3.1 Pro (Alt ID)" },
+      { id: "gemini-3.1-pro-preview", name: "Gemini 3.1 Pro Preview" },
      { id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
      { id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
      { id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
@@ -168,6 +173,9 @@ export const REGISTRY: Record<string, RegistryEntry> = {
      clientSecretDefault: "",
    },
    models: [
+      { id: "gemini-3.1-pro", name: "Gemini 3.1 Pro" },
+      { id: "gemini-3-1-pro", name: "Gemini 3.1 Pro (Alt ID)" },
+      { id: "gemini-3.1-pro-preview", name: "Gemini 3.1 Pro Preview" },
      { id: "gemini-2.5-pro", name: "Gemini 2.5 Pro" },
      { id: "gemini-2.5-flash", name: "Gemini 2.5 Flash" },
      { id: "gemini-2.5-flash-lite", name: "Gemini 2.5 Flash Lite" },
@@ -460,8 +468,13 @@ export const REGISTRY: Record<string, RegistryEntry> = {
      "Anthropic-Version": "2023-06-01",
    },
    models: [
+      { id: "claude-haiku-4.5", name: "Claude Haiku 4.5" },
      { id: "claude-sonnet-4-20250514", name: "Claude Sonnet 4" },
+      { id: "claude-sonnet-4-6-20251031", name: "Claude Sonnet 4.6 (Dated)" },
+      { id: "claude-sonnet-4.6", name: "Claude Sonnet 4.6" },
      { id: "claude-opus-4-20250514", name: "Claude Opus 4" },
+      { id: "claude-opus-4-6-20251031", name: "Claude Opus 4.6 (Dated)" },
+      { id: "claude-opus-4.6", name: "Claude Opus 4.6" },
      { id: "claude-3-5-sonnet-20241022", name: "Claude 3.5 Sonnet" },
    ],
  },
@@ -495,6 +508,8 @@ export const REGISTRY: Record<string, RegistryEntry> = {
      "Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
    },
    models: [
+      { id: "glm-5", name: "GLM 5" },
+      { id: "glm-5-turbo", name: "GLM 5 Turbo" },
      { id: "glm-4.7-flash", name: "GLM 4.7 Flash" },
      { id: "glm-4.7", name: "GLM 4.7" },
      { id: "glm-4.6v", name: "GLM 4.6V (Vision)" },
@@ -506,6 +521,25 @@ export const REGISTRY: Record<string, RegistryEntry> = {
    ],
  },

+  zai: {
+    id: "zai",
+    alias: "zai",
+    format: "claude",
+    executor: "default",
+    baseUrl: "https://api.z.ai/api/anthropic/v1/messages",
+    urlSuffix: "?beta=true",
+    authType: "apikey",
+    authHeader: "x-api-key",
+    headers: {
+      "Anthropic-Version": "2023-06-01",
+      "Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
+    },
+    models: [
+      { id: "glm-5", name: "GLM 5" },
+      { id: "glm-5-turbo", name: "GLM 5 Turbo" },
+    ],
+  },
+
  kimi: {
    id: "kimi",
    alias: "kimi",
@@ -637,7 +671,11 @@ export const REGISTRY: Record<string, RegistryEntry> = {
      "Anthropic-Version": "2023-06-01",
      "Anthropic-Beta": "claude-code-20250219,interleaved-thinking-2025-05-14",
    },
-    models: [{ id: "MiniMax-M2.1", name: "MiniMax M2.1" }],
+    models: [
+      { id: "minimax-m2.5", name: "MiniMax M2.5" },
+      { id: "MiniMax-M2.5", name: "MiniMax M2.5 (Legacy Alias)" },
+      { id: "MiniMax-M2.1", name: "MiniMax M2.1" },
+    ],
  },

  "minimax-cn": {
@@ -655,6 +693,8 @@ export const REGISTRY: Record<string, RegistryEntry> = {
    },
    models: [
      // Keep parity with minimax to ensure model discovery works for minimax-cn connections.
+      { id: "minimax-m2.5", name: "MiniMax M2.5" },
+      { id: "MiniMax-M2.5", name: "MiniMax M2.5 (Legacy Alias)" },
      { id: "MiniMax-M2.1", name: "MiniMax M2.1" },
    ],
  },
@@ -717,10 +757,14 @@ export const REGISTRY: Record<string, RegistryEntry> = {
    authType: "apikey",
    authHeader: "bearer",
    models: [
-      { id: "grok-4", name: "Grok 4" },
+      { id: "grok-4-fast-non-reasoning", name: "Grok 4 Fast" },
      { id: "grok-4-fast-reasoning", name: "Grok 4 Fast Reasoning" },
-      { id: "grok-code-fast-1", name: "Grok Code Fast" },
+      { id: "grok-4-1-fast-non-reasoning", name: "Grok 4.1 Fast" },
+      { id: "grok-4-1-fast-reasoning", name: "Grok 4.1 Fast Reasoning" },
+      { id: "grok-4-0709", name: "Grok 4 (0709)" },
+      { id: "grok-4", name: "Grok 4" },
      { id: "grok-3", name: "Grok 3" },
+      { id: "grok-3-mini", name: "Grok 3 Mini" },
    ],
  },

@@ -849,7 +893,10 @@ export const REGISTRY: Record<string, RegistryEntry> = {
    authType: "apikey",
    authHeader: "bearer",
    models: [
+      { id: "gpt-oss-120b", name: "GPT OSS 120B", toolCalling: false },
+      { id: "openai/gpt-oss-120b", name: "GPT OSS 120B (OpenAI Prefix)", toolCalling: false },
      { id: "meta/llama-3.3-70b-instruct", name: "Llama 3.3 70B" },
+      { id: "nvidia/llama-3.3-70b-instruct", name: "Llama 3.3 70B (NVIDIA Prefix)" },
      { id: "meta/llama-4-maverick-17b-128e-instruct", name: "Llama 4 Maverick" },
      { id: "moonshotai/kimi-k2.5", name: "Kimi K2.5" },
      { id: "z-ai/glm4.7", name: "GLM 4.7" },
@@ -0,0 +1,155 @@
+/**
+ * Search Provider Registry
+ *
+ * Defines providers that support the /v1/search endpoint.
+ * Unlike LLM/embedding providers, search providers don't have "models" —
+ * a provider IS the model (Serper = Google SERP, Brave = Brave index).
+ *
+ * API keys are stored in the same provider credentials system,
+ * keyed by provider ID (e.g. "serper-search", "brave-search").
+ * perplexity-search reuses credentials from the "perplexity" chat provider.
+ */
+
+export interface SearchProviderConfig {
+  id: string;
+  name: string;
+  baseUrl: string;
+  method: "GET" | "POST";
+  authType: "apikey";
+  authHeader: string;
+  costPerQuery: number;
+  freeMonthlyQuota: number;
+  searchTypes: string[];
+  defaultMaxResults: number;
+  maxMaxResults: number;
+  timeoutMs: number;
+  cacheTTLMs: number;
+}
+
+export const SEARCH_PROVIDERS: Record<string, SearchProviderConfig> = {
+  "serper-search": {
+    id: "serper-search",
+    name: "Serper Search",
+    baseUrl: "https://google.serper.dev",
+    method: "POST",
+    authType: "apikey",
+    authHeader: "x-api-key",
+    costPerQuery: 0.001,
+    freeMonthlyQuota: 2500,
+    searchTypes: ["web", "news"],
+    defaultMaxResults: 5,
+    maxMaxResults: 100,
+    timeoutMs: 10_000,
+    cacheTTLMs: 5 * 60 * 1000,
+  },
+
+  "brave-search": {
+    id: "brave-search",
+    name: "Brave Search",
+    baseUrl: "https://api.search.brave.com/res/v1",
+    method: "GET",
+    authType: "apikey",
+    authHeader: "x-subscription-token",
+    costPerQuery: 0.005,
+    freeMonthlyQuota: 1000,
+    searchTypes: ["web", "news"],
+    defaultMaxResults: 5,
+    maxMaxResults: 20,
+    timeoutMs: 10_000,
+    cacheTTLMs: 5 * 60 * 1000,
+  },
+
+  "perplexity-search": {
+    id: "perplexity-search",
+    name: "Perplexity Search",
+    baseUrl: "https://api.perplexity.ai/search",
+    method: "POST",
+    authType: "apikey",
+    authHeader: "bearer",
+    costPerQuery: 0.005,
+    freeMonthlyQuota: 0,
+    searchTypes: ["web"],
+    defaultMaxResults: 5,
+    maxMaxResults: 20,
+    timeoutMs: 10_000,
+    cacheTTLMs: 5 * 60 * 1000,
+  },
+
+  "exa-search": {
+    id: "exa-search",
+    name: "Exa Search",
+    baseUrl: "https://api.exa.ai/search",
+    method: "POST",
+    authType: "apikey",
+    authHeader: "x-api-key",
+    costPerQuery: 0.007,
+    freeMonthlyQuota: 1000,
+    searchTypes: ["web", "news"],
+    defaultMaxResults: 5,
+    maxMaxResults: 100,
+    timeoutMs: 10_000,
+    cacheTTLMs: 5 * 60 * 1000,
+  },
+
+  "tavily-search": {
+    id: "tavily-search",
+    name: "Tavily Search",
+    baseUrl: "https://api.tavily.com/search",
+    method: "POST",
+    authType: "apikey",
+    authHeader: "bearer",
+    costPerQuery: 0.008,
+    freeMonthlyQuota: 1000,
+    searchTypes: ["web", "news"],
+    defaultMaxResults: 5,
+    maxMaxResults: 20,
+    timeoutMs: 10_000,
+    cacheTTLMs: 5 * 60 * 1000,
+  },
+};
+
+/**
+ * Credential fallback mapping — search providers that can reuse credentials
+ * from a related provider (e.g., perplexity-search uses the same API key as perplexity chat).
+ */
+export const SEARCH_CREDENTIAL_FALLBACKS: Record<string, string> = {
+  "perplexity-search": "perplexity",
+};
+
+/**
+ * Get search provider config by ID
+ */
+export function getSearchProvider(providerId: string): SearchProviderConfig | null {
+  return SEARCH_PROVIDERS[providerId] || null;
+}
+
+/**
+ * Get all search providers as a flat list
+ */
+export function getAllSearchProviders(): Array<{
+  id: string;
+  name: string;
+  searchTypes: string[];
+}> {
+  return Object.values(SEARCH_PROVIDERS).map((p) => ({
+    id: p.id,
+    name: p.name,
+    searchTypes: p.searchTypes,
+  }));
+}
+
+/**
+ * Select the cheapest available provider.
+ * If an explicit provider is given, validate and return it.
+ * Otherwise, return the cheapest by costPerQuery.
+ */
+export function selectProvider(explicitProvider?: string): SearchProviderConfig | null {
+  if (explicitProvider) {
+    return SEARCH_PROVIDERS[explicitProvider] || null;
+  }
+
+  const providers = Object.values(SEARCH_PROVIDERS);
+  if (providers.length === 0) return null;
+
+  return providers.reduce((cheapest, p) => (p.costPerQuery < cheapest.costPerQuery ? p : cheapest));
+}
@@ -42,6 +42,12 @@ import {
 import { getIdempotencyKey, checkIdempotency, saveIdempotency } from "@/lib/idempotencyLayer";
 import { createProgressTransform, wantsProgress } from "../utils/progressTracker.ts";
 import { isModelUnavailableError, getNextFamilyFallback } from "../services/modelFamilyFallback.ts";
+import { computeRequestHash, deduplicate, shouldDeduplicate } from "../services/requestDedup.ts";
+import {
+  shouldUseFallback,
+  isFallbackDecision,
+  EMERGENCY_FALLBACK_CONFIG,
+} from "../services/emergencyFallback.ts";

 export function shouldUseNativeCodexPassthrough({
  provider,
@@ -89,6 +95,22 @@ export async function handleChatCore({
 }) {
  const { provider, model, extendedContext } = modelInfo;
  const startTime = Date.now();
+  const persistFailureUsage = (statusCode: number, errorCode?: string | null) => {
+    saveRequestUsage({
+      provider: provider || "unknown",
+      model: model || "unknown",
+      tokens: { input: 0, output: 0, cacheRead: 0, cacheCreation: 0, reasoning: 0 },
+      status: String(statusCode),
+      success: false,
+      latencyMs: Date.now() - startTime,
+      timeToFirstTokenMs: 0,
+      errorCode: errorCode || String(statusCode),
+      timestamp: new Date().toISOString(),
+      connectionId: connectionId || undefined,
+      apiKeyId: apiKeyInfo?.id || undefined,
+      apiKeyName: apiKeyInfo?.name || undefined,
+    }).catch(() => {});
+  };

  // ── Phase 9.2: Idempotency check ──
  const idempotencyKey = getIdempotencyKey(clientRawRequest?.headers);
@@ -193,7 +215,7 @@ export async function handleChatCore({
    } else if (isClaudePassthrough) {
      // Claude-to-Claude passthrough: forward body completely untouched.
      // No translation, no field stripping, no thinking normalization.
-      // We are just a gateway -- do not interfere with the request in any way.
+      // We are just a gateway -- do not interfere with the request in the slightest.
      translatedBody = { ...body };
      log?.debug?.("FORMAT", "claude->claude passthrough -- forwarding untouched");
    } else {
@@ -246,8 +268,44 @@ export async function handleChatCore({
      if (Array.isArray(translatedBody.messages)) {
        for (const msg of translatedBody.messages) {
          if (Array.isArray(msg.content)) {
-            msg.content = msg.content.filter((block: Record<string, unknown>) =>
-              block.type !== "text" || (typeof block.text === "string" && block.text.length > 0)
+            msg.content = msg.content.filter(
+              (block: Record<string, unknown>) =>
+                block.type !== "text" || (typeof block.text === "string" && block.text.length > 0)
+            );
+          }
+        }
+      }
+
+      // ── #409: Normalize unsupported content part types ──
+      // Cursor and other clients send {type:"file"} when attaching .md or other files.
+      // Providers (Copilot, OpenAI) only accept "text" and "image_url" in content arrays.
+      // Convert: file → text (extract content), drop unrecognized types with a warning.
+      if (Array.isArray(translatedBody.messages)) {
+        for (const msg of translatedBody.messages) {
+          if (msg.role === "user" && Array.isArray(msg.content)) {
+            msg.content = (msg.content as Record<string, unknown>[]).flatMap(
+              (block: Record<string, unknown>) => {
+                if (block.type === "text" || block.type === "image_url" || block.type === "image") {
+                  return [block];
+                }
+                // file / document → extract text content
+                if (block.type === "file" || block.type === "document") {
+                  const fileContent =
+                    (block.file as Record<string, unknown>)?.content ??
+                    (block.file as Record<string, unknown>)?.text ??
+                    block.content ??
+                    block.text;
+                  const fileName =
+                    (block.file as Record<string, unknown>)?.name ?? block.name ?? "attachment";
+                  if (typeof fileContent === "string" && fileContent.length > 0) {
+                    return [{ type: "text", text: `[${fileName}]\n${fileContent}` }];
+                  }
+                  return [];
+                }
+                // Unknown types: drop silently
+                log?.debug?.("CONTENT", `Dropped unsupported content part type="${block.type}"`);
+                return [];
+              }
            );
          }
        }
@@ -328,6 +386,57 @@ export async function handleChatCore({
  // Get executor for this provider
  const executor = getExecutor(provider);

+  // Create stream controller for disconnect detection
+  const streamController = createStreamController({ onDisconnect, log, provider, model });
+
+  const dedupRequestBody = { ...translatedBody, model: `${provider}/${model}` };
+  const dedupEnabled = shouldDeduplicate(dedupRequestBody);
+  const dedupHash = dedupEnabled ? computeRequestHash(dedupRequestBody) : null;
+
+  const executeProviderRequest = async (modelToCall = model, allowDedup = false) => {
+    const execute = async () => {
+      const bodyToSend =
+        translatedBody.model === modelToCall
+          ? translatedBody
+          : { ...translatedBody, model: modelToCall };
+
+      const rawResult = await withRateLimit(provider, connectionId, modelToCall, () =>
+        executor.execute({
+          model: modelToCall,
+          body: bodyToSend,
+          stream,
+          credentials,
+          signal: streamController.signal,
+          log,
+          extendedContext,
+        })
+      );
+
+      if (stream) return rawResult;
+
+      // Non-stream responses need cloning for shared dedup consumers.
+      const status = rawResult.response.status;
+      const statusText = rawResult.response.statusText;
+      const headers = Array.from(rawResult.response.headers.entries());
+      const payload = await rawResult.response.text();
+
+      return {
+        ...rawResult,
+        response: new Response(payload, { status, statusText, headers }),
+      };
+    };
+
+    if (allowDedup && dedupEnabled && dedupHash) {
+      const dedupResult = await deduplicate(dedupHash, execute);
+      if (dedupResult.wasDeduplicated) {
+        log?.debug?.("DEDUP", `Joined in-flight request hash=${dedupHash}`);
+      }
+      return dedupResult.result;
+    }
+
+    return execute();
+  };
+
  // Track pending request
  trackPendingRequest(model, provider, connectionId, true);

@@ -345,9 +454,6 @@ export async function handleChatCore({
    0;
  log?.debug?.("REQUEST", `${provider.toUpperCase()} | ${model} | ${msgCount} msgs`);

-  // Create stream controller for disconnect detection
-  const streamController = createStreamController({ onDisconnect, log, provider, model });
-
  // Execute request using executor (handles URL building, headers, fallback, transform)
  let providerResponse;
  let providerUrl;
@@ -355,17 +461,7 @@ export async function handleChatCore({
  let finalBody;

  try {
-    const result = await withRateLimit(provider, connectionId, model, () =>
-      executor.execute({
-        model,
-        body: translatedBody,
-        stream,
-        credentials,
-        signal: streamController.signal,
-        log,
-        extendedContext,
-      })
-    );
+    const result = await executeProviderRequest(model, true);

    providerResponse = result.response;
    providerUrl = result.url;
@@ -412,6 +508,7 @@ export async function handleChatCore({
      streamController.handleError(error);
      return createErrorResult(499, "Request aborted");
    }
+    persistFailureUsage(HTTP_STATUS.BAD_GATEWAY, error?.name || "upstream_error");
    const errMsg = formatProviderError(error, provider, model, HTTP_STATUS.BAD_GATEWAY);
    console.log(`${COLORS.red}[ERROR] ${errMsg}${COLORS.reset}`);
    return createErrorResult(HTTP_STATUS.BAD_GATEWAY, errMsg);
@@ -521,17 +618,7 @@ export async function handleChatCore({
        log?.info?.("MODEL_FALLBACK", `${model} unavailable (${statusCode}) → trying ${nextModel}`);
        // Re-execute with the fallback model
        try {
-          const fallbackResult = await withRateLimit(provider, connectionId, nextModel, () =>
-            executor.execute({
-              model: nextModel,
-              body: translatedBody,
-              stream,
-              credentials,
-              signal: streamController.signal,
-              log,
-              extendedContext,
-            })
-          );
+          const fallbackResult = await executeProviderRequest(nextModel, false);
          if (fallbackResult.response.ok) {
            providerResponse = fallbackResult.response;
            providerUrl = fallbackResult.url;
@@ -543,18 +630,79 @@ export async function handleChatCore({
            // We fall through by NOT returning here
          } else {
            // Fallback also failed — return original error
+            persistFailureUsage(statusCode, "model_unavailable");
            return createErrorResult(statusCode, errMsg, retryAfterMs);
          }
        } catch {
+          persistFailureUsage(statusCode, "model_unavailable");
          return createErrorResult(statusCode, errMsg, retryAfterMs);
        }
      } else {
+        persistFailureUsage(statusCode, "model_unavailable");
        return createErrorResult(statusCode, errMsg, retryAfterMs);
      }
    } else {
+      persistFailureUsage(statusCode, `upstream_${statusCode}`);
      return createErrorResult(statusCode, errMsg, retryAfterMs);
    }
    // ── End T5 ───────────────────────────────────────────────────────────────
+
+    // ── Emergency Fallback (ClawRouter Feature #09/017) ────────────────────
+    // When a non-streaming request fails with a budget-related error (402 or
+    // budget keywords), redirect to nvidia/gpt-oss-120b ($0.00/M) before
+    // returning the error to the combo router. This gives one last free-tier
+    // attempt so the user's session stays alive.
+    const requestHasTools = Array.isArray(translatedBody.tools) && translatedBody.tools.length > 0;
+    if (!stream) {
+      const fbDecision = shouldUseFallback(
+        statusCode,
+        message,
+        requestHasTools,
+        EMERGENCY_FALLBACK_CONFIG
+      );
+      if (isFallbackDecision(fbDecision)) {
+        log?.info?.("EMERGENCY_FALLBACK", fbDecision.reason);
+        try {
+          // Build a minimal fallback request using the original body but with
+          // the NVIDIA free-tier model and max_tokens capped to avoid overuse.
+          const fbExecutor = getExecutor(fbDecision.provider);
+          const fbResult = await fbExecutor.execute({
+            model: fbDecision.model,
+            body: {
+              ...translatedBody,
+              model: fbDecision.model,
+              max_tokens: Math.min(
+                typeof translatedBody.max_tokens === "number"
+                  ? translatedBody.max_tokens
+                  : fbDecision.maxOutputTokens,
+                fbDecision.maxOutputTokens
+              ),
+            },
+            stream: false,
+            credentials: credentials,
+            signal: streamController.signal,
+            log,
+            extendedContext,
+          });
+          if (fbResult.response.ok) {
+            providerResponse = fbResult.response;
+            log?.info?.(
+              "EMERGENCY_FALLBACK",
+              `Serving ${fbDecision.provider}/${fbDecision.model} as budget fallback for ${provider}/${model}`
+            );
+            // Fall through to non-streaming handler — providerResponse is now OK
+          } else {
+            log?.warn?.(
+              "EMERGENCY_FALLBACK",
+              `Emergency fallback also failed (${fbResult.response.status})`
+            );
+          }
+        } catch (fbErr) {
+          log?.warn?.("EMERGENCY_FALLBACK", `Emergency fallback error: ${fbErr?.message}`);
+        }
+      }
+    }
+    // ── End Emergency Fallback ────────────────────────────────────────────
  }

  // Non-streaming response
@@ -580,6 +728,7 @@ export async function handleChatCore({
          connectionId,
          status: `FAILED ${HTTP_STATUS.BAD_GATEWAY}`,
        }).catch(() => {});
+        persistFailureUsage(HTTP_STATUS.BAD_GATEWAY, "invalid_sse_payload");
        return createErrorResult(
          HTTP_STATUS.BAD_GATEWAY,
          "Invalid SSE response for non-streaming request"
@@ -597,6 +746,7 @@ export async function handleChatCore({
          connectionId,
          status: `FAILED ${HTTP_STATUS.BAD_GATEWAY}`,
        }).catch(() => {});
+        persistFailureUsage(HTTP_STATUS.BAD_GATEWAY, "invalid_json_payload");
        return createErrorResult(HTTP_STATUS.BAD_GATEWAY, "Invalid JSON response from provider");
      }
    }
@@ -639,6 +789,11 @@ export async function handleChatCore({
        provider: provider || "unknown",
        model: model || "unknown",
        tokens: usage,
+        status: "200",
+        success: true,
+        latencyMs: Date.now() - startTime,
+        timeToFirstTokenMs: Date.now() - startTime,
+        errorCode: null,
        timestamp: new Date().toISOString(),
        connectionId: connectionId || undefined,
        apiKeyId: apiKeyInfo?.id || undefined,
@@ -0,0 +1,664 @@
+/**
+ * Search Handler
+ *
+ * Handles POST /v1/search requests.
+ * Routes to 5 search providers with automatic failover:
+ *   serper-search, brave-search, perplexity-search, exa-search, tavily-search
+ *
+ * Request format:
+ * {
+ *   "query": "search query",
+ *   "provider": "serper-search" | "brave-search" | ... // optional, auto-selects cheapest
+ *   "max_results": 5,
+ *   "search_type": "web" | "news"
+ * }
+ */
+
+import { getSearchProvider, type SearchProviderConfig } from "../config/searchRegistry.ts";
+import { saveCallLog } from "@/lib/usageDb";
+
+// ── Types ────────────────────────────────────────────────────────────────
+
+export interface SearchResult {
+  title: string;
+  url: string;
+  display_url?: string;
+  snippet: string;
+  position: number;
+  score: number | null;
+  published_at: string | null;
+  favicon_url: string | null;
+  content: { format: string; text: string; length: number } | null;
+  metadata: {
+    author: string | null;
+    language: string | null;
+    source_type: string | null;
+    image_url: string | null;
+  } | null;
+  citation: {
+    provider: string;
+    retrieved_at: string;
+    rank: number;
+  };
+  provider_raw: Record<string, unknown> | null;
+}
+
+export interface SearchResponse {
+  provider: string;
+  query: string;
+  results: SearchResult[];
+  answer: { source: string; text: string | null; model: string | null } | null;
+  usage: { queries_used: number; search_cost_usd: number; llm_tokens?: number };
+  metrics: {
+    response_time_ms: number;
+    upstream_latency_ms: number;
+    gateway_latency_ms?: number;
+    total_results_available: number | null;
+  };
+  errors: Array<{ provider: string; code: string; message: string }>;
+}
+
+interface SearchHandlerResult {
+  success: boolean;
+  status?: number;
+  error?: string;
+  data?: SearchResponse;
+}
+
+interface SearchHandlerOptions {
+  query: string;
+  provider: string;
+  maxResults: number;
+  searchType: string;
+  country?: string;
+  language?: string;
+  timeRange?: string;
+  offset?: number;
+  domainFilter?: string[];
+  contentOptions?: { snippet?: boolean; full_page?: boolean; format?: string; max_characters?: number };
+  strictFilters?: boolean;
+  providerOptions?: Record<string, unknown>;
+  credentials: Record<string, any>;
+  alternateProvider?: string;
+  alternateCredentials?: Record<string, any> | null;
+  log?: any;
+}
+
+// ── Constants ────────────────────────────────────────────────────────────
+
+const GLOBAL_TIMEOUT_MS = 15_000;
+
+// Non-retriable HTTP status codes — fail immediately, don't try alternate
+const NON_RETRIABLE = new Set([400, 401, 403, 404]);
+
+// ── Input Sanitization ──────────────────────────────────────────────────
+
+// Control characters that should never appear in search queries
+const CONTROL_CHAR_RE = /[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/;
+
+function sanitizeQuery(query: string): { clean: string; error?: string } {
+  if (CONTROL_CHAR_RE.test(query)) {
+    return { clean: "", error: "Query contains invalid control characters" };
+  }
+  const clean = query.normalize("NFKC").trim().replace(/\s+/g, " ");
+  if (clean.length === 0) {
+    return { clean: "", error: "Query is empty after normalization" };
+  }
+  return { clean };
+}
+
+// ── Response Normalizers ────────────────────────────────────────────────
+
+function makeResult(
+  providerId: string,
+  item: {
+    title?: string;
+    url?: string;
+    snippet?: string;
+    score?: number;
+    published_at?: string;
+    favicon_url?: string;
+    author?: string;
+    source_type?: string;
+    image_url?: string;
+    full_text?: string;
+    text_format?: string;
+  },
+  idx: number,
+  now: string
+): SearchResult {
+  const url = item.url || "";
+  return {
+    title: item.title || "",
+    url,
+    display_url: url ? url.replace(/^https?:\/\/(www\.)?/, "").split("?")[0] : undefined,
+    snippet: item.snippet || "",
+    position: idx + 1,
+    score: typeof item.score === "number" ? Math.min(1, Math.max(0, item.score)) : null,
+    published_at: item.published_at || null,
+    favicon_url: item.favicon_url || null,
+    content: item.full_text
+      ? { format: item.text_format || "text", text: item.full_text, length: item.full_text.length }
+      : null,
+    metadata: {
+      author: item.author || null,
+      language: null,
+      source_type: item.source_type || null,
+      image_url: item.image_url || null,
+    },
+    citation: { provider: providerId, retrieved_at: now, rank: idx + 1 },
+    provider_raw: null,
+  };
+}
+
+function normalizeSerperResponse(
+  data: any,
+  _query: string,
+  searchType: string
+): { results: SearchResult[]; totalResults: number | null } {
+  const now = new Date().toISOString();
+  const items = searchType === "news" ? data.news : data.organic;
+  if (!Array.isArray(items)) return { results: [], totalResults: null };
+
+  const results = items.map((item: any, idx: number) =>
+    makeResult(
+      "serper-search",
+      {
+        title: item.title,
+        url: item.link,
+        snippet: item.snippet || item.description,
+        published_at: item.date,
+      },
+      idx,
+      now
+    )
+  );
+
+  return {
+    results,
+    totalResults:
+      typeof data.searchParameters?.totalResults === "number"
+        ? data.searchParameters.totalResults
+        : null,
+  };
+}
+
+function normalizeBraveResponse(
+  data: any,
+  _query: string,
+  searchType: string
+): { results: SearchResult[]; totalResults: number | null } {
+  const now = new Date().toISOString();
+  const container = searchType === "news" ? data.news : data.web;
+  const items = container?.results;
+  if (!Array.isArray(items)) return { results: [], totalResults: null };
+
+  const results = items.map((item: any, idx: number) =>
+    makeResult(
+      "brave-search",
+      {
+        title: item.title,
+        url: item.url,
+        snippet: item.description,
+        published_at: item.page_age || item.age,
+        favicon_url: item.meta_url?.favicon || item.favicon,
+      },
+      idx,
+      now
+    )
+  );
+
+  return { results, totalResults: container?.totalCount ?? null };
+}
+
+// ── Helpers ─────────────────────────────────────────────────────────────
+
+function parseDomainFilter(domainFilter?: string[]): {
+  includes: string[];
+  excludes: string[];
+} {
+  if (!domainFilter?.length) return { includes: [], excludes: [] };
+  const includes = domainFilter.filter((d) => !d.startsWith("-"));
+  const excludes = domainFilter.filter((d) => d.startsWith("-")).map((d) => d.slice(1));
+  return { includes, excludes };
+}
+
+// ── Provider Request Builders ───────────────────────────────────────────
+
+interface SearchRequestParams {
+  query: string;
+  searchType: string;
+  maxResults: number;
+  token: string;
+  country?: string;
+  language?: string;
+  domainFilter?: string[];
+}
+
+function buildSerperRequest(
+  config: SearchProviderConfig,
+  params: SearchRequestParams
+): { url: string; init: RequestInit } {
+  const endpoint = params.searchType === "news" ? "/news" : "/search";
+  const body: Record<string, unknown> = { q: params.query, num: params.maxResults };
+  if (params.country) body.gl = params.country.toLowerCase();
+  if (params.language) body.hl = params.language;
+  return {
+    url: `${config.baseUrl}${endpoint}`,
+    init: {
+      method: "POST",
+      headers: { "Content-Type": "application/json", "X-API-Key": params.token },
+      body: JSON.stringify(body),
+    },
+  };
+}
+
+function buildBraveRequest(
+  config: SearchProviderConfig,
+  params: SearchRequestParams
+): { url: string; init: RequestInit } {
+  const endpoint = params.searchType === "news" ? "/news/search" : "/web/search";
+  const qp = new URLSearchParams({ q: params.query, count: String(params.maxResults) });
+  if (params.country) qp.set("country", params.country);
+  if (params.language) qp.set("search_lang", params.language);
+  return {
+    url: `${config.baseUrl}${endpoint}?${qp}`,
+    init: {
+      method: "GET",
+      headers: { Accept: "application/json", "X-Subscription-Token": params.token },
+    },
+  };
+}
+
+function buildPerplexityRequest(
+  config: SearchProviderConfig,
+  params: SearchRequestParams
+): { url: string; init: RequestInit } {
+  const body: Record<string, unknown> = { query: params.query, max_results: params.maxResults };
+  if (params.country) body.country = params.country;
+  if (params.language) body.search_language_filter = [params.language];
+  if (params.domainFilter?.length) body.search_domain_filter = params.domainFilter;
+  return {
+    url: config.baseUrl,
+    init: {
+      method: "POST",
+      headers: { "Content-Type": "application/json", Authorization: `Bearer ${params.token}` },
+      body: JSON.stringify(body),
+    },
+  };
+}
+
+function buildExaRequest(
+  config: SearchProviderConfig,
+  params: SearchRequestParams
+): { url: string; init: RequestInit } {
+  const { includes, excludes } = parseDomainFilter(params.domainFilter);
+  const body: Record<string, unknown> = {
+    query: params.query,
+    numResults: params.maxResults,
+    type: "auto",
+    text: true,
+    highlights: true,
+  };
+  if (includes.length) body.includeDomains = includes;
+  if (excludes.length) body.excludeDomains = excludes;
+  if (params.searchType === "news") body.category = "news";
+  return {
+    url: config.baseUrl,
+    init: {
+      method: "POST",
+      headers: { "Content-Type": "application/json", "x-api-key": params.token },
+      body: JSON.stringify(body),
+    },
+  };
+}
+
+function buildTavilyRequest(
+  config: SearchProviderConfig,
+  params: SearchRequestParams
+): { url: string; init: RequestInit } {
+  const { includes, excludes } = parseDomainFilter(params.domainFilter);
+  const body: Record<string, unknown> = {
+    query: params.query,
+    max_results: params.maxResults,
+    topic: params.searchType === "news" ? "news" : "general",
+  };
+  if (includes.length) body.include_domains = includes;
+  if (excludes.length) body.exclude_domains = excludes;
+  if (params.country) body.country = params.country;
+  return {
+    url: config.baseUrl,
+    init: {
+      method: "POST",
+      headers: { "Content-Type": "application/json", Authorization: `Bearer ${params.token}` },
+      body: JSON.stringify(body),
+    },
+  };
+}
+
+function buildRequest(
+  config: SearchProviderConfig,
+  params: SearchRequestParams
+): { url: string; init: RequestInit } {
+  if (config.id === "serper-search") return buildSerperRequest(config, params);
+  if (config.id === "brave-search") return buildBraveRequest(config, params);
+  if (config.id === "perplexity-search") return buildPerplexityRequest(config, params);
+  if (config.id === "exa-search") return buildExaRequest(config, params);
+  if (config.id === "tavily-search") return buildTavilyRequest(config, params);
+  // Fallback for future providers: POST with bearer auth
+  return {
+    url: config.baseUrl,
+    init: {
+      method: config.method,
+      headers: { "Content-Type": "application/json", Authorization: `Bearer ${params.token}` },
+      body: JSON.stringify({
+        query: params.query,
+        max_results: params.maxResults,
+        search_type: params.searchType,
+      }),
+    },
+  };
+}
+
+function normalizePerplexityResponse(
+  data: any,
+  _query: string,
+  _searchType: string
+): { results: SearchResult[]; totalResults: number | null } {
+  const now = new Date().toISOString();
+  const items = data.results;
+  if (!Array.isArray(items)) return { results: [], totalResults: null };
+
+  const results = items.map((item: any, idx: number) =>
+    makeResult(
+      "perplexity-search",
+      {
+        title: item.title,
+        url: item.url,
+        snippet: item.snippet,
+        published_at: item.date || item.last_updated,
+      },
+      idx,
+      now
+    )
+  );
+  return { results, totalResults: results.length };
+}
+
+function normalizeExaResponse(
+  data: any,
+  _query: string,
+  _searchType: string
+): { results: SearchResult[]; totalResults: number | null } {
+  const now = new Date().toISOString();
+  const items = data.results;
+  if (!Array.isArray(items)) return { results: [], totalResults: null };
+
+  const results = items.map((item: any, idx: number) =>
+    makeResult(
+      "exa-search",
+      {
+        title: item.title,
+        url: item.url,
+        snippet: item.highlights?.[0] || item.text?.slice(0, 300) || "",
+        score: item.score,
+        published_at: item.publishedDate,
+        favicon_url: item.favicon,
+        author: item.author,
+        image_url: item.image,
+        full_text: item.text,
+        text_format: "text",
+      },
+      idx,
+      now
+    )
+  );
+  return { results, totalResults: results.length };
+}
+
+function normalizeTavilyResponse(
+  data: any,
+  _query: string,
+  _searchType: string
+): { results: SearchResult[]; totalResults: number | null } {
+  const now = new Date().toISOString();
+  const items = data.results;
+  if (!Array.isArray(items)) return { results: [], totalResults: null };
+
+  const results = items.map((item: any, idx: number) =>
+    makeResult(
+      "tavily-search",
+      {
+        title: item.title,
+        url: item.url,
+        snippet: item.content || "",
+        score: item.score,
+        published_at: item.published_date,
+        full_text: item.raw_content,
+        text_format: "text",
+      },
+      idx,
+      now
+    )
+  );
+  return { results, totalResults: results.length };
+}
+
+function normalizeResponse(
+  providerId: string,
+  data: any,
+  query: string,
+  searchType: string
+): { results: SearchResult[]; totalResults: number | null } {
+  if (providerId === "serper-search") return normalizeSerperResponse(data, query, searchType);
+  if (providerId === "brave-search") return normalizeBraveResponse(data, query, searchType);
+  if (providerId === "perplexity-search")
+    return normalizePerplexityResponse(data, query, searchType);
+  if (providerId === "exa-search") return normalizeExaResponse(data, query, searchType);
+  if (providerId === "tavily-search") return normalizeTavilyResponse(data, query, searchType);
+  return { results: [], totalResults: null };
+}
+
+// ── Main Handler ────────────────────────────────────────────────────────
+
+export async function handleSearch(options: SearchHandlerOptions): Promise<SearchHandlerResult> {
+  const {
+    query,
+    provider: providerId,
+    maxResults,
+    searchType,
+    country,
+    language,
+    domainFilter,
+    credentials,
+    alternateProvider,
+    alternateCredentials,
+    log,
+  } = options;
+  const startTime = Date.now();
+
+  // 1. Sanitize input
+  const { clean: cleanQuery, error: sanitizeError } = sanitizeQuery(query);
+  if (sanitizeError) {
+    return { success: false, status: 400, error: sanitizeError };
+  }
+
+  // 2. Use resolved provider from route (no re-resolution)
+  const primaryConfig = getSearchProvider(providerId);
+  if (!primaryConfig) {
+    return {
+      success: false,
+      status: 400,
+      error: `Unknown search provider: ${providerId}`,
+    };
+  }
+
+  // 3. Get alternate config for failover (pre-resolved by route)
+  const alternateConfig = alternateProvider ? getSearchProvider(alternateProvider) : null;
+
+  const requestParams = {
+    query: cleanQuery,
+    searchType,
+    maxResults,
+    country,
+    language,
+    domainFilter,
+  };
+
+  // 4. Try primary provider
+  const result = await tryProvider(primaryConfig, requestParams, credentials, startTime, log);
+
+  if (result.success) return result;
+
+  // 5. Failover to alternate (only for retriable errors and auto-select mode)
+  if (
+    alternateConfig &&
+    alternateCredentials &&
+    !NON_RETRIABLE.has(result.status || 0) &&
+    Date.now() - startTime < GLOBAL_TIMEOUT_MS
+  ) {
+    if (log) {
+      log.warn(
+        "SEARCH",
+        `${primaryConfig.id} failed (${result.status}), trying ${alternateConfig.id}`
+      );
+    }
+
+    const fallbackResult = await tryProvider(
+      alternateConfig,
+      requestParams,
+      alternateCredentials,
+      startTime,
+      log
+    );
+
+    if (fallbackResult.success) return fallbackResult;
+  }
+
+  return result;
+}
+
+async function tryProvider(
+  config: SearchProviderConfig,
+  params: Omit<SearchRequestParams, "token">,
+  credentials: Record<string, any>,
+  globalStartTime: number,
+  log?: any
+): Promise<SearchHandlerResult> {
+  const startTime = Date.now();
+  const token = credentials.apiKey || credentials.accessToken;
+
+  if (!token) {
+    return {
+      success: false,
+      status: 401,
+      error: `No credentials for search provider: ${config.id}`,
+    };
+  }
+
+  const { query, searchType, maxResults } = params;
+  const { url, init } = buildRequest(config, { ...params, token });
+
+  // Timeout: min of provider timeout and remaining global timeout
+  const remainingGlobal = GLOBAL_TIMEOUT_MS - (Date.now() - globalStartTime);
+  const timeout = Math.min(config.timeoutMs, Math.max(remainingGlobal, 1000));
+  const controller = new AbortController();
+  const timer = setTimeout(() => controller.abort(), timeout);
+
+  if (log) {
+    log.info("SEARCH", `${config.id} | query: "${query.slice(0, 80)}" | type: ${searchType}`);
+  }
+
+  try {
+    const response = await fetch(url, { ...init, signal: controller.signal });
+    clearTimeout(timer);
+
+    if (!response.ok) {
+      const errorText = await response.text();
+      if (log) {
+        log.error("SEARCH", `${config.id} error ${response.status}: ${errorText.slice(0, 200)}`);
+      }
+
+      saveCallLog({
+        method: config.method,
+        path: "/v1/search",
+        status: response.status,
+        model: config.id,
+        provider: config.id,
+        duration: Date.now() - startTime,
+        requestType: "search",
+        error: errorText.slice(0, 500),
+        requestBody: {
+          query: query.slice(0, 200),
+          search_type: searchType,
+          max_results: maxResults,
+        },
+      }).catch(() => { /* non-critical — logging must not block search response */ });
+
+      return {
+        success: false,
+        status: response.status,
+        error: `Search provider ${config.id} returned ${response.status}`,
+      };
+    }
+
+    const data = await response.json();
+    const { results, totalResults } = normalizeResponse(config.id, data, query, searchType);
+    const duration = Date.now() - startTime;
+
+    saveCallLog({
+      method: config.method,
+      path: "/v1/search",
+      status: 200,
+      model: config.id,
+      provider: config.id,
+      duration,
+      requestType: "search",
+      tokens: { prompt_tokens: 0, completion_tokens: 0 },
+      requestBody: { query: query.slice(0, 200), search_type: searchType, max_results: maxResults },
+      responseBody: { results_count: results.length, cached: false },
+    }).catch(() => { /* non-critical — logging must not block search response */ });
+
+    return {
+      success: true,
+      data: {
+        provider: config.id,
+        query,
+        results,
+        answer: null,
+        usage: { queries_used: 1, search_cost_usd: config.costPerQuery },
+        metrics: {
+          response_time_ms: duration,
+          upstream_latency_ms: duration,
+          total_results_available: totalResults,
+        },
+        errors: [],
+      },
+    };
+  } catch (err: any) {
+    clearTimeout(timer);
+
+    const isTimeout = err.name === "AbortError";
+    if (log) {
+      log.error("SEARCH", `${config.id} ${isTimeout ? "timeout" : "fetch error"}: ${err.message}`);
+    }
+
+    saveCallLog({
+      method: config.method,
+      path: "/v1/search",
+      status: isTimeout ? 504 : 502,
+      model: config.id,
+      provider: config.id,
+      duration: Date.now() - startTime,
+      requestType: "search",
+      error: err.message,
+      requestBody: { query: query.slice(0, 200), search_type: searchType, max_results: maxResults },
+    }).catch(() => { /* non-critical — logging must not block search response */ });
+
+    return {
+      success: false,
+      status: isTimeout ? 504 : 502,
+      error: `Search provider ${isTimeout ? "timeout" : "error"}: ${err.message}`,
+    };
+  }
+}
@@ -0,0 +1,48 @@
+import { describe, it, expect } from "vitest";
+import {
+  MCP_TOOLS,
+  MCP_TOOL_MAP,
+  setRoutingStrategyInput,
+  setRoutingStrategyTool,
+} from "../schemas/tools.ts";
+
+describe("omniroute_set_routing_strategy MCP tool schema", () => {
+  it("should be registered in MCP_TOOLS", () => {
+    const tool = MCP_TOOLS.find((t) => t.name === "omniroute_set_routing_strategy");
+    expect(tool).toBeDefined();
+    expect(tool?.phase).toBe(2);
+  });
+
+  it("should be available in MCP_TOOL_MAP", () => {
+    expect(MCP_TOOL_MAP["omniroute_set_routing_strategy"]).toBeDefined();
+  });
+
+  it("should require write:combos scope", () => {
+    expect(setRoutingStrategyTool.scopes).toContain("write:combos");
+  });
+
+  it("should validate a standard strategy payload", () => {
+    const result = setRoutingStrategyInput.safeParse({
+      comboId: "my-combo",
+      strategy: "cost-optimized",
+    });
+    expect(result.success).toBe(true);
+  });
+
+  it("should validate auto strategy with autoRoutingStrategy", () => {
+    const result = setRoutingStrategyInput.safeParse({
+      comboId: "my-combo",
+      strategy: "auto",
+      autoRoutingStrategy: "latency",
+    });
+    expect(result.success).toBe(true);
+  });
+
+  it("should reject unknown strategy", () => {
+    const result = setRoutingStrategyInput.safeParse({
+      comboId: "my-combo",
+      strategy: "unknown-strategy",
+    });
+    expect(result.success).toBe(false);
+  });
+});
@@ -107,6 +107,7 @@ export const listCombosOutput = z.object({
        "priority",
        "weighted",
        "round-robin",
+        "strict-random",
        "random",
        "least-used",
        "cost-optimized",
@@ -470,7 +471,53 @@ export const setBudgetGuardTool: McpToolDefinition<
  sourceEndpoints: ["/api/usage/budget"],
 };

-// --- Tool 11: omniroute_set_resilience_profile ---
+// --- Tool 11: omniroute_set_routing_strategy ---
+export const setRoutingStrategyInput = z.object({
+  comboId: z.string().describe("Combo ID or name to update"),
+  strategy: z
+    .enum([
+      "priority",
+      "weighted",
+      "round-robin",
+      "strict-random",
+      "random",
+      "least-used",
+      "cost-optimized",
+      "auto",
+    ])
+    .describe("Routing strategy to apply"),
+  autoRoutingStrategy: z
+    .enum(["rules", "cost", "eco", "latency", "fast"])
+    .optional()
+    .describe("Optional strategy used by auto mode (only used when strategy='auto')"),
+});
+
+export const setRoutingStrategyOutput = z.object({
+  success: z.boolean(),
+  combo: z.object({
+    id: z.string(),
+    name: z.string(),
+    strategy: z.string(),
+    autoRoutingStrategy: z.string().nullable(),
+  }),
+});
+
+export const setRoutingStrategyTool: McpToolDefinition<
+  typeof setRoutingStrategyInput,
+  typeof setRoutingStrategyOutput
+> = {
+  name: "omniroute_set_routing_strategy",
+  description:
+    "Updates a combo routing strategy (priority/weighted/auto/etc.) at runtime. Supports selecting the sub-strategy used by auto mode (rules/cost/latency).",
+  inputSchema: setRoutingStrategyInput,
+  outputSchema: setRoutingStrategyOutput,
+  scopes: ["write:combos"],
+  auditLevel: "full",
+  phase: 2,
+  sourceEndpoints: ["/api/combos", "/api/combos/{id}"],
+};
+
+// --- Tool 12: omniroute_set_resilience_profile ---
 export const setResilienceProfileInput = z.object({
  profile: z
    .enum(["aggressive", "balanced", "conservative"])
@@ -502,7 +549,7 @@ export const setResilienceProfileTool: McpToolDefinition<
  sourceEndpoints: ["/api/resilience"],
 };

-// --- Tool 12: omniroute_test_combo ---
+// --- Tool 13: omniroute_test_combo ---
 export const testComboInput = z.object({
  comboId: z.string().describe("ID of the combo to test"),
  testPrompt: z.string().max(500).describe("Short test prompt (max 500 chars)"),
@@ -540,7 +587,7 @@ export const testComboTool: McpToolDefinition<typeof testComboInput, typeof test
  sourceEndpoints: ["/api/combos/test", "/v1/chat/completions"],
 };

-// --- Tool 13: omniroute_get_provider_metrics ---
+// --- Tool 14: omniroute_get_provider_metrics ---
 export const getProviderMetricsInput = z.object({
  provider: z.string().describe("Provider name (e.g., 'claude', 'gemini-cli', 'codex')"),
 });
@@ -583,7 +630,7 @@ export const getProviderMetricsTool: McpToolDefinition<
  sourceEndpoints: ["/api/provider-metrics", "/api/resilience"],
 };

-// --- Tool 14: omniroute_best_combo_for_task ---
+// --- Tool 15: omniroute_best_combo_for_task ---
 export const bestComboForTaskInput = z.object({
  taskType: z
    .enum(["coding", "review", "planning", "analysis", "debugging", "documentation"])
@@ -628,7 +675,7 @@ export const bestComboForTaskTool: McpToolDefinition<
  sourceEndpoints: ["/api/combos", "/api/combos/metrics", "/api/monitoring/health"],
 };

-// --- Tool 15: omniroute_explain_route ---
+// --- Tool 16: omniroute_explain_route ---
 export const explainRouteInput = z.object({
  requestId: z.string().describe("Request ID from the X-Request-Id header"),
 });
@@ -674,7 +721,7 @@ export const explainRouteTool: McpToolDefinition<
  sourceEndpoints: [],
 };

-// --- Tool 16: omniroute_get_session_snapshot ---
+// --- Tool 17: omniroute_get_session_snapshot ---
 export const getSessionSnapshotInput = z.object({}).describe("No parameters required");

 export const getSessionSnapshotOutput = z.object({
@@ -723,7 +770,7 @@ export const getSessionSnapshotTool: McpToolDefinition<
  sourceEndpoints: ["/api/usage/analytics", "/api/telemetry/summary"],
 };

-// --- Tool 17: omniroute_sync_pricing ---
+// --- Tool 18: omniroute_sync_pricing ---
 export const syncPricingInput = z.object({
  sources: z
    .array(z.string())
@@ -775,6 +822,7 @@ export const MCP_TOOLS = [
  // Phase 2: Advanced
  simulateRouteTool,
  setBudgetGuardTool,
+  setRoutingStrategyTool,
  setResilienceProfileTool,
  testComboTool,
  getProviderMetricsTool,
@@ -25,6 +25,7 @@ import {
  listModelsCatalogInput,
  simulateRouteInput,
  setBudgetGuardInput,
+  setRoutingStrategyInput,
  setResilienceProfileInput,
  testComboInput,
  getProviderMetricsInput,
@@ -45,6 +46,7 @@ import {
 import {
  handleSimulateRoute,
  handleSetBudgetGuard,
+  handleSetRoutingStrategy,
  handleSetResilienceProfile,
  handleTestCombo,
  handleGetProviderMetrics,
@@ -593,6 +595,18 @@ export function createMcpServer(): McpServer {
    )
  );

+  server.registerTool(
+    "omniroute_set_routing_strategy",
+    {
+      description:
+        "Updates combo routing strategy at runtime (priority/weighted/round-robin/auto/etc.)",
+      inputSchema: setRoutingStrategyInput,
+    },
+    withScopeEnforcement("omniroute_set_routing_strategy", (args) =>
+      handleSetRoutingStrategy(setRoutingStrategyInput.parse(args))
+    )
+  );
+
  server.registerTool(
    "omniroute_set_resilience_profile",
    {
@@ -1,16 +1,18 @@
 /**
- * OmniRoute MCP Advanced Tools — 8 intelligence tools that differentiate
+ * OmniRoute MCP Advanced Tools — 10 intelligence tools that differentiate
 * OmniRoute from all other AI gateways.
 *
 * Tools:
 *   1. omniroute_simulate_route     — Dry-run routing simulation
 *   2. omniroute_set_budget_guard   — Session budget with degrade/block/alert
- *   3. omniroute_set_resilience_profile — Circuit breaker/retry profiles
- *   4. omniroute_test_combo         — Live test each provider in a combo
- *   5. omniroute_get_provider_metrics — Detailed per-provider metrics
- *   6. omniroute_best_combo_for_task — AI-powered combo recommendation
- *   7. omniroute_explain_route      — Post-hoc routing decision explainer
- *   8. omniroute_get_session_snapshot — Full session state snapshot
+ *   3. omniroute_set_routing_strategy — Runtime strategy switch for combos
+ *   4. omniroute_set_resilience_profile — Circuit breaker/retry profiles
+ *   5. omniroute_test_combo         — Live test each provider in a combo
+ *   6. omniroute_get_provider_metrics — Detailed per-provider metrics
+ *   7. omniroute_best_combo_for_task — AI-powered combo recommendation
+ *   8. omniroute_explain_route      — Post-hoc routing decision explainer
+ *   9. omniroute_get_session_snapshot — Full session state snapshot
+ *  10. omniroute_sync_pricing      — Sync provider pricing from external source
 */

 import { logToolCall } from "../audit.ts";
@@ -335,6 +337,108 @@ export async function handleSetBudgetGuard(args: {
  }
 }

+export async function handleSetRoutingStrategy(args: {
+  comboId: string;
+  strategy:
+    | "priority"
+    | "weighted"
+    | "round-robin"
+    | "strict-random"
+    | "random"
+    | "least-used"
+    | "cost-optimized"
+    | "auto";
+  autoRoutingStrategy?: "rules" | "cost" | "eco" | "latency" | "fast";
+}) {
+  const start = Date.now();
+  try {
+    const combos = normalizeCombosResponse(await apiFetch("/api/combos"));
+    const combo = combos.find(
+      (comboEntry) =>
+        toString(comboEntry.id) === args.comboId || toString(comboEntry.name) === args.comboId
+    );
+
+    if (!combo) {
+      const msg = `Combo '${args.comboId}' not found`;
+      await logToolCall(
+        "omniroute_set_routing_strategy",
+        args,
+        null,
+        Date.now() - start,
+        false,
+        msg
+      );
+      return { content: [{ type: "text" as const, text: `Error: ${msg}` }], isError: true };
+    }
+
+    const comboId = toString(combo.id);
+    if (!comboId) {
+      const msg = "Matched combo has no id";
+      await logToolCall(
+        "omniroute_set_routing_strategy",
+        args,
+        null,
+        Date.now() - start,
+        false,
+        msg
+      );
+      return { content: [{ type: "text" as const, text: `Error: ${msg}` }], isError: true };
+    }
+
+    const comboData = toRecord(combo.data);
+    const currentConfig = toRecord(
+      Object.keys(toRecord(combo.config)).length > 0 ? combo.config : comboData.config
+    );
+
+    let nextConfig: JsonRecord | undefined = undefined;
+    if (args.strategy === "auto" && args.autoRoutingStrategy) {
+      const currentAutoConfig = toRecord(currentConfig.auto);
+      nextConfig = {
+        ...currentConfig,
+        auto: {
+          ...currentAutoConfig,
+          routingStrategy: args.autoRoutingStrategy,
+        },
+      };
+    }
+
+    const payload: JsonRecord = { strategy: args.strategy };
+    if (nextConfig && Object.keys(nextConfig).length > 0) {
+      payload.config = nextConfig;
+    }
+
+    const updatedCombo = toRecord(
+      await apiFetch(`/api/combos/${encodeURIComponent(comboId)}`, {
+        method: "PUT",
+        body: JSON.stringify(payload),
+      })
+    );
+
+    const updatedConfig = toRecord(updatedCombo.config);
+    const resolvedAutoStrategy =
+      toString(toRecord(updatedConfig.auto).routingStrategy) ||
+      (args.strategy === "auto" ? (args.autoRoutingStrategy ?? "rules") : "");
+
+    const result = {
+      success: true,
+      combo: {
+        id: toString(updatedCombo.id, comboId),
+        name: toString(updatedCombo.name, toString(combo.name, comboId)),
+        strategy: toString(updatedCombo.strategy, args.strategy),
+        autoRoutingStrategy:
+          toString(updatedCombo.strategy, args.strategy) === "auto" ? resolvedAutoStrategy : null,
+      },
+    };
+
+    await logToolCall("omniroute_set_routing_strategy", args, result, Date.now() - start, true);
+    return { content: [{ type: "text" as const, text: JSON.stringify(result, null, 2) }] };
+  } catch (err) {
+    const msg = err instanceof Error ? err.message : String(err);
+    await logToolCall("omniroute_set_routing_strategy", args, null, Date.now() - start, false, msg);
+    return { content: [{ type: "text" as const, text: `Error: ${msg}` }], isError: true };
+  }
+}
+
 export async function handleSetResilienceProfile(args: {
  profile: "aggressive" | "balanced" | "conservative";
 }) {
@@ -20,6 +20,7 @@ import {
 import { getTaskFitness } from "./taskFitness";
 import { getModePack } from "./modePacks";
 import { getSelfHealingManager } from "./selfHealing";
+import { classifyPromptIntent } from "../intentClassifier";

 export interface AutoComboConfig {
  id: string;
@@ -30,6 +31,8 @@ export interface AutoComboConfig {
  modePack?: string;
  budgetCap?: number; // max cost per request in USD
  explorationRate: number; // 0.05 = 5% exploratory
+  /** If set, RouterStrategy name to use for selection ('rules' | 'cost' | 'latency') */
+  routerStrategy?: string;
 }

 export interface SelectionResult {
@@ -43,14 +46,44 @@ export interface SelectionResult {

 /**
 * Select the best provider from an auto-combo pool.
+ *
+ * @param config - AutoCombo configuration
+ * @param candidates - Provider candidates to score
+ * @param taskType - Task type hint. When "default" or omitted, the engine will attempt
+ *   to infer the intent from `promptMessages` using multilingual classification.
+ * @param promptMessages - Optional raw messages for intent classification
 */
 export function selectProvider(
  config: AutoComboConfig,
  candidates: ProviderCandidate[],
-  taskType: string = "default"
+  taskType: string = "default",
+  promptMessages?: Array<{ role: string; content: unknown }>
 ): SelectionResult {
  const healer = getSelfHealingManager();

+  // ── Intent classification (ClawRouter Feature #10/11) ────────────────────
+  // When taskType is generic ('default'), attempt to classify the prompt intent
+  // using the multilingual intentClassifier for better task fitness scoring.
+  let effectiveTaskType = taskType;
+  if ((taskType === "default" || taskType === "") && promptMessages?.length) {
+    // Extract text from last user message for classification
+    const lastUserMsg = [...promptMessages].reverse().find((m) => m.role === "user");
+    if (lastUserMsg) {
+      const text =
+        typeof lastUserMsg.content === "string"
+          ? lastUserMsg.content
+          : Array.isArray(lastUserMsg.content)
+            ? (lastUserMsg.content as Array<{ type: string; text?: string }>)
+                .filter((b) => b.type === "text")
+                .map((b) => b.text || "")
+                .join(" ")
+            : "";
+      if (text.length > 10) {
+        const intent = classifyPromptIntent(text);
+        effectiveTaskType = intent; // 'code' | 'reasoning' | 'simple' | 'medium'
+      }
+    }
+  }
  // Resolve weights from mode pack or config
  let weights = config.weights;
  if (config.modePack) {
@@ -80,8 +113,8 @@ export function selectProvider(
    excluded.length = 0;
  }

-  // Score all providers
-  const scored = scorePool(pool, taskType, weights, getTaskFitness);
+  // Score all providers (using classified intent if available)
+  const scored = scorePool(pool, effectiveTaskType, weights, getTaskFitness);

  // Apply self-healing re-evaluation with actual scores
  const finalCandidates = scored.filter((s) => {
@@ -0,0 +1,159 @@
+/**
+ * RouterStrategy — Pluggable Routing Strategy System
+ *
+ * Inspired by ClawRouter commit 14c83c258 "refactor: extract routing into pluggable RouterStrategy system".
+ * Provides a RouterStrategy interface and two built-in implementations:
+ *   - RulesStrategy (default): wraps the existing 6-factor scoring engine
+ *   - CostStrategy: always picks cheapest available model
+ */
+
+import type { ProviderCandidate, ScoredProvider } from "./scoring.ts";
+import { scorePool } from "./scoring.ts";
+import { getTaskFitness } from "./taskFitness.ts";
+
+export interface RoutingContext {
+  taskType: string;
+  requestHasTools?: boolean;
+  requestHasVision?: boolean;
+  estimatedInputTokens?: number;
+}
+
+export interface RoutingDecision {
+  provider: string;
+  model: string;
+  strategy: string;
+  reason: string;
+  candidatesConsidered: number;
+  finalScore: number;
+}
+
+export interface RouterStrategy {
+  readonly name: string;
+  readonly description: string;
+  select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision;
+}
+
+// ── RulesStrategy: wraps 6-factor scoring engine ────────────────────────────
+
+class RulesStrategyImpl implements RouterStrategy {
+  readonly name = "rules";
+  readonly description =
+    "6-factor weighted scoring: quota, health, cost, latency, taskFit, stability";
+
+  select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision {
+    const eligible = pool.filter((c) => c.circuitBreakerState !== "OPEN");
+    const ranked: ScoredProvider[] = scorePool(
+      eligible.length > 0 ? eligible : pool,
+      context.taskType,
+      undefined,
+      getTaskFitness
+    );
+    const best = ranked[0];
+    if (!best) throw new Error("[RulesStrategy] No candidates to score");
+    return {
+      provider: best.provider,
+      model: best.model,
+      strategy: this.name,
+      reason: `RulesStrategy: score=${best.score.toFixed(3)} (quota=${best.factors.quota.toFixed(2)}, health=${best.factors.health.toFixed(2)}, cost=${best.factors.costInv.toFixed(2)}, taskFit=${best.factors.taskFit.toFixed(2)})`,
+      candidatesConsidered: ranked.length,
+      finalScore: best.score,
+    };
+  }
+}
+
+// ── CostStrategy: always picks cheapest healthy provider ─────────────────────
+
+class CostStrategyImpl implements RouterStrategy {
+  readonly name = "cost";
+  readonly description = "Always selects cheapest available provider (by costPer1MTokens)";
+
+  select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision {
+    const healthy = pool.filter((c) => c.circuitBreakerState !== "OPEN");
+    const candidates = healthy.length > 0 ? healthy : pool;
+    const sorted = [...candidates].sort((a, b) => a.costPer1MTokens - b.costPer1MTokens);
+    const best = sorted[0];
+    if (!best) throw new Error("[CostStrategy] No candidates available");
+    return {
+      provider: best.provider,
+      model: best.model,
+      strategy: this.name,
+      reason: `CostStrategy: cheapest at $${best.costPer1MTokens.toFixed(3)}/1M tokens`,
+      candidatesConsidered: candidates.length,
+      finalScore: best.costPer1MTokens === 0 ? 1.0 : 1 / best.costPer1MTokens,
+    };
+  }
+}
+
+// ── LatencyStrategy: prioritize low latency + reliability ───────────────────
+
+class LatencyStrategyImpl implements RouterStrategy {
+  readonly name = "latency";
+  readonly description = "Prioritizes lowest p95 latency with reliability weighting";
+
+  select(pool: ProviderCandidate[], context: RoutingContext): RoutingDecision {
+    const healthy = pool.filter((c) => c.circuitBreakerState !== "OPEN");
+    const candidates = healthy.length > 0 ? healthy : pool;
+    const sorted = [...candidates].sort((a, b) => {
+      const aPenalty = a.errorRate * 1000;
+      const bPenalty = b.errorRate * 1000;
+      return a.p95LatencyMs + aPenalty - (b.p95LatencyMs + bPenalty);
+    });
+    const best = sorted[0];
+    if (!best) throw new Error("[LatencyStrategy] No candidates available");
+
+    const latencyScore = best.p95LatencyMs > 0 ? Math.max(0.001, 10_000 / best.p95LatencyMs) : 1;
+    const reliability = Math.max(0, 1 - best.errorRate);
+    const finalScore = latencyScore * 0.7 + reliability * 0.3;
+
+    return {
+      provider: best.provider,
+      model: best.model,
+      strategy: this.name,
+      reason: `LatencyStrategy: p95=${best.p95LatencyMs}ms, errorRate=${(best.errorRate * 100).toFixed(2)}%`,
+      candidatesConsidered: candidates.length,
+      finalScore,
+    };
+  }
+}
+
+// ── Registry ──────────────────────────────────────────────────────────────────
+
+const strategyRegistry = new Map<string, RouterStrategy>();
+
+const rulesStrategy = new RulesStrategyImpl();
+const costStrategy = new CostStrategyImpl();
+const latencyStrategy = new LatencyStrategyImpl();
+
+strategyRegistry.set("rules", rulesStrategy);
+strategyRegistry.set("cost", costStrategy);
+strategyRegistry.set("eco", costStrategy); // alias
+strategyRegistry.set("latency", latencyStrategy);
+strategyRegistry.set("fast", latencyStrategy); // alias
+
+export function getStrategy(name: string): RouterStrategy {
+  const strategy = strategyRegistry.get(name);
+  if (!strategy) {
+    console.warn(`[RouterStrategy] Strategy '${name}' not found, falling back to 'rules'`);
+    return rulesStrategy;
+  }
+  return strategy;
+}
+
+export function registerStrategy(name: string, strategy: RouterStrategy): void {
+  if (strategyRegistry.has(name)) {
+    console.warn(`[RouterStrategy] Overwriting strategy '${name}'`);
+  }
+  strategyRegistry.set(name, strategy);
+}
+
+export function listStrategies(): Array<{ name: string; description: string }> {
+  return [...strategyRegistry.entries()].map(([name, s]) => ({ name, description: s.description }));
+}
+
+export function selectWithStrategy(
+  pool: ProviderCandidate[],
+  context: RoutingContext,
+  strategyName = "rules"
+): RoutingDecision {
+  return getStrategy(strategyName).select(pool, context);
+}
@@ -74,7 +74,8 @@ export function calculateScore(factors: ScoringFactors, weights: ScoringWeights)
    weights.costInv * factors.costInv +
    weights.latencyInv * factors.latencyInv +
    weights.taskFit * factors.taskFit +
-    weights.stability * factors.stability
+    weights.stability * factors.stability +
+    weights.tierPriority * factors.tierPriority
  );
 }

@@ -24,10 +24,23 @@ const FITNESS_TABLE: Record<string, Record<string, number>> = {
    "deepseek-coder": 0.9,
    "deepseek-v3": 0.85,
    "deepseek-r1": 0.88,
+    "deepseek-chat": 0.84, // DeepSeek V3.2 Chat — strong code performance
+    "deepseek-v3.2": 0.86, // Explicit V3.2 alias
    qwen: 0.78,
    llama: 0.72,
    mistral: 0.75,
    mixtral: 0.77,
+    // Grok-4 fast — good code, ultra-low latency (1143ms P50)
+    "grok-4-fast": 0.8,
+    "grok-4": 0.82,
+    "grok-3": 0.8,
+    // Kimi K2.5 — agentic with tool calling, good at code tasks
+    "kimi-k2": 0.82,
+    // GLM-5 — Z.AI model with 128k output
+    "glm-5": 0.78,
+    // MiniMax M2.5 — reasoning support helps complex code
+    "minimax-m2.5": 0.75,
+    "minimax-m2": 0.72,
  },
  review: {
    "claude-sonnet": 0.92,
@@ -58,10 +71,15 @@ const FITNESS_TABLE: Record<string, Record<string, number>> = {
    "claude-sonnet": 0.92,
    "gemini-2.5-pro": 0.95,
    "gemini-pro": 0.88,
+    "gemini-3.1-pro": 0.95, // Gemini 3.1 Pro — 1M context, ideal for long analysis
    "gpt-4o": 0.85,
    o1: 0.9,
    o3: 0.93,
    "deepseek-r1": 0.88,
+    "deepseek-chat": 0.8,
+    "kimi-k2": 0.82, // Kimi K2.5 agentic — good for analysis
+    "glm-5": 0.78, // GLM-5 with 128k output for long analysis
+    "minimax-m2.5": 0.76,
  },
  debugging: {
    "claude-sonnet": 0.93,
@@ -87,8 +105,17 @@ const FITNESS_TABLE: Record<string, Record<string, number>> = {
    "claude-opus": 0.85,
    "gpt-4o": 0.85,
    "gemini-pro": 0.8,
+    "gemini-3.1-pro": 0.85,
    "deepseek-v3": 0.75,
+    "deepseek-chat": 0.74,
    "gemini-flash": 0.72,
+    // New models from ClawRouter analysis (2026-03-17):
+    "grok-4-fast": 0.72, // ultra-fast, suitable for all tasks
+    "grok-4": 0.74,
+    "grok-3": 0.73,
+    "kimi-k2": 0.76, // agentic multi-step tasks
+    "glm-5": 0.7,
+    "minimax-m2.5": 0.7,
  },
 };

@@ -5,18 +5,37 @@

 import { checkFallbackError, formatRetryAfter, getProviderProfile } from "./accountFallback.ts";
 import { unavailableResponse } from "../utils/error.ts";
-import { recordComboRequest, getComboMetrics } from "./comboMetrics.ts";
+import { recordComboIntent, recordComboRequest, getComboMetrics } from "./comboMetrics.ts";
 import { resolveComboConfig, getDefaultComboConfig } from "./comboConfig.ts";
 import * as semaphore from "./rateLimitSemaphore.ts";
 import { getCircuitBreaker } from "../../src/shared/utils/circuitBreaker";
 import { fisherYatesShuffle, getNextFromDeck } from "../../src/shared/utils/shuffleDeck";
 import { parseModel } from "./model.ts";
+import { applyComboAgentMiddleware, injectModelTag } from "./comboAgentMiddleware.ts";
+import { classifyWithConfig, DEFAULT_INTENT_CONFIG } from "./intentClassifier.ts";
+import { selectProvider as selectAutoProvider } from "./autoCombo/engine.ts";
+import { selectWithStrategy } from "./autoCombo/routerStrategy.ts";
+import { DEFAULT_WEIGHTS, scorePool } from "./autoCombo/scoring.ts";
+import { supportsToolCalling } from "./modelCapabilities.ts";

 // Status codes that should mark semaphore + record circuit breaker failures
 const TRANSIENT_FOR_BREAKER = [429, 502, 503, 504];

 const MAX_COMBO_DEPTH = 3;

+// Bootstrap defaults from ClawRouter benchmark (used when no local latency history exists yet)
+const DEFAULT_MODEL_P95_MS = {
+  "grok-4-fast-non-reasoning": 1143,
+  "grok-4-1-fast-non-reasoning": 1244,
+  "gemini-2.5-flash": 1238,
+  "kimi-k2.5": 1646,
+  "gpt-4o-mini": 2764,
+  "claude-sonnet-4.6": 4000,
+  "claude-opus-4.6": 6000,
+  "deepseek-chat": 2000,
+};
+const MIN_HISTORY_SAMPLES = 10;
+
 // In-memory atomic counter per combo for round-robin distribution
 // Resets on server restart (by design — no stale state)
 const rrCounters = new Map();
@@ -201,6 +220,193 @@ function sortModelsByUsage(models, comboName) {
  return withUsage.map((e) => e.modelStr);
 }

+function toTextContent(content) {
+  if (typeof content === "string") return content;
+  if (!Array.isArray(content)) return "";
+  return content
+    .map((part) => {
+      if (!part || typeof part !== "object") return "";
+      if (typeof part.text === "string") return part.text;
+      return "";
+    })
+    .join("\n");
+}
+
+function extractPromptForIntent(body) {
+  if (!body || typeof body !== "object") return "";
+
+  const fromMessages = Array.isArray(body.messages)
+    ? [...body.messages].reverse().find((m) => m && typeof m === "object" && m.role === "user")
+    : null;
+  if (fromMessages) return toTextContent(fromMessages.content);
+
+  if (typeof body.input === "string") return body.input;
+  if (Array.isArray(body.input)) {
+    const text = body.input
+      .map((item) => {
+        if (!item || typeof item !== "object") return "";
+        if (typeof item.content === "string") return item.content;
+        if (typeof item.text === "string") return item.text;
+        return "";
+      })
+      .filter(Boolean)
+      .join("\n");
+    if (text) return text;
+  }
+
+  if (typeof body.prompt === "string") return body.prompt;
+  return "";
+}
+
+function mapIntentToTaskType(intent) {
+  switch (intent) {
+    case "code":
+      return "coding";
+    case "reasoning":
+      return "analysis";
+    case "simple":
+      return "default";
+    case "medium":
+    default:
+      return "default";
+  }
+}
+
+function toStringArray(input) {
+  if (Array.isArray(input)) {
+    return input.map((v) => (typeof v === "string" ? v.trim() : "")).filter(Boolean);
+  }
+  if (typeof input === "string") {
+    return input
+      .split(",")
+      .map((v) => v.trim())
+      .filter(Boolean);
+  }
+  return [];
+}
+
+function getIntentConfig(settings, combo) {
+  const comboIntentConfig =
+    combo?.autoConfig?.intentConfig ||
+    combo?.config?.auto?.intentConfig ||
+    combo?.config?.intentConfig ||
+    {};
+
+  return {
+    ...DEFAULT_INTENT_CONFIG,
+    ...comboIntentConfig,
+    ...(typeof settings?.intentDetectionEnabled === "boolean"
+      ? { enabled: settings.intentDetectionEnabled }
+      : {}),
+    ...(Number.isFinite(Number(settings?.intentSimpleMaxWords))
+      ? { simpleMaxWords: Number(settings.intentSimpleMaxWords) }
+      : {}),
+    ...(toStringArray(settings?.intentExtraCodeKeywords).length > 0
+      ? { extraCodeKeywords: toStringArray(settings.intentExtraCodeKeywords) }
+      : {}),
+    ...(toStringArray(settings?.intentExtraReasoningKeywords).length > 0
+      ? { extraReasoningKeywords: toStringArray(settings.intentExtraReasoningKeywords) }
+      : {}),
+    ...(toStringArray(settings?.intentExtraSimpleKeywords).length > 0
+      ? { extraSimpleKeywords: toStringArray(settings.intentExtraSimpleKeywords) }
+      : {}),
+  };
+}
+
+function getBootstrapLatencyMs(modelId) {
+  const normalized = String(modelId || "").toLowerCase();
+  return DEFAULT_MODEL_P95_MS[normalized] ?? 1500;
+}
+
+async function buildAutoCandidates(modelStrings, comboName) {
+  const metrics = getComboMetrics(comboName);
+  const { getPricingForModel } = await import("../../src/lib/localDb");
+  let historicalLatencyStats = {};
+  try {
+    const { getModelLatencyStats } = await import("../../src/lib/usageDb");
+    historicalLatencyStats = await getModelLatencyStats({
+      windowHours: 24,
+      minSamples: 3,
+      maxRows: 10000,
+    });
+  } catch {
+    // keep empty stats — auto-combo will use runtime + bootstrap signals
+  }
+
+  const candidates = await Promise.all(
+    modelStrings.map(async (modelStr) => {
+      const parsed = parseModel(modelStr);
+      const provider = parsed.provider || parsed.providerAlias || "unknown";
+      const model = parsed.model || modelStr;
+      const historicalKey = `${provider}/${model}`;
+      const historicalModelMetric = historicalLatencyStats[historicalKey] || null;
+      const historicalTotal = Number(historicalModelMetric?.totalRequests);
+      const hasHistoricalSignal =
+        Number.isFinite(historicalTotal) && historicalTotal >= MIN_HISTORY_SAMPLES;
+
+      let costPer1MTokens = 1;
+      try {
+        const pricing = await getPricingForModel(provider, model);
+        const inputPrice = Number(pricing?.input);
+        if (Number.isFinite(inputPrice) && inputPrice >= 0) {
+          costPer1MTokens = inputPrice;
+        }
+      } catch {
+        // keep default cost
+      }
+
+      const modelMetric = metrics?.byModel?.[modelStr] || null;
+      const avgLatency = Number(modelMetric?.avgLatencyMs);
+      const successRate = Number(modelMetric?.successRate);
+      const historicalP95Latency = Number(historicalModelMetric?.p95LatencyMs);
+      const historicalStdDev = Number(historicalModelMetric?.latencyStdDev);
+      const historicalSuccessRate = Number(historicalModelMetric?.successRate); // 0..1
+
+      const p95LatencyMs = hasHistoricalSignal
+        ? Number.isFinite(historicalP95Latency) && historicalP95Latency > 0
+          ? historicalP95Latency
+          : getBootstrapLatencyMs(model)
+        : Number.isFinite(avgLatency) && avgLatency > 0
+          ? avgLatency
+          : getBootstrapLatencyMs(model);
+
+      const errorRate = hasHistoricalSignal
+        ? Number.isFinite(historicalSuccessRate) &&
+          historicalSuccessRate >= 0 &&
+          historicalSuccessRate <= 1
+          ? 1 - historicalSuccessRate
+          : 0.05
+        : Number.isFinite(successRate) && successRate >= 0 && successRate <= 100
+          ? 1 - successRate / 100
+          : 0.05;
+      const latencyStdDev =
+        hasHistoricalSignal && Number.isFinite(historicalStdDev) && historicalStdDev > 0
+          ? Math.max(10, historicalStdDev)
+          : Math.max(10, p95LatencyMs * 0.1);
+
+      const breakerStateRaw = getCircuitBreaker(`combo:${modelStr}`)?.getStatus?.()?.state;
+      const circuitBreakerState =
+        breakerStateRaw === "OPEN" || breakerStateRaw === "HALF_OPEN" ? breakerStateRaw : "CLOSED";
+
+      return {
+        provider,
+        model,
+        quotaRemaining: 100,
+        quotaTotal: 100,
+        circuitBreakerState,
+        costPer1MTokens,
+        p95LatencyMs,
+        latencyStdDev,
+        errorRate,
+        accountTier: "standard",
+        quotaResetIntervalSecs: 86400,
+      };
+    })
+  );
+
+  return candidates;
+}
+
 /**
 * Handle combo chat with fallback
 * Supports all 6 strategies: priority, weighted, round-robin, random, least-used, cost-optimized
@@ -225,12 +431,49 @@ export async function handleComboChat({
  const strategy = combo.strategy || "priority";
  const models = combo.models || [];

+  // ── Combo Agent Middleware (#399 + #401) ────────────────────────────────
+  // Apply system_message override, tool_filter_regex, and extract pinned model
+  // from context caching tag. These are all opt-in per combo config.
+  const { body: agentBody, pinnedModel } = applyComboAgentMiddleware(
+    body,
+    combo,
+    "" // provider/model not yet known — resolved per-model in loop
+  );
+  body = agentBody;
+  if (pinnedModel) {
+    log.info("COMBO", `[#401] Context caching: pinned model=${pinnedModel}`);
+  }
+  // Wrap handleSingleModel to inject context caching tag on response (#401)
+  const handleSingleModelWrapped = combo.context_cache_protection
+    ? async (b, modelStr) => {
+        const res = await handleSingleModel(b, modelStr);
+        // Inject tag only on success and only for non-streaming non-binary responses
+        if (res.ok && !b.stream) {
+          try {
+            const json = await res.clone().json();
+            const msgs = Array.isArray(json?.messages) ? json.messages : [];
+            if (msgs.length > 0) {
+              const tagged = injectModelTag(msgs, modelStr);
+              return new Response(JSON.stringify({ ...json, messages: tagged }), {
+                status: res.status,
+                headers: res.headers,
+              });
+            }
+          } catch {
+            /* non-JSON or stream — skip tagging */
+          }
+        }
+        return res;
+      }
+    : handleSingleModel;
+  // ─────────────────────────────────────────────────────────────────────────
+
  // Route to round-robin handler if strategy matches
  if (strategy === "round-robin") {
    return handleRoundRobinCombo({
      body,
      combo,
-      handleSingleModel,
+      handleSingleModel: handleSingleModelWrapped,
      isModelAvailable,
      log,
      settings,
@@ -278,7 +521,131 @@ export async function handleComboChat({
  }

  // Apply strategy-specific ordering
-  if (strategy === "strict-random") {
+  if (strategy === "auto") {
+    const requestHasTools = Array.isArray(body?.tools) && body.tools.length > 0;
+    let eligibleModels = [...orderedModels];
+
+    if (requestHasTools) {
+      const filtered = eligibleModels.filter((m) => supportsToolCalling(m));
+      if (filtered.length > 0) {
+        eligibleModels = filtered;
+      } else {
+        log.warn(
+          "COMBO",
+          "Auto strategy: all candidates filtered by tool-calling policy, falling back to full pool"
+        );
+      }
+    }
+
+    const prompt = extractPromptForIntent(body);
+    const systemPrompt =
+      typeof combo?.system_message === "string" ? combo.system_message : undefined;
+    const intentConfig = getIntentConfig(settings, combo);
+    const intent = classifyWithConfig(prompt, intentConfig, systemPrompt);
+    recordComboIntent(combo.name, intent);
+    const taskType = mapIntentToTaskType(intent);
+
+    const autoConfigSource = combo?.autoConfig || combo?.config?.auto || combo?.config || {};
+    const routingStrategy =
+      typeof autoConfigSource.routingStrategy === "string"
+        ? autoConfigSource.routingStrategy
+        : typeof autoConfigSource.strategyName === "string"
+          ? autoConfigSource.strategyName
+          : "rules";
+
+    const candidatePool = Array.isArray(autoConfigSource.candidatePool)
+      ? autoConfigSource.candidatePool
+      : [
+          ...new Set(
+            eligibleModels.map((m) => {
+              const parsed = parseModel(m);
+              return parsed.provider || parsed.providerAlias || "unknown";
+            })
+          ),
+        ];
+
+    const weights =
+      autoConfigSource.weights && typeof autoConfigSource.weights === "object"
+        ? autoConfigSource.weights
+        : DEFAULT_WEIGHTS;
+    const explorationRate = Number.isFinite(Number(autoConfigSource.explorationRate))
+      ? Number(autoConfigSource.explorationRate)
+      : 0.05;
+    const budgetCap = Number.isFinite(Number(autoConfigSource.budgetCap))
+      ? Number(autoConfigSource.budgetCap)
+      : undefined;
+    const modePack =
+      typeof autoConfigSource.modePack === "string" ? autoConfigSource.modePack : undefined;
+
+    const candidates = await buildAutoCandidates(eligibleModels, combo.name);
+    if (candidates.length > 0) {
+      let selectedProvider = null;
+      let selectedModel = null;
+      let selectionReason = "";
+
+      if (routingStrategy !== "rules") {
+        try {
+          const decision = selectWithStrategy(
+            candidates,
+            { taskType, requestHasTools },
+            routingStrategy
+          );
+          selectedProvider = decision.provider;
+          selectedModel = decision.model;
+          selectionReason = decision.reason;
+        } catch (err) {
+          log.warn(
+            "COMBO",
+            `Auto strategy '${routingStrategy}' failed (${err?.message || "unknown"}), falling back to rules`
+          );
+        }
+      }
+
+      if (!selectedProvider || !selectedModel) {
+        const selection = selectAutoProvider(
+          {
+            id: combo.id || combo.name,
+            name: combo.name,
+            type: "auto",
+            candidatePool,
+            weights,
+            modePack,
+            budgetCap,
+            explorationRate,
+          },
+          candidates,
+          taskType
+        );
+        selectedProvider = selection.provider;
+        selectedModel = selection.model;
+        selectionReason = `score=${selection.score.toFixed(3)}${selection.isExploration ? " (exploration)" : ""}`;
+      }
+
+      const modelLookup = new Map();
+      for (const modelStr of eligibleModels) {
+        const parsed = parseModel(modelStr);
+        const provider = parsed.provider || parsed.providerAlias || "unknown";
+        const modelId = parsed.model || modelStr;
+        modelLookup.set(`${provider}/${modelId}`, modelStr);
+      }
+
+      const ranked = scorePool(candidates, taskType, weights)
+        .map((r) => modelLookup.get(`${r.provider}/${r.model}`) || `${r.provider}/${r.model}`)
+        .filter(Boolean);
+
+      const selectedModelStr =
+        modelLookup.get(`${selectedProvider}/${selectedModel}`) ||
+        `${selectedProvider}/${selectedModel}`;
+      orderedModels = [...new Set([selectedModelStr, ...ranked, ...eligibleModels])];
+
+      log.info(
+        "COMBO",
+        `Auto selection: ${selectedModelStr} | intent=${intent} task=${taskType} | strategy=${routingStrategy} | ${selectionReason}`
+      );
+    } else {
+      log.warn("COMBO", "Auto strategy has no candidates, keeping default ordering");
+    }
+  } else if (strategy === "strict-random") {
    const selectedId = await getNextFromDeck(`combo:${combo.name}`, orderedModels);
    // Put selected model first so the fallback loop tries it first
    const rest = orderedModels.filter((m) => m !== selectedId);
@@ -348,7 +715,7 @@ export async function handleComboChat({
        `Trying model ${i + 1}/${orderedModels.length}: ${modelStr}${retry > 0 ? ` (retry ${retry})` : ""}`
      );

-      const result = await handleSingleModel(body, modelStr);
+      const result = await handleSingleModelWrapped(body, modelStr);

      // Success — return response
      if (result.ok) {
@@ -0,0 +1,169 @@
+/**
+ * comboAgentMiddleware.ts — Combo Agent Features
+ *
+ * Implements the "combo as agent" features from issues #399 and #401:
+ *
+ * 1. **System Message Override** (#399): If the combo defines a `system_message`,
+ *    it is injected as the first system message, replacing any existing system message.
+ *
+ * 2. **Tool Filter Regex** (#399): If the combo defines a `tool_filter_regex`,
+ *    only tools whose name matches the pattern are forwarded to the provider.
+ *
+ * 3. **Context Caching Protection** (#401): If the combo enables
+ *    `context_cache_protection`, the proxy:
+ *    a. On response: injects `<omniModel>provider/model</omniModel>` tag into
+ *       the first assistant message content string.
+ *    b. On request: scans the message history for the tag, and if found,
+ *       overrides the requested model with the pinned one.
+ *
+ * All features are opt-in per combo and backward compatible with existing setups.
+ */
+
+interface ComboConfig {
+  system_message?: string | null;
+  tool_filter_regex?: string | null;
+  context_cache_protection?: number | boolean;
+  [key: string]: unknown;
+}
+
+interface Message {
+  role?: string;
+  content?: unknown;
+  [key: string]: unknown;
+}
+
+// ── Context Caching Tag ─────────────────────────────────────────────────────
+
+const CACHE_TAG_PATTERN = /<omniModel>([^<]+)<\/omniModel>/;
+
+/**
+ * Inject the model tag into the last assistant message (or append a new one).
+ * Only modifies string content — does not touch array content to avoid breaking
+ * Claude/Gemini multi-part message formats.
+ */
+export function injectModelTag(messages: Message[], providerModel: string): Message[] {
+  // Remove any existing tag first to avoid duplication on context compaction
+  const cleaned = messages.map((msg) => {
+    if (msg.role === "assistant" && typeof msg.content === "string") {
+      return { ...msg, content: msg.content.replace(CACHE_TAG_PATTERN, "").trimEnd() };
+    }
+    return msg;
+  });
+
+  // Find last assistant message with string content
+  const lastAssistantIdx = cleaned.map((m) => m.role).lastIndexOf("assistant");
+  if (lastAssistantIdx === -1) return cleaned;
+
+  const msg = cleaned[lastAssistantIdx];
+  if (typeof msg.content !== "string") return cleaned;
+
+  const tagged = [...cleaned];
+  tagged[lastAssistantIdx] = {
+    ...msg,
+    content: `${msg.content}\n<omniModel>${providerModel}</omniModel>`,
+  };
+  return tagged;
+}
+
+/**
+ * Scan message history for the model tag injected by a previous response.
+ * Returns the pinned "provider/model" string, or null if not found.
+ */
+export function extractPinnedModel(messages: Message[]): string | null {
+  // Scan from newest to oldest for efficiency
+  for (let i = messages.length - 1; i >= 0; i--) {
+    const msg = messages[i];
+    if (msg.role === "assistant" && typeof msg.content === "string") {
+      const match = CACHE_TAG_PATTERN.exec(msg.content);
+      if (match) return match[1];
+    }
+  }
+  return null;
+}
+
+// ── System Message Override ──────────────────────────────────────────────────
+
+/**
+ * Replace or inject a system message at the beginning of the messages array.
+ * Existing system messages are removed if a combo override is set.
+ */
+export function applySystemMessageOverride(messages: Message[], systemMessage: string): Message[] {
+  // Remove all existing system messages
+  const filtered = messages.filter((m) => m.role !== "system");
+  // Inject combo system message at start
+  return [{ role: "system", content: systemMessage }, ...filtered];
+}
+
+// ── Tool Filter Regex ────────────────────────────────────────────────────────
+
+/**
+ * Filter the tools array, keeping only tools whose name matches the regex.
+ * Returns the original array unchanged if pattern is null/empty.
+ */
+export function applyToolFilter(
+  tools: unknown[] | undefined,
+  pattern: string | null | undefined
+): unknown[] | undefined {
+  if (!tools || !pattern) return tools;
+
+  let regex: RegExp;
+  try {
+    regex = new RegExp(pattern);
+  } catch {
+    // Invalid regex — return tools unchanged rather than crashing
+    console.warn(`[ComboAgent] Invalid tool_filter_regex: "${pattern}"`);
+    return tools;
+  }
+
+  return tools.filter((tool) => {
+    const t = tool as Record<string, unknown>;
+    // Support both OpenAI format ({ function: { name } }) and Anthropic ({ name })
+    const name = (t.function as Record<string, unknown> | undefined)?.name ?? t.name ?? "";
+    return regex.test(String(name));
+  });
+}
+
+// ── Main Middleware ──────────────────────────────────────────────────────────
+
+/**
+ * Apply all combo agent features to the request body.
+ * Safe to call with null/undefined comboConfig — returns body unchanged.
+ */
+export function applyComboAgentMiddleware(
+  body: Record<string, unknown>,
+  comboConfig: ComboConfig | null | undefined,
+  providerModel: string // "provider/model" string for context caching
+): { body: Record<string, unknown>; pinnedModel: string | null } {
+  if (!comboConfig) return { body, pinnedModel: null };
+
+  let messages: Message[] = Array.isArray(body.messages) ? [...body.messages] : [];
+  let pinnedModel: string | null = null;
+
+  // 1. Context caching: check for pinned model in history
+  if (comboConfig.context_cache_protection) {
+    pinnedModel = extractPinnedModel(messages);
+    if (pinnedModel) {
+      // Model is pinned — caller should override model selection
+    }
+  }
+
+  // 2. System message override
+  if (comboConfig.system_message && comboConfig.system_message.trim()) {
+    messages = applySystemMessageOverride(messages, comboConfig.system_message);
+  }
+
+  // 3. Tool filter
+  const filteredTools = applyToolFilter(
+    body.tools as unknown[] | undefined,
+    comboConfig.tool_filter_regex
+  );
+
+  return {
+    body: {
+      ...body,
+      messages,
+      ...(filteredTools !== body.tools && { tools: filteredTools }),
+    },
+    pinnedModel,
+  };
+}
@@ -21,6 +21,7 @@ interface ComboMetricsEntry {
  totalLatencyMs: number;
  strategy: string;
  lastUsedAt: string | null;
+  intentCounts: Record<string, number>;
  byModel: Record<string, ModelMetrics>;
 }

@@ -69,6 +70,7 @@ export function recordComboRequest(
      totalLatencyMs: 0,
      strategy,
      lastUsedAt: null,
+      intentCounts: {},
      byModel: {},
    });
  }
@@ -131,6 +133,7 @@ export function getComboMetrics(comboName: string): ComboMetricsView | null {
      combo.totalRequests > 0 ? Math.round((combo.totalSuccesses / combo.totalRequests) * 100) : 0,
    fallbackRate:
      combo.totalRequests > 0 ? Math.round((combo.totalFallbacks / combo.totalRequests) * 100) : 0,
+    intentCounts: { ...combo.intentCounts },
    byModel: Object.fromEntries(
      Object.entries(combo.byModel).map(([model, m]) => [
        model,
@@ -156,6 +159,30 @@ export function getAllComboMetrics(): Record<string, ComboMetricsView | null> {
  return result;
 }

+/**
+ * Record detected prompt intent for a combo (used by multilingual routing analytics).
+ */
+export function recordComboIntent(comboName: string, intent: string): void {
+  if (!metrics.has(comboName)) {
+    metrics.set(comboName, {
+      totalRequests: 0,
+      totalSuccesses: 0,
+      totalFailures: 0,
+      totalFallbacks: 0,
+      totalLatencyMs: 0,
+      strategy: "priority",
+      lastUsedAt: null,
+      intentCounts: {},
+      byModel: {},
+    });
+  }
+
+  const combo = metrics.get(comboName);
+  if (!combo) return;
+  const key = String(intent || "unknown");
+  combo.intentCounts[key] = (combo.intentCounts[key] || 0) + 1;
+}
+
 /**
 * Reset metrics for a specific combo
 */
@@ -0,0 +1,103 @@
+/**
+ * Emergency Fallback — Budget Exhaustion Redirect
+ *
+ * When a request fails due to budget exhaustion (HTTP 402 or budget keywords
+ * in the error body), optionally redirect to a free-tier model
+ * (default provider/model: nvidia + openai/gpt-oss-120b at $0.00/M tokens).
+ *
+ * Inspired by ClawRouter: "gpt-oss-120b costs nothing and serves as
+ * automatic fallback when wallet is empty."
+ */
+
+export interface EmergencyFallbackConfig {
+  enabled: boolean;
+  provider: string;
+  model: string;
+  triggerOn402: boolean;
+  triggerOnBudgetKeywords: boolean;
+  budgetKeywords: string[];
+  /** Skip fallback for tool requests (gpt-oss-120b may not support structured tool calling) */
+  skipForToolRequests: boolean;
+  maxOutputTokens: number;
+}
+
+export const EMERGENCY_FALLBACK_CONFIG: EmergencyFallbackConfig = {
+  enabled: true,
+  provider: "nvidia",
+  model: "openai/gpt-oss-120b",
+  triggerOn402: true,
+  triggerOnBudgetKeywords: true,
+  budgetKeywords: [
+    "insufficient funds",
+    "insufficient_funds",
+    "budget exceeded",
+    "budget_exceeded",
+    "quota exceeded",
+    "quota_exceeded",
+    "billing",
+    "payment required",
+    "out of credits",
+    "no credits",
+    "credit limit",
+    "spending limit",
+    "saldo insuficiente",
+    "limite de gastos",
+    "cota excedida",
+  ],
+  skipForToolRequests: true,
+  maxOutputTokens: 4096,
+};
+
+export interface FallbackDecision {
+  shouldFallback: true;
+  reason: string;
+  provider: string;
+  model: string;
+  maxOutputTokens: number;
+}
+
+export interface NoFallbackDecision {
+  shouldFallback: false;
+  reason: string;
+}
+
+export type FallbackResult = FallbackDecision | NoFallbackDecision;
+
+export function shouldUseFallback(
+  status: number,
+  errorBody: string,
+  requestHasTools: boolean,
+  config: EmergencyFallbackConfig = EMERGENCY_FALLBACK_CONFIG
+): FallbackResult {
+  if (!config.enabled) return { shouldFallback: false, reason: "emergency fallback disabled" };
+  if (config.skipForToolRequests && requestHasTools) {
+    return { shouldFallback: false, reason: "skipped: request has tools" };
+  }
+  if (config.triggerOn402 && status === 402) {
+    return {
+      shouldFallback: true,
+      reason: `HTTP 402 → emergency fallback to ${config.provider}/${config.model}`,
+      provider: config.provider,
+      model: config.model,
+      maxOutputTokens: config.maxOutputTokens,
+    };
+  }
+  if (config.triggerOnBudgetKeywords && errorBody) {
+    const lowerBody = errorBody.toLowerCase();
+    const matched = config.budgetKeywords.find((kw) => lowerBody.includes(kw.toLowerCase()));
+    if (matched) {
+      return {
+        shouldFallback: true,
+        reason: `Budget error detected ('${matched}') → emergency fallback to ${config.provider}/${config.model}`,
+        provider: config.provider,
+        model: config.model,
+        maxOutputTokens: config.maxOutputTokens,
+      };
+    }
+  }
+  return { shouldFallback: false, reason: "no budget error detected" };
+}
+
+export function isFallbackDecision(result: FallbackResult): result is FallbackDecision {
+  return result.shouldFallback === true;
+}
@@ -0,0 +1,375 @@
+/**
+ * Multilingual Intent Detection for AutoCombo
+ *
+ * Classifies prompts as: code | reasoning | simple | medium
+ * using keywords in 9 languages (EN, PT-BR, ES, ZH, JA, RU, DE, KO, AR).
+ *
+ * Inspired by ClawRouter (BlockRunAI) multilingual routing system.
+ * Execution: purely synchronous, <1ms, no I/O.
+ */
+
+export type IntentType = "code" | "reasoning" | "simple" | "medium";
+
+export const CODE_KEYWORDS: readonly string[] = [
+  // English
+  "function",
+  "class",
+  "import",
+  "def",
+  "SELECT",
+  "async",
+  "await",
+  "const",
+  "let",
+  "var",
+  "return",
+  "```",
+  "algorithm",
+  "compile",
+  "debug",
+  "refactor",
+  "typescript",
+  "python",
+  "javascript",
+  "code",
+  "implement",
+  "write a",
+  "create a component",
+  "endpoint",
+  "repository",
+  "deploy",
+  "install",
+  "script",
+  "api",
+  "database",
+  "query",
+  "schema",
+  "interface",
+  "generic",
+  "enum",
+  "module",
+  "package",
+  "dependency",
+  // Português (PT-BR)
+  "função",
+  "classe",
+  "importar",
+  "definir",
+  "consulta",
+  "assíncrono",
+  "aguardar",
+  "constante",
+  "variável",
+  "retornar",
+  "algoritmo",
+  "compilar",
+  "depurar",
+  "refatorar",
+  "código",
+  "implementar",
+  "criar um",
+  "componente",
+  "como fazer",
+  "repositório",
+  "configurar",
+  "instalar",
+  "banco de dados",
+  "escrever uma função",
+  "criar uma classe",
+  // Español
+  "función",
+  "clase",
+  "importar",
+  "definir",
+  "consulta",
+  "asíncrono",
+  "esperar",
+  "constante",
+  "variable",
+  "retornar",
+  "algoritmo",
+  "compilar",
+  "depurar",
+  "refactorizar",
+  "código",
+  "implementar",
+  // 中文
+  "函数",
+  "类",
+  "导入",
+  "定义",
+  "查询",
+  "异步",
+  "等待",
+  "常量",
+  "变量",
+  "返回",
+  "算法",
+  "编译",
+  "调试",
+  "代码",
+  // 日本語
+  "関数",
+  "クラス",
+  "インポート",
+  "非同期",
+  "定数",
+  "変数",
+  "コード",
+  "アルゴリズム",
+  // Русский
+  "функция",
+  "класс",
+  "импорт",
+  "запрос",
+  "асинхронный",
+  "константа",
+  "переменная",
+  "алгоритм",
+  "код",
+  // Deutsch
+  "funktion",
+  "klasse",
+  "importieren",
+  "abfrage",
+  "asynchron",
+  "konstante",
+  "variable",
+  "algorithmus",
+  "code",
+  // 한국어
+  "함수",
+  "클래스",
+  "가져오기",
+  "정의",
+  "쿼리",
+  "비동기",
+  "대기",
+  "상수",
+  "변수",
+  "반환",
+  "코드",
+  // العربية
+  "دالة",
+  "فئة",
+  "استيراد",
+  "استعلام",
+  "غير متزامن",
+  "ثابت",
+  "متغير",
+  "كود",
+  "خوارزمية",
+];
+
+export const REASONING_KEYWORDS: readonly string[] = [
+  // English
+  "prove",
+  "theorem",
+  "derive",
+  "step by step",
+  "chain of thought",
+  "formally",
+  "mathematical",
+  "proof",
+  "logically",
+  "analyze",
+  "reasoning",
+  "deduce",
+  "infer",
+  "hypothesis",
+  "convergence",
+  // Português (PT-BR)
+  "provar",
+  "teorema",
+  "derivar",
+  "passo a passo",
+  "cadeia de pensamento",
+  "formalmente",
+  "matemático",
+  "prova",
+  "logicamente",
+  "analisar",
+  "raciocínio",
+  "deduzir",
+  "inferir",
+  "hipótese",
+  "demonstrar",
+  "cálculo",
+  "equação diferencial",
+  "integral",
+  "otimização",
+  // Español
+  "demostrar",
+  "teorema",
+  "derivar",
+  "paso a paso",
+  "formalmente",
+  "matemático",
+  "lógicamente",
+  // 中文
+  "证明",
+  "定理",
+  "推导",
+  "逐步",
+  "思维链",
+  "数学",
+  "逻辑",
+  "分析",
+  // 日本語
+  "証明",
+  "定理",
+  "導出",
+  "論理的",
+  "分析",
+  // Русский
+  "доказать",
+  "теорема",
+  "шаг за шагом",
+  "математически",
+  "логически",
+  // Deutsch
+  "beweisen",
+  "theorem",
+  "schritt für schritt",
+  "mathematisch",
+  "logisch",
+  // 한국어
+  "증명",
+  "정리",
+  "단계별",
+  "수학적",
+  "논리적",
+  // العربية
+  "إثبات",
+  "نظرية",
+  "خطوة بخطوة",
+  "رياضي",
+  "منطقياً",
+];
+
+export const SIMPLE_KEYWORDS: readonly string[] = [
+  // English
+  "what is",
+  "define",
+  "translate",
+  "hello",
+  "yes or no",
+  "summarize",
+  "list",
+  "tell me",
+  "who is",
+  // Português (PT-BR)
+  "o que é",
+  "definir",
+  "traduzir",
+  "olá",
+  "oi",
+  "sim ou não",
+  "resumir",
+  "listar",
+  "me diga",
+  "quem é",
+  "quando foi",
+  "onde fica",
+  "explique brevemente",
+  "de forma simples",
+  // Español
+  "qué es",
+  "definir",
+  "traducir",
+  "hola",
+  "resumir",
+  "listar",
+  // 中文
+  "什么是",
+  "定义",
+  "翻译",
+  "你好",
+  "总结",
+  "列出",
+  // Русский
+  "что такое",
+  "определить",
+  "перевести",
+  "привет",
+  "резюмировать",
+  // Deutsch
+  "was ist",
+  "definieren",
+  "übersetzen",
+  "hallo",
+  "zusammenfassen",
+  // 한국어
+  "이란",
+  "정의",
+  "번역",
+  "안녕",
+  "요약",
+  // العربية
+  "ما هو",
+  "تعريف",
+  "ترجمة",
+  "مرحبا",
+  "ملخص",
+];
+
+/**
+ * Classify a prompt's intent using multilingual keyword matching.
+ * Priority: code > reasoning > simple > medium (default)
+ */
+export function classifyPromptIntent(prompt: string, systemPrompt?: string): IntentType {
+  const fullText = `${systemPrompt ?? ""} ${prompt}`.toLowerCase();
+  const wordCount = prompt.trim().split(/\s+/).length;
+
+  for (const kw of CODE_KEYWORDS) {
+    if (fullText.includes(kw.toLowerCase())) return "code";
+  }
+  for (const kw of REASONING_KEYWORDS) {
+    if (fullText.includes(kw.toLowerCase())) return "reasoning";
+  }
+  if (wordCount < 60) {
+    for (const kw of SIMPLE_KEYWORDS) {
+      if (fullText.includes(kw.toLowerCase())) return "simple";
+    }
+  }
+  return "medium";
+}
+
+export interface IntentClassifierConfig {
+  enabled: boolean;
+  extraCodeKeywords?: string[];
+  extraReasoningKeywords?: string[];
+  extraSimpleKeywords?: string[];
+  simpleMaxWords?: number;
+}
+
+export const DEFAULT_INTENT_CONFIG: IntentClassifierConfig = {
+  enabled: true,
+  simpleMaxWords: 60,
+};
+
+export function classifyWithConfig(
+  prompt: string,
+  config: IntentClassifierConfig,
+  systemPrompt?: string
+): IntentType {
+  if (!config.enabled) return "medium";
+  const fullText = `${systemPrompt ?? ""} ${prompt}`.toLowerCase();
+  const wordCount = prompt.trim().split(/\s+/).length;
+  const maxSimpleWords = config.simpleMaxWords ?? 60;
+  const codeKws = [...CODE_KEYWORDS, ...(config.extraCodeKeywords ?? [])];
+  const reasoningKws = [...REASONING_KEYWORDS, ...(config.extraReasoningKeywords ?? [])];
+  const simpleKws = [...SIMPLE_KEYWORDS, ...(config.extraSimpleKeywords ?? [])];
+  for (const kw of codeKws) {
+    if (fullText.includes(kw.toLowerCase())) return "code";
+  }
+  for (const kw of reasoningKws) {
+    if (fullText.includes(kw.toLowerCase())) return "reasoning";
+  }
+  if (wordCount < maxSimpleWords) {
+    for (const kw of simpleKws) {
+      if (fullText.includes(kw.toLowerCase())) return "simple";
+    }
+  }
+  return "medium";
+}
@@ -23,6 +23,18 @@ const PROVIDER_MODEL_ALIASES = {
    "gemini-3-flash": "gemini-3-flash-preview",
    "raptor-mini": "oswe-vscode-prime",
  },
+  gemini: {
+    "gemini-3.1-pro-preview": "gemini-3.1-pro",
+    "gemini-3-1-pro": "gemini-3.1-pro",
+  },
+  "gemini-cli": {
+    "gemini-3.1-pro-preview": "gemini-3.1-pro",
+    "gemini-3-1-pro": "gemini-3.1-pro",
+  },
+  nvidia: {
+    "gpt-oss-120b": "openai/gpt-oss-120b",
+    "nvidia/gpt-oss-120b": "openai/gpt-oss-120b",
+  },
  antigravity: {},
 };

@@ -0,0 +1,50 @@
+import { PROVIDER_ID_TO_ALIAS, PROVIDER_MODELS } from "../config/providerModels.ts";
+import { parseModel } from "./model.ts";
+
+// Conservative denylist fallback used when registry metadata is absent.
+// Keep small and explicit to avoid false negatives.
+const TOOL_CALLING_UNSUPPORTED_PATTERNS = [
+  "gpt-oss-120b",
+  "deepseek-reasoner",
+  "glm-4.7",
+  "glm4.7",
+];
+
+function getRegistryToolCallingFlag(providerIdOrAlias: string, modelId: string): boolean | null {
+  const providerAlias = PROVIDER_ID_TO_ALIAS[providerIdOrAlias] || providerIdOrAlias;
+  const models = PROVIDER_MODELS[providerAlias];
+  if (!Array.isArray(models)) return null;
+  const found = models.find((m) => m?.id === modelId);
+  if (!found) return null;
+  return typeof found.toolCalling === "boolean" ? found.toolCalling : null;
+}
+
+/**
+ * Returns whether a model should be considered safe for structured function/tool calling.
+ *
+ * Decision order:
+ * 1) Provider registry metadata (toolCalling flag) when available.
+ * 2) Conservative denylist fallback for known problematic model families.
+ * 3) Default true.
+ */
+export function supportsToolCalling(modelStr: string): boolean {
+  const parsed = parseModel(modelStr);
+  const provider = parsed.provider || parsed.providerAlias || "";
+  const model = parsed.model || modelStr;
+
+  if (provider) {
+    const fromRegistry = getRegistryToolCallingFlag(provider, model);
+    if (fromRegistry !== null) return fromRegistry;
+  }
+
+  const normalized = String(modelStr || "").toLowerCase();
+  if (!normalized) return false;
+
+  const blocked = TOOL_CALLING_UNSUPPORTED_PATTERNS.some((pattern) => {
+    if (normalized === pattern) return true;
+    if (normalized.endsWith(`/${pattern}`)) return true;
+    return normalized.includes(pattern);
+  });
+
+  return !blocked;
+}
@@ -0,0 +1,120 @@
+/**
+ * Request Deduplication Service
+ *
+ * Deduplicates **concurrent** identical requests to the same upstream.
+ * Inspired by ClawRouter's dedup.ts (BlockRunAI / github.com/BlockRunAI/ClawRouter).
+ *
+ * IMPORTANT: In-memory only — does NOT persist across restarts and does NOT
+ * work across multiple process instances (no cross-instance dedup).
+ */
+
+import { createHash } from "node:crypto";
+
+export interface DedupConfig {
+  enabled: boolean;
+  maxTemperatureForDedup: number;
+  timeoutMs: number;
+}
+
+export const DEFAULT_DEDUP_CONFIG: DedupConfig = {
+  enabled: true,
+  maxTemperatureForDedup: 0.1,
+  timeoutMs: 60_000,
+};
+
+export interface DedupResult<T> {
+  result: T;
+  wasDeduplicated: boolean;
+  hash: string;
+}
+
+const inflight = new Map<string, Promise<unknown>>();
+
+/**
+ * Compute a deterministic hash for a request body.
+ * Includes: model, messages, temperature, tools, tool_choice, max_tokens, response_format
+ * Excludes: stream, user, metadata (don't affect LLM output)
+ */
+export function computeRequestHash(requestBody: unknown): string {
+  const body = requestBody as Record<string, unknown>;
+  const canonical = {
+    model: body.model ?? null,
+    messages: body.messages ?? null,
+    temperature: typeof body.temperature === "number" ? body.temperature : 1.0,
+    tools: body.tools ?? null,
+    tool_choice: body.tool_choice ?? null,
+    max_tokens: body.max_tokens ?? null,
+    response_format: body.response_format ?? null,
+    top_p: body.top_p ?? null,
+    frequency_penalty: body.frequency_penalty ?? null,
+    presence_penalty: body.presence_penalty ?? null,
+  };
+  return createHash("sha256").update(JSON.stringify(canonical)).digest("hex").slice(0, 16);
+}
+
+/** Determine whether a request should be deduplicated */
+export function shouldDeduplicate(
+  requestBody: unknown,
+  config: DedupConfig = DEFAULT_DEDUP_CONFIG
+): boolean {
+  if (!config.enabled) return false;
+  const body = requestBody as Record<string, unknown>;
+  if (body.stream === true) return false;
+  const temperature = typeof body.temperature === "number" ? body.temperature : 1.0;
+  if (temperature > config.maxTemperatureForDedup) return false;
+  return true;
+}
+
+/**
+ * Execute a request with deduplication.
+ * Concurrent identical requests share one upstream call.
+ */
+export async function deduplicate<T>(
+  hash: string,
+  fn: () => Promise<T>,
+  config: DedupConfig = DEFAULT_DEDUP_CONFIG
+): Promise<DedupResult<T>> {
+  if (!config.enabled) {
+    return { result: await fn(), wasDeduplicated: false, hash };
+  }
+
+  const existing = inflight.get(hash);
+  if (existing) {
+    const result = (await existing) as T;
+    return { result, wasDeduplicated: true, hash };
+  }
+
+  let resolve!: (value: T) => void;
+  let reject!: (reason: unknown) => void;
+  const sharedPromise = new Promise<T>((res, rej) => {
+    resolve = res;
+    reject = rej;
+  });
+  inflight.set(hash, sharedPromise as Promise<unknown>);
+
+  const timer = setTimeout(() => {
+    if (inflight.get(hash) === sharedPromise) inflight.delete(hash);
+  }, config.timeoutMs);
+
+  try {
+    const result = await fn();
+    resolve(result);
+    return { result, wasDeduplicated: false, hash };
+  } catch (err) {
+    reject(err);
+    throw err;
+  } finally {
+    clearTimeout(timer);
+    if (inflight.get(hash) === sharedPromise) inflight.delete(hash);
+  }
+}
+
+export function getInflightCount(): number {
+  return inflight.size;
+}
+export function getInflightHashes(): string[] {
+  return [...inflight.keys()];
+}
+export function clearInflight(): void {
+  inflight.clear();
+}
@@ -0,0 +1,142 @@
+/**
+ * Search Cache — in-memory TTL cache with request coalescing
+ *
+ * Bounded at MAX_CACHE_ENTRIES to prevent OOM.
+ * Request coalescing deduplicates concurrent identical queries
+ * to prevent cache stampede (critical for agentic tools).
+ */
+
+import { createHash } from "crypto";
+
+const MAX_CACHE_ENTRIES = 5000;
+const DEFAULT_TTL_MS = parseInt(process.env.SEARCH_CACHE_TTL_MS || String(5 * 60 * 1000), 10);
+
+interface CacheEntry<T> {
+  data: T;
+  expiresAt: number;
+}
+
+const cache = new Map<string, CacheEntry<unknown>>();
+const inflight = new Map<string, Promise<unknown>>();
+
+let hits = 0;
+let misses = 0;
+
+/**
+ * Normalize a query for cache key computation.
+ * NFKC normalization, lowercase, trim, collapse whitespace.
+ */
+function normalizeQuery(query: string): string {
+  return query.normalize("NFKC").toLowerCase().trim().replace(/\s+/g, " ");
+}
+
+/**
+ * Compute a deterministic cache key from search parameters.
+ */
+export function computeCacheKey(
+  query: string,
+  provider: string,
+  searchType: string,
+  maxResults: number,
+  country?: string,
+  language?: string,
+  filters?: unknown
+): string {
+  const normalized = normalizeQuery(query);
+  const payload = JSON.stringify({
+    q: normalized,
+    p: provider,
+    t: searchType,
+    n: maxResults,
+    c: country || null,
+    l: language || null,
+    f: filters || null,
+  });
+  return createHash("sha256").update(payload).digest("hex");
+}
+
+/**
+ * Evict expired entries and enforce size bound.
+ * Called lazily on writes. O(n) worst case but amortized O(1).
+ */
+function evictIfNeeded(): void {
+  const now = Date.now();
+
+  // Remove expired entries first
+  for (const [key, entry] of cache) {
+    if (entry.expiresAt <= now) {
+      cache.delete(key);
+    }
+  }
+
+  // FIFO eviction if still over limit
+  while (cache.size >= MAX_CACHE_ENTRIES) {
+    const firstKey = cache.keys().next().value;
+    if (firstKey !== undefined) {
+      cache.delete(firstKey);
+    } else {
+      break;
+    }
+  }
+}
+
+/**
+ * Get or coalesce: return cached data, join an inflight request,
+ * or execute the fetch function and cache the result.
+ *
+ * @param key - Cache key from computeCacheKey()
+ * @param ttlMs - TTL in milliseconds (0 to bypass cache)
+ * @param fetchFn - Function to execute on cache miss
+ * @returns The cached or freshly fetched data
+ */
+export async function getOrCoalesce<T>(
+  key: string,
+  ttlMs: number,
+  fetchFn: () => Promise<T>
+): Promise<{ data: T; cached: boolean }> {
+  // 1. Check cache
+  const cached = cache.get(key) as CacheEntry<T> | undefined;
+  if (cached && cached.expiresAt > Date.now()) {
+    hits++;
+    return { data: cached.data, cached: true };
+  }
+
+  // 2. Join inflight request if one exists (request coalescing)
+  const existing = inflight.get(key) as Promise<T> | undefined;
+  if (existing) {
+    hits++;
+    const data = await existing;
+    return { data, cached: true };
+  }
+
+  // 3. Cache miss — execute fetch
+  misses++;
+  const promise = fetchFn();
+  inflight.set(key, promise);
+
+  try {
+    const data = await promise;
+
+    // Store in cache
+    if (ttlMs > 0) {
+      evictIfNeeded();
+      cache.set(key, { data, expiresAt: Date.now() + ttlMs });
+    }
+
+    return { data, cached: false };
+  } finally {
+    inflight.delete(key);
+  }
+}
+
+/**
+ * Get cache statistics for monitoring.
+ */
+export function getCacheStats(): { size: number; hits: number; misses: number } {
+  return { size: cache.size, hits, misses };
+}
+
+/**
+ * Default TTL for search cache entries.
+ */
+export const SEARCH_CACHE_DEFAULT_TTL_MS = DEFAULT_TTL_MS;
@@ -208,7 +208,7 @@ export function openaiResponsesToOpenAIRequest(
    });
  }

-  // Filter orphaned tool results (no matching tool_call in any assistant message)
+  // Filter orphaned tool results (no matching tool_call in assistant messages)
  const allToolCallIds = new Set<string>();
  for (const m of messages) {
    const rec = toRecord(m);
@@ -1,12 +1,12 @@
 {
  "name": "omniroute",
-  "version": "2.6.7",
+  "version": "2.7.0",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "omniroute",
-      "version": "2.6.7",
+      "version": "2.7.0",
      "hasInstallScript": true,
      "license": "MIT",
      "workspaces": [
@@ -1725,9 +1725,9 @@
      }
    },
    "node_modules/@next/env": {
-      "version": "16.1.6",
-      "resolved": "https://registry.npmjs.org/@next/env/-/env-16.1.6.tgz",
-      "integrity": "sha512-N1ySLuZjnAtN3kFnwhAwPvZah8RJxKasD7x1f8shFqhncnWZn4JMfg37diLNuoHsLAlrDfM3g4mawVdtAG8XLQ==",
+      "version": "16.1.7",
+      "resolved": "https://registry.npmjs.org/@next/env/-/env-16.1.7.tgz",
+      "integrity": "sha512-rJJbIdJB/RQr2F1nylZr/PJzamvNNhfr3brdKP6s/GW850jbtR70QlSfFselvIBbcPUOlQwBakexjFzqLzF6pg==",
      "license": "MIT"
    },
    "node_modules/@next/eslint-plugin-next": {
@@ -1741,9 +1741,9 @@
      }
    },
    "node_modules/@next/swc-darwin-arm64": {
-      "version": "16.1.6",
-      "resolved": "https://registry.npmjs.org/@next/swc-darwin-arm64/-/swc-darwin-arm64-16.1.6.tgz",
-      "integrity": "sha512-wTzYulosJr/6nFnqGW7FrG3jfUUlEf8UjGA0/pyypJl42ExdVgC6xJgcXQ+V8QFn6niSG2Pb8+MIG1mZr2vczw==",
+      "version": "16.1.7",
+      "resolved": "https://registry.npmjs.org/@next/swc-darwin-arm64/-/swc-darwin-arm64-16.1.7.tgz",
+      "integrity": "sha512-b2wWIE8sABdyafc4IM8r5Y/dS6kD80JRtOGrUiKTsACFQfWWgUQ2NwoUX1yjFMXVsAwcQeNpnucF2ZrujsBBPg==",
      "cpu": [
        "arm64"
      ],
@@ -1757,9 +1757,9 @@
      }
    },
    "node_modules/@next/swc-darwin-x64": {
-      "version": "16.1.6",
-      "resolved": "https://registry.npmjs.org/@next/swc-darwin-x64/-/swc-darwin-x64-16.1.6.tgz",
-      "integrity": "sha512-BLFPYPDO+MNJsiDWbeVzqvYd4NyuRrEYVB5k2N3JfWncuHAy2IVwMAOlVQDFjj+krkWzhY2apvmekMkfQR0CUQ==",
+      "version": "16.1.7",
+      "resolved": "https://registry.npmjs.org/@next/swc-darwin-x64/-/swc-darwin-x64-16.1.7.tgz",
+      "integrity": "sha512-zcnVaaZulS1WL0Ss38R5Q6D2gz7MtBu8GZLPfK+73D/hp4GFMrC2sudLky1QibfV7h6RJBJs/gOFvYP0X7UVlQ==",
      "cpu": [
        "x64"
      ],
@@ -1773,9 +1773,9 @@
      }
    },
    "node_modules/@next/swc-linux-arm64-gnu": {
-      "version": "16.1.6",
-      "resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-gnu/-/swc-linux-arm64-gnu-16.1.6.tgz",
-      "integrity": "sha512-OJYkCd5pj/QloBvoEcJ2XiMnlJkRv9idWA/j0ugSuA34gMT6f5b7vOiCQHVRpvStoZUknhl6/UxOXL4OwtdaBw==",
+      "version": "16.1.7",
+      "resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-gnu/-/swc-linux-arm64-gnu-16.1.7.tgz",
+      "integrity": "sha512-2ant89Lux/Q3VyC8vNVg7uBaFVP9SwoK2jJOOR0L8TQnX8CAYnh4uctAScy2Hwj2dgjVHqHLORQZJ2wH6VxhSQ==",
      "cpu": [
        "arm64"
      ],
@@ -1789,9 +1789,9 @@
      }
    },
    "node_modules/@next/swc-linux-arm64-musl": {
-      "version": "16.1.6",
-      "resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-musl/-/swc-linux-arm64-musl-16.1.6.tgz",
-      "integrity": "sha512-S4J2v+8tT3NIO9u2q+S0G5KdvNDjXfAv06OhfOzNDaBn5rw84DGXWndOEB7d5/x852A20sW1M56vhC/tRVbccQ==",
+      "version": "16.1.7",
+      "resolved": "https://registry.npmjs.org/@next/swc-linux-arm64-musl/-/swc-linux-arm64-musl-16.1.7.tgz",
+      "integrity": "sha512-uufcze7LYv0FQg9GnNeZ3/whYfo+1Q3HnQpm16o6Uyi0OVzLlk2ZWoY7j07KADZFY8qwDbsmFnMQP3p3+Ftprw==",
      "cpu": [
        "arm64"
      ],
@@ -1805,9 +1805,9 @@
      }
    },
    "node_modules/@next/swc-linux-x64-gnu": {
-      "version": "16.1.6",
-      "resolved": "https://registry.npmjs.org/@next/swc-linux-x64-gnu/-/swc-linux-x64-gnu-16.1.6.tgz",
-      "integrity": "sha512-2eEBDkFlMMNQnkTyPBhQOAyn2qMxyG2eE7GPH2WIDGEpEILcBPI/jdSv4t6xupSP+ot/jkfrCShLAa7+ZUPcJQ==",
+      "version": "16.1.7",
+      "resolved": "https://registry.npmjs.org/@next/swc-linux-x64-gnu/-/swc-linux-x64-gnu-16.1.7.tgz",
+      "integrity": "sha512-KWVf2gxYvHtvuT+c4MBOGxuse5TD7DsMFYSxVxRBnOzok/xryNeQSjXgxSv9QpIVlaGzEn/pIuI6Koosx8CGWA==",
      "cpu": [
        "x64"
      ],
@@ -1821,9 +1821,9 @@
      }
    },
    "node_modules/@next/swc-linux-x64-musl": {
-      "version": "16.1.6",
-      "resolved": "https://registry.npmjs.org/@next/swc-linux-x64-musl/-/swc-linux-x64-musl-16.1.6.tgz",
-      "integrity": "sha512-oicJwRlyOoZXVlxmIMaTq7f8pN9QNbdes0q2FXfRsPhfCi8n8JmOZJm5oo1pwDaFbnnD421rVU409M3evFbIqg==",
+      "version": "16.1.7",
+      "resolved": "https://registry.npmjs.org/@next/swc-linux-x64-musl/-/swc-linux-x64-musl-16.1.7.tgz",
+      "integrity": "sha512-HguhaGwsGr1YAGs68uRKc4aGWxLET+NevJskOcCAwXbwj0fYX0RgZW2gsOCzr9S11CSQPIkxmoSbuVaBp4Z3dA==",
      "cpu": [
        "x64"
      ],
@@ -1837,9 +1837,9 @@
      }
    },
    "node_modules/@next/swc-win32-arm64-msvc": {
-      "version": "16.1.6",
-      "resolved": "https://registry.npmjs.org/@next/swc-win32-arm64-msvc/-/swc-win32-arm64-msvc-16.1.6.tgz",
-      "integrity": "sha512-gQmm8izDTPgs+DCWH22kcDmuUp7NyiJgEl18bcr8irXA5N2m2O+JQIr6f3ct42GOs9c0h8QF3L5SzIxcYAAXXw==",
+      "version": "16.1.7",
+      "resolved": "https://registry.npmjs.org/@next/swc-win32-arm64-msvc/-/swc-win32-arm64-msvc-16.1.7.tgz",
+      "integrity": "sha512-S0n3KrDJokKTeFyM/vGGGR8+pCmXYrjNTk2ZozOL1C/JFdfUIL9O1ATaJOl5r2POe56iRChbsszrjMAdWSv7kQ==",
      "cpu": [
        "arm64"
      ],
@@ -1853,9 +1853,9 @@
      }
    },
    "node_modules/@next/swc-win32-x64-msvc": {
-      "version": "16.1.6",
-      "resolved": "https://registry.npmjs.org/@next/swc-win32-x64-msvc/-/swc-win32-x64-msvc-16.1.6.tgz",
-      "integrity": "sha512-NRfO39AIrzBnixKbjuo2YiYhB6o9d8v/ymU9m/Xk8cyVk+k7XylniXkHwjs4s70wedVffc6bQNbufk5v0xEm0A==",
+      "version": "16.1.7",
+      "resolved": "https://registry.npmjs.org/@next/swc-win32-x64-msvc/-/swc-win32-x64-msvc-16.1.7.tgz",
+      "integrity": "sha512-mwgtg8CNZGYm06LeEd+bNnOUfwOyNem/rOiP14Lsz+AnUY92Zq/LXwtebtUiaeVkhbroRCQ0c8GlR4UT1U+0yg==",
      "cpu": [
        "x64"
      ],
@@ -6817,7 +6817,6 @@
      "version": "2.3.2",
      "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.2.tgz",
      "integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==",
-      "dev": true,
      "hasInstallScript": true,
      "license": "MIT",
      "optional": true,
@@ -8812,14 +8811,14 @@
      }
    },
    "node_modules/next": {
-      "version": "16.1.6",
-      "resolved": "https://registry.npmjs.org/next/-/next-16.1.6.tgz",
-      "integrity": "sha512-hkyRkcu5x/41KoqnROkfTm2pZVbKxvbZRuNvKXLRXxs3VfyO0WhY50TQS40EuKO9SW3rBj/sF3WbVwDACeMZyw==",
+      "version": "16.1.7",
+      "resolved": "https://registry.npmjs.org/next/-/next-16.1.7.tgz",
+      "integrity": "sha512-WM0L7WrSvKwoLegLYr6V+mz+RIofqQgVAfHhMp9a88ms0cFX8iX9ew+snpWlSBwpkURJOUdvCEt3uLl3NNzvWg==",
      "license": "MIT",
      "dependencies": {
-        "@next/env": "16.1.6",
+        "@next/env": "16.1.7",
        "@swc/helpers": "0.5.15",
-        "baseline-browser-mapping": "^2.8.3",
+        "baseline-browser-mapping": "^2.9.19",
        "caniuse-lite": "^1.0.30001579",
        "postcss": "8.4.31",
        "styled-jsx": "5.1.6"
@@ -8831,14 +8830,14 @@
        "node": ">=20.9.0"
      },
      "optionalDependencies": {
-        "@next/swc-darwin-arm64": "16.1.6",
-        "@next/swc-darwin-x64": "16.1.6",
-        "@next/swc-linux-arm64-gnu": "16.1.6",
-        "@next/swc-linux-arm64-musl": "16.1.6",
-        "@next/swc-linux-x64-gnu": "16.1.6",
-        "@next/swc-linux-x64-musl": "16.1.6",
-        "@next/swc-win32-arm64-msvc": "16.1.6",
-        "@next/swc-win32-x64-msvc": "16.1.6",
+        "@next/swc-darwin-arm64": "16.1.7",
+        "@next/swc-darwin-x64": "16.1.7",
+        "@next/swc-linux-arm64-gnu": "16.1.7",
+        "@next/swc-linux-arm64-musl": "16.1.7",
+        "@next/swc-linux-x64-gnu": "16.1.7",
+        "@next/swc-linux-x64-musl": "16.1.7",
+        "@next/swc-win32-arm64-msvc": "16.1.7",
+        "@next/swc-win32-x64-msvc": "16.1.7",
        "sharp": "^0.34.4"
      },
      "peerDependencies": {
@@ -1,6 +1,6 @@
 {
  "name": "omniroute",
-  "version": "2.6.7",
+  "version": "2.7.2",
  "description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
  "type": "module",
  "bin": {
@@ -0,0 +1 @@
+<svg width="56" height="64" viewBox="0 0 56 64" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M53.292 15.321l1.5-3.676s-1.909-2.043-4.227-4.358c-2.317-2.315-7.225-.953-7.225-.953L37.751 0H18.12l-5.589 6.334s-4.908-1.362-7.225.953C2.988 9.602 1.08 11.645 1.08 11.645l1.5 3.676-1.91 5.447s5.614 21.236 6.272 23.83c1.295 5.106 2.181 7.08 5.862 9.668 3.68 2.587 10.36 7.08 11.45 7.762 1.091.68 2.455 1.84 3.682 1.84 1.227 0 2.59-1.16 3.68-1.84 1.091-.681 7.77-5.175 11.452-7.762 3.68-2.587 4.567-4.562 5.862-9.668.657-2.594 6.27-23.83 6.27-23.83l-1.908-5.447z" fill="url(#paint0_linear)"/><path fill-rule="evenodd" clip-rule="evenodd" d="M34.888 11.508c.818 0 6.885-1.157 6.885-1.157s7.189 8.68 7.189 10.536c0 1.534-.619 2.134-1.347 2.842-.152.148-.31.3-.467.468l-5.39 5.717a9.42 9.42 0 01-.176.18c-.538.54-1.33 1.336-.772 2.658l.115.269c.613 1.432 1.37 3.2.407 4.99-1.025 1.906-2.78 3.178-3.905 2.967-1.124-.21-3.766-1.589-4.737-2.218-.971-.63-4.05-3.166-4.05-4.137 0-.809 2.214-2.155 3.29-2.81.214-.13.383-.232.48-.298.111-.075.297-.19.526-.332.981-.61 2.754-1.71 2.799-2.197.055-.602.034-.778-.758-2.264-.168-.316-.365-.654-.568-1.004-.754-1.295-1.598-2.745-1.41-3.784.21-1.173 2.05-1.845 3.608-2.415.194-.07.385-.14.567-.209l1.623-.609c1.556-.582 3.284-1.229 3.57-1.36.394-.181.292-.355-.903-.468a54.655 54.655 0 01-.58-.06c-1.48-.157-4.209-.446-5.535-.077-.261.073-.553.152-.86.235-1.49.403-3.317.897-3.493 1.182-.03.05-.06.093-.089.133-.168.238-.277.394-.091 1.406.055.302.169.895.31 1.629.41 2.148 1.053 5.498 1.134 6.25.011.106.024.207.036.305.103.84.171 1.399-.805 1.622l-.255.058c-1.102.252-2.717.623-3.3.623-.584 0-2.2-.37-3.302-.623l-.254-.058c-.976-.223-.907-.782-.804-1.622.012-.098.024-.2.035-.305.081-.753.725-4.112 1.137-6.259.14-.73.253-1.32.308-1.62.185-1.012.076-1.168-.092-1.406a3.743 3.743 0 01-.09-.133c-.174-.285-2-.779-3.491-1.182-.307-.083-.6-.162-.86-.235-1.327-.37-4.055-.08-5.535.077-.226.024-.422.045-.58.06-1.196.113-1.297.287-.903.468.285.131 2.013.778 3.568 1.36.597.223 1.17.437 1.624.609.183.069.373.138.568.21 1.558.57 3.398 1.241 3.608 2.414.187 1.039-.657 2.489-1.41 3.784-.204.35-.4.688-.569 1.004-.791 1.486-.812 1.662-.757 2.264.044.488 1.816 1.587 2.798 2.197.229.142.415.257.526.332.098.066.266.168.48.298 1.076.654 3.29 2 3.29 2.81 0 .97-3.078 3.507-4.05 4.137-.97.63-3.612 2.008-4.737 2.218-1.124.21-2.88-1.061-3.904-2.966-.963-1.791-.207-3.559.406-4.99l.115-.27c.559-1.322-.233-2.118-.772-2.658a9.377 9.377 0 01-.175-.18l-5.39-5.717c-.158-.167-.316-.32-.468-.468-.728-.707-1.346-1.308-1.346-2.842 0-1.855 7.189-10.536 7.189-10.536s6.066 1.157 6.884 1.157c.653 0 1.913-.433 3.227-.885.333-.114.669-.23 1-.34 1.635-.545 2.726-.549 2.726-.549s1.09.004 2.726.549c.33.11.667.226 1 .34 1.313.452 2.574.885 3.226.885zm-1.041 30.706c1.282.66 2.192 1.128 2.536 1.343.445.278.174.803-.232 1.09-.405.285-5.853 4.499-6.381 4.965l-.215.191c-.509.459-1.159 1.044-1.62 1.044-.46 0-1.11-.586-1.62-1.044l-.213-.191c-.53-.466-5.977-4.68-6.382-4.966-.405-.286-.677-.81-.232-1.09.344-.214 1.255-.683 2.539-1.344l1.22-.629c1.92-.992 4.315-1.837 4.689-1.837.373 0 2.767.844 4.689 1.837.436.226.845.437 1.222.63z" fill="#fff"/><path fill-rule="evenodd" clip-rule="evenodd" d="M43.34 6.334L37.751 0H18.12l-5.589 6.334s-4.908-1.362-7.225.953c0 0 6.544-.59 8.793 3.064 0 0 6.066 1.157 6.884 1.157.818 0 2.59-.68 4.226-1.225 1.636-.545 2.727-.549 2.727-.549s1.09.004 2.726.549 3.408 1.225 4.226 1.225c.818 0 6.885-1.157 6.885-1.157 2.249-3.654 8.792-3.064 8.792-3.064-2.317-2.315-7.225-.953-7.225-.953z" fill="url(#paint1_linear)"/><defs><linearGradient id="paint0_linear" x1=".671" y1="64.319" x2="55.2" y2="64.319" gradientUnits="userSpaceOnUse"><stop stop-color="#F50"/><stop offset=".41" stop-color="#F50"/><stop offset=".582" stop-color="#FF2000"/><stop offset="1" stop-color="#FF2000"/></linearGradient><linearGradient id="paint1_linear" x1="6.278" y1="11.466" x2="50.565" y2="11.466" gradientUnits="userSpaceOnUse"><stop stop-color="#FF452A"/><stop offset="1" stop-color="#FF2000"/></linearGradient></defs></svg>
@@ -0,0 +1,4 @@
+<svg xmlns="http://www.w3.org/2000/svg" width="48" height="48" viewBox="0 0 48 48">
+  <rect width="48" height="48" rx="8" fill="#1E40AF"/>
+  <text x="24" y="32" text-anchor="middle" font-family="system-ui,-apple-system,sans-serif" font-size="22" font-weight="700" fill="white">exa</text>
+</svg>
@@ -0,0 +1,4 @@
+<svg xmlns="http://www.w3.org/2000/svg" width="48" height="48" viewBox="0 0 48 48">
+  <rect width="48" height="48" rx="8" fill="#1E40AF"/>
+  <text x="24" y="32" text-anchor="middle" font-family="system-ui,-apple-system,sans-serif" font-size="22" font-weight="700" fill="white">exa</text>
+</svg>
@@ -14,6 +14,7 @@
 *
 * Fixes: https://github.com/diegosouzapw/OmniRoute/issues/129
 * Fixes: https://github.com/diegosouzapw/OmniRoute/issues/321
+ * Fixes: https://github.com/diegosouzapw/OmniRoute/issues/426
 */

 import { existsSync, copyFileSync, mkdirSync } from "node:fs";
@@ -80,8 +81,54 @@ if (existsSync(rootBinary)) {
  }
 }

+// Strategy 1.5: Use node-pre-gyp to download the correct prebuilt binary
+// This works on Windows without requiring node-gyp, Python, or MSVC.
+// better-sqlite3 ships prebuilts for win32-x64, win32-arm64, darwin-x64/arm64.
+console.log("  📥  Attempting to download prebuilt binary via node-pre-gyp...");
+try {
+  const { execSync } = await import("node:child_process");
+  // better-sqlite3 bundles @mapbox/node-pre-gyp — use it directly
+  const preGypBin = join(
+    ROOT,
+    "app",
+    "node_modules",
+    ".bin",
+    process.platform === "win32" ? "node-pre-gyp.cmd" : "node-pre-gyp"
+  );
+  const preGypFallback = join(
+    ROOT,
+    "app",
+    "node_modules",
+    "@mapbox",
+    "node-pre-gyp",
+    "bin",
+    "node-pre-gyp"
+  );
+  const preGypCmd = existsSync(preGypBin) ? preGypBin : preGypFallback;
+
+  if (existsSync(preGypCmd)) {
+    execSync(`"${process.execPath}" "${preGypCmd}" install --fallback-to-build=false`, {
+      cwd: join(ROOT, "app", "node_modules", "better-sqlite3"),
+      stdio: "inherit",
+      timeout: 60_000,
+    });
+    mkdirSync(dirname(appBinary), { recursive: true });
+    try {
+      process.dlopen({ exports: {} }, appBinary);
+      console.log("  ✅ Prebuilt binary downloaded and loaded successfully!\n");
+      process.exit(0);
+    } catch (loadErr) {
+      console.warn(`  ⚠️  Downloaded binary failed to load: ${loadErr.message}`);
+    }
+  } else {
+    console.warn("  ⚠️  node-pre-gyp not found, skipping prebuilt download.");
+  }
+} catch (err) {
+  console.warn(`  ⚠️  node-pre-gyp download failed: ${err.message.split("\n")[0]}`);
+}
+
 // Strategy 2: Fall back to npm rebuild (may work if build tools are available)
-console.log("  ⚠️  Root binary not available or incompatible, attempting npm rebuild...");
+console.log("  ⚠️  Attempting npm rebuild (requires build tools)...");

 try {
  const { execSync } = await import("node:child_process");
@@ -103,14 +150,23 @@ try {
  }
 }

-// If nothing worked, warn but don't fail the install — let the package stay
-// installed so users can fix manually or use the pre-flight check in the CLI
-console.warn("  ⚠️  Could not fix better-sqlite3 native module automatically.");
+// If nothing worked, warn but don't fail the install
+console.warn("\n  ⚠️  Could not fix better-sqlite3 native module automatically.");
 console.warn("     The server may not start correctly.");
-console.warn("     Try manually:");
-console.warn(`     cd ${join(ROOT, "app")} && npm rebuild better-sqlite3`);
-if (process.platform === "darwin") {
+console.warn("     Manual fix options:");
+if (process.platform === "win32") {
+  console.warn("     Option A (easiest — no build tools needed):");
+  console.warn(`       cd "${join(ROOT, "app", "node_modules", "better-sqlite3")}"`);
+  console.warn("       npx @mapbox/node-pre-gyp install --fallback-to-build=false");
+  console.warn("     Option B (requires Build Tools for Visual Studio):");
+  console.warn(`       cd "${join(ROOT, "app")}" && npm rebuild better-sqlite3`);
+  console.warn("       Install from: https://visualstudio.microsoft.com/visual-cpp-build-tools/");
+  console.warn("       Also ensure Python is installed: https://python.org");
+} else if (process.platform === "darwin") {
+  console.warn(`     cd ${join(ROOT, "app")} && npm rebuild better-sqlite3`);
  console.warn("     If build tools are missing: xcode-select --install");
+} else {
+  console.warn(`     cd ${join(ROOT, "app")} && npm rebuild better-sqlite3`);
 }
 console.warn("");

@@ -278,6 +278,19 @@ if (existsSync(swcHelpersSrc) && !existsSync(swcHelpersDst)) {
  console.log("  ✅ @swc/helpers included in standalone build.");
 }

+// ── Step 10.6: Remove large binaries from standalone build ──
+// These directories contain platform-native binaries (.node, .asar) that
+// trigger Z_DATA_ERROR during npm pack. They are not needed in the npm package.
+const binaryDirsToRemove = ["vscode-extension", "electron"];
+for (const dir of binaryDirsToRemove) {
+  const targetDir = join(APP_DIR, dir);
+  if (existsSync(targetDir)) {
+    console.log(`  🧹 Removing app/${dir}/ (not needed in npm package)...`);
+    rmSync(targetDir, { recursive: true, force: true });
+    console.log(`  ✅ app/${dir}/ removed.`);
+  }
+}
+
 // ── Done ───────────────────────────────────────────────────
 const appPkg = join(APP_DIR, "package.json");
 if (existsSync(appPkg)) {
@@ -33,11 +33,29 @@ export default function APIPageClient({ machineId }) {
  const [viewTab, setViewTab] = useState("api");
  const [mcpStatus, setMcpStatus] = useState<any>(null);
  const [a2aStatus, setA2aStatus] = useState<any>(null);
+  const [searchProviders, setSearchProviders] = useState<any[]>([]);

  const { copied, copy } = useCopyToClipboard();

+  const fetchSearchProviders = async () => {
+    try {
+      const res = await fetch("/v1/search");
+      if (res.ok) {
+        const data = await res.json();
+        setSearchProviders(data.data || []);
+      }
+    } catch {
+      // Search endpoint may not be available
+    }
+  };
+
  useEffect(() => {
-    Promise.allSettled([loadCloudSettings(), fetchModels(), fetchProtocolStatus()]).finally(() => {
+    Promise.allSettled([
+      loadCloudSettings(),
+      fetchModels(),
+      fetchProtocolStatus(),
+      fetchSearchProviders(),
+    ]).finally(() => {
      setLoading(false);
    });
  }, []);
@@ -575,6 +593,47 @@ export default function APIPageClient({ machineId }) {
            </div>
          </div>

+          {/* Search & Discovery */}
+          {searchProviders.length > 0 && (
+            <div className="mb-6">
+              <div className="flex items-center gap-2 mb-3">
+                <span className="material-symbols-outlined text-sm text-cyan-400">
+                  travel_explore
+                </span>
+                <h3 className="text-xs font-semibold text-text-muted uppercase tracking-wider">
+                  {t("categorySearch") || "Search & Discovery"}
+                </h3>
+                <div className="flex-1 h-px bg-border/50" />
+              </div>
+              <div className="flex flex-col gap-3">
+                <EndpointSection
+                  icon="search"
+                  iconColor="text-cyan-500"
+                  iconBg="bg-cyan-500/10"
+                  title={t("webSearch") || "Web Search"}
+                  path="/v1/search"
+                  description={
+                    t("webSearchDesc") ||
+                    "Unified web search across multiple providers with automatic failover and caching"
+                  }
+                  models={searchProviders.map((p) => ({
+                    id: p.id,
+                    name: p.name,
+                    owned_by: p.id,
+                    type: "search",
+                  }))}
+                  expanded={expandedEndpoint === "search"}
+                  onToggle={() =>
+                    setExpandedEndpoint(expandedEndpoint === "search" ? null : "search")
+                  }
+                  copy={copy}
+                  copied={copied}
+                  baseUrl={currentEndpoint}
+                />
+              </div>
+            </div>
+          )}
+
          {/* Utility & Management */}
          <div>
            <div className="flex items-center gap-2 mb-3">
@@ -101,6 +101,7 @@ export default function ProviderDetailPage() {
  const isOpenAICompatible = isOpenAICompatibleProvider(providerId);
  const isAnthropicCompatible = isAnthropicCompatibleProvider(providerId);
  const isCompatible = isOpenAICompatible || isAnthropicCompatible;
+  const isSearchProvider = providerId.endsWith("-search");

  const providerStorageAlias = isCompatible ? providerId : providerAlias;
  const providerDisplayAlias = isCompatible ? providerNode?.prefix || providerId : providerAlias;
@@ -1060,21 +1061,43 @@ export default function ProviderDetailPage() {
        )}
      </Card>

-      {/* Models */}
-      <Card>
-        <h2 className="text-lg font-semibold mb-4">{t("availableModels")}</h2>
-        {renderModelsSection()}
+      {/* Models — hidden for search providers (they don't have models) */}
+      {!isSearchProvider && (
+        <Card>
+          <h2 className="text-lg font-semibold mb-4">{t("availableModels")}</h2>
+          {renderModelsSection()}

-        {/* Custom Models — available for ALL providers */}
-        {!isCompatible && (
-          <CustomModelsSection
-            providerId={providerId}
-            providerAlias={providerDisplayAlias}
-            copied={copied}
-            onCopy={copy}
-          />
-        )}
-      </Card>
+          {/* Custom Models — available for non-compatible, non-search providers */}
+          {!isCompatible && (
+            <CustomModelsSection
+              providerId={providerId}
+              providerAlias={providerDisplayAlias}
+              copied={copied}
+              onCopy={copy}
+            />
+          )}
+        </Card>
+      )}
+
+      {/* Search provider info */}
+      {isSearchProvider && (
+        <Card>
+          <h2 className="text-lg font-semibold mb-4">{t("searchProvider") || "Search Provider"}</h2>
+          <p className="text-sm text-text-muted">
+            {t("searchProviderDesc") ||
+              "This provider is used for web search via POST /v1/search. No model configuration needed — search providers are ready to use once an API key is connected."}
+          </p>
+          {providerId === "perplexity-search" && (
+            <div className="mt-3 flex items-center gap-2 px-3 py-2 rounded-lg bg-blue-500/10 border border-blue-500/20">
+              <span className="material-symbols-outlined text-sm text-blue-400">link</span>
+              <p className="text-xs text-blue-300">
+                Uses the same API key as <strong>Perplexity</strong> (chat provider). If you already
+                have Perplexity configured, no additional setup is needed.
+              </p>
+            </div>
+          )}
+        </Card>
+      )}

      {/* Modals */}
      {providerId === "kiro" ? (
@@ -0,0 +1,50 @@
+/**
+ * GET  /api/logs/detail         — List detailed request logs
+ * GET  /api/logs/detail/:id     — Get specific detailed log
+ * POST /api/logs/detail/toggle  — Enable/disable detailed logging
+ */
+import { NextRequest, NextResponse } from "next/server";
+import { isAuthenticated } from "@/shared/utils/apiAuth";
+import {
+  getRequestDetailLogs,
+  getRequestDetailLogCount,
+  isDetailedLoggingEnabled,
+} from "@/lib/db/detailedLogs";
+import { updateSettings } from "@/lib/db/settings";
+
+export const dynamic = "force-dynamic";
+
+export async function GET(req: NextRequest) {
+  if (!isAuthenticated(req)) {
+    return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
+  }
+
+  const url = new URL(req.url);
+  const limit = Math.min(Number(url.searchParams.get("limit") ?? 50), 200);
+  const offset = Number(url.searchParams.get("offset") ?? 0);
+
+  const logs = getRequestDetailLogs(limit, offset);
+  const total = getRequestDetailLogCount();
+  const enabled = await isDetailedLoggingEnabled();
+
+  return NextResponse.json({ enabled, total, logs });
+}
+
+export async function POST(req: NextRequest) {
+  if (!isAuthenticated(req)) {
+    return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
+  }
+
+  const body = await req.json();
+  const enabled = body.enabled === true || body.enabled === "1";
+
+  await updateSettings({ detailed_logs_enabled: enabled });
+
+  return NextResponse.json({
+    success: true,
+    enabled,
+    message: enabled
+      ? "Detailed logging enabled. Pipeline bodies will be captured for new requests."
+      : "Detailed logging disabled.",
+  });
+}
@@ -13,6 +13,7 @@ export async function GET() {
    const { getAllCircuitBreakerStatuses } = await import("@/shared/utils/circuitBreaker");
    const { getAllRateLimitStatus } = await import("@omniroute/open-sse/services/rateLimitManager");
    const { getAllModelLockouts } = await import("@omniroute/open-sse/services/accountFallback");
+    const { getInflightCount } = await import("@omniroute/open-sse/services/requestDedup.ts");

    const settings = await getSettings();
    const circuitBreakers = getAllCircuitBreakerStatuses();
@@ -50,6 +51,9 @@ export async function GET() {
      localProviders: getAllHealthStatuses(),
      rateLimitStatus,
      lockouts,
+      dedup: {
+        inflightRequests: getInflightCount(),
+      },
      setupComplete: settings?.setupComplete || false,
    });
  } catch (error) {
@@ -0,0 +1,115 @@
+/**
+ * GET  /api/system/version  — Returns current version and latest available on npm
+ * POST /api/system/update   — Triggers npm install -g omniroute@latest + pm2 restart
+ *
+ * Security: Requires admin authentication (same as other management routes).
+ * Safety: Update only runs if a newer version is available on npm.
+ */
+import { NextRequest, NextResponse } from "next/server";
+import { execFile } from "child_process";
+import { promisify } from "util";
+import { isAuthenticated } from "@/shared/utils/apiAuth";
+
+const execFileAsync = promisify(execFile);
+
+export const dynamic = "force-dynamic";
+
+/** Fetch latest version from npm registry (no install, just metadata) */
+async function getLatestNpmVersion(): Promise<string | null> {
+  try {
+    const { stdout } = await execFileAsync("npm", ["info", "omniroute", "version", "--json"], {
+      timeout: 10000,
+    });
+    const parsed = JSON.parse(stdout.trim());
+    return typeof parsed === "string" ? parsed : null;
+  } catch {
+    return null;
+  }
+}
+
+/** Current installed version from package.json */
+function getCurrentVersion(): string {
+  try {
+     
+    return require("../../../../../package.json").version as string;
+  } catch {
+    return "unknown";
+  }
+}
+
+/** Compare semver strings — returns true if a > b */
+function isNewer(a: string | null, b: string): boolean {
+  if (!a) return false;
+  const parse = (v: string) => v.split(".").map(Number);
+  const [aMaj, aMin, aPat] = parse(a);
+  const [bMaj, bMin, bPat] = parse(b);
+  if (aMaj !== bMaj) return aMaj > bMaj;
+  if (aMin !== bMin) return aMin > bMin;
+  return aPat > bPat;
+}
+
+export async function GET(req: NextRequest) {
+  if (!isAuthenticated(req)) {
+    return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
+  }
+
+  const current = getCurrentVersion();
+  const latest = await getLatestNpmVersion();
+  const updateAvailable = isNewer(latest, current);
+
+  return NextResponse.json({
+    current,
+    latest: latest ?? "unavailable",
+    updateAvailable,
+    channel: "npm",
+  });
+}
+
+export async function POST(req: NextRequest) {
+  if (!isAuthenticated(req)) {
+    return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
+  }
+
+  const current = getCurrentVersion();
+  const latest = await getLatestNpmVersion();
+
+  if (!latest) {
+    return NextResponse.json(
+      { success: false, error: "Could not reach npm registry" },
+      { status: 503 }
+    );
+  }
+
+  if (!isNewer(latest, current)) {
+    return NextResponse.json({
+      success: false,
+      error: `Already on latest version (${current})`,
+      current,
+      latest,
+    });
+  }
+
+  // Run update in background — client gets immediate acknowledgment
+  const install = async () => {
+    try {
+      await execFileAsync("npm", ["install", "-g", `omniroute@${latest}`, "--ignore-scripts"], {
+        timeout: 300000, // 5 minutes
+      });
+      // Restart PM2 — non-fatal if pm2 not available (Docker/manual setups)
+      await execFileAsync("pm2", ["restart", "omniroute"]).catch(() => null);
+      console.log(`[AutoUpdate] Successfully updated to v${latest}`);
+    } catch (err) {
+      console.error(`[AutoUpdate] Update failed:`, err);
+    }
+  };
+
+  // Fire-and-forget
+  install();
+
+  return NextResponse.json({
+    success: true,
+    message: `Update to v${latest} started. Restarting in ~30 seconds.`,
+    from: current,
+    to: latest,
+  });
+}
@@ -0,0 +1,268 @@
+import { CORS_ORIGIN } from "@/shared/utils/cors";
+import { handleSearch } from "@omniroute/open-sse/handlers/search.ts";
+import { getProviderCredentials, extractApiKey, isValidApiKey } from "@/sse/services/auth";
+import {
+  getAllSearchProviders,
+  getSearchProvider,
+  selectProvider,
+  SEARCH_PROVIDERS,
+  SEARCH_CREDENTIAL_FALLBACKS,
+} from "@omniroute/open-sse/config/searchRegistry.ts";
+import { errorResponse } from "@omniroute/open-sse/utils/error.ts";
+import { HTTP_STATUS } from "@omniroute/open-sse/config/constants.ts";
+import * as log from "@/sse/utils/logger";
+import { toJsonErrorPayload } from "@/shared/utils/upstreamError";
+import { enforceApiKeyPolicy } from "@/shared/utils/apiKeyPolicy";
+import { v1SearchSchema } from "@/shared/validation/schemas";
+import { isValidationFailure, validateBody } from "@/shared/validation/helpers";
+import { recordCost } from "@/domain/costRules";
+import {
+  computeCacheKey,
+  getOrCoalesce,
+  SEARCH_CACHE_DEFAULT_TTL_MS,
+} from "@omniroute/open-sse/services/searchCache.ts";
+
+const CORS_HEADERS = {
+  "Access-Control-Allow-Origin": CORS_ORIGIN,
+  "Access-Control-Allow-Methods": "GET, POST, OPTIONS",
+  "Access-Control-Allow-Headers": "*",
+};
+
+/**
+ * Handle CORS preflight
+ */
+export async function OPTIONS() {
+  return new Response(null, { headers: CORS_HEADERS });
+}
+
+/**
+ * GET /v1/search — list available search providers
+ */
+export async function GET() {
+  const providers = getAllSearchProviders();
+  const timestamp = Math.floor(Date.now() / 1000);
+
+  const data = providers.map((p) => ({
+    id: p.id,
+    object: "search_provider",
+    created: timestamp,
+    name: p.name,
+    search_types: p.searchTypes,
+  }));
+
+  return new Response(JSON.stringify({ object: "list", data }), {
+    headers: { "Content-Type": "application/json", ...CORS_HEADERS },
+  });
+}
+
+// Helper: resolve credentials with fallback (e.g., perplexity-search → perplexity)
+async function resolveSearchCredentials(providerId: string) {
+  const creds = await getProviderCredentials(providerId).catch(() => null);
+  if (creds) return creds;
+  const fallbackId = SEARCH_CREDENTIAL_FALLBACKS[providerId];
+  if (fallbackId) return getProviderCredentials(fallbackId).catch(() => null);
+  return null;
+}
+
+// Helper: build domain filter array from filters object
+function buildDomainFilter(filters?: {
+  include_domains?: string[];
+  exclude_domains?: string[];
+}): string[] | undefined {
+  if (!filters) return undefined;
+  const parts: string[] = [];
+  if (filters.include_domains?.length) parts.push(...filters.include_domains);
+  if (filters.exclude_domains?.length) parts.push(...filters.exclude_domains.map((d) => `-${d}`));
+  return parts.length > 0 ? parts : undefined;
+}
+
+/**
+ * POST /v1/search — execute a web search
+ */
+export async function POST(request: Request) {
+  let rawBody: unknown;
+  try {
+    rawBody = await request.json();
+  } catch {
+    log.warn("SEARCH", "Invalid JSON body");
+    return errorResponse(HTTP_STATUS.BAD_REQUEST, "Invalid JSON body");
+  }
+
+  const validation = validateBody(v1SearchSchema, rawBody);
+  if (isValidationFailure(validation)) {
+    return errorResponse(HTTP_STATUS.BAD_REQUEST, validation.error.message);
+  }
+  const body = validation.data;
+
+  // Optional API key validation
+  if (process.env.REQUIRE_API_KEY === "true") {
+    const apiKey = extractApiKey(request);
+    if (!apiKey) {
+      return errorResponse(HTTP_STATUS.UNAUTHORIZED, "Missing API key");
+    }
+    const valid = await isValidApiKey(apiKey);
+    if (!valid) {
+      return errorResponse(HTTP_STATUS.UNAUTHORIZED, "Invalid API key");
+    }
+  }
+
+  // Enforce API key policies — use "search" as model identifier for consistent policy config
+  const policy = await enforceApiKeyPolicy(request, "search");
+  if (policy.rejection) return policy.rejection;
+
+  // Resolve provider and credentials
+  let providerConfig = selectProvider(body.provider);
+  if (!providerConfig) {
+    return errorResponse(
+      HTTP_STATUS.BAD_REQUEST,
+      body.provider ? `Unknown search provider: ${body.provider}` : "No search providers available"
+    );
+  }
+
+  let credentials: Record<string, any> | null = null;
+  let alternateProviderId: string | undefined;
+  let alternateCredentials: Record<string, any> | null = null;
+
+  if (body.provider) {
+    // Explicit provider — single credential lookup (with fallback)
+    credentials = await resolveSearchCredentials(providerConfig.id);
+    if (!credentials) {
+      return errorResponse(
+        HTTP_STATUS.BAD_REQUEST,
+        `No credentials configured for search provider: ${providerConfig.id}. Add an API key for "${providerConfig.id}" in the dashboard.`
+      );
+    }
+  } else {
+    // Auto-select — try the resolved provider first, then iterate others by cost
+    credentials = await resolveSearchCredentials(providerConfig.id);
+
+    if (!credentials) {
+      // Sort by cost to find cheapest with credentials
+      const sortedIds = Object.values(SEARCH_PROVIDERS)
+        .sort((a, b) => a.costPerQuery - b.costPerQuery)
+        .map((p) => p.id);
+
+      for (const pid of sortedIds) {
+        if (pid === providerConfig.id) continue;
+        const altConfig = getSearchProvider(pid);
+        const altCreds = await resolveSearchCredentials(pid);
+        if (altConfig && altCreds) {
+          providerConfig = altConfig;
+          credentials = altCreds;
+          break;
+        }
+      }
+    }
+
+    if (!credentials) {
+      return errorResponse(
+        HTTP_STATUS.BAD_REQUEST,
+        `No credentials configured for any search provider. Add an API key for a search provider (${Object.keys(SEARCH_PROVIDERS).join(", ")}) in the dashboard.`
+      );
+    }
+
+    // Find alternate for failover — must bind credentials to the matched provider
+    const otherIds = Object.values(SEARCH_PROVIDERS)
+      .sort((a, b) => a.costPerQuery - b.costPerQuery)
+      .map((p) => p.id)
+      .filter((id) => id !== providerConfig.id);
+
+    for (const pid of otherIds) {
+      const creds = await resolveSearchCredentials(pid);
+      if (creds) {
+        alternateProviderId = pid;
+        alternateCredentials = creds;
+        break;
+      }
+    }
+  }
+
+  // Clamp max_results to provider limit
+  const clampedMaxResults = Math.min(body.max_results, providerConfig.maxMaxResults);
+
+  // Cache key — includes all fields that affect results
+  const cacheKey = computeCacheKey(
+    body.query,
+    providerConfig.id,
+    body.search_type,
+    clampedMaxResults,
+    body.country,
+    body.language,
+    { filters: body.filters, offset: body.offset, time_range: body.time_range }
+  );
+
+  const ttl = providerConfig.cacheTTLMs || SEARCH_CACHE_DEFAULT_TTL_MS;
+
+  try {
+    const { data: searchResult, cached } = await getOrCoalesce(cacheKey, ttl, async () => {
+      const result = await handleSearch({
+        query: body.query,
+        provider: providerConfig.id,
+        maxResults: clampedMaxResults,
+        searchType: body.search_type,
+        country: body.country,
+        language: body.language,
+        timeRange: body.time_range,
+        offset: body.offset,
+        domainFilter: buildDomainFilter(body.filters),
+        contentOptions: body.content,
+        strictFilters: body.strict_filters,
+        providerOptions: body.provider_options,
+        credentials,
+        alternateProvider: alternateProviderId,
+        alternateCredentials,
+        log,
+      });
+
+      if (!result.success) {
+        throw new SearchError(result.error || "Search failed", result.status || 502);
+      }
+
+      return result.data!;
+    });
+
+    // Record cost for budget tracking (skip cache hits — no provider cost)
+    if (!cached && policy.apiKeyInfo?.id && searchResult.usage?.search_cost_usd > 0) {
+      try {
+        recordCost(policy.apiKeyInfo.id, searchResult.usage.search_cost_usd);
+      } catch (e: any) {
+        log.warn("SEARCH", `Cost recording failed: ${e?.message}`);
+      }
+    }
+
+    const response = {
+      id: `search-${crypto.randomUUID()}`,
+      ...searchResult,
+      cached,
+      usage: cached ? { queries_used: 0, search_cost_usd: 0 } : searchResult.usage,
+    };
+
+    return new Response(JSON.stringify(response), {
+      status: 200,
+      headers: { "Content-Type": "application/json", ...CORS_HEADERS },
+    });
+  } catch (err: any) {
+    if (err instanceof SearchError) {
+      const errorPayload = toJsonErrorPayload(err.message, "Search provider error");
+      return new Response(JSON.stringify(errorPayload), {
+        status: err.statusCode,
+        headers: { "Content-Type": "application/json", ...CORS_HEADERS },
+      });
+    }
+
+    log.error("SEARCH", `Unexpected error: ${err.message}`);
+    const errorPayload = toJsonErrorPayload(err.message, "Internal search error");
+    return new Response(JSON.stringify(errorPayload), {
+      status: 500,
+      headers: { "Content-Type": "application/json", ...CORS_HEADERS },
+    });
+  }
+}
+
+class SearchError extends Error {
+  statusCode: number;
+  constructor(message: string, statusCode: number) {
+    super(message);
+    this.statusCode = statusCode;
+  }
+}
@@ -818,7 +818,12 @@
    "settingsApi": "Settings API",
    "categoryCore": "Core APIs",
    "categoryMedia": "Media & Multi-Modal",
+    "categorySearch": "Search & Discovery",
    "categoryUtility": "Utility & Management",
+    "webSearch": "Web Search",
+    "webSearchDesc": "Unified web search across multiple providers with automatic failover and caching",
+    "searchProvider": "Search Provider",
+    "searchProviderDesc": "This provider is used for web search via POST /v1/search. No model configuration needed — search providers are ready to use once an API key is connected.",
    "enableCloudTitle": "Enable Cloud Proxy",
    "whatYouGet": "What you will get",
    "cloudBenefitAccess": "Access your API from anywhere in the world",
@@ -143,6 +143,10 @@ const SCHEMA_SQL = `
    tokens_cache_creation INTEGER DEFAULT 0,
    tokens_reasoning INTEGER DEFAULT 0,
    status TEXT,
+    success INTEGER DEFAULT 1,
+    latency_ms INTEGER DEFAULT 0,
+    ttft_ms INTEGER DEFAULT 0,
+    error_code TEXT,
    timestamp TEXT NOT NULL
  );
  CREATE INDEX IF NOT EXISTS idx_uh_timestamp ON usage_history(timestamp);
@@ -327,6 +331,35 @@ function ensureProviderConnectionsColumns(db: SqliteDatabase) {
  }
 }

+function ensureUsageHistoryColumns(db: SqliteDatabase) {
+  try {
+    const columns = db.prepare("PRAGMA table_info(usage_history)").all() as Array<{
+      name?: string;
+    }>;
+    const columnNames = new Set(columns.map((column) => String(column.name ?? "")));
+
+    if (!columnNames.has("success")) {
+      db.exec("ALTER TABLE usage_history ADD COLUMN success INTEGER DEFAULT 1");
+      console.log("[DB] Added usage_history.success column");
+    }
+    if (!columnNames.has("latency_ms")) {
+      db.exec("ALTER TABLE usage_history ADD COLUMN latency_ms INTEGER DEFAULT 0");
+      console.log("[DB] Added usage_history.latency_ms column");
+    }
+    if (!columnNames.has("ttft_ms")) {
+      db.exec("ALTER TABLE usage_history ADD COLUMN ttft_ms INTEGER DEFAULT 0");
+      console.log("[DB] Added usage_history.ttft_ms column");
+    }
+    if (!columnNames.has("error_code")) {
+      db.exec("ALTER TABLE usage_history ADD COLUMN error_code TEXT");
+      console.log("[DB] Added usage_history.error_code column");
+    }
+  } catch (error: unknown) {
+    const message = error instanceof Error ? error.message : String(error);
+    console.warn("[DB] Failed to verify usage_history schema:", message);
+  }
+}
+
 export function getDbInstance(): SqliteDatabase {
  if (_db) return _db;

@@ -337,6 +370,7 @@ export function getDbInstance(): SqliteDatabase {
    const memoryDb = new Database(":memory:");
    memoryDb.pragma("journal_mode = WAL");
    memoryDb.exec(SCHEMA_SQL);
+    ensureUsageHistoryColumns(memoryDb);
    _db = memoryDb;
    return memoryDb;
  }
@@ -420,6 +454,7 @@ export function getDbInstance(): SqliteDatabase {
  db.pragma("synchronous = NORMAL");
  db.exec(SCHEMA_SQL);
  ensureProviderConnectionsColumns(db);
+  ensureUsageHistoryColumns(db);

  // ── Versioned Migrations ──
  // Auto-seed 001 as applied (the inline SCHEMA_SQL already created these tables)
@@ -0,0 +1,101 @@
+/**
+ * Detailed Request Logs DB Layer (#378)
+ *
+ * Saves full request/response bodies at each pipeline stage.
+ * Ring-buffer of 500 entries enforced by SQL trigger in migration 006.
+ * Only active when settings.detailed_logs_enabled = "1".
+ */
+import { v4 as uuidv4 } from "uuid";
+import { getDbInstance } from "./core";
+import { getSettings } from "./settings";
+
+export interface RequestDetailLog {
+  id?: string;
+  call_log_id?: string | null;
+  timestamp?: string;
+  client_request?: string | null;
+  translated_request?: string | null;
+  provider_response?: string | null;
+  client_response?: string | null;
+  provider?: string | null;
+  model?: string | null;
+  source_format?: string | null;
+  target_format?: string | null;
+  duration_ms?: number;
+}
+
+/** Returns true if detailed logging is enabled in settings */
+export async function isDetailedLoggingEnabled(): Promise<boolean> {
+  try {
+    const settings = await getSettings();
+    const val = settings.detailed_logs_enabled;
+    return val === true || val === "1" || val === "true";
+  } catch {
+    return false;
+  }
+}
+
+/** Save a detailed log entry — caller must verify isDetailedLoggingEnabled() first */
+export function saveRequestDetailLog(entry: RequestDetailLog): void {
+  const db = getDbInstance();
+  const id = entry.id ?? uuidv4();
+  const timestamp = entry.timestamp ?? new Date().toISOString();
+
+  // Trim large bodies to avoid excessive disk usage (max 64KB each)
+  const trim = (s: string | null | undefined, max = 65536): string | null => {
+    if (!s) return null;
+    return s.length > max ? s.slice(0, max) + "…[truncated]" : s;
+  };
+
+  db.prepare(
+    `
+    INSERT INTO request_detail_logs
+      (id, call_log_id, timestamp, client_request, translated_request,
+       provider_response, client_response, provider, model, source_format, target_format, duration_ms)
+    VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+  `
+  ).run(
+    id,
+    entry.call_log_id ?? null,
+    timestamp,
+    trim(entry.client_request),
+    trim(entry.translated_request),
+    trim(entry.provider_response),
+    trim(entry.client_response),
+    entry.provider ?? null,
+    entry.model ?? null,
+    entry.source_format ?? null,
+    entry.target_format ?? null,
+    entry.duration_ms ?? 0
+  );
+}
+
+/** Fetch detailed logs (latest first) */
+export function getRequestDetailLogs(limit = 50, offset = 0): RequestDetailLog[] {
+  const db = getDbInstance();
+  return db
+    .prepare(
+      `
+      SELECT * FROM request_detail_logs
+      ORDER BY timestamp DESC
+      LIMIT ? OFFSET ?
+    `
+    )
+    .all(limit, offset) as RequestDetailLog[];
+}
+
+/** Get a single detailed log by ID */
+export function getRequestDetailLogById(id: string): RequestDetailLog | null {
+  const db = getDbInstance();
+  return (db.prepare("SELECT * FROM request_detail_logs WHERE id = ?").get(id) ??
+    null) as RequestDetailLog | null;
+}
+
+/** Get total count of detailed logs */
+export function getRequestDetailLogCount(): number {
+  const db = getDbInstance();
+  const row = db.prepare("SELECT COUNT(*) as cnt FROM request_detail_logs").get() as {
+    cnt: number;
+  };
+  return row?.cnt ?? 0;
+}
@@ -98,6 +98,10 @@ CREATE TABLE IF NOT EXISTS usage_history (
  tokens_cache_creation INTEGER DEFAULT 0,
  tokens_reasoning INTEGER DEFAULT 0,
  status TEXT,
+  success INTEGER DEFAULT 1,
+  latency_ms INTEGER DEFAULT 0,
+  ttft_ms INTEGER DEFAULT 0,
+  error_code TEXT,
  timestamp TEXT NOT NULL
 );
 CREATE INDEX IF NOT EXISTS idx_uh_timestamp ON usage_history(timestamp);
@@ -0,0 +1,19 @@
+-- 005_combo_agent_fields.sql
+-- Safe migration for existing users: adds optional agent fields to combos.
+-- Uses ADD COLUMN with DEFAULT NULL (SQLite compatible) — existing rows are untouched.
+-- New fields are read as NULL by old code versions (backward compatible).
+
+-- System prompt override: when set, injected as the first system message before
+-- forwarding to the provider. Overrides any system message from the client.
+ALTER TABLE combos ADD COLUMN system_message TEXT DEFAULT NULL;
+
+-- Regex-based tool filter: when set, only tool calls whose "name" matches this
+-- regex pattern are forwarded to the provider. Others are stripped silently.
+-- Example: "^(gh_|create_file|web_fetch)" — allows only GitHub and web tools.
+ALTER TABLE combos ADD COLUMN tool_filter_regex TEXT DEFAULT NULL;
+
+-- Context caching protection: when 1, the proxy tags assistant responses with
+-- <omniModel>provider/model</omniModel> and pins the model for the session.
+ALTER TABLE combos ADD COLUMN context_cache_protection INTEGER DEFAULT 0;
+
+CREATE INDEX IF NOT EXISTS idx_combos_cache_protection ON combos(context_cache_protection);
@@ -0,0 +1,42 @@
+-- 006_detailed_request_logs.sql
+-- Stores full request/response bodies at each pipeline stage for debugging.
+-- Only populated when detailed_logs_enabled = 1 in settings (off by default).
+-- Ring-buffer enforced via trigger: keeps only the last 500 entries.
+-- Existing users are not impacted (table is new, feature is opt-in).
+
+CREATE TABLE IF NOT EXISTS request_detail_logs (
+  id TEXT PRIMARY KEY,
+  call_log_id TEXT,                  -- FK to call_logs.id (optional, nullable)
+  timestamp TEXT NOT NULL,
+  -- The 4 pipeline stages (all nullable — only populated when available)
+  client_request TEXT,               -- Raw body received from the client (JSON)
+  translated_request TEXT,           -- Body after format translation (JSON)
+  provider_response TEXT,            -- Raw body from the provider (JSON)
+  client_response TEXT,              -- Final body sent to the client (JSON)
+  -- Metadata
+  provider TEXT,
+  model TEXT,
+  source_format TEXT,
+  target_format TEXT,
+  duration_ms INTEGER DEFAULT 0
+);
+
+CREATE INDEX IF NOT EXISTS idx_rdl_timestamp ON request_detail_logs(timestamp);
+CREATE INDEX IF NOT EXISTS idx_rdl_call_log_id ON request_detail_logs(call_log_id);
+
+-- Ring-buffer trigger: auto-delete oldest records beyond 500
+CREATE TRIGGER IF NOT EXISTS trg_rdl_ring_buffer
+AFTER INSERT ON request_detail_logs
+BEGIN
+  DELETE FROM request_detail_logs
+  WHERE id IN (
+    SELECT id FROM request_detail_logs
+    ORDER BY timestamp ASC
+    LIMIT MAX(0, (SELECT COUNT(*) FROM request_detail_logs) - 500)
+  );
+END;
+
+-- Settings key for enabling/disabling detailed logs (default: disabled)
+-- Inserted only if not already present (safe for existing installs)
+INSERT OR IGNORE INTO key_value (namespace, key, value)
+VALUES ('settings', 'detailed_logs_enabled', '0');
@@ -0,0 +1,4 @@
+-- Add request_type column to call_logs for non-chat request tracking (search, embed, rerank).
+-- Backward-compatible: DEFAULT NULL means existing rows are unaffected.
+ALTER TABLE call_logs ADD COLUMN request_type TEXT DEFAULT NULL;
+CREATE INDEX IF NOT EXISTS idx_call_logs_request_type ON call_logs(request_type);
@@ -440,6 +440,69 @@ async function validateAnthropicCompatibleProvider({ apiKey, providerSpecificDat
  }
 }

+// ── Search provider validators (factored) ──
+
+async function validateSearchProvider(
+  url: string,
+  init: RequestInit
+): Promise<{ valid: boolean; error: string | null }> {
+  try {
+    const response = await fetch(url, init);
+    if (response.ok) return { valid: true, error: null };
+    if (response.status === 401 || response.status === 403) {
+      return { valid: false, error: "Invalid API key" };
+    }
+    return { valid: false, error: `Validation failed: ${response.status}` };
+  } catch (error: any) {
+    return { valid: false, error: error.message || "Validation failed" };
+  }
+}
+
+const SEARCH_VALIDATOR_CONFIGS: Record<
+  string,
+  (apiKey: string) => { url: string; init: RequestInit }
+> = {
+  "serper-search": (apiKey) => ({
+    url: "https://google.serper.dev/search",
+    init: {
+      method: "POST",
+      headers: { "Content-Type": "application/json", "X-API-Key": apiKey },
+      body: JSON.stringify({ q: "test", num: 1 }),
+    },
+  }),
+  "brave-search": (apiKey) => ({
+    url: "https://api.search.brave.com/res/v1/web/search?q=test&count=1",
+    init: {
+      method: "GET",
+      headers: { Accept: "application/json", "X-Subscription-Token": apiKey },
+    },
+  }),
+  "perplexity-search": (apiKey) => ({
+    url: "https://api.perplexity.ai/search",
+    init: {
+      method: "POST",
+      headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}` },
+      body: JSON.stringify({ query: "test", max_results: 1 }),
+    },
+  }),
+  "exa-search": (apiKey) => ({
+    url: "https://api.exa.ai/search",
+    init: {
+      method: "POST",
+      headers: { "Content-Type": "application/json", "x-api-key": apiKey },
+      body: JSON.stringify({ query: "test", numResults: 1 }),
+    },
+  }),
+  "tavily-search": (apiKey) => ({
+    url: "https://api.tavily.com/search",
+    init: {
+      method: "POST",
+      headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}` },
+      body: JSON.stringify({ query: "test", max_results: 1 }),
+    },
+  }),
+};
+
 export async function validateProviderApiKey({ provider, apiKey, providerSpecificData = {} }: any) {
  if (!provider || !apiKey) {
    return { valid: false, error: "Provider and API key required", unsupported: false };
@@ -468,6 +531,16 @@ export async function validateProviderApiKey({ provider, apiKey, providerSpecifi
    nanobanana: validateNanoBananaProvider,
    elevenlabs: validateElevenLabsProvider,
    inworld: validateInworldProvider,
+    // Search providers — use factored validator
+    ...Object.fromEntries(
+      Object.entries(SEARCH_VALIDATOR_CONFIGS).map(([id, configFn]) => [
+        id,
+        ({ apiKey }: any) => {
+          const { url, init } = configFn(apiKey);
+          return validateSearchProvider(url, init);
+        },
+      ])
+    ),
  };

  if (SPECIALTY_VALIDATORS[provider]) {
@@ -186,6 +186,7 @@ export async function saveCallLog(entry: any) {
      duration: entry.duration || 0,
      tokensIn: entry.tokens?.prompt_tokens || 0,
      tokensOut: entry.tokens?.completion_tokens || 0,
+      requestType: entry.requestType || null,
      sourceFormat: entry.sourceFormat || null,
      targetFormat: entry.targetFormat || null,
      apiKeyId,
@@ -201,10 +202,10 @@ export async function saveCallLog(entry: any) {
    db.prepare(
      `
      INSERT INTO call_logs (id, timestamp, method, path, status, model, provider,
-        account, connection_id, duration, tokens_in, tokens_out, source_format, target_format,
+        account, connection_id, duration, tokens_in, tokens_out, request_type, source_format, target_format,
        api_key_id, api_key_name, combo_name, request_body, response_body, error)
      VALUES (@id, @timestamp, @method, @path, @status, @model, @provider,
-        @account, @connectionId, @duration, @tokensIn, @tokensOut, @sourceFormat, @targetFormat,
+        @account, @connectionId, @duration, @tokensIn, @tokensOut, @requestType, @sourceFormat, @targetFormat,
        @apiKeyId, @apiKeyName, @comboName, @requestBody, @responseBody, @error)
    `
    ).run(logEntry);
@@ -24,8 +24,7 @@ export const CALL_LOGS_DIR = isCloud ? null : path.join(DATA_DIR, "call_logs");
 // Legacy paths
 const LEGACY_DB_FILE =
  isCloud || !LEGACY_DATA_DIR ? null : path.join(LEGACY_DATA_DIR, "usage.json");
-const LEGACY_LOG_FILE =
-  isCloud || !LEGACY_DATA_DIR ? null : path.join(LEGACY_DATA_DIR, "log.txt");
+const LEGACY_LOG_FILE = isCloud || !LEGACY_DATA_DIR ? null : path.join(LEGACY_DATA_DIR, "log.txt");
 const LEGACY_CALL_LOGS_DB_FILE =
  isCloud || !LEGACY_DATA_DIR ? null : path.join(LEGACY_DATA_DIR, "call_logs.json");
 const LEGACY_CALL_LOGS_DIR =
@@ -82,10 +81,10 @@ export function migrateUsageJsonToSqlite() {
        const insert = db.prepare(`
          INSERT INTO usage_history (provider, model, connection_id, api_key_id, api_key_name,
            tokens_input, tokens_output, tokens_cache_read, tokens_cache_creation, tokens_reasoning,
-            status, timestamp)
+            status, success, latency_ms, ttft_ms, error_code, timestamp)
          VALUES (@provider, @model, @connectionId, @apiKeyId, @apiKeyName,
            @tokensInput, @tokensOutput, @tokensCacheRead, @tokensCacheCreation, @tokensReasoning,
-            @status, @timestamp)
+            @status, @success, @latencyMs, @ttftMs, @errorCode, @timestamp)
        `);

        const tx = db.transaction(() => {
@@ -103,6 +102,14 @@ export function migrateUsageJsonToSqlite() {
                entry.tokens?.cacheCreation ?? entry.tokens?.cache_creation_input_tokens ?? 0,
              tokensReasoning: entry.tokens?.reasoning ?? entry.tokens?.reasoning_tokens ?? 0,
              status: entry.status || null,
+              success: entry.success === false ? 0 : 1,
+              latencyMs: Number.isFinite(Number(entry.latencyMs)) ? Number(entry.latencyMs) : 0,
+              ttftMs: Number.isFinite(Number(entry.timeToFirstTokenMs))
+                ? Number(entry.timeToFirstTokenMs)
+                : Number.isFinite(Number(entry.latencyMs))
+                  ? Number(entry.latencyMs)
+                  : 0,
+              errorCode: entry.errorCode || null,
              timestamp: entry.timestamp || new Date().toISOString(),
            });
          }
@@ -29,6 +29,20 @@ function toNumber(value: unknown): number {
  return 0;
 }

+function percentile(sortedValues: number[], p: number): number {
+  if (sortedValues.length === 0) return 0;
+  if (sortedValues.length === 1) return sortedValues[0];
+  const bounded = Math.max(0, Math.min(1, p));
+  const idx = Math.round((sortedValues.length - 1) * bounded);
+  return sortedValues[idx] ?? sortedValues[sortedValues.length - 1];
+}
+
+function stdDev(values: number[], avg: number): number {
+  if (values.length <= 1) return 0;
+  const variance = values.reduce((acc, v) => acc + (v - avg) ** 2, 0) / values.length;
+  return Math.sqrt(Math.max(0, variance));
+}
+
 // ──────────────── Pending Requests (in-memory) ────────────────

 const pendingRequests: {
@@ -107,6 +121,10 @@ export async function getUsageDb() {
        reasoning: toNumber(r.tokens_reasoning),
      },
      status: toStringOrNull(r.status),
+      success: toNumber(r.success) === 1,
+      latencyMs: toNumber(r.latency_ms),
+      timeToFirstTokenMs: toNumber(r.ttft_ms),
+      errorCode: toStringOrNull(r.error_code),
      timestamp: toStringOrNull(r.timestamp),
    };
  });
@@ -130,8 +148,8 @@ export async function saveRequestUsage(entry: any) {
      `
      INSERT INTO usage_history (provider, model, connection_id, api_key_id, api_key_name,
        tokens_input, tokens_output, tokens_cache_read, tokens_cache_creation, tokens_reasoning,
-        status, timestamp)
-      VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+        status, success, latency_ms, ttft_ms, error_code, timestamp)
+      VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
    `
    ).run(
      entry.provider || null,
@@ -145,6 +163,14 @@ export async function saveRequestUsage(entry: any) {
      entry.tokens?.cacheCreation ?? entry.tokens?.cache_creation_input_tokens ?? 0,
      entry.tokens?.reasoning ?? entry.tokens?.reasoning_tokens ?? 0,
      entry.status || null,
+      entry.success === false ? 0 : 1,
+      Number.isFinite(Number(entry.latencyMs)) ? Number(entry.latencyMs) : 0,
+      Number.isFinite(Number(entry.timeToFirstTokenMs))
+        ? Number(entry.timeToFirstTokenMs)
+        : Number.isFinite(Number(entry.latencyMs))
+          ? Number(entry.latencyMs)
+          : 0,
+      entry.errorCode || null,
      timestamp
    );
  } catch (error) {
@@ -202,11 +228,150 @@ export async function getUsageHistory(filter: any = {}) {
        reasoning: toNumber(r.tokens_reasoning),
      },
      status: toStringOrNull(r.status),
+      success: toNumber(r.success) === 1,
+      latencyMs: toNumber(r.latency_ms),
+      timeToFirstTokenMs: toNumber(r.ttft_ms),
+      errorCode: toStringOrNull(r.error_code),
      timestamp: toStringOrNull(r.timestamp),
    };
  });
 }

+export interface ModelLatencyStatsEntry {
+  provider: string;
+  model: string;
+  key: string;
+  totalRequests: number;
+  successfulRequests: number;
+  successRate: number; // 0..1
+  avgLatencyMs: number;
+  p50LatencyMs: number;
+  p95LatencyMs: number;
+  p99LatencyMs: number;
+  latencyStdDev: number;
+  windowHours: number;
+}
+
+/**
+ * Aggregate rolling latency stats per provider/model from usage_history.
+ * Used by auto-combo routing to incorporate real-world latency and reliability.
+ */
+export async function getModelLatencyStats(
+  options: { windowHours?: number; minSamples?: number; maxRows?: number } = {}
+): Promise<Record<string, ModelLatencyStatsEntry>> {
+  const windowHours =
+    Number.isFinite(Number(options.windowHours)) && Number(options.windowHours) > 0
+      ? Number(options.windowHours)
+      : 24;
+  const minSamples =
+    Number.isFinite(Number(options.minSamples)) && Number(options.minSamples) > 0
+      ? Number(options.minSamples)
+      : 1;
+  const maxRows =
+    Number.isFinite(Number(options.maxRows)) && Number(options.maxRows) > 0
+      ? Number(options.maxRows)
+      : 10000;
+
+  const db = getDbInstance();
+  const sinceIso = new Date(Date.now() - windowHours * 60 * 60 * 1000).toISOString();
+
+  type LatencyRow = {
+    provider: string | null;
+    model: string | null;
+    success: number | null;
+    latency_ms: number | null;
+  };
+
+  const rows = db
+    .prepare(
+      `
+      SELECT provider, model, success, latency_ms
+      FROM usage_history
+      WHERE timestamp >= @sinceIso
+        AND provider IS NOT NULL
+        AND model IS NOT NULL
+      ORDER BY timestamp DESC
+      LIMIT @maxRows
+    `
+    )
+    .all({ sinceIso, maxRows }) as LatencyRow[];
+
+  const grouped = new Map<
+    string,
+    {
+      provider: string;
+      model: string;
+      totalRequests: number;
+      successfulRequests: number;
+      successfulLatencies: number[];
+      allLatencies: number[];
+    }
+  >();
+
+  for (const row of rows) {
+    const provider = toStringOrNull(row.provider);
+    const model = toStringOrNull(row.model);
+    if (!provider || !model) continue;
+
+    const key = `${provider}/${model}`;
+    if (!grouped.has(key)) {
+      grouped.set(key, {
+        provider,
+        model,
+        totalRequests: 0,
+        successfulRequests: 0,
+        successfulLatencies: [],
+        allLatencies: [],
+      });
+    }
+
+    const bucket = grouped.get(key);
+    if (!bucket) continue;
+
+    bucket.totalRequests += 1;
+    const isSuccess = toNumber(row.success) !== 0;
+    if (isSuccess) bucket.successfulRequests += 1;
+
+    const latency = toNumber(row.latency_ms);
+    if (latency > 0) {
+      bucket.allLatencies.push(latency);
+      if (isSuccess) bucket.successfulLatencies.push(latency);
+    }
+  }
+
+  const stats: Record<string, ModelLatencyStatsEntry> = {};
+  for (const [key, bucket] of grouped.entries()) {
+    const baseLatencies =
+      bucket.successfulLatencies.length >= minSamples
+        ? bucket.successfulLatencies
+        : bucket.allLatencies;
+
+    if (baseLatencies.length < minSamples) continue;
+
+    const sorted = [...baseLatencies].sort((a, b) => a - b);
+    const avg = sorted.reduce((acc, n) => acc + n, 0) / sorted.length;
+    const successRate =
+      bucket.totalRequests > 0 ? bucket.successfulRequests / bucket.totalRequests : 0;
+
+    stats[key] = {
+      provider: bucket.provider,
+      model: bucket.model,
+      key,
+      totalRequests: bucket.totalRequests,
+      successfulRequests: bucket.successfulRequests,
+      successRate,
+      avgLatencyMs: Math.round(avg),
+      p50LatencyMs: Math.round(percentile(sorted, 0.5)),
+      p95LatencyMs: Math.round(percentile(sorted, 0.95)),
+      p99LatencyMs: Math.round(percentile(sorted, 0.99)),
+      latencyStdDev: Math.round(stdDev(sorted, avg)),
+      windowHours,
+    };
+  }
+
+  return stats;
+}
+
 // ──────────────── Request Log (log.txt) ────────────────

 import fs from "fs";
@@ -23,6 +23,7 @@ export {
  getUsageDb,
  saveRequestUsage,
  getUsageHistory,
+  getModelLatencyStats,
  appendRequestLog,
  getRecentLogs,
 } from "./usage/usageHistory";
@@ -31,9 +32,4 @@ export { calculateCost } from "./usage/costCalculator";

 export { getUsageStats } from "./usage/usageStats";

-export {
-  saveCallLog,
-  rotateCallLogs,
-  getCallLogs,
-  getCallLogById,
-} from "./usage/callLogs";
+export { saveCallLog, rotateCallLogs, getCallLogs, getCallLogById } from "./usage/callLogs";
@@ -0,0 +1,54 @@
+/**
+ * Kiro IDE MITM Configuration (#336)
+ *
+ * Kiro IDE removed the Base URL / API Key configuration UI.
+ * To route Kiro's traffic through OmniRoute, we intercept it using MITM,
+ * similar to the existing Antigravity/Claude Code implementation.
+ *
+ * Kiro IDE uses the Anthropic API at https://api.anthropic.com:
+ * - Main endpoint: POST /v1/messages
+ * - Auth header: x-api-key: <key>
+ * - User-Agent contains: "kiro" or "Kiro"
+ *
+ * To use: Install OmniRoute's MITM certificate, then run:
+ *   omniroute mitm start --targets kiro
+ *
+ * The MITM server intercepts requests to api.anthropic.com and forwards
+ * them to the OmniRoute proxy (localhost:20128) instead.
+ */
+
+export interface MitmTarget {
+  id: string;
+  name: string;
+  description: string;
+  targetHost: string;
+  targetPort: number;
+  localPort: number;
+  userAgentPattern: string | null;
+  apiEndpoints: string[];
+  authHeader: string;
+  instructions: string[];
+  referenceIde?: string;
+}
+
+/** Kiro IDE MITM profile */
+export const KIRO_MITM_PROFILE: MitmTarget = {
+  id: "kiro",
+  name: "Kiro IDE",
+  description:
+    "Intercepts Kiro IDE requests to api.anthropic.com and routes them through OmniRoute.",
+  targetHost: "api.anthropic.com",
+  targetPort: 443,
+  localPort: 20130,
+  userAgentPattern: null, // Kiro does not expose a stable User-Agent
+  apiEndpoints: ["/v1/messages"],
+  authHeader: "x-api-key",
+  instructions: [
+    "1. Install OmniRoute's root certificate: run `omniroute cert install` or go to Settings → MITM Certificates",
+    "2. Start the MITM proxy: `omniroute mitm start --target kiro`",
+    "3. Set your system HTTP proxy to 127.0.0.1:20130 (or use transparent MITM via DNS override)",
+    "4. Open Kiro IDE — API calls will be automatically routed through OmniRoute.",
+    "5. Verify: check the Proxy Logs in OmniRoute dashboard and look for provider=anthropic source=mitm",
+  ],
+  referenceIde: "antigravity", // Same MITM infrastructure as Antigravity
+};
@@ -258,7 +258,7 @@ export default function RequestLoggerV2() {
          onClick={() => setRecording(!recording)}
          className={`flex items-center gap-2 px-3 py-1.5 rounded-full text-sm font-medium border transition-colors ${
            recording
-              ? "bg-red-500/10 border-red-500/30 text-red-400"
+              ? "bg-red-500/10 border-red-500/30 text-red-700 dark:text-red-400"
              : "bg-bg-subtle border-border text-text-muted"
          }`}
        >
@@ -413,11 +413,11 @@ export default function RequestLoggerV2() {
            className={`flex items-center gap-1.5 px-3 py-1 rounded-full text-xs font-medium border transition-all ${
              activeFilter === f.key
                ? f.key === "error"
-                  ? "bg-red-500/20 text-red-400 border-red-500/40"
+                  ? "bg-red-500/20 text-red-700 dark:text-red-400 border-red-500/40"
                  : f.key === "ok"
-                    ? "bg-emerald-500/20 text-emerald-400 border-emerald-500/40"
+                    ? "bg-emerald-500/20 text-emerald-700 dark:text-emerald-400 border-emerald-500/40"
                    : f.key === "combo"
-                      ? "bg-violet-500/20 text-violet-300 border-violet-500/40"
+                      ? "bg-violet-500/20 text-violet-700 dark:text-violet-300 border-violet-500/40"
                      : "bg-primary text-white border-primary"
                : "bg-bg-subtle border-border text-text-muted hover:border-text-muted"
            }`}
@@ -635,7 +635,7 @@ export default function RequestLoggerV2() {
                      {visibleColumns.combo && (
                        <td className="px-3 py-2">
                          {log.comboName ? (
-                            <span className="inline-block px-2 py-0.5 rounded-full text-[9px] font-bold bg-violet-500/20 text-violet-700 dark:text-violet-300 border border-violet-500/30">
+                            <span className="inline-block px-2 py-0.5 rounded-full text-[9px] font-bold bg-violet-500/20 text-violet-800 dark:text-violet-300 border border-violet-500/40">
                              {log.comboName}
                            </span>
                          ) : (
@@ -7,6 +7,20 @@ export const DEFAULT_PRICING = {

  // Claude Code (cc)
  cc: {
+    "claude-opus-4-6": {
+      input: 5.0,
+      output: 25.0,
+      cached: 2.5,
+      reasoning: 25.0,
+      cache_creation: 5.0,
+    },
+    "claude-sonnet-4-6": {
+      input: 3.0,
+      output: 15.0,
+      cached: 1.5,
+      reasoning: 15.0,
+      cache_creation: 3.0,
+    },
    "claude-opus-4-5-20251101": {
      input: 15.0,
      output: 75.0,
@@ -115,6 +129,13 @@ export const DEFAULT_PRICING = {
      reasoning: 18.0,
      cache_creation: 2.0,
    },
+    "gemini-3.1-pro-preview": {
+      input: 2.0,
+      output: 12.0,
+      cached: 0.25,
+      reasoning: 18.0,
+      cache_creation: 2.0,
+    },
    "gemini-2.5-pro": {
      input: 2.0,
      output: 12.0,
@@ -129,12 +150,13 @@ export const DEFAULT_PRICING = {
      reasoning: 3.75,
      cache_creation: 0.3,
    },
+    // Gemini 2.5 Flash Lite — preco corrigido via ClawRouter: $0.10/$0.40 (era $0.15/$1.25)
    "gemini-2.5-flash-lite": {
-      input: 0.15,
-      output: 1.25,
-      cached: 0.015,
-      reasoning: 1.875,
-      cache_creation: 0.15,
+      input: 0.1,
+      output: 0.4,
+      cached: 0.025,
+      reasoning: 0.6,
+      cache_creation: 0.1,
    },
  },

@@ -202,18 +224,25 @@ export const DEFAULT_PRICING = {
      cache_creation: 0.75,
    },
    "deepseek-v3.2-chat": {
-      input: 0.5,
-      output: 2.0,
-      cached: 0.25,
-      reasoning: 3.0,
-      cache_creation: 0.5,
+      input: 0.28,
+      output: 0.42,
+      cached: 0.014,
+      reasoning: 0.63,
+      cache_creation: 0.28,
+    },
+    "deepseek-v3.2": {
+      input: 0.28,
+      output: 0.42,
+      cached: 0.014,
+      reasoning: 0.63,
+      cache_creation: 0.28,
    },
    "deepseek-v3.2-reasoner": {
-      input: 0.75,
-      output: 3.0,
-      cached: 0.375,
-      reasoning: 4.5,
-      cache_creation: 0.75,
+      input: 0.55,
+      output: 2.19,
+      cached: 0.14,
+      reasoning: 2.19,
+      cache_creation: 0.55,
    },
    // Short-form aliases used by decolua/9router catalog (Mar 2026)
    "deepseek-3.1": {
@@ -451,10 +480,71 @@ export const DEFAULT_PRICING = {
      reasoning: 15.0,
      cache_creation: 3.0,
    },
+    // Claude 4.5 Haiku — modelo eco mais recente da Anthropic (2025-10)
+    "claude-haiku-4-5-20251001": {
+      input: 1.0,
+      output: 5.0,
+      cached: 0.5,
+      reasoning: 7.5,
+      cache_creation: 1.0,
+    },
+    "claude-haiku-4.5": {
+      input: 1.0,
+      output: 5.0,
+      cached: 0.5,
+      reasoning: 7.5,
+      cache_creation: 1.0,
+    },
+    // Claude Sonnet 4.6 — maxOutput 64k tokens, $3/$15/M
+    "claude-sonnet-4-6-20251031": {
+      input: 3.0,
+      output: 15.0,
+      cached: 1.5,
+      reasoning: 22.5,
+      cache_creation: 3.0,
+    },
+    "claude-sonnet-4.6": {
+      input: 3.0,
+      output: 15.0,
+      cached: 1.5,
+      reasoning: 22.5,
+      cache_creation: 3.0,
+    },
+    // Claude Opus 4.6 — mais barato que Opus 4 ($5/$25 vs $15/$75)
+    "claude-opus-4-6-20251031": {
+      input: 5.0,
+      output: 25.0,
+      cached: 2.5,
+      reasoning: 37.5,
+      cache_creation: 5.0,
+    },
+    "claude-opus-4.6": {
+      input: 5.0,
+      output: 25.0,
+      cached: 2.5,
+      reasoning: 37.5,
+      cache_creation: 5.0,
+    },
  },

  // Gemini
  gemini: {
+    // Gemini 3.1 Pro — novo flagship Google (2026-03-17)
+    // Context: 1.050.000 tokens | Max Output: 65.536
+    "gemini-3.1-pro": {
+      input: 2.0,
+      output: 12.0,
+      cached: 0.25,
+      reasoning: 18.0,
+      cache_creation: 2.0,
+    },
+    "gemini-3-1-pro": {
+      input: 2.0,
+      output: 12.0,
+      cached: 0.25,
+      reasoning: 18.0,
+      cache_creation: 2.0,
+    },
    "gemini-3-pro-preview": {
      input: 2.0,
      output: 12.0,
@@ -462,6 +552,13 @@ export const DEFAULT_PRICING = {
      reasoning: 18.0,
      cache_creation: 2.0,
    },
+    "gemini-3.1-pro-preview": {
+      input: 2.0,
+      output: 12.0,
+      cached: 0.25,
+      reasoning: 18.0,
+      cache_creation: 2.0,
+    },
    "gemini-2.5-pro": {
      input: 2.0,
      output: 12.0,
@@ -476,12 +573,53 @@ export const DEFAULT_PRICING = {
      reasoning: 3.75,
      cache_creation: 0.3,
    },
+    // Gemini 2.5 Flash Lite — preco corrigido: $0.10/$0.40 (ClawRouter)
    "gemini-2.5-flash-lite": {
-      input: 0.15,
-      output: 1.25,
-      cached: 0.015,
-      reasoning: 1.875,
-      cache_creation: 0.15,
+      input: 0.1,
+      output: 0.4,
+      cached: 0.025,
+      reasoning: 0.6,
+      cache_creation: 0.1,
+    },
+  },
+
+  // DeepSeek — API nativa (V3.2 Chat), separada de free providers
+  // Preco: $0.28/$0.42/M tokens (verificado via ClawRouter 2026-03-17)
+  deepseek: {
+    "deepseek-chat": {
+      input: 0.28,
+      output: 0.42,
+      cached: 0.014,
+      reasoning: 0.42,
+      cache_creation: 0.28,
+    },
+    "deepseek-v3": {
+      input: 0.28,
+      output: 0.42,
+      cached: 0.014,
+      reasoning: 0.42,
+      cache_creation: 0.28,
+    },
+    "deepseek-v3.2": {
+      input: 0.28,
+      output: 0.42,
+      cached: 0.014,
+      reasoning: 0.42,
+      cache_creation: 0.28,
+    },
+    "deepseek-reasoner": {
+      input: 0.55,
+      output: 2.19,
+      cached: 0.14,
+      reasoning: 2.19,
+      cache_creation: 0.55,
+    },
+    "deepseek-r1": {
+      input: 0.55,
+      output: 2.19,
+      cached: 0.14,
+      reasoning: 2.19,
+      cache_creation: 0.55,
    },
  },

@@ -498,6 +636,20 @@ export const DEFAULT_PRICING = {

  // GLM
  glm: {
+    "glm-5": {
+      input: 1.0,
+      output: 3.2,
+      cached: 0.5,
+      reasoning: 4.8,
+      cache_creation: 1.0,
+    },
+    "glm-5-turbo": {
+      input: 1.2,
+      output: 4.0,
+      cached: 0.6,
+      reasoning: 6.0,
+      cache_creation: 1.2,
+    },
    "glm-4.7": {
      input: 0.75,
      output: 3.0,
@@ -521,7 +673,7 @@ export const DEFAULT_PRICING = {
    },
  },

-  // Kimi
+  // Kimi (Moonshot)
  kimi: {
    "kimi-latest": {
      input: 1.0,
@@ -530,10 +682,33 @@ export const DEFAULT_PRICING = {
      reasoning: 6.0,
      cache_creation: 1.0,
    },
+    // Kimi K2.5 — acesso direto via Moonshot API
+    // Context: 262.144 tokens | Capabilities: reasoning, vision, agentic, tools
+    "kimi-k2.5": {
+      input: 0.6,
+      output: 3.0,
+      cached: 0.3,
+      reasoning: 4.5,
+      cache_creation: 0.6,
+    },
+    "moonshot-kimi-k2.5": {
+      input: 0.6,
+      output: 3.0,
+      cached: 0.3,
+      reasoning: 4.5,
+      cache_creation: 0.6,
+    },
  },

  // MiniMax
  minimax: {
+    "minimax-m2.1": {
+      input: 0.5,
+      output: 2.0,
+      cached: 0.25,
+      reasoning: 3.0,
+      cache_creation: 0.5,
+    },
    "MiniMax-M2.1": {
      input: 0.5,
      output: 2.0,
@@ -541,6 +716,22 @@ export const DEFAULT_PRICING = {
      reasoning: 3.0,
      cache_creation: 0.5,
    },
+    // MiniMax M2.5 — mais barato que M2.1, reasoning + tools
+    // Context: 204.800 tokens | Max Output: 16.384 tokens
+    "minimax-m2.5": {
+      input: 0.3,
+      output: 1.2,
+      cached: 0.15,
+      reasoning: 1.8,
+      cache_creation: 0.3,
+    },
+    "MiniMax-M2.5": {
+      input: 0.3,
+      output: 1.2,
+      cached: 0.15,
+      reasoning: 1.8,
+      cache_creation: 0.3,
+    },
  },

  // ─── Free-tier API Key Providers (nominal $0 pricing) ───
@@ -627,6 +818,7 @@ export const DEFAULT_PRICING = {

  // Nvidia
  nvidia: {
+    "nvidia/gpt-oss-120b": { input: 0, output: 0, cached: 0, reasoning: 0, cache_creation: 0 },
    "openai/gpt-oss-120b": { input: 0, output: 0, cached: 0, reasoning: 0, cache_creation: 0 },
    "gpt-oss-120b": { input: 0, output: 0, cached: 0, reasoning: 0, cache_creation: 0 },
    "moonshotai/kimi-k2.5": { input: 0, output: 0, cached: 0, reasoning: 0, cache_creation: 0 },
@@ -757,7 +949,85 @@ export const DEFAULT_PRICING = {
    },
  },

-  // Kiro (AWS)
+  // ─────────────────────────────────────────────────────────────────────
+  // xAI (Grok) — Grok-3 + Grok-4 Family
+  // Source: ClawRouter benchmarks 2026-03-17
+  // Grok-4-fast-non-reasoning: 1143ms P50 (mais rapido do benchmark)
+  // ─────────────────────────────────────────────────────────────────────
+  xai: {
+    "grok-3": {
+      input: 3.0,
+      output: 15.0,
+      cached: 1.5,
+      reasoning: 22.5,
+      cache_creation: 3.0,
+    },
+    "grok-3-mini": {
+      input: 0.3,
+      output: 0.5,
+      cached: 0.15,
+      reasoning: 0.75,
+      cache_creation: 0.3,
+    },
+    // Grok-4 Fast Family — ultrabaratos ($0.20/$0.50/M)
+    "grok-4-fast-non-reasoning": {
+      input: 0.2,
+      output: 0.5,
+      cached: 0.1,
+      reasoning: 0.0,
+      cache_creation: 0.2,
+    },
+    "grok-4-fast-reasoning": {
+      input: 0.2,
+      output: 0.5,
+      cached: 0.1,
+      reasoning: 0.75,
+      cache_creation: 0.2,
+    },
+    "grok-4-1-fast-non-reasoning": {
+      input: 0.2,
+      output: 0.5,
+      cached: 0.1,
+      reasoning: 0.0,
+      cache_creation: 0.2,
+    },
+    "grok-4-1-fast-reasoning": {
+      input: 0.2,
+      output: 0.5,
+      cached: 0.1,
+      reasoning: 0.75,
+      cache_creation: 0.2,
+    },
+    "grok-4-0709": {
+      input: 0.2,
+      output: 1.5,
+      cached: 0.1,
+      reasoning: 2.25,
+      cache_creation: 0.2,
+    },
+  },
+
+  // ─────────────────────────────────────────────────────────────────────
+  // Z.AI / ZhipuAI — GLM-5 Family
+  // Adicionados via ClawRouter 2026-03-17 | maxOutput: 128k tokens!
+  // ─────────────────────────────────────────────────────────────────────
+  zai: {
+    "glm-5": {
+      input: 1.0,
+      output: 3.2,
+      cached: 0.5,
+      reasoning: 4.8,
+      cache_creation: 1.0,
+    },
+    "glm-5-turbo": {
+      input: 1.2,
+      output: 4.0,
+      cached: 0.6,
+      reasoning: 6.0,
+      cache_creation: 1.2,
+    },
+  },
+
  kiro: {
    "claude-sonnet-4.5": {
      input: 3.0,
@@ -390,6 +390,66 @@ export const APIKEY_PROVIDERS = {
    website: "https://cloud.google.com/vertex-ai",
    authHint: "Provide Service Account JSON or OAuth access_token",
  },
+  zai: {
+    id: "zai",
+    alias: "zai",
+    name: "Z.AI (GLM-5)",
+    icon: "psychology",
+    color: "#2563EB",
+    textIcon: "ZA",
+    website: "https://open.bigmodel.cn",
+    apiHint: "API key from https://open.bigmodel.cn/usercenter/apikeys",
+  },
+  "perplexity-search": {
+    id: "perplexity-search",
+    alias: "pplx-search",
+    name: "Perplexity Search",
+    icon: "search",
+    color: "#20808D",
+    textIcon: "PS",
+    website: "https://docs.perplexity.ai/guides/search-quickstart",
+    authHint: "Same API key as Perplexity (pplx-...)",
+  },
+  "serper-search": {
+    id: "serper-search",
+    alias: "serper-search",
+    name: "Serper Search",
+    icon: "search",
+    color: "#4285F4",
+    textIcon: "SP",
+    website: "https://serper.dev",
+    authHint: "API key from serper.dev dashboard",
+  },
+  "brave-search": {
+    id: "brave-search",
+    alias: "brave-search",
+    name: "Brave Search",
+    icon: "travel_explore",
+    color: "#FB542B",
+    textIcon: "BR",
+    website: "https://brave.com/search/api",
+    authHint: "Subscription token from Brave Search API dashboard",
+  },
+  "exa-search": {
+    id: "exa-search",
+    alias: "exa-search",
+    name: "Exa Search",
+    icon: "neurology",
+    color: "#1E40AF",
+    textIcon: "EX",
+    website: "https://exa.ai",
+    authHint: "API key from dashboard.exa.ai",
+  },
+  "tavily-search": {
+    id: "tavily-search",
+    alias: "tavily-search",
+    name: "Tavily Search",
+    icon: "manage_search",
+    color: "#5B4FDB",
+    textIcon: "TV",
+    website: "https://tavily.com",
+    authHint: "API key from app.tavily.com (format: tvly-...)",
+  },
 };

 export const OPENAI_COMPATIBLE_PREFIX = "openai-compatible-";
@@ -52,6 +52,7 @@ const comboStrategySchema = z.enum([
  "least-used",
  "cost-optimized",
  "strict-random",
+  "auto",
 ]);

 const comboRuntimeConfigSchema = z
@@ -139,6 +140,12 @@ export const updateSettingsSchema = z.object({
    .optional(),
  wildcardAliases: z.array(z.object({ pattern: z.string(), target: z.string() })).optional(),
  stickyRoundRobinLimit: z.number().int().min(0).max(1000).optional(),
+  // Auto intent classifier settings (multilingual routing)
+  intentDetectionEnabled: z.boolean().optional(),
+  intentSimpleMaxWords: z.number().int().min(1).max(500).optional(),
+  intentExtraCodeKeywords: z.array(z.string().max(100)).optional(),
+  intentExtraReasoningKeywords: z.array(z.string().max(100)).optional(),
+  intentExtraSimpleKeywords: z.array(z.string().max(100)).optional(),
  // Protocol toggles (default: disabled)
  mcpEnabled: z.boolean().optional(),
  a2aEnabled: z.boolean().optional(),
@@ -1074,3 +1081,137 @@ export const guideSettingsSaveSchema = z.object({
  apiKey: z.string().optional(),
  model: z.string().trim().min(1, "Model is required"),
 });
+
+// ── Search Schemas ─────────────────────────────────────────────────────
+// Unified search request/response schemas. Final contract — all fields optional
+// with defaults. New features add implementations, not new fields.
+// Multi-query deferred to POST /v1/search/batch (separate PRD).
+
+export const v1SearchSchema = z
+  .object({
+    // Core
+    query: z
+      .string()
+      .trim()
+      .min(1, "Query is required")
+      .max(500, "Query must be 500 characters or fewer"),
+    provider: z
+      .enum(["serper-search", "brave-search", "perplexity-search", "exa-search", "tavily-search"])
+      .optional(),
+    max_results: z.coerce.number().int().min(1).max(100).default(5),
+    search_type: z.enum(["web", "news"]).default("web"),
+    offset: z.coerce.number().int().min(0).default(0),
+
+    // Locale
+    country: z.string().max(2).toUpperCase().optional(),
+    language: z.string().min(2).max(5).optional(),
+    time_range: z.enum(["any", "day", "week", "month", "year"]).optional(),
+
+    // Content control
+    content: z
+      .object({
+        snippet: z.boolean().default(true),
+        full_page: z.boolean().default(false),
+        format: z.enum(["text", "markdown"]).default("text"),
+        max_characters: z.coerce.number().int().min(100).max(100000).optional(),
+      })
+      .optional(),
+
+    // Filters
+    filters: z
+      .object({
+        include_domains: z.array(z.string().max(253)).max(20).optional(),
+        exclude_domains: z.array(z.string().max(253)).max(20).optional(),
+        safe_search: z.enum(["off", "moderate", "strict"]).optional(),
+      })
+      .optional(),
+
+    // Answer synthesis (Phase 2 — returns null until implemented)
+    synthesis: z
+      .object({
+        strategy: z.enum(["none", "auto", "provider", "internal"]).default("none"),
+        model: z.string().optional(),
+        max_tokens: z.coerce.number().int().min(1).max(4000).optional(),
+      })
+      .optional(),
+
+    // Provider-specific passthrough
+    provider_options: z.record(z.string(), z.unknown()).optional(),
+
+    // Strict mode — reject if provider doesn't support a requested filter
+    strict_filters: z.boolean().default(false),
+  })
+  .catchall(z.unknown());
+
+export const searchResultSchema = z.object({
+  title: z.string(),
+  url: z.string(),
+  display_url: z.string().optional(),
+  snippet: z.string(),
+  position: z.number().int().positive(),
+  score: z.number().min(0).max(1).nullable().optional(),
+  published_at: z.string().nullable().optional(),
+  favicon_url: z.string().nullable().optional(),
+  content: z
+    .object({
+      format: z.enum(["text", "markdown"]).optional(),
+      text: z.string().optional(),
+      length: z.number().int().optional(),
+    })
+    .nullable()
+    .optional(),
+  metadata: z
+    .object({
+      author: z.string().nullable().optional(),
+      language: z.string().nullable().optional(),
+      source_type: z
+        .enum(["article", "blog", "forum", "video", "academic", "news", "other"])
+        .nullable()
+        .optional(),
+      image_url: z.string().nullable().optional(),
+    })
+    .nullable()
+    .optional(),
+  citation: z.object({
+    provider: z.string(),
+    retrieved_at: z.string(),
+    rank: z.number().int().positive(),
+  }),
+  provider_raw: z.record(z.string(), z.unknown()).nullable().optional(),
+});
+
+export const v1SearchResponseSchema = z.object({
+  id: z.string(),
+  provider: z.string(),
+  query: z.string(),
+  results: z.array(searchResultSchema),
+  cached: z.boolean(),
+  answer: z
+    .object({
+      source: z.enum(["none", "provider", "internal"]).optional(),
+      text: z.string().nullable().optional(),
+      model: z.string().nullable().optional(),
+    })
+    .nullable()
+    .optional(),
+  usage: z.object({
+    queries_used: z.number().int().min(0),
+    search_cost_usd: z.number().min(0),
+    llm_tokens: z.number().int().min(0).optional(),
+  }),
+  metrics: z.object({
+    response_time_ms: z.number().int().min(0),
+    upstream_latency_ms: z.number().int().min(0).optional(),
+    gateway_latency_ms: z.number().int().min(0).optional(),
+    total_results_available: z.number().int().nullable(),
+  }),
+  errors: z
+    .array(
+      z.object({
+        provider: z.string(),
+        code: z.string(),
+        message: z.string(),
+      })
+    )
+    .optional(),
+});
@@ -30,6 +30,12 @@ export const updateSettingsSchema = z.object({
    .optional(),
  wildcardAliases: z.array(z.object({ pattern: z.string(), target: z.string() })).optional(),
  stickyRoundRobinLimit: z.number().int().min(0).max(1000).optional(),
+  // Auto intent classifier settings (multilingual routing)
+  intentDetectionEnabled: z.boolean().optional(),
+  intentSimpleMaxWords: z.number().int().min(1).max(500).optional(),
+  intentExtraCodeKeywords: z.array(z.string().max(100)).optional(),
+  intentExtraReasoningKeywords: z.array(z.string().max(100)).optional(),
+  intentExtraSimpleKeywords: z.array(z.string().max(100)).optional(),
  // Protocol toggles (default: disabled)
  mcpEnabled: z.boolean().optional(),
  mcpTransport: z.enum(["stdio", "sse", "streamable-http"]).optional(),
@@ -46,6 +46,10 @@ import {
  applyTaskAwareRouting,
  getTaskRoutingConfig,
 } from "@omniroute/open-sse/services/taskAwareRouter.ts";
+import {
+  isFallbackDecision,
+  shouldUseFallback,
+} from "@omniroute/open-sse/services/emergencyFallback.ts";

 /**
 * Handle chat completion request
@@ -270,7 +274,8 @@ async function handleSingleModelChat(
  request: any = null,
  comboName: string | null = null,
  apiKeyInfo: any = null,
-  telemetry: any = null
+  telemetry: any = null,
+  runtimeOptions: { emergencyFallbackTried?: boolean } = {}
 ) {
  // 1. Resolve model → provider/model
  const resolved = await resolveModelOrError(modelStr, body);
@@ -372,6 +377,53 @@ async function handleSingleModelChat(
      return result.response;
    }

+    // Emergency fallback for budget exhaustion (402 / billing / quota keywords):
+    // reroute to a free model (default provider/model: nvidia + openai/gpt-oss-120b) exactly once.
+    if (!runtimeOptions.emergencyFallbackTried) {
+      const fallbackDecision = shouldUseFallback(
+        Number(result.status || 0),
+        String(result.error || ""),
+        Array.isArray(body?.tools) && body.tools.length > 0
+      );
+
+      if (isFallbackDecision(fallbackDecision)) {
+        const fallbackModelStr = `${fallbackDecision.provider}/${fallbackDecision.model}`;
+        const currentModelStr = `${provider}/${model}`;
+
+        if (fallbackModelStr !== currentModelStr) {
+          const fallbackBody = { ...body, model: fallbackModelStr };
+
+          // Cap output on emergency fallback to avoid unexpected long responses.
+          const maxTokens = Math.min(
+            Number(
+              fallbackBody.max_tokens ??
+                fallbackBody.max_completion_tokens ??
+                fallbackDecision.maxOutputTokens
+            ) || fallbackDecision.maxOutputTokens,
+            fallbackDecision.maxOutputTokens
+          );
+          fallbackBody.max_tokens = maxTokens;
+          fallbackBody.max_completion_tokens = maxTokens;
+
+          log.warn(
+            "EMERGENCY_FALLBACK",
+            `${currentModelStr} -> ${fallbackModelStr} | reason=${fallbackDecision.reason}`
+          );
+
+          return handleSingleModelChat(
+            fallbackBody,
+            fallbackModelStr,
+            clientRawRequest,
+            request,
+            comboName,
+            apiKeyInfo,
+            telemetry,
+            { ...runtimeOptions, emergencyFallbackTried: true }
+          );
+        }
+      }
+    }
+
    // 6. Mark account as quota-exhausted on 429 response
    if (result.status === 429) {
      markAccountExhaustedFrom429(credentials.connectionId, provider);
@@ -0,0 +1,277 @@
+import test from "node:test";
+import assert from "node:assert/strict";
+
+// ═══════════════════════════════════════════════════════════════
+//  Search Registry + Cache Unit Tests
+//  Tests for searchRegistry, searchCache, and response normalization
+// ═══════════════════════════════════════════════════════════════
+
+const { SEARCH_PROVIDERS, getSearchProvider, getAllSearchProviders, selectProvider } =
+  await import("../../open-sse/config/searchRegistry.ts");
+
+const { computeCacheKey, getOrCoalesce, getCacheStats, SEARCH_CACHE_DEFAULT_TTL_MS } =
+  await import("../../open-sse/services/searchCache.ts");
+
+// ─── Registry Tests ──────────────────────────────────────────
+
+test("SEARCH_PROVIDERS has all 5 providers", () => {
+  assert.ok(SEARCH_PROVIDERS["serper-search"], "serper should exist");
+  assert.ok(SEARCH_PROVIDERS["brave-search"], "brave should exist");
+  assert.ok(SEARCH_PROVIDERS["perplexity-search"], "perplexity-search should exist");
+  assert.ok(SEARCH_PROVIDERS["exa-search"], "exa should exist");
+  assert.ok(SEARCH_PROVIDERS["tavily-search"], "tavily should exist");
+  assert.equal(Object.keys(SEARCH_PROVIDERS).length, 5);
+});
+
+test("serper-search config is correct", () => {
+  const s = SEARCH_PROVIDERS["serper-search"];
+  assert.equal(s.id, "serper-search");
+  assert.equal(s.method, "POST");
+  assert.equal(s.authHeader, "x-api-key");
+  assert.equal(s.costPerQuery, 0.001);
+  assert.equal(s.freeMonthlyQuota, 2500);
+  assert.deepEqual(s.searchTypes, ["web", "news"]);
+});
+
+test("brave-search config is correct", () => {
+  const b = SEARCH_PROVIDERS["brave-search"];
+  assert.equal(b.id, "brave-search");
+  assert.equal(b.method, "GET");
+  assert.equal(b.authHeader, "x-subscription-token");
+  assert.equal(b.costPerQuery, 0.005);
+  assert.equal(b.freeMonthlyQuota, 1000);
+});
+
+test("perplexity-search config is correct", () => {
+  const p = SEARCH_PROVIDERS["perplexity-search"];
+  assert.equal(p.id, "perplexity-search");
+  assert.equal(p.method, "POST");
+  assert.equal(p.authHeader, "bearer");
+  assert.equal(p.baseUrl, "https://api.perplexity.ai/search");
+  assert.equal(p.costPerQuery, 0.005);
+  assert.equal(p.freeMonthlyQuota, 0);
+  assert.deepEqual(p.searchTypes, ["web"]);
+});
+
+test("getSearchProvider returns config for valid ID", () => {
+  const config = getSearchProvider("serper-search");
+  assert.ok(config);
+  assert.equal(config.id, "serper-search");
+});
+
+test("getSearchProvider returns null for unknown ID", () => {
+  assert.equal(getSearchProvider("unknown"), null);
+});
+
+test("tavily config is correct", () => {
+  const t = SEARCH_PROVIDERS["tavily-search"];
+  assert.equal(t.id, "tavily-search");
+  assert.equal(t.method, "POST");
+  assert.equal(t.authHeader, "bearer");
+  assert.equal(t.baseUrl, "https://api.tavily.com/search");
+  assert.equal(t.costPerQuery, 0.008);
+  assert.equal(t.freeMonthlyQuota, 1000);
+  assert.deepEqual(t.searchTypes, ["web", "news"]);
+});
+
+test("getAllSearchProviders returns flat list", () => {
+  const all = getAllSearchProviders();
+  assert.equal(all.length, 5);
+  assert.ok(all.some((p) => p.id === "serper-search"));
+  assert.ok(all.some((p) => p.id === "brave-search"));
+  assert.ok(all.some((p) => p.id === "perplexity-search"));
+  assert.ok(all.some((p) => p.id === "exa-search"));
+  assert.ok(all.some((p) => p.id === "tavily-search"));
+  // Each entry should have id, name, searchTypes
+  for (const p of all) {
+    assert.ok(p.id);
+    assert.ok(p.name);
+    assert.ok(Array.isArray(p.searchTypes));
+  }
+});
+
+test("selectProvider with explicit provider returns that provider", () => {
+  const config = selectProvider("brave-search");
+  assert.ok(config);
+  assert.equal(config.id, "brave-search");
+});
+
+test("selectProvider with unknown provider returns null", () => {
+  assert.equal(selectProvider("unknown"), null);
+});
+
+test("selectProvider without argument returns cheapest (serper)", () => {
+  const config = selectProvider();
+  assert.ok(config);
+  assert.equal(config.id, "serper-search"); // $0.001 < $0.005
+});
+
+// ─── Cache Key Tests ─────────────────────────────────────────
+
+test("computeCacheKey is deterministic", () => {
+  const k1 = computeCacheKey("hello world", "auto", "web", 5);
+  const k2 = computeCacheKey("hello world", "auto", "web", 5);
+  assert.equal(k1, k2);
+});
+
+test("computeCacheKey normalizes query (case, whitespace)", () => {
+  const k1 = computeCacheKey("Hello  World", "auto", "web", 5);
+  const k2 = computeCacheKey("hello world", "auto", "web", 5);
+  assert.equal(k1, k2);
+});
+
+test("computeCacheKey differs by provider", () => {
+  const k1 = computeCacheKey("test", "serper", "web", 5);
+  const k2 = computeCacheKey("test", "brave", "web", 5);
+  assert.notEqual(k1, k2);
+});
+
+test("computeCacheKey differs by search_type", () => {
+  const k1 = computeCacheKey("test", "auto", "web", 5);
+  const k2 = computeCacheKey("test", "auto", "news", 5);
+  assert.notEqual(k1, k2);
+});
+
+test("computeCacheKey differs by max_results", () => {
+  const k1 = computeCacheKey("test", "auto", "web", 5);
+  const k2 = computeCacheKey("test", "auto", "web", 10);
+  assert.notEqual(k1, k2);
+});
+
+// ─── Cache + Coalescing Tests ────────────────────────────────
+
+test("getOrCoalesce caches and returns on second call", async () => {
+  let callCount = 0;
+  const key = "test-cache-hit-" + Date.now();
+
+  const r1 = await getOrCoalesce(key, 60_000, async () => {
+    callCount++;
+    return { value: 42 };
+  });
+  assert.equal(r1.cached, false);
+  assert.deepEqual(r1.data, { value: 42 });
+
+  const r2 = await getOrCoalesce(key, 60_000, async () => {
+    callCount++;
+    return { value: 99 };
+  });
+  assert.equal(r2.cached, true);
+  assert.deepEqual(r2.data, { value: 42 }); // original value, not 99
+  assert.equal(callCount, 1); // fetchFn called only once
+});
+
+test("getOrCoalesce coalesces concurrent requests", async () => {
+  let callCount = 0;
+  const key = "test-coalesce-" + Date.now();
+
+  const fetchFn = async () => {
+    callCount++;
+    await new Promise((r) => setTimeout(r, 50)); // simulate async
+    return { value: "coalesced" };
+  };
+
+  // Launch 3 concurrent requests with the same key
+  const [r1, r2, r3] = await Promise.all([
+    getOrCoalesce(key, 60_000, fetchFn),
+    getOrCoalesce(key, 60_000, fetchFn),
+    getOrCoalesce(key, 60_000, fetchFn),
+  ]);
+
+  assert.equal(callCount, 1); // Only one fetch executed
+  assert.deepEqual(r1.data, { value: "coalesced" });
+  assert.deepEqual(r2.data, { value: "coalesced" });
+  assert.deepEqual(r3.data, { value: "coalesced" });
+});
+
+test("getOrCoalesce respects TTL=0 (no caching)", async () => {
+  let callCount = 0;
+  const key = "test-no-cache-" + Date.now();
+
+  await getOrCoalesce(key, 0, async () => {
+    callCount++;
+    return { value: 1 };
+  });
+  await getOrCoalesce(key, 0, async () => {
+    callCount++;
+    return { value: 2 };
+  });
+
+  assert.equal(callCount, 2); // Both calls executed
+});
+
+test("getCacheStats returns valid stats", () => {
+  const stats = getCacheStats();
+  assert.equal(typeof stats.size, "number");
+  assert.equal(typeof stats.hits, "number");
+  assert.equal(typeof stats.misses, "number");
+});
+
+test("SEARCH_CACHE_DEFAULT_TTL_MS is positive", () => {
+  assert.ok(SEARCH_CACHE_DEFAULT_TTL_MS > 0);
+});
+
+// ─── Validation Schema Tests ────────────────────────────────
+
+test("v1SearchSchema validates correct input", async () => {
+  const { v1SearchSchema } = await import("../../src/shared/validation/schemas.ts");
+
+  const result = v1SearchSchema.safeParse({
+    query: "test query",
+    provider: "serper-search",
+    max_results: 10,
+    search_type: "web",
+  });
+  assert.ok(result.success);
+  assert.equal(result.data.query, "test query");
+  assert.equal(result.data.provider, "serper-search");
+  assert.equal(result.data.max_results, 10);
+});
+
+test("v1SearchSchema rejects empty query", async () => {
+  const { v1SearchSchema } = await import("../../src/shared/validation/schemas.ts");
+
+  const result = v1SearchSchema.safeParse({ query: "" });
+  assert.ok(!result.success);
+});
+
+test("v1SearchSchema rejects query over 500 chars", async () => {
+  const { v1SearchSchema } = await import("../../src/shared/validation/schemas.ts");
+
+  const result = v1SearchSchema.safeParse({ query: "a".repeat(501) });
+  assert.ok(!result.success);
+});
+
+test("v1SearchSchema rejects invalid provider", async () => {
+  const { v1SearchSchema } = await import("../../src/shared/validation/schemas.ts");
+
+  const result = v1SearchSchema.safeParse({ query: "test", provider: "google" });
+  assert.ok(!result.success);
+});
+
+test("v1SearchSchema accepts tavily provider", async () => {
+  const { v1SearchSchema } = await import("../../src/shared/validation/schemas.ts");
+
+  const result = v1SearchSchema.safeParse({ query: "test", provider: "tavily-search" });
+  assert.ok(result.success);
+  assert.equal(result.data.provider, "tavily-search");
+});
+
+test("v1SearchSchema applies defaults", async () => {
+  const { v1SearchSchema } = await import("../../src/shared/validation/schemas.ts");
+
+  const result = v1SearchSchema.safeParse({ query: "test" });
+  assert.ok(result.success);
+  assert.equal(result.data.max_results, 5);
+  assert.equal(result.data.search_type, "web");
+  assert.equal(result.data.provider, undefined);
+});
+
+test("v1SearchSchema allows unknown fields (forward compat)", async () => {
+  const { v1SearchSchema } = await import("../../src/shared/validation/schemas.ts");
+
+  const result = v1SearchSchema.safeParse({
+    query: "test",
+    future_field: true,
+  });
+  assert.ok(result.success);
+});
Author	SHA1	Message	Date
diegosouzapw	d3dfd9ce57	feat(release): v2.7.2 — fix light mode contrast in logs UI Build Electron Desktop App / Validate version (push) Failing after 38s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details - fix(logs): text colors in filter buttons + combo badge now have dark: variants - Bumped version to 2.7.2 - Updated CHANGELOG and openapi.yaml	2026-03-18 00:42:22 -03:00
Diego Rodrigues de Sa e Souza	aa06d5d356	Merge pull request #433 from diegosouzapw/fix/issue-378-logs-light-mode-contrast Merged fix for light mode contrast in filter buttons and combo badge. Thanks @rdself for the great bug report!	2026-03-18 00:41:28 -03:00
diegosouzapw	448c8a29e1	fix(logs): fix light mode contrast in filter buttons and combo badge (#378 ) - text-red-400 → text-red-700 dark:text-red-400 (error filter, recording button) - text-emerald-400 → text-emerald-700 dark:text-emerald-400 (ok filter) - text-violet-300 → text-violet-700 dark:text-violet-300 (combo filter) - combo row badge: violet-700 → violet-800 dark:violet-300, stronger border Fixes #378	2026-03-17 16:46:27 -03:00
diegosouzapw	928b7120f4	feat(release): v2.7.1 — unified web search routing + Next.js 16.1.7 security Build Electron Desktop App / Validate version (push) Failing after 35s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details - POST /v1/search: 5 providers (Serper, Brave, Perplexity, Exa, Tavily), 6,500+ free/mo - Search analytics dashboard tab + GET /api/v1/search/analytics - db: request_type column on call_logs (migration 007) - Next.js 16.1.7: 6 CVEs fixed (critical: CVE-2026-29057 HTTP request smuggling) - docs/openapi.yaml: bumped to 2.7.1	2026-03-17 16:27:31 -03:00
diegosouzapw	a3deacd718	feat: Implement historical model latency and success rate tracking for auto-combo routing and update Claude and Deepseek pricing and model registrations.	2026-03-17 16:18:36 -03:00
diegosouzapw	78959fffbd	Merge branch 'main' of https://github.com/diegosouzapw/OmniRoute	2026-03-17 16:18:12 -03:00
Diego Rodrigues de Sa e Souza	1788616e52	Merge pull request #431 from diegosouzapw/dependabot/npm_and_yarn/next-16.1.7 Security update merged: Next.js 16.1.7 fixes 6 CVEs including critical CVE-2026-29057 (HTTP request smuggling). No breaking changes.	2026-03-17 16:18:01 -03:00
Diego Rodrigues de Sa e Souza	c61e6d0777	Merge pull request #432 from Regis-RCR/feat/search-provider-routing Merged with dashboard improvements: SearchAnalyticsTab + /api/v1/search/analytics endpoint — PR review complete by Antigravity.	2026-03-17 16:17:39 -03:00
diegosouzapw	a3bc7620b1	feat(integration): integrate ClawRouter services into active pipeline - intentClassifier → engine.ts selectProvider() When taskType is 'default', classifies prompt via multilingual keyword detection (9 langs) and uses detected intent (code/reasoning/simple/medium) for 6-factor task fitness scoring. - emergencyFallback → chatCore.ts error path (after T5 intra-family fallback) On HTTP 402 or budget-exhaustion keywords, attempts one redirect to nvidia/gpt-oss-120b ($0.00/M) before returning error to combo router. Skipped for streaming requests and tool-calling requests. - AutoComboConfig.routerStrategy field added Allows per-combo strategy override ('rules' \| 'cost' \| 'latency') Note: requestDedup was already integrated in chatCore.ts (line 387-430) Branch: feat/clawrouter-improvements	2026-03-17 15:22:12 -03:00
diegosouzapw	8064c588dc	docs(i18n): sync v2.7.0 release notes to 29 language READMEs New in v2.7.0: pluggable RouterStrategy, multilingual intent detection, request deduplication, new providers (Grok-4 Fast, GLM-5/Z.AI, MiniMax M2.5, Kimi K2.5). Native translations for de/es/fr/it/ru/zh-CN/ja/ko/ar/pt-BR/pt.	2026-03-17 15:11:09 -03:00
Regis	564e983c68	feat(search): add unified web search routing with 5 providers Add POST /v1/search — a unified search endpoint routing queries across 5 providers (Serper, Brave, Perplexity Search, Exa, Tavily) with automatic failover, in-memory caching, and request coalescing. No open-source AI gateway offers unified search routing. This chains free tiers for 5,500+ searches/month with zero downtime. Providers: Serper ($0.001/q, 2500/mo free), Brave ($0.005/q, 1000/mo), Perplexity Search ($0.005/q), Exa ($0.007/q, 1000/mo), Tavily ($0.008/q, 1000/mo). Auto-select picks cheapest with credentials. Architecture follows existing patterns: - searchRegistry.ts (same as embeddingRegistry.ts) - search.ts handler (same as embeddings.ts) - route.ts (same as /v1/embeddings/route.ts) - searchCache.ts (bounded TTL cache + request coalescing) Schema finalized — all future fields defined as optional with safe defaults. No breaking changes when implementing content extraction, answer synthesis, or ranking. Key features: - Per-provider request builders and response normalizers - Enriched response: display_url, score, favicon_url, content block, metadata, answer block, errors array, upstream_latency_ms metrics - Cost-sorted auto-select with failover on 429/5xx/timeout - Credential fallback (perplexity-search reuses perplexity chat key) - Cache key includes all result-affecting parameters - max_results clamped to provider limits, sanitized error responses - Factored validators (validateSearchProvider factory) - CORS headers on all responses - Dashboard: Search & Discovery section, search provider template - DB migration 007: request_type column in call_logs - 28 unit tests (registry, cache, coalescing, validation)	2026-03-17 18:28:35 +01:00
diegosouzapw	e1da181740	fix(publish): also remove app/electron/ (contains app.asar binary) to prevent Z_DATA_ERROR	2026-03-17 14:25:48 -03:00
diegosouzapw	c63209200e	fix(publish): remove app/vscode-extension/ after build to prevent Z_DATA_ERROR in npm pack	2026-03-17 14:13:15 -03:00
diegosouzapw	737808cf53	fix(npm): exclude app/vscode-extension/ from package to prevent Z_DATA_ERROR during publish	2026-03-17 13:50:06 -03:00
diegosouzapw	a197bb7736	fix(routerStrategy): use .ts extension in imports for Next.js App Router bundle compatibility	2026-03-17 13:15:47 -03:00
dependabot[bot]	f9dd967bc5	deps: bump next from 16.1.6 to 16.1.7 Bumps [next](https://github.com/vercel/next.js) from 16.1.6 to 16.1.7. - [Release notes](https://github.com/vercel/next.js/releases) - [Changelog](https://github.com/vercel/next.js/blob/canary/release.js) - [Commits](https://github.com/vercel/next.js/compare/v16.1.6...v16.1.7) --- updated-dependencies: - dependency-name: next dependency-version: 16.1.7 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>	2026-03-17 16:14:44 +00:00
diegosouzapw	44e4d55a66	feat(release): merge feat/clawrouter-improvements — v2.7.0 Build Electron Desktop App / Validate version (push) Failing after 40s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details	2026-03-17 13:12:41 -03:00
diegosouzapw	095c84ac16	fix(providerRegistry): remove duplicate claude-haiku-4-5-20251001 from anthropic provider to prevent ambiguous model resolution	2026-03-17 13:10:23 -03:00
diegosouzapw	e063eae727	feat(clawrouter): implement 14 ClawRouter-inspired features PRICING UPDATES (01-09): - xAI Grok-4 family: grok-4-fast-non-reasoning (/usr/bin/bash.20/$0.50/M, 1143ms), grok-4-fast-reasoning, grok-4-1-fast-*, grok-4-0709, grok-3, grok-3-mini - Z.AI GLM-5 family: glm-5 + glm-5-turbo (128k maxOutput, $1.00/$3.20/M) - Gemini Flash Lite: price corrected $0.15→$0.10 / $1.25→$0.40 (per ClawRouter) - Gemini 3.1 Pro: new flagship (1.05M context, aliased as gemini-3.1-pro) - Anthropic Claude 4.5/4.6: haiku-4.5 ($1/$5), sonnet-4.6 ($3/$15), opus-4.6 ($5/$25) - DeepSeek native section: deepseek-chat/v3/v3.2 ($0.28/$0.42), deepseek-reasoner ($0.55/$2.19) - Kimi K2.5 Moonshot: kimi-k2.5 ($0.60/$3.00, 262k ctx), moonshot-kimi-k2.5 alias - MiniMax M2.5: minimax-m2.5 ($0.30/$1.20, 204k ctx, reasoning+tools) - NVIDIA free tier: gpt-oss-120b at $0.00/M via emergencyFallback.ts INFRASTRUCTURE FEATURES (10-14): - feat(router): add intentClassifier.ts for multilingual intent detection (9 langs) Detects code/reasoning/simple in EN, PT-BR, ES, ZH, JA, RU, DE, KO, AR - feat(dedup): add requestDedup.ts for concurrent request deduplication SHA-256 hash, skip streaming, skip high-temperature, 60s failsafe TTL - feat(autoCombo): add routerStrategy.ts pluggable strategy system RouterStrategy interface, RulesStrategy (6-factor) + CostStrategy, registry - feat(fallback): add emergencyFallback.ts budget-exhaustion detector Triggers on HTTP 402 or budget keywords, redirects to nvidia/gpt-oss-120b - feat(taskFitness): add fitness scores for Grok-4, Kimi K2.5, GLM-5, MiniMax M2.5, DeepSeek V3.2, Gemini 3.1 Pro across all task categories PROVIDERS: - providers.ts: add Z.AI (zai) provider entry for GLM-5 API key connections All features on branch: feat/clawrouter-improvements Source: github.com/BlockRunAI/ClawRouter analysis (2026-03-17)	2026-03-17 10:43:12 -03:00
diegosouzapw	f02c5b5c69	fix(install/v2.6.10): Windows better-sqlite3 prebuilt download (#426 ) Build Electron Desktop App / Validate version (push) Failing after 35s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details npm version patch run BEFORE staging files — this is an ATOMIC commit. Adds Strategy 1.5 to scripts/postinstall.mjs: - Uses @mapbox/node-pre-gyp install --fallback-to-build=false (bundled within better-sqlite3) to download the correct prebuilt binary for the current OS/arch (win32-x64/arm64, darwin-x64/arm64) WITHOUT requiring node-gyp, Python, or MSVC build tools. - Tries node-pre-gyp.cmd (Windows) or node-pre-gyp (Unix) from .bin/ with fallback to direct path in @mapbox/node-pre-gyp/bin/ - Falls back to npm rebuild only if prebuilt download fails. - Windows-specific error: shows Option A (npx node-pre-gyp) and Option B (rebuild) with Visual Studio Build Tools links. Fixes: #426 (better_sqlite3.node is not a valid Win32 application)	2026-03-17 10:09:45 -03:00
diegosouzapw	838f1d645c	fix(v2.6.9): CI budget checks, #409 file attachments, atomic release workflow Build Electron Desktop App / Validate version (push) Failing after 38s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details Includes version bump — v2.6.9 — committed ATOMICALLY with all changes: fixes: - fix(ci/t11): Remove 'any' from comments in openai-responses.ts + chatCore.ts (\bany\b regex counted comment text as explicit any violations) - fix(chatCore/#409): Normalize unsupported content part types before forwarding Cursor sends {type:'file'} for .md attachments; Copilot/OpenAI providers reject with 'type has to be either image_url or text'. Now: file/document→text block, unknown types dropped with debug log. Fixes claude-* models via github-copilot. workflow: - chore(generate-release): ATOMIC COMMIT RULE — npm version patch MUST run before feature commits so the release tag always points to a commit with full changes	2026-03-17 09:09:01 -03:00
diegosouzapw	ce2c30c437	chore(release): v2.6.8 — combo agents, auto-update, detailed logs, MITM Kiro Build Electron Desktop App / Validate version (push) Failing after 31s Details Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped Details Build Electron Desktop App / Build Electron (linux) (push) Has been skipped Details Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped Details Build Electron Desktop App / Build Electron (windows) (push) Has been skipped Details Build Electron Desktop App / Create Release (push) Has been skipped Details	2026-03-17 08:58:03 -03:00
diegosouzapw	d56fae0a7b	feat: combo agents, auto-update UI, detailed logs, MITM Kiro (#399 #401 #320 #378 #336 ) DB Migrations (zero-breaking, ADD COLUMN DEFAULT NULL + new table): - 005_combo_agent_fields.sql: system_message, tool_filter_regex, context_cache_protection on combos - 006_detailed_request_logs.sql: ring-buffer table (500 entries) for full pipeline body capture Features: - #399 System Message Override + Tool Filter Regex per Combo - applyComboAgentMiddleware() injected into handleComboChat/handleRoundRobinCombo - Supports both OpenAI and Anthropic tool name formats - #401 Context Caching Protection (Stateless) - injectModelTag() appends <omniModel>provider/model</omniModel> to responses - extractPinnedModel() reads tag from history and pins model for session - #320 Auto-Update via Settings - GET /api/system/version — current vs latest npm - POST /api/system/update — fire-and-forget npm install + pm2 restart - #378 Detailed Request Logs - saveRequestDetailLog() captures bodies at 4 pipeline stages (opt-in toggle) - GET/POST /api/logs/detail — list logs + enable/disable toggle - #336 MITM Kiro IDE - src/mitm/targets/kiro.ts: MitmTarget profile for api.anthropic.com interception	2026-03-17 08:53:41 -03:00
				`@@ -0,0 +1 @@`
				<svg width="56" height="64" viewBox="0 0 56 64" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M53.292 15.321l1.5-3.676s-1.909-2.043-4.227-4.358c-2.317-2.315-7.225-.953-7.225-.953L37.751 0H18.12l-5.589 6.334s-4.908-1.362-7.225.953C2.988 9.602 1.08 11.645 1.08 11.645l1.5 3.676-1.91 5.447s5.614 21.236 6.272 23.83c1.295 5.106 2.181 7.08 5.862 9.668 3.68 2.587 10.36 7.08 11.45 7.762 1.091.68 2.455 1.84 3.682 1.84 1.227 0 2.59-1.16 3.68-1.84 1.091-.681 7.77-5.175 11.452-7.762 3.68-2.587 4.567-4.562 5.862-9.668.657-2.594 6.27-23.83 6.27-23.83l-1.908-5.447z" fill="url(#paint0_linear)"/><path fill-rule="evenodd" clip-rule="evenodd" d="M34.888 11.508c.818 0 6.885-1.157 6.885-1.157s7.189 8.68 7.189 10.536c0 1.534-.619 2.134-1.347 2.842-.152.148-.31.3-.467.468l-5.39 5.717a9.42 9.42 0 01-.176.18c-.538.54-1.33 1.336-.772 2.658l.115.269c.613 1.432 1.37 3.2.407 4.99-1.025 1.906-2.78 3.178-3.905 2.967-1.124-.21-3.766-1.589-4.737-2.218-.971-.63-4.05-3.166-4.05-4.137 0-.809 2.214-2.155 3.29-2.81.214-.13.383-.232.48-.298.111-.075.297-.19.526-.332.981-.61 2.754-1.71 2.799-2.197.055-.602.034-.778-.758-2.264-.168-.316-.365-.654-.568-1.004-.754-1.295-1.598-2.745-1.41-3.784.21-1.173 2.05-1.845 3.608-2.415.194-.07.385-.14.567-.209l1.623-.609c1.556-.582 3.284-1.229 3.57-1.36.394-.181.292-.355-.903-.468a54.655 54.655 0 01-.58-.06c-1.48-.157-4.209-.446-5.535-.077-.261.073-.553.152-.86.235-1.49.403-3.317.897-3.493 1.182-.03.05-.06.093-.089.133-.168.238-.277.394-.091 1.406.055.302.169.895.31 1.629.41 2.148 1.053 5.498 1.134 6.25.011.106.024.207.036.305.103.84.171 1.399-.805 1.622l-.255.058c-1.102.252-2.717.623-3.3.623-.584 0-2.2-.37-3.302-.623l-.254-.058c-.976-.223-.907-.782-.804-1.622.012-.098.024-.2.035-.305.081-.753.725-4.112 1.137-6.259.14-.73.253-1.32.308-1.62.185-1.012.076-1.168-.092-1.406a3.743 3.743 0 01-.09-.133c-.174-.285-2-.779-3.491-1.182-.307-.083-.6-.162-.86-.235-1.327-.37-4.055-.08-5.535.077-.226.024-.422.045-.58.06-1.196.113-1.297.287-.903.468.285.131 2.013.778 3.568 1.36.597.223 1.17.437 1.624.609.183.069.373.138.568.21 1.558.57 3.398 1.241 3.608 2.414.187 1.039-.657 2.489-1.41 3.784-.204.35-.4.688-.569 1.004-.791 1.486-.812 1.662-.757 2.264.044.488 1.816 1.587 2.798 2.197.229.142.415.257.526.332.098.066.266.168.48.298 1.076.654 3.29 2 3.29 2.81 0 .97-3.078 3.507-4.05 4.137-.97.63-3.612 2.008-4.737 2.218-1.124.21-2.88-1.061-3.904-2.966-.963-1.791-.207-3.559.406-4.99l.115-.27c.559-1.322-.233-2.118-.772-2.658a9.377 9.377 0 01-.175-.18l-5.39-5.717c-.158-.167-.316-.32-.468-.468-.728-.707-1.346-1.308-1.346-2.842 0-1.855 7.189-10.536 7.189-10.536s6.066 1.157 6.884 1.157c.653 0 1.913-.433 3.227-.885.333-.114.669-.23 1-.34 1.635-.545 2.726-.549 2.726-.549s1.09.004 2.726.549c.33.11.667.226 1 .34 1.313.452 2.574.885 3.226.885zm-1.041 30.706c1.282.66 2.192 1.128 2.536 1.343.445.278.174.803-.232 1.09-.405.285-5.853 4.499-6.381 4.965l-.215.191c-.509.459-1.159 1.044-1.62 1.044-.46 0-1.11-.586-1.62-1.044l-.213-.191c-.53-.466-5.977-4.68-6.382-4.966-.405-.286-.677-.81-.232-1.09.344-.214 1.255-.683 2.539-1.344l1.22-.629c1.92-.992 4.315-1.837 4.689-1.837.373 0 2.767.844 4.689 1.837.436.226.845.437 1.222.63z" fill="#fff"/><path fill-rule="evenodd" clip-rule="evenodd" d="M43.34 6.334L37.751 0H18.12l-5.589 6.334s-4.908-1.362-7.225.953c0 0 6.544-.59 8.793 3.064 0 0 6.066 1.157 6.884 1.157.818 0 2.59-.68 4.226-1.225 1.636-.545 2.727-.549 2.727-.549s1.09.004 2.726.549 3.408 1.225 4.226 1.225c.818 0 6.885-1.157 6.885-1.157 2.249-3.654 8.792-3.064 8.792-3.064-2.317-2.315-7.225-.953-7.225-.953z" fill="url(#paint1_linear)"/><defs><linearGradient id="paint0_linear" x1=".671" y1="64.319" x2="55.2" y2="64.319" gradientUnits="userSpaceOnUse"><stop stop-color="#F50"/><stop offset=".41" stop-color="#F50"/><stop offset=".582" stop-color="#FF2000"/><stop offset="1" stop-color="#FF2000"/></linearGradient><linearGradient id="paint1_linear" x1="6.278" y1="11.466" x2="50.565" y2="11.466" gradientUnits="userSpaceOnUse"><stop stop-color="#FF452A"/><stop offset="1" stop-color="#FF2000"/></linearGradient></defs></svg>