Compare commits

...

12 Commits

Author SHA1 Message Date
diegosouzapw 659e2b414d feat(release): v2.8.2 — model alias routing fix, log export, 2 merged PRs
Build Electron Desktop App / Validate version (push) Failing after 25s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
2026-03-19 11:13:49 -03:00
diegosouzapw 7bcb58e3db feat(logs): add export button with time range dropdown (1h, 6h, 12h, 24h)
- New API: /api/logs/export?hours=24&type=call-logs
- UI: Export button with dropdown on /dashboard/logs page
- Supports export of request-logs, proxy-logs, and call-logs
- Downloads as JSON file with Content-Disposition header
2026-03-19 11:11:07 -03:00
diegosouzapw 2d7d7776a6 fix(routing): model aliases now affect routing, not just format detection (#472)
Previously resolveModelAlias() output was used only for getModelTargetFormat()
but the original model was sent in translatedBody.model and to the executor.
Now effectiveModel is propagated to all downstream operations.
2026-03-19 11:07:29 -03:00
Prakersh Maheshwari c5f429521c fix(pricing): add missing Codex 5.3/5.4 and Anthropic model ID entries (#479)
* fix(pricing): add missing Codex 5.3/5.4 and Anthropic model ID entries

Missing pricing entries cause $0.00 cost for:
- GPT 5.3 Codex family (gpt-5.3-codex, -high, -xhigh, -low, -none)
- GPT 5.4 (with hyphen: gpt-5.4)
- GPT 5.1 Codex Mini High
- Common Anthropic model IDs without dates (claude-opus-4-6,
  claude-sonnet-4-6, claude-opus-4, claude-sonnet-4)
- Dated variants used by Claude Code (claude-opus-4-5-20251101,
  claude-sonnet-4-5-20250929)

* refactor: extract shared pricing constants to reduce duplication

Address review feedback: extract duplicated pricing objects into
named constants (GPT_5_3_CODEX_PRICING, CLAUDE_OPUS_4_PRICING, etc.)
and add clarifying comment about intentional hyphen/dot variant entries.
2026-03-19 11:04:30 -03:00
diegosouzapw 426d8636bc fix(stream): extract usage from remaining buffer in flush handler (#480) 2026-03-19 11:02:13 -03:00
diegosouzapw a265c7096e feat(release): v2.8.1 — streaming log fix, Kiro compat, cache tokens, Chinese i18n, configurable tool call ID
Build Electron Desktop App / Validate version (push) Failing after 31s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
2026-03-19 08:45:54 -03:00
diegosouzapw 1c9953b1ba chore: remove ZWS_README_V1.md (internal contributor doc) 2026-03-19 08:43:17 -03:00
diegosouzapw 601cc21a44 feat: call log response content, per-model tool call ID, key PATCH & validation (#470) 2026-03-19 08:41:01 -03:00
Ethan Hunt 102c42dfe4 feat: Improve the Chinese translation (#475)
Co-authored-by: gmw <rorschach1167@qq.com>
2026-03-19 08:37:51 -03:00
Prakersh Maheshwari 4953727aa7 fix(callLogs): support Claude format usage and include cache tokens (#476)
saveCallLog only read prompt_tokens/completion_tokens (OpenAI format).
When sourceFormat=claude, the openai-to-claude translator writes
input_tokens/output_tokens instead, causing all cross-format requests
(Codex-via-Claude, Kiro-via-Claude, etc.) to show 0|0 tokens in
call_logs.

Also includes cache_read and cache_creation tokens in tokens_in total
so heavily-cached requests don't show misleadingly low input counts.

Changes:
- Read prompt_tokens || input_tokens (supports both formats)
- Read completion_tokens || output_tokens (supports both formats)
- Sum cache_read_input_tokens + cache_creation_input_tokens into total
2026-03-19 08:37:49 -03:00
Prakersh Maheshwari e6af874b47 fix(usage): include cache tokens in usage history input total (#477)
logUsage stored only non-cached input tokens in usage_history.tokens_input.
For heavily-cached Claude requests (common with Claude Code), this shows
near-zero input when the real total is 150K+, causing the analytics
dashboard to severely underreport input token usage.

Now sums: input = prompt_tokens + cache_read + cache_creation
2026-03-19 08:37:46 -03:00
Prakersh Maheshwari 801b4eef4c fix(kiro): strip injected model field from request body (#478)
chatCore.ts injects translatedBody.model for all providers after
translation. Kiro API (AWS CodeWhisperer) has strict schema validation
and rejects unknown top-level fields — only conversationState, profileArn,
and inferenceConfig are valid. This causes 100% of Kiro requests to fail
with "Improperly formed request".

Strip the injected model field in KiroExecutor.transformRequest().
2026-03-19 08:37:44 -03:00
22 changed files with 1512 additions and 874 deletions
+40
View File
@@ -4,6 +4,46 @@
---
## [2.8.2] — 2026-03-19
> Sprint: 2 merged PRs, model aliases routing fix, log export, and issue triage.
### Features
- **Log Export**: New Export button on `/dashboard/logs` with time range dropdown (1h, 6h, 12h, 24h). Downloads JSON of request/proxy/call logs via `/api/logs/export` API (#user-request)
### Bug Fixes
- **Model Aliases Routing** (#472): Settings → Model Aliases now correctly affect provider routing, not just format detection. Previously `resolveModelAlias()` output was only used for `getModelTargetFormat()` but the original model ID was sent to the provider
- **Stream Flush Usage** (#480): Usage data from the last SSE event in the buffer is now correctly extracted during stream flush (merged from @prakersh)
### Merged PRs
- #480 — Extract usage from remaining buffer in flush handler (@prakersh)
- #479 — Add missing Codex 5.3/5.4 and Anthropic model ID pricing entries (@prakersh)
---
## [2.8.1] — 2026-03-19
> Sprint: Five community PRs — streaming call log fixes, Kiro compatibility, cache token analytics, Chinese translation, and configurable tool call IDs.
### ✨ Features
- **feat(logs)**: Call log response content now correctly accumulated from raw provider chunks (OpenAI/Claude/Gemini) before translation, fixing empty response payloads in streaming mode (#470, @zhangqiang8vip)
- **feat(providers)**: Per-model configurable 9-char tool call ID normalization (Mistral-style) — only models with the option enabled get truncated IDs (#470)
- **feat(api)**: Key PATCH API expanded to support `allowedConnections`, `name`, `autoResolve`, `isActive`, and `accessSchedule` fields (#470)
- **feat(dashboard)**: Response-first layout in request log detail UI (#470)
- **feat(i18n)**: Improved Chinese (zh-CN) translation — complete retranslation (#475, @only4copilot)
### 🐛 Bug Fixes
- **fix(kiro)**: Strip injected `model` field from request body — Kiro API rejects unknown top-level fields (#478, @prakersh)
- **fix(usage)**: Include cache read + cache creation tokens in usage history input totals for accurate analytics (#477, @prakersh)
- **fix(callLogs)**: Support Claude format usage fields (`input_tokens`/`output_tokens`) alongside OpenAI format, include all cache token variants (#476, @prakersh)
---
## [2.8.0] — 2026-03-19
> Sprint: Bailian Coding Plan provider with editable base URLs, plus community contributions for Alibaba Cloud and Kimi Coding.
+1 -1
View File
@@ -1,7 +1,7 @@
openapi: 3.1.0
info:
title: OmniRoute API
version: 2.8.0
version: 2.8.2
description: |
OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
endpoint that routes requests to multiple AI providers with load balancing,
+5 -2
View File
@@ -77,10 +77,13 @@ export class KiroExecutor extends BaseExecutor {
}
transformRequest(model: string, body: unknown, stream: boolean, credentials: unknown): unknown {
void model;
void stream;
void credentials;
return body;
// Kiro uses conversationState.currentMessage.userInputMessage.modelId,
// not a top-level "model" field. chatCore injects translatedBody.model
// which Kiro API rejects as unknown top-level field.
const { model: _model, ...rest } = body as Record<string, unknown>;
return rest;
}
/**
+25 -11
View File
@@ -23,6 +23,7 @@ import {
appendRequestLog,
saveCallLog,
} from "@/lib/usageDb";
import { getModelNormalizeToolCallId } from "@/lib/db/models";
import { getExecutor } from "../executors/index.ts";
import { translateNonStreamingResponse } from "./responseTranslator.ts";
import { extractUsageFromResponse } from "./usageExtractor.ts";
@@ -156,10 +157,16 @@ export async function handleChatCore({
// Detect source format and get target format
// Model-specific targetFormat takes priority over provider default
// Apply custom model aliases (Settings → Model Aliases → Pattern→Target) before routing (#315)
// Apply custom model aliases (Settings → Model Aliases → Pattern→Target) before routing (#315, #472)
// Custom aliases take priority over built-in and must be resolved here so the
// downstream getModelTargetFormat() lookup uses the correct, aliased model ID.
// downstream getModelTargetFormat() lookup AND the actual provider request use
// the correct, aliased model ID. Without this, aliases only affect format detection.
const resolvedModel = resolveModelAlias(model);
// Use resolvedModel for all downstream operations (routing, provider requests, logging)
const effectiveModel = resolvedModel !== model ? resolvedModel : model;
if (resolvedModel !== model) {
log?.info?.("ALIAS", `Model alias applied: ${model}${resolvedModel}`);
}
const alias = PROVIDER_ID_TO_ALIAS[provider] || provider;
const modelTargetFormat = getModelTargetFormat(alias, resolvedModel);
@@ -310,6 +317,7 @@ export async function handleChatCore({
}
}
const normalizeToolCallId = getModelNormalizeToolCallId(provider || "", model || "");
translatedBody = translateRequest(
sourceFormat,
targetFormat,
@@ -318,7 +326,8 @@ export async function handleChatCore({
stream,
credentials,
provider,
reqLogger
reqLogger,
{ normalizeToolCallId }
);
}
} catch (error) {
@@ -364,8 +373,8 @@ export async function handleChatCore({
delete translatedBody._toolNameMap;
delete translatedBody._disableToolPrefix;
// Update model in body
translatedBody.model = model;
// Update model in body — use resolved alias so the provider gets the correct model ID (#472)
translatedBody.model = effectiveModel;
// Strip unsupported parameters for reasoning models (o1, o3, etc.)
const unsupported = getUnsupportedParams(provider, model);
@@ -394,7 +403,7 @@ export async function handleChatCore({
const dedupEnabled = shouldDeduplicate(dedupRequestBody);
const dedupHash = dedupEnabled ? computeRequestHash(dedupRequestBody) : null;
const executeProviderRequest = async (modelToCall = model, allowDedup = false) => {
const executeProviderRequest = async (modelToCall = effectiveModel, allowDedup = false) => {
const execute = async () => {
const bodyToSend =
translatedBody.model === modelToCall
@@ -442,8 +451,8 @@ export async function handleChatCore({
trackPendingRequest(model, provider, connectionId, true);
// T5: track which models we've tried for intra-family fallback
const triedModels = new Set<string>([model]);
let currentModel = model;
const triedModels = new Set<string>([effectiveModel]);
let currentModel = effectiveModel;
// Log start
appendRequestLog({ model, provider, connectionId, status: "PENDING" }).catch(() => {});
@@ -462,7 +471,7 @@ export async function handleChatCore({
let finalBody;
try {
const result = await executeProviderRequest(model, true);
const result = await executeProviderRequest(effectiveModel, true);
providerResponse = result.response;
providerUrl = result.url;
@@ -871,8 +880,12 @@ export async function handleChatCore({
// Create transform stream with logger for streaming response
let transformStream;
// Callback to save call log when stream completes (streaming calls were never logged before!)
const onStreamComplete = ({ status: streamStatus, usage: streamUsage }) => {
// Callback to save call log when stream completes (include responseBody when provided by stream)
const onStreamComplete = ({
status: streamStatus,
usage: streamUsage,
responseBody: streamResponseBody,
}) => {
saveCallLog({
method: "POST",
path: clientRawRequest?.endpoint || "/v1/chat/completions",
@@ -883,6 +896,7 @@ export async function handleChatCore({
duration: Date.now() - startTime,
tokens: streamUsage || {},
requestBody: body,
responseBody: streamResponseBody ?? undefined,
sourceFormat,
targetFormat,
comboName,
+58 -15
View File
@@ -1,26 +1,69 @@
// Tool call helper functions for translator
// Generate unique tool call ID
const ALPHANUM9 = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
// Generate unique tool call ID (default long form)
export function generateToolCallId() {
return `call_${Date.now().toString(36)}_${Math.random().toString(36).slice(2, 9)}`;
}
// Ensure all tool_calls have id field and arguments is string (some providers require it)
export function ensureToolCallIds(body) {
// Generate 9-char [a-zA-Z0-9] id for providers that require it (e.g. Mistral)
function generateToolCallId9(): string {
let s = "";
for (let i = 0; i < 9; i++) {
s += ALPHANUM9[Math.floor(Math.random() * ALPHANUM9.length)];
}
return s;
}
/** @param options.use9CharId - When true, normalize ids to 9-char [a-zA-Z0-9] (e.g. Mistral); when false, only fix type/arguments, leave ids as-is */
export function ensureToolCallIds(body, options?: { use9CharId?: boolean }) {
if (!body.messages || !Array.isArray(body.messages)) return body;
for (const msg of body.messages) {
if (msg.role === "assistant" && msg.tool_calls && Array.isArray(msg.tool_calls)) {
for (const tc of msg.tool_calls) {
if (!tc.id) {
tc.id = generateToolCallId();
}
if (!tc.type) {
tc.type = "function";
}
// Ensure arguments is JSON string, not object
if (tc.function?.arguments && typeof tc.function.arguments !== "string") {
tc.function.arguments = JSON.stringify(tc.function.arguments);
const use9CharId = options?.use9CharId === true;
for (let i = 0; i < body.messages.length; i++) {
const msg = body.messages[i];
if (msg.role !== "assistant" || !msg.tool_calls || !Array.isArray(msg.tool_calls)) continue;
const used9 = new Set<string>();
const newIdsInOrder: string[] = [];
for (const tc of msg.tool_calls) {
if (!tc.type) {
tc.type = "function";
}
if (tc.function?.arguments && typeof tc.function.arguments !== "string") {
tc.function.arguments = JSON.stringify(tc.function.arguments);
}
if (use9CharId) {
let newId: string;
do {
newId = generateToolCallId9();
} while (used9.has(newId));
used9.add(newId);
newIdsInOrder.push(newId);
tc.id = newId;
} else {
// Leave id as-is, only ensure it exists for later tool message matching
const id =
tc.id != null && String(tc.id).trim() !== "" ? String(tc.id) : generateToolCallId();
tc.id = id;
newIdsInOrder.push(id);
}
}
// Tool responses (role "tool") follow in same order as tool_calls; set tool_call_id by index.
// Stop when we hit another assistant so we only link tool messages that immediately follow this one.
if (newIdsInOrder.length > 0) {
let idx = 0;
for (let j = i + 1; j < body.messages.length; j++) {
const later = body.messages[j];
if (later.role === "assistant") break;
if (later.role !== "tool") continue;
if (idx < newIdsInOrder.length) {
later.tool_call_id = newIdsInOrder[idx];
idx++;
}
}
}
+10 -3
View File
@@ -66,6 +66,7 @@ function normalizeOpenAIResponsesRequest(body) {
return normalized;
}
/** @param options.normalizeToolCallId - When true, use 9-char tool call ids (e.g. Mistral); when false, leave ids as-is */
// Translate request: source -> openai -> target
export function translateRequest(
sourceFormat,
@@ -75,9 +76,11 @@ export function translateRequest(
stream = true,
credentials = null,
provider = null,
reqLogger = null
reqLogger = null,
options?: { normalizeToolCallId?: boolean }
) {
let result = body;
const use9CharId = options?.normalizeToolCallId === true;
// Phase 2: Apply thinking budget control before normalization
result = applyThinkingBudget(result);
@@ -85,8 +88,8 @@ export function translateRequest(
// Normalize thinking config: remove if lastMessage is not user
normalizeThinkingConfig(result);
// Always ensure tool_calls have id (some providers require it)
ensureToolCallIds(result);
// Ensure tool_calls have id; optionally normalize to 9-char for providers like Mistral
ensureToolCallIds(result, { use9CharId });
// Fix missing tool responses (insert empty tool_result if needed)
fixMissingToolResponses(result);
@@ -140,6 +143,10 @@ export function translateRequest(
result = normalizeOpenAIResponsesRequest(result);
}
// Ensure unique tool_call ids on final payload (translators may have introduced duplicates)
ensureToolCallIds(result, { use9CharId });
fixMissingToolResponses(result);
return result;
}
+156 -19
View File
@@ -30,6 +30,8 @@ type StreamLogger = {
type StreamCompletePayload = {
status: number;
usage: unknown;
/** Minimal response body for call log (streaming: usage + note; non-streaming not used) */
responseBody?: unknown;
};
type StreamOptions = {
@@ -51,6 +53,8 @@ type TranslateState = ReturnType<typeof initState> & {
toolNameMap?: unknown;
usage?: unknown;
finishReason?: unknown;
/** Accumulated message content for call log response body */
accumulatedContent?: string;
};
function getOpenAIIntermediateChunks(value: unknown): unknown[] {
@@ -106,14 +110,21 @@ export function createSSEStream(options: StreamOptions = {}) {
let buffer = "";
let usage = null;
// State for translate mode
// State for translate mode (accumulatedContent for call log response body)
const state: TranslateState | null =
mode === STREAM_MODE.TRANSLATE
? { ...(initState(sourceFormat) as TranslateState), provider, toolNameMap }
? {
...(initState(sourceFormat) as TranslateState),
provider,
toolNameMap,
accumulatedContent: "",
}
: null;
// Track content length for usage estimation (both modes)
let totalContentLength = 0;
// Passthrough: accumulate content for call log response body
let passthroughAccumulatedContent = "";
// Guard against duplicate [DONE] events — ensures exactly one per stream
let doneSent = false;
@@ -201,9 +212,10 @@ export function createSSEStream(options: StreamOptions = {}) {
if (extracted) {
usage = extracted;
}
// Track content length from Responses format
// Track content length and accumulate for call log
if (parsed.delta && typeof parsed.delta === "string") {
totalContentLength += parsed.delta.length;
passthroughAccumulatedContent += parsed.delta;
}
} else if (isClaudeSSE) {
// Claude SSE: extract usage, track content, forward as-is
@@ -213,14 +225,23 @@ export function createSSEStream(options: StreamOptions = {}) {
// message_start carries input_tokens, message_delta carries output_tokens
if (!usage) usage = {};
if (extracted.prompt_tokens > 0) usage.prompt_tokens = extracted.prompt_tokens;
if (extracted.completion_tokens > 0) usage.completion_tokens = extracted.completion_tokens;
if (extracted.completion_tokens > 0)
usage.completion_tokens = extracted.completion_tokens;
if (extracted.total_tokens > 0) usage.total_tokens = extracted.total_tokens;
if (extracted.cache_read_input_tokens) usage.cache_read_input_tokens = extracted.cache_read_input_tokens;
if (extracted.cache_creation_input_tokens) usage.cache_creation_input_tokens = extracted.cache_creation_input_tokens;
if (extracted.cache_read_input_tokens)
usage.cache_read_input_tokens = extracted.cache_read_input_tokens;
if (extracted.cache_creation_input_tokens)
usage.cache_creation_input_tokens = extracted.cache_creation_input_tokens;
}
// Track content length and accumulate from Claude format
if (parsed.delta?.text) {
totalContentLength += parsed.delta.text.length;
passthroughAccumulatedContent += parsed.delta.text;
}
if (parsed.delta?.thinking) {
totalContentLength += parsed.delta.thinking.length;
passthroughAccumulatedContent += parsed.delta.thinking;
}
// Track content length from Claude format
if (parsed.delta?.text) totalContentLength += parsed.delta.text.length;
if (parsed.delta?.thinking) totalContentLength += parsed.delta.thinking.length;
} else {
// Chat Completions: full sanitization pipeline
parsed = sanitizeStreamingChunk(parsed);
@@ -246,6 +267,10 @@ export function createSSEStream(options: StreamOptions = {}) {
if (content && typeof content === "string") {
totalContentLength += content.length;
}
if (typeof delta?.content === "string")
passthroughAccumulatedContent += delta.content;
if (typeof delta?.reasoning_content === "string")
passthroughAccumulatedContent += delta.reasoning_content;
const extracted = extractUsage(parsed);
if (extracted) {
@@ -301,23 +326,45 @@ export function createSSEStream(options: StreamOptions = {}) {
continue;
}
// Track content length for estimation (from various formats)
// Include both regular content and reasoning/thinking content
// Track content length and accumulate for call log (from raw provider chunk, so content is never missed)
// Do this before translation so we capture content regardless of translator output shape
// Claude format
if (parsed.delta?.text) {
totalContentLength += parsed.delta.text.length;
const t = parsed.delta.text;
totalContentLength += t.length;
if (state?.accumulatedContent !== undefined && typeof t === "string")
state.accumulatedContent += t;
}
if (parsed.delta?.thinking) {
totalContentLength += parsed.delta.thinking.length;
const t = parsed.delta.thinking;
totalContentLength += t.length;
if (state?.accumulatedContent !== undefined && typeof t === "string")
state.accumulatedContent += t;
}
// OpenAI format
if (parsed.choices?.[0]?.delta?.content) {
totalContentLength += parsed.choices[0].delta.content.length;
const c = parsed.choices[0].delta.content;
if (typeof c === "string") {
totalContentLength += c.length;
if (state?.accumulatedContent !== undefined) state.accumulatedContent += c;
} else if (Array.isArray(c)) {
for (const part of c) {
if (part?.text && typeof part.text === "string") {
totalContentLength += part.text.length;
if (state?.accumulatedContent !== undefined)
state.accumulatedContent += part.text;
}
}
}
}
if (parsed.choices?.[0]?.delta?.reasoning_content) {
totalContentLength += parsed.choices[0].delta.reasoning_content.length;
const r = parsed.choices[0].delta.reasoning_content;
if (typeof r === "string") {
totalContentLength += r.length;
if (state?.accumulatedContent !== undefined) state.accumulatedContent += r;
}
}
// Gemini format - may have multiple parts
@@ -325,10 +372,30 @@ export function createSSEStream(options: StreamOptions = {}) {
for (const part of parsed.candidates[0].content.parts) {
if (part.text && typeof part.text === "string") {
totalContentLength += part.text.length;
if (state?.accumulatedContent !== undefined) state.accumulatedContent += part.text;
}
}
}
// Generic fallback: delta string, top-level content/text (e.g. some SSE payloads)
if (state?.accumulatedContent !== undefined) {
if (typeof (parsed as JsonRecord).delta === "string") {
const d = (parsed as JsonRecord).delta as string;
state.accumulatedContent += d;
totalContentLength += d.length;
}
if (typeof (parsed as JsonRecord).content === "string") {
const c = (parsed as JsonRecord).content as string;
state.accumulatedContent += c;
totalContentLength += c.length;
}
if (typeof (parsed as JsonRecord).text === "string") {
const t = (parsed as JsonRecord).text as string;
state.accumulatedContent += t;
totalContentLength += t.length;
}
}
// Extract usage
const extracted = extractUsage(parsed);
if (extracted) state.usage = extracted; // Keep original usage for logging
@@ -344,6 +411,9 @@ export function createSSEStream(options: StreamOptions = {}) {
if (translated?.length > 0) {
for (const item of translated) {
// Content for call log is accumulated only from parsed (above) to avoid double-counting;
// do not add again from item here.
// Filter empty chunks
if (!hasValuableContent(item, sourceFormat)) {
continue; // Skip this empty chunk
@@ -415,10 +485,30 @@ export function createSSEStream(options: StreamOptions = {}) {
status: "200 OK",
}).catch(() => {});
}
// Notify caller for call log persistence
// Notify caller for call log persistence (include full response body with accumulated content)
if (onComplete) {
try {
onComplete({ status: 200, usage });
const u = usage as Record<string, unknown> | null;
const prompt = Number(u?.prompt_tokens ?? u?.input_tokens ?? 0);
const completion = Number(u?.completion_tokens ?? u?.output_tokens ?? 0);
const content = passthroughAccumulatedContent.trim() || "";
const responseBody = {
choices: [
{
message: {
role: "assistant",
content,
},
},
],
usage: {
prompt_tokens: prompt,
completion_tokens: completion,
total_tokens: prompt + completion,
},
_streamed: true,
};
onComplete({ status: 200, usage, responseBody });
} catch {}
}
return;
@@ -428,6 +518,33 @@ export function createSSEStream(options: StreamOptions = {}) {
if (buffer.trim()) {
const parsed = parseSSELine(buffer.trim());
if (parsed && !parsed.done) {
// Extract usage from remaining buffer — if the usage-bearing event
// (e.g. response.completed) is the last SSE line, it ends up here
// in the flush handler where extractUsage was not called.
// Non-destructive merge: some providers send usage across multiple
// events (e.g. prompt_tokens in message_start, completion_tokens
// in message_delta). Direct assignment would lose earlier data.
const extracted = extractUsage(parsed);
if (extracted) {
if (!state.usage) {
state.usage = extracted;
} else {
if (extracted.prompt_tokens > 0)
state.usage.prompt_tokens = extracted.prompt_tokens;
if (extracted.completion_tokens > 0)
state.usage.completion_tokens = extracted.completion_tokens;
if (extracted.total_tokens > 0) state.usage.total_tokens = extracted.total_tokens;
if (extracted.cache_read_input_tokens > 0)
state.usage.cache_read_input_tokens = extracted.cache_read_input_tokens;
if (extracted.cache_creation_input_tokens > 0)
state.usage.cache_creation_input_tokens = extracted.cache_creation_input_tokens;
if (extracted.cached_tokens > 0)
state.usage.cached_tokens = extracted.cached_tokens;
if (extracted.reasoning_tokens > 0)
state.usage.reasoning_tokens = extracted.reasoning_tokens;
}
}
const translated = translateResponse(targetFormat, sourceFormat, parsed, state);
// Log OpenAI intermediate chunks
@@ -497,10 +614,30 @@ export function createSSEStream(options: StreamOptions = {}) {
status: "200 OK",
}).catch(() => {});
}
// Notify caller for call log persistence
// Notify caller for call log persistence (include full response body with accumulated content)
if (onComplete) {
try {
onComplete({ status: 200, usage: state?.usage });
const u = state?.usage as Record<string, unknown> | null | undefined;
const prompt = Number(u?.prompt_tokens ?? u?.input_tokens ?? 0);
const completion = Number(u?.completion_tokens ?? u?.output_tokens ?? 0);
const content = (state?.accumulatedContent ?? "").trim() || "";
const responseBody = {
choices: [
{
message: {
role: "assistant",
content,
},
},
],
usage: {
prompt_tokens: prompt,
completion_tokens: completion,
total_tokens: prompt + completion,
},
_streamed: true,
};
onComplete({ status: 200, usage: state?.usage, responseBody });
} catch {}
}
} catch (error) {
+3 -1
View File
@@ -400,8 +400,10 @@ export function logUsage(provider, usage, model = null, connectionId = null, api
console.log(msg);
// Save to usage DB
// input = total input tokens (non-cached + cache_read + cache_creation)
// This ensures analytics show correct totals for heavily-cached requests
const tokens = {
input: inTokens,
input: inTokens + (cacheRead || 0) + (cacheCreation || 0),
output: outTokens,
cacheRead: cacheRead || 0,
cacheCreation: cacheCreation || 0,
+2 -2
View File
@@ -1,12 +1,12 @@
{
"name": "omniroute",
"version": "2.7.9",
"version": "2.8.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "omniroute",
"version": "2.7.9",
"version": "2.8.0",
"hasInstallScript": true,
"license": "MIT",
"workspaces": [
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "omniroute",
"version": "2.8.0",
"version": "2.8.2",
"description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
"type": "module",
"bin": {
+119 -11
View File
@@ -1,27 +1,135 @@
"use client";
import { useState } from "react";
import { useState, useRef, useEffect } from "react";
import { RequestLoggerV2, ProxyLogger, SegmentedControl } from "@/shared/components";
import ConsoleLogViewer from "@/shared/components/ConsoleLogViewer";
import AuditLogTab from "./AuditLogTab";
import { useTranslations } from "next-intl";
const TIME_RANGES = [
{ label: "1h", hours: 1 },
{ label: "6h", hours: 6 },
{ label: "12h", hours: 12 },
{ label: "24h", hours: 24 },
];
const TAB_TO_LOG_TYPE: Record<string, string> = {
"request-logs": "request-logs",
"proxy-logs": "proxy-logs",
"audit-logs": "call-logs",
console: "call-logs",
};
export default function LogsPage() {
const [activeTab, setActiveTab] = useState("request-logs");
const [showExport, setShowExport] = useState(false);
const [exporting, setExporting] = useState(false);
const dropdownRef = useRef<HTMLDivElement>(null);
const t = useTranslations("logs");
useEffect(() => {
function handleClickOutside(e: MouseEvent) {
if (dropdownRef.current && !dropdownRef.current.contains(e.target as Node)) {
setShowExport(false);
}
}
document.addEventListener("mousedown", handleClickOutside);
return () => document.removeEventListener("mousedown", handleClickOutside);
}, []);
async function handleExport(hours: number) {
setExporting(true);
setShowExport(false);
try {
const logType = TAB_TO_LOG_TYPE[activeTab] || "call-logs";
const res = await fetch(`/api/logs/export?hours=${hours}&type=${logType}`);
if (!res.ok) throw new Error("Export failed");
const blob = await res.blob();
const url = URL.createObjectURL(blob);
const a = document.createElement("a");
a.href = url;
a.download = `omniroute-${logType}-${hours}h-${new Date().toISOString().slice(0, 10)}.json`;
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
} catch (err) {
console.error("Export failed:", err);
} finally {
setExporting(false);
}
}
return (
<div className="flex flex-col gap-6">
<SegmentedControl
options={[
{ value: "request-logs", label: t("requestLogs") },
{ value: "proxy-logs", label: t("proxyLogs") },
{ value: "audit-logs", label: t("auditLog") },
{ value: "console", label: t("console") },
]}
value={activeTab}
onChange={setActiveTab}
/>
<div className="flex items-center justify-between gap-4 flex-wrap">
<SegmentedControl
options={[
{ value: "request-logs", label: t("requestLogs") },
{ value: "proxy-logs", label: t("proxyLogs") },
{ value: "audit-logs", label: t("auditLog") },
{ value: "console", label: t("console") },
]}
value={activeTab}
onChange={setActiveTab}
/>
<div className="relative" ref={dropdownRef}>
<button
id="export-logs-btn"
onClick={() => setShowExport(!showExport)}
disabled={exporting}
className="flex items-center gap-2 px-4 py-2 text-sm font-medium rounded-lg
bg-[var(--card-bg,#1e1e2e)] border border-[var(--border,#333)]
text-[var(--text-secondary,#aaa)] hover:text-[var(--text-primary,#fff)]
hover:border-[var(--accent,#7c3aed)] transition-all duration-200
disabled:opacity-50 disabled:cursor-not-allowed"
>
<svg
width="16"
height="16"
viewBox="0 0 16 16"
fill="none"
stroke="currentColor"
strokeWidth="1.5"
>
<path
d="M8 2v8m0 0l-3-3m3 3l3-3M3 12h10"
strokeLinecap="round"
strokeLinejoin="round"
/>
</svg>
{exporting ? "Exporting..." : "Export"}
</button>
{showExport && (
<div
className="absolute right-0 top-full mt-1 z-50 min-w-[140px] rounded-lg
bg-[var(--card-bg,#1e1e2e)] border border-[var(--border,#333)]
shadow-xl overflow-hidden animate-in fade-in"
>
<div className="px-3 py-2 text-xs text-[var(--text-muted,#666)] border-b border-[var(--border,#333)] font-medium">
Time Range
</div>
{TIME_RANGES.map((range) => (
<button
key={range.hours}
id={`export-${range.hours}h-btn`}
onClick={() => handleExport(range.hours)}
className="w-full px-3 py-2 text-sm text-left hover:bg-[var(--hover-bg,#2a2a3e)]
text-[var(--text-secondary,#aaa)] hover:text-[var(--text-primary,#fff)]
transition-colors flex items-center justify-between"
>
<span>Last {range.label}</span>
<span className="text-xs text-[var(--text-muted,#666)]">
{range.hours === 24 ? "default" : ""}
</span>
</button>
))}
</div>
)}
</div>
</div>
{/* Content */}
{activeTab === "request-logs" && <RequestLoggerV2 />}
@@ -1477,6 +1477,7 @@ function CustomModelsSection({ providerId, providerAlias, copied, onCopy }) {
const [editingModelId, setEditingModelId] = useState<string | null>(null);
const [editingApiFormat, setEditingApiFormat] = useState("chat-completions");
const [editingEndpoints, setEditingEndpoints] = useState<string[]>(["chat"]);
const [editingNormalizeToolCallId, setEditingNormalizeToolCallId] = useState(false);
const [savingModelId, setSavingModelId] = useState<string | null>(null);
const fetchCustomModels = useCallback(async () => {
@@ -1548,12 +1549,14 @@ function CustomModelsSection({ providerId, providerAlias, copied, onCopy }) {
? model.supportedEndpoints
: ["chat"]
);
setEditingNormalizeToolCallId(Boolean(model.normalizeToolCallId));
};
const cancelEdit = () => {
setEditingModelId(null);
setEditingApiFormat("chat-completions");
setEditingEndpoints(["chat"]);
setEditingNormalizeToolCallId(false);
setSavingModelId(null);
};
@@ -1577,6 +1580,7 @@ function CustomModelsSection({ providerId, providerAlias, copied, onCopy }) {
source: model?.source || "manual",
apiFormat: editingApiFormat,
supportedEndpoints: editingEndpoints,
normalizeToolCallId: editingNormalizeToolCallId,
}),
});
@@ -1738,6 +1742,14 @@ function CustomModelsSection({ providerId, providerAlias, copied, onCopy }) {
🔊 Audio
</span>
)}
{model.normalizeToolCallId && (
<span
className="text-[10px] px-1.5 py-0.5 rounded-full bg-slate-500/15 text-slate-400 font-medium"
title="9-char tool call ID (Mistral)"
>
ID×9
</span>
)}
</div>
{editingModelId === model.id && (
@@ -1790,6 +1802,16 @@ function CustomModelsSection({ providerId, providerAlias, copied, onCopy }) {
))}
</div>
</div>
<label className="flex items-center gap-2 text-xs text-text-main cursor-pointer">
<input
type="checkbox"
checked={editingNormalizeToolCallId}
onChange={(e) => setEditingNormalizeToolCallId(e.target.checked)}
className="rounded border-border"
/>
Normalize Tool Call ID (9 chars, Mistral)
</label>
</div>
<div className="mt-3 flex items-center gap-2">
<Button
+26 -4
View File
@@ -55,9 +55,26 @@ export async function PATCH(request, { params }) {
if (isValidationFailure(validation)) {
return NextResponse.json({ error: validation.error }, { status: 400 });
}
const { allowedModels, noLog } = validation.data;
const {
name,
allowedModels,
allowedConnections,
noLog,
autoResolve,
isActive,
accessSchedule,
} = validation.data;
const updated = await updateApiKeyPermissions(id, { allowedModels, noLog });
const payload: Parameters<typeof updateApiKeyPermissions>[1] = {};
if (name !== undefined) payload.name = name;
if (allowedModels !== undefined) payload.allowedModels = allowedModels;
if (allowedConnections !== undefined) payload.allowedConnections = allowedConnections;
if (noLog !== undefined) payload.noLog = noLog;
if (autoResolve !== undefined) payload.autoResolve = autoResolve;
if (isActive !== undefined) payload.isActive = isActive;
if (accessSchedule !== undefined) payload.accessSchedule = accessSchedule;
const updated = await updateApiKeyPermissions(id, payload);
if (!updated) {
return NextResponse.json({ error: "Key not found" }, { status: 404 });
}
@@ -67,8 +84,13 @@ export async function PATCH(request, { params }) {
return NextResponse.json({
message: "API key settings updated successfully",
allowedModels,
noLog,
...(name !== undefined && { name }),
...(allowedModels !== undefined && { allowedModels }),
...(allowedConnections !== undefined && { allowedConnections }),
...(noLog !== undefined && { noLog }),
...(autoResolve !== undefined && { autoResolve }),
...(isActive !== undefined && { isActive }),
...(accessSchedule !== undefined && { accessSchedule }),
});
} catch (error) {
console.log("Error updating key permissions:", error);
+58
View File
@@ -0,0 +1,58 @@
import { getDbInstance } from "@/lib/db/core";
/**
* GET /api/logs/export — export logs as JSON
* Query params: ?hours=24 (1, 6, 12, 24; default 24)
* &type=call-logs|request-logs|proxy-logs (default call-logs)
*/
export async function GET(request: Request) {
try {
const { searchParams } = new URL(request.url);
const hours = Math.min(Math.max(parseInt(searchParams.get("hours") || "24") || 24, 1), 168);
const logType = searchParams.get("type") || "call-logs";
const since = new Date(Date.now() - hours * 3600 * 1000).toISOString();
const db = getDbInstance();
let rows: unknown[] = [];
let tableName = "";
if (logType === "call-logs") {
tableName = "call_logs";
const stmt = db.prepare(
"SELECT * FROM call_logs WHERE timestamp >= @since ORDER BY timestamp DESC"
);
rows = stmt.all({ since });
} else if (logType === "request-logs") {
tableName = "request_logs";
const stmt = db.prepare(
"SELECT * FROM request_logs WHERE timestamp >= @since ORDER BY timestamp DESC"
);
rows = stmt.all({ since });
} else if (logType === "proxy-logs") {
tableName = "proxy_logs";
const stmt = db.prepare(
"SELECT * FROM proxy_logs WHERE timestamp >= @since ORDER BY timestamp DESC"
);
rows = stmt.all({ since });
}
const filename = `omniroute-${tableName}-${hours}h-${new Date().toISOString().slice(0, 10)}.json`;
return new Response(
JSON.stringify({ logs: rows, count: rows.length, hours, type: logType }, null, 2),
{
status: 200,
headers: {
"Content-Type": "application/json",
"Content-Disposition": `attachment; filename="${filename}"`,
},
}
);
} catch (error) {
return Response.json(
{ error: { message: (error as Error).message, type: "server_error" } },
{ status: 500 }
);
}
}
+3 -1
View File
@@ -113,12 +113,14 @@ export async function PUT(request) {
return Response.json({ error: validation.error }, { status: 400 });
}
const { provider, modelId, modelName, apiFormat, supportedEndpoints } = validation.data;
const { provider, modelId, modelName, apiFormat, supportedEndpoints, normalizeToolCallId } =
validation.data;
const model = await updateCustomModel(provider, modelId, {
modelName,
apiFormat,
supportedEndpoints,
normalizeToolCallId,
});
if (!model) {
File diff suppressed because it is too large Load Diff
+25
View File
@@ -200,6 +200,9 @@ export async function updateCustomModel(providerId, modelId, updates = {}) {
...(updates.supportedEndpoints !== undefined
? { supportedEndpoints: updates.supportedEndpoints }
: {}),
...(updates.normalizeToolCallId !== undefined
? { normalizeToolCallId: Boolean(updates.normalizeToolCallId) }
: {}),
};
models[index] = next;
@@ -212,3 +215,25 @@ export async function updateCustomModel(providerId, modelId, updates = {}) {
backupDbFile("pre-write");
return next;
}
/**
* Whether the given provider/model has "normalize tool call id" (9-char Mistral-style) enabled.
* Only custom models can have this set; returns false for built-in models.
*/
export function getModelNormalizeToolCallId(providerId: string, modelId: string): boolean {
const db = getDbInstance();
const row = db
.prepare("SELECT value FROM key_value WHERE namespace = 'customModels' AND key = ?")
.get(providerId);
const value = getKeyValue(row).value;
if (!value) return false;
let models: { id: string; normalizeToolCallId?: boolean }[];
try {
models = JSON.parse(value);
} catch {
return false;
}
if (!Array.isArray(models)) return false;
const m = models.find((x: { id: string }) => x.id === modelId);
return Boolean(m?.normalizeToolCallId);
}
+6 -2
View File
@@ -184,8 +184,12 @@ export async function saveCallLog(entry: any) {
account,
connectionId: entry.connectionId || null,
duration: entry.duration || 0,
tokensIn: entry.tokens?.prompt_tokens || 0,
tokensOut: entry.tokens?.completion_tokens || 0,
tokensIn: toNumber(
(entry.tokens?.prompt_tokens ?? entry.tokens?.input_tokens ?? 0) +
(entry.tokens?.cache_read_input_tokens ?? entry.tokens?.cached_tokens ?? 0) +
(entry.tokens?.cache_creation_input_tokens ?? 0)
),
tokensOut: toNumber(entry.tokens?.completion_tokens ?? entry.tokens?.output_tokens ?? 0),
requestType: entry.requestType || null,
sourceFormat: entry.sourceFormat || null,
targetFormat: entry.targetFormat || null,
+10 -10
View File
@@ -223,21 +223,21 @@ export default function RequestLoggerDetail({ log, detail, loading, onClose, onC
</div>
) : (
<>
{/* Request Payload */}
{requestJson && (
{/* Response Payload (返回) — show first */}
{responseJson && (
<PayloadSection
title="Request Payload"
json={requestJson}
onCopy={() => onCopy(requestJson)}
title="Response Payload (返回)"
json={responseJson}
onCopy={() => onCopy(responseJson)}
/>
)}
{/* Response Payload */}
{responseJson && (
{/* Request Payload (请求) */}
{requestJson && (
<PayloadSection
title="Response Payload"
json={responseJson}
onCopy={() => onCopy(responseJson)}
title="Request Payload (请求)"
json={requestJson}
onCopy={() => onCopy(requestJson)}
/>
)}
+71 -1
View File
@@ -2,6 +2,47 @@
// All rates are in dollars per million tokens ($/1M tokens)
// Based on user-provided pricing for Antigravity models and industry standards for others
// Shared pricing constants to reduce duplication
const GPT_5_3_CODEX_PRICING = {
input: 5.0,
output: 20.0,
cached: 2.5,
reasoning: 30.0,
cache_creation: 5.0,
};
const CLAUDE_OPUS_4_PRICING = {
input: 15.0,
output: 75.0,
cached: 7.5,
reasoning: 112.5,
cache_creation: 15.0,
};
const CLAUDE_SONNET_4_PRICING = {
input: 3.0,
output: 15.0,
cached: 1.5,
reasoning: 15.0,
cache_creation: 3.0,
};
const CLAUDE_OPUS_46_PRICING = {
input: 5.0,
output: 25.0,
cached: 2.5,
reasoning: 37.5,
cache_creation: 5.0,
};
const CLAUDE_SONNET_46_PRICING = {
input: 3.0,
output: 15.0,
cached: 1.5,
reasoning: 22.5,
cache_creation: 3.0,
};
export const DEFAULT_PRICING = {
// OAuth Providers (using aliases)
@@ -46,7 +87,14 @@ export const DEFAULT_PRICING = {
// OpenAI Codex (cx)
cx: {
// Issue #334: add gpt5.4
// GPT 5.4
"gpt-5.4": {
input: 5.0,
output: 20.0,
cached: 2.5,
reasoning: 30.0,
cache_creation: 5.0,
},
"gpt5.4": {
input: 5.0,
output: 20.0,
@@ -54,6 +102,19 @@ export const DEFAULT_PRICING = {
reasoning: 30.0,
cache_creation: 5.0,
},
// GPT 5.3 Codex family (all same pricing tier)
"gpt-5.3-codex": GPT_5_3_CODEX_PRICING,
"gpt-5.3-codex-xhigh": GPT_5_3_CODEX_PRICING,
"gpt-5.3-codex-high": GPT_5_3_CODEX_PRICING,
"gpt-5.3-codex-low": GPT_5_3_CODEX_PRICING,
"gpt-5.3-codex-none": GPT_5_3_CODEX_PRICING,
"gpt-5.1-codex-mini-high": {
input: 1.5,
output: 6.0,
cached: 0.75,
reasoning: 9.0,
cache_creation: 1.5,
},
"gpt-5.2-codex": {
input: 5.0,
output: 20.0,
@@ -525,6 +586,15 @@ export const DEFAULT_PRICING = {
reasoning: 37.5,
cache_creation: 5.0,
},
// Common model IDs (without dates) used across providers
// Intentional duplicates of dot-notation variants (e.g. claude-opus-4.6)
// to cover hyphen-notation IDs (claude-opus-4-6) used by some clients
"claude-opus-4-6": CLAUDE_OPUS_46_PRICING,
"claude-sonnet-4-6": CLAUDE_SONNET_46_PRICING,
"claude-opus-4-5-20251101": CLAUDE_OPUS_4_PRICING,
"claude-sonnet-4-5-20250929": CLAUDE_SONNET_4_PRICING,
"claude-sonnet-4": CLAUDE_SONNET_4_PRICING,
"claude-opus-4": CLAUDE_OPUS_4_PRICING,
},
// Gemini
+1
View File
@@ -347,6 +347,7 @@ export const providerModelMutationSchema = z.object({
source: z.string().trim().max(80).optional(),
apiFormat: z.enum(["chat-completions", "responses"]).default("chat-completions"),
supportedEndpoints: z.array(z.enum(["chat", "embeddings", "images", "audio"])).default(["chat"]),
normalizeToolCallId: z.boolean().optional(),
});
const pricingFieldsSchema = z
+8 -3
View File
@@ -135,9 +135,7 @@ export async function handleChat(request: any, clientRawRequest: any = null) {
log.debug("AUTH", "No API key provided (local mode)");
}
// Optional strict API key mode for /v1 endpoints.
// Keep disabled by default to preserve local-mode compatibility.
// Exception: X-Internal-Test header bypasses auth for admin-side combo health checks (#350)
// Optional strict API key mode for /v1 endpoints (require key on every request).
const isInternalTest = request.headers?.get?.("x-internal-test") === "combo-health-check";
if (process.env.REQUIRE_API_KEY === "true" && !isInternalTest) {
if (!apiKey) {
@@ -149,6 +147,13 @@ export async function handleChat(request: any, clientRawRequest: any = null) {
log.warn("AUTH", "Invalid API key while REQUIRE_API_KEY=true");
return errorResponse(HTTP_STATUS.UNAUTHORIZED, "Invalid API key");
}
} else if (apiKey && !isInternalTest) {
// Client sent a Bearer key — it must exist in DB (otherwise reject to avoid "key ignored" confusion).
const valid = await isValidApiKey(apiKey);
if (!valid) {
log.warn("AUTH", "API key not found or invalid (must be created in API Manager)");
return errorResponse(HTTP_STATUS.UNAUTHORIZED, "Invalid API key");
}
}
if (!modelStr) {