Compare commits

...

8 Commits

Author SHA1 Message Date
Diego Rodrigues de Sa e Souza f279368531 Merge pull request #911 from diegosouzapw/release/v3.4.4
Build Electron Desktop App / Validate version (push) Failing after 33s
Build Electron Desktop App / Build Electron (macos-arm64) (push) Has been skipped
Build Electron Desktop App / Build Electron (linux) (push) Has been skipped
Build Electron Desktop App / Build Electron (macos-intel) (push) Has been skipped
Build Electron Desktop App / Build Electron (windows) (push) Has been skipped
Build Electron Desktop App / Create Release (push) Has been skipped
Build Electron Desktop App / Publish to npm (push) Has been skipped
chore(release): v3.4.4 — Responses API token fix, SQLite WAL checkpoint, issue triage
2026-04-02 07:44:29 -03:00
diegosouzapw 4cc44b37bb chore(release): v3.4.4 — Responses API token fix, SQLite WAL checkpoint, issue triage 2026-04-02 01:05:09 -03:00
Diego Rodrigues de Sa e Souza e121fec599 Merge pull request #905 from rdself/coder/sqlite-wal-checkpoint-shutdown
Flush SQLite WAL on graceful shutdown
2026-04-02 01:00:50 -03:00
Diego Rodrigues de Sa e Souza 6c669abb23 Merge pull request #909 from christopher-s/fix/responses-api-flush-total-tokens
fix(translator): emit response.completed with total_tokens for Responses API clients
2026-04-02 01:00:34 -03:00
Chris Staley 9588c1ea3e refactor(translator): extract usage token constants for readability
Extract input_tokens/output_tokens into local constants to avoid
repeating nullish coalescing chains in the total_tokens calculation.
2026-04-01 20:03:31 -06:00
Chris Staley 304664b318 test: update usage field assertions to Responses API format
The normalization to input_tokens/output_tokens/total_tokens changed
the usage field names. Update test to assert on the correct fields.
2026-04-01 19:46:18 -06:00
Chris Staley 8372a3c7ca fix(translator): emit response.completed with total_tokens for Responses API clients
Hub-and-spoke flush path was broken for non-OpenAI providers when the
client speaks OpenAI Responses API (e.g. Codex CLI). When
claudeToOpenAIResponse(null) returned null, openaiToOpenAIResponsesResponse
was never called, so response.completed (carrying total_tokens) was never
emitted — causing "missing field total_tokens" errors in Codex.

Two fixes:
- Pass null through to Step 2 translator even when Step 1 produced no
  output during flush, so terminal events get emitted
- Capture usage from any chunk carrying it (not just usage-only chunks)
  and normalize Chat Completions format to Responses API format
2026-04-01 19:26:34 -06:00
R.D. 69bbc0a2a1 flush sqlite wal on graceful shutdown 2026-04-01 20:14:31 -04:00
18 changed files with 181 additions and 25 deletions
+17
View File
@@ -4,6 +4,23 @@
---
## [3.4.4] - 2026-04-02
### 🐛 Bug Fixes
- **Responses API Token Reporting:** Emit `response.completed` with correct `input_tokens`/`output_tokens` fields for Codex CLI clients, fixing token usage display (#909 — thanks @christopher-s).
- **SQLite WAL Checkpoint on Shutdown:** Flush WAL changes into the primary database file during graceful shutdown/restart, preventing data loss on Docker container stops (#905 — thanks @rdself).
- **Graceful Shutdown Signal:** Changed `/api/restart` and `/api/shutdown` routes from `process.exit(0)` to `process.kill(SIGTERM)`, ensuring the shutdown handler runs before exit.
- **Docker Stop Grace Period:** Added `stop_grace_period: 40s` to Docker Compose files and `--stop-timeout 40` to Docker run examples.
### 🛠️ Maintenance
- Closed 5 resolved/not-a-bug issues (#872, #814, #816, #890, #877).
- Triaged 6 issues with needs-info requests (#892, #887, #886, #865, #895, #870).
- Responded to CLI detection tracking issue (#863) with contributor guidance.
---
## [3.4.3] - 2026-04-02
### ✨ New Features
+4
View File
@@ -979,6 +979,7 @@ OmniRoute is available as a public Docker image on [Docker Hub](https://hub.dock
docker run -d \
--name omniroute \
--restart unless-stopped \
--stop-timeout 40 \
-p 20128:20128 \
-v omniroute-data:/app/data \
diegosouzapw/omniroute:latest
@@ -993,6 +994,7 @@ cp .env.example .env
docker run -d \
--name omniroute \
--restart unless-stopped \
--stop-timeout 40 \
--env-file .env \
-p 20128:20128 \
-v omniroute-data:/app/data \
@@ -1016,6 +1018,8 @@ Notes:
- Quick Tunnel URLs are temporary and change after every restart.
- Managed install currently supports Linux, macOS, and Windows on `x64` / `arm64`.
- Docker images bundle system CA roots and pass them to managed `cloudflared`, which avoids TLS trust failures when the tunnel bootstraps inside the container.
- SQLite runs in WAL mode. `docker stop` should be allowed to finish so OmniRoute can checkpoint the latest changes back into `storage.sqlite`.
- The bundled Compose files already set a 40s stop grace period. If you run the image directly, keep `--stop-timeout 40` (or similar) so manual stops do not cut off shutdown cleanup.
- Set `CLOUDFLARED_BIN=/absolute/path/to/cloudflared` if you want OmniRoute to use an existing binary instead of downloading one.
**Using Docker Compose with Caddy (HTTPS Auto-TLS):**
+1
View File
@@ -19,6 +19,7 @@ services:
target: runner-cli
image: omniroute:prod
restart: unless-stopped
stop_grace_period: 40s
env_file: .env
environment:
- NODE_ENV=production
+1
View File
@@ -17,6 +17,7 @@
x-common: &common
restart: unless-stopped
stop_grace_period: 40s
env_file: .env
environment:
- DATA_DIR=/app/data # Must match the volume mount below
+4
View File
@@ -983,6 +983,7 @@ OmniRoute is available as a public Docker image on [Docker Hub](https://hub.dock
docker run -d \
--name omniroute \
--restart unless-stopped \
--stop-timeout 40 \
-p 20128:20128 \
-v omniroute-data:/app/data \
diegosouzapw/omniroute:latest
@@ -997,6 +998,7 @@ cp .env.example .env
docker run -d \
--name omniroute \
--restart unless-stopped \
--stop-timeout 40 \
--env-file .env \
-p 20128:20128 \
-v omniroute-data:/app/data \
@@ -1020,6 +1022,8 @@ Notes:
- Quick Tunnel URLs are temporary and change after every restart.
- Managed install currently supports Linux, macOS, and Windows on `x64` / `arm64`.
- Docker images bundle system CA roots and pass them to managed `cloudflared`, which avoids TLS trust failures when the tunnel bootstraps inside the container.
- SQLite uses WAL mode. Let `docker stop` finish cleanly so OmniRoute can checkpoint the latest changes back into `storage.sqlite`.
- The bundled Compose files already use a 40s stop grace period. If you run the image directly, keep `--stop-timeout 40` (or similar) so manual stops do not interrupt shutdown cleanup.
- Set `CLOUDFLARED_BIN=/absolute/path/to/cloudflared` if you want OmniRoute to use an existing binary instead of downloading one.
**Using Docker Compose with Caddy (HTTPS Auto-TLS):**
+1 -1
View File
@@ -1,7 +1,7 @@
openapi: 3.1.0
info:
title: OmniRoute API
version: 3.4.3
version: 3.4.4
description: |
OmniRoute is a local-first AI API proxy router. It provides an OpenAI-compatible
endpoint that routes requests to multiple AI providers with load balancing,
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "omniroute-desktop",
"version": "3.4.3",
"version": "3.4.4",
"description": "OmniRoute Desktop Application",
"main": "main.js",
"author": {
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "@omniroute/open-sse",
"version": "3.4.3",
"version": "3.4.4",
"description": "Express SSE sidecar for OmniRoute — handles streaming, protocol translation, and provider orchestration",
"type": "module",
"main": "index.js",
+9
View File
@@ -267,6 +267,15 @@ export function translateResponse(targetFormat, sourceFormat, chunk, state) {
finalResults.push(...(Array.isArray(converted) ? converted : [converted]));
}
}
// Flush: pass null to source-format translator even when Step 1 produced no output.
// This is critical for formats like openai-responses that emit terminal events
// (e.g., response.completed with total_tokens) in their flush handler.
if (chunk === null && results.length === 0) {
const converted = fromOpenAI(null, state);
if (converted) {
finalResults.push(...(Array.isArray(converted) ? converted : [converted]));
}
}
results = finalResults;
}
}
@@ -14,11 +14,24 @@ export function openaiToOpenAIResponsesResponse(chunk, state) {
return flushEvents(state);
}
if (!chunk.choices?.length) {
// Capture usage from usage-only chunks (stream_options.include_usage)
if (chunk.usage) {
state.usage = chunk.usage;
// Capture usage from any chunk that carries it (usage-only chunks OR final chunks with finish_reason)
// Normalize Chat Completions format (prompt_tokens/completion_tokens) to Responses API format
// (input_tokens/output_tokens) so response.completed always has the fields Codex expects.
if (chunk.usage) {
const u = chunk.usage;
const input_tokens = u.input_tokens ?? u.prompt_tokens ?? 0;
const output_tokens = u.output_tokens ?? u.completion_tokens ?? 0;
state.usage = {
input_tokens,
output_tokens,
total_tokens: u.total_tokens ?? input_tokens + output_tokens,
};
if (u.prompt_tokens_details?.cached_tokens) {
state.usage.input_tokens_details = { cached_tokens: u.prompt_tokens_details.cached_tokens };
}
}
if (!chunk.choices?.length) {
return [];
}
+3 -3
View File
@@ -1,12 +1,12 @@
{
"name": "omniroute",
"version": "3.4.3",
"version": "3.4.4",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "omniroute",
"version": "3.4.3",
"version": "3.4.4",
"hasInstallScript": true,
"license": "MIT",
"workspaces": [
@@ -21068,7 +21068,7 @@
},
"open-sse": {
"name": "@omniroute/open-sse",
"version": "3.4.3"
"version": "3.4.4"
}
}
}
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "omniroute",
"version": "3.4.3",
"version": "3.4.4",
"description": "Smart AI Router with auto fallback — route to FREE & cheap models, zero downtime. Works with Cursor, Cline, Claude Desktop, Codex, and any OpenAI-compatible tool.",
"type": "module",
"bin": {
+2 -2
View File
@@ -1,9 +1,9 @@
import { NextResponse } from "next/server";
export async function POST() {
// Graceful restart: exit with code 0 so the process manager (pm2/systemd) restarts
// Graceful restart: SIGTERM flows through the shutdown handler before the process manager restarts
setTimeout(() => {
process.exit(0);
process.kill(process.pid, "SIGTERM");
}, 500);
return NextResponse.json({ status: "restarting" });
+1 -1
View File
@@ -4,7 +4,7 @@ export async function POST() {
const response = NextResponse.json({ success: true, message: "Shutting down..." });
setTimeout(() => {
process.exit(0);
process.kill(process.pid, "SIGTERM");
}, 500);
return response;
+36 -5
View File
@@ -12,6 +12,7 @@ import { runMigrations } from "./migrationRunner";
type SqliteDatabase = import("better-sqlite3").Database;
type JsonRecord = Record<string, unknown>;
type CheckpointMode = "PASSIVE" | "FULL" | "RESTART" | "TRUNCATE";
// ──────────────── Environment Detection ────────────────
@@ -323,6 +324,12 @@ function setDb(db: SqliteDatabase | null): void {
}
}
function checkpointDb(db: SqliteDatabase, mode: CheckpointMode = "TRUNCATE"): boolean {
if (isCloud || isBuildPhase || !SQLITE_FILE) return false;
db.pragma(`wal_checkpoint(${mode})`);
return true;
}
function ensureProviderConnectionsColumns(db: SqliteDatabase) {
try {
const columns = db.prepare("PRAGMA table_info(provider_connections)").all() as Array<{
@@ -523,15 +530,39 @@ export function getDbInstance(): SqliteDatabase {
return db;
}
export function closeDbInstance(options?: { checkpointMode?: CheckpointMode | null }): boolean {
const db = getDb();
if (!db) return false;
const checkpointMode = options?.checkpointMode ?? "TRUNCATE";
try {
if (checkpointMode) {
try {
if (checkpointDb(db, checkpointMode)) {
console.log(`[DB] SQLite WAL checkpoint completed (${checkpointMode}).`);
}
} catch (error: unknown) {
const message = error instanceof Error ? error.message : String(error);
console.warn(`[DB] WAL checkpoint failed during close (${checkpointMode}):`, message);
}
}
} finally {
try {
if (db.open) db.close();
} finally {
setDb(null);
}
}
return true;
}
/**
* Reset the singleton (used by restore).
*/
export function resetDbInstance() {
const db = getDb();
if (db) {
db.close();
setDb(null);
}
closeDbInstance();
}
// ──────────────── JSON → SQLite Migration ────────────────
+3 -5
View File
@@ -96,11 +96,9 @@ async function waitForDrain(): Promise<void> {
*/
async function cleanup(): Promise<void> {
try {
const { getDbInstance } = await import("@/lib/db/core");
const db = getDbInstance();
if (db && typeof db.close === "function") {
db.close();
console.log("[Shutdown] SQLite database closed.");
const { closeDbInstance } = await import("@/lib/db/core");
if (closeDbInstance()) {
console.log("[Shutdown] SQLite database checkpointed and closed.");
}
} catch (err) {
console.error("[Shutdown] Error during cleanup:", (err as Error).message);
+76
View File
@@ -19,6 +19,8 @@ const proxyFetch = await import("../../open-sse/utils/proxyFetch.ts");
const proxyDispatcher = await import("../../open-sse/utils/proxyDispatcher.ts");
const proxySettingsRoute = await import("../../src/app/api/settings/proxy/route.ts");
const proxyTestRoute = await import("../../src/app/api/settings/proxy/test/route.ts");
const shutdownRoute = await import("../../src/app/api/shutdown/route.ts");
const restartRoute = await import("../../src/app/api/restart/route.ts");
async function withEnv(name, value, fn) {
const previous = process.env[name];
@@ -141,6 +143,80 @@ test(
}
);
test("closeDbInstance checkpoints WAL changes into the primary SQLite file", async () => {
await resetStorage();
const db = core.getDbInstance();
const now = new Date().toISOString();
db.prepare(
"INSERT INTO provider_connections (id, provider, auth_type, name, is_active, created_at, updated_at) VALUES (?, ?, ?, ?, ?, ?, ?)"
).run("checkpoint-test-conn", "openai", "apikey", "checkpoint-test", 1, now, now);
core.closeDbInstance();
const snapshotPath = path.join(TEST_DATA_DIR, "storage-snapshot.sqlite");
fs.copyFileSync(core.SQLITE_FILE, snapshotPath);
const Database = (await import("better-sqlite3")).default;
const snapshotDb = new Database(snapshotPath, { readonly: true });
try {
const row = snapshotDb
.prepare("SELECT name FROM provider_connections WHERE id = ?")
.get("checkpoint-test-conn");
assert.equal(row?.name, "checkpoint-test");
} finally {
snapshotDb.close();
}
});
test("shutdown route uses SIGTERM for graceful shutdown", async () => {
const originalKill = process.kill;
const originalSetTimeout = globalThis.setTimeout;
const calls = [];
process.kill = (pid, signal) => {
calls.push({ pid, signal });
return true;
};
globalThis.setTimeout = (callback) => {
callback();
return 0;
};
try {
const response = await shutdownRoute.POST();
assert.equal(response.status, 200);
assert.deepEqual(calls, [{ pid: process.pid, signal: "SIGTERM" }]);
} finally {
process.kill = originalKill;
globalThis.setTimeout = originalSetTimeout;
}
});
test("restart route uses SIGTERM for graceful restart", async () => {
const originalKill = process.kill;
const originalSetTimeout = globalThis.setTimeout;
const calls = [];
process.kill = (pid, signal) => {
calls.push({ pid, signal });
return true;
};
globalThis.setTimeout = (callback) => {
callback();
return 0;
};
try {
const response = await restartRoute.POST();
assert.equal(response.status, 200);
assert.deepEqual(calls, [{ pid: process.pid, signal: "SIGTERM" }]);
} finally {
process.kill = originalKill;
globalThis.setTimeout = originalSetTimeout;
}
});
test("unlinkFileWithRetry retries EBUSY/EPERM and eventually succeeds", async () => {
const target = path.join(TEST_DATA_DIR, "retry-target.tmp");
fs.writeFileSync(target, "retry-me");
@@ -369,7 +369,9 @@ test("Chat→Responses streaming: usage-only chunk is captured (not dropped)", (
const completedEvent = finishEvents.find((e) => e.event === "response.completed");
assert.ok(completedEvent, "should have completed event");
assert.ok(completedEvent.data.response.usage, "completed event should include usage");
assert.equal(completedEvent.data.response.usage.prompt_tokens, 10);
assert.equal(completedEvent.data.response.usage.input_tokens, 10);
assert.equal(completedEvent.data.response.usage.output_tokens, 5);
assert.equal(completedEvent.data.response.usage.total_tokens, 15);
});
test("Chat→Responses streaming: completed event includes accumulated output", () => {