Skip to main content

Response Headers Reference

Complete reference for all HTTP headers returned by WorldFlow AI endpoints.

Cost Optimizer Headers (x-worldflow-*)

Returned by proxy endpoints when the cost optimizer is active.

HeaderTypeExampleDescription
x-worldflow-providerstringopenaiProvider that handled the request
x-worldflow-modelstringgpt-4o-miniModel that generated the response
x-worldflow-costfloat0.000450Actual cost of this request in USD
x-worldflow-cost-savedfloat0.008550Estimated savings vs. most expensive alternative in USD
x-worldflow-routing-reasonstringauto_cost_optimizedWhy this model was selected

x-worldflow-routing-reason Values

ValueDescription
auto_cost_optimizedOptimizer selected cheapest model meeting quality threshold
cheapest_availableCheapest routing mode selected the absolute cheapest model
fastest_availableFastest routing mode selected the lowest-latency model
fixed_modelRequest pinned to a specific model via fixed:<model>
chain_routingRequest delegated to a multi-model chain
fallbackNo model met the quality threshold; fallback model used
fallback_not_in_matrixFallback model not found in cost matrix
cheapest_capableRouting engine selected cheapest capable provider
quality_preferredQuality-first strategy preferred a higher-quality model
budget_constrainedBudget pressure forced a cheaper model selection
cache_affinityA model with warm cache data was preferred
policy_overrideA static routing rule or force-provider hint matched
failoverPrimary providers unavailable; failover provider used
cascade_escalationQuery escalated to a higher-tier model after initial attempt
latency_slaSelected to meet a latency SLO constraint

Request Header: X-WorldFlow-Routing

Controls per-request routing behavior.

ValueBehavior
autoUse the workspace's configured routing strategy
cheapestPick the cheapest available model (no quality threshold)
fastestPick the lowest-latency model
fixed:<model_id>Pin to a specific model (e.g., fixed:gpt-4o)
fixed:chain:<chain_id>Execute a multi-model chain

Cache Headers (x-synapse-*)

Returned by all proxy endpoints (OpenAI-compatible, Anthropic-compatible, Gemini-compatible, Cohere-compatible).

Response Headers

HeaderTypeExampleDescription
x-synapse-cache-statusstringHITCache result: HIT, MISS, SKIP, or BYPASS
x-synapse-cache-tierstringL2Cache tier that served the response: L0, L1, L2, or NONE
x-synapse-latency-msinteger23Total request processing latency in milliseconds
x-synapse-similarityfloat0.9400Cosine similarity score on cache hits (0.0-1.0)
x-synapse-cost-savedfloat0.042000Estimated cost saved by cache hit in USD
x-synapse-response-bucket-indexinteger2Cache bucket index used (when bucketing is active)

x-synapse-cache-status Values

ValueDescription
HITResponse served from semantic cache. No LLM call was made.
HIT-L1Response served from L1 (Redis) cache.
HIT-L2Response served from L2 (Milvus) vector cache.
MISSNo similar query found in cache. Request forwarded to provider. Response cached for future use.
SKIPCache was bypassed due to X-Synapse-Skip-Cache: true or BYPASS configuration.
BYPASSRequest forwarded directly without cache interaction (passthrough mode).

x-synapse-cache-tier Values

ValueDescription
L0GPU-resident HBM cache (CAGRA index). Lowest latency (~1ms).
L1Redis with RediSearch HNSW index. Low latency (~5ms).
L2Milvus vector database. Higher capacity, slightly higher latency (~15ms).
NONENo cache tier was involved (cache miss, skip, or bypass).

Request Headers

HeaderTypeValuesDescription
x-synapse-skip-cachebooleantrue, 1Bypass cache for this request
x-synapse-preview-cachebooleantrue, 1Return cache prediction without making an LLM call
x-synapse-passthroughbooleantrue, 1Forward directly to provider without caching or stream processing. Use for tool-use heavy clients.
x-synapse-workspace-contextstringcommit:abc1234,dirty:3Workspace state for code-aware cache invalidation
x-synapse-code-contextJSONSee belowCode session context for three-stage cache matching
x-synapse-bucket-sizeinteger10Override bucket size (1-255)
x-synapse-bucket-indexinteger3Select a specific bucket entry (0-based)
x-synapse-bucket-modestringdeterministicOverride bucket selection: random, recency, deterministic
x-synapse-bucket-disabledbooleantrue, 1Disable bucketing for this request
x-api-keystringsk-...Client's own API key for transparent cache mode (used on cache misses)

x-synapse-workspace-context Format

Comma-separated key-value pairs:

commit:<git-hash>,dirty:<count>,files_hash:<hash>,root:<path>
KeyDescription
commitCurrent git commit hash
dirtyNumber of uncommitted file changes
files_hashHash of mentioned file contents
rootWorkspace root path

x-synapse-code-context Schema

JSON-encoded object:

{
"session_id": "session-abc123",
"file_paths_mentioned": ["src/lib.rs", "src/main.rs"],
"recent_tool_calls": [["Read", "src/lib.rs"]],
"system_prompt_hash": "hash123",
"working_directory": "/path/to/project",
"git_commit": "abc123",
"mentioned_files_mtime": {"src/lib.rs": 1704067200}
}

Rate Limit Headers (X-RateLimit-*)

Returned by all API endpoints when rate limiting is enabled.

Response Headers (Success)

HeaderTypeExampleDescription
X-RateLimit-Limitinteger6000Maximum requests allowed per minute
X-RateLimit-Remaininginteger5999Approximate requests remaining in current window
X-RateLimit-Resetinteger60Seconds until the rate limit window resets

Response Headers (Rate Limited -- HTTP 429)

HeaderTypeExampleDescription
Retry-Afterinteger5Seconds to wait before retrying

Rate Limit Tiers

TierRPSRPMTPMRPDBurst
Starter10600100K10K2x RPS
Pro1006,0001M100K2x RPS
Enterprise50030,00010M1M2x RPS

Burst tolerance allows short spikes at 2x the per-second limit for up to 5 seconds.

Standard Headers

Returned by all endpoints.

HeaderTypeExampleDescription
Content-Typestringapplication/jsonResponse content type
X-Request-IDUUID550e8400-e29b-41d4-a716-446655440000Unique request identifier for tracing and support

Example Response

A cache miss with cost-optimized routing:

HTTP/1.1 200 OK
Content-Type: application/json
X-Request-ID: 550e8400-e29b-41d4-a716-446655440000
x-synapse-cache-status: MISS
x-synapse-cache-tier: NONE
x-synapse-latency-ms: 1250
x-worldflow-provider: openai
x-worldflow-model: gpt-4o-mini
x-worldflow-cost: 0.000450
x-worldflow-cost-saved: 0.008550
x-worldflow-routing-reason: auto_cost_optimized
X-RateLimit-Limit: 6000
X-RateLimit-Remaining: 5999
X-RateLimit-Reset: 60

A cache hit (no LLM call, no cost):

HTTP/1.1 200 OK
Content-Type: application/json
X-Request-ID: 660f9500-f30c-52e5-b827-557766551111
x-synapse-cache-status: HIT-L2
x-synapse-cache-tier: L2
x-synapse-latency-ms: 15
x-synapse-similarity: 0.9650
x-synapse-cost-saved: 0.009000
X-RateLimit-Limit: 6000
X-RateLimit-Remaining: 5998
X-RateLimit-Reset: 59