Home  /  Status

🕒 Auto-refreshes every 60 seconds

All systems operational.

Live health of every DCS endpoint, every region, every minute. Public, read-only, no auth required. Powered by Better Stack monitors and the R-Series receipt chain.

Next auto-refresh in 60s
Live endpoints

Every critical surface, in one list

Pulled live from Better Stack monitors. The 90-day bars show daily uptime — one bar per day, hover for the exact number.

ENDPOINT90-DAY HISTORY · UPTIME · RESPONSE TIME
Main backendapi.dcsai.ai · Railway us-east-1
99.992%
124 ms
EU passive failoverapi-eu.dcsai.ai · Railway europe-west4
99.978%
98 ms
Compute laneapi.compute.dcsai.ai · Railway asia-southeast1
99.984%
142 ms
Storage backendapi.storage.dcsai.ai · Lighthouse-backed
99.989%
156 ms
Primary IPFS gatewaygateway.dcsai.ai · Cloudflare Worker
99.996%
42 ms
Gateway standbygateway-standby.dcsai.ai · separate CF account
99.999%
45 ms
Inference Tier 0 (own-GPU)RunPod H200 · weight 0.5-0.7
99.612%
312 ms
Public verifierverify.dcsai.ai · Cloudflare Pages
99.994%
38 ms
Region health summary

Five regions watched in parallel

Synthetic builds run once a minute from each region; the green tag is the result of the last 60 checks.

🇺🇸
North America
Operational
build · 1m 47s
🇪🇺
Europe
Operational
build · 1m 51s
🇪🇰
Asia-Pacific
Operational
build · 2m 04s
🇦🇪
Middle East
Degraded
build · 2m 41s
🇧🇷
Latin America
Operational
build · 2m 12s
99.987%90-day uptime
60 sMonitor frequency
5 regionsSynthetic build probes
0 incidentsLast 7 days
2 minGateway failover RTO
Incident history · last 30 days

Named, dated, resolved

Every incident gets a postmortem note. No "minor issues" hand-waving.

21 MAY17:42 UTC

Inference Tier 0 intermittent 503s

RunPod H200 returned 503 on ~3% of requests for 14 minutes. Traffic auto-shed to Tier 1 (Cerebras) — zero impact on builds. Underlying capacity restored 17:56 UTC.

Resolved
18 MAY03:10 UTC

Scheduled maintenance · Storage backend

Lighthouse SDK upgrade to 0.4.5. 6-minute window. Read traffic served from cache; new mirrors queued and drained within the window.

Maintenance
14 MAY22:08 UTC

Middle East region latency > 3s

Upstream BGP path flap added 1.4-2.7s to ME-routed builds for 23 minutes. Cleared by route propagation; no requests dropped, only delayed.

Degraded
09 MAY14:32 UTC

Primary gateway free-tier quota approach

Cloudflare free-tier daily request budget reached 87%. Gateway standby pre-warmed; cutover not triggered. Daily budget upgrade applied 15:10 UTC.

Resolved
On-chain heartbeat · Reputation SBT

Last 5 worker reputation mints

Each mint is an on-chain attestation that a worker hit a reliability milestone. Live from the contract on Base mainnet.

TRUSTED
0xa4e7…9f31 · 1,240 jobs · 99.2% reliable
2 min ago Basescan →
VERIFIED
0xb3c2…4d18 · 512 jobs · 97.8% reliable
18 min ago Basescan →
VERIFIED
0xd1f8…7e02 · 408 jobs · 98.4% reliable
41 min ago Basescan →
TRUSTED
0x9a55…c074 · 3,012 jobs · 99.5% reliable
1 h ago Basescan →
VERIFIED
0x68bd…13af · 196 jobs · 96.1% reliable
2 h ago Basescan →

Contract 0xbDd1f5fC349D9a8EfCEb07Edbd491233b2540f5F on Base mainnet. Verify a chain →

Get pinged on incidents.

Email + RSS + a webhook for Slack or PagerDuty. No marketing, only status.

Status FAQ

How this page actually works

Where does the data come from?
Better Stack synthetic monitors (60-second cadence, 5 regions) for the endpoint list and region grid. The Reputation SBT mints stream live from the contract on Base mainnet. Everything is read-only and public — no API keys baked into the page.
How quickly do you flip gateway if Cloudflare hits the free tier?
~2 minutes. DNS at gateway.dcsai.ai is pre-staged to swing to gateway-standby.dcsai.ai (separate Cloudflare account, separate worker). DNS propagation is typically under 5 minutes globally; client cache hits continue serving during the swing.
What's the EU failover RTO?
The EU passive failover already mirrors api.dcsai.ai on every deploy. RTO is dominated by Cloudflare DNS swing — typically under 90 seconds to start serving from api-eu.dcsai.ai.
What counts as "degraded"?
P95 response time more than 2× the 30-day baseline, or error rate above 1% sustained for 5 minutes. Both trigger a yellow status; sustained breach for 15+ minutes opens an incident.
Is the 90-day uptime number real?
Yes — computed from the Better Stack monitor log, weighted by minute. We don't subtract scheduled maintenance from the number. If something was down, it counts.
Can I get a webhook for incidents?
Yes — Slack and PagerDuty webhooks live, plus a generic JSON webhook. Email subscribers get a per-incident note within 60 seconds of an incident open.