Methodology
How CapPulse measures latency, error rates, and reliability for AI providers. This page is updated as our measurement pipeline evolves.
P95 latency
Error rate
Rate limit
What we measure
- Latency (p50, p95): median and tail response times for successful requests.
- Error rate: share of requests returning 5xx or network errors.
- Rate-limit rate: share of requests returning 429 or provider throttling signals.
- Availability: percent of successful responses across probes.
- Throughput: estimated requests per second observed by probes.
How we measure
- Distributed probes in North America, Europe, and APAC.
- Sampling every 60 seconds across chat, embeddings, image, and audio endpoints.
- Requests use consistent prompts and payloads to reduce variance.
- Measurements are normalized to remove client-side overhead.
Reliability score
Reliability is a weighted blend of availability, error rate, and latency stability. We publish ranges rather than exact weights to reduce gaming and keep the score stable as measurement improves.
Typical weighting range: availability (40-50%), error rate (25-35%), latency stability (15-25%), rate limits (5-10%).
Latency stability is normalized using p95 thresholds to keep scores consistent across providers.
Data freshness and retention
- Rollups update every 5 minutes. Freshness badges show on-time, delayed, or stale data.
- Public dashboards show the last 24 hours of data.
- Paid plans unlock 7d, 30d, and 90d retention windows.
- Incident timelines include evidence snapshots for verification.
What we do not claim
- CapPulse is not an official provider status page.
- We do not guarantee uptime or availability for any provider.
- Metrics are not financial advice and should not be used for trading decisions.
Data quality and confidence
Confidence indicators are based on sample count and regional coverage. Low confidence means fewer samples or limited regions.
High confidence typically requires 5,000+ samples and three or more regions in the selected window.
Methodology change log
2025-12-27 - Added freshness badges and 5-minute rollup windows.
2025-12-01 - Added rate-limit classification and endpoint normalization.
2025-10-15 - Expanded probes to APAC and EU coverage.
2025-08-02 - Introduced reliability scoring framework.
Want deeper data?
Access raw metrics, longer retention, and webhook events through the CapPulse API.