AI Providers

Live reliability, latency, and rate-limit tracking across major AI providers. Public data covers global probes with a 24-hour window.

Create Watchlist Get Alerts

Last updated: 2025-12-30 07:20:21 UTCFreshness: StaleData window: last 24hHow we measure

P95 latency

Error rate

Rate limit

Operational

Degraded

Active Incidents

Median P95 Latency

1010 ms

Filters

Public data caps at 24 hours. Sign in for region-level filters and longer retention.

Sponsor

Monitoring partner slot. Sponsorships are labeled and never influence ordering.

Request media kit ->

AWS Bedrock

Degraded

Managed model hosting with multiple providers in AWS regions.

Regions: us-east, us-west, eu-west | Endpoints: chat, embeddings

P95 latency

1490 ms

Error rate

1.60%

Rate limit

1.50%

Availability

99.30%

Confidence: High

Reliability score: 85/100

Sample count: 21,600

Incidents (30d): 0

View provider ->

Anthropic

Operational

Claude family models with strong reasoning performance and long context.

Regions: us-east, eu-central, ap-sg | Endpoints: chat, embeddings

P95 latency

970 ms

Error rate

0.50%

Rate limit

0.70%

Availability

99.90%

Confidence: High

Reliability score: 91/100

Sample count: 44,600

Incidents (30d): 0

View provider ->

Azure OpenAI

Operational

Azure-hosted OpenAI models with enterprise tenancy controls.

Regions: us-east, eu-north, ap-sg | Endpoints: chat, embeddings, images

P95 latency

910 ms

Error rate

0.60%

Rate limit

0.80%

Availability

99.80%

Confidence: High

Reliability score: 92/100

Sample count: 40,200

Incidents (30d): 0

View provider ->

Cohere

Operational

Enterprise-focused models optimized for retrieval and RAG workloads.

Regions: us-east, eu-west | Endpoints: chat, embeddings

P95 latency

1030 ms

Error rate

0.70%

Rate limit

0.90%

Availability

99.70%

Confidence: Medium

Reliability score: 90/100

Sample count: 24,000

Incidents (30d): 0

View provider ->

Google

Operational

Gemini APIs across text, image, and multimodal workflows.

Regions: us-east, eu-west, ap-tokyo | Endpoints: chat, embeddings, images

P95 latency

1010 ms

Error rate

0.70%

Rate limit

0.60%

Availability

99.80%

Confidence: High

Reliability score: 91/100

Sample count: 39,200

Incidents (30d): 0

View provider ->

Groq

Partial Outage

High-throughput inference provider for open source models.

Regions: us-west | Endpoints: chat, embeddings

P95 latency

740 ms

Error rate

2.40%

Rate limit

6.20%

Availability

98.90%

Confidence: Low

Reliability score: 92/100

Sample count: 16,800

Incidents (30d): 0

View provider ->

Mistral

Operational

European-first LLM provider with strong general purpose models.

Regions: eu-west, us-east | Endpoints: chat, embeddings

P95 latency

960 ms

Error rate

0.80%

Rate limit

1.00%

Availability

99.60%

Confidence: Medium

Reliability score: 91/100

Sample count: 26,500

Incidents (30d): 0

View provider ->

OpenAI

Degraded

Frontier model provider with GPT-4o, GPT-4.1, and multimodal APIs.

Regions: us-east, us-west, eu-west, ap-sg | Endpoints: chat, embeddings, images, audio

P95 latency

1320 ms

Error rate

1.60%

Rate limit

2.10%

Availability

99.40%

Confidence: High

Reliability score: 87/100

Sample count: 48,200

Incidents (30d): 0

View provider ->

Stay ahead of incidents.

Create alerts for the providers you care about and get notified before your users feel it.

Get Alerts Sign In