First deployment
This commit is contained in:
26
docs/monitoring-dashboard-config.md
Normal file
26
docs/monitoring-dashboard-config.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# Monitoring Dashboard Configuration
|
||||
|
||||
## Objective
|
||||
|
||||
Define baseline dashboards and alert thresholds for reliability and freshness checks.
|
||||
|
||||
## Dashboard Panels
|
||||
|
||||
1. API p95 latency for `/api/news` and `/api/news/latest`
|
||||
2. API error rate (`5xx`) by route
|
||||
3. Scheduler success/failure count per hour
|
||||
4. Feed freshness lag (minutes since latest published item)
|
||||
|
||||
## Alert Thresholds
|
||||
|
||||
- API latency alert: p95 > 750 ms for 10 minutes
|
||||
- API error-rate alert: `5xx` > 3% for 5 minutes
|
||||
- Scheduler alert: 2 consecutive failed fetch cycles
|
||||
- Freshness alert: latest item older than 120 minutes
|
||||
|
||||
## Test Trigger Plan
|
||||
|
||||
- Latency trigger: run stress test against `/api/news` with 50 concurrent requests in staging.
|
||||
- Error-rate trigger: simulate upstream timeout and confirm 5xx alert path.
|
||||
- Scheduler trigger: disable upstream API key in staging and verify consecutive failure alert.
|
||||
- Freshness trigger: pause scheduler for >120 minutes in staging and confirm lag alert.
|
||||
Reference in New Issue
Block a user