3.7 KiB
3.7 KiB
Observability
Comprehensive monitoring, metrics, logging, and management for production deployments.
Overview
Svrnty.CQRS provides production-ready observability features for monitoring health, collecting metrics, structured logging, and operational management.
Key Features:
- ✅ Health Checks - Monitor stream and consumer health
- ✅ Metrics - OpenTelemetry-compatible telemetry
- ✅ Structured Logging - High-performance logging with correlation
- ✅ Management API - REST endpoints for operations
Quick Start
using Svrnty.CQRS.Events;
using Svrnty.CQRS.Events.Logging;
var builder = WebApplication.CreateBuilder(args);
// Health checks
builder.Services.AddStreamHealthChecks(options =>
{
options.DegradedConsumerLagThreshold = 1000;
options.UnhealthyConsumerLagThreshold = 10000;
});
// Metrics
builder.Services.AddEventStreamMetrics();
builder.Services.AddOpenTelemetry()
.WithMetrics(metrics => metrics
.AddMeter("Svrnty.CQRS.Events")
.AddPrometheusExporter());
// Logging (already configured via appsettings.json)
var app = builder.Build();
// Management API
app.MapEventStreamManagementApi();
// Health checks
app.MapHealthChecks("/health");
// Prometheus metrics
app.MapPrometheusScrapingEndpoint("/metrics");
app.Run();
Features
Health Checks
Monitor stream and consumer health:
- Stream Health - Detect unhealthy streams
- Consumer Health - Detect lag and stalled consumers
- ASP.NET Core Integration - Built-in health check support
Metrics
OpenTelemetry-compatible metrics:
- Event Counters - Published, consumed, errors
- Processing Metrics - Latency, throughput
- Consumer Metrics - Lag, active consumers
- Prometheus Integration - Export to Prometheus/Grafana
Logging
Structured logging with correlation:
- Correlation IDs - Distributed tracing
- Event IDs - Categorized log events
- High Performance - LoggerMessage source generators
- Serilog Integration - Structured logging support
Management API
REST endpoints for operations:
- Stream Operations - List, query streams
- Subscription Operations - Query subscriptions
- Consumer Operations - Monitor consumers
- Offset Management - Reset consumer positions
Monitoring Dashboard
Grafana Dashboard
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboards
data:
svrnty-cqrs.json: |
{
"panels": [
{
"title": "Events Per Second",
"targets": [
{
"expr": "rate(svrnty_cqrs_events_published[1m])"
}
]
},
{
"title": "Consumer Lag",
"targets": [
{
"expr": "svrnty_cqrs_events_consumer_lag"
}
]
},
{
"title": "Processing Latency (P95)",
"targets": [
{
"expr": "histogram_quantile(0.95, svrnty_cqrs_events_processing_latency_bucket)"
}
]
}
]
}
Production Checklist
✅ DO
- Configure health checks
- Export metrics to monitoring system
- Set up structured logging
- Monitor consumer lag
- Set up alerts for unhealthy conditions
- Use correlation IDs
- Track error rates
- Monitor processing latency
❌ DON'T
- Don't deploy without health checks
- Don't ignore consumer lag warnings
- Don't skip structured logging
- Don't forget to export metrics
- Don't ignore stale consumer alerts