# Observability Comprehensive monitoring, metrics, logging, and management for production deployments. ## Overview Svrnty.CQRS provides production-ready observability features for monitoring health, collecting metrics, structured logging, and operational management. **Key Features:** - ✅ **Health Checks** - Monitor stream and consumer health - ✅ **Metrics** - OpenTelemetry-compatible telemetry - ✅ **Structured Logging** - High-performance logging with correlation - ✅ **Management API** - REST endpoints for operations ## Quick Start ```csharp using Svrnty.CQRS.Events; using Svrnty.CQRS.Events.Logging; var builder = WebApplication.CreateBuilder(args); // Health checks builder.Services.AddStreamHealthChecks(options => { options.DegradedConsumerLagThreshold = 1000; options.UnhealthyConsumerLagThreshold = 10000; }); // Metrics builder.Services.AddEventStreamMetrics(); builder.Services.AddOpenTelemetry() .WithMetrics(metrics => metrics .AddMeter("Svrnty.CQRS.Events") .AddPrometheusExporter()); // Logging (already configured via appsettings.json) var app = builder.Build(); // Management API app.MapEventStreamManagementApi(); // Health checks app.MapHealthChecks("/health"); // Prometheus metrics app.MapPrometheusScrapingEndpoint("/metrics"); app.Run(); ``` ## Features ### [Health Checks](health-checks/) Monitor stream and consumer health: - **Stream Health** - Detect unhealthy streams - **Consumer Health** - Detect lag and stalled consumers - **ASP.NET Core Integration** - Built-in health check support ### [Metrics](metrics/) OpenTelemetry-compatible metrics: - **Event Counters** - Published, consumed, errors - **Processing Metrics** - Latency, throughput - **Consumer Metrics** - Lag, active consumers - **Prometheus Integration** - Export to Prometheus/Grafana ### [Logging](logging/) Structured logging with correlation: - **Correlation IDs** - Distributed tracing - **Event IDs** - Categorized log events - **High Performance** - LoggerMessage source generators - **Serilog Integration** - Structured logging support ### [Management API](management-api/) REST endpoints for operations: - **Stream Operations** - List, query streams - **Subscription Operations** - Query subscriptions - **Consumer Operations** - Monitor consumers - **Offset Management** - Reset consumer positions ## Monitoring Dashboard ### Grafana Dashboard ```yaml apiVersion: v1 kind: ConfigMap metadata: name: grafana-dashboards data: svrnty-cqrs.json: | { "panels": [ { "title": "Events Per Second", "targets": [ { "expr": "rate(svrnty_cqrs_events_published[1m])" } ] }, { "title": "Consumer Lag", "targets": [ { "expr": "svrnty_cqrs_events_consumer_lag" } ] }, { "title": "Processing Latency (P95)", "targets": [ { "expr": "histogram_quantile(0.95, svrnty_cqrs_events_processing_latency_bucket)" } ] } ] } ``` ## Production Checklist ### ✅ DO - Configure health checks - Export metrics to monitoring system - Set up structured logging - Monitor consumer lag - Set up alerts for unhealthy conditions - Use correlation IDs - Track error rates - Monitor processing latency ### ❌ DON'T - Don't deploy without health checks - Don't ignore consumer lag warnings - Don't skip structured logging - Don't forget to export metrics - Don't ignore stale consumer alerts ## See Also - [Event Streaming Overview](../event-streaming/README.md) - [Best Practices](../best-practices/README.md) - [Troubleshooting](../troubleshooting/README.md)