# Svrnty.MCP.Gateway - Module Design **Document Type:** Architecture Design Document **Status:** Planned **Version:** 1.0.0 **Last Updated:** 2025-10-19 --- ## Overview Svrnty.MCP.Gateway is a .NET 8 library that provides proxy and routing infrastructure for MCP traffic, enabling centralized management, authentication, monitoring, and load balancing between MCP clients and servers. This document defines the architecture, components, and design decisions. ### Purpose - **What**: Gateway/proxy library for routing MCP traffic between clients and servers - **Why**: Enable centralized management, security, and monitoring of MCP infrastructure - **How**: Clean Architecture with routing strategies, health monitoring, and transport abstraction --- ## Architecture ### Clean Architecture Layers ``` ┌─────────────────────────────────────────────────┐ │ Svrnty.MCP.Gateway.Cli (Executable) │ │ ┌───────────────────────────────────────────┐ │ │ │ Svrnty.MCP.Gateway.AspNetCore (HTTP)│ │ │ │ ┌─────────────────────────────────────┐ │ │ │ │ │ Svrnty.MCP.Gateway.Infrastructure│ │ │ │ │ │ ┌───────────────────────────────┐ │ │ │ │ │ │ │ Svrnty.MCP.Gateway.Core │ │ │ │ │ │ │ │ - IGatewayRouter │ │ │ │ │ │ │ │ - IRoutingStrategy │ │ │ │ │ │ │ │ - IAuthProvider │ │ │ │ │ │ │ │ - ICircuitBreaker │ │ │ │ │ │ │ │ - Models (no dependencies) │ │ │ │ │ │ │ └───────────────────────────────┘ │ │ │ │ │ └─────────────────────────────────────┘ │ │ │ └───────────────────────────────────────────┘ │ └─────────────────────────────────────────────────┘ ``` ### Layer Responsibilities | Layer | Purpose | Dependencies | |-------|---------|----| | **Core** | Abstractions and models | None | | **Infrastructure** | Routing, auth, circuit breakers | Core, System.Text.Json | | **AspNetCore** | HTTP endpoints and DI | Core, Infrastructure, ASP.NET Core | | **Cli** | Management CLI | All layers | --- ## Core Components ### IGatewayRouter Interface Primary interface for gateway routing operations: ```csharp public interface IGatewayRouter { // Request Routing Task RouteRequestAsync( McpRequest request, RoutingContext context, CancellationToken ct = default ); // Server Management Task> GetRegisteredServersAsync(); Task RegisterServerAsync(ServerConfig config); Task UnregisterServerAsync(string serverId); // Health Monitoring Task GetServerHealthAsync(string serverId); Task> GetAllServerHealthAsync(); } ``` ### IRoutingStrategy Interface Defines server selection logic: ```csharp public interface IRoutingStrategy { string SelectServer( RoutingContext context, IEnumerable availableServers ); } public class RoutingContext { public string? ToolName { get; set; } public string? ClientId { get; set; } public Dictionary? Headers { get; set; } public Dictionary? Metadata { get; set; } } ``` ### IAuthProvider Interface Authentication and authorization: ```csharp public interface IAuthProvider { Task AuthenticateAsync( string? apiKey, Dictionary? headers, CancellationToken ct = default ); Task AuthorizeAsync( string clientId, string serverId, string toolName, CancellationToken ct = default ); } public class AuthResult { public bool IsAuthenticated { get; set; } public string? ClientId { get; set; } public IEnumerable Roles { get; set; } = []; public string? ErrorMessage { get; set; } } ``` ### ICircuitBreaker Interface Prevent cascading failures: ```csharp public interface ICircuitBreaker { bool IsOpen(string serverId); void RecordSuccess(string serverId); void RecordFailure(string serverId); void Reset(string serverId); } ``` --- ## Routing Strategies ### Built-In Strategies #### Round-Robin Strategy ```csharp public class RoundRobinStrategy : IRoutingStrategy { private int _currentIndex = 0; public string SelectServer( RoutingContext context, IEnumerable servers) { var healthyServers = servers.Where(s => s.IsHealthy).ToList(); if (healthyServers.Count == 0) { throw new NoHealthyServersException(); } var index = Interlocked.Increment(ref _currentIndex) % healthyServers.Count; return healthyServers[index].Id; } } ``` #### Tool-Based Strategy ```csharp public class ToolBasedStrategy : IRoutingStrategy { private readonly Dictionary _toolPrefixMappings; public string SelectServer( RoutingContext context, IEnumerable servers) { if (context.ToolName == null) { throw new InvalidOperationException("ToolName required for tool-based routing"); } foreach (var (prefix, serverId) in _toolPrefixMappings) { if (context.ToolName.StartsWith(prefix)) { return serverId; } } // Default to first healthy server return servers.First(s => s.IsHealthy).Id; } } ``` #### Client-Based Strategy ```csharp public class ClientBasedStrategy : IRoutingStrategy { private readonly Dictionary _clientMappings; public string SelectServer( RoutingContext context, IEnumerable servers) { if (context.ClientId != null && _clientMappings.TryGetValue(context.ClientId, out var serverId)) { return serverId; } // Default routing return servers.First(s => s.IsHealthy).Id; } } ``` --- ## Configuration ### Gateway Configuration Model ```csharp public class GatewayConfig { public string Name { get; set; } = "MCP Gateway"; public string Version { get; set; } = "1.0.0"; public string? Description { get; set; } public string ListenAddress { get; set; } = "http://localhost:8080"; } public class ServerConfig { public string Id { get; set; } = string.Empty; public string Name { get; set; } = string.Empty; public TransportConfig Transport { get; set; } = new(); public bool Enabled { get; set; } = true; public Dictionary? Metadata { get; set; } } public class TransportConfig { public string Type { get; set; } = "Stdio"; // "Stdio" or "Http" public string? Command { get; set; } public string[]? Args { get; set; } public string? BaseUrl { get; set; } public Dictionary? Headers { get; set; } } ``` ### Routing Configuration ```csharp public class RoutingConfig { public string Strategy { get; set; } = "RoundRobin"; // "RoundRobin", "ToolBased", "ClientBased" public TimeSpan HealthCheckInterval { get; set; } = TimeSpan.FromSeconds(30); public Dictionary? StrategyConfig { get; set; } } ``` ### Security Configuration ```csharp public class SecurityConfig { public bool EnableAuthentication { get; set; } = false; public string? ApiKeyHeader { get; set; } = "X-MCP-API-Key"; public RateLimitConfig? RateLimit { get; set; } } public class RateLimitConfig { public int RequestsPerMinute { get; set; } = 100; public int BurstSize { get; set; } = 20; } ``` --- ## Health Monitoring ### Server Health Check ```csharp public class ServerHealthStatus { public string ServerId { get; set; } = string.Empty; public string ServerName { get; set; } = string.Empty; public bool IsHealthy { get; set; } public DateTime LastCheck { get; set; } public TimeSpan? ResponseTime { get; set; } public string? ErrorMessage { get; set; } } ``` ### Health Check Implementation ```csharp public class McpServerHealthCheck : IHealthCheck { private readonly IGatewayRouter _router; public async Task CheckHealthAsync( HealthCheckContext context, CancellationToken ct = default) { var statuses = await _router.GetAllServerHealthAsync(); var healthyCount = statuses.Count(s => s.IsHealthy); var totalCount = statuses.Count(); if (healthyCount == totalCount) { return HealthCheckResult.Healthy( $"All {totalCount} servers healthy"); } else if (healthyCount > 0) { return HealthCheckResult.Degraded( $"{healthyCount}/{totalCount} servers healthy"); } else { return HealthCheckResult.Unhealthy( "No healthy servers available"); } } } ``` --- ## Error Handling ### Exception Hierarchy ```csharp public class GatewayException : Exception { } public class NoHealthyServersException : GatewayException { } public class ServerNotFoundException : GatewayException { public string ServerId { get; } } public class RoutingException : GatewayException { public RoutingContext Context { get; } } public class AuthenticationException : GatewayException { } public class RateLimitExceededException : GatewayException { public string ClientId { get; } public int RequestsPerMinute { get; } } ``` ### Circuit Breaker Implementation ```csharp public class CircuitBreaker : ICircuitBreaker { private readonly ConcurrentDictionary _states = new(); private readonly int _failureThreshold = 5; private readonly TimeSpan _timeout = TimeSpan.FromSeconds(30); public bool IsOpen(string serverId) { if (!_states.TryGetValue(serverId, out var state)) { return false; } if (state.State == CircuitState.Open && DateTime.UtcNow - state.LastFailure > _timeout) { // Transition to half-open state.State = CircuitState.HalfOpen; } return state.State == CircuitState.Open; } public void RecordSuccess(string serverId) { _states.AddOrUpdate(serverId, _ => new CircuitState { State = CircuitState.Closed }, (_, state) => { state.FailureCount = 0; state.State = CircuitState.Closed; return state; }); } public void RecordFailure(string serverId) { _states.AddOrUpdate(serverId, _ => new CircuitState { FailureCount = 1, LastFailure = DateTime.UtcNow }, (_, state) => { state.FailureCount++; state.LastFailure = DateTime.UtcNow; if (state.FailureCount >= _failureThreshold) { state.State = CircuitState.Open; } return state; }); } public void Reset(string serverId) { _states.TryRemove(serverId, out _); } } enum CircuitState { Closed, Open, HalfOpen } ``` --- ## Testing Strategy ### Unit Tests - Test Core abstractions with mocks - Test routing strategies with mock servers - Test circuit breaker logic - Test authentication/authorization ### Integration Tests - Test actual routing to real MCP servers - Test health checks - Test error scenarios (server failures, timeouts) - Test authentication flows ### Test Coverage Goals - Core: >90% - Infrastructure: >80% - AspNetCore: >70% --- ## Performance Considerations ### Connection Pooling - Maintain persistent connections to backend servers - Configurable pool size per server - Idle connection eviction ### Request Caching - Cache tool discovery results - Cache health check results (with TTL) - Invalidate cache on server changes ### Monitoring - Track request latency per server - Track request success/failure rates - Track circuit breaker state changes - OpenTelemetry metrics integration --- ## Security ### Input Validation - Validate all incoming requests - Sanitize routing context data - Validate server configuration ### Authentication - API key authentication - JWT token support - Client identity verification ### Authorization - Role-based access control - Server-level permissions - Tool-level permissions ### Rate Limiting - Per-client rate limiting - Per-server rate limiting - Global rate limiting - Burst protection --- ## Future Enhancements - [ ] WebSocket transport support - [ ] Request/response compression - [ ] Dynamic server registration/discovery - [ ] A/B testing support - [ ] Blue/green deployment routing - [ ] Multi-region routing - [ ] Request replay for debugging - [ ] Distributed tracing integration --- **Document Version:** 1.0.0 **Status:** Planned **Next Review:** After Phase 1 implementation