Research conducted on modern AI coding assistants (Cursor, GitHub Copilot, Cline,
Aider, Windsurf, Replit Agent) to understand architecture patterns, context management,
code editing workflows, and tool use protocols.
Key Decision: Pivoted from building full CLI (40-50h) to validation-driven MCP-first
approach (10-15h). Build 5 core CODEX MCP tools that work with ANY coding assistant,
validate adoption over 2-4 weeks, then decide on full CLI if demand proven.
Files:
- research/ai-systems/modern-coding-assistants-architecture.md (comprehensive research)
- research/ai-systems/codex-coding-assistant-implementation-plan.md (original CLI plan, preserved)
- research/ai-systems/codex-mcp-tools-implementation-plan.md (approved MCP-first plan)
- ideas/registry.json (updated with approved MCP tools proposal)
Architech Validation: APPROVED with pivot to MCP-first approach
Human Decision: Approved (pragmatic validation-driven development)
Next: Begin Phase 1 implementation (10-15 hours, 5 core MCP tools)
🤖 Generated with CODEX Research System
Co-Authored-By: The Archivist <archivist@codex.svrnty.io>
Co-Authored-By: The Architech <architech@codex.svrnty.io>
Co-Authored-By: Mathias Beaulieu-Duncan <mat@svrnty.io>
13 KiB
13 KiB
OpenHarbor.MCP.Gateway - Module Design
Document Type: Architecture Design Document Status: Planned Version: 1.0.0 Last Updated: 2025-10-19
Overview
OpenHarbor.MCP.Gateway is a .NET 8 library that provides proxy and routing infrastructure for MCP traffic, enabling centralized management, authentication, monitoring, and load balancing between MCP clients and servers. This document defines the architecture, components, and design decisions.
Purpose
- What: Gateway/proxy library for routing MCP traffic between clients and servers
- Why: Enable centralized management, security, and monitoring of MCP infrastructure
- How: Clean Architecture with routing strategies, health monitoring, and transport abstraction
Architecture
Clean Architecture Layers
┌─────────────────────────────────────────────────┐
│ OpenHarbor.MCP.Gateway.Cli (Executable) │
│ ┌───────────────────────────────────────────┐ │
│ │ OpenHarbor.MCP.Gateway.AspNetCore (HTTP)│ │
│ │ ┌─────────────────────────────────────┐ │ │
│ │ │ OpenHarbor.MCP.Gateway.Infrastructure│ │ │
│ │ │ ┌───────────────────────────────┐ │ │ │
│ │ │ │ OpenHarbor.MCP.Gateway.Core │ │ │ │
│ │ │ │ - IGatewayRouter │ │ │ │
│ │ │ │ - IRoutingStrategy │ │ │ │
│ │ │ │ - IAuthProvider │ │ │ │
│ │ │ │ - ICircuitBreaker │ │ │ │
│ │ │ │ - Models (no dependencies) │ │ │ │
│ │ │ └───────────────────────────────┘ │ │ │
│ │ └─────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────┘ │
└─────────────────────────────────────────────────┘
Layer Responsibilities
| Layer | Purpose | Dependencies |
|---|---|---|
| Core | Abstractions and models | None |
| Infrastructure | Routing, auth, circuit breakers | Core, System.Text.Json |
| AspNetCore | HTTP endpoints and DI | Core, Infrastructure, ASP.NET Core |
| Cli | Management CLI | All layers |
Core Components
IGatewayRouter Interface
Primary interface for gateway routing operations:
public interface IGatewayRouter
{
// Request Routing
Task<McpResponse> RouteRequestAsync(
McpRequest request,
RoutingContext context,
CancellationToken ct = default
);
// Server Management
Task<IEnumerable<ServerInfo>> GetRegisteredServersAsync();
Task RegisterServerAsync(ServerConfig config);
Task UnregisterServerAsync(string serverId);
// Health Monitoring
Task<ServerHealthStatus> GetServerHealthAsync(string serverId);
Task<IEnumerable<ServerHealthStatus>> GetAllServerHealthAsync();
}
IRoutingStrategy Interface
Defines server selection logic:
public interface IRoutingStrategy
{
string SelectServer(
RoutingContext context,
IEnumerable<ServerInfo> availableServers
);
}
public class RoutingContext
{
public string? ToolName { get; set; }
public string? ClientId { get; set; }
public Dictionary<string, string>? Headers { get; set; }
public Dictionary<string, object>? Metadata { get; set; }
}
IAuthProvider Interface
Authentication and authorization:
public interface IAuthProvider
{
Task<AuthResult> AuthenticateAsync(
string? apiKey,
Dictionary<string, string>? headers,
CancellationToken ct = default
);
Task<bool> AuthorizeAsync(
string clientId,
string serverId,
string toolName,
CancellationToken ct = default
);
}
public class AuthResult
{
public bool IsAuthenticated { get; set; }
public string? ClientId { get; set; }
public IEnumerable<string> Roles { get; set; } = [];
public string? ErrorMessage { get; set; }
}
ICircuitBreaker Interface
Prevent cascading failures:
public interface ICircuitBreaker
{
bool IsOpen(string serverId);
void RecordSuccess(string serverId);
void RecordFailure(string serverId);
void Reset(string serverId);
}
Routing Strategies
Built-In Strategies
Round-Robin Strategy
public class RoundRobinStrategy : IRoutingStrategy
{
private int _currentIndex = 0;
public string SelectServer(
RoutingContext context,
IEnumerable<ServerInfo> servers)
{
var healthyServers = servers.Where(s => s.IsHealthy).ToList();
if (healthyServers.Count == 0)
{
throw new NoHealthyServersException();
}
var index = Interlocked.Increment(ref _currentIndex) % healthyServers.Count;
return healthyServers[index].Id;
}
}
Tool-Based Strategy
public class ToolBasedStrategy : IRoutingStrategy
{
private readonly Dictionary<string, string> _toolPrefixMappings;
public string SelectServer(
RoutingContext context,
IEnumerable<ServerInfo> servers)
{
if (context.ToolName == null)
{
throw new InvalidOperationException("ToolName required for tool-based routing");
}
foreach (var (prefix, serverId) in _toolPrefixMappings)
{
if (context.ToolName.StartsWith(prefix))
{
return serverId;
}
}
// Default to first healthy server
return servers.First(s => s.IsHealthy).Id;
}
}
Client-Based Strategy
public class ClientBasedStrategy : IRoutingStrategy
{
private readonly Dictionary<string, string> _clientMappings;
public string SelectServer(
RoutingContext context,
IEnumerable<ServerInfo> servers)
{
if (context.ClientId != null &&
_clientMappings.TryGetValue(context.ClientId, out var serverId))
{
return serverId;
}
// Default routing
return servers.First(s => s.IsHealthy).Id;
}
}
Configuration
Gateway Configuration Model
public class GatewayConfig
{
public string Name { get; set; } = "MCP Gateway";
public string Version { get; set; } = "1.0.0";
public string? Description { get; set; }
public string ListenAddress { get; set; } = "http://localhost:8080";
}
public class ServerConfig
{
public string Id { get; set; } = string.Empty;
public string Name { get; set; } = string.Empty;
public TransportConfig Transport { get; set; } = new();
public bool Enabled { get; set; } = true;
public Dictionary<string, string>? Metadata { get; set; }
}
public class TransportConfig
{
public string Type { get; set; } = "Stdio"; // "Stdio" or "Http"
public string? Command { get; set; }
public string[]? Args { get; set; }
public string? BaseUrl { get; set; }
public Dictionary<string, string>? Headers { get; set; }
}
Routing Configuration
public class RoutingConfig
{
public string Strategy { get; set; } = "RoundRobin"; // "RoundRobin", "ToolBased", "ClientBased"
public TimeSpan HealthCheckInterval { get; set; } = TimeSpan.FromSeconds(30);
public Dictionary<string, string>? StrategyConfig { get; set; }
}
Security Configuration
public class SecurityConfig
{
public bool EnableAuthentication { get; set; } = false;
public string? ApiKeyHeader { get; set; } = "X-MCP-API-Key";
public RateLimitConfig? RateLimit { get; set; }
}
public class RateLimitConfig
{
public int RequestsPerMinute { get; set; } = 100;
public int BurstSize { get; set; } = 20;
}
Health Monitoring
Server Health Check
public class ServerHealthStatus
{
public string ServerId { get; set; } = string.Empty;
public string ServerName { get; set; } = string.Empty;
public bool IsHealthy { get; set; }
public DateTime LastCheck { get; set; }
public TimeSpan? ResponseTime { get; set; }
public string? ErrorMessage { get; set; }
}
Health Check Implementation
public class McpServerHealthCheck : IHealthCheck
{
private readonly IGatewayRouter _router;
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken ct = default)
{
var statuses = await _router.GetAllServerHealthAsync();
var healthyCount = statuses.Count(s => s.IsHealthy);
var totalCount = statuses.Count();
if (healthyCount == totalCount)
{
return HealthCheckResult.Healthy(
$"All {totalCount} servers healthy");
}
else if (healthyCount > 0)
{
return HealthCheckResult.Degraded(
$"{healthyCount}/{totalCount} servers healthy");
}
else
{
return HealthCheckResult.Unhealthy(
"No healthy servers available");
}
}
}
Error Handling
Exception Hierarchy
public class GatewayException : Exception { }
public class NoHealthyServersException : GatewayException { }
public class ServerNotFoundException : GatewayException
{
public string ServerId { get; }
}
public class RoutingException : GatewayException
{
public RoutingContext Context { get; }
}
public class AuthenticationException : GatewayException { }
public class RateLimitExceededException : GatewayException
{
public string ClientId { get; }
public int RequestsPerMinute { get; }
}
Circuit Breaker Implementation
public class CircuitBreaker : ICircuitBreaker
{
private readonly ConcurrentDictionary<string, CircuitState> _states = new();
private readonly int _failureThreshold = 5;
private readonly TimeSpan _timeout = TimeSpan.FromSeconds(30);
public bool IsOpen(string serverId)
{
if (!_states.TryGetValue(serverId, out var state))
{
return false;
}
if (state.State == CircuitState.Open &&
DateTime.UtcNow - state.LastFailure > _timeout)
{
// Transition to half-open
state.State = CircuitState.HalfOpen;
}
return state.State == CircuitState.Open;
}
public void RecordSuccess(string serverId)
{
_states.AddOrUpdate(serverId,
_ => new CircuitState { State = CircuitState.Closed },
(_, state) => { state.FailureCount = 0; state.State = CircuitState.Closed; return state; });
}
public void RecordFailure(string serverId)
{
_states.AddOrUpdate(serverId,
_ => new CircuitState { FailureCount = 1, LastFailure = DateTime.UtcNow },
(_, state) =>
{
state.FailureCount++;
state.LastFailure = DateTime.UtcNow;
if (state.FailureCount >= _failureThreshold)
{
state.State = CircuitState.Open;
}
return state;
});
}
public void Reset(string serverId)
{
_states.TryRemove(serverId, out _);
}
}
enum CircuitState
{
Closed,
Open,
HalfOpen
}
Testing Strategy
Unit Tests
- Test Core abstractions with mocks
- Test routing strategies with mock servers
- Test circuit breaker logic
- Test authentication/authorization
Integration Tests
- Test actual routing to real MCP servers
- Test health checks
- Test error scenarios (server failures, timeouts)
- Test authentication flows
Test Coverage Goals
- Core: >90%
- Infrastructure: >80%
- AspNetCore: >70%
Performance Considerations
Connection Pooling
- Maintain persistent connections to backend servers
- Configurable pool size per server
- Idle connection eviction
Request Caching
- Cache tool discovery results
- Cache health check results (with TTL)
- Invalidate cache on server changes
Monitoring
- Track request latency per server
- Track request success/failure rates
- Track circuit breaker state changes
- OpenTelemetry metrics integration
Security
Input Validation
- Validate all incoming requests
- Sanitize routing context data
- Validate server configuration
Authentication
- API key authentication
- JWT token support
- Client identity verification
Authorization
- Role-based access control
- Server-level permissions
- Tool-level permissions
Rate Limiting
- Per-client rate limiting
- Per-server rate limiting
- Global rate limiting
- Burst protection
Future Enhancements
- WebSocket transport support
- Request/response compression
- Dynamic server registration/discovery
- A/B testing support
- Blue/green deployment routing
- Multi-region routing
- Request replay for debugging
- Distributed tracing integration
Document Version: 1.0.0 Status: Planned Next Review: After Phase 1 implementation