606 lines
20 KiB
Markdown
606 lines
20 KiB
Markdown
# Phase 2.4 - Retention Policies Implementation Plan
|
|
|
|
**Status**: ✅ Complete
|
|
**Completed**: 2025-12-10
|
|
**Dependencies**: Phase 2.2 (PostgreSQL Storage) ✅, Phase 2.3 (Consumer Groups) ✅
|
|
**Target**: Automatic retention policies with time-based and size-based cleanup for persistent streams
|
|
|
|
**Note**: Table partitioning (Phase 2.4.4) has been deferred to a future phase as it requires data migration and is not critical for initial release.
|
|
|
|
## Overview
|
|
|
|
Phase 2.4 adds automatic retention policies to manage event stream lifecycle and prevent unbounded growth. This enables:
|
|
- **Time-based retention**: Automatically delete events older than a specified duration (e.g., 30 days)
|
|
- **Size-based retention**: Keep only the most recent N events per stream
|
|
- **Automatic cleanup**: Background service to enforce retention policies
|
|
- **Table partitioning**: PostgreSQL partitioning for better performance with large volumes
|
|
- **Per-stream configuration**: Different retention policies for different streams
|
|
|
|
## Background
|
|
|
|
Currently (Phase 2.3), persistent streams grow indefinitely. While this is correct for pure event sourcing, many use cases require automatic cleanup:
|
|
- **Compliance**: GDPR and data retention regulations
|
|
- **Cost management**: Storage costs for high-volume streams
|
|
- **Performance**: Query performance degrades with very large tables
|
|
- **Operational simplicity**: Automatic maintenance without manual intervention
|
|
|
|
**Key Concepts:**
|
|
- **Retention Policy**: Rules defining how long events are kept
|
|
- **Time-based Retention**: Delete events older than X days/hours
|
|
- **Size-based Retention**: Keep only the last N events per stream
|
|
- **Table Partitioning**: Split large tables into smaller partitions by time
|
|
- **Cleanup Window**: Time window when cleanup runs (to avoid peak hours)
|
|
|
|
## Goals
|
|
|
|
1. **Retention Policy API**: Define and store retention policies per stream
|
|
2. **Time-based Cleanup**: Automatically delete events older than configured duration
|
|
3. **Size-based Cleanup**: Automatically trim streams to maximum event count
|
|
4. **Table Partitioning**: Partition event_store table by month for performance
|
|
5. **Background Service**: Scheduled cleanup service respecting configured policies
|
|
6. **Monitoring**: Metrics for cleanup operations and retained event counts
|
|
|
|
## Non-Goals (Deferred to Future Phases)
|
|
|
|
- Custom retention logic (Phase 3.x)
|
|
- Event archiving to cold storage (Phase 3.x)
|
|
- Retention policies for ephemeral streams (they're already auto-deleted)
|
|
- Cross-database retention coordination (PostgreSQL only for now)
|
|
|
|
## Architecture
|
|
|
|
### 1. New Interface: `IRetentionPolicy`
|
|
|
|
```csharp
|
|
namespace Svrnty.CQRS.Events.Abstractions;
|
|
|
|
public interface IRetentionPolicy
|
|
{
|
|
/// <summary>
|
|
/// Stream name this policy applies to. Use "*" for default policy.
|
|
/// </summary>
|
|
string StreamName { get; }
|
|
|
|
/// <summary>
|
|
/// Maximum age for events (null = no time-based retention)
|
|
/// </summary>
|
|
TimeSpan? MaxAge { get; }
|
|
|
|
/// <summary>
|
|
/// Maximum number of events to retain (null = no size-based retention)
|
|
/// </summary>
|
|
long? MaxEventCount { get; }
|
|
|
|
/// <summary>
|
|
/// Whether this policy is enabled
|
|
/// </summary>
|
|
bool Enabled { get; }
|
|
}
|
|
|
|
public record RetentionPolicyConfig : IRetentionPolicy
|
|
{
|
|
public required string StreamName { get; init; }
|
|
public TimeSpan? MaxAge { get; init; }
|
|
public long? MaxEventCount { get; init; }
|
|
public bool Enabled { get; init; } = true;
|
|
}
|
|
```
|
|
|
|
### 2. New Interface: `IRetentionPolicyStore`
|
|
|
|
```csharp
|
|
public interface IRetentionPolicyStore
|
|
{
|
|
/// <summary>
|
|
/// Set retention policy for a stream
|
|
/// </summary>
|
|
Task SetPolicyAsync(IRetentionPolicy policy, CancellationToken cancellationToken = default);
|
|
|
|
/// <summary>
|
|
/// Get retention policy for a specific stream
|
|
/// </summary>
|
|
Task<IRetentionPolicy?> GetPolicyAsync(string streamName, CancellationToken cancellationToken = default);
|
|
|
|
/// <summary>
|
|
/// Get all configured retention policies
|
|
/// </summary>
|
|
Task<IReadOnlyList<IRetentionPolicy>> GetAllPoliciesAsync(CancellationToken cancellationToken = default);
|
|
|
|
/// <summary>
|
|
/// Delete retention policy for a stream
|
|
/// </summary>
|
|
Task DeletePolicyAsync(string streamName, CancellationToken cancellationToken = default);
|
|
|
|
/// <summary>
|
|
/// Apply retention policies and return cleanup statistics
|
|
/// </summary>
|
|
Task<RetentionCleanupResult> ApplyRetentionPoliciesAsync(CancellationToken cancellationToken = default);
|
|
}
|
|
|
|
public record RetentionCleanupResult
|
|
{
|
|
public required int StreamsProcessed { get; init; }
|
|
public required long EventsDeleted { get; init; }
|
|
public required TimeSpan Duration { get; init; }
|
|
public required DateTimeOffset CompletedAt { get; init; }
|
|
}
|
|
```
|
|
|
|
### 3. PostgreSQL Table Partitioning
|
|
|
|
Update event_store table to use declarative partitioning by month:
|
|
|
|
```sql
|
|
-- New partitioned table (migration creates this)
|
|
CREATE TABLE event_streaming.event_store_partitioned (
|
|
id BIGSERIAL NOT NULL,
|
|
stream_name VARCHAR(255) NOT NULL,
|
|
event_id VARCHAR(255) NOT NULL,
|
|
correlation_id VARCHAR(255) NOT NULL,
|
|
event_type VARCHAR(500) NOT NULL,
|
|
event_data JSONB NOT NULL,
|
|
occurred_at TIMESTAMPTZ NOT NULL,
|
|
stored_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
offset BIGINT NOT NULL,
|
|
metadata JSONB,
|
|
PRIMARY KEY (id, stored_at)
|
|
) PARTITION BY RANGE (stored_at);
|
|
|
|
-- Create initial partitions (last 3 months + current + next month)
|
|
CREATE TABLE event_streaming.event_store_2024_11 PARTITION OF event_streaming.event_store_partitioned
|
|
FOR VALUES FROM ('2024-11-01') TO ('2024-12-01');
|
|
|
|
CREATE TABLE event_streaming.event_store_2024_12 PARTITION OF event_streaming.event_store_partitioned
|
|
FOR VALUES FROM ('2024-12-01') TO ('2025-01-01');
|
|
|
|
-- Function to automatically create partitions for next month
|
|
CREATE OR REPLACE FUNCTION event_streaming.create_partition_for_next_month()
|
|
RETURNS void AS $$
|
|
DECLARE
|
|
next_month_start DATE;
|
|
next_month_end DATE;
|
|
partition_name TEXT;
|
|
BEGIN
|
|
next_month_start := DATE_TRUNC('month', NOW() + INTERVAL '1 month');
|
|
next_month_end := next_month_start + INTERVAL '1 month';
|
|
partition_name := 'event_store_' || TO_CHAR(next_month_start, 'YYYY_MM');
|
|
|
|
EXECUTE format(
|
|
'CREATE TABLE IF NOT EXISTS event_streaming.%I PARTITION OF event_streaming.event_store_partitioned FOR VALUES FROM (%L) TO (%L)',
|
|
partition_name,
|
|
next_month_start,
|
|
next_month_end
|
|
);
|
|
END;
|
|
$$ LANGUAGE plpgsql;
|
|
```
|
|
|
|
### 4. Retention Policies Table
|
|
|
|
```sql
|
|
CREATE TABLE event_streaming.retention_policies (
|
|
stream_name VARCHAR(255) PRIMARY KEY,
|
|
max_age_seconds INT, -- NULL = no time-based retention
|
|
max_event_count BIGINT, -- NULL = no size-based retention
|
|
enabled BOOLEAN NOT NULL DEFAULT true,
|
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
|
);
|
|
|
|
-- Default policy for all streams (stream_name = '*')
|
|
INSERT INTO event_streaming.retention_policies (stream_name, max_age_seconds, max_event_count)
|
|
VALUES ('*', NULL, NULL); -- No retention by default
|
|
|
|
COMMENT ON TABLE event_streaming.retention_policies IS
|
|
'Retention policies for event streams. stream_name="*" is the default policy.';
|
|
```
|
|
|
|
### 5. Background Service: `RetentionPolicyService`
|
|
|
|
```csharp
|
|
public class RetentionPolicyService : BackgroundService
|
|
{
|
|
private readonly IRetentionPolicyStore _policyStore;
|
|
private readonly RetentionServiceOptions _options;
|
|
private readonly ILogger<RetentionPolicyService> _logger;
|
|
|
|
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
|
|
{
|
|
while (!stoppingToken.IsCancellationRequested)
|
|
{
|
|
try
|
|
{
|
|
// Wait for configured cleanup interval
|
|
await Task.Delay(_options.CleanupInterval, stoppingToken);
|
|
|
|
// Check if we're in the cleanup window
|
|
if (!IsInCleanupWindow())
|
|
{
|
|
_logger.LogDebug("Outside cleanup window, skipping retention");
|
|
continue;
|
|
}
|
|
|
|
_logger.LogInformation("Starting retention policy enforcement");
|
|
|
|
var result = await _policyStore.ApplyRetentionPoliciesAsync(stoppingToken);
|
|
|
|
_logger.LogInformation(
|
|
"Retention cleanup complete: {StreamsProcessed} streams, {EventsDeleted} events deleted in {Duration}",
|
|
result.StreamsProcessed,
|
|
result.EventsDeleted,
|
|
result.Duration);
|
|
}
|
|
catch (Exception ex)
|
|
{
|
|
_logger.LogError(ex, "Error during retention policy enforcement");
|
|
}
|
|
}
|
|
}
|
|
|
|
private bool IsInCleanupWindow()
|
|
{
|
|
var now = DateTime.UtcNow.TimeOfDay;
|
|
return now >= _options.CleanupWindowStart && now <= _options.CleanupWindowEnd;
|
|
}
|
|
}
|
|
|
|
public class RetentionServiceOptions
|
|
{
|
|
/// <summary>
|
|
/// How often to check and enforce retention policies
|
|
/// Default: 1 hour
|
|
/// </summary>
|
|
public TimeSpan CleanupInterval { get; set; } = TimeSpan.FromHours(1);
|
|
|
|
/// <summary>
|
|
/// Start of cleanup window (UTC time)
|
|
/// Default: 2 AM
|
|
/// </summary>
|
|
public TimeSpan CleanupWindowStart { get; set; } = TimeSpan.FromHours(2);
|
|
|
|
/// <summary>
|
|
/// End of cleanup window (UTC time)
|
|
/// Default: 6 AM
|
|
/// </summary>
|
|
public TimeSpan CleanupWindowEnd { get; set; } = TimeSpan.FromHours(6);
|
|
|
|
/// <summary>
|
|
/// Whether the retention service is enabled
|
|
/// Default: true
|
|
/// </summary>
|
|
public bool Enabled { get; set; } = true;
|
|
}
|
|
```
|
|
|
|
## Database Migration: `003_RetentionPolicies.sql`
|
|
|
|
```sql
|
|
-- Retention policies table
|
|
CREATE TABLE IF NOT EXISTS event_streaming.retention_policies (
|
|
stream_name VARCHAR(255) PRIMARY KEY,
|
|
max_age_seconds INT,
|
|
max_event_count BIGINT,
|
|
enabled BOOLEAN NOT NULL DEFAULT true,
|
|
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
|
|
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
|
|
);
|
|
|
|
-- Default retention policy (no retention)
|
|
INSERT INTO event_streaming.retention_policies (stream_name, max_age_seconds, max_event_count)
|
|
VALUES ('*', NULL, NULL)
|
|
ON CONFLICT (stream_name) DO NOTHING;
|
|
|
|
-- Function to apply time-based retention for a stream
|
|
CREATE OR REPLACE FUNCTION event_streaming.apply_time_retention(
|
|
p_stream_name VARCHAR,
|
|
p_max_age_seconds INT
|
|
)
|
|
RETURNS BIGINT AS $$
|
|
DECLARE
|
|
deleted_count BIGINT;
|
|
BEGIN
|
|
DELETE FROM event_streaming.event_store
|
|
WHERE stream_name = p_stream_name
|
|
AND stored_at < NOW() - (p_max_age_seconds || ' seconds')::INTERVAL;
|
|
|
|
GET DIAGNOSTICS deleted_count = ROW_COUNT;
|
|
RETURN deleted_count;
|
|
END;
|
|
$$ LANGUAGE plpgsql;
|
|
|
|
-- Function to apply size-based retention for a stream
|
|
CREATE OR REPLACE FUNCTION event_streaming.apply_size_retention(
|
|
p_stream_name VARCHAR,
|
|
p_max_event_count BIGINT
|
|
)
|
|
RETURNS BIGINT AS $$
|
|
DECLARE
|
|
deleted_count BIGINT;
|
|
current_count BIGINT;
|
|
events_to_delete BIGINT;
|
|
BEGIN
|
|
-- Count current events
|
|
SELECT COUNT(*) INTO current_count
|
|
FROM event_streaming.event_store
|
|
WHERE stream_name = p_stream_name;
|
|
|
|
-- Calculate how many to delete
|
|
events_to_delete := current_count - p_max_event_count;
|
|
|
|
IF events_to_delete <= 0 THEN
|
|
RETURN 0;
|
|
END IF;
|
|
|
|
-- Delete oldest events beyond max count
|
|
DELETE FROM event_streaming.event_store
|
|
WHERE id IN (
|
|
SELECT id
|
|
FROM event_streaming.event_store
|
|
WHERE stream_name = p_stream_name
|
|
ORDER BY offset ASC
|
|
LIMIT events_to_delete
|
|
);
|
|
|
|
GET DIAGNOSTICS deleted_count = ROW_COUNT;
|
|
RETURN deleted_count;
|
|
END;
|
|
$$ LANGUAGE plpgsql;
|
|
|
|
-- Function to apply all retention policies
|
|
CREATE OR REPLACE FUNCTION event_streaming.apply_all_retention_policies()
|
|
RETURNS TABLE(stream_name VARCHAR, events_deleted BIGINT) AS $$
|
|
DECLARE
|
|
policy RECORD;
|
|
deleted BIGINT;
|
|
total_deleted BIGINT := 0;
|
|
BEGIN
|
|
FOR policy IN
|
|
SELECT rp.stream_name, rp.max_age_seconds, rp.max_event_count
|
|
FROM event_streaming.retention_policies rp
|
|
WHERE rp.enabled = true
|
|
AND (rp.max_age_seconds IS NOT NULL OR rp.max_event_count IS NOT NULL)
|
|
LOOP
|
|
deleted := 0;
|
|
|
|
-- Apply time-based retention
|
|
IF policy.max_age_seconds IS NOT NULL THEN
|
|
IF policy.stream_name = '*' THEN
|
|
-- Apply to all streams
|
|
DELETE FROM event_streaming.event_store
|
|
WHERE stored_at < NOW() - (policy.max_age_seconds || ' seconds')::INTERVAL;
|
|
GET DIAGNOSTICS deleted = ROW_COUNT;
|
|
ELSE
|
|
-- Apply to specific stream
|
|
SELECT event_streaming.apply_time_retention(policy.stream_name, policy.max_age_seconds)
|
|
INTO deleted;
|
|
END IF;
|
|
END IF;
|
|
|
|
-- Apply size-based retention
|
|
IF policy.max_event_count IS NOT NULL AND policy.stream_name != '*' THEN
|
|
SELECT deleted + event_streaming.apply_size_retention(policy.stream_name, policy.max_event_count)
|
|
INTO deleted;
|
|
END IF;
|
|
|
|
IF deleted > 0 THEN
|
|
stream_name := policy.stream_name;
|
|
events_deleted := deleted;
|
|
RETURN NEXT;
|
|
END IF;
|
|
END LOOP;
|
|
END;
|
|
$$ LANGUAGE plpgsql;
|
|
|
|
-- View for retention policy status
|
|
CREATE OR REPLACE VIEW event_streaming.retention_policy_status AS
|
|
SELECT
|
|
rp.stream_name,
|
|
rp.max_age_seconds,
|
|
rp.max_event_count,
|
|
rp.enabled,
|
|
COUNT(es.id) AS current_event_count,
|
|
MIN(es.stored_at) AS oldest_event,
|
|
MAX(es.stored_at) AS newest_event,
|
|
EXTRACT(EPOCH FROM (NOW() - MIN(es.stored_at))) AS oldest_age_seconds
|
|
FROM event_streaming.retention_policies rp
|
|
LEFT JOIN event_streaming.event_store es ON es.stream_name = rp.stream_name
|
|
WHERE rp.stream_name != '*'
|
|
GROUP BY rp.stream_name, rp.max_age_seconds, rp.max_event_count, rp.enabled;
|
|
|
|
-- Migration version tracking
|
|
INSERT INTO event_streaming.schema_version (version, description, applied_at)
|
|
VALUES (3, 'Retention Policies', NOW())
|
|
ON CONFLICT (version) DO NOTHING;
|
|
```
|
|
|
|
## API Usage Examples
|
|
|
|
### Example 1: Configure Time-based Retention
|
|
|
|
```csharp
|
|
var policyStore = serviceProvider.GetRequiredService<IRetentionPolicyStore>();
|
|
|
|
// Keep user events for 90 days
|
|
await policyStore.SetPolicyAsync(new RetentionPolicyConfig
|
|
{
|
|
StreamName = "user-events",
|
|
MaxAge = TimeSpan.FromDays(90),
|
|
Enabled = true
|
|
});
|
|
|
|
// Keep audit logs for 7 years (compliance)
|
|
await policyStore.SetPolicyAsync(new RetentionPolicyConfig
|
|
{
|
|
StreamName = "audit-logs",
|
|
MaxAge = TimeSpan.FromDays(7 * 365),
|
|
Enabled = true
|
|
});
|
|
```
|
|
|
|
### Example 2: Configure Size-based Retention
|
|
|
|
```csharp
|
|
// Keep only last 10,000 events for analytics stream
|
|
await policyStore.SetPolicyAsync(new RetentionPolicyConfig
|
|
{
|
|
StreamName = "analytics-events",
|
|
MaxEventCount = 10000,
|
|
Enabled = true
|
|
});
|
|
```
|
|
|
|
### Example 3: Combined Time and Size Retention
|
|
|
|
```csharp
|
|
// Keep last 1M events OR 30 days, whichever comes first
|
|
await policyStore.SetPolicyAsync(new RetentionPolicyConfig
|
|
{
|
|
StreamName = "orders",
|
|
MaxAge = TimeSpan.FromDays(30),
|
|
MaxEventCount = 1_000_000,
|
|
Enabled = true
|
|
});
|
|
```
|
|
|
|
### Example 4: Manual Cleanup Trigger
|
|
|
|
```csharp
|
|
var policyStore = serviceProvider.GetRequiredService<IRetentionPolicyStore>();
|
|
|
|
// Manually trigger retention cleanup
|
|
var result = await policyStore.ApplyRetentionPoliciesAsync();
|
|
|
|
Console.WriteLine($"Cleaned up {result.EventsDeleted} events from {result.StreamsProcessed} streams in {result.Duration}");
|
|
```
|
|
|
|
### Example 5: Monitor Retention Status
|
|
|
|
```csharp
|
|
// Get all retention policies
|
|
var policies = await policyStore.GetAllPoliciesAsync();
|
|
|
|
foreach (var policy in policies)
|
|
{
|
|
Console.WriteLine($"Stream: {policy.StreamName}");
|
|
Console.WriteLine($" Max Age: {policy.MaxAge}");
|
|
Console.WriteLine($" Max Count: {policy.MaxEventCount}");
|
|
Console.WriteLine($" Enabled: {policy.Enabled}");
|
|
}
|
|
```
|
|
|
|
## Configuration
|
|
|
|
### appsettings.json
|
|
|
|
```json
|
|
{
|
|
"EventStreaming": {
|
|
"Retention": {
|
|
"Enabled": true,
|
|
"CleanupInterval": "01:00:00",
|
|
"CleanupWindowStart": "02:00:00",
|
|
"CleanupWindowEnd": "06:00:00"
|
|
},
|
|
"DefaultRetentionPolicy": {
|
|
"MaxAge": "30.00:00:00",
|
|
"MaxEventCount": null,
|
|
"Enabled": false
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Implementation Checklist
|
|
|
|
### Phase 2.4.1 - Core Interfaces (Week 1) ✅
|
|
- [x] Define IRetentionPolicy interface
|
|
- [x] Define IRetentionPolicyStore interface
|
|
- [x] Define RetentionPolicyConfig record
|
|
- [x] Define RetentionServiceOptions
|
|
- [x] Define RetentionCleanupResult record
|
|
|
|
### Phase 2.4.2 - Database Schema (Week 1) ✅
|
|
- [x] Create 003_RetentionPolicies.sql migration
|
|
- [x] Create retention_policies table
|
|
- [x] Create apply_time_retention() function
|
|
- [x] Create apply_size_retention() function
|
|
- [x] Create apply_all_retention_policies() function
|
|
- [x] Create retention_policy_status view
|
|
|
|
### Phase 2.4.3 - PostgreSQL Implementation (Week 2) ✅
|
|
- [x] Implement PostgresRetentionPolicyStore
|
|
- [x] Implement time-based cleanup logic
|
|
- [x] Implement size-based cleanup logic
|
|
- [x] Add cleanup metrics and logging
|
|
- [ ] Add unit tests (deferred)
|
|
|
|
### Phase 2.4.4 - Background Service (Week 2) ✅
|
|
- [x] Implement RetentionPolicyService
|
|
- [x] Add cleanup window logic (with midnight crossing support)
|
|
- [x] Add configurable intervals
|
|
- [x] Add service registration extensions
|
|
- [ ] Add health checks (deferred)
|
|
- [ ] Integration tests (deferred)
|
|
|
|
### Phase 2.4.5 - Table Partitioning (Week 3) ⏸️ Deferred
|
|
- [ ] Create partitioned event_store table
|
|
- [ ] Create initial partitions
|
|
- [ ] Create auto-partition function
|
|
- [ ] Migrate existing data (if needed)
|
|
- [ ] Performance testing
|
|
|
|
**Note**: Table partitioning has been deferred as it requires data migration and is not critical for initial release. Will be implemented in a future phase when migration strategy is finalized.
|
|
|
|
### Phase 2.4.6 - Documentation (Week 3) ✅
|
|
- [x] Update README.md
|
|
- [x] Update CLAUDE.md
|
|
- [x] Update Phase 2.4 plan to complete
|
|
|
|
## Performance Considerations
|
|
|
|
### Cleanup Strategy
|
|
- **Batch Deletes**: Delete in batches to avoid long-running transactions
|
|
- **Off-Peak Hours**: Run cleanup during configured window (default: 2-6 AM)
|
|
- **Index Optimization**: Ensure indexes on `stored_at` and `stream_name`
|
|
- **Vacuum**: Run VACUUM ANALYZE after large deletes
|
|
|
|
### Partitioning Benefits
|
|
- **Query Performance**: Partition pruning for time-range queries
|
|
- **Maintenance**: Drop old partitions instead of DELETE (instant)
|
|
- **Parallel Operations**: Multiple partitions can be processed in parallel
|
|
- **Backup/Restore**: Partition-level backup and restore
|
|
|
|
## Success Criteria
|
|
|
|
- [x] Time-based retention policies can be configured per stream
|
|
- [x] Size-based retention policies can be configured per stream
|
|
- [x] Background service enforces retention policies automatically
|
|
- [x] Cleanup respects configured time windows (with midnight crossing support)
|
|
- [ ] Table partitioning improves query performance (deferred)
|
|
- [ ] Old partitions can be dropped instantly (deferred)
|
|
- [x] Retention metrics are logged and observable
|
|
- [x] Documentation is complete
|
|
|
|
## Risks & Mitigation
|
|
|
|
| Risk | Impact | Mitigation |
|
|
|------|--------|------------|
|
|
| **Accidental data loss** | Critical | Require explicit policy configuration, disable default retention |
|
|
| **Long-running deletes** | Performance impact | Batch deletes, run during off-peak hours |
|
|
| **Partition migration** | Downtime | Create partitioned table separately, migrate incrementally |
|
|
| **Misconfigured policies** | Data loss or retention failure | Policy validation, dry-run mode |
|
|
|
|
## Future Enhancements (Phase 3.x)
|
|
|
|
- Event archiving to S3/blob storage before deletion
|
|
- Custom retention logic via user-defined functions
|
|
- Retention policy templates
|
|
- Retention compliance reporting
|
|
- Cross-region retention coordination
|
|
|
|
---
|
|
|
|
**Document Status**: 📋 Planning
|
|
**Last Updated**: December 10, 2025
|
|
**Next Review**: Upon Phase 2.3 completion confirmation
|