# Backend Caching Implementation ## How It Works The caching is implemented **entirely on the backend** in the Express server, ensuring that all clients benefit from shared cache and rate limiting. ### Architecture ``` ┌─────────────┐ ┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ Client A │ │ Client B │ │ Express Server │ │ GitHub API │ └─────────────┘ └─────────────┘ └──────────────────┘ └─────────────────┘ │ │ │ │ │ GET /api/workflow-runs │ │ ├──────────────────────────────────────→ │ │ │ │ │ Check Cache │ │ │ │ ┌─────────────────┐ │ │ │ │ │ CACHE MISS │ │ │ │ │ └─────────────────┘ │ │ │ │ Make API Request │ │ │ ├─────────────────────→ │ │ │ │ ←───────────────────── │ │ │ │ Store in Cache │ │ ←──────────────────────────────────────── │ │ │ │ │ │ │ │ GET /api/workflow-runs │ │ ├──────────────────→ │ │ │ │ │ Check Cache │ │ │ │ ┌─────────────────┐ │ │ │ │ │ CACHE HIT │ │ │ │ │ └─────────────────┘ │ │ │ ←──────────────────── │ (No API call) │ ``` ### Key Components 1. **Single GitHubService Instance**: One shared instance across all clients 2. **In-Memory Cache**: Map-based caching with TTL expiration 3. **Request Queue**: All API requests are queued and rate-limited 4. **Rate Limit Tracking**: Shared rate limit state across all requests ### Cache Features - **TTL-based Expiration**: 5 minutes for workflow runs and other data - **Automatic Cleanup**: Expired entries are automatically removed - **Cache Preservation**: Cache survives configuration changes - **Request Deduplication**: Multiple identical requests share the same cached result ### Rate Limiting - **Parallel Processing**: Multiple API requests processed concurrently - **Intelligent Rate Limiting**: Maximum 10 concurrent requests with 10 requests/second limit - **Proactive Waiting**: Automatically waits when approaching rate limits - **Shared Counters**: All clients share the same rate limit tracking ### Benefits ✅ **Reduced API Calls**: Cache hits eliminate redundant GitHub API requests ✅ **Shared Rate Limits**: Multiple clients don't compound rate limit usage ✅ **Better Performance**: Cached responses are served instantly ✅ **Automatic Management**: No client-side cache logic needed ✅ **Scalable**: Adding more clients doesn't increase API usage proportionally ### Monitoring The backend provides detailed logging: - `💾 Cache HIT: owner/repo - runs` - Request served from cache - `🌐 Cache MISS: owner/repo - runs - Making API request` - New API request - `📊 API Rate Limit: remaining/limit remaining` - Current rate limit status - `💾 Cached response for 300s` - Data cached with TTL ### API Endpoints for Cache Management - `GET /api/rate-limit` - View current GitHub API rate limit status - `GET /api/cache/stats` - View cache size and entries - `DELETE /api/cache` - Manually clear the cache This ensures efficient API usage while providing transparency and control over the caching behavior.