Files
flipstone-radar/CACHING_EXPLANATION.md
2025-07-10 21:59:56 -04:00

78 lines
4.8 KiB
Markdown

# Backend Caching Implementation
## How It Works
The caching is implemented **entirely on the backend** in the Express server, ensuring that all clients benefit from shared cache and rate limiting.
### Architecture
```
┌─────────────┐ ┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Client A │ │ Client B │ │ Express Server │ │ GitHub API │
└─────────────┘ └─────────────┘ └──────────────────┘ └─────────────────┘
│ │ │ │
│ GET /api/workflow-runs │ │
├──────────────────────────────────────→ │ │
│ │ │ Check Cache │
│ │ │ ┌─────────────────┐ │
│ │ │ │ CACHE MISS │ │
│ │ │ └─────────────────┘ │
│ │ │ Make API Request │
│ │ ├─────────────────────→ │
│ │ │ ←───────────────────── │
│ │ │ Store in Cache │
│ ←──────────────────────────────────────── │ │
│ │ │ │
│ │ GET /api/workflow-runs │
│ ├──────────────────→ │ │
│ │ │ Check Cache │
│ │ │ ┌─────────────────┐ │
│ │ │ │ CACHE HIT │ │
│ │ │ └─────────────────┘ │
│ │ ←──────────────────── │ (No API call) │
```
### Key Components
1. **Single GitHubService Instance**: One shared instance across all clients
2. **In-Memory Cache**: Map-based caching with TTL expiration
3. **Request Queue**: All API requests are queued and rate-limited
4. **Rate Limit Tracking**: Shared rate limit state across all requests
### Cache Features
- **TTL-based Expiration**: 5 minutes for workflow runs and other data
- **Automatic Cleanup**: Expired entries are automatically removed
- **Cache Preservation**: Cache survives configuration changes
- **Request Deduplication**: Multiple identical requests share the same cached result
### Rate Limiting
- **Parallel Processing**: Multiple API requests processed concurrently
- **Intelligent Rate Limiting**: Maximum 10 concurrent requests with 10 requests/second limit
- **Proactive Waiting**: Automatically waits when approaching rate limits
- **Shared Counters**: All clients share the same rate limit tracking
### Benefits
**Reduced API Calls**: Cache hits eliminate redundant GitHub API requests
**Shared Rate Limits**: Multiple clients don't compound rate limit usage
**Better Performance**: Cached responses are served instantly
**Automatic Management**: No client-side cache logic needed
**Scalable**: Adding more clients doesn't increase API usage proportionally
### Monitoring
The backend provides detailed logging:
- `💾 Cache HIT: owner/repo - runs` - Request served from cache
- `🌐 Cache MISS: owner/repo - runs - Making API request` - New API request
- `📊 API Rate Limit: remaining/limit remaining` - Current rate limit status
- `💾 Cached response for 300s` - Data cached with TTL
### API Endpoints for Cache Management
- `GET /api/rate-limit` - View current GitHub API rate limit status
- `GET /api/cache/stats` - View cache size and entries
- `DELETE /api/cache` - Manually clear the cache
This ensures efficient API usage while providing transparency and control over the caching behavior.