flipstone-radar/CACHING_EXPLANATION.md

# Backend Caching Implementation

## How It Works

The caching is implemented **entirely on the backend** in the Express server, ensuring that all clients benefit from shared cache and rate limiting.

### Architecture

```
┌─────────────┐    ┌─────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Client A  │    │   Client B  │    │  Express Server  │    │  GitHub API     │
└─────────────┘    └─────────────┘    └──────────────────┘    └─────────────────┘
       │                   │                     │                       │
       │ GET /api/workflow-runs                  │                       │
       ├──────────────────────────────────────→ │                       │
       │                   │                     │ Check Cache           │
       │                   │                     │ ┌─────────────────┐   │
       │                   │                     │ │ CACHE MISS      │   │
       │                   │                     │ └─────────────────┘   │
       │                   │                     │ Make API Request      │
       │                   │                     ├─────────────────────→ │
       │                   │                     │ ←───────────────────── │
       │                   │                     │ Store in Cache        │
       │ ←──────────────────────────────────────── │                       │
       │                   │                     │                       │
       │                   │ GET /api/workflow-runs                      │
       │                   ├──────────────────→ │                       │
       │                   │                     │ Check Cache           │
       │                   │                     │ ┌─────────────────┐   │
       │                   │                     │ │ CACHE HIT       │   │
       │                   │                     │ └─────────────────┘   │
       │                   │ ←──────────────────── │  (No API call)        │
```

### Key Components

1. **Single GitHubService Instance**: One shared instance across all clients
2. **In-Memory Cache**: Map-based caching with TTL expiration
3. **Request Queue**: All API requests are queued and rate-limited
4. **Rate Limit Tracking**: Shared rate limit state across all requests

### Cache Features

- **TTL-based Expiration**: 5 minutes for workflow runs and other data
- **Automatic Cleanup**: Expired entries are automatically removed
- **Cache Preservation**: Cache survives configuration changes
- **Request Deduplication**: Multiple identical requests share the same cached result

### Rate Limiting

- **Parallel Processing**: Multiple API requests processed concurrently
- **Intelligent Rate Limiting**: Maximum 10 concurrent requests with 10 requests/second limit
- **Proactive Waiting**: Automatically waits when approaching rate limits
- **Shared Counters**: All clients share the same rate limit tracking

### Benefits

✅ **Reduced API Calls**: Cache hits eliminate redundant GitHub API requests
✅ **Shared Rate Limits**: Multiple clients don't compound rate limit usage
✅ **Better Performance**: Cached responses are served instantly
✅ **Automatic Management**: No client-side cache logic needed
✅ **Scalable**: Adding more clients doesn't increase API usage proportionally

### Monitoring

The backend provides detailed logging:
- `💾 Cache HIT: owner/repo - runs` - Request served from cache
- `🌐 Cache MISS: owner/repo - runs - Making API request` - New API request
- `📊 API Rate Limit: remaining/limit remaining` - Current rate limit status
- `💾 Cached response for 300s` - Data cached with TTL

### API Endpoints for Cache Management

- `GET /api/rate-limit` - View current GitHub API rate limit status
- `GET /api/cache/stats` - View cache size and entries
- `DELETE /api/cache` - Manually clear the cache

This ensures efficient API usage while providing transparency and control over the caching behavior.