
AI Summary
→ WHAT IT COVERS Gerhard Lazu debugs Changelog's CDN infrastructure after 43 out-of-memory crashes since October, implementing file-based caching for MP3s, fixing Fly.io proxy misconfigurations, and discovering massive bandwidth abuse from 10,000+ IPs downloading episode 456 repeatedly. → KEY INSIGHTS - **Memory fragmentation solution:** Varnish crashed 43 times in three months from MP3 files (30-100MB each) causing memory fragmentation. Moving large files from malloc memory storage to file-based cache with pre-allocated disk space eliminated crashes while maintaining 93% cache hit ratio across 15 global regions. - **Concurrency misconfiguration impact:** Setting Fly.io proxy concurrency to connections instead of requests caused 2,700 long-running connections to block new traffic in Newark region. HTTP/2 clients experienced response body timeouts while headers returned successfully, resolved by switching concurrency mode and explicitly setting 60-second idle timeouts. - **Thread pool architecture benefits:** Varnish runs as daemon with multiple threads, so out-of-memory kills only restart individual threads within two seconds rather than entire VM. This let-it-crash philosophy from Erlang ecosystem enables system stability despite component failures, with zero thread failures recorded after five days uptime. - **Bandwidth abuse detection:** Episode 456 generated 30 terabytes from San Jose alone in 60 days, with 10,000+ distinct IPs downloading repeatedly. Honeycomb observability reveals patterns like 170,000 favicon requests in two hours and weekly Python/Go clients scraping all MP3s, requiring vmod-throttle implementation for rate limiting. - **Regional traffic optimization:** San Jose and Tokyo handle highest CDN load at 2.29 gigabits per second peak. Automated hourly checks using hurl test all 15 regions, downloading full MP3s to validate response times under 100 seconds. Fly.io allows per-region instance sizing but requires manual scaling after initial deployment. → NOTABLE MOMENT One episode from August 2021 about OAuth complexity has been downloaded over one million times, generating 400 gigabytes every four hours from thousands of Asian IP addresses. The team suspects speed testing or archiving bots rather than genuine listeners, forcing implementation of throttling mechanisms to control bandwidth costs. 💼 SPONSORS [{"name": "Namespace", "url": "https://namespace.so"}, {"name": "Fly.io", "url": "https://fly.io"}, {"name": "Squarespace", "url": "https://squarespace.com/changelog"}, {"name": "Depot", "url": "https://depot.dev"}] 🏷️ CDN Architecture, Varnish Caching, Infrastructure Debugging, Bandwidth Abuse, Observability, Let It Crash