Many, many years ago, when Modern Vespa was still on its very first server (they just grow up so fast!) I was struggling to keep the server from overloading. MV was oddly successful right from the start, and we had more traffic than we could handle on such a low-end server. At the time, our hosting plan allowed us unlimited bandwidth, but a fairly stingy amount of compute time, with surcharges being assessed if we went over the monthly allotment. And we were going over the limit every single month. The bills were starting to add up.
One of the things I did at the time to alleviate the situation was to turn off gzip compression. (Sidebar: gzip compression is the standard mechanism to reduce the amount of data being sent across the wire between the host and the browser).
At the time, the compression was being done by the forum software itself, and it was definitely adding to the compute burden. How much it was contributing is an open question -- it’s trivial to compress text today, on modern hardware. It was a bit murkier in 2005/2006, when this was happening. Since I had unlimited bandwidth, though, and was being charged extra for compute, getting rid of gzip compression seemed like the right move. Do less work for every page served, even if it uses more bandwidth and even if the perceived response at the client will be slower.
Fast forward about 15 years, and the circumstances we now find ourselves in are the exact opposite. We are paying a fixed cost for our monthly compute resource, but we are paying by the gigabyte for bandwidth. The bandwidth charges aren’t exorbitant, but they’re not nothing, and they’re not fixed. So there’s an incentive to use less.
For the last 10 years or so, most of our non-bandwidth costs were quite a bit higher, and so the bandwidth charges were mostly just in the noise. However, over the last couple of months, I’ve been steadily chipping away at MV’s monthly hosting bill, finding better ways to provide the same service for less cost, and I’ve actually got the bill down substantially -- less than half of what it used to be. That makes the bandwidth charges stand out all the more, though. I’ve picked all the other low-hanging fruit, and bandwidth now represents the largest single item on the hosting bill.
And it turns out that in 15 years, I’ve never revisited the subject of gzip compression that I so unceremoniously ripped out of the software years ago. Lots of static resources (CSS, javascript, etc) are already sitting on the CDN server in a compressed form, but the HTML served by the core Modern Vespa server -- this topic, for instance -- has been delivered in uncompressed form this whole time.
I didn’t really think the HTML text was all that significant. As I started wondering what to do about the bandwidth costs, though, I looked at a few long threads to get some actual numbers. It turns out that a typical full page of postings is about 250k bytes worth of text -- that’s just HTML, not including avatars or attachments or anything. So it’s actually pretty significant. Well, okay, multiplied by thousands of page requests a day, it’s significant.
Fortunately, it turns out that this fruit was hanging lower than I thought, as it’s an easy problem to solve. Modern versions of Apache generally come pre-installed with mod_deflate, which is a module that will automatically compress page content on-the-fly if the client browser agrees to take it -- and all modern browsers do. Heck, even the ancient browsers accept gzip compression. It’s been a standard seemingly forever.
So I flicked the switched. And right off the bat, those large 250k pages started being more like 25k, or as little as 10% of the uncompressed size. Which means you all get your pages that much faster (especially when you’re on a cellular connection) and MV’s hosting bill will be a little bit less next month.
Win win.
I just wish I’d done it sooner.
⚠️ Last edited by jess on UTC; edited 1 time