Finding the Nginx gzip_comp_level Sweet Spot

When a web server creates a web page it generally pulls loads of crap from the database/disk and stitches together some ginormous hodgepodge of raw markup hundreds or thousands of kilobytes in size. As HTML is literally just a heap of embarrassingly compressible plaintext, turning on nginx’s gzip compression filter is a no-brainer.

gzip on;

Beyond that there are a handful of other knobs you can turn. The most common is probably gzip_comp_level, which sets the compression level to be used.

Going beyond `gzip on`

The acceptable range of values for gzip_comp_level correspond with those of the standalone gzip. An excerpt from its man page explains:

These options change the compression level used, with the -1 option being the fastest, with less compression, and the -9 option being the slowest, with optimal compression.

In other words: you trade CPU time for resulting file size. The idea is to send less data to the client, but not at the expense of higher CPU usage on the server or the client (who could very well be on a mobile device with a meek CPU). It stands to reason that there exists a sort of “sweet spot” depending on your specific content, server, users, etc.

For what it’s worth nginx defaults to level 1 and gzip uses level 6.

Finding the sweet spot

Using curl to capture the gzipped HTML stream you can effectively compare the savings of each compression level for your site’s content. Edit the nginx configuration to use level 1, 2, 3 etc, testing each setting individually and saving the results to disk.

For example, a large, dynamic page on a site of mine:

$ curl -H "Accept-Encoding: gzip" https://example.org/community-list > /tmp/community-list-gzip1.html.gz
$ curl -H "Accept-Encoding: gzip" https://example.org/community-list > /tmp/community-list-gzip2.html.gz
$ ...
$ curl https://example.org/community-list > /tmp/community-list.html

Note the use of the Accept-Encoding HTTP header, which informs the server that your client supports gzip. The lack of this header makes the server send plain HTML.

Results

A few things become clear when looking at the resulting file sizes:

$ du -sh /tmp/community-list*
40K     /tmp/community-list-gzip1.html.gz
40K     /tmp/community-list-gzip2.html.gz
40K     /tmp/community-list-gzip3.html.gz
36K     /tmp/community-list-gzip4.html.gz
36K     /tmp/community-list-gzip5.html.gz
32K     /tmp/community-list-gzip6.html.gz
32K     /tmp/community-list-gzip7.html.gz
32K     /tmp/community-list-gzip8.html.gz
32K     /tmp/community-list-gzip9.html.gz
380K    /tmp/community-list.html

First, it is immediately clear to me that, even at compression level 1, the file size for this page is reduced by 340 kilobytes simply by turning gzip on! Another observation is that the file sizes cluster around compression levels 1, 4, and 6; there is no use using levels 2 or 3, for example, because you can get the same compression for less CPU by using level 1.

The takeaway

The “sweet spot” for nginx’s gzip compression filter seems to be either level 4 or 6. Knowing that gzip itself uses level 6 by default, it’s safe to say that the sweet spot is probably there. According to my data anything after 6 is just a waste of CPU time anyways.

If in doubt, just turn gzip on and go optimize other layers of your stack. For most people that will be enough, but remember: in addition to consuming this mess of HTML, the client’s browser still has to go fetch all of the images, stylesheets, JavaScript, etc once it starts parsing the source, not to mention rendering the page to the DOM!