A Gzip Primer for Front-End Development
Find out how gzip works and how to improve page performance with it. It's simpler than you think!
It’s easy to treat Gzip as black box technology — for most people, it’s a checkbox that you turn on at the server level and forget about. But with a basic understanding of gzip, you can speed up your workflow and eliminate some performance myths.
How does it work?
Gzip is a lossless compression algorithm for files that is, in itself, a variation of another compression algorithm called LZ77. “Lossless” means that unlike the compression on videos, JPGs, and PNGs, Gzip doesn’t remove any data and can be reversed at the end by the browser to produce a perfect copy. Gzip works by finding and reducing repetition.
A neat example of gzip in action is Julia Evans’ 54sec video “gzip + poetry = awesome.”, which shows a simplified gzip identifying repetitive strings and replacing them with pointers.
To get a better feel for how LZ77 works on web content, play around with this JSfiddle a bit.
The savings on the example text are only around 10%, but on larger files (and files with more code) the savings grow. The JSON content of Wikipedia’s gzip article, for example, saves about 35%.
A quick thing to try is just typing “example example example” into the box. With some copying and pasting, you can reach a point where the savings top out at about 96%. This demonstrates how effective gzip is at reducing repetition — the more you use an identical string beyond a certain length, the cheaper it is.
What can be gzipped?
Nearly anything you send to a browser — HTML, CSS, JS, SVGs — can and should be gzipped. There are a few interesting exceptions, though:
- Don’t gzip already-compressed files, which include .jpg, .woff font files, and .png. In some cases, it can actually make them slightly larger.
- Don’t gzip downloads. Gzipping files like .pdf and .doc can make them significantly larger.
- Don’t gzip especially small files if you can avoid it. If a file is under 1.4kb, it’s most likely going to be transmitted in a single packet anyway, and there’s no need to gzip. It won’t hurt, but it adds (very slightly) extra CPU time for your server and the browser.
What does this mean for how I develop?
Essentially, that repetition is your friend when it occurs inside a single file. Some examples:
- You might think that including the same inline SVG icon ten times on a page is wasteful, and move it into a spritesheet or external file. You’d be wrong; once gzipped, the remaining 9 uses are simply pointers to the original, and very space-efficient.
- When Sass was introduced, using
@extendwas seen as a smart way to keep filesize from growing by attaching selectors (which are small) to a single set of rules (which are big). Gzip flips this on its head — using a large set of rules many times in a single stylesheet is fine, as each successive ruleset just points to the first one.
@extendtends to make files slightly larger than just using repetitive @mixins.
- This rule extends to media queries. When media queries were introduced, the best practice was to put them all at the end of your CSS file, partially due to the savings of only writing them once. Sass encourages you to nest them where they make sense instead, and if you’re gzipping your output, the resulting difference is negligible.
There is a bit of extra complexity around the 32kb "sliding window" of compression, which is the distance that gzip can travel back while compressing to create pointers. This could come up if, for example, you use the same SVG icon in the header and footer of a very long page — in this instance, gzip wouldn't point one to the other due to the distance between them.
You should always still minify code even if you’re gzipping it. Minified CSS removes spaces and indentation, and minified JS goes even further by removing unnecessary code and renaming variables into a repetitive scheme. Gzipped, minified JS tends to be much smaller than Gzipped JS alone.
Note that I said single file at the start of this section — including the same Sass mixin in two separate CSS files will, of course, produce no savings. Same with getting back similar content on multiple JSON requests, or putting the same SVG in your JS and your markup. This can be an argument for inlining more assets into HTML than you might at the moment, as additional repetitive strings can cost less and less bytes.
How do I learn more?
Enabling gzip is different on every platform, so check out this handy guide from Patrick Sexton for enabling it on Nginx, Apache, and other platforms.
If you want to mess around with gzip locally, take a look at the command line usage of the gzip-size npm library. This will allow you to easily check gzip sizes when in local development, without running output through a server, just by running