Paul Calvano

Choosing Between gzip, Brotli and zStandard Compression

2024-03-19T04:00:00+00:00

HTTP compression is a mechanism that allows a web server to deliver text based content using less bytes, and it’s been supported on the web for a very long time. In fact the first web browser to support gzip compression was NCSA Mosaic v2.1 way back in 1993! The web has obviously come a long way since then, but today pretty much every web server and browser still supports gzip compression.

In recent years, new and innovative compression methods have gained browser support. One in particular that has achieved widespread adoption is Brotli. First supported in Chrome 50 (2016), it was supported by all modern browsers a year later. Brotli is able to compress files much smaller than gzip, albeit with a higher computational overhead. Based on HTTP Archive data from January 2024, Brotli is actually used more than gzip for JavaScript and CSS!

Facebook’s zStandard compression is another promising new method which aims to serve smaller payloads compared to gzip while also being faster. zStandard was added to the IANA’s list of HTTP content encodings with an identifier of “zstd”, and support for it was added to Chrome in version 123, which was released this month. Facebook recently shared some benchmarks that show it performing signifcantly faster than gzip.

Beyond this, there’s also shared dictionaries for Brotli and zStandard, which have the potential to significantly reduce byte sizes.

While it may take some time for browsers, web servers and CDNs to catch up, it’s worth pondering which compression method is right for your content. A few years ago I wrote a blog post about Brotli compression as well as a tool to help you determine how Brotli could compress your content relative to gzip. I’ve updated this tool to include zStandard compression as well as show the relative latency incurred at each compression level. You can find the new/updated tool at https://tools.paulcalvano.com/compression-tester/

How HTTP Compression Works

When a client makes an HTTP request, it includes an Accept-Encoding header to advertise the compression encodings it can support. The web server then selects one of the advertised encodings that it also supports and serves a compressed response with a Content-Encoding header to indicate which compression was used.

In the example below, the client advertised support for gzip, Brotli, and Deflate compression. The server returned a gzip compressed response containing a text/html document.

    GET / HTTP/2
    Host: httparchive.org
    Accept-Encoding: gzip, deflate, br

    HTTP/2 200
    content-type: text/html; charset=utf-8
    content-encoding: gzip

If a client sends multiple encodings in its Accept-Encoding header, then the server will have to choose one. For example, if I send an HTTP request to Facebook’s homepage and advertise support for gzip, Brotli and zStandard - their server chooses to deliver the response via zStandard.

    GET / HTTP/2
    Host: www.facebook.com
    accept-encoding: gzip, deflate, br, zstd
    
    HTTP/2 200 
    content-encoding: zstd
    content-type: text/html; charset="utf-8"

Gzip Compression

Gzip is fundamentally supported by all web servers, browsers and intermediaries (CDNs, proxies, etc), mostly by default. Despite how easy it is to serve content gzip compressed, there are a few things to keep in mind:

Most web servers and CDNs default to gzip compression level 6, which is a reasonable default. Some servers (ie, NGINX) default to gzip level 1, which usually results in faster compression times but results in a larger file. Make sure to check your compression levels!
Many CDNs can gzip compress resources for you, which is helpful if you missed something on your origin server. Some CDNs enable this by default.
Some CDNs may decompress and recompress content for you if they need to inspect or manipulate the contents. This may have an impact on your time to first byte (TTFB) especially if the content that needs to be recompressed is large.

Brotli Compression

Brotli compression was created by Google and is supported across all major web browsers. At its highest compression level files can often be reduced 15-25% more than gzip. Higher compression levels come at a significant latency cost though.

Many popular web servers support Brotli via modules. For example, Apache has a mod_brotli module which defaults to level 5. NGINX has a module called ngx_brotli which defaults to level 6. Brotli is supported by a variety of CDNs, albeit in slightly different ways. It’s best to understand how your specific CDN handles this compression algorithm. For example:

Most CDNs have the ability to serve Brotli compressed payloads from an origin server that supports Brotli.
- This is done by varying the content by the Content-Encoding response header.
- Some CDNs do this by default, others require additional steps.
Some CDNs can Brotli compress responses even if an origin server does not support Brotli.
- Usually they fetch the result via gzip from your origin.
- Some CDNs can perform on the fly Brotli compression for static and dynamic, usually at specific compression levels they define.
- Some CDNs can pre-compress static content at higher compression levels

While the highest compression levels might be ideal for static content, you’ll want to be careful with dynamic content to avoid impacting your TTFB. Additionally if your CDN offers dynamic Brotli compression, then you may want to determine if the byte savings are worth the latency overhead from decompressing and re-compressing the response at the edge. In some cases it may be best to Brotli compress dynamic content from the origin, or stick with Gzip if your origin doesn’t support Brotli.

zStandard

zStandard is a newer compression method developed by Facebook. It was designed to compress at ratios similar to gzip compression, but with faster compression and decompression speeds. Historically its usage has been mostly filesystem related, however Chrome is the first browser to support it as of March 2024. While most CDNs do not have support for zStandard yet, many could feasibly vary the cache key for origin compressed responses similar to the way they support Brotli.

When examining the top 10k sites in the HTTP Archive, zStandard compression usage appears to be mostly confined to sites owned by Meta such as www.facebook.com, www.instagram.com, www.messenger.com, www.whatsapp.com, etc) and Netflix. The Meta sites are delivering zStandard encoded content to Chrome browsers. Netflix’s appears to default to gzip compression, possibly implementing zStandard as a test.

Compression Tester

A few years ago I wrote a compression tool that was designed to help you determine whether gzip is sufficient for your content, or if Brotli could provide a reduction in payload sizes. I’ve released a new version of this tool that includes zStandard compression as well as compression times for each individual test.

You can find the new/updated tool at https://tools.paulcalvano.com/compression-tester/

When we’re looking at these results, a few things you’ll want to keep in mind are:

If your content is dynamic / non-cacheable, then it will be very sensitive to any additional latency you add. Higher Brotli and zStandard compression levels can bec computationally expensive, so it’s best to use a more moderate compression level.
If your content is static / cacheable, then you may want to use higher compression levels.
Compression settings are typically set on a server and not per request. If you do not have flexibility with setting this at a more granular level, you should go with the lowest common denominator required for all of your content.

Let’s dive deeper into a few examples:

Example: Facebook

When browsing Facebook, the largest JavaScript resource I saw was 2.37 MB uncompressed. It was delivered to my browser as a 514 KB zStandard compressed response. When testing this file via the Compression Tester tool, it was served a 645 KB gzip and 526 KB Brotli response.

Comparing the compression test results to the delivered responses, we can see that the server likely compressed in the middle range. For example it delivered a 526 KB Brotli payload (presumably level 5), but it could have delivered Brotli level 11 . This would come at a higher computational cost though - which may have been a factor in their selection. zStandard also appears to be delivering a smaller file, but based on this test the computational overhead is more than double what gzip level 6 costs.

For this response, Brotli level 9 would provide the best compression ratio with a CPU overhead similar to zStandard 12. If it’s possible to precompress payloads, then the highest compression levels would reduce the payloads further. However Brotli appears to outperform zStandard in both compression ratio and compression time up until level 9 for this response.

The base HTML page for Facebook, it is 62 KB uncompressed. They support gzip, Brotli and zStandard - and it seems to be compressed at the lowest compression levels. While compression level 1 is often undesired, in this case there doesn’t appear to be much of an advantage of applying higher levels of compression due to limited byte savings. Additionally for Facebook’s HTML all 3 compression algorithms produce a similar size payload - but Brotli and zStandard both compress their HTML faster than gzip.

Sandals.com Homepage

Out of the top 10 thousand websites, Sandals has the largest HTML payload - clocking in at almost 7.9MB (delivered as a 804 KB gzip compressed payload). Additionally they have a 9.4MB script bundle (gzip compressed to 2.5 MB). In the waterfall graph below you can see the impact that these large payloads are having on the experience. Reducing them solves only part of the problem, as that’s still a huge amount of content for the browser to parse, evaluate, and execute. But let’s see how these compression algorithms do.

Sandals appears to be gzip compressing at the highest possible compression level. If we assume that this page is dynamically generated and try to stay within the relative compression times:

Brotli level 9 or zStandard level 15 would result in approximately 75% smaller payload with faster compression times compared to gzip 9.
Brotli level 5 appears to be a good tradeoff between compression ratio and time (209 KB, and 73% faster to compress)
In this particular example, zStandard is producing a larger payload compared to Brotli.

Sandal’s is also using Cloudflare’s CDN which supports Brotli compression, so enabling this could be a quick performance win for them.

Compression Levels vs Compression Times

When considering a compression level, it’s important to balance the time it takes to compress a payload with the estimated byte savings. For example, utilizing the maximum compression level will often produce the smallest payload, but will do so at a higher computational cost. Likewise, the lowest compression levels are often really fast to compress, but might not be as effective.

I tested the base HTML of the top 10 thousand websites’ HTML pages and their largest first party request. The results below show that the majority of gzip compression seems to fall within the estimated range of 4-6 (likely 6 since that is a common default). However ~30% of sites are utilizing gzip level 1, which often provides inadequate compression. The majority of Brotli compression appears to be level 4. However there’s ~25% of sites delivering Brotli level 1. And finally the majority of Brotli level 11 usage seems to be for static content.

The table below details a few sites that are serving their HTML using either gzip or Brotli level 1. Using such a low compression level will likely result in larger payload. In many of these examples, the Brotli compressed payload is actually larger than gzip. Fortunately some of these sites are leveraging services that will automatically deliver gzip because of the byte discrepancy - but they could still benefit from increasing the compression level.

	Content Length Delivered (KB)
url	Uncompressed	gzip	Brotli
https://www.epicurious.com	3877	393	684
https://www.anker.com	3273	628	232
https://www.bonappetit.com	1889	230	348
https://www.allure.com	1817	239	372
https://www.gq.com	1665	226	340
https://www.newyorker.com	1638	252	342
https://www.vanityfair.com	1593	231	339
https://seekingalpha.com	1453	279
https://www.vogue.com	1446	204	312
https://www.cntraveler.com	1422	205	311

The table below shows the results of applying gzip level 6, brotli level 5 and zStandard level 12 against the base HTML on these pages A few observations:

Gzip level 6 reduces most of the gzipped payloads by 25-30% compared to the size delivered via gzip level 1.
Brotli level 5 was able to reduce their sizes by almost 75% compared to gzip level 1. In many cases the compression time overhead is comparable to gzip level 6 - but this varies.
zStandard level 12 was able to provide similar compression levels to brotli level 5 while maintaining compression times similar to gzip level 6.

Based on these examples, real time zStandard compression seems to provide a slight advantage over Brotli - achieving the same sizes with faster compression times.

	>After applying higher compression levels			Compression Time (seconds)
url	gzip (L6)	Brotli (L5)	zStd (L12)	gzip (lL6)	Brotli L5)	zStd (L12)
https://www.epicurious.com	271	145	145	0.083	0.093	0.068
https://www.anker.com	412	219	219	0.099	0.081	0.088
https://www.bonappetit.com	169	113	114	0.044	0.061	0.054
https://www.allure.com	179	135	135	0.051	0.067	0.043
https://www.gq.com	168	118	119	0.049	0.053	0.050
https://www.newyorker.com	191	127	128	0.053	0.071	0.043
https://www.vanityfair.com	173	128	128	0.058	0.098	0.042
https://seekingalpha.com	227	171	174	0.049	0.073	0.079
https://www.vogue.com	152	115	115	0.053	0.075	0.047
https://www.cntraveler.com	155	117	117	0.052	0.052	0.048

Cacheable objects can often be compressed at higher levels - especially if they are able to be precompressed. When evaluating the largest first party objects hosted on the top 10 thousand sites, I found that Brotli level 5 and zStandard level 12 resulted in similar file sizes and compression times - much like the results above. However when evaluating Brotli compression level 11 vs zStandard 19, the smallest files are almost always generated by Brotli level 11. zStandard’s compression times are 4x faster than Brotli 11 though. So if you have the ability to precompress your objects, Brotli 11 may still be preferred. If not, then zStandard may be the better option.

	% File size reduction compared to gzip level 6
	br5	zstd12	br11	zstd19
Average	8.85%	9.07%	19.18%	14.11%
p50	6.99%	7.33%	17.53%	12.40%
p75	10.25%	10.94%	22.21%	17.10%
p90	15.29%	15.74%	27.21%	22.40%

Conclusion

HTTP compression is an incredibly important feature, and has long been overlooked due to the universal support of gzip compression across all web servers and browsers. It’s great to see innovation in this space, and the addition of another compression encoding along with the possibility of shared dictionary compression in the future.

The research I’ve shared in this article also shows that for many sites Brotli will provide better compression for static content. zStandard could potentially provide some benefits for dynamic content due to its faster compression speeds. Additionally:

A surprising amount of sites are using low level gzip compression, and should consider increasing their compression levels.
For dynamic content
- Brotli level 5 usually result in smaller payloads, at similar or slightly slower compression times.
- zStandard level 12 often produces similar payloads to Brotli level 5, with compression times faster than gzip and Brotli.
For static content
- Brotli level 11 produces the smallest payloads
- zStandard is able to apply their highest compression levels much faster than Brotli, but the payloads are still smaller with Brotli.

Of course your mileage will vary and there’s really no universal answer. It’s worth running some tests on your site to see whether your content would benefit, and which compression levels to consider. And then experimenting with RUM data to evaluate whether the approach you decide on is successful. I hope that tool I created helps you get started on this analysis for your site!

Identifying Font Subsetting Opportunities with Web Font Analyzer

2024-02-16T14:00:00+00:00

10 years ago, custom web fonts were a niche feature used by ~10% of websites. Today they are used by over 83% of websites! Fonts are generally loaded as a high priority resource, and some sites use techniques such as preload and early hints to get them to load as quickly as possible. Custom web fonts are important to many sites, since rendering with a specific typography is often preferred from a design perspective. However, this can easily become a performance issue when a large amount of fonts are loaded.

In this article, we’ll explore some potential issues around font loading and the performance benefits of a lesser used feature - font subsetting. We’ll look at HTTP Archive data to understand how prevalent the issue is, and then examine a few case studies. And finally, I created a new tool - Web Font Analyzer - that may help you explore whether font subsetting is something to consider for your site.

Background

There’s been a lot written about web fonts over the years. In fact the HTTP Archive’s Web Almanac has an entire chapter dedicated to this topic. A few years ago Zach Leatherman wrote a fantastic checklist on font loading strategies, which included using preload to load fonts earlier. Way back in 2016 the CSS font-display attribute was introduced, and today it is supported on all modern browsers and used by almost a third of websites! Variable fonts are heavily used by Google Fonts which are widely used across the web. Barry Pollard wrote a great article on self hosting Google Fonts. And just last month Stoyan Stefanov wrote an article about google font sizes.

The font-display feature was a major step forward in font loading performance, as it gave developers control over whether to prioritize rendering or typography. Using font-display:swap would avoid a rendering delay by painting text using a system font and then swapping the actual font after it’s loaded.

Font optimization strategies are great, but when combined the results can be confusing. For example, preloading fonts is a great way to get them to load earlier, but using font-display:swap at the same time may result in a less effective use of bandwidth early in the page load. It’s a good idea to understand exactly how your fonts are loading, how they are being used and what they contain.

Font sizes across popular sites

Using the HTTP Archive we can explore font usage across millions of websites. For the purpose of this analysis, we’ll focus on the top million sites. As of January 2024, 81.5% of the top million sites are utilizing at least one custom web font. Usage varies widely, with the average site loading 238 KB of fonts.

Depending on the font loading strategy used, fonts may be delivered at different parts of the page loading lifecycle. For the purpose of this analysis, I’m going to break this up into 4 parts:

Before FCP - Means that the fonts finished downloading before the First Contentful Paint was measured. This could indicate that a font was render blocking, fetched with a high priority or preloaded.
FCP to LCP - Means that the font was loaded in between the First and Largest Contentful Paint. These fonts were loaded while other resources critical to the user experience were loading
LCP to onLoad - The fonts were loaded after the Largest Contentful Paint but before the onLoad event.
After onLoad - This could indicate that the font was either delayed or not used by the DOM until much later on.

Out of the top million sites that are loading fonts, 63.6% are loading at least 1 custom web font prior to FCP. I would expect this to be high, especially considering how often preloading fonts is recommended.

However, 25% of these sites are loading more than 75 KB of fonts before the FCP, and over 4500 sites are loading more than 500 KB of fonts during this time! Regardless of the benefits of the font loading strategies applied - I’d say that there is likely some waste occurring.

Glyphs vs Characters on Pages

A font is essentially a typographical representation of a character. One can display the same text with different fonts and they will appear differently on a web page - however the unicode value for the character will always be the same. For example, the space character is 0032, the values ABC are 0065, 0066 and 0067 respectively across numerous fonts.

Some font packages are designed to display icons, and others are designed for text. However the one thing that is not always apparent to developers is how many glyphs are included in each font package. For example, a popular Google font called “Material Icons” is used on 114K websites. It contains 2229 glyphs and adds 128 KB to websites using it. It’s very unlikely that sites are making use of all these glyphs.

The most popular Google font is Open Sans, and it’s used by over 1.5 million websites. You can use a tool like FontDrop to explore the contents of your fonts, and you might be surprised! This font contains 280 glyphs and adds 43 KB to pages using it. Loading a few fonts like this can really add up.

So how does that compare with the rendered HTML of a page? Using the HTTP Archive, I was able to write a query that extracts and summarizes the visible glyphs in rendered HTML pages for the top 10K sites. We can then compare this to the minimum and maximum number of glyphs in a page’s custom web fonts. The results might surprise you!.

The median popular site contains 3 fonts, totalling 95 KB. The rendered HTML has 101 glyphs, while the smallest font has 248 glyphs.
At every percentile, the number of glyphs in the smallest font has 2-3x the number of glyphs compared to the rendered HTML.

	Font Glyphs vs Rendered HTML
	Font Count	Font Weight	Visibly Glyphs	Min Font Glyphs	Max Font Glyphs
p50	3	95 KB	101	248	524
p75	5	193 KB	124	444	861
p95	9	542 KB	221	901	2229
p99	15	1001 KB	631	1530	3248

How to Test Your Page

There’s a few tools in this area that provide some insight into font usage:

FontDrop provides a useful way to explore the contents of your fonts. Simply drop the font file onto their UI and it will show you the metadata and all the glyphs contained in it.
YellowLabs has a font audit that will tell you if you have unused glyphs and summarize it by character sets. This can also be useful when deciding to subset fonts.

As part of this research, I created a tool named “Web Font Analyzer” that will leverage results from a WebPageTest measurement to show you a summary of the glyphs used on a page, when they are loaded relative to some performance metrics and how many glyphs are supported by each font. I’ve also attempted to provide some guidance on how you can use this information in the tool.

You can find the tool at https://tools.paulcalvano.com/wpt-font-analysis/.

Using the HTTP Archive we can identify sites that exhibit some of these issues, and there are quite a few. Let’s look at a few examples using this tool alongside WebPageTest.

Example - Mayoclinic

After running a WebPageTest measurement for Mayoclinic’s homepage, I copy and pasted the URL of the test results into the font analyzer tool. The tool will fetch the measurement data and create a summary of the fonts that were loaded.

The page summary section tells you how many fonts you are loading, and how large they are. It also provides a summary of the number of visible characters in the rendered HTML. In this example, there were 366 KB of fonts loaded on a page that contained only 85 visible glyphs. You can also click on “show glyphs” to see a summary of the actual glyphs in the rendered HTML.

Next we can see where these fonts are loading. The majority of these fonts are loading prior to the First Contentful Paint.And all of them are loading prior to the Largest Contentful Paint. It’s very likely that these font assets are competing with other resources for bandwidth.

If we examine the summary of fonts used on the page, we can see that many of them contain more than 500 glyphs!

In the WebPageTest measurement we can see 8 font files loading immediately after the HTML.

In the HTTP response header for the base page, there is a Link header with preloads for 5 font files. Of these preloads, 4 are for fonts hosted on www.mayoclinic.com, and the other is design.mayloclinic.com. While the preloaded fonts are downloading, the HTML initiates 4 preloads for the font files on www.mayoclinic.com (however those are already in flight). Then the fonts.css file loads, which references fonts from design.mayoclinic.com. During this page load, approximately 700ms was spent loading fonts, half of which were unused. Each of these font files were ~42 KB and contained > 500 glyphs. Meanwhile the page contained less than 100 glyphs in the rendered HTML.

Recommendations:

Update CSS to use the fonts from www.mayoclinc.com.
Subset the font files to a latin character set to reduce their weight by up to 90%.
Continue preloading the (much smaller) font files.

Example - Kia

You can see another example on Kia.com US homepage. The page weight is 21.5 MB, and 3 MB of that are font files. There are 85 visible glyphs on the page, and expanding the glyphs we can see that there are two korean glyphs (codepoints 54620 and 44544) that are used to render the “Korean” language switch link in the footer.

Prior to LCP, there was almost 7 MB of content loaded, including all of the fonts. Font weight accounts for 14% of overall page weight, but a whopping 42% of bytes loaded prior to LCP! The fonts are almost certainly competing for bandwidth.

When looking at the individual font assets loaded, we can see that these are not using font-display:swap, and that 3 of them contain 15,190 glyphs! It also appears that all of the KiaSignature fonts are delivered as WOFF files (converting them to WOFF2 would reduce some of the bytes). Their icons font is also delivered as a TTF file, and not gzip compressed.

The fonts are referenced in clientlib-base.css and they are not being preloaded. At the start of the waterfall we can see the first-party CSS and JS loading, but then clientlib-base.js gets interrupted by the higher priority KiaSignatureRegular.woff font file. This font is 965 KB, which delays the JavaScript and ultimately the first contentful paint. Additionally, the font is only cached for 1 day - so repeat visitors will need to download the fonts again.

Further down the waterfall we can see numerous PNG images. However they are being delayed by 2.1 MB of fonts. At the same time, a 2 MB hero image loaded via Adobe Experience Cloud (Scene7) is fetched and used as the poster image for a hero video.

While fonts are not the only factor affecting the performance of this page, they are clearly holding back the FCP and LCP by competing with other resources for bandwidth. Applying some easy to implement performance optimizations on the images such as lazy loading, using optimal image formats, and better cache directives will help - but ultimately the font loading will still cause delays - so subsetting them would be ideal.

Recommendations:

Subset the KiaSignature fonts for regional websites so that they are using only the necessary glyphs. For example on the US site, using latin + extended latin + the specific KR characters needed.
Convert WOFF files to WOFF2
Enable gzip compression for kia-icons.ttf. This would reduce the file to 104 KB (45% smaller). Subset the font to reduce the size even further.
Fonts are only cached on the browser for 1 day. TTL should be increased.
Not font related, but consider image and video compression, alternate image formats,and lazy loading images.

How to Subset Fonts

In some of the examples above, font subsetting would be an excellent optimization. However based on the data from the HTTP Archive, it doesn’t seem that this is being used very often. Zach Leatherman wrote a great tool for subsetting fonts, called Glyphhanger. You can also use the fonttools command line library to subset your fonts.

To use fonttools, you need to install the library.

sudo apt-get install fonttools

Once installed, you’ll have the pyfmtfont application which can be used to subset fonts. You can see some examples of its usage in this blog post.

When I was creating the WPT Font Analyzer tool, I started using a fontawesome font to render a checkbox, an exclamation point and an information circle icon. This resulted in a 124 KB font, which was many times the size of the rest of the tool! I was able to reduce this to a small 1 KB font by running the following command to create a subsetted font with the glyphs I needed.

pyftsubset fa-solid-900.woff2 \
        --unicodes=U+f05a+f058+f071 \
        --flavor=woff2 \
        --output-file=fa-solid-900-subset.woff2

Let’s try this against the 3 Kia fonts we saw from the previous example. In this example, I’m subsetting

Basic latin characters (02-7E)
Copyright symbol (A9)
Registered Trademark symbol (AE)
Trademark Symbol (2122)
Double Quotes (201C-201D)
Korean Hangul syllables (D55D, ADC0)

pyftsubset KiaSignatureRegular.woff \
          --flavor=woff2 \
          --unicodes="U+0020-007E,U+00A9,U+00AE,U+2122,U+201C-021D,U+D55D,U+ADC0" \
          --output-file=KiaSignatureRegular_subset.woff2

The resulting file is 10 KB for each of the KiaSignature fonts. So in this example 3 MB of font weight could be reduced to 30 KB! You can download this subsetted font here and examine it in FontDrop.

Conclusion

Fonts can be challenging to support from a web performance perspective, especially as their placement on modern web pages occurs at the intersection of design and web operations. Over the years there have been some innovative approaches to font loading, designed to limit the performance overhead of them. There’s also been a lot of great research in the web performance industry on this topic and best practices published. It’s always worth evaluating the end user experience to ensure that the tools and optimizations put in place are having the desired impact on user experience. I’m hopeful that the Web Font Analyzer tool I created adds another lens for you to evaluate font loading through.

Many thanks to Barry Pollard and Tim Vereecke for reviewing this.

Queries

Interested in seeing some of the HTTP Archive Queries behind this analysis? Here’s a few of the queries I used. Please be aware that some of these queries will exceed the free tier quota - so be careful when running them! (You can read more on how to minimize query costs at har.fyi.)

Percent of sites using fonts by rank

This query will calculate the percentage of sites that are using at least 1 custom web font, and group the results by rank.


SELECT 
  rank, 
  IF(SAFE_CAST(JSON_EXTRACT(summary, "$.reqFont") AS INT64) >0,"Fonts", "No Fonts") as f,
  COUNT(*),
  COUNT(0) / SUM(COUNT(0)) OVER (PARTITION BY rank) AS pct

FROM `httparchive.all.pages`
WHERE 
  date = "2024-01-01"
  AND is_root_page = true
  AND client = "mobile"
GROUP BY rank,f
ORDER BY rank ASC

Average font weight across top million sites

This query will calculate the average, median and p75 font weight of sites with a rank <= 1 million, which are using at least 1 custom web font.


SELECT 
  COUNT(*) AS freq,
  ROUND(AVG(SAFE_CAST(JSON_EXTRACT(summary, "$.bytesFont") AS INT64)),2) AS avgFontSize,
  ROUND(APPROX_QUANTILES(SAFE_CAST(JSON_EXTRACT(summary, "$.bytesFont") AS INT64), 100)[SAFE_ORDINAL(50)],2) p50FontSIze,
  ROUND(APPROX_QUANTILES(SAFE_CAST(JSON_EXTRACT(summary, "$.bytesFont") AS INT64), 100)[SAFE_ORDINAL(75)],2) p75FontSIze
FROM `httparchive.all.pages`
WHERE 
  date = "2024-01-01"
  AND is_root_page = true
  AND client = "mobile"
  AND rank <= 1000000
  AND SAFE_CAST(JSON_EXTRACT(summary, "$.reqFont") AS INT64) > 0

When do fonts start loading?

This query will run against the top 100K sites to identify the FCP, LCP and onLoad time for each measurement. It JOINs the requests table to search for the earliest start time for a font download. Then it groups the results by time interval - such as "Before FCP", "Between FCP and LCP", etc. This query processes ~1TB of data.


CREATE TEMP FUNCTION GetLcpTime(json_data STRING)
RETURNS INT64
LANGUAGE js AS """
  var data = JSON.parse(json_data);

  if (data && Array.isArray(data)) {
    for (var i = 0; i < data.length; i++) {
      if (data[i] && data[i].name === 'LargestContentfulPaint') {
        return data[i].time;
      }
    }
  }
  
  return null;
""";

SELECT 
CASE
    WHEN FCP IS null THEN 'error - no FCP'
    WHEN LCP IS null THEN 'error - no LCP'
    WHEN onLoad IS null THEN 'error - no onLoad'
    WHEN firstFontStartTime IS null THEN 'error - no firstFontStartTime'
    WHEN firstFontStartTime < FCP THEN "Before FCP"
    WHEN firstFontStartTime BETWEEN FCP AND LCP THEN "Between FCP and LCP"
    WHEN firstFontStartTime BETWEEN LCP AND onLoad THEN "Between LCP and onLoad"
    WHEN firstFontStartTime >  onLoad THEN "After onLoad"
    ELSE 'Unhandled Error'
  END AS test,
  COUNT(*)
FROM (
  SELECT 
    p.page,
    SAFE_CAST(JSON_EXTRACT(p.payload, "$._firstContentfulPaint") AS INT64) AS FCP,
    getLcpTime(JSON_EXTRACT(p.payload, "$._chromeUserTiming")) AS LCP,
    SAFE_CAST(JSON_EXTRACT(p.summary, "$.onLoad") AS INT64) AS onLoad,
    MIN(SAFE_CAST(JSON_EXTRACT(r.payload, "$._all_start") AS INT64)) AS firstFontStartTime,
  FROM `httparchive.all.pages` AS p
  INNER JOIN `httparchive.all.requests` AS r
  ON CAST(JSON_EXTRACT(p.summary, "$.pageid") AS INT64) = CAST(JSON_EXTRACT(r.summary, "$.pageid") AS INT64)
  WHERE 
    p.date = "2024-01-01" AND r.date = "2023-11-01"
    AND p.is_root_page = true AND r.is_root_page = true
    AND p.client = "mobile" AND r.client = "mobile"
    AND rank <= 100000
    AND r.type = "font"
    AND SAFE_CAST(JSON_EXTRACT(p.summary, "$.reqFont") AS INT64) > 0
  GROUP BY 1,2,3,4
)
GROUP BY 1
ORDER BY 2 DESC

How many font bytes are downloaded before FCP?

This query will run against the top 1 million sites to aggregate the number of bytes loaded before FCP. This query processes ~1.3TB of data.



SELECT 
  COUNT(*),
  ROUND(APPROX_QUANTILES(fontBytesBeforeFCP, 100)[SAFE_ORDINAL(50)],2) p50FontSize,
  ROUND(APPROX_QUANTILES(fontBytesBeforeFCP, 100)[SAFE_ORDINAL(75)],2) p75FontSize,
  ROUND(APPROX_QUANTILES(fontBytesBeforeFCP, 100)[SAFE_ORDINAL(95)],2) p95FontSize,
  ROUND(APPROX_QUANTILES(fontBytesBeforeFCP, 100)[SAFE_ORDINAL(99)],2) p99FontSize,
FROM (
  SELECT 
    p.page,
    SUM(SAFE_CAST(JSON_EXTRACT(r.summary, "$.respSize") AS INT64)) AS fontBytesBeforeFCP,
  FROM `httparchive.all.pages` AS p
  INNER JOIN `httparchive.all.requests` AS r
  ON CAST(JSON_EXTRACT(p.summary, "$.pageid") AS INT64) = CAST(JSON_EXTRACT(r.summary, "$.pageid") AS INT64)
  WHERE 
    p.date = "2023-11-01" AND r.date = "2023-11-01"
    AND p.is_root_page = true AND r.is_root_page = true
    AND p.client = "mobile" AND r.client = "mobile"
    AND rank <= 1000000
    AND r.type = "font"
    AND SAFE_CAST(JSON_EXTRACT(p.summary, "$.reqFont") AS INT64) > 0
    AND SAFE_CAST(JSON_EXTRACT(r.payload, "$._all_start") AS INT64) < SAFE_CAST(JSON_EXTRACT(p.payload, "$._firstContentfulPaint") AS INT64)
  GROUP BY 1
)

Font glyphs vs rendered HTML

This query provides a list of the top 10K web sites, individual font URLs, the number of glyphs in the font, and the number of visible glyphs in the rendered HTML. It also provides a link to the WebPageTest results from the HTTP Archive run for further analysis. This query processes almost 4 TB of data. I stored the results of the query in a scratchspace table in the HTTP Archive `httparchive.scratchspace.2024_01_01_font_glyphs_top100k`


CREATE TEMPORARY FUNCTION  CountVisibleGlyphs(html STRING)
RETURNS INT64
LANGUAGE js
AS """
  var extractedText = '';
  if (html) {
    // Remove HTML tags and keep only text content
    extractedText = html.replace(/<[^>]+>/g, '');

    // Remove extra spaces and newlines
    extractedText = extractedText.replace(/\\s+/g, ' ');

    // Remove leading and trailing spaces
    extractedText = extractedText.trim();

    // Count unique characters
    var uniqueChars = new Set(extractedText.split('')).size;
    return uniqueChars;
  } else {
    return 0; // Handle cases with empty HTML content
  }
""";

SELECT 
  pages.page, 
  rank,
  fontRequests.url,
  CAST(JSON_EXTRACT_SCALAR(fontRequests.summary, "$.respBodySize") AS INT64) AS font_size,
  CAST(JSON_EXTRACT_SCALAR(fontRequests.payload, "$._font_details.counts.num_glyphs") AS INT64) AS glyphs,
  CountVisibleGlyphs(htmlRequests.response_body) AS visibleGlyphs,
  CONCAT("https://webpagetest.httparchive.org/result/", wptid, "/") AS webpagetest, 

FROM `httparchive.all.pages` AS pages
INNER JOIN `httparchive.all.requests` AS fontRequests
  ON CAST(JSON_EXTRACT(pages.summary, "$.pageid") AS INT64) = CAST(JSON_EXTRACT(fontRequests.summary, "$.pageid") AS INT64)
INNER JOIN `httparchive.all.requests` AS htmlRequests
  ON CAST(JSON_EXTRACT(pages.summary, "$.pageid") AS INT64) = CAST(JSON_EXTRACT(htmlRequests.summary, "$.pageid") AS INT64)

WHERE 
  pages.date = "2024-01-01" 
  AND fontRequests.date = "2024-01-01"
  AND htmlRequests.date = "2024-01-01"
    
  -- mobile
  AND pages.client="mobile" 
  AND fontRequests.client="mobile"
  AND htmlRequests.client="mobile"
  
  -- root pages
  AND pages.is_root_page = true
  AND fontRequests.is_root_page = true
  AND htmlRequests.is_root_page = true

  -- font requests and HTML request
  AND fontRequests.type = "font"
  AND htmlRequests.is_main_document = true

  AND rank <= 10000

Font glyphs vs rendered HTML - analysis

This query uses the results from the previous table to compared the minimum and maximum number of glyphs in a font to the glyphs in the rendered HTML. Since this query uses the saved results from the previous query, it process a very small amount of data (~1 MB)


WITH fontStats AS (
  SELECT page, 
  COUNT(*) AS fonts,
  MIN(CAST(font_size AS INT64)) AS minSize, 
  MAX(CAST(font_size AS INT64)) AS maxSize, 
  SUM(CAST(font_size AS INT64)) AS totalSize, 
  min(CAST(glyphs AS INT64)) AS minGlyphs, 
  MAX(CAST(glyphs AS INT64)) AS maxGlyphs, 
  AVG(visibleGlyphs) AS visibleGlyphs 
FROM `httparchive.scratchspace.2024_01_01_font_glyphs_top100k` 
GROUP BY 1
)

SELECT 
  ROUND(APPROX_QUANTILES(fonts, 100)[SAFE_ORDINAL(50)],2) p50FontCount,
  ROUND(APPROX_QUANTILES(fonts, 100)[SAFE_ORDINAL(75)],2) p75FontCount,
  ROUND(APPROX_QUANTILES(fonts, 100)[SAFE_ORDINAL(95)],2) p95FontCount,
  ROUND(APPROX_QUANTILES(fonts, 100)[SAFE_ORDINAL(99)],2) p99FontCount,
  ROUND(APPROX_QUANTILES(totalSize, 100)[SAFE_ORDINAL(50)],2) p50FontWeight,
  ROUND(APPROX_QUANTILES(totalSize, 100)[SAFE_ORDINAL(75)],2) p75FontWeight,
  ROUND(APPROX_QUANTILES(totalSize, 100)[SAFE_ORDINAL(95)],2) p95FontWeight,
  ROUND(APPROX_QUANTILES(totalSize, 100)[SAFE_ORDINAL(99)],2) p99FontWeight,  
  ROUND(APPROX_QUANTILES(visibleGlyphs, 100)[SAFE_ORDINAL(50)],2) p50VisibleGlyphs,
  ROUND(APPROX_QUANTILES(visibleGlyphs, 100)[SAFE_ORDINAL(75)],2) p75VisibleGlyphs,
  ROUND(APPROX_QUANTILES(visibleGlyphs, 100)[SAFE_ORDINAL(95)],2) p95VisibleGlyphs,
  ROUND(APPROX_QUANTILES(visibleGlyphs, 100)[SAFE_ORDINAL(99)],2) p99VisibleGlyphs,
  ROUND(APPROX_QUANTILES(minGlyphs, 100)[SAFE_ORDINAL(50)],2) p50MinGlyphs,
  ROUND(APPROX_QUANTILES(minGlyphs, 100)[SAFE_ORDINAL(75)],2) p75MinGlyphs,
  ROUND(APPROX_QUANTILES(minGlyphs, 100)[SAFE_ORDINAL(95)],2) p95MinGlyphs,
  ROUND(APPROX_QUANTILES(minGlyphs, 100)[SAFE_ORDINAL(99)],2) p99MinGlyphs,  
  ROUND(APPROX_QUANTILES(maxGlyphs, 100)[SAFE_ORDINAL(50)],2) p50MaxGlyphs,
  ROUND(APPROX_QUANTILES(maxGlyphs, 100)[SAFE_ORDINAL(75)],2) p75MaxGlyphs,
  ROUND(APPROX_QUANTILES(maxGlyphs, 100)[SAFE_ORDINAL(95)],2) p95MaxGlyphs,
  ROUND(APPROX_QUANTILES(maxGlyphs, 100)[SAFE_ORDINAL(99)],2) p99MaxGlyphs,   
FROM fontStats

Internet Explorer’s Decline in Usage in 2021

2022-01-31T14:00:00+00:00

In May 2021 Microsoft announced that it would be officially retiring Internet Explorer in favor of the Chromium based Microsoft Edge. Usage for the legacy browser had been very low over the past few years, although many websites have still maintained polyfills for the older browser. In fact a number of my clients have recently told me that supporting IE 11 is required by their business, and is still a consideration when adding new features to their sites. I’m sure that the official retirement of Internet Explorer will help numerous organizations embrace modern web features and move on from some expensive polyfills. The official retirement date is June 15, 2022.

A few years ago Philip Walton wrote an excellent blog post about loading polyfills only when needed. His strategy focused on optimizing the experience for users on modern browsers, and loading polyfills based on browser support for required features (or lack thereof). One thing I really like about this approach is that he recommends creating separate bundles for modern browsers. This avoids unnecessary CPU load on modern browsers, since they won’t have to parse, compile and evaluate the polyfill JavaScript.

More recently, Alan Davalos published a blog post where he made the argument that the baseline for web development in 2022 should change as a result of this. He provided numerous examples of websites that have officially dropped support for the legacy browser in 2021.

This made me curious about the current traffic levels for Internet Explorer, and how that has changed over the past year. Looking at Akamai’s mPulse data, which measures real user performance data for users across all browsers, I can see a steady decline since March 2021. Most of the Internet Explorer traffic was from the IE 11 browser, with earlier browsers being a small fraction (< 0.01% of pages).

Looking at this data daily, you can see that the decline in Internet Explorer usage was linear throughout the year. It also seems that the decline accelerated after the announcement in May 2021. Usage dropped even further in November 2021.

The zig-zag pattern in the daily traffic indicates that the majority of users utilizing IE 11 are doing so during the Monday-Friday work week, with traffic dropping by two thirds on the weekends. If business users are the primary source of Internet Explorer usage, then they could be at risk of 0-day exploits once official support ends.

I thought it would be interesting to see which countries had the highest percentage of Internet Explorer traffic during January 2022. Sure enough this varied from country to country. For example, in the United States 0.97% of pages were loaded from an Internet Explorer browser. In Great Britain, it only accounted for 0.29% of pages.

Some countries had a noticeably higher percentage of Internet Explorer traffic - for example South Korea (2.36%), China (1.76%), Hong Kong (1.78%). And then there are some countries with a shockingly high percentage of Internet Explorer traffic: Haiti (26.15%), Belize (9.12%), Jamaica (7.13%), Cambodia (6.8%) and more. The graph below shows a summary Internet Explorer usage per country. You can also view an interactive version of it here.

Internet Explorer is going away, and that is ultimately a good thing for the web. As support officially ends, website owners should consider removing expensive polyfills for the browser and add support for technologies that are supported on modern browsers. However this won’t happen by itself, and sites that have been bundling polyfills into a large monolithic bundle will need to put in the effort to analyze which ones are still needed. Fortunately there are some great resources on bundle analysis, such as Sia Karamalegos’ guide to Lighthouse Treemap and Nolan Lawson’s guide to bundle analysis tools.

Organizations should also look to remove the legacy browsers on their employees’ machines to avoid security risks down the line. Microsoft provides guides and resources to remove legacy Internet Explorer browsers from organizations, which are available here.

Page Visibility: If a tree falls in the forest…

2021-12-31T16:30:00+00:00

If a tree falls in the forest and no one is around, does it make a sound? Likewise, if a web page loads in a background tab then does its load time really matter? As a user, the time it takes for background tabs to load may seem irrelevant since you are unlikely to notice delays. However if you are managing a website and measuring user experience, then it’s important to understand how visibility state can influence the data you are analyzing.

In this blog post we’ll explore the Page Visibility API as well as some data from Akamai’s mPulse to understand the visibility states of real users loading billions of page views, and what it means for web performance.

Page Visibility

The Page Visibility API defines a programmatic way of determining the visibility state of a top-level browsing content, as well as a method of measuring visibility state changes over time. Web developers can use this information to determine whether a page is visible to an end user. It also gives them the ability to scale back the work being performed on a page load. The Page Visibility API is also supported in all modern web browsers.

Using this API is very simple. The attribute document.visibiltyState will return visible or hidden depending on whether the page is visible or not. If you want to see the value changing as you switch tabs, you can also monitor the visibility changes. For example,

console.log(document.visibilityState + ': ' + Date())
document.onvisibilitychange = () => console.log(document.visibilityState + ': ' + Date())

The W3C specification, also provides an example of using this API to decide whether to autoplay a video on page load based on visibility state. It adds an event listener to listen for changes in the visibility state so that the video playback can start automatically when the page is visible.

RUM tools can collect this data as well. For example, mPulse collects the visibility state of a page once the page load has completed and also measures for changes in visibility throughout the page load.

The data in this blog post is based on billions of page views across all sites using mPulse during the month of November 2021.

Visibility States

How often do you right click on a link and load a page in a background tab? Or click on a link on your mobile device and switch applications before it finishes loading? Or lock the screen on your mobile device while waiting for a page to load?

The graph below breaks down visibility states by device type using RUM data from mPulse. The visibility state is measured as soon as the onLoad event is fired. 11.18% of all Desktop page views were loaded in a hidden visibility state. Similarly, 9.59% of Mobile page views were loaded in a hidden visibility state.

Note: Less than 1% of pages were loaded in a prerender state. This feature has been deprecated in the Page Visibility API, supported inconsistently across browsers, and therefore not as useful for this analysis.

Why is Visibility State Important for Web Performance?

Most modern browsers prioritize work being done in the foreground, and as such one would expect that a page loaded in a background tab may be slower. Individually you can examine this behavior by loading a page and looking at it’s navigation timing data. Load that same page in a non-visible tab a few times and look at the difference in timing.

When analyzing the median page load time (onLoad metric) in mPulse, I can see a significant difference in performance based on visibility state. For example, the median time to load a page on Desktop was 2.8 seconds. The median load time for pages in a visible state was 2.7 seconds and the median load time in a hidden state took 4 seconds. The median load times for pages loaded in a hidden visibility state was 32% slower on Desktop and 37% slower on Mobile!

Going back to the tree falling in a forest analogy, does this really matter? I’d say yes and no, for the following reasons:

From a user experience, if the page is not visible then the user is not impacted by the delay.
As performance engineers we look to RUM data to tell us what our users are experiencing. If a large enough percentage of users are loading tabs in the background, then it may impact the metrics we are analyzing. This is exacerbated even further if you are looking at upper percentile stats.

The graph below illustrates the median load times as well as the p75, p95 and p99 based on visibility state. The p95 for all Desktop pages loaded in a visible state was 14.37 seconds. Comparatively, the page load times for hidden states was 37.65 seconds, which is more than twice as slow!.

Taking this one step further, the graph below shows the distribution of load times in a histogram, for both hidden and visible states. As the response times increase, hidden visibility states account for a larger percentage of experiences. At the 95th percentile, 25% of all pages were loaded in a hidden visibility state. (Note: the x axis in this graph ends at 18 seconds, which is around the 95th percentile).

Now let’s explore the upper percentiles. The graph below shows the same data for the slowest 5% of experiences. The percentage of hidden visibility states increases with respect to the load time, eventually approaching 36%. If you are analyzing upper percentiles to tackle some of your long tail performance issues - then not filtering out hidden visibility states will leave a significant amount of noise in your data.

Page visibility by desktop browser

During the month of November 2021, Chrome, Edge, Safari, Firefox and Internet Explorer accounted for 96.3% of all desktop page views. Chrome was the dominant browser, with 63.9% of all page views. However all of these browsers support the page visibility API.

The table below details the distribution of visibility states by Desktop browser. The percentage of hidden visibility states varies widely by browser. This may be influenced by a variety of factors, such as browser UI features (such as tabbed browsing) or by the end user switching between applications on their machine.

Chrome had the highest percentage of hidden visibility state page loads (12.9%) compared to other desktop browsers. Interestingly, Edge (which is now Chromium based) had 7.86% of page loads. Given that legacy Internet Explorer (version 11 and earlier) had only 4.4% hidden visibility states, it’s possible that some users of Edge and Internet Explorer are less likely to utilize the tabbed browsing features.

	% of page loads by visibility state
Browser	hidden	visible
Chrome	12.91%	87.09%
Edge	7.86%	92.14%
Safari	6.86%	92.89%
Firefox	9.25%	90.75%
IE	4.44%	95.56%

It’s worth noting that if you are measuring your site with a synthetic measurement, you are almost always loading a page in a visible state.

Page visibility by mobile browser

Mobile web traffic is split between browser apps, WebViews, and in-app browsers. Mobile Safari is the dominant mobile browser, with 39.2% of page loads. WebViews also represented a significant traffic share, accounting for 17% of all mobile page views (or 20% if we include Chrome Mobile and Firefox Mobile’s iOS apps).

The table below breaks down the visibility states measured by mobile web browsers. Overall, 5.98% of pages on Mobile Safari were loaded in a hidden visibility state, which is comparable to Desktop. However Chrome Mobile had less hidden visibility states (8.88% mobile vs 12.91% desktop). Interestingly, some Chromium based mobile browsers (such as Samsung Internet, MiuiBrowser and Edge Mobile) had a much higher percentage of hidden visibility states - likely due to differences in their UI and user base.

	% of page loads by visibility state
Browser	hidden	visible
Mobile Safari	5.98%	93.99%
Chrome Mobile	8.88%	91.12%
Samsung Internet	12.34%	87.66%
Crosswalk	2.93%	97.07%
Firefox Mobile	2.48%	97.52%
MiuiBrowser	16.43%	83.57%
Opera Mobile	12.45%	87.55%
Yandex Browser	6.16%	93.83%
Edge Mobile	10.10%	89.89%
UC Browser	9.74%	90.25%

When we look at distribution of visibility states across WebView’s and in-app browsers, we can see that WebViews have the highest percentage of hidden visibility states. This could be due to mobile users switching apps or locking their screens before a page is finished loading.

Another interesting observation is that the social media apps Instagram and Snapchat have a higher % of hidden visibility states compared to Facebook and Pinterest. Could this be due to differences in demographics of these platforms?

	% of page loads by visibility state
Browser	hidden	visible
Mobile Safari UI/WKWebView	26.61%	73.20%
Chrome Mobile WebView	12.73%	87.27%
Facebook	2.78%	97.22%
Chrome Mobile iOS	7.77%	92.20%
Instagram	7.16%	92.83%
LINE	1.87%	98.12%
Firefox iOS	6.24%	93.75%
Snapchat	9.46%	90.48%
Android Webkit	4.41%	95.59%
Pinterest	3.69%	96.31%

One thing in common across all browsers - desktop and mobile - is that there are a significant amount of hidden visibility states that are worth accounting for in our performance measurements.

Beyond page load time

When we look at other performance metrics, we can see that times measured for metrics such as Total Blocking Time, Largest Contentful Paint and First Contentful Paint were also impacted by visibility state. The graph below illustrates this for Chrome Desktop pages. The metric that was most impacted was Total Blocking Time, which was almost twice as slow when hidden. This makes sense since the browser is likely not prioritizing the execution of JavaScript on the page, which also explains the 46% increase in page load times

Largest Contentful Paint is one of the Core Web Vitals, which Google is using as a signal for search ranking. The mPulse data shows us that the p75 LCP for a hidden visibility state is 23% slower than when it is visible. At the p95, the LCP is almost twice as slow when hidden.

Google’s Chrome User Experience Report (CrUX) already filters out hidden visibility states, which means that your search ranking will not be impacted by slow non-visible page loads. However the tools you are using to monitor these thresholds may have a blind spot here. Fortunately it’s easy enough to collect visibility state data using the Page Visibility API. For example, in mPulse, the visibility state is a dimension that you can filter in any dashboard. You can also create custom dashboards to track the distribution of experiences based on visibility states if that interests you.

Conclusion

The Page Visibility API is an incredibly useful way of determining the visibility state of a page load. It can be used to provide developers with the ability to fine tune experiences based on visibility state, which can conserve CPU and battery usage. It’s also measurable with RUM, and based on the data from mPulse we can see that page load times are slower across all browsers when the visibility state is hidden.

While this may not matter as much for end user experience, it’s happening at a high enough frequency that it can influence your performance metrics. If you are optimizing for the long tail of web performance, you may want to filter out hidden visibility states.

Originally published at https://calendar.perfplanet.com/2021/page-visibility-if-a-tree-falls-in-the-forest/

What can the HTTP Archive tell us about Largest Contentful Paint?

2021-06-07T16:30:00+00:00

Largest Contentful Paint (LCP) is an important metric that measures when the largest element in the browser’s viewport becomes visible. This could be an image, a background image, a poster image for a video, or even a block of text. The metric is measured with the Largest Contentful Paint API, which is supported in Chromium browsers. Optimizing for this metric is critical to end user experience, since it affects their ability to visualize your content.

Google has promoted this metric as one of the three “Core Web Vitals” that affect user experience on the web. It is also slated to become a search ranking signal over the next few weeks, which has created a lot of awareness about it. The suggested target for a good Largest Contentful Paint is less than 2.5 seconds for at least 75% of page loads.

Source: https://web.dev/lcp/

Some of the recent posts on WPOStats feature interesting case studies about this metric. For example,

Google’s research found that when Core Web Vitals are met, users are 24% less likely to abandon a page before it finishes loading.
Vodafone improved LCP by 31% and saw an 8% increase in sales.
NDTV improved their LCP by 55% and saw a 50% reduction in bounce rate.
Tokopedia improved their LCP by 55% and saw a 23% increase in session duration.

Identifying the Largest Contentful Paint Element

The name of this metric implies that size is used as a proxy for importance. Because of this, you may be wondering specifically which image or text triggered it as well as the percentage of the viewport it consumed. There are a few ways to examine this:

One way to visualize the Largest Contentful Paint is to look at a WebPageTest filmstrip. You’ll be able to see when visual changes occurred (yellow outline) as well as when the Largest Contentful Paint event occurred (red outline).

In Chrome DevTools, you can also click on the LCP indicator in the “Performance” tab to examine the Largest Contentful Paint element in your browser. Using this method you can see and inspect the exact element (image, text, etc) that triggered it.

Lighthouse also has an audit that identifies the Largest Contentful Paint element. If you examine the screenshot below you’ll notice that there is a yellow box around the largest element, as well as an HTML snippet.

How Large is the Largest Contentful Paint?

The HTTP Archive runs Lighthouse audits for approximately 7.2 million websites every month. In the May 2021 dataset, Lighthouse was able to identify an LCP element in 97.35% of the tests. Since we have the ability to query all of these Lighthouse test results, we can analyze the result of the LCP audits and get more insight into what drives this metric across the web.

Using the same boundaries that Lighthouse uses to draw the rectangle around the LCP element, it’s possible to calculate the area of it. In the above example, the product of the LCP image’s height (191) and width (340) was 64,940 pixels. Since the Lighthouse test was run with an emulated Moto G4 user agent with a screen size of 640x360, we can also calculate that this particular LCP image took up 28% of the viewport.

The graph below shows the cumulative distribution of the LCP element as a percentage of screen size. The median LCP element takes up 31% of the screen size! At the 75th percentile the LCP element is nearly twice as large, taking up 59% of the screen size. Additionally 10.6% of sites actually had an LCP element that exceeded the viewport (which is why the y axis doesn’t reach 100%).

The graph below illustrates the same data in a histogram. From this we can see that 4.03% of sites (285,751) had a LCP element that took up 0 pixels. Upon further inspection, the 0 pixel elements appear to have been used in carousels, so by the time the audit completed the LCP element slid out of the viewport.

Node Paths of LCP Elements

Another interesting aspect of the Largest Contentful Paint audit is the nodePath of the element, which shows you where in the DOM this element was. In the example we looked at earlier, the nodePath was: ```

1,HTML,1,BODY,8,DIV,2,SECTION,1,DIV,0,DIV,0,DIV,0,UL,0,LI,0,ARTICLE,1,DIV,0,DIV,0,A,0,IMG

If we look at the last element in the node path, we can get some insight into the type of element that triggered the Largest Contentful Paint. The most common node that triggered the Largest Contentful Paint was <IMG>, which accounted for 42% of all sites. Next was <DIV> at 27% (which could include text or images). The <H1> through <H5> header elements accounted for 7.18% of all Largest Contentful Paints.

LCP Node (last element in path)	Number of Sites	Percent of Sites
IMG	3,067,354	42.12%
DIV	1,981,416	27.21%
P	766,977	10.53%
H1	291,091	4.00%
	192,498	2.64%
SECTION	182,267	2.50%
H2	144,534	1.98%
A	107,501	1.48%
SPAN	85,245	1.17%
HEADER	67,762	0.93%
LI	64,212	0.88%
H3	60,679	0.83%
RS-SBG	51,623	0.71%
TD	48,470	0.67%
H4	19,039	0.26%
VIDEO	15,649	0.21%
ARTICLE	12,860	0.18%
FIGURE	9,208	0.13%
BODY	8,859	0.12%
image	8,077	0.11%
CENTER	7,960	0.11%

The <VIDEO> element only accounted for 0.21% of sites. According to the Web Almanac, the <video> element was used on 0.49% of mobile websites - so from this we can estimate that half of sites loading videos are triggering LCP with video poster images.

Image Weight for the LCP

One of the Lighthouse audits looks for opportunities to preload the Largest Contentful Paint element, and estimates the potential savings in performance. This audit also identifies the URL for the LCP element - which can give us some insights into what type of images are being loaded as a LCP element. In the HTTP Archive data, only 67% of the Lighthouse tests were able to identify a URL for an LCP element. Based on this, we can infer that text nodes are used for the LCP on approximately 33% of sites.

The graph below shows the distribution of sizes for the image element that was associated with the Largest Contentful Paint. The median LCP element size was 80KB. At the 90th percentile, the LCP element size was 512KB. If you have a large LCP image then you should consider optimizing it before you attempt to follow the Lighthouse preload recommendation.

Additionally, 70% of the LCP element images were JPEG and 25% were PNG. Only 3% of sites served a webp as their LCP element.

format	sites	% of Sites
jpg	3,161,991	69.37%
png	1,122,585	24.63%
webp	141,441	3.10%
gif	84,829	1.86%
svg	34,123	0.75%
Other	13,272	0.29%

When we look at the LCP element as a percentage of page weight, we can see that the median LCP element is 4.17% of the total page weight. At the higher percentiles, the LCP elements are larger and also a larger percentage of page weight.

percentile	ImageRequests	ImageKB	TotalKB	LCP as a % of Page Weight
p25	15	422	1,138	3.01%
p50	26	1,142	2,185	4.17%
p75	45	2,692	4,108	5.58%
p95	103	8,008	10,036	8.42%

Since images account for 52% of the median page weight (for the sites that have a LCP image element), we can infer that at the median 8% of page weight is used to render content to 31% of the screen.

How does this change based on Site Popularity?

The HTTP Archive now contains rank groupings, obtained from the Chrome User Experience Report. This can enable us to segment this analysis based on the popularity of sites. The rank grouping indicator buckets sites into the top 1K, 10K, 100K, 1 million and 10 million.

When we look at the Largest Contentful Paint image size based on popularity, it’s interesting to note that the most popular sites tend to be serving smaller images for the LCP element. While there may be numerous reasons for this, I suspect that the more popular sites are investing in image optimization solutions.

Page weight follows the same pattern, with the least popular websites having some of the largest page weights. If we look at the LCP element based on the percentage of page weight, you can see that within the top 100K sites the ratios are very close. In the less popular sites, the LCP element tends to be a much greater percentage of page weight.

rank	p25	p50	p75	p95
Top 1k	1.61%	2.12%	2.85%	5.67%
Top 10k	1.76%	2.27%	3.00%	4.96%
Top 100k	2.07%	2.87%	3.77%	5.78%
Top 1 million	2.53%	3.49%	4.60%	6.95%
Top 10 million	3.11%	4.30%	5.75%	8.65%

We can also make some interesting observations about how popular sites are optimizing their LCP assets. Looking at the various image formats, JPG images are the most common LCP element. Some other formats such as PNG, WebP, GIF and SVG are used more frequently in the more popular sites.

Conclusion

Largest Contentful Paint is an important metric that helps illustrate when a page’s most significant content is rendered to the screen. In reviewing the HTTP Archive data, we can see that this area represents between 30% and 60% of a mobile viewport for a majority of sites.

There are a shocking number of sites that have a LCP element that consumes a large percentage of the viewport and are delivered as large unoptimized images. Site owners should evaluate both what is triggering the Largest Contentful Paint as well as how it is loaded. Optimizing for the Largest Contentful Paint will ensure that the browser has the opportunity to load and render this content as quickly as possible.

If you are interested in seeing some of the SQL queries and raw data used in this analysis, I’ve created a post with all the details in the HTTP Archive discussion forums. You can also see all the data used for these graphs in this Google Sheet.

Growth of the Web in 2020

2020-09-29T16:30:00+00:00

For the past 10 years, the HTTP Archive has tracked the evolution of the web by archiving the technical details of desktop and mobile homepages. During its early years, the Alexa top million dataset (which was publicly available until 2017) was used to source the list of URLs included in the archive and the number of sites tracked increased from 16K to almost 500K as testing capacity increased. To keep the archive current and include new sites, towards the end of 2018 we started using the Chrome User Experience Report as a source of the URLs to track.

Throughout 2019 the size of the HTTP Archive dataset was mostly constant. However, the sample size has grown quite a bit in 2020 as you can see in the graph below! Additionally, if we combine both desktop and mobile URLs, there was a recent peak of 7.5 million sites!

How are sites Included in the Chrome User Experience Report

The Chrome User Experience Report (CrUX) is sourced from performance data collected from real Chrome users that have opted into syncing their browser history and sharing anonymized usage statistics reporting.It’s essentially real user measurement (RUM) data for Chrome users.

You can read more about CrUX on Google’s Developer website, as well as this informative blog post from Rick Viscomi. I’ve also written about it previously here.

While Google doesn’t publish a definitive list on what it takes to be included in the Chrome User Experience Report dataset, they have indicated that:

Origins are automatically curated based on real-user Chrome usage
Websites must meet a traffic threshold to be included.
Websites must be publicly accessible

Essentially, a website’s inclusion in the Chrome User Experience Report indicates that they’ve reached a certain threshold of activity. According to the CrUX changelog, there have been no changes in the methodology. So it can be inferred that analyzing the number of websites included in this dataset should provide some interesting insights into the month on month growth of the number of websites being visited by real users.

Note: The Chrome User Experience Report does not contain traffic details, and as such this analysis should not be interpreted as growth of traffic on the internet. This analysis is specifically about the growth in the number of websites that people are visiting.

Global Growth of Origins Accessed in 2020

The graph below illustrates the total number of websites that the Chrome User Experience reported across all form factors during the previous 12 months. There are a few interesting observations we can make:

In both 2019 and 2020 there were increases in the number of websites at the start of the year.
There was a linear increase in the number of websites through the first half of 2020.
The drop in March and April 2020 is interesting, since that coincides with the start of the global COVID-19 pandemic.

There is a similar pattern with the total number of registered domains. This indicates that much of the growth is new domains and not necessarily subdomains of existing domains.

When we look at the month over month rate of change, you can see that the max change in 2019 was +/- 5%. The number of sites tends to fluctuate month to month. For example, between August and December 2019 there was an 8.6% decrease in sites. However at the start of 2020 there was a 7.5% increase.

When comparing the number origins between December 2019 and August 2020, the total number of origins increased by 28.9% this year alone! That’s huge!

Mobile vs Desktop Growth

Looking at this by device type, we can see that there are consistently more mobile websites compared to desktop websites. And over the past year the fluctuations between them have been fairly consistent. The one exception is between May 2020 and June 2020, where desktop increased by 0.7% and mobile increased by 6.2%.

Overall, there are 22.9% more mobile websites in the CrUX dataset compared to desktop. We know from sources like statcounter that mobile usage has grown significantly over the years, and consistently surpasses desktop. But why are mobile users navigating to so many more websites compared to desktop users?

Is there something about the mobile experience (such as social media links, email marketing, etc) that increases the change a user may navigate to an unfamiliar website?

Or could it be growth in regions where mobile is more dominant?

How Has this Varied by Region?

At the start of 2020, most regions of the world saw an increase in the domain of sites. The exception to this was western Asia. The regions that had the most substantial increase at the start of the year were Northern Europe, North America and South America.

Between May and June there was another large uptick in the number of sites. This appeared to be mostly South-East Asian and Western European countries

The tables below detail the number of sites included in the CrUX dataset during December 2019 as well as January, May and June 2020. This first table contains the top 10 regions, most of which saw an increase of 15% to 25% during the previous six months!

	Number of Sites Included in CrUX dataset
Sub Region	Dec 2019	Jan 2020	May 2020	June 2020	August 2020
Northern America	1,257,159	1,406,284	1,676,120	1,681,454	1,730,260
Western Europe	713,202	768,644	874,164	908,560	943,891
Eastern Europe	640,145	694,632	821,024	913,037	926,202
Eastern Asia	720,926	740,767	871,854	882,322	901,008
South America	486,894	540,685	668,604	726,410	784,854
Southern Europe	506,054	541,526	661,416	710,543	724,323
Northern Europe	453,591	516,527	601,459	638,744	661,790
South-Eastern Asia	473,962	485,143	524,249	584,214	629,815
Southern Asia	403,325	419,118	441,328	462,282	500,600
Western Asia	274,339	273,610	327,425	351,186	362,340

	Percent Change of Sites Included in CrUX dataset
Sub Region	Dec 2019 - Jan 2020	May - Jun 2020	Jun - Aug 2020	Dec - Aug 2020
Northern America	10.60%	0.32%	2.82%	27.34%
Western Europe	7.21%	3.79%	3.74%	24.44%
Eastern Europe	7.84%	10.08%	1.42%	30.88%
Eastern Asia	2.68%	1.19%	2.07%	19.99%
South America	9.95%	7.96%	7.45%	37.96%
Southern Europe	6.55%	6.91%	1.90%	30.13%
Northern Europe	12.18%	5.84%	3.48%	31.46%
South-Eastern Asia	2.30%	10.26%	7.24%	24.75%
Southern Asia	3.77%	4.53%	7.65%	19.43%
Western Asia	-0.27%	6.77%	3.08%	24.29%

Looking at the next 10 in the list, we can see significant growth in Central America, Australia as well as West, Southern and South Africa. Overall the regions with the most growth during the 7 month period was Australia and New Zealand, South America, and Central America.

	Number of Sites Included in CrUX dataset
Sub Region	Dec 2019	Jan 2020	May 2020	June 2020	August 2020
Central America	155,057	179,295	236,255	242,132	257,043
Australia and New Zealand	124,763	141,523	194,212	196,841	214,757
Northern Africa	68,754	69,497	83,312	88,672	88,606
Southern Africa	50,618	59,139	64,978	66,392	70,218
Central Asia	45,932	49,192	57,098	57,508	62,112
Western Africa	44,692	49,868	47,834	51,257	50,853
Caribbean	33,840	37,445	44,090	45,910	45,395
Eastern Africa	31,010	34,822	36,073	37,388	38,609
Middle Africa	8,873	9,149	9,121	10,057	10,032
Melanesia	2,733	2,991	2,580	2,779	2,818

	Percent Change of Sites Included in CrUX dataset
Sub Region	Dec 2019 - Jan 2020	May - Jun 2020	Jun - Aug 2020	Dec - Aug 2020
Central America	13.52%	2.43%	5.80%	39.68%
Australia and New Zealand	11.84%	1.34%	8.34%	41.91%
Northern Africa	1.07%	6.04%	-0.07%	22.40%
Southern Africa	14.41%	2.13%	5.45%	27.91%
Central Asia	6.63%	0.71%	7.41%	26.05%
Western Africa	10.38%	6.68%	-0.79%	12.12%
Caribbean	9.63%	3.96%	-1.13%	25.45%
Eastern Africa	10.95%	3.52%	3.16%	19.68%
Middle Africa	3.02%	9.31%	-0.25%	11.55%
Melanesia	8.63%	7.16%	1.38%	3.02%

Many of the regions that had an increase in sites visited (based on CrUX data), also have a high percentage of mobile visitors compared to the global population (based on statcounter). So while it’s difficult to say for certain, it’s entirely possible that location is a large factor in the gap between Desktop and Mobile.

Analyzing by Top Level Domain

The .com top level domain accounts for 43.7% of all websites tracked in the Chrome User Experience report. The next largest top level domain is .org, which consists of 3.7% of all sites. Overall there were 4111 TLDs in the dataset, and the top 20 of them represented 75% of all websites.

Most of these top level domains experienced a > 20% growth in active websites since December 2019, with the exception of .info and .net. The domains with the largest percentage growth were co.uk, com.au and de.

If we look at the month to month growth trends for these TLDs, we can make a few interesting observations:

There was a significant drop across all TLDs in March 2020.
The largest percentage drop was for .it domains in March 2020, although that rebounded with increases in April, May and June.
In February 2020, there was a 23.9% increase in .edu domains receiving traffic.
In May 2020, more than a dozen popular TLDs saw a double-digit increase in the number of sites.
In August 2020 there was a 10.4% increase in edu domains

Conclusion

The web is constantly growing and evolving, and clearly it’s rate of growth can vary quite a bit. During this analysis we explored a public dataset that Google provides to show how the web has grown during 2020, and which regions are growing the most. While this doesn’t speak to the traffic levels experienced in these locations, the number of websites can be used as a proxy for understanding usage of the web. As this analysis shows, 2020 has been a year of substantial global growth for the web.

Browser Back/Forward Caches and their Benefit to Web Performance

2020-08-03T18:30:00+00:00

Overview

There are a few different types of top-level navigation events that browsers handle. The most common one is a simple “navigation”. You enter a URL in the address bar and hit enter, or you click on a hyperlink to visit another webpage. Another is a “reload”, which causes the browser to revalidate it’s cached entries. There’s also a “hard reload”, where the browser ignores cache and reloads the page. And another type of navigation, which we will discuss in this post, is “Back/Forward” navigation. These occur when you click the back or forward buttons in your browser to return to a previously visited page.

If you are using a real user measurement (RUM) service like mPulse then you are likely collecting the stats for every navigation type that results in a page load. The navigation timing API used by RUM services also includes the navigation type which makes it possible to determine which one was used. You can see the value in your browser console by checking the value of performance.navigation.type. The values defined by the specification are below:

Navigation Event	Type Attribute	Description
TYPE_NAVIGATE	0	Navigation started by clicking a link, entering a URL in the address bar, submitting a form, or triggered by a script (except the ones used by reload or back_forward)
TYPE_RELOAD	1	Navigation through the reload operation or the `location.reload()` method.
TYPE_BACK_FORWARD	2	Navigation through a history traversal operation.
TYPE_RESERVED	255	Any navigation types not defined above.

mPulse collects the navigation types for every real user measurement across thousands of websites. Analyzing this data can provide an interesting perspective on the performance of back/forward navigations as well as how their usage varies by device and browser.

Back/Forward Caches in Popular Browsers

The general idea behind a back/forward cache is to preserve the state of the page so that the DOM does not need to be rebuilt when a user returns to a previously visited page. With a back/forward cache the render tree and layout remains intact. The page is not technically loaded again because it was already loaded.

Chrome, Firefox and Safari all have implementations of a back/forward cache, although Chrome’s is currently hidden behind an experimental flag (chrome://flags/#back-forward-cache).

The Chrome team has announced its intention to enable the back/forward cache feature for Chrome Mobile users starting with Chrome 86 (October 2020). In this post we will examine how frequently back/forward navigations are used across different devices and browsers. We’ll also look at the performance opportunities that the back/forward cache implementation may be able to solve.

Note: Because of how back/forward caches work, we do not have real user measurements for back/forward cache hits. In this analysis we’ll explore the opportunity presented by back/forward caches, and not the performance gained by any specific browser’s implementation.

Summary of Navigation Types

The graph below illustrates the breakdown of different navigation types by device type. While the content of this graph is very high level and seems to be simple, it’s important to note that this is an aggregation of tens of billions of real user measurements from thousands of websites, measured during the month of June 2020.

There are a few interesting observations we can make from this graph:

Across all device types, ~85% of navigations are type=navigation.
There is a greater percentage of back/forward navigations on mobile and tablet devices compared to desktop
Desktop browsers had 60% more reloads compared to mobile/tablet.

Desktop Browsers

The most popular desktop browsers are Chrome, Safari, Edge and Firefox. In fact, combined they account for 91.4% of all Desktop pages!

Note: When looking at this data it’s also important to note that in Firefox and Safari we only have the back/forward navigation events for navigations that did not result in a back/forward cache hit.

When we look at this by desktop browser, we can see that

Chrome and Opera have a large % of reloads compared to other browsers
Almost 17% of Opera navigations are back/forward!
All browsers except for Safari Desktop and legacy IE have at least 10% of back/forward navigations.
There were 50% less back/forward navigations measured for Safari compared to Chrome. This seems to indicate that approximately half the time users are getting a cache hit from this feature.
Firefox had a similar rate of back/forward navigations to Chrome. This seems to indicate that either Back/Forward navigation are much more prevalent in Firefox, or that their implementation is not resulting in many cache hits.

The graph below illustrates the median load times by navigation types for these browsers. For each browser, the reload events are significantly slower. On Chrome, reloads took 30% longer than navigations. Firefox saw the biggest degradations with reload events, taking almost twice as long as navigations.

Comparing the navigation and back/forward load times, we can see that legacy IE and Opera back/forward load times were only slightly faster than a navigation. With Chrome, the back/forward navigations are 16% faster than navigations. Safari back/forward navigations are 27% faster. That’s without any back/forward cache, but merely relying on the efficiency of the browser cache.

Mobile Browsers

The most popular mobile browsers are Mobile Safari and Chrome Mobile, which together accounts for 67.6% of mobile page views. The graph below also includes other popular mobile browsers such as Samsung Internet, UC Browser, Crosswalk, and Firefox Mobile. Additionally included are webviews and in-app browsers are included - such as Facebook, Instagram, LINE, Flipboard, Snapchat, Pinterest, etc.

Some observations:

Back/Forward usage is much more prominent in mobile browsers compared to desktop browsers.
Mobile browsers have a much higher back/forward usage compared to WebViews and in-app browsers.
Considering the difference between Chrome Mobile iOS vs Mobile Safari (both Wekbit based), as well as Chrome Mobile vs Samsung Internet vs Crosswalk (all Chromium based), the amount of back/forward navigation events may be influenced by the browser UI.

Is this a Blind Spot for Synthetic?

When measuring the performance of a synthetic transaction, you are usually starting a new session - which means that the page loads will be navigations. Since most synthetic services identify themselves in their User-Agent, we’re able to compare navigation types across popular synthetic solutions using mPulse data. In the graph below you can see that the vast majority of requests from synthetic services are navigation events. There are a small percentage of reloads, likely due to multi-step scripts that repeat a page view.

Regardless of what we see today, on the measurement platform you use it may be possible to script these types of navigations. For example, in WebPageTest I can trigger a back/forward navigation by using a script such the one below:

// Turn off logging of results
logData	0   
// Navigate to the first page.  Then navigate to another page.
navigate	https://www.google.com/
navigate	https://www.google.com/search?q=back+forward+cache
// Wait for 2 seconds
sleep	2 
// Turn on logging of results
logData	1 
// Use JavaScript to trigger the back button
execAndWait	window.history.back();

What does this mean for RUM?

Many RUM implementations, such as mPulse, will aggregate and report on all user navigations. So the results that you are looking at will include a mix of navigation, reload, back/forward and reserved navigation. Based on the data here, ~ 10% of Desktop and ~20% of Mobile page loads utilize Back/Forward navigation.

If a navigation is served from the Back/Forward cache, then it will not trigger a beacon since the browser is unfreezing the browser state. In this case there will be no load event to trigger a beacon.

We may look at ways to support timing back/forward navigation from cache in the future - but for now, I would expect the following:

Once Back/Forward cache rolls out for a browser, you might see a drop in page loads from that browser (since those navigation types will no longer be reported)
Additionally, load times may appear to be slightly higher because some of the faster experiences (a back/forward navigation utilizing the browser cache) will no longer be reported. Essentially, these measurements will be affected by survivorship bias.

I would expect this to apply to other analytics tools as well.

Conclusion

While we often think of navigating the web as the simple operation of loading a web page, there are different types of navigation events which can be measured. Reloads are slower, and many of us are accustomed to that in normal use. However we expect the Back/Forward operations to be fast - and they are faster. The implementations of Back/Forward caches in popular browsers are helping to improve this experience even further - which has the benefit of significantly speeding up the web for up to 20% of navigations!

Many thanks to Nic Jansma and Utkarsh Goel for reviewing this.

An Analysis of Cookie Sizes on the Web

2020-07-13T05:00:00+00:00

Cookies are used on a lot of websites - 83.9% of the 5.7 million home pages tracked in the HTTP Archive to be specific. They are essentially a name/value pair set by a server and stored in a client’s browser. Sites can store these cookies by using the Set-Cookie HTTP response header, or via JavaScript (document.cookie). On subsequent requests, these cookies are sent to the server in a Cookie HTTP request header.

In this article we’ll be looking at the size of cookies across the web, and discuss some of the web performance implications of them.

First vs Third Party cookies?

When a cookie is set by the domain you see in your address bar, it is considered a first party cookie. These can be used for session management, authentication, personalization, etc. When a cookie is set by a different domain, then it’s considered a third party cookie.

Based on an analysis of over 109 million cookies, third parties account for 79% of all cookies.

Note: I wrote another blog post exploring the use of the SameSite attribute in cookie files, and how third party cookies are affected. You can read it here.

Cookie Sizes

An example of a cookie set by a server would be:

Set-Cookie: loggedIn=true; Domain=.example.com; Path=/; Max-Age=14400; HttpOnly

Directives such as Domain, Path, max-age, and HttpOnly affect how the cookie is stored, and which hostname a browser should share them with. In this example, loggedIn=true is the name/value portion of the cookie, and that is what we’ll be exploring in this post.

The median length of all cookies in the HTTP Archive is 36 bytes as of June 2020. That statistic is consistent across both first and third party cookies. The minimum is just a single byte, usually set by empty Set-Cookie headers (which is likely an error).

Cookie Size	First Party	Third Party
Min	1	1
Median	36	37
95th Percentile	181	135
99th Percentile	287	248
Max	29,735	8,500

The largest cookie size was 29,735 bytes, which is quite large! This is so large in fact that it is rejected by all modern browsers. I was curious to see what the limits are, and decided to dig into the source. Both Chrome and Firefox will reject cookies greater than 4KB. This is likely due to the implementation limits defined in RFC 6265.

So who is setting these large cookies?

The largest first party cookie was set by https://www.ridewill.it/, and it is named menu. It’s value was a long urlencoded string that contained multiple <div> layers and links. All in all, there were 240 links in a single cookie!
Many large first party cookies included large session cookies.
The largest third party cookie was set by web.taggbox.com, and consisted of a large JSON array named “liveWall”.
Most of the largest third party cookies were set from web.taggbox.com as well as a small number of advertising third parties.

If we look at the entire distribution of cookie sizes, it gets even more interesting. 88% of the cookies being set are less than 100 bytes. The 99th percentile is 372 bytes. So really large individual cookies are not common.

Cookies Sent to First Party Domain

What about cookies that clients send back to servers? A client can send multiple cookies in a single Cookie request header. Since the HTTP Archive only collects information on homepages, there is a limit to the insight we can collect here. If we look at just the request headers for favicon.ico, we can get an idea of how large the Cookie request header might be for a subsequent request. However this does not include any additional cookies set later in a session (ie, such as after logging in)

The median size of cookies sent on the favicon request was 161 bytes and the 95th percentile was 681. The largest was 7,795 bytes, and you can see the distribution below.

It’s important to note that the cookies set by the time favicon was requested may under represent the size of the cookies users would send later in a browsing session. For example, when logging into an application, a few additional cookies might be set. Some third parties that use a first party subdomain (ie, if www.example.com loaded a resource from metrics.example.com) also set a first party cookie.

Performance Implications of Large Cookies

When a browser sends an HTTP request, the HTTP request headers are usually 400 - 500 bytes. In the example below the request headers total 407 bytes.

GET / HTTP/1.1
Host: www.example.com
Connection: keep-alive
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9

Adding a cookie to that will increase the size of the request header. If we add more than 1KB of cookies to that request, then we exceed 1500 bytes, which is the standard maximum transmission unit (MTU) used by TCP. This means that the HTTP request would span multiple TCP packets, which may result in multiple round trips and increase the risk of retransmission. This can potentially increase the TTFB of the response since it would take longer to make the request.

With HTTP/2, we have HPACK compression - which was designed to help reduce the size of HTTP request headers by utilizing a dynamic index table. Receiving endpoints advertise the maximum size of this table in bytes (default is 4096), the sender can insert headers up to this limit in one request or response and subsequently reference it in another. In theory HPACK seems like it could help reduce the overhead of these large cookies. However, it’s not as easy as it sounds. Consider the following:

If a request is made on a new HTTP/2 connection (ie, for a return visitor, or a navigation that happened after the previous connection times out), then the entire cookie string would be sent. This would impact TTFB for that first request (usually the HTML).
Some servers intentionally exclude cookies from HPACK compression, since cookies are a very valuable target for an attacker.

Because of the practical constraints on compressing cookies, and the impact of cookie size on the first flight of requests and responses, it is beneficial to use smaller cookies. Based on the observations here, 900 bytes seems like a good budget for a total cookie size, which leaves room for other headers such as user-agent (which can benefit from HPACK compression).

Conclusion

Cookies are everywhere, and they are set by both first and third party requests. Most browsers will limit the size of cookies stored to 4KB - which is still quite large.

At a minimum, setting a cookie larger than 4KB will simply not work. However, 4KB is still too large and can negatively impact your TTFB by increasing the number of round trips on the HTTP response.

Even moreso, it’s important to keep track of how many cookies are being sent, as bloated cookies can also impact your TTFB by increasing the time it takes to make an HTTP request.

Many thanks to Lucas Pardue and Matt Ringel for reviewing this.

Investigating Duplicate HTML Requests on a Page Load

2020-07-10T05:00:00+00:00

Why it occurs, and what is the impact on web performance?

Overview

Ever see a web page load an embedded object that is really the same HTML as the main document? I’ve seen this show up on a handful of websites recently - including a few large ecommerce sites, a news site, a financial services site and a utility company’s website. Every time I’ve run into this, I’ve said I need to write a blog post about it. So, here’s that post!

Two of the most recent ones I’ve seen made the same mistake. Can you spot the error?

The body was styled with the CSS background property, which is shorthand for background-color, background-url, and others (this is explained better here). In this example, the developer probably wanted to use background: none to give a blank background color, but used an empty URL instead. The url() function attempts to load a background image, and it takes a string as an argument. Since an empty string is a valid string this doesn’t trigger an error in the console. It simply results in a relative request to the base URL.

I created a simple test page to demonstrate this. In the WebPageTest waterfall below you can clearly see the extra page request.

In Chrome DevTools, you would see the duplicate request show up with an initiator of the main document. In the console there is no indication or an error. This may not be a technical error, but it certainly is unintentional and comes with a performance penalty.

In this article we are going to explore why this is a performance issue, how duplicate requests for HTML can be inadvertently triggered, and how you can avoid them.

What is the Performance Impact?

For many sites, HTML pages are dynamically generated - which means that there is a backend cost associated with delivering the content. We often measure this by using the Time to First Byte metric, which tells us the time from when a request was made until the browser starts to receive a response. Often these pages are not cacheable either - so each request gets handled dynamically.

The overall capacity of the application backend is usually driven by the its ability to process the workload volume it receives. And if you are accidentally sending twice the number of requests for a dynamically generated HTML page, your infrastructure could be working twice as hard for nothing.

For example:

If the backend is struggling to keep up with the load, then all pages will be affected. This has the potential to degrade time-to-first-byte for all pages.
There are costs involved with scaling out capacity at the origin. Unnecessary HTML requests can skew capacity planning in this case.

Increased page weight is an additional reason. Many of the examples below were used to generate placeholders or empty content. In these cases, the additional HTML could be unnecessarily increasing page weight.

What Causes This?

Here are a few reasons I’ve seen for duplicate HTML requests

<body style="background: url('')">

This is the example I started off this post with. When you style an element with a background URL using url(), any string is considered a value argument. Since url(‘ ‘) is passed a valid string, it gets translated to a relative URL. This triggers a duplicate request in Chrome as well as Firefox. Safari ignores the empty url string

This also applies to other elements that can be styled with a background URL. For example, <div style="background: url()">

And it also happens with url(), url('') and url(' ')!

Element src’s with ? or # For some elements, a src attribute of either “?” or “#” would get interpreted that as a blank relative URL. For example <img src=”#” /> looks like an empty placeholder image, but it will cause the base HTML to get fetched again.

The same applies for <script>, <iframe>, and <video> - although browsers vary a bit. For example:

src=” “ returns a duplicate request for Firefox, but not the other browsers.
<script src=”#” /> also affects only Firefox
<script src=”?” /> affects only Chrome browsers
Chrome does not send duplicate requests for relative video src attributes, but Firefox and Safari do.

The table below lists the full summary of the tests I ran across Chrome, Safari and Firefox to see how they handle these errors. A warning icon indicates that they result in multiple HTML requests -

	Chrome	Safari.	Firefox
`<img src="#" />`
`<img src=" " />`
`<img src="?" />`
`<script src="#" />`
`<script src=" " />`
`<script src="?" />`
`<iframe src="#" />`
`<iframe src=" " />`
`<iframe src="?" />`
`<video src="#" />`
`<video src=" " />`
`<video src="?" />`
`<body style="background: url()">`
`<body style="background: url('')">`
`<body style="background: url(' ')">`
`<div style="background: url()">`
`<div style="background: url('')">`
`<div style="background: url(' ')">`

* iFrames with src=”?” and “#” resulted in 3-4 HTML requests for the same page.

Service worker Cache

Creating a separate cache layer via a service worker is a great way to preload your browser cache for resources that it will need later. However, what happens when you include an HTML page as one of those resources to fetch?

Here’s an example of a site doing exactly that. Even worse, the HTML was not cacheable…

If you are going to use a service worker to cache a resource, make sure that resource is cacheable!

display:none

While this example is not specifically about HTML, it occurs enough that it’s worth mentioning. As @dougsillars says “display:none does not mean download:none”.

Your periodic reminder that using "display:none" in your CSS does not also mean "download:none" pic.twitter.com/eZZNnrQ3OS
— Doug Sillars 🇬🇧 (@dougsillars) May 17, 2020

In this example, the developer referenced video content that used a media query to hide the video for certain screen sizes. The end result was downloading both the desktop and mobile videos, while hiding one of them. That’s a lot of wasted bytes!

Why do I bring this up in an article about duplicate HTML requests? I’ve seen some cases of this technique used alongside <img src="#" /> - resulting in duplicate HTML!

How to Detect?

There aren’t any tools or audits I know of that look for this, as it does tend to be a bit of a niche error.

Probably the easiest way to spot this is by looking at the CPU processing overlay in WebPageTest. If you see JS execution for a request before the request is loaded, then it was likely a duplicate.

In DevTools, you can also filter by your domain name using domain:www.example.com. Filtering and sorting by name, makes these easier to spot. If you see the same URL with a type of both document and text/html, then you are loading a duplicate HTML page.

Clicking on the initiator can help find what caused the duplicate HTML to load.

Conclusion

If you are analyzing a page and see what looks like a duplicate request for the base HTML page, don’t ignore it! This could be a serious performance and capacity issue. Instead use DevTools to attempt to locate what initiated that request. You may find that you have accidentally included a placeholder or set an incorrect CSS property or element src attribute.

SameSite Cookies - Are You Ready?

2020-07-07T05:00:00+00:00

Last year Google announced updates to Chrome that provide a way for developers to control how cross site cookies should work on their sites. This is a good change - as it ultimately improves end user security and privacy by limiting which third parties can read cookies that were set while visiting a different site. It also defeats cross site request forgery attacks. The implementation is fairly simple, and only requires developers to add the SameSite attribute to their cookies.

The SameSite attribute is supported by all modern browsers, and most have historically defaulted to a permissive use of cookies if the attribute isn’t present.

Google changed the default behavior of SameSite attribute to secure cookies by default when Chrome 80 was released in February 2020. However it was rolled back in April 2020 to ensure stability during the initial stage of the COVID-19 response. Now they are planning to resume SameSite cookie enforcement with Chrome 84, which will be released on July 14th.

Despite almost a year of notice and warnings in the browser console, this seemed to catch many by surprise in February. How ready are third parties for this now?

What is a Cross Site Third Party Cookie?

If a request on www.example.com sets the following cookie from its domain, then the browser will store this cookie and send it back on subsequent requests to the same domain. This is an example of a first party cookie. Essentially a cookie whose domain matches the domain that appears in the address bar…

Set-Cookie: session=abc; path=/; Secure; HttpOnly;

Now let’s assume that www.example.com includes a third party analytics provider, metrics.analyticsexample.com. When the third party request is made, that third party can also set a cookie in the end users browser. And that third party will be able to read the cookie. This is an example of a third party cookie.

If that same user then navigated to www.example2.com, which uses the same third party analytics provider, then their third party cookies would be readable by them across both sites. The third party is then able to track the user across multiple websites.

SameSite Cookies

The SameSite cookie attribute was introduced in a 2016 IETF draft, but had not been widely adopted initially. This attribute provided developers with the ability to control when a browser would send a cookie to a third party. Using them is simply a matter of adding the SameSite attribute to a cookie declaration, with one of the three supported values: “None”, “Lax”, and “Strict”.

This provides the following controls:

SameSite=None
- The browser will send cookies with both cross-site requests and same-site requests.
SameSite=Lax
- Same-site cookies are withheld on cross-site sub-requests, such as calls to load images or frames, but will be sent when a user navigates to the URL from an external site; for example, by following a link.
SameSite=Strict
- The browser will only send cookies for same-site requests (requests originating from the site that set the cookie). If the request originated from a different URL than the URL of the current location, none of the cookies tagged with the Strict attribute will be included.

An example of how this is configured is:

Set-Cookie: key=value; SameSite=Strict

If the SameSite attribute is not included, then most browsers have historically defaulted to the most permissive behavior: SameSite=None.

Google Chrome’s Update

Google has been planning to update the behavior of SameSite within the Chrome browser to default to the more secure SameSite=Lax. Additionally, if a SameSite=None attribute is present, then they would require that the cookie have the “Secure” attribute. There was some concern that this change would cause breakage for some third parties, so a warning message was included in Chrome since version 77 (September 2019).

A cookie associated with a cross-site resource at was set without the `SameSite` attribute. A future release of Chrome will only deliver cookies with cross-site requests if they are set with `SameSite=None` and `Secure`. You can review cookies in developer tools under Application>Storage>Cookies and see more details at https://www.chromestatus.com/feature/5088147346030592 and[ https://www.chromestatus.com/feature/5633521622188032](https://www.chromestatus.com/feature/5633521622188032).

SameSite Usage Across the Web

The HTTP Archive stores a tremendous amount of detail for every HTTP request and response for approximately 5.8 million homepages. In the June 2020 data, there were approximately 108 million third party cookies set across 3.79 million homepages. Of these cookies 35,721,768 (32.9%) included the SameSite attribute. Comparatively in August 2019, 21.4% of cookies had the SameSite attribute.

Note: Due to a collection issue described here, ~18.6% of third party cookies were unreadable in the June 2020 HTTP Archive data. The remainder of this analysis is on the cookies we could read.

The “Secure” flag of a cookie ensures that the browser only sends the cookie over HTTPS. Chrome made this a requirement to use SameSite=None. Out of the 35 million cookies, nearly 75% of them use the Secure flag.

When we look at this graphically, there are a few interesting observations we can make:

SameSite=Lax, which will be the new default, is in use by only 10.82% of secure cookies, but 97% of insecure cookies.
SameSite=None is present on 89.10% of Secure cookies.
2.65% (238,810) insecure cookies are set with SameSite=None, but not including the Secure flag. These will default to SameSite=Lax.
Only 0.06% (16,000) of secure cookies are using SameSite=Strict!
There are less than 1000 SameSite attributes set to an erroneous value (ie, not Lax, Strict or None).

Who is using SameSite=None incorrectly?

There were 238,810 third party cookies set with SameSite=None, but missing the Secure flag. These will default to SameSite=Lax in Chrome 84 unless the Secure flag is added. Overall, there were 1749 third party domains that made this error. The top 5 account for 48% of the erroneous SameSite cookies. This includes Spotxchange, ETargetNet, SmartAdServer, BazaarVoice and EntityTag.

Even more concerning than the number of cookies, is the number of websites that are affected. For example spotxchange.com is setting SameSite=None with insecure cookies on 26,174 websites. EntityTag.co.uk is doing the same for 14,358 websites.

Who is using SameSite Strict?

I thought it was odd that SameSite=Strict was used so infrequently. The table below shows some of the third parties that are using it. The top 10 account for 67% of all SameSite=Strict usage.

What Erroneous SameSite Attribute Values are in Use?

I was surprised to see that there was such a low percentage of erroneous usage of the SameSite attribute. The top 10 errors listed below account for 80% of the erroneous uses. Most were SameSite=Secure, SameSite with no value, and SameSite: Lax.

Many of these errors were from a small number of third parties. For example:

SameSite=secure was set by wildbeardbg.com:
- dg_perm_sessid=95951591788161; secure; SameSite=Secure
SameSite: Lax was set by nytimes.com.
- nyt-purr=cfshcfhssc; Expires=Fri, 18 Jun 2021 01:02:51 GMT; Path=/; Domain=.nytimes.com;SameSite: Lax;Secure
SameSite=; was set by cloudengage.com
- CEUID=E80y%2BgcVMl0o6Skdol5X9FRCrdrbqBNssQeOYrNUnQxC42U4gpGNV6VX; expires=Wed, 01-Jul-2020 16:16:13 GMT; Max-Age=2592000; path=/; samesite=; domain=.cloudengage.com; secure; HttpOnly

SameSite Usage Across Popular Third Parties

So far we’ve looked at how SameSite is being used across third parties. But which third parties are not setting SameSite attributes? By not setting it, they will default to SameSite=Lax. Whether or not this is intentional or will cause breakage will depend on each third party’s usage of the cookies. But not setting an explicit SameSite attribute could be indicative of whether a third party has prepared for this.

The graph below shows SameSite usage across some of the most popular third parties. Very few sites third parties are setting SameSite=Lax, which is about to become the new default. Some of these third parties are used by hundreds of thousands of sites.

Conclusion

SameSite cookies are a huge win for privacy and security, but there is a risk that Chrome’s new default settings will cause problems. For many cases, this will likely render some cross site tracking techniques ineffective with little change to end user experience. However, with any changes like this there are risks of breakage and serious site issues. With SameSite adoption at less than 33% of third party cookies, this raises the question of how prepared third parties are for this.

It’s important to note that the absence of a SameSite attribute does not necessarily mean that there will be breakage. However depending on how the cookie is used, it has the potential of becoming problematic. It may be worth checking the JS console in Chrome DevTools to see if you see the SameSite warning for any of your third parties. You can also set a flag in Chrome to test how this will affect your site ahead of the Chrome 84 release. The Chrome team has published a useful guide for debugging SameSite cookies.