Impact of Page Weight on Load Time

Over the years it has been fun to track website page weight by comparing it to milestones such as the size of a floppy disk (1.44MB), the size of the original install size of DOOM (2.39MB) and when it hit 3MB last summer.

When we talk about page weight, we are often talking about high resolution images, large hero videos, excessive 3rd party content, JavaScript bloat – and the list goes on.

I recently did some research to show that sites with more 3rd party content are more likely to be slower. And then a few days later USAToday showed us an extreme example by publishing a GDPR friendly version of their site for EU visitors. The EU version has no 3rd party content, substantially less page weight and is blazing fast compared to the US version.

Shortly after the 3MB average page weight milestone was reached last summer, I did some analysis to try and understand the sudden jump in page weight. It turns out that the largest 5% of pages were influencing the average, which is a perfect example of averages misleading us.

These days we focus more on percentiles, histograms and statistical distributions to represent page weight. For example, in the image below you can see how this is being represented in the recently redesigned HTTP Archive reports.

Is page weight still something we should care about in 2018?

Thanks to the HTTP Archive, we have the ability to analyze how the web is built. And by combining it with the Chrome User Experience Report (CrUX), we can also see how this translates into the actual end user experiences across these sites. This is an extremely powerful combination.

Note: If you are not familiar with CrUX, it is Real User Measurement data collected by Google from Chrome users who have opted-in to syncing their browsing history and have usage statistic reporting enabled. I wrote about CruX and created some overview videos if you are interested in learning more about the data and how to work with it.

By leveraging CrUX and the HTTP Archive together, we can analyze performance across many websites and look for trends. For example, below you can see how often the Alexa top 10 sites are able to load pages in less than 2 seconds, 2-4 seconds, 4-6 seconds and greater than 6 seconds. It’s easy to glance that this chart and see which sites have a large percentage of slower pages. I wrote a another post about how we can use CrUX data like this to compare yourself to competitors.

But what happens if we look at the real user performance for 1,000 popular sites this way? The results are oddly symmetrical, with almost as many fast sites as slow ones. In the graph below I sorted the onLoad metrics from fast (left) to slow (right). There are 1000 tiny bars – each representing a summary of a single website’s real user experiences on Chrome browsers. The most consistently fast site in this list is the webcomic XKCD – with an impressive 93.5% of users loading pages in < 2 seconds. Some other sites in the “extremely fast” category are Google, Bing, CraigsList, Gov.uk, etc. Many of the slow sites (far right of this graph) have large page weights, videos, advertisements and numerous 3rd parties. Where do you think your site’s performance stacks up?

Since we’re interested in investigating the relationship of page weight to performance, let’s look at the top 1000 pages that are less than 1MB and the top 1000 pages that are greater than 3MB. The pattern in load times is quite revealing. A few notable observations:

  • The distribution of fast vs slow seems to be cut down the middle. Sites appear to either be mostly fast or mostly slow
  • There are far more fast <1MB pages compared to slow ones
  • There are far more slow >3MB pages compared to fast ones.
  • The fact that there are still some fast >3MB pages and slow <1MB pages proves that page weight isn’t everything, and it is possible to optimize rich experiences for performance.

Note: I’ve also applied the same logic to the top 10,000 sites, and the pattern was identical.

What About the Other Metrics that CrUX Collects?

Since CrUX contains additional metrics, I also looked at the relationship of page weight to DOM Content Loaded and First Contentful Paint. The set of graphs below compare the fastest range (<2s for onLoad, <1s for FCP and DCL) for the top 1000 sites. Across these three metrics, we see the highest correlation of load times to page weight with the onLoad metric.

(Note: the higher %s mean that more pages experienced faster load times. So higher=better in these graphs.)

What Aspects of Page Weight Impacts Performance the Most?

We’ve seen a strong correlation of performance to page weight, and we’ve learned that this is more measurable via onLoad vs First Contentful Paint. But what contributing factors of page weight impact load times the most?

If we examine the Top 1000 sites again and pull some of the page weight statistics from the HTTP Archive, we can once again compare HTTP Archive data w/ CrUX. The graphs below summarize the percentage of pages with onLoad times less than 2 seconds. The Y axis is the median number of bytes, and the X axis represents the percentage of sites with fast page loads.

In the top left graph, page weight shows a strong correlation, and the sites with less fast pages tended to have larger page weights. The remaining 3 graphs show how JavaScript, CSS and Images contribute to page weight and performance. Based on these graphs, Images and JavaScript are the most significant contributors to the page weights that affect load time. And for some slow sites, the amount of compressed JavaScript actually exceeds the number of image bytes!

Conclusion

Page weight is an important metric to track, but we should always consider using appropriate statistical methods when tracking it on the web as a whole. Certainly track it for your sites – because as you’ve seen here, the size of content does matter. It’s not the only thing that matters – but the correlation is strong enough that a spike in page weight should merit some investigation.

If your site has a page weight problem, there are a few things you do can do:

  • Akamai’s Image Manager can help to optimize images with advanced techniques such as perceptual quality compression. This is also a great way to ensure that you don’t get any surprises when a marketing promo drops an 2MB hero image on your homepage.
  • Limit the use of large video files, or defer their loading to avoid critical resources competing for bandwidth. Check out Doug Sillars’ blog post on videos embedded into web pages.
  • Lazy load images that are not in the viewport of your screen. Jeremy Wagner wrote a nice guide on this recently.
  • Ensure that you are compressing text based content. Gzip compression at a minimum should be enabled. Brotli compression is widely supported can help reduce content size further. (Akamai Ion customers can automatically serve Brotli compressed resources via Resource Optimizer)
  • Use Lighthouse and Chrome Dev Tools to audit your pages. Find unused CSS and JS with the Coverage feature and attempt to optimize.
  • Audit your 3rd parties. Many sites do not realize how much content their 3rd parties add to their site and how inconsistent their performance may become as a result. Harry Roberts wrote a helpful guide here!. Also, Akamai’s Script Manager service can help to manage third parties based on performance.
  • Track your sites page weight over time and alert on increases. If you use Akamai’s mPulse RUM service – you can do this with resource timing data (if TAO is permitted).

Thanks to Yoav Weiss and Ilya Grigorik for reviewing this and providing feedback.

HTTP Heuristic Caching (Missing Cache-Control and Expires Headers) Explained

Have you ever wondered why WebPageTest can sometimes show that a repeat view loaded with less bytes downloaded, while also triggering warnings related to browser caching? It can seem like the test is reporting an issue that does not exist, but in fact it’s often a sign of a more serious issue that should be investigated. Often the issue is not the lack of caching, but rather lack of control over how your content is cached.

If you have not run into this issue before, then examine the screenshot below to see an example:

Continue reading

Adoption of HTTP Security Headers on the Web

Over the past few weeks the topic of security related HTTP headers has come up in numerous discussions – both with customers I work with as well as other colleagues that are trying to help improve the security posture of their customers. I’ve often felt that these headers were underutilized, and a quick test on Scott Helme’s excellent securityheaders.io site usually proves this to be true. I decided to take a deeper look at how these headers are being used on a large scale.

Looking at this data through the lens of the HTTP Archive, I thought it would be interesting to see if we could give the web a scorecard for security headers. I’ll dive deeper into how each of these headers are implemented below, but let’s start off by looking at the percentage of sites that are using these security headers. As I suspected, adoption is quite low. Furthermore, it seems that adoption is marginally higher for some of the most popular sites – but not by much.

Continue reading

Cache Control Immutable – A Year Later

In January 2017, Facebook wrote about a new Cache-Control directive – immutable – which was designed to tell supported browsers not to attempt to revalidate an object on a normal reload during it’s freshness lifetime. Firefox 49 implemented it, while Chrome went ahead with a different approach by changing the behavior of the reload button. Additionally it seems that WebKit has also implemented the immutable directive since then.

So it’s been a year – let’s see where Cache-Control immutable is being used in the wild!

Continue reading

Measuring the Performance of Firefox Quantum with RUM

On Nov 14th, Mozilla released Firefox Quantum. On launch day, I personally felt that the new version was rendering pages faster and I heard anecdotal reports indicating the same. There have also been a few benchmarks which seem to show that this latest Firefox version is getting content to screens faster than its predecessor. But I wanted to try a different approach to measurement.

Given the vast amount of performance information that we collect at Akamai, I thought it would be interesting to benchmark the performance of Firefox Quantum with a large set of real end-user performance data. The results were dramatic: the new browser improved DOM Content Loaded time by an extremely impressive 24%. Let’s take a look at how those results were achieved.



Continue reading

Which 3rd Party Content Loads Before Render Start?

Since the HTTP Archive is capturing the timing information on each request, I thought it would be interesting to correlate request timings (ie, when an object was loaded) with page timings. The idea is that we can categorize resources that were loaded before or after and event.

Content Type Loaded Before/After Render Start It’s generally well known that third party content impacts performance. We see this with both resource loading, and JavaScript execution blocking the browser from loading other content. While we don’t have the data to evaluate script execution timings per resource captured here, we can definitely look at when resources were loaded with respect to certain timings and get an idea of what is being loaded before a page starts rendering. Continue reading

Exploring Relationships Between Performance Metrics in HTTP Archive Data

I thought it would be interesting to explore how some of the page metrics we use to analyze web performance compare with each other. In the HTTP Archive “pages” table, metrics such as TTFB, renderStart, VisuallyComplete, onLoad and fullyLoaded are tracked. And recently some of the newer metrics such as Time to Interactive, First Meaningful Paint, First Contentful paint, etc exist in the HAR file tables.

But first, a warning about using response time data from the HTTP Archive. While the accuracy has improved since the change to Chrome based browsers on linux agents – we’re still looking at a single measurement from many sites, all run from a single location and a single browser or mobile device (Moto G4). For this reason, I’m not looking at any specific website’s performance, but rather analyzing the full data-set for patterns and insights.

Continue reading

Tracking Page Weight Over Time

As of July 2017, the “average” page weight is 3MB. @Tammy wrote an excellent blog post about HTTP Archive page stats and trends. Last year @igrigorik published an analysis on page weight using CDF plots. And of course, we can view the trends over time on the HTTP Archive trends page. Since this is all based on HTTP Archive data, I thought I’d start a thread here to continue the discussion on how to gauge the increase in page weight over time.

Continue reading