Tutorial: Using BigQuery to Analyze Chrome User Experience Report Data

Last week I wrote a blog post showing some examples of how you can use the Chrome User Experience report to compare your site’s RUM data to competitors. In this post I’d like to share some brief videos to help you quickly get started exploring the data via Google BigQuery.

Accessing the Chrome User Experience Report (CrUX) Data

The Chrome User Experience Report data is available on Google BigQuery, which is part of the Google Cloud Platform. To get started, log into Google Cloud, create a project for your CrUX work, and then navigate to the BigQuery console. Then add the chrome-ux-report dataset and explore the way the tables are structured. Here’s a short video that walks you through this process.

Links:

Brief CrUX Overview

Now that we’ve accessed the CrUX data, let’s explore the table structure and where you can find some resources for additional help:

Links:

Comparing Form Factor and Connection Type Distributions For Different Sites

Next let’s explore what we can do with the form factor and effective connection type dimensions. In this next video we’ll explore the % of these dimensions for two sites.

Links:

Competitive Histograms

In this next video we’ll take a look at how to analyze the performance of a single metric across multiple sites in BigQuery. We’ll export the results and graph them –

Links:

Graphing all the Metrics

Finally, let’s UNION together a bunch of queries and examine the performance of a single site by creating histograms for first paint, first contentful paint, DOM Content Loaded and onLoad.

Links:

Conclusion

I hope this helps you get started digging into the CrUX data. As I mentioned in my earlier blog post on CrUX, Akamai is working on building this functionality into mPulse so that it will be easy for our customers to quickly analyze this data in the future. But for now these videos should give you a starting point to explore the data via BigQuery.

HTTP Heuristic Caching (Missing Cache-Control and Expires Headers) Explained

Have you ever wondered why WebPageTest can sometimes show that a repeat view loaded with less bytes downloaded, while also triggering warnings related to browser caching? It can seem like the test is reporting an issue that does not exist, but in fact it’s often a sign of a more serious issue that should be investigated. Often the issue is not the lack of caching, but rather lack of control over how your content is cached.

If you have not run into this issue before, then examine the screenshot below to see an example:

Continue reading

Adoption of HTTP Security Headers on the Web

Over the past few weeks the topic of security related HTTP headers has come up in numerous discussions – both with customers I work with as well as other colleagues that are trying to help improve the security posture of their customers. I’ve often felt that these headers were underutilized, and a quick test on Scott Helme’s excellent securityheaders.io site usually proves this to be true. I decided to take a deeper look at how these headers are being used on a large scale.

Looking at this data through the lens of the HTTP Archive, I thought it would be interesting to see if we could give the web a scorecard for security headers. I’ll dive deeper into how each of these headers are implemented below, but let’s start off by looking at the percentage of sites that are using these security headers. As I suspected, adoption is quite low. Furthermore, it seems that adoption is marginally higher for some of the most popular sites – but not by much.

Continue reading

Cache Control Immutable – A Year Later

In January 2017, Facebook wrote about a new Cache-Control directive – immutable – which was designed to tell supported browsers not to attempt to revalidate an object on a normal reload during it’s freshness lifetime. Firefox 49 implemented it, while Chrome went ahead with a different approach by changing the behavior of the reload button. Additionally it seems that WebKit has also implemented the immutable directive since then.

So it’s been a year – let’s see where Cache-Control immutable is being used in the wild!

Continue reading

Measuring the Performance of Firefox Quantum with RUM

On Nov 14th, Mozilla released Firefox Quantum. On launch day, I personally felt that the new version was rendering pages faster and I heard anecdotal reports indicating the same. There have also been a few benchmarks which seem to show that this latest Firefox version is getting content to screens faster than its predecessor. But I wanted to try a different approach to measurement.

Given the vast amount of performance information that we collect at Akamai, I thought it would be interesting to benchmark the performance of Firefox Quantum with a large set of real end-user performance data. The results were dramatic: the new browser improved DOM Content Loaded time by an extremely impressive 24%. Let’s take a look at how those results were achieved.



Continue reading

Which 3rd Party Content Loads Before Render Start?

Since the HTTP Archive is capturing the timing information on each request, I thought it would be interesting to correlate request timings (ie, when an object was loaded) with page timings. The idea is that we can categorize resources that were loaded before or after and event.

Content Type Loaded Before/After Render Start It’s generally well known that third party content impacts performance. We see this with both resource loading, and JavaScript execution blocking the browser from loading other content. While we don’t have the data to evaluate script execution timings per resource captured here, we can definitely look at when resources were loaded with respect to certain timings and get an idea of what is being loaded before a page starts rendering. Continue reading

Exploring Relationships Between Performance Metrics in HTTP Archive Data

I thought it would be interesting to explore how some of the page metrics we use to analyze web performance compare with each other. In the HTTP Archive “pages” table, metrics such as TTFB, renderStart, VisuallyComplete, onLoad and fullyLoaded are tracked. And recently some of the newer metrics such as Time to Interactive, First Meaningful Paint, First Contentful paint, etc exist in the HAR file tables.

But first, a warning about using response time data from the HTTP Archive. While the accuracy has improved since the change to Chrome based browsers on linux agents – we’re still looking at a single measurement from many sites, all run from a single location and a single browser or mobile device (Moto G4). For this reason, I’m not looking at any specific website’s performance, but rather analyzing the full data-set for patterns and insights.

Continue reading

Tracking Page Weight Over Time

As of July 2017, the “average” page weight is 3MB. @Tammy wrote an excellent blog post about HTTP Archive page stats and trends. Last year @igrigorik published an analysis on page weight using CDF plots. And of course, we can view the trends over time on the HTTP Archive trends page. Since this is all based on HTTP Archive data, I thought I’d start a thread here to continue the discussion on how to gauge the increase in page weight over time.

Continue reading