Measuring Third-Party Cost

In another informative talk, Andy Davies looks at how you can measure the cost of a particular third-party script (just after the 4-minute mark).

  1. Test a page in WebPageTest
  2. Repeat with selected third-parties blocked
  3. Compare the two (filmstrip) results

This process can help illustrate performance and UX tradeoffs being made, but Andy also suggests feeding those results into a RUM (real user monitoring) dashboard to understand the business impact.

“What if I was a second faster? How many more conversions would I get? You can begin to translate that user-experience impact into pounds and pence.”

This step had never occurred to me, and after a teeny bit of research on RUM tools, I realized that quite a few past clients likely had this data available, I just didn’t know what to do or how to ask for it. A post on the Soasta blog (via Charles Vazac) further illustrates utilizing RUM for these purposes.

“What if my digital property had better performance? How would that affect the bottom line of my company?”

Next, I’m going to try and get some experience with RUM tools and work with a client to get these sorts of data points. If you’ve got any screenshots or stories using RUM tools to do this sort of thing that you can share, please give me a holler!

Please Turn Off Your Ad Blocker

It’s common for sites to ask users to turn off ad blockers. I often happily oblige to support sites I value. I’d likely oblige more often if the differences in experience weren’t so extreme. Here’s one example (via Sam Kap):

food network website
How To Turn Off Your Adblocker for Food Network

Not an unreasonable request: “Turn off your ad blocker so we can continue to make content. We won’t hit you with pop-up ads.” But note my “before” experience:

food network website
31 requests, 6.73 MB in 1.83 seconds

After turning my ad blocker off, the experience changed dramatically:

food network website
348 requests, 14.87 MB in 34.74 seconds

Not only was the page heavier and slower to load, but there was also a lot of scroll-jank and processor lag. Because the differences are so huge, the request was starting to seem less reasonable. And to top things off, I got a pop-up anyways:

food network website

Sites with ads are one thing. Sites with such a high amount of ads, trackers, analytics, and A/B testing resources that load via a long list of third-parties are something else. I don’t think Food Network is unique here, but it is a good example of what happens when third-party inclusions get out of hand.

When implementing third-party scripts/services, I think organizations need to:

  • Monitor page speed and processor lag
  • Evaluate UX implications
  • Avoid redundant scripts/services
  • Establish criteria for measuring cost/benefits of each script/services

Maybe most organizations do this already, and perhaps it still makes business sense for them to include everything. I hope not.

Identifying Third-Party Scripts

Now that I’d spent time itemizing third-party scripts and finding the most prevalent ones, I wanted to identify what these scripts/services do, who owns them, etc.

I’d already found myself wondering, “what is the Rubicon Project?” or “why is 2mdn.net everywhere?” while digging into client sites. Initially unable to find a third-party index to answer these questions, I set out to create one of my own:

rough third-party services indexing site
Rough/WIP third-party index site

As excited as I was to have a quick Jekyll site with live search stubbed out, I quickly realized that properly researching & indexing all these would be no small feat. Early in my process, I luckily stumbled across Ind.ie’s Better privacy tool site’s tracker index:

better.fyi tracker index
better.fyi/trackers

Bookmarked! Problem solved! I should have known that Aral, Laura, and the Ind.ie team might have something like this. It’s a great resource, and likely the result of some really hard work. Thanks to them for making it available!

Tag Manager Chat with Vector Media Group

Longtime friend and collaborator, Matt Weinberg from Vector Media Group recently asked if I’d like to chat with coworker Lee Goldberg and himself about Vector’s approach to tag managers (and web marketing in general)—I quickly took them up on their generous offer.

As someone who primarily sees tag managers (and all their potential third-party inclusions) as a performance hit, I wanted to gain a better perspective on the value of these marketing tools. That way, I can be a little more objective, seeing things from all sides when inter-departmental discussions come up during projects.

Lee walked me through Vector’s tag manager setup, citing specific examples of increased sales, conversion, time on site, etc. for clients along the way. The results were compelling and would be hard to argue against as far as a company balance sheet is concerned.

But here’s the thing—I think Vector uniquely embodies a holistic approach to marketing on the web. Design, development, and marketing work together, as Matt states:

“We work towards a global maximum versus a local maximum. Small, hyper-focused changes may seem good at the time, but we have to be mindful of the overall quality of the user-experience and integrity of the brand.”

Lee seeks to get involved early:

“An analytics strategy should be part of the initial development and design process so that together we can define business goals, figure out how to measure them, and develop a method for using that data to further refine the user experience.”

It’s not the tag manager, but how it’s used that actually impacts the quality of a site, and I haven’t found that teams using tag managers always make decisions with the same level of care as Vector.

tag manager
Me, marveling at how easy it is to throw tags on a website

Yes, tag managers can offload the burden of site updates from developers, but what burden are we on-loading to users? Every inclusion warrants a comprehensive discussion around its value. As Reagan says, Talk to Each Other!

Third-Party Script Prevalence on Alexa Top 50

Since beefing up my technique for itemizing third-party requests, I’ve been looking into which third-party domains and services are most prevalent. Starting with a spreadsheet of the top 50 US sites (according to Alexa in late December 2017), I downloaded and analyzed HAR files for each. Caveats:

  • Results (# of requests) may change as sites are updated.
  • I chose to start with US sites because of my own familiarity.
  • 46 sites actually (3 NSFW sites not tested and t.co doesn’t apply)

After a few grunt work sessions, I have a spreadsheet that cross-references those top 46 sites with the domains from which third-party requests originate. I’ll spare you the entire spreadsheet, but some initial points of interests below:

  • The top 46 US sites made requests from 213 different domains.
  • CDN-related request domains are included.
  • 75 of those different domains are unique to 1 site.
  • Third-party serving domains are prevalent. Here are domains that appear on the 46 sites tested more than 10 times:
Third-party domain # of top 46 sites % of top 46 sites
doubleclick.net 38 82.6%
facebook.com 32 69.6%
google-analytics.com 27 58.7%
googlesyndication.com 25 54.3%
googleadservices.com 24 52.2%
cloudfront.net 20 43.5%
googleapis.com 20 43.5%
scorecardresearch.com 18 39.1%
2mdn.net 17 37.0%
adnxs.com 17 37.0%
fastly.net 17 37.0%
akamaihd.net 16 34.8%
amazonaws.com 16 34.8%
demdex.net 15 32.6%
googletagservices.com 15 32.6%
adsrvr.org 14 30.4%
fonts.googleapis.com 14 30.4%
bing.com 13 28.3%
connect.facebook.net 13 28.3%
fbcdn.net 12 26.1%
quantserve.com 12 26.1%
truste.com 12 26.1%
adsafeprotected.com 11 23.9%
amazon-adsystem.com 11 23.9%
analytics.twitter.com 11 23.9%
bluekai.com 11 23.9%
jquery.com 11 23.9%
rlcdn.com 11 23.9%
rubiconproject.com 11 23.9%
serving-sys.com 11 23.9%
turn.com 11 23.9%
moatads.com 10 21.7%

Here are the number of unique third-party domains from which requests originate (excluding multiple requests from the same domain) per top 46 site:

Site # of third-party domains
nytimes.com 64
washingtonpost.com 63
Metropcs.mobi 59
cnn.com 57
ebay.com 49
msn.com 45
microsoft.com 43
wikia.com 42
salesforce.com 40
bestbuy.com 38
imdb.com 37
twitch.tv 37
espn.com 36
wordpress.com 32
target.com 29
diply.com 28
reddit.com 28
amazon.com 27
walmart.com 23
stackoverflow.com 21
etsy.com 19
pinterest.com 19
paypal.com 18
imgur.com 13
twitter.com 13
chase.com 12
youtube.com 12
linkedin.com 11
yahoo.com 11
yelp.com 11
tumblr.com 10
bankofamerica.com 9
(outlook) live.com 9
netflix.com 8
office.com 7
facebook.com 6
google.com 6
blogspot.com 5
bing.com 4
github.com 4
instagram.com 4
microsoftonline.com 4
wellsfargo.com 3
apple.com 2
craigslist.org 2
wikipedia.org 1

I don’t have much commentary at this point—some numbers are higher or lower than expected, and it’s understandable why some sites have more than others. Heck, with video or tweet embeds my site can pull 10+ third-domains surprisingly easy. I need more context, so perhaps it’d be logical to run more tests around a single category.

Next, I’ll be researching the more prevalent third-party services. Many of them are new to me. Further down the road, I’ll look at performance impact, but that’s a whole ’nother ball of wax.

Itemizing Third-Party Scripts

If I want to see all the third-party requests included with a webpage, I have a few options (that I know of):

  • Look at the code in the <head> and before the </body> tags
  • Use a browser extension like Ghostery to see what’s being served
  • Utilize services like Calibre or SpeedCurve for in-depth analysis
  • Inspect the page—use developer tools (network & sources panel)
Sources tab
Chrome inspector sources tab at slack.com (arbitrarily chosen)

But if I want to itemize that information and make it portable (e.g., paste it into a spreadsheet for filtering or further analysis), the process is not as straightforward.

Save as HAR

From inside some browsers’ web inspectors (Chrome, Firefox, Edge), you can save a HAR (HTTP Archive format) file (further reading). Among other things, this captures all the requests a site is making. Note that HAR files can contain sensitive data, so save and share with care.

Saving a HAR file
Saving a HAR file from the Firefox network panel

So how do you open a HAR file? har.tech has a list of helpful tools. I gravitated towards Charles.

Charles is an HTTP proxy / HTTP monitor / Reverse Proxy that enables a developer to view all of the HTTP and SSL / HTTPS traffic between their machine and the Internet.

Viewing a HAR file
Viewing HAR file via Charles Proxy

You can learn a lot poking around the app. Unfortunately, I’ve yet to find a way to export or copy this info into text or a table. (Update)

This approach (and level of analysis) gives me a good perception upgrade for what’s happening in a browser when third-party scripts, iframes, etc. load with a webpage. Next, I’ll be taking chunks of this information for multiple sites and running some comparisons to see what scripts/services are most common.

I’m figuring all this out as I go, so if you have advice or better ideas, let me know!

Update

My friend Matt Weinberg from Vector Media Group informed me that you can export a full session from Charles app as a CSV file (File->Export Session->CSV), which is ideal. He also pointed out that “a HAR file is just a JSON file with inlined content. So any tool that can parse JSON can do whatever you need to with a HAR.” Bingo. Thanks, Matt!

Optimizely Blog—Page Load Time & Engagement

Oliver Palmer wrote a post on the Optimizely blog about how page load time impacts engagement. Recognizing the increasing prevalence of third-party javascript…

between the homepages of 20 major US and European publishers, some 500 different external snippets of Javascript were loaded. Nine months later, that figure has risen to almost 700.

…they set up an A/B test experiment with The Telegraph.

Do these hundreds of lines of externally loaded Javascript code impact how fast a page loads? By how much? And what impact does this have on user engagement and retention?

We would artificially slow the site down in order to measure the impact on overall user engagement and retention to try and model out the relationship between site speed and overall revenue.

Delays ranged from 4 to 20 seconds, with page view impact ranging from -11% to -44%. That’s a significant drop-off, but not particularly surprising given similar documented cases.

Using a metric developed by the Telegraph’s internal strategy team representing the monetary value of a pageview[…], we were able to model out the overall revenue impact of each variant. By doing so, we could paint an accurate picture of the cost to user engagement of any new on-site changes which incur a detrimental impact on page performance.

This last step seems crucial. Establishing a monetary value for a page view can help facilitate objective discussions around tradeoffs between revenue gains and impact to user experience. I’d love to learn more about how/if this new data led The Telegraph to many any changes.