Request Map Generator

I was reading Harry Roberts’ post, Identifying, Auditing, and Discussing Third Parties, and realized I hadn’t paid proper attention to the Request Map Generator built by Simon Hearne. It’s fantastic.

trentwalton.com request map
Request Map for trentwalton.com

First off, having a visual “map” of requests is compelling even before you begin any analysis. While my site is relatively basic, something like Amazon can be complex, producing a sizable map.

Amazon website request map
Request Map for amazon.com

You can dig into the map for further analysis (Harry’s post covers this well), and you can also export as a CSV, which I love because it’s another easy way to itemize third-parties for comparison, sorting, etc.

amazon request map in CSV
Request Map for amazon.com exported as CSV

I don’t think we’ll be doing any performance or third-party consulting at Paravel without utilizing the Request Map Generator, especially now that it’s also included in the webpagetest.org UI.

Understanding GDPR and Privacy

On May 25, 2018, the General Data Protection Regulation (GDPR) becomes enforceable across Europe. According to Wikipedia:

The General Data Protection Regulation is a regulation in EU law on data protection and privacy for all individuals within the European Union. It also addresses the export of personal data outside the EU. The GDPR aims primarily to give control to citizens and residents over their personal data and to simplify the regulatory environment for international business by unifying the regulation within the EU.

Because these changes include data collected outside the EU, its impact is global. For example, if you store user data in the U.S. for someone in the EU, you’re subject to these laws. I think this is a good thing for users, but what does that mean for us as web builders?

I recently came across this article written by Heather Burns for Smashing Magazine. It contains an excellent breakdown of GDPR requirements, what data is protected, and tips for how to go about adapting. Some of my favorite bits:

Europe’s data protection regime stands in stark contrast to that of the U.S., which has no single overarching, cross-sector, or cross-situational data protection law. […] This cultural difference often sees American developers struggling with the concept of privacy as a fundamental human right enshrined in law, a situation which has no U.S. equivalent.

GDPR requires the adoption of the Privacy by Design framework, a seven-point development methodology which requires optimal data protection to be provided as standard, by default, across all uses and applications.

You can read more about PbD here.

A Privacy Impact Assessment (PIA), which is required under GDPR for data-intensive projects, is a living document which must be made accessible to all involved with a project. It is the process by which you discuss, audit, inventory, and mitigate the privacy risks inherent in the data you collect and process.

These items seem less like extra work and more like work that should be done from the beginning as a default. Just as we formalize accessibility, performance, and browser/device support standards, we should be doing the same for privacy and data protection.

But what about third-parties? If I have Google Analytics on my site (I don’t), who is responsible for that data?

As I understand it, according to GDPR, The site/app owner is the ‘data controller,’ and the third-party service (like Google Analytics) is the ‘data processor.’ It is up to Google to be sure the data they process is GDPR compliant, but it would also be up to me as the data controller to be sure that my third-party vendors and services are in compliance. These roles further reinforce the need for organizations to regularly audit and itemize the third-party scripts and services they include with their web pages.

Smashing Conf 2018

I just got back from attending & speaking at Smashing Conf 2018. I had a great time—the speakers were excellent, and the Smashing team is always so helpful and supportive. It was my first time speaking about third-parties. I think it went well, and I’m thankful for the questions and feedback. Next, I’ll be researching privacy and adding some thoughts to the deck for the next talks for An Event Apart in Boston and Orlando!

Trent Walton speaking at Smashing Conf 2018 in San Francisco
Photo via Marc Thiele

Measuring Third-Party Cost

In another informative talk, Andy Davies looks at how you can measure the cost of a particular third-party script (just after the 4-minute mark).

  1. Test a page in WebPageTest
  2. Repeat with selected third-parties blocked
  3. Compare the two (filmstrip) results

This process can help illustrate performance and UX tradeoffs being made, but Andy also suggests feeding those results into a RUM (real user monitoring) dashboard to understand the business impact.

“What if I was a second faster? How many more conversions would I get? You can begin to translate that user-experience impact into pounds and pence.”

This step had never occurred to me, and after a teeny bit of research on RUM tools, I realized that quite a few past clients likely had this data available, I just didn’t know what to do or how to ask for it. A post on the Soasta blog (via Charles Vazac) further illustrates utilizing RUM for these purposes.

“What if my digital property had better performance? How would that affect the bottom line of my company?”

Next, I’m going to try and get some experience with RUM tools and work with a client to get these sorts of data points. If you’ve got any screenshots or stories using RUM tools to do this sort of thing that you can share, please give me a holler!

Please Turn Off Your Ad Blocker

It’s common for sites to ask users to turn off ad blockers. I often happily oblige to support sites I value. I’d likely oblige more often if the differences in experience weren’t so extreme. Here’s one example (via Sam Kap):

food network website
How To Turn Off Your Adblocker for Food Network

Not an unreasonable request: “Turn off your ad blocker so we can continue to make content. We won’t hit you with pop-up ads.” But note my “before” experience:

food network website
31 requests, 6.73 MB in 1.83 seconds

After turning my ad blocker off, the experience changed dramatically:

food network website
348 requests, 14.87 MB in 34.74 seconds

Not only was the page heavier and slower to load, but there was also a lot of scroll-jank and processor lag. Because the differences are so huge, the request was starting to seem less reasonable. And to top things off, I got a pop-up anyways:

food network website

Sites with ads are one thing. Sites with such a high amount of ads, trackers, analytics, and A/B testing resources that load via a long list of third-parties are something else. I don’t think Food Network is unique here, but it is a good example of what happens when third-party inclusions get out of hand.

When implementing third-party scripts/services, I think organizations need to:

  • Monitor page speed and processor lag
  • Evaluate UX implications
  • Avoid redundant scripts/services
  • Establish criteria for measuring cost/benefits of each script/services

Maybe most organizations do this already, and perhaps it still makes business sense for them to include everything. I hope not.

Identifying Third-Party Scripts

Now that I’d spent time itemizing third-party scripts and finding the most prevalent ones, I wanted to identify what these scripts/services do, who owns them, etc.

I’d already found myself wondering, “what is the Rubicon Project?” or “why is 2mdn.net everywhere?” while digging into client sites. Initially unable to find a third-party index to answer these questions, I set out to create one of my own:

rough third-party services indexing site
Rough/WIP third-party index site

As excited as I was to have a quick Jekyll site with live search stubbed out, I quickly realized that properly researching & indexing all these would be no small feat. Early in my process, I luckily stumbled across Ind.ie’s Better privacy tool site’s tracker index:

better.fyi tracker index
better.fyi/trackers

Bookmarked! Problem solved! I should have known that Aral, Laura, and the Ind.ie team might have something like this. It’s a great resource, and likely the result of some really hard work. Thanks to them for making it available!

Tag Manager Chat with Vector Media Group

Longtime friend and collaborator, Matt Weinberg from Vector Media Group recently asked if I’d like to chat with coworker Lee Goldberg and himself about Vector’s approach to tag managers (and web marketing in general)—I quickly took them up on their generous offer.

As someone who primarily sees tag managers (and all their potential third-party inclusions) as a performance hit, I wanted to gain a better perspective on the value of these marketing tools. That way, I can be a little more objective, seeing things from all sides when inter-departmental discussions come up during projects.

Lee walked me through Vector’s tag manager setup, citing specific examples of increased sales, conversion, time on site, etc. for clients along the way. The results were compelling and would be hard to argue against as far as a company balance sheet is concerned.

But here’s the thing—I think Vector uniquely embodies a holistic approach to marketing on the web. Design, development, and marketing work together, as Matt states:

“We work towards a global maximum versus a local maximum. Small, hyper-focused changes may seem good at the time, but we have to be mindful of the overall quality of the user-experience and integrity of the brand.”

Lee seeks to get involved early:

“An analytics strategy should be part of the initial development and design process so that together we can define business goals, figure out how to measure them, and develop a method for using that data to further refine the user experience.”

It’s not the tag manager, but how it’s used that actually impacts the quality of a site, and I haven’t found that teams using tag managers always make decisions with the same level of care as Vector.

tag manager
Me, marveling at how easy it is to throw tags on a website

Yes, tag managers can offload the burden of site updates from developers, but what burden are we on-loading to users? Every inclusion warrants a comprehensive discussion around its value. As Reagan says, Talk to Each Other!

Third-Party Script Prevalence on Alexa Top 50

Since beefing up my technique for itemizing third-party requests, I’ve been looking into which third-party domains and services are most prevalent. Starting with a spreadsheet of the top 50 US sites (according to Alexa in late December 2017), I downloaded and analyzed HAR files for each. Caveats:

  • Results (# of requests) may change as sites are updated.
  • I chose to start with US sites because of my own familiarity.
  • 46 sites actually (3 NSFW sites not tested and t.co doesn’t apply)

After a few grunt work sessions, I have a spreadsheet that cross-references those top 46 sites with the domains from which third-party requests originate. I’ll spare you the entire spreadsheet, but some initial points of interests below:

  • The top 46 US sites made requests from 213 different domains.
  • CDN-related request domains are included.
  • 75 of those different domains are unique to 1 site.
  • Third-party serving domains are prevalent. Here are domains that appear on the 46 sites tested more than 10 times:
Third-party domain # of top 46 sites % of top 46 sites
doubleclick.net 38 82.6%
facebook.com 32 69.6%
google-analytics.com 27 58.7%
googlesyndication.com 25 54.3%
googleadservices.com 24 52.2%
cloudfront.net 20 43.5%
googleapis.com 20 43.5%
scorecardresearch.com 18 39.1%
2mdn.net 17 37.0%
adnxs.com 17 37.0%
fastly.net 17 37.0%
akamaihd.net 16 34.8%
amazonaws.com 16 34.8%
demdex.net 15 32.6%
googletagservices.com 15 32.6%
adsrvr.org 14 30.4%
fonts.googleapis.com 14 30.4%
bing.com 13 28.3%
connect.facebook.net 13 28.3%
fbcdn.net 12 26.1%
quantserve.com 12 26.1%
truste.com 12 26.1%
adsafeprotected.com 11 23.9%
amazon-adsystem.com 11 23.9%
analytics.twitter.com 11 23.9%
bluekai.com 11 23.9%
jquery.com 11 23.9%
rlcdn.com 11 23.9%
rubiconproject.com 11 23.9%
serving-sys.com 11 23.9%
turn.com 11 23.9%
moatads.com 10 21.7%

Here are the number of unique third-party domains from which requests originate (excluding multiple requests from the same domain) per top 46 site:

Site # of third-party domains
nytimes.com 64
washingtonpost.com 63
Metropcs.mobi 59
cnn.com 57
ebay.com 49
msn.com 45
microsoft.com 43
wikia.com 42
salesforce.com 40
bestbuy.com 38
imdb.com 37
twitch.tv 37
espn.com 36
wordpress.com 32
target.com 29
diply.com 28
reddit.com 28
amazon.com 27
walmart.com 23
stackoverflow.com 21
etsy.com 19
pinterest.com 19
paypal.com 18
imgur.com 13
twitter.com 13
chase.com 12
youtube.com 12
linkedin.com 11
yahoo.com 11
yelp.com 11
tumblr.com 10
bankofamerica.com 9
(outlook) live.com 9
netflix.com 8
office.com 7
facebook.com 6
google.com 6
blogspot.com 5
bing.com 4
github.com 4
instagram.com 4
microsoftonline.com 4
wellsfargo.com 3
apple.com 2
craigslist.org 2
wikipedia.org 1

I don’t have much commentary at this point—some numbers are higher or lower than expected, and it’s understandable why some sites have more than others. Heck, with video or tweet embeds my site can pull 10+ third-domains surprisingly easy. I need more context, so perhaps it’d be logical to run more tests around a single category.

Next, I’ll be researching the more prevalent third-party services. Many of them are new to me. Further down the road, I’ll look at performance impact, but that’s a whole ’nother ball of wax.