Third-Party Script Prevalence on Alexa Top 50

Since beefing up my technique for itemizing third-party requests, I’ve been looking into which third-party domains and services are most prevalent. Starting with a spreadsheet of the top 50 US sites (according to Alexa in late December 2017), I downloaded and analyzed HAR files for each. Caveats:

  • Results (# of requests) may change as sites are updated.
  • I chose to start with US sites because of my own familiarity.
  • 46 sites actually (3 NSFW sites not tested and t.co doesn’t apply)

After a few grunt work sessions, I have a spreadsheet that cross-references those top 46 sites with the domains from which third-party requests originate. I’ll spare you the entire spreadsheet, but some initial points of interests below:

  • The top 46 US sites made requests from 213 different domains.
  • CDN-related request domains are included.
  • 75 of those different domains are unique to 1 site.
  • Third-party serving domains are prevalent. Here are domains that appear on the 46 sites tested more than 10 times:
Third-party domain # of top 46 sites % of top 46 sites
doubleclick.net 38 82.6%
facebook.com 32 69.6%
google-analytics.com 27 58.7%
googlesyndication.com 25 54.3%
googleadservices.com 24 52.2%
cloudfront.net 20 43.5%
googleapis.com 20 43.5%
scorecardresearch.com 18 39.1%
2mdn.net 17 37.0%
adnxs.com 17 37.0%
fastly.net 17 37.0%
akamaihd.net 16 34.8%
amazonaws.com 16 34.8%
demdex.net 15 32.6%
googletagservices.com 15 32.6%
adsrvr.org 14 30.4%
fonts.googleapis.com 14 30.4%
bing.com 13 28.3%
connect.facebook.net 13 28.3%
fbcdn.net 12 26.1%
quantserve.com 12 26.1%
truste.com 12 26.1%
adsafeprotected.com 11 23.9%
amazon-adsystem.com 11 23.9%
analytics.twitter.com 11 23.9%
bluekai.com 11 23.9%
jquery.com 11 23.9%
rlcdn.com 11 23.9%
rubiconproject.com 11 23.9%
serving-sys.com 11 23.9%
turn.com 11 23.9%
moatads.com 10 21.7%

Here are the number of unique third-party domains from which requests originate (excluding multiple requests from the same domain) per top 46 site:

Site # of third-party domains
nytimes.com 64
washingtonpost.com 63
Metropcs.mobi 59
cnn.com 57
ebay.com 49
msn.com 45
microsoft.com 43
wikia.com 42
salesforce.com 40
bestbuy.com 38
imdb.com 37
twitch.tv 37
espn.com 36
wordpress.com 32
target.com 29
diply.com 28
reddit.com 28
amazon.com 27
walmart.com 23
stackoverflow.com 21
etsy.com 19
pinterest.com 19
paypal.com 18
imgur.com 13
twitter.com 13
chase.com 12
youtube.com 12
linkedin.com 11
yahoo.com 11
yelp.com 11
tumblr.com 10
bankofamerica.com 9
(outlook) live.com 9
netflix.com 8
office.com 7
facebook.com 6
google.com 6
blogspot.com 5
bing.com 4
github.com 4
instagram.com 4
microsoftonline.com 4
wellsfargo.com 3
apple.com 2
craigslist.org 2
wikipedia.org 1

I don’t have much commentary at this point—some numbers are higher or lower than expected, and it’s understandable why some sites have more than others. Heck, with video or tweet embeds my site can pull 10+ third-domains surprisingly easy. I need more context, so perhaps it’d be logical to run more tests around a single category.

Next, I’ll be researching the more prevalent third-party services. Many of them are new to me. Further down the road, I’ll look at performance impact, but that’s a whole ’nother ball of wax.