MacKuba

Kuba Suder's blog on Mac & iOS development

I'm building an ad blocker

Categories: Cocoa, JavaScript, Mac Comments: 0 comments

Since my update to the iOS version of Banner Hunter was rejected by app review, the app’s been in a kind of Schrödinger state, both dead and alive. It’s still selling those few copies a week, and I’m updating the blocklist, but I’m afraid to make any updates to the Mac app now… So since then I started looking for some other ideas for new apps I could build instead.

One thing I started working on is a Chrome version of Banner Hunter. I wasn’t really planning to do it before, but since Apple pushed me now… I might as well give it a try. I have no idea if it’s possible to make any money on Google Store, since the vast majority of extensions are free, but we’ll see. The main part of the app is done, but I need to work on the non-technical parts like graphics and copy, and it will probably have to wait until late summer at least.

I’ve got another idea though which has kind of come up by itself, which is… to build an ad blocker for Safari.

(TLDR: here’s the landing page.)

Researching the options

I’ve been a long time user of Ghostery before, since it was closest to what I want from an ad blocker and it worked well. But since I upgraded to Catalina and Safari 13, I had to finally let go of the old Ghostery extension, and the new “Ghostery Lite” is just not as good as the old one. So I started looking for alternatives.

Turns out, there aren’t really that many good, reputable, popular ad blockers on the Mac App Store – I think it’s easily a single-digit number. I’ve tried a few, but I wasn’t completely happy with any of them. That doesn’t mean they’re not objectively good – they should work fine for a lot of people, possibly most people – but it’s just not what I personally want.

Here’s what I want from an ideal ad blocker:

  • it should mercilessly block any unnecessary code loaded from external sources that slows down page load and sends tracking cookies who knows where – this conveniently blocks almost all ads, but it should also block any non-ad things that spy on me (like Google Analytics or Facebook tracking pixels)
  • it should NOT try to globally block any resource and any div on any page that includes the word “ad”, “banner”, “popup” etc. – I believe this makes it way too easy to hit false positives and randomly break some innocent sites in the process (and it seems to be much easier to detect for anti-adblockers, which usually create a “bait” div that screams “this box is an ad, block me!”); but unfortunately most public ad lists seem to work this way
  • it should not try to do so much that it becomes bloated in some way, or needs to be split across several separate content blockers, etc.
  • it should allow some basic configuration like site whitelisting
  • it should have a native look & feel

Most of the apps I’ve tried were either doing way too much for me (e.g. 1Blocker – 8 content blockers and requires a subscription, Wipr – 3 blockers) or too little (Ka-Block – just ~500 rules, or Wipr’s no whitelisting by design), often also didn’t look right in terms of UI (Ghostery Lite, or AdGuard – UI made with Electron 😑).

The ones that came the closest were Better and Ghostery Lite, and conveniently, both are open source. So I ended up playing with the Ghostery Lite source and modifying it a little, replacing the popup UI and tweaking the blocklist.

But at some point I thought – since I’m already spending time on this… maybe I can just do it better from scratch?

Collecting the blocklist

I started by building a tool for scanning sites for trackers. I used a modified WKWebView to load a large number of sites (a couple thousands) one by one and record the URLs of all external resources to a JSON file.

What I noticed in the results is that there is a very uneven distribution of the resources on the checked sites: there are a few extremely popular ones, like Google Analytics which can be found on as much as 2/3 of all sites (!):

# Resource No. of domains %
1 https://www.google-analytics.com/analytics.js 3189 65,2%
2 https://fonts.googleapis.com/css 1983 40,5%
3 https://www.googletagmanager.com/gtm.js 1879 38,4%
4 https://www.facebook.com/tr/ 1366 27,9%
5 https://connect.facebook.net/en_US/fbevents.js 1320 27,0%
6 https://adservice.google.com/adsid/integrator.js 1163 23,8%
7 https://adservice.google.pl/adsid/integrator.js 1153 23,6%
8 https://www.googletagmanager.com/gtag/js 708 14,5%
9 https://www.googletagservices.com/tag/js/gpt.js 689 14,1%
10 https://www.googleadservices.com/pagead/conversion_async.js 671 13,7%

So following the 80/20 rule, I should be able to solve much of the problem with a fairly small number of blocking rules – small enough that it should be doable for one determined person. And indeed, it appears that just by blocking Google ads and a couple of others I get to a point where I don’t really see any ads almost anywhere. Sure, there’s still a very long tail of smaller providers on less popular sites that still get through, but does it really matter that much? I don’t need to catch every single ad.

Of course, that still leaves non-ad trackers which are also problematic, even if they don’t display anything. There seems to be a whole industry of services solely devoted to spying on your website’s visitors and invading their privacy in order to make them more likely to buy your stuff, and it’s honestly terrifying:

LuckyOrange: A visitor just left your website without converting. See everything they did before they left.
So I’m not allowed to visit a website without “converting” now? (into what?)

This one literally advertises in the video on their website, quote: “It’s like sitting behind each customer’s screen and watching every click and keystroke”.

~

Market to people, not cookies. Today's consumer expects more than yesterday's tech. Use BounceX to accurately recognize and market to the actual person behind every visit in real-time.
Today’s consumer expects to *not* be recognized on every website

~

I find this disgusting and against what the Internet is supposed to be, so I intend to fight this if I can, and I will be gradually building up the blocklist to include services like this.

On the other hand, in most cases I don’t intend to block first-party ads on purpose, unless they’re exceptionally annoying. I really don’t mind if some news site shows some small static banner here and there that doesn’t draw my attention and doesn’t send my ID or fingerprint somewhere else. I understand that sites like this need to make money somehow, and that ads are the primary way of doing so, and I’m ok with this as long they’re: 1) not making my browsing experience worse, and 2) done locally and not a part of some evil global surveillance network. I kind of wish someone would come up with some new generation privacy- and user-focused ad network that handles everything server-side and doesn’t track users.

MyApple.pl website. Article links in the top and bottom parts, a banner ad in the middle that seems to blend in.
This is ok.

The current state

Initially I’ve been working on this mostly for myself and as an experiment, so I didn’t have any proper UI beyond a toolbar popup that shows a list of blocked URLs. But I asked on Twitter and people seemed interested, so last week I put in some work and added a main window with an explanation, a report button and integrated Sparkle for updates.

Like I said, right now it seems to block ads on most sites just by blocking several major ad networks, it also blocks most popular tracking scripts like Google Analytics, Facebook and a bunch of others. At the moment I have around 500 entries on the blocklist, but I’m not nearly done yet. But I’m already using it as my only ad blocker in Safari on both my main machines, so it may already work well enough for you.

Apart from the blocklist (which is really the most important part), the rest is an MVP. Reporting works by opening a mailto: link. Updating is only done via Sparkle. For whitelisting, for now you’ll need to exclude content blockers on a site in Safari preferences. But I’m planning to work on all of those over the summer, and get to a point where I can release a 1.0, hopefully later this year. If I manage to do it, it will be on the Mac App Store like Banner Hunter, it will be a one-time purchase, not a subscription, and it will be cheap – probably not more than $2.

Ideally I’d also love to include an option to pick which specific services are blocked (or pick from a few presets), and possibly enable specific services on a specific site, like old Ghostery used to – e.g. to enable Facebook comments on a specific site I often use, but not everywhere. It should be possible with the new API, I think. I plan to make some subset of services that I find harmless enabled by default (e.g. NewRelic), but give an option to move the slider all the way to “kill ‘em all” and disable those and some more.

I made a temporary landing page that explains what the project is and lets you download an alpha version. If you want to give it a try, follow to: mackuba.eu/adblockerbeta/.

Popup displayed from the Safari toolbar. Shows: www.macrumors.com; Blocked resources: 9; Report site button; list of the resource URLs below.


Leave a comment

*

*
This will only be used to display your Gravatar image.

*

What JS method would you use to find all checkboxes on a page? (just the method name)

*