web:config

tips and tricks for the interwebs

Url filters - The Discipline of Annoyance & Demise

, , , , , ,

Note: title kindly adapted from Emperor's 2001 album

Lets do a little experience. idea
1. Go to the website http://www.asdf.com/
2. Select block content from the right-click menu
3. Add http://www.asdf.com/* to the blocked urls.
4. Now open this url. Here's what you see

5. Go to a unexistent website like http://foo.bar/. Here's what you see

Can you make the difference ? No ? Look a little closer, approach the screen a bit, observe from different angles. Still can't see the different ? Well, the second website has some red in it. bigeyes doh

What does this mean ?
Whops ! You stumbled upon a invalid domain and Opera cannot connect. But what about the 1st case ? Opera deliberatly blocked the url. devil

Now imagine that you just downloaded one of those ad lists online. You start to wonder..hum ! Opera can't render this webpage. It's a Google conspiracy for sure.sherlock

Most filter files encoutered online are too generic, and hit too many false positives.
Consider the case described in this thread.
The user complained because Opera blocked an image which add "/ad4" in it's base64 content. Well too bad. You downloaded a too generic filter file, now you hit some false positives. But he was lucky to solve the problem with such a simple case. If you check the OP's filter file, whenever there should be either a dot and an wildcard, or wildcard and dot, there is no dot, becaise the person that did that filter file didn't consider that ad4 could ever be part of a valid url, which that rule would happily block.ninja

Now, why did I initially approach this subject with the asdf.com example ?
Simple. Opera currently does not provide any way to tell the user that some content is blocked. Iframes get blank, not loaded scripts go to the error console and that's it. Opera definetly needs a friendly error page telling the user that a url was blocked. If so, the user would then inspect either the info panel, or whatever, to realize that there is blocked content.faint

As an extra case, I once saw a user complaining he couldn't open any website that started with http://count*
Where do I recognize that ? Most statistic website's domains start with count. So the poor used downloaded a bad filter file, and actively contributed for Opera not being part of statistics online. The author of that filter file must be proud.furious

Conclusion: don't accept generic filter files out of ignorance. First inspect them. Tamil has a almost good list, but I consider it too generic. My personal filter file has a very little amount of generic rules. The rest is all full domains, and I rarely see an ad.yes

Happy browsing (without annoying ads). wink

Developer tools unleashedPlugsome Toolbar

Comments

Jasonpandasoangry Friday, April 20, 2007 8:26:17 PM

They could just add a notification for "blocked content" like they have popups.

EricJH Friday, June 8, 2007 8:59:01 PM

Where can I download your url filter? I can't seem to find it.

João EirasxErath Friday, June 8, 2007 11:04:56 PM

I can't seem to find it.

It's not uploaded, and it's almost custom made.

EricJH Saturday, June 9, 2007 3:13:12 PM

Why not share it with the world when it rocks? bigsmile

Scott Viviansvivian Wednesday, October 17, 2007 11:26:23 AM

I posted some thoughts in a similar vein at http://my.opera.com/community/forums/topic.dml?id=202032

It's quite annoying when a URL just won't open.

chuso Wednesday, January 2, 2008 5:57:01 PM

I totally agree with you, this is why I tried to do my urlfilter.ini less generic as possible.
I will share it as soon as I come back home (xmas holidays).

Arthur WilkinsonGT500 Tuesday, January 15, 2008 5:44:52 AM

Nice article xErath. I also stumbled upon a nice adblock list for Opera created by (believe it or not) a Firefox user. He keeps multiple versions of the list, including one that does not block tracking sites such as Hitslink, Statcounter, etc. For the most part I don't run into issue while using it, but there have been a few instances where a page simply refused to load, and eventually I figured out that it was the content blocker that was doing it (which is simple to disable if you manage to figure out that it's the source of the trouble).

Anyway, for anyone who want to know more about this list, I wrote a bit about it here.

serious Sunday, March 2, 2008 9:05:57 PM

jo, nice article. I once blocked "*layer.php*" and then wondered why the video player on a site wouldn't work. The solution: the page for loading the player was called "player.php" p

_Grey_ Monday, March 3, 2008 2:01:15 PM

Wouldn't this problem be solved if the "blocked urls" were only evaluated from the start of the line?

^http://www.asdf.com/*

E.g. http://www.asdf.com/ wouldn't be blocked, since Opera doesn't evaluate * to "no character", also anything starting with data: wouldn't be blocked because it doesn't start with http:// ...

edit: Nevermind. Proper subdomain-blocking won't be possible using this approach; e.g.

^http://*.asdf.com/* would also match http://example.com/?p=foo.asdf.com/

Write a comment

You must be logged in to write a comment. If you're not a registered member, please sign up.