Web bugs
Friday, 21. September 2007, 13:37:04
A web bug is something on a web page (or in HTML email or such) that performs some task when the user's browser loads it. They are usually invisible to the user, perhaps a transparent 1x1 image or an iframe.
They can be used for various things, ranging from malicious to useful. For example, you can use them to track visitors on certain pages. Another example is spam email which often contain these: If your email client loads the web bug, the spammer receives a confirmation that your email address is indeed active and you read it, meaning more spam to you.
I recently found a quite intresting application of the web bug: Pseudocron.
cron is a Linux utility which can be used to time the running of scripts or such. For example, you could make a cronjob for running a database cleanup script once every 12 hours. What pseudocron does is allow you to run tasks in a cron-like manner but without cron. This is often useful for people who don't have access to shell on a cheap webhosting provider and therefore cannot use it.
Pseudocron uses a web bug that's placed on a webpage and when loaded, it runs the pseudocron script which then checks if there are any tasks that should've been ran between now and the previous request.
I've also read that DokuWiki uses a web bug to run its page indexing script.
Problems with web bugs
As you can probably guess, a web bug will not run unless an user's browser requests the file. This will not be a problem on high traffic sites though, but a high traffic site can have it's own problem: If the bug runs on every user's request, you may end up having a lot of PHP threads doing the same things. This is why it's important to implement a locking mechanism which will stop other threads from doing pointless things.
Implementation
There's some things to note when implementing a web bug.
One is the locking I mentioned above, which can be easily done by for example creating a file or a directory. The script should always check whether the lock file/dir exists and if it does, exit. Easy.
Second is that if your task is long, the browser may show the progress bar to the user, making them think that the page is still loading. You need to make sure you send the correct headers and data to the user's browser when they request the script so that the browser will think there is no more data and won't show the progress bar.
Also, the above will probably make your script stop when the browser figures that there is no more data. To prevent this, you can use ignore_user_abort() in PHP to make the script run even after the browser closes the connection. I don't know about other languages, but I'm sure they have their equivalents.








How to use Quote function: