You need to be logged in to post in the forums. If you do not have an account, please sign up first.
Fun webpage download tool for Opera users
Okay, it works in Firefox, Safari and Konqueror too
Most of you probably know that images can be embedded in your pages and stylesheets using the data: protocol. An embedded image would look something like this: <img src=" ... " />
Well, did you know that pretty much any other type of file can be included this way too?
I've built a little tool which will access a webpage, grab all associated content, including images, background images, stylesheets and javascript, and package them using data: URIs into one single file for you to download. No external files required! Now you don't need to Save as.. "HTML file with images" into a folder to hold all the images and stylesheets that get downloaded with it. All of those get embedded in a single HTML page.
This file is portable, just like a PDF, so I call them PHFs for Portable Hypertext Format
It's just too bad that these generated pages will not work in MSIE
I am beta testing the script here: http://www.greywyvern.com/code/php/phf-demo so take a look! Go easy on my server though, the process is pretty CPU-intensive! Report bugs to bugs [at] greywyvern.com
PlanetWerks 2
- A planetarium on your desktop! :: The Puffin Archive: Opera's unofficial mascot?
:: Opera rendering bug listSniffles is my hero...

PlanetWerks 2
- A planetarium on your desktop! :: The Puffin Archive: Opera's unofficial mascot?
:: Opera rendering bug listSniffles is my hero...

Hopefully some sort of userjs version of this could come up. That would be nice,if possible that is.
Oh I almost forgot.. BE Gʚʚk BE Gȫd-Lʚšš http://godlessgeeks.com/

This could be a solution to the so many times requested "save as like IE". Needless to say that this is a much better feature.
P.S Don´t post this in the general forum until it´s available for download or your server will hate you.


topic.phf.zip
I had to zip it, apparently the Opera forums do some whacked out stuff to the embedded javascript when uploading an HTML file
PlanetWerks 2
- A planetarium on your desktop! :: The Puffin Archive: Opera's unofficial mascot?
:: Opera rendering bug listSniffles is my hero...
5. November 2005, 10:57:12 (edited)
I've tested with this page: www.javaworld.com: unfortunately, if I open the page in Opera, it's a mess; in Firefox, instead, it seems ok.
. Now i just put the url in and i get a download dialog for the page.I wanted this feature for such a long time. Finally!!

Suggestion . How about hosting this page in multiple places(free webhosting) to reduce your server load??
Is there any chance that you will make this available as a download, rather than just as an online service?
Just drag the page to the panels, and now its as good as downloaded

Oh I almost forgot.. BE Gʚʚk BE Gȫd-Lʚšš http://godlessgeeks.com/
Originally posted by andystriker:
Very cool
![]()
I've tested with this page: www.javaworld.com: unfortunately, if I open the page in Opera, it's a mess; in Firefox, instead, it seems ok.
Yesh, looks like some mishandling of iframes when there are many of them. Since Firefox handles them fine, it may be an Opera bug; I'll look into it. Did you try it in Opera 8.5 or 9.0p1?
EDIT: I just checked, and my downloaded page from javaworld works just fine in 9.0p1
PlanetWerks 2
- A planetarium on your desktop! :: The Puffin Archive: Opera's unofficial mascot?
:: Opera rendering bug listSniffles is my hero...
5. November 2005, 20:39:03 (edited)
Originally posted by GreyWyvern:
Did you try it in Opera 8.5 or 9.0p1?
Opera 8.5. Can you tell me where to find more about this protocol?

http://www.faqs.org/rfcs/rfc2397.html
PlanetWerks 2
- A planetarium on your desktop! :: The Puffin Archive: Opera's unofficial mascot?
:: Opera rendering bug listSniffles is my hero...
6. November 2005, 00:55:56 (edited)
Brilliant idea!
Seems to work fine.

BTW: why is this in Lounge? Maybe good to post in tech forums too. Geek(K)s may love this.
Originally posted by GeekK:
BTW: why is this in Lounge? Maybe good to post in tech forums too. Geek(K)s may love this.
![]()
There are still some optimizations I need to make. I'm only worried that my hosting server will not enjoy it very much if 100 people try to use the tool at once
Perhaps it can handle it fine, perhaps not, but I think it's better just a few people test it while I am still working on it.
PlanetWerks 2
- A planetarium on your desktop! :: The Puffin Archive: Opera's unofficial mascot?
:: Opera rendering bug listSniffles is my hero...
I've been hard at work today, fixing up some misbehaviours and adding a couple more replacing rules. Now even things like <body background="URI"> and <table background="URI"> are replaced, although there is a bug in Opera which imposes a limit on data: URI length in these cases (I reported it). What else can you do?

Anyway, here is a copy of the Opera.com front page in PHF as a good example of what this script is capable of: opera.phf.zip Even the javascript menus work!

I would like to open a second round of beta testing, as I've upped the version number to 0.2 It didn't cause as much fuss on my server as I'd thought, so I'd like to post in a larger forum, if possible. I was thinking about the Opera General Discussion forum, but I didn't really know if this would be counted as off-topic. And I don't really want to post it into the Lounge

Anyone have any other ideas? Thanks again, you guys!
PlanetWerks 2
- A planetarium on your desktop! :: The Puffin Archive: Opera's unofficial mascot?
:: Opera rendering bug listSniffles is my hero...
Thanks Grey!
Oh I almost forgot.. BE Gʚʚk BE Gȫd-Lʚšš http://godlessgeeks.com/
who would of thought...People everywhere have been looking for a HTML version of pdf.
You should patent this just incase anybody else hijacks your idea and makes millions from it...
21. November 2005, 01:57:50 (edited)
GreyWyvern, I found these problems: frames are encoded, but their content should be encoded first.
Malformed webpages get all messed up, but that's not your fault

And the website form you provided doesn't recognize application/xml(+xhtml)?
http://my.opera.com/xErath/blog/
Originally posted by xErath:
I found these problems: frames are encoded, but their content should be encoded first.
I have tried to make frames get encoded properly... Can you give me a specific URI where it fails? That would help very much

Originally posted by xErath:
Malformed webpages get all messed up, but that's not your fault
Right now it assumes, for the most part, that the page is correctly coded. Bad HTML coders can do just about anything to mangle HTML and trying to deal with them would require adding an HTML parser, 100x more work than just replacing links to external content.
Originally posted by xErath:
And the website form you provided doesn't recognize application/xml(+xhtml)?
With or without the brackets? It should work with application/xml and text/xml, although I think it wants to use application/xhtml+xml which is the correct mime-type for xhtml documents.
Thanks for the reports!

PlanetWerks 2
- A planetarium on your desktop! :: The Puffin Archive: Opera's unofficial mascot?
:: Opera rendering bug listSniffles is my hero...
Originally posted by chesss:
Now i think it would be good idea to post this in the customization forum. Since a request for saving a page as a single file has been made many times.
I've already posted this in two forums here; do you think it would be a good idea? I'm not really keen on spamming this tool around.
Originally posted by chesss:
Thanks Grey!
No probalo!

PlanetWerks 2
- A planetarium on your desktop! :: The Puffin Archive: Opera's unofficial mascot?
:: Opera rendering bug listSniffles is my hero...
24. November 2005, 03:05:15 (edited)
Originally posted by GreyWyvern:
Originally posted by xErath:
I found these problems: frames are encoded, but their content should be encoded first.
I have tried to make frames get encoded properly... Can you give me a specific URI where it fails? That would help very much
it work well with
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/createthread.asp
but the javascript code becomes visible.
Originally posted by GreyWyvern:
hum.. funny had problems a few days ago with this page I didOriginally posted by xErath:
And the website form you provided doesn't recognize application/xml(+xhtml)?
With or without the brackets? It should work with application/xml and text/xml, although I think it wants to use application/xhtml+xml which is the correct mime-type for xhtml documents.
http://gnomo.fe.up.pt/~ei02043/list.php
And Grey, please register a patent, or someone will rip this excelent idea and code out of you.
http://my.opera.com/xErath/blog/
Originally posted by xErath:
it work well with
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dllproc/base/createthread.asp
but the javascript code becomes visible.
I have fixed the javascript display issue, although the MSDN site does TONS of browser sniffing. Because of this, I don't think my script can get a pristine copy. I find that it works best if you identify as IE when grabbing the frameset, although the left-hand menu only appears if you identify as Mozilla.
Fun fun.I have updated the version number to 0.3.1 and updated the available source as well.
Originally posted by xErath:
And Grey, please register a patent, or someone will rip this excelent idea and code out of you.
![]()
I took a look at patenting it... but the truth is, I am against software patents. Although I'd really like to work with someone and get an executable version going for Windows at least.
It shouldn't be that hard to make a program that takes all the files downloaded via Opera, FF or MSIE using each of their Save As.. complete webpage tools and compacting them into one data URI encoded html file. However, I don't have the tools or the compiler know-how to do this :|
Regardless, this would still require two steps, whereas a native browser version of this would only require one. That's what I hope this inspires; not a patent for me

PlanetWerks 2
- A planetarium on your desktop! :: The Puffin Archive: Opera's unofficial mascot?
:: Opera rendering bug listSniffles is my hero...
http://my.opera.com/xErath/blog/
Originally posted by soniclnd:
is this still in development? are there any prospects of an executable?
I think it's not related to this project, but is does nearly the same:
http://unipage.org/
Forums » The Lounge » Software