HTML, CSS, JS and other unsorted stuff

cleanPages Extension - an arc90 Readability conversion

, , , , ,

cleanPages improves the readability of webpages by removing unnecessary clutter. It enhances the layout and combines multi-paged articles into one. It works on locally saved pages and in offline mode, too. cleanPages is a multi-lingual derivative work based on the code of the Arc90 labs experiment "Readability™".


cleanPages is NOT an adblocker or scriptblocker; it cleans pages for reading or printing after they have been loaded.

Version: 1.0.2
Download from the addons page: cleanPages
Supported Languages: English, French, German, Italian, Polish, Portuguese, Russian, Turkish and Swedish.

Latest test version:
1.5.19-alpha
Warning: May be unstable!
Date: 2011-12-12 13:00 GMT+1
Download from my private server: cleanPages.oex
A warning will be displayed, you'll have to trust me wink

Changed in the Alpha:
  • Changed the way the CSS is applied, should be more robust now.
  • Extension resets itself to default values after a fresh install.
  • added setting for uncolorized black background around the cleaned text.
  • added some options to the preferences page: show images, show vimeo and youtube videos, merge paginated pages
  • Started writing a help section (still quite empty) and filled the first things in. Hints for further themes are welcome.
  • Workaround for Opera CORE-23171 bug
  • Options styled and some minor changes
  • Mouse gesture support - see HowTo
  • Faster reload
  • Added Ctrl+Shift+R as shortcut to start the extension and to reload the original page
  • New alpha icon p
  • Extended font support. Detects installed fonts from a list of 450 of the most common installed fonts on your computer.
  • Added autoscrolling feature, watch the upper right corner. No setting for the speed yet, it is late here wink
  • New settings page (still a little bit experimental)
  • Bugfix: Additional footnote anchors in text if the button was clicked more than once. (see details)
  • No internationalization yet
  • Improved the next page detection but there are still some quirks left (at least I hope id didn't break it too much)
  • Bugfix: elements styled by <u><b> were removed including their contents.
  • Improved duplicate pages detection on multi page articles - should work now correctly with my.opera blog articles with more than one comment page too.
  • changed width setting to fixed values and a percentile maximum width to avoid horizontal scroll bars if set too wide if the window width is changed afterwards.
  • experimental fix for H2s abused as intro

Known issues of the Alpha
  • line height changes are only applied after window size change or setting of font size or spacing in 11.50+. Not my fault, reported as Opera Bug DSK-344053
  • Settings page is still unstyled. edit: but getting better smile
  • No help descriptions
  • Several more issues wink Please post major errors you encounter in the basic functionality here in the blog comments. Thank You!

Usage
If the extension's button is active, you can click on it to change the layout of the active tab's content - or you can select some text (300+ characters) and click the button to make that text readable. If you selected too little text, cleanPages switches back to the default mode and tries to find the relevant content on its own.

cleanPages shows 3 buttons on the cleaned webpage:
  • Reload: It has basically the same behavior as the normal reload button in the browser. It is a true reload except when used on frame sites, then the history is used to go back to the same subframes as before. (Read about History Navigation Mode quirks below)
  • Print: Opens the Print dialog to print the cleaned page. Text will be black, backgrounds will be white, the buttons will be not printed.
  • Email: Opens the default email client on your system with the page's URL as body text. Feel free to edit subject and body text to something more meaningful than the included default text. wink


Preferences | Options
cleanPages comes with settings for Style, Size and Margin. Style changes the font and the background color, Size the font-size, Margins the margin between the displayed text and the container. The container is centered in your viewport and can adapt to its width to avoid horizontal scrollbars, if the viewport is smaller than the container's maximum width of 1000px. The Margin setting puts a margin between the (invisible) border of the container and the text, meaning: The width of the text part shrinks if the margin is set to bigger values.

To set up the extension's preferences:
  • Right-click the button of the extension
  • Choose "Preferences"
  • On the preferences page, change the settings in each column at least once and tick or untick the "... footnotes" checkbox. This makes the settings permanent as long as the extension is installed (only necessary after a new install, later you can change each setting individually).


You can see a preview with sample text in the "Example" box below the settings. The settings can be changed again any time later by re-opening the "Preferences".

Supported Languages
cleanPages comes in:
English, French, German, Italian, Polish, Portuguese, Russian, Turkish and Swedish.
The language is set according to your browser language settings and defaults to English for languages not yet supported. The functionallity of cleanPages is independend from languages, one of my test users reported, that it works just fine on Japanese pages.

Please send me a personal message or leave a comment here, if you can and want to translate it into your language.

Changes to the Original Readability™
  • Added multilingual preferences and user dialogs.
  • Fixed some frame issues. Overwriting or replacing the body of the top document in a frameset is not allowed in Opera because of security restrictions.
  • Removed included Typekit fonts. I have no license to use them and I don't intend to buy one.
  • Removed original JS smooth scrolling. Operas built in is good enough. Use [space] to scroll down a page and [shift]+[space] to scroll up a page.
  • Reactivated the Terminal style.
  • Removed the Athleas style.
  • Improved the font stacks for cross system use.
  • Removed bad browser sniffing because Opera can mask as IE. That wouldn't have worked out. wink
  • Removed or replaced Firefox-only code. (read: Firefox only Bug workarounds for not following the W3C specifications lol)

Various other fixes, see source code of the included script. All changes are marked with /*q ... */


Known Issues
The description is not multilingual. Not my fault, kick Opera for that, especially the person who wrote the parser that checks the config.xml during the publishing process for validity. It doesn't even respect their own specifications. sad

cleanPages, like the original "Arc90 Readability™" bookmarklet, does not work well with:
  • Start pages of a website. Navigate to an article page before you use the extension. I will not change that, my version of cleanPages should stay a small extension with a low system impact. If you think otherwise: feel free to edit it, it is licensed under Apache 2.0 wink
  • Pages with not enough text to analyze. Not possible. No way.
  • Pages with crappy markup. It will do the best it can.
  • Some kinds of frameset pages. However frame pages without forced frame reload should work fine.
  • Pages that are reloaded with Unser Prefs|History Navigation Mode set to "Auto" (1, default) or "Fast" (3). It works better when set to "Compatible" (2).

To switch between all 3 settings you can use this button:
History Navigation Mode

Further known issues: I hope not wink

cleanPages comes AS IS, meaning:
I won't fix mistakes that other people made on their websites. If it works, it works. If not and if it is my fault, leave a comment below.
If you find any real bugs, please post them in the comments, too.
If it destroys your hard-disc and melts your processor: Buy a new computer p

Legal Stuff
"Readability™" is a Trademark of Arc90, http://arc90.com
Permission to use the code was granted by license and email.

    Besinnlichkeit ...Error correction ...

    Comments

    QuHno Wednesday, January 19, 2011 9:34:53 PM

    The extension is multilingual, but I couldn't upload a multilingual config.xml because there is no possible setting in the submission process for that, so the short description in the manage and setup dialog is in English only (not to be confused with the options/preferences page and the rest of it, they are multilingual).

    I'll upload the extension with the multilingual config.xml to my own webspace, as soon as the extension gets its approval.

    Sorry for the inconvenience.

    BjoernDBjörni Thursday, January 20, 2011 10:40:20 AM

    Muß ich das noch in meinem Thread beifügen... also das die Extension fertig ist?

    (edit: Added by QuHno: Er meint sein Blog)

    QuHno Thursday, January 20, 2011 11:42:12 AM

    Erst wenn sie fertig ist wink

    Fertig smile
    Du musst nicht, aber niemand wird Dich aufhalten, wenn Du es tust smile

    QuHno Friday, January 21, 2011 11:19:17 PM

    skrzimproved wrote at the addons page:

    Useful, I will probably use this instead of the original bookmarklet. I would like more options - background color, font color, etc.

    I too, but I can't promise anything. I am still testing and looking what I can do, I am no JavaScript Guru, just a qualified copy&paste guy. wink

    owgrunt wrote at the addons page:

    Marvellous! I used to make a custom button manually. Happy to see the extension eventually! But it would be even better if you let us align-justify text.

    At the moment I don't change original text align at all because it can be be very annoying if you look at a page with code examples and they are all justified ...

    I'll look into it and if I find a solution that works (meaning: doesn't destroy PRE, CODE etc. and doesn't destroy the image algorithm), may be I'll add a checkbox for justify, but, like above, I can't promise anything, so don't hold your breath. wink

    metude Sunday, January 23, 2011 3:23:26 PM

    Nice. I'm using this very often!

    QuHno Monday, January 24, 2011 2:50:39 PM

    thibi wrote at the addons page:

    The only thing that could make this better would be to have the ability to create a customized version and to be able to use local fonts!

    Yes, that would be really nice but I don't know if it is possible to read out the internal font list at all. There are some hacks that claim to do that, but the are all quite "expensive" when it comes to computing time and they all work to 80% at most apart from adding a SWF file, what isn't allowed in an extension AFAIK...

    dmitso wrote at the addons page:

    Good job, although it doesn't work on http://www.smithsonianmag.com/


    Which page?
    http://www.smithsonianmag.com/history-archaeology/Bodybuilders-Through-the-Ages.html
    works for me, it even pagerizes the article but I have seen the problem at this page:
    http://www.smithsonianmag.com/science-nature/Tracking-the-Elusive-Lynx.html

    The markup of that page is ... weird and heavy AJAX loaded, not easy to parse, the original script fails there too.

    I'll look into it and if I can find a way to solve that problem without a generic hack for Smithonian alone, I'll do it.
    Those **** webmasters at Smithonian packed the article content into a container with the class "subNavSponsoredWithPic" which hits 2 stoppwords at the same time: Nav and Sponsor. Absolutely stupid and un-semantic markup.

    Hint: In the meantime you can use the added functionality that I built in, select the text with the mouse first and press the button then.

    I will add a generic hack for that page in the next version, which will come with 3 more translations, too smile

    edit:
    Test version 1.0.2-alpha
    Changes:
    Smithonianmag.com fix and Italian and Turkish translations added. Waiting for the 3rd translation to become ready and I'll submit it if there are no new errors smile

    QuHno Tuesday, January 25, 2011 5:00:20 PM

    1.0.2-alpha (Version number not changed)
    Changes:
    Polish translation added.

    QuHno Wednesday, January 26, 2011 9:02:10 AM

    s-a-s-h wrote at the addons page:

    Maybe settings could be added to page (and appear when pressing button along with buttons print, email and reload original page). Also bug in extension gallery. Everybody rated this extension with 5 start but average is 4.5

    What kind of settings?
    BTW: There is no bug at the extensions page, there was at least one 4 Star rating because the Smithonian page doesn't work in the 1.0.1 version (see comment above). Only ratings with comments are shown in the overview, but there are less comments than ratings wink

    DavidGPeters wrote at the addons page:

    Amazing work QuHno! Thank you very much. . Now if THIS could be integrated: http://kuerzer.de/TopWordsHighlighter ... this extension could become the next giant leap for mankind wink oops was gonna say for human reading. . Regards and keep up the great work, . David.P


    Thanks for the flowers, but Arc90 did most of the work - it would have been impossible for me without their content search algorithm - and I had a lot of help by several comunity members of the my.opera community smile
    BTW: The Highlighter works just fine without any changes when started as UserJS (see screenshot) - but because my extension is multilingual, I'd need lists of common stop words for the different languages apart from English and German and make the UTF-8 chartable bigger because cleanPages wotks with other charsets than iso-latin too (even CJK) edit: The chartable for exotic characters seems to be quite complete up to the FFxx characters smile.

    It should be possible to integrate it but I have to check the license first and ask the author of the original script under which conditions it is allowed (my Japanese is still bad, despite several years of learning sad but I think he can speak English too smile ).
    I could throw out the Greasemonkey auto updater and the getElementsByClass routine because Opera Extensions can do both by default bigsmile

    DavidGPeters Wednesday, January 26, 2011 9:46:20 AM

    Hi QuHno and thank you for your quick reply!

    Originally posted by QuHno:

    BTW: The Highlighter works just fine without any changes when started as UserJS (see screenshot)



    How did you do that...? I followed this tutorial:
    http://www.mydigitallife.info/2009/12/11/how-to-install-greasemonkey-user-script-javascript-in-opera/

    ...but I still can't get the Top Words Highlighter Script
    http://top-words-highlighter.googlecode.com/files/top_words_highlighter.user.js

    ...to work in Opera (Opera Portable, that is).

    If I press the keys "Ctrl+Y" in Opera, nothing happens (this shortcut should start the auto keyword highlighting -- at least it does in Firefox).

    Btw., I don't think that there would be a license problem with the Top Words Highlighter script. It actually was a (paid) development for me by Pierre Carbonnelle, based on the original Greasemonkey script by hzhbest (who knows about it and showed his approval).

    Thanks & Regards
    David.P

    QuHno Wednesday, January 26, 2011 10:04:43 AM

    I just copied this script from the yellowpages link to my userJS folder and that's all, nothing special. But remember: A page has to be loaded or reloaded after the script was installed wink

    A portable install should work just fine - at least mine did - as long as the path is set and can be found.

    If you speak German you can look up ho to set up such scripts here, or else here.

    Some Greasemonkey scripts need additional userJS files added to the userJS folder to emulate special GM_script functions, but this one behaves quite well. It could be possible to store and lock the edited keywords in Opera too, but storage in Opera and everything that belongs to it is something I still have to learn. I am no JS Guru, just an advanced copy&paster and sometimes debugger wink

    edit (2011-01-28): I have reworked the highlighter script so that all functions are availlable in Opera now. It was a matter of binding the propietary Greasemonkey functions and methods to the W3C conform localStorage calls that Opera supports just fine (like Chrome and other standard compliant browsers too).

    DavidGPeters Wednesday, January 26, 2011 11:03:57 AM

    Vielen Dank QuHno wink

    klappt jetzt bestens! Aus irgendeinem Grund hatte ich das "aagmfunctions.js"-Skript in meinem Opera-Skriptordner, welches verhindert hat, dass der Auto Highligher funktioniert.

    Nach dem Löschen von aagmfunctions.js funktioniert der Highligher jetzt genauso gut wie im Firefox!

    Beste Grüße David.P

    Anonymous Thursday, January 27, 2011 6:13:43 PM

    gxip writes: Doesn't seem to work with pages that use usercss files. Does the extension see the original page or the page that is reformatted with usercss?

    QuHno Thursday, January 27, 2011 7:34:48 PM

    It does, but you will not see the effect because the usercss overrules the styles applied by the extension.

    Explanation:
    Extensions can only see the source code or the DOM by design of the Extensions API, so they can only remove or overrule the styles that are directly written or linked in the source code.

    Extensions work with injected scripts and stylesheets and just have access to the DOM of the page but to nothing else.

    A usercss can overrule everything that a webpage does and everything that an extension does too, because injected JS ans CSS are a part of the page.

    The priority is as follows (highest to lowest) [1]:
    1. usercss !important
    2. page !important (This is what an extension can reach at maximum)
    3. page normal
    4. usercss normal
    5. browser's default css !important
    6. browser's default css regular
    and that is completely independent of selector specificity so even if the usercss says this:
    span { color: lime !important; }
    and the page says this:
    html body div p span { color: red !important; }
    the usercss wins.

    It's designed to allow a page to style with the simplest of selectors, without the browser's default CSS from overriding it. The same goes for the usercss and the pagecss.

    In short:
    The extension itself works when used in combination with a usercss, but the usercss can always overrule the extension by adding !important to its selector styles.

    *1) see also: W3C CSS - Cascading Order

    abbottm Friday, January 28, 2011 10:29:07 PM

    Many thanks for the extension.

    Small bug: I have the "Convert hyperlinks to footnotes" option turned on. Press the cleanPages button while on an article with hyperlinks, then press the button again. All the footnote labels are duplicated.

    Example: http://www.badscience.net/2011/01/tell-me-now-how-do-i-feel/

    First press: "it was paid for by Sky Travel [1]"
    Second press: "it was paid for by Sky Travel [1][1]"
    Third press: ... you get the idea

    QuHno Friday, January 28, 2011 10:52:23 PM

    bigeyes

    Right, I didn't think about pressing the button more than once at the same page with convert to footnotes active. I hope I can catch that, it processes the link again and because the footnote anchor link there from the last click, it is doubled.

    Now I have to think about a way to prevent links from being processed again without loosing the ability to reorder the footnotes if for example a chunk of text is selected before the second click. Not quite trivial ...

    Thank You for reporting it! smile

    edit:
    And gone - at least I hope that I didn't break something else with the fix.
    The download URL for version 1.0.3-alpha is at the top of the blog post.
    If you encounter any problems with the fix, please leave a message here smile

    Anonymous Saturday, January 29, 2011 7:06:28 PM

    jozal writes: Hey, great extension, very useful. It would be much better though, if it worked fine on a linux machine. In my experience, it chrashes opera 96% of the time. Keep up the good work:)

    QuHno Saturday, January 29, 2011 8:31:20 PM

    Error 583
    Can't reproduce crashing on Linux, it worked fine on Linux Mint when I tested it ...

    Any hint's why it is crashing?
    Any errors in the JS console?
    Special pages?

    Anonymous Saturday, January 29, 2011 10:01:26 PM

    jozal writes: Well i can tell it crashed for me on http://lifehacker.com/5736011/learn-how-to-code-part-i-variables-and-basic-data-types and any other lifehacker/gawker sites and on http://www.brightsideofthesun.com/2011/1/28/1962468/suns-out-physicalize-the-physical-boston-celtics-and-win-88-71, and some others that i tried, but while the extension seemed to work 5/10 on other sites, it always crashed on these.

    QuHno Sunday, January 30, 2011 10:24:05 PM

    That is a mystery to me. I have asked Linux users with various systems: gentoo, Debian, Ubuntu, Mint and PCLinux, some with "unstable" otr "edge" versions and none of them could reproduce the crashers on the pages you mentioned.

    I tested them with Mint too. Several Macs tested it too, but no crashers.

    Let's try to find the problem:
    • What Linux variant do you use?
    • 32 or 64 bit?
    • Did you test it with a clean profile and no other userscripts or extensions running?
    • Were you logged in to some pages that embed their buttons on those pages (Facebook, Twitter, ...)?
    • Did you try cleaning up the cache?
    • Is the JIT Compiler enabled or disabled (opera:config#JIT)?
    • Which kind of history navigation do you use(opera:config#History Navigation Mode)?


    Trex 279trex279 Monday, January 31, 2011 3:21:22 AM

    Seems to have problems with images inserted inline in a paragraph (Is this considered bad markup?). The image that is in the middle of a line is shown at the beginning of the next line. For example:
      <p>
      Blah Blah Blah ... <img src='...'> Blah blah blah
    blah blah blah blah ... </p>
    is not shown properly.

    Saskatchewan Monday, January 31, 2011 3:13:16 PM

    Suggestion: It would be nice if we could set the width of text instead of setting margins. Right now the width is changing after showing/hiding panels.

    QuHno Monday, January 31, 2011 10:24:57 PM

    @ Trex 279:
    I know, but that is a difficult problem to solve. It is in the original Readability too and comes from ripping the whole page content into parts and replacing DIVs and TABLEs with P. So all the images the algorithm detects as belonging to the content have to be "rescued" from deletion with the non content stuff and are inserted afterwards. I am still investigating, how I can circumvent that.

    BTW: Do you have an URL for me where the effect is particularly disturbing? It is easier with a life example ...

    @ Saskatchewan:
    Setting fixed widths instead of percentage widths could cause the content to be displayed with scrollbars on smaller screens - but I am thinking about some additional preferences that allow more settings. It may take a while for I am a slow programmer, but may be I'll add a fixed width setting in one of the next test versions.

    Anonymous Tuesday, February 1, 2011 2:39:02 PM

    jozal writes: @QuHno: -Arch Linux -64 bit -i have urlfilter from here: http://www.fanboy.co.nz/adblock/opera/ , but i also have it on my windows configs. -i wasn't logged on on any of those sites -yes, it didn't do anything -enabled -it says "1"

    NicoHellbillyDeluxe Wednesday, February 2, 2011 8:38:56 AM

    Danke für die Extension! Ich habe einen kleinen Rechtschreibfehler in der deutschen Übersetzung gefunden: "Original Seite neu laden". Richtig muss es heißen "Original-Seite neu laden".

    QuHno Wednesday, February 2, 2011 11:17:21 AM

    @ Nico: ... oder Originalseite. Der Duden Korrektor beschwert sich über keine der 3 Schreibweisen (außer darüber, dass das Satzendzeichen fehlt), vom Stil her ist es aber mit Bindestrich schöner, hab' es in der Alpha geändert. smile

    @ Trex 279:
    I have disabled the small images left floating in the Alpha test version linked in the blog post above. Please give it a try.

    QuHno Thursday, February 3, 2011 9:19:39 AM

    linwangjan wrote at the addons page:

    这个是readablility,很早就能用css加js来实现了,不过做成扩展了倒是方便新手了

    The usual translators I tried mangled the meaning of the text - can anyone translate it for me, please?

    tt-21 wrote at the addons page:

    Сan be done so that the pictures were not removed from the article and had the opportunity to change the background color after cleaning?

    They are not removed if they are in the same (X)HTML container as the article - but unluckily that is not always the case. The Extension is not intelligent in a way a human is, it just follows the structure of the HTML code, so there have to be some hints in the page source that an image is content relevent to an article and not only decorative like i.e. in the navigation or in advertisements.

    To the color thing: I plan to change some things in the options page, like the option to define your own colors, font-family, sizes etc., but it may take a while.

    Bruno NascimentioBohemiaDrinker Thursday, February 3, 2011 8:38:22 PM

    This is a GREAT extension.

    I just have a very small feature request: could you add a keyboard shortcut, or the ability to custiomize one?

    I don`t let the adress bar visible, so I`d like a way to trigger the extension without actually having to click on the button.

    QuHno Friday, February 4, 2011 5:12:13 AM

    Good idea. I am modifying the internal messaging algorithm at the moment and as I am on it, I can add that too. I just have to find a shortcut that is still free, Opera itself uses most of the good ones ... left

    QuHno Friday, February 4, 2011 1:01:22 PM

    arc90 polished the interface and made a partially pay service out of the original Readability, but the Bookmarklet is still free and works fine with Opera 11 (but is very slooow IMHO).

    I don't know if I can copy the new UI (it is looking really functional), but it is worth a try. Life setting the preferences is quite nice.

    BTW: The pay service pays gratifications to the authors of the pages that were made readable is a good idea in principle - but be aware, you send the URLs of all pages you made readable to them and they are gaining many referer backlinks in the server logs.

    QuHno Saturday, February 5, 2011 1:34:17 AM

    Someone listed the extension at Softpedia.com yesterday faint

    I didn't ask for being listed there and I didn't want to be listed there and I didn't want my real name listed there and they didn't ask me if it is OK to use my real name there. I get enough SPAM mails and I don't need that extra amount. furious

    Please don't download the extension from there, use the official Opera addons page. All extensions at the official addons page are reviewed by the Opera staff and are granted to not harm your Opera installation so there is exactly no need to download them from foreign sources.

    Thank you.

    metude Saturday, February 5, 2011 6:31:47 PM

    Could you add share buttons to cleanpage?
    Just mail doesn't enough.
    Mintshare (*) could be nice.

    QuHno Sunday, February 6, 2011 1:21:12 PM

    That will be possible in the next version. smile

    I am building another interface for the settings, you can see a raw, unfinished and unstyled preview here. Just hover over the gray bar at the top left and change the view live.
    With the planed interface I can add the share buttons easily smile


    BTW: I suck at UI design.
    I would really appreciate if someone could make a good looking image mock-up for me so that I can rebuild the interface by that guideline.
    Please contact me by PM so that we can talk about details smile

    Anonymous Sunday, February 6, 2011 6:54:28 PM

    Arguggi writes: Cleanpages doesn't seem to work on Slate.com and on this page in particular: http://www.slate.com/id/2283372/pagenum/all/#p2 Dunno if OS is helpfull in any way, but I'm on Vista 32 SP2

    QuHno Sunday, February 6, 2011 7:41:41 PM

    Confirmed - but I have an additional problem with that page:
    Dragonfly doesn't work with it too (immediate freeze), so it is quite hard to analyze what is wrong with that page ...

    edit:
    It works if you delete #p2 from the end of the URL.
    I think I have a problem with addresses that contain hashes ...
    Added to my to-do list, I'll take a look into it.

    TommyTommyAngelo Monday, February 7, 2011 9:34:59 AM

    I found some bugs during usage:
    - font size, sometimes is not the same as in preferences. I use big font, it shows normal sometimes, after changing the size to normal and then to big again it works.
    - www.spiegel.de doesnt work sometimes

    QuHno Monday, February 7, 2011 10:56:34 AM

    Font- size not remembered:
    That shouldn't happen as long as the extension's storage is not cleared by something - but it could be related to the hash problem I mentioned in the previous comment too (I use the hash in the messaging, but I change that in the new version).

    www.spiegel.de:
    I h8 "sometimes" bugs lol
    That is one of my daily visited pages and until now I didn't run into that problem - but it is a huge site with many pages and I don't read every article there. It would be fine if you could provide a direct link to one of the problem pages.
    BTW: Does it help if you select content and then click on the button if that happens?

    roboperasync Monday, February 7, 2011 8:55:18 PM

    Great job! Very nice tool, well adopted.

    I am the second one who whishes for implementing the possibility for a shortcut. I think its a big strenght of opera that you can easily use the keyboard for nearly everything. Maybe some days and my favourite extension will work by shortcut to wink

    Thanx for implementing and keep on cooding for new features.

    Saskatchewan Tuesday, February 8, 2011 9:11:10 AM

    Originally posted by BohemiaDrinker:

    could you add a keyboard shortcut (...) ?

    Originally posted by roboperasync:

    I am the second one who whishes for implementing the possibility for a shortcut.

    Just wanted to say: You're not alone wink.

    QuHno Wednesday, February 9, 2011 12:50:40 PM

    Yes, I will add KB shortcuts in the next version bigsmile

    ... but it will take a while, it is a major rewrite and I still have some problems to solve, some of them are mentioned here in the comments but there are some more pages that don't run well. I am trying to find a generic solution for that because I can not (and I don't want to) add hacks for each and every site that doesn't play nicely with the extension.

    Anonymous Wednesday, February 9, 2011 4:26:54 PM

    Anonymous Coward writes: A keyboard shortcut would be really useful. Additionally, the preferences page could feature a button like "Use these settings". It was quite odd for me in the beginning to have set my custom values for font and margin, but not finding a button to set them. Eventually I discovered that it just works by setting the options, but an additional button could not hurt. (And I think that sometimes not all settings are applied, margin and font was sometimes smaller than I selected, but I am unable to reproduce this.)

    QuHno Wednesday, February 9, 2011 5:20:44 PM

    I replace the settings page by live in-page-settings in the next version, it will work like this mockup I made (just hover over the gray bar at top left on that page to see the settings). I think it is easier to handle and more individual settings can be made.

    I am just looking for a good way to build a font selector that shows some of the fonts that are actually installed on the user's system and that can be used for the extension. Unfortunately extensions don't have the same access to the system fonts as they have on the system color picker, so that is a little bit tricky ...



    QuHno Thursday, February 10, 2011 3:56:00 PM

    bhelyer wrote at the addons page:

    I don't know if it's a linux issue or what, but this extension mostly succeeds in only crashing Opera. The idea is cool, but until it actually works I can't recommend cleanpages.

    Sorry to read that. The extension was tested by people I know with about 9 different Linux distributions without any issues. The code is valid without errors and doesn't use any fancy stuff.

    Yes, I know, that you are not alone with that problem, but I can't fix what I can't replicate.

    BTW: Can it be a Flash related problem? I just got the information that Flash on Linux hates it to be created and destroyed in short order...

    What happens if Flash is deactivated? Does it still crash then?

    Anonymous Saturday, February 19, 2011 11:46:22 PM

    krnark writes: Look at this page http://www.vokrugsveta.ru/vs/article/6660/ with cleanPages dоn't visible same picture, but with www.readability.com it's all right.

    QuHno Sunday, February 20, 2011 2:42:55 PM

    Do you mean the top picture ofthe article? I can see that ...

    I can't do the same things as www.readability.com does, because the cleaning of the page is done server sided, meaning:
    You send the URL of the page to their server, their server grabs the page, rebuilds it and sends the cleaned page back to you. You can test it by saving the page locally, switch Opera to offline mode, open the saved page in Opera and try to use the readability.com bookmarklet. It won't work, whereas the extension does.

    Extensions have some limitations when it comes to manipulate the source code of a page because they have to rely on the parsed DOM they get from the browser and that can be really ugly in Opera sometimes (see the blog Post about Error correction .... That makes ist impossible to do some of the real fancy stuff that is easy to do server sided (like running HTML-tidy over the code of a webpage before and after rebuilding it).

    To rebuild the page I have to destroy the page source before corecting the errors. Unfortunately the error correction of Opera kicks in at these moments and prevents consistent results, so I can't do everything I want to do ...

    At the moment I try to correct that in my internal version, but to do this, I have to write a rudimentary parser, that works independent from Opera's parser. That is no easy task and I don't know if I will succeed in doing so, so don't hold your breath.

    Anonymous Monday, February 21, 2011 1:08:59 AM

    krnark writes: I try also FireFox plugin https://addons.mozilla.org/en-US/firefox/tag/readability on that http://www.vokrugsveta.ru/vs/article/6660/ page. With same result, pictures in midle of article not visible. Google Chrome plug-in https://chrome.google.com/extensions/detail/jggheggpdocamneaacmfoipeehedigia has same problem. But in Safari (it also use Readability sources in Safari Reader feature) it's all right. So, I think it's not Opera DOM problem. It's problem open version Readability codes, with tag "table" in this specific case. Looks like Readability.com and Apple use a modified version of the code.

    QuHno Monday, February 21, 2011 1:56:21 PM

    Yes, I tested it with the other extensions and the bookmarklet too ...

    The problem is not only the tag table (OK, that too. Any help to improve it is welcome), but that some elements can not be selected with DOM methods after the initial start of the cleaning process, because the error correction of the DOM parsers corrects partially rewritten code before the script can do so (as you can see here. "Result" is the same page reloaded as txt and "pre-cleaned" with some regex) It is some kind of race condition, the faster one wins and in 99% of all cases it is the parser.

    This problem doesn't occur on the server sided closed version and in Safari, because those the scripts have direct access to the unchanged page code.

    Bruce Couperzeno53 Saturday, February 26, 2011 2:48:21 AM

    I should also like to request a keyboard shortcut. You noted that there are not many left. Perhaps a more complex/difficult combination? Users could then redefine and replace another as they wish. For me, doing everything by voice, the combination wouldn't matter.

    So, thank you! I've spent much time away from Opera but the inclusion of extensions, the speed and the usual reasons draw me back. A few seconds of looking and I found your extension. Exactly what I needed! I wish Arc90 well with their endeavor but for a few reasons I no longer use it. Your alternative is excellent.

    DitherDitherSky Tuesday, March 15, 2011 8:37:42 PM

    Please add an option to disable readability tools AKA reload-save-email buttons. They are not necessary in most cases and interfere with saving page as a clean one. Also, returning to the original view should be assigned to the same cleanPages toolbar button.

    QuHno Wednesday, March 16, 2011 12:16:31 PM

    Hm, difficult ...

    I am scripting on a major UI overhaul and I hope I can include that. I was thinking about a popup, but messaging between injected script, popup and background is quite ugly when you want to change settings from places other than the options page, I could need some help with that.

    ... but I have to solve some of the more "core" related bugs like the URL with # problem too, before I can finalize the new ideas to a test version.

    QuHno Thursday, March 17, 2011 9:46:06 AM

    New experimental version of cleanPages, should work in 11.10 Beta too, but is still unfinished.

    Please uninstall the old version before installing the experimental version.

    How to use Quote function:

    1. Select some text
    2. Click on the Quote link

    Write a comment

    Comment
    (BBcode and HTML is turned off for anonymous user comments.)

    If you can't read the words, press the small reload icon.


    Smilies