The My Opera forums have been replaced with forums.opera.com. Please head over there to discuss Opera's products and features
See the new ForumsYou need to be logged in to post in the forums. If you do not have an account, please sign up first.
[rss feed reader][filter/label problem] Strange behavior in filtering
I posted this problem before in the mac subforum, but maybe here it is more appropriate. I made more experiment and I have updates.I use A LOT Opera rss feed reader, from some time I discovered the filters and the relative rules. And now I have a quite complex system of filters-sub-filters and manual labeling. I have approximately around 300+ feed and 163k+ post saved (and no intention at all to loose even a small part of them!).
setup:
- global filter:(applied to all the posts not only the incoming) with many OR conditions (so to collect all the relative post of the argument "smart material"), inside there are around 14 rules.
- sub-filters: various sub-filters with partial replicated rules inside (ex: general filter "all message that match the word NITINOL OR FLEXINOL OR THERMOCHROMIC"; sub-filter: "All messages that match the word THERMOCHROMIC") that look for matches only inside the main filter (this due to the structure of search I wanted and because of the impossibility to create coherent AND condition, bug I discovered and has been confirmed).
Problem:
- when creating a filter with multiple rules (3+, with 2 works... or at least seems to be) with the option to apply also to existing posts, the result is unpredictable and most of the time incorrect (I have proof that the matching is wrong... it has been labeling also almost empty post, that cannot contain any of the listed words)
Example of problem occurred:
I made a search on the of posts,I was looking for "EAP" (electro active polymers) with the "EAP" string on the "general search", I got my result and I decided that it was worth for creating a new label/filter on my SMART MATERIAL FOLDER. So I added "EAP" on the current "smart material" folder property. But magic: instead of the expected 10-11 post, I got 863 new message in the folder... (without counting the just read posts).
My thought was that the only logical explanation is that the expression EAP that on the general search was intended as "WORD", has become a regular expression like "*eap*" so will match all the possible occurrence of "eap" in all the post inside all the possible words.
Result: months of collecting messed up.
I recreated from scratch the filter (I just need to copy by hand all the filters rules inside the property), and put it inside again the sub-folders. Result? NONE The search does not answer anymore to any kind of logic, all the messages/post in the new folder are totally unrelated to the pattern put in the filter.
There are only long word in the filter (like thermochromic or similar pattern), no regular expressions or other possible cause of misunderstanding for the filter.
Is there a known bug? Did I do something wrong? Is it possible to have such a random behavior? Am I the only person using seriously the RSS reader with such a "complexity"?
Conclusion:
I made more experiment, restarted, created new labels with the same filter setting as existing ones. Any rule in AND will have wrong result, any rule after the second in OR will have wrong result. Existing rules works "fine" but some of them if recreated give more correct result than the existing ones (could it be that before the rule was not looking into links? not confirmed)
Andrea
quick note: in this screenshot you can see as also the local search make strange result... or better does not make any result when there are!!!
Originally posted by motenay:
quick note: in this screenshot you can see as also the local search make strange result... or better does not make any result when there are!!!
Please explain what is strange in the pic.
Originally posted by motenay:
Problem...
None of what you wrote is going to help. We need example feeds to subscribe to, example labels, example rules and example search terms etc., whith step-by-step instructions on how to reproduce the problem.
Note though. If Opera is searching incorrectly, you can close down Opera and delete the lexicon folder in the mail folder. That will cause Opera to rebuild the search index from scratch, which might help.
Also, make sure you uncheck "learn from labeled messages" for a label. That will cause to learn and add more rules (that will be stored in an ini file in the autofilter folder in the mail folder), which can cause messages to show up in the label that you don't want there.
Also, set the Mail Database Consistency Check Time to 0, save the change and restart Opera. Choose OK to perform the maintenance. The number will reset to a high number after the restart, so don't worry about that.
If you're using IMAP, also note that Opera will only be able to search in the headers for the message unless you've opened the message to fetch the body. If you set "make all messages available offline" for the IMAP account, then Opera will be able to search through all parts of your messages.
There's also the launguage-specific forums if needed.
Originally posted by motenay:
My thought was that the only logical explanation is that the expression EAP that on the general search was intended as "WORD", has become a regular expression like "*eap*" so will match all the possible occurrence of "eap" in all the post inside all the possible words.
For a label rule with "contains", it won't be a word search. It'll be a substring match. If you want to search for the word "foo", you need to use "regexp" instead of "contains" and use \bfoo\b
For the search field above the message list, you need to type a space-separated list of words. But, if you type "foo", any email that has a word that starts with "foo" will also match. But, "foo" won't match in the middle of a word, just the start. You can also group a list of words in quotes. For example, in the subject, "So close no matter how far", you could put "no matter how" in quotes in the search field to match that string. You can mix words without quotes and with words in quotes. For example, you could search for so "no matter how" close.
I do not use Opera for any kind of mail (only webmail... but this is another story), I use mainly the feed reader. Only RSS Feed.
I know how a regexp works, and the "eap" was my mistake, but the problem is what it is happened after, I try to not use regexp because I can make easily errors and usually the word or the string I look for is not a common word...
I never used "learn from labeled messages".
I'm doing my PhD and I follow a lot of related stuff in the feeds.
stated this...
the picture: it is a post found on in one of the label with filter (filter for "processing"), if I look for the word processing inside the post (with cmd+f) most of the word I can look for are not matching correctly, in the picture I highlighted the "Processing" word by hand to show that it is not addressing the right match. From time to time this problem appear again and again. At the current state (2 restart for others causes) the cmd+f search works again.
As I told I have a huge amount of subscribed rss feed. I'll try to provide the whole list... I don't know which post is from which feed. But you can probably import the whole set.
List of the feed I subscribed:opml file
Example label/search not working correctly : picture.
step by step:
- I subscribed the feeds (some of them are old... ages)
- create new label
- put one rule "entire message" contains "nitinol" (no "" in the field as you can see in the picture)
- unset the option "apply rules only to new messages)
- leave some time to update
- if the update is set apply the other rule in OR for the words "smart material" and "flexinol"
- let it update and come back to check what is the result. There should be some "not matching" any of the rules.
I'm trying to find another filter "broken" with more recent feed so you can reproduce the problem, from the other day till today I tried many new filter, and after the second rule was becoming a mess... always, but I never check how much old were the wrong posts.
I'll check what you suggested and try the reset of time check.
Does the resent of Mail Database Consistency Check Time affect the db?
and what about the lexicon?
I really fear to loose all the posts

Thanks for your rapid answer!
Andrea
23. September 2011, 11:03:50 (edited)
Originally posted by motenay:
let it update and come back to check what is the result.
I get 26 legit matching messages so far. Will see if I get some non-matching ones and let you know.
Originally posted by motenay:
Example label/search not working correctly : picture.
Is there a reason you have "match messages in" set to "All Messages" instead of "Feeds"?. It shouldn't matter since you don't have any mail accounts setup in Opera. But, perhaps when there's no mail account, setting it to "All Messages" causes a bug. "Match messages in" is a rather new feature, so it could have a bug.
Originally posted by motenay:
Does the resent of Mail Database Consistency Check Time affect the db?
Yes.
Originally posted by motenay:
and what about the lexicon?
Not really.
Originally posted by motenay:
I really fear to loose all the posts
You should copy your mail folder to a safe place very often so you have a backup.
, but I'll definitely try again the "AND" when I'll have time).OK, I have another example 2 months old post (here is the picture with the filter, I hope to not have made any mistake).
This time I made it short.
- create new label
- add all the filters match: dielectric elastomer, smart material, flexinol, nitinol
- uncheck the option "apply rules only to new messages"
- wait...
The wrong messages less old than 2 months are the ABOUT and ARDUCOPTER and just after, the one with PROVA as topic (but that could be trick that feed produce a lot...).
Hope this time you can reproduce the problem, I cannot make secure backup from here so I'll not try risky operations, I'll make a backup this evening and after I'll try that.
Thanks again!!
Andrea
Originally posted by motenay:
The reason for "match messages in" is because of the "AND but NOT" condition have a bug (the bug named DSK-341502, I reported it to you and you filed the bug, I don't know if it has been fixed now but from that discovery I kept that kind of setting
O.K. I remember the bug. It's not fixed yet.
Originally posted by motenay:
OK, I have another example 2 months old post (here is the picture with the filter, I hope to not have made any mistake).
This time I made it short.
create new label
add all the filters match: dielectric elastomer, smart material, flexinol, nitinol
uncheck the option "apply rules only to new messages"
wait...
The wrong messages less old than 2 months are the ABOUT and ARDUCOPTER and just after, the one with PROVA as topic (but that could be trick that feed produce a lot...).
Hope this time you can reproduce the problem,
No. Still can't reproduce the problem. Remember though, with me subscribing to the feed list, I only have the feed messages that are still on the feed page. The ones that have come and gone, I don't have.
I would use the "standalone installation" option in the installer to install to a folder on your desktop (or however that works on mac). Then, import the feed list. See if you can reproduce the problem with that fresh profile.
Originally posted by burnout426:
Also, set the Mail Database Consistency Check Time to 0
What this setting does exacly?
Originally posted by Corey Serbia:
What this setting does exacly?
<http://my.opera.com/operawiki/forums/topic.dml?id=1132182>
I put the 0 on the time check, but actually nothing happened (the log for the recovery did not update from the 2010...), maybe it is because in the meanwhile the Opera's version has changed... (no idea), it did not ask me anything.
Anyway, the inconsistency on the filtering has not appeared again, not that kind at least (it is also possible that it was dependent from time... and lately it tends to crash often for unrelated motivations so it does not stay up running for weeks as before).
Some strange search are still inconsistent, but I did not find any way to reproduce the problem and probably was related from before...
For now it works...
I tried with a clean version of Opera (even the alpha) and I could not reproduce it again (probably because of lack of my massive db).
Next month I'll have some more time free, if you can tell me how to "import" the old db to the new fresh Opera installation I can try.
Thanks anyway,
Andrea
Forums » Opera for Windows/Mac/Linux » Opera mail, chat and news