MAMA: What is the Web made of?
By Brian Wilsonblooberry. Wednesday, October 15, 2008 2:45:31 PM
I'm proud to announce a project that has been in the works in Opera's QA group for quite some time. It is a tool called MAMA (Metadata Analysis and Mining Application). MAMA was created to help us improve Opera by finding real world sites that we could test with. MAMA allows us to find any combination of CSS, Script and markup factors that we desire. In a sense, it is a search engine for Web page structures instead of content.
MAMA has been tremendously helpful in testing Opera and measuring the popularity of technologies both new and old. One of MAMA's strengths is correlation between different issues. Want to find examples of a "tty" CSS media type that also use the "@import" syntax? Fine. Documents that have over 1,000 inline images or XML documents that use both Flash and external CSS? MAMA can find them.
We know that MAMA is useful, but we think it will be very useful to others too:
- Browser manufacturers and others can use MAMA data on the popularity of widely used technologies to prioritize bugs and justify adding support for new technology to in-progress releases.
- Standards bodies can use the data to measure the success and adoption rates of various technologies.
- Web developers can use the same data to justify support of various technologies in their work.
- It can provide real-world, practical samples of the Web developer's "art", for inspiration and instruction.
MAMA has a side effect that is also one of its most beneficial features: it gives Web authors a voice to the browser and standards makers by documenting actual practice.
MAMA's results for your consideration
In MAMA's introduction, and all the files that link from it, you'll find data that has been pulled from MAMA so far. We lead off MAMA's debut with a full study of markup validation. Along with this, a much shorter, condensed validation article covers some of the broad points of MAMA's validation findings.
Quick links to MAMA's first results:
- Introduction (this is the main starting point)
- Shiny and fancy press release
- Markup validation (full version)
- Markup validation (brief version)
- Key findings
- "The average web page"
...And this is just the initial phase! In the coming weeks, a number of other articles about MAMA's findings and statistics on other Web page topics will be released. Our eventual goal is to make an interface to the MAMA data directly available at some point, but that availability will be phased in gradually when it is ready. For now, these statistics are a way to gauge and generate interest. These articles should also help spur a dialog in guiding MAMA's future collection goals - I find that the more feature requests MAMA gets, the more it improves.
Please send your comments and feedback our way, either directly on dev.opera.com's forums or here on the Opera QA blog.