What is good for business and what is good for the Web are not the same. Opera's business case for Presto has not been so strong. Most of the development effort had the least perceptible effect for the users (and the profit), and what has had the most perceptible effect has taken the least effort.
Those quirks that have annoyed me as an Opera user, and those that have annoyed you if you use Opera too, are likely not what an Opera developer has spent much time to solve. I am no greater fan of Webkit than of Presto, and there is no reason to believe that Opera users will be more satisfied with Webkit than with Presto, but the onus is no longer on Opera Software.
If Presto had 5 bugs and Webkit had 10, but web developers worked around the Webkit ones, it would be Opera that appeared like the buggy one, and who had to expend developer time to handle those. Now bugs would not put Opera at a comparable disadvantage because the others in Opera's world would share them too.
In a matter of engine lineage Konqueror konquered through the Webkit branch, but who knows Konqueror today? A decade ago it was losing an OSS mindshare battle with Mozilla, but technically it was more of a threat to Opera on mobile devices than Mozilla was. The assessment back then was that Mozilla would pose no real threat on mobile, and a decade later that has proven to be right.
My view was that while Opera could take the device hills but they couldn't win the war unless they also could hold the rich valleys below, meaning at the time and to this day Windows for PCs. That Opera has consistently failed to do through the years, while Mozilla took much that valley and some smaller Linux side valleys but not the higher ground.
Rivalry notwithstanding, in standards politics the closed-sourced for-profit Opera hill browser and open-sourced not-for-profit Mozilla valley browser may have been the closest of allies. It is not so surprising, the other competitors are the company men, whether for the goals of Apple, or Google, or Microsoft. It may not be a good omen for the Open Web that next year will be the year of the wooden horse.
The business interests of Opera haven't changed fundamentally, an engine change affects the Web more than it changes the way a company operates. But Mozilla have gotten a lessened ally, not because Opera will be a -webkit- mouthpiece, but because a -webkit-Opera is likely to retreat from the standards battlefield, leaving a few guerilla warriors.
Fighting the Open Battle
I spent some of my opening years at Opera arguing for that it was Open Standards rather than Open Source that is crucial to keep the Internet free. Open Source is well and good, and we are better off that 2 of 3 remaining major engines are open sourced, but it isn't the source code make for a healthy ecosystem. GNU Emacs is an Open Source editor-cum-operating-system, but had Emacs alone been be the editing environment, with different forks and customisations to adapt to different needs, next to nobody's need would actually be fulfilled, and managing that haystack of forks would not be feasible.
Open Standards on the other hand, of which the Internet is the prime example, can handle any participant, open or closed source, and use their products. I can make something on the Web, like this blog entry or a data API, without knowing who will be using it or what software they use for it. You, and the billons of other people and machines that read this, can use this not caring who I am or what software I used. You can process this for your own purpose, again without knowing anything about who your producers and consumers are, as long as they follow open standards.
We may only realise how unusual what we have is when we lose it, when the contract of open standards is broken, and we no longer speak the same language. The web apps for different platforms is an example of losing that contract. If you develop an app for Apple, you would have to redo it for Android, and then again for Windows. It was really bad in the days of Netscape vs Internet Explorer, but it didn't quite come to this.
Closed-sourced Opera has been ported to more devices, taken more hills, than I believe any open-sourced browser has. Basically Opera turned over the years into a device porting implementation. In a Closed Standards world that would not be enough. Were you to make the next great device first you would have to decide which feudal master you would have to declare your allegiance to: Apple, Google, or Microsoft. Independent would not be an available option.
However, as Engadget succinctly headlines, Bluetooth SIG unveils Smart Marks, explains v4.0 compatibility with unnecessary complexity. You don't just have Smart, you have Smart Ready for devices that are ready to smart. "Ready" implies "buy more expensive now, get happy later". If you buy a Ready device it will be able to work with The Shiny New Thing on the near horizon.
In for instance the High Definition TV branding, you had Full HD and HD Ready, with one set of specifications for the target capability (Full HD), and a lesser one that you can get now and compatible with the target spec (HD Ready). That in itself were marketing more than reality, the name notwithstanding HD Ready would never be Full HD, but at least it would be compatible.
In the Bluetooth 4 scheme of things the host/master is Smart Ready, while the connected devices will be Smart. Any Smart customer seeing the Smart Ready label will ask, OK when will Smart Ready be ready? And why do the Smart devices need Smart Ready devices, why can't they just be smart? Do you have anything Full Smart? Mangled language, mangled expectations, confusion.
This is not written from WAI perspective, but also from the perspective of making video/audio WAI usable.
Before jumping into syntax at the end I would like to consider the requirements for captions and transcriptions. Here are four of mine, in part based on an earlier post I have made.
1. IT HAS TO BE COLLABORATIVE
Look at the process where transcripts are made. Take the TED.com videos as an example, which I think is more useful than e.g. YouTube.
0. Binary video
Step 1: Captioning
1a. autocaption (Speech Recognition)
1c. contributors turn into regular language, add value
1e. editors takes the contributions and fix them, add quality
Step 2: Caption translation
2c. contributors translate into target language and script, add value
2e. editors takes the contributions and fix them, add quality
Step 3: Encoding context
3a. Automatic metadata
3c. contributors correct and tag the video, add value
3e. editors takes the contributions and fix them, add quality
Step 2 is dependent on step 1, while step 3 is semi-independent and can be shunted off to some metadata discussion. Some of it is relevant, though. TED talks are talks made by a single speaker, but in interviews or discussions there may be multiple voices.
What I am aiming at however is that in practice these steps are made by different people, often at different times and places. The infrastructure has to allow for that. The one making the video may not be the ones to make the captions or transcriptions, who may not be the ones to translate them into readable Mandarin Chinese.
2. IT HAS TO HAVE SENSIBLE FALLBACKS
In the absence of collaboration we have to rely on automatic captioning and translations, provided by the web site and/or the user agent. Having both will likely end up with double dutch with today's technology, but it is still better than nothing. The transcription from that Urdu video may not be easily understandable but at least we might get an inkling what it is about and whether we should spend effort getting a better translation. A caption made by a human should be vastly better, though in real cases it is not unlikely that a caption is neither made by a human nor better than what a UA can provide.
This extends to transcriptions versus captions. captions versus subtitle, and the interaction with the description, chapter, and metadata types. Fairly obviously a subtitle can with some degradation serve as a caption, or a caption as a subtitle. The rule for what would serve as a transcript in the absence of a transcript track is less obvious, but necessary if transcript is to work.
It seems to me that like caption is a subtitle in the absence of audio, a transcription is a caption in the absence of video. Furthermore it need not be timed, and would not normally be rendered as timed. It would however be a mistake to remove the timedness of transcriptions, even if not presented as a timed text.
Most "real world" transcripts are edited, removing umms and errs, filler words, non-verbal cues, and pauses, having a more written form than oral form. But this is not an inherent characteristic of transcripts, a linguistic transcript could go to considerable lengths, pauses and laughter could be significant, even the length of the pause, as with the Nixon tapes.
Thus subtitles, captions, and transcripts could be used as fallbacks for each other, but it should be specified how.
3. IT HAS TO SUPPORT SOME TRANSCRIPTION VIEWPORT
A transcription is meant to be read, not watched. It may or may not be read with the transcribed video active.
A good model for transcripts is TED.com (done with scripting rather than HTML5, but the functionality should be replicated). Take an example a TED talk like "Building blocks that blink, beep and teach". You can pick subtitles in one out of 23 different languages/transcriptions (at current). If you click the "Interactive transcript" button you can get a transcript in either of these languages. Incidentally you don't have to pick the same language for subtitle and transcript.
Crucially, the transcript is interactive and timed, even though the time cues are not visible. Change the language to Simplified Chinese, activate the text in "在大约100年后的1947年" and you get to the point 45 seconds into the video where the speaker talks about LEGO. The video and the transcript are independent but linked. This is a very valuable property.
4. IT HAS TO HAVE MULTI-LANGUAGE SUPPORT
Like with TED future videos should support any language the authors and contributors are able to encode simultaneously.
Here in China it isn't unheard of to have three subtitles shown simultaneously (Simplified Chinese characters, Pinyin, and English), though a double subtitle track (e.g. Simplified Chinese and English) is far more common.
While there is a use case for multiple subtitles, it would be less common for transcriptions. A selection box UI like what TED uses to pick one transcription out of multiple seems more natural.
SYNTAX? WHAT SYNTAX?
I don't care that much about syntaxes, but would say that the first syntax is both more natural, putting transcripts on line with captions and subtitles, and more flexible given that popular videos should have a multitude of languages (and thus a multitude of transcripts).
The second syntax seems to have been born out of a concern that while the video may have a transcription it seems trapped inside the video container. It should be possible to display text tracks in another context, either embedded in the same document like TED does with "Interactive transcript" or by linking to it.
Furthermore it should be possible for the UA to display a transcript whether or not the author has neglected to make a TED-like "transcript" container or if the transcript has been made by a different contributor than the original author.
This seems as yet unanswered in the spec. We have the transcript (or a transcript fallback), but how can it be activated? Will it be interactive (like the TED case) by default?
From face-to-face meeting on Web Animations, as discussed here. There is now a CSS-SVG Effects Task Force activity on this.
SVG Animation (based on SMIL) and CSS Animation offer some similar features for animating Web content. Harmonising these two technologies has been considered on a number of occasions but a path forward has yet to be established. In response to a feature-by-feature comparison of the two technologies the question was raised, “What are the animation features required, prioritised by use case?” (Meeting minutes which lead to Action-48) although a closely related question was, “What would it take to make CSS Animations achieve feature parity with SVG Animations?”
It has taken time, but by now CSS and SVG are on the same team, which is kind of a precursor to make the specifications well-integrated as well.
Trust builds up slowly, once lost it is rarely regained. In those early days after the April 1 non-joke, trust came readily to Gmail. For me it started with my favourite four-letter word, "Undo". Commonplace client-side, web services rarely offered this basic functionality. That as much as latency made for an inferior user experience. Gmail was an exception. You delete something you rather wouldn't, you pick it up from trash. You did something you should't, you click on the "undo" link, and all was well again.
For a while.
I mostly prefer to use Opera, which has been troublesome with Google, but the two have mostly made up lately. Worse, I have been to many places with limited connectivity, using phone network or fritzy WiFi that drops. This means the "Basic HTML" rather than "Standard" service. Here in China I have little choice even when the connectivity is good, Opera+Google+China+Standard fails in most cases. Basic HTML is actually good enough, you lose some flash of Standard, but it the lack of connectivity, not the lack of flashiness that slows you down. What is worse, and I cannot say if it is Basic HTML as I have too little experience with Standard, but Gmail has become unreliable. Not unreliable as the well-publicised spat with Chinese government unreliable, which is annoying but not really damaging (it is better now, at times VPN was required, now it is only recommended). The problem was losing-your-email reliability issues. You write a message, the connection is lost, you press Send, and get an error message. No problem right, unlike IE6 (boo!hiss!) you can go back one step in history and retrieve that message, right? Wrong. Whatever script Gmail is running it interfers with Operas now-less-than-stellar history functionality so that chances are you got nothing.
This and other issues have caused a number of lost messages from me, and more often my co-workers (whether using Opera or other browser). My advice has been the old adage to save often, as well as the somewhat-stoneagey-but-safe "type it in in a word processing program and copy it into the browser" since all modern word processors/text editors autosave without any freakish AJAXy unreliability, or maybe the best "Use Thunderbird". I can't really say I like Thunderbird, but at least our data have been safe so far.
Two days ago I started a company-important document that I worked on and off on. Of course I should have headed my own advice and used a word processor, but using a browser is convenient. I did save (though not as often as I should, Gmail's autosave is not much), so I ran a risk of data loss, but nothing catastrophic. So I thought until the "discarded" message suddenly showed up. When Google say "discarded" they mean it. That is not a nice little "We moved your message into the Trash folder for you, do you mind?", or "We deleted your life's work. Undo?", or even "We messed up, all your changes since your last save are gone. Sorry.". It does mean "The message you have been worked on from the beginning through what we told you we have saved to the last changes is now gone and deleted. There is no way of getting it back. We are not telling you why we did it, and there is nothing you can do about it. Have a nice day."
Trust builds up slowly, once lost it is rarely regained.
Coincidentally the W3C declared that finally CSS 2.1 had reached Proposed Recommendation status, and surely, surely!, the end, in the form of CSS 2.1 as Recommendation, should be nigh. I recapped the story in an almost four year earlier entry, Cruelly Slow Slog, a slog that cruelly continued for four more years. 11 years of labour is not bad for what was intended as a quick fix.
Heaping on the irony, a new working draft of CSS3 Speech Module was published. While CSS 2.1 was bound to reach PR some day, the CSS3 Speech Module had been missing in action for seven years, if this module had been a person he would have been recorded as dead long time ago and his belongings spread among his inheritors.
Maybe it is time for me to revive the Audio module? While the Speech module is an aural equivalent to the Text module, how to style spoken text (generated by text to speech), there would be a use, arguably a greater one, for handling audio files. Audio is to speech as image is to rendered text. Properties of an audio module would be the likes of volume, balance, delay, speed, how to present the media in a given context.
In the intervening years HTML5 has happened, with dedicated audio and video elements. As is otherwise the case having more well-defined markup makes the case for styles easier, with the ability of everyone involved to adapt content to user circumstances, for audio as well as images.
Of course, given present form, Jesus Christ would have become a frequent flyer by the time such a module would reach Proposed Recommendation.
Like most Wikipedia maps of this type it is made in SVG (using Adobe Illustrator). The code it generates it fairly decent, better than some code I have seen earlier, but still hard to read and use for my purpose, so I cleaned it up with Kommodo Edit. The regular expressions work well in that editor, which makes clean-up much quicker and easier.
Maps are, or at least should be, ideal for SVG and SVG animations. Famously they are not the territory, but they are a layer of information that can be used instead of or as an overlay to other maps of the territory, highlighting the information you are looking for and hiding the information that distracts.
We are now at a stage technically, and sometimes socially, that this process could be organised differently. It could be more like the process for producing the HTML5 standard for instance, with early prototyping, public involvement with the stakeholders, scenarios and visualisations, and a transparent process.
Any area potentially under development could be put on Google maps, or local equivalents, as well as on a time line for any interested citizen to see or get involved with.
I am interested to know: What "prior art" do you know of in architecture along these lines, how was it done, what were the consequences and lessons?
(Speaking of long-standing bugs, if mostly an annoying one: Opera 11 made a big issue of showing simplified URLs with only the most significant bits and security information in the address bar. That is fine, but the fact is that what is actually in the address bar need not be the page you are at, but some old crud that happen to be in the address bar, like an earlier page, or maybe a typed search, or something else. It is a boring and possibly extensive bug to fix, but in principle it is a security issue. If anything it seems to be worse in Opera 11.)
But I didn't write to gripe, but applaud that Opera finally has tab groups, or tab stacks. As soon as you have more than a dozen tabs active organising them is going to be an issue, and at around the double the number you lose overview completely. It is nice that you can move the tabs around, but a larger group of tabs is unwieldy. Enter the tab stack.
Neat as this may be, it risks a common syndrome of Opera development. New features are developed all the time, hopefully to completion but sometimes only to beta level, and then it is forgotten until subsumed by later features. A decade ago the duel between windows-based browsing and tab-based browsing was resolved into a combined style of browsing, which can be seen in the context menu of a link (where you can open it in a new window or tab). A couple years later Opera got sessions, a feature that left unchanged since it was partially implemented, making it much less attractive than it should have been. However sessions matter for a more important reason because this is what lets Opera restart with all your tabs intact, except of course what you have typed into them, viz bug 155102. Yet later we got some minor tab features like for instance private and pinned tabs.
What if Opera would combine the three separate features of mixed tabs and windows, sessions, and tab stacks into one unified feature? What is an Opera window but a tab stack with a duplicate chrome? What is a window could be turned into a tab stack and vice versa. Likewise a named tab stack and a session could be synonymous. You give a tab stack (or a window), and when it has a name it has an identity that can live on through sessions. The tab stack UI solves the maintenance problem of sessions, there are always some tabs that you don't want in that session any longer and there will be new ones to add. That is exactly what you naturally do with tab stacks. If you don't want a particular tab to pop up next time you activate a session, you just move it out of the tab stack (or into, as the case may be). When tab stacks are unified with sessions, the time would come to simplify the window UI, hopefully in a way that would make the desktop UI more in synch with the device UIs.
Next: Having united these three features, there is a fourth that is living a separate existence from the others, maybe it will be time for a Grand Unified Featureset of tabs and windows.
It is situated at Storo which would be third in rank after Oslo Centre and Majorstuen as a traffic nexus. It is right on top of the Storo ring line metro station, meaning all other metro stations are in easy reach, most tram lines and some buses stop there as well, including the airport bus for conveniently leaving Oslo at will. Whether long-term or short-term commuting or just wanting easy access to all of Oslo, the location is ideal.
The flat itself is a one room 23 m2 with with a nice westward view.
The feature in the Norwegian newspaper Dagbladet showed how he resourcefully had turned the hotel room into a makeshift home office, with the Lenovo Thinkpad neatly stacked next to a blue folder with "Confidential" handwritten in neat press-friendly letters. He was being a model Norwegian citizen by "working most of the day", but spending some time shopping, getting himself an iPad. Doing his work from New York was no problem, but he had to get up early due to the time zone differences. He wasn't too concerned about missing the Friday's Council of State, but hoped to be home for his daughter's dancing performance on Sunday. That covered the checklist, working effectively, but not too long, knowing trendy technology, while being relaxed and a good family man.
The American CNN had a feature as well, here the picture provided (by the Prime Minister's Office, no doubt) was the Prime Minister at the airport actually handling the aforementioned iPad, or at appearing to do so. The tagline here was Running a country? There's an app for that.
From a machine point of view the ThinkPad did all the work while the iPad got the attention. Though if truly seen from a machine perspective it would be the mobile phone, and all the hidden technology to make the human interface devices actually function, thus allowing the head of government of a country to stay away from home with nobody noticing. Nobody but Dagbladet, CNN, this blog, and a few hundred other media outlets that is.
Absurdist literature, it appears, stimulates our brains.
That's the conclusion of a study recently published in the journal Psychological Science. Psychologists Travis Proulx of the University of California, Santa Barbara and Steven Heine of the University of British Columbia report our ability to find patterns is stimulated when we are faced with the task of making sense of an absurd tale. What's more, this heightened capability carries over to unrelated tasks.
In the first of two experiments, 40 participants (all Canadian college undergraduates) read one of two versions of a Franz Kafka story, The Country Doctor. In the first version, which was only slightly modified from the original, "the narrative gradually breaks down and ends abruptly after a series of non sequiturs," the researchers write. "We also included a series of bizarre illustrations that were unrelated to the story."
The second version contained extensive revisions to the original. The non sequiturs were removed, and a "conventional narrative" was added, along with relevant illustrations.
In other news, Reader's Digest files for bankrupcy. Hope for the human mind?
The most obscure one may be InkML. The name might imply a language for tagging with paint, but is really describing the set of movements registered by a touch-sensitive tablet or screen so that the scribbles you make can be processed and enhanced by someone more clever than the tablet driver. Unfortunately this specification is made by a tablet-maker subgroup that like Schrödinger's cat is living or dead depending on your perception, and the spec is progressing at a less than vital speed.
Augustine-kommisjonen til Obama: Ingen måneferd før 2020
Kan høyhastighetstog bli ryggraden i vår megaregion?
Cat free 9/9/9? Great, but what about all the other days?
|April 2013June 2013|