Skip navigation.

exploreopera

| Help

Sign up | Help

Web Applications Blog

avatar

Search engine friendly multi-language support on widgets.opera.com

, , , , ,

What, why and how

We have just released a new version of widgets.opera.com where we have reworked our multi-language solution to support search engines. Instead of storing the user selected language in a cookie, which hides all languages except the default from search engines and indexers, we now have unique URLs for every language which the site is translated into.

In this blog post we present both the important client-side effects, and our technical Python and Pylons-based solutions on the servers. So if you are a user of widgets.opera.com, a developer who wants a similar site structure, or just passing by - read on!


What has changed?

Widgets.opera.com is currently available in two languages: English and Japanese. Before, the language you selected for the site was stored in a cookie. With this release, the cookie is removed and the URLs will instead reflect your choice of language. This means that all pages in Japanese have slightly different URLs than the corresponding pages in English. For example, the URL for the most popular widgets list is now:

As you can see, URLs without a language specifier (the same URLs as before) will give you the pages in English. The Japanese versions of the pages have the "ja" prefix just after the domain name. The same holds for all other pages on widgets.opera.com, and the links between them will adjust to your choice of language.

Why change?

The problem with the cookie-based solution is that the URLs are the same for all pages, regardless of language. The server translates the pages based on the cookie, and if there is no cookie, the default language is used. Search engines cannot use cookies because they cannot target cookie contents when linking to sites, so they only index the default language. So instead, by specifying the language in the URLs, all the translated pages will be uniquely addressable, and will thus be indexed by the search engines.

From the end users' perspective, they will be able to search in their own language and get the relevant hits, instead of having to search in English only. They will also, like the search engines, be able to link to widgets.opera.com targeting specific languages.

Currently, the benefits apply mostly to our Japanese speaking users, but if we were to add support for more languages at a later stage, the same benefits would apply to them.

How we implemented it

In this section we present the technical details of our language solution. This is hopefully of interest and help to others with similar wishes and issues regarding internationalization. A previous blog post presented some of the technologies we use for the widgets.opera.com site. In short, widgets.opera.com is a Python WSGI application using Pylons with Genshi, FormEncode, and AuthKit.

Moving the language specifier to the URLs has involved updating various sources of URLs in the system to include the request language when necessary, but we will cover the whole internationalization solution for completeness. The specific points of interest are:

  • Transparently extracting the language from the URLs
  • Various language-enabled helper methods
  • Internationalizing Genshi templates
  • Internationalizing FormEncode validation feedback
  • Language-enabled redirects to error pages
  • Language-enabled redirects by AuthKit to the login page

We will now go through each of them in turn.

Transparently extracting the language from the URLs

Because there are two types of URLs for almost every Pylons controller in widgets.opera.com (with and without a language specifier), we wanted to handle the language extraction at an early stage, so that only one type is matched against the routes mapping. We found that by extending the Routes Mapper and overriding the match and routematch methods, we could manipulate the the URL right before the matching takes place. Here we extract and remove the language code, and then add it to the match result, making it end up being available from environ['pylons.routes_dict'] in the base controller. From there we use it to set the language in Pylons. The custom mapper looks something like this:

from routes import Mapper

class LanguageDetectingMapper(Mapper):
    def match(self, url):

        url, lang = self.detect_language(url)
        result = self._match(url)

        if result[0]:
            result[0]['_lang'] = lang
        if self.debug:
            return result[0], result[1], result[2]
        if result[0]:
            return result[0]
        return None

    def routematch(self, url):
        # Similar to above

    def detect_language(self, url):
        # Detect and return the language code, if
        # any, and return the URL without it.

The highlighted lines are the basic differences from the actual Mapper source code. Now that the language is set, we can start using it.

Various language enabled helper methods

Because the language itself isn't present in the routes, the url_for method doesn't know about it, and thus had to be redefined to include the language prefix when necessary. And because of that, the redirect_to, current_url, current_page, link_to_unless_current methods, etc. had to be redifined too to use the custom url_for method. For some of these methods we kept a *_no_lang version to be used for language independent URLs and checks.

Internationalizing Genshi templates

We don't do anything extraordinary when it comes to internationalizing the Genshi templates. We use a callback method to the template loader which adds a Genshi Translator filter to the templates, as suggested in the Genshi documentation. The Translator filter is set to use Pylons' ugettext method, so that all text messages are fetched through Pylons' i18n framework:

from genshi.filters.i18n import Translator
from pylons.i18n.translation import ugettext

def template_loaded(template):
    template.filters.insert(0, Translator(ugettext))

See the Genshi documentation for clarifications.

Internationalizing FormEncode validation feedback

FormEncode is treated similarly as the Genshi templates. All messages are fetched through Pylons by supplying a state object to the validation decorators, which maps the _ (underscore) method to Pylons' ugettext:

from pylons.i18n.translation import ugettext

class PylonsFormEncodeState(object):
    _ = staticmethod(ugettext)

@validate(..., state=PylonsFormEncodeState())
def foo():
   # ...

This is based on Pylons issue #296, with the main difference being that we don't fall back to FormEncode's own translations. Using the conventional methods for extracting the default messages from the code and templates, the translation keys are the fallback messages when no translations are found. This presumes that the fallback language is English, which is the case for widgets.opera.com.

Language-enabled redirects to error pages

The ErrorDocuments middleware, which translates status codes into error pages, has to be changed to redirect to different error page URLs depending on the request language. The ErrorDocuments middleware gets the URLs from the Pylons error_mapper method, so by placing a language-aware prefixing method between the error_mapper method and the ErrorDocuments middleware, the problem is solved.

The relevant documentation can be found here.

Language-enabled redirects by AuthKit to the login page

We have been using the redirect method in AuthKit for handling the situations where the user needs to log in in order to proceed. The only issue is that the redirect handler takes a static URL to the login page, while we need different URLs, depending on the selected language. So we replaced it with new redirect handler which can take a callable as well. Then we set the handler to use (our language aware) url_for as the callable with the route to the login form action as argument. The code for the handler is as follows, and is completely generic:

class CallableRedirectHandler(object):
    def __init__(self, app, target, kwargs):
        self.app = app
        self.target = target
        self.kwargs = kwargs
    
    def __call__(self, environ, start_response):
        if callable(self.target):
            location = self.target(**self.kwargs)
        else:
            location = self.target
        
        start_response('302 FOUND', [
            ('Location', location),
            ('Content-Type', 'text/plain'),
            ('Content-Length', '0'),
        ])
        
        return ['Redirecting to %s' % location]
    
def make_callableredirect_handler(
    app,
    auth_conf,
    app_conf=None,
    global_conf=None,
    prefix='authkit.callableredirect.'
):
    if not auth_conf.has_key('login'):
        raise AuthKitConfigError('No %slogin key specified' % prefix)
    
    if 'loginkwargs' in auth_conf:
        kwargs = auth_conf['loginkwargs']
    else:
        kwargs = {}
    
    app = MultiHandler(app)
    app.add_method('callableredirect', CallableRedirectHandler, target=auth_conf['login'], kwargs=kwargs)
    app.add_checker('callableredirect', status_checker)
    return app

The login and loginkwargs keywords specify the static URL or callable, and the arguments to the callable.

See the AuthKit Cookbook on how to create custom handlers.

That's it

That concludes the internationalization solution on widgets.opera.com. We hope this is of help to you or someone else in the future. If you have any thoughts or questions, feel free to write a comment.

widgets.opera.com 2.0 - the best place for you to share and find Opera Widgetswidgets.opera.com team coming to EuroPython

Comments

avatar
Two languages? That's it? :/

By Zajec, # 18. March 2008, 09:47:29

avatar
You have to start somewhere :smile:

By haavard, # 18. March 2008, 11:25:27

avatar
there are Swedish friends blogs ,i want to translate to English so i can reply to them

By brains2020, # 6. June 2008, 05:17:20

Write a comment

You must be logged in to write a comment. if you're not a registered member, please sign up.