What language is this page?

One internal goal of the gia project is to showcase a html5 compliant Roundup tracker template.

This raises the need to determine the language used to render the respective template, since we need to fill in the html tag’s lang attribute:

<tal:block metal:define-macro="">
  <!DOCTYPE html>
    <html lang="??">
...

Alert sign with indications in several languages While not documented, it turns out that in the end it is neither hard nor overly involved to find out this information inside a template. You have to do your homework though.

In this article, I will first share the recipe, then expand on http language negotiation and Roundups approach to it. Finally I will elaborate briefly on further aspects of internationalization of Roundup trackers.

The Recipe

Preamble

Amongst others, we use the following configuration settings in our trackers:

[tracker]
...
language = de_AT

[web]
...
use_browser_language = yes
...

This means, that the trackers web interface tries to switch to the language preference indicated by the web browser of the visiting user. If none of the preferred languages matches, the web interface will be presented in (Austrian) German.

If your tracker is monolingual or you do not allow the language to be switched by the browser preference there is no need to use the following recipe. If you do switch languages via the web user interface though, just use the request/language path in the templates.

TAL Python Extension

Create the file page_language.py with the following contents in the extensions sub directory of your tracker:

# return the selected page language
def page_language(i18n):
    if 'language' in i18n.info():
        return i18n.info()['language']
    else:
        return 'en'

def init(instance):
    instance.registerUtil('page_language', page_language)

Prepare .po Files Correctly

By convention all .po files contain a translation for the empty string on top, which is in the form of an RFC 822 header. Be sure, in all .po files this ‘info’ block looks like the following.

"Project-Id-Version: 1\n"
"POT-Creation-Date: Sat Dec 10 21:53:51 2016\n"
"PO-Revision-Date: 2016-12-20 10:18-0600\n"
"Last-Translator: Georg Lehner <email@suppressed>\n"
"Language-Team: German\n"
"Language: de\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: talgettext.py 1.5.0\n"

The critical part for our objective is the line:

"Language: de\n"

Note: Content-Type and Content-Transfer-Encoding are the other two required lines for a working translation.

Gratuitous English .po File

You must have a en.po file in the locale sub directory of your tracker, although it needs not contain any translation. Create it by copying the messages.pot file to en.po and do with it as described in Prepare .po Files Correctly.

Use the Page Language In Your Template

At any place in your TAL template file where you need to know the selected page language use the python extension as in the following example:

<tal:block metal:define-macro="">
  <!DOCTYPE html>
    <html lang="python:utils.page_language(i18n)">
...

This is just the way I use it for gias html5 compliant template mentioned at the beginning of this article

What the Heck?

Man with machete fighting other man with revolver

HTTP Language Negotiation

By the very nature of the Internet web sites have an international audience. The means provided by the web standards to adapt a web site to the language capabilities of a visiting user is the HTTP Header Accept-Language which is defined in RFC 7231.

The user of the browser (you!) sets up one or more languages in order of preference. The browser then sends the Accept-Language header with the list of languages with every request to the web servers.

Roundup Internationalization

With the configuration line:

use_browser_language = yes

Roundup dissects the list of preferred languages and converts the [language tags][rfc_5646] into a list of gettext locale codes ordered by the users preference, highest priority first. Then this list is expanded by adding main language tags to sub-tagged languages, if the main language does not exists. Finally the list is iterated through, trying to find a translation file on each turn. The first file found is used to translate the strings in the page. If no matching file is found, the configured tracker locale is used.

The translation file is read in and represented by the template translation object i18n, with the well documented methods and template invocations. This process makes use of the Python gettext module.

Notes:

  • While the algorithm used provides for a correct ordering, it probably can generate incorrect or unusable locale codes, since an eventual variant sub tag given by the browser will not be represented in the way locale codes are defined. I guess, however, that in practice this has no relevance, unless you go into the lengths of creating different translations for specific dialects or variants of the same main language on your Roundup tracker.

  • Probably Roundup should record the selected locale in a template variable, e.g. page_language, by itself, obsoleting this blog completely.

The Trick to Get The Page Language

The Python gettext module apparently reads the translation of the null string into the _info variable and provides its content via the info() method of the NullTranslations class in form of a dictionary.

This dictionary is even available as template path i18n/info. A convenience mapping to the values as in i18n/info/language is however not implemented.

The reason why we cannot use the template expression:

python:i18n.info()['language']

directly is, because if no language is negotiated or the Language: line is not present in the .po file, we get a KeyError exception thrown at. Our page_language TAL Python Extension takes care of this error.

English Is Not English

Sketch of the actor Rowan Atkinson. The next peculiarity found is, that even if my browsers language preference is set to ‘English, then German’, I get the German translation. If you re-read the notes about Roundup Internationalization you will see, that if there is no en.po file (or more precisely: no en.mo file) the configured tracker language will be used. This happens even if ‘en’ is on the language list and we have no need to translate any string.

Furthermore, if the English language is explicitly requested via a @language=en element in the http query string (more on this later), the i18n translation object has no file to read in the _info variable and in consequence the dictionary returned by i18n.info() is empty.

Both issues are fixed by providing the Gratuitous English .po File.

Further Notes on Internationalization

A Roundup tracker typically falls into the class of ‘Multilingual, same content’ websites. Just the user interface is translated into the respective users language.

W3Cs masterly article about language negotiation highlight several aspects from the usability view. Most are dealt with already in Roundup.

The first is Language negotiation via user configured preferences: This is what we have discussed until now. What follows are some thoughts on the other aspects of internationalization of web sites.

User-Agent Header Heuristics

Sean Connery as James Bond The above mentioned article suggests one could guess the browser user interface language from the User-Agent http header, in case no Accept-Language header is sent.

A quick Internet search seems to indicate, that a lot of browsers do not include the language in the User-Agent header and that programmers seldom (or never?) use this heuristic at all. Roundup does currently not look at this header.

Stickiness In Navigation

The language can be set directly via the query string element @language. It sets the cookie roundup_language to the selected locale code. When the cookie is found, it is used to set the templates language variable and to select the respective translation.

Two issues arise:

  • If @language is set to a language for which you do not provide a translation file, English is chosen. Arguably one would expect the site to be rendered in the tracker language in this situation.

  • Setting a cookie is considered a privacy leak. The W3C site, e.g. asks you gently via a pop up if you want to set a cookie for persisting your language choice. If you don’t, every next page falls back to language negotiation.

Bad or No User Languages

In several scenarios a user might get a page with a language she does not know, even if in some cases a known language were available:

  • By failure to configure preferred languages at all.
  • When configuring only languages for which there are no translations available.
  • When using a browser on a foreign device (e.g. in an Internet Cafe).

In these cases she will get either the default tracker language or English (as shown in Stickiness In Navigation) or an available language whereby any or all of them might be unknown to her.

colletion of national flags

Language Controls

The common solution is to include ‘language controls’ on some or all pages, where the user can switch to any of the available languages. Most likely you know these web sites with lots of little country flags representing the language to switch to. Ever wondered how people with visual impairment handle this?

The default Roundup tracker templates do not provide such controls and there is one piece of information missing to satisfactorily implement them. At the moment there is no provision to get a list of all available translations inside a tracker template.

You can and should, of course, define a template variable with a hand-crafted list of available languages and use it in a drop-down or a language navigation bar to provide this feature.

User Account Language Preference

Besides Stickiness In Navigation, logged in users might want to override their browser language choice too, if any, or get a different than the tracker language as default.

This can be done by adding a preferred language property to the user class. It is to be researched, how to make use of this information to set this user language upon log in.

Other Bits and Pieces

File Upload Buttons

The button used to select files for upload is provided by the browser and can neither be styled, nor can the text on the button set. This comes natural and by html spec, as the browser also indicates if a file has already be selected or not.

In my case, I get a complete German user interface on roundup, with the only exception of the “Browse…” button, because I usually install my operating system (and so the browser) with English user interface language.

A very complete, though involved solution to style and potentially translate these

 <input ... type="file"> 
elements is provided by Osvaldas Valutis in a very well crafted tutorial: Styling & Customizing File Inputs the Smart Way.

Roundup Internal Strings

I have found at least one instance of a string in the Roundup sources, which is not subjected to gettext processing. It is inside a javascript function, so I guess it is challenging to do it there.

There are other user interface elements where no translation appears, however the python code reveals that they are in fact gettexted. You can work around this, by providing an extra file in the html sub directory, which is used to hold translations of strings not extracted (or extractable) by roundup-gettext.

I have named this file po.html (must be *.html!) and e.g. the entry:

<x i18n:translate="">Submit Changes</x>

has given me a translated submit button on the issue item template, which is produced by the opaque construct:

<span tal:replace="structure context/submit">_submit_button_</span>

Summary

We have shown, how to add multilingual capabilities for html5 templates to Roundup without patching the software.

There remain several shortcomings, which cannot be resolved without changing the source, however.

It would be helpful for the internationalization of the Roundup tracker, to provide the following information to the html templates:

  • page language .. the selected language for rendering a page.
  • translations .. a list of available languages.

The behavior with respect to the fallback language should be improved.

Some translation strings of code generated elements are not automatically included in the messages.pot file generated by roundup-gettext, other such elements are missing gettext processing still.

User Account Language Preference is still to be researched.

Nice to haves: