What language is this page?
One internal goal of the gia project is to showcase a html5 compliant Roundup tracker template.
This raises the need to determine the language used to render the
respective template, since we need to fill in the html
tag’s lang
attribute:
<tal:block metal:define-macro="">
<!DOCTYPE html>
<html lang="??">
...
While not documented, it turns out that in the end it is neither hard nor overly involved to find out this information inside a template. You have to do your homework though.
In this article, I will first share the recipe, then expand on http language negotiation and Roundups approach to it. Finally I will elaborate briefly on further aspects of internationalization of Roundup trackers.
The Recipe
Preamble
Amongst others, we use the following configuration settings in our trackers:
[tracker]
...
language = de_AT
[web]
...
use_browser_language = yes
...
This means, that the trackers web interface tries to switch to the language preference indicated by the web browser of the visiting user. If none of the preferred languages matches, the web interface will be presented in (Austrian) German.
If your tracker is monolingual or you do not allow the language to be
switched by the browser preference there is no need to use the
following recipe. If you do switch languages via the web user
interface though, just use the request/language
path in the
templates.
TAL Python Extension
Create the file page_language.py
with the following contents in the
extensions
sub directory of your tracker:
# return the selected page language
def page_language(i18n):
if 'language' in i18n.info():
return i18n.info()['language']
else:
return 'en'
def init(instance):
instance.registerUtil('page_language', page_language)
Prepare .po Files Correctly
By convention all .po
files contain a translation for the empty
string on top, which is in the form of an RFC 822 header. Be sure, in
all .po
files this ‘info’ block looks like the following.
"Project-Id-Version: 1\n" "POT-Creation-Date: Sat Dec 10 21:53:51 2016\n" "PO-Revision-Date: 2016-12-20 10:18-0600\n" "Last-Translator: Georg Lehner <email@suppressed>\n" "Language-Team: German\n" "Language: de\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=UTF-8\n" "Content-Transfer-Encoding: 8bit\n" "Generated-By: talgettext.py 1.5.0\n"
The critical part for our objective is the line:
"Language: de\n"
Note:
Content-Type
andContent-Transfer-Encoding
are the other two required lines for a working translation.
Gratuitous English .po File
You must have a en.po
file in the locale
sub directory of your
tracker, although it needs not contain any translation. Create it by
copying the messages.pot
file to en.po
and do with it as described
in Prepare .po Files Correctly.
Use the Page Language In Your Template
At any place in your TAL template file where you need to know the selected page language use the python extension as in the following example:
<tal:block metal:define-macro="">
<!DOCTYPE html>
<html lang="python:utils.page_language(i18n)">
...
This is just the way I use it for gias html5 compliant template mentioned at the beginning of this article
What the Heck?
HTTP Language Negotiation
By the very nature of the Internet web sites have an international
audience. The means provided by the web standards to adapt a web site
to the language capabilities of a visiting user is the HTTP Header
Accept-Language
which is defined in RFC 7231.
The user of the browser (you!) sets up one or more
languages in order of preference. The browser then sends the
Accept-Language
header with the list of languages with every request
to the web servers.
Roundup Internationalization
With the configuration line:
use_browser_language = yes
Roundup dissects the list of preferred languages and converts the [language tags][rfc_5646] into a list of gettext locale codes ordered by the users preference, highest priority first. Then this list is expanded by adding main language tags to sub-tagged languages, if the main language does not exists. Finally the list is iterated through, trying to find a translation file on each turn. The first file found is used to translate the strings in the page. If no matching file is found, the configured tracker locale is used.
The translation file is read in and represented by the template
translation object i18n
, with the well documented methods and
template invocations. This process makes use of the Python gettext
module.
Notes:
While the algorithm used provides for a correct ordering, it probably can generate incorrect or unusable locale codes, since an eventual variant sub tag given by the browser will not be represented in the way locale codes are defined. I guess, however, that in practice this has no relevance, unless you go into the lengths of creating different translations for specific dialects or variants of the same main language on your Roundup tracker.
Probably Roundup should record the selected locale in a template variable, e.g.
page_language
, by itself, obsoleting this blog completely.
The Trick to Get The Page Language
The Python gettext module apparently reads the translation of the null
string into the _info
variable and provides its content via the
info() method of the NullTranslations
class in form of a
dictionary.
This dictionary is even available as template path i18n/info
. A
convenience mapping to the values as in i18n/info/language
is however not
implemented.
The reason why we cannot use the template expression:
python:i18n.info()['language']
directly is,
because if no language is negotiated or the Language:
line is not
present in the .po
file, we get a KeyError
exception thrown at.
Our page_language
TAL Python Extension takes care of this error.
English Is Not English
The next peculiarity found is, that even if my browsers language
preference is set to ‘English, then German’, I get the German
translation. If you re-read the notes about
Roundup Internationalization you will see, that if there is no
en.po
file (or more precisely: no en.mo
file) the configured
tracker language will be used. This happens even if ‘en’ is on the
language
list and we have no need to translate any string.
Furthermore, if the English language is explicitly requested via a
@language=en
element in the http query string (more on this later),
the i18n
translation object has no file to read in the _info
variable and in consequence the dictionary returned by i18n.info()
is empty.
Both issues are fixed by providing the Gratuitous English .po File.
Further Notes on Internationalization
A Roundup tracker typically falls into the class of ‘Multilingual, same content’ websites. Just the user interface is translated into the respective users language.
W3Cs masterly article about language negotiation highlight several aspects from the usability view. Most are dealt with already in Roundup.
The first is Language negotiation via user configured preferences: This is what we have discussed until now. What follows are some thoughts on the other aspects of internationalization of web sites.
User-Agent Header Heuristics
The above mentioned article suggests one could guess the browser user
interface language from the User-Agent
http header, in case no
Accept-Language
header is sent.
A quick Internet search seems to indicate, that a lot of browsers do
not include the language in the User-Agent
header and that
programmers seldom (or never?) use this heuristic at all. Roundup
does currently not look at this header.
Stickiness In Navigation
The language can be set directly via the query string element
@language. It sets the cookie roundup_language
to
the selected locale code. When the cookie is found, it is used to set
the templates language
variable and to select the respective
translation.
Two issues arise:
If
@language
is set to a language for which you do not provide a translation file, English is chosen. Arguably one would expect the site to be rendered in the tracker language in this situation.Setting a cookie is considered a privacy leak. The W3C site, e.g. asks you gently via a pop up if you want to set a cookie for persisting your language choice. If you don’t, every next page falls back to language negotiation.
Bad or No User Languages
In several scenarios a user might get a page with a language she does not know, even if in some cases a known language were available:
- By failure to configure preferred languages at all.
- When configuring only languages for which there are no translations available.
- When using a browser on a foreign device (e.g. in an Internet Cafe).
In these cases she will get either the default tracker language or English (as shown in Stickiness In Navigation) or an available language whereby any or all of them might be unknown to her.
Language Controls
The common solution is to include ‘language controls’ on some or all pages, where the user can switch to any of the available languages. Most likely you know these web sites with lots of little country flags representing the language to switch to. Ever wondered how people with visual impairment handle this?
The default Roundup tracker templates do not provide such controls and there is one piece of information missing to satisfactorily implement them. At the moment there is no provision to get a list of all available translations inside a tracker template.
You can and should, of course, define a template variable with a hand-crafted list of available languages and use it in a drop-down or a language navigation bar to provide this feature.
User Account Language Preference
Besides Stickiness In Navigation, logged in users might want to override their browser language choice too, if any, or get a different than the tracker language as default.
This can be done by adding a preferred language property to the user
class. It is to be researched, how to make use of this information to
set this user language upon log in.
Other Bits and Pieces
File Upload Buttons
The button used to select files for upload is provided by the browser and can neither be styled, nor can the text on the button set. This comes natural and by html spec, as the browser also indicates if a file has already be selected or not.
In my case, I get a complete German user interface on roundup, with the only exception of the “Browse…” button, because I usually install my operating system (and so the browser) with English user interface language.
A very complete, though involved solution to style and potentially translate these
<input ... type="file">
Roundup Internal Strings
I have found at least one instance of a string in the Roundup
sources, which is not subjected to gettext
processing. It is inside
a javascript function, so I guess it is challenging to do it there.
There are other user interface elements where no translation appears,
however the python code reveals that they are in fact gettext
ed.
You can work around this, by providing an extra file in the html
sub directory, which is used to hold translations of strings not
extracted (or extractable) by roundup-gettext
.
I have named this file po.html
(must be *.html
!) and e.g. the
entry:
<x i18n:translate="">Submit Changes</x>
has given me a translated submit button on the issue item template, which is produced by the opaque construct:
<span tal:replace="structure context/submit">_submit_button_</span>
Summary
We have shown, how to add multilingual capabilities for html5 templates to Roundup without patching the software.
There remain several shortcomings, which cannot be resolved without changing the source, however.
It would be helpful for the internationalization of the Roundup tracker, to provide the following information to the html templates:
- page language .. the selected language for rendering a page.
- translations .. a list of available languages.
The behavior with respect to the fallback language should be improved.
Some translation strings of code
generated elements are not automatically included in the
messages.pot
file generated by roundup-gettext
, other such
elements are missing gettext
processing still.
User Account Language Preference is still to be researched.
Nice to haves:
- A template with demo Language Controls.
- Some javascript to warn the ‘anonymous user’ before
setting the
roundup_language
cookie. User-Agent
http header processing.