Welcome to the MagmaSoft Tech Blog.

We publish short technical stories as we explore, expand and fix our technical environment.

You may subscribe to the blog by using one of the buttons below or see the full list of posts.

The most recent articles can be read further down this page.

Georg Lehner
The Only Way For SSH Forced Commands

Secure Shell is a wonderful tool for automated or interactive pass wordless remote access, but it is not easy to get security right with this setup.

Out of the box you have two options: either you allow complete shell access to the remote system, or you restrict access to just one (1) specific command line.

Several proposals can be found on the Internet which show how to solve various different use cases of remote access where something in between these two extremes is required.

One popular use case is automated backups, mirror scripts - often involving rsync - or access to revision control systems and the like. Joey Hess' article gives an overview and points out potentially insecure solutions.

Another use case is to allow (real) remote users to run one out of several possible commands, check out this bin blog post for a simple example. An interesting way to generalize this by use of a dedicated tools directory is described in this StackExchange article, all along with a security related discussion.

Let's cover both classes of use cases with just one script which can serve any user account on the server.

In comes 'only'.

The following sections will talk you through the perception and raise of 'only'. If you don't want to listen to the whole biography of 'only' just skip down to The Grown Up 'only'.

The one and 'only'

'only' is a shell script, which only allows to run some command, let's see the embryonic 'only' script:

#!/bin/sh
allowed=$1
set -- $SSH_ORIGINAL_COMMAND
if [ "$1" = "$allowed" ]; then
exec "$@"
fi
echo you may only $allowed, denied: $@ >&2
exit 1`

We copy 'only' into the PATH of a remote host and put it in front of a ssh key in the authorized_keys file like this:

command="only ls" ssh-rsa AAAA...

Let's see what this does when we access the remote host with, e.g.

ssh user@remote.host ls /bin
  1. 'only' is run with the following command line: ls
  2. We save $1, the only allowed command in the environment variable allowed
  3. We set the command line to the originally sent command line making use of ssh's feature of setting it up in the environment variable SSH_ORIGINAL_COMMAND, it will be: ls /bin in this case.
  4. Now $1 holds the first token of the original command line: ls.
  5. $allowed = $1 = ls, they are the same, so we
  6. execute the entire command line "$@", ls /bin in our case, which prints all files inside /bin and exit.

You might want to review the sh(1) man page if these dollars, brackets and "$@" warts look strange to you. The "ssh" and "remote" talk can be enlightened by the sshd(8) manual page.

If we use any other command, like rm -rf / we get:

  1. only is run with: ls
  2. $allowed = ls, $1 = rm, they are different so we
  3. print a diagnostic message on stderr(3) and exit with a failure code of 1 so that the sending end gets notified.

Bingo: we allow to run the one and only explicitly allowed command, but we now can give it any command line we want. This helps us with the second use case, however is rather insecure. Consider running rm -rf / on a command="only rm" ... setup in the root account...

But let's see into this later (security comes, ahem ... late, right?)

Not just 'only' one

With some care, 'only' grows a little bit and is now able to handle more then one command:

#!/bin/sh
cmds="$@"
set -- $SSH_ORIGINAL_COMMAND
for allowed in $cmds; do
    if [ "$allowed" = "$1" ]; then
        exec "$@"
    fi
done    
echo you may only $cmds, denied: $@ >&2
exit 1

I allow here myself to mimic the example of the bin blog article. The line in the authorized_keys file now might look like this:

command="only ps vmstat cups"

Please still don't try this at home, rather follow me, when I mentally run

ssh user@remote.host cups stop
  1. 'only' receives the following command line ps vmstat cups which we save in cmds. After the set --, the command line becomes cups stop.
  2. The for loop picks ps as the first value for allowed from cmds.
  3. $allowed = ps, $1 is cups, we fall through the if and enter the next iteration of for.
  4. $allowed becomes vmstat in the second, and then cups in the last iteration of for.
  5. cups is equal to $1, so we execute cups stop.

If we had sent the rm -rf / command, we would have fallen through the loop and printed denied to the user.

Picky 'only'

Youngsters are complicated. So is 'only'. He grows up a little bit and start to refuse some of the command line arguments thrown at him, he just got picky.

If you are not into adolescent psychology I offer you once again to skip to The Grown Up 'only' section. Go download the adult only script and abuse his workers rights by making him work for you all day long without paying him a dime. But maybe you are curious about his teenager years. If so, then stay with me. I'll put the young only on the couch and analyze him as quick as possible.

First a small preamble. Unless we want to write a whole string matching utility library in shell, using variable expansion with their ugly ${17##*/} and whatelse evil constructs, we can just get a little help of a good old friend. I refer to the humble and ubiquitous sed(1) command.

Note that this means introducing (shock) regular expressions (huhh!). Yes, I also would prefer some simple glob(3) like pattern matching as done by sudo(1) in these cases, but this would involve some exotic character like Tcl(n) or friends, and those guys and girls are not always handily available. So let's make these regex(7)es as painless as possible.

We just want to know, if some line of text matches, or not, a specific pattern. sed(1) is quite reserved if we tell him to via the -n flag, he will just swallow all input he can get, just like PAC-MAN, and only talk back if we tell him so with a print command. To match, for example, the line ps we'd use the following rule: \:^ps$:p. The \: .. : construct delimits the matching pattern, ^ is the start, $ the end of the line and in between the string we are looking for: ps. If sed(1) -n receives any line of text he will remain quiet, unless the line is exactly ps, in which case he mutters back ps, via the trailing p command. So matching results in an echo of the input, not matching in remaining silent.

We'll put the rules to filter the allowed command line in a file in the remote users home directory and call it ~/.onlyrules. So you can set up 'only' for any number of users and adapt allowed commands and rules individually, no need to change 'only' at all. Nice, isn't it?

To continue with our example we stuff the following lines into ~/.onlyrules:

\:^ps$:p
\:^vmstat$:p
\:^cups \(stop\|start\)$:p

The last line is somewhat sophisticated: it allows ether 'stop' or | 'start' as arguments. The alternatives are bound together by the () parenthesis, our ornamentalophile sed(1) wants them all to be prefixed with a \ backslash.

And here is our 'only' youngster.

#!/bin/sh
cmds="$@"
set -- $SSH_ORIGINAL_COMMAND
for allowed in $cmds; do
    if [ "$allowed" = "$1" ]; then
    if [ -z "$(echo $@ | sed -nf ~/.onlyrules)" ]; then
        break
    fi
        exec "$@"
    fi
done    
echo you may only $cmds, denied: $@ >&2
exit 1

You already see the pimples, the attitude?!

When 'only' sees an allowed command, let's say cups, it shouts the whole command line over to sed(1). sed(1) takes line by line of ~/.onlyrules, compares it with the input and shouts it back on stdout(3) if it matches. Consider the last run of the for loop (with cups as $allowed command). Suppose we sent cups stop to the remote host.

  1. cups is allowed, so we hand the command line over to sed(1).
  2. cups stop matches the last line in ~/.onlyrules, so the output of the $() command substitution is cups stop, which is not a zero length string (-z). Thus we skip the break and
  3. we exec the command line. Done!

Now we test the other way round. Let's run cups status:

  1. cups is allowed.
  2. cups status does not match any line in ~/.onlyrules, sed(1) does not shout anything back to us and the command substitution is the zero length string "".
  3. break breaks out of the for loop and
  4. we deny the command.

So, 'only' can do now everything that was promised in the introduction, everything?:

  • We can lock down a remote account to one command, but with (controlled) arbitrary arguments.
  • We can enable a set of allowed commands for a remote account.
  • We can adapt the behavior for any number of remote accounts, without changing the 'only' script.

'only', however, is still a very fragile teenager, don't entrust him your servers, yet, better wait for him to mature.

Why not this cute simple 'only'?!

Verbosity

Although it is perfectly understandable that any unacquainted user wants to know why o dear was my command not accepted by the remote host, we wouldn't want to give too many hints to the unacquainted attacker who is just trying to get into this remote server by means of a stolen ssh key and eager for any useful information. Especially in non interactive scenarios we'd just leave him wondering why.

The Grown Up 'only' allows you to tune him from complete silence up to idiotic verbosity towards the invoking user.

Accountability

Until now nobody notices when, what and for whom 'only' is working. You'll just get a short log message from ssh itself, informing that somebody connected via ssh(1) to a specific account on your remote host.

The Grown Up 'only' uses logger(1) to tell us what command line has been run by which user at the auth.info level, and what command line has been denied for which user at the auth.warn level, so we can sort things out while struggling to forge these rules and after that, in production use, find abusers.

Absolute command paths

If a user, human or automated, e.g. sends /usr/bin/vmstat instead of vmstat to the remote host, we still would like the command to be executed even if we only allowed vmstat. Our stubborn teenager 'only' would reject this command because of his simplistic equality match.

The Grown Up 'only' has gained some tolerance with his peers already. He patiently looks up it's allowed command in the users PATH and compares the result with the given command in case the latter comes in with an absolute path. Thus only commands inside the users PATH are allowed. If you want to lock down commands to a specific directory put it as the only directory into the PATH environment variable for this user (sshd(8) can help you with this). Then set LOGGER, WHICH and SED at the top of the 'only' script to the respective programs full path specification on your remote host.

You can also allow commands outside of PATH, by specifying them with an absolute path in the authorized_keys file. In this case, The Grown Up 'only' requires an exact match with the sent command, but does not enforce it to be in the PATH.

Smarter command line matching

Since commands can come in either with absolute or relative paths, the rules for the command line filter would have to take this into account and would become complicated, difficult to read and therefore error prone.

To make it easier to write command line filters, the lines sent to sed(1) are instead composed of the $allowed command in question followed by the command line arguments (stripping off the actually send notation of the command). Recurring to the previous example, the lines sent to sed(1) in the second iteration of the for loop would be: vmstat and not /usr/bin/vmstat and thus will match with The Grown Up 'only' but not with Picky 'only'.

Quoting hell

When starting to grow 'only' I used the youngster to restrict access to a darcs repository, only to find out, that darcs sends the repository directory path single quoted '' when creating the repository but without quotes when getting it. With exec "$@" the repository directory repo gets created as 'repo' on the remote host, and thus becomes inaccessible to the other darcs commands, which naturally expect it to be repo.

The Grown Up 'only' therefore does eval "$@", which parses away the quotes.

Note however, that I fear that other command line constructs now might horribly fail or disaster be injected into your server by evil forces finding out how to take advantage of quoting hell.

The Grown Up 'only'

Installation and basic configuration

Please download the only script and the example rules file. Both start with an explanation on how to use and configure them, please read these comments in place of a manual. An example ~/.onlyrc is available too.

You can also get 'only' from my public darcs repository. You don't need darcs(1) for this, just wget(1), curl(1) or your browser.

1. Put the 'only' script into a location accessible to all users on the remote host, e.g. into /usr/local/bin.

2. Create a ssh key pair. For a starter, the following command line will create the files only_key and only_key.pub without a pass phrase for you.

    ssh-keygen -P "" -f only_key
  1. Copy only_key.pub to authorized_keys, and prefix:

    command="only ls grep who",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding
    

    to the first and only line, leaving a space before the ssh-rsa AAAA... part. Of course, instead of ls grep who, you'll put in the command(s) you want to allow on the remote host.

  2. Install this authorized_keys file in the .ssh sub directory of the user accounts home directory on the remote host which should run the respective commands. You might want to deny the user account write permissions on the file.

  3. Copy .onlyrules into the home directory of the same user account on the remote host and adapt to your needs. See Writing 'only' rules for some tips.

  4. You are done with this user account. Repeat, starting from "2. Create a ssh key pair." as often as needed for this remote host.

  5. You are done with this remote host. Repeat from "1. Put the 'only' script" for any further remote host you want to access.

Writing 'only' rules

Always consider locking down the command line to only match precisely the wanted alternatives. While options will likely come in a fixed set and variation, the arguments like file paths or user names might vary considerably and unforeseeable. For paths you might consider require a given prefix and disallow dot-dot .. so attackers or mad gone scripts can't break out of their allowed realm.

You surely have rules how a user name may be constructed on your remote host: minimum/maximum length and a set of allowed characters come to mind. Create the respective sed(1) rules for these, check that they don't allow white space or comment and escape characters in between.

Always filter on the whole command line, that is, make the filter have a ^ at the start and a $ at the end, else anybody can prefix or annex arbitrary strings and thus circumventing your allowed command list.

All that said, you might not know all possible variants of the invocation of a command in advance and/or are too lazy to figure it out beforehand. Shame on you, but anyway... lock down the command (let's name it new_kid for just another example), then start with a completely open filter like this:

\:^new_kid:{p;q}

You note the missing $ at the end of the command string, do you?.

Now run all variants of new_kids invocations you can imagine, or gather them after one day or so running and get the results out of syslog(3).

Let's say that new_kid gives as allowed command lines like the following:

new_kid --server -P ./
new_kid --cleanup -P ./
new_kid --stats -P ./ /var/log/new_kid.log
new_kid --discard /var/log/new_kid.log.10
new_kid --rotate /var/log/new_kid.log.9
...
new_kid --rotate --/var/log/new_kid.log

Then we could consider a filter like:

\:new_kid --\(server\|cleanup\|stats\) -P \(/var/log/new_kid.log\)\?\./$:{p;q}
\:new_kid --\(rotate\|discard\) /var/log/new_kid\.log\(\.[1-9]0?\)\?$:{p;q}

or just stop that regex(7) head pain and pack all found lines literally between \:^ and $:{p;q} and you are done.

If you did not get all possible invocations in the first run, you will get the others as denied in your logs. Watch out for lost ones after a month or so and then after a year. (Just kidding, you know your monthly and yearly scripts well, don't you?).

Finally note that shell magic is helping you when writing filters, since white space between arguments is reduced to exactly one space.

Substitution rules

Attentive readers have already noticed that our Picky 'only' is not an equivalent replacement for the example in the bin blog post. If, for example, we send cups start to the remote server, the command /etc/init.d/cupsys start should be executed instead.

Well, while I find this startling, from a security point of view, I needed to support this capability to hold my word on the claim in the introduction.

Create a file ~/.onlyrc in the home directory of the user running 'only' and write enable_command_line_substitution on a line by itself. From this moment on, 'only' will substitute the original command it got sent to with the string printed out by sed(1). In the rules file substitute \:^cups \(stop\|start\)$:p with the following monster:

\:^cups \(stop\|start\)$:{
    s:^cups \(.*\):/etc/init.d/cupsys \1:p
    q
}   

sed(1)s substitute command will replace cupswith /etc/init.d/cupsys and the \1 place holder with whatever command line option (of 'stop' or 'start') it encounters within the \(.*\) parentheses.

This is a contrived example. The bin blog example replaces ps with ps -ef. We don't need substitute here, instead we:

\:^ps$:{
    c\
ps -ef
    q
}

Well, this looks ugly. But hey! The c\ command puts out the subsequent line, which is: ps -ef and omits the input completely. We must put the q command on its proper line so it gets not appended too. c\ allows us to write long complicated command lines easily.

Look! You can go wild on sed(1) and invent your super-hyper-uber substitutions sed(1) programs if you want to, people to even math with it! But once again: don't do command line substitutions for your own mental health and your servers integrity's sake.

Verbose feedback

By default 'only' just exits with a failure code if a command is not allowed to run.

The ~/.onlyrc file can be used to make 'only' chatty about denied commands. You can:

  • Show an enigmatic denied to the user with show_terse_denied.
  • Show the allowed command to the user with show_allowed.
  • Show the exact denied command line with show_denied.
  • Print out a complete manual by appending text in the ~/.onlyrc file after help_text.

The provided example ~/.onlyrc file illustrates and documents all of these options. If you want to mimic the bin blog example your ~/.onlyrc would look like:

show_allowed
help_text
Sorry. Only these commands are available to you.

Security considerations

As I told you before, security comes late, right? Now, a serious review on security related topics is way out of the scope of this article. I just want to throw in two thoughts, or three, or four...

Any command which just reads from the remote system (files, process listings, kernel or interface stats, etc.) can be abused to gain insight into the system (for later hacking it) or to obtain information which might be private (user data, like credit card numbers or passwords, or habits like login statistics, emails, ...). One of the objectives of running commands via ssh with public/private key authentication is to restrict them to user accounts which don't have excessive rights for obtaining information. 'only' can help you lock down this further, but do your homework on securing the remote host first.

Commands which can write to the remote system or modify any other of its resources (processes, kernel variables, interface settings, etc.) are even more sensible. Let's start with the possibility of overwriting the settings for 'only' which can be used to gain unrestricted access to the system. But the same principle as before applies: if the user account is already 'secure', an attacker cannot go much further.

Additionally consider resource depletion. Although you can craft denial of service with a read only access, with "write" access come additional risks into play. Don't allow the user to allocate a lot of processes or disk space, as this can be abused for writing oodles of senseless data to fill up your disk just to annoy you, or better for storing images and videos with disputable content, for gratuitous distribution from your server. This is where quotas and limits come in, start with quota(1) and prlimit(1) if you are on Linux and want to go into further detail.

Note that a lock down script like 'only' is just one concept for ssh security. You might get yourself a restricted shell and cast that upon the remote user, like indicated in this article for rsync. A funny article by Doug Stilwell does not inspire confidence into the security of restricted rbash though. Similar consideration like in this article might apply to other restricted shells, and of course they do for 'only'.

Postamble

Why did I perceive 'only'? I wanted to set up an unprivileged account for managing a private darcs(1) repository for distributing configuration data of my servers. It did not seem right to me to allow complete shell access for this, so I soon stumbled upon the issue of the inflexible forced command in ssh. When I came up with the embryonic 'only' approach I started to look around on the Internet and saw that it has not been proposed yet in this form. With a sudo(1) background of pattern matching on allowed command lines I started to go for the picky 'only'. Writing this article whetted my appetite and while I don't think (ab)using 'only' for interactive session lock down ever, I implemented all the related surplus.

Playing around with the different aged 'only's soon brought me to test driven design. So there are (primitive) test suites available. This is another interesting area, but also quite another story.

Credits for 'only' go to all inspiring inputs, some of them referred to by the external links.

Please give me feedback if you encounter any bug or issue with 'only'. I am especially interested in comments with respect to 'only's (lack of) security.

Posted
Georg Lehner
Language Selection in Roundup Templates

What language is this page?

One internal goal of the gia project is to showcase a html5 compliant Roundup tracker template.

This raises the need to determine the language used to render the respective template, since we need to fill in the html tag's lang attribute:

<tal:block metal:define-macro="">
  <!DOCTYPE html>
    <html lang="??">
...

Alert sign with indications in several languages While not documented, it turns out that in the end it is neither hard nor overly involved to find out this information inside a template. You have to do your homework though.

In this article, I will first share the recipe, then expand on http language negotiation and Roundups approach to it. Finally I will elaborate briefly on further aspects of internationalization of Roundup trackers.

The Recipe

Preamble

Amongst others, we use the following configuration settings in our trackers:

[tracker]
...
language = de_AT

[web]
...
use_browser_language = yes
...

This means, that the trackers web interface tries to switch to the language preference indicated by the web browser of the visiting user. If none of the preferred languages matches, the web interface will be presented in (Austrian) English.

If your tracker is monolingual or you do not allow the language to be switched by the browser preference there is no need to use the following recipe. If you do switch languages via the web user interface though, just use the request/language path in the templates.

TAL Python Extension

Create the file page_language.py with the following contents in the extensions sub directory of your tracker:

# return the selected page language
def page_language(i18n):
    if 'language' in i18n.info():
        return i18n.info()['language']
    else:
        return 'en'

def init(instance):
    instance.registerUtil('page_language', page_language)

Prepare .po Files Correctly

By convention all .po files contain a translation for the empty string on top, which is in the form of an RFC 822 header. Be sure, in all .po files this 'info' block looks like the following.

"Project-Id-Version: 1\n"
"POT-Creation-Date: Sat Dec 10 21:53:51 2016\n"
"PO-Revision-Date: 2016-12-20 10:18-0600\n"
"Last-Translator: Georg Lehner <email@suppressed>\n"
"Language-Team: German\n"
"Language: de\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: talgettext.py 1.5.0\n"

The critical part for our objective is the line:

"Language: de\n"

Note: Content-Type and Content-Transfer-Encoding are the other two required lines for a working translation.

Gratuitous English .po File

You must have a en.po file in the locale sub directory of your tracker, although it needs not contain any translation. Create it by copying the messages.pot file to en.po and do with it as described in Prepare .po Files Correctly.

Use the Page Language In Your Template

At any place in your TAL template file where you need to know the selected page language use the python extension as in the following example:

<tal:block metal:define-macro="">
  <!DOCTYPE html>
    <html lang="python:utils.page_language(i18n)">
...

This is just the way I use it for gias html5 compliant template mentioned at the beginning of this article

What the Heck?

Man with machete fighting other man with revolver

HTTP Language Negotiation

By the very nature of the Internet web sites have an international audience. The means provided by the web standards to adapt a web site to the language capabilities of a visiting user is the HTTP Header Accept-Language which is defined in RFC 7231.

The user of the browser (you!) sets up one or more languages in order of preference. The browser then sends the Accept-Language header with the list of languages with every request to the web servers.

Roundup Internationalization

With the configuration line:

use_browser_language = yes

Roundup dissects the list of preferred languages and converts the language tags into a list of gettext locale codes ordered by the users preference, highest priority first. Then this list is expanded by adding main language tags to sub-tagged languages, if the main language does not exists. Finally the list is iterated through, trying to find a translation file on each turn. The first file found is used to translate the strings in the page. If no matching file is found, the configured tracker locale is used.

The translation file is read in and represented by the template translation object i18n, with the well documented methods and template invocations. This process makes use of the Python gettext module.

Notes:

  • While the algorithm used provides for a correct ordering, it probably can generate incorrect or unusable locale codes, since an eventual variant sub tag given by the browser will not be represented in the way locale codes are defined. I guess, however, that in practice this has no relevance, unless you go into the lengths of creating different translations for specific dialects or variants of the same main language on your Roundup tracker.

  • Probably Roundup should record the selected locale in a template variable, e.g. page_language, by itself, obsoleting this blog completely.

The Trick to Get The Page Language

The Python gettext module apparently reads the translation of the null string into the _info variable and provides its content via the info() method of the NullTranslations class in form of a dictionary.

This dictionary is even available as template path i18n/info. A convenience mapping to the values as in i18n/info/language is however not implemented.

The reason why we cannot use the template expression:

python:i18n.info()['language']

directly is, because if no language is negotiated or the Language: line is not present in the .po file, we get a KeyError exception thrown at. Our page_language TAL Python Extension takes care of this error.

English Is Not English

Sketch of the actor Rowan Atkinson. The next peculiarity found is, that even if my browsers language preference is set to 'English, then German', I get the German translation. If you re-read the notes about Roundup Internationalization you will see, that if there is no en.po file (or more precisely: no en.mo file) the configured tracker language will be used. This happens even if 'en' is on the language list and we have no need to translate any string.

Furthermore, if the English language is explicitly requested via a @language=en element in the http query string (more on this later), the i18n translation object has no file to read in the _info variable and in consequence the dictionary returned by i18n.info() is empty.

Both issues are fixed by providing the Gratuitous English .po File.

Further Notes on Internationalization

A Roundup tracker typically falls into the class of 'Multilingual, same content' websites. Just the user interface is translated into the respective users language.

W3Cs masterly article about language negotiation highlight several aspects from the usability view. Most are dealt with already in Roundup.

Language negotiation via user configured preferences: This is what we have discussed in previous sections of this article.

User-Agent Header Heuristics

Sean Connery as James Bond The above mentioned article suggests one could guess the browser user interface language from the User-Agent http header, in case no Accept-Language header is sent.

A quick Internet search seems to indicate, that a lot of browsers do not include a language in the User-Agent header and that programmers seldom (or never?) use this heuristic at all, neither Roundup does currently look for this header.

Stickiness In Navigation

The language can be set directly via the query string element @language. It sets the cookie roundup_language to the selected locale code. When the cookie is found, it is used to set the templates language variable and to select the respective translation.

Two issues arise:

  • If @language is set to a language for which you do not provide a translation file, English is chosen. Arguably one would expect the site to be rendered in the tracker language in this situation.

  • Setting a cookie is considered a privacy leak. The W3C site, e.g. asks you gently via a pop up if you want to set a cookie for persisting your language choice. If you don't, every next page falls back to language negotiation.

Bad or No User Languages

In several scenarios a user might get a page with a language she does not know, even if in some cases a known language were available:

  • By failure to configure preferred languages at all.
  • When configuring only languages for which there are no translations available.
  • When using a browser on a foreign device (e.g. in an Internet Cafe).

In these cases she will get either the default tracker language or English (as shown in Stickiness In Navigation) or an available language whereby any or all of them might be unknown to her.

colletion of national flags

Language Controls

The common solution is to include 'language controls' on some or each pages, where the user can switch to any of the available languages. Most likely you know these web sites with lots of little country flags representing the language to switch to. Ever wondered how people with visual impairment handle this?

The default Roundup tracker templates do not provide such controls and there is one piece of information missing to satisfactorily implement them. At the moment there is no provision to get a list of all available translations inside a tracker template.

You can and should, of course, define a template variable with a hand-crafted list of available languages and use it in a drop-down or a language navigation bar to provide this feature.

User Account Language Preference

Besides Stickiness In Navigation, logged in user might want to override their browser language choice too, if any, or get a different than the tracker language as default.

This can be done by adding a preferred language property to the user class. It is to be researched, how to make use of this information to set this user language upon log in.

Other Bits and Pieces

File Upload Buttons

The button used to select files for upload is provided by the browser and can neither be styled, nor can the text on the button set. This comes natural and by html spec, as the browser also indicates if a file has already be selected or not.

In my case, I get a complete German user interface on roundup, with the only exception of the "Browse..." button, because I usually install my operating system (and so the browser) with English user interface language.

A very complete, though involved solution to style and potentially translate these

 <input ... type="file"> 
elements is provided by Osvaldas Valutis in a very well crafted tutorial: Styling & Customizing File Inputs the Smart Way.

Roundup Internal Strings

I have found at least one instance of a string in the Roundup sources, which is not subjected to gettext processing. It is inside a javascript function, so I guess it is challenging to do it there.

There are other user interface elements where no translation appears, however the python code reveals that they are in fact gettexted. You can work around this, by providing an extra file in the html sub directory, which is used to hold translations of strings not extracted (or extractable) by roundup-gettext.

I have named this file po.html (must be *.html!) and e.g. the entry:

<x i18n:translate="">Submit Changes</x>

has given me a translated submit button on the issue item template, which is produced by the opaque construct:

<span tal:replace="structure context/submit">_submit_button_</span>

Summary

We have shown, how to add multilingual capabilities for html5 templates to Roundup without patching the software.

There remain several shortcomings, which cannot be resolved without changing the source, however.

It would be helpful for the internationalization of the Roundup tracker, to provide the following information to the html templates:

  • page language .. the selected language for rendering a page.
  • translations .. a list of available languages.

The behavior with respect to the fallback language should be improved.

Some translation strings of code generated elements are not automatically included in the messages.pot file generated by roundup-gettext, other such elements are missing gettext processing still.

User Account Language Preference is still to be researched.

Nice to haves:

Georg Lehner
Microformats for IkiWiki

Background

Microformats is one of several approaches to add meaning to the visible text which is readable by computers. The goal of this is to create a Semantic Web.

The IndieWeb proposes the use of microformats to build a decentralized social network: the IndieWeb.

The blog your are reading right now is based on IkiWiki, adding microformats to IkiWiki is a first step to adding it to the IndieWeb.

Microformats

Microformats 2 is the recent development of the microformats markup. It is advised, to use them but still mark web pages up with classic microformats. For easier authoring I have put together a cheat sheet for translating h-entry to hAtom attributes.

For marking up a blog, a perceived minimum of three microformats classes are needed:

h-entry:
For giving details about a post (or article, respectively).
h-card:
For marking up the author of the article inside the h-entry.
h-feed:
For marking up aggregations of posts.

The following is a stripped down HTML 5 source of the present article. Different class attributes hold the microformat markup. Look out for the following data (classic microformats in parenthesis):

  • h-entry (hentry)
  • p-name (entry-title)
  • e-content (entry-content)
  • rel="tag" (p-category)
  • p-author (author), with h-card (vcard)
  • dt-published (published)
  • dt-updated (updated)
<article class="page h-entry hentry">
    ...
    <span class="title p-name entry-title">
        Simple Responsive Design for IkiWiki
    </span>
    ...
    <section id="content" role="main" class="e-content entry-content">
        ... article text comes here ...
    </section>
    <nav>
        Tags:
        <a href="../../../tags/webdesign/" rel="tag"  class="p-category">webdesign</a>
    </nav>
    <span class="vcard">
        <a class="p-author author h-card" href="http://jorge.at.anteris.net">Georg Lehner</a>,
    </span>
    <span class="dt-published published">Posted <time datetime="2016-06-18T14:08:19Z" pubdate="pubdate" class="relativedate" title="Sat, 18 Jun 2016 16:08:19 +0200">at teatime on Saturday, June 18th, 2016</time></span>
    <span class="dt-updated updated">Last edited <time datetime="2016-07-23T13:48:28Z" class="relativedate" title="Sat, 23 Jul 2016 15:48:28 +0200">Saturday afternoon, July 23rd, 2016</time></span>
</article>

Now lets look at a feed. In this case just a time ordered list of two posts. Follow the structure as you did above:

  • h-feed (hfeed)
  • p-name (entry-title): title of the feed.
  • list of posts, each:
    • h-entry (entry)
    • u-url (bookmark)
    • p-name (entry-title): title of the post
    • dt-published (published)
    • p-author (author), with h-card (vcard)
<div class="h-feed hfeed">
    <span class="p-name entry-title"><span class="value-title" title="MagmaSoft Tech Blog: all posts list"> </span></span>
        <div class="archivepage h-entry entry">
            <a href="./Microformats_for_IkiWiki/" class="u-url bookmark p-name entry-title">Microformats for IkiWiki</a><br />
            <span class="archivepagedate dt-published published">
                Posted <time datetime="2016-07-27T15:19:32Z" pubdate="pubdate" class="relativedate" title="Wed, 27 Jul 2016 17:19:32 +0200">late Wednesday afternoon, July 27th, 2016</time>
            </span>
        </div>
        <div class="archivepage h-entry entry">
            <a href="./Simple_Responsive_Design_for_IkiWiki/" class="u-url bookmark p-name entry-title">Simple Responsive Design for IkiWiki</a><br />
            <span class="archivepagedate dt-published published">
                Posted <time datetime="2016-06-18T14:07:50Z" pubdate="pubdate" class="relativedate" title="Sat, 18 Jun 2016 16:07:50 +0200">at teatime on Saturday, June 18th, 2016</time>
            </span>
            <span class="vcard">
                by <a class="p-author author h-card url fn" href="http://jorge.at.anteris.net">Georg Lehner</a>
            </span>
        </div>
</div>

If you use Firefox and install the Operator Add-on you can see respective 'Contacts' and 'Tagspaces' entries.

This should do for learning by example, if you need more, go to the http://microformats.org website.

Note: IMHO the markup looks overly complicated, due to doubling microformats v1 and v2 markup. Microformats v2 simplify things a lot, but Operator has no support for it (yet) and who else out there will have?!

IkiWiki

Single posts

Pages are rendered by IkiWiki via HTML::Template using a fixed template: page.tmpl. So are blog posts, as these are simply standard wiki pages. The templates can contain variables, such as the authors name or the creation date of the page, which are inserted accordingly in the HTML code.

Instead of rewriting the default page.tmpl, I copied it over to mf2-article.tmpl and use the IkiWiki directive pagetemplate on top of all blog posts in the following way: ``

Note: One would like to avoid this extra typing. There are approaches which automate template selection, e.g. as discussed at the IkiWiki website here, however they are not yet available in the default IkiWiki code base.

Aggregates

We already explored the different options of aggregating several pages with IkiWiki on this website.

Two templates come into play for formatting the various posts involved:

archivepage.tmpl:
 
For simple lists of posts, like the example for a feed used above.
inlinepage.tmpl:
 
When concatenating several posts, e.g. the five most recent ones, with all their contents.

Note: there is also a microblog.tmpl template, which I have not used until now. Of course it will need a microformats upgrade too.

Accordingly I provide two modified template files which include the necessary microformats markup:

These are used in the respective inline directive in the template argument. Two live examples from the posts and the blog pages:

[[!inline  pages="page(blog/posts/*) and !*/Discussion"
show="5" feeds="no"
template="mf2-inlinepage" pageinfo]]
[[!inline  pages="page(./posts/*) and !*/Discussion"
archive=yes quick=yes trail=yes show=0
template="mf2-archivepage" pageinfo]]

But this does not give us a feed markup!

Ideally the inline directive should create the microformats markup for h-feed by itself. This would need a major change in IkiWiki's code base and of course has to be discussed with the IkiWiki authors and community. In the meantime I wrote a wrapper template: h-feed, which can be used to enclose an inline directive and wraps the rendered post list into a h-feed tagged <div>.

Note: it is not trivial to mark up the h-feed automatically. Feeds have required and optional elements which might be made visible or not on the page. The question is, how would the inline directive know which information to show whether to put it on top or below of the list of posts and which text to wrap it into - think multiple languages. A possible solution would be a feedtemplate parameter with which you can select a template wrapping the in-lined pages. The you can adapt the template to your taste. A default template provided by IkiWiki could be similar to the h-feed template.

Finally for the lazy readers: here comes the complete live example for h-feed. We'll show the archivepage example (simple list of posts):

[[!template  id=h-feed.mdwn
name="MagmaSoft Tech Blog: all posts list"
feed="""
[[!inline pages="page(./posts/*) and !*/Discussion"
archive=yes quick=yes trail=yes show=0
template="mf2-archivepage" pageinfo]]
"""]]

Itches

Pageinfo: more data for templates

IkiWiki runs several passes to compile the given wiki source tree into html pages. During these passes a lot of meta data is gathered for each wiki page. However, as explained in a forum post, IkiWiki does not supply much of this information in the template or inline directive.

To harvest the meta data already available I prototyped a new "IkiWiki" feature, which I call pageinfo. It adds a new valueless parameter to the inline and the template directive as can seen in the above examples. If it is present, information in the global hash %pagestate is made available to the template as <TMPL_VAR variable.

Plugins can be written, which add information to %pagestate in early passes of the IkiWiki compiler.

The precise location for a given variable of a certain page in %pagestate is: $pagestate{$page}{pageinfo}{variable}

Note: This approach should be expanded somehow to take the information from the meta plugin into account.

Sample plugin: get_authors

The canonical way to declare authorship of an IkiWiki page is by using the ?meta directive with the author parameter. This is (sic) not available to templates or the inline directive. Additionally, you must set it by yourself manually for each page.

Since I am using a revision control system (rcs), the authorship information is already present in the wiki. So why repeat myself?

In the pageinfo prototype of IkiWiki two new rcs hooks are added, which gather the creator and the last committer of a file. The sample get_authors plugin uses this information with a map in the setup file to convert the rcs author information to the authors name and URL and provide them as author_name, author_url, modifier_name and modifier_url. These variables are present in the above described templates.

Note: I am not happy with this first attempt. It is rather heuristic and already needs two hooks to implement for each type of rcs. In a real world wiki a page can have a lot of contributors, not just the first and the last one. Should we care?

Show me the source

At http://at.magma-soft.at/gitweb you can find the pageinfo branch of the ikiwiki.git repository. In the same location you will find the ikiwiki-plugins repository with the get_author.pm plugin.

Georg Lehner
Simple Responsive Design for IkiWiki

When adding responsive design to the MagmaSoft websites I was tempted to use either Pure.css or reuse Milligram as done on my homepage, however decided to google for typographic correct breakpoints. No need to explain here what this is, since Vasilis van Gemert has done it masterly in his article about Logical Breakpoints Responsive Design in the Smashing Magazine

Vasilis provides a measure help which allows you to detect, how many 'em's you need for a certain font to provide a certain average amount of words per line in a certain language. Since we are trilingual: English, German, Spanish, I tried several options and settled on 40 em, which gives us around 90 characters a line. The left margin is set to 6% of the space, the text line to 94%, so that the reader does not have to horizontal scroll the maximized browser window on a wide screen to find the text.

When the screen size gets smaller, the text font is scaled down a little bit and the margins are set to zero. So there is more text shown on smartphone screens.

IkiWiki just required some small changes to local.css and page.tpl.

local.css:

@media (max-width: 44em) {
   body {
      font-size: .9em;
      padding: 0 1.5em;
      margin: 0;
    }
}

@media (min-width: 44em) {
   body {
      max-width: 47em;
   }

   article {
      width: 94%;
      margin-left: 6%;
   }

   h1, h1 + p {
      margin: 1em 0 1em -6%;
   }

}

page.tmpl:

<TMPL_IF HTML5>
<meta name="viewport" content="width=device-width, initial-scale=1">
</TMPL_IF>
<link rel="stylesheet" href="<TMPL_VAR BASEURL>css/normalize.css" type="text/css" />

Since the first test involved Milligram, I included normalize.css too. I left it in after moving to Vasilis' design, hoping "it helps" somehow.

Posted