Skip to content
This repository has been archived by the owner on Feb 26, 2022. It is now read-only.

HTML Page Localization

mykmelez edited this page Dec 30, 2011 · 9 revisions

Introduction

Add-on SDK-based addons often include HTML pages displayed in UI elements such as panels, widgets, and tabs. Such content typically includes locale-specific text. And addons are frequently used in multiple locales. So it should be possible for SDK-based addons to localize their pages.

The Mozilla platform supports the localization of XUL documents and XHTML pages via DTDs and entity references. It also supports the localization of text in scripts via .properties files and keys. But it doesn't support the localization of HTML pages.

Bug 691782 implements support for the localization of addon main programs via an l10n CommonJS module and .properties files. This proposal builds on and complements that one by adding two APIs for the localization of HTML pages.

Proposal

Static Localization

For localization of static text in pages, we integrate a localization processor into the addon:// protocol handler, which is being implemented as part of the Add-on Pages API in bug 644595. The localization processor parses pages while they are being loaded and replaces property references (i.e. references to locale-specific text strings in .properties files) with their localized text referents. Property references are property keys delimited by dollar-sign-prefixed curly braces (${key}), and their referents are stored in the same .properties files that support localization of addon main programs.

[Alternative delimiter styles under consideration: mustaches ({{key}}), character entities (&key;). One consideration is avoiding conflicts with client-side template processors like jQuery Templates and mustache.js. We might make it possible to disable localization if we find that users encounter conflicts we cannot easily resolve, although we must work hard to minimize the risk of conflicts, since we want addons to be localizable, and addon developers want that too.]

[Alternately, we could make the processor identify all text nodes (and localizable attributes, like the alt attribute to the <img> tag) and use their values as keys automatically, so users wouldn't have to specify delimited keys. It is unclear how feasible or preferable this alternative is.]

For example, the following page includes two references:

<html>
  <head>
    <title>${title}</title>
  </head>
  <body>
    ${greeting}
  </body>
</html>

Given this English .properties file in a browser set to the en-US locale:

title = Simple Page
greeting = Hello, world!

The page would be processed to:

<html>
  <head>
    <title>Simple Page</title>
  </head>
  <body>
    Hello, world!
  </body>
</html>

The locale processor's parser is context-aware and escapes the text it inserts into the page using something like the approach described in Using type inference to make web templates robust against XSS.

Dynamic Localization of Text

For dynamic localization of text in pages, i.e. plaintext strings that are inserted into the DOM of a page by a page script after the page is loaded, the SDK injects an API into the pages that is like the one provided by the l10n CommonJS module for addon main programs.

[Alternative: give pages access to CommonJS modules and make the API available via the l10n CommonJS module.]

For example, given the .properties file mentioned previously, the statement:

alert(_("greeting"));

Would display an alert dialog with the text:

Hello, world!

References Within References

Property values are processed recursively, like entity values in DTD files, so they can embed references to other properties via the same syntax as the pages themselves.

For example, given the .properties file:

appName = Bamboozle
thankYou = Thank you for using ${appName}!

The statement:

alert(_("thankYou"));

Would display an alert dialog with the text:

Thank you for using Bamboozle!

[Alternative: rely on the conventional mechanism for embedding references in property values, which uses opaque identifiers like %S whose semantics can be specified via comments that are sometimes structured to facilitate machine readability. But parsing those comments is complex and brittle, and relying on opaque identifiers would complicate the JSON format by which localizations are shipped in addon XPIs.]

Non-goals

  1. Dynamic Localization of HTML: processing HTML content with locale-specific plaintext strings that is dynamically inserted into a page, à la jQuery Templates (unlike dynamic localization of the plaintext strings themselves, which is covered in the Dynamic Localization of Text section above). This doesn't seem essential for the initial implementation, although it may prove useful to implement in a later phase of development.
  2. Generic Template Processing: this proposal aims, instead, for interoperability with third-party template processors.

References