Skip to content

Commit

Permalink
Add initial support for localization and internationalization (#126)
Browse files Browse the repository at this point in the history
* Move `LanguageRange` and `LanguageTag` abstractions from Hyperspace to be used in the localization support
* Add string escaping support to ease the creation of strings including control characters and Unicode code points
  * `String>>unescaped` transforms the escaping sequences into the target Unicode characters
  * `String>>escaped` is a kind of inverse but some sequences have several possible escaping options
* Add CurrentLocale dynamic variable to control the localization options
* Add string localization support for translating strings in several languages
  * `String>>localized` and `String>>localizedWithAll:` providing translation services dependent on the current locale for the process, placeholders between {} are interpolated with the provided arguments, and escape sequences are automatically unescaped.
* Add extensions to TestCase to ease the testing of locale-aware code
  • Loading branch information
gcotelli authored Jun 25, 2024
1 parent 4fea7c4 commit 57e1b32
Show file tree
Hide file tree
Showing 45 changed files with 1,895 additions and 10 deletions.
2 changes: 2 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ understanding over specific topics:
- **Exception Handling**: Extensions to the [exception handling mechanics](reference/ExceptionHandling.md).
- **Meta-programming**: Some abstractions like [namespaces](reference/Namespaces.md),
[interfaces](reference/Interfaces.md) and extensions to the [Object model](reference/MOP.md).
- **Internationalization**: Abstractions and extensions for [localizing](reference/Internationalization.md)
an application.
- **SUnit**: [Extensions to the SUnit framework](reference/SUnit.md).

---
Expand Down
129 changes: 129 additions & 0 deletions docs/reference/Internationalization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Internationalization and Localization

Internationalization, often shortened to "i18n", is the practice of designing a
system in such a way that it can easily be adapted for different target audiences,
that may vary in region, language, or culture.
The complementary process of adapting a system for a specific target audience is
called Localization. Localization (l10n) is the process of adapting a software
user interface to a specific culture.

## Language Tags and Ranges

Language Tags are represented by instances of `LanguageTag`. A language tag is
used to label the language used by some information content.

These tags can also be used to specify the user's preferences when selecting
information content or to label additional attributes of content and associated
resources.

Sometimes language tags are used to indicate additional language attributes of
the content.

Language tags can be created by providing a subtags list or by parsing its
string representation:

```smalltalk
LanguageTag composedOf: #('en' 'Latn' 'US').
LanguageTag fromString: 'en-us'.
'en-us' asLanguageTag.
```

Its instances can respond the language code (`languageCode`) and provide
methods to access its script and region in case they are defined:

```smalltalk
tag withScriptDo: [:script | ].
tag withRegionDo: [:region| ].
```

This implementation does not do anything special with the other optional
subtags that can be defined; nor supports extended languages and regions in UN
M.49 codes.

Language ranges are represented by instances of `LanguageRange`. A language
range has the same syntax as a language-tag, or is the single character `"*"`.

A language range matches a language tag if it exactly equals the tag, or if it
exactly equals a prefix of the tag such that the first character following the
prefix is `"-"`.

The special range `"*"` matches any tag. A protocol that uses
language ranges may specify additional rules about the semantics of
`"*"`; for instance, `HTTP/1.1` specifies that the range `"*"` matches only
languages not matched by any other range within an `"Accept-Language:"` header.

Language ranges can be created by sending the message `any`, providing a list
of subtags, or parsing its string representation:

```smalltalk
LanguageRange any.
LanguageRange composedOf: #('en').
LanguageRange fromString: '*'.
LanguageRange fromString: 'es-AR'.
```

`LanguageRange` instances are capable of matching corresponding language tags.
For example:

```smalltalk
(LanguageRange fromString: 'es') matches: 'es-AR' asLanguageTag "==> true"
```

## Escaping Strings

When dealing with localized strings it's usual to have to type some Unicode
characters outside the ones easily available in a keyboard. To ease this typing
we can use `String>>#unescaped` method, that escapes certain sequences starting
with `\` (the reverse solidus character).

The available escaping sequences are:

- `\\` escapes the reverse solidus character
- `\a` escapes the `BEL` Unicode Character (code point 7)
- `\b` escapes the `BS` Unicode Character (code point 8)
- `\e` escapes the `ESC` Unicode Character (code point 27)
- `\f` escapes the `FF` Unicode Character (code point 12)
- `\l` escapes the `LF` Unicode Character (code point 10)
- `\n` escapes to the OS line delimiter
- `\r` escapes the `CR` Unicode Character (code point 13)
- `\t` escapes the `TAB` Unicode Character (code point 9)
- `\v` escapes the `VT` Unicode Character (code point 11)
- `\u{XX}` escapes the Unicode Character with code point XX, XX is expressed in
Hexadecimal notation and can be any valid code point in the Unicode range.

New escaping sequences can be implemented by end users subclassing `StringEscapingRule`
and implementing the required behavior.

## Current Locale and Language Translator

`CurrentLocale` is a dynamic variable used for accessing the current locale in
the localization support. Note that this is process-specific, so the same
image can have different locales for different running processes.

`NaturalLanguageTranslator` is a placeholder holding the current translator
available in the image.

## Localizing Strings

To localize a string into the language defined by the current locale, you need
to send the message `localize` to the string instance, or `localizeWithAll:` if
you need some placeholders in the translation.

For example,

```smalltalk
'Hello world!' localized
```

will search for a translation of `Hello world!` in the installed language
translator according to the language in the current locale.

```smalltalk
'Hello {1}' localizedWithAll: { self name }
```

will search for the translation and then use the `String>>#format:` method to replace
the placeholders.

The localization methods will first unescape the receiver, then search for a
translation and finally apply the format method.
3 changes: 3 additions & 0 deletions docs/reference/SUnit.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
- `deny:includes:` denies that a collection includes an element
- `should:raise:withMessageText:` asserts that a block raises a specific
exception including a specific message text
- `use:asLocaleDuring:` allows changing the current locale during a block execution
- `use:asNaturalLanguageTranslatorDuring:` allows using and configuring a language
translator during a block execution
- `withTheOnlyOneIn:do:` provides a facility to assert that a collection has
only one element and evaluates a block with it

Expand Down
4 changes: 3 additions & 1 deletion rowan/components/Dependent-SUnit-Extensions.ston
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,9 @@ RwSimpleProjectLoadComponentV2 {
#name : 'Dependent-SUnit-Extensions',
#condition : 'sunit',
#projectNames : [ ],
#componentNames : [ ],
#componentNames : [
'Deployment'
],
#packageNames : [
'Buoy-SUnit-Model',
'Buoy-SUnit-GS64-Extensions'
Expand Down
4 changes: 4 additions & 0 deletions rowan/components/Deployment.ston
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ RwSimpleProjectLoadComponentV2 {
'Buoy-Dynamic-Binding',
'Buoy-Exception-Handling-Extensions',
'Buoy-Exception-Handling-GS64-Extensions',
'Buoy-Localization',
'Buoy-Localization-GS64-Extensions',
'Buoy-Math',
'Buoy-Math-Extensions',
'Buoy-Math-GS64-Base-Extensions',
Expand All @@ -43,6 +45,8 @@ RwSimpleProjectLoadComponentV2 {
'Buoy-Dynamic-Binding' : { 'symbolDictName' : 'Buoy' },
'Buoy-Exception-Handling-Extensions' : { 'symbolDictName' : 'Globals' },
'Buoy-Exception-Handling-GS64-Extensions' : { 'symbolDictName' : 'Globals' },
'Buoy-Localization' : { 'symbolDictName' : 'Buoy' },
'Buoy-Localization-GS64-Extensions' : { 'symbolDictName' : 'Globals' },
'Buoy-Math' : { 'symbolDictName' : 'Buoy' },
'Buoy-Math-Extensions' : { 'symbolDictName' : 'Globals' },
'Buoy-Math-GS64-Base-Extensions' : { 'symbolDictName' : 'Globals' },
Expand Down
2 changes: 2 additions & 0 deletions rowan/components/Tests.ston
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ RwSimpleProjectLoadComponentV2 {
'Buoy-Conditions-Tests',
'Buoy-Dynamic-Binding-Tests',
'Buoy-Exception-Handling-Tests',
'Buoy-Localization-Tests',
'Buoy-Math-Tests',
'Buoy-Metaprogramming-Tests',
'Buoy-SUnit-Tests'
Expand All @@ -29,6 +30,7 @@ RwSimpleProjectLoadComponentV2 {
'Buoy-Conditions-Tests' : { 'symbolDictName' : 'Buoy' },
'Buoy-Dynamic-Binding-Tests' : { 'symbolDictName' : 'Buoy' },
'Buoy-Exception-Handling-Tests' : { 'symbolDictName' : 'Buoy' },
'Buoy-Localization-Tests' : { 'symbolDictName' : 'Buoy' },
'Buoy-Math-Tests' : { 'symbolDictName' : 'Buoy' },
'Buoy-Metaprogramming-Tests' : { 'symbolDictName' : 'Buoy' },
'Buoy-SUnit-Tests' : { 'symbolDictName' : 'Buoy' }
Expand Down
31 changes: 24 additions & 7 deletions source/BaselineOfBuoy/BaselineOfBuoy.class.st
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ BaselineOfBuoy >> baseline: spec [
baselineComparison: spec;
baselineDynamicBinding: spec;
baselineExceptionHandling: spec;
baselineLocalization: spec;
baselineMath: spec;
baselineMetaprogramming: spec;
baselineSUnit: spec;
Expand Down Expand Up @@ -175,6 +176,22 @@ BaselineOfBuoy >> baselineGS64Development: spec [
group: 'GS64-Development' with: 'Buoy-Chronology-GS64-Extensions'
]

{ #category : 'baselines' }
BaselineOfBuoy >> baselineLocalization: spec [

spec
package: 'Buoy-Localization'
with: [
spec requires:
#( 'Buoy-Assertions' 'Buoy-Dynamic-Binding' 'Buoy-Metaprogramming-Extensions' ) ];
group: 'Deployment' with: 'Buoy-Localization';
package: 'Buoy-Localization-Pharo-Extensions' with: [ spec requires: 'Buoy-Localization' ];
group: 'Deployment' with: 'Buoy-Localization-Pharo-Extensions';
package: 'Buoy-Localization-Tests'
with: [ spec requires: #( 'Buoy-Localization-Pharo-Extensions' 'Dependent-SUnit-Extensions' ) ];
group: 'Tests' with: 'Buoy-Localization-Tests'
]

{ #category : 'baselines' }
BaselineOfBuoy >> baselineMath: spec [

Expand Down Expand Up @@ -219,13 +236,13 @@ BaselineOfBuoy >> baselineMetaprogramming: spec [
{ #category : 'baselines' }
BaselineOfBuoy >> baselineSUnit: spec [

spec
package: 'Buoy-SUnit-Model';
group: 'Dependent-SUnit-Extensions' with: 'Buoy-SUnit-Model';
package: 'Buoy-SUnit-Pharo-Extensions';
group: 'Dependent-SUnit-Extensions' with: 'Buoy-SUnit-Pharo-Extensions';
package: 'Buoy-SUnit-Tests' with: [ spec requires: 'Dependent-SUnit-Extensions' ];
group: 'Tests' with: 'Buoy-SUnit-Tests'
spec
package: 'Buoy-SUnit-Model' with: [ spec requires: 'Buoy-Localization' ];
group: 'Dependent-SUnit-Extensions' with: 'Buoy-SUnit-Model';
package: 'Buoy-SUnit-Pharo-Extensions';
group: 'Dependent-SUnit-Extensions' with: 'Buoy-SUnit-Pharo-Extensions';
package: 'Buoy-SUnit-Tests' with: [ spec requires: 'Dependent-SUnit-Extensions' ];
group: 'Tests' with: 'Buoy-SUnit-Tests'
]

{ #category : 'baselines' }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ DurationChronologyExtensionsTest >> testWait [

| ms |

ms := Time millisecondsToRun: [ 2.1 seconds wait ].
self assert: ms >= 2100
ms := Time millisecondsToRun: [ 2 seconds wait ].
self assert: ms >= 2000
]

{ #category : 'tests' }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,31 @@ CharacterCollection >> findTokens: delimiters [
^ self substrings: separators
]

{ #category : '*Buoy-Collections-GS64-Extensions' }
CharacterCollection >> format: collection [
"Format the receiver by interpolating elements from collection"

^ self species new: self size streamContents: [ :result |
| stream |
stream := self readStream.
[ stream atEnd ] whileFalse: [
| currentChar |
( currentChar := stream next ) == ${
ifTrue: [
| expression index |
expression := stream upTo: $}.
index := Integer readFrom: expression ifFail: [ expression ].
result nextPutAll: ( collection at: index ) asString
]
ifFalse: [
currentChar == $\
ifTrue: [ stream atEnd ifFalse: [ result nextPut: stream next ] ]
ifFalse: [ result nextPut: currentChar ]
]
]
]
]

{ #category : '*Buoy-Collections-GS64-Extensions' }
CharacterCollection >> includesSubstring: substring [

Expand Down
28 changes: 28 additions & 0 deletions source/Buoy-Collections-Tests/StringExtensionsTest.class.st
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,34 @@ StringExtensionsTest >> testFindTokens [
hasTheSameElementsInTheSameOrderThat: #( 'es' ' ' 'his' )
]

{ #category : 'tests' }
StringExtensionsTest >> testFormat [

self
assert: ( 'This is {1} !' format: #( 'a test' ) ) equals: 'This is a test !';
assert: ( 'This is a {type}' format: { ( 'type' -> 'test' ) } asDictionary )
equals: 'This is a test';
assert: ( 'In {1} you can escape \{ by prefixing it with \\' format: { 'strings' } )
equals: 'In strings you can escape { by prefixing it with \';
assert: ( 'In \{1\} you can escape \{ by prefixing it with \\' format: { 'strings' } )
equals: 'In {1} you can escape { by prefixing it with \';
assert: ( '\{ \} \\ foo {1} bar {2}' format: { 12 . 'string' } )
equals: '{ } \ foo 12 bar string';
assert: ( '\{1}' format: { } ) equals: '{1}';
assert: ( '\{1}{1}' format: { $a } ) equals: '{1}a';
assert: ( '{1}{1}' format: { $a } ) equals: 'aa'
]

{ #category : 'tests' }
StringExtensionsTest >> testFormatWhenInvalid [

self
should: [ 'Invalid {1}' format: #( ) ] raise: SubscriptOutOfBounds;
should: [ 'Invalid {date}' format: #( 1 2 ) ] raise: Error;
should: [ '{ _1_ }' format: #( 1 ) ] raise: Error;
should: [ '{ date }' format: { 'date' -> 1 } asDictionary ] raise: NotFound
]

{ #category : 'tests' }
StringExtensionsTest >> testIncludesSubstring [

Expand Down
10 changes: 10 additions & 0 deletions source/Buoy-Deprecated-V8/LanguageRange.extension.st
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Extension { #name : 'LanguageRange' }

{ #category : '*Buoy-Deprecated-V8' }
LanguageRange class >> from: aSubtagCollection [

self
deprecated: 'Use composedOf: instead'
transformWith: '`@receiver from: `@subtags' -> '`@receiver composedOf: `@subtags'.
^ self composedOf: aSubtagCollection
]
11 changes: 11 additions & 0 deletions source/Buoy-Deprecated-V8/LanguageTag.extension.st
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
Extension { #name : 'LanguageTag' }

{ #category : '*Buoy-Deprecated-V8' }
LanguageTag class >> from: aSubtagCollection [

self
deprecated: 'Use composedOf: instead'
transformWith: '`@receiver from: `@subtags' -> '`@receiver composedOf: `@subtags'.

^ self composedOf: aSubtagCollection
]
1 change: 1 addition & 0 deletions source/Buoy-Deprecated-V8/package.st
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Package { #name : 'Buoy-Deprecated-V8' }
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
Extension { #name : 'CharacterCollection' }

{ #category : '*Buoy-Localization-GS64-Extensions' }
CharacterCollection >> asLanguageRange [

^ LanguageRange fromString: self
]

{ #category : '*Buoy-Localization-GS64-Extensions' }
CharacterCollection >> asLanguageTag [

^ LanguageTag fromString: self
]

{ #category : '*Buoy-Localization-GS64-Extensions' }
CharacterCollection >> escaped [

^ EscapingAlgorithm new escape: self
]

{ #category : '*Buoy-Localization-GS64-Extensions' }
CharacterCollection >> localized [

^ self localizedWithAll: #( )
]

{ #category : '*Buoy-Localization-GS64-Extensions' }
CharacterCollection >> localizedWithAll: collection [

^ NaturalLanguageTranslator current localize: self withAll: collection to: CurrentLocale value
]

{ #category : '*Buoy-Localization-GS64-Extensions' }
CharacterCollection >> unescaped [

^ EscapingAlgorithm new unescape: self
]
Loading

0 comments on commit 57e1b32

Please sign in to comment.