This is a Gatsby plugin that enables you to internationalize the urls of your website.
- Support the BCP47 tags for identifying languages
- Defines a non canonical duplicate page for the main index pages
- This is done at all level of your arborescence
- The canonical link in the header is injected to signal the search engines crawlers register the right address
- Include the locale tag in the path of untranslated pages with the default language
- Configurable default language
- Translation of the slugs
It injects the locale code of your pages in the base path of the URL, your tree will look like that:
www.example.com**/en**/hello
www.example.com**/en-US**/hello
www.example.com**/fr**/hello
www.example.com/ deliver the same content as the default language
- You have to provide the language in the path of the pages like
/this/is/my-page.html.fr
for the french version. - The paths are translated in the language of the page using the translations files you edit, see the locale.[lang].json section.
{
resolve: `gatsby-plugin-intl-url`,
options: {
defaultLanguage: 'fr',
translationFile: 'src/locale.json'
}
}
- defaultLanguage optional: the default language chosen by the plugin if the language code is missing from your
path. Defaults to
en
- translationFile optional: the path to your path translation json file, defaults to
src/locale.json
- The plugin takes as input all the pages of the project
Nb: The pages might have been tranformed before with a markup transformer like gatsby-transformer-remark - The language code of each page by extracted from the path:
/path/to/my/file.html.en
provides theen
code.- If the pattern does not maches, the page will be treated as written in the default language.
- If the page file name is index, the path will stop at the folder level
- As output, it transforms the path of the pages and inject context variables as the next table describes
Path | New path | context.locale | context.canonical | context.slug | context.pathRegex |
---|---|---|---|---|---|
/a/b.fr | /fr/a/b | fr | null | /a/b | /a/b/ |
/a.html | /en/a.html | en (default) | null | /a.html | /a.html/ |
/a.html.es | /es/a.html | es | null | /a.html | /a.html/ |
/index.no | /no | no | null | / | // |
/index.en | / and /en | en (default) | en and null | / | // |
- Each item of the path is translated using the translation file
Ex:
/my/path.html
is exploded as['my','path.html']
, and every string of this array will be translated in french usingsrc/locale.json
file
After handled by the plugin, the page's context have been injected with the following properties:
- locale: The language tag indentifying the language of the page.
- canonical: The canonical link to the page, if the current one has not the canonical path itself, null otherwise. This is usefull to indicate the search engines which link should be registered for the index pages.
- slug: This is the relative path of your page without any indication of the language. It should be written in the
default language
so that you can translate it(feature not implemented yet). - pathRegex: A regular expression containing your the slug for you to filter easily in GraphQL.
This file must be structured as follows:
{
"fr": {
"slugs": {
"boats": "bateaux",
"mono-hull": "mono-coque",
"multi-hull": "multi-coque",
"sailboat": "voilier",
"motorboat": "bateau-a-moteur",
}
}
}
This file would enable to translate the following paths:
Path | Translated |
---|---|
/boats/motorboat | /bateaux/bateau-a-moteur |
/boats/sailboat/mono-hull | /bateaux/voilier/mono-coque |
/boats/sailboat/multi-hull | /bateaux/voilier/multi-coque |
A website www.example.com will have an index page translated in several languages availables at their specific URLs. for instance www.example.com/pt, www.example.com/fi, www.example.com/no... If the default language is norwegian, the main home page should be translated in norwegian, which means the URL www.example.com/ should deliver the same content as www.example.com/no.
To handle this case, the plugin will general two pages with two different URLs, one will be the preferred one
ie. the canonical, and the other the non canonical.
As of today, this choice isn't configurable and it behaves like so:
- www.example.com/no is canonical
- www.example.com/ is not canonical
Please help me improve this project, feel free to ask, suggest or send your pull requests. You can directly contact me on Twitter.