MusicBrainz Server/Internationalization: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
(→‎Server: Replace database context example with a real one)
 
(70 intermediate revisions by 5 users not shown)
Line 1: Line 1:
This page is intended for translators of [[MusicBrainz Server]] and [[MusicBrainz Database]]’s semi-static content, and people interested in the state of [[MusicBrainz]]’s internationalization.
The [[MusicBrainz Server]] code is using gettext to provide with automatic internationalisation of messages and texts used in the Perl code and templates.

A .pot file is provided with all the strings used in the server. They are in English.


== Getting started ==
== Getting started ==


For a general introduction about translation platform, projects, and languages, please refer to the [[Internationalization#Translation|more general page about translation]].
If you want to help translate, go to the [https://www.transifex.net/projects/p/musicbrainz/ Transifex page] and create an account. If there is already a team for your language, you can request to join it, if not, you can ask for the creation of a new team.

== Questions or problems ==

To discuss about:
* MusicBrainz Server/Database translation: Use the [https://community.metabrainz.org/tags/c/musicbrainz/6/translation “translation” tag in the “MusicBrainz” forum category].
* MusicBrainz Server/Database internationalization: Use the [https://community.metabrainz.org/tags/c/musicbrainz/6/internationalization “internationalization” tag in the “MusicBrainz” forum category].
* General translation/internationalization: Use the [https://community.metabrainz.org/c/internationalization/21 “Internationalization” forum category] with the appropriate tags.

For more real-time interactive conversation with community members, you’re welcome to ask in the [[Communication/IRC|#musicbrainz IRC channel]].

To report problems about:
* Countries: Search for [https://tickets.metabrainz.org/issues/?jql=project%20%3D%20AREQ%20ORDER%20BY%20status%20ASC%2C%20resolution%20DESC AREQ tickets] and create a new ticket if not found.
* Instruments or instrument descriptions: Search for [https://tickets.metabrainz.org/issues/?jql=project%20%3D%20INST%20ORDER%20BY%20status%20ASC%2C%20resolution%20DESC INST tickets] and create a new ticket if not found.
* Attributes, languages, relationship types, or scripts: Search for [https://tickets.metabrainz.org/issues/?jql=project%20%3D%20STYLE%20ORDER%20BY%20status%20ASC%2C%20resolution%20DESC STYLE tickets] and create a new ticket if not found.
* Anything else: Search for [https://tickets.metabrainz.org/issues/?jql=project%20%3D%20MBS%20AND%20labels%20%3D%20internationalization%20ORDER%20BY%20status%20ASC%2C%20resolution%20DESC MBS tickets with “internationalization” label] and create a new ticket if not found.

For more real-time interactive conversation with developers, you’re welcome to ask in the [[Communication/IRC|#metabrainz IRC channel]].

''Note: There used to be a [[Communication/Mailing_Lists#Internationalization_Mailing_List|musicbrainz-i18n mailing list]], but it is discontinued and has been replaced by the above-mentioned forums (using categories and tags).''

== Translation components ==

The following components are available for translation:

=== Attributes ===
It contains the names and the descriptions of [[MusicBrainz Entity|MusicBrainz entity]] attributes such as artist’s type and so on.

The [https://docs.weblate.org/en/latest/user/translating.html#translation-context context] keys are set to the name of the database where the message is stored, for example <code>release_group_primary_type</code> (see [[MusicBrainz_Database/Schema#Release_group|diagram]]).

It is also used by [[MusicBrainz Picard]].

=== Countries ===
It contains the names of [[Release/Country|release countries]].

It is also used by [[MusicBrainz Picard]].

Note that country names should be the same as area aliases; See [[jira:MBS-13140]] for follow-up.

Only the documentation [[Release/Country]] is not localized for now; See [[jira:MBS-13109]] for follow-up.

=== Instrument Descriptions ===
It contains only the descriptions of [[Instrument|instruments]].

=== Instruments ===
It contains only the names of [[Instrument|instruments]].

Note that:

* Instrument names should be the same as instrument aliases; See [[jira:MBS-13141]] for follow-up.
* Context keys are set to their disambiguation comment for now; See [[jira:MBS-13374]] for follow-up.

=== Languages ===
It contains the names of languages that can be set for [[Release#Language|release]]’s tracklist and [[Work|work]]’s lyrics.

=== Relationship Types ===

It contains the names, descriptions, and (forward/long/reverse) link phrases of [https://musicbrainz.org/relationships relationship types] as well as the names and descriptions of [https://musicbrainz.org/relationship-attributes relationship attributes]. See also [[Relationships]].

=== Scripts ===
It contains the names of scripts that can be set for [[Release#Script|release]]’s tracklist.

Note: Because of [[wikipedia:Transliteration|transliteration]] a language is not necessarily paired with its usual script/[[wikipedia:Writing system|writing system]].

=== Server ===
It contains the messages shown to users and admins by the MusicBrainz website.

The context keys that allow the same source message to be translated differently in different contexts are of the following forms:

* A grammatical context such as:
** The role a word would have in a full sentence, for example “Edit” can be either a ''noun'' or a ''verb''.
** The numeration of a word, for example “Series” can be either ''singular'' or ''plural''.
** The object a word is referring to, for example “Removed” can refer either to ''ratings'' or ''folksonomy tags''.
* A layout context, for example “Preview” can be either in ''interactive'' (button, link, menu/item…) or as a ''header''.
* A database context, for example “Internal/Bot” is a ''user type''.

If you feel that more context is needed, do not hesitate to add [https://docs.weblate.org/en/latest/user/translating.html#comments source string comments].

=== Statistics ===
It contains the events in [https://musicbrainz.org/statistics/timeline/main MusicBrainz timeline] and the messages for [https://musicbrainz.org/statistics Database Statistics] section of the website UI.

A few context keys are set in the same way as for the above component Server.


== Viewing the translations ==
== Viewing the translations ==


Some of the more complete translations (generally those over 50% translated) are available on the beta server at https://beta.musicbrainz.org/. The translations do not update automatically (see [[Development/Beta_Cycle|development beta cycle]]), but the beta server uses the same database as the main server. If you want to use the
The translations are available at http://i18n.mbsandbox.org/ - however the only way to change the language at present is via the browser's language settings. The available languages (some much more complete than others) are <code>en-au</code>, <code>en-ca</code>, <code>cy</code>, <code>de</code>, <code>el</code>, <code>eo</code>, <code>es-es</code>, <code>fr-ca</code>, <code>fr-fr</code>, <code>id-id</code>, <code>it</code>, <code>ja</code>, <code>nl</code>, <code>no</code>, <code>pl</code>, <code>pt-br</code>, <code>ru</code>, <code>tr</code> and a special fake translation <code>en-aq</code>.
beta server all of the time for your editing, click the "Use beta site" link in the footer of https://musicbrainz.org/.

== Variables ==

Translatable messages not only contain plain text or HTML markup, they can also contain replaceable variables. For example:

* In <code>{entity1} has a BookBrainz page at {entity0}</code>, which is a URL-Work relationship link phrase, there are two entity variables whose name should not be translated, since variable <code>{entity1}</code> will be replaced by a work title and <code>{entity0}</code> by a URL.
* In link phrases, variables are often used for (optional) attributes, in order to avoid inflating the number of messages. Below are examples with the “additional” attribute:
** <code>{additional}</code> will be replaced by ''additional'' if the “additional” attribute is set, otherwise it will be removed from the text.
** <code>{additional:additionally}</code> will be replaced by ''additionally'' if the “additional” attribute is set, otherwise it will be removed from the text.
** <code>{additional:an|a}</code> will be replaced by ''an'' if the “additional” attribute is set, otherwise it will be replaced by ''a''.
** <code>{additional:%|regular}</code> will be replaced by ''additional'' if the “additional” attribute is set, otherwise it will be replaced by ''regular''.
** Hence, <code>{additional}</code> can be translated as <code>{additional:aldona}</code> in Esperanto.
* Note that <code>{instrument}</code> and <code>{vocals}</code> variables are replaced by the specific instrument/vocals name:
** <code>{instrument:%|instruments}</code> will be replaced by ''piano'' (or its translation) if the related instrument is “piano”, otherwise it will be replaced by ''instruments''.
* In <code>Please {search|try again}</code>:
** <code>{search|try again}</code> will be replaced with an hyperlink on ''try again'' that leads to a search page. The variable identifier ''search'' should not be translated, only the link text ''try again'' should be translated.

== Development ==

The [https://github.com/metabrainz/musicbrainz-server MusicBrainz Server code] is using gettext to provide with automatic internationalization of messages and texts used in the Perl code and templates.

A .pot file is provided with all the strings used in the server. They are in English.


== Beyond translation ==
The translations on http://i18n.mbsandbox.org/ update every hour. If you have made changes on Transifex which are still not showing up several hours later, please contact [[User:Nikki|Nikki]].


== Language-specific pages ==
=== Current features ===


* Localized [[Aliases|aliases]] whose list of locales is imported from [https://cldr.unicode.org/ Unicode CLDR].
This wiki is the perfect place to discuss specific languages translations issues such as glossary, or right-to-left required patches etc.
** Support to search for [[MusicBrainz Entity|entities]] by name, alias, or both using fuzzy search (the default).
** Specific sort name for each alias indicating how the entity should be sorted under that name in the given locale.
** Aliases are returned (if specifically requested) by the [[MusicBrainz API]].
** However aliases are not otherwise used for display in the website; See [[jira:MBS-11965]] for follow-up.
* [[Release#Language|Language]] and [[Release#Script|script]] (a.k.a. writing system) of each release’s tracklist.
** Relationship type to [[rt:fc399d47-23a7-4c28-bfcf-0607a562b644|link releases having translated/transliterated title and tracklist]].
*** The ability for editors to add [[Release#Status|pseudo-releases]] (to be backed with [[jira:MBS-4501|alternative tracklists]]) for translating/transliterating any release.
** Support to search for releases by their tracklist’s language and script
* Language of [[Work|work]]’s lyrics.
** Relationship type attribute to [[ra:ed11fcb1-5a18-4e1d-b12c-633ed19c8ee1|link works having translated lyrics]].
** Support to search for works by lyrics’ language
* Ability to enter localized artist names as appropriate on releases and recordings using [[Artist Credits|artist credits]].


=== Current issues ===
* [[Server Internationalisation/French|French]]


Most of current issues are tracked through [https://tickets.metabrainz.org/issues/?jql=project%20%3D%20MBS%20AND%20labels%20%3D%20internationalization%20ORDER%20BY%20status%20ASC%2C%20resolution%20DESC MusicBrainz Server internationalization tickets].
* [[Server Internationalisation/Italian|Italian]]
Some more long-term goals are not tracked yet.


Possibly the biggest unsolved issue is that there is no way to translate MusicBrainz documentation ([[jira:MBS-1406]]). This might involve [[Internationalization#Wiki|finding a way to translate wiki content]], moving documentation to a different place, or a combination of both.
* [[Server Internationalisation/Polish|Polish]]


There are most likely some internationalization issues with fuzzy search in some languages (with agglutinative words or ideographic characters).
* [[Server Internationalisation/Spanish|Spanish]]
It mostly requires making proper use of [https://solr.apache.org/guide/solr/latest/indexing-guide/language-analysis.html language analysis] from Apache Solr.


=== Ideas up in the air ===
* [[Server Internationalisation/Esperanto|Esperanto]]


== Importing ISO lists ==
===== Artists sort names =====
There are conveniently LGPL 2.1 pretranslated lists for a bunch of ISO standard lists at http://anonscm.debian.org/gitweb/?p=iso-codes/iso-codes.git;a=tree (specifically, we use 3166, 639, and 15924).


Artists currently get a (main) sort name which must be in Latin script, and translations or transliterations are used for artists with non-Latin names. This could eventually be replaced by the already existing alias sort name feature, which already allows any appropriate script for the alias locale; it might require the introduction of either a generic "Latin script" alias locale or a way to indicate Latin transliterations for non-Latin alias locales.
To bring these translations in as a base for countries, languages, and scripts, follow this procedure:


===== Automatic Transliteration =====
*Download a .po for translation from Transifex. We'll assume it's called mbs.po for the sake of commands below.
*Download the proper language .po from the link above for each of the three lists. We'll assume they're called 3166.po, 639.po, and 15924.po.
*'''Check the translations &ndash; many are wrong in some translations, and if so you should not import.'''
*<code>msgmerge -N 3166.po mbs.po -o mbs1.po</code>
*<code>msgmerge -N 639.po mbs1.po -o mbs2.po</code>
*<code>msgmerge -N 15924.po mbs2.po -o mbs-final.po</code>
*<code>msgfmt -c mbs-final.po</code>
*Correct any errors msgfmt complains about, then use Transifex's 'Upload File' and upload mbs-final.po.


Automatic transliteration could be done for many languages if no transliterated/translated alias is available. For best results it is necessary to know the language (e.g. cyrillic script is used by several languages; transliteration will be subtly different from Ukrainian or from Azerbaijani - in the case of Chinese, differences between dialects are even more dramatic). For Japanese, where identical kanji can have multiple different readings, the correct transliteration may not be easy to determine at all. In addition, individual artists often may prefer nonstandard transliteration of their names, or may have an "English" name that isn't really a transliteration.
Alternatively the ISO standard lists can be uploaded directly to Transifex, which will subsequently merge the common strings itself as appropriate.


Congratulations, you should then have all the countries/languages/scripts translated!
[[Category:Internationalization]]
[[Category:Internationalization]]

Latest revision as of 19:29, 12 December 2023

This page is intended for translators of MusicBrainz Server and MusicBrainz Database’s semi-static content, and people interested in the state of MusicBrainz’s internationalization.

Getting started

For a general introduction about translation platform, projects, and languages, please refer to the more general page about translation.

Questions or problems

To discuss about:

For more real-time interactive conversation with community members, you’re welcome to ask in the #musicbrainz IRC channel.

To report problems about:

For more real-time interactive conversation with developers, you’re welcome to ask in the #metabrainz IRC channel.

Note: There used to be a musicbrainz-i18n mailing list, but it is discontinued and has been replaced by the above-mentioned forums (using categories and tags).

Translation components

The following components are available for translation:

Attributes

It contains the names and the descriptions of MusicBrainz entity attributes such as artist’s type and so on.

The context keys are set to the name of the database where the message is stored, for example release_group_primary_type (see diagram).

It is also used by MusicBrainz Picard.

Countries

It contains the names of release countries.

It is also used by MusicBrainz Picard.

Note that country names should be the same as area aliases; See jira:MBS-13140 for follow-up.

Only the documentation Release/Country is not localized for now; See jira:MBS-13109 for follow-up.

Instrument Descriptions

It contains only the descriptions of instruments.

Instruments

It contains only the names of instruments.

Note that:

  • Instrument names should be the same as instrument aliases; See jira:MBS-13141 for follow-up.
  • Context keys are set to their disambiguation comment for now; See jira:MBS-13374 for follow-up.

Languages

It contains the names of languages that can be set for release’s tracklist and work’s lyrics.

Relationship Types

It contains the names, descriptions, and (forward/long/reverse) link phrases of relationship types as well as the names and descriptions of relationship attributes. See also Relationships.

Scripts

It contains the names of scripts that can be set for release’s tracklist.

Note: Because of transliteration a language is not necessarily paired with its usual script/writing system.

Server

It contains the messages shown to users and admins by the MusicBrainz website.

The context keys that allow the same source message to be translated differently in different contexts are of the following forms:

  • A grammatical context such as:
    • The role a word would have in a full sentence, for example “Edit” can be either a noun or a verb.
    • The numeration of a word, for example “Series” can be either singular or plural.
    • The object a word is referring to, for example “Removed” can refer either to ratings or folksonomy tags.
  • A layout context, for example “Preview” can be either in interactive (button, link, menu/item…) or as a header.
  • A database context, for example “Internal/Bot” is a user type.

If you feel that more context is needed, do not hesitate to add source string comments.

Statistics

It contains the events in MusicBrainz timeline and the messages for Database Statistics section of the website UI.

A few context keys are set in the same way as for the above component Server.

Viewing the translations

Some of the more complete translations (generally those over 50% translated) are available on the beta server at https://beta.musicbrainz.org/. The translations do not update automatically (see development beta cycle), but the beta server uses the same database as the main server. If you want to use the beta server all of the time for your editing, click the "Use beta site" link in the footer of https://musicbrainz.org/.

Variables

Translatable messages not only contain plain text or HTML markup, they can also contain replaceable variables. For example:

  • In {entity1} has a BookBrainz page at {entity0}, which is a URL-Work relationship link phrase, there are two entity variables whose name should not be translated, since variable {entity1} will be replaced by a work title and {entity0} by a URL.
  • In link phrases, variables are often used for (optional) attributes, in order to avoid inflating the number of messages. Below are examples with the “additional” attribute:
    • {additional} will be replaced by additional if the “additional” attribute is set, otherwise it will be removed from the text.
    • {additional:additionally} will be replaced by additionally if the “additional” attribute is set, otherwise it will be removed from the text.
    • {additional:an|a} will be replaced by an if the “additional” attribute is set, otherwise it will be replaced by a.
    • {additional:%|regular} will be replaced by additional if the “additional” attribute is set, otherwise it will be replaced by regular.
    • Hence, {additional} can be translated as {additional:aldona} in Esperanto.
  • Note that {instrument} and {vocals} variables are replaced by the specific instrument/vocals name:
    • {instrument:%|instruments} will be replaced by piano (or its translation) if the related instrument is “piano”, otherwise it will be replaced by instruments.
  • In Please {search|try again}:
    • {search|try again} will be replaced with an hyperlink on try again that leads to a search page. The variable identifier search should not be translated, only the link text try again should be translated.

Development

The MusicBrainz Server code is using gettext to provide with automatic internationalization of messages and texts used in the Perl code and templates.

A .pot file is provided with all the strings used in the server. They are in English.

Beyond translation

Current features

Current issues

Most of current issues are tracked through MusicBrainz Server internationalization tickets. Some more long-term goals are not tracked yet.

Possibly the biggest unsolved issue is that there is no way to translate MusicBrainz documentation (jira:MBS-1406). This might involve finding a way to translate wiki content, moving documentation to a different place, or a combination of both.

There are most likely some internationalization issues with fuzzy search in some languages (with agglutinative words or ideographic characters). It mostly requires making proper use of language analysis from Apache Solr.

Ideas up in the air

Artists sort names

Artists currently get a (main) sort name which must be in Latin script, and translations or transliterations are used for artists with non-Latin names. This could eventually be replaced by the already existing alias sort name feature, which already allows any appropriate script for the alias locale; it might require the introduction of either a generic "Latin script" alias locale or a way to indicate Latin transliterations for non-Latin alias locales.

Automatic Transliteration

Automatic transliteration could be done for many languages if no transliterated/translated alias is available. For best results it is necessary to know the language (e.g. cyrillic script is used by several languages; transliteration will be subtly different from Ukrainian or from Azerbaijani - in the case of Chinese, differences between dialects are even more dramatic). For Japanese, where identical kanji can have multiple different readings, the correct transliteration may not be easy to determine at all. In addition, individual artists often may prefer nonstandard transliteration of their names, or may have an "English" name that isn't really a transliteration.