User:Reosarevok/i18n

From MusicBrainz Wiki
Jump to navigationJump to search

Beyond translation

Current features

  • Localized aliases on all entities except URL; the list of locales is imported from Unicode CLDR.
    • Support to search for entities by name, alias, or both using fuzzy search (the default).
    • Specific sort name for each alias indicating how the entity should be sorted under that name in the given locale.
    • Aliases are returned (if specifically requested) by the MusicBrainz API.
    • Primary aliases in the user's preferred language are shown when searching to help find the right entities.
    • However aliases are not otherwise used for display in the website; See jira:MBS-11965 for follow-up.
  • Language and script (a.k.a. writing system) of each release’s tracklist.
  • Language of work’s lyrics.
  • Ability to enter localized artist names as appropriate on releases and recordings using artist credits.

Future

Entity names

Either the existing (main) entity names / titles should get a locale in the same vein as aliases, or the name / title itself should become an alias with a specific flag.

Artist sort names

Artists currently get a (main) sort name which must be in Latin script, and translations or transliterations are used for artists with non-Latin names. This could eventually be replaced by the already existing alias sort name feature, which already allows any appropriate script for the alias locale; it might require the introduction of either a generic "Latin script" alias locale or a way to indicate Latin transliterations for non-Latin alias locales.


Automatic Transliteration

Automatic transliteration could be done for many languages if no transliterated/translated alias is available. For best results it is necessary to know the language (e.g. cyrillic script is used by several languages; transliteration will be subtly different from Ukrainian or from Azerbaijani - in the case of Chinese, differences between dialects are even more dramatic). For Japanese, where identical kanji can have multiple different readings, the correct transliteration may not be easy to determine at all. In addition, individual artists often may prefer nonstandard transliteration of their names, or may have an "English" name that isn't really a transliteration.

Language-specific issues

There are some issues that are particular to specific languages. Since the browser is doing the rendering, problems like combining characters aren't an issue for MusicBrainz, but some issues remain.

Greek final sigma

One case where Unicode forces applications to deal with combining characters directly, rather than leave it to the browsers, is the alternate form of lowercase sigma at the end of a word. RFE 1021537 points out two places (Javascript "Guess Case" and auto-approval for case/accent-changing edits) where this needs to be handled.

Right-to-Left Support (Arabic & Hebrew)

Characters in these alphabets are written right-to-left, and the hairy and complex bidirectionality support tries to make these work correctly, even when embedded in a page that is primarily left-to-right. However, some things get botched. Meta-information in parentheses, like (disc 1), is particularly mangled, and in subscription notification emails, you also get things like "2) <hebrew> open, 4 applied)" since parentheses and numbers don't override the current default direction, and parentheses are reversed based on current direction. Judicious use of RTL and LTR overrides at the beginning and end of artist names in Arabic and Hebrew would help (although I don't believe they should be embedded in the artist names themselves). RFE ME med

Furthermore, for localization of the web server itself into Hebrew and Arabic, right justification (or really, mirror display) of all the pages layouts will be needed. The i18n support for this is surely not yet present. RFE ME low