History:Release Language

From MusicBrainz Wiki
Revision as of 10:55, 20 November 2006 by Dmppanda (talk | contribs) (Reworded (Imported from MoinMoin))
Jump to navigationJump to search

Release Language and Script

Releases have two linguistic attributes, language and script. The language attribute (e.g. French) records the language of the release title and track titles, (not the lyrics and not the extra informations on the disc sleeve), and the script attribute (e.g. Latin) records the general type of characters in which the release and track titles are written. In most cases, the script will be guessed correctly, so you don't need to worry too much about it. The language guessing is less accurate, and may be hard even for an experienced person to determine.

Languages

MusicBrainz uses the ISO 639 language codes to record languages; there are over 400 different languages, and a list with several thousand is being developed. However, few of these languages are used in many MusicBrainz releases, so by default, a reduced list of 20 languages is presented (you can request a complete list if the correct language for a release is not listed). The reduced list includes the official languages of the United Nations (Arabic, Chinese, English, French, Russian, and Spanish), the special case 'Multiple Languages', and the 13 other languages that are most frequently used in MusicBrainz data. The languages are chosen in this way for fairness, allowing the community to choose which languages are most important when they set the language of a release.

If several languages are used in the titles, choose the most common language. For releases where there's an equal mix of two or more languages and hence no obvious answer, 'Multiple Languages' may be the best choice. But remember that it is quite common for languages to borrow words and phrases, and so "Je ne sais quoi" in an English title does not make something multiple languages, nor do a few English words in a foreign language title. (Some languages borrow quite extensively, and especially for Japanese, unless most of the titles are in other languages, Japanese is probably the best choice.)

In some cases, the release and track titles written on a release may include translations or transliterations.

Scripts

MusicBrainz uses the ISO 15924 script codes to record scripts. While there are over a hundred different scripts, very few of these are used for releases, so by default, a reduced list of about a dozen scripts is presented (you can request a more extensive list if the correct script for a release is not listed). If you are entering more than a dozen releases in a script that is not listed in the reduced list, please send a request to the UsersMailingList to have your script added to the reduced list.

If several scripts are used on a release, choose the most common script (there is no 'Multiple Scripts' choice); however, as the Latin script is common in many languages that primarily use another script, Latin should only be chosen if there are no more than one or two titles (or a few characters) in other scripts. For example, a Japanese release with a mix of English and Japanese titles should ... Attention.png unfinished sentence here? --Keschte

The script data comes from ISO 15924, with the exception that the code Hkrt, "(alias for Hiragana + Katakana)" has been renamed to "Kanji & Kana" as Japanese is often a mix of kanji, hiragana and katakana and there's no single script code to cover this.

Guide to Common Scripts

  • Latin (also known as Roman or, incorrectly, "English")
  • Latin is the most common script, and usually the correct choice. It is used for all Western European languages, and many others. It is the most common script used for transliterations.
  • Arabic العربية
  • The Arabic script is used for languages in the Middle East and Central Asia such as Arabic, Farsi and Urdu.
  • Cyrillic Кириллик
  • Cyrillic is used for languages in Eastern Europe such as Russian, Ukrainian, Belarusian and Bulgarian.
  • Greek Ελληνικά
  • The Greek script is used for Greek, but several characters have also been adopted for mathematical uses.
  • Han 汉字
  • This script should only be used for Chinese where the variant is unknown; in all other cases, Han (simplified), Han (traditional), Kanji & Kana, or Hangul should be used instead.
  • Han (simplified) 简体字
  • The simplified variant of Han characters is used to write Chinese in mainland China and Singapore.
  • Han (traditional) 正體字
  • The traditional variant of Han characters is used to write Chinese in Hong Kong and Taiwan.
  • Hangul 한글
  • Hangul is used exclusively for Korean. Hangul should also be used for any Korean which also includes Hanja (Hanzi).
  • Hebrew עברית
  • The Hebrew script is used for Hebrew, but a few characters have also been adopted for mathematical uses.
  • Kanji & Kana 漢字 & ひらがな & カタカナ
  • This covers any combination of kanji, hiragana and katakana for Japanese.
  • Katakana カタカナ
  • Katakana should only be used for transliterations into Japanese. Japanese language titles with words written in katakana should use Kanji & Kana.
  • Thai ไทย
  • The Thai script is used for Thai, as well as some minor languages in Southeast Asia.