History:Release Transliteration And Translation

From MusicBrainz Wiki
Revision as of 12:20, 21 December 2005 by DonRedman (talk | contribs) (reworking the whole page (Imported from MoinMoin))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Summary: This is a proposal to add a human-language to human-language track title transliteration relationship which works on Album units.

Proposal

Add an AdvancedRelationshipType that relates an album in one language to another album in another language to represent translations and transliterations (TranslationAndTransliterationRelationshipType).

Definitions

Are these actually correct?

Translation (t9n)
means that one human-oriented natural language (e.g. English-US, Japanese, etc) is translated to another. Words, that have meaning in language A are translated into words that have meaning in language B.
Transliteration (t13n)
means that words which have meaning in the original language are transliterated into the scripting conventions or another language, so that the sound is roughly the same. This can include a change of script (e.g. Katakana to Latin), but does not necessarily so (a French name can be transliterated to German).

This is a interimmeasure for the current MB database and not intended to be a permanent solution. It does not solve the Artist name transliteration problem.

Advantage(s):

  • The user's assertion that this album is related to this album can be stored.
  • When updating the MB database schema in a way that will truly support i18n, the information in the relationships between Album data can be retrieved and automaitcally imported.
  • (Allows the user to choose which album info when extracting from the database. Actually this functionality already exists when the Artist is the same.)

Disadvantage(s):

  • Information grouped on Albums such as DiscIDs is not automatically related. Solving this woul deither need
    • Duplication of data, i.e. storing the DiscID in multiple albums (how bad is this? NadelnderBambus will support this).
    • The presentation software on the MB server side needs to collect and show shared DiscIDs, etc parsing the relationships.
  • If e.g. an English and a Japanese release of the same album get transliterated to Cyrillic, you will have to break the DontMakeRelationshipClusters rule.
  • Validity?
  • What to do with unique Album info such as release dates or even Artist?
  • How to publish this capability to a MB database contributor?

Amendments

Amendment 1: Add a "official" or "unofficial" status to this transliteration relationship. "This is a 'official' transliteration."

Advantage(s):

  • Transliteration validity (but how is validity concretely defined?)

Disadvantage(s):

  • Is an unmarked Album transliteration more or less official than an "unofficial" Album transliteration?

Amendment 2: Add a language identifier to this tranliteration relationship. "This is a 'Japanese' transliteration."

Advantage(s):

  • Human language can be identified for filtering purposes, thus people preferring a certain language can recieve the appropriate transliteration if available. (future server functionality?)

Disadvantage(s):

  • None?

Background: Discussion on the Mailing List

This was discussed on the UsersMailingList as Duplicate albums for transliteration. It is currently (2005-12) discussed on the StyleMailingList as Cyrrilic.

Some Relevant Points

  • Recently there was some discussion about how to deal with Kate Bush's Aerial which contains a track named <pi>. To appease clients that can't deal with unicode, it looks as though it was decided to create two identical versions of this album, one with the track named the symbol <pi> (shouldn't that be capital <Pi>?) and one named "Pi". Beside this the two albums are identical. In fact, there is a third album representing the Japanese release with the same discid as well but with Japanese naming. This approach seems to me to be an unmaintainable solution. You end up with redundant (and unnormalized) data that need to be maintained in parallel...
  • People will not like this, since arbitrary scipts and laguages can be stored in the db. Fact is, however, that the db cannot deal with the relationships between such translitterations in a proper way.
  • Could you elaborate on this? I was thinking that there could be an album-unit relationship entity called "xxx is a transliteration of yyy" that could be added in the mean time. It would help to identify data in future improvements. Does it make sense?