User:CallerNo6/Sort Name

From MusicBrainz Wiki
< User:CallerNo6
Revision as of 01:57, 29 February 2016 by Legoktm (talk | contribs) (Legoktm moved page User:Caller number six/Sort Name to User:CallerNo6/Sort Name: Automatically moved page while renaming the user "Caller number six" to "CallerNo6")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Problems I see in Sort Name Style

Purpose

I think it's important that there be a stated purpose to this guideline rather than just a list of "rules"[2] that are to be followed. I mention it because the following points are based on expected results.

I tend to treat the purpose as this:

  1. mimic bricks-n-mortar record store ordering for the most part
  2. try to put names where people would expect to find them

It has been suggested that we could go further toward record-store-sorting by e.g. appending first names when none are present. This is interesting, but I think should be a separate field. Otherwise, one would need to know biographical details about an artist in order to know where to look in a collated list.

(To some extent, that will always be true or course)

Jr/Sr

The current guidelines handle "jr.", "sr." etc. badly. By appending them to the Family Name, we get unexpected results.

Example (using the current guidelines):

  • "Walter Davis, jr." would be given sort name "Davis jr., Walter"
  • "Betty Davis" would be given the sort name "Davis, Betty"
  • So Walter Davis would sort before Betty Davis, or anybody else with that family name.

The correct sort name would be "Davis, Walter, jr"

Titles and Ranks

The current guidelines handle titles/ranks badly. By pre-pending the title to the person's given name, we get unexpected results.

Example (using the current guidelines):

  • "Sir Arthur Sullivan" would be given sortname "Sullivan, Sir Arthur"
  • "Maxine Sullivan" would be given the sortname "Sullivan, Maxine"
  • So Maxine would sort before Arthur.

Delimiter

Using "comma" as a delimiter can give unexpected results. The reason for this is that "space" sorts before "comma" (while both sort before the alpha-numeric characters). This means that names with two (or more words) will be sorted in unexpected ways.

Example:

  • "Jerry Garcia" has sort name "Garcia, Jerry"
  • "Gabriel García Márquez" has sort name "García Márquez, Gabriel"
  • "García Márquez" will sort /before/ "Garcia". Why? Because "space" sorts before "comma".

SPAs

Currently the Special Purpose Artists retain the brackets in their sort names.

It was pointed out to me that this groups them logically outside the normal flow of proper names (since "[" sorts before a-z). I think that's a cool thing, but it is correct?

Fictional character names, fictitious names

Currently fictional characters are given sort names as real people would be, when possible, so e.g. "Simpson, Bart".

This is not an explicit in the guidelines. An editor has requested that this be made explicit.

Similarly, given "Bob Foo's Superlative Orchestra", MB would give the sort name "Foo, Bob, Superlative Orchestra". This mimics record-store sorting, but is not in line with (at least some) libraries.

Possessive apostrophe

As long as "Bob Foo's Superlative Orchestra" sorts on Bob Foo's last name, the apostrophe-s should be dropped. "Foo's," will not sort with "Foo,". Worse, it might come before or after depending on whether the apostrophe is ascii or not, and depending on the sorting utility used.

Unicode Guidelines

Nikki has urged me to streamline the unicode/character/symbol parts of the guideline. I'm open to suggestions.

i18n

Some non-English guidelines would be nice.

Display

Sort Names as we currently use them need to be relatively display-friendly. They appear in artist searches and are used as official transliterated Artist names. This isn't a problem per se, but needs to be taken into account when rethinking sort name style. If sort names didn't need to be display-friendly, that would be nice.

Obsolescence

How many people even care about sort names? It seems more common to use search functionality.