Talk:Sort Name Style

From MusicBrainz Wiki
Revision as of 14:57, 12 September 2007 by Foolip (talk | contribs) (suggestion for new style guide (Imported from MoinMoin))
Jump to navigationJump to search

SortNameStyle > Discussion

SortNameStyle Discussion

!BibTeX has a pretty complete sorting algorithm and that one defines a "von-part" of the Name, Thus IIRC Manfred Albrecht Freiherr von Richthofen (The 'Red Baron') is sorted: Richthofen, Freiherr von, Manfred Albrecht. I assume that is what you mean by 'adjective'. The correct desciption is either "von-part" or aristocratic title or something like that. I think most sorting practices agree that this is a special part that has to be treated by itself. --DonRedman

  • This cannot be entirely true, at least for German useage. Ludwig van Beethoven for example has the sortname "Beethoven, Ludwig van". This is AFAIK the general useage for German aristocratc titles; so your example should read "Richthofen, Manfred Albrecht Freiherr von". --derGraph

There actually are standardised names for most artists which have been determined by the wise seers at the Library of Congress (and are in the processes of being harmonised with the British Library and other national libraries). These are called "name authority records" and can be conveniently searched online at:

According to them, the official names for the above are: "Muslimgauze (Musician)" and " 'N Sync (Musical group)". (Authorised names are in the "100's" fields, and known alternate names or aliases are found in the "600's" fields.) The others don't seem to have authority records..., yet.

Another scheme that librarians use, primarily for electronic records, is called the Dublin Core. (It's a way of adding information to HTML documents to identify the creator, etc.) I found a site about the Dublin Core which contains a very good description of how people who catalogue things for a living approach unknown names and non-standard characters. There's a PDF document or you can skip directly to an HTML version of page 9 ("Creator") in the Google cache.

What about fictitious artist names like Pete Namlook? It looks like a real person's name, but it isn't. It is, however, an alias for a person. --Zout 
  • I would treat it as a real name. -- WolfSong 13:48, 01 February 2006 (UTC)

I personally treat artists like ♪◆m599XGSMF6 under #2 (giving m599XGSMF6 as a sortname) whereas artists like (´・д・)ノ I have no idea about (Japan probably has some name for them...). --Nikki 

  • Is it counter-intuitive to sort bands that contain single artists under the band rather than the artist? It'd be a pretty weird record shop where Bob Dylan and The Band is in the "D" section, Jimi Hendrix is in the "H" section but the Jimi Hendrix experience is in the "J" section.
    • Which is probably why I've never liked this sorting scheme. I used to work in record store retail and groups with a members name were always sorted by the sort name of the member. So Dave Matthews Band would be under Matthews; Alan Parsons Project would be under Parsons. I do realize that the actual sort name will look ugly but I don't think people generally "look" at sort names with any frequency. You're more likely to look at the "sorted" name (meaning how it looks in a list). The hurdle will be how to place the elements. Is it Hendrix Experience, The Jimi or Hendrix Experience, Jimi, The. The driver for that decision should be how the system (Picard and TaggerScript in general) will interpret it not is it visually appealing to humans. -- WolfSong 17:14, 14 February 2006 (UTC)
      • totally agree, it seems odd that 'Jimmie Hendrix' and 'Jimmie Hendrix Experience' is not sorted together, it seems to defy the purpose of sortnames mo

hb- I’m not happy at all about any of the rules after #4. They are poorly written, poorly structured, and in my opinion, poorly thought out.

There is an on-going debate about the last (#9) item of rule #6, regarding handling of proper names within band names. The fact that rule #6 has 9 items really indicates that there are problems here.

It’s also debatable as to whether there -is- a debate on this issue. While several have spoken out against the current rule, no one has come to its defense or tried to explain the reasoning.

It really feels as though there are only a very few people who prefer to not sort the artist and leave the ArtistSortName tag identical to the ArtistName tag as much as possible. This defeats the idea of the overall dictum: which is that “Sortnames are –heavily- edited in order to sort all artists well” (emphasis added). Rule 6 item 9 is absolutely counter to this direction and its supporters need to speak up or be over-ruled in their silence.

ArtistSortNames do NOT have to be non-ugly. They have to be functional. The simple ArtistName tag is the display tag. Why bother to have two tags if we keep them the same despite the desire to actually “sort” our artists?

Rules 5 and 6 need to be re-written for clarity and process, and for reversal of item 9.

I propose that the existing first four rules remain as they are written.

After those, use the following rules:

5. ArtistNames that include natural separators like “and”, “with”, “&”, “vs.”, and “,” (comma) have already created, de facto multiple ordered sequential “parts”. Whether these ordered “parts” contain Collaborating Artists or not is immaterial. Maintain artist intent by keeping this order and performing the remaining sort naming accordingly –within- each “part”. If there are no separators, then treat the whole ArtistName as a single “part”. Each “part” follows the remaining rules independently from one another. Keep their order, and keep the separator between them without an additional comma.

6. Within each “part”, if present, pull out the full proper name of the non-fictional individual artist or member for whom the sort will be performed, e.g. “Jimi Hendrix” and “Dave Matthews” and “Alex Harvey”. If there is no proper non-fictional artist name or band member, then proceed to rule #7. Format the proper name portion of the “part” accordingly:

a. The proper name must be the individual artist or a band member. An ArtistName “George Washington” does not get sorted under “W” unless the individual is, or the band includes, a George Washington.

b. Regular names like “First-Name Last-Name” are sorted like “Last-Name, First-Name” with the addition of a comma. Example: "Eric Clapton" sorts as "Clapton, Eric".

c. For artist names with a nickname between the first name and last name, the nickname is treated as if it's part of the first name of the artist. Example: "Jean 'Toots' Thielemans" sorts as "Thielemans, Jean 'Toots'".

d. Leading Titles like “Dr.” and “DJ” and “MC” are moved after the person’s name with a preceding comma. Example: “Dr. Dre” sorts as “Dre, Dr.” and "DJ Tiësto" sorts as "Tiësto, DJ".

e. Trailing suffixes like “Sr.” or “Jr.” or “III” always remain at the end of Individual’s Name. Example: "Harry Connick, Jr." sorts as "Connick, Harry, Jr.". Though there usually is one, add no preceding comma where there is no comma already. Example “Dave Thomas III” sorts as “Thomas, Dave III”.

f. For artists whose last names start with an abbreviation, the last names are unabbreviated in the sort name. Example: "Rebecca St. James" sorts as "Saint James, Rebecca".

g. Removal of the proper name from the ArtistName “part” might leave a remaindered portion. In this case, a comma is added at the end of the proper name portion, and the remainder is kept as a new, single, whole entity and treated identically to a “part” under Rule 7. The result is concatenated at the end after the comma. Example: “The Jimi Hendrix Experience” has a remainder of “The Experience” and sorts as “Hendrix, Jimi, Experience, The” and “The Sensational Alex Harvey Band” leaves “The Sensational Band” and sorts as “Harvey, Alex, Sensational Band, The”.

7. ArtistName “parts” without proper names (including band names and remaindered portions from Rule 6) are handled accordingly:

a. Leading Articles, like “The” and “A” and “Los”, regardless of language, are moved to the end with a preceding comma. Example: “The Beatles” and “A Perfect Circle” and “Los Lobos” sort as “Beatles, The” and “Perfect Circle, A” and “Lobos, Los”.

b. Leading abbreviations are unabbreviated. Example: "St. Lunatics" sorts as "Saint Lunatics".


The Jimi Hendrix Experience – Hendrix, Jimi, Experience, The

Dave Matthews Band – Matthews, Dave, Band

The Sensational Alex Harvey Band – Harvey, Alex, Sensational Band, The

Stevie Ray Vaughn and Double Trouble – Vaughn, Stevie Ray and Double Trouble

Roger Clyne and The Peacemakers – Clyne, Roger and Peacemakers, The

Hootie and the Blowfish – Hootie and Blowfish, The

Note that there is no need for a comma after “Stevie Ray” and “Roger” as the full proper names were parts without any remainder per rule 6g.

If the name had been “The Peacemakers and Roger Clyne” then keeping proper sequential order would yield “Peacemakers, The and Clyne, Roger”. Note that there is no need for a comma after the “The” following “Peacemakers”.

Note: There are two general rules for when we add a comma. First, in the case where we separate the proper name from it’s remainder. And more commonly, for when we change the sequence of words. For example moving the leading “The” and swapping first name and last name. We don’t add a comma between the natural separators like “and” and “with” because we haven’t relocated any words and they work just fine on their own.

Script writers/programmers will notice that these rules lend themselves quite nicely to programmatically deriving ArtistSortName from ArtistName without having to use many exception cases, given a searchable table of non-fictional proper name.

  • To the anonymous contributor: Well, some of us aren't happy with any of the rules from number 1 on. Personally, I think we could save a lot of time and effort if we rewrote them thus: # For a single artist, ArtistSortName must be the same as the artist's name as listed in the U.S. Library of Congress catalog. # For multiple artists working together (e.g. "Tony Sheridan and the Beatles"), the ArtistSortName should be broken down into separate artists' names, each name replaced with the name as listed in the U.S. Library of Congress catalog, then the names re-assembled (e.g. "Sheridan, Tony and Beatles"). # There is no rule 3. But I doubt that's going to happen. --LarryGilbert

hb- Sorry for the AP. It was unintentional.

Thanks for the support inasmuch as it's one more voice seeking a meaningful ArtistSortName.

I certainly have no problem with Sort names appearing as you propose. I especially like the concept of having a strongly enforced database of standard values. Hey! That's what MusicBrainz is supposed to be.

Serioulsy, I'd considered the LoC approach before. (I've been down this road quite a bit.) I'd so love to see a new tag for LoC LC number. However, for the life of me I can't find a way to access the LoC DB programmatically. If that's not just me, then I think that's a deal killer.

Maybe if someone can find a link explaining how-to, I'd get behind that idea. Until then, as I see it, the only thing we have to go from is the value in ArtistName, and I stand by my proposal.

Still waiting to hear from some one who supports the current rules.--HnryBrdsly

I have a question about artists like, "The Ghost Who Walks"; Should we edit their name in the style of "Ghost Who Walks, The" or just leave it as "The Ghost Who Walks". I think it should be more the latter because it seems more like a sentence/statement than a name. -- Mackattack

  • I think that the sortname should be "Ghost Who Walks, The", on the off chance that somewhere there is a release that left out the "The", so that they get sorted right next to each other. -- MartinRudat 13:34, 17 June 2006 (UTC)

In Flanders (Dutch-speaking part of Belgium), family names that start with "Van", "De" or similar are generally sorted under "Van ...", "De ...", etc. E.g. "Boudewijn de Groot" would be sorted under "de Groot, Boudewijn". --JanC

In the Netherlands they are definitely sorted without the article or preposition though. And that answers the question as well: definitely not adjectives: 'de' and 'het' and their variations are articles 'van' (and some other less common ones like 'in' or 'op') are prepositions. -- thisfred

The value of preserving case is not immediately clear. It is not unusual for music libraries to "smash" case for sort order by storing only one. My first artist entries to MusicBrainz were fouled because the JavaScript in my browser did not work for the guess feature and it was not clear to me that case mattered. This means I and others had to go back and edit all the entries I made. Additional verifications and some substitute for the Javascript magic might be nice, but it seems at the very least that if preserving upper and lower case is going to be a big deal that should be mentioned up front in the sortnamestyle guideline. -- m0llusk

Some clarification is needed for what to do when there are three or more artist names to a sortname, and whether or not it's acceptable to use semicolons, as traditionally dictated in English grammar. There was some disagreement recently with The Hacker, Millimetric & David Caretta: whether the sortname should be "Hacker, The, Millimetric & Caretta, David" or "Hacker, The; Millimetric & Caretta, David". My taste runs to the latter, but the style guidelines here indicate the former is correct. Would it be all right to add something like "Do not use semicolons even if there are three or more names" to clarify the intent? --LarryGilbert

  • I've added the comma to rule 4 above, to make it clear the same separator has to be used. I think that will do --Zout

Why is (for example) 10,000 Maniacs sorted as "10,000 Maniacs" and not "Ten Thousand Maniacs" as libraries do? --LarryGilbert

Since abbreviations such as St. are expanded in sort names, shouldn't Jr. and Sr. be too? --Creap

  • Note that according to the current rules, it's only the first word, after sorting that gets expanded. There are only a few that I can think of that would remain there after the title move, mostly just Saint and it's equivalent in various other languages. --SailorLeo

The sort names for Chinese releases is kind of a mess and very unhelpful. The de facto rule seems be using their tranliterated family name and their English name, e.g. "Chou, Jay" "Leung, Tony" "Chang, Jeff". This is helpful to non-Chinese users, but kind of breaks what sort names are. I sometimes change these to the transliterated names, like "Chang, Shin Che".

Furthermore, the sort names are very unhelpful for Chinese releases and make it harder to find what you are looking for even when the sort names are proper transliterations. This is because different transliterations are used in different areas, and some artists seem to make up their own transliterations that "sound good" For exmple the family name 周 is written as "Zhou", "Chou" and "Chow". It would probably be easier to find what you're looking for if simple unicode sorting was used for Chinese names.

Ideas? Should sort names like "Chou, Jay" be tolerated?


  • I believe that there are moves afoot to have proper support for translation/transliteration, rather than overloading sort name for that, until that time, people are going to keep putting latin text in there, rather than simply copying the artist's name, which I think will be the correct thing for most asian names. Perhaps in addition to, say, pinyin, romaji, etc, there should also be an 'official transliteration', based on however the artist/publisher/label seems to write it most often. Until we have that though, I'd say that it's quite likely going to be a pain to try and keep a lid on things like that. (Personally, I'd browse for stuff using 'official transliteration', 'cause I can't read any of those squiggles. =) -- MartinRudat 07:34, 19 August 2007 (UTC)
    • Do you have a reference/link to that discussion where I could give my input? -- foolip
      • No, sorry. This was just my impression from the irc channel. After a bit of poking around, apparently translation and transliteration is supported between albums, but there isn't a relationship between two artist going x is a transliteration of y. I don't really know if this is the state at which translation is going to be supported, or if there is going to be more. The last time I heard anything was sometime last year. -- MartinRudat 10:49, 20 August 2007 (UTC)

Suggested ammendment to the

  1. All ArtistSortNames should be in Latin script. Other scripts such as Greek, Hebrew and Han (Chinese/Japanese) should use a sort name as per below.
    • An official transliteration/translation as it appears on album covers or other official material.
    • A widely known transliteration/translation as known in the press, by fans, etc.
    • A transliteration using the standard transliteration system used in the region where the artist is active.

I don't understand why French particules aren't treated the same as Dutch tussenvoegselen. It seems to me all the arguments applied to tussenvoegselen also apply to particules. Truthfully, I feel "Groot, de, Boudewijn" just looks silly and doesn't improve sortability. Suppose there was another artist named "Martijn de Groot" - these two artist would sort to the same order whether the "de" was in the middle or at the end. And what exactly are the rules for Germanic names? Beethoven has sort name "Beethoven, Ludwig van". --dkg