Difference between revisions of "Style/Artist/Sort Name"

From MusicBrainz Wiki
< Style‎ | Artist
((Imported from MoinMoin))
(Commenting on proposal of new/rewritten rules (Imported from MoinMoin))
Line 177: Line 177:
 
Note: There are two general rules for when we add a comma. First, in the case where we separate the proper name from it’s remainder. And more commonly, for when we change the sequence of words. For example moving the leading “The” and swapping first name and last name. We don’t add a comma between the natural separators like “and” and “with” because we haven’t relocated any words and they work just fine on their own.  
 
Note: There are two general rules for when we add a comma. First, in the case where we separate the proper name from it’s remainder. And more commonly, for when we change the sequence of words. For example moving the leading “The” and swapping first name and last name. We don’t add a comma between the natural separators like “and” and “with” because we haven’t relocated any words and they work just fine on their own.  
  
Script writers/programmers will notice that these rules lend themselves quite nicely to programmatically deriving [[Artist Sort Name|ArtistSortName]] from [[Artist Name|ArtistName]] without having to use many exception cases, given a searchable table of non-fictional proper name.   
+
Script writers/programmers will notice that these rules lend themselves quite nicely to programmatically deriving [[Artist Sort Name|ArtistSortName]] from [[Artist Name|ArtistName]] without having to use many exception cases, given a searchable table of non-fictional proper name.  
 +
<ul><li style="list-style-type:none">To the anonymous contributor: Well, some of us aren't happy with any of the rules from number 2 on. Personally, I think we could save a lot of time and effort if we rewrote them thus:
 +
  # For a single artist, [[Artist Sort Name|ArtistSortName]] must be the same as the artist's name as listed in the U.S. Library of Congress catalog.
 +
# For multiple artists working together (e.g. "Tony Sheridan and the Beatles"), the [[Artist Sort Name|ArtistSortName]] should be broken down into separate artists' names, each name replaced with the name as listed in the U.S. Library of Congress catalog, then the names re-assembled (e.g. "Sheridan, Tony and Beatles").
 +
# There is no rule 3.
 +
But I doubt that's going to happen. --[[User:LarryGilbert|LarryGilbert]]
 +
</ul>
 +
 
 
----
 
----
  

Revision as of 00:36, 26 July 2007

Style for sorting artist names

This pages describes how to format the ArtistSortName for ArtistNames.

Sortnames are heavily edited in order to sort all artists well, international ones and those with extremely weird names. The following rules apply:

  1. All ArtistSortNames should be in Latin script. For Japanese, Chinese, Greek, etc. ArtistNames this means they have to be transliterated.
  2. For artist names that are stylised, change stylised characters to their equivalent letter and remove any style-only characters and apply the rules below. This applies to names which include unusual punctuation or spacing which is not intended to be pronounced using the normal rules.
  3. ArtistSortNames contain all the accented characters that are present in the ArtistName, as long as they're in Latin script.
  4. Numerals do not change.
  5. If an ArtistName consists of two or more collaborating artists, each individual name is sorted separately according to the rules below. The 'separator' (e.g. "&", "and", "with", "vs." or "," (comma)) stays the same.
  6. All parts of a sort name are separated by ", " (comma and space). How to distinguishing parts is explained below.
    • For artist names that are regular names, the sort name will be "Last Name, First Name". Example: "Eric Clapton" 's sort name is "Clapton, Eric".
    • For artist names that are ficticious names, the sort name is the same as the artist name. Examples: "Franz Ferdinand" and "Cypress Hill".
    • For artist names that start with "The", that word is treated as is it were a first name of a regular name. Examples: "The Beatles" have sort name "Beatles, The".
    • Non-english articles like La, El, Los and Le are treated as the English article The. Example: "Los Lobos" have sort name "Lobos, Los".
    • For artist names that start with a title like "Dr.", "DJ" or "MC", that title is treated as is it were a first name of a regular name. Example: "DJ Tiësto" has sort name "Tiësto, DJ".
    • For artist names that end with a title like "Jr." or "Sr.", that title is always put at the end of the sort name, preceded by ", ". Example: "Harry Connick, Jr." has sort name "Connick, Harry, Jr.".
    • For artist names with a nickname between the first name and last name, the nickname is treated as if it's part of the first name of the artist. Example: "Jean 'Toots' Thielemans" has sort name "Thielemans, Jean 'Toots'".
    • For artists whose last names start with an abbreviation, the last names are unabbreviated in the sort name. Example: "Rebecca St. James" has sort name "Saint James, Rebecca". This also holds for groups. "St. Lunatics" has sort name "Saint Lunatics".
    • Artist names that contain a person's name (usually bands) do not sort as persons, but as ficticious names. Examples: "The Sensational Alex Harvey Band" has sort name "Sensational Alex Harvey Band, The". "The Jimi Hendrix Experience" has sort name "Jimi Hendrix Experience, The".

Examples

  1. Transliteration: Пётр Ильич Чайковский is transliterated to Pyotr Ilyich Tchaikovsky (and has the sortname "Tchaikovsky, Pyotr Ilyich").
  2. De-stylizing: My$t:c DJz have sort name "Mystic DJz". *NSync have sort name "NSync". t r a n c e [ c o n t r o l] has sort name "trance control". Exceptions to this rule are Artists whose name do not mean anything and cannot be transliterated in any way. Examples: (´・д・)ノ has sort name "(´・д・)ノ". ♪◆m599XGSMF6 has sort name "♪◆m599XGSMF6".
  3. Accented characters: "René Löwe" has sort name "Löwe, René".
  4. Numerals: "The Four Freshmen" have sort name "Four Freshmen, The". "10,000 Maniacs" have sort name "10,000 Maniacs". "Maroon 5" have sort name "Maroon 5".
  5. Collaborating Artists: "Bob Dylan and The Band" have sort name "Dylan, Bob and Band, The". "B.B. King & Eric Clapton" have sort name "King, B.B. & Clapton, Eric". "Bill Haley & His Comets" have sort name "Haley, Bill & His Comets". This rule does not apply for artist names that seem to consist of more than one artist, but do not. Example: sort name for "Hootie & the Blowfish" is the same, because the Blowfish are not are separate band.
  6. "A Perfect Circle" has sort name "A Perfect Circle".

Language specific rules

These language specific rules are not official, but are generally applied for artist names in the languages listed below. Sort names for non-English artist names are not discussed yet. Please help out.

Dutch

ArtistNames with a tussenvoegsel (there is no English word for this; it's the bit between the first and last name): artist "Boudewijn de Groot" to "Groot, de, Boudewijn". This seems to me the most clear and logically correct way to sort these artists. Since "de" is not part of the first name ("De Groot" is the last name), and since we want to sort these persons under "Groot" the best option is "Groot, de, Boudewijn". Better than "Groot, Boudewijn de" where "de" seems to be part of the first name, which it is incorrect.

French

The French use for particules is to put it behind the first name (like in Portuguese it seems). Hence, "Alfred de Musset" gets the sortname of "Musset, Alfred de". MB live example at François de Roubaix.

More complicated examples:

Hungarian

Hungarian names follow the "western" custom, using given name and family name. However, Hungary is the only European country to place the family name before the given names, i.e. it uses the eastern name order. So effectively the sort name equals the name there.

Icelandic

Sort on their first name.

Italian

All of the official rules above apply. Articles to put after if they come first are "il", "gli", "lo", "la", "i", "le". The Italian equivalent of the Dutch "van" are "de" and "del" and should be treated as the Dutch rules.

Japanese

Names are usually family name first, given name second. As a result, the sortname (once transliterated) is the same as the artist name. However, Japanese artists commonly known in countries that use Latin script often reverse their name for releases in those nations, so some caution is required when adding such artists.

Portuguese

Person:

  • Last name, First name [2nd, 3rd, ...]. Example: "Moreira, Gilberto Passos Gil"
  • Specific rules:
    • Compost last name. Example: "Espírito Santo, Pedro"
    • Familiar ship indication names (Filho, Neto, Júnior), go with the last name. Example: "Connick Júnior, Harry"
    • de, da, e before last name. Example: "Hollanda, Francisco Buarque de"

Romanian

Persons are sorted normally (LastName, FirstName). There are no "de" or "von" particles in normal Romanian names. Bands with Romanian names are always sorted by their name, because the definite article in Romanian is a suffix.

Discussion

!BibTeX has a pretty complete sorting algorithm and that one defines a "von-part" of the Name, Thus IIRC Manfred Albrecht Freiherr von Richthofen (The 'Red Baron') is sorted: Richthofen, Freiherr von, Manfred Albrecht. I assume that is what you mean by 'adjective'. The correct desciption is either "von-part" or aristocratic title or something like that. I think most sorting practices agree that this is a special part that has to be treated by itself. --DonRedman

  • This cannot be entirely true, at least for German useage. Ludwig van Beethoven for example has the sortname "Beethoven, Ludwig van". This is AFAIK the general useage for German aristocratc titles; so your example should read "Richthofen, Manfred Albrecht Freiherr von". --derGraph


There actually are standardised names for most artists which have been determined by the wise seers at the Library of Congress (and are in the processes of being harmonised with the British Library and other national libraries). These are called "name authority records" and can be conveniently searched online at: http://authorities.loc.gov/

According to them, the official names for the above are: "Muslimgauze (Musician)" and " 'N Sync (Musical group)". (Authorised names are in the "100's" fields, and known alternate names or aliases are found in the "600's" fields.) The others don't seem to have authority records..., yet.

Another scheme that librarians use, primarily for electronic records, is called the Dublin Core. (It's a way of adding information to HTML documents to identify the creator, etc.) I found a site about the Dublin Core which contains a very good description of how people who catalogue things for a living approach unknown names and non-standard characters. There's a PDF document or you can skip directly to an HTML version of page 9 ("Creator") in the Google cache.


What about fictitious artist names like Pete Namlook? It looks like a real person's name, but it isn't. It is, however, an alias for a person. --Zout 
  • I would treat it as a real name. -- WolfSong 13:48, 01 February 2006 (UTC)

I personally treat artists like ♪◆m599XGSMF6 under #2 (giving m599XGSMF6 as a sortname) whereas artists like (´・д・)ノ I have no idea about (Japan probably has some name for them...). --Nikki 


  • Is it counter-intuitive to sort bands that contain single artists under the band rather than the artist? It'd be a pretty weird record shop where Bob Dylan and The Band is in the "D" section, Jimi Hendrix is in the "H" section but the Jimi Hendrix experience is in the "J" section.
    • Which is probably why I've never liked this sorting scheme. I used to work in record store retail and groups with a members name were always sorted by the sort name of the member. So Dave Matthews Band would be under Matthews; Alan Parsons Project would be under Parsons. I do realize that the actual sort name will look ugly but I don't think people generally "look" at sort names with any frequency. You're more likely to look at the "sorted" name (meaning how it looks in a list). The hurdle will be how to place the elements. Is it Hendrix Experience, The Jimi or Hendrix Experience, Jimi, The. The driver for that decision should be how the system (Picard and TaggerScript in general) will interpret it not is it visually appealing to humans. -- WolfSong 17:14, 14 February 2006 (UTC)
      • totally agree, it seems odd that 'Jimmie Hendrix' and 'Jimmie Hendrix Experience' is not sorted together, it seems to defy the purpose of sortnames mo

Either option is counterintuitive, and sortname is used for things other than sorting. It's also the method Picard and other clients use to transliterate names in non-Latin scripts for people who cannot read them, or cannot use them in filenames. -Sailorleo

  • Really? That sounds like it's a potential source of problems... if we had an artist named, say "Dr. Johņ 'Saint' de St. Doę, Jr.", the sort-name would be something along the lines of "de Saint Doe, Dr. John 'Saint', Jr." which strikes me as not being a terribly friendly name...
    • actually for picard, it no longer uses sortname, but aliases, so we get Kago Ai, not Ai, Kago. picard will also be using TaggerScript in just a little bit. mo

hb- I’m not happy at all about any of the rules after #4. They are poorly written, poorly structured, and in my opinion, poorly thought out.

There is an on-going debate about the last (#9) item of rule #6, regarding handling of proper names within band names. The fact that rule #6 has 9 items really indicates that there are problems here.

It’s also debatable as to whether there -is- a debate on this issue. While several have spoken out against the current rule, no one has come to its defense or tried to explain the reasoning.

It really feels as though there are only a very few people who prefer to not sort the artist and leave the SortArtistName tag identical to the ArtistName tag as much as possible. This defeats the idea of the overall dictum: which is that “Sortnames are –heavily- edited in order to sort all artists well” (emphasis added). Rule 6 item 9 is absolutely counter to this direction and its supporters need to speak up or be over-ruled in their silence.

ArtistSortNames do NOT have to be non-ugly. They have to be functional. The simple ArtistName tag is the display tag. Why bother to have two tags if we keep them the same despite the desire to actually “sort” our artists?

Rules 5 and 6 need to be re-written for clarity and process, and for reversal of item 9.

I propose that the existing first four rules remain as they are written.

After those, use the following rules:

5. ArtistNames that include natural separators like “and”, “with”, “&”, “vs.”, and “,” (comma) have already created, de facto multiple ordered sequential “parts”. Whether these ordered “parts” contain Collaborating Artists or not is immaterial. Maintain artist intent by keeping this order and performing the remaining sort naming accordingly –within- each “part”. If there are no separators, then treat the whole ArtistName as a single “part”. Each “part” follows the remaining rules independently from one another. Keep their order, and keep the separator between them without an additional comma.

6. Within each “part”, if present, pull out the full proper name of the non-fictional individual artist or member for whom the sort will be performed, e.g. “Jimi Hendrix” and “Dave Matthews” and “Alex Harvey”. If there is no proper non-fictional artist name or band member, then proceed to rule #7. Format the proper name portion of the “part” accordingly:

a. The proper name must be the individual artist or a band member. An ArtistName “George Washington” does not get sorted under “W” unless the individual is, or the band includes, a George Washington.

b. Regular names like “First-Name Last-Name” are sorted like “Last-Name, First-Name” with the addition of a comma. Example: "Eric Clapton" sorts as "Clapton, Eric".

c. For artist names with a nickname between the first name and last name, the nickname is treated as if it's part of the first name of the artist. Example: "Jean 'Toots' Thielemans" sorts as "Thielemans, Jean 'Toots'".

d. Leading Titles like “Dr.” and “DJ” and “MC” are moved after the person’s name with a preceding comma. Example: “Dr. Dre” sorts as “Dre, Dr.” and "DJ Tiësto" sorts as "Tiësto, DJ".

e. Trailing suffixes like “Sr.” or “Jr.” or “III” always remain at the end of Individual’s Name. Example: "Harry Connick, Jr." sorts as "Connick, Harry, Jr.". Though there usually is one, add no preceding comma where there is no comma already. Example “Dave Thomas III” sorts as “Thomas, Dave III”.

f. For artists whose last names start with an abbreviation, the last names are unabbreviated in the sort name. Example: "Rebecca St. James" sorts as "Saint James, Rebecca".

g. Removal of the proper name from the ArtistName “part” might leave a remaindered portion. In this case, a comma is added at the end of the proper name portion, and the remainder is kept as a new, single, whole entity and treated identically to a “part” under Rule 7. The result is concatenated at the end after the comma. Example: “The Jimi Hendrix Experience” has a remainder of “The Experience” and sorts as “Hendrix, Jimi, Experience, The” and “The Sensational Alex Harvey Band” leaves “The Sensational Band” and sorts as “Harvey, Alex, Sensational Band, The”.

7. ArtistName “parts” without proper names (including band names and remaindered portions from Rule 6) are handled accordingly:

a. Leading Articles, like “The” and “A” and “Los”, regardless of language, are moved to the end with a preceding comma. Example: “The Beatles” and “A Perfect Circle” and “Los Lobos” sort as “Beatles, The” and “Perfect Circle, A” and “Lobos, Los”.

b. Leading abbreviations are unabbreviated. Example: "St. Lunatics" sorts as "Saint Lunatics".

Examples:

The Jimi Hendrix Experience – Hendrix, Jimi, Experience, The

Dave Matthews Band – Matthews, Dave, Band

The Sensational Alex Harvey Band – Harvey, Alex, Sensational Band, The

Stevie Ray Vaughn and Double Trouble – Vaughn, Stevie Ray and Double Trouble

Roger Clyne and The Peacemakers – Clyne, Roger and Peacemakers, The

Hootie and the Blowfish – Hootie and Blowfish, The

Note that there is no need for a comma after “Stevie Ray” and “Roger” as the full proper names were parts without any remainder per rule 6g.

If the name had been “The Peacemakers and Roger Clyne” then keeping proper sequential order would yield “Peacemakers, The and Clyne, Roger”. Note that there is no need for a comma after the “The” following “Peacemakers”.

Note: There are two general rules for when we add a comma. First, in the case where we separate the proper name from it’s remainder. And more commonly, for when we change the sequence of words. For example moving the leading “The” and swapping first name and last name. We don’t add a comma between the natural separators like “and” and “with” because we haven’t relocated any words and they work just fine on their own.

Script writers/programmers will notice that these rules lend themselves quite nicely to programmatically deriving ArtistSortName from ArtistName without having to use many exception cases, given a searchable table of non-fictional proper name.

  • To the anonymous contributor: Well, some of us aren't happy with any of the rules from number 2 on. Personally, I think we could save a lot of time and effort if we rewrote them thus: # For a single artist, ArtistSortName must be the same as the artist's name as listed in the U.S. Library of Congress catalog. # For multiple artists working together (e.g. "Tony Sheridan and the Beatles"), the ArtistSortName should be broken down into separate artists' names, each name replaced with the name as listed in the U.S. Library of Congress catalog, then the names re-assembled (e.g. "Sheridan, Tony and Beatles"). # There is no rule 3. But I doubt that's going to happen. --LarryGilbert


I have a question about artists like, "The Ghost Who Walks"; Should we edit their name in the style of "Ghost Who Walks, The" or just leave it as "The Ghost Who Walks". I think it should be more the latter because it seems more like a sentence/statement than a name. -- Mackattack

  • I think that the sortname should be "Ghost Who Walks, The", on the off chance that somewhere there is a release that left out the "The", so that they get sorted right next to each other. -- MartinRudat 13:34, 17 June 2006 (UTC)


In Flanders (Dutch-speaking part of Belgium), family names that start with "Van", "De" or similar are generally sorted under "Van ...", "De ...", etc. E.g. "Boudewijn de Groot" would be sorted under "de Groot, Boudewijn". --JanC



In the Netherlands they are definitely sorted without the article or preposition though. And that answers the question as well: definitely not adjectives: 'de' and 'het' and their variations are articles 'van' (and some other less common ones like 'in' or 'op') are prepositions. -- thisfred



The value of preserving case is not immediately clear. It is not unusual for music libraries to "smash" case for sort order by storing only one. My first artist entries to MusicBrainz were fouled because the JavaScript in my browser did not work for the guess feature and it was not clear to me that case mattered. This means I and others had to go back and edit all the entries I made. Additional verifications and some substitute for the Javascript magic might be nice, but it seems at the very least that if preserving upper and lower case is going to be a big deal that should be mentioned up front in the sortnamestyle guideline. -- m0llusk



Some clarification is needed for what to do when there are three or more artist names to a sortname, and whether or not it's acceptable to use semicolons, as traditionally dictated in English grammar. There was some disagreement recently with The Hacker, Millimetric & David Caretta: whether the sortname should be "Hacker, The, Millimetric & Caretta, David" or "Hacker, The; Millimetric & Caretta, David". My taste runs to the latter, but the style guidelines here indicate the former is correct. Would it be all right to add something like "Do not use semicolons even if there are three or more names" to clarify the intent? --LarryGilbert

  • I've added the comma to rule 4 above, to make it clear the same separator has to be used. I think that will do --Zout


Why is (for example) 10,000 Maniacs sorted as "10,000 Maniacs" and not "Ten Thousand Maniacs" as libraries do? --LarryGilbert



Since abbreviations such as St. are expanded in sort names, shouldn't Jr. and Sr. be too? --Creap

  • Note that according to the current rules, it's only the first word, after sorting that gets expanded. There are only a few that I can think of that would remain there after the title move, mostly just Saint and it's equivalent in various other languages. --Sailorleo