History:Capitalization Proposal: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
(history (Imported from MoinMoin))
((Imported from MoinMoin))
Line 5: Line 5:
<ul><li style="list-style-type:none">'''Status:''' [[Image:Alert.png]] ''This has been accepted and made the current [[Capitalization Standard|CapitalizationStandard]] -- see Tom's comments below for more details.''
<ul><li style="list-style-type:none">'''Status:''' [[Image:Alert.png]] ''This has been accepted and made the current [[Capitalization Standard|CapitalizationStandard]] -- see Tom's comments below for more details.''
</ul>
</ul>

''Is this a [[Candidate For Deletion|CandidateForDeletion]]? Or is it just some [[History Lying Around|HistoryLyingAround]]. I think it's the latter. This page does no harm, so I would just keep it. --[[User:DonRedman|DonRedman]]''


----
----
Line 93: Line 91:
I think that the capitalization rules for English titles are very clear and concise, but I do think we should try to gather information on how to capitalize titles in other languages. Perhaps it will be too much information to add to the [http://www.musicbrainz.org/popup/capguide.html Capitalization Guide], but we could at least create a wiki page containing such guidelines and link to that page from the Capitalization Guide. Let me also add that for Swedish titles, the rules are only to capitalize the first word and all proper nouns (i.e. names of persons, places, objects, etc.). -- BigNick
I think that the capitalization rules for English titles are very clear and concise, but I do think we should try to gather information on how to capitalize titles in other languages. Perhaps it will be too much information to add to the [http://www.musicbrainz.org/popup/capguide.html Capitalization Guide], but we could at least create a wiki page containing such guidelines and link to that page from the Capitalization Guide. Let me also add that for Swedish titles, the rules are only to capitalize the first word and all proper nouns (i.e. names of persons, places, objects, etc.). -- BigNick


[[Category:Candidate for Deletion]] [[Category:To Be Reviewed]] [[Category:Style]] [[Category:Discussion]] [[Category:History]]
[[Category:To Be Reviewed]] [[Category:Style]] [[Category:Discussion]] [[Category:History]]

Revision as of 13:33, 15 February 2008

Status: This Page is Glorious History!

The content of this page either is bit-rotted, or has lost its reason to exist due to some new features having been implemented in MusicBrainz, or maybe just described something that never made it in (or made it in a different way), or possibly is meant to store information and memories about our Glorious Past. We still keep this page to honor the brave editors who, during the prehistoric times (prehistoric for you, newcomer!), struggled hard to build a better present and dreamed of an even better future. We also keep it for archival purposes because possibly it still contains crazy thoughts and ideas that may be reused someday. If you're not into looking at either the past or the future, you should just disregard entirely this page content and look for an up to date documentation page elsewhere.

Tom Hull's Capitalization Proposal

  • Status: Alert.png This has been accepted and made the current CapitalizationStandard -- see Tom's comments below for more details.


Use the following standard guidelines for capitalizing titles (of albums or songs) in the English language. Different rules apply for other languages.

All words in a title should be have their first letter capitalized and following letters lower case except as noted below:

  • 1) Always capitalize the first and last word of a title. This rule should be followed even if the words would normally be lowercase according to the other rules. If a title is broken up by major punctuation (colon, question mark, exclamation mark, em-dash, parentheses, or quotes), capitalize each distinct piece of the title as if where a distinct title. Therefore, for example, always capitalize the first and last words of each section.
  • 2) Capitalize all words between the first and last word of a title except:
    • a) Articles: a, an, the
    • b) Coordinate conjunctions: and, but, or, nor
    • c) Short (three letters or less) prepositions: as, at, by, for, in, of, on, to -- except when used as abverbs or as an inseparable part of a verb
    • d) When used to form an infinitive: to
  • 3) In compounds formed by hyphens, capitalize each part exactly as if they were a separate word.
  • 4) Capitalize contractions and slang consistent with the rules above to the extent that such clearly apply. For example, do not capitalize o' for "of", 'n' or n' for "and".

These rules should cover the overwhelming majority of cases, but there will always be exceptions that need to be decided by votes on a case by case basis.



There is a JavaScript GuessCase function that is implemented on the server that does the above plus more.

Discussion

Comments from Tom Hull:

For HTML, I suggest that the example words appear in bold.

Rationale: I've tried to come up with a simplified rule set that does not generally require in-depth understanding of English grammer, but produces reasonably correct results in almost all cases. The 3/4 letter prepositition size limit is used by (I think) most U.S. publishers.

The trickiest part is (2c). Cutting the preposition size down limits the number of exceptional cases. The 3/4 letter split is a rough guideline. I have omitted prepositions like "up" and "out" because they are so infrequently used as prepositions that it's much simpler (and not terribly wrong) to always capitalize them; on the other hand, such 4-letter words as "from", "into", "onto", and "with" are common and almost always used as prepositions, so there is a rather good case for including them. I personally prefer to lowercase these four, but feel it would be easier (and not terribly wrong) to always uppercase them.)

I've also omitted "so" from the list: while it is sometimes used as a conjunction, it is overwhelmingly used as an adverb, so the same rationale applies as with "up" and "out".

I am hard pressed to explain "as" and "by" except by grammar rules: both are used as conjunctions (lc), prepositions (lc), and adverbs (ulc); although lc uses predominate, they are not overwhelming.

Not capitalizing "to" in infinitives, which is common but not universal practice, puts it overwhelmingly in the lc camp.

The bottom line here is that we have a list of 15 words (I may have missed a couple more, what are they?) that are not capitalized in most or all cases. We could probably illustrate that list in a second file, as well as build up a deeper list of exceptions and special cases. If we have more examples, we may be able to better formulate the rules.

For non-English titles, one can either force the titles into English-language rules (with or without identifying and special-casing the foreign language articles, conjunctions, and prepositions), or one can let each language set its own rules. In my own work I do the former, but I can't defend that as a general principle, so my recommendation is that we tag titles by language and apply the appropriate language-specific rules. Still, this leaves us with further problems: how to determine the language of ambiguous titles or titles with foreign words, and how to handle bilingual titles.

And, of course, this doesn't address the real problem with album titles, which is often just what the hell is the title? Lots of albums say one thing on the spine, another on the cover; have subtitles or series titles that can be combined with dashes or colons, possibly in more than one way. --TomHull



I would just like to express my opinion - opinion only - that using the capitalization on the album itself is never wrong. This is, assuming, ofcourse, that the album title isn't in all-caps (And not an acronym of similiar). Unfortunately, its impossible for people who don't have the album in front of them to judge whether the capitalization is "right" or "wrong", then. Alternative argument: If we had a perfect algorithm for determining the caps of the titles, no user effort would be required for it, as they could be set automatically by the database/display code. Hence the present user-level effort put into getting the capitalization right is somewhat superfluous.

As noted, there are a handful of album- and song-titles where the capitalization is nonstandard, however, and in those cases the only way to make sure is to check the album cover / website for the correct capitalization. One suggestion I would make to avoid effort being wasted on this, would be to have such an automatic capitalization algorithm, with a checkbox for "Non-standard capaitalization" in the input form. Some usability tests might be worth conducting, too; it might make sense to confront submitters with an extra page asking "Is this the capitalization you intended?", and in most cases the non-standard capitalization can be guessed or inhered. --Donwulff



Eric: I like the extra page idea. I think forcing all titles to English language specifications is a bad idea, specifically for the reason that it limits poetic license. But, some titles are actually just descriptions, such as for symphonies, and need to be standardized. I suggest that when you edit data in the database, you can add a comment that moderators can read when they vote. That way, if there is some change that doesn't meet the standard, the change can be explained such as "Capitalization on album cover", etc.

Here's a real easy capitalization rule: Capitalize Every Word. It's not always the prettiest solution, but it removes ambiguity. --Pitboss and Seighin



I don't like capitalization. Just capitalize words that should be capitalized, like the first words, names of people, cities, countries etc. --MJAX

I am not against capitalization but if I can only apply the rules with a thorough understanding of english grammar, then IMHO they are too complicated. --DonRedman



Seeing English capitalization rules used for titles in a languague that have different ones, looks rather strange when you know the language ... And applying them in one's own mods would feel even stranger - so I would appreciate language-specific capitalization (still) to be accepted. I don't find it hard to understand though, why I've had questions about capitalization in some of my (non-English) mods lately - most of which have been in Norwegian. Tenebrous suggested in one of them to put the main rules here (to make voting on the mods easier) - so here they come ... I think the main rules are the same for quite a few languages - in short using the same rules in titles as in normal text, i.e. to capitalize:

  1. first letters, and
  2. names (with a few exceptions for prepositions etc. in names like Stratford-upon-Avon, Frankfurt am Main, Frankfurt an der Oder, Ludwig van Beethoven)

In German, nouns are in general capitalized as well.

Composite names (e.g. The English Chamber Orchestra) probably have a bit more differing rules than the names of persons and places. In Norwegian, the main rule is to capitalize the first letter only (a rule often violated by Norwegians, too). In general, everybody should be particularly critical to titles in the language(s) they know best ... mede

I think that the capitalization rules for English titles are very clear and concise, but I do think we should try to gather information on how to capitalize titles in other languages. Perhaps it will be too much information to add to the Capitalization Guide, but we could at least create a wiki page containing such guidelines and link to that page from the Capitalization Guide. Let me also add that for Swedish titles, the rules are only to capitalize the first word and all proper nouns (i.e. names of persons, places, objects, etc.). -- BigNick