Difference between revisions of "Guess Case"

From MusicBrainz Wiki
m (New modes under discussion)
(Options: specify keep all-uppercase word is on by default (+fix typos))
 
(7 intermediate revisions by one other user not shown)
Line 1: Line 1:
This is the wiki page for the GuessCase JavaScript function.  
+
Guess case changes the capitalization of titles to be closer to our language guidelines. While in most situations it is not perfect and the result will still need some human corrections, it can still save a lot of time.
  
[[User:Keschte|Keschte]] has enhanced the guess case script and addressed most of the [[Guess Case Old Suggestions|GuessCaseOldSuggestions]].
+
==Modes==
  
==Implemented modes==
+
These are the currently available Guess Case modes:
  
* [[Guess Case Mode/Default Mode|GuessCaseMode/DefaultMode]] - handles English titles.
+
:;English
** Implements [[Capitalization Standard English|CapitalizationStandardEnglish]]
+
::This method tries to make titles follow the [[Style/Language/English|English language guidelines]].
** All the official formatting [[Style Guideline|StyleGuideline]]<code><nowiki></nowiki></code>s
 
  
* [[Guess Case Mode/Sentence Mode|GuessCaseMode/SentenceMode]] - handles the same cases than the [[Guess Case Mode/Default Mode|GuessCaseMode/DefaultMode]], but 
+
:;Sentence
** ''Titles'' only the first word of a sentence
+
::This method capitalizes the first word of a sentence, but all the following words of the sentence are kept lowercase. This is close to the guidelines for a lot of non-English languages, but you should still look out for proper nouns and any other differences indicated in [[Style#Language_specific_guidelines|the appropriate language guideline]].
<ul><li style="list-style-type:none">If there are multiple sentences in a title, each one is handled as a separate sentence. (meaning: the next word after one of the sentence stop characters "?", "!", ".", ";", "/" is ''titled'' again) 
 
</ul>
 
** Does not ''title'' words after a hyphen. 
 
<ul><li style="list-style-type:none">Example: Peut-être, the second part is not ''titled'' 
 
</ul>
 
  
* [[Guess Case Mode/Classical Mode|GuessCaseMode/ClassicalMode]] - handles specific cases of the [[Classical Style Guideline|ClassicalStyleGuideline]]  
+
:;French
 +
::This method is very similar to the Sentence one, but a space is inserted before semicolons, colons, exclamation marks and question marks (;:!?) and text inside guillemets is padded with spaces too (« text »). Keep in mind this method is not smart enough to figure out when the first noun in a sentence should be capitalized according to the [[Style/Language/French|French language guidelines]].
  
* [[Guess Case Mode/French Mode|GuessCaseMode/FrenchMode]]  
+
:;Turkish
 +
::This method tries to make titles follow the [[Style/Language/Turkish|Turkish language guidelines]]. It is fairly similar to the English one, but it has a different set of (Turkish) words that it will lowercase, and it knows how to capitalize letters ı and i.
  
==New modes under discussion==
+
==Options==
  
<ul><li style="list-style-type:none">(see [http://bugs.musicbrainz.org/report/2 the open tickets] of the guess case function)
+
:;Keep all-uppercase words uppercased (on by default)
</ul>
+
:: With this option on, a title like “Absolute ABBA” is left as-is. Without this option on, a title like “A VERY LOUD TITLE” will be converted to “A Very Loud Title”, “A very loud title” or whatnot, depending on the mode chosen. Select it if some words are intentionally all-uppercase in the title; unselect it if you have an all-uppercase tracklist that you’d want to turn into normal case.
  
==Borked Data==
+
:;Uppercase Roman numerals
 
+
::This option will turn any Roman numerals in the titles into their more standard uppercase version. Keep in mind that if you aren't specifically trying to uppercase any Roman numerals in the titles, it might be sensible to keep this off: there's a fair amount of common music-related words that are also Roman numerals, such as "mix" (1009) or "mic" (1099), or the key E in several languages ("mi", 1001).
usually lowercased words after a single quote:  
 
* [http://musicbrainz.org/search/oldsearch.html?search=Puttin%27+on+the+Ritz+&table=track&limit=0 Puttin' on the Ritz]
 
* [http://musicbrainz.org/search/oldsearch.html?search=Singin%27+in+the+Rain+&table=track&limit=0 Singin' in the Rain]
 
* [http://musicbrainz.org/search/oldsearch.html?search=Stompin%27+at+the+Savoy+&table=track&limit=0 Stompin' at the Savoy] etc.  
 
 
 
[[Category:To Be Reviewed]] [[Category:Development]]
 

Latest revision as of 06:15, 28 December 2019

Guess case changes the capitalization of titles to be closer to our language guidelines. While in most situations it is not perfect and the result will still need some human corrections, it can still save a lot of time.

Modes

These are the currently available Guess Case modes:

English
This method tries to make titles follow the English language guidelines.
Sentence
This method capitalizes the first word of a sentence, but all the following words of the sentence are kept lowercase. This is close to the guidelines for a lot of non-English languages, but you should still look out for proper nouns and any other differences indicated in the appropriate language guideline.
French
This method is very similar to the Sentence one, but a space is inserted before semicolons, colons, exclamation marks and question marks (;:!?) and text inside guillemets is padded with spaces too (« text »). Keep in mind this method is not smart enough to figure out when the first noun in a sentence should be capitalized according to the French language guidelines.
Turkish
This method tries to make titles follow the Turkish language guidelines. It is fairly similar to the English one, but it has a different set of (Turkish) words that it will lowercase, and it knows how to capitalize letters ı and i.

Options

Keep all-uppercase words uppercased (on by default)
With this option on, a title like “Absolute ABBA” is left as-is. Without this option on, a title like “A VERY LOUD TITLE” will be converted to “A Very Loud Title”, “A very loud title” or whatnot, depending on the mode chosen. Select it if some words are intentionally all-uppercase in the title; unselect it if you have an all-uppercase tracklist that you’d want to turn into normal case.
Uppercase Roman numerals
This option will turn any Roman numerals in the titles into their more standard uppercase version. Keep in mind that if you aren't specifically trying to uppercase any Roman numerals in the titles, it might be sensible to keep this off: there's a fair amount of common music-related words that are also Roman numerals, such as "mix" (1009) or "mic" (1099), or the key E in several languages ("mi", 1001).