From MusicBrainz Wiki
< Style‎ | Language
Revision as of 23:13, 18 February 2009 by Foolip (talk | contribs) ((Imported from MoinMoin))
Jump to navigationJump to search

Attention.png Status: this is a controversial proposal born out of, do not use it until the discussion has settled!

This page outlines the capitalization rules for the Vietnamese language. It forms part of the MusicBrainz CapitalizationStandard.

Bằng tiếng Việt

Xin bạn viết bài này bằng tiếng Việt.

In English

These rules apply to album and track titles:

  • Do not capitalize any word except for:
    • the first letter of the text
    • the first letter of a sentence after . ! ? / :
    • the first letter of proper and geographical names (persons, cities, nations, etc.)

Punctuation (no space)

Like English, Vietnamese does not use any space before punctuation, including but not limited to ? ! ; : . , ...

Punctuation (with space)

In Vietnamese there is a no-break space (" " U+00A0) — preferably narrow no-break space ("" U+202F) — before the following punctuation:

  • The exclamation mark ("!" U+0021)
  • The colon (":" U+003A)
  • The semicolon (";" U+003B)
  • The question mark ("?" U+003F)

In Musicbrainz, as there are few chances to have word-warp problems with song and album titles, the narrow no-break space as well as the no-break space could seem overkill. Thus the use of the standard space character.

There is, however, no space before:

  • The comma ("," U+002C)
  • The dot ("." U+002E)
  • The ellipsis ("" U+2026 or 3×dot "..." 3×U+002E)

Examples (no space)

  • Em còn nhớ hay em đã quên?
  • Nhớ về Hà Nội
  • Hà Nội café ơi!
  • Liên khúc: Đêm cuối / Giận hờn

Examples (with space)

  • Em còn nhớ hay em đã quên ?
  • Nhớ về Hà Nội
  • Hà Nội café ơi !
  • Liên khúc : Đêm cuối / Giận hờn

Reference (with space)

Vietnamese language teaching book for 7 year old pupils

Bộ giáo dục và đào tạo — Lớp một (1), Tiếng Việt, Tập một : Chữ cái và vần

Ministry of education and training — First class (1), Vietnamese language, first volume : Letters and syllabs

[[[Image:img0780qp9.jpg]] [1](A](A]) [[[Image:img0782yw7.jpg]] [2](B](B]) [[[Image:img0783gx1.jpg]] [3](C](C]) [[[Image:img0784nc9.jpg]] [4](D](D]) [[[Image:img0785ol4.jpg]] [5](E](E])

  • A: This is the cover. Bộ giáo dục và đào tạo stands for Ministry of education and training
  • B: Nhớ lời Bác dạy : Chăm học, chăm làm ; Bố mẹ đều khen ; Thầy cô vui vẻ. (Remember uncle Hồ Chí Minh's words: Serious in studies and work ; Parents will praise ; Teachers will rejoice. (^_^)
  • C: Bay đi đâu cả tối ngày sếu ơi ?
  • D: Em đi máy bay !
  • E: Here you can see the other books for the same class (math, nature, etc.). I have number 1. and 2. This is number 1. You also can see some info like printed in 150000 copies in 1999, etc.
  • Yes, there is a half-space here. This shows that this style is not completely foreign in Vietnamese. --foolip

La Fontaine, THƠ NGỤ NGÔN (Nguyễn Văn Vĩnh dịch)

= Les fables de La Fontaine (translated by the Nguyễn Văn Vĩnh)

Nguyễn Văn Vĩnh, the guy who made the promotion of quốc ngữ in early 1900 which eventually led to its adoption as official writing system (some brief english history here). NVV wrote some books to make people used to this usage. He translated some french fables (Les fables de La Fontaine). If you look at the result, you can see the use of spaces before those punctuation marks. To be honest, I don't know if those scans are the original edition or if they are conform. It could have been retyped.

Tranlated manga

Đôrêmon (ドラえもん)

[[[Image:img0786wd7.jpg]] img0786wd7th]

Here you can see the use of narrow no-break space.

  • Sorry, but to me this looks like no space at all, I take this as support for the "no space" style. "Looks" like Unicode Character 'ZERO WIDTH NO-BREAK SPACE' (U+FEFF) ;-) --foolip
    • You must be joking, the space is 3 times bigger than the no space of the dot. Look at the lower right corner. TỨC QUÁ ! shows the same spacing between words than between words and exclamation mark. -- jesus2099 07:06, 10 February 2009 (UTC)
      • I think this (unlike the other example below) is just a case of which font has been used. Look at NHỈ?, there's the same spacing between "H Ỉ" as in "Ỉ ?", does that mean that it should be written as "NH Ỉ ?" in digital form. Let's not discuss this example any more, others will have to make their own judgement when the time comes for voting/vetoing on this issue. -- foolip 07:13, 11 February 2009 (UTC)

Translated literature book

Trang Tử (莊子) : Nam Hoa kinh (南華經)

[[[Image:img0793qw5.jpg]] [6](A](A]) [[[Image:img0794fe9.jpg]] [7](B](B]) [[[Image:img0798ih2.jpg]] [8](C](C])

  • A: Cover. Vietnamese adaptation / translation by Nguyễn Hiến Lê. Nhà xuất bản Văn hóa stands for ~ published by (Vh) Văn hóa (=culture).
  • B: Here you can see the use of no-break space before exclamation mark, semicolon and question mark
  • C: Here you can see the use of no-break space before exclamation mark, colon, semicolon and question mark

Alexandre De Rhodes's dictionnary

Yes, that is the Alexandre De Rhodes.

But this is very obsolete frankly. Even dots and , are spaced.

Some other spaced stuff

  • [[[Image:Visa_Form.jpg]] Visa application form (the M3 type)] (I don't have mines anymore but they were like this one too, even if I didn't get them from the same place)

Reference (no space)

Writing Rules

Quoting Một số quy tắc soạn thảo văn bản cơ bản:

  • Các dấu ngắt câu như chấm (.), phẩy (,), hai chấm (:), chấm phảy (;), chấm than Idea.png, hỏi chấm (?) phải được gõ sát vào từ đứng trước nó, tiếp theo là một dấu trắng nếu sau đó vẫn còn nội dung.

In brief, this says that punctuation .,:;!? should be written togther with the preceding word (i.e. no space).

Searching for the above text reveals that it is replicated on many educational websites and the like.

Vietnamese Online

In the top million site rankings (fetched 2009-02-09) there are 208 websites under the .vn domain, which should be a good sample of how Vietnamese is typically written in digital form. Using a python script I downloaded the front page from all of those 208 sites, extracted the text (not including HTML markup, scripts or style sheets) and counted the occurences where exclamation and question mark was preceded by whitespace or not. I have gone over the results to see that the matches are for actual text so the results are not skewed by misencoding errors, etc. Sites which were either unavailable or did not contain any such punctuation are not included.

Site "!?" " !?" 4 0 2 0 6 1 12 0 5 0 4 0 9 3 3 2 2 1 6 0 8 0 36 1 7 0 5 0 0 1 4 0 1 0 15 4 5 0 2 0 7 0 1 0 16 1 1 31 3 0 2 1 1 0 10 0 1 0 3 0 12 0 1 0 5 0 2 0 3 0 3 0 15 0 6 0 8 0 5 3 2 1 1 0 7 0 8 0 4 0 2 0 6 2 8 1 4 0 4 2 13 3 8 0 13 0 1 0 2 1 1 1 2 0 1 0 10 0 16 1 1 0 2 0 1 0 1 0 1 0 2 1 2 0 1 1 5 0 16 0 2 4 6 0 7 0 13 0 1 0 11 4 8 0 0 2 2 1 3 0 3 0 1 2 21 1 3 0 9 0 6 0 3 0 6 0 3 0 3 0 4 0 3 0 2 4 2 0 1 0 0 1 1 0 2 0 2 0 1 1 12 0 5 0 1 0 2 0 6 0 7 0 5 0 1 0 6 1 1 2 7 0 3 5 1 1 2 0 0 1 2 3 2 0
Total 578 (86%) 96 (14%)

Apart from the simple comparison by percentage, it is worth noting that there are only 4 sites which consistenly use the "with space" style, but 3 of those have only a single instance of "!" or "?" and the 4th has just 2. Conversely, there are 81 sites which consistently use the "no space" style, with up to as many 16 occurences of that punctuation style and an average of 4.63. An in-depth analysis of these sites would probably reveal that almost all use both punctuation styles on some page, but it's difficult to not see "with space" as the odd duckling here, a typo in many cases. seems to be semi-consistent in using the "with space" style, while is just a case study in horrible punctuation (not because it doesn't support my case, go see for yourself what I'm talking about).

Verdict: When Vietnamese is written in digital form, the convention is to not include a space before punctuation. Note that this is different from French where the "with space" style can be readily found, so it's not just a simply case of "digital laziness".


Vietnamese in Print

[[[Image:dsc00666qx8.jpg]] [9](A1](A1]) [[[Image:dsc00665kc5.jpg]] [10](A2](A2]) [[[Image:dsc00657nk7.jpg]] [11](B1](B1]) [[[Image:dsc00665kc5.jpg]] [12](B2](B2]) [[[Image:dsc00658am7.jpg]] [13](C1](C1]) [[[Image:dsc00660db8.jpg]] [14](C2](C2])


Hi Foolip, I took the liberty of changing your draft to conform with my official children school book that comes from the ministry of education and training of Việt Nam.

Actually I browsed 4 random books of my library to find out that there were inconsistencies between publishers. Even if that may not be statistically correct, out of 4 random books, I got 3 books to support the space-before-punctuation style versus 1 book supporting no-space-before-punctuation. The 3 books include a (actually because it is vol.1 + vol.2 but hey) very official book because it is the vietnamese teaching book for first class of first school (7 years old, lớp một của trường tiểu học / cấp một).

In Reference, you can find those quick pictures I took yesterday evening.


  • I think you were a bit too liberal in your changes, putting both versions in for comparison. --foolip

This discussion isn't going to be settled between French guy and a Swedish guy, we need input from someone who reads/writes Vietnamese as their native language. My girlfriend strongly supports the "no space" style and insists that putting a space would have been marked as incorrect in computer class or when writing essays. These are the rules they teach and how schoolbooks are written (obviously with at least one exception above). However, I don't expect you to take my word for all of this, so I'll make samples from such books available. I'd also really like for someone else to comment on this issue, someone not involved in it so to say... -- foolip 07:22, 11 February 2009 (UTC)

  • Yes I'm waiting for your book. We obviously can't value anything as much as an official book (that of course couldn't be something like a translated english typography book either, like the typography sentence that is found everywhere online, which is also probably a translated guideline). I feel tired for you with all you search on the web. Vietnamese latinization come from a french guy at the start so I'm not surprised that they have those spaces by the way. Now vn population is very young and quite online and influenced by monoculturalism too. You can also find MANY french text without those spaces. And another thing too. I don't know if this space thing is only french or if it is like that in all the latin (south) part of the Europe actually. Could be. -- jesus2099 08:56, 11 February 2009 (UTC)
    • I just ran the punctuation checker script on the 910 .fr sites in the top million, and the results are 1231 "!?" and 2987 " !?", i.e. 71% proper French capitalization and 29% "English" capitalization (although I haven't filtered out any English text in the results, I think the number should be lower for pure French texts online). So it seems that the customary way to write French digitally is with a space before "!?", just as one would expect. Unless you think Vietnamese people make mistakes 86% of the time you have to accept that the customary way of writing Vietnamese in digital form is without the space. That is not the end of the argument of course, I shall present to you printed Vietnamese text supporting this as well. -- foolip 12:51, 11 February 2009 (UTC)
      • At least we agree on the sentence case. That makes around 99.9 % of the titles already. -- jesus2099 13:48, 11 February 2009 (UTC) I agree there is a huge amound of inconsistency. I even find inconsistency on the same page of printed material sometimes. Obviously, they absolutely don't give a damn about what we are both debating here. The more official stuff you can get, the better IMO. Because the my wife's word versus your girlfriend's word match is not serious argument (this is why I didn't want to tell you what she says). Maybe I should eventually get those missing vietnamese school books from lớp 2 to lớp 12 (the whole of them). I was already planning to acquire those anyway. -- jesus2099 14:16, 11 February 2009 (UTC) → There are latest editions of the schoolbooks (Bộ giáo dục và đào tạo : Lớp 1, 2008-2009 14 book batch). I'll have a look at them as soon as I can. -- jesus2099 11:01, 17 February 2009 (UTC)