Updated 20 December, 2020
This page gathers basic information about the Myanmar script and its use for the Burmese language. It doesn't cover use of the Burmese orthography for writing Pali. It aims (generally) to provide an overview of the orthography and typographic features, and (specifically) to advise how to write Burmese using Unicode.
Phonetic transcriptions on this page should be treated as an approximate guide, only. Many are more phonemic than phonetic, and there may be variations depending on the source of the transcription.
အပိုဒ် ၁ လူတိုင်းသည် တူညီ လွတ်လပ်သော ဂုဏ်သိက္ခာဖြင့် လည်းကောင်း၊ တူညီလွတ်လပ်သော အခွင့်အရေးများဖြင့် လည်းကောင်း၊ မွေးဖွားလာသူများ ဖြစ်သည်။ ထိုသူတို့၌ ပိုင်းခြား ဝေဖန်တတ်သော ဉာဏ်နှင့် ကျင့်ဝတ် သိတတ်သော စိတ်တို့ရှိကြ၍ ထိုသူတို့သည် အချင်းချင်း မေတ္တာထား၍ ဆက်ဆံကျင့်သုံးသင့်၏။
အပိုဒ် ၂ လူတိုင်းသည် လူ့အခွင့် အရေး ကြေညာစာတမ်းတွင် ဖော်ပြထားသည့် အခွင့်အရေး အားလုံး၊ လွတ်လပ်ခွင့် အားလုံးတို့ကို ပိုင်ဆိုင် ခံစားခွင့်ရှိသည်။ လူမျိုးနွယ်အားဖြင့် ဖြစ်စေ၊ အသားအရောင်အားဖြင့် ဖြစ်စေ၊ ကျား၊ မ၊ သဘာဝအားဖြင့် ဖြစ်စေ၊ ဘာသာစကားအားဖြင့် ဖြစ်စေ၊ ကိုးကွယ်သည့် ဘာသာအားဖြင့် ဖြစ်စေ၊ နိုင်ငံရေးယူဆချက်၊ သို့တည်းမဟုတ် အခြားယူဆချက်အားဖြင့် ဖြစ်စေ၊ နိုင်ငံနှင့် ဆိုင်သော၊ သို့တည်းမဟုတ် လူမှုအဆင့်အတန်းနှင့် ဆိုင်သော ဇစ်မြစ် အားဖြင့်ဖြစ်စေ၊ ပစ္စည်း ဥစ္စာ ဂုဏ်အားဖြင့် ဖြစ်စေ၊ မျိုးရိုးဇာတိအားဖြင့် ဖြစ်စေ၊ အခြား အဆင့်အတန်း အားဖြင့် ဖြစ်စေ ခွဲခြားခြင်းမရှိစေရ။ ထို့ပြင် လူတစ်ဦး တစ်ယောက် နေထိုင်ရာ နိုင်ငံ၏ သို့တည်းမဟုတ် နယ်မြေဒေသ၏ နိုင်ငံရေးဆိုင်ရာ ဖြစ်စေ စီရင် ပိုင်ခွင့်ဆိုင်ရာ ဖြစ်စေ တိုင်းပြည် အချင်းချင်း ဆိုင်ရာဖြစ်စေ၊ အဆင့်အတန်း တစ်ခုခုကို အခြေပြု၍ သော်လည်းကောင်း၊ ဒေသနယ်မြေတစ်ခုသည် အချုပ်အခြာ အာဏာပိုင် လွတ်လပ်သည့် နယ်မြေ၊ သို့တည်းမဟုတ် ကုလသမဂ္ဂ ထိန်းသိမ်း စောင့်ရှောက် ထားရသည့် နယ်မြေ၊ သို့တည်းမဟုတ် ကိုယ်ပိုင် အုပ်ချုပ်ခွင့် အာဏာတို့ တစိတ်တဒေသလောက်သာ ရရှိသည့် နယ်မြေ စသဖြင့် ယင်းသို့ သော နယ်မြေများ ဖြစ်သည်၊ ဖြစ်သည် ဟူသော အကြောင်းကို အထောက်အထား ပြု၍ သော်လည်းကောင်း ခွဲခြားခြင်း လုံးဝ မရှိစေရ။
The Myanmar script is used to write Burmese and, with various extensions and adaptations, for other languages in the region, such as Mon, Karen, Kayah, Shan, and Palaung. It is also used to write Pali and Sanskrit.
မြန်မာအက္ခရာ mjàɴmà ʔɛʔkʰa̰jà Burmese alphabet
A descendant of the Brahmi script, via Pallava and Old Mon, early evidence of the Myanmar script dates back to around the 10th century. What were originally square shapes evolved around the 17th century to become the rounded forms we see today, supposedly to improve writing techniques on palm leaves.
Sources: Scriptsource, Wikipedia.
The script is an abugida, ie. consonants carry an inherent vowel sound that is overridden, where needed, using vowel signs. See the table to the right for a brief overview of features for the modern Burmese orthography.
Myanmar text runs left to right in horizontal lines.
Spaces separate phrases, rather than words.
The 20 consonant letters used for pure Burmese words are supplemented by 9 more which are used in Sanskrit words.
Consonant stacking is used in multi-syllabic words (mostly derived from Pali) to indicate doubled or homorganic consonant clusters. Subjoined forms are produced using a dedicated, invisible virama character. Conjuncts do not span word boundaries.
Syllable-initial clusters use 4 dedicated combining marks for the medial. Aspirated onset consonants are common: some have dedicated letters, others are indicated with a subjoined h.
Syllable-final consonant sounds use ordinary characters with a visible mark called asat to indicate that the inherent vowel is killed. There is also one dedicated final consonant (the anusvara).
The Burmese orthography has an inherent vowel, and represents vowels using 8 vowel-signs (including 1 prescript), and 3 consonants/diacritics. All vowel-signs are combining marks, and are stored after the base character. In a couple of instances an asat is used to indicate tone information, rather than attaching a tone mark.
The pronunciation of some vowel graphemes may vary in open and closed syllables.
There is an incomplete set of 7 independent vowels, mostly used for Pali or Sanskrit words, and standalone vowel sounds are normally written using vowel-signs applied to အ [U+1021 MYANMAR LETTER A].
This page lists just 6 composite vowels (made from 5 vowel signs, and 2 consonants/diacritics). Composite vowels can involve up to 3 glyphs, and glyphs can surround the base consonant(s) on up to 2 sides, eg. ကော် keaˣ.
This section lists non-ASCII characters used for Burmese, and other characters in the Myanmar script block not used by Burmese. For descriptions of usage, click on ↓.
The Burmese language is tonal and syllable-based.
Words are composed of syllables. These start with a consonant or initial vowel. An initial consonant may be followed by a medial consonant, which adds the sound j or w. After the vowel, a syllable may end with a nasalisation of the vowel or an unreleased glottal stop, though these final sounds can be represented by various different consonant symbols.
At the end of a syllable a final consonant usually has an 'asat' sign above it, to show that there is no inherent vowel.
In multisyllabic words derived from an Indian language such as Pali, where two consonants occur internally with no intervening vowel, the consonants tend to be stacked vertically, and the asat sign is not used.
The following table shows the order in which characters should be typed and stored in memory for a given syllable, per the description in the Unicode Standard. (It is Burmese-specific and doesn't reflect the order or characters needed for languages such as Karen, Mon, Shan, etc.) u,648-9
|kinzi||င U+1004 + ် U+103A + ္ U+1039|
|consonants/vowels||[ က U+1000 .. အ U+1021 | ဣ U+1023 .. ဧ U+1027 | ဩ U+1029 | ဪ U+102A | ဿ U+103F | ၎ U+104E ]|
|subscript consonant||္ U+1039 + [ က U+1000 .. ဈ U+1008 | ည U+100A .. မ U+1019 | ရ U+101B | လ U+101C | သ U+101E | ဠ U+1020 | အ U+1021 ]|
|asat sign||် U+103A|
|medial ya*||ျ U+103B (+ ် U+103A)|
|medial ra||ြ U+103C|
|medial wa||ွ U+103D|
|medial ha||ှ U+103E|
|vowel sign e||ေ U+1031|
|vowel sign i, ii, ai||[ ိ U+102D | ီ U+102E | ဲ U+1032]|
|vowel sign u, uu||[ ု U+102F | ူ U+1030]|
|vowel sign tall aa, aa*||[ ါ U+102B | ာ U+102C] (+ ် U+103A)|
|dot below||့ U+1037|
Characters with an asterisk are potentially followed by an asat sign.
Unfortunately, normalization may result in a different order. In particular, ် [U+103A MYANMAR SIGN ASAT] occurs after ့ [U+1037 MYANMAR SIGN DOT BELOW] in normalized text. Applications such as fonts should still handle this alternative order, since the sequences are canonically equivalent.
The following schematic shows sequences that typically make up a syllable in Burmese. Start with the C (consonant) on the left, or IV (initial vowel) and travel from left to right only. You can stop at any point. The plus sign in the box represents the virama – this should be followed immediately by another syllable, as should the kinzi.
Dashes are used to indicate whether the character represents a vowel sound in a closed or an open syllable.
Click on the sound groups to see where else in the document each of the sounds are referred to.
Phones in a lighter colour are non-native or allophones. Source Wikipedia.
|Close||i ĩ||u ũ|
|Close||u̯a u̯ɛ u̯e|
|Close-mid||ei ẽĩ||əɨ||ou õũ|
|Open||ai ãĩ au ãũ|
Some of the above sounds can only occur in open syllables, others only in closed syllables. As a rough rule, the plain vowels occur in open syllables, and the diphthongs in closed.
The sound ə only occurs in minor syllables, and is the only sound occurring in those syllables.
A process called vocalic weakening affects the first syllables of certain words (mostly nouns and adverbs), eg. ထမင် is pronounced tʰəmɪ̀ɴ, not tʰa̰mɪ̀ɴ; ဘုရား is pronounced pʰəjá, not pʰṵjá.
The inherent vowel is usually transcribed and pronounced in Burmese as a in open syllables, but very often reduced phonetically to ə. So က [U+1000 MYANMAR LETTER KA] is pronounced ka.
In closed syllables, the inherent vowel is pronounced as one of ɪ, e, a, or ɛ, depending on the final consonant that follows, eg. နှစ်
Non-inherent vowel sounds that follow a consonant are represented using vowel-signs, eg. ကိ [U+1000 MYANMAR LETTER KA + U+102D MYANMAR VOWEL SIGN I] is pronounced ki. They may be used on their own, or in combination with others (see composite_vowels).
Burmese vowel-signs are all combining characters. All vowel-signs are stored after the base consonant, and the font puts them in the correct place for display. Some input systems may allow the user to type the prescript vowel before the base consonant, but it is still stored after.
Three vowel-signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.
ါ [U+102B MYANMAR VOWEL SIGN TALL AA] is just an alternative shape for ာ [U+102C MYANMAR VOWEL SIGN AA]. See shapechanges.
The pronunciation of the vowel-sign often depends on whether it appears in an open or closed syllable, eg. compare the following, which are open and closed, respectively ဆိုး ဆိုင် Open syllables typically contain plain vowels, while closed ones contain diphthongs.
The 'primary' vowels have 'short' and 'long' written forms that hark back to the earlier Indic script origins, but the distinction is used nowadays for indicating different tones only. For example, compare the tones in the open syllables at the beginning of these 2 words မိနစ် မီ
Burmese has only one vowel-sign that appears to the left of the base consonant letter or cluster, eg. မြေပုံ
This is a combining mark that is always stored after the base consonant. The font places the glyph before the base consonant.
A consonant cluster is treated as a unit when it comes to vowel-signs, for example in အငွေ ʔŋw̆e (a.ngwe)the E is displayed to the left of the NGA although the character appears after the WA in memory.
Some input methods may allow the user to type this vowel before the consonant, whereas others will expect it to be typed after, per the stored order.
In closed syllables when followed by the inherent vowel, ွ [U+103D MYANMAR CONSONANT SIGN MEDIAL WA] may be pronounced as the vowel ʊ, rather than the glide w, eg. နွမ်း
အ [U+1021 MYANMAR LETTER A] on its own represents the standalone version of the inherent vowel, ʔa. It is used as a base for other standalone vowels.
Tall AA. There are two forms of the -a vowel sign in Burmese. The combination ဝာ [U+101D MYANMAR LETTER WA + U+102C MYANMAR VOWEL SIGN AA] (wa) can be hard to distinguish from တ [U+1010 MYANMAR LETTER TA], so a taller glyph is used for the vowel to avoid confusion, ie. ဝါ [U+101D MYANMAR LETTER WA + U+102B MYANMAR VOWEL SIGN TALL AA]. This glyph, whether alone or as part of a complex vowel, is used after the following consonants:
For example, ပေါင် Where there is no ambiguity, however, the normal shape is used, eg. ပြောင်းဖူး
Whereas in Unicode 5.0 the choice of appropriate form was left to the font or implementation during rendering, such contextual decisions are not appropriate for Sgaw Karen and other minority scripts, which only use the tall form, so ါ [U+102B MYANMAR VOWEL SIGN TALL AA] was added to Unicode 5.1 as a separate character, and needs to be typed explicitly.u,648
Displaced U vowel-signs. The vowel signs ု [U+102F MYANMAR VOWEL SIGN U] and ူ [U+1030 MYANMAR VOWEL SIGN UU], which normally appear below a consonant, are displayed to the right if something else intrudes on that space. Examples include
There are no special characters for these forms. The shape is produced automatically by the font.
Vowels represented by combinations of the above characters:
In the sequence ော် [U+1031 MYANMAR VOWEL SIGN E + U+102C MYANMAR VOWEL SIGN AA + U+103A MYANMAR SIGN ASAT], the asat is added to indicate the low tone. Otherwise, it is the same vowel.
The following list shows where vowel-signs are positioned (by default) around a base consonant to produce vowels, and how many instances of that pattern there are. Numbers after the + sign represent combinations of vowel-signs.
The special long forms of ု [U+102F MYANMAR VOWEL SIGN U] and ူ [U+1030 MYANMAR VOWEL SIGN UU], used when there is not enough room for them below a cluster are produced by the font.
Characters that don't appear in the combinations:
Myanmar represents standalone vowel sounds using အ [U+1021 MYANMAR LETTER A] as a base for vowel-signs, eg. အိတ် This is classed as a consonant rather than a vowel by the Burmese, and carries the inherent vowel when used alone, eg. အတန်း
Myanmar also has a set of independent vowel letters used to represent standalone vowels, but only in certain words – typically Indian loan words, eg. ဧရာဝတီ ဩဂုတ် ဤ
Each letter represents a specific vowel+tone combination (see the accent marks in the list above). Not all vowel+tones combinations are represented.
Sequences of characters can be combined to look like a few of the independent vowels, eg. သြော် [U+101E MYANMAR LETTER SA + U+103C MYANMAR CONSONANT SIGN MEDIAL RA + U+1031 MYANMAR VOWEL SIGN E + U+102C MYANMAR VOWEL SIGN AA + U+103A MYANMAR SIGN ASAT] can look very similar to or the same as ဪ [U+102A MYANMAR LETTER AU]. The Unicode Standard strongly recommends to only use the single code point for each independent vowel.
Myanmar is a script where different sequences of Unicode characters may produce the same visual result. Here we look at those related to vowels.
It is strongly recommended to use the single code points on the left, rather than the sequences on the right, because they are not made the same by normalisation. Therefore the content will be regarded as different, which will affect searching and other operations on the text.
|Code point||Deprecated combination|
|ဪ [U+102A MYANMAR LETTER AU]||သြော် [U+101E MYANMAR LETTER SA + U+103C MYANMAR CONSONANT SIGN MEDIAL RA + U+1031 MYANMAR VOWEL SIGN E + U+102C MYANMAR VOWEL SIGN AA + U+103A MYANMAR SIGN ASAT]|
|ဩ [U+1029 MYANMAR LETTER O]||သြ [U+101E MYANMAR LETTER SA + U+103C MYANMAR CONSONANT SIGN MEDIAL RA]|
In the following case, the precomposed character decomposes in NFD, and re-forms again in NFC. It is generally recommended to use the precomposed character, but both forms are canonically equivalent.
|ဦ [U+1026 MYANMAR LETTER UU]||ဦ [U+1025 MYANMAR LETTER U + U+102E MYANMAR VOWEL SIGN II]|
The following tables show how the above vowel sounds commonly map to characters or sequences of characters. Graphemes for a particular sound are split according to whether they occur in open (o), closed (c), or standalone (s) syllables.
Inherent vowel, followed by –ည် i, –စ် ɪʔ, or one of –င်, –ဉ် ɪɴ , eg. ညည်း, နှစ်, ဝင်.
ွ [U+103D MYANMAR CONSONANT SIGN MEDIAL WA], eg. နွမ်း.
Inherent vowel, followed by –ည်.
ဧ [U+1027 MYANMAR LETTER E], eg. ဧက. Used in a few words only (typically Indian loan words).
Inherent vowel, followed by –ည် or -က် ɛʔ, eg. စက္ကူ
ဩ [U+1029 MYANMAR LETTER O], eg. ဩဂုတ်. Used in a few words only (typically Indian loan words).
Unstressed vowels, eg. ဖိနပ်.
Inherent vowel (creaky tone), eg. သဿ.
ာ [U+102C MYANMAR VOWEL SIGN AA] (low tone), eg. ဆရာ.
ါ [U+102B MYANMAR VOWEL SIGN TALL AA] (low tone), eg. တံဂါ.
Inherent vowel, followed by one of –တ်, –ပ်aʔ, or one of –န်, –မ်, –ံaɴ, eg. ဖတ်, ပန်း.
ဪ [U+102A MYANMAR LETTER AU] (low tone), in some words, particularly Indian loan words or words in the literary style.
Click on the sounds to see where else in the document they are referred to.
Phones in a lighter colour are non-native or allophones. Source Wikipedia.
|fricative||θ ð||s z
|nasal||m m̥||n n̥||ɲ ɲ̊||ŋ ŋ̊|
|approximant||w ʍ||l l̥||j|
b ဗ ဘ
|t တ ဋ
tʰ ထ ဌ
d ဒ ဍ ဎ ဓ
ɡ ဂ ဃ
z ဇ ဈ
|ʃ ߛ||h ဟ|
n န ဏ
ɲ ဉ ည
l လ ဠ
|j ယ ရ|
|Trill||r ߙ ߚ|
Native Burmese words use a subset of the consonants that make up the traditional articulatory arrangement of indic scripts.
Representing foreign sounds. Some Burmese conventions exist for representing foreign sounds. f is ဖ (usually pʰ), v is ဗ (usually b) or ဗွ (usually bw), eg. တီဗွီ
A foriegn syllable final sound can be rendered by placing a second killed consonant after the syllable, sometimes in parentheses, eg. ဘတ်(စ်)
Pali loans. The following additional consonants are mainly used for Pali loan words.
Additional symbols are available for use in loan words, especially Indian loan words. These include the retroflex and voiced aspirated consonants.
Other characters. Other characters in the Myanmar Unicode block are used for orthographic variations employed by minority languages that use the Myanmar script. They are not dealt with here.
Unvoiced syllable initial consonants are typically pronounced with voicing when they appear in non-initial syllables of a word or in particle suffixes, unless they follow a syllable with stopped tone or follow the အ [U+1021 MYANMAR LETTER A] prefix. Aspirated consonants lose their aspiration at the same time. For example, သတင်းစာ farmer is pronounced θədɪ́ɴzà not θətɪ́ɴsà. However, because of the rule about the stopped tone (ie. a syllable ending in a plosive consonant), တစ်ဆယ် ten is pronounced təʔsʰɛ̀ not təʔzɛ̀.
Note that care needs to be taken with compound words, since they contain more than one word-initial syllable, eg. နားထောင် listen is pronounced nátʰàʊɴ not nádàʊɴ .m,175-6
There is also an irregular pattern of voicing initial consonants, particularly with place names. Mesher provides examples of words beginning with စ [U+1005 MYANMAR LETTER CA] ပ [U+1015 MYANMAR LETTER PA] တ [U+1010 MYANMAR LETTER TA] and ထ [U+1011 MYANMAR LETTER THA], eg. စေတီ table is pronounced zèdì not sèdì ; ပုဂံ Pagan/Bagan is pronounced bəgàɴ not pəgàɴ; ထားဝယ် Tavoy/Dawei is pronounced dəwɛ̀ not tʰəwɛ̀.m,251
In many multi-syllabic words (mostly derived from Pali), consonants that have no intervening inherent vowel are arranged such that the consonant cluster is stacked. Stacked consonants of this kind are always doubled consonants or homorganic.wbs,#Stacked_consonants
The second consonant appears below the first, eg. မန္တလေး ဗုဒ္ဓ In some cases the lower character is abbreviated or reoriented, eg. က္ဌ to represent က်ဌ.
This effect is achieved by using the character U+1039 MYANMAR SIGN VIRAMA between the consonants forming the cluster.
The virama is never visible.
Consonants may also be stacked in abbreviations of native Burmese words, in which case they may not be homorganic and vowels may be pronounced between the consonants. For example, လက်ဖက် is sometimes abbreviatedwbs,#Stacked_consonants to လ္ဘက် l͓ḃkˣ
Where the same consonant appears at the end of a syllable and the beginning of a new syllable in the same word they are commonly represented in the usual cluster form, eg. ပိန္နဲသီး
In a few Burmese words, however, a doubled consonant is represented by a single consonant plus asat, eg. ယောက်ျား ကျွန်ုပ် Note how this produces a situation where an asat is used between a consonant and a medial or vowel sign.h
A repeated သ [U+101E MYANMAR LETTER SA] can be represented using ဿ [U+103F MYANMAR LETTER GREAT SA]. In modern Burmese, ဿ appears within words, whereas သ်သ is used across word boundaries.l,3
When the first consonant in a consonant cluster is a non-word-final င [U+1004 MYANMAR LETTER NGA] it rises over the following letter and keeps its virama, rather than pushing the following consonant below it, eg. အင်္ဂလန် This is called 'kinzi' (ကင်းစီး kɪ́ɴzí).
To achieve this, use the sequence င်္ [U+1004 MYANMAR LETTER NGA + U+103A MYANMAR SIGN ASAT + U+1039 MYANMAR SIGN VIRAMA], then continue with the next letter.
Unicode has the following, dedicated combining characters for the second letter in a syllable-onset cluster. The virama should not be used.
ျ [U+103B MYANMAR CONSONANT SIGN MEDIAL YA] and ြ [U+103C MYANMAR CONSONANT SIGN MEDIAL RA] are both pronounced j by default, eg. ပျော် ပြည် However, when preceded by a velar stop these characters indicate palatalisation, producing tɕ, tɕʰ, dʑ, ɲ, eg. ကျောင်း ကြက် ဂျပန်
ှ [U+103E MYANMAR CONSONANT SIGN MEDIAL HA] is used to create aspirated versions of consonants, eg. မှာ See aspiration for more details. It also creates the sound ʃ in the combinations ရှ ṙh̆ လျှ ly̆h̆eg. ရှိတယ်
The -h medial is typically transcribed before the letter it modifies, unlike the order of characters as typed or stored in memory. For example, မြွှ [U+1019 MYANMAR LETTER MA + U+103C MYANMAR CONSONANT SIGN MEDIAL RA + U+103D MYANMAR CONSONANT SIGN MEDIAL WA + U+103E MYANMAR CONSONANT SIGN MEDIAL HA] (pronounced m̥w) is transcribed hmrw. For more information about character order, see cporder.
ွ [U+103D MYANMAR CONSONANT SIGN MEDIAL WA] represents the w glide, eg. နွား, but it may also represent the vowel ʊ (see vowelsigns).
It is possible to find 2 or 3 medials in an onset cluster, eg. လျှ ly̆h̆ lʰjá or ʃá and မြွေ
The following panel lists a number of syllable-onset clusters which are not pronounced as you might expect.
Burmese aspirates many consonants. In some cases these are separate characters, in other cases the aspiration is indicated using ှ [U+103E MYANMAR CONSONANT SIGN MEDIAL HA]. Aspirated sounds include the followingm,12, where the last six use MEDIAL HA.
For example words, see map_consonants.
Pali and Sanskrit texts written in the Myanmar script, as well as in older orthographies of Burmese, sometimes render the consonants YA, RA, WA and HA in subjoined form. In those cases, U+1039 MYANMAR SIGN VIRAMA and the regular form of the consonant are used.u,647
The old spelling of many words uses a fifth medial consonant, la swe, eg. ခ္လိုဝ်း kʰ͓liuwˣ² washwhich is produced using just a subjoined l, ie. ္လ [U+1039 MYANMAR SIGN VIRAMA + U+101C MYANMAR LETTER LA].
Syllable-final consonants carry a visible mark called a.sat (အသတ် ʔa̰θaʔ) to indicate that the inherent vowel is killed, eg. see the small 'c' like mark over the last character in ဝင် ် [U+103A MYANMAR SIGN ASAT] is a character introduced in Unicode version 5.1 for this purpose. It is effectively a visible virama.
In native Burmese, 9 characters (5 nasals, င ဉ ည န မ NGA, NYA, NNYA, NA and MA, and 4 stops, က စ တ ပ KA, CA, TA, PA) appear in syllable final position.
In final position, nasals are pronounced as a nasalization of the previous vowel, eg. ရင် and all stops are pronounced ʔ, eg. မတ်
Some syllables ending in nasal consonants use ံ [U+1036 MYANMAR SIGN ANUSVARA] rather than the ordinary consonant sign, eg. compare သိမ်း သုံး
(Note that the ASAT is also used over ာ [U+102C MYANMAR VOWEL SIGN AA] and ယ [U+101A MYANMAR LETTER YA] to produce vowel+tone combinations.)
The following maps the above sounds to graphemes.
ပ [U+1015 MYANMAR LETTER PA], eg. ပိုက်ဆံ.
ဒ [U+1012 MYANMAR LETTER DA], eg. ဒေါ်လေး.
ဓ [U+1013 MYANMAR LETTER DHA], eg. ဓာတ်.
တ [U+1010 MYANMAR LETTER TA], eg. သတင်းစာ, where affected by sandhi.
ထ [U+1011 MYANMAR LETTER THA], eg. အကျဉ်းထောင်, where affected by sandhi, but sometimes also word-initial.
ဍ [U+100D MYANMAR LETTER DDA]. Mostly archaic, used for Pali.
ဎ [U+100E MYANMAR LETTER DDHA]. Mostly archaic, used for Pali.
ထ [U+1011 MYANMAR LETTER THA], eg. ထီး.
က [U+1000 MYANMAR LETTER KA], eg. ကား.
ခ [U+1001 MYANMAR LETTER KHA], eg. ခေါက်ဆွဲ.
ပ [U+1015 MYANMAR LETTER PA]. For foreign sounds.
သ [U+101E MYANMAR LETTER SA], eg. ပန်းသီး, where affected by sandhi.
စ [U+1005 MYANMAR LETTER CA], eg. စာအှပ်.
ဇ [U+1007 MYANMAR LETTER JA], eg. ဇွန်း.
ဈ [U+1008 MYANMAR LETTER JHA], eg. စျေး, rare.
စ [U+1005 MYANMAR LETTER CA], eg. စေတီ, when affected by sandhi, but also irregularly in initial position.
ဆ [U+1006 MYANMAR LETTER CHA], eg. ထမင်းဆိုင်. when affected by sandhi.
ဆ [U+1006 MYANMAR LETTER CHA], eg. ဆိုင်.
ဟ [U+101F MYANMAR LETTER HA], eg. ဟုတ်ကဲ့.
ှ [U+103E MYANMAR CONSONANT SIGN MEDIAL HA], eg. နှာ. Medial.
မ [U+1019 MYANMAR LETTER MA], eg. မာ.
ည [U+100A MYANMAR LETTER NNYA]. eg. ညာ. (Silent in final position.)
င [U+1004 MYANMAR LETTER NGA], eg. ငါး.
ရ [U+101B MYANMAR LETTER RA], in loan words, eg. ရေဒီယို.
In Unicode 5.0, ် [U+103A MYANMAR SIGN ASAT] did not exist, and U+1039 MYANMAR SIGN VIRAMA had to be used for both visible and non-visible viramas. This approach was problematic in that, since there are no spaces between words, it is not easy to automatically ascertain whether a virama should appear above a consonant or cause the stacking effect. For example, should my sequence of characters appear like this, အမ်မီတာ, or like this အမ္မီတာ? To get around this in Unicode 5.0 you needed to use a U+200C ZERO WIDTH NON-JOINER (ZWNJ) after the virama if you wanted it to remain visible (ie. the first example above would have been transcribed as ʔmˣmïta and the second as ʔm͓mïta). The non-joiner prevents stacking. In practice, this meant that there were very many ZWNJ characters in Burmese text, since there are many syllable-final consonants needing ASAT, and typing in the Myanmar script was therefore much more time-consuming than it needed to be.
Unicode 5.1 also introduced dedicated medial consonants. This makes it easier to type Myanmar text, but also allows for easy distinction of subjoined variants of these consonants rather than the usual medial forms.
One or two other characters were introduced, such as the TALL AA.
There are four tones in Burmese, creaky, low, high and stopped. The tone of a syllable can be indicated by the vowel used, or by combining a vowel and one of the following combining marks.
The stopped tone only, but always, occurs where a syllable ends in a stop consonant. Syllables that end with a vowel sound and syllables that end with the nasal sound ɴ can have one or more of the other three tones.
The phonemic transcriptions here use the following conventions for marking tones, using a as the base for the examples.
For more details about tones in Burmese, see burmese_phonetics.
This section provides more detailed information about the pronunciation of rhymes in Burmese.
A vowel plus tone combination is called a rhyme.
The following table shows the normal combinations of vowel, final consonant and tone mark characters that are seen in Burmese, and their pronunciations. Read down the left column to find the symbol used for the vowel sound, and across the top row to find syllable final consonants. The table doesn't take vowel reduction into account.
|-||a̰, ə||ɛʔ||ɪʔ||aʔ||aʔ||ɪ̀ɴ||ì, è, ɛ̀||ɪ̀ɴ||àɴ||àɴ||àɴ|
|+ ့||ɪ̰ɴ||ḭ, ḛ, ɛ̰||ɪ̰ɴ||a̰ɴ||a̰ɴ||a̰ɴ|
|+ း||ɪ́ɴ||í, é, ɛ́||ɪ́ɴ||áɴ||áɴ||áɴ|
|+ း (ဧး)||é|
There are 7 main vowel sounds in open syllables. The following lists those sounds and their different representations for the three tones in Burmese, creaky, low and high, that apply to open syllables. (Combining symbols are shown with အ, and alternate independent forms are shown in parentheses.)
|a||Primary central||အာ||အား||inherent||လာ là come|
|i||Primary front||အီ||အီး (ဤ)||အိ (ဣ)||မီး mí fire|
|u||Primary back||အူ (ဦ)||အူး||အု (ဥ)||တူ tù chopsticks|
|e||High front mid||အေ||အေး (ဧ)||အေ့||နှေး n̥é slow|
|o||High back mid||အို||အိုး||အို့||ဆိုး sʰó bad|
|ɛ||Low front mid||အယ်||အဲ||အဲ့||ဘယ် bɛ̀ which|
|ɔ||Low back mid||အော် (ဪ)||အော (ဩ)||အော့||ပျော် pjɔ̀ happy|
The following table summarises the above in a way that allows you to see how the various tones are applied to open syllables using the native Myanmar characters. Where long vs. short forms exist, for the purposes of clarity in the table, the long form is taken here to be the standard form and the short form a variant.
low high creaky a no mark visarga inherent vowel i no mark visarga short form u no mark visarga short form e no mark visarga dot below o no mark visarga dot below ɛ killed-y form no mark dot below ɔ asat no mark dot below
Vowels in 'closed' syllables end in a glottal stop or nasalisation. Historically, however, they ended in one of four nasals or four stops, and this is still reflected in the orthography. The vowel quality has also evolved in these syllables, typically producing diphthongs.
To indicate that the consonant is syllable-final, an asat is placed over it.
The sound values of vowel signs used in open and closed syllables differs systematically as follows.
i becomes eɪ, eg. အိန် ʔèɪɴ; အိတ် ʔeɪʔ.
u becomes oʊ, eg.အုန် ʔòʊɴ; အုတ် ʔoʊʔ.
ɔ becomes aʊ, eg. အောင် ʔàʊɴ; အောက် ʔaʊʔ.
o becomes aɪ, eg. အိုင် ʔàɪɴ; အိုက် ʔaɪʔ.
The inherent a is a lot more complicated, becoming one of ɪ, e, a, or ɛ.
The list of most common sounds are show in the large table above, and in the smaller tables below. There are other combinations of vowel and final consonant found in Burmese words of Indian origin, which often stick to the original Indian spelling, however, they tend to follow Burmese pronunciation, eg. ဓာတ် daʔ, ဗိုလ် bò, ဥယ္ယာဉ် ʔṵjaɴ.
The following table lists the main sounds in Burmese where the syllable ends in a nasal.
|ã||အန်||အမ်||ပန်း páɴ flower|
|ĩ||အင်||ဝင် wɪ̀ɴ enter|
|ũ||အွန်||ဇွန်း zʊ́ɴ spoon|
|eĩ||အိန်||အိမ်||အိမ် ʔèɪɴ house|
|oʊ̃||အုန်||အုမ်||ရန်ကုန် jàɴkòʊɴ Rangoon|
|aʊ̃||အောင်||ကောင်း káʊɴ good|
|aɪ̃||အိုင်||ဆိုင် sʰàɪɴ store|
Note how အည် doesn't end in a nasalisation. There is another consonant, ဉ [U+1009 MYANMAR LETTER NYA], which has come to be used to produce nasalisation.
These syllables are by default low in tone, but creaky and high tones can be indicated using ့ [U+1037 MYANMAR SIGN DOT BELOW] and း [U+1038 MYANMAR SIGN VISARGA] in a very regular way. Note that the tone mark appears at the end of the syllable, not immediately after the vowel, eg. အုန့် and ကောင်း.
The following table lists the main sounds in Burmese where the syllable ends in a stop.
|aʔ||အတ်||အပ်||ဖတ် pʰaʔ read|
|iʔ||အစ်||နှစ် n̥ɪʔ year|
|ɛʔ||အက်||ကြက် tɕɛʔ chicken|
|ũ||အွတ်||လွတ်လပ် lʊʔlaʔ independent|
|eiʔ||အိတ်||အိပ်||အရိပ် ʔa̰jeɪʔ shadow|
|oʊʔ||အုတ်||အုပ်||စာအုပ် sàʔoʊʔ book|
|aʊʔ||အောက်||နောက် naʊʔ next|
|aɪʔ||အိုက်||လိုက် laɪʔ follow|
These syllables are all unmarked 4th (stopped) tone.
The Unicode Myanmar block has two characters with the general category symbol. Neither are used in Burmese.
The Unicode Myanmar block includes two sets of digits. The first is used for Myanmar, but also tends to be used for other languages, including those with their own scripts, such as Tai Nüa.
There is also a set of Shan digits.
Myanmar text is written horizontally, left to right.
This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.
Burmese doesn't have any features relevant to cursive text, baselines, or character transforms.
You can experiment with examples using the Burmese character app.
Glyphs for subscripted consonants tend to be smaller than their full forms, eg. သဒ္ဒါ and may be rotated, eg. က္ဌ .
The following are additional examples of content-sensitive glyph shaping that should be taken care of by the font.
The shape of ြ [U+103C MYANMAR CONSONANT SIGN MEDIAL RA] changes according to what it surrounds, eg. compare the two different widths in the words and note the shortening at the top right of the second word ကြက်သွန်ဖြူ ဝန်ကြီး The joining behaviour of ျ [U+103B MYANMAR CONSONANT SIGN MEDIAL YA] also differs, eg. compare ချက် ကျွန်မတို့
The asat varies its position and shape according to context, eg. လမ်း ဒေါ်လေး
The shape of NA changes when something appears below it, eg.နို့နဲ့ Similarly, the bottom of NYA ဉ also changes in the following context, ပဉ္စမ
The placement of ့ [U+1037 MYANMAR SIGN DOT BELOW], used as a tone mark, varies slightly according to context, eg. ပြီးခဲ့တဲ့ တချို့ as does that of ှ [U+103E MYANMAR CONSONANT SIGN MEDIAL HA], eg. it is smaller than usual in ကောက်ညှင်း and the shape and position are very different in ရွှေပဲသီး
Other examples noted earlier include the change of shape and position of ု [U+102F MYANMAR VOWEL SIGN U] and ူ [U+1030 MYANMAR VOWEL SIGN UU] when other items appear below the base consonant, and the production of the kinzi.
Observation: Italicisation is used in Wikipedia for quotations and for citing the titles of articles or publications.
Myanmar script doesn't separate words in a phrase.
There is, however, a concept of words. Native Burmese words are typically monosyllabic, but there are also mutlisyllabic words, and these should not be broken during line wrapping.
: [U+003A COLON]
Spaces are used to separate phrases, rather than words. Phrase length is variable. Examples can be seen in the extract from the Declaration of Human Rights at the top of the page.
Punctuation is commonly limited to ၊ [U+104A MYANMAR SIGN LITTLE SECTION] and ။ [U+104B MYANMAR SIGN SECTION], with significance close to comma and full stop, respectively.
|initial||” [U+201D RIGHT DOUBLE QUOTATION MARK]|
Quoted speech may also be slanted (see fontstyle).
၌ [U+104C MYANMAR SYMBOL LOCATIVE] is an abbreviation meaning 'locative marker', ie. 'at, in, on', used in Burmese literary form.
၍ [U+104D MYANMAR SYMBOL COMPLETED] means 'subordinate marker', used to connect two trains of thought, ie. 'so / because'. Used in Burmese literary form.
၎ [U+104E MYANMAR SYMBOL AFOREMENTIONED] is used in the sequence ၎င်း as a demonstrative noun (this or that) when it precedes a noun. It is also used as a connecting phrase (meaning as well as) between two nouns within a clause.
၏ [U+104F MYANMAR SYMBOL GENITIVE] is used in Burmese literary form as a genitive that is written at the end of a sentence ending with a verb. It also marks possession of a preceding noun. It is used as a full stop if the sentence ends immediately with a verb.
Follow the links on the names of the above characters for more information.
If it is necessary to break text within a phrase, breaks can occur at syllable boundaries, but not within a word. The difficulty is that there is no visual information about which sequences of syllables consitute a word.
One way of detecting line-break opportunities is to use a dictionary to search for polysyllabic words, and then break at syllable boundaries outside the word. This approach may, however, run into problems when uncommon words or new words are used, especially those borrowed foreign terms.
A common approach is to break lines at phrase boundaries and then use justification is to adjust inter-phrase spacing.h,12
An alternative is to indicate break points by inserting U+200B ZERO WIDTH SPACE (ZWSP) between words when the content is developed.
Otherwise, you could tie the syllables in a polysyllabic word together using U+2060 WORD JOINER while authoring the content. This requires less intervention than adding ZWSP, since the number of polysyllabic words is smaller than the whole. Problems with that approach, currently, are that applications must be able to ignore the word joiner for searching, sorting, and the like. For this reason Hosken recommends against using it, and recommends instead the use of a dictionary with ZWSP backup for words that the dictionary doesn't handle well. However, it's not clear which words a dictionary will fail to recognise when the text is used across different platforms and applications, so this is not an ideal solution either – not to mention that it is difficult for an author to know in advance which words will cause problems and which won't .h,12
Characters used for Burmese have the following assignments related to line-break properties.
|AL||4||၌ ၍ ၎ ၏|
|NU||10||၁ ၉ ၄ ၈ ၀ ၂ ၃ ၅ ၆ ၇|
|QU||4||‘ ’ “ ”|
|SA||67||က ခ ဂ ဃ င စ ဆ ဇ ဈ ဉ ည ဋ ဌ ဍ ဎ ဏ တ ထ ဒ ဓ န ပ ဖ ဗ ဘ မ ယ ရ လ ဝ သ ဟ ဠ အ ဣ ဤ ဥ ဦ ဧ ဩ ဪ ဿ ါ ာ ိ ီ ု ူ ေ ဲ ံ ့ း ္ ် ျ ြ ွ ှ ၒ ၓ ၔ ၕ ၖ ၗ ၘ ၙ|
AL (ordinary alphabetic and symbol characters) requires other characters to provide break opportunities; otherwise, unless tailored rules are applied, no line breaks are allowed between pairs of them.
BA (break after) indicates that it is normal to break after that character.
NU (number) behaves like ordinary characters (AL) in the context of most characters but activate the prefix and postfix behavior of prefix and postfix characters.
QU (quotation) characters can be opening or closing, or even both, depending on usage. The default is to treat them as both opening and closing.
SA (Southeast Asian) require morphological analysis to determine break opportunities, in a way similar to a hyphenation algorithm. No break opportunities will be found otherwise. Complex context analysis, often involving dictionary lookup of some form, is required to determine non-emergency line breaks. If such analysis is not available, it is recommended to treat them as AL.
Justification may begin by adjusting inter-phrase spacing.h,12
Ready-made Counter Styles lists a numeric counter style for use with the Burmese language. You can experiment with this style using the Counter styles converter.
The myanmar numeric style is decimal-based and uses the digits shown below.
This section is for any features that are specific to thisScript and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.
Version 12.0 of the Unicode Standard has the following blocks dedicated to the Myanmar script:
The modern Burmese orthography described here uses characters from the following Unicode blocks.
|Myanmar||75||က ခ ဂ ဃ င စ ဆ ဇ ဈ ဉ ည ဋ ဌ ဍ ဎ ဏ တ ထ ဒ ဓ န ပ ဖ ဗ ဘ မ ယ ရ လ ဝ သ ဟ ဠ အ ဣ ဤ ဥ ဦ ဧ ဩ ဪ ါ ာ ိ ီ ု ူ ေ ဲ ံ ့ း ္ ် ျ ြ ွ ှ ဿ ၀ ၁ ၂ ၃ ၄ ၅ ၆ ၇ ၈ ၉ ၊ ။ ၌ ၍ ၎ ၏|
See also the Character usage lookup page, and the Script Comparison Table.
According to ScriptSource, the Myanmar script is used for the following languages: