Lao

orthography notes

Updated 7 December, 2024

This page brings together basic information about the Lao script and its use for the Lao language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Lao using Unicode.

Referencing this document

Richard Ishida, Lao Orthography Notes, 07-Dec-2024, https://r12a.github.io/scripts/laoo/lo

Sample

Select part of this sample text to show a list of characters, with links to more details.
Change size:   32px

ມາດຕາ 1: ມະນຸດເກີດມາມີສິດເສລີພາບ ແລະ ສະເໝີໜ້າກັນໃນທາງກຽດຕິສັກ ແລະ ທາງສິດດ້ວຍມະນຸດມີສະຕິສຳປັດຊັນຍະ(ຮູ້ດີຮູ້ຊົ່ວ)ແລະມີມະໂນທຳຈື່ງຕ້ອງປະພຶດຕົນຕໍ່ກັນໃນທາງພີ່ນ້ອງ.

ມາດຕາ 2: ຂໍ້ 1.ຄົນຜູ້ໃດກໍ່ອ້າງຕົນໄດ້ວ່າ:ມີສິດ ແລະ ເສລີພາບທຸກຢ່າງທີ່ໄດ້ປ່າວຮ້ອງຢູ່ໃນປະກາດສະບັບນີ້ໂດຍບໍ່ເລືອກໜ້າ ບໍ່ຈຳກັດເຊື້ອຊາດ,ຜິວເນື້ອ,ເພດ,ສາສະໜາ ຄວາມຄິດເຫັນໃນດ້ານການເມືອງ ຫຼື ອື່ນໆ ກຳເນີດແຫ່ງຊາດຫຼື ສັງຄົມຖານະການມີຊັບສົມບັດມາກ ຫຼື ນ້ອຍ,ມີຕະກຸນ ຫຼື ຖານະອື່ນໆ. ຂໍ້ 2.ອີກປະການໜື່ງ ຈະບໍ່ຈຳກັດຢ່າງໃດໃນການແຕກຕ່າງກັນອັນເນື່ອງມາຈາກລະບຽບການເມືອງການປົກຄອງ ຫຼື ລະຫວ່າງຊາດຂອງປະເທດ ຫຼື ດິນແດນ ຊື່ງບຸກຄົນຜູ້ໃດຜູ້ໜື່ງສັງກັດຢູ່;ດິນແດນນັ້ນຈຳເປັນເອກະລາດຢູ່ໃນຄວາມອາລັກຂາຂອງມະຫາອຳນາດ ຫຼື ບໍ່ມີອິດສະຫຼະ ຫຼື ຖືກລົດອະທິປະໄຕລົງໂດຍຈຳກັດກໍ່ຕາມ.

Source: Unicode UDHR, articles 1 & 2

Usage & history

Origins of the Lao script, 16thC – today.

Phoenician

└ Aramaic

└ Brahmi

└ Tamil-Brahmi

└ Pallava

└ Khmer

└ Sukhothai

└ Fakkham

└ Tai Noi

└ Lao

+ Tai Yo

The Lao language has around 4,000,000 speakers. The Lao script is used for writing the Lao language, and is also the official script of a number of minority languages in Laos. There is a considerable Lao-speaking population in Thailand who write their language with the Thai script.

ອັກສອນລາວ ʔáksɔ̌ːn láːw

The Lao alphabet was adapted from the Khmer script, and is a sister system to the Thai script, with which it shares many similarities and roots. However, Lao has fewer characters and is formed in a more curvilinear fashion than Thai. Further distancing from the Thai script occurred via a number of reforms. In 1975, the latest spelling reform simplified and standardised the script.

More information: Wikipedia

Orthographic development & variants

The script was originally an abugida, but since the script reforms leading up to 1960 it has been alphabetic. When the communist Pathet Lao overthrew the Lao government in 1975, they implemented a final spelling reform which simplified and standardized the script.

Basic features

Lao is an alphabet. This means that both consonants and vowels are indicated. See the table to the right for a brief overview of features for the modern Lao orthography.

Lao text runs left to right in horizontal lines. Spaces separate phrases, rather than words. There is no case distinction.

❯ consonantSummary

Lao has 26 consonants. Each onset consonant is associated with a high, mid, or low class related to tone.

No conjuncts are used for consonant clusters, except for one subjoined consonant, used in combination only with HA.

Syllable-initial clusters and syllable-final consonant sounds are all written with ordinary consonant letters. However, because all vowels are written, it is not difficult to algorithmically detect syllable boundaries.

❯ basicV

Unlike its close relative, Thai, the Lao orthography is an alphabet and has no inherent vowel. It represents vowels using 9 combining marks, and 12 letters, much of the time grouped together in various combinations. This page lists 28 composite vowels, which can involve up to 4 glyphs (plus a tone mark) at a time, and can surround the base consonant(s) on up to 3 sides simultaneously.

Vowels are often written differently when they appear in a closed vs. open syllable.

Lao uses visual placement: only the 8 vowel components that appear above or below the consonant are combining marks; the others are ordinary spacing characters that are typed in the order seen.

There are 5 pre-base letters. There are no single-character circumgraphs in Lao text, but a single vowel or diphthong is frequently made up of multiple components, some of which will appear on different sides of the base consonant(s).

There are no independent vowels, and standalone vowel sounds are written using vowel signs applied to .

Lao has 6 tones. Tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), plus any tone mark.

Character index

Letters

Show

Basic consonants

ຜ␣ຖ␣ຂ␣ປ␣ບ␣ຕ␣ດ␣ຈ␣ກ␣ອ␣ພ␣ທ␣ຄ␣ຝ␣ສ␣ຫ␣ຟ␣ຊ␣ຮ␣ໝ␣ໜ␣ມ␣ນ␣ຍ␣ງ␣ຢ␣ລ␣ວ␣ໝ␣ໜ

Extended consonants

Vowels

ເ␣ແ␣ໂ␣ໄ␣ໃ␣ະ␣າ␣ຽ␣ຳ

Other

ຯ␣ໆ

Combining marks

Show

Vowels

ິ␣ີ␣ຶ␣ື␣ຸ␣ູ␣ົ␣ໍ␣ັ

Tones

່␣້␣໊␣໋

Other

Not used for Lao

Numbers

Show
໐␣໑␣໒␣໓␣໔␣໕␣໖␣໗␣໘␣໙

Punctuation

Show
‘␣’␣“␣”␣«␣»␣…

ASCII

,␣-␣.␣:␣;␣?␣!␣(␣)

Symbols

Show

Other

Show
​␣⁠

To be investigated

%␣*␣[␣]␣_␣§␣ʼ␣͏␣‌␣‍␣–␣—␣†␣‡␣‰␣′␣″␣‹␣›
Items to show in lists

Phonology

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones.

Vowel sounds

Plain vowels

i i ɯ ɯː ɯ ɯː u u e e ɤ ɤː ɤ ɤː o o ɛ ɛː ɛ ɛː ɔ ɔː ɔ ɔː a a

Complex vowels

iə̯ iːə̯ iw iːw ɯə̯ ɯːə̯ uə̯ uːə̯
ɤːj
ɛw ɛːw
aj aːj aw

Consonant sounds

Initials

labial alveolar post-
alveolar
palatal velar glottal
stops p b t d     k ʔ
aspirated      
affricates     t͡ɕ      
fricatives f s     x h
nasals m n   ɲ ŋ
approximants ʋ/w l   j  
trills/flaps   r  

Finals

labial alveolar post-
alveolar
palatal velar glottal
stop p t     k ʔ
nasal m n     ŋ
approximant w     j  

Tone

Lao has 6 phonological tones in unchecked syllables, and 4 in checked syllables.

Name Vowel Final Unchecked? Checked?
Rising ˨˦ or ˨˩˦ ě  
High ˦ é
High falling ˥˧ ê  
Mid ˧ ē
Low ˩ è
Low falling ˧˩ e᷆

The tone depends on the class of the initial consonant in a syllable, the structure of the syllable, and whether or not a tone mark is applied to override the default.

Tone values vary depending on location in Laos. There is some disagreement about whether there are 5 or 6 tones in Vientiane, and the tables below show that different sources disagree on the tones produced. According to some, most dialects of Lao and Isan have six tones, those of Luang Prabang have five.wl

More details about tone values

The following tables present different descriptions of tone values in Lao for the Vientiane dialect. The first and third tables basically agree on the tone value, although the names of tones vary. The middle table shows some different tone values altogether. See a list of studies for Vientiane tones.

This diagram shows 5 tones with names corresponding to a mixture of the first two tables below.

Diagrams of tone vectors.

Tone marks are normally used only on open syllables, and modify the default tone value. Two of the four tone marks are only used with Class 1 consonants. Tone marks tend to be placed directly over the consonant (or superscript vowel), unlike Thai which tends to place them slightly to the right.

Open or live syllables are those that end with a long vowel or sonorant (eg. ງນມຍວ). Closed or dead syllables end with a stop consonant (eg. ກດບ) or short vowel.

  Open Closed
short vowel
Closed
long vowel
Tone
mai eːk
Tone
mai toː
Tone
mai tiː
Tone
mai cat-ta-waː
Class 1 low ˊ high ˆ low falling ˉ mid ˋ high falling ˋ high falling ˇ low rising
Class 2 ˇ low rising ˊ high ˆ low falling ˉ mid ˆ low falling - -
Class3 ˊ high ˉ mid ˋ high falling ˉ mid ˋ high falling - -

Refs: Daniels

  Live Dead
short vowel
Dead
long vowel
Tone
mai eːk
Tone
mai toː
Tone
mai tiː
Tone
mai cat-ta-waː
Class 1 ˋ low ˇ rising ˇ rising mid ˆ falling ˊ high ˇ rising
Class 2 ˇ rising ˇ rising ˋ low mid ˋ low - -
Class3 ˊ high mid ˆ falling mid ˆ falling - -

Refs: Simmala

  Live Dead
short vowel
Dead
long vowel
Tone
mai eːk
Tone
mai toː
Tone
mai tiː
Tone
mai cat-ta-waː
Class 1 low rising high rising low falling high-mid high falling    
Class 2 low rising high rising low falling high-mid low falling    
Class3 high rising high-mid high falling high-mid high falling    

Refs: SEAlang

The Simmala chart appears a little suspect, since they say in the text that the rising tone doesn't occur in dead syllables, and yet the book has examples of dead syllables with long vowels with a low tone.

Structure

The syllable is a basic element of the Lao language, and many words are monosyllabic. All syllables begin with a written consonant. Syllables that begin with a vowel sound are written with a silent base consonant.

The phonological structure of a syllable is (C)w?V(C).

Unlike Thai, its close neighbour linguistically, Lao doesn't naturally support onset clusters of consonants other than with ʷ, and then not before rounded vowels.wl,#Syllables Onset consonants followed by labialisation include: tʷʰ tɕʷ kʷ kʷʰ ʔʷ sʷ ŋʷ lʷ.

Only a small set of consonants occur at the end of a syllable. ʔ occurs after short vowels.

Lao also has 6 phonological tones in unchecked syllables (ie. ending in a vowel, or m, n, ŋ, w or j), and 4 in checked syllables (ie. ending in p, t, k, or ʔ).

Vowels

Vowel summary table

The following table summarises the main vowel to character assigments.

The diphthongs section below contains one character that incorporates a final nasal. The list doesn't include those combinations that involve simply appending a glide after the vowel (see compositeV). Standalone vowels use the character shown as a base for the normal vowel symbols. ◌ indicates the location of consonants, but doesn't necessarily indicate the presence of a combining mark.

Plain:
◌ິ␣◌ີ␣ ␣◌ຶ␣◌ື␣ ␣◌ຸ␣◌ູ
ເ◌ະ␣ເ◌ັ◌␣ເ◌␣ ␣ເ◌ິ␣ເ◌ີ␣ ␣◌ົ␣ໂ◌ະ␣ໂ◌
ແ◌ະ␣ແ◌ັ◌␣ແ◌␣ ␣ເ◌າະ␣◌ັອ␣◌ໍ␣◌ອ
◌ັ␣◌ະ␣◌າ
Diphthongs:
◌ັຽ␣ເ◌ັຍ␣◌ຽ␣ເ◌ຍ␣ ␣ເ◌ຶອ␣ເ◌ືອ␣ ␣◌ັວ◌␣◌ົວະ␣◌ົວ␣◌ວ◌␣◌ວາ◌
ໄ◌␣ໃ◌␣ເ◌ົາ␣◌ຳ
Standalone:

For more details see vowel_mappings.

Vowel signs A & MAI KAN

ກະ ka U+0E81 LAO LETTER KO + U+0EB0 LAO VOWEL SIGN A

The short vowel a has to be written explicitly, using 0EB0 in open syllables, and 0EB1 in closed syllables. The following word shows examples of both.

ລະດັບ

When used in conjunction with other vowels, 0EB0 and 0EB1 are also used to indicate short vowels for open and closed syllables, respectively. In phonetic transcriptions, shortened open vowels often end with a glottal stop. For example, compare:

ໂຕະ

ໂຕ

See also vlength.

Vowels following consonants

Unlike its close relative, Thai, the Lao orthography is an alphabet and has no inherent vowel. It represents vowels using combining marks, dedicated vowel letters, and a couple of consonants, much of the time grouped together in various combinations. This page lists 28 composite vowels, which can involve up to 4 glyphs, and can surround the base consonant(s) on up to 3 sides simultaneously.

Vowels are often written differently when they appear in a closed vs. open syllable.

Lao uses visual placement: only the 8 vowel components that appear above or below the consonant are combining marks; the others are ordinary spacing characters that are typed in the order seen.

There are 5 pre-base letters. There are no single-character circumgraphs in Lao text, but a single vowel or diphthong is frequently made up of multiple components, some of which will appear on different sides of the base consonant(s).

None of the marks are spacing combining marks; the spacing glyphs used to write vowels are all letters.

Plain vowels

ເກິ U+0EC0 LAO VOWEL SIGN E + U+0E81 LAO LETTER KO + U+0EB4 LAO VOWEL SIGN I

The following panel lists Lao monophthongs. The dotted circle shows the location of adjacent consonants.

◌ິ␣◌ີ␣◌ຶ␣◌ື␣◌ຸ␣◌ູ␣ເ◌ະ␣ເ◌ັ◌␣ເ◌␣ເ◌ິ␣ເ◌ີ␣◌ົ␣ໂ◌ະ␣ໂ◌␣ແ◌ະ␣ແ◌ັ◌␣ແ◌␣ເ◌າະ␣◌ັອ␣◌ໍ␣◌ອ◌␣◌ັ◌␣◌ະ␣◌າ

A few of the vowels are written differently when the syllable is open or closed. Doubled dotted circles in the list above indicate special forms for closed syllables, eg. for ເ◌ັ◌ -e- and ແ◌ັ◌ -ɛ-.

0EAD is used as a vowel carrier for standalone vowels (see standalone), but is also used as a vowel. On its own, in the middle of a closed syllable, 0EAD is pronounced as the vowel -ɔː-, eg.

ຈອກ

Complex vowels

ກຽວ kiːə̯w U+0E81 LAO LETTER KO + U+0EBD LAO SEMIVOWEL SIGN NYO + U+0EA7 LAO LETTER WO

Lao diphthongs are complicated and numerous. The way they are written also sometimes varies according to whether they appear in an open or a closed syllable.

Many complex Lao vowels involve adding a consonant to represent the final -j or -w glide after a vowel. The following panel shows combinations that don't follow that pattern. Except for uə̯, they are spelled the same whether the syllable is open or closed. However, there are 2 ways of spelling most diphthongs in open syllables.

◌ຽ◌␣ເ◌ັຽ␣ເ◌ຍ␣◌ັຽ◌␣ເ◌ັຍ␣ເ◌ັຍະ␣ເ◌ືອ␣ເ◌ຶອ␣◌ົວ␣◌ວ◌␣◌ວາ◌␣◌ັວ␣◌ົວະ␣ໃ◌␣ໄ◌␣ເ◌ົາ

Note that the above set includes 3 diphthongs that are written using single code points: , , and .

On its own, in the middle of a closed syllable, the consonant 0EA7 is pronounced -uːə̯-, eg.

ບວມ

The next panel shows diphthongs and a triphthong that are produced by simply adding one of the consonants 0EA7, 0E8D, or 0EBD to the end of a previously mentioned vowel to represent the glide -w or -j.

◌ີວ␣◌ິວ␣◌ຽວ␣ເ◌ັຽວ␣◌ວາຍ␣ເ◌ີຍ␣ເ◌ີຽ␣ໂ◌ຍ␣ແ◌ວ␣◌ອຍ␣◌າຍ␣◌ັຍ␣◌າຽ␣◌າວ

0EBD was originally an alternate form of non-initial 0E8D, but is now used for diphthongs, either alone as iːə̯ or as the semi-vowel -j, eg.

ປຽກ

ຊາຽ

The final panel in this section shows a complete Lao rhyme that has a special spelling.

◌ຳ

0EB3 is classed as a vowel, but also contains the final consonant m, represented by a built-in nikhahit (cf. 0ECD). It is a spacing combining character, but the Unicode Standard classifies it as a letter.

Pre-base vowel signs

ໂກ koː U+0EC2 LAO VOWEL SIGN O + U+0E81 LAO LETTER KO

Five vowel signs appear to the left of the onset consonant. See an example in fig_prebase.

ເ◌␣ແ◌␣ໂ◌␣ ␣ໄ◌␣ໃ◌

Like Thai, Lao uses a visual encoding model, so these characters are not combining characters, and are typed and stored before the base. For example:

ແມວ

Note that should not be typed as two successive characters.

These vowel signs are placed before the start of the syllable onset. This means that in a word with more than one consonant at the start (such as for shifting the tone) the pre-base vowel is placed to the left of the syllable-initial consonant, rather than to the left of the consonant after which it is pronounced. Tone marks and post-base vowel signs are however attached to the latter. For the following examples, click on the Lao text to see the order of characters.

ໃຫຍ່ ເຫຼືອງ

fig_prebase shows another example to graphically illustrate the relationships between the characters.

ແມວ
A vowel sign that appears 2 characters out of sequence from where it is pronounced, because the syllable onset is 2 characters long.
show composition

ແກວ່ງ

Vowel length

In common with other languages, i, ɯ and u vowels have dedicated characters for long and short sounds. But many composite vowels use 0EB0 (in open syllables) and   0EB1 (in closed syllables) as shorteners. The following provides one example of the general pattern.

ເ◌␣ເ◌ະ␣ເ◌ັ◌

This can be seen clearly by comparing the long and short vowels in vowel_mappings.

Composite vowels

ເກຶອ kuːə U+0EC0 VOWEL SIGN E + U+0E81 LETTER KO + U+0EB6 VOWEL SIGN Y + U+0EAD LETTER O

In the lists below, ◌ represents a consonant. Vowels used in closed syllables are indicated by a trailing ◌, or a trailing hyphen in the IPA transcription.

Lao has many vowel sounds that are represented by more than one code point. Composite vowels can involve up to 4 glyphs, and glyphs can surround the base consonant(s) on up to 3 sides.

ເຈັ້ຍ
A composite vowel comprised of a pre-base vowel letter, a vowel combining mark, and a post-base semivowel that is acting as part of the rhyme.
show composition

ເຈັ້ຍ

Some composite vowels represent plain vowel sounds:

ເ◌ະ␣ເ◌ັ␣ເ◌ິ␣ເ◌ີ␣ໂ◌ະ␣ແ◌ະ␣ແ◌ັ␣ເ◌າະ␣ັອ

The other composites represent diphthongs, which generally end in one of ə̯, i, or w.

In some cases, the spelling is straightforward because a semivowel simply follows a plain vowel.

ິວ␣ີວ␣ຽວ␣ເີ◌ຽ␣ເີ◌ຍ␣ແ◌ວ␣ອຍ␣ັຍ␣າຍ␣າຽ␣າວ

For others, the spelling doesn't closely follow the sound represented.

ັຽ␣ເ◌ັຍ␣ເ◌ຍ␣ເ◌ຶອ␣ເ◌ືອ␣ັວ␣ົວະ␣ົວ␣ວາ␣ເ◌ົາ

Observation: Simmala et al. list the composite vowel ເ-ັຍະ for -ia in (only) one location, but I have yet to identify words containing this sequence, and suspect that it may be a typographic error.

Characters that don't appear in the combinations:

◌ຸ␣◌ູ␣ໃ◌␣ໄ◌␣◌ໍ␣◌ຳ
Show which combinations contain a given character:
ເ◌ິ␣ ␣ິວ
ເ◌ີ␣ ␣ເ◌ີຽ␣ເ◌ີຍ␣ ␣ີວ
ເ◌ຶອ
ເ◌ືອ
ເ◌ະ␣ເ◌ັ␣ເ◌ິ␣ເ◌ີ␣ເ◌າະ␣ ␣ເ◌ີຽ␣ເ◌ີຍ␣ເ◌ົາ␣ ␣ເ◌ັຍ␣ເ◌ຍ␣ເ◌ັຍະ␣ເ◌ຶອ␣ເ◌ືອ
ແ◌ະ␣ແ◌ັ␣ ␣ແ◌ວ
ໂ◌ະ
ເ◌ະ␣ໂ◌ະ␣ແ◌ະ␣ເ◌າະ␣ ␣ເ◌ັຍະ␣ົວະ
ເ◌ັ␣ແ◌ັ␣ັອ␣ ␣ັຍ␣ ␣ັຽ␣ເ◌ັຍ␣ເ◌ັຍະ␣ັວ
ວາ␣ ␣ເ◌າະ␣ ␣າຍ␣າຽ␣າວ␣ເ◌ົາ
ົວະ␣ົວ␣␣ເ◌ົາ
ເ◌ຶອ␣ເ◌ືອ␣ັອ␣ ␣ອຍ
ຽວ␣ເ◌ີຽ␣າຽ␣ ␣ັຽ
ເ◌ີຍ␣ອຍ␣ັຍ␣າຍ␣ ␣ເ◌ັຍ␣ເ◌ຍ␣ເ◌ັຍະ
ັວ␣ົວະ␣ົວ␣ວາ␣␣ິວ␣ີວ␣ຽວ␣ແ◌ວ␣າວ
Show details about glyph positioning

The following list shows where vowel signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are. Numbers after + sign indicate multiple code points.

  • 5 pre-base, eg. ໂກ ōk̯ (ko)
  • 3 post-base, eg. ກາ k̯ā
  • 7 superscript, eg. ກິ k̯i
  • 2 subscript, eg. ກຸ k̯u
  • 1+6 sup+post-base, eg. ກັຍ k̯äɲ̱ kaj
  • +3 post+post-base, eg. ກາຍ k̯āɲ̱ kaːj
  • +5 pre+post-base, eg. ເກະ ēk̯a ke
  • +4 pre+superscript, eg. ເກິ ēk̯i
  • +1 super+post+post, eg. ກົວະ k̯ow̱a kuə
  • +1 pre+post+post, eg. ເກາະ ēk̯āa
  • +4 pre+sup+post-base, eg. ເກົາ ēk̯oā kaw
  • +2 pre+sup+post+post-base, eg. ເກັຍະ ēk̯äɲ̱a kia

At maximum, vowel components can occur concurrently on 3 sides of the base.

Distribution of vowel elements is as follows:

-ັ -ິ -ີ -ຶ -ື -ໍ -ົ -ຳ
ເ ແ ໂ ໃ ໄ ະ າ ອ ວ ຍ ຽ ຍ ຽ ະ ວ
  -ຸ -ູ    

Standalone vowels

Lao uses a silent as a base (although it is often transcribed as ʔ), to which vowel signs are applied, eg.

ໂອ

ອຸ່ນ

ຊາວເອັດ

Lao has no independent vowel letters, but when 0EAD is used as a vowel in a closed syllable it indicates the vowel ɔː.

ຈອກ

Tones

The Unicode Lao block provides the following characters for indicating tone.

◌່␣◌້␣◌໊␣◌໋

The tone is expressed using the class of the initial consonant in a syllable, the structure of the syllable, and whether or not a tone mark is applied to override the default.

Tone marks should be typed and stored in memory immediately after the base consonant of the syllable, or after a superscript vowel sign if there is one. However, the tone mark should be typed before , even though it will be displayed above the nikhahit.

ນ້ຳ

The following chart shows how to tell which tones are associated with a syllable. 'Checked' means ending in the sound -p, -t, or -k or a short vowel.

Mark Checked? Short/long Consonant Tone
    high/mid low
low falling
    high/mid falling
low high
    mid high
    mid rising
none no   high rising
mid/low mid
yes short high/mid low
low high
long high/mid low
low falling

Vowel sounds to characters

This section maps Lao vowel sounds to common graphemes in the Lao orthography.

The ◌ indicates the location of a consonant relative to the vowel sign; if there are 2 of these, the vowel is used only in closed syllables.

Plain vowels

mixed

i

mixed

ɯː

mixed

ɯ

mixed

mixed

u

mixed

mixed ເ◌

e

final ເ◌ະ

 

medial ເ◌ັ◌

ɤː

mixed ເ◌ີ

ɤ

mixed ເ◌ິ

mixed ໂ◌

o

final ໂ◌ະ

medial ◌ົ◌

ɛː

mixed

ɛ

final ແ◌ະ

medial ແ◌ັ◌

ɔː

final ◌ໍ

medial ◌ອ◌

ɔ

final ເ◌າະ

medial ◌ັອ◌

mixed ◌າ

a

final ◌ະ

medial ◌ັ◌

Diphthongs, triphthongs, and rhymes

iːə̯

rhyme ເ◌ຍ

rhyme ເ◌ັຽ

medial ◌ຽ◌

iə̯

rhyme ເ◌ັຍ

medial ◌ັຽ◌

iːə̯w

rhyme ຽວ

iə̯w

rhyme ເ◌ັຽວ (check this)

iːw

rhyme ◌ີວ

iw

rhyme ◌ິວ

ia

rhyme ເ◌ັຍະ (check this)

ɯːə̯

rhyme ເ◌ືອ

ɯə̯

rhyme ເ◌ຶອ

uːə̯

rhyme ◌ົວ

medial ◌ວ◌

medial ◌ວາ◌

uə̯

rhyme ◌ົວະ

medial ◌ັວ◌

uːə̯j

rhyme ◌ວາຍ

ɤːj

rhyme ເ◌ີຍ

rhyme ເ◌ີຽ (check this)

oːj

rhyme ໂ◌ຍ

ɛːw

rhyme ແ◌ວ

ɔːj

rhyme ◌ອຍ

aːj

rhyme ◌າຍ

rhyme ◌າຽ

aj

rhyme ໃ◌

rhyme ໄ◌

rhyme ◌ັຍ

aːw

rhyme ◌າວ

aw

rhyme ເ◌ົາ

am

rhyme ◌ຳ

Consonants

Consonant summary table

The following table summarises the main consonant to character assigments.

The initial consonants are split across high, mid, and low columns.

  high class mid class low class
Onsets
ຜ␣ຖ␣ຂ
ປ␣ບ␣ຕ␣ດ␣ຈ␣ກ␣ອ
ພ␣ທ␣ຄ
ຝ␣ສ␣ຫ
 
ວ␣ຟ␣ຊ␣ຮ
ໝ␣ຫມ␣ໜ␣ຫນ␣ຫງ␣ຫຍ
 
ມ␣ນ␣ຍ␣ງ
ຫວ␣ຫຼ␣ຫລ
ວ␣ຣ␣ລ
Finals
ບ␣ດ␣ກ
ມ␣ນ␣ຣ␣ງ
ວ␣ຢ␣ຍ

For additional details see consonant_mappings.

Consonant letters

Each of the basic consonants is associated with one of 3 classes (high, mid, and low), that play a part in indicating the tone of the syllable (see tones). In only a few cases, though, does this lead to more than one letter for a given consonant.

The pronunciation of a letter often differs when the consonant is the onset or coda of a syllable.

The panel below lists the basic consonant letters for Lao, with typical pronunciations for syllable onset and final positions.

In the following lists, the class of each consonant is shown just below the IPA data. If a dash appears after the IPA transcription, it indicates the pronunciation in syllable-initial position; before indicates the pronunciation for syllable codas.

high

ຜ␣ຖ␣ຂ␣ຝ␣ສ␣ຫ

mid

ປ␣ບ␣ຕ␣ດ␣ຈ␣ກ␣ອ␣ຢ

low

ພ␣ທ␣ຄ␣ຟ␣ຊ␣ຮ␣ມ␣ນ␣ຍ␣ງ␣ວ␣ລ

High class nasals & liquids with HO

Gaps in the high class range can be filled by using a silent before the onset characters listed just below to make their default tonal class high. Note that this doesn't set a high tone, it just changes the class of the consonant, to which tone rules are then applied.

ຫມ␣ຫນ␣ຫຍ␣ຫງ␣ຫວ␣ຫລ␣ຫຽ
ຫວ່າງ
A HA prefix converts the class of the onset consonant to high.
show composition

ຫວ່າງ

In modern texts, 3 of these combinations are typically represented by one of the following alternate forms.

ໝ␣ໜ␣ຫຼ

Two combinations can be represented as ligatures, for which there are separate characters in Unicode: 0EDD and 0EDCd,462, eg.

ໝາ

ໜອນ

A third can be represented by 0EAB 0EBCu,378, eg.

ຫຼາຍ

Wiktionary lists most of the 2-letter spellings as dated when the following 3 alternatives could be used, instead.

Letter O

0EAD represents a glottal stop or is silent when used as a base for vowels at the beginning of a syllable (see standalone).

ໂອ

When it appears after a base consonant in a closed syllable it becomes the vowel ɔː (see otherV).

ຈອກ

It is also used in combination with other characters to produce additional vowel sounds (see compositeV).

The r sound

One more letter was officially removed from the alphabet by the Ministry of Education, but it is still used occasionally to transliterate Indic or other foreign words into Lao, eg. ຝຣັ່ງ flaŋ foreigner It is generally used to represent the letter 'r': the sound r no longer exists in Lao.

Onsets

Modern Lao really only has one audible, syllable-initial cluster, and that occurs when labialises one of just over half a dozen initial consonants (see structure). In those cases, the initial consonant is simply followed by . Note that the tone mark goes over the second character in the digraph in the following example.

ກວ້າງ

Lao does, however, have clusters of written consonants at the syllable onset where is used to change the class of a following consonant. This includes the use of Lao's one subjoined consonant mark in the combination ຫຼ. This is described in highclass.

ຫຍ້າ

Finals

With one exception, Lao doesn't have any special code points dedicated solely to syllable final consonants, although consonants do appear in those positions, eg. ນົກ

This is true whether or not the syllable coda is followed by another syllable.

However, only the following consonants appear in syllable-final position. Note how the sound may change, compared to when the same letter is used in syllable-initial position: where the onset sound differs from the coda the onset is shown below.

ບ␣ດ␣ກ␣ມ␣ນ␣ຣ␣ງ␣ວ␣ຍ

Because Lao requires vowels to be written, there is not the ambiguity about syllable boundaries that one finds in Thai (caused by ambiguity about whether a consonant is syllable-final or a syllable in its own right).

The exception is that the rhyme -am may sometimes be represented by 0EB3.

ນຳ

Consonant clusters

See onsets for the few consonant clusters that occur at the beginning of a syllable.

Otherwise, consonant letter clusters only occur where a syllable ends with a consonant and another syllable begins.

Because the orthography is alphabetic, rather than an abugida, vowel absence after syllable-final consonants does not normally need to be marked in any way. The absence of a vowel sound is simply indicated by the absence of a vowel sign. The following example has 2 instances of a syllable-final consonant followed by an onset, one word-internal, and the other between words.

ອັກສອນລາວ

In a consonant cluster any tone marks or superscript vowels appear over the second consonant.

0EBA is used as a virama when writing Pali. It is not used in modern Lao.

0ECC was previously used to indicate silenced consonants, but is now described as obsolete.wl,#Punctuation

Consonant length

The Lao orthography has no special features for dealing with geminated or long consonant sounds.

Consonant sounds to characters

This section maps Lao consonant sounds to common graphemes in the Lao orthography.

Light coloured characters occur infrequently.

p

mid class

 

coda

high class

 

low class

b

mid class

t

mid class

 

coda

high class

 

low class

t͡ɕ

mid class

d

mid class

k

mid class syllable initial & final

 

coda

high class syllable-initial only

 

low class syllable-initial only

ʔ

mid class vowel carrier

f

high class

 

low class

s

high class

 

low class

x

high class syllable-initial only

 

low class syllable-initial only

h

high class

 

low class

m

low class

 

atomic high class digraph

 

high class digraph ຫມ

 

coda

n

low class

 

atomic high class digraph

 

high class digraph ຫນ

 

coda

 

coda Used for non-native sounds in loan words.

ɲ

low class

 

high class digraph ຫຍ

ŋ

low class syllable initial & final

 

high class digraph ຫງ

 

coda

ʋ

low class

 

high class digraph ຫວ

w

low class

 

high class digraph ຫວ

 

coda

l

low class

 

high class digraph ຫຼ

 

high class digraph ຫລ

 

high class subjoined

 

low class Used for non-native sounds in loan words.

la

et cetera logograph ໆລໆ

j

mid class

coda

Other features

Pali characters

Unicode 12 added 14 consonant letters and 1 combining mark for writing Pali.

ຆ␣ຉ␣ຌ␣ຎ␣ຏ␣ຐ␣ຑ␣ຒ␣ຓ␣ຘ␣ຠ␣ຨ␣ຩ␣ຬ␣຺

Encoding choices

This section offers advice about characters or character sequences to avoid, and what to use instead. It takes into account the relevance of Unicode Normalisation Form D (NFD) and Unicode Normalisation Form C (NFC)..

Although usage is recommended here, content authors may well be unaware of such recommendations. Therefore, applications should look out for the non-recommended approach and treat it the same as the recommended approach wherever possible.

VOWEL SIGN EI

In complex scripts, visually similar or identical glyph patterns can often be made from a sequence of code points rather than the single code point that Unicode provides. These are not made the same by normalisation, and they are not semantically equivalent. These inappropriate sequences should be avoided because they will cause the meaning of the text to change; searches, matching and other aspects of the text will fail to be understood by the application or the font.

Only one such is listed in the table below, The single code point on the left should be used, and not the sequence on the right. In some cases, fonts will indicate that there is a problem by forcing the appearance of a dotted circle or otherwise failing to render the text correctly, but this may not always be the case.

Use Do not use
0EC1 0EC0 0EC0

VOWEL SIGN AM

The combination of nikahit and sara aa is normally written with the precomposed character in the Lao block. It is possible to use 2 code points to create something that may visually look identical (and is in fact used during justification), but the single character and the sequence are not converted to each other during normalisation; therefore, the text will be read as different by normalisation-based matching algorithms.

Recommended Not recommended
0EB3 0ECD 0EB2

Code point sequences

Tone marks should be typed and stored after any combining vowel mark. Fonts will typically indicate visually that the order is incorrect because the tone mark will appear below the vowel mark if they are the wrong way around.

Lao is visually encoded so pre-base glyphs are associated with ordinary spacing characters, and these need to be typed and stored in visual order relative to the base consonant(s) in a syllable. If the syllable begins with a consonant cluster such as pr, the pre-base code points must be typed before the p, even though they are pronounced after the r.

Numbers, dates, currency, etc.

Digits

Lao uses Western digits.

There is, however, a set of Lao digits.

໐␣໑␣໒␣໓␣໔␣໕␣໖␣໗␣໘␣໙

Observation: Pending further clarification about how widespread the use of Lao digits is, note that Lao Wikipedia uses Lao digits for table of contents list numbering and for footnote references. See the relevant sections below.

The CLDR standard-decimal pattern is #,##0.###. The standard-percent pattern is #,##0%.cldr

Observation: Lao Wikipedia uses a French pattern, #.##0,###, eg. ມີກຳລັງຕິດຕັ້ງ 7.207,24 ເມກາວັດ There are 7,207.24 megawatts installed

Currency

The CLDR standard format for currency is ¤#,##0.00;¤-#,##0.00, and the symbol for the Lao currency, Kip, is .cldr

Text direction

Lao text runs left to right in horizontal lines.

Show default bidi_class properties for characters by the modern Lao orthography.

Glyph shaping & positioning

Experiment with examples using the Lao character app.

Context-based shaping & positioning

Pre-base vowels are visually ordered, and therefore do not need to be repositioned by the font.

Vowel signs, tones, and one consonant, on the other hand, are combining characters that need to be correctly positioned relative to the base character, and multiple marks can be combined with a single base character.

When using the vowel sign AM with a tone mark the small circle needs to push the tone mark upwards, even though the tone mark occurs before the vowel sign in memory (see fig_tone_am).

ກ່ຳ
The small circle of the vowel sign AM appears below the tone mark, even though the tone mark precedes the vowel sign in memory.

Letterform slopes, weights, & italics

Observation: Italicised text used for a figure captions, and also for quotations.

Typographic units

Word boundaries

Words are not separated by spaces, nontheless double-clicking or other selection methods are expected to identify word boundaries. There are 2 alternative approaches for managing this.

  1. An application uses a dictionary or smart algorithm to parse the text and determine word boundaries.
  2. The author uses 200B (ZWSP) between words when creating the content.

Unlike Thai or Khmer, it is fairly straightforward to parse individual syllables in Lao, because its alphabetic nature makes it possible to identify syllable-final consonants. Note that syllable-based segmentation must identify and keep together any syllable-initial clusters involving h or l, for example, the initial 2 letters in ຫມາ should wrap as a unit just like the ligated form, ໝາ .

What about kw etc?

While nearly all syllables can be argued to be words in their own right, there is still a preference for keeping multi-syllabic words together during word-based segmentation. eg. ປະເທດ For this, an application needs to use a dictionary to parse Lao text.

However, widely used software automatically inserts 200B in Lao text at word or syllable boundaries, and many web pages use such inserted ZWSP characters to get browsers to wrap correctly.g3,#issuecomment-385847864

If a dictionary fails to keep two or more syllables together as needed, it should be possible to use the Unicode character 2060 between the two syllables. This is an invisible character, equivalent to a zero-width no-break space, and used to prevent line-breaks.

If dictionaries are used for segmentation, they should be selected based on the language, not the script. (See the list of languages using the Lao script.)

Graphemes

Grapheme clusters

tbd

Unicode grapheme clusters divide text into segments that contain a single base consonant plus any following combining characters: the latter include the 9 combining vowel signs, and all tone marks. Not included are free-standing vowel signs and consonants that make up other parts of a composite vowel, both pre-base and post-base. Also, syllable-initial consonant clusters with -ວ U+0EA7 LETTER WO and ຫ- U+0EAB LETTER HO SUNG are treated as 2 text units, but not ຫຼ.

This implies that a pre-base vowel sign such as ເ- U+0EC0 VOWEL SIGN E would be treated as a separate item from what follows, and in fact this can be seen in fig_drop_caps_2, where that character is the only thing highlighted in an initial letter selection. (On the other hand, initial letters followed by combining characters select the whole sequence, as seen in fig_drop_caps.)

This means that Lao typography is different from some other SE Asian scripts where pre-base vowel signs are selected with the base because they are combining characters, or syllable-initial consonant clusters form a unit because the 'medial' consonants are represented by combining characters.

Punctuation & inline features

Phrase & section boundaries

,␣;␣:␣.␣?␣!

Lao uses ASCII punctuation, but also uses space as punctuation.

phrase

0020

,

;

:

sentence

0020

.

?

!

Spaces are used, but represent phrase or sentence boundaries.

Numbers are also normally surrounded by spaces.

In principle, periods are not used, though this appears to be changing.wl,#Punctuation

Observation: Lao Wikipedia uses periods at the end of sentences, and commas (see an example).l An online news site also consistently uses periods to end sentences.

Western punctuation is also used. Contemporary writing may include punctuation marks borrowed from French, such as the exclamation mark (!), and question mark (?). However, questions can be determined by question words within a sentence.wl,#Punctuation

Hyphens are also commonly found in modern writing.wl,#Punctuation

Bracketed text

(␣)

Lao commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

(

)

( and ) are used for parentheses in contemporary writing.wl,#Punctuation

Quotations & citations

“␣”␣«␣»␣‘␣’

Lao texts use quotation marks around quotations. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.

  start end
initial

«

»

nested

The default quote marks for Lao are at the start, and at the end.cldr

When an additional quote is embedded within the first, the quote marks are and .cldr

Contemporary writing may also include « and » for quotation marks, borrowed from French.wl,#Punctuation

Abbreviation, ellipsis & repetition

ຯ␣…␣ໆ

Ellipsis & abbreviation

is used to indicate ellipsis or abbreviation, as well as missing words.wl,#Punctuation

The ellipsis, , is also commonly found in modern writing.wl,#Punctuation

Observation: Lao Wikipedia uses periods after date-related abbreviations,l eg. in ຄ.ສ. 1935 ສະບັບຄົ້ນ ḵʰ.s. 1935 sab̯äb̯ḵʰo²ṉ CE 1935 Edition It is also used in the abbreviated name of the country, eg. ສ.ປ.ປ.ລາວ s.p̯.p̯.ḻāw̱ Lao PDR

is used in ໆລໆ kʰɯaŋ-mǎːj-lɛ-ɯːn-ɯːn (ເຄຶ່ອງໝາຍ ແລະອຶ່ນໆ), with a meaning similar to etc. For example, ການສື່ສານ,ສື່ມວນຊົນ,ສື່ໂຄສະນາ...ໆລໆ Communication, media, advertising ... etc

Some sources use ຯລຯ and others ໆລໆ – check this out.

Repetition

is used to indicate repetition of a preceding sound.

Other inline features

Other punctuation

CLDR includes the following punctuation.

‐␣‑␣–␣—␣†␣‡␣′␣″

Line & paragraph layout

Line-breaking & hyphenation

Although Lao doesn't use spaces or dividers between words, the expectation is that line-breaks occur at word boundaries.

See word for a discussion of issues related to word-based segmentation.

Line-edge rules

As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line. The Unicode line-break properties help applications decide whether a character should appear at the start or end of a line.

Show (default) line-breaking properties for characters in the modern Lao orthography.

The following list gives examples of typical behaviours for some of the characters used in modern Lao. Context may affect the behaviour of some of these and other characters.

Click/tap on the Lao characters to show what they are.

  • “ ‘ ( «   should not be the last character on a line.
  • ” ’ ) » . , ; ! ? %   should not begin a new line.
  •   should be kept with any number, even if separated by a space or parenthesis.

Line breaking should not move a danda or double danda to the beginning of a new line even if they are preceded by a space character.

Text alignment & justification

Since spaces aren't used to separate words, Lao has to use alternative strategies for justification of text.

Baselines, line height, etc.

Lao uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

Lao places vowel and tone marks above base characters, one above the other, and can also add combining characters below the line. The complexity of these marks means that the vertical resolution needed for clearly readable Lao text is higher than for English, or most Latin text. In addition, Lao also tends to add more interline spacing than Latin text does.

To give an approximate idea, fig_baselines compares Latin and Lao glyphs from Noto fonts. The basic height of Lao letters is typically slightly above the Latin x-height, however extenders and combining marks reach well beyond the Latin ascenders and descenders, creating a need for larger line spacing.

Hhqxใฏูกิ้ปีฬุฬึ์๕๙ Hhqxใฏูกิ้ปีฬุฬึ์๕๙
Font metrics for Latin text compared with Lao glyphs in the Noto Serif Lao (top) and Noto Sans Lao (bottom) fonts.

fig_baselines_other shows similar comparisons for the Lao MN and DokChampa fonts.

Hhqxใฏูกิ้ปีฬุฬึ์๕๙ Hhqxใฏูกิ้ปีฬุฬึ์๕๙
Latin font metrics compared with Lao glyphs in the Lao MN (top) and DokChampa (bottom) fonts.

Counters, lists, etc.

You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.

The modern Lao orthography uses a numeric style.

Numeric

The lao numeric style is decimal-based and uses these digits.rmcs

໐␣໑␣໒␣໓␣໔␣໕␣໖␣໗␣໘␣໙

Examples:

໑␣໒␣໓␣໔␣໑໑␣໒໒␣໓໓␣໔໔␣໑໑໑␣໒໒໒␣໓໓໓␣໔໔໔

Prefixes and suffixes

Observation: Lao Wikipedia uses periods for suffixes.l

Lao digits being used for section numbering in Wikipedia.

Styling initials

It is possible to find the first letter in a paragraph styled so that it is larger and sits alongside several lines of the continuing paragraph text.

Observation: All combining characters, including spacing ones, are included in the selections shown in fig_drop_caps.

Any punctuation such as opening quotes and opening parentheses should also be included in the initial styling. ?

Examples of dropped highlights Examples of dropped highlights

Two example paragraphs showing dropped highlighted initials with combining characters.

Observation: In the figures shown, the alphabetic baseline of the highlighted letter(s) matches the bottom of the row that determines the size of the highlighted letter(s). Selections without diacritics above are somewhat shorter than the height of the lines alongside, whereas selections with multiple diacritics rise slightly higher than the first line of text.

Observation: In fig_drop_caps_2, the selection picks out only from the digraph ຫລ; and from the syllable ເມຶ່ອ.

Examples of dropped highlights Examples of dropped highlights

More example paragraphs, showing dropped highlighted initials that are part of a larger construct.

Page & book layout

Notes, footnotes, etc

Observation: Lao Wikipedia uses Lao digits for footnote references.

Footnote references in Wikipedia using Lao digits.

References