Lao (draft)
Lao

Updated 10 January, 2023

This page brings together basic information about the Lao script and its use for the Lao language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Lao using Unicode.

Sample

Select part of this sample text to show a list of characters, with links to more details. Source
Change size:   28px

ມາດຕາ 1: ມະນຸດເກີດມາມີສິດເສລີພາບ ແລະ ສະເໝີໜ້າກັນໃນທາງກຽດຕິສັກ ແລະ ທາງສິດດ້ວຍມະນຸດມີສະຕິສຳປັດຊັນຍະ(ຮູ້ດີຮູ້ຊົ່ວ)ແລະມີມະໂນທຳຈື່ງຕ້ອງປະພຶດຕົນຕໍ່ກັນໃນທາງພີ່ນ້ອງ.

ມາດຕາ 2: ຂໍ້ 1.ຄົນຜູ້ໃດກໍ່ອ້າງຕົນໄດ້ວ່າ:ມີສິດ ແລະ ເສລີພາບທຸກຢ່າງທີ່ໄດ້ປ່າວຮ້ອງຢູ່ໃນປະກາດສະບັບນີ້ໂດຍບໍ່ເລືອກໜ້າ ບໍ່ຈຳກັດເຊື້ອຊາດ,ຜິວເນື້ອ,ເພດ,ສາສະໜາ ຄວາມຄິດເຫັນໃນດ້ານການເມືອງ ຫຼື ອື່ນໆ ກຳເນີດແຫ່ງຊາດຫຼື ສັງຄົມຖານະການມີຊັບສົມບັດມາກ ຫຼື ນ້ອຍ,ມີຕະກຸນ ຫຼື ຖານະອື່ນໆ. ຂໍ້ 2.ອີກປະການໜື່ງ ຈະບໍ່ຈຳກັດຢ່າງໃດໃນການແຕກຕ່າງກັນອັນເນື່ອງມາຈາກລະບຽບການເມືອງການປົກຄອງ ຫຼື ລະຫວ່າງຊາດຂອງປະເທດ ຫຼື ດິນແດນ ຊື່ງບຸກຄົນຜູ້ໃດຜູ້ໜື່ງສັງກັດຢູ່;ດິນແດນນັ້ນຈຳເປັນເອກະລາດຢູ່ໃນຄວາມອາລັກຂາຂອງມະຫາອຳນາດ ຫຼື ບໍ່ມີອິດສະຫຼະ ຫຼື ຖືກລົດອະທິປະໄຕລົງໂດຍຈຳກັດກໍ່ຕາມ.

Usage & history

The Lao script is used for writing the Lao language, and is also the official script of a number of minority languages in Laos. There is a considerable Lao-speaking population in Thailand who write their language with the Thai script.

ອັກສອນລາວ ʔáksɔ̌ːn láːw

The Lao alphabet was adapted from the Khmer script, and is a sister system to the Thai script, with which it shares many similarities and roots. However, Lao has fewer characters and is formed in a more curvilinear fashion than Thai. Further distancing from the Thai script occurred via a number of reforms. In 1975, the latest spelling reform simplified and standardised the script.

Sources: Scriptsource, Wikipedia.

Orthographic development & variants

The script was originally an abugida, but since the script reforms leading up to 1960 it has been alphabetic. When the communist Pathet Lao overthrew the Lao government in 1975, they implemented a final spelling reform which simplified and standardized the script.

Observation. I'm not aware of any variant spellings or confusable characters.

Basic features

Lao is an alphabet. This means that both consonants and vowels are indicated. See the table to the right for a brief overview of features for the modern Lao orthography.

Lao text runs left to right in horizontal lines.

Spaces separate phrases, rather than words.

There are consonants. ❯ consonants

Each onset consonant is associated with a high, mid, or low class related to tone. Tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), plus any tone mark.

No conjuncts are used for consonant clusters, except for one subjoined consonant, used in combination only with HA. ❯ clusters

Syllable-initial clusters and syllable-final consonant sounds are all written with ordinary consonant letters. However, because all vowels are written, it is not difficult to algorithmically detect syllable boundaries. ❯ finals

Unlike its close relative, Thai, the Lao orthography has no inherent vowel, but still represents vowels using 18 vowel signs (including 5 pre-base), and 2 consonants. Only vowel signs that appear above or below the consonant are combining marks; the others are ordinary characters that are typed in the order seen. Vowels are often written differently when they appear in a closed vs. open syllable. ❯ vowels

There are no independent vowels, and standalone vowel sounds are written using vowel signs applied to [U+0EAD LAO LETTER O]. ❯ standalone

This page lists 27 composite vowels (made from 12 vowel signs, and 2 consonants). Composite vowels can involve up to 4 glyphs, and glyphs can surround the base consonant(s) on up to 3 sides. ❯ composite_vowels

Character index

Letters

Show

Basic consonants

ຜ␣ຖ␣ຂ␣ປ␣ບ␣ຕ␣ດ␣ຈ␣ກ␣ອ␣ພ␣ທ␣ຄ␣ຝ␣ສ␣ຫ␣ຟ␣ຊ␣ຮ␣ໝ␣ໜ␣ມ␣ນ␣ຍ␣ງ␣ຢ␣ລ␣ວ␣ໝ␣ໜ

Extended consonants

Vowels

ເ␣ແ␣ໂ␣ໄ␣ໃ␣ະ␣າ␣ຽ␣ຳ

Other

ຯ␣ໆ

Combining marks

Show

Vowels

ິ␣ີ␣ຶ␣ື␣ຸ␣ູ␣ົ␣ໍ␣ັ

Tones

່␣້␣໊␣໋

Other

Not used for Lao

Numbers

Show
໐␣໑␣໒␣໓␣໔␣໕␣໖␣໗␣໘␣໙

Punctuation

Show
‘␣’␣“␣”␣«␣»␣…

ASCII

,␣-␣.␣:␣;␣?␣!␣(␣)

CLDR additions

‑␣–␣—␣†␣‡␣′␣″

Symbols

Show

Other

Show
​␣⁠
Items to show in lists

Phonology

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones.

Vowel sounds

Plain vowels

i i ɯ ɯː ɯ ɯː u u e e ɤ ɤː ɤ ɤː o o ɛ ɛː ɛ ɛː ɔ ɔː ɔ ɔː a a

Diphthongs

iə̯ iːə̯ iw iːw iə̯ iːə̯ iw iːw ɯə̯ ɯːə̯ ɯə̯ ɯːə̯ uə̯ uːə̯ uə̯ uːə̯ ɤːj ɤːj ɛw ɛːw ɛw ɛːw aj aːj aw aj aːj aw

Consonant sounds

Initials

labial dental alveolar post-
alveolar
palatal velar glottal
stops p b t d       k ʔ
aspirated        
affricates       t͡ɕ      
fricatives f   s     x h
nasals m   n   ɲ ŋ
approximants ʋ/w   l   j  
trills/flaps     r  

Finals

labial dental alveolar post-
alveolar
palatal velar glottal
stop p t       k ʔ
nasal m   n     ŋ
approximant w       j  

Tone values

Lao has 6 phonological tones in unchecked syllables, and 4 in checked syllables.

Name Vowel Final Unchecked? Checked?
Rising ˨˦ or ˨˩˦ ě  
High ˦ é
High falling ˥˧ ê  
Mid ˧ ē
Low ˩ è
Low falling ˧˩ e᷆

The tone depends on the class of the initial consonant in a syllable, the structure of the syllable, and whether or not a tone mark is applied to override the default.

Tone values vary depending on location in Laos. There is some disagreement about whether there are 5 or 6 tones in Vientiane, and the tables below show that different sources disagree on the tones produced. According to some, most dialects of Lao and Isan have six tones, those of Luang Prabang have five.wl

More details about tone values

The following tables present different descriptions of tone values in Lao for the Vientiane dialect. The first and third tables basically agree on the tone value, although the names of tones vary. The middle table shows some different tone values altogether. See a list of studies for Vientiane tones.

This diagram shows 5 tones with names corresponding to a mixture of the first two tables below.

Diagrams of tone vectors.

Tone marks are normally used only on open syllables, and modify the default tone value. Two of the four tone marks are only used with Class 1 consonants. Tone marks tend to be placed directly over the consonant (or superscript vowel), unlike Thai which tends to place them slightly to the right.

Open or live syllables are those that end with a long vowel or sonorant (eg. ງນມຍວ). Closed or dead syllables end with a stop consonant (eg. ກດບ) or short vowel.

  Open Closed
short vowel
Closed
long vowel
Tone
mai eːk
Tone
mai toː
Tone
mai tiː
Tone
mai cat-ta-waː
Class 1 low ˊ high ˆ low falling ˉ mid ˋ high falling ˋ high falling ˇ low rising
Class 2 ˇ low rising ˊ high ˆ low falling ˉ mid ˆ low falling - -
Class3 ˊ high ˉ mid ˋ high falling ˉ mid ˋ high falling - -

Refs: Daniels

  Live Dead
short vowel
Dead
long vowel
Tone
mai eːk
Tone
mai toː
Tone
mai tiː
Tone
mai cat-ta-waː
Class 1 ˋ low ˇ rising ˇ rising mid ˆ falling ˊ high ˇ rising
Class 2 ˇ rising ˇ rising ˋ low mid ˋ low - -
Class3 ˊ high mid ˆ falling mid ˆ falling - -

Refs: Simmala

  Live Dead
short vowel
Dead
long vowel
Tone
mai eːk
Tone
mai toː
Tone
mai tiː
Tone
mai cat-ta-waː
Class 1 low rising high rising low falling high-mid high falling    
Class 2 low rising high rising low falling high-mid low falling    
Class3 high rising high-mid high falling high-mid high falling    

Refs: SEAlang

The Simmala chart appears a little suspect, since they say in the text that the rising tone doesn't occur in dead syllables, and yet the book has examples of dead syllables with long vowels with a low tone.

Structure

The syllable is a basic element of the Lao language, and many words are monosyllabic. All syllables begin with a written consonant. Syllables that begin with a vowel sound are written with a silent base consonant.

The phonological structure of a syllable is (C)w?V(C).

Unlike Thai, its close neighbour linguistically, Lao doesn't naturally support onset clusters of consonants other than with ʷ, and then not before rounded vowels.wl,#Syllables Onset consonants followed by labialisation include: tʷʰ tɕʷ kʷ kʷʰ ʔʷ sʷ ŋʷ lʷ.

Only a small set of consonants occur at the end of a syllable. ʔ occurs after short vowels.

Lao also has 6 phonological tones in unchecked syllables (ie. ending in a vowel, or m, n, ŋ, w or j), and 4 in checked syllables (ie. ending in p, t, k, or ʔ).

Vowels

Dashes are used to indicate whether the character represents a vowel sound in a closed or an open syllable.

Vowel signs

ກະ ka U+0E81 LAO LETTER KO + U+0EB0 LAO VOWEL SIGN A

Although modern Lao orthography is alphabetic, and therefore has no inherent vowel, vowel signs are still used to write the vowel sounds. What's different is that ka, which would in an abugida be written just using a consonant, is also spelled out using a vowel sign.

Vowels in Lao are written with a mixture of combining characters and ordinary spacing characters. Only the superscript and subscript vowel signs are combining characters. It is also common to use some consonants to represent vowel sounds (see consonant_vowels).

Lao uses visual placement: only the 8 vowel components that appear above or below the consonant are combining marks; the others are ordinary spacing characters that are typed in the order seen.

The glyphs used to represent vowels, whether alone or in composites, are arranged around a syllable onset, which is rare in Lao but may include 2 consonants, rather than just around the immediately preceding consonant. For an example of the effect this produces, see prebase and composite_vowels.

Vowel signs A & MAI KAN

As mentioned, unlike its close neighbour Thai, Lao consonants do not carry an inherent vowel. The short vowel a has to be written explicitly, using [U+0EB0 LAO VOWEL SIGN A] in open syllables, and    [U+0EB1 LAO VOWEL SIGN MAI KAN] in closed syllables.

When used in conjunction with other vowels, [U+0EB0 LAO VOWEL SIGN A] and    [U+0EB1 LAO VOWEL SIGN MAI KAN] are used to indicate short vowels for open and closed syllables, respectively, eg. ລະດັບ

This can be seen by comparing the long and short vowels in vowel_mappings. In common with other languages, long i, ɯ and u vowels have dedicated characters for long sounds, and there are a few unusual sequences, but many composite vowels use the aforementioned signs as shorteners. The following provides one example of the general pattern.

ເ-␣ເ-ະ␣ເ-ັ-

Combining marks used for vowels

ກິ ki U+0E81 LAO LETTER KO + U+0EB4 LAO VOWEL SIGN I

Lao uses the following combining marks for vowels.

ິ␣ີ␣ຶ␣ື␣ຸ␣ູ␣ົ␣ໍ␣ັ

Letter characters used for vowels

ກາ kaː U+0E81 LAO LETTER KO + U+0EB2 LAO VOWEL SIGN AA

The following additional, vowel-specific characters are ordinary spacing characters, with the general category of 'letter'. Many of these are typed and stored before the base consonant (see prebase).

ເ␣ແ␣ໂ␣ະ␣າ␣ ␣ຳ␣ໄ␣ໃ

[U+0EB3 LAO VOWEL SIGN AM] is classed as a vowel, but also contains the final consonant m, represented by a built-in nikhahit (cf. -ໍ [U+0ECD LAO NIGGAHITA​]). It is a spacing combining character, but the Unicode Standard classifies it as a letter.

Pre-base vowel signs

ໂກ koː U+0EC2 LAO VOWEL SIGN O + U+0E81 LAO LETTER KO

Five vowel signs appear to the left of the onset consonant. See an example in fig_prebase.

ເ␣ແ␣ໂ␣ ␣ໄ␣ໃ

Like Thai, Lao uses a visual encoding model, so these characters are not combining characters, and are typed and stored before the base. For example:

ແມວ

Note that [U+0EC1 LAO VOWEL SIGN EI] should not be typed as two successive [U+0EC0 LAO VOWEL SIGN E] characters.

These vowel signs are placed before the start of the syllable onset. This means that in a word with more than one consonant at the start (such as for shifting the tone) the pre-base vowel is placed to the left of the syllable-initial consonant, rather than to the left of the consonant after which it is pronounced. Tone marks and post-base vowel signs are however attached to the latter. For the following examples, click on the Lao text to see the order of characters.

ໃຫຍ່ ເຫຼືອງ

fig_prebase shows another example to graphically illustrate the relationships between the characters.

ແມວ
A vowel sign that appears 2 characters out of sequence from where it is pronounced, because the syllable onset is 2 characters long.
details

ແກວ່ງ

Consonants used for vowel sounds

ກວ kuːə U+0E81 LAO LETTER KO + U+0EA7 LAO LETTER WO

The following characters are also used to create vowel sounds, either alone or as part of a composite vowel.

ອ␣ວ␣ຍ␣ຽ

On its own, in the middle of a closed syllable, [U+0EAD LAO LETTER O] is pronounced as the vowel -ɔː- and [U+0EA7 LAO LETTER WO] is pronounced -uːə̯-, eg. ຈອກ tɕɔːk˧˩ glass ບວມ buːəm to swell

[U+0EAD LAO LETTER O] is also used as a vowel carrier for standalone vowels (see standalone).

[U+0EA7 LAO LETTER WO], [U+0E8D LAO LETTER NYO], and [U+0EBD LAO SEMIVOWEL SIGN NYO] also represent the glide w or j at the end of a diphthong (see composite_vowels).

[U+0EBD LAO SEMIVOWEL SIGN NYO] was originally an alternate form of non-initial [U+0E8D LAO LETTER NYO], but is now used for diphthongs, either alone as iːə̯ or as the semi-vowel j, eg. ປຽກ piːə̯k˧˩ wet ຊາຽ saːj˦ sand

All of these characters also appear in the composite vowels described below.

Composite vowels

ເກຶອ kuːə U+0EC0 VOWEL SIGN E + U+0E81 LETTER KO + U+0EB6 VOWEL SIGN Y + U+0EAD LETTER O

In the lists below, hyphens represent consonants. Vowels used in closed syllables are indicated by a trailing hyphen in the IPA transcription.

Vowels in Lao are commonly written using a combination of code points, which may produce glyphs on more than one side of the base. See fig_composite.

ເຈັ້ຍ
A composite vowel comprised of a pre-base vowel letter, a vowel combining mark, and a post-base semivowel that is acting as part of the rhyme.
details

ເຈັ້ຍ

Some composite vowels represent plain vowel sounds:

ເ-ະ␣ເ-ັ␣ເ-ິ␣ເ-ີ␣ໂ-ະ␣ແ-ະ␣ແ-ັ␣ເ-າະ␣ັອ

The other composites represent diphthongs, which generally end in one of ə̯, i, or w.

In some cases, the spelling is straightforward because a semivowel simply follows a plain vowel.

ິວ␣ີວ␣ຽວ␣ເ-ີຽ␣ເ-ີຍ␣ແ-ວ␣ອຍ␣ັຍ␣າຍ␣າຽ␣າວ

For others, the spelling doesn't closely follow the sound represented.

ັຽ␣ເ-ັຍ␣ເ-ຍ␣ເ-ຶອ␣ເ-ືອ␣ັວ␣ົວະ␣ົວ␣ເ-ົາ

Observation: Simmala et al. list the composite vowel ເ-ັຍະ for -ia in (only) one location, but I have yet to identify words containing this sequence, and suspect that it may be a typographic error.

Characters that don't appear in the combinations:

ຸ␣ູ␣ໃ␣ໄ␣ໍ␣ຳ
Show which combinations contain a given character:
ເ-ິ␣ ␣ິວ
ເ-ີ␣ ␣ເ-ີຽ␣ເ-ີຍ␣ ␣ີວ
ເ-ຶອ
ເ-ືອ
ເ-ະ␣ເ-ັ␣ເ-ິ␣ເ-ີ␣ເ-າະ␣ ␣ເ-ີຽ␣ເ-ີຍ␣ເ-ົາ␣ ␣ເ-ັຍ␣ເ-ຍ␣ເ-ັຍະ␣ເ-ຶອ␣ເ-ືອ
ແ-ະ␣ແ-ັ␣ ␣ແ-ວ
ໂ-ະ
ເ-ະ␣ໂ-ະ␣ແ-ະ␣ເ-າະ␣ ␣ເ-ັຍະ␣ົວະ
ເ-ັ␣ແ-ັ␣ັອ␣ ␣ັຍ␣ ␣ັຽ␣ເ-ັຍ␣ເ-ັຍະ␣ັວ
ເ-າະ␣ ␣າຍ␣າຽ␣າວ␣ເ-ົາ
ົວະ␣ົວ␣␣ເ-ົາ
ເ-ຶອ␣ເ-ືອ␣ັອ␣ ␣ອຍ
ຽວ␣ເ-ີຽ␣າຽ␣ ␣ັຽ
ເ-ີຍ␣ອຍ␣ັຍ␣າຍ␣ ␣ເ-ັຍ␣ເ-ຍ␣ເ-ັຍະ
ັວ␣ົວະ␣ົວ␣␣ິວ␣ີວ␣ຽວ␣ແ-ວ␣າວ
Show details about glyph positioning

The following list shows where vowel signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are. Numbers after + sign indicate multiple code points.

  • 5 pre-base, eg. ໂກ ōk̯ (ko)
  • 3 post-base, eg. ກາ k̯ā
  • 7 superscript, eg. ກິ k̯i
  • 2 subscript, eg. ກຸ k̯u
  • 1+6 sup+post-base, eg. ກັຍ k̯äɲ̱ kaj
  • +3 post+post-base, eg. ກາຍ k̯āɲ̱ kaːj
  • +5 pre+post-base, eg. ເກະ ēk̯a ke
  • +4 pre+superscript, eg. ເກິ ēk̯i
  • +1 super+post+post, eg. ກົວະ k̯ow̱a kuə
  • +1 pre+post+post, eg. ເກາະ ēk̯āa
  • +4 pre+sup+post-base, eg. ເກົາ ēk̯oā kaw
  • +2 pre+sup+post+post-base, eg. ເກັຍະ ēk̯äɲ̱a kia

At maximum, vowel components can occur concurrently on 3 sides of the base.

Distribution of vowel elements is as follows:

-ັ -ິ -ີ -ຶ -ື -ໍ -ົ -ຳ
ເ ແ ໂ ໃ ໄ ະ າ ອ ວ ຍ ຽ ຍ ຽ ະ ວ
  -ຸ -ູ    

Standalone vowels

Standalone vowels are not preceded by a consonant, and may appear at the beginning or in the middle of a word. This typically includes a way to represent the sound of the inherent vowel in isolation.

Lao uses a silent [U+0EAD LAO LETTER O] as a base (although it is often transcribed as ʔ), to which vowel signs are applied, eg. ໂອ ອຸ່ນ

Lao has no independent vowel letters.

Vowel absence

Because the orthography is alphabetic, rather than an abugida, vowel absence after syllable-final consonants does not normally need to be marked in any way. The absence of a vowel sound is simply indicated by the absence of a vowel sign.

Nor is vowel absence normally marked in syllable-initial clusters. The only conjunct forms in Lao are the subjoined l and the ligatures described in highclass.

   [U+0ECC LAO CANCELLATION MARK was previously used to indicate silenced consonants, but is now described as obsolete.wl,#Punctuation

Vowel sounds to characters

This section maps Lao vowel sounds to common graphemes in the Lao orthography. The dash indicates the location of the consonant relative to the vowel sign; if there are 2 dashes, the vowel is used only in closed syllables. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

Plain vowels

Diphthongs and triphthongs

Tones

Tone marks

The Unicode Lao block provides the following characters for indicating tone.

່␣້␣໊␣໋

Tone marks should be typed and stored in memory immediately after the base consonant of the syllable, or after a superscript vowel sign if there is one. However, the tone mark should be typed before [U+0EB3 LAO VOWEL SIGN AM], and should be displayed above the nikhahit, see fig_tone_am.

Consonants

Each onset consonant is associated with a high, mid, or low class related to tone. Tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), plus any tone mark.

No conjuncts are used for consonant clusters, except for one subjoined consonant, used in combination only with HA.

Syllable-initial clusters and syllable-final consonant sounds are all written with ordinary consonant letters. However, because all vowels are written, it is not difficult to algorithmically detect syllable boundaries.

For a mapping of sounds to graphemes see consonant_mappings.

Consonant letters

Plosives & affricate

high class
ຜ␣ຖ␣ຂ
mid class
ປ␣ບ␣ຕ␣ດ␣ຈ␣ກ␣ອ
low class
ພ␣ທ␣ຄ

Fricatives

high class
ຝ␣ສ␣ຫ
low class
ຟ␣ຊ␣ຮ

Nasals

high class
ຫມ␣ໝ␣ຫນ␣ໜ␣ຫຍ␣ຫງ
low class
ມ␣ນ␣ຍ␣ງ

Other sonorants

high class
ຫວ␣ຫລ␣ຫຼ␣ຫຽ
mid class
low class
ລ␣ວ

High class nasals & liquids with HO

A silent [U+0EAB LAO LETTER HO SUNG] can be added before the following onset characters to make their default tonal class high:

ຫມ␣ຫນ␣ຫຍ␣ຫງ␣ຫວ␣ຫລ␣ຫຽ

There are alternate forms for 3 of these. Two can be represented as ligatures, for which there are separate characters in Unicode: [U+0EDC LAO HO NO] and [U+0EDD LAO HO MO], d,462 eg. ໝາ A third can be represented by ຫຼ [U+0EAB LAO LETTER HO SUNG + U+0EBC LAO SEMIVOWEL SIGN LO],u,378 eg. ຫຼາຍ

ໝ␣ໜ␣ຼ

Letter O

[U+0EAD LAO LETTER O] is silent when used as a base for vowels at the beginning of a syllable, eg. ໂອ When it appears after a base consonant it becomes the vowel ɔː, eg. ຈອກ It is also used in combination with other characters to produce additional vowel sounds (see composite_vowels).

The r sound

One more letter was officially removed from the alphabet by the Ministry of Education, but it is still used occasionally to transliterate Indic or other foreign words into Lao, eg. ຝຣັ່ງ flaŋ foreigner It is generally used to represent the letter 'r': the sound r no longer exists in Lao.

Consonant clusters

Phonetically, unlike Thai, Lao doesn't have syllable-initial consonant clusters other than those followed by w plus a non-rounded vowel (see structure). Those are written using an ordinary [U+0EA7 LAO LETTER WO], eg. ຄວາຍ

Otherwise, consonant letter clusters occur in the following circumstances:

No special characters or viramas are involved, except those described in highclass.

In a consonant cluster any tone marks or superscript vowels appear over the second consonant.

Final consonants

Lao doesn't have any code points dedicated to syllable final consonants, although consonants do appear in those positions, eg. ນົກ

Only the following consonants appear in syllable-final position. Note how the sound may change: where there is more than one phonetic realisation, the one with a preceding hyphen is the final.

ກ␣ງ␣ຍ␣ດ␣ນ␣ບ␣ມ␣ຢ␣ຣ␣ວ

Because Lao requires vowels to be written, there is not the ambiguity about syllable boundaries that one finds in Thai (caused by ambiguity about whether a consonant is syllable-final or a syllable in its own right).

A final m sound may be represented by   [U+0EB3 LAO VOWEL SIGN AM].

Consonant sounds to characters

This section maps Lao consonant sounds to common graphemes in the Lao orthography, grouped by high class ( h ), mid class ( m ), low class ( l ) and syllable-final ( f ).. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

Stops

Affricate

t͡ɕ
m

Fricatives

Nasals

Other

Final stops

Final nasals

Other letters

Unicode 12 added 14 consonant letters and 1 combining mark for writing Pali.

ຆ␣ຉ␣ຌ␣ຎ␣ຏ␣ຐ␣ຑ␣ຒ␣ຓ␣ຘ␣ຠ␣ຨ␣ຩ␣ຬ␣຺

Numbers, dates, currency, etc.

Digits

Lao uses Western digits.

There is, however, a set of Lao digits.

໐␣໑␣໒␣໓␣໔␣໕␣໖␣໗␣໘␣໙

Observation: Pending further clarification about how widespread the use of Lao digits is, note that Lao Wikipedia uses Lao digits for table of contents list numbering and for footnote references. See the relevant sections below.

The CLDR standard-decimal pattern is #,##0.###. The standard-percent pattern is #,##0%.cldr

Observation: Lao Wikipedia uses a French pattern, #.##0,###, eg. ມີກຳລັງຕິດຕັ້ງ 7.207,24 ເມກາວັດ There are 7,207.24 megawatts installed

Currency

The CLDR standard format for currency is ¤#,##0.00;¤-#,##0.00, and the symbol for the Lao currency, Kip, is [U+20AD KIP SIGN].cldr

Text direction

Lao text runs left to right in horizontal lines.

Show default bidi_class properties for characters by the modern Lao orthography.

Glyph shaping & positioning

This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.

You can experiment with examples using the Lao character app.

None of the characters require special shaping based on the visual context.. Nor is printed text cursive.

Lao has no special requirements for baseline alignment between mixed scripts or in general.

The orthography has no case distinction, and no special transforms are needed to convert between characters.

Context-based shaping & positioning

Prescript vowels are visually ordered, and therefore do not need to be positioned by the font.

Vowel signs, tones, and one consonant, on the other hand, are combining characters that need to be correctly positioned relative to the base character, and multiple marks can be combined with a single base character.

When using the vowel sign AM with a tone mark the small circle needs to push the tone mark upwards, even though the tone mark occurs before the vowel sign in memory (see fig_tone_am).

ກ່ຳ
The small circle of the vowel sign AM appears below the tone mark, even though the tone mark precedes the vowel sign in memory.

Font styling & weight

Observation: Italicised text used for a figure captions, and also for quotations.

Graphemes

Grapheme clusters

tbd

Unicode grapheme clusters divide text into segments that contain a single base consonant plus any following combining characters: the latter include the 9 combining vowel signs, and all tone marks. Not included are free-standing vowel signs and consonants that make up other parts of a composite vowel, both pre-base and post-base. Also, syllable-initial consonant clusters with -ວ [U+0EA7 LAO LETTER WO] and ຫ- [U+0EAB LAO LETTER HO SUNG] are treated as 2 text units, but not ຫຼ [U+0EAB LAO LETTER HO SUNG + U+0EBC LAO SEMIVOWEL SIGN LO].

This implies that a pre-base vowel sign such as ເ- [U+0EC0 LAO VOWEL SIGN E] would be treated as a separate item from what follows, and in fact this can be seen in fig_drop_caps_2, where that character is the only thing highlighted in an initial letter selection. (On the other hand, initial letters followed by combining characters select the whole sequence, as seen in fig_drop_caps.)

This means that Lao typography is different from some other SE Asian scripts where pre-base vowel signs are selected with the base because they are combining characters, or syllable-initial consonant clusters form a unit because the 'medial' consonants are represented by combining characters.

Punctuation & inline features

Word boundaries

Words are not separated by spaces, nontheless double-clicking or other selection methods are expected to identify word boundaries. There are 2 alternative approaches for managing this.

  1. An application uses a dictionary or smart algorithm to parse the text and determine word boundaries.
  2. The author uses U+200B ZERO WIDTH SPACE between words when creating the content.

Unlike Thai or Khmer, it is fairly straightforward to parse individual syllables in Lao, because its alphabetic nature makes it possible to identify syllable-final consonants. Note that syllable-based segmentation must identify and keep together any syllable-initial clusters involving h or l, for example, the initial 2 letters in ຫມາ should wrap as a unit just like the ligated form, ໝາ .

What about kw etc?

While nearly all syllables can be argued to be words in their own right, there is still a preference for keeping multi-syllabic words together during word-based segmentation. eg. ປະເທດ For this, an application needs to use a dictionary to parse Lao text.

However, widely used software automatically inserts U+200B ZERO WIDTH SPACE (ZWSP) in Lao text at word or syllable boundaries, and many web pages use such inserted ZWSP characters to get browsers to wrap correctly.g3,#issuecomment-385847864

If a dictionary fails to keep two or more syllables together as needed, it should be possible to use the Unicode character U+2060 WORD JOINERbetween the two syllables. This is an invisible character, equivalent to a zero-width no-break space, and used to prevent line-breaks.

If dictionaries are used for segmentation, they should be selected based on the language, not the script. (See the list of languages using the Lao script.)

Phrase & section boundaries

,␣;␣:␣.␣?␣!

Lao uses ASCII punctuation, but also uses space as punctuation.

phrase

U+0020 SPACE

, [U+002C COMMA]

; [U+003B SEMICOLON]

: [U+003A COLON]

sentence

U+0020 SPACE

. [U+002E FULL STOP]

? [U+003F QUESTION MARK]

! [U+0021 EXCLAMATION MARK]

Spaces are used, but represent phrase or sentence boundaries.

Numbers are also normally surrounded by spaces.

In principle, periods are not used, though this appears to be changing.wl,#Punctuation

Observation: Lao Wikipedia uses periods at the end of sentences, and commas (see an example).l An online news site also consistently uses periods to end sentences.

Western punctuation is also used. Contemporary writing may include punctuation marks borrowed from French, such as the exclamation mark (!), and question mark (?). However, questions can be determined by question words within a sentence.wl,#Punctuation

Hyphens are also commonly found in modern writing.wl,#Punctuation

Bracketed text

(␣)

Lao commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

( [U+0028 LEFT PARENTHESIS]

) [U+0029 RIGHT PARENTHESIS]

( [U+0028 LEFT PARENTHESIS] and ) [U+0029 RIGHT PARENTHESIS] are used for parentheses in contemporary writing.wl,#Punctuation

Quotations & citations

“␣”␣«␣»␣‘␣’

Lao texts use quotation marks around quotations. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.

  start end
initial

[U+201C LEFT DOUBLE QUOTATION MARK]

« [U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK]

[U+201D RIGHT DOUBLE QUOTATION MARK]

» [U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK]

nested

[U+2018 LEFT SINGLE QUOTATION MARK]

[U+2019 RIGHT SINGLE QUOTATION MARK]

The default quote marks for Lao are [U+201C LEFT DOUBLE QUOTATION MARK] at the start, and [U+201D RIGHT DOUBLE QUOTATION MARK] at the end.cldr

When an additional quote is embedded within the first, the quote marks are [U+2018 LEFT SINGLE QUOTATION MARK] and [U+2019 RIGHT SINGLE QUOTATION MARK].cldr

Contemporary writing may also include « [U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK] and » [U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK] for quotation marks, borrowed from French.wl,#Punctuation

Emphasis

tbd

Abbreviation, ellipsis & repetition

ຯ␣…␣ໆ

Ellipsis & abbreviation

[U+0EAF LAO ELLIPSIS] is used to indicate ellipsis or abbreviation, as well as missing words.wl,#Punctuation

The ellipsis, [U+2026 HORIZONTAL ELLIPSIS], is also commonly found in modern writing.wl,#Punctuation

Observation: Lao Wikipedia uses periods after date-related abbreviations,l eg. in ຄ.ສ. 1935 ສະບັບຄົ້ນ ḵʰ.s. 1935 sab̯äb̯ḵʰo²ṉ CE 1935 Edition It is also used in the abbreviated name of the country, eg. ສ.ປ.ປ.ລາວ s.p̯.p̯.ḻāw̱ Lao PDR

[U+0EC6 LAO KO LA] is used in ໆລໆ kʰɯaŋ-mǎːj-lɛ-ɯːn-ɯːn (ເຄຶ່ອງໝາຍ ແລະອຶ່ນໆ), with a meaning similar to etc. For example, ການສື່ສານ,ສື່ມວນຊົນ,ສື່ໂຄສະນາ...ໆລໆ Communication, media, advertising ... etc

Some sources use ຯລຯ and others ໆລໆ – check this out.

Repetition

[U+0EC6 LAO KO LA] is used to indicate repetition of a preceding sound.

Inline notes & annotations

tbd

Other punctuation

CLDR includes the following punctuation.

‐␣‑␣–␣—␣†␣‡␣′␣″

Other inline text decoration

tbd

Line & paragraph layout

Line-breaking & hyphenation

Although Lao doesn't use spaces or dividers between words, the expectation is that line-breaks occur at word boundaries. See word for a discussion of issues related to word-based segmentation.

Line-edge rules

As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line. The Unicode line-break properties help applications decide whether a character should appear at the start or end of a line.

Show (default) line-breaking properties for characters in the modern Lao orthography.

The following list gives examples of typical behaviours for some of the characters used in modern Lao. Context may affect the behaviour of some of these and other characters.

Click/tap on the Lao characters to show what they are.

  • “ ‘ ( «   should not be the last character on a line.
  • ” ’ ) » . , ; ! ? %   should not begin a new line.
  •   should be kept with any number, even if separated by a space or parenthesis.

Line breaking should not move a danda or double danda to the beginning of a new line even if they are preceded by a space character.

Text alignment & justification

Since spaces aren't used to separate words, Lao has to use alternative strategies for justification of text.

Text spacing

tbd

This section looks at ways in which spacing is applied between characters over and above that which is introduced during justification.

Baselines, line height, etc.

Lao uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

Lao places vowel and tone marks above base characters, one above the other, and can also add combining characters below the line. The complexity of these marks means that the vertical resolution needed for clearly readable Lao text is higher than for English, or most Latin text. In addition, Lao also tends to add more interline spacing than Latin text does.

To give an approximate idea, fig_baselines compares Latin and Lao glyphs from Noto fonts. The basic height of Lao letters is typically slightly above the Latin x-height, however extenders and combining marks reach well beyond the Latin ascenders and descenders, creating a need for larger line spacing.

Hhqxใฏูกิ้ปีฬุฬึ์๕๙ Hhqxใฏูกิ้ปีฬุฬึ์๕๙
Font metrics for Latin text compared with Lao glyphs in the Noto Serif Lao (top) and Noto Sans Lao (bottom) fonts.

fig_baselines_other shows similar comparisons for the Lao MN and DokChampa fonts.

Hhqxใฏูกิ้ปีฬุฬึ์๕๙ Hhqxใฏูกิ้ปีฬุฬึ์๕๙
Latin font metrics compared with Lao glyphs in the Lao MN (top) and DokChampa (bottom) fonts.

Counters, lists, etc.

You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.

The modern Lao orthography uses a numeric style.

Numeric

The lao numeric style is decimal-based and uses these digits.rmcs

໐␣໑␣໒␣໓␣໔␣໕␣໖␣໗␣໘␣໙

Examples:

໑␣໒␣໓␣໔␣໑໑␣໒໒␣໓໓␣໔໔␣໑໑໑␣໒໒໒␣໓໓໓␣໔໔໔

Prefixes and suffixes

Observation: Lao Wikipedia uses periods for suffixes.l

Lao digits being used for section numbering in Wikipedia.

Styling initials

It is possible to find the first letter in a paragraph styled so that it is larger and sits alongside several lines of the continuing paragraph text.

Observation: All combining characters, including spacing ones, are included in the selections shown in fig_drop_caps.

Any punctuation such as opening quotes and opening parentheses should also be included in the initial styling. ?

Examples of dropped highlights Examples of dropped highlights

Two example paragraphs showing dropped highlighted initials with combining characters.

Observation: In the figures shown, the alphabetic baseline of the highlighted letter(s) matches the bottom of the row that determines the size of the highlighted letter(s). Selections without diacritics above are somewhat shorter than the height of the lines alongside, whereas selections with multiple diacritics rise slightly higher than the first line of text.

Observation: In fig_drop_caps_2, the selection picks out only from the digraph ຫລ; and from the syllable ເມຶ່ອ.

Examples of dropped highlights Examples of dropped highlights

More example paragraphs, showing dropped highlighted initials that are part of a larger construct.

Page & book layout

This section is for any features that are specific to Lao and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.

Notes, footnotes, etc

Observation: Lao Wikipedia uses Lao digits for footnote references.

Footnote references in Wikipedia using Lao digits.

References