Northern Thai (draft)
Lanna (Tai Tham)

Updated 9 January, 2023

This page brings together basic information about the Tai Tham (Lanna) script and its use for the Northern Thai language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Northern Thai using Unicode.

Good Tai Tham fonts are hard to find, especially for Northern Thai. The default font used for Northern Thai in this page is Hariphunchai. Discussions are under way at the Unicode Consortium that may change the ordering of character sequences in the future, so the order of characters in this page reflects what the font expects in order to display correctly.

Sample

Select part of this sample text to show a list of characters, with links to more details.
Change size:   28px

ᨣᩢ᩠ᨶᩉᩖᩮᩨᨠᩥ᩠ᨶ ᨣᩢᩐᩢᩣᨡᩣ᩠ᨿᨸᩮ᩠ᨶᨦᩫ᩠ᨶ ᨠᩮ᩠ᨷᩉᩬᨾᩋᩬᨾᩅᩱᩢᨯ᩠᩶ᨦᨶᩦ᩶ ᨴᩩᨠᪧᨸᩦᨾᩣᨷᩢᨡᩣ᩠ᨯ ᨧᩥ᩠᩵ᨦᨠ᩠ᨴᩣᩴᩉᩨ᩶ᨡᩮᩢᩣᨻᩳ᩵ᨾᩯ᩵ᩃᩪᨠ ᨷᩢᨯᩱ᩶ᨠᩢ᩠᩶ᨶᩈᩢ᩠ᨦᩈᩢ᩠ᨠᨩᩮᩨᩬ

Usage & history

Northern Thai is is spoken by the people of Lanna, Thailand, with a smaller community of Lanna speakers in northwestern Laos. Few of the six million speakers of Northern Thai are literate in the Tai Tham script, although there is some rising interest in the script among the young. Since the beginning of the 20th century, the Thai script has been used for the Northern Thai language, although the fact that Thai only has 5 tones to Northern Thai's 6 makes this problematic.

Use of the Lanna traditional script is now largely limited to Buddhist temples, where many old sermon manuscripts are still in active use. There is no active production of literature in the traditional alphabet. The modern pronunciation differs from that prescribed in spelling rules.

ᨲ᩠ᩅᩫᨾᩮᩥᩬᨦ ᨣᩤᩴᨾᩮᩨᩬᨦ

In the Thai script this is คำเมือง.

The Lana script is derived from Mon, and before that Pallava.

Sources: Wikipedia, Unicode.

Basic features

The Tai Tham script is an abugida, ie. consonants carry an inherent vowel sound that is overridden using vowel signs. In Tai Tham, consonants carry an inherent vowel. See the table to the right for a brief overview of features for the modern Northern Thai orthography.

Northern Thai text runs left to right in horizontal lines.

Words are not separated by spaces, however syllables may be separated by ZWSP, as long as they don't fall inside a stack.

Each onset consonant is associated with a high, mid, or low class related to tone. Tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), plus any tone mark. ❯ consonants

Tai Tham has stacked consonants, but these do not necessarily indicate consonant clusters. The script is unusual in that any consonant in a stack can retain its inherent vowel, or be associated with a vowel sign. The sakot, which produces stacks, is never visible. ❯ clusters

Stacks can span word boundaries.

Syllable-initial clusters use 2 dedicated code points for the medial l, and a subjoined letter for medial w. ❯ onsets

Syllable-final consonant sounds can be written using 6 special diacritics, but otherwise use ordinary letters, which may or may not be subjoined depending on the context. ❯ finals

The Northern Thai orthography has an inherent vowel a, and represents other vowels using 19 dependent vowel marks (including 5 pre-base vowel signs), and 3 consonants (2 of which as subjoined forms). Unlike Thai and Lao, all vowel signs are combining marks, and are stored after the base character. Vowels are often written differently when they appear in a closed vs. open syllable. ❯ vowels

There is an incomplete set of independent vowels, and standalone vowel sounds are typically written using vowel signs applied to [U+1A4B TAI THAM LETTER A]. ❯ standalone

This page lists 29 composite vowels (made from 9 vowel signs, and 3 consonants/diacritics). Composite vowels can involve up to 5 glyphs, which can surround the base consonant(s) on up to 4 sides. ❯ composite_vowels

Northern Thai and Khün not only use a slightly different set of characters, but a number of characters have consistently divergeant shapes.

Character index

Letters

Show

Basic consonants

ᨸ␣ᨹ␣ᨲ␣ᨭ␣ᨳ␣ᨮ␣ᨠ␣ᨷ␣ᨯ␣ᩋ␣ᨻ␣ᨽ␣ᨴ␣ᨵ␣ᨰ␣ᨣ␣ᨧ␣ᨩ␣ᨺ␣ᩈ␣ᩆ␣ᩇ␣ᨨ␣ᨡ␣ᨢ␣ᩉ␣ᨼ␣ᨪ␣ᨫ␣ᨥ␣ᨤ␣ᩌ␣ᨾ␣ᨶ␣ᨱ␣ᨬ␣ᨿ␣ᨦ␣ᩀ␣ᩅ␣ᩁ␣ᩃ␣ᩊ

Extended consonants

ᩓ␣ᩔ

Vowels

ᩍ␣ᩎ␣ᩏ␣ᩐ␣ᩑ

Other

Combining marks

Show

Vowels

ᩡ␣ᩢ␣ᩣ␣ᩤ␣ᩥ␣ᩦ␣ᩧ␣ᩨ␣ᩩ␣ᩪ␣ᩫ␣ᩬ␣ᩮ␣ᩯ␣ᩰ␣ᩱ␣ᩲ␣ᩳ

Tones

᩵␣᩶

Medials

ᩕ␣ᩖ

Finals

ᩴ␣ᩙ␣ᩘ␣ᩢ
ᩝ␣ᩞ

Invisible stacker

Other

᩺␣᩻␣᩿␣ᩛ

Numbers

Show
᪀␣᪁␣᪂␣᪃␣᪄␣᪅␣᪆␣᪇␣᪈␣᪉
᪐␣᪑␣᪒␣᪓␣᪔␣᪕␣᪖␣᪗␣᪘␣᪙

Punctuation

Show
᪨␣᪩␣᪪␣᪫␣“␣”
᪦␣᪬␣᪣␣᪤␣᪥␣᪭

ASCII

(␣)␣?␣!

Logographs

᪠␣᪡␣᪢
Items to show in lists

Phonology

These are sounds for the Northern Thai language.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones.

Vowel sounds

Plain vowels

i ɯ ɯː ɯ ɯː u e ɤ ɤː ɤ ɤː o ɛ ɛː ɔ ɔː a

Diphthongs

ia ɯ ɯː ɯa ɯaː ua uaː uai aw aj

Consonant sounds

labial dental alveolar post-
alveolar
palatal velar glottal
stops p b t d       k ʔ
aspirated          
affricates       t͡ɕ      
fricatives f   s     x h
nasals m   n   ɲ ŋ
approximants w   l   j  

The glottal stop is pronounced after short open vowels. An initial glottal stop is also pronounced before independent vowels (see standalone).

Syllable-final

labial dental alveolar post-
alveolar
palatal velar glottal
stop       ʔ
nasal m   n     ŋ
approximant w       j  

All final stops are unreleased@Wikipedia,https://en.wikipedia.org/wiki/Northern_Thai_language#Final_consonants.

Tone

The Chiang Mai dialect of Northern Thai has 6 tones. They are illustrated in fig_tones, which is taken from Wikipedia.

The six tones of Northern Thai.

Wikipedia provides the following information for the 6 phonemic tones for unchecked syllables in the Changmai dialect of Northern Thai. (It also has sound recordings.)wnl,#Consonants

ToneRepresentations Example
low-rising˨˦ 24 ǎ ᩉᩖᩮᩢᩣ
mid-low ˨ 22 à ᩉᩖᩮᩢᩣ᩵
high-falling
glottalised
˥˧ 53 a᷇ ᩉᩖᩮᩢᩣ᩶
mid-high ˧ 33 ā ᩃᩮᩢᩣ
falling ˥˩ 51 â ᩃᩮᩢᩣ᩵
high rising-falling
glottalised
˦˥˦ 545 á ᩃᩮᩢᩣ᩶

This is the list for checked syllables.wnl,#Consonants

Tone Representations Example
low-rising ˨˦ 24 ǎ ᩉᩖᩢᨠ
high-falling ˥ 55 a᷇ ᩃᩢ᩠ᨠ
low ˨ 22 à ᩉᩖᩣ᩠ᨠ
falling ˥˩ 51 â ᩃᩣ᩠ᨠ

The mapping of tones to characters is described in tones.

Structure

See Tai Tham/Khün.

Vowels

Dashes are used to indicate the location of a consonant or consonant cluster. Prescript vowel signs have been stored before the hyphen because of the limitations of the font, but in reality all vowel signs should occur after the consonant they modify.

Inherent vowel

ka U+1A20 LETTER HIGH KA

a following a consonant is not written, but is seen as an inherent part of the consonant letter, so ka is written by simply using the consonant letter.

Other vowels

Non-inherent vowel sounds that follow a consonant are mostly represented using combining vowel marks (vowel signs). However, Northern Thai also uses some consonants. Many vowel sounds are represented by a combination of code points (see composite_vowels), and the consonant base can be followed by 4 vowel signs, which are displayed around the base.

In principle, all vowel signs are typed and stored after the base consonant, whether or not they precede it when displayed. The font takes care of the glyph positioning. However, the Unicode Consortium is currently examining the encoding model for Tai Tham. There is a possibility that pre-base vowel signs may be stored before the consonant in future.

Eight vowel signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.

Combining marks used for vowels

ᨠᩥ ki U+1A20 LETTER HIGH KA + U+1A65 VOWEL SIGN I

Northern Thai uses the following dedicated combining marks for vowels. They may be used on their own, or in combination with others (see composite_vowels).

ᩥ␣ᩦ␣ᩧ␣ᩨ␣ᩩ␣ᩪ␣ᩮ␣ᩫ␣ᩰ␣ᩯ␣ᩬ␣ᩴ␣ᩳ␣ᩡ␣ᩣ␣ᩤ␣ᩢ␣␣ᩱ␣ᩲ

 ᩤ [U+1A64 TAI THAM VOWEL SIGN TALL AA​] and  ᩣ [U+1A63 TAI THAM VOWEL SIGN AA​] are both used to represent the same phoneme. The choice of which to use is a question of spelling: the taller version is typically used after the following consonants.

ᩅ ᨴ ᨵ ᨣ

Some textbooks also recommend it's use after these characters, too.e

ᨧ ᨻ ᩁ ᨽ

[U+1A62 TAI THAM VOWEL SIGN MAI SAT] is commonly used as a vowel, but it sometimes also indicates a final -k sound.

ᨯᩢᩬ

[U+1A74 TAI THAM SIGN MAI KANG​] is used with many words to represent a syllable-final -m or (see finals), but it also functions as a vowel when it appears alone or as a component of a composite vowel.

ᨣᩴ

The sound it represents may be ambiguous. For instance, compare the example just above with the one below, where it fulfills the role of syllable-final consonant.

ᨶᩣᩴ

Pre-base vowel signs

ᨠᩮ keː U+1A20 LETTER HIGH KA + U+1A6E VOWEL SIGN E

ᩮ␣ᩰ␣ᩯ␣ ␣ᩱ␣ᩲ

Five vowel signs appear to the left of the base consonant letter or cluster.

ᨯᩱ᩶
A prebase vowel sign appears to the left of the consonant after which it is pronounced.
details

ᨯᩱ᩶

These combining marks are stored after the base consonant: the rendering process places the glyph before that of the base consonant. However, the Unicode Consortium is currently examining the coding model for Tai Tham. There is a possibility that pre-base vowel signs may be stored before the consonant in future. Also, some fonts already require this kind of handling, especially for dealing with complex combinations of characters.

Consonants pronounced as vowels

ᨠ᩠ᩅ kua U+1A20 LETTER HIGH KA + U+1A60 SIGN SAKOT + U+1A45 LETTER WA

The following are also involved in the production of vowel sounds.

᩠ᨿ␣᩠ᩅ␣ᩋ

The sequence ᩠ᩅ [U+1A60 TAI THAM SIGN SAKOT​ + U+1A45 TAI THAM LETTER WA] often represents a medial w – especially common after x or k but also occurring after a (dwindling) number of other consonants. However, when no other vowel signs follow (ie. when the inherent vowel is involved), it represents the diphthong ua rather than -wa.

Similarly, the sequence ᩠ᨿ [U+1A60 TAI THAM SIGN SAKOT​ + U+1A3F TAI THAM LETTER LOW YA] is pronounced as the diphthong ia when it appears alone after a consonant.

Both of these characters also appear as a component in some of the composite vowels described below.

[U+1A4B TAI THAM LETTER A] on its own represents the standalone version of the inherent vowel ʔa, and is used as a base for vowel signs when writing other standalone vowels (see standalone). However, it also makes an appearance as a vowel component in 2 composite vowels.

Composite vowels

ᨠᩮᩥᩬᩡ kɤʔ U+1A20 LETTER HIGH KA + U+1A6E VOWEL SIGN E​ + U+1A65 VOWEL SIGN I​ + U+1A6C VOWEL SIGN OA BELOW​ + U+1A61 VOWEL SIGN A​

This section lists vowel sounds represented by combinations of the above characters (this list is possibly incomplete).

Some represent plain vowel sounds:

ᩮᩢ␣ᩮᩡ␣ᩰᩡ␣ᩰᩫ␣ᩯᩢ␣ᩯᩡ␣ᩮᩥᩢ␣ᩮᩥᩬᩡ␣ᩮᩥ␣ᩮᩥᩬ␣ᩢᩬ␣ᩰᩬᩡ␣ᩬᩴ

The other composites represent diphthongs, which generally end in one of -a, -j or -w.

᩠ᨿᩮ␣᩠ᨿᩮᩡ␣ᩢ᩠ᨿ␣ᩮᩥᩢᩬ␣ᩮᩨᩢᩬ␣ᩮᩨᩬ␣ᩮᩨᩬᩋ␣ᩮᩨᩬᩋᩡ␣᩠ᩅᩫ␣᩠ᩅᩫᩡ␣᩠ᩅᩢ␣᩠ᩅ᩠ᨿ␣ᩱ᩠ᨿ␣ᩣ᩠ᨿ␣ᩮᩢᩣ␣ᩮᩢᩤ
Show which combinations contain a given character:
ᨿ
᩠ᨿᩮ␣᩠ᨿᩮᩡ␣ᩢ᩠ᨿ␣᩠ᩅ᩠ᨿ␣ᩱ᩠ᨿ␣ᩣ᩠ᨿ
᩠ᩅᩫᩡ␣᩠ᩅᩫ␣᩠ᩅᩢ␣᩠ᩅ᩠ᨿ
ᩮᩨᩬᩋ␣ᩮᩨᩬᩋᩡ
ᩮᩡ␣ᩰᩡ␣ᩯᩡ␣ᩮᩥᩬᩡ␣ᩰᩬᩡ␣␣᩠ᨿᩮᩡ␣ᩮᩨᩬᩋᩡ␣᩠ᩅᩫᩡ
ᩮᩢ␣ᩯᩢ␣ᩮᩥᩢ␣ᩢᩬ␣␣ᩢ᩠ᨿ␣ᩮᩥᩢᩬ␣ᩮᩨᩢᩬ␣᩠ᩅᩢ␣ᩮᩢᩣ
ᩮᩢᩣ␣ᩣ᩠ᨿ
ᩮᩢᩤ
ᩮᩥᩢ␣ᩮᩥᩬᩡ␣ᩮᩥ␣ᩮᩥᩬ␣␣ᩮᩥᩢᩬ
ᩮᩨᩢᩬ␣ᩮᩨᩬᩋᩡ␣ᩮᩨᩬᩋ␣ᩮᩨᩬ
ᩰᩫ␣᩠ᩅᩫᩡ␣᩠ᩅᩫ
ᩮᩥᩬᩡ␣ᩮᩥᩬ␣ᩢᩬ␣ᩰᩬᩡ␣ᩬᩴ␣␣ᩮᩥᩢᩬ␣ᩮᩨᩢᩬ␣ᩮᩨᩬᩋᩡ␣ᩮᩨᩬᩋ␣ᩮᩨᩬ
ᩮᩢ␣ᩮᩡ␣ᩮᩥᩢ␣ᩮᩥᩬᩡ␣ᩮᩥ␣ᩮᩥᩬ␣␣᩠ᨿᩮ␣᩠ᨿᩮᩡ␣ᩮᩥᩢᩬ␣ᩮᩨᩢᩬ␣ᩮᩨᩬᩋᩡ␣ᩮᩨᩬᩋ␣ᩮᩨᩬ␣ᩮᩢᩣ␣ᩮᩢᩤ
ᩯᩢ␣ᩯᩡ
ᩰᩡ␣ᩰᩫ␣ᩰᩬᩡ
ᩱ᩠ᨿ
ᩬᩴ
Show details about vowel glyph positioning.

The following list shows where vowel signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are. The figure after the + sign represents combinations of Unicode characters, The list includes subjoined WA and YAand the postfixed .

  • 5 pre-base, eg. ᨠᩮ keː
  • 3+1 post-base, eg. ᨠᩣ kaː
  • 7 superscript, eg. ᨠᩥ ki
  • 3+1 subscript, eg. ᨠᩩ ku
  • +4 sub+superscript, eg. ᨠᩬᩴ kɔ̱ŋ̊ kɔː
  • +1 sub+super+post-base, eg. ᨠ᩠ᩅᩫᩡ k˖w̱ɔ̈a kuaʔ
  • +1 sub+post-base, eg. ᨠ᩠ᩅ᩠ᨿ k˖w̱˖y̱ kuaj
  • +1 super+post-base, eg. ᨠᩢ᩠ᨿ ká˖ȳ kia-
  • +4 pre+post-base, eg. ᨠᩮᩡ keȧʷ keʔ
  • +2 pre+sub+superscript, eg. ᨠᩮᩬᩥ keɔ̱i kɤː
  • +1 pre+super+post-base, eg. ᨠᩮᩢᩣ keáā kaw
  • +1 pre+sub+post-base, eg. ᨠᩰᩬᩡ koɔ̱a kɔʔ
  • +2 pre+sub+super+post-base, eg. ᨠᩮᩬᩥᩋ keɔ̱iʔ̯ kɤː
  • +1 pre+post+post-base, eg. ᨠ᩠ᨿᩮᩡ k˖y̱eȧʷ kiaʔ
  • +1 pre+sub+super+post+post-base, eg.ᨠᩮᩬᩨᩋᩡ keɔ̄ɯ̄ʔ̯ȧʷ kɯa
  • +4 pre+superscript, eg. ᨠᩮᩢ keá ke-
  • +1 pre+super+superscript, eg. ᨠᩮᩥᩢ keiá kɤ-
  • +2 pre+sub+super+superscript, eg. ᨠᩮᩬᩨᩢ keɔ̄ɯ̄á kia-

Distribution of vowel elements is as follows:

 ᩢ  ᩫ  ᩳ  ᩴᩘ  
ᩮ ᩯ ᩱ ᩰ ᩲ ᩡ ᩅ ᩣ ᩤ ᩋ
   ᩩ  ᩪ  ᩬ ᩠ᨿ  
Locations where vowel elements can appear, including in complex vowels.

Vowel components can occur concurrently on 4 sides of the base, eg. ᩮᩬᩥᩡ.

Characters that don't appear in the combinations:

ᩦ␣ᩧ␣ᩩ␣ᩪ␣ᩲ␣ᩳ

Standalone vowels

In Northern Thai standalone vowel sounds can be written in 2 different ways.

Vowel signs

For vowels not preceded by a consonant, Northern Thai generally uses [U+1A4B TAI THAM LETTER A] with one or more vowel signs, eg. ᩋᩧ᩠ᨷ

Independent vowels

Some standalone vowels can be represented using a set of independent vowel letters. The set includes a consonant character which used alone represents the inherent vowel sound, but the list only covers a small number of possible vowel sounds.

ᩍ␣ᩎ␣ᩏ␣ᩐ␣ᩑ␣ᩋ

The 5 independent vowel letters are used in syllable-initial position for certain words, but for other words the base+vowel sign approach may be used.

ᩑᨠ

ᩋᩮ᩠ᨶ

Vowel sounds mapped to characters

This section maps Northern Thai vowel sounds to common graphemes in the Lanna orthography, where open indicates an open syllable, closed a closed syllable, and standalone a standalone vowel. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

For some diphthongs ending in -j or -w, Owen indicates that phonetic sequences exist. but offers no examples. Based on other examples, it is assumed here that -j is formed using sakot+ya, and -w using sakot+wa, except where the preceding vowel sign extends below the baseline (such as for uj).

Plain vowels

i
open/closed
 
standalone
open/closed
 
standalone
ɯ
open/closed
ɯː
open/closed
u
open/closed
 
standalone
open/closed
 
standalone
a
open

Inherent vowel

 
standalone

Diphthongs and other combinations

Tones

With the high/low categorisation of consonants, Northern Thai writing generally needs only the two combining tone marks below to indicate one of the possible phonetic tones.

᩵␣᩶

If there is a vowel over or below a consonant or consonant stack, the tone mark follows the vowel in storage, and is displayed above or alongside the vowel.

Otherwise, the tone is input after the consonant, ie. before a vowel sign that is displayed to the right or below, and appears over the consonant. e

The default fonts used here expect the tone to be typed after a lefted vowel if there is one; after a vowel above, if there is one; before a vowel to the right; and doesn't seem to matter wrt low vowel. See this test. Noto agrees except for lefted vowels.

The following chart shows how to tell which tones are associated with a syllable.

Consonant Checked? Tone mark Tone
high checked short 2
long 3
open - 1
3
5
mid checked short 2
long 3
open - 2
3
5
2
1
6
low checked short 6
long 4
open - 2
4
6

 

Consonants

Each onset consonant is associated with a high, mid, or low class related to tone. Tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), plus any tone mark.

Tai Tham has stacked consonants, but these do not necessarily indicate consonant clusters. The script is unusual in that any consonant in a stack can retain its inherent vowel, or be associated with a vowel sign. The sakot, which produces stacks, is never visible.

Stacks can span word boundaries.

Syllable-initial clusters use 2 dedicated code points for the medial l, and a subjoined letter for medial w.

Syllable-final consonant sounds can be written using 6 special diacritics, but otherwise use ordinary letters, which may or may not be subjoined depending on the context.

For a mapping of sounds to graphemes see consonant_mappings.

Basic consonants

The lists below show consonants in the Northern Thai repertoire. The letters h, m, and l indicate the class of the consonant. This list includes some sequences to indicate high class forms when there is no single letter for that. Where 2 pronunciations are given, the first is for syllable-initial, and the second for syllable-final use.

ᨸ␣ᨻ␣ᨷ␣ᨲ␣ᨭ␣ᨴ␣ᨯ␣ᨠ␣ᨣ␣ᩋ
ᨹ␣ᨽ␣ᨳ␣ᨮ␣ᨵ␣ᨰ
ᨧ␣ᨩ
ᨺ␣ᨼ␣ᩈ␣ᩆ␣ᩇ␣ᨨ␣ᨪ␣ᨫ␣ᨡ␣ᨢ␣ᨥ␣ᨤ␣ᩉ␣ᩌ
ᩉ᩠ᨾ␣ᨾ␣ᩉ᩠ᨶ␣ᨶ␣ᨱ␣ᨬ␣ᨿ␣ᨦ
ᩅ␣ᩉᩖ␣ᩁ␣ᩃ␣ᩊ␣ᩀ

ʨʰ is not a native Northern Thai sound, but rather associated with reading the alphabet out loud and in learned pronunciation of Pali loanwords.o,142

A few consonants have different phonetic realisations in Tai Khün, and [U+1A22 TAI THAM LETTER HIGH KXA] is not used by Tai Khün.

High class nasals & liquids with HA

High and low consonants usually come in pairs, but where they don't the high variant is normally given by subjoining the low consonant below [U+1A49 TAI THAM LETTER HIGH HA].

ᩉ᩠ᨶᩧ᩵ᨦ

These combinations are included in the charts above.

The letter A

[U+1A4B TAI THAM LETTER A] represents a glottal stop.

It can be used with vowels at the beginning of a syllable, or on its own to indicate a standalone sound corresponding to the inherent vowel (see standalone).

ᩋᩧ᩠ᨷ

ᩋᩉ᩠ᨿᩢᨦ

It has very different shapes in Northern Thai text and Khün text .

Special consonants

ᩛ␣ᩓ␣ᩔ

The first of these is a special-use consonant diacritic. The second two are ligatures.

[U+1A5B TAI THAM CONSONANT SIGN HIGH RATHA OR LOW PA​] represents two different functions with the same appearance. It represents [U+1A2E TAI THAM LETTER HIGH RATHA] in eᩈᨱᩛᩣ᩠ᨶ sṇ̱ᵽā˖ṉ shape And it represents [U+1A3B TAI THAM LETTER LOW PA] in ᩋᨾᩛ ʔ̯m̱ᵽ mangoCompare with the somewhat rare subjoined form,e eg. ᨷᩢᨱ᩠ᨻᨷᩩᩁᩩᩇ b̯áṇ̱˖p̄b̯uruṣ disciple

[U+1A53 TAI THAM LETTER LAE] represents the combination ᩃᩯ [U+1A43 TAI THAM LETTER LA + U+1A6F TAI THAM VOWEL SIGN AE], eg. ᩈᩮᩓ᩠ᩅ᩶

[U+1A54 TAI THAM LETTER GREAT SA] represents geminated [U+1A48 TAI THAM LETTER HIGH SA].

Subjoined consonants

Tai Tham is unusual in that subjoined consonants do not only appear where there are consonant clusters. There is a natural tendency to attempt to stack consonants, usually 2 high, whenever possible.

U+1A60 TAI THAM SIGN SAKOT​ is the (always) invisible character used to produce the subjoined form of a consonant, eg. compare the following:

ᨠᨠ kk [U+1A20 TAI THAM LETTER HIGH KA + U+1A20 TAI THAM LETTER HIGH KA]

ᨠ᩠ᨠ kk [U+1A20 TAI THAM LETTER HIGH KA + U+1A60 TAI THAM SIGN SAKOT​ + U+1A20 TAI THAM LETTER HIGH KA]

Sakot doesn't always kill the inherent vowel between two consonants, nor does it create conjuncts, in the sense of merged shapes, but subjoined forms of consonants typically have a different and smaller shape compared to the standard form.

Sakot can follow a vowel sign. For example, in the following word the sakot is used to position the final consonant in the syllable below the vowel sign. This is quite common.

ᩈᩣ᩠ᨾ

A subjoined consonant can also follow a digit.

᪓᩠ᨴ

Subjoined consonants are not only syllable-final consonants. The first consonant in a following syllable may also be subjoined, eg. (final r is pronounced as n).e u,654

ᨳ᩠ᨶᩫ᩻ᩁ

This list shows consonants in their normal and subjoined forms. Not all consonants traditionally have subjoined forms, but modern innovations in borrowed terminology suggest that fonts should provide them for all consonants except the old vocalic letters.u,654 You may find that the font applied here doesn't handle all combinations well.

high class letters
ᨸ᩠ᨸ␣ᨹ᩠ᨹ␣ᨲ᩠ᨲ␣ᨭ᩠ᨭ␣ᨳ᩠ᨳ␣ᨮ᩠ᨮ␣ᨠ᩠ᨠ␣ᨧ᩠ᨧ␣ᨺ᩠ᨺ␣ᩈ᩠ᩈ␣ᩆ᩠ᩆ␣ᩇ᩠ᩇ␣ᨨ᩠ᨨ␣ᨡ᩠ᨡ␣ᨢ᩠ᨢ␣ᩉ᩠ᩉ␣ᩀ᩠ᩀ
mid class letters
ᨷ᩠ᨷ␣ᨯ᩠ᨯ␣ᩋ᩠ᩋ
low class letters
ᨻ᩠ᨻ␣ᨽ᩠ᨽ␣ᨴ᩠ᨴ␣ᨵ᩠ᨵ␣ᨰ᩠ᨰ␣ᨣ᩠ᨣ␣ᨩ᩠ᨩ␣ᨼ᩠ᨼ␣ᨪ᩠ᨪ␣ᨫ᩠ᨫ␣ᨥ᩠ᨥ␣ᨤ᩠ᨤ␣ᩌ᩠ᩌ␣ᨾ᩠ᨾ␣ᨶ᩠ᨶ␣ᨱ᩠ᨱ␣ᨬ᩠ᨬ␣ᨿ᩠ᨿ␣ᨦ᩠ᨦ␣ᩅ᩠ᩅ␣ᩁ᩠ᩁ␣ᩃ᩠ᩃ␣ᩊ᩠ᩊ

[U+1A7B TAI THAM SIGN MAI SAM​] is used in Northern Thai to identify double-acting consonants, or to indicate that a subjoined consonant begins a new syllable, eg. compare the following (where final r is pronounced as n).e

ᨳᩫ᩠ᨶᩁ tʰo˖ṉṟ tʰonra

ᨳ᩠ᨶᩫ᩻ᩁ tʰo˖ṉʻṟ tʰanon

(It is also used to repeat a word.)

Onset consonants

The following are used to represent the second consonant in syllable-initial clusters.

ᩕ␣ᩖ␣᩠ᩅ

[U+1A55 TAI THAM CONSONANT SIGN MEDIAL RA​] after a stop generally produces aspiration, or converts the sound to x, but it may also be pronounced -l.

ᨠᩕᩣ᩠ᨷ

ᨣᩕᩢ᩠ᨷ

[U+1A56 TAI THAM CONSONANT SIGN MEDIAL LA​] is commonly not pronounced, however it is also found in the combination ᩉᩖ [U+1A49 TAI THAM LETTER HIGH HA + U+1A56 TAI THAM CONSONANT SIGN MEDIAL LA​] which creates a high class letter with the sound l.

ᨠᩖᩣ᩶

ᩉᩖᩢᨠ

A medial -w also occurs, but there is no dedicated character for it. Instead it is produced using an ordinary WA which is subjoined using the sakot, ie. ᩠ᩅ [U+1A60 TAI THAM SIGN SAKOT​ + U+1A45 TAI THAM LETTER WA]. Such clusters are generally limited to kw and xw, although some other combinations are occasionally found, though they appear to be tending to obsoletion.wnl,#Consonants

ᨣ᩠ᩅᩣ᩠ᨿ

Other syllable-initial clusters include the combination of [U+1A49 TAI THAM LETTER HIGH HA] plus a subjoined low class consonant to make the consonant high class (see highclass). These combinations are not pronounced as multiple consonants.

Final consonants

Northern Thai text commonly renders syllable-final consonants using regular consonant code points (see the example just below), but sometimes special combining characters are used.

ᩑᨠ

Stacking

When regular consonants are used they are commonly subjoined, eg.

ᨠᩣ᩠ᩁ

There are, however, exceptions. For example, when preceded by a subscript vowel a final consonant may be rendered on the baseline, eg.

ᩃᩪᨠ

Northern Thai tends to add sub-base vowels below a consonant stack, whereas Khün typically shifts the vowel to the right of the stack (see fig_kiss).

ᨧ᩠ᨷᩪ ᨧ᩠ᨷᩪ
Positioning of U vowel sign after a stack. In Northern Thai (right) it commonly occurs below the stack, whereas in Khün (left) it tends to be moved to the side.
details

ᨧ᩠ᨷᩪ

Due to font design or USE (the Universal Shaping Engine) the characters may have to be typed in an order that departs from the spoken order so that they look as expected. For example, the word in fig_kiss is stored as CCV, whereas it is pronounced CVC.

Combining marks

The following diacritics are sometimes used for syllable-final consonants.

ᩴ␣ᩙ␣ᩘ␣ᩢ␣ᩝ␣ᩞ

  [U+1A74 TAI THAM SIGN MAI KANG​] may be used as a vowel, or to represent a syllable-final nasal. The use is sometimes ambiguous (see combiningvowels).

  [U+1A58 TAI THAM SIGN MAI KANG LAI​] can also be used to represent a syllable-final nasal. Click on the name for details. Note that this diacritic has a very different shape in the Khün orthography. Compare ᩅᩘ and ᩅᩘ.

[U+1A62 TAI THAM VOWEL SIGN MAI SAT] is commonly used as a vowel, but it also sometimes functions to indicate a final -k sound, eg. ᨯᩢᩬ

[U+1A5D TAI THAM CONSONANT SIGN BA] and [U+1A5E TAI THAM CONSONANT SIGN SA] appear to be alternative shapes for the normal subjoined consonants, used per writer preference (follow the links for more information). 

Silencer

[U+1A7A TAI THAM SIGN RA HAAM​] is used in Northern Thai to silence one or more characters in a word. It is not always clear which sound or sounds are cancelled. Click on the following words to see how the letters map to sounds.

ᨵᨾ᩠ᨾ᩺

ᨼᩥᩃ᩠ᨾ᩺

In Lü it is used as a final n; in Khün it is used as a final r.

Consonant to script mapping

This section maps Northern Thai consonant vowel sounds to common graphemes in the Lanna orthography, where h indicates high class, m is mid class, l is low class, and f indicates a final consonant. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

Stops

p
high

  (᩠ᨸ) [U+1A38 TAI THAM LETTER HIGH PA]

 
low

  ᩠ᨻ [U+1A3B TAI THAM LETTER LOW PA]

[U+1A5B TAI THAM CONSONANT SIGN HIGH RATHA OR LOW PA], commonly used for the subjoined form of [U+1A3B TAI THAM LETTER LOW PA].@Wiktionary,https://en.wiktionary.org/wiki/%E1%A8%BB#Translingual

b
mid

  ᩠ᨷ [U+1A37 TAI THAM LETTER BA] when syllable-initial

[U+1A5D TAI THAM CONSONANT SIGN BA] is an optional alternative to the normal subjoined form of [U+1A37 TAI THAM LETTER BA]

high

  ᩠ᨹ [U+1A39 TAI THAM LETTER HIGH PHA] when syllable initial

 
low
d
mid

  ᩠ᨯ [U+1A2F TAI THAM LETTER DA]

k
high

  ᩠ᨠ [U+1A20 TAI THAM LETTER HIGH KA] .

 
low
 
final

  ᩠ᨠ [U+1A20 TAI THAM LETTER HIGH KA]

  ᩠ᨣ [U+1A23 TAI THAM LETTER LOW KA]

[U+1A62 TAI THAM VOWEL SIGN MAI SAT] Also used as a vowel.

 
final

Pronounced but not written after a short, open vowel.

Affricates

t͡ɕ
high

  ᩠ᨧ [U+1A27 TAI THAM LETTER HIGH CA] .

 
low

Fricatives

Nasals

Other

 
low

  ᩠ᩅ [U+1A45 TAI THAM LETTER WA]

 
final

  ᩠ᩅ [U+1A45 TAI THAM LETTER WA]

As part of a diphthong, this is typically rendered using the subjoined form. See consonant_vowels.

 
low

  ᩠ᩃ [U+1A43 TAI THAM LETTER LA]

  ᩠ᩊ [U+1A4A TAI THAM LETTER LLA] 

  ᩠ᩁ [U+1A41 TAI THAM LETTER RA]

In ligature for lɛː, [U+1A53 TAI THAM LETTER LAE]

j
high
 
final

  ᩠ᨬ [U+1A2C TAI THAM LETTER NYA] 

ᨿ   ᩠ᨿ [U+1A3F TAI THAM LETTER LOW YA]

As part of a diphthong, this is typically rendered using the subjoined form. See consonant_vowels.

Encoding choices

A number of questions need to be addressed with regards to ordering of characters in composite vowels and stacks. These have been discussed by Unicode experts, but no conclusions have yet been reached. Here we will list some examples.

A number of choices made for this page are enforced by the Hariphunchai font which is used, however other fonts allow different choices.

Order of YA

A first example concerns the order of vowel components that include a subjoined YA. The following produce identical output for the composite vowel pronounced ia in the Payap Lanna, Lamphun, and Hariphunchai fonts (but not in the Noto Sans Tai Tham font).

ᨠ᩠ᨿᩮ kia [KA + SAKOT​ + YA + E​]

ᨠᩮ᩠ᨿ kia [KA + E​ + SAKOT​ + YA]

In this case, the first alternative, which subjoins the YA with the KA and follows it with the MAI SAT vowel sign seems the more intuitive, since the diphthong produced is ia.

However, change the sound to aj and the above logic would lean towards the second of the two following encodings, which also produce identical results.

ᨠ᩠ᨿᩢ kaj [KA + SAKOT​ + YA + MAI SAT​]

ᨠᩢ᩠ᨿ kaj [KA + MAI SAT​ + SAKOT​ + YA]

This is somewhat unusual for Unicode, since it involves an invisible stacker appearing after a vowel mark (there is no other way of producing the right shape for the semivowel j).

Subjoined OA

The following 2 sequences produce identical visual results.

ᨠᩴ᩠ᩋ kɔː(?) [HIGH KA + MAI KANG​ + SAKOT​ + A]

ᨠᩬᩴ kɔː(?) [HIGH KA + VOWEL SIGN OA BELOW​ + MAI KANG​]

There appears to be some question about which is the appropriate sequence for the composite vowel -ɔː.

Symbols

Logographs

᪠␣᪡␣᪢

The meaning of each of the logographs is shown above. Unicode classes these symbols as punctuation.

Cryptography

᩿

 ᩿ [U+1A7F TAI THAM COMBINING CRYPTOGRAMMIC DOT​] is used singly or multiply beneath letters to give each letter a different value according to some hidden agreement between reader and writer. u,665

Numbers

Digits

᪀␣᪁␣᪂␣᪃␣᪄␣᪅␣᪆␣᪇␣᪈␣᪉
᪐␣᪑␣᪒␣᪓␣᪔␣᪕␣᪖␣᪗␣᪘␣᪙

Two sets of digits are in common use: a secular set (Hora) and an ecclesiastical set (Tham). European digits are also found in books. u,665

Text direction

Northern Thai text runs left to right in horizontal lines.

Show default bidi_class properties for characters in the Northern Thai orthography described here.

Glyph shaping & positioning

This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.

You can experiment with examples using the Northern Thai character app.

The orthography has no case distinction, and no special transforms are needed to convert between characters.

Since there are no conjuncts, there is not so much contextual shaping in Tham as in many other Brahmi-descended scripts.

Glyph styles

In addition to the regular differences in shape of glyphs in Northern Thai and Tai Khün, the shapes of certain glyphs in Northern Thai texts may also vary, depending on the region or source.

Regional variation in the shape of subjoined NA is evident in manuscripts from Xishuangbanna (Yunnan Province, China), Keng Tung (Shan State, Myanmar), and Chiangmai (Thailand). (Source: Trager)

By way of a further example, the Payap Lanna and Haripunchai fonts differ in terms of styling, but some glyphs are substantially different. The following table shows glyph shapes for various characters in both fonts.

Hariphunchai
Payap Lanna

Context-based shaping & positioning

Northern Thai text relies on rules to correctly position glyphs and shape them according to the surrounding text.

One major area where this applies is in the use of subjoined forms for consonant stacks (see clusters). Many of the subjoined forms of a letter are substantially different and/or smaller than the normal letter glyph, but the character in memory is the same.

᩠ᨾ   ᨾ
Standard and subjoined forms of the letter MA.

The following is a selection of other examples of contextual shaping and positioning.

Placement of tone marks may involve special shaping and positioning. In some fonts a tone mark is displayed alongside a superscript vowel sign, rather than above it.

ᨠᩥ᩵ ᨠᩥ᩶
Positioning of tone marks next to a superscript vowel in the Hariphunchai font.

A number of code point sequences may be ligated by a font.

ᨶ ᩣ ᨶᩣ
ᨬ ᨬ ᨬ᩠ᨬ
Examples of ligated forms for code point sequences.

Font styling & weight

tbd

Graphemes

Grapheme clusters

tbd

Punctuation & inline features

Word boundaries

Spaces separate phrases. There is no separation of individual words.

A new word may start with a subjoined consonant. Stacking is performed across word boundaries. This means that operations such as line-breaking, word highlighting, etc. have to use an orthographic syllable unit which differs from the underlying phonetic syllables.

Phrase & section boundaries

᪨␣᪩␣᪪␣᪫␣?␣!␣᪣␣᪤␣᪥␣᪭␣᪦␣᪬

Northern Thai uses a variety of native punctuation, and only a couple of ASCII code points.

phrase

[U+0020 SPACE]

[U+1AA8 TAI THAM SIGN KAAN]

[U+1AA9 TAI THAM SIGN KAANKUU]

[U+1AAA TAI THAM SIGN SATKAAN]

[U+1AAB TAI THAM SIGN SATKAANKUU]

sentence

? [U+003F QUESTION MARK]

! [U+0021 EXCLAMATION MARK]

opening section

[U+1AA3 TAI THAM SIGN KEOW]

[U+1AA4 TAI THAM SIGN HOY]

[U+1AA5 TAI THAM SIGN DOKMAI]

[U+1AAD TAI THAM SIGN CAANG]

᪩᪥᪩ [U+1AA9 TAI THAM SIGN KAANKUU + U+1AA5 TAI THAM SIGN DOKMAI + U+1AA9 TAI THAM SIGN KAANKUU]

 ᪭ᩣ [U+1AAD TAI THAM SIGN CAANG + U+1A63 TAI THAM VOWEL SIGN AA]

closing section

[U+1AA6 TAI THAM SIGN REVERSED ROTATED RANA]

[U+1AAC TAI THAM SIGN HANG]

᪦᪦᪩ [U+1AA6 TAI THAM SIGN REVERSED ROTATED RANA + U+1AA6 TAI THAM SIGN REVERSED ROTATED RANA + U+1AA9 TAI THAM SIGN KAANKUU]

 ᪩᪦᪩ [U+1AA9 TAI THAM SIGN KAANKUU + U+1AA6 TAI THAM SIGN REVERSED ROTATED RANA + U+1AA9 TAI THAM SIGN KAANKUU]

᪩᪦᪩᪬ [U+1AA9 TAI THAM SIGN KAANKUU + U+1AA6 TAI THAM SIGN REVERSED ROTATED RANA + U+1AA9 TAI THAM SIGN KAANKUU + U+1AAC TAI THAM SIGN HANG]

 ᪦᪦᪬ [U+1AA6 TAI THAM SIGN REVERSED ROTATED RANA + U+1AA6 TAI THAM SIGN REVERSED ROTATED RANA + U+1AAC TAI THAM SIGN HANG]

The following punctuation marks have "progressive values of finality".

  1. [U+1AA8 TAI THAM SIGN KAAN]
  2. [U+1AA9 TAI THAM SIGN KAANKUU]
  3. [U+1AAA TAI THAM SIGN SATKAAN]
  4. [U+1AAB TAI THAM SIGN SATKAANKUU] 

European punctuation such as question marks and exclamation marks are also used.

[U+1AA3 TAI THAM SIGN KEOW], [U+1AA4 TAI THAM SIGN HOY], [U+1AA5 TAI THAM SIGN DOKMAI], and [U+1AAD TAI THAM SIGN CAANG] are all used as section starters, sometimes in conjunction with other punctuatione, eg.

᪩᪥᪩

᪭ᩣ

To close a section, use [U+1AA6 TAI THAM SIGN REVERSED ROTATED RANA] and/or [U+1AAC TAI THAM SIGN HANG], eg.

᪦᪦᪩

᪩᪦᪩

᪩᪦᪩᪬

᪦᪦᪬

Bracketed text

(␣)

Northern Thai commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

( [U+0028 LEFT PARENTHESIS]

) [U+0029 RIGHT PARENTHESIS]

Quotations & citations

“␣”

Northern Thai texts use quotation marks around quotations. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.

  start end
initial

[U+201C LEFT DOUBLE QUOTATION MARK]

[U+201D RIGHT DOUBLE QUOTATION MARK]

Emphasis

tbd

Abbreviation, ellipsis & repetition

Repetition

[U+1AA7 TAI THAM SIGN MAI YAMOK] indicates reduplication of the preceding word, eg.

ᨴᩩᨠᪧ

ᩃᩡᩋᩬ᩵ᩁᪧ

Adverbs are often derived by reduplicating an adjective.o,149

[U+1A7B TAI THAM SIGN MAI SAM​] is also used in Northern Thai to indicate repetition of a word,e eg.

ᨲᩣ᩠᩵᩻ᨦ

Inline notes & annotations

tbd

Other punctuation

tbd

Other inline text decoration

tbd

Line & paragraph layout

Line breaking & hyphenation

There are no spaces between words in Northern Thai to serve as line-break opportunities. Lines must be broken at orthographic syllable boundaries. Since the onset consonant of a word or syllable may be subjoined below a previous consonant, and stacks must not be broken, orthgraphic syllable units typically don't match phonetic syllables or words.

In-word line-breaking

In-word line-breaking is a fact of life, because stacks cannot be broken, but no hyphen or other character is used to indicate that a word was broken.

Line-edge rules

As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line. The Unicode line-break properties help applications decide whether a character should appear at the start or end of a line.

Show (default) line-breaking properties for characters in the Northern Thai orthography.

The following list gives examples of typical behaviours for some of the characters used in modern Northern Thai. Context may affect the behaviour of some of these and other characters. Most of the Northern Thai characters, including native punctuation, will prevent a line break before or after, and require morphological analysis to determine break opportunities, in a way similar to a hyphenation algorithm. No break opportunities will be found otherwise. Complex context analysis, often involving dictionary lookup of some form, is required to determine non-emergency line breaks. If such analysis is not available,.

Click/tap on the Bangla characters to show what they are.

  • “ ‘ (   should not be the last character on a line.
  • ” ’ ) . , ; ! ? । ॥ %   should not begin a new line.

Line breaking should not move a danda or double danda to the beginning of a new line even if they are preceded by a space character.

Text alignment & justification

tbd

tbd

Text spacing

tbd

Baselines, line height, etc.

Northern Thai uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

Northern Thai places vowel and tone marks above base characters, one above the other, and can also add combining characters below the line. It also stacks characters, though stacks are usually limited in height. The complexity of the text means that the vertical resolution needed for clearly readable Northern Thai text is higher than for English, or most Latin text. In addition,

To give an approximate idea, fig_baselines compares Latin and Northern Thai glyphs from the Payap Lanna font. The basic height of Northern Thai letters is typically around the Latin x-height, however extenders and combining marks reach well beyond the Latin ascenders and descenders, creating a need for larger line spacing.

Hhqxᨾᩮᩥᩬᨦᨯᩱ᩶ᨠᩮᩨᩢᩬᨸᩃᩮᩥᩢ᩠ᨠᨣᩤᩴᨼ᩺ᩋᩧ᩠ᨷᨧ᩠ᨷᩪ᪆᪢ᨬ᩠ᨬ
Font metrics for Latin text compared with Northern Thai glyphs in the Payap Lanna font.

Counters, lists, etc.

tbd

Page & book layout

This section is for any features that are specific to Northern Thai and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.

References

Acknowledgements

Many thanks are due to Richard Wordingham and Patrick Chew for reviewing the initial draft of this material and sending suggestions.