Updated 10 July, 2020
This page gathers together basic information about the Tai Tham script and its use for the Khün language. It aims (generally) to provide an overview of the orthography and typographic features, and (specifically) to advise how to write Khün using Unicode.
See also the companion document, Tai Tham character notes, for detailed information about specific Unicode characters.
Phonetic transcriptions on this page should be treated as an approximate guide, only. Many are more phonemic than phonetic, and there may be variations depending on the source of the transcription.
ᨡᩳ᩶ 1 ᨣᩢ᩠ᨶᩉᩮᩖᩨᨠᩥ᩠ᨶ ᨣᩢᩐᩢᩣᨡᩣ᩠ᨿᨸᩮ᩠ᨶᨦᩫ᩠ᨶ ᨠᩮ᩠ᨷᩉᩬᨾᩋᩬᨾᩅᩱᩢᨯ᩠᩶ᨦᨶᩦ᩶ ᨴᩩᨠᪧᨸᩦᨾᩣᨷᩢᨡᩣ᩠ᨯ ᨧᩥ᩠᩵ᨦᨠ᩠ᨴᩣᩴᩉᩨ᩶ᨡᩮᩢᩣᨻᩳ᩵ᨾᩯ᩵ᩃᩪᨠ ᨷᩢᨯᩱᩢᨠᩢᩢ᩠ᨶᩈᩢ᩠ᨦᩈᩢ᩠ᨠᨩᩮᩨᩬ.
ᨡᩳ᩶ 2 ᨴᩩᨠᨤ᩠ᨶᩫᨾᩦᩈᩥᨴ᩠ᨵᩥᩓᩢᨻ᩠ᨦᩈᩁᨽᩣ᩠ᨷ ᨲᩣ᩠ᨾᨴᩦ᩵ᨯᩱᩢᨠ᩵ᩣ᩠ᩅᩅᩱᩢᨶᩱᨡᩳ᩶ᨠᨲᩥᨠᩣᩋ᩠ᨶᩢᨶᩦ᩶ ᨯᩰ᩠ᨿᨷᩢᨯᩱᩢᨲᩯ᩠ᨠᨲ᩵ᩣ᩠ᨦᨠ᩠ᨶᩢ ᨷᩢᩅᩤ᩵ᩋ᩠ᨶᩢᨯᩱ ᩉᩮ᩠ᨾᩨᩁᩅᩤ᩵ ᨩᩮ᩠ᩋᩨ᩶ᨩᩣ᩠ᨲ ᨹ᩠ᩅᩥ ᨻ᩠ᨿᨯ ᨽᩣᩈᩣ ᩈᩣᩈᨶᩣ ᨣ᩠ᩅᩣ᩠ᨾᨣ᩠ᨯᩧᩉ᩠ᨶᩢᨴᩤ᩠ᨦᨠᩣ᩠ᩁᨾᩮ᩠ᨦᩨ ᩉᩕᩨᨴᩤ᩠ᨦᩋ᩠ᨶᩨ᩵ᪧ ᨩᩣ᩠ᨲᨩᩮ᩠ᩋᩨ᩶ ᩈ᩠ᨦᩢᨣ᩠ᨾᩫ ᩈᩫ᩠ᨾᨷ᩠ᨲᩢ ᨩᩣ᩠ᨲᨠᩮ᩠ᨯᩨ ᩉᩕᩨᩈᨳᩣᨶᩋ᩠ᨶᩨ᩵ᪧ ᨳᩯ᩠ᨦᩢᨷᩕᨠᩣ᩠ᩁᩉ᩠ᨶᩧ᩵ᨦ ᨷᩢᨾᩦᨣ᩠ᩅᩣ᩠ᨾᨲᩯ᩠ᨠᨲ᩵ᩣ᩠ᨦᨠ᩠ᨶᩢ ᨷᩢᩅᩤ᩵ᨴᩤ᩠ᨦᨠᩣ᩠ᩁᨾᩮ᩠ᨦᩨ ᨠᩣ᩠ᩁᩈᩣ᩠ᩁᨠᩣ᩠ᩁᩃᩩᨾ ᨠᩣ᩠ᨦᨲ᩵ᩣ᩠ᨦᨷᩤ᩠ᨶᩢᨲ᩵ᩣ᩠ᨦᨾᩮ᩠ᨦᩨ ᩉᩕᩨᨠᩣ᩠ᩁᨶᩱᨯ᩠ᨶᩥᨯᩯ᩠ᨶᨴᩦ᩵ᨤ᩠ᨶᩫᩋᩣᩈᩱ᩠ᨿᩀᩪ᩵ ᨷᩢᩅᩤ᩵ᨯ᩠ᨶᩥᨯᩯ᩠ᨶᨶᩦ᩶ᨧᩢᨸᩮ᩠ᨶᩑᨠᩁᩣ᩠ᨩ ᩀᩪ᩵ᨶᩱᨣ᩠ᩅᩣ᩠ᨾᨸ᩠ᨠᩫᨣᩕ᩠ᩋᨦᨡ᩠ᩋᨦᨲ᩠ᨶᩫ ᩉᩕᩨᩀᩪ᩵ᨲᩱᩢᩋᩴᩣᨶᩣ᩠ᨧᨲ᩠ᨶᩫᨯᩱᪧᨴ᩠ᨦᩢᩈᩢ᩠ᨿᨦ
The script called Tai Tham is used for three living languages, Lue, Khuen, and Northern Thai, which are spoken in China, Myanmar, Northern Thailand, and surrounding areas. In addition, the script is used for Lao Tham (or Old Lao) and other dialect variants found in Buddhist palm leaves and notebooks. Although the script has no single, commonly recognized name across the region today, it is known by various language-specific and region-specific names, such as Old Xishuang banna Dai or Old Tai Lue in China, Khün in Myanmar, and Tua Mueang, Lanna, or Yuan in Thailand.
Few of the six million speakers of Northern Thai are literate in the Tai Tham script, although there is some rising interest in the script among the young. There are about 690,000 speakers of Tai Lue. Of those, many people born before 1950 are literate in the Tai Tham script, and newspapers and other literature are regularly produced in the Xishuangbanna region of Yunnan using the script. Younger speakers are taught the New Tai Lue script, instead. ... The Tai Tham script continues to be taught in the Tai Lue monasteries. There are 107,000 speakers of Khün, for which Tai Tham is the only script.
The Tai Tham script, Lanna script (Thai: อักษรธรรมล้านนา) or Tua Mueang (Lanna: ᨲ᩠ᩅᩫᨾᩮᩥᩬᨦ, Northern Thai pronunciation: [tǔa.mɯ̄aŋ] Tai Lü: ᨲ᩠ᩅᩫᨵᨾ᩠ᨾ᩼ , Tham, "scripture"), is used for three living languages: Northern Thai (that is, Kham Mueang), Tai Lü and Khün. In addition, the Lanna script is used for Lao Tham (or old Lao) and other dialect variants in Buddhist palm leaves and notebooks. The script is also known as Tham or Yuan script.
The Tai Tham script is an abugida, ie. consonants carry an inherent vowel sound that is overridden using vowel signs. In Khün, consonants carry an inherent vowel a. See the table to the right for a brief overview of features of the modern Tai Khün orthography. (See the key. Character counts exclude ASCII characters.)
The following list describes some distinctive characteristics of the Tai Tham script.
Tai Tham script is written horizontally and left to right.
Words in the Tai languages are mostly monosyllabic, however multi-syllable borrowings and compound words occur.
Owen describes the stressed phonological syllable in Khün as C(C)V(C)T. The second consonant in an initial cluster is highly restricted. The onset consonant can be a glottal stop. Unstressed syllables are CVT, where the vowel is normally a.
Dialects differ, and experts also differ, but the basic sounds of Khün appear to grosso-modo include the following.
p pʰ b k kʰ t tʰ d c cʰ ʔ
i iː ɯ ɯː u uː
|p t k ʔ|
f s h
|e eː ɤ ɤː o oː|
m n ŋ
|ɛ ɛː a aː ɔ ɔː||m n ŋ|
j r l w
|(r) l||w j|
Final ʔ occurs only after short vowels.
Click on the sounds to reveal locations in this document where they are mentioned.
These tables use a hyphen to indicate the location of a consonant or consonant cluster. Prescript vowel-signs have been stored before the hyphen because of the limitations of the font, but in reality all vowel-signs should occur after the consonant they modify.
|Independent||Open syllables||Closed syllables|
|High||short||ʔi ᩍ||ʔu ᩏ||iʔ -ᩥ||ɯʔ -ᩧ||uʔ -ᩩ||i -ᩥ-||ɯ -ᩧ-||u -ᩩ-|
|long||ʔi: ᩎ||ʔu: ᩐ||i: -ᩦ||ɯː -ᩨ||u: -ᩪ||i: -ᩦ-||ɯː -ᩨ-||u: -ᩪ-|
|Upper-mid||short||eʔ ᩮ-ᩡ||ɤʔ ᩮ-ᩬᩨᩡ||oʔ ᩰ-ᩡ -᩠ᩅᩫᩡ|
|long||ʔe: ᩑ||ʔo: ᩒ||e: ᩮ-||ɤ: ᩮ-ᩬᩨ||o: ᩰ- -᩠ᩅᩫ||e: ᩮ-- -᩠ᨿ-||ɤ: ᩮ-ᩨ-||o: ᩰ-- -᩠ᩅ-|
|Lower-mid||short||ɛʔ ᩯ-ᩡ||ɔʔ ᩰ-ᩬᩡ||ɔ -ᩫ-|
|long||ɛ ᩯ-||ɔ: -ᩳ||ɛ ᩯ--||ɔ:-ᩬ-|
|ʔa ᩋ||aʔ - -ᩡ||a -ᩢ-|
|long||ʔa: ᩋᩣ||a: -ᩣ -ᩤ||a: -ᩣ- -ᩤ-|
Green indicates that the same character(s) are used for open syllables
In the absence of any other consonant, short vowels are always followed by a glottal stop. Long vowels never occur before a glottal stop.o143
Most rhymes ending in an approximant (j or w) are regular combinations of vowel nucleus and approximant coda. These include -iw -e:w -ɛːw -aːw -ɯj -ɯːj -ɤj -ɤːj -aːj -uj.o152
The following special forms exist.o153
|Front • Central • Back|
|Upper-mid||e:w -ᩴ᩠ᨿ||o:j -᩠ᩅ᩠ᨿ|
|Low||aw ᩮ-ᩢᩣ aj ᩱ- ᩱ-᩠ᨿ|
The Unicode proposal contains the following diphthong -ia ᩮ-᩠ᨿ.e4-5
Owen says that most archaic diphthongs have morphed into plain vowels in Khün speech, however the remnants of the former sounds are seen in the duplication of written forms for e: ɤː oː in the earlier table which are represented by a sequence of vowel-signs.
The inherent vowel is usually transcribed and pronounced as a. So ᨠ is pronounced ka.
Other than the inherent vowel, vowel sounds that follow a consonant sound are represented using vowel-signs, eg. ᨠᩥ ki. This includes diphthongs, 5 prescript signs, and 22 circumgraphs.
Characters that produce vowel signs are all combining characters. In principle a single character is used per base consonant, but vowel signs can be combined to create additional sounds.
All vowel-signs are typed and stored after the base consonant, whether or not they precede it when displayed. The font takes care of the glyph positioning.
About half of the vowel-signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.
In the following list, transcriptions indicate which vowel-signs are used by Tai Khün.
ᩤ [U+1A64 TAI THAM VOWEL SIGN TALL AA] and ᩣ [U+1A63 TAI THAM VOWEL SIGN AA] are both used to represent the same phoneme. The choice of which is used is a matter of spelling. The taller version is typically (Owen says only, for Khün o152) used after ᨷ ᩅ ᨴ ᨵ ᨣ, and avoids confusion with otherwise similar shapes, eg. ᩅᩣ looks like ᨲ. Some textbooks also recommend it's use after ᨧ ᨻ ᩁ ᨽ. e
Northern Thai, also uses ᩲ [U+1A72 TAI THAM VOWEL SIGN THAM AI], however, ᩭ [U+1A6D TAI THAM VOWEL SIGN OY] is not used in Northern Thai.u655
The sequence ᩠ + ᨿ [U+1A60 TAI THAM SIGN SAKOT + U+1A3F TAI THAM LETTER LOW YA] is pronounced as the diphthong ia in Northern Thai and as eː in Khün when it appears alone.
The sequence ᩠ + ᩅ [U+1A60 TAI THAM SIGN SAKOT + U+1A45 TAI THAM LETTER WA] is pronounced as the diphthong ua when it appears alone.
Both of these characters also appear as a part of the combinations described below.
The various letters used for vowels can be mixed together to produce additional sounds, as shown in the examples below for sequences used in Khün. This list doesn't include rhymes that end in WA and YA in the normal way that codas are formed.
The following list shows where vowel-signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are. The figure after the + sign represents combinations of Unicode characters, The list includes subjoined WA and YAand the postfixed ᩋ.
Vowel components can occur concurrently on 4 sides of the base, eg. ᩅᩮᩬᩨᩡ.
Distribution of vowel elements is as follows:
|ᩢ ᩫ ᩳ ᩴᩘ|
|ᩮ ᩯ ᩱ ᩰ ᩲ||ᩡ ᩅ ᩣ ᩤ ᩋ||ᩡ|
|ᩩ ᩪ ᩬ||ᩭ ᩠ᨿ|
In Tai Tham some vowels that are not preceded by a consonant can be represented using a set of independent vowel letters. The set includes a character to represent the inherent vowel sound, but doesn't cover all such possible vowels.
The 6 vowel letters are used in syllable-initial position where there is no consonant onset, eg. ᩑᨠ.
The use of independent vowels is lexically constrained: for other vowels not preceded by a consonant Tai Tham uses ᩋ [U+1A4B TAI THAM LETTER A] with a vowel-sign, eg. ᩋᩧ᩠ᨷ.
ᩒ [U+1A52 TAI THAM LETTER OO] is not used in Northern Thai.u654
The lists below show all consonants in the Tai Khün repertoire.
The consonants are associated with high, mid, or low class values related to tone. (Low class consonants are indicated using an underline in the transliteration, and mid class by an inverted breve.)
Click on the sounds to reveal locations in this document where they are mentioned.
p ᨸ ᨻ
t ᨭ ᨴ
tʰ ᨳ ᨮ ᨵ ᨰ
k ᨠ ᨣ
kʰ ᨡ ᨥ ᨤ
|Fricative||f ᨺ ᨼ||s ᩈ ᩆ ᩇ ᨪ||h ᩉ ᩌ|
|Nasal||m ᨾ||n ᨶ ᨱ||ŋ ᨦ|
|Approximant||w ᩅ||l ᩃ ᩊ||j ᩀ ᨿ|
cʰ seems to be regarded as not a native Khün sound, but rather associated with reading the alphabet out loud and in learned pronunciation of Pali loanwords.o142
There seems to be general agreement that initial consonant clusters are limited to kw and kʰw, although some have found other sounds in words loaned from Burmese, Sanskrit or Pali.o142
The initial glottal stop is also pronounced before independent vowels, though they are not listed here.
p̚ ᨸ ᨻ
||t̚ ᨭ ᨴ||k̚ ᨠ ᨣ||ʔ|
|Nasal||m ᨾ||n ᨶ ᨱ||ŋ ᨦ|
|Approximant||w ᩅ||j ᩀ ᨿ|
The glottal stop is unwritten and non-phonemic, but is pronounced after a short, open vowel.
High and low consonants usually come in pairs, but where they don't the high variant is normally given by subjoining the low consonant below ᩉ [U+1A49 TAI THAM LETTER HIGH HA], eg. ᩉ᩠ᨶᩧ᩵ᨦ.
A few consonants have different phonetic realisations in Northern Thai, and ᨢ [U+1A22 TAI THAM LETTER HIGH KXA] is used in Northern Thai but not by Tai Khün.
ᩋ [U+1A4B TAI THAM LETTER A] represents a glottal stop. It can be used with vowels at the beginning of a syllable, eg. ᩋᩧ᩠ᨷ. Note that this can have very different shapes, as shown by the Northern Thai font (ᩋ) and the Khün font (ᩋ).
These combining characters are used to represent the second consonant in syllable-initial clusters.
In addition, a subjoined w̱ is often found in a syllable-initial cluster, eg. ᨣ᩠ᩅᩣ᩠ᨿ.
Other syllable-initial clusters include the combination of ᩉ [U+1A49 TAI THAM LETTER HIGH HA] plus a subjoined low class consonant to give a high class version, as mentioned just above.
Medial characters are useful, as they can signal the difference between a consonant cluster and an initial-final sequence, ie. using a subjoined l. Some fonts, however, don't make that distinction clear.h
Tai Tham text commonly renders syllable-final consonants using regular consonant code points, eg. ᩑᨠ, but sometimes the special combining characters shown below are used.
When regular consonants are used they are commonly subjoined, eg. ᨠᩣ᩠ᩁ, but not always. For example, when preceded by a subscript vowel a final consonant may be rendered on the baseline, eg. ᩃᩪᨠ. On the other hand, sometimes in Khün the consonant is subjoined and the subscript vowel is moved to the side of the stack, eg. ᨧ᩠ᨷᩪ. (In Lanna, all three may be stacked, ie. ᨧ᩠ᨷᩪ.) In either case, due to font design or USE (the Universal Shaping Engine) the characters may have to be typed in an order that departs from the spoken order so that they look as expected.
The following diacritics are sometimes used for syllable-final consonants. For more details about usage, click on the characters and follow the links to the character notes.
Owen says that the superscript consonants in Kühn are limited to final r ( ᩺ [U+1A7A TAI THAM SIGN RA HAAM]) and ŋ ( ᩙ [U+1A59 TAI THAM CONSONANT SIGN FINAL NGA]) in syllables where a subscript vowel prevents the use of a subscript final consonant. Superscript forms are mainly found in handwritten text, whereas regular forms of these consonants in postscript position are the norm for printed texts. o145
᩺ [U+1A7A TAI THAM SIGN RA HAAM], has a different shape in Northern Thai and is not used for syllable-final consonants, but rather as a silence marker. Note that a syllable-final r is pronounced n.
ᩴ [U+1A74 TAI THAM SIGN MAI KANG] may be regarded as a vowel, but it doesn't introduce any vowel sound other than the inherent vowel when used above a consonant on its own.
ᩘ [U+1A58 TAI THAM SIGN MAI KANG LAI], has very different shapes in Northern Thai (ᩅᩘ) and Khün (ᩅᩘ).
ᩜ [U+1A5C TAI THAM CONSONANT SIGN MA], ᩝ [U+1A5D TAI THAM CONSONANT SIGN BA] and ᩞ [U+1A5E TAI THAM CONSONANT SIGN SA] appear to be alternative shapes for the normal subjoined consonants, used per writer preference (follow the links for more information). The latter two are rarely used in Khün.
The first of these is a special-use consonant diacritic. The second two are ligatures.
ᩛ [U+1A5B TAI THAM CONSONANT SIGN HIGH RATHA OR LOW PA] represents two different functions with the same appearance. It represents ᨮ [U+1A2E TAI THAM LETTER HIGH RATHA] in ᩈᨱᩛᩣ᩠ᨶ. e And it represents ᨻ [U+1A3B TAI THAM LETTER LOW PA] in ᩋᨾᩛ. (Compare with the somewhat rare subjoined form, eg. ᨷᩢᨱ᩠ᨻᨷᩩᩁᩩᩇ.)e
Khün uses ᨭᩛ, which is ᨭ + ᩛ [U+1A2D TAI THAM LETTER RATA + U+1A5B TAI THAM CONSONANT SIGN HIGH RATHA OR LOW PA] instead of ᨮ [U+1A2E TAI THAM LETTER HIGH RATHA].e
ᩓ [U+1A53 TAI THAM LETTER LAE] represents the combination ᩃ + ᩯ [U+1A43 TAI THAM LETTER LA + U+1A6F TAI THAM VOWEL SIGN AE], eg. ᩈᩮᩓ᩠ᩅ᩶.
ᩔ [U+1A54 TAI THAM LETTER GREAT SA] represents geminated ᩈ [U+1A48 TAI THAM LETTER HIGH SA].
ᩚ [U+1A5A TAI THAM CONSONANT SIGN LOW PA] moves the stack upwards. The normal rendering of kp̄˖p̄ʰ would be ᨠᨻ᩠ᨽ, but in some Tai Lü words this sign is used instead, eg. ᨠᨽᩚ.e Note, however, that this changes the typing order of the consonants, since a combining character has to be typed after the base. The transliteration now becomes kp̄ʰp̆. e
Tai Tham is unusual in that subjoined consonants do not only appear where there are consonant clusters. There is a natural tendency to attempt to stack consonants, usually 2 high, whenever possible.
᩠ [U+1A60 TAI THAM SIGN SAKOT] is an invisible character used to produce the subjoined form of a consonant, eg. ᨠᨠ vs. ᨠ᩠ᨠ.
Unlike the virama in most brahmi-derived scripts, sakot doesn't necessarily kill the vowel between two consonants, nor does it create conjuncts in the sense of merged shapes. For example, in ᨨ᩠ᩃᩣ᩠ᨯ the inherent vowel after c is not suppressed.
Also unusually, sakot can follow a vowel-sign. For example, in ᩈᩣ᩠ᨾ the sakot is used to position the final consonant in the syllable below the vowel-sign. This is quite common. A subjoined consonant can also follow a digit, eg. ᪓᩠ᨴ.
Tai Tham will usually attempt to subjoin non-initial consonants, although generally only two characters deep. Sequences of 2 subjoined characters exist, but in Tai Khün the second subjoined character joins to the right of the stack, rather than sitting below it, eg. ᨠ᩠ᩅ᩠ᨿᩁ. In Northern Thai, however, they may all be stacked, eg. ᨠ᩠ᩅ᩠ᨿᩁ
A consequence of this shallow subjoining is that a subscript vowel will typically cause a final consonant to not be subjoined, eg. ᩃᩪᨠ. However, this is not always the case. In Khün, ᨧ᩠ᨷᩪ the final b is subjoined under the onset consonant, and the normally subscript vowel is moved to the side. (In Northern Thai, however, this may be displayed as a single stack, ie. ᨧ᩠ᨷᩪ.)
Subjoined consonants are not only syllable-final consonants. The first consonant in a following syllable may also be subjoined, eg. ᨳ᩠ᨶ᩻ᩫᩁ (final r is pronounced as n). e u654
Not all consonants traditionally have subjoined forms, but modern innovations in borrowed terminology suggest that fonts should provide them for all consonants except the old vocalic letters. u654
With the high/low categorisation of consonants, Tai Tham writing needs only the two combining tone marks below to indicate one of 6 possible phonetic tones.
The Unicode block for Khün contains 3 more tone marks, although they are rarely used.
Owen describes various studies of tones in Khün which reach slightly different conclusions.→o In addition, some studies conclude that there are 6 tones in total, and others 5. The table below shows Owen's 6-tone system.
|mid glottalised||˧˧ʔ||33ʔ||kaː˧˧ʔ dance|
|high falling||˥˩||51||kaː˥˩ trade|
If there is a vowel over or below a consonant or consonant stack, the tone mark follows the vowel in storage, and is displayed above or alongside the vowel.
Otherwise, the tone is input after the consonant, ie. before a vowel sign that is displayed to the right or below, and appears over the consonant.e
The default fonts used here expect the tone to be typed after a lefted vowel if there is one; after a vowel above, if there is one; before a vowel to the right; and doesn't seem to matter wrt low vowel. See this test. Noto agrees except for lefted vowels.
The following chart shows how to tell which tones are associated with a syllable.
Tai Tham has many combining marks. Here we list those that are not already listed in one of the following sections: vowelsigns, medials, finals, specials, and tones.
᩠ [U+1A60 TAI THAM SIGN SAKOT] is used to create a subjoined consonant. See subjoined.
᩼ [U+1A7C TAI THAM SIGN KHUEN-LUE KARAN] is written over a consonant (normally in final position) when that consonant is not to be pronounced. Frequently used in loans from languages with consonant clusters in the coda such as Pali, eg. ᩈᩫ᩠ᨾᨷᩪᩁ᩠᩼ᨱ or English ᨼᩥ᩠ᩃ᩼ᨾ.o149 Northern Thai would use ᩺ [U+1A7A TAI THAM SIGN RA HAAM], ie. ᨼᩥ᩠ᩃ᩺ᨾ f̱i˖ḻ˟m̱ fim².p
᩻ [U+1A7B TAI THAM SIGN MAI SAM] is used in Northern Thai either to indicate repetition of a word, eg. ᨲ᩵ᩣ᩠ᨦ᩻ t¹ā˖ŋ̱ʻ taːŋ taːŋ different in my view, or to identify double-acting consonants, or to indicate that a subjoined consonant begins a new syllable, eg. compare ᨳᩫ᩠ᨶᩁ tʰɔ̈˖ṉr tʰonra with ᨳᩫ᩠ᨶ᩻ᩁ tʰɔ̈˖ṉʻr tʰanon path (final r is pronounced as n).e
᩿ [U+1A7F TAI THAM COMBINING CRYPTOGRAMMIC DOT] is used in Northern Thai, singly or multiply beneath letters to give each letter a different value according to some hidden agreement between reader and writer.u655
The shapes of some of the above are significantly different in Northern Thai.
These characters are all described in boundaries.
The meaning of each of the logographs is shown above.
Two sets of digits are in common use: a secular set (Hora) and an ecclesiastical set (Tham). European digits are also found in books. u655
You can experiment with examples using the Tai Khün character app.
Although the same code points are used, there are some significant and consistent differences in the glyphs shapes used for characters in the Tai Khün (top) and Northern Thai (bottom) repertoires.
The panel below shows the differences for the consonants of the fonts used for this page.
There is not so much contextual shaping in Tai Tham as in many other Brahmi-descended scripts. One particularly noticeable example of contextual shaping is the realisation of ᨬ + ᩠ + ᨬ [U+1A2C TAI THAM LETTER NYA + U+1A60 TAI THAM SIGN SAKOT + U+1A2C TAI THAM LETTER NYA], which moves the initial character upwards rather than subjoining the second, ie. ᨬ᩠ᨬ.
Another common ligature is ᨶᩣ, which is composed of ᨶ + ᩣ [U+1A36 TAI THAM LETTER NA + U+1A63 TAI THAM VOWEL SIGN AA]. It forms even when NA has non-spacing subscripts, and even MEDIAL RA, eg. ᩋᩫᨶ᩠ᨲᩕᩣ᩠ᨿ. Pali must regularly handle the nominative singular ending for present participles, ᨶ᩠ᨲᩮᩣ ṉ˖teā.r
Placement of tone marks often involves special shaping and positioning. See the positions in the examples below.
In the A Tai Tham KH New font, a 2nd-tone mark following ᩢ [U+1A62 TAI THAM VOWEL SIGN MAI SAT] loses its uptick to create two parallel lines, eg. ᨡᩮᩢ᩶ᩣ.
Spaces separate phrases. There is no separation of individual words.
A new word may start with a subjoined consonant. Stacking is performed across word boundaries. This means that operations such as line-breaking, word highlighting, etc. have to use an orthographic syllable unit which differs from the underlying phonetic syllables.
The following punctuation marks have "progressive values of finality".
European punctuation marks such as question marks, exclamation marks, parentheses, and quotation marks are also used.
᪣ [U+1AA3 TAI THAM SIGN KEOW], ᪤ [U+1AA4 TAI THAM SIGN HOY], ᪥ [U+1AA5 TAI THAM SIGN DOKMAI], and ᪭ [U+1AAD TAI THAM SIGN CAANG] are all used as section starters, sometimes in conjunction with other punctuation, eg. ᪩᪥᪩ and ᪭ᩣ. e
To close a section, use ᪦ [U+1AA6 TAI THAM SIGN REVERSED ROTATED RANA] and/or ᪬ [U+1AAC TAI THAM SIGN HANG], eg. ᪦᪦᪩, ᪩᪦᪩, ᪩᪦᪩᪬, and ᪦᪦᪬.
Opportunities for line breaking are lexical, but a line break may not be inserted between a base letter and a combining diacritic. u656
There is no insertion of visible hyphens at line boundaries. u656
Characters used for the Tai Khün language have the following assignments related to line-break properties.
|NU||20||᪀ ᪁ ᪂ ᪃ ᪄ ᪅ ᪆ ᪇ ᪈ ᪉ ᪐ ᪑ ᪒ ᪓ ᪔ ᪕ ᪖ ᪗ ᪘ ᪙|
|SA||96||ᩮ ᩯ ᩱ ᩰ ᩣ ᩤ ᩡ ᩢ ᩥ ᩦ ᩧ ᩨ ᩩ ᩪ ᩭ ᩫ ᩬ ᩳ ᩠ ᨿ ᩠ ᩅ ᩋ ᩍ ᩎ ᩏ ᩐ ᩑ ᩒ ᨠ ᨡ ᨧ ᨨ ᨮ ᨲ ᨳ ᨸ ᨹ ᩀ ᩈ ᩆ ᩇ ᨺ ᩉ ᨯ ᨷ ᩋ ᨣ ᨥ ᨤ ᨦ ᨩ ᨫ ᨬ ᨭ ᨰ ᨱ ᨴ ᨵ ᨶ ᨻ ᨽ ᨾ ᩃ ᩊ ᩁ ᨿ ᩅ ᨪ ᨼ ᩌ ᩕ ᩖ ᩺ ᩙ ᩴ ᩘ ᩜ ᩝ ᩞ ᩛ ᩓ ᩔ ᩠ ᩵ ᩶ ᩼ ᪨ ᪩ ᪪ ᪫ ᪧ|
CM (combining mark) takes on the behaviour of its base character.
NU (number) behaves like ordinary characters (AL) in the context of most characters but activate the prefix and postfix behavior of prefix and postfix characters.
SA (Southeast Asian) require morphological analysis to determine break opportunities, in a way similar to a hyphenation algorithm. No break opportunities will be found otherwise. Complex context analysis, often involving dictionary lookup of some form, is required to determine non-emergency line breaks. If such analysis is not available, it is recommended to treat them as AL.
ZW (ZERO WIDTH SPACE, ZWSP) enables invisible break opportunities wherever SPACE cannot be used. It has no width, and is treated as if it wasn't there during justification.
The Tai Tham script characters in Unicode 12.0 are in the following block:
The modern Tai Khün orthography described here uses characters from the following Unicode blocks.
|Tai Tham||97||ᨠ ᨡ ᨣ ᨤ ᨥ ᨦ ᨧ ᨨ ᨩ ᨪ ᨫ ᨬ ᨭ ᨮ ᨯ ᨰ ᨱ ᨲ ᨳ ᨴ ᨵ ᨶ ᨷ ᨸ ᨹ ᨺ ᨻ ᨼ ᨽ ᨾ ᨿ ᩀ ᩁ ᩃ ᩅ ᩆ ᩇ ᩈ ᩉ ᩊ ᩋ ᩌ ᩍ ᩎ ᩏ ᩐ ᩑ ᩒ ᩓ ᩔ ᩕ ᩖ ᩘ ᩙ ᩛ ᩜ ᩝ ᩞ ᩠ ᩡ ᩢ ᩣ ᩤ ᩥ ᩦ ᩧ ᩨ ᩩ ᩪ ᩫ ᩬ ᩭ ᩮ ᩯ ᩰ ᩱ ᩳ ᩴ ᩵ ᩶ ᩺ ᩼ ᪀ ᪁ ᪂ ᪃ ᪄ ᪅ ᪆ ᪇ ᪈ ᪉ ᪧ ᪨ ᪩ ᪪ ᪫|
The infrequently used characters come from these blocks.
|Tai Tham||22||᩷ ᩸ ᩹ ᪐ ᪑ ᪒ ᪓ ᪔ ᪕ ᪖ ᪗ ᪘ ᪙ ᪠ ᪡ ᪢ ᪣ ᪤ ᪥ ᪦ ᪬ ᪭|
See also the Character usage lookup page, and the Script Comparison Table.
According to ScriptSource, the Tai Tham script is used for the following languages: