Updated 1 January, 2023
This page brings together basic information about the Tai Tham (Lanna) script and its use for the Northern Thai language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Northern Thai using Unicode.
Good Tai Tham fonts are hard to find, especially for Northern Thai. The default font used for Northern Thai in this page is Hariphunchai. Discussions are under way at the Unicode Consortium that may change the ordering of character sequences in the future, so the order of characters in this page reflects what the font expects in order to display correctly.
ᨣᩢ᩠ᨶᩉᩖᩮᩨᨠᩥ᩠ᨶ ᨣᩢᩐᩢᩣᨡᩣ᩠ᨿᨸᩮ᩠ᨶᨦᩫ᩠ᨶ ᨠᩮ᩠ᨷᩉᩬᨾᩋᩬᨾᩅᩱᩢᨯ᩠᩶ᨦᨶᩦ᩶ ᨴᩩᨠᪧᨸᩦᨾᩣᨷᩢᨡᩣ᩠ᨯ ᨧᩥ᩠᩵ᨦᨠ᩠ᨴᩣᩴᩉᩨ᩶ᨡᩮᩢᩣᨻᩳ᩵ᨾᩯ᩵ᩃᩪᨠ ᨷᩢᨯᩱ᩶ᨠᩢ᩠᩶ᨶᩈᩢ᩠ᨦᩈᩢ᩠ᨠᨩᩮᩨᩬ
Northern Thai is is spoken by the people of Lanna, Thailand, with a smaller community of Lanna speakers in northwestern Laos. Few of the six million speakers of Northern Thai are literate in the Tai Tham script, although there is some rising interest in the script among the young. Since the beginning of the 20th century, the Thai script has been used for the Northern Thai language, although the fact that Thai only has 5 tones to Northern Thai's 6 makes this problematic.
Use of the Lanna traditional script is now largely limited to Buddhist temples, where many old sermon manuscripts are still in active use. There is no active production of literature in the traditional alphabet. The modern pronunciation differs from that prescribed in spelling rules.
ᨲ᩠ᩅᩫᨾᩮᩥᩬᨦ ᨣᩤᩴᨾᩮᩨᩬᨦ
In the Thai script this is คำเมือง.
The Lana script is derived from Mon, and before that Pallava.
The Tai Tham script is an abugida, ie. consonants carry an inherent vowel sound that is overridden using vowel signs. In Tai Tham, consonants carry an inherent vowel. See the table to the right for a brief overview of features for the modern Northern Thai orthography.
Northern Thai text runs left to right in horizontal lines.
Words are not separated by spaces, however syllables may be separated by ZWSP, as long as they don't fall inside a stack.
Each onset consonant is associated with a high, mid, or low class related to tone. Tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), plus any tone mark. ❯ consonants
Tai Tham has stacked consonants, but these do not necessarily indicate consonant clusters. The script is unusual in that any consonant in a stack can retain its inherent vowel, or be associated with a vowel sign. The sakot, which produces stacks, is never visible. ❯ clusters
Stacks can span word boundaries.
Syllable-initial clusters use 2 dedicated code points for the medial l, and a subjoined letter for medial w. ❯ onsets
Syllable-final consonant sounds can be written using 6 special diacritics, but otherwise use ordinary letters, which may or may not be subjoined depending on the context. ❯ finals
The Northern Thai orthography has an inherent vowel a, and represents vowels using 18 vowel signs (including 5 pre-base vowels), and 3 consonants. However, unlike Thai and Lao, all vowel signs are combining marks, and are stored after the base character. Vowels are often written differently when they appear in a closed vs. open syllable. ❯ vowels
There is an incomplete set of independent vowels, and standalone vowel sounds are typically written using vowel signs applied to ᩋ [U+1A4B TAI THAM LETTER A]. ❯ standalone
This page lists 29 composite vowels (made from 9 vowel signs, and 3 consonants/diacritics). Composite vowels can involve up to 5 glyphs, which can surround the base consonant(s) on up to 4 sides, eg. ❯ composite_vowels
ᨠᩮᩨᩬᩋᩡ
Northern Thai and Khün not only use a slightly different set of characters, but a number of characters have consistently divergeant shapes.
See Tai Tham/Khün.
These are sounds for the Northern Thai language.
Click on the sounds to reveal locations in this document where they are mentioned.
Phones in a lighter colour are non-native or allophones.
labial | dental | alveolar | post- alveolar |
palatal | velar | glottal | |
---|---|---|---|---|---|---|---|
stops | p b | t d | k | ʔ | |||
aspirated | pʰ | tʰ | |||||
affricates | t͡ɕ | ||||||
fricatives | f | s | x | h | |||
nasals | m | n | ɲ | ŋ | |||
approximants | w | l | j | ||||
The glottal stop is pronounced after short open vowels. An initial glottal stop is also pronounced before independent vowels (see standalone).
labial | dental | alveolar | post- alveolar |
palatal | velar | glottal | |
---|---|---|---|---|---|---|---|
stop | p̚ | t̚ | k̚ | ʔ | |||
nasal | m | n | ŋ | ||||
approximant | w | j |
All final stops are unreleased@Wikipedia,https://en.wikipedia.org/wiki/Northern_Thai_language#Final_consonants.
The Chiang Mai dialect of Northern Thai has 6 tones. They are illustrated in fig_tones, which is taken from Wikipedia.
Wikipedia provides the following information for the 6 phonemic tones for unchecked syllables in the Changmai dialect of Northern Thai. (It also has sound recordings.)wnl,#Consonants
Tone | Representations | Example | ||
---|---|---|---|---|
low-rising | ˨˦ | 24 | ǎ | ᩉᩖᩮᩢᩣ |
mid-low | ˨ | 22 | à | ᩉᩖᩮᩢᩣ᩵ |
high-falling glottalised |
˥˧ | 53 | a᷇ | ᩉᩖᩮᩢᩣ᩶ |
mid-high | ˧ | 33 | ā | ᩃᩮᩢᩣ |
falling | ˥˩ | 51 | â | ᩃᩮᩢᩣ᩵ |
high rising-falling glottalised |
˦˥˦ | 545 | á | ᩃᩮᩢᩣ᩶ |
This is the list for checked syllables.wnl,#Consonants
Tone | Representations | Example | ||
---|---|---|---|---|
low-rising | ˨˦ | 24 | ǎ | ᩉᩖᩢᨠ |
high-falling | ˥ | 55 | a᷇ | ᩃᩢ᩠ᨠ |
low | ˨ | 22 | à | ᩉᩖᩣ᩠ᨠ |
falling | ˥˩ | 51 | â | ᩃᩣ᩠ᨠ |
The mapping of tones to characters is described in tones.
Dashes are used to indicate the location of a consonant or consonant cluster. Prescript vowel signs have been stored before the hyphen because of the limitations of the font, but in reality all vowel signs should occur after the consonant they modify.
The Northern Thai orthography has an inherent vowel a, and represents vowels using 19 dependent vowel marks (including 5 pre-base vowel signs), and 3 consonants (2 of which as subjoined forms). Unlike Thai and Lao, all vowel signs are combining marks, and are stored after the base character. Vowels are often written differently when they appear in a closed vs. open syllable.
There is an incomplete set of independent vowels, and standalone vowel sounds are typically written using vowel signs applied to ᩋ [U+1A4B TAI THAM LETTER A].
This page lists 29 composite vowels (made from 9 vowel signs, and 3 consonants/diacritics). Composite vowels can involve up to 5 glyphs, which can surround the base consonant(s) on up to 4 sides, eg.
ᨠᩮᩨᩬᩋᩡ
For a mapping of sounds to graphemes see vowel_mappings.
a following a consonant is not written, but is seen as an inherent part of the consonant letter, so ka is written by simply using the consonant letter.
ᨠ ka [U+1A20 LETTER HIGH KA]
Non-inherent vowel sounds that follow a consonant can be represented using vowel signs, eg.
ᨠᩥ ki [U+1A20 LETTER HIGH KA + U+1A65 VOWEL SIGN I]
The majority of code points used to represent vowel sounds are combining marks. However, Northern Thai also uses some consonants. Many vowel sounds are represented by a combination of code points (see composite_vowels), such as the following example, where the consonant base is followed by 4 vowel signs, which are displayed around the base.
ᨠᩮᩥᩬᩡ kɤʔ [U+1A20 LETTER HIGH KA + U+1A6E VOWEL SIGN E + U+1A65 VOWEL SIGN I + U+1A6C VOWEL SIGN OA BELOW + U+1A61 VOWEL SIGN A]
In principle, all vowel signs are typed and stored after the base consonant, whether or not they precede it when displayed. The font takes care of the glyph positioning. However, the Unicode Consortium is currently examining the encoding model for Tai Tham. There is a possibility that pre-base vowel signs may be stored before the consonant in future.
Eight vowel signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.
Northern Thai uses the following dedicated combining marks for vowels. They may be used on their own, or in combination with others (see composite_vowels).
ᩤ [U+1A64 TAI THAM VOWEL SIGN TALL AA] and ᩣ [U+1A63 TAI THAM VOWEL SIGN AA] are both used to represent the same phoneme. The choice of which to use is a question of spelling: the taller version is typically used after the following consonants.
ᩅ ᨴ ᨵ ᨣ
Some textbooks also recommend it's use after these characters, too.e
ᨧ ᨻ ᩁ ᨽ
ᩢ [U+1A62 TAI THAM VOWEL SIGN MAI SAT] is commonly used as a vowel, but it sometimes also indicates a final -k sound.
ᨯᩢᩬ
ᩴ [U+1A74 TAI THAM SIGN MAI KANG] is used with many words to represent a syllable-final -m or -ŋ (see finals), but it also functions as a vowel when it appears alone or as a component of a composite vowel.
ᨣᩴ
The sound it represents may be ambiguous. For instance, compare the example just above with the one below, where it fulfills the role of syllable-final consonant.
ᨶᩣᩴ
Five vowel signs appear to the left of the base consonant letter or cluster.
These combining marks are stored after the base consonant: the rendering process places the glyph before that of the base consonant. However, the Unicode Consortium is currently examining the coding model for Tai Tham. There is a possibility that pre-base vowel signs may be stored before the consonant in future. Also, some fonts already require this kind of handling, especially for dealing with complex combinations of characters.
The following are also involved in the production of vowel sounds.
The sequence ᩠ᩅ [U+1A60 TAI THAM SIGN SAKOT + U+1A45 TAI THAM LETTER WA] often represents a medial w – especially common after x or k but also occurring after a (dwindling) number of other consonants. However, when no other vowel signs follow (ie. when the inherent vowel is involved), it represents the diphthong ua rather than -wa.
Similarly, the sequence ᩠ᨿ [U+1A60 TAI THAM SIGN SAKOT + U+1A3F TAI THAM LETTER LOW YA] is pronounced as the diphthong ia when it appears alone after a consonant.
Both of these characters also appear as a component in some of the composite vowels described below.
ᩋ [U+1A4B TAI THAM LETTER A] on its own represents the standalone version of the inherent vowel ʔa, and is used as a base for vowel signs when writing other standalone vowels (see standalone). However, it also makes an appearance as a vowel component in 2 composite vowels.
This section lists vowel sounds represented by combinations of the above characters (this list is possibly incomplete).
Some represent plain vowel sounds:
The other composites represent diphthongs, which generally end in one of -a, -j or -w.
The following list shows where vowel signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are. The figure after the + sign represents combinations of Unicode characters, The list includes subjoined WA and YAand the postfixed ᩋ.
Distribution of vowel elements is as follows:
ᩢ ᩫ ᩳ ᩴᩘ | |||
ᩮ ᩯ ᩱ ᩰ ᩲ | ᩡ ᩅ ᩣ ᩤ ᩋ | ᩡ | |
ᩩ ᩪ ᩬ | ᩠ᨿ |
Vowel components can occur concurrently on 4 sides of the base, eg. ᩮᩬᩥᩡ.
Characters that don't appear in the combinations:
For vowels not preceded by a consonant Northern Thai generally uses ᩋ [U+1A4B TAI THAM LETTER A] with one or more vowel signs, eg. ᩋᩧ᩠ᨷ
Some standalone vowels can be represented using a set of independent vowel letters. The set includes a consonant character which used alone represents the inherent vowel sound, but the list only covers a small number of possible vowel sounds.
The 5 independent vowel letters are used in syllable-initial position for certain words, but for other words the base+vowel sign approach may be used.
ᩑᨠ
ᩋᩮ᩠ᨶ
This section maps Northern Thai vowel sounds to common graphemes in the Lanna orthography, where open indicates an open syllable, closed a closed syllable, and standalone a standalone vowel. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.
Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.
For some diphthongs ending in -j or -w, Owen indicates that phonetic sequences exist. but offers no examples. Based on other examples, it is assumed here that -j is formed using sakot+ya, and -w using sakot+wa, except where the preceding vowel sign extends below the baseline (such as for uj).
Inherent vowel
᩠ᨿᩢ◌ [U+1A60 TAI THAM SIGN SAKOT + U+1A3F TAI THAM LETTER LOW YA + U+1A62 TAI THAM VOWEL SIGN MAI SAT] (rare)
ᩮᩥᩢᩬ◌ [U+1A6E TAI THAM VOWEL SIGN E + U+1A65 TAI THAM VOWEL SIGN I + U+1A62 TAI THAM VOWEL SIGN MAI SAT + U+1A6C TAI THAM VOWEL SIGN OA BELOW]
ᩮᩨᩢᩬ◌ [U+1A6E TAI THAM VOWEL SIGN E + U+1A68 TAI THAM VOWEL SIGN UUE + U+1A62 TAI THAM VOWEL SIGN MAI SAT + U+1A6C TAI THAM VOWEL SIGN OA BELOW]
With the high/low categorisation of consonants, Northern Thai writing generally needs only the two combining tone marks below to indicate one of the possible phonetic tones.
If there is a vowel over or below a consonant or consonant stack, the tone mark follows the vowel in storage, and is displayed above or alongside the vowel.
Otherwise, the tone is input after the consonant, ie. before a vowel sign that is displayed to the right or below, and appears over the consonant. e
The default fonts used here expect the tone to be typed after a lefted vowel if there is one; after a vowel above, if there is one; before a vowel to the right; and doesn't seem to matter wrt low vowel. See this test. Noto agrees except for lefted vowels.
The following chart shows how to tell which tones are associated with a syllable.
Consonant | Checked? | Tone mark | Tone |
---|---|---|---|
high | checked | short | 2 |
long | 3 | ||
open | - | 1 | |
᩵ | 3 | ||
᩶ | 5 | ||
mid | checked | short | 2 |
long | 3 | ||
open | - | 2 | |
᩵ | 3 | ||
᩶ | 5 | ||
᩷ | 2 | ||
᩸ | 1 | ||
᩹ | 6 | ||
low | checked | short | 6 |
long | 4 | ||
open | - | 2 | |
᩵ | 4 | ||
᩶ | 6 |
Each onset consonant is associated with a high, mid, or low class related to tone. Tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), plus any tone mark.
Tai Tham has stacked consonants, but these do not necessarily indicate consonant clusters. The script is unusual in that any consonant in a stack can retain its inherent vowel, or be associated with a vowel sign. The sakot, which produces stacks, is never visible.
Stacks can span word boundaries.
Syllable-initial clusters use 2 dedicated code points for the medial l, and a subjoined letter for medial w.
Syllable-final consonant sounds can be written using 6 special diacritics, but otherwise use ordinary letters, which may or may not be subjoined depending on the context.
For a mapping of sounds to graphemes see consonant_mappings.
The lists below show consonants in the Northern Thai repertoire. The letters h, m, and l indicate the class of the consonant. This list includes some sequences to indicate high class forms when there is no single letter for that. Where 2 pronunciations are given, the first is for syllable-initial, and the second for syllable-final use.
ʨʰ is not a native Northern Thai sound, but rather associated with reading the alphabet out loud and in learned pronunciation of Pali loanwords.o,142
A few consonants have different phonetic realisations in Tai Khün, and ᨢ [U+1A22 TAI THAM LETTER HIGH KXA] is not used by Tai Khün.
High and low consonants usually come in pairs, but where they don't the high variant is normally given by subjoining the low consonant below ᩉ [U+1A49 TAI THAM LETTER HIGH HA].
ᩉ᩠ᨶᩧ᩵ᨦ
These combinations are included in the charts above.
ᩋ [U+1A4B TAI THAM LETTER A] represents a glottal stop.
It can be used with vowels at the beginning of a syllable, or on its own to indicate a standalone sound corresponding to the inherent vowel (see standalone).
ᩋᩧ᩠ᨷ
ᩋᩉ᩠ᨿᩢᨦ
It has very different shapes in Northern Thai text ᩋ and Khün text ᩋ.
The first of these is a special-use consonant diacritic. The second two are ligatures.
ᩛ [U+1A5B TAI THAM CONSONANT SIGN HIGH RATHA OR LOW PA] represents two different functions with the same appearance. It represents ᨮ [U+1A2E TAI THAM LETTER HIGH RATHA] in eᩈᨱᩛᩣ᩠ᨶ sṇ̱ᵽā˖ṉ shape And it represents ᨻ [U+1A3B TAI THAM LETTER LOW PA] in ᩋᨾᩛ ʔ̯m̱ᵽ mangoCompare with the somewhat rare subjoined form,e eg. ᨷᩢᨱ᩠ᨻᨷᩩᩁᩩᩇ b̯áṇ̱˖p̄b̯uruṣ disciple
ᩓ [U+1A53 TAI THAM LETTER LAE] represents the combination ᩃᩯ [U+1A43 TAI THAM LETTER LA + U+1A6F TAI THAM VOWEL SIGN AE], eg. ᩈᩮᩓ᩠ᩅ᩶
ᩔ [U+1A54 TAI THAM LETTER GREAT SA] represents geminated ᩈ [U+1A48 TAI THAM LETTER HIGH SA].
Tai Tham is unusual in that subjoined consonants do not only appear where there are consonant clusters. There is a natural tendency to attempt to stack consonants, usually 2 high, whenever possible.
᩠ U+1A60 TAI THAM SIGN SAKOT is the (always) invisible character used to produce the subjoined form of a consonant, eg. compare the following:
ᨠᨠ kk [U+1A20 TAI THAM LETTER HIGH KA + U+1A20 TAI THAM LETTER HIGH KA]
ᨠ᩠ᨠ kk [U+1A20 TAI THAM LETTER HIGH KA + U+1A60 TAI THAM SIGN SAKOT + U+1A20 TAI THAM LETTER HIGH KA]
Sakot doesn't always kill the inherent vowel between two consonants, nor does it create conjuncts, in the sense of merged shapes, but subjoined forms of consonants typically have a different and smaller shape compared to the standard form.
Sakot can follow a vowel sign. For example, in the following word the sakot is used to position the final consonant in the syllable below the vowel sign. This is quite common.
ᩈᩣ᩠ᨾ
A subjoined consonant can also follow a digit.
᪓᩠ᨴ
Subjoined consonants are not only syllable-final consonants. The first consonant in a following syllable may also be subjoined, eg. (final r is pronounced as n).e u,654
ᨳ᩠ᨶᩫ᩻ᩁ
This list shows consonants in their normal and subjoined forms. Not all consonants traditionally have subjoined forms, but modern innovations in borrowed terminology suggest that fonts should provide them for all consonants except the old vocalic letters.u,654 You may find that the font applied here doesn't handle all combinations well.
᩻ [U+1A7B TAI THAM SIGN MAI SAM] is used in Northern Thai to identify double-acting consonants, or to indicate that a subjoined consonant begins a new syllable, eg. compare the following (where final r is pronounced as n).e
ᨳᩫ᩠ᨶᩁ tʰo˖ṉṟ tʰonra
ᨳ᩠ᨶᩫ᩻ᩁ tʰo˖ṉʻṟ tʰanon
(It is also used to repeat a word.)
The following are used to represent the second consonant in syllable-initial clusters.
ᩕ [U+1A55 TAI THAM CONSONANT SIGN MEDIAL RA] after a stop generally produces aspiration, or converts the sound to x, but it may also be pronounced -l.
ᨠᩕᩣ᩠ᨷ
ᨣᩕᩢ᩠ᨷ
ᩖ [U+1A56 TAI THAM CONSONANT SIGN MEDIAL LA] is commonly not pronounced, however it is also found in the combination ᩉᩖ [U+1A49 TAI THAM LETTER HIGH HA + U+1A56 TAI THAM CONSONANT SIGN MEDIAL LA] which creates a high class letter with the sound l.
ᨠᩖᩣ᩶
ᩉᩖᩢᨠ
A medial -w also occurs, but there is no dedicated character for it. Instead it is produced using an ordinary WA which is subjoined using the sakot, ie. ᩠ᩅ [U+1A60 TAI THAM SIGN SAKOT + U+1A45 TAI THAM LETTER WA]. Such clusters are generally limited to kw and xw, although some other combinations are occasionally found, though they appear to be tending to obsoletion.wnl,#Consonants
ᨣ᩠ᩅᩣ᩠ᨿ
Other syllable-initial clusters include the combination of ᩉ [U+1A49 TAI THAM LETTER HIGH HA] plus a subjoined low class consonant to make the consonant high class (see highclass). These combinations are not pronounced as multiple consonants.
Northern Thai text commonly renders syllable-final consonants using regular consonant code points (see the example just below), but sometimes special combining characters are used.
ᩑᨠ
When regular consonants are used they are commonly subjoined, eg.
ᨠᩣ᩠ᩁ
There are, however, exceptions. For example, when preceded by a subscript vowel a final consonant may be rendered on the baseline, eg.
ᩃᩪᨠ
Northern Thai tends to add sub-base vowels below a consonant stack, whereas Khün typically shifts the vowel to the right of the stack (see fig_kiss).
Due to font design or USE (the Universal Shaping Engine) the characters may have to be typed in an order that departs from the spoken order so that they look as expected. For example, the word in fig_kiss is stored as CCV, whereas it is pronounced CVC.
The following diacritics are sometimes used for syllable-final consonants.
ᩴ [U+1A74 TAI THAM SIGN MAI KANG] may be used as a vowel, or to represent a syllable-final nasal. The use is sometimes ambiguous (see combiningvowels).
ᩘ [U+1A58 TAI THAM SIGN MAI KANG LAI] can also be used to represent a syllable-final nasal. Click on the name for details. Note that this diacritic has a very different shape in the Khün orthography. Compare ᩅᩘ and ᩅᩘ.
ᩢ [U+1A62 TAI THAM VOWEL SIGN MAI SAT] is commonly used as a vowel, but it also sometimes functions to indicate a final -k sound, eg. ᨯᩢᩬ
ᩝ [U+1A5D TAI THAM CONSONANT SIGN BA] and ᩞ [U+1A5E TAI THAM CONSONANT SIGN SA] appear to be alternative shapes for the normal subjoined consonants, used per writer preference (follow the links for more information).
᩺ [U+1A7A TAI THAM SIGN RA HAAM] is used in Northern Thai to silence one or more characters in a word. It is not always clear which sound or sounds are cancelled. Click on the following words to see how the letters map to sounds.
ᨵᨾ᩠ᨾ᩺
ᨼᩥᩃ᩠ᨾ᩺
In Lü it is used as a final n; in Khün it is used as a final r.
This section maps Northern Thai consonant vowel sounds to common graphemes in the Lanna orthography, where h indicates high class, m is mid class, l is low class, and f indicates a final consonant. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.
Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.
ᨻ ᩠ᨻ [U+1A3B TAI THAM LETTER LOW PA]
ᩛ [U+1A5B TAI THAM CONSONANT SIGN HIGH RATHA OR LOW PA], commonly used for the subjoined form of ᨻ [U+1A3B TAI THAM LETTER LOW PA].@Wiktionary,https://en.wiktionary.org/wiki/%E1%A8%BB#Translingual
ᨸ (᩠ᨸ) [U+1A38 TAI THAM LETTER HIGH PA]
ᨻ ᩠ᨻ [U+1A3B TAI THAM LETTER LOW PA]
ᨷ ᩠ᨷ [U+1A60 TAI THAM SIGN SAKOT + U+1A37 TAI THAM LETTER BA]
ᨷ ᩠ᨷ [U+1A37 TAI THAM LETTER BA] when syllable-initial
ᩝ [U+1A5D TAI THAM CONSONANT SIGN BA] is an optional alternative to the normal subjoined form of ᨷ [U+1A37 TAI THAM LETTER BA]
ᨧ ᩠ᨧ [U+1A27 TAI THAM LETTER HIGH CA]
ᨩ ᩠ᨩ [U+1A29 TAI THAM LETTER LOW CA]
ᨭ ᩠ᨭ [U+1A2D TAI THAM LETTER RATA], eg.
ᨯ ᩠ᨯ [U+1A2F TAI THAM LETTER DA]
ᨰ ᩠ᨰ [U+1A30 TAI THAM LETTER LOW RATHA]
ᨲ ᩠ᨲ [U+1A32 TAI THAM LETTER HIGH TA]
ᨳ ᩠ᨳ [U+1A33 TAI THAM LETTER HIGH THA]
ᨴ ᩠ᨴ [U+1A34 TAI THAM LETTER LOW TA], eg.
ᨵ ᩠ᨵ [U+1A35 TAI THAM LETTER LOW THA]
ᩆ ᩠ᩆ [U+1A46 TAI THAM LETTER HIGH SHA]
ᩇ ᩠ᩇ [U+1A47 TAI THAM LETTER HIGH SSA], eg.
ᨠ ᩠ᨠ [U+1A20 TAI THAM LETTER HIGH KA]
ᨣ ᩠ᨣ [U+1A23 TAI THAM LETTER LOW KA]
ᩢ [U+1A62 TAI THAM VOWEL SIGN MAI SAT] Also used as a vowel.
Pronounced but not written after a short, open vowel.
ᩈ ᩠ᩈ [U+1A48 TAI THAM LETTER HIGH SA]
ᩆ ᩠ᩆ [U+1A46 TAI THAM LETTER HIGH SHA]
ᩇ ᩠ᩇ [U+1A47 TAI THAM LETTER HIGH SSA]
ᨨ ᩠ᨨ [U+1A28 TAI THAM LETTER HIGH CHA]
ᩞ [U+1A5E TAI THAM CONSONANT SIGN SA] as an optional alternative to the normal subjoined form of ᩠ᩈ [U+1A48 TAI THAM LETTER HIGH SA].
ᩔ [U+1A54 TAI THAM LETTER GREAT SA] when geminated, as a ligature
ᨱ ᩠ᨱ [U+1A31 TAI THAM LETTER RANA]
ᨶ ᩠ᨶ [U+1A36 TAI THAM LETTER NA]
ᩁ ᩠ᩁ [U+1A41 TAI THAM LETTER RA]
ᩃ ᩠ᩃ [U+1A43 TAI THAM LETTER LA]
ᩊ ᩠ᩊ [U+1A4A TAI THAM LETTER LLA]
ᨦ ᩠ᨦ [U+1A26 TAI THAM LETTER NGA]
ᩴ [U+1A74 TAI THAM SIGN MAI KANG]
ᩘ [U+1A58 TAI THAM SIGN MAI KANG LAI] esp. before s, h and l.
ᩅ ᩠ᩅ [U+1A45 TAI THAM LETTER WA]
As part of a diphthong, this is typically rendered using the subjoined form. See consonant_vowels.
ᩃ ᩠ᩃ [U+1A43 TAI THAM LETTER LA]
ᩊ ᩠ᩊ [U+1A4A TAI THAM LETTER LLA]
ᩁ ᩠ᩁ [U+1A41 TAI THAM LETTER RA]
In ligature for lɛː, ᩓ [U+1A53 TAI THAM LETTER LAE]
ᨬ ᩠ᨬ [U+1A2C TAI THAM LETTER NYA]
ᨿ ᩠ᨿ [U+1A3F TAI THAM LETTER LOW YA]
As part of a diphthong, this is typically rendered using the subjoined form. See consonant_vowels.
A number of questions need to be addressed with regards to ordering of characters in composite vowels and stacks. These have been discussed by Unicode experts, but no conclusions have yet been reached. Here we will list some examples.
A number of choices made for this page are enforced by the Hariphunchai font which is used, however other fonts allow different choices.
A first example concerns the order of vowel components that include a subjoined YA. The following produce identical output for the composite vowel pronounced ia in the Payap Lanna, Lamphun, and Hariphunchai fonts (but not in the Noto Sans Tai Tham font).
ᨠ᩠ᨿᩮ kia [KA + SAKOT + YA + E]
ᨠᩮ᩠ᨿ kia [KA + E + SAKOT + YA]
In this case, the first alternative, which subjoins the YA with the KA and follows it with the MAI SAT vowel sign seems the more intuitive, since the diphthong produced is ia.
However, change the sound to aj and the above logic would lean towards the second of the two following encodings, which also produce identical results.
ᨠ᩠ᨿᩢ kaj [KA + SAKOT + YA + MAI SAT]
ᨠᩢ᩠ᨿ kaj [KA + MAI SAT + SAKOT + YA]
This is somewhat unusual for Unicode, since it involves an invisible stacker appearing after a vowel mark (there is no other way of producing the right shape for the semivowel j).
The following 2 sequences produce identical visual results.
ᨠᩴ᩠ᩋ kɔː(?) [HIGH KA + MAI KANG + SAKOT + A]
ᨠᩬᩴ kɔː(?) [HIGH KA + VOWEL SIGN OA BELOW + MAI KANG]
There appears to be some question about which is the appropriate sequence for the composite vowel -ɔː.
The meaning of each of the logographs is shown above. Unicode classes these symbols as punctuation.
᩿ [U+1A7F TAI THAM COMBINING CRYPTOGRAMMIC DOT] is used singly or multiply beneath letters to give each letter a different value according to some hidden agreement between reader and writer. u,665
Two sets of digits are in common use: a secular set (Hora) and an ecclesiastical set (Tham). European digits are also found in books. u,665
Northern Thai text runs left to right in horizontal lines.
Show default bidi_class
properties for characters in the Northern Thai orthography described here.
This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.
You can experiment with examples using the Northern Thai character app.
The orthography has no case distinction, and no special transforms are needed to convert between characters.
Since there are no conjuncts, there is not so much contextual shaping in Tham as in many other Brahmi-descended scripts.
In addition to the regular differences in shape of glyphs in Northern Thai and Tai Khün, the shapes of certain glyphs in Northern Thai texts may also vary, depending on the region or source.
By way of a further example, the Payap Lanna and Haripunchai fonts differ in terms of styling, but some glyphs are substantially different. The following table shows glyph shapes for various characters in both fonts.
Hariphunchai | ᩳ | ᩘ | ᩝ | ᩞ | ᨫ | ᪬ | ᪥ | ᪢ |
---|---|---|---|---|---|---|---|---|
Payap Lanna | ᩳ | ᩘ | ᩝ | ᩞ | ᨫ | ᪬ | ᪥ | ᪢ |
Northern Thai text relies on rules to correctly position glyphs and shape them according to the surrounding text.
One major area where this applies is in the use of subjoined forms for consonant stacks (see clusters). Many of the subjoined forms of a letter are substantially different and/or smaller than the normal letter glyph, but the character in memory is the same.
The following is a selection of other examples of contextual shaping and positioning.
Placement of tone marks may involve special shaping and positioning. In some fonts a tone mark is displayed alongside a superscript vowel sign, rather than above it.
A number of code point sequences may be ligated by a font.
tbd
tbd
Spaces separate phrases. There is no separation of individual words.
A new word may start with a subjoined consonant. Stacking is performed across word boundaries. This means that operations such as line-breaking, word highlighting, etc. have to use an orthographic syllable unit which differs from the underlying phonetic syllables.
Northern Thai uses a variety of native punctuation, and only a couple of ASCII code points.
The following punctuation marks have "progressive values of finality".
European punctuation such as question marks and exclamation marks are also used.
᪣ [U+1AA3 TAI THAM SIGN KEOW], ᪤ [U+1AA4 TAI THAM SIGN HOY], ᪥ [U+1AA5 TAI THAM SIGN DOKMAI], and ᪭ [U+1AAD TAI THAM SIGN CAANG] are all used as section starters, sometimes in conjunction with other punctuatione, eg.
᪩᪥᪩
᪭ᩣ
To close a section, use ᪦ [U+1AA6 TAI THAM SIGN REVERSED ROTATED RANA] and/or ᪬ [U+1AAC TAI THAM SIGN HANG], eg.
᪦᪦᪩
᪩᪦᪩
᪩᪦᪩᪬
᪦᪦᪬
Northern Thai commonly uses ASCII parentheses to insert parenthetical information into text.
start | end | |
---|---|---|
standard |
Northern Thai texts use quotation marks around quotations. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.
start | end | |
---|---|---|
initial | ” [U+201D RIGHT DOUBLE QUOTATION MARK] |
tbd
ᪧ [U+1AA7 TAI THAM SIGN MAI YAMOK] indicates reduplication of the preceding word, eg.
ᨴᩩᨠᪧ
ᩃᩡᩋᩬ᩵ᩁᪧ
Adverbs are often derived by reduplicating an adjective.o,149
᩻ [U+1A7B TAI THAM SIGN MAI SAM] is also used in Northern Thai to indicate repetition of a word,e eg.
ᨲᩣ᩠᩵᩻ᨦ
tbd
tbd
tbd
There are no spaces between words in Northern Thai to serve as line-break opportunities. Lines must be broken at orthographic syllable boundaries. Since the onset consonant of a word or syllable may be subjoined below a previous consonant, and stacks must not be broken, orthgraphic syllable units typically don't match phonetic syllables or words.
In-word line-breaking is a fact of life, because stacks cannot be broken, but no hyphen or other character is used to indicate that a word was broken.
As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line. The Unicode line-break properties help applications decide whether a character should appear at the start or end of a line.
Show (default) line-breaking properties for characters in the Northern Thai orthography.
The following list gives examples of typical behaviours for some of the characters used in modern Northern Thai. Context may affect the behaviour of some of these and other characters. Most of the Northern Thai characters, including native punctuation, will prevent a line break before or after, and require morphological analysis to determine break opportunities, in a way similar to a hyphenation algorithm. No break opportunities will be found otherwise. Complex context analysis, often involving dictionary lookup of some form, is required to determine non-emergency line breaks. If such analysis is not available,.
Click/tap on the Bangla characters to show what they are.
Line breaking should not move a danda or double danda to the beginning of a new line even if they are preceded by a space character.
tbd
tbd
tbd
Northern Thai uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.
Northern Thai places vowel and tone marks above base characters, one above the other, and can also add combining characters below the line. It also stacks characters, though stacks are usually limited in height. The complexity of the text means that the vertical resolution needed for clearly readable Northern Thai text is higher than for English, or most Latin text. In addition,
To give an approximate idea, fig_baselines compares Latin and Northern Thai glyphs from the Payap Lanna font. The basic height of Northern Thai letters is typically around the Latin x-height, however extenders and combining marks reach well beyond the Latin ascenders and descenders, creating a need for larger line spacing.
tbd
This section is for any features that are specific to Northern Thai and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.
Many thanks are due to Richard Wordingham and Patrick Chew for reviewing the initial draft of this material and sending suggestions.