Updated 16 April, 2022
This page brings together basic information about the Thai script and its use for the Thai language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Thai using Unicode.
ข้อ 1 มนุษย์ทั้งหลายเกิดมามีอิสระและเสมอภาคกันในเกียรติศักด[เกียรติศักดิ์]และสิทธิ ต่างมีเหตุผลและมโนธรรม และควรปฏิบัติต่อกันด้วยเจตนารมณ์แห่งภราดรภาพ
ข้อ 2 ทุกคนย่อมมีสิทธิและอิสรภาพบรรดาที่กำหนดไว้ในปฏิญญานี้ โดยปราศจากความแตกต่างไม่ว่าชนิดใด ๆ ดังเช่น เชื้อชาติ ผิว เพศ ภาษา ศาสนา ความคิดเห็นทางการเมืองหรือทางอื่น เผ่าพันธุ์แห่งชาติ หรือสังคม ทรัพย์สิน กำเนิด หรือสถานะอื่น ๆ อนึ่งจะไม่มีความแตกต่างใด ๆ ตามมูลฐานแห่งสถานะทางการเมือง ทางการศาล หรือทางการระหว่างประเทศของประเทศหรือดินแดนที่บุคคลสังกัด ไม่ว่าดินแดนนี้จะเป็นเอกราช อยู่ในความพิทักษ์มิได้ปกครองตนเอง หรืออยู่ภายใต้การจำกัดอธิปไตยใด ๆ ทั้งสิ้น
The Thai script is used primarily for writing the Thai language, as well as Northern Thai, Northeastern Thai, Southern Thai, and Thai Song, which are separate languages. It is also used to write a number of minority languages in Thailand, Laos and China, as well as Pali, which is widely used in Buddhist temples and monasteries.s
The alphabet was derived from the Old Khmer script, which descended from Pallava. Thai tradition attributes the creation of the script to King Ramkhamhaeng the Great (พ่อขุนรามคำแหงมหาราช pʰo kʰun raːm kʰam ŋɛː ma haː raː tɕʰa) in 1283, though this has been challenged.
Both the Thai language and script are closely related to Lao and its script.
Sources: Scriptsource, Wikipedia.
Thai is an abugida. Consonant letters have an inherent vowel sound. Vowel-signs are attached to the consonant to produce a different vowel. See the table to the right for a brief overview of features for the modern Thai orthography.
Thai text runs left to right in horizontal lines.
Spaces separate phrases, rather than words.
Each onset consonant is associated with a high, mid, or low class related to tone. Tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), plus any tone mark.
No conjuncts are used for consonant clusters.
Syllable-initial clusters and syllable-final consonant sounds are all written with ordinary consonant letters. It can therefore be difficult to algorithmically detect syllable boundaries.
The inherent vowel is pronounced o inside a closed syllable, and a in an open syllable. 15 vowel-signs (including 5 pre-base vowels), and 5 consonants/diacritics represent non-inherent vowels. Only the 7 vowel-signs that appear above or below the consonant are combining marks; the others are ordinary spacing characters that are typed in the order seen. Vowels are often written differently when they appear in a closed vs. open syllable.
There are no independent vowels, and standalone vowel sounds are written using vowel-signs applied to อ [U+0E2D THAI CHARACTER O ANG].
This page lists 37 composite vowels (made from 12 vowel-signs, and 4 consonants/diacritics). Composite vowels can involve up to 4 glyphs (plus a tone mark), and glyphs can surround the base consonant(s) on up to 3 sides, eg. เกี๊ยะ
Thai has vocalics.
Thai has native digits, and they are commonly used.
Thai syllables allow the following patterns, where V can be a short or a long vowelc,#Thai.
V VC CV CVC CCV CCVC
The long vs. short vowel distinction is phonemically important. Long vowels are approximately twice the length of short ones. All open syllables have long vowels.wl,#Vowel_developments
Consonant clusters only occur in syllable initial position, with the following permissable combinations:c,#Thai
Syllable-final consonants can be one of the following.c,#Thai Stops are unreleased.
-p̚ -t̚ -k̚ -m -n -ŋ -j -w
Click on the sound groups to see where else in the document each of the sounds are referred to.
|Close||i iː||ɯ ɯː||u uː|
|Close-mid||e eː||ɤ ɤː||o oː|
|Open-mid||ɛ ɛː||ɔ ɔː|
|Open||a aː||ɑ ɑː|
|Close||iə iːə iw
|uə uːə uj uːj
|Open||aj aːj aw aːw|
The majority of diphthongs and all 3 triphthongs in Thai end in j or w.wl,#Phonology The exceptions are a handful of diphthongs that end in ə.
|stops||p b||t d||k||ʔ|
Thai is a contour tone language, with 5 tones: high, mid, low, falling, & rising.
The following table provides typical phonological transcriptions and descriptions for the five tones.wl,#Tones
|high||á||˦˥||unchecked or checked syllables|
|mid||a/ā||˧||unchecked syllables only|
|low||à||˨˩||unchecked or checked syllables|
|rising||ǎ||˧˨˧||unchecked syllables only|
|falling||â||˥˩||unchecked or checked syllables|
Thai vowels all come in short and long forms, which are phonemically distinctive. A set of diphthongs end in a̯, and most vowels can be followed by either w or j.
Short vowels in open syllables usually end with a glottal stop.
See also vocalics.
This section maps Thai vowel sounds to common graphemes in the Thai orthography. The dotted circle indicates the location of the consonant relative to the vowel-sign; if there are 2 circles, the vowel is used only in closed syllables. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.
Inherent vowel (usually mid-word)
Wiktionary provides a very useful table of Thai rhymes.
The inherent vowel is pronounced o inside a closed syllable, and a in an open syllable. So ka can be written by simply using the consonant letter ก [U+0E01 THAI CHARACTER KO KAI], and kon by just the 2 consonants, กน [U+0E01 THAI CHARACTER KO KAI + U+0E19 THAI CHARACTER NO NU]. Example of a single word using both inherent vowels: ถนน
A third inherent vowel, ɔː, occurs before a syllable-final RA (which is pronounced n), eg. ศร นคร
Non-inherent vowel sounds that follow a consonant can be represented using vowel-signs, eg. ki is written กิ [U+0E01 THAI CHARACTER KO KAI + U+0E34 THAI CHARACTER SARA I].
An orthography that uses vowel-signs is different from one that uses simple diacritics or letters for vowels, in that the vowel-signs are generally attached to the syllable, rather than just applied to the letter of the immediately preceding consonant (see prescript_vowels). In Thai, vowel characters may be used on their own, or in combination with other characters (see composite_vowels).
Vowels in Thai are written with a mixture of combining characters and ordinary spacing characters, Only the superscript and subscript vowel-signs are combining characters. It is also common to use some consonants to represent vowel sounds (see consonant_vowels).
As shown above, a given sound may be written differently depending on whether it appears in an open or a closed syllable. Closed syllables have a written consonant after the vowel.
Thai uses the following combining characters for vowels.
็ [U+0E47 THAI CHARACTER MAITAIKHU] is used to shorten vowel sounds, but also occasionally operates as a vowel-sign in its own right.
It converts vowels produced by the following three vowel signs to short vowels when they occur in medial position:
It is also used for ew เ◌็ว (eːw > ew), eg. เร็ว
One word consists of this diacritic over a consonant with no vowel-sign: ก็
The following additional, vowel-specific characters are ordinary spacing characters, with the general category of 'letter'.
ำ [U+0E33 THAI CHARACTER SARA AM] is classed as a vowel, but also contains the final consonant m, represented by a built-in nikhahit.
Used in Pali and Sanskrit, ํ [U+0E4D THAI CHARACTER NIKHAHIT] is not commonly used alone in Thai, except that when letter-spacing Thai text it is necessary to add the space between the circle and the remainder of ำ [U+0E33 THAI CHARACTER SARA AM]. See inter_character_spacing.
The separation is not produced by NFD normalisation.
The following characters are also used to create vowel sounds, either alone or as part of a composite vowel.
The consonant อ [U+0E2D THAI CHARACTER O ANG] can also be pronounced as the vowel ɔː when it appears alone after a base consonant.
Many of the composite vowels involve ย [U+0E22 THAI CHARACTER YO YAK] and/or ว [U+0E27 THAI CHARACTER WO WAEN] to create diphthongs.
The consonant ร [U+0E23 THAI CHARACTER RO RUA] is pronounced as a vowel a when doubled medially, eg. ธรรม When doubled at the end of a syllable it is pronounced an, eg. กรรไกร Note, however, that this may also constitute the end and beginning of two syllables, eg. ภรรยา
Five vowel-signs appear to the left of the onset consonant, eg. ไข่
Thai uses a visual encoding model and these are not combining characters. They are typed and stored before the base.
These vowel-signs are placed before the start of the syllable. This means that a word with a consonant cluster at the start separates the prescript vowel from any postscript vowels by more than one consonant character, eg. เปล่า โปรแกรม
แ [U+0E41 THAI CHARACTER SARA AE] should not be typed as two successive เ [U+0E40 THAI CHARACTER SARA E] characters.
Some composite vowels represent plain vowel sounds:
The other composites represent diphthongs, which generally end in one of ə̯, i, or w.
For some, the spelling isn't completely obvious.
In many other cases, a glide consonant is simply added after one of the vowels seen earlier.
Finally, two vocalic letters can be lengthened using an additional, special character.
The following list shows where vowel-signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are. The figure after the + sign represents combinations of Unicode characters,
At maximum, vowel components can occur concurrently on 3 sides of the base.
Distribution of vowel elements is as follows:
|ั ิ ี ึ ื ็||ำ|
|เ แ ใ ไ โ||อ ะ า ย ว ๅ||ะ ย|
Characters that don't appear in the combinations:
Thai uses อ [U+0E2D THAI CHARACTER O ANG] as a base for vowel signs, eg. อิ่มสะอาด
อ [U+0E2D THAI CHARACTER O ANG] on its own represents the same sound as the inherent vowel, eg. อเมริกา
There are no independent vowel letters in Thai,
Vowel absence after syllable-final consonants is not normally marked in any way. Nor is it marked in syllable-initial clusters.
์ [U+0E4C THAI CHARACTER THANTHAKHAT] can be used above a consonant or syllable when it is not pronounced (usually at the end of a syllable), eg. รถเมล์ศักดิ์สิทธิ์ It is often used for foreign loan words, eg. คอมพิวเตอร์ โปสการ์ด สแตมป์
Each onset consonant is associated with a 'high', 'mid', or 'low' class related to, but not indicative of, tone. (For example, when they appear without tone marks the 'high' class consonants produce a rising tone, and 'mid' or 'low' class consonants both produce a mid tone.)
Tone is also affected by the use of the following combining marks on unchecked syllables, however in 2 cases the result of their use is also context-dependent, due to historical linguistic changes. (For example, ่ [U+0E48 THAI CHARACTER MAI EK] can produce either a low tone or a falling tone, depending on the class of the onset.)
In the end, tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), vowel length (for checked syllables), plus any tone mark.
The following table shows the various ways of writing tones in checked syllables. Only 3 tones are available, and no diacritics are used. Vowel length changes the tone after a low register consonant.
The next table shows the various ways of writing tones in unchecked syllables. All 5 tones are possible.
|high tone||MID||๊ [U+0E4A THAI CHARACTER MAI TRI]|
|LOW||้ [U+0E49 THAI CHARACTER MAI THO]|
|low tone||HIGH||่ [U+0E48 THAI CHARACTER MAI EK]|
|MID||่ [U+0E48 THAI CHARACTER MAI EK]|
|MID||๋ [U+0E4B THAI CHARACTER MAI CHATTAWA]|
|falling tone||HIGH||้ [U+0E49 THAI CHARACTER MAI THO]|
|MID||้ [U+0E49 THAI CHARACTER MAI THO]|
|LOW||่ [U+0E48 THAI CHARACTER MAI EK]|
The expected typing and storage position for tone marks is immediately after the base consonant of the syllable, or after a superscript or subscript vowel-sign if there is one.
The tone mark should be typed before ำ [U+0E33 THAI CHARACTER SARA AM], but should be displayed above the nikhahit, eg. ก่ำ
These letters are actually considered to be consonants in Thai.
The long forms of both are created using ๅ [U+0E45 THAI CHARACTER LAKKHANGYAO], ie. ฤๅ ruː ฦๅ luː
Otherwise, that character doesn't appear alone.
This section maps Thai consonant sounds to common graphemes in the Thai orthography, grouped by high class ( h ), mid class ( m ), low class ( l ) and syllable-final ( f ) types. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.
ป [U+0E1B THAI CHARACTER PO PLA], eg. ปลา.
บ [U+0E1A THAI CHARACTER BO BAIMAI], eg. ไบใม้.
ผ [U+0E1C THAI CHARACTER PHO PHUNG], eg. ผึ้ง.
ก [U+0E01 THAI CHARACTER KO KAI], eg. ไก่.
ฝ [U+0E1D THAI CHARACTER FO FA], eg. ฝา.
ฟ [U+0E1F THAI CHARACTER FO FAN], eg. ฟัน.
ซ [U+0E0B THAI CHARACTER SO SO], eg. โซ่.
ห [U+0E2B THAI CHARACTER HO HIP], eg. หีบ.
ฮ [U+0E2E THAI CHARACTER HO NOKHUK], eg. นกฮูก.
ม [U+0E21 THAI CHARACTER MO MA], eg. ม้า.
ง [U+0E07 THAI CHARACTER NGO NGU], eg. งู.
ว [U+0E27 THAI CHARACTER WO WAEN], eg. ว่าง.
ร [U+0E23 THAI CHARACTER RO RUA], eg. เรือ.
ถ [U+0E16 THAI CHARACTER THO THUNG], eg. รถ.
ฐ [U+0E10 THAI CHARACTER THO THAN], eg. ประเสริฐ.
ส [U+0E2A THAI CHARACTER SO SUA], eg. โอกาส.
ศ [U+0E28 THAI CHARACTER SO SALA], eg. อากาศ.
ษ [U+0E29 THAI CHARACTER SO RUSI], eg. พิษ.
ต [U+0E15 THAI CHARACTER TO TAO], eg. ชีวิต.
ด [U+0E14 THAI CHARACTER DO DEK], eg. ตลาด.
ฏ [U+0E0F THAI CHARACTER TO PATAK], eg. ปรากฏ.
ฎ [U+0E0E THAI CHARACTER DO CHADA], eg. กฎ.
จ [U+0E08 THAI CHARACTER CHO CHAN], eg. ดุจ.
ท [U+0E17 THAI CHARACTER THO THAHAN], eg. บาท.
ธ [U+0E18 THAI CHARACTER THO THONG], eg. โกรธ.
ฑ [U+0E11 THAI CHARACTER THO NANGMONTHO], eg. ครุฑ.
ช [U+0E0A THAI CHARACTER CHO CHANG], eg. ประโยชน์.
ม [U+0E21 THAI CHARACTER MO MA]. eg. ยิ้ม.
ง [U+0E07 THAI CHARACTER NGO NGU]. eg. ลิง.
A silent ห [U+0E2B THAI CHARACTER HO HIP] is added before the characters in the list below to make their default tonal class high.
See onset_clusters for further details about how these are presented.
อ [U+0E2D THAI CHARACTER O ANG] is silent when used as a base for vowels at the beginning of a syllable. When it appears alone after a base consonant it becomes the vowel ɔː. It is also used in combination with other characters to produce additional vowel sounds (see independent).
Consonant clusters occur syllable-initially, or where one syllable ends with a consonant and the next begins with one.
Thai doesn't have conjuncts, stacking, or special code points for final consonants, etc.
Consonant clusters at the start of a syllable are usually one of the following:
There are no dedicated code points for glides when they are used after an initial consonant, so it is feasible that ปลา could be pronounced pà laː in a different context.
Tone marks and/or super-/subscript vowel-signs are attached to the second consonant, eg. เปลี่ยน
Prescript vowel-signs are placed before the first consonant in the cluster, ie. at the start of the syllable, eg. (where this occurs twice) โปรแกรม
The vocalics can also be used after an initial consonant, and again can create ambiguity for pronunciation, eg. compare พฤหัสพฤษภา
A consonant that appears at both the end of one syllable and the beginning of the next may be expressed with a single character, even if the sounds in each phonetic location differ, eg. สinพิสดาร or ล in จุลทัศน์
Only the following set of consonants behave in this way.
Only the phonemes p, t, k, m, n, ŋ occur at the end of a syllable, however many more consonant letters can appear in final position.
The following consonant letters are pronounced differently in syllable-initial and syllable-final positions.
For example,ล in ลิง ตำบล
Consonants at the end of a syllable use ordinary code points, eg. ตื่น
This can create some ambiguity, since there is no distinction between the sequence in the previous example and one where น is a new syllable with an inherent vowel.
The one exception is the character that is normally regarded as a vowel, ำ [U+0E33 THAI CHARACTER SARA AM], which includes the final m sound, eg. ห้องน้ำ (A final m is not always represented using sara am) eg. ห้าม
Thai has a set of decimal digits, that are used regularly.
The CLDR standard-decimal pattern is
#,##,##0.###. The standard-percent pattern is
The currency symbol for baht is encoded in the Unicode Thai block.
The CLDR standard format for currency is
Thailand commonly uses the Buddhist Era calendar. The Gregorian year 2000 was 2543 in the Buddhist calendar.
In fig_thai_date the abbreviation พ.ศ. p̱ʰ.ś. stands for Buddhist era.
Thai text runs left to right in horizontal lines.
bidi_class properties for characters used by the modern Thai language.
This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.
You can experiment with examples using the Thai character app.
None of the Thai characters require special shaping based on the visual context. Nor is printed text cursive.
The orthography has no case distinction, and no special transforms are needed to convert between characters.
Modern type styles often omit the loops found in more traditional typefaces. See an article that explores this in depth.
Loopless is considered to be more contemporary and modern, and is mainly used for advertising and titling. The distinction doesn’t necessarily map to that of serif vs sans – Noto, for example, provides both serif and sans Thai font faces, but they both have loops. On the other hand, Neue Frutiger Thai offers traditional (looped) and modern (loopless) alternatives as part of the same font family (each with both regular, italic and bold substyles).
Most of the combining characters in Thai are used for vowel-signs and tone marks.
Combining characters need to be placed in different positions, according to the context. The example below shows the same tone character displayed at different heights, according to what falls beneath it.
Thai regularly combines multiple combining characters above a base consonant. There are two examples in the text below, both of which show a base character with a vowel sign and then a tone mark on top.
Thai places vowel and tone marks above base characters, one above the other, and can also add combining characters below the line. The complexity of these marks means that the vertical resolution needed for clearly readable Thai text is higher than for English, or most Latin text.
The baseline of Thai text is the same as that of embedded Latin text, so there are no particular issues related to alignment of baselines between fonts.
Line height, then, tends to be greater than for Latin text, but also Thai tends to adds more interline spacing than Latin text does.
Ben Mitchell describes how italicisation is used for meta-text and to convey the ‘about’ voice, rather than for emphasis or names of things (for which bold is used).
Italicisation tends to be applied to whole paragraphs or groups of paragraphs, for such things as picture captions, bylines, and other labels, commentaries, summaries such as standfirsts in magazines or news stories, and signposting. It is also regularly used for direct speech between quote marks.
Observation: Thai newspapers appear to use italic text for captions and by-lines. There is no evidence of the use of inline italicisation, but there is inline bolding.
Non-combining Thai vowel characters are treated as independent grapheme clusters. Only combining characters are grouped together with their base into a cluster.
The grapheme cluster boundaries indicate the units of text used by cursor movements and forward deletion. It also allows justification algorithms to insert equal amounts of space between non-combining letters, including between non-combining vowel-signs and their consonants.
Thai segmentation may have to deal with ambiguous situations. Take for example the word ถนน
Because syllable-final sounds are ordinary letters, with no special indication, this could be parsed as ta.non, ton.na, or even ta.na.na, and indeed some words are written the same but pronounced differently, eg. นม
Similarly, because medial consonants are written with normal characters, there is a possibile ambiguity about whether a sequence contains an inherent vowel, eg. กรี
Thai doesn't separate words in a phrase.
There is, however, a concept of words in the text. For example, lines are supposed to be broken at word boundaries.
The main difficulty arises when dealing with compound words. It can often be difficult to decide whether a given string of syllables represents multiple words or a single compound word.
The variation may be related to the operation being performed on the text (eg. line breaking in narrow newsprint columns, vs. double-click selection, vs. cursor movement, etc.), or it may just be down to personal preference,
The difference may also be contextually dependent. Wirote Aroonmanakun describes how คนขับรถ should be viewed as a single word in the context คนขับรถนั่งคอยอยู่ในรถ, whereas in the phrase คนขับรถผ่านแยกนี้ไม่มากนัก it would be viewed as 3 words, referring to anyone who is driving.at
Proper names, which are composed from multiple words, are also problematic, especially because there are no capital letters to distinguish them from other pieces of text.g2455,#issuecomment-375162188
In order to manually fine-tune word-boundary detection, the invisible character U+200B ZERO WIDTH SPACE] (ZWSP) can be used to create breaks.u,625 [
To prevent a break between syllables, U+2060 WORD JOINER] (WJ) can be used. [
It is also important to bear in mind that Thai may be used to write various languages, in particular minority languages for which different dictionaries are needed. Since such dictionaries may not available in a given browser or other application, there is a tendency to use ZWSP in order to compensate.
Large-scale manual entry of ZWSP and WJ has potential downsides because the user cannot see them; this leads to problems with ZWSP being inserted in the wrong position, or multiple times. However, these don't set a state, so it doesn't create major issues. It would be useful, however, if an editor showed the location of these characters.
Care should also be taken when trying to match text, eg. for searching in a page. WJ should be ignored. ZWSP may or may not be ignored, depending on whether word boundaries are significant for the search.
, [U+002C COMMA]
: [U+003A COLON]
. [U+002E FULL STOP]
|section||๚ [U+0E5A THAI CHARACTER ANGKHANKHU]|
Thai uses space as a phrase marker, rather than to delimit words, often in places where English text would use commas or periods.
Latin-based punctuation such as comma, period, and colon are also used in text, particularly in conjunction with Latin letters or in formatting numbers, addresses, and so forth.
๚ [U+0E5A THAI CHARACTER ANGKHANKHU] is used to mark the end of a long segment of text. It can be combined as ๚ะ to mark a larger segment of text; typically this usage can be seen at the end of a verse in poetry.u,625
๛ [U+0E5B THAI CHARACTER KHOMUT] marks the end of a chapter or document, where it always follows the ๚ะ combination.u,625
|initial||” [U+201D RIGHT DOUBLE QUOTATION MARK]|
|nested||’ [U+2019 RIGHT SINGLE QUOTATION MARK]|
ๆ [U+0E46 THAI CHARACTER MAIYAMOK] is used to mark repetition of preceding letters.u,625 It is typically preceded and followed by a space, eg. ทุกวัน ๆ However, some publishers prefer to publish without a leading space,g19,#issuecomment-579378205 ie. ทุกวันๆ
This character shouldn't be wrapped to the beginning of a new line on its own, and should be kept not far from the preceding text that it duplicates during justification.g19,#issuecomment-579378205
ฯ [U+0E2F THAI CHARACTER PAIYANNOI] is used to indicate elision or abbreviation of letters; it is viewed as a kind of letter, however, and is used with considerable frequency because of its appearance in such words as the Thai name for Bangkok, กรุงเทพฯ k̯ṟuŋ̱eṯʰp̱ʰ⋯ krūŋ tʰêːpwhich is short for กรุงเทพมหานคร k̯ṟuŋ̱eṯʰp̱ʰm̱hāṉḵʰṟ krūŋ tʰêːp mahǎː nákʰɔ̄ːnIt is followed by a space.
Paiyannoi is also used in the combination ฯลฯ to create a construct called paiyanyai , which means “et cetera, and so forth.”u,625
Some abbreviations are written using a full stop, eg. สนง.ตปท. sṉŋ̱.t̯p̯ṯʰ. Office of the Royal Thai Police which is short for สำนักงานตำรวจแห่งชาติ saᵐṉäk̯ŋ̱āṉt̯aᵐṟw̱c̯ɛh¹ŋ̱c̱ʰāt̯i
CLDR indicates that … [U+2026 HORIZONTAL ELLIPSIS] is also used for ellipsis.
CLDR indicates that the following are also used:
Thai doesn't indicate word boundaries, but when Thai text is wrapped at the end of a line you should not split a word.
As you change the width of the browser window the highlighted text above should break at the following points if your browser supports Thai wrapping:
Because Thai doesn't separate words, applications typically look up word boundaries in a dictionary, however, such lookup doesn't always produce the needed result, especially when dealing with compound words and proper names (see words). To counteract these deficiencies, authors may use U+200B ZERO WIDTH SPACE] and [U+2060 WORD JOINER] (see zwsp). [
Show (default) line-breaking properties for characters in the Thai language.
Justification in Thai primarily adjusts the blank spaces between phrases, rather than expanding the text between words or syllables. The fact that lines break at word boundaries helps reduce the size of the gaps produced.
Thai may also make certain adjustments to inter-character spacing. The character-based spacing is most common in narrow columns, such as newsprint, where there is no space except at the end of a line.
Any U+200B ZERO WIDTH SPACE] (ZWSP) is used to separate words is ignored during justification. Justification proceeds as if it wasn't there.u,625 [
The justification in fig_justification_intercharacter_spacing shows equal spacing across a phrase where there are no space characters to stretch. Note how the equal spacing separates prebase and postbase vowel signs from their consonants by the same amount as consonants are separated from each other; they are not kept together with the base consonant they modify.
This kind of spacing requires a special behaviour for ำ [U+0E33 THAI CHARACTER SARA AM]. The small circle is kept with the preceding consonant, and space is added before the spacing part of the vowel, as shown in fig_am_spacing.
(To facilitate this, applications tend to convert ำ [U+0E33 THAI CHARACTER SARA AM] to the sequence ํา [U+0E4D THAI CHARACTER NIKHAHIT + U+0E32 THAI CHARACTER SARA AA] before stretching. Some care has to also be taken to correctly order the superscript glyphs, since in memory the tone mark precedes the nikhahit. The nikhahit character is not otherwise used for modern Thai.)
Thai does indent the initial line of a paragraph.
You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.
๏ [U+0E4F THAI CHARACTER FONGMAN] is the Thai bullet, which is used to mark items in lists or appears at the beginning of a verse, sentence, paragraph, or other textual segment. u,625
The modern Thai orthography uses numeric and alphabetic styles.
The thai numeric style is decimal-based and uses the digits shown below.
The thai-alphabetic style uses the letters shown below.
It is possible to find the first letter in a paragraph styled so that it is larger and sits alongside several lines of the continuing paragraph text.
Observation: All combining characters are included in the selections shown in fig_drop_caps.
Any punctuation such as opening quotes and opening parentheses should also be included in the initial styling. ?
Observation: In the figures shown, the alphabetic baseline of the highlighted letter falls slightly below the bottom of the row that determines the size of the highlighted letter. It's not clear whether that's a general trend, or just related to this specific publication.
Observation: In fig_drop_caps_2, the selection picks out only แ from the syllable แฉ.
This section is for any features that are specific to Thai and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.