Thai

Updated 4 November, 2021

This page brings together basic information about the Thai script and its use for the Thai language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Thai using Unicode.

Phonetic transcriptions should be treated as an approximate guide, only. Many are more phonemic than phonetic, and there may be variations depending on the source of the transcription.

More about using this page
Related pages.
Other script summaries.

Sample (Thai)

Select part of this sample text to show a list of characters, with links to more details.
Change size:   28px

ข้อ 1 มนุษย์ทั้งหลายเกิดมามีอิสระและเสมอภาคกันในเกียรติศักด[เกียรติศักดิ์]และสิทธิ ต่างมีเหตุผลและมโนธรรม และควรปฏิบัติต่อกันด้วยเจตนารมณ์แห่งภราดรภาพ

ข้อ 2 ทุกคนย่อมมีสิทธิและอิสรภาพบรรดาที่กำหนดไว้ในปฏิญญานี้ โดยปราศจากความแตกต่างไม่ว่าชนิดใด ๆ ดังเช่น เชื้อชาติ ผิว เพศ ภาษา ศาสนา ความคิดเห็นทางการเมืองหรือทางอื่น เผ่าพันธุ์แห่งชาติ หรือสังคม ทรัพย์สิน กำเนิด หรือสถานะอื่น ๆ อนึ่งจะไม่มีความแตกต่างใด ๆ ตามมูลฐานแห่งสถานะทางการเมือง ทางการศาล หรือทางการระหว่างประเทศของประเทศหรือดินแดนที่บุคคลสังกัด ไม่ว่าดินแดนนี้จะเป็นเอกราช อยู่ในความพิทักษ์มิได้ปกครองตนเอง หรืออยู่ภายใต้การจำกัดอธิปไตยใด ๆ ทั้งสิ้น

Usage & history

The Thai script is used primarily for writing the Thai language, as well as Northern Thai, Northeastern Thai, Southern Thai, and Thai Song, which are separate languages. It is also used to write a number of minority languages in Thailand, Laos and China, as well as Pali, which is widely used in Buddhist temples and monasteries.s

อักษรไทย ʔ̯àksɔ̌ːn tʰāj Thai script

The alphabet was derived from the Old Khmer script, which descended from Pallava. Thai tradition attributes the creation of the script to King Ramkhamhaeng the Great (พ่อขุนรามคำแหงมหาราช pʰo kʰun raːm kʰam ŋɛː ma haː raː tɕʰa) in 1283, though this has been challenged.

Both the Thai language and script are closely related to Lao and its script.

Sources: Scriptsource, Wikipedia.

Basic features

Thai is an abugida. Consonant letters have an inherent vowel sound. Vowel-signs are attached to the consonant to produce a different vowel. See the table to the right for a brief overview of features for the modern Thai orthography.

Thai text runs left to right in horizontal lines.

Spaces separate phrases, rather than words.

Each onset consonant is associated with a high, mid, or low class related to tone. Tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), plus any tone mark.

No conjuncts are used for consonant clusters.

Syllable-initial clusters and syllable-final consonant sounds are all written with ordinary consonant letters. It can therefore be difficult to algorithmically detect syllable boundaries.

The inherent vowel is pronounced o inside a closed syllable, and a in an open syllable. 15 vowel-signs (including 5 pre-base vowels), and 5 consonants/diacritics represent non-inherent vowels. Only the 7 vowel-signs that appear above or below the consonant are combining marks; the others are ordinary spacing characters that are typed in the order seen. Vowels are often written differently when they appear in a closed vs. open syllable.

There are no independent vowels, and standalone vowel sounds are written using vowel-signs applied to [U+0E2D THAI CHARACTER O ANG].

This page lists 37 composite vowels (made from 12 vowel-signs, and 4 consonants/diacritics). Composite vowels can involve up to 4 glyphs, and glyphs can surround the base consonant(s) on up to 3 sides, eg. เกียะ ek̯īy̱a kiːa

Thai has vocalics.

Thai has native digits, and they are commonly used.

Character index

Letters

Show

Consonants

ผ␣ถ␣ฐ␣ข␣ฃ␣ป␣บ␣ต␣ด␣ฏ␣ฎ␣ก␣อ␣พ␣ภ␣ท␣ธ␣ฑ␣ฒ␣ค␣ฆ␣ฅ␣ฉ␣จ␣ช␣ฌ␣ฝ␣ส␣ศ␣ษ␣ห␣ฟ␣ซ␣ฮ␣ม␣น␣ณ␣ง␣ว␣ร␣ล␣ฬ␣ย␣ญ

Vowels

เ␣แ␣ใ␣ไ␣โ␣ะ␣า␣ำ

Vocalics

ฤ␣ฦ␣ๅ

Other

ฯ␣ๆ

Combining marks

Show

Vowels

ิ␣ี␣ึ␣ื␣ุ␣ู␣ั␣็

Bindu

Tones

่␣้␣๊␣๋

Consonant killer

Not used for Thai

Numbers

Show
๐␣๑␣๒␣๓␣๔␣๕␣๖␣๗␣๘␣๙

Punctuation

Show
“␣”␣‘␣’␣๚␣๛␣๏␣‐␣–␣—␣…

ASCII

,␣:␣.␣?␣!␣(␣)

Possible

′␣″

Symbols

Show
฿

Other

Show
​␣⁠
In character lists, show:

Phonology

Click on the sound groups to see where else in the document each of the sounds are referred to.

Source Wikipedia.

Vowel sounds

Plain vowels

i iː i iː ɯ ɯː ɯ ɯː u uː u uː e eː e eː ɤ ɤː ɤ ɤː o oː o oː ɛ ɛː ɛ ɛː ɔ ɔː ɔ ɔː a aː a aː ɑ ɑː ɑ ɑː

Diphthongs

iə iːə iw iəw iə iːə iw iəw ɯə ɯːə ɯəj ɯə ɯːə ɯəj uə uːə uj uːj uəj uə uːə uj uːj uəj ew eːw ew eːw ɤːj ɤːj oːj oːj ɛːw ɛːw ɔːj ɔːj aj aːj aw aːw aj aːj aw aːw

The majority of diphthongs and all 3 triphthongs in Thai end in j or w.wl,#Phonology The exceptions are a handful of diphthongs that end in ə.

Consonant sounds

labial dental alveolar post-
alveolar
palatal velar glottal
stop p b
t d
      k
ʔ
affricate       t͡ɕ
t͡ɕʰ
     
fricative f   s       h
nasal m   n     ŋ
approximant w   l   j  
trill/flap     r  

Finals

labial dental alveolar post-
alveolar
palatal velar glottal
stop p t       k ʔ
affricate              
fricative              
nasal m   n     ŋ
approximant w       j  
trill/flap        

Vowels

Click on the characters in the lists for detailed information. For a mapping of sounds to graphemes see vowel_mappings.

Dashes are used to indicate whether the character represents a vowel sound in a closed or an open syllable.

See also vocalics.

Inherent vowel

The inherent vowel is pronounced o inside a closed syllable, and a in an open syllable. So ka is written by simply using the consonant letter [U+0E01 THAI CHARACTER KO KAI], and kon by just the 2 consonants, กน [U+0E01 THAI CHARACTER KO KAI + U+0E19 THAI CHARACTER NO NU].

Vowel signs

Non-inherent vowel sounds that follow a consonant can be represented using vowel-signs, eg. ki is written กิ [U+0E01 THAI CHARACTER KO KAI + U+0E34 THAI CHARACTER SARA I].

An orthography that uses vowel-signs is different from one that uses simple diacritics or letters for vowels, in that the vowel-signs are generally attached to the syllable, rather than just applied to the letter of the immediately preceding consonant (see prescript_vowels). In Thai, vowel characters may be used on their own, or in combination with other characters (see composite_vowels).

Vowels in Thai are written with a mixture of combining characters and ordinary spacing characters, Only the superscript and subscript vowel-signs are combining characters. It is also common to use some consonants to represent vowel sounds (see consonant_vowels).

Combining marks used for vowels

Thai uses the following combining characters for vowels.

ิ␣ี␣ึ␣ื␣ุ␣ู␣็␣ั

Maitaikhu

 ็ [U+0E47 THAI CHARACTER MAITAIKHU] is used to shorten vowel sounds, but also occasionally operates as a vowel-sign in its own right. 

It converts vowels produced by the following three vowel signs to short vowels when they occur in medial position:

  • เ–็– becomes e, eg. เด็ก
  • –็อ– ɔː becomes ɔ,
  • แ–็– ɛː becomes ɛ (not very common).

It is also used for ew เ–็ว (eːw > ew), eg. เร็ว

One word consists of this diacritic over a consonant with no vowel-sign: ก็

Letter characters used for vowels

The following additional, vowel-specific characters are ordinary spacing characters, with the general category of 'letter'.

เ␣โ␣แ␣ะ␣า␣ ␣ไ␣ใ␣ำ

Sara AM & nikhahit

[U+0E33 THAI CHARACTER SARA AM] is classed as a vowel, but also contains the final consonant m, represented by a built-in nikhahit.

Used in Pali and Sanskrit,   [U+0E4D THAI CHARACTER NIKHAHIT] is not commonly used alone in Thai, except that when letter-spacing Thai text it is necessary to add the space between the circle and the remainder of  ำ [U+0E33 THAI CHARACTER SARA AM]. See inter_character_spacing.

The separation is not produced by NFD normalisation.

Consonants used for vowels

The following characters are also used to create vowel sounds, either alone or as part of a composite vowel.

อ␣ย␣ว␣ร

The consonant [U+0E2D THAI CHARACTER O ANG] can also be pronounced as the vowel ɔː when it appears alone after a base consonant.

Many of the composite vowels involve [U+0E22 THAI CHARACTER YO YAK] and/or [U+0E27 THAI CHARACTER WO WAEN] to create diphthongs.

The consonant [U+0E23 THAI CHARACTER RO RUA] is pronounced as a vowel a when doubled medially, eg. ธรรม When doubled at the end of a syllable it is pronounced an, eg. กรรไกร Note, however, that this may also constitute the end and beginning of two syllables, eg. ภรรยา

Pre-base vowel signs

เ␣แ␣ใ␣ไ␣โ

Five vowel-signs appear to the left of the onset consonant, eg. ไข่

Thai uses a visual encoding model and these are not combining characters. They are typed and stored before the base.

These vowel-signs are placed before the start of the syllable. This means that a word with a consonant cluster at the start separates the prescript vowel from any postscript vowels by more than one consonant character, eg. เปล่า โปรแกรม

[U+0E41 THAI CHARACTER SARA AE] should not be typed as two successive [U+0E40 THAI CHARACTER SARA E] characters.

Composite vowels

Some composite vowels represent plain vowel sounds:

ือ␣เ-ะ␣เ-็␣เ-ิ␣เ-อ␣เ-อะ␣โ-ะ␣็อ␣แ-ะ␣แ-็␣เ-าะ

The other composites represent diphthongs, which generally end in one of ə̯, i, or w.

For some, the spelling isn't completely obvious.

เ-ีย␣เ-ียะ␣เ-ือ␣เ-ือะ␣ัว␣ัวะ

In many other cases, a glide consonant is simply added after one of the vowels seen earlier.

ิว␣ูย␣ุย␣เ-ว␣เ-็ว␣เ-ย␣แ-ว␣โ-ย␣อย␣็อย␣าย␣ัย␣ไ-ย␣าว␣เ-า␣ ␣เ-ียว␣เ-ือย␣วย

Finally, two vocalic letters can be lengthened using an additional, special character.

ฤๅ␣ฦๅ
Show which combinations contain a given character:
เ-ะ␣โ-ะ␣แ-ะ␣เ-าะ␣เ-อะ␣ ␣เ-ียะ␣เ-ือะ␣-ัวะ
-ัย␣ ␣-ัวะ␣-ัว
เ-าะ␣ ␣-าย␣ ␣-าว␣เ-า
เ-ิ␣ ␣-ิว
เ-ีย␣เ-ียะ␣ ␣เ-ียว
-ือ␣ ␣เ-ือ␣เ-ือะ␣ ␣เ-ือย
-ุย
-ูย
เ-ะ␣เ-็␣เ-อ␣เ-ิ␣เ-าะ␣เ-อะ␣ ␣เ-ือ␣เ-ีย␣เ-ียะ␣เ-ือะ␣ ␣เ-ย␣เ-ือย␣ ␣เ-ว␣เ-า␣เ-็ว␣เ-ียว
แ-ะ␣แ-็␣ ␣แ-ว
โ-ะ␣ ␣โ-ย
ไ-ย
-็อ␣เ-็␣แ-็␣ ␣-็อย␣ ␣เ-็ว
-ือ␣-็อ␣เ-อ␣เ-อะ␣ ␣เ-ือ␣เ-ือะ␣ ␣-อย␣-็อย␣เ-ือย
เ-ีย␣เ-ียะ␣ ␣-ัย␣-าย␣-อย␣-ุย␣-ูย␣-วย␣เ-ย␣โ-ย␣ไ-ย␣-็อย␣เ-ือย␣ ␣เ-ียว
-ัว␣-ัวะ␣ ␣-วย␣ ␣-ิว␣-าว␣เ-ว␣แ-ว␣เ-็ว␣เ-ียว
-รร-␣-รร
Show details about glyph positioning

The following list shows where vowel-signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are. The figure after the + sign represents combinations of Unicode characters,

  • 5 prescript, eg. โก ok̯ (ko)
  • 2 postscript, eg. กา k̯ā
  • 5 superscript, eg. กิ k̯i
  • 2 subscript, eg. กุ k̯u
  • 1+5 sup+postscript, eg. กือ k̯ɯ̄ʔ̯ kɯːo
  • +4 post+postscript, eg. กาว k̯āw̱ kaːw
  • +2 sub+postscript, eg. กุย k̯uy̱ kuj
  • +10 pre+postscript, eg. เกะ ek̯a
  • +3 pre+superscript, eg. เกิ ek̯i kɤː
  • +2 super+post+post, eg. กัวะ k̯äw̱a kua
  • +2 pre+post+post, eg. เกาะ ek̯āa kɔ̀
  • +3 pre+sup+postscript, eg. เกือ ek̯ɯ̄ʔ̯ kɯːa
  • +4 pre+sup+post+postscript, eg. เกียะ ek̯īy̱a kiːa

At maximum, vowel components can occur concurrently on 3 sides of the base.

Distribution of vowel elements is as follows:

  ั   ิ   ี   ึ   ื   ็  ำ
เ แ ใ ไ โ อ ะ า ย ว ๅ ะ ย
    ุ   ู    

Characters that don't appear in the combinations:

ร␣ำ␣ึ␣ใ

Standalone vowels

Thai uses [U+0E2D THAI CHARACTER O ANG] as a base for vowel signs, eg. อิ่มสะอาด

[U+0E2D THAI CHARACTER O ANG] on its own represents the same sound as the inherent vowel, eg. อเมริกา

There are no independent vowel letters in Thai,

Consonants with no following vowel

Vowel absence after syllable-final consonants is not normally marked in any way. Nor is it marked in syllable-initial clusters.

 ์ [U+0E4C THAI CHARACTER THANTHAKHAT​] can be used above a consonant or syllable when it is not pronounced (usually at the end of a syllable), eg. รถเมล์ศักดิ์สิทธิ์ It is often used for foreign loan words, eg. คอมพิวเตอร์ โปสการ์ด สแตมป์

Tones

Each onset consonant is associated with a high, mid, or low class related to tone. Tone is indicated by a combination of the consonant class, the syllable type (checked/unchecked), plus any tone mark from the following set.

่␣้␣๊␣๋

The following chart shows how to tell which tones are associated with a syllable.

Consonant Checked? Tone mark Tone
high checked short ˩˩ low
long ˩˩ low
open - ˩˥ rising
˩˩ low
˥˩ falling
mid checked short ˩˩ low
long ˩˩ low
open - ˧˧ mid
˩˩ low
˥˩ falling
˦˥ high
˩˥ rising
low checked short ˦˥ high
long ˥˩ falling
open - ˧˧ mid
˥˩ falling
˦˥ high

'Checked' means ending in the sound -p, -t, or -k or a short vowel.

The expected typing and storage position for tone marks is immediately after the base consonant of the syllable, or after a superscript or subscript vowel-sign if there is one.

The tone mark should be typed before [U+0E33 THAI CHARACTER SARA AM], but should be displayed above the nikhahit, eg. ก่ำ

Vowel sounds mapped to characters

This section maps Thai vowel sounds to common graphemes. The dotted circle indicates the location of the consonant relative to the vowel-sign; if there are 2 circles, the vowel is used only in closed syllables. Click on the character names to see examples.

Plain vowels

i

◌ิ [U+0E34 THAI CHARACTER SARA I], eg. ลิง.

◌ี [U+0E35 THAI CHARACTER SARA II], eg. นี่.

ɯ

◌ึ [U+0E36 THAI CHARACTER SARA UE], eg. ดื่ม.

ɯː

◌ื [U+0E37 THAI CHARACTER SARA UEE], eg. มืด.

◌ือ [U+0E37 THAI CHARACTER SARA UEE + U+0E2D THAI CHARACTER O ANG], eg. มือ.

u

◌ุ [U+0E38 THAI CHARACTER SARA U], eg. คุณ.

◌ู [U+0E39 THAI CHARACTER SARA UU], eg. ถูก.

เ◌ [U+0E40 THAI CHARACTER SARA E], eg. เยน.

o

Inherent vowel (in closed syllables), eg.ผม.

โ◌ะ [U+0E42 THAI CHARACTER SARA O + U+0E30 THAI CHARACTER SARA A]

โ◌๊ะ [U+0E42 THAI CHARACTER SARA O + U+0E30 THAI CHARACTER SARA A], eg. โต๊ะ.

◌็อ◌ [U+0E47 THAI CHARACTER MAITAIKHU + U+0E2D THAI CHARACTER O ANG]

โ◌ [U+0E42 THAI CHARACTER SARA O], eg. โน่น.

Inherent vowel in syllables that end with [U+0E23 THAI CHARACTER RO RUA], eg. พร.

ɛː

แ◌ [U+0E41 THAI CHARACTER SARA AE], eg. แพง.

ɔː

◌อ [U+0E2D THAI CHARACTER O ANG], eg. ชอบ.

◌็ [U+0E47 THAI CHARACTER MAITAIKHU] (only in the word ก็)

a

Inherent vowel (in open syllables).

[U+0E30 THAI CHARACTER SARA A], eg. อะไร.

◌ั◌ [U+0E31 THAI CHARACTER MAI HAN-AKAT], eg. นั่น.

◌รร◌ [U+0E23 THAI CHARACTER RO RUA + U+0E23 THAI CHARACTER RO RUA], eg. ธรรม.

◌า [U+0E32 THAI CHARACTER SARA AA], อ่าง.

am

◌ำ [U+0E33 THAI CHARACTER SARA AM], eg. ทำ.

an

◌รร [U+0E23 THAI CHARACTER RO RUA + U+0E23 THAI CHARACTER RO RUA], eg. กรรไกร.

Diphthongs and triphthongs

iw

◌ิว [U+0E34 THAI CHARACTER SARA I + U+0E27 THAI CHARACTER WO WAEN], eg. คอมพิวเตอร์.

uːə

Vocalics

ฤ␣ฦ␣ๅ

These letters are actually considered to be consonants in Thai.

The long forms of both are created using [U+0E45 THAI CHARACTER LAKKHANGYAO], ie. ฤๅ ruː ฦๅ luː

Otherwise, that character doesn't appear alone.

Consonants

Click on the characters in the lists for detailed information. For a mapping of sounds to graphemes see consonant_mappings.

Basic consonants

Stops

high class
ผ␣ถ␣ฐ␣ข␣ฃ
mid class
ป␣บ␣ต␣ด␣ฏ␣ฎ␣ก␣อ
low class
พ␣ภ␣ท␣ธ␣ฑ␣ฒ␣ค␣ฆ␣ฅ

Affricates

high class
mid class
low class
ช␣ฌ

Fricatives

high class
ฝ␣ส␣ศ␣ษ␣ห
low class
ฟ␣ซ␣ฮ

Nasals

high class
หม␣หน␣หง
low class
ม␣น␣ณ␣ง

Other sonorants

high class
หว␣หร␣หล␣หย␣หญ
low class
ว␣ร␣ล␣ฬ␣ย␣ญ

High class nasals & liquids with HO

A silent [U+0E2B THAI CHARACTER HO HIP] is added before the characters in the list below to make their default tonal class high.

ม␣น␣ง␣ว␣ร␣ล␣ย␣ญ

Examples: หมาหยุด

See onset_clusters for further details about how these are presented.

The letter O ANG

[U+0E2D THAI CHARACTER O ANG] is silent when used as a base for vowels at the beginning of a syllable. When it appears alone after a base consonant it becomes the vowel ɔː. It is also used in combination with other characters to produce additional vowel sounds (see independent).

Consonant clusters

Consonant clusters occur syllable-initially, or where one syllable ends with a consonant and the next begins with one.

Thai doesn't have conjuncts, stacking, or special code points for final consonants, etc.

Syllable-onset clusters

Consonant clusters at the start of a syllable are usually one of the following:

  1. A glide after an initial consonant, such as [U+0E25 THAI CHARACTER LO LING] and [U+0E23 THAI CHARACTER RO RUA] eg. ปลา
  2. The silent [U+0E2B THAI CHARACTER HO HIP] used to affect tonal values, eg. หมา
  3. The word-initial combination ทร [U+0E17 THAI CHARACTER THO THAHAN + U+0E23 THAI CHARACTER RO RUA], which is pronounced s, eg. ทราย

There are no dedicated code points for glides when they are used after an initial consonant, so it is feasible that ปลา could be pronounced pà laː in a different context.

Tone marks and/or super-/subscript vowel-signs are attached to the second consonant, eg. เปลี่ยน

Prescript vowel-signs are placed before the first consonant in the cluster, ie. at the start of the syllable, eg. (where this occurs twice) โปรแกรม

The vocalics can also be used after an initial consonant, and again can create ambiguity for pronunciation, eg. compare พฤหัสพฤษภา

Final+initial consonant folding

A consonant that appears at both the end of one syllable and the beginning of the next may be expressed with a single character, even if the sounds in each phonetic location differ, eg. inพิสดาร or in จุลทัศน์

Only the following set of consonants behave in this way.

จ␣ช␣ศ␣ษ␣ส␣ล

Syllable-final consonants

Only the phonemes p, t, k, m, n, ŋ occur at the end of a syllable, however many more consonant letters can appear in final position.

The following consonant letters are pronounced differently in syllable-initial and syllable-final positions.

จ␣ช␣ศ␣ษ␣ส␣ร␣ล␣ญ

For example, in ลิง ตำบล

Consonants at the end of a syllable use ordinary code points, eg. ตื่น

This can create some ambiguity, since there is no distinction between the sequence in the previous example and one where is a new syllable with an inherent vowel.

The one exception is the character that is normally regarded as a vowel, [U+0E33 THAI CHARACTER SARA AM], which includes the final m sound, eg. ห้องน้ำ (A final m is not always represented using sara am) eg. ห้าม

Consonant sounds mapped to characters

This section maps Thai vowel sounds to common graphemes, grouped by high class ( h ), mid class ( m ), low class ( l ) and syllable-final ( f ). Click on the character names to see examples.

Initials

p
m

[U+0E1B THAI CHARACTER PO PLA], eg. ปลา.

b
m

[U+0E1A THAI CHARACTER BO BAIMAI], eg. ไบใม้.

h

[U+0E1C THAI CHARACTER PHO PHUNG], eg. ผึ้ง.

 
l

[U+0E1E THAI CHARACTER PHO PHAN], eg. พาน.

[U+0E20 THAI CHARACTER PHO SAMPHAO], eg. สำเภา.

t
m

[U+0E15 THAI CHARACTER TO TAO], eg. เต่า.

[U+0E0F THAI CHARACTER TO PATAK], eg. ปะฏัก (rare).

d
m

[U+0E14 THAI CHARACTER DO DEK], eg. เด็ก.

[U+0E0E THAI CHARACTER DO CHADA], ชะฎา (rare).

h

[U+0E16 THAI CHARACTER THO THUNG], eg. ถุง.

[U+0E10 THAI CHARACTER THO THAN], eg. ฐาน.

 
l

[U+0E17 THAI CHARACTER THO THAHAN], eg. ทหาร.

[U+0E18 THAI CHARACTER THO THONG], eg. ธง.

[U+0E11 THAI CHARACTER THO NANGMONTHO], eg. มณโฑ.

[U+0E12 THAI CHARACTER THO PHUTHAO], eg. ผู้เฒ่า.

k
m

[U+0E01 THAI CHARACTER KO KAI], eg. ไก่.

h

[U+0E02 THAI CHARACTER KHO KHAI], eg. ไข่.

[U+0E03 THAI CHARACTER KHO KHUAT], eg. ฃวด (obsolete).

 
l

[U+0E04 THAI CHARACTER KHO KHWAI], eg. ควาย.

[U+0E06 THAI CHARACTER KHO RAKHANG], eg. ระฆัง.

[U+0E05 THAI CHARACTER KHO KHON], eg. ฅน (obsolete).

t͡ɕ
m

[U+0E08 THAI CHARACTER CHO CHAN], eg. จาน.

t͡ɕʰ
h

[U+0E09 THAI CHARACTER CHO CHING], eg. ฉิ่ง.

 
l

[U+0E0A THAI CHARACTER CHO CHANG], eg.ช้าง.

[U+0E0C THAI CHARACTER CHO CHOE], eg. เฌอ.

f
h

[U+0E1D THAI CHARACTER FO FA], eg. ฝา.

 
l

[U+0E1F THAI CHARACTER FO FAN], eg. ฟัน.

s
h

[U+0E2A THAI CHARACTER SO SUA], eg. เสือ.

[U+0E28 THAI CHARACTER SO SALA], eg. ศาลา.

[U+0E29 THAI CHARACTER SO RUSI], eg. ฤๅษี.

ทร [U+0E17 THAI CHARACTER THO THAHAN + U+0E23 THAI CHARACTER RO RUA] in initial position, eg. ทราย

 
l

[U+0E0B THAI CHARACTER SO SO], eg. โซ่.

h
h

[U+0E2B THAI CHARACTER HO HIP], eg. หีบ

 
l

[U+0E2E THAI CHARACTER HO NOKHUK], eg. นกฮูก.

 
l

[U+0E27 THAI CHARACTER WO WAEN], eg. ว่าง.

 
l

[U+0E23 THAI CHARACTER RO RUA], eg. เรือ.

 
l

[U+0E25 THAI CHARACTER LO LING], eg. ลิง.

[U+0E2C THAI CHARACTER LO CHULA], eg. จุฬา.

 
l

[U+0E22 THAI CHARACTER YO YAK], eg. ยักษ์.

[U+0E0D THAI CHARACTER YO YING], eg. ประเทศญี่ปุ่น.

Vocalics

ri
 

[U+0E24 THAI CHARACTER RU]. eg. อังกฤษ.

 

[U+0E24 THAI CHARACTER RU]. eg. ฤดู.

rɯː
 
rɤː
 

[U+0E24 THAI CHARACTER RU]. eg. ฤก ษ์.

Finals

p
f

[U+0E1D THAI CHARACTER FO FA]

[U+0E1B THAI CHARACTER PO PLA], eg. ทวีป.

[U+0E1A THAI CHARACTER BO BAIMAI], eg. ดิบ.

[U+0E1E THAI CHARACTER PHO PHAN], eg. กรุงเทพฯ.

[U+0E20 THAI CHARACTER PHO SAMPHAO], eg. ลาภ.

t
f

[U+0E16 THAI CHARACTER THO THUNG], eg. รถ.

[U+0E10 THAI CHARACTER THO THAN], eg. ประเสริฐ.

[U+0E2A THAI CHARACTER SO SUA], eg. โอกาส.

[U+0E28 THAI CHARACTER SO SALA], eg. อากาศ.

[U+0E29 THAI CHARACTER SO RUSI], eg. พิษ.

  [U+0E15 THAI CHARACTER TO TAO], eg. ชีวิต.

[U+0E14 THAI CHARACTER DO DEK], eg. ตลาด.

  [U+0E0F THAI CHARACTER TO PATAK], eg. ปรากฏ.

[U+0E0E THAI CHARACTER DO CHADA], eg. กฎ.

[U+0E08 THAI CHARACTER CHO CHAN], eg. ดุจ.

[U+0E17 THAI CHARACTER THO THAHAN], eg. บาท.

[U+0E18 THAI CHARACTER THO THONG], eg. โกรธ.

[U+0E11 THAI CHARACTER THO NANGMONTHO], eg. ครุฑ.

[U+0E12 THAI CHARACTER THO PHUTHAO]

[U+0E0A THAI CHARACTER CHO CHANG], eg. ประโยชน์.

[U+0E0B THAI CHARACTER SO SO] 

k
f
m
f

[U+0E21 THAI CHARACTER MO MA]. eg. ยิ้ม.

n
f

[U+0E23 THAI CHARACTER RO RUA], eg. นคร.

[U+0E25 THAI CHARACTER LO LING], eg. ตำบล.

[U+0E2C THAI CHARACTER LO CHULA], eg. ลคุฬ.

[U+0E0D THAI CHARACTER YO YING], eg. บังเอิญ.

[U+0E13 THAI CHARACTER NO NEN], eg. คุณ.

[U+0E19 THAI CHARACTER NO NU], eg. อ้วน.

ŋ
f

[U+0E07 THAI CHARACTER NGO NGU]. eg. ลิง.

Other characters

๎␣ฺ

  [U+0E4E THAI CHARACTER YAMAKKAN] is an ancient punctuation mark used to mark clusters, such as in พ๎ราห๎มณ p̱ʰ๎ṟāh๎m̱ṇ̱ pʰraːmǒn

[U+0E3A THAI CHARACTER PHINTHU] is used as a virama when writing Pali.

Numbers, dates, currency, etc.

Thai has a set of decimal digits, that are used regularly.

๐␣๑␣๒␣๓␣๔␣๕␣๖␣๗␣๘␣๙

The CLDR standard-decimal pattern is #,##,##0.###. The standard-percent pattern is #,##,##0%.cldr

Currency

The currency symbol for baht is encoded in the Unicode Thai block.

฿

The CLDR standard format for currency is ¤#,##0.00.cldr

Dates

Thailand commonly uses the Buddhist Era calendar. The Gregorian year 2000 was 2543 in the Buddhist calendar.

วันพุธที่ 15 มีนาคม พ.ศ. 2538 ขึ้น 15 ค่ำ เดือน 4 ปีจอ

Buddhist era date at the top of a Thai newspaper: 15 March 1995.

In fig_thai_date the abbreviation พ.ศ. p̱ʰ.ś. stands for Buddhist era.

Text direction

Thai text runs left to right in horizontal lines.

Show default bidi_class properties for characters used by the modern Thai language.

Glyph shaping & positioning

This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.

You can experiment with examples using the Thai character app.

None of the Thai characters require special shaping based on the visual context. Nor is printed text cursive.

The orthography has no case distinction, and no special transforms are needed to convert between characters.

Writing styles

Modern type styles often omit the loops found in more traditional typefaces. See an article that explores this in depth.

Loopless is considered to be more contemporary and modern, and is mainly used for advertising and titling. The distinction doesn’t necessarily map to that of serif vs sans – Noto, for example, provides both serif and sans Thai font faces, but they both have loops. On the other hand, Neue Frutiger Thai offers traditional (looped) and modern (loopless) alternatives as part of the same font family (each with both regular, italic and bold substyles).

ทุกคนมีสิทธิที่จะออกจากประเทศใด ๆ ไป รวมทั้งประเทศของตนเองด้วย และที่จะกลับยังประเทศตน
The Silom font uses the traditional looped glyphs.
ทุกคนมีสิทธิที่จะออกจากประเทศใด ๆ ไป รวมทั้งประเทศของตนเองด้วย และที่จะกลับยังประเทศตน
The Sukhumvit Set font uses modern unlooped glyphs.

Context-based positioning

Most of the combining characters in Thai are used for vowel-signs and tone marks.

Combining characters need to be placed in different positions, according to the context. The example below shows the same tone character displayed at different heights, according to what falls beneath it.

ให้มีขึ้น

The same tone mark displayed at different heights.

Thai regularly combines multiple combining characters above a base consonant. There are two examples in the text below, both of which show a base character with a vowel sign and then a tone mark on top.

ครั้งที่

Multiple diacritics (vowel sign + tone mark) attached to the same base character.

Baselines, line height, etc.

Thai places vowel and tone marks above base characters, one above the other, and can also add combining characters below the line. The complexity of these marks means that the vertical resolution needed for clearly readable Thai text is higher than for English, or most Latin text.

The baseline of Thai text is the same as that of embedded Latin text, so there are no particular issues related to alignment of baselines between fonts.

พรุ่งนี้
Metrics for the Noto Serif Thai and Noto Serif fonts, viewed in Hibizcus.

Line height, then, tends to be greater than for Latin text, but also Thai tends to adds more interline spacing than Latin text does.

Font styles

Ben Mitchell describes how italicisation is used for meta-text and to convey the ‘about’ voice, rather than for emphasis or names of things (for which bold is used).

Italicisation tends to be applied to whole paragraphs or groups of paragraphs, for such things as picture captions, bylines, and other labels, commentaries, summaries such as standfirsts in magazines or news stories, and signposting. It is also regularly used for direct speech between quote marks.

Observation: Thai newspapers appear to use italic text for captions and by-lines. There is no evidence of the use of inline italicisation, but there is inline bolding.

Punctuation & inline features

Grapheme boundaries

Non-combining Thai vowel characters are treated as independent grapheme clusters. Only combining characters are grouped together with their base into a cluster.

และที่จะกลับ
Grapheme cluster boundaries split spacing vowel-signs from their base consonants, but not combining characters.

The grapheme cluster boundaries indicate the units of text used by cursor movements and forward deletion. It also allows justification algorithms to insert equal amounts of space between non-combining letters, including between non-combining vowel-signs and their consonants.

Word boundaries

Thai doesn't separate words in a phrase.

There is, however, a concept of words in the text. For example, lines are supposed to be broken at word boundaries.

รวมทั้งวิทยาการด้านคอมพิว
Word boundaries occur where the vertical lines appear, though they are not marked by the script.

The main difficulty arises when dealing with compound words. It can often be difficult to decide whether a given string of syllables represents multiple words or a single compound word.

ตัวอย่างการเขียนกาษาไทย

ตัวอย่างการเขียนกาษาไทย

Alternative line break opportunities for Thai text using compound nouns.

The variation may be related to the operation being performed on the text (eg. line breaking in narrow newsprint columns, vs. double-click selection, vs. cursor movement, etc.), or it may just be down to personal preference,

The difference may also be contextually dependent. Wirote Aroonmanakun describes how คนขับรถ should be viewed as a single word in the context คนขับรถนั่งคอยอยู่ในรถ, whereas in the phrase คนขับรถผ่านแยกนี้ไม่มากนัก it would be viewed as 3 words, referring to anyone who is driving.at

Proper names, which are composed from multiple words, are also problematic, especially because there are no capital letters to distinguish them from other pieces of text.g2455,#issuecomment-375162188

ZWSP & WJ

In order to manually fine-tune word-boundary detection, the invisible character ZWSP [U+200B ZERO WIDTH SPACE] (ZWSP) can be used to create breaks.u,625

To prevent a break between syllables, ZWJ [U+2060 WORD JOINER] (WJ) can be used.

It is also important to bear in mind that Thai may be used to write various languages, in particular minority languages for which different dictionaries are needed. Since such dictionaries may not available in a given browser or other application, there is a tendency to use ZWSP in order to compensate.

Large-scale manual entry of ZWSP and WJ has potential downsides because the user cannot see them; this leads to problems with ZWSP being inserted in the wrong position, or multiple times. However, these don't set a state, so it doesn't create major issues. It would be useful, however, if an editor showed the location of these characters.

Care should also be taken when trying to match text, eg. for searching in a page. WJ should be ignored. ZWSP may or may not be ignored, depending on whether word boundaries are significant for the search.

Phrase & section boundaries

,␣:␣.␣?␣!␣๚␣๛
phrase

U+0020 SPACE

, [U+002C COMMA]

: [U+003A COLON]

sentence

U+0020 SPACE

. [U+002E FULL STOP]

? [U+003F QUESTION MARK]

! [U+0021 EXCLAMATION MARK]

section [U+0E5A THAI CHARACTER ANGKHANKHU]
chapter/document

[U+0E5B THAI CHARACTER KHOMUT]

Thai uses space as a phrase marker, rather than to delimit words, often in places where English text would use commas or periods.

Latin-based punctuation such as comma, period, and colon are also used in text, particularly in conjunction with Latin letters or in formatting numbers, addresses, and so forth.

[U+0E5A THAI CHARACTER ANGKHANKHU] is used to mark the end of a long segment of text. It can be combined as ๚ะ to mark a larger segment of text; typically this usage can be seen at the end of a verse in poetry.u,625

[U+0E5B THAI CHARACTER KHOMUT] marks the end of a chapter or document, where it always follows the ๚ะ combination.u,625

Dashes include

‐␣–␣—

Parentheses & brackets

(␣)
  start end
standard

( [U+0028 LEFT PARENTHESIS]

) [U+0029 RIGHT PARENTHESIS]

Quotations

“␣”␣‘␣’
  start end
initial

[U+201C LEFT DOUBLE QUOTATION MARK]

[U+201D RIGHT DOUBLE QUOTATION MARK]
nested

[U+2018 LEFT SINGLE QUOTATION MARK]

[U+2019 RIGHT SINGLE QUOTATION MARK]

Emphasis

tbd

Abbreviation, ellipsis & repetition

Repetition

[U+0E46 THAI CHARACTER MAIYAMOK] is used to mark repetition of preceding letters.u,625 It is typically preceded and followed by a space, eg. ทุกวัน ๆ However, some publishers prefer to publish without a leading space,g19,#issuecomment-579378205 ie. ทุกวันๆ

This character shouldn't be wrapped to the beginning of a new line on its own, and should be kept not far from the preceding text that it duplicates during justification.g19,#issuecomment-579378205

Abbreviation

[U+0E2F THAI CHARACTER PAIYANNOI] is used to indicate elision or abbreviation of letters; it is viewed as a kind of letter, however, and is used with considerable frequency because of its appearance in such words as the Thai name for Bangkok, กรุงเทพฯ k̯ṟuŋ̱eṯʰp̱ʰ⋯ krūŋ tʰêːpwhich is short for กรุงเทพมหานคร k̯ṟuŋ̱eṯʰp̱ʰm̱hāṉḵʰṟ krūŋ tʰêːp mahǎː nákʰɔ̄ːnIt is followed by a space.

Ellipsis

ฯ␣…

Paiyannoi is also used in the combination ฯลฯ to create a construct called paiyanyai , which means “et cetera, and so forth.”u,625

Some abbreviations are written using a full stop, eg. สนง.ตปท. sṉŋ̱.t̯p̯ṯʰ. Office of the Royal Thai Police which is short for สำนักงานตำรวจแห่งชาติ saᵐṉäk̯ŋ̱āṉt̯aᵐṟw̱c̯ɛh¹ŋ̱c̱ʰāt̯i

CLDR indicates that [U+2026 HORIZONTAL ELLIPSIS] is also used for ellipsis.

Inline notes & annotations

tbd

Other inline ranges

tbd

Other punctuation

CLDR indicates that the following are also used:

′␣″

Line & paragraph layout

Line breaking & hyphenation

Thai doesn't indicate word boundaries, but when Thai text is wrapped at the end of a line you should not split a word.

As you change the width of the browser window the highlighted text above should break at the following points if your browser supports Thai wrapping:

โลกจะใช้เพียง
Break points detected in a sequence of Thai characters by an automatic word segmenter.

Because Thai doesn't separate words, applications typically look up word boundaries in a dictionary, however, such lookup doesn't always produce the needed result, especially when dealing with compound words and proper names (see words). To counteract these deficiencies, authors may use ZWSP [U+200B ZERO WIDTH SPACE] and ZWJ [U+2060 WORD JOINER] (see zwsp).

Show (default) line-breaking properties for characters in the Thai language.

Text alignment & justification

Justification in Thai primarily adjusts the blank spaces between phrases, rather than expanding the text between words or syllables. The fact that lines break at word boundaries helps reduce the size of the gaps produced.

Thai may also make certain adjustments to inter-character spacing. The character-based spacing is most common in narrow columns, such as newsprint, where there is no space except at the end of a line.

Any ZWSP [U+200B ZERO WIDTH SPACE] (ZWSP) is used to separate words is ignored during justification. Justification proceeds as if it wasn't there.u,625

Inter-character spacing

The justification in fig_justification_intercharacter_spacing shows equal spacing across a phrase where there are no space characters to stretch. Note how the equal spacing separates prebase and postbase vowel signs from their consonants by the same amount as consonants are separated from each other; they are not kept together with the base consonant they modify.

แนะโบรกฯรวมตัวตั้งสนง.ตปท.

A line with no spaces applies inter-character spacing to justify the text.

This kind of spacing requires a special behaviour for [U+0E33 THAI CHARACTER SARA AM]. The small circle is kept with the preceding consonant, and space is added before the spacing part of the vowel, as shown in fig_am_spacing.

น้ำ นํ้า
Sara AM before (left) and after (right) inter-character spacing has been applied.

(To facilitate this, applications tend to convert [U+0E33 THAI CHARACTER SARA AM] to the sequence ํา [U+0E4D THAI CHARACTER NIKHAHIT + U+0E32 THAI CHARACTER SARA AA] before stretching. Some care has to also be taken to correctly order the superscript glyphs, since in memory the tone mark precedes the nikhahit. The nikhahit character is not otherwise used for modern Thai.)

Paragraph indentation

Thai does indent the initial line of a paragraph.

Examples of indented paragraph start.

Indentations at paragraph start in a Thai newspaper.

Letter spacing

See justify.

Counters, lists, etc.

You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.

[U+0E4F THAI CHARACTER FONGMAN] is the Thai bullet, which is used to mark items in lists or appears at the beginning of a verse, sentence, paragraph, or other textual segment. u,625

The modern Thai orthography uses numeric and alphabetic styles.

Numeric

The thai numeric style is decimal-based and uses the digits shown below.

๐␣๑␣๒␣๓␣๔␣๕␣๖␣๗␣๘␣๙

Examples:

๑␣๒␣๓␣๔␣๑๑␣๒๒␣๓๓␣๔๔␣๑๑๑␣๒๒๒␣๓๓๓␣๔๔๔

Alphabetic

The thai-alphabetic style uses the letters shown below.

ก␣ข␣ค␣ง␣จ␣ฉ␣ช␣ซ␣ฌ␣ญ␣ฎ␣ฏ␣ฐ␣ฑ␣ฒ␣ณ␣ด␣ต␣ถ␣ท␣ธ␣น␣บ␣ป␣ผ␣ฝ␣พ␣ฟ␣ภ␣ม␣ย␣ร␣ล␣ว␣ศ␣ษ␣ส␣ห␣ฬ␣อ␣ฮ

Examples:

ก␣ข␣ค␣ง␣ฎ␣น␣ล␣กค␣ขภ␣จด␣ซจ␣ญว

Styling initials

It is possible to find the first letter in a paragraph styled so that it is larger and sits alongside several lines of the continuing paragraph text.

Observation: All combining characters are included in the selections shown in fig_drop_caps.

Any punctuation such as opening quotes and opening parentheses should also be included in the initial styling. ?

Examples of dropped highlights Examples of dropped highlights

Two example paragraphs showing dropped highlighted initials with combining characters.

Observation: In the figures shown, the alphabetic baseline of the highlighted letter falls slightly below the bottom of the row that determines the size of the highlighted letter. It's not clear whether that's a general trend, or just related to this specific publication.

Observation: In fig_drop_caps_2, the selection picks out only from the syllable แฉ.

Examples of dropped highlights

Another example paragraph, showing a prescript vowel-sign alone as a highlighted initial.

Page & book layout

This section is for any features that are specific to Thai and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.

Languages using the Thai script

According to ScriptSource, the Thai script is used for the following languages:

Online resources

  1. Universal Declaration of Human Rights - Thai
  2. Thairath News
  3. Wikipedia

References