Tai Nüa

Tai Le orthography notes

Updated 29 November, 2024

This page brings together basic information about the Tai Le script and its use for the Tai Nüa language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Tai Nüa using Unicode.

Referencing this document

Richard Ishida, Tai Nüa (Tai Le) Orthography Notes, 29-Nov-2024, https://r12a.github.io/scripts/tale/tdd

Sample

The two paragraphs show the same text, except that the first uses spacing characters for tone marks, and second shows the older combining-character orthography.

Select part of this sample text to show a list of characters, with links to more details.
Change size:   28px

ᥘᥬᥰ​ᥔᥩᥛᥳ​ᥝᥢᥰ​ᥘᥭᥳ,​ᥐᥩᥙ​ᥘᥭᥲ​ᥑᥨᥛᥱ​ᥑᥦᥢᥴ​ᥖᥫᥒᥰ​ᥐᥣ,​ᥝᥣᥐ​ᥖᥣᥙᥱ​ᥐᥨᥢᥰ​ᥑᥥᥢᥴ​ᥛᥣᥰ​ᥔᥥᥴ,​ᥛᥣᥢᥲ​ᥘᥣᥰ​ᥟᥢᥐᥬᥲ​ᥓᥛ​ᥑᥩᥙᥱ​ᥞᥤᥛᥰ​ᥙᥥᥲ​ᥢᥣᥢᥳ​ᥘᥢᥳ,​ᥛᥤᥰ​ᥐᥣᥭᥰ​ᥚᥣᥒᥳ​ᥓᥤᥢ​ᥚᥧᥒᥴ​ᥘᥫᥒ​ᥑᥝᥲ​ᥛᥣᥢᥳ​ᥛᥣᥰ,​ᥟᥣ​ᥛᥥᥝᥰ​ᥖᥭᥰ​ᥖᥒᥰ​ᥘᥣᥭᥴ​ᥟᥩᥢ​ᥐᥢ​ᥐᥣᥱ​ᥓᥩᥭ​ᥗᥦᥛᥴ.

ᥘᥬ̈​ᥔᥩᥛ̇​ᥝᥢ̈​ᥘᥭ̇,​ᥐᥩᥙ​ᥘᥭ̀​ᥑᥨᥛ̌​ᥑᥦᥢ́​ᥖᥫᥒ̈​ᥐᥣ,​ᥝᥣᥐ​ᥖᥣᥙ̌​ᥐᥨᥢ̈​ᥑᥥᥢ́​ᥛᥣ̈​ᥔᥥ́,​ᥛᥣᥢ̀​ᥘᥣ̈​ᥟᥢᥐᥬ̀​ᥓᥛ​ᥑᥩᥙ̌​ᥞᥤᥛ̈​ᥙᥥ̀​ᥢᥣᥢ̇​ᥘᥢ̇,​ᥛᥤ̈​ᥐᥣᥭ̈​ᥚᥣᥒ̇​ᥓᥤᥢ​ᥚᥧᥒ́​ᥘᥫᥒ​ᥑᥝ̀​ᥛᥣᥢ̇​ᥛᥣ̈,​ᥟᥣ​ᥛᥥᥝ̈​ᥖᥭ̈​ᥖᥒ̈​ᥘᥣᥭ́​ᥟᥩᥢ​ᥐᥢ​ᥐᥣ̌​ᥓᥩᥭ​ᥗᥦᥛ́.

Source: Tai Dehong story site

Usage & history

Origins of the Tai Le script, 1954 – today.

Phoenician

└ Aramaic

└ Brahmi

└ Pallava

└ Mon-Burmese

└ Lik-Tai

└ Tai Le

+ Ahom

+ Khamti

The Tai Le script, or Dehong Dai script, is used to write the Tai Nüa language of south-central Yunnan, China. (The language is also known as Tai Nüa, Dehong Dai, Tai Mau, Tai Kong, and Chinese Shan.)

The script is currently widely used in China for government documents, public notice boards and signage, in advertising, education and publishing. There are 6 publishing houses in China which publish over 45,000 book copies per year in the script. It is estimated that speakers of Tai Le in Dehong are about 95% literate in the Tai Le script.s

ᥖᥭᥰᥘᥫᥴ

Several orthographic conventions have been used over the 700-800 years of the script's use. Between 1952 and 1988, the script went through four reforms. The reform of 1954 rationalised the old system, to reduce the redundancy of symbols to represent sounds, to represent tones more accurately, and to standardise the handwritten cursive forms. That of 1963/4 standardised combining marks used to represent tones. The reform of 1988 replaced the tone diacritics with today's spacing characters.

More information: Unicode Standard, WikipediaScriptsource

Basic features

The Tai Le script is an abugida, ie. consonants carry an inherent vowel sound that is overridden using vowel signs. In Tai Le, consonants carry an inherent vowel a. See the table to the right for a brief overview of features for the modern Tai Nüa orthography.

Tai Le text runs left to right in horizontal lines. Words are not separated by spaces.

The key distinguishing feature of Tai Le is its regularity and simplicity compared to other Tai scripts. The sequence of characters is C(V)(C)(T). Tones always go after any other characters in the syllable.

❯ consonantSummary

Tai Le's 19 consonants are straightforward. There is no duplication for tone support, no stacking or other conjunct behaviour, etc.

Syllable-final consonant sounds use ordinary code points without an inherent vowel. Parsing syllables is usually straightforward because each syllable-final consonant is folllowed by a tone mark.

❯ basicV

The Tai Le orthography is an abugida with one inherent vowel, pronounced a.

Other vowels are written using 10 dedicated vowel letters. All vowel signs are ordinary spacing characters (no combining marks), and are stored and displayed after the base character. A rhyme can also end with a -j or -w glide. There is a dedicated letter for the former, and the letter w doubles for the latter.

There are no pre-base vowels, circumgraphs, or multipart vowels.

Tones are written using 5 spacing characters after the final character in a syllable, however they were written using 5 combining marks until 1988.

Tai Nüa numbers use ASCII digits, but may also use Myanmar digits with some shape changes.

Character index

Letters

Show

Consonants

ᥙ␣ᥚ␣ᥖ␣ᥗ␣ᥐ␣ᥠ␣ᥟ␣ᥓ␣ᥡ␣ᥜ␣ᥔ␣ᥑ␣ᥞ␣ᥛ␣ᥢ␣ᥒ␣ᥝ␣ᥘ␣ᥕ

Vowels

ᥤ␣ᥪ␣ᥧ␣ᥥ␣ᥨ␣ᥫ␣ᥦ␣ᥩ␣ᥣ␣ᥬ␣ᥭ

Tones

ᥰ␣ᥱ␣ᥲ␣ᥳ␣ᥴ

Combining marks

Show

Tones

̈␣̌␣̀␣̇␣́

Numbers

Show
၀␣၁␣၂␣၃␣၄␣၅␣၆␣၇␣၈␣၉

Punctuation

Show
〈␣〉␣《␣》␣(␣)␣!␣?␣:␣;␣。␣、␣,␣.
Items to show in lists

Phonology

These sounds are for the Tai Nüa language.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones. Source Wikipedia.

Vowel sounds

Plain vowels

i ɯ ɯ u e o ə ə ɛ ɔ a

Rhymes

ia ip it ik im in iw  
  ɯp ɯt ɯk ɯm ɯn ɯŋ ɯw ɯj
ua up ut uk um un   uj
  ep et ek em en ew ej
  op ot ok om on ow oj
  əp ət ək əm ən əŋ əw əj
  ɛp ɛt ɛk ɛm ɛn ɛŋ ɛw ɛj
  ɔp ɔt ɔk ɔm ɔn ɔŋ   ɔj
ap at ak am an aw aj
  aːp aːt aːk aːm aːn aːŋ aːw aːj

Source: Wikipedia

Consonant sounds

labial dental alveolar post-
alveolar
palatal velar glottal
stops p t       k
ʔ
aspirated        
affricates   t͡s          
aspirated   t͡sʰ          
fricatives f   s     x h
nasals m   n     ŋ
approximants w   l   j  

and t͡sʰ appear in loan words.

Syllable-final

labial dental alveolar post-
alveolar
palatal velar glottal
stop p t       k  
nasal m   n     ŋ

Tone

Tai Nüa has the following 6 tones in unchecked syllables.

Description Representation
mid-rise ˨˦ 35 ¹
high-level ˥ 55 ²
low-level ˩ 11 ³
mid-fall ˦˨ 42
high-fall ˥˧ 54
mid-level ˧ 33

Checked syllables are limited to the following 3 tones.

Description Representation
mid-rise ˨˦ 35 ¹
low-level ˩ 11 ³
high-fall ˥˧ 54

Structure

The script is syllable-based.

A syllable's phonetic and orthographic structure is very simple; C(V)(C)(T).

There are no medial consonant letters. Single characters are available for the onset sequences.

Syllable-final consonants are the same characters used for onset.

There are 6 tone marks, which may be represented by spacing characters or in older orthographies by combining marks, but which always come at the end of the syllable..

Vowels

Vowel summary table

The following table summarises the main vowel to character assigments.

ⓘ represents the inherent vowel.

Simple
ᥤ␣ ␣ᥪ␣ ␣ᥧ
ᥥ␣ ␣ᥨ
ᥦ␣ ␣ᥩ
ⓘ␣ᥣ
Diphthongs
ᥦ␣ ␣ᥩ
Glides
ᥭ␣ᥝ

For additional details see vowel_mappings.

Inherent vowel

ka [U+1950 TAI LE LETTER KA]

a following a consonant is not written, but is seen as an inherent part of the consonant letter, so ka is written by simply using the consonant letter.

Vowels after consonants

Post-consonant vowels are written using 10 dedicated vowel letters. All vowel signs are ordinary spacing characters (no combining marks), and are stored and displayed after the base character. A rhyme can also end with a -j or -w glide. There is a dedicated letter for the former, and the letter w doubles for the latter.

There are no pre-base vowels, circumgraphs, or multipart vowels.

Plain vowel letters

ᥐᥤ ki [U+1950 TAI LE LETTER KA + U+1964 TAI LE LETTER I]

Non-inherent vowel sounds that follow a consonant are represented using ordinary spacing letters, rather than combining marks, and they all appear after the base. The following are used for plain vowel sounds.

ᥤ␣ᥪ␣ᥧ␣ᥥ␣ᥨ␣ᥫ␣ᥦ␣ᥩ␣ᥣ

Glides & diphthongs

ᥝ␣ᥭ

Seventeen rhymes end with a glide. Tai Le uses the dedicated character for -j and uses the consonant for -w.

ᥑᥣᥭᥰ

ᥑᥣᥝᥱ

ᥑᥭᥲ

ᥟᥝ

ᥦ␣ᥩ␣ᥬ

The 3 letters just above represent diphthongs. The first 2 are also used for plain vowel sounds, but is just used for the diphthong.

Standalone vowels

Syllable-initial standalone vowels are written after the vowel carrier ws,#Letters. On its own, it represents the sound of the inherent vowel.

ᥟᥛᥱ

ᥟᥣᥒᥱ

Tones

The current orthography for Tai Le uses spacing characters to represent tone marks.

ᥴ␣ᥰ␣ᥱ␣ᥲ␣ᥳ

Tone marks were introduced in 1963, and until 1988 were written using the following combining characters.

́␣̈␣̌␣̀␣̇

Whether spacing character or combining characters are used, Tai Le tone marks always appear at the very end of a syllable (ie. after any final consonant).

The table of tones shown earlier is extended here to show how tones are written in the current (spacing character) and old (combining mark) orthographies.

Description Representation Old Current
mid-rise ˨˦ 35 ¹   ́
high-level ˥ 55 ²   ̈
low-level ˩ 11 ³   ̌
mid-fall ˦˨ 42   ̀
high-fall ˥˧ 54   ̇
mid-level ˧ 33

The mid-level tone is unmarked.

When a diacritic is used with a tall vowel letter it is displayed to the side (see context). For example:

ᥛᥣ̈

Vowel sounds mapped to characters

This section maps Tai Nüa vowel sounds to common graphemes in the Tai Le orthography.

Plain vowels

i
 

ᥐᥤᥢ

ɯ
 

ᥓᥪ

u
 

ᥒᥧᥰ

e
 

ᥔᥥᥒᥴ

o
 

ᥞᥨᥐᥱ

ə
 

ᥛᥫᥒᥰ

ɛ
 

ᥛᥦᥝᥴ

ɔ
 

ᥔᥩᥒᥴ

 

Inherent vowel

ᥐᥝ

ᥑᥣᥛᥰ

Diphthongs and glides

ia
 

ua
 

 

ᥚᥬᥴ

-j
 

ᥕᥭᥰ

-w
 

ᥑᥣᥝᥱ

Consonants

Consonant summary table

The following table summarises the main consonant to character assigments.

Onsets
ᥙ␣ᥖ␣ᥐ␣ᥟ␣ ␣ᥚ␣ᥗ␣ᥠ
ᥓ␣ ␣ᥡ
ᥜ␣ᥔ␣ᥑ␣ᥞ
ᥛ␣ᥢ␣ᥒ
ᥝ␣ᥘ␣ᥕ
Finals

For additional details see vowel_mappings.

Consonant letters

Whereas the table just above takes you from sounds to letters, the following simply lists the basic consonant letters (however, since the orthography is highly phonetic there is little difference in ordering).

ᥙ␣ᥚ␣ᥖ␣ᥗ␣ᥐ␣ᥠ␣ᥟ␣ ␣ᥓ␣ᥡ␣ ␣ᥜ␣ᥔ␣ᥑ␣ᥞ␣ ␣ᥛ␣ᥢ␣ᥒ␣ ␣ᥝ␣ᥘ␣ᥕ

Onsets

Tai Nüa has no syllable-initial clusters. Check this.

Finals

Consonants do appear in syllable-final position, but Tai Le has no dedicated characters for this, other than the glide , used for -j. For the rest, standard consonant characters are used. It is usually easy to tell that a character is used in final consonant position, because of the position of tone marks, however it seems possible that an open syllable with no tone mark followed by an open syllable using the inherent vowel would create some ambiguity.

When used in final position, 195D is pronounced w.

Consonant to script mapping

This section maps Tai Nüa consonant sounds to common graphemes in the Tai Le orthography. Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

p
 

ᥙᥤ

 

ᥚᥬᥴ

t
 

ᥖᥦᥛᥲ

 

ᥗᥬᥴ

t͡s
 

ᥓᥪ

t͡sʰ
 

k
 

ᥐᥤᥢ

 

ʔ
 

ᥟᥛᥱ

f
 

ᥜᥒᥴ

v
 

ᥝᥢᥰ

s
 

ᥔᥥᥒᥴ

x
 

ᥑᥣᥛᥰ

h
 

ᥞᥨᥐᥱ

Nasals

m
 

ᥛᥫᥒᥰ

n
 

ᥢᥣᥒᥰ

ŋ
 

ᥒᥧᥰ

Other

 

ᥑᥣᥝᥱ

l
 

ᥘᥤᥐ

j
 

ᥕᥭᥰ

-j
 

ᥕᥭᥰ

Numbers

In China, European digits are used, in the main, although Myanmar digits (U+1040..U+1049) are also used with slight glyph variants.

These are the Myanmar digits. Unfortunately, the default font for this page doesn't show the typical differences in glyph shape, in particular, for the digits 2, 6, 8, and 9. u

၀␣၁␣၂␣၃␣၄␣၅␣၆␣၇␣၈␣၉

The differences can be seen in fig_digits.u650

Table showing digit glyphs.

Comparison of glyphs for Myanmar digits used in Myanmar and Tai Le.

Text direction

Tai Le text runs left to right in horizontal lines.

Show default bidi_class properties for characters in the Tai Nüa orthography described here.

Glyph shaping & positioning

You can experiment with examples using the Tai Le character app.

Context-based shaping & positioning

Vowels all follow the initial consonant and are spacing characters with no special joining behaviour. Therefore, the Tai Le script has no need for special shaping, other than that when a tone diacritic is used with a tall vowel letter it is displayed to the side.

ᥙᥥ̀ ᥑᥥᥢ́
Examples of contextual placement of tone marks.

Typographic units

Word boundaries

Words are not separated by spaces.

Graphemes

Grapheme clusters

Graphemes in Tai Nüa consist of single letters or letters with one combining mark. This means that text can be adequately segmented into typographic units using grapheme clusters.

Punctuation & inline features

Phrase & section boundaries

,␣、␣;␣:␣.␣。␣?␣!

Tai Le uses western and fullwidth Chinese punctuationu650, which may include the following (needs to be checked).

phrase

FF0C

3001

FF1B

FF1A

sentence

FF0E

3002

FF1F

FF01

Bracketed text

(␣)

Tai Nüa commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

FF08

FF09

Quotations & citations

《␣》␣〈␣〉

Tai Nüa uses fullwidth angle brackets around quotations.

  start end
initial

300A

300B

nested

3008

3009

Line & paragraph layout

Line breaking & hyphenation

tbd

Line-edge rules

As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line. The Unicode line-break properties help applications decide whether a character should appear at the start or end of a line.

Show (default) line-breaking properties for characters in the modern Tai Nüa orthography.

The following list gives examples of typical behaviours for some of the characters used in modern Tai Nüa. Context may affect the behaviour of some of these and other characters.

Click/tap on the characters to show what they are.

  • 〈 《 (   should not be the last character on a line.
  • 〉 》 ) 。 、 , . : ; ! ?   should not begin a new line.

Baselines, line height, etc.

Tai Nüa uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

Tai Nüa places tone marks above base characters, but they are placed to the side of tall character glyphs.

To give an approximate idea, fig_baselines compares Latin and Tai Nüa glyphs from a Noto font. The basic height of Tai Nüa letters is typically just over the Latin x-height, however tall glyphs reach a little beyond the Latin ascenders, creating a need for slightly larger line spacing.

Hhqxᥐᥩᥙᥘᥭ̀ᥑᥨᥛ̌ᥑᥦᥢ́ᥖᥫᥒ̈ᥐᥣ
Font metrics for Latin text compared with Tai Nüa glyphs in the Noto Sans Tai Le font.

Page & book layout

Online resources

References