Tai Dam orthography notes

Sample

Select part of this sample text to show a list of characters, with links to more details.
Change size: 28px

ꪭꪴꪒ 1 ꪹꪕꪸꪉ ꪀꪱ ꪋꪴ ꫛ ꪎꪲꪉ ꪮꪮꪀ ꪣꪱ ꪻꪠ ꪁꪷ ꪻꪬ ꪼꪒ ꪕꪳ ꪕꪱꪉ ꪀꪾꪚ ꪹꪋꪷꪉ ꪝꪸꪉ ꪕꪮꪥ ꪩꪾ ꫛ ꪶꪔꪙ ꪠꪴ - ꪋꪴ ꪬꪺ ꫛ ꪻꪠ ꪁꪷ ꪻꪬ ꪣꪲ ꪁꪫꪸꪙ ꪎꪱꪉ ꪶꪎꪣ ꪩꪺꪉ ꪹꪥꪸꪒ ꫛ ꪀꪾꪚ ꪹꪥꪸꪒ ꪻꪊ ꪚꪴꪙ ꪀꪾꪚ ꪼꪒ ꪹꪚꪷꪉ ꪒꪲ ꪀꪾꪚ ꪫꪸꪀ ꪭꪰꪀ ꪵꪝꪉ ꪹꪏꪉ ꪹꪭꪙ ꪒꪸꪫ.

ꪭꪴꪒ 2 ꪋꪴ ꫛ ꪻꪠ ꪁꪷ ꪝꪮꪣ ꪼꪒ ꪹꪬꪉ ꪝꪳꪉ ꪁꪫꪸꪙ ꪹꪜꪸꪙ ꪹꪊꪱ ꪀꪾꪚ ꪕꪳ ꪕꪱꪉ ꪹꪏꪉ ꪹꪫꪱ ꪀꪺꪉ ꪻꪚ ꪜꪱꪫ ꪁꪫꪱꪣ ꪙꪲ, ꪹꪚꪱ ꪜꪽ ꪵꪊꪀ ꪹꪋ ꪡꪽ - ꪹꪙ ꪘꪰꪉ - ꪻꪊ ꪈꪾ - ꪁꪫꪱꪣ ꪜꪱꪀ - ꪭꪲꪒ ꪅꪮꪉ - ꪩꪺꪉ ꪜꪴꪙ ꪵꪔꪉ ꪀꪨꪰꪒ ꪄꪮꪉ ꪹꪊꪱ ꪭꪳ ꪫꪱ ꪩꪺꪉ ꪀꪨꪰꪒ ꪮꪳꪙ, ꪶꪀꪀ ꪹꪅꪱ ꪹꪬꪱ ꪭꪱꪀ ꪚꪱꪙ ꪹꪣꪉ ꪬꪱꪀ ꪣꪲ ꪭꪳ ꪫꪱ ꪚꪱꪙ ꪹꪣꪉ ꪙꪾ ꪣꪲꪉ ꪎꪲꪉ ꪄꪮꪉ. ꪄꪮꪉ ꪹꪊꪱ ꪬꪱꪀ ꪼꪒ ꪬꪱꪀ ꪣꪲ ꪭꪳ ꪫꪱ ꪹꪜꪸꪙ ꪵꪔ ꪚꪱꪙ ꪹꪣꪉ ꪕꪰꪒ ꪮꪮꪀ.

Source: Unicode UDHR, articles 1 & 2

Usage & history

The Tai Viet script is used for writing the Tai Dam (Black Tai or Tai Noir), Tai Dón (White Tai or Tai Blanc), Tai Daeng, Thai Song (Lao Song or Lao Song Dam) and Tày Tac languages spoken in Vietnam, Laos, China and Thailand. There is also a diaspora in the United States, Australia and France.

The total population using the three languages, across all countries, is estimated to be 1.3 million (Tai Dam 764,000, Tai Dón 490,000, Thai Song 32,000). The script is still used by the Tai people in Vietnam, and there is a desire to introduce it into formal education there.

Little is known about the origin of the Tai Viet script. It appears to have been derived from the Thai script around the 16th century.

Significant variation occurs in the orthographic conventions of the Tai languages, as well as in their phonologies. A unified, standardized version of the script, with an agreed upon core set of characters, was developed at a UNESCO-sponsored workshop in 2006, and subsequently accepted for encoding in The Unicode Standard.

Unicode 17 has 1 dedicated Tai Viet block, comprising 72 characters.

More information: Scriptsource • The Unicode Standard

Basic features

The Tai Viet script is an alphabet, ie. all vowels are written explicitly, alongside consonants; there is no inherent vowel in a consonant (abugidas), certain vowels are not systematically dropped (abjads), and consonant and vowel are not combined in the same character (syllabaries).

The Tai Viet script is heavily syllable-based, with exceptions being a very small number of unstressed initial syllables, and loan words.

❯ basicV

Vowels Vowel sounds following a consonant are written using a mixture of 13 ordinary spacing characters (of which 5 are also consonants) and 7 combining marks.

There are 5 pre-base vowel glyphs (all letters), but no circumgraphs.

This page lists 6 composite vowel signs, made from 6 vowel signs and 3 consonants. Composite vowel signs can involve up to 3 glyphs, though usually only 2, and glyphs can surround the base consonant(s) on 2 sides.

Tai Viet uses visual placement: only the vowel components that appear above or below the consonant are combining marks; the others are ordinary spacing characters that are typed in the order seen.

Standalone vowels are written using a vowel sign attached to ꪮ or ꪯ. There are no independent vowels.

Tone can be indicated either by diacritics or ordinary spacing characters. Both are a recent innovation. Combining tone marks always follow the root consonant and any combining vowels, ie. they come before any post-base vowel. Spacing tone marks always come at the very end of the syllable.

❯ consonantSummary

Consonants Tai Dam has 42 basic consonant letters, all neatly divided into 2 classes. Each consonant is associated with a high or low class to indicate tone.

Vowel absenceThere are no conjuncts or subjoined consonants.

The only syllable-initial cluster involves labialisation, using ꪫ w.

Coda sounds use a subset of 8 ordinary consonant letters, but since there is no inherent vowel, it is still simple to detect syllable boundaries. Syllable-final consonant sounds are also built into 6 vowel-consonant graphemes.

Numbers There are no native Tai Viet digits. ASCII digits are used.

Layout Tai Dam text runs left to right in horizontal lines. Words are separated by spaces, although this is a recent innovation. Letters have no case distinction.

The visual forms of letters don't usually interact.

Punctuation is mostly ASCII, with some native.

Notable features

tone notation is based on a complete set of duplicate consonants for high and low tone
tone can be indicated using either diacritics or letters
most native words are monosyllabic
vowels are indicated using both combining marks and letters
Tai Viet uses visual placement: vowel glyphs to the left of the base are standalone letters
there are no independent vowels

Character index

Letters

Show

Consonants

ꪝ,ꪛ,ꪕ,ꪓ,ꪗ,ꪁ,ꪆ,ꪮ,ꪜ,ꪚ,ꪔ,ꪒ,ꪖ,ꪀ,ꪇ,ꪯ,ꪋ,ꪊ,ꪡ,ꪏ,ꪑ,ꪅ,ꪭ,ꪠ,ꪪ,ꪎ,ꪐ,ꪄ,ꪬ,ꪣ,ꪙ,ꪉ,ꪢ,ꪘ,ꪈ,ꪫ,ꪧ,ꪩ,ꪥ,ꪦ,ꪨ,ꪤ

Vowel letters

ꪹ,ꪶ,ꪵ,ꪻ,ꪼ,ꪱ,ꪺ,ꪽ

Tones

ꫀ,ꫂ

Logograms

ꫛ,ꫜ

Repetition marker

ꫝ

Not used

ꪟ,ꪃ,ꪞ,ꪂ,ꪍ,ꪌ

Combining marks

Show

Vowel marks

ꪴ,ꪰ,ꪲ,ꪳ,ꪷ,ꪸ,ꪾ

Tones

꪿,꫁

Punctuation

Show

꫞,꫟

ASCII

(,),,,.

Other

Show

To be investigated

!,;,?

Phonology

These are sounds for the Tai Dam language.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones.

Vowel sounds

Plain vowels

Diphthongs

Consonant sounds

	labial	alveolar	post- alveolar	palatal	velar	glottal
stop	p b	t d			k ɡ	ʔ
aspirated		tʰ
affricate			t͡ɕ
fricative	f v	s			x	h
nasal	m	n		ɲ	ŋ
approximant	w	l		j
trill/flap		r
	img

r and ɡ are used in Vietnamese names.

Syllable-final

	labial	alveolar	palatal	velar	glottal
stop	p	t		k	ʔ
nasal	m	n		ŋ
approximant	w		j

Tone

The Tai Dam language has 6 tones: 3 level pitch tones, and 3 contour tones.

The 2nd and 5th tones are the only tones used in checked syllables, but they can also appear in unchecked syllables, too.

IPA	Name	Digit	Unchecked	Checked
˨	low-mid level	1	✓
˦˥	high rising	2	✓	✓
˨˩	low falling	3	✓
˥	high level	4	✓
˦	mid-high level	5	✓	✓
˧˩	mid falling	6	✓

See how tones are written.

Structure

The Tai languages are almost exclusively monosyllabic. A very small number of words have an unstressed initial syllable, and loan words may be polysyllabic.b

Syllables follow the pattern C (w) V (C)

Syllable onsets are normally simple, but may have a medial -w-.

Syllable codas include -p -t -k -ʔ, which create 'checked' syllables, and otherwise -m, -n, -ŋ, as well as the glides -j, -w.

Vowels

Plain	Complex	Standalone carriers
ꪲ,ꪳ,,ꪴ	ꪸ, ,ꪹ◌, ,ꪺ◌	ꪮ,ꪯ
ꪹ◌ꪸ, ,ꪶ◌
ꪹ◌ꪷ	ꪻ◌
ꪵ◌, ,ꪷ,ꪮ,ꪯ	ꪵ◌ꪫ,ꪵ◌ꪫꪥ
ꪰ,ꪱ	ꪼ◌,ꪹ◌ꪱ,ꪾ,ꪽ,ꪚꪾ

Post-consonant vowels

None of the combining marks are spacing marks (meaning that none of them consume horizontal space when added to a base consonant).

Basic vowels

The basic Tai Viet vowel sounds are written as follows. The list includes both letters and combining marks, and sometimes combinations of both. See componentV.

Vowels are typically long in open syllables, and short in syllables with codas.

The dotted circle indicates the position of the consonant(s) relative to the vowel sign, rather than indicating a combining mark.

ꪲ,ꪳ,ꪴ,ꪹ◌ꪸ,ꪶ◌,ꪹ◌ꪷ,ꪵ◌,ꪷ,ꪮ,ꪯ,ꪰ,ꪱ,

Vowel letters are visually encoded, ie. if a vowel letter is displayed to the left of the base consonant, it is typed and stored before the consonant, too. Any combining marks are typed and stored after the base consonant.

eg.

ꪹꪔꪸꪣ

ꪹ,ꪔ,ꪸ,ꪣ

ꪵꪒꪙꪒꪲꪙ

ꪵ,ꪒ,ꪙ,ꪒ,ꪲ,ꪙ

ꪀꪷꪵꪀ

ꪀ,ꪷ,ꪵ,ꪀ

ꪮ and ꪯ in principle represent the glottal stop, but they can also represent vowels when used after a consonant. The following word in fact shows the same character being used as both consonant and vowel in the same word.b

eg.

ꪮꪮꪀ

ꪮ,ꪮ,ꪀ

ꪵ should not be typed as two successive ꪹ characters.

Diphthongs & rhymes

Tai Viet writes a number of dipthongs and rhymes as follows.

ꪸ,ꪹ◌,◌ꪺ,ꪻ◌,ꪵ◌ꪫ,ꪵ◌ꪫꪥ,ꪼ◌,ꪹ◌ꪱ,ꪾ,ꪽ,ꪚꪾ

eg.

ꪻꪚꪼꪣ꫁

ꪜꪺ

ꪵꪁꪫꪥ

ꪙꪾ꫁ꪹꪚꪸ꫁

-ap. The last item in the list above is rather unusual. Some dialects use the combination ꪚꪾ for the rhyme -ap, where the vowel is placed over the final, low-series b, rather than over the initial consonant.

eg.

ꪀꪚꪾ

ꪀ,ꪚ,ꪾ

See writing_styles, however, for a font variant setting that allows you to store the code points in the normal order, but still displays the AM over the BO.

Composite vowel signs

Vowels represented by combinations of the above characters include the following, which mostly add glyphs to different sides of the base:

ꪹ◌ꪱ,ꪹ◌ꪸ,ꪹ◌ꪷ,ꪵ◌ꪫ,ꪵ◌ꪫꪥ,◌ꪚꪾ

Pre-base and post-base vowel glyphs are split around the syllable onset, which may be more than a single character. fig_prebase shows an example.

ꪫ in the combination ꪵ◌ꪫ can be ambiguous unless there is a tone mark. The sequence ꪵ–ꪫꪥ U+AAB5 VOWEL E + U+AAAB LETTER HIGH VO + U+AAA5 LETTER HIGH YO is sometimes used to remove that ambiguity. For details, see onsets.

Characters that don't appear in the combinations:

ꪲ,ꪳ,ꪴ,ꪶ,ꪮ,ꪯ,ꪰ, ,ꪻ,ꪼ,ꪽ

Show which combinations contain a given character:

ꪹ	ꪹ-ꪸ,ꪹ-ꪷ, ,ꪹ-ꪱ
ꪵ	ꪵ-ꪫ
ꪱ	ꪹ-ꪱ
ꪸ	ꪹ-ꪸ
ꪷ	ꪹ-ꪷ
ꪫ	ꪵ-ꪫ
ꪚ	-ꪚꪾ
ꪾ	-ꪚꪾ

Show details about glyph positioning

The following list shows where vowel signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are. The figure after the + sign represents combinations of Unicode characters,

5 pre-base, eg. ꪶꪁ ok
3 post-base, eg. ꪁꪱ kā
6 superscript, eg. ꪁꪲ ki
1 subscript, eg. ꪁꪴ ku
2 pre+post-base, eg. ꪹꪁꪱ ɨᵊkā (kaʷ)
2 pre+superscript, eg. ꪹꪁꪱ ɨᵊkā (ke)
1 post+superscript, eg. ꪁꪜꪾ kp̄aᵐ (kap)

Vowel components

The following breaks down the list of characters used for Tai Dam vowels by type.

Tai Dam uses the following combining marks for vowels.

ꪲ,ꪳ,ꪴ,ꪷ,ꪰ,ꪸ,ꪾ

The following are dedicated vowel letters. Five of these are typed and stored before the onset consonant (see prebase), and only 3 appear after.

The dotted circle indicates the location of the base consonant — these are not combining marks. They are typed and stored in visual order (see prebase).

ꪶ◌,ꪵ◌,◌ꪱ,ꪹ◌,◌ꪺ,ꪻ◌,ꪼ◌,◌ꪽ

The following characters that are normally regarded as consonants are also used to create vowel sounds, either alone or as part of a composite vowel sign.

ꪮ,ꪯ,ꪫ,ꪥ,ꪚ

Pre-base vowel signs

Five CV combinations are written using vowel signs that appear to the left of the onset consonant.

ꪹ,ꪶ,ꪵ,ꪻ,ꪼ

Like Lao, Tai Viet uses a visual encoding model, so these characters are not combining characters, but are typed and stored before the base.

eg.

ꪵꪣꪫ

ꪵ,ꪣ,ꪫ

In fact, these vowel signs are placed before the start of the syllable onset. This means that in a word that begins with more than one consonant letter (ie. in labialised consonants) the pre-base vowel is placed to the left of the syllable-initial consonant, rather than to the left of the consonant after which it is actually pronounced.

fig_prebase shows an example to graphically illustrate the relationships between the characters.

A vowel sign that appears 2 characters out of sequence from where it is pronounced, because the syllable onset is 2 characters long.

show composition

ꪵꪁꪫꪥ

Standalone vowels

Vowels in Tai Dam are always preceded by a consonant, but that consonant may be a glottal stop, written using ꪮ or ꪯ followed by the relevant vowel sign(s). There are no independent vowels.

eg.

ꪮꪲꪒ

ꪮ꪿ꪱꪉ

ꪵꪮꪚ

Tones

Consonant	Checked?	Tone mark	Tone	Digit	Example
low	unchecked	—	˨	1	ꪀꪱ
ꫀ or ꪿	˦˥	2	ꪎꪲꫀ / ꪎꪲ꪿
ꫂ or ꫁	˨˩	3	ꪐꪱꫂ / ꪐ꫁ꪱ
checked	—	˦˥	2	ꪎꪲꪚ
high	unchecked	—	˥	4	ꪉꪴ
ꫀ or ꪿	˦	5	ꪝꪮꪣꫀ / ꪝ꪿ꪮꪣ
ꫂ or ꫁	˧˩	6	ꪡꪱꫂ / ꪡ꫁ꪱ
checked	—	˦	5	ꪩꪴꪀ

Vowel sounds to characters

This section maps Tai Dam vowel sounds to common graphemes in the Tai Viet orthography.

Plain vowels

dependent vowel ꪲ

dependent vowel ꪳ

dependent vowel ꪴ

circumgraph vowel ꪹ◌ꪸ

prescript vowel ꪶ◌

circumgraph vowel ꪹ◌ꪷ

prescript vowel ꪵ◌

prescript vowel ꪷ only in open syllables.

prescript vowel ꪮ in closed syllables.

prescript vowel ꪯ (seems to be rare)

dependent vowel ꪰ

aː

dependent vowel ꪱ

Diphthongs and rhymes

iə

dipthong ꪸ

ɨə

prescript dipthong ꪹ◌

uə

dipthong ◌ꪺ

ʷɛ

circumgraph dipthong ꪵ◌ꪫ

circumgraph dipthong ꪵ◌ꪫꪥ in some dialects, to avoid ambiguity.

əw

prescript dipthong ꪻ◌

prescript vowel ꪼ◌

aːw

circumgraph dipthong ꪹ◌ꪱ

rhyme ꪾ

rhyme ꪽ

rhyme ꪚꪾ

Vowel absence

Vowel absence principally occurs either when a consonant is a syllable coda, or when a consonant is part of a consonant cluster.

Consonant clusters involve no special characters or viramas. There are no conjunct forms or subjoined consonants.

Syllable codas generally use a selection of ordinary letters, but some characters represent rhymes that include the syllable coda. See finals.

Syllable-initial consonant clusters are limited to labialisation, which is written using separated letters (see onsets).

Consonants

	High class	Low class
Onsets	ꪝ,ꪛ,ꪕ,ꪗ,ꪋ,ꪓ,ꪁ,ꪇ,ꪯ	ꪜ,ꪚ,ꪔ,ꪖ,ꪊ,ꪒ,ꪀ,ꪆ,ꪮ
	ꪡ,ꪫ,ꪏ,ꪅ,ꪭ	ꪠ,ꪪ,ꪎ,ꪄ,ꪬ
	ꪣ,ꪙ,ꪑ,ꪉ	ꪢ,ꪘ,ꪐ,ꪈ
	ꪧ,ꪩ,ꪥ	ꪦ,ꪨ,ꪤ
Medial	ꪫ
Codas	ꪚ,ꪒ,ꪀ
	ꪣ,ꪙ,ꪉ
	ꪫ,ꪥ
Rhymes	ꪾ,ꪽ,ꪜꪾ
Logographs	ꫛ,ꫜ

Basic consonants

Basic consonant sounds in Tai Dam are written using the following letters.

Click on each letter for more details and for examples of usage, especially where more than one sound is indicated.

low

ꪜ,ꪚ,ꪔ,ꪒ,ꪀ,ꪆ,ꪮ,ꪖ,ꪊ,ꪠ,ꪪ,ꪎ,ꪄ,ꪬ,ꪢ,ꪘ,ꪐ,ꪈ,ꪦ,ꪨ,ꪤ

high

ꪝ,ꪛ,ꪕ,ꪓ,ꪁ,ꪇ,ꪯ,ꪗ,ꪋ,ꪡ,ꪫ,ꪏ,ꪅ,ꪭ,ꪣ,ꪙ,ꪑ,ꪉ,ꪫ,ꪧ,ꪩ,ꪥ

Other dialects

Three pairs of consonants are used for the Tai Don language, but not for Tai Dam.btd They are:

ꪟ,ꪞ,ꪍ,ꪌ,ꪃ,ꪂ

Onsets

Tai Dam labialises an initial velar consonant in some syllables, but doesn't otherwise have consonant clusters in syllable onsets. The labialisation is indicated using the ordinary letter ꪫ; there are no dedicated medial consonants.

eg.

ꪁꪫꪱꪣ

ꪄꪷꪄꪫ꪿ꪱ

The pronunciation of a consonant followed by VO can be ambiguous if there is no diacritic present and the vowel sign appears before the base. For example, the following does not involve labialisation:

eg.

ꪵꪄꪫ

ꪵꪄ,ꪫ

The following pairs illustrate how a vowel sign or a tone diacritic can resolve the ambiguity.

cf.

ꪀꪲꪫ ḵiw

ꪀꪫꪲ ḵwi

cf.

ꪵꪀ꫁ꪫ ɛḵ²w kɛw

ꪵꪀꪫ꫁ ɛḵw² kʷɛ

In order to address the ambiguity when no diacritic is present, the character ꪥ may be appended to the end of the sequence, indicating labialisation.

eg.

ꪵꪁꪫꪥ

Since j never occurs after ɛ, this can be done without creating a new ambiguity. This spelling is only used in some dialects of the traditional script, however, it has been adopted as a standard in a project sponsored by the Son La Department of Science and Technology.b

The sound kʰʷ exists in Tai Don, but not in Tai Dam. The sound kʷ exists in both languages.btd

Codas

Syllable-final plosives are written using the following low class consonants. These create 'checked' syllables.

ꪚ,ꪒ,ꪀ

For open syllables ending with nasals or glides, the following high class consonants are used.

ꪣ,ꪙ,ꪉ,ꪥ,ꪫ

In addition, Tai Dam has ways of writing several rhymes, where a vowel sign represents a vowel sound followed by a final consonant. See vowels. These include:

ꪾ,ꪽ,-ꪜꪾ,ꪹ-ꪱ,ꪼ,ꪻ

Consonant sounds to characters

This section maps Tai Dam consonant sounds to common graphemes in the Tai Viet orthography.

The labels on the left show whether this consonant is high class, low class, or a coda.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc. Light coloured characters occur infrequently.

high ꪝ

low ꪜ

coda ꪚ

coda ꪚꪾ

See also the rhymes for -ap.

high ꪛ

low ꪚ

high ꪕ

low ꪔ

coda ꪒ

tʰ

high ꪗ

low ꪖ

t͡ɕ

high ꪋ

low ꪊ

high ꪓ

low ꪒ

high ꪁ

low ꪀ

kon⁴

logograph ꫛ

high ꪇ

low ꪆ

high ꪯ

low ꪮ

coda ꪀ

high ꪡ

low ꪠ

high ꪫ

low ꪪ

high ꪏ

low ꪎ

high ꪅ

low ꪄ

high ꪭ

low ꪬ

high ꪣ

low ꪢ

coda ꪣ

coda ꪾ

high ꪙ

low ꪘ

coda ꪙ

coda ꪽ

nɨŋ⁵

logograph ꫜ

high ꪑ

low ꪐ

high ꪉ

low ꪈ

coda ꪉ

labialisation ◌ꪫ indicates labialisation of a velar onset.

labialisation ◌ꪫꪥ occasionally, to indicate labialisation when there is no diacritic to clarify ambiguous spelling.

coda ꪫ See also the diphthongs ending in w.

high ꪧ

low ꪦ

high ꪩ

low ꪨ

high ꪥ

low ꪤ

coda ꪥ See also the diphthongs ending in j.

Symbols

The Tai Viet Unicode block contains no characters with the general property symbol, however it contains 2 letters that act as logograms.

ꫛ,ꫜ

ꫛ means person, and is used to distinguish between homophonous wordsb§9 such as

cf.

ꫛ

ꪶꪁꪙ

ꫜ is a ligature for the word one. b§9

cf.

ꫜ

ꪙꪳꪉꫀ

Numbers

There are no native Tai Viet digits. ASCII digits are used.

ꪹꪊꪉ ꪊꪉꪲ ꪠꪱꪒ ꪵꪖꪉ ꪁꪫꪱꪣ ꪼꪕ ꪖꪳ 6 ꪣꪳ 25 ꪁ ꪾ ꪹꪚꪙ 9 ꪜꪲ 2020 Bai thứ 6 mự 25/9/2020 — Observation: Examples of dates in Tai Viet. (source)

Text direction

Tai Viet text runs left to right in horizontal lines.

Show default bidi_class properties for characters in the Tai Dam orthography described here.

Glyph shaping & positioning

Experiment with examples using the Tai Viet character app.

Font styles

Glyph variants. The Tai Heritage Pro font has font features that allow the following alternative glyph shapes for certain characters.

feature	code point	alternative shapes
`lcoa`	ꪊ
`htoa`	ꪕ
`hpho`	ꪟ
`auea`	ꪻ
`hoia`	꫞

Context-based shaping & positioning

Contextual positioning. Combining marks need to be positioned relative to the shape of the base that they are combined with. fig_vowp shows an example: the combining marks are higher to the right than the left, because of the size of the glyphs below.

Location of combining marks. The Tai Heritage Pro font offers a variant feature that allows placement of combining vowel signs and tones over the onset consonant, or over the final consonant in a closed syllable, see fig_vowp. The underlying sequence of code points is identical.

ꪕꪳ꪿ꪉ — Font feature `vowp` as default (left), and set to 2 (right).

Whereas the code point sequence remains the same for the example just shown, the same font feature can also be used to support a different code point sequence for AABE. By default, the code point order for the left-hand example in fig_vowp1 would be:

ꪊꪚꪾ

With the vowp feature set to 1, combining marks appear over the onset, except for this specific combination. This means that you can use the code point sequence:

ꪊꪾꪚ

Typographic units

Word boundaries

Unlike many other Tai scripts, Tai Viet uses spaces between words.b However, this is a fairly recent innovation. Polysyllabic words may be written without space between the syllables.u

Brase provides some algorithmic detail for handling older texts without spacing.btd

Graphemes

Tai Viet has syllables that include free-standing vowel signs before and/or after the base.

eg.

ꪹꪉꪱ

Tai Viet users do not expect these to be connected to the onset consonant. When a cursor moves across text, they expect it to stop before and after each of these characters, and not skip the complete syllable. All spacing characters behave this way.

Punctuation & inline features

Phrase & section boundaries

phrase	,
sentence	.
poems	꫞ ꫟

phrase

sentence

poems

꫞

꫟

Observation: The UDHR text contains regular ASCII punctuation, including commas, periods, and colons, as well as dashes to separate text. Some examples can be seen in the sample text at the start of this page.

1. ꪋꪴ ꫛ ꪝꪮꪣ ꪼꪒ ꪣꪲ ꪁꪫꪸꪙ ꪵꪮꪚ ꪭꪸꪙ - ꪼꪒ ꪹꪤꪸꪒ ꪕꪮꪥ ꪹꪊꪸ ꪶꪒ ꪤꪱꪫ ꪤꪴꪀ ꪹꪚꪱ ꪎꪸ ꪁꪱ ꪙꪮꪥ ꪹꪭꪸꪉ ꪁꪷ ꪼꪒ ꪵꪮꪚ ꪬꪮꪉ ꪋꪽ ꪔꪾꪣ ꪀꪾꪚ ꪤꪱꪫ ꪤꪴꪀ ꪘꪰꪉ ꪶꪠꪉ ꪶꪩ - ꪤꪱꪫ ꪤꪴꪀ ꪋꪽ ꪔꪾꪣ ꪭꪳ ꪵꪣꪙ ꪄꪮꪉ ꪄꪰꪒ ꪹꪭꪸꪉ, ꪤꪱꪫ ꪥꪴꪀ ꪀꪲ ꪗꪺꪒ ꪀꪾꪚ ꪝꪳꪉ ꪹꪉꪸ ꪭꪳ ꪹꪜꪸꪙ ꪼꪄ ꪀꪫꪱꪉ ꪀꪾꪚ ꪤꪱꪫ ꪥꪴꪀ ꪋꪽ ꪎꪴꪉ ( ꪀꪱꪫ ꪭꪮꪀ ) ꪹꪜꪸꪙ ꪕꪮꪥ ꪈꪫꪸꪙ ꪔꪰꪀ ꪹꪋꪷꪉ ꪝꪸꪉ ꪻꪬ ꪹꪚꪱ ꪁꪫꪱꪙ ꫛ ꪻꪒ ꪵꪮꪚ ꪼꪒ.

Observation: Example ASCII punctuation in UDHR. (source)

Dashes. Dashes are used to separate phrases.

ꪉꪮꪙ ꪶꪕ ꪖꪳ 2 ꪣꪳ 05 ꪁꪾ ꪹꪚꪙ 8 ꪜꪲ 2019 – ꪁꪱꪫꪣ ꪶꪕ ꪵꪔ ꪶꪡꪉ ꪚꪱꪙ ꪙꪱ — Observation: Example of en-dash in Tai Viet. (source)

Poems & songs. The only native punctuation in the Unicode Tai Viet block is for poems and songs: ꫞ marks the beginning and ꫟ marks the end of the text.

Bracketed text

Tai Dam commonly uses ASCII parentheses to insert parenthetical information into text.

	start	end
standard	(	)

ꪉꪮꪙ ꪶꪕ ꪋꪴ ꪹꪋ ꫄ꪤꪙ ꪹꪣꪉ ꪀꪰꪚ ꪀꪺꪀ ꪶꪭꪥ (Ngon tô chu chựa dần mương cắp Quốc hội) — Observation: Examples of parentheses in Tai Viet. (source)

Abbreviation, ellipsis & repetition

Repetition. ꫝ indicates repetition of the previous word.

Line & paragraph layout

Line breaking & hyphenation

tbd

Observation: The primary break point for text seen online is the inter-word space.

Show (default) line-breaking properties for characters in the Tai Dam orthography.

Baselines, line height, etc.

tbd

Tai Viet uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

fig_baselines shows glyphs from the Tai Heritage Pro font. Tai Viet extenders and combining marks, extend well beyond the Latin ascenders and descenders, creating a need for much larger line heights.

Xqhxꪚꪸ꫁ꪉꪳ꪿ꪞꪜꪴꪗꪴꪼꪨ — Font metrics for Latin text compared with Tai Viet glyphs in the Tai Heritage Pro font.

Tai Viet, Tai Dam

Sample

Usage & history

Basic features

Notable features

Character index

Letters

Consonants

Vowel letters

Tones

Logograms

Repetition marker

Not used

Combining marks

Vowel marks

Tones

Punctuation

ASCII

Other

To be investigated

Phonology

Vowel sounds

Plain vowels

Diphthongs

Consonant sounds

Syllable-final

Tone

Structure

Vowels

Post-consonant vowels

Basic vowels

Diphthongs & rhymes

Composite vowel signs

Vowel components

Pre-base vowel signs

Standalone vowels

Tones

Vowel sounds to characters

Plain vowels

Diphthongs and rhymes

Vowel absence

Consonants

Basic consonants

Other dialects

Onsets

Codas

Consonant sounds to characters

Symbols

Numbers

Text direction

Glyph shaping & positioning

Font styles

Context-based shaping & positioning

Typographic units

Word boundaries

Graphemes

Punctuation & inline features

Phrase & section boundaries

Bracketed text

Abbreviation, ellipsis & repetition

Line & paragraph layout

Line breaking & hyphenation

Baselines, line height, etc.

Page & book layout

Online resources

References