Kirat Rai orthography notes

Sample

Select part of this sample text to show a list of characters, with links to more details.
Change size: 28

𖵢𖵣𖵈𖵜𖵀𖵛𖵤 𖵃𖵑𖵥𖵖𖵣𖵃𖵩 𖵃𖵢𖵣𖵈𖵞𖵣𖵅𖵤𖵛 𖵛𖵤𖵖𖵫𖵝𖵩𖵆𖵬𖵜𖵣𖵈-𖵔𖵣𖵞𖵤𖵠𖵣𖵃𖵩 “𖵃𖵩” 𖵝𖵥𖵗𖵞𖵣𖵔𖵣 𖵙𖵧𖵈𖵟𖵣-𖵙𖵥𖵠𖵥𖵢𖵩𖵖 𖵢𖵩𖵈𖵅𖵣𖵛𖵔𖵣 𖵜𖵈𖵅𖵣𖵖 𖵖𖵈 𖵃𖵧𖵖𖵅𖵣𖵛𖵉𖵤 𖵗𖵩𖵜𖵣𖵄𖵞𖵣 𖵛𖵥𖵛𖵣𖵅𖵣𖵝𖵫𖵛𖵣𖵃𖵩 𖵠𖵣𖵖𖵀 𖵔𖵥𖵈𖵔𖵣 𖵉𖵥𖵈𖵢𖵣𖵈-𖵛𖵣 𖵛𖵥𖵛𖵣 𖵞𖵤𖵠𖵣𖵃𖵩 𖵜𖵣𖵃𖵣𖵃𖵮

Source: Sikkim Herald article

Usage & history

Kirat Rai is a South Asian abugida used in the Indian state of Sikkim to write the Bantawa language. It is sometimes called “Khambu Rai Lipi” in West Bengal. The script is used for poetry, newspapers, educational materials, and government records, among others. It was devised developed by Late Kripasalyan Rai in 1981-1982 from the Devanagari script. The Bantawa language is taught in schools up to the primary level ever since it was recognized as one of the official languages of Sikkim in 1997.

Kirat Rai doesn't have the conjunct forms, reordering, or combining marks of most Brahmi-derived scripts. Words are separated using spaces.

More information: Unicode Proposal • Unicode

Basic features

The Kirat Rai script is an abugida, ie. each consonant contains an inherent vowel sound.

❯ basicV

Vowels The inherent vowel is pronounced a, or in some regions ʌ.

Post-consonant vowels are written using free-standing letters (vowel signs). There are no combining marks.

Vowels can be nasalised using the punctuation-like letter 𖵁 after the vowel.

There are no pre-base glyphs, circumgraphs, or composite vowel signs.

Standalone vowel sounds are written using 𖵃 as a carrier, followed by a vowel letter. There are no independent vowel letters.

❯ consonantSummary

Consonants Bantawa has 30 basic consonant letters, but 2 more consonants are used to represent sounds in borrowed words.

Vowel absence Vowel absence is indicated using a visible virama letter after the consonant with no following vowel, or by unmarked vowel absence (word finally), or by combining marks for syllable codas. There are no conjunct forms, nor dedicated medial consonant characters.

There are two vowel killers, and both are always visible. 16D6C is used to kill the inherent vowel in the first syllable of a word. 16D6B is used in all other locations.

Geminated consonants are also written by repeating the consonant letter with a virama letter between them.

Syllable codas can be written with ordinary consonant letters, or using one of 2 dedicated letters resembling Sanskrit's bindu and visarga marks.

Numbers Kirat Rai has a set of native digits.

Layout Bantawa text runs left to right in horizontal lines. Words are separated by spaces. There is no case distinction.

Punctuation is mostly ASCII, but dandas are used for sentence and verse final punctuation.

Character index

Letters

Show

Basic consonants

𖵃,𖵄,𖵅,𖵆,𖵇,𖵈,𖵉,𖵊,𖵋,𖵌,𖵍,𖵎,𖵏,𖵐,𖵑,𖵒,𖵓,𖵔,𖵕,𖵖,𖵗,𖵘,𖵙,𖵚,𖵛,𖵜,𖵝,𖵞,𖵟,𖵠,𖵡,𖵢

Vowels

𖵣,𖵤,𖵥,𖵦,𖵧,𖵨,𖵩,𖵪

Other

𖵀,𖵁,𖵂,𖵫,𖵬

Numbers

Show

𖵰,𖵱,𖵲,𖵳,𖵴,𖵵,𖵶,𖵷,𖵸,𖵹

Punctuation

Show

𖵭,𖵮,𖵯,‘,’,“,”

ASCII

!,(,),␣,.,:,;,?

Phonology

The following represents the repertoire of the Bantawa language.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones. Source Wikipedia.

Vowel sounds

Plain vowels

Complex vowels

aɪ aʊ

Consonant sounds

	labial	alveolar	post- alveolar	retroflex	palatal	velar	glottal
stop	p b	t d		ʈ ɖ	c ɟ	k ɡ	ʔ
	pʰ bʰ	tʰ dʰ		ʈʰ ɖʰ		kʰ ɡʰ
affricate		t͡s d͡z
		t͡sʰ d͡zʰ
fricative		s	ʃ				h
nasal	m	n			ɲ	ŋ
approximant	w	l			j
trill/flap		r

Tone

Bantawa is not a tonal language.

Structure

tbd

Vowels

	post-consonant	standalone
Simple	𖵤,𖵦,𖵥	𖵃𖵤,𖵃𖵦,𖵃𖵥
	𖵧,𖵩	𖵃𖵧,𖵃𖵩
	ⓘ,𖵣	𖵃,𖵃𖵣
Diphthongs	𖵨,𖵪	𖵃𖵨,𖵃𖵪

ⓘ represents the inherent vowel. The letter 𖵁 follows a vowel to indicate nasalisation (not shown here).

Inherent vowel

𖵄 ka

The inherent vowel for Kirat Rai is pronounced a or in some regions ʌ. So ka is written by simply using the consonant letter.

eg.

𖵄𖵤𖵝𖵒 𖵝𖵨

𖵄,𖵤,𖵝,𖵒

Since Kirat Rai consonants normally include an inherent vowel, the orthography has ways to indicate a consonant that is not followed by a vowel sound. See novowel.

Post-consonant vowels

𖵄𖵤 ki

The 'vowel signs' used to write all vowels in Kirat Rai are all free-standing letters. Even though this is an abugida, there are no combining marks. All vowel letters follow the consonant letter; there are no pre-base glyphs or circumgraphs.

Although several of these characters are clearly built from the same components, it is recommended to use the atomic code points.

Vowels can be nasalised using a punctuation-like letter after the vowel.

Plain vowels

The plain vowels of Kirat Rai are written using the following letters.

𖵤,𖵦,𖵥,𖵧,𖵩,𖵣

eg.

𖵠𖵥𖵅𖵫𖵅𖵤𖵛

𖵗𖵦

Diphthongs

Kirat Rai has the following dedicated letters for post-consonant diphthongs.

𖵨,𖵪

eg.

𖵊𖵪𖵁𖵟𖵣

Nasalisation

Vowel nasalisation is indicated using 𖵁 after the vowel.

eg.

𖵜𖵣𖵠𖵣𖵁𖵟𖵣

𖵊𖵪𖵁𖵟𖵣

Standalone vowels

𖵃𖵤 i

Kirat Rai has no independent vowel letters, but instead uses 𖵃 as a carrier for standalone vowels. This can represent a zero or glottal stop onset. The vowel to be pronounced is indicated by following it with the appropriate vowel letter.

eg.

𖵃𖵤𖵛𖵣

𖵃𖵤,𖵛,𖵣

𖵊𖵧𖵁𖵃𖵤𖵟𖵣

𖵊𖵧𖵁,𖵃𖵤,𖵟𖵣

Used alone, this letter represents the standalone vowel a.

eg.

𖵃𖵈𖵫𖵄𖵣

The following list shows how to write each of the standalone vowels.

𖵃𖵤,𖵃𖵦,𖵃𖵥,𖵃𖵧,𖵃𖵩,𖵃,𖵃𖵣, ,𖵃𖵨,𖵃𖵪

Vowel sounds to characters

This section maps Bantawa vowel sounds to common graphemes in the Kirat Rai orthography.

Plain vowels

vowel 𖵤

vowel 𖵦

vowel 𖵥

vowel 𖵧

vowel 𖵩

inherent vowel

vowel carrier 𖵃 Pronounced /a/ when used as a standalone vowel in a syllable onset.

aː

vowel 𖵣

Complex vowels

aɪ

vowel 𖵨

aʊ

vowel 𖵪

Nasalisation

◌̃

nasalisation 𖵁

Vowel absence

Vowel absence principally occurs either when a consonant is a syllable coda, or when a consonant is part of a consonant cluster.

Given that consonants normally include an inherent vowel, the orthography needs a way to indicate when a consonant is not followed by a vowel.

Follow these links for more information.

A visible virama after the consonant with no following vowel.
Unmarked vowel absence (word finally).
Coda diacritics. Combining marks for syllable codas.

Visible virama

Word-internally, suppression of the inherent vowel is marked, but Kirat Rai is unusual in that it has 2 vowel killers:

𖵫,𖵬

𖵬 is only used to mute the inherent vowel of the first letter of the word, whereas 𖵫 is the most frequently used and appears in all other positions. Also unusual, these vowel killers are letters, rather than combining marks.

eg.

𖵚𖵬𖵜𖵣𖵄𖵫𖵠𖵤𖵖

𖵛𖵬𖵢𖵥𖵛𖵣

𖵅𖵣𖵎𖵫𖵛𖵣

𖵉𖵣𖵃𖵫𖵟𖵣

Unmarked vowel absence

The inherent vowel is not usually pronounced at the end of a word, though the absence is not marked in writing.

eg.

𖵛𖵣𖵀𖵄𖵩𖵞𖵧𖵖

𖵛𖵣𖵀,𖵄𖵩,𖵞𖵧𖵖

Consonants

Onsets & codas	𖵗,𖵙,𖵒,𖵔,𖵎,𖵐,𖵄,𖵆,𖵃,𖵂
	𖵘,𖵚,𖵓,𖵕,𖵏,𖵑,𖵅,𖵇
	𖵉,𖵋,𖵊,𖵌
	𖵠,𖵡,𖵢
	𖵛,𖵖,𖵀,𖵍,𖵈,𖵀
	𖵟,𖵝,𖵞,𖵜

Ordinary consonant letters can also appear as a syllable codas; we only list after a hyphen the dedicated code points that are not used in onsets.

Basic consonants

The following are the basic consonant letters used for writing Bantawa.

Click on each letter for more details and for examples of usage.

𖵗,𖵘,𖵙,𖵚,𖵒,𖵓,𖵔,𖵕,𖵎,𖵏,𖵐,𖵄,𖵅,𖵆,𖵇,𖵃,𖵉,𖵊,𖵋,𖵌,𖵠,𖵡,𖵢,𖵛,𖵖,𖵈,𖵟,𖵝,𖵞,𖵜

𖵃 represents ʔ as a syllable coda, and also as part of a standalone vowel sequence. Alone as a syllable onset it represents a. See standalone.

Repertoire extension

In addition, Bantawa texts may contain the following letters, but they are used for borrowed words.up§4

𖵍,𖵑

Onsets

Bantawa does have some consonant clusters at the beginning of a syllable, but these are indicated using 𖵬 between consonant letters. There are no dedicated code points for medial consonants, but it's worth noting that the SAAT character is only used for killing the vowel in the syllable onset.

eg.

𖵢𖵬𖵜𖵥𖵖𖵤

𖵛𖵬𖵢𖵥𖵛𖵣

Codas

Syllable codas are generally written using ordinary consonant letters (followed by a virama if word-internal).

eg.

𖵚𖵬𖵜𖵣𖵄𖵫𖵠𖵤𖵖

𖵅𖵣𖵎𖵫𖵛𖵣

Kirat Rai does have a couple of code points dedicated to syllable codas that look familiar to users of Indic scripts, but all of the glyphs representing final consonants have the general category of letter, rather than combining mark.

Nasal codas. 𖵀 is used for nasal codas. The phonetic value, n or ŋ, depends on the articulation of the following consonant.

eg.

𖵛𖵣𖵀𖵄𖵩𖵞𖵧𖵖

𖵛𖵣𖵀𖵠𖵣𖵟𖵣

Glottal stop codas. A syllable-final glottal stop can be written in one of two ways: (a) using 𖵂, or (b) using 𖵃. The same word can be written in either way.

eg.

𖵃𖵣𖵂𖵟𖵣

𖵃𖵣𖵃𖵫𖵟𖵣

𖵉𖵣𖵃𖵫𖵟𖵣

𖵉𖵣𖵂𖵟𖵣

Note that word-medially a virama is needed after the vowel, but not after the visarga.

Consonant length

Geminated consonants occur in Kirat Rai, but they are simply written using 2 instances of the letter with an intervening 𖵫.

eg.

𖵠𖵥𖵅𖵫𖵅𖵤𖵛

Consonant sounds to characters

This section maps Bantawa consonant sounds to common graphemes in the Kirat Rai orthography.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc. Light coloured characters occur infrequently.

consonant 𖵗

pʰ

consonant 𖵘

consonant 𖵙

bʰ

consonant 𖵚

consonant 𖵒

tʰ

consonant 𖵓

t͡s

consonant 𖵉

t͡sʰ

consonant 𖵊

consonant 𖵔

dʰ

consonant 𖵕

d͡z

consonant 𖵋

d͡zʰ

consonant 𖵌

consonant 𖵎

ʈʰ

consonant 𖵏

consonant 𖵐

ɖʰ

consonant 𖵑

consonant 𖵄

kʰ

consonant 𖵅

consonant 𖵆

ɡʰ

consonant 𖵇

glottal stop 𖵂

vowel carrier 𖵃 As a vowel carrier. (Pronounced /a/ when used as a standalone vowel in a syllable onset.)

consonant 𖵠

consonant 𖵡 Used in borrowed words only.

consonant 𖵢

consonant 𖵛

consonant 𖵖

nasal coda 𖵀

consonant 𖵍 Used in borrowed words only.

consonant 𖵈

nasal coda 𖵀

consonant 𖵟

consonant 𖵝

consonant 𖵞

consonant 𖵜

Encoding choices

This section offers advice about characters or character sequences to avoid, and what to use instead. It takes into account the relevance of Unicode Normalisation Form D (NFD) and Unicode Normalisation Form C (NFC)..

Although usage is recommended here, content authors may well be unaware of such recommendations. Therefore, applications should look out for the non-recommended approach and treat it the same as the recommended approach wherever possible.

Canonically equivalent encodings

Several vowel letters can be visually analysed as an atomic character (the norm), or as a sequence of component parts. The Unicode Standard recommends that authors use the atomic code points, rather than a sequence of component parts.

Atomic (recommended)	Decomposed ( NOT recommended )
𖵩	𖵩
𖵨	𖵨
𖵪	𖵪
𖵪	𖵪

Inappropriate sequence

One additional vowel letter was sometimes composed from 2 code points in pre-Unicode encodings. In Unicode there is no equivalence between the atomic character and the sequence shown below, so the atomic character must be used. This also preserves better semantics.

Use this	Do NOT use
𖵦	𖵥𖵭

Numbers

Digits

Kirat Rai has a set of native, decimal digits

𖵰,𖵱,𖵲,𖵳,𖵴,𖵵,𖵶,𖵷,𖵸,𖵹

Text direction

Kirat Rai text runs left to right in horizontal lines.

Show default bidi_class properties for characters in the Kirat Rai orthography described here.

Glyph shaping & positioning

Experiment with examples using the Kirat Rai character app.

Context-based shaping & positioning

Kirat Rai letters don't interact, so no special shaping is needed.

The are no combining marks, so no context-sensitive shaping is needed, either.

Typographic units

Word boundaries

Words are separated by spaces.

Some words are hyphenated.

eg.

𖵠𖵣𖵜𖵣-𖵙𖵥𖵈

Graphemes

Graphemes in Kirat Rai consist of single letters or letters with no combining marks. This means that text can be segmented into typographic units using grapheme clusters.

Phrase, sentence, and section delimiters are described in phrase.

Punctuation & inline features

Phrase & section boundaries

Kirat Rai uses ASCII punctuation marks for the most part.

phrase	, ; :
sentence	𖵮 . ? !
section	𖵯

phrase

;

sentence

𖵮

section

𖵯

It also has its own dandas, used to end a sentence and a section, respectively.

Bracketed text

See type samples.

Wancho commonly uses ASCII parentheses to insert parenthetical information into text.

	start	end
standard	(	)

Quotations & citations

See type samples.

Kirat Rai texts use the following punctuation around quotations. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.

	start	end
initial	“	”
nested	‘	’

The row labelled 'initial' indicates the usual default quote marks. When an additional quote is embedded within the first, the quote marks are those in the 'nested' row.

Abbreviation, ellipsis & repetition

Kirat Rai uses 𖵭 to indicate abbreviation.

eg.

𖵐𖵭𖵋𖵭𖵗𖵤𖵭

Line & paragraph layout

Line breaking & hyphenation

Lines are generally broken between words.

Text alignment & justification

Full justification is achieved by adjusting the spaces between words.

Kirat Rai, Bantawa

Sample

Usage & history

Basic features

Character index

Letters

Basic consonants

Vowels

Other

Numbers

Punctuation

ASCII

Phonology

Vowel sounds

Plain vowels

Complex vowels

Consonant sounds

Tone

Structure

Vowels

Inherent vowel

Post-consonant vowels

Plain vowels

Diphthongs

Nasalisation

Standalone vowels

Vowel sounds to characters

Plain vowels

Complex vowels

Nasalisation

Vowel absence

Visible virama

Unmarked vowel absence

Consonants

Basic consonants

Repertoire extension

Onsets

Codas

Consonant length

Consonant sounds to characters

Encoding choices

Canonically equivalent encodings

Inappropriate sequence

Numbers

Digits

Text direction

Glyph shaping & positioning

Context-based shaping & positioning

Typographic units

Word boundaries

Graphemes

Punctuation & inline features

Phrase & section boundaries

Bracketed text

Quotations & citations

Abbreviation, ellipsis & repetition

Line & paragraph layout

Line breaking & hyphenation

Text alignment & justification

Page & book layout

Online resources

References