Tolong Siki, Kurukh

(draft) orthography notes

Updated 18 May, 2025

This page brings together basic information about the Tolong Siki script and its use for the Kurukh language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Kurukh using Unicode.

NOTE: The Tolong Siki font used for this page is defective when it comes to positioning combining marks alongside a base character glyph. This results in square boxes or, in some browsers, odd placements where a mark is used in an example. Hopefully, in time, a better version of the font will become available.

Referencing this document

Richard Ishida, Tolong Siki (Kurukh) Orthography Notes, 18-May-2025, https://r12a.github.io/scripts/tols/kru

Sample

Select part of this sample text to show a list of characters, with links to more details.
Change size:   36px

𑷕𑶳𑷐𑶺𑶵 𑶵𑷑𑶵𑷐𑶰𑶿 𑷕𑶴𑷊 𑷌𑶴𑷕𑶰 𑶸𑶵𑷐𑶱 𑶿𑶲 𑶺𑶴𑷑𑷑𑶰𑶿𑶻𑶵 𑶴𑷗𑶵𑷂𑶰 𑶴𑷐𑶵 𑶴𑷓𑷀𑶱𑶺 𑶺𑶴𑶿𑶿𑶵 𑷌𑶴𑷕𑶰 𑷕𑶴𑶻 𑷖𑶴𑷋𑶴𑷐𑷊𑶰 𑷐𑶴𑶰. 𑶵𑷐𑶰𑶿 𑷑𑶲𑷐 𑶴𑷐𑶵 𑷇𑶰𑷏𑶵 𑷌𑶴𑷕𑶰 𑷂𑶴𑶽 𑶸𑶴𑶲𑷔𑶵 𑷖𑶴𑷋𑶴𑷊𑶰𑷙 𑷐𑶴𑶰𑷙 𑶴𑷐𑶵 𑶻𑶲𑶺𑷕𑶱𑷙𑷓 𑶺𑶴𑷈𑶰 𑶿𑶲𑷙 𑶺𑶱𑷑-𑶶̰𑶱𑶺 𑷌𑶴𑷕𑶰 𑶸𑶱𑶽𑷕𑶵𑷐 𑶿𑶴𑶿𑶿𑶵 𑷅𑶴𑷕𑶰𑷙.

Source: Universal Declaration of Human Rights, article 1

Usage & history

Tolong Siki is a South Asian monocameral alphabet used in India specifically for the North Dravidian Kurukh language. It was invented by Narayan Oraon in 1988 and formally published in 1999. Books and magazines have been published in Tolong Siki, and it was officially recognized by the state of Jharkhand in 2007. The Kurukh Literary Society of India has been instrumental in spreading the Tolong Siki script for Kurukh literature.

𑶻𑶳𑷑𑶳𑷎 𑷔𑶰𑷊𑶰

Tolong Siki is read horizontally, left to right. It is a relatively simple alphabet. Vowels are written using letters, with additional signs to indicate vowel length and nasalisation. There are no ligatures or conjuncts. There are combining marks, but no ascenders or descenders, so the need for context-sensitive positioning is low. Words are separated using spaces. It has its own set of number digits.

More information: Unicode ProposalKobayashi et al

Basic features

The Tolong Siki script is an alphabet, ie. consonants and vowels are written separately. See the table to the right for a brief overview of features for the Kurukh language.

Text runs left-to-right in horizontal lines. There is no case distinction. Words are separated by spaces.

❯ consonantSummary

Tolong Siki represents native consonant sounds using a basic set of around 36 letters.

Medial r can be written as a tilde below the preceding base, and Tolong Siki also uses a dot like the Devanagari anusvara to represent homorganic nasals at the end of a syllable.

There are no special mechanisms involved in representing consonant clusters. Letters are simply juxtaposed. Similarly, geminated consonant sounds are indicated by simply doubling the consonant letter.

❯ basicV

In Tolong Siki text basic vowels are written in a straightforward way, using vowel letters for each of the native vowel sounds. All vowels can be lengthened using a colon-like sign, and nasalised using a tilde above, or both.

A breve above a vowel can be used to indicate non-native sounds.

Although word-initial standalone vowels begin phonetically with a glottal stop, they are written using the vowel letter alone, but word-medial standalones are typically preceded by an apostrophe that indicates that the sequence is not a diphthong and represents the glottal stop.

Tolong Siki has a set of native digits, and uses ASCII code points for most punctuation marks.

Notable features

Character index

Letters

Show

Basic consonants

𑶶␣𑶷␣𑶸␣𑶹␣𑶺␣𑶻␣𑶼␣𑶽␣𑶾␣𑶿␣𑷀␣𑷁␣𑷂␣𑷃␣𑷄␣𑷅␣𑷆␣𑷇␣𑷈␣𑷉␣𑷊␣𑷋␣𑷌␣𑷍␣𑷎␣𑷏␣𑷐␣𑷑␣𑷒␣𑷓␣𑷔␣𑷕␣𑷖␣𑷗␣𑷘␣𑷚

Vowels

𑶰␣𑶱␣𑶲␣𑶳␣𑶴␣𑶵␣𑷙

Other

𑷛

Combining marks

Show

Vowels

̃␣̆

Medials

̰

Finals

̇

Other

̤␣̈

Not used for modern Kurukh

̄␣̣␣̱

Numbers

Show
𑷠␣𑷡␣𑷢␣𑷣␣𑷤␣𑷥␣𑷦␣𑷧␣𑷨␣𑷩

Punctuation

Show
।␣ʼ

ASCII

!␣(␣)␣,␣-␣.␣:␣;␣?
Items to show in lists

Phonology

The following represents the repertoire of the Kurukh language.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones. Source Wikipedia.

Vowel sounds

Plain vowels

i u ũ e o õ ɔ ɔː ɔ̃ a a

Consonant sounds

labial dental alveolar post-
alveolar
retroflex palatal velar glottal
stop p b   t d   ʈ ɖ c ɟ k ɡ ʔ
      ʈʰ ɖʰ ɟʰ ɡʰ  
fricative     s ʃ     x h
nasal m   n   ɳ ɲ ŋ
approximant w   l     j  
trill/flap     r   ɽ ɽʰ

Tone

Kurukh is not a tonal language.

Structure

tbd

Vowels

Vowel summary table

This table summarises only basic vowel to character assignments.

Diacritics are added to the vowels to indicate length and nasalisation. These are shown for a single vowel letter, as the pattern is the same for other vowels.

  post-consonant long vowel nasalised vowel long, nasalised vowel
Simple:
𑶰␣𑶲
𑶰𑷙␣𑶲𑷙
𑶰̃␣𑶲̃
𑶰̃𑷙␣𑶲̃𑷙
𑶱␣𑶳
𑶱𑷙␣𑶳𑷙
𑶱̃␣𑶳̃
𑶱̃𑷙␣𑶳̃𑷙
𑶴␣𑶵
𑶴𑷙␣𑶵𑷙
𑶴̃␣𑶵̃
𑶴̃𑷙␣𑶵̃𑷙

For additional details see vowel_mappings.

Post-consonant vowels

Basic vowels are written in a straightforward way, using dedicated letters for each of the native vowel sounds. All vowels can be lengthened using a colon-like sign, and nasalised using a tilde above, or both.

A breve above a vowel can be used to indicate non-native sounds.

Basic vowel letters

These are the basic vowels.

𑶰␣𑶲␣𑶱␣𑶳␣𑶴␣𑶵

For example:

𑶼𑶴𑷐𑶰𑷏𑶵

𑶼␣𑶴␣𑷐␣𑶰␣𑷏␣𑶵

Although Kobayashi et al. say that Kurukh has 5 cardinal vowels, and even though both 𑶴 and 𑶵 are generally transcribed using a, they describe the latter as only marginally longer than former, but consistently pronounced further back. LETTER A is described as phonetically between ɐ and ə.kl,23 This distinction is reinforced by the fact that both can take lengthening marks (see vlength). Therefore the vowels are represented here using different symbols: a and ɑ in order to give more clarity in the phonological transcriptions.

Extended vowels

𑶵̆

The diacritic ̆ can be used to indicate non-native vowel sounds. An example is the English sound ɔ, which is written in a way that calls to mind the candra symbols in Devanagari.

𑶵̆𑶶𑶰𑷔

Diphthongs

Kurukh has diphthongs, including ɐɪ, ɐʊ, and . They are written using a sequence of vowel letters, with no intervening glottal stop character.

𑷏𑶴𑶰𑶽

Vowel length

Long vowels in Tolong Siki are indicated by the use of 𑷙.

𑶱𑷙𑷊𑶵

𑶺𑶴𑷈𑷕𑶰𑷙

The following panel shows examples of lengthened vowels.

𑶰𑷙␣𑶲𑷙␣𑶱𑷙␣𑶳𑷙␣𑶴𑷙␣𑶵𑷙

Observation: Although 𑶵 is already supposed to be long, instances of it followed by a vowel lengthener can easily be found in texts, for example 𑶻𑶵𑷙𑷊𑶵. It's not clear what this means, although it lends to the assumption that 𑶴 and 𑶵 are differentiated more by quality than length (see vletter).

Nasalisation

Vowel nasalisation is marked using ̃.

𑷊𑶲̃𑷗𑶲𑷖

𑷀𑶲𑷉𑶵̃

If another sign is attached to the base consonant, such as a vowel lengthening mark, the nasalisation mark is positioned over the base. For example,

𑶺𑶲̃𑷙𑷈𑶲𑷐ʼ𑶵

The following panel shows examples of long, nasalised vowels.

𑶰̃𑷙␣𑶲̃𑷙␣𑶱̃𑷙␣𑶳̃𑷙␣𑶴̃𑷙␣𑶵̃𑷙

This diacritic was introduced in 2015. Prior to that authors would use ̇ for vowel nasalisation as well as to represent nasal codas (for which it is still used).

Standalone vowels

Word initial standalone vowels are phonetically preceded by a glottal stop, but are written using the normal vowel signs with no special additional mechanisms.

𑶵𑷗𑶰𑷙

𑶵̆𑶶𑶰𑷔

𑶰𑷐𑶰𑶸

Word-medial standalone vowels will typically be preceded by ʼ, to indicate the syllable boundaries within the word, and in careful speech the glottal stop, eg.

𑷕𑶴ʼ𑶵

𑷐𑶱𑶻ʼ𑶵

𑶱𑷙𑶼𑷐ʼ𑶵

This sign helps to distinguish between diphthongs and multiple syllables.

Vowel sounds to characters

This section maps Kurukh vowel sounds to common graphemes in the Tolong Siki orthography.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc. Light coloured characters occur infrequently.

Plain vowels

i

vowel 𑶰

u

vowel 𑶲

e

vowel 𑶱

o

vowel 𑶳

ɔ

extended vowel 𑶵̆ Non-native sound in loan words.

a

vowel 𑶴

ɑ

vowel 𑶵

Modifiers

◌̃

nasalisation ̃

ː

vowel length 𑷙

Consonants

Consonant summary table

This table summarises only basic consonant to character assignments.

Onsets
𑶶␣𑶸␣𑶻␣𑶽␣𑷀␣𑷂␣𑷅␣𑷇␣𑷊␣𑷌␣𑷚
𑶷␣𑶹␣𑶼␣𑶾␣𑷁␣𑷃␣𑷆␣𑷈␣𑷋␣𑷍
𑷔␣𑷖␣𑷕
𑶺␣𑶿␣𑷄␣𑷉␣𑷓␣𑷎
𑷒␣𑷐␣𑷗␣𑷘␣𑷑␣𑷏
Medials:
◌̰
Finals:
◌̇

For additional details see consonant_mappings.

Basic consonants

Basic consonant sounds in Kurukh are written using the following letters.

Click on each letter for more details and for examples of usage.

𑶶␣𑶷␣𑶸␣𑶹␣𑶻␣𑶼␣𑶽␣𑶾␣𑷀␣𑷁␣𑷂␣𑷃␣𑷅␣𑷆␣𑷇␣𑷈␣𑷊␣𑷋␣𑷌␣𑷍␣𑷚␣𑷔␣𑷖␣𑷕␣𑶺␣𑶿␣𑷄␣𑷉␣𑷓␣𑷎␣𑷒␣𑷐␣𑷗␣𑷘␣𑷑␣𑷏

The sign for the glottal stop 𑷚 is apparently used after vowel sounds, although its use does not appear to be particularly common. Narayan Oraon gives the following example: 𑷔𑶰𑷚𑶿𑶵. The sign ʼ also indicates a glottal stop when it occurs before word-medial standalone vowels (see standalone).

Observation: It's not clear what the second palatal nasal is for.

Repertoire extension

Tolong Siki uses combining marks to indicate non-native sounds. Since 2015, this involves ̤ or ̈.

The table below gives mappings to sounds found in Hindi and Sanskrit. The right-hand column shows the Devanagari equivalent.

11DCA 324qक़
11DCC 324ɣग़
11DC5 324क्ष
11DC7 324zज़
11DB7 324fफ़
11DD2 324vव़
11DD4 324ʃ
11DD4 308ʂ

Prior to 2015 several different combining marks were used, and these may still appear in texts. They include ̣, ̱, ̄, ̄̃, and ̄̇.

Onsets

Medial r can be written using ̰. The Unicode Proposal lists the following combinations.

𑷊̰␣𑶻̰␣𑶽̰␣𑷔̰␣𑷊̰𑶰

This diacritic was introduced in 2015.

Finals

Syllable codas are typically written just using ordinary consonant letters, eg.

𑶹𑶵𑶹𑶿𑶵

However, ̇ can be used to indicate a syllable-final nasal. The quality of the nasal depends on the sound that follows. For example,

𑷀𑶳̇𑷊𑶵

Consonant clusters

Kurukh syllables can include reasonably long consonant clusters, especially word-medially but also in word-final position. The clusters can involve a number of sounds, which don't necessarily follow the normal principle of sonority sequencing. However, apart from the medial r described in onsets, and the nasal coda described in finals, there are no special mechanisms involved in representing consonant clusters. Letters are simply juxtaposed. See the following examples:

𑶿𑶰𑷎𑷌𑷔𑷖𑶱𑷗𑶳𑷙

𑶹𑶱𑶿𑶽𑷌𑷑𑶱𑷙

Consonant length

Consonant gemination is common in Kurukh, and often occurs at the end of a word. Gemination is indicated by doubling the relevant consonant letter. Examples:

𑶰𑶻𑶻𑶵

𑷅𑶰𑷅𑷅

𑶲𑷎𑷊𑷊

Consonant sounds to characters

This section maps Kurukh consonant sounds to common graphemes in the Tolong Siki orthography.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc. Light coloured characters occur infrequently.

p

consonant 𑶶

consonant 𑶷

b

consonant 𑶸

consonant 𑶹

t

consonant 𑶻

consonant 𑶼

d

consonant 𑶽

consonant 𑶾

ʈ

consonant 𑷀

ʈʰ

consonant 𑷁

ɖ

consonant 𑷂

ɖʰ

consonant 𑷃

c

consonant 𑷅

consonant 𑷆

ɟ

consonant 𑷇

ɟʰ

consonant 𑷈

k

consonant 𑷊

consonant 𑷋

ɡ

consonant 𑷌

ɡʰ

consonant 𑷍

ʔ

glottal stop ʼ Used before vowels.

glottal stop 𑷚 Used after vowels.

s

consonant 𑷔

x

consonant 𑷖

h

consonant 𑷕

m

consonant 𑶺

n

consonant 𑶿

homorganic final nasal ̇ Coda.

ɳ

consonant 𑷄

homorganic final nasal ̇ Coda.

ɲ

consonant 𑷉

consonant 𑷓

homorganic final nasal ̇ Coda.

ŋ

consonant 𑷎

homorganic final nasal ̇ Coda

w

consonant 𑷒

r

consonant 𑷐

medial r ̰ Medial consonant.

ɽ

consonant 𑷗

ɽʰ

consonant 𑷘

l

consonant 𑷑

j

consonant 𑷏

Non-native sounds

extended repertoire consonant 𑷅̤ Equivalent to क्ष.

q

extended repertoire consonant 𑷊̤ Equivalent to क़.

f

extended repertoire consonant 𑶷̤ Equivalent to फ़.

v

extended repertoire consonant 𑷒̤ Equivalent to व़.

z

extended repertoire consonant 𑷇̤ Equivalent to ज़.

ʃ

extended repertoire consonant 𑷔̤ Equivalent to श.

ʂ

extended repertoire consonant 𑷔̈ Equivalent to ष.

ɣ

extended repertoire consonant 𑷌̤ Equivalent to ग़.

Other features

Auspicious sign

𑷛 is an auspicious sign. It has the general category of letter, and is pronounced ũɡɡu.

Encoding choices

Although usage is recommended here, content authors may well be unaware of such recommendations. Therefore, applications should look out for the non-recommended approach and treat it the same as the recommended approach wherever possible.

Codepoint sequences

Combining marks always follow the base character.

When a consonant is both nasalised and long, the length marker needs to be added after the nasalisation diacritic, so that the latter sits above the consonant. For example:

𑶺𑶲̃𑷙𑷈𑶲𑷐ʼ𑶵

Numbers

Digits

Tolong Siki has a set of native digits

𑷠␣𑷡␣𑷢␣𑷣␣𑷤␣𑷥␣𑷦␣𑷧␣𑷨␣𑷩

Text direction

Tolong Siki text runs left to right in horizontal lines.

Show default bidi_class properties for characters in the Tolong Siki orthography described here.

Glyph shaping & positioning

Experiment with examples using the Tolong Siki character app.

Context-based shaping & positioning

Wancho letters don't interact, so no special shaping is needed.

Where a base character carries multiple combining marks, these need to be arranged so as not to overlap.

Typographic units

Word boundaries

Words are separated by spaces.

Some words are hyphenated, eg.

𑷑𑶴𑶹-𑷑𑶴𑶹

Graphemes

Graphemes in Tolong Siki consist of single letters or letters with one or two combining marks. This means that text can be segmented into typographic units using grapheme clusters.

Phrase, sentence, and section delimiters are described in phrase.

Punctuation & inline features

Phrase & section boundaries

,␣:␣;␣.␣?␣!␣।

Tolong Siki uses mostly ASCII and Arabic punctuation, although apparently may sometimes be used as well.up,4

phrase

,

;

:

sentence

.

?

!

Bracketed text

See type samples.

(␣)

Wancho commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

(

)

Line & paragraph layout

Line breaking & hyphenation

Lines are generally broken between words.

Page & book layout

Online resources

  1. Universal Declaration of Human Rights - Assyrian Neo-Aramaic
  2. The Bible in Assyrian Neo-Aramaic

References