Kashmiri

Devanagari orthography notes

Updated 26 January, 2024

This page brings together basic information about the Devanagari script and its use for the Kashmiri language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Kashmiri using Unicode.

Referencing this document

Richard Ishida, Kashmiri (Devanagari) Orthography Notes, 26-Jan-2024, https://r12a.github.io/scripts/deva/ks

Sample

Select part of this sample text to show a list of characters, with links to more details.rt,84 Source
Change size:   28px

सिरीनगर छु अख सॏंदर शहर। यि छु जॆहलम दऺरियावॖक्यन दॖन बठ्यन प्यठ बऺसिथ। शहरा मंज़ छि ज़ॖ बाल, शेंकराचार तॖ हारि परबथ। निशात बाग, सालॖमऺर बाग, चॆशमॖ सऻही, पऺरी महल तॖ हऻरवन सरबंद छि सऻरिय सिरीनगर शहरस मंज़। अमर नाथ तॖ तुलमुल छि हॆंद्यन हॖंद्य जॖ पवित्र तीर्थस्थान। हज़रतबल तॖ खानकाह छि मॖसलमानन हॖमंजॖ मुक्कदस ज़ियारचॖ। कल्हन पँडिथ ओस कऺशीरि हुंद अख बॊड बारॖ तऻरीखदान तॖ लल द्यद तॖ नुंदॖ र्‌यॊश ॴस्य् जॖ थदि पायिक्य् सूफी शऻयिर। कऻशिर्‌यन हॖंज़ ज़बान छि कऻशुर।

Usage & history

Kashmiri is written in the devanagari script by Hindus. Muslims use the arabic script. Due to population migrations, the use of the Devanagari script to write Kashmiri has significantly dwindled, although there are efforts to revive its use, and a number of recent reforms attempted to standardise the orthography.

कऻशुर

In 1995, 2002, and 2009 the orthographic reforms centred around the representation of vowel sounds. The result is that texts on the internet can be found using various different approaches, and the largest number of pages found were written just after the introduction of the 2002 reform, and so use slightly different vowel graphemes. This page presents the orthography based on the 2009 revision. For more information see previousOrthographies.

For information about the script in general, see the Devanagari overview.u

Basic features

Devanagari is an abugida. Consonant letters have an inherent vowel sound. Combining vowel signs are attached to the consonant to indicate that a different vowel follows the consonant. See the table in the right-hand column for a brief overview of features for the modern Kashmiri orthography using the Devanagari script.

Kashmiri uses fewer consonants than Hindi, but has more vowels. The orthography includes some Kashmiri-specific characters.

Devanagari text runs left-to-right in horizontal lines.

Orthographic syllables (as opposed to phonetic syllables) play a significant role in Devanagari text. An orthographic syllable starts at the beginning of any cluster of consonants and incorporates the whole cluster plus any following vowels and diacritics.

❯ consonantSummary

The 25 basic consonant letters are supplemented by repertoire extensions for 3 more sounds by applying the nukta diacritic to characters.

Phonetically, Kashmiri has only three forms of plosives, illustrated here with the bilabial stop: unvoiced p, voiced b, aspirated . The murmured is not used, although these letters may crop up in Sanskrit or Hindi loan words. It also has a set of retroflex consonants. Kashmiri also commonly palatalises consonants.

Consonant clusters are normally indicated using the virama between consonants, though often there is no marker for unpronounced inherent vowels. It is also common to see a visible virama, especially for palatalisation. Conjunct forms are otherwise expressed using the common Devanagari half-forms, stacked consonants, and ligated glyphs.

As part of a cluster, RA has special forms, but a palatalised RA at the beginning of a word needs special treatment to avoid a repa formation.

Syllable-final consonant nasal sounds are most commonly represented by a dedicated combining mark (anusvara). Kashmiri normally uses only one letter for m and one for n, although other nasals may occur in words borrowed from Sanskrit.

❯ basicV

The Kashmiri orthography is an abugida with one inherent vowel. It represents other vowels using 16 vowel signs. All vowel signs are combining marks.

There are no multipart vowels and no circumgraphs. There is 1 pre-base vowel sign.

All standalone vowel sounds are written using one of 17 independent vowel letters. One vocalic letter is also used.

Two vowel signs and letters represent diphthongs.

Vowels may be nasalised, using the candrabindu diacritic.

Native digits are used. Punctuation is mostly ASCII, but dandas may be used for phrase boundaries.

Character index

Letters

Show

Basic consonants

प␣फ␣ब␣त␣थ␣द␣ट␣ठ␣ड␣क␣ख␣ग␣च␣छ␣ज␣व␣स␣श␣ह␣म␣न␣र␣ल␣य␣ज़

Sanskrit/Hindi consonants

ण␣ञ␣ङ␣भ␣ध␣ढ␣झ␣घ␣ष

Vowels

इ␣ई␣ॶ␣ॷ␣उ␣ऊ␣ऎ␣ए␣ॳ␣ॴ␣ऒ␣ओ␣ॵ␣अ␣आ␣ऐ␣औ

Vocalic

Not used for modern Kashmiri

ॲ␣ऑ␣ऽ

Combining marks

Show

Vowels

ि␣ी␣ॖ␣ॗ␣ु␣ू␣ॆ␣े␣ऺ␣ऻ␣ॊ␣ो␣ॏ␣ा␣ै␣ौ

Vocalic

Finals

Other

़␣्␣ँ

Not used for modern Kashmiri

ॅ␣ॉ

Punctuation

Show
।␣॥

ASCII

!␣(␣)␣,␣:␣;␣?

Other

Show
‌␣‍

To be investigated

%␣.␣[␣]␣§␣ʼ␣͏␣ॄ␣ॐ␣ॠ␣०␣१␣२␣३␣४␣५␣६␣७␣८␣९␣॰␣‑␣–␣—␣‘␣’␣“␣”␣†␣‡␣…␣‰␣′␣″␣⁠␣₹␣⹁
In character lists, show:

Structure

See the Devanagari overview.u

Phonology

These are sounds for the Kashmiri language.

Click on the sounds to reveal locations in this document where they are mentioned.

Vowel sounds

Plain vowels

i i ɨ ɨː ɨ ɨː u u e e o o ə əː ə əː ɔ ɔ a a

Diphthongs

əĭ əŭ əĭ əŭ

Consonant sounds

labial dental alveolar post-
alveolar
retroflex palatal velar glottal
stops p b t d     ʈ ɖ   k ɡ  
aspirated     ʈʰ    
affricates   t͡s   t͡ʃ d͡ʒ        
aspirated   t͡sʰ   t͡ʃʰ        
fricatives     s z ʃ       h
nasals m   n        
approximants w   l     j  
trills/flaps     r    

Kashmiri has no voiced aspirated sounds.

Vowels

Vowel summary table

The following table summarises the main vowel to character assigments.

ⓘ represents the inherent vowel. Dependent vowels are on the left, standalone vowels on the right. Diacritics are added to the vowels to indicate nasalisation (not shown here).

Plain:
ि␣ी␣ॖ␣ॗ␣ु␣ू
इ␣ई␣ॶ␣ॷ␣उ␣ऊ
ॆ␣े␣ॊ␣ो
ऎ␣ए␣ऒ␣ओ
ऺ␣ऻ
ॳ␣ॴ
ⓘ␣ा
अ␣आ
Diphthongs:
ै␣ौ
ऐ␣औ
Vocalics:

For additional details see vowel_mappings.

Inherent vowel

ka U+0915 DEVANAGARI LETTER KA

a following a consonant is not written, but is seen as an inherent part of the consonant letter, so ka is written by simply using the consonant letter.

Combining marks used for vowels

की kiː U+0915 DEVANAGARI LETTER KA + U+0940 DEVANAGARI VOWEL SIGN II

Kashmiri uses the following dedicated combining marks for vowels.

ि␣ी␣ॖ␣ॗ␣ु␣ू␣ॆ␣े␣ऺ␣ऻ␣ॊ␣ो␣ॏ␣ा␣ ␣ै␣ौ

Some of these vowel signs are the result of recent standardisation of the orthography (see previousOrthographies).

Eight vowel signs are spacing combining characters, meaning that they consume horizontal space when added to a base consonant.

All vowel signs are typed and stored after the base consonant, and the rendering process puts them in the correct place for display.

An orthography that uses vowel signs is different from one that uses simple diacritics or letters for vowels, in that the vowel signs are generally attached to an orthographic syllable, rather than just applied to the letter of the immediately preceding consonant. In other words, pre-base vowel sign components are rendered before a whole consonant cluster if that cluster is rendered as a conjunct (see prebase_vowels for an example).

Vowel length

Vowel length is indicated by the vowel sign used (see combiningV).

Nasalisation

Nasalisation of the vowel in a syllable can be indicated using [U+0901 DEVANAGARI SIGN CANDRABINDU], eg. मुँह वाँदुर

Standalone vowels

Kashmiri represents standalone vowels using a set of independent vowel letters. The set contains a character to represent the inherent vowel sound.

इ␣ई␣ॶ␣ॷ␣उ␣ऊ␣ऎ␣ए␣ॳ␣ॴ␣ऒ␣ओ␣ॵ␣अ␣आ␣ ␣ऐ␣औ

As was the case for the vowel signs, some of these letters are the result of recent standardisation of the orthography (see previousOrthographies).

Pre-base vowel sign

कि ki U+0915 DEVANAGARI LETTER KA + U+093F DEVANAGARI VOWEL SIGN I

ि

One vowel sign appears to the left of the base consonant letter or cluster, eg. कि

This is a combining mark that is always typed and stored after the base consonant(s), ie. the codepoints follow the order in which the items are pronounced. The rendering process places the glyph before the base consonant without changing the code points.

It is actually placed before the start of an orthographic syllable. In fig_prebase the sequence of glyphs for the orthographic syllable is rendered VCC, whereas the pronunciation is CCV. In conjuncts with 3 consonants, it will still be rendered before the consonants.

बेत्रि
A prebase vowel, pronounced after a consonant cluster, but rendered to the left of the conjunct.
show composition

बेत्रि

However, if the cluster is split by a visible virama, this creates two syllables and the pre-base vowel sign appears after the last consonant with the virama. The sequence of displayed glyphs is now CVC. If the conjunct contains 3 consonants, the displayed order will be CCVC.

Tones

Kashmiri is not a tonal language.

Vowel sign placement

The following list shows where vowel signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are.

Vowel absence

To kill the inherent vowel after a consonant Kashmiri uses [U+094D DEVANAGARI SIGN VIRAMA].

In conjuncts, the virama is usually not seen, but it is often seen in Kashmiri words that end with palatalisation (see palatalisation).

Kashmiri commonly suppresses the inherent vowel without a conjunct or visible virama appearing in the orthography, eg. अतलास रफतार

Vowel sounds to characters

This section maps Kashmiri vowel sounds to common graphemes in the Devanagari orthography, grouped by dependent ( d ), or standalone ( s ) forms. Click on the character names to see examples.

Plain vowels

a
-

Inherent vowel.

Diphthongs

Previous orthographies

Prior to 1995 there was no standard way to write Kashmiri, and people spelled words in different ways.rt,7 There was an orthographic standardisation reform in 1995, followed by another in 2002, and a further revision in 2009.

Prior to the orthographic reform in 2002, the phonemes ɨ and ɨː were respectively written ॅु [U+0945 DEVANAGARI VOWEL SIGN CANDRA E + U+0941 DEVANAGARI VOWEL SIGN U] and ॅू [U+0945 DEVANAGARI VOWEL SIGN CANDRA E + U+0942 DEVANAGARI VOWEL SIGN UU].ep The 2002 reform replaced those with [U+0956 DEVANAGARI VOWEL SIGN UE] and [U+0957 DEVANAGARI VOWEL SIGN UUE], and a pair of equivalent independent vowels.mkr

It also brought in a number of other characters shown in fig_orthographic_changes.

phoneme 199520022009Current usage
ɨ ॅु   [U+0956 DEVANAGARI VOWEL SIGN UE]
[U+0976 DEVANAGARI LETTER UE]
ɨː ॅू   [U+0957 DEVANAGARI VOWEL SIGN UUE]
[U+0977 DEVANAGARI LETTER UUE]
ə [U+093A DEVANAGARI VOWEL SIGN OE]
[U+0973 DEVANAGARI LETTER OE]
əː   [U+093B DEVANAGARI VOWEL SIGN OOE]
[U+0974 DEVANAGARI LETTER OOE]
e े'   [U+0946 DEVANAGARI VOWEL SIGN SHORT E]
[U+090E DEVANAGARI LETTER SHORT E]
o ो'   [U+094B DEVANAGARI VOWEL SIGN O]
[U+0913 DEVANAGARI LETTER O]
ɔ   [U+094F DEVANAGARI VOWEL SIGN AW]
[U+0975 DEVANAGARI LETTER AW]
Glyphs changed during the 2002 and 2009 reforms, showing vowel signs.

Another revision occurred in 2009, resulting in the set of characters used in this page.l Principle changes included the substitution of [U+0973 DEVANAGARI LETTER OE] and [U+0974 DEVANAGARI LETTER OOE] for [U+0972 DEVANAGARI LETTER CANDRA A] and [U+0911 DEVANAGARI LETTER CANDRA O], respectively.

The reform also introduced a new character, [U+0975 DEVANAGARI LETTER AW], and its equivalent vowel sign, [U+094F DEVANAGARI VOWEL SIGN AW], to replace the use of -्व [U+094D DEVANAGARI SIGN VIRAMA + U+0935 DEVANAGARI LETTER VA] for the vowel ɔ. For example, the following shows the spelling changes for the word sɔkʰmoth.

Old: *स्वखNew: सॏख

The new characters were added in Unicode v6. In the gap, there was some experimentation with Gurmukhi characters for the phonemes ɨ and ɨː.

Vocalics

ऋ␣ृ

Observation: Raina & Trakru describe the use of a single vocalic. It appears to be used for Sanskrit-derived words, and 2 of the four example words given also include the letter [U+0937 DEVANAGARI LETTER SSA], which is not usually used for Kashmiri.

One of the examples also uses a vowel sign to modify the inherent sound of the standalone vocalic, which is somewhat unusual. The example is ऋॆष्य्.

Consonants

Consonant summary table

The following table summarises the main consonant to character assigments.

Plosives
प␣ब␣त␣द␣ट␣ड␣क␣ग␣ ␣फ␣थ␣ठ␣ख
Affricates
च़␣च␣ज␣ ␣छ़␣छ
Fricatives
व␣स␣ज़␣श␣ह
Nasals
म␣न
Other
र␣ल␣य
Finals

For additional details see vowel_mappings.

Basic consonants

Basic set of consonants used for Kashmiri.

प␣ब␣फ␣त␣द␣थ␣ट␣ड␣ठ␣क␣ग␣ख
च़␣छ़␣च␣ज␣छ
व␣स␣ज़␣श␣ह
म␣न
र␣ल␣य

Nuktas

Three items in the lists above are combinations of [U+093C DEVANAGARI SIGN NUKTA] and another character.

च़␣छ़␣ज़

Only one of those combinations exists in precomposed form. The other two have to be typed and stored as two characters.

NFC does not recombine the decomposed version of this character into a precomposed character. Instead, normalisation produces decomposed forms when using both NFC and NFD. So both approaches are canonically equivalent, but the decomposed form is recommended by the Unicode Standard.

Palatalisation

Palatalisation is a frequent feature of Kashmiri words. It is represented using [U+092F DEVANAGARI LETTER YA] as the final element of a cluster.

Inside a word the YA forms a conjunct or a cluster with the preceding consonant, eg. त्यम्बॖर

At the end of a word, the YA is followed by a visible virama, eg. थऺन्य्

Use preceding the inherent vowel is typically transcribed using ê, eg. têmbar. At the end of a word, it is often transcribed using a superscript i, eg. tånⁱ

Some care needs to be taken when the palatalisation follows r at the beginning of a word, so as to prevent the sequence forming a repha, ie. र्य [U+0930 DEVANAGARI LETTER RA + U+094D DEVANAGARI SIGN VIRAMA + U+092F DEVANAGARI LETTER YA]. The required rendering can be achieved using र्‌य [U+0930 DEVANAGARI LETTER RA + U+094D DEVANAGARI SIGN VIRAMA + U+200C ZERO WIDTH NON-JOINER + U+092F DEVANAGARI LETTER YA], eg. र्‌यथ Word-internal use of the repha with palatalisation can, however, be seen, eg. पऻर्यज़ान

Since they are palatal sounds, the YA is not needed after the following consonants.

च␣ज␣छ␣श

Sanskrit letters

Words directly borrowed from Sanskrit and Hindi may use additional characters that are not normally used in Kashmiri.mkr

Nasals

ण␣ञ␣ङ

Kashmiri normally uses only 2 of the 5 standard nasal letters in Sanskrit. The missing letters shown just above are normally rendered in Kashmiri using [U+0902 DEVANAGARI SIGN ANUSVARA],mkr eg. compare*ब्रह्मण्ड b͓rh͓mɳ͓ɖब्रह्मांड

They may, however, be found occasionally in conjuncts,rt,9 eg. ang in the Kashmiri orthography is written अंगbut may be written अङ्ग

On the other hand, they normally never appear outside of a conjunct, ie. ganapatʰ is more properly written in Kashmiri asगनपथ gnptʰ rather than the Sanskrit गणपथ gɳptʰThat said, some writers will nonetheless use the Sanskrit forms.rt,9

Voiced aspirated plosives

भ␣ध␣ढ␣झ␣घ

The voiced aspirated plosive letters of Devanagari shown just above may be used to write Sanskrit words, or those words may be written without, eg. dharma may be written धर्म using Sanskrit letters, or दर्म in the Kashmiri style.rt,9

Others

ष␣क्ष␣ज्ञ

The letter and the two special conjuncts listed just above are also not used in Kashmiri, although they may pop up sometimes in words borrowed directly from Sanskrit.

Onsets

Clusters of consonant letters at the beginning of an orthographic syllable occur in Kashmiri, and they are handled as described in the section clusters.

Special behaviours include handling of RA at the beginning of an orthographic syllable (see rconjuncts).

Finals

[U+0902 DEVANAGARI SIGN ANUSVARA] represents a nasal that is homorganic with a following consonant. It is positioned over the previous consonant or vowel sign,mkr eg. पॖंच़ॗहज़ॊंग

See also the candrabindu diacritic, which nasalises a vowel.

The visarga is not used in Kashmiri.rt,8

Consonant clusters

See the Devanagari overview.

Consonant length

Gemination and consonant lengthening are handled using the normal approach to consonant clusters (see clusters).

Consonant sounds to characters

This section maps Kashmiri consonant sounds to common graphemes using the Devanagari orthography.

Click on the character names to see examples.

Stops

Affricates

Fricatives

Other sonorants

Palatalisation

Encoding choices

This section looks at alternative strategies for typing and storing letters used by Kashmiri, taking into consideration the effects of normalising the text using Unicode Normalisation Form D (NFD), and Normalisation Form C (NFC).

Vowel signs

The single code points on the left should be used, and not the sequences on the right, because they are not made the same by normalisation. Therefore the content will be regarded as different, which will affect searching and other operations on the text.

Use Do not use
[U+094B DEVANAGARI VOWEL SIGN O] + [U+093E DEVANAGARI VOWEL SIGN AA + U+0947 DEVANAGARI VOWEL SIGN E]
[U+094C DEVANAGARI VOWEL SIGN AU] + [U+093E DEVANAGARI VOWEL SIGN AA + U+0948 DEVANAGARI VOWEL SIGN AI]
[U+094A DEVANAGARI VOWEL SIGN SHORT O] + [U+093E DEVANAGARI VOWEL SIGN AA + U+0946 DEVANAGARI VOWEL SIGN SHORT E]
[U+093B DEVANAGARI VOWEL SIGN OOE] + [U+093E DEVANAGARI VOWEL SIGN AA + U+093A DEVANAGARI VOWEL SIGN OE]

The next table shows vowel signs that were rendered obsolete by recent standardisation work. Use the characters on the left, rather than those on the right. (See previousOrthographies.)

Use Do not use
[U+0956 DEVANAGARI VOWEL SIGN UE] ॅु [U+0945 DEVANAGARI VOWEL SIGN CANDRA E + U+0941 DEVANAGARI VOWEL SIGN U]
[U+0957 DEVANAGARI VOWEL SIGN UUE] ॅू [U+0945 DEVANAGARI VOWEL SIGN CANDRA E + U+0942 DEVANAGARI VOWEL SIGN UU]
[U+093A DEVANAGARI VOWEL SIGN OE]

[U+0945 DEVANAGARI VOWEL SIGN CANDRA E]

[U+093D DEVANAGARI SIGN AVAGRAHA]

[U+093B DEVANAGARI VOWEL SIGN OOE] [U+0949 DEVANAGARI VOWEL SIGN CANDRA O]
[U+0946 DEVANAGARI VOWEL SIGN SHORT E] े' [U+0947 DEVANAGARI VOWEL SIGN E + U+0027 APOSTROPHE]
[U+094B DEVANAGARI VOWEL SIGN O] ो' [U+094B DEVANAGARI VOWEL SIGN O + U+0027 APOSTROPHE]
[U+094F DEVANAGARI VOWEL SIGN AW] [U+0935 DEVANAGARI LETTER VA] 

Independent vowels

Again, the single code points on the left should be used, and not the sequences on the right, because they are not made the same by normalisation.

Use Do not use
[U+0906 DEVANAGARI LETTER AA] + [U+0905 DEVANAGARI LETTER A + U+093E DEVANAGARI VOWEL SIGN AA]
[U+0973 DEVANAGARI LETTER OE] + [U+0905 DEVANAGARI LETTER A + U+093A DEVANAGARI VOWEL SIGN OE] 
[U+0974 DEVANAGARI LETTER OOE] + [U+0905 DEVANAGARI LETTER A + U+093B DEVANAGARI VOWEL SIGN OOE]
[U+0913 DEVANAGARI LETTER O] + [U+0905 DEVANAGARI LETTER A + U+094B DEVANAGARI VOWEL SIGN O]
[U+0914 DEVANAGARI LETTER AU] + [U+0905 DEVANAGARI LETTER A + U+094C DEVANAGARI VOWEL SIGN AU]
[U+0912 DEVANAGARI LETTER SHORT O] + [U+0905 DEVANAGARI LETTER A + U+094A DEVANAGARI VOWEL SIGN SHORT O]
[U+0976 DEVANAGARI LETTER UE] + [U+0905 DEVANAGARI LETTER A + U+0956 DEVANAGARI VOWEL SIGN UE]
[U+0977 DEVANAGARI LETTER UUE] + [U+0905 DEVANAGARI LETTER A + U+0957 DEVANAGARI VOWEL SIGN UUE]
[U+0910 DEVANAGARI LETTER AI] + [U+090F DEVANAGARI LETTER E + U+0947 DEVANAGARI VOWEL SIGN E]
[U+090E DEVANAGARI LETTER SHORT E] + [U+090F DEVANAGARI LETTER E + U+0946 DEVANAGARI VOWEL SIGN SHORT E]

The next table shows vowel signs that were rendered obsolete by recent standardisation work. Use the characters on the left, rather than those on the right. (See previousOrthographies.)

Use Do not use
[U+0976 DEVANAGARI LETTER UE] ॅु [U+0945 DEVANAGARI VOWEL SIGN CANDRA E + U+0941 DEVANAGARI VOWEL SIGN U]
[U+0977 DEVANAGARI LETTER UUE] ॅू [U+0945 DEVANAGARI VOWEL SIGN CANDRA E + U+0942 DEVANAGARI VOWEL SIGN UU]
[U+0973 DEVANAGARI LETTER OE]

[U+0972 DEVANAGARI LETTER CANDRA A] 

[U+093D DEVANAGARI SIGN AVAGRAHA]

[U+0974 DEVANAGARI LETTER OOE] [U+0911 DEVANAGARI LETTER CANDRA O]
[U+090E DEVANAGARI LETTER SHORT E] े' [U+0947 DEVANAGARI VOWEL SIGN E + U+0027 APOSTROPHE]
[U+0913 DEVANAGARI LETTER O] ो' [U+094B DEVANAGARI VOWEL SIGN O + U+0027 APOSTROPHE]
[U+0975 DEVANAGARI LETTER AW] [U+0935 DEVANAGARI LETTER VA] 

Consonants

The table just below shows precomposed and decomposed representation of a Kashmiri letter which are treated as canonically equivalent by Unicode, meaning that you can use either. The Unicode Standard, however, recommends the use of the decomposed version, because normalisation does not reconstitute the precomposed from the decomposed.

Recommended Not recommended
ज़ [U+091C DEVANAGARI LETTER JA + U+093C DEVANAGARI SIGN NUKTA] [U+095B DEVANAGARI LETTER ZA]

Numbers, dates, currency, etc

Observation: Clarification needed on whether or not Kashmiri uses indic digits, and the rupee sign. Sources used so far keep to ASCII digits, but the Devanagari block has a set of digits that are used in Hindi.

Text direction

Kashmiri in the Devanagari script runs left to right in horizontal lines.

Show default bidi_class properties for characters in the Kashmiri orthography described here.

Glyph shaping & positioning

You can experiment with examples using the Kashmiri character app.

Glyph joining

Within a Kashmiri word, spacing glyphs are typically joined together at the top bar (shirorekha).

काहवऺट

The top bar extends across or through most spacing letters, including both consonants and vowels, but some letters create a gap in the line (while still joining at either side). Two such letters can be seen in the following example.

अथॖ

Characters that create these gaps include digits and the following:

ॶ␣ॷ␣ॳ␣ऒ␣ॴ␣ओ␣अ␣आ␣ॵ␣औ␣थ␣श

Alignment of the top bar may be appropriate when mixing text of different sizes (see initials). Also, when Gurmukhi text is mixed with another script that also has a top bar, such as Devanagari, the top bars of both scripts may need to be aligned.

Context-based shaping & positioning

Context-based shaping

The shape of a character when displayed can vary, often dramatically, according to the context.

One very common example in most indic scripts is the handling of 'conjunct consonants', ie. groups of consonants with no intervening vowel sounds. Since consonants in indic scripts have an inherent vowel sound, when two consonants are combined this way you have to indicate that the vowel of the initial consonant is suppressed. This is normally done by altering the shape of the first consonant, or merging the shape of the two consonants.

To tell the font to do this, in Unicode you add  ् [U+094D DEVANAGARI SIGN VIRAMA​] between the two consonants. This produces the change in the shapes of the glyphs that indicates to the reader that this is a conjunct. The actual outcome is font dependent. For the word below which contains a conjunct of two [U+0932 DEVANAGARI LETTER LA] characters (making a long L sound) you may see a 'half-form' used for the first LA (shown on the left) or you may see (as shown on the right) a ligated form.

दिल्ली दिल्‍ली
Alternative representations of a geminated l consonant.

There are other types of context-based shaping, which are font specific. One is shown below. The width of the glyph for  ि [U+093F DEVANAGARI VOWEL SIGN I​] differs according to the base character to which it is attached.

हालाँकि प्रचलित
Context-sensitive shaping of the glyph for i.

Multiple combining characters

Diacritics regularly combine with a vowel sign attached to the same consonant or consonant cluster. The example below shows two combining characters that are positioned above the base character in a very common form of the verb 'to be'. One is [U+0948 DEVANAGARI VOWEL SIGN AI​], and the other the nasalisation mark [U+0902 DEVANAGARI SIGN ANUSVARA​].

हैं
Multiple combining characters over one base character.

Context-based positioning

Combining characters need to be placed in different positions, according to the context.

The example on the left below displays the dot (anusvara) immediately over the long vertical stroke. The example to the right has moved the dot slightly to the right in order to accomodate the vowel sign.

अंधे में
Context-sensitive placement of the anusvara diacritic.

In the following the image to the left shows the normal position of  ू [U+0942 DEVANAGARI VOWEL SIGN UU​], beneath the first letter. The example on the right shows that character displayed higher up and to the right when combined with the base character [U+0930 DEVANAGARI LETTER RA].

पूजा परू
Context-dependent placement of the glyph representing ra.

Graphemes

Grapheme clusters

tbd

Punctuation & inline features

Word boundaries

Word boundaries are indicated by spaces.

Kashmiri sometimes uses a hyphen to separate parts of a compound noun, eg. ॶंह-रारय

Phrase & section boundaries

,␣;␣:␣।␣?␣!␣॥

Devanagari uses standard Latin punctuation, but also has its own version of a full stop, [U+0964 DEVANAGARI DANDA].

phrase

, [U+002C COMMA]

; [U+003B SEMICOLON]

: [U+003A COLON]

sentence

[U+0964 DEVANAGARI DANDA]

? [U+003F QUESTION MARK]

! [U+0021 EXCLAMATION MARK]

paragraph [U+0965 DEVANAGARI DOUBLE DANDA]

Bracketed text

(␣)

Kashmiri commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

( [U+0028 LEFT PARENTHESIS]

) [U+0029 RIGHT PARENTHESIS]

Line & paragraph layout

Line breaking & hyphenation

Devanagari is normally wrapped at word boundaries.

Line-edge rules

As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line. The Unicode line-break properties help applications decide whether a character should appear at the start or end of a line.

Show (default) line-breaking properties for characters in the modern Kashmiri orthography.

The following list gives examples of typical behaviours for characters used in modern Hindi. Context may affect the behaviour of some of these and other characters.

Click on the Hindi characters to show what they are.

  • “ ‘ (   should not be the last character on a line
  • ” ’ ) ? ! ।॥ %   should not begin a new line

Line breaking should also not move a danda or double danda to the beginning of a new line, even if they are preceded by a space character. These punctuation characters should behave in the same way as a full stop does in English text.

In-word line-breaks

Devanagari text can be hyphenated during line wrap, though it is not very common (unlike several south Indian scripts).

Hyphenation adds a hyphen at the end of the line when a word is broken.

Page & book layout

Online resources

  1. Koshur, An introduction to spoken Kashmiri
  2. Let us learn Kashmiri
  3. A Dictionary of Kashmiri Proverbs

References