/*
*/ var charDetails = { '\u{0600}': `
`, '\u{0601}': `
`, '\u{0602}': `
`, '\u{0603}': `
`, '\u{0604}': `
`, '\u{0605}': `
`, '\u{0606}': `؆
`, '\u{0607}': `؇
`, '\u{0608}': `؈
`, '\u{0609}': `؉
`, '\u{060A}': `؊
`, '\u{060B}': `؋
`, '\u{060C}': `،
`, '\u{060D}': `؍
`, '\u{060E}': `؎
`, '\u{060F}': `؏
`, '\u{0610}': `ؐ
`, '\u{0611}': `ؑ
`, '\u{0612}': `ؒ
`, '\u{0613}': `ؓ
`, '\u{0614}': `ؔ
A sign placed over the name or nom-de-plume of a poet, or in some writings used to mark all proper names.
The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg.
عطاشادؔ ataː ʃaː Ata Shad (author's name)
`, '\u{0615}': `ؕ
`, '\u{0616}': `ؖ
`, '\u{0617}': `ؗ
`, '\u{0618}': `ؘ
`, '\u{0619}': `ؙ
`, '\u{061A}': `ؚ
`, '\u{061B}': `؛
`, '\u{061C}': `
`, '\u{061E}': `؞
`, '\u{061F}': `؟
`, '\u{0620}': `ؠ
ʲ palatalisation marker. This is a frequent feature of Kashmiri words. After a syllable onset the glyph has a small circle below (gol yeh); after a coda the swash form is used (taler yeh). The joining forms are ؠ ؠ ؠ ؠ
Examples: ہوٗنؠ گِلۂرؠ لؠو زؠو رؠتہٕ کول
When palatalisation is applied to the coda of a syllable within a lexical item, the swash form is always used. To produce this, it can be followed by a space or a zero-width non-joiner. It is common for single lexical items to be split in Kashmiri.
ۂسؠ تِنؠ
کھٔرؠ پھٕ
`, '\u{0621}': `ء
`, '\u{0622}': `آ
aː is written using this character in in isolate and word-initial positions:
initial | 0622آزاد |
---|---|
medial | 0627 |
final | 0627 |
isolated | 0622 |
This character and the decomposed sequence 0627 0653 are canonically equivalent.
`, '\u{0623}': `أ
ə is written using this character in isolate and word-initial positions:
isolate | 0623 |
---|---|
initial | 0623أنْز |
medial | 0654ژٔر |
final | 0654 |
The atomic and decomposed versions of the initial and isolate forms are canonically equivalent.
`, '\u{0624}': `ؤ
wə atomic CV letter. It is written with the following forms, which include this character in all positions in precomposed text.
isolate | 0624 0648 0654 |
ؤہراتھ ؤٹِل |
---|---|---|
initial | 0624 0648 0654 |
|
medial | 0624 0648 0654 |
|
final | 0624 0648 0654 |
This character decomposes and recomposes during normalisation.
`, '\u{0625}': `إ
ɨ initial vowel. It is written using this character in isolate & word-initial positions:
isolate | 0625 |
---|---|
initial | 0625 |
medial | 0655 |
final | 0655 |
The atomic and decomposed versions of the initial and isolate forms are canonically equivalent.
`, '\u{0626}': `ئ
Use 06CC 0654 instead.
`, '\u{0627}': `ا
aː is written using this character in isolate and word-initial positions:
initial | 0622 |
---|---|
medial | 0627آتھوار |
final | 0627 |
isolated | 0622 |
Vowel support
∅ Used as an unpronounced support or prefix for all other vowels in isolate or initial positions.
Other vowels
ɔː is written using this character in all positions:
initial | 0627 06C4 0627 |
---|---|
medial | 06C4 0627سۄاد |
final | 06C4 0627 |
isolated | 0627 06C4 0627 |
This character combines with others in decomposed text to create canonically equivalent alternatives.
Combinations
ۄا
ɔː is ۄا
اِ
i is اِ
ایٖ
iː is ایٖ
اٟ
ɨː is اٟ
اُ
u is اُ
اوٗ
uː is اوٗ
اێ
e is اێ
ای
eː is ای
اۆ
o is اۆ
او
oː is او
اۄ
ɔ is اۄ
اَ
a is اَ
`, '\u{0628}': `
ب
b consonant. بُڈؠ بَب
There is no aspirated version of this phoneme in Kashmiri.
`, '\u{0629}': `ة
`, '\u{062A}': `ت
t consonant.
سَتہٕ تُت
Combinations
تھ
tʰ is تھ
تھُج
بُتھ
`, '\u{062B}': `ث
s in words of Arabic and/or Persian origin.
`, '\u{062C}': `ج
d͡ʒ eg. جۆنوٗب تھُج
There is no aspirated version of this phoneme in Kashmiri.
`, '\u{062C}': `ج
d͡ʒ consonant.
جوٚنوٗب
تھُج
There is no aspirated version of this phoneme in Kashmiri.
`, '\u{062D}': `ح
h in words of Arabic and/or Persian origin.
حُکُمران
`, '\u{062E}': `خ
kʰ~x in words of Arabic and/or Persian origin.
خٔرِنؠ
بَطُخ
`, '\u{062F}': `د
d consonant. دۄد
There is no aspirated version of this phoneme in Kashmiri.
`, '\u{0630}': `ذ
z in words of Arabic and/or Persian origin.
ذاتھ
`, '\u{0631}': `ر
r consonant. ریش زامتُر زُر
`, '\u{0632}': `ز
z consonant. زۆن زٲمؠ زٕ
`, '\u{0633}': `س
s consonant. ساس اِنسان
`, '\u{0634}': `ش
ʃ consonant. شیر کٔشیٖر ہَش
`, '\u{0635}': `ص
s in words of Arabic and/or Persian origin. صِفَر
`, '\u{0636}': `ض
z in words of Arabic and/or Persian origin.
ضٔمیٖر
`, '\u{0637}': `ط
t in words of Arabic and/or Persian origin, eg. طوطہٕ
`, '\u{0638}': `ظ
z in words of Arabic and/or Persian origin.
ظٲلِم
`, '\u{0639}': `ع
∅ in loan words before a standalone vowel in word-initial position, rather than ا.
عَقٕل
عٔشِق
aː In some words.
جُمعہ
`, '\u{063A}': `غ
ɡ in words of Arabic and/or Persian origin, eg. مَغرِب
`, '\u{063B}': `ػ
`, '\u{063C}': `ؼ
`, '\u{063D}': `ؽ
`, '\u{063E}': `ؾ
`, '\u{063F}': `ؿ
`, '\u{0640}': `ـ
`, '\u{0641}': `ف
f in words of Arabic and/or Persian origin. صِفَر سَرُف
`, '\u{0642}': `ق
k in words of Arabic and/or Persian origin. خلق مَشرِق
`, '\u{0643}': `ك
Use ک [U+06A9 ARABIC LETTER KEHEH] instead.
`, '\u{0644}': `ل
l consonant. لوو کۄلَے چھَلُن وۄزُل
`, '\u{0645}': `م
m consonant. مول بیمہٕ زام
This has a special hooked form before alef or lam, eg. ماس
`, '\u{0646}': `ن
n consonant.
نَنُن
گَرُن
̃ vowel nasalisation is written using نْ.
پَنْداہ
پانْژھ
وانْدُر
See also 06BA, used at the end of a word.
`, '\u{0647}': `ه
`, '\u{0648}': `و
w consonant. رِوُن گَروول لوو
ʋ sometimes, especially word-initial, eg. وَدُن واتُن آپراوُن
Vowel usage
uː is written using this character in all positions:
initial | 0627 0648 0657اوٗترٕ |
---|---|
medial | 0648 0657نوٗل |
final | 0648 0657قوبوٗ |
isolated | 0627 0648 0657 |
oː is written using this character in all positions:
initial | 0627 0648اوش |
---|---|
medial | 0648پوش |
final | 0648 |
isolated | 0627 0648 |
o is written using this character in all positions:
initial | 0627 0648 065Aاوٚبُر |
---|---|
medial | 0648 065Aزۆن |
final | 0648 065A |
isolated | 0627 0648 065A |
Kashmiri content may contain the precomposed character 06C6 (intended for Kazakh v), which looks identical, but is not linked by normalisation. The decomposed sequence should therefore be used instead.
See also 06C4.
`, '\u{0649}': `ى
`, '\u{064A}': `ي
Use 06CC instead.
`, '\u{064B}': `ً
`, '\u{064C}': `ٌ
`, '\u{064D}': `ٍ
`, '\u{064E}': `َ
a is written using this character in all positions:
initial | 0627 064Eاَرَب |
---|---|
medial | 064Eہَرُد |
final | 064E |
isolated | 0627 064E |
ُ
u is written using this character in all positions:
initial | 0627 064Fاُجرَتھ |
---|---|
medial | 064Fسَرُف |
final | 064F |
isolated | 0627 064F |
ِ
i is written with the following forms, which include this character in all positions:
initial | 0627 0650اِنسان |
---|---|
medial | 0650صِفَر |
final | 0650زٲمِیہِ |
isolated | 0627 0650 |
Observation: When used word-finally, this often appears in the combination ہِ (the 'h' is not pronounced).
`, '\u{0651}': `ّ
Doubles the value of the consonant it is attached to.
`, '\u{0652}': `ْ
Indicates a medial consonant or vowel nasalisation. It is not used to indicate consonant clusters or consonants that are not followed by a vowel.
Medial consonants
To indicate a medial -r or -j Kashmiri places this diacritic above the consonant letter. This helps distinguish syllable boundaries. Putting the jazm above the medial, rather than the consonant before it, is a significant difference in the usage compared to other Arabic script orthographies. Normally it would appear over the letter that is not followed by a vowel. This also means that the base character may be associated with both a vowel diacritic and the jazm
برَْگ
کرُْہُن
The medial -j is less common than the -r, and can sometimes be written using a palatalisation marker instead.
کیْوٚم
Nasalisation
The combination 0646 0652 indicates nasalisation of the preceding vowel. In this case the jazm is never accompanied by a vowel diacritic.
اَنْگریٖزی
مٲنْش
A word-final nasalisation is very rare, but when it occurs it is written using 06BA, like Urdu.
Observation: To make the jazm display with the inverted-v shape in the Noto Nastaliq Urdu font it is necessary to set the language of the text to Kashmiri (ks
).
ٓ
aː is written using this character in in isolate and word-initial positions in decomposed text:
initial | 0627 0653آزاد |
---|---|
medial | 0627 |
final | 0627 |
isolated | 0627 0653 |
The above sequence and the precomposed character 0622 are canonically equivalent.
`, '\u{0654}': `ٔ
ə is written using this character in all positions in decomposed text:
initial | 0627 0654أنْز |
---|---|
medial | 0654ژٔر |
final | 0654 |
isolated | 0627 0654 |
There is a precomposed character for the initial and isolate combinations with alef, and a couple of other letters are available for use with the hamza already included, and are used in NFC normalised text.
All of the above are canonically equivalent with decomposed sequences.
Other precomposed characters with hamza above exist in the Unicode repertoire, but the precomposed and decomposed versions are not canonically equivalent, and their use is not recommended.
`, '\u{0655}': `ٕ
ɨ is written using this character in word-medial & word-final positions, or in all positions in decomposed text (however, atomic characters are more common):
initial | 0627 0655 |
---|---|
medial | 0655گَگٕر |
final | 0655چھِرٕ |
isolated | 0627 0655 |
The atomic and decomposed versions of the initial and isolate forms are canonically equivalent.
`, '\u{0656}': `ٖ
iː is written using this character in word-initial and word-medial positions:
initial | 0627 06CC 0656ایٖمان |
---|---|
medial | 06CC 0656شيٖتھ |
final | 06CCزٲمی |
isolated | 0627 06CC |
This is the only use. It doesn't occur alone or in any other context.
`, '\u{0657}': `ٗ
uː is written using this character in all positions:
initial | 0627 0648 0657اوٗترٕ |
---|---|
medial | 0648 0657نوٗل |
final | 0648 0657قوبوٗ |
isolated | 0627 0648 0657 |
٘
`, '\u{0659}': `ٙ
`, '\u{065A}': `ٚ
e is written using this character in all positions:
initial | 0627 06CC 065A |
---|---|
medial | یٚبیٚنہِ |
final | ےٚ |
isolated | اےٚ |
o is written using this character in all positions:
initial | 0627 0648 065Aاوٚبُر |
---|---|
medial | 0648 065Aزۆن |
final | 0648 065A |
isolated | 0627 0648 065A |
The precomposed ێ (intended for use with Sorani e) and ۆ (intended for use with Kazakh v) are sometimes found in Kashmiri content , but normalisation doesn't convert between the decomposed sequence and the precomposed character in either direction. Content authors should use the decomposed sequences shown above.
`, '\u{065B}': `ٛ
Although this looks like the Kashmiri jazm, as described in the name of the character, it was introduced to Unicode to serve as a vowel sign for African languages §.
The appropriate semantic character for representing the jazm is 0652.
`, '\u{065C}': `ٜ
`, '\u{065D}': `ٝ
`, '\u{065E}': `ٞ
`, '\u{065F}': `ٟ
ɨː is written using this character in all positions:
initial | 0627 065F |
---|---|
medial | 065Fتٟر |
final | 065F |
isolated | 0627 065F |
There is a precomposed character for the initial and isolate combinations with alef, ٳ [U+0673 ARABIC LETTER ALEF WITH WAVY HAMZA BELOW], that neither composes nor decomposes in normalisation, but it is strongly deprecated by the Unicode Standard.
`, '\u{0660}': `٠
`, '\u{0661}': `١
`, '\u{0662}': `٢
`, '\u{0663}': `٣
`, '\u{0664}': `٤
`, '\u{0665}': `٥
`, '\u{0666}': `٦
`, '\u{0667}': `٧
`, '\u{0668}': `٨
`, '\u{0669}': `٩
`, '\u{066A}': `٪
`, '\u{066B}': `٫
`, '\u{066C}': `٬
`, '\u{066D}': `٭
`, '\u{066E}': `ٮ
Sometimes used with 06EA to hack the initial and medial forms of the palatalisation letter in Kashmiri. 0620 should be used instead. Fonts will automatically apply a circle diacritic for initial and medial positions (only).
`, '\u{066F}': `ٯ
`, '\u{0670}': `ٰ
`, '\u{0671}': `ٱ
`, '\u{0672}': `ٲ
əː is written using this character in all positions:
initial | 0672ٲس |
---|---|
medial | 0672کٲشُر |
final | 0672 |
isolated | 0672 |
There is no decomposed version of this letter.
`, '\u{0673}': `ٳ
Visually identical to 0627 065F, which represents the Kashmiri sound ɨː in isolate and initial positions. Because normalisation doesn't convert this character to the decomposed sequence, or vice versa, use of this character is strongly discouraged (deprecated) in the Unicode Standard. The 2-character sequence should be used instead.
`, '\u{0674}': `ٴ
`, '\u{0675}': `ٵ
`, '\u{0676}': `ٶ
`, '\u{0677}': `ٷ
`, '\u{0678}': `ٸ
`, '\u{0679}': `ٹ
ʈ eg.
کٔٹ
ؤٹِل
Combinations
ٹھٹھٹھ ٹھ
ʈʰ is 0679 06BE, eg. کَٹھ
`, '\u{067A}': `ٺ
`, '\u{067B}': `ٻ
`, '\u{067C}': `ټ
`, '\u{067D}': `ٽ
`, '\u{067E}': `پ
p eg. پَرُن
Combinations
پھپھپھ پھ
پ
pʰ is 067E 06BE, eg. پھَش
`, '\u{067F}': `ٿ
`, '\u{0680}': `ڀ
`, '\u{0681}': `Do not use. Use 062D 0654 instead. Read more.
`, '\u{0682}': `ڂ
`, '\u{0683}': `ڃ
`, '\u{0684}': `ڄ
`, '\u{0685}': `څ
`, '\u{0686}': `چ
t͡ʃ eg. چھِرٕ بَطٕچ
Combinations
چھچھچھ چھ
t͡ʃʰ is produced by the combination 0686 06BE, eg. چھَلُن مٔچھ
`, '\u{0687}': `ڇ
`, '\u{0688}': `ڈ
ɖ eg. ژھانْڈُن بُڈؠ بَب
`, '\u{0689}': `ډ
`, '\u{068A}': `ڊ
`, '\u{068B}': `ڋ
`, '\u{068C}': `ڌ
`, '\u{068D}': `ڍ
`, '\u{068E}': `ڎ
`, '\u{068F}': `ڏ
`, '\u{0690}': `ڐ
`, '\u{0691}': `ڑ
ɽ in words of foreign origin, eg. لٔڑکی
`, '\u{0692}': `ڒ
`, '\u{0693}': `ړ
`, '\u{0694}': `ڔ
`, '\u{0695}': `ڕ
`, '\u{0696}': `ږ
`, '\u{0697}': `ڗ
`, '\u{0698}': `ژ
t͡s eg. ژٔر تٔژ
Combinations
ژھژھژھ ژھ
t͡sʰ is 0698 06BE. ژھاوٕج ووٚژھ
`, '\u{0699}': `ڙ
`, '\u{069A}': `ښ
`, '\u{069B}': `ڛ
`, '\u{069C}': `ڜ
`, '\u{069D}': `ڝ
`, '\u{069E}': `ڞ
`, '\u{069F}': `ڟ
`, '\u{06A0}': `ڠ
`, '\u{06A1}': `ڡ
`, '\u{06A2}': `ڢ
`, '\u{06A3}': `ڣ
`, '\u{06A4}': `ڤ
`, '\u{06A5}': `ڥ
`, '\u{06A6}': `ڦ
`, '\u{06A7}': `ڧ
`, '\u{06A8}': `ڨ
`, '\u{06A9}': `ک
k eg. کۄکٕر کُکِل
Note the special shape when it precedes l.
Combinations
کھکھکھ کھ
kʰ is 06A9 06BE. کھۄر اَکھ
`, '\u{06AA}': `ڪ
`, '\u{06AB}': `ګ
`, '\u{06AC}': `ڬ
`, '\u{06AD}': `ڭ
`, '\u{06AE}': `ڮ
`, '\u{06AF}': `گ
ɡ eg. گُر زَنْگ گُل
Note the special shape when it precedes l.
There is no aspirated version of this phoneme in Kashmiri.
`, '\u{06B0}': `ڰ
`, '\u{06B1}': `ڱ
`, '\u{06B2}': `ڲ
`, '\u{06B3}': `ڳ
`, '\u{06B4}': `ڴ
`, '\u{06B5}': `ڵ
`, '\u{06B6}': `ڶ
`, '\u{06B7}': `ڷ
`, '\u{06B8}': `ڸ
`, '\u{06B9}': `ڹ
`, '\u{06BA}': `ں
◌̃ nasalisation marker used only at the end of a word§. However, it appears to be rare.
اٟں
If the nasalisation appears in the middle of a word, it is represented by نْ.
`, '\u{06BB}': `ڻ
`, '\u{06BC}': `ڼ
`, '\u{06BD}': `ڽ
`, '\u{06BE}': `ھ
Used to create the aspirated letters of the Kashmiri alphabet. Each letter is composed of two characters. The letters are: پھ pʰ, تھ tʰ, ٹھ ʈʰ, ژھ t͡sʰ, چھ ʧʰ, and کھ kʰe, eg. پھَش ژھاوٕج
Unlike Urdu, Kashmiri doesn't aspirate any voiced sounds. However, a small number of words beginning with بھ bh retain the spellingmkr,6, eg. بھَرَت
`, '\u{06BF}': `ڿ
`, '\u{06C0}': `ۀ
`, '\u{06C1}': `ہ
h consonant. ۂہَر کۆہ جُمعہ
Many words end with a silent h (which carries any vowel diacritics that would otherwise have been associated with the preceding consonant), eg. رامہٕ ہوٗن طوطہٕ کَہہ
`, '\u{06C2}': `ۂ
hə atomic CV letter. It is written with the following forms, which include this character in all positions in precomposed text:
isolated |
06C2 06C1 0654 |
ۂہَر بٕۂر |
---|---|---|
initial |
06C2 06C1 0654 | |
medial |
06C2 06C1 0654 | |
final |
06C2 06C1 0654 |
This character decomposes and recomposes during normalisation.
`, '\u{06C3}': `ۃ
`, '\u{06C4}': `ۄ
ɔ is written using this character in all positions:
initial | 0627 06C4 |
---|---|
medial | 06C4دۄد |
final | 06C4سۄ |
isolated | 0627 06C4 |
ɔː is written using this character in all positions.
initial | 0627 06C4 0627 |
---|---|
medial | 06C4 0627سۄاد |
final | 06C4 0627 |
isolated | 0627 06C4 0627 |
ۅ
`, '\u{06C6}': `Do not use. Use 0648 065A instead. Read more.
That said, in Wikipedia content, this character is much more frequently used than the decomposed alternative. Applications should therefore produce the decomposed version, but should recognise this character if it is used.
`, '\u{06C7}': `ۇ
`, '\u{06C8}': `ۈ
`, '\u{06C9}': `ۉ
`, '\u{06CA}': `ۊ
`, '\u{06CB}': `ۋ
`, '\u{06CC}': `ی
j eg.
دۄیُن
زٲمِیہِ
This letter can often be found as a medial -j (although sometimes it is spelled using a palatalisation letter). In this case, it is combined with jazm, ie. 06CC 0652, and may carry a vowel diacritic as well.
کیْوٚم
دَکھیُْن
Word-finally, this sound is written using 06D2.
Vowel usage
eː is written using this character in word-initial & word-medial positions:
initial | ی |
---|---|
medial | یشیر |
final | 06D2 |
isolated | 06D2 |
e is also written using this character in word-initial & word-medial positions:
initial | 0627 06CC 065A |
---|---|
medial | یٚبیٚنہِ |
final | ےٚ |
isolated | اےٚ |
Kashmiri content occasionally contains the precomposed character 06CE, which looks identical, but is intended for the Sorani e and is not linked by normalisation. The decomposed sequence should be used instead.
iː is written using this character in all positions:
initial | ایٖایٖمان |
---|---|
medial | یٖشيٖتھ |
final | یزٲمی |
isolated | ای |
Kashmiri doesn't use 064A, but people sometimes use it in Kashmiri text, failing to distinguish it from this character.
`, '\u{06CD}': `ۍ
`, '\u{06CE}': `Do not use. Use 06CC 065A instead. Read more.
`, '\u{06CF}': `ۏ
`, '\u{06D0}': `ې
`, '\u{06D1}': `ۑ
`, '\u{06D2}': `ے
-j word-finally, eg. بوے کۄلَے
(Elswhere in a word, this sound is written using 06CC. Do not confuse this usage with that of the palatisation letter, 0620.)
Vowel usage
eː is written using this character in isolate & word-final positions:
initial | ی |
---|---|
medial | ی |
final | 06D2دُپاسے |
isolated | 06D2 |
e is also written using this character in isolate & word-final positions:
initial | 0627 06CC 065A |
---|---|
medial | یٚ |
final | ےٚشےٚ |
isolated | اےٚ |
ۓ
`, '\u{06D4}': `۔
`, '\u{06D5}': `ە
`, '\u{06D6}': `ۖ
`, '\u{06D7}': `ۗ
`, '\u{06D8}': `ۘ
`, '\u{06D9}': `ۙ
`, '\u{06DA}': `ۚ
`, '\u{06DB}': `ۛ
`, '\u{06DC}': `ۜ
`, '\u{06DD}': `
`, '\u{06DE}': `۞
`, '\u{06DF}': `۟
`, '\u{06E0}': `۠
`, '\u{06E1}': `ۡ
`, '\u{06E2}': `ۢ
`, '\u{06E3}': `ۣ
`, '\u{06E4}': `ۤ
`, '\u{06E5}': `ۥ
`, '\u{06E6}': `ۦ
`, '\u{06E7}': `ۧ
`, '\u{06E8}': `ۨ
`, '\u{06E9}': `۩
`, '\u{06EA}': `۪
Sometimes used with 066E to hack the initial and medial forms of the palatalisation letter in Kashmiri. 0620 should be used instead. Fonts will automatically apply a circle diacritic for initial and medial positions (only).
`, '\u{06EB}': `۫
`, '\u{06EC}': `۬
`, '\u{06ED}': `ۭ
`, '\u{06EE}': `ۮ
`, '\u{06EF}': `ۯ
`, '\u{06F0}': `۰
0 Digit.
`, '\u{06F1}': `۱
1 Digit.
Shape. ۱.
`, '\u{06F2}': `۲
2 Digit.
Shape. ۲.
`, '\u{06F3}': `۳
3 Digit.
Shape. ۳.
`, '\u{06F4}': `۴
4 Digit.
Shape. ۴.
This is the same shape as 4 in Urdu, but different from that in Persian and Sindi, and from that in Arabic. See the shapes.
`, '\u{06F5}': `۵
5 Digit.
Shape. ۵.
This is the same shape as 5 in Persian, Urdu, and Sindhi, but different from that in Arabic. See the shapes.
`, '\u{06F6}': `۶
6 Digit.
Shape. ۶.
This is the same shape as 6 in Arabic, Urdu, and Sindhi, but different from that in Persian. See the shapes.
`, '\u{06F7}': `۷
7 Digit.
Shape. ۷.
This is the same shape as 7 in Urdu, and Sindhi, but different from that for Arabic and Persian. See the shapes.
`, '\u{06F8}': `۸
8 Digit.
Shape. ۸.
`, '\u{06F9}': `۹
9 Digit.
Shape. ۹.
`, '\u{06FA}': `ۺ
`, '\u{06FB}': `ۻ
`, '\u{06FC}': `ۼ
`, '\u{06FD}': `۽
`, '\u{06FE}': `۾
`, '\u{06FF}': `ۿ
`, '\u{0750}': `ݐ
`, '\u{0751}': `ݑ
`, '\u{0752}': `ݒ
`, '\u{0753}': `ݓ
`, '\u{0754}': `ݔ
`, '\u{0755}': `ݕ
`, '\u{0756}': `ݖ
`, '\u{0757}': `ݗ
`, '\u{0758}': `ݘ
`, '\u{0759}': `ݙ
`, '\u{075A}': `ݚ
`, '\u{075B}': `ݛ
`, '\u{075C}': `ݜ
`, '\u{075D}': `ݝ
`, '\u{075E}': `ݞ
`, '\u{075F}': `ݟ
`, '\u{0760}': `ݠ
`, '\u{0761}': `ݡ
`, '\u{0762}': `ݢ
`, '\u{0763}': `ݣ
`, '\u{0764}': `ݤ
`, '\u{0765}': `ݥ
`, '\u{0766}': `ݦ
`, '\u{0767}': `ݧ
`, '\u{0768}': `ݨ
`, '\u{0769}': `ݩ
`, '\u{076A}': `ݪ
`, '\u{076B}': `ݫ
`, '\u{076C}': `Do not use. Use 0631 0654 instead. Read more.
`, '\u{076D}': `ݭ
`, '\u{076E}': `ݮ
`, '\u{076F}': `ݯ
`, '\u{0770}': `ݰ
`, '\u{0771}': `ݱ
`, '\u{0772}': `ݲ
`, '\u{0773}': `ݳ
`, '\u{0774}': `ݴ
`, '\u{0775}': `ݵ
`, '\u{0776}': `ݶ
`, '\u{0777}': `ݷ
`, '\u{0778}': `ݸ
`, '\u{0779}': `ݹ
`, '\u{077A}': `ݺ
`, '\u{077B}': `ݻ
`, '\u{077C}': `ݼ
`, '\u{077D}': `ݽ
`, '\u{077E}': `ݾ
`, '\u{077F}': `ݿ
`, '\u{08A0}': `ࢠ
`, '\u{08A1}': `Do not use. Use 0628 0654 instead. Read more.
`, '\u{08A2}': `ࢢ
`, '\u{08A3}': `ࢣ
`, '\u{08A4}': `ࢤ
`, '\u{08A5}': `ࢥ
`, '\u{08A6}': `ࢦ
`, '\u{08A7}': `ࢧ
`, '\u{08A8}': `ࢨ
`, '\u{08A9}': `ࢩ
`, '\u{08AA}': `ࢪ
`, '\u{08AB}': `ࢫ
`, '\u{08AC}': `ࢬ
`, '\u{08AD}': `ࢭ
`, '\u{08AE}': `ࢮ
`, '\u{08AF}': `ࢯ
`, '\u{08B0}': `ࢰ
`, '\u{08B1}': `ࢱ
`, '\u{08B2}': `ࢲ
`, '\u{08B3}': `ࢳ
`, '\u{08B4}': `ࢴ
`, '\u{08B6}': `ࢶ
`, '\u{08B7}': `ࢷ
`, '\u{08B8}': `ࢸ
`, '\u{08B9}': `ࢹ
`, '\u{08BA}': `ࢺ
`, '\u{08BB}': `ࢻ
`, '\u{08BC}': `ࢼ
`, '\u{08BD}': `ࢽ
`, '\u{08BE}': `ࢾ
`, '\u{08BF}': `ࢿ
`, '\u{08C0}': `ࣀ
`, '\u{08C1}': `ࣁ
`, '\u{08C2}': `ࣂ
`, '\u{08C3}': `ࣃ
`, '\u{08C4}': `ࣄ
`, '\u{08C5}': `ࣅ
`, '\u{08C6}': `ࣆ
`, '\u{08C7}': `ࣇ
`, '\u{08D3}': `࣓
`, '\u{08D4}': `ࣔ
`, '\u{08D5}': `ࣕ
`, '\u{08D6}': `ࣖ
`, '\u{08D7}': `ࣗ
`, '\u{08D8}': `ࣘ
`, '\u{08D9}': `ࣙ
`, '\u{08DA}': `ࣚ
`, '\u{08DB}': `ࣛ
`, '\u{08DC}': `ࣜ
`, '\u{08DD}': `ࣝ
`, '\u{08DE}': `ࣞ
`, '\u{08DF}': `ࣟ
`, '\u{08E0}': `࣠
`, '\u{08E1}': `࣡
`, '\u{08E2}': `
`, '\u{08E3}': `ࣣ
`, '\u{08E4}': `ࣤ
`, '\u{08E5}': `ࣥ
`, '\u{08E6}': `ࣦ
`, '\u{08E7}': `ࣧ
`, '\u{08E8}': `ࣨ
`, '\u{08E9}': `ࣩ
`, '\u{08EA}': `࣪
`, '\u{08EB}': `࣫
`, '\u{08EC}': `࣬
`, '\u{08ED}': `࣭
`, '\u{08EE}': `࣮
`, '\u{08EF}': `࣯
`, '\u{08F0}': `ࣰ
`, '\u{08F1}': `ࣱ
`, '\u{08F2}': `ࣲ
`, '\u{08F3}': `ࣳ
`, '\u{08F4}': `ࣴ
`, '\u{08F5}': `ࣵ
`, '\u{08F6}': `ࣶ
`, '\u{08F7}': `ࣷ
`, '\u{08F8}': `ࣸ
`, '\u{08F9}': `ࣹ
`, '\u{08FA}': `ࣺ
`, '\u{08FB}': `ࣻ
`, '\u{08FC}': `ࣼ
`, '\u{08FD}': `ࣽ
`, '\u{08FE}': `ࣾ
`, '\u{08FF}': `ࣿ
`, // COMMON PUNCTUATION // § '\u{00A7}': ` `, // « '\u{00AB}': ` `, // » '\u{00BB}': ` `, // danda '\u{0964}': `।
`, // double danda '\u{0965}': `॥
`, // – '\u{2010}': ` `, // – '\u{2013}': ` `, // — '\u{2014}': ` `, // '.. '\u{2018}': ` `, // ..' '\u{2019}': ` `, // ".. '\u{201C}': `“
Kashmiri uses this as a closing quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text. This makes the order in which quotation marks are used in Kashmiri different from that in LTR text.
`, // .." '\u{201D}': `”
Kashmiri uses this as an opening quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text. This makes the order in which quotation marks are used in Kashmiri different from that in LTR text.
`, // ! '\u{0021}': ` `, // … '\u{2026}': ` `, // ( '\u{0028}': `(
Kashmiri uses this ASCII character as an 'opening parenthesis' before the parenthetic text, just as in LTR text. The glyph will face left, thanks to bidi mirroring.
`, // ) '\u{0029}': `)
Kashmiri uses this ASCII character as a 'closing parenthesis' after the parenthetic text, just as in LTR text. The glyph will face right, thanks to bidi mirroring.
`, // , '\u{002C}': ` `, // . '\u{002E}': ` `, // : '\u{003A}': ` `, // ; '\u{003B}': ` `, // ? '\u{003F}': ` `, // cgj '\u{034F}': `Semantically separates characters. Can be used to prevent pairs of characters being treated as digraphs, or to block canonical reordering of combining marks during normalization. The word 'joiner' in the name is a misnomer.
`, // alm '\u{061C}': `Helps produce the correct ordering for sequences with no strong directional characters by overriding the Unicode Bidirectional Algorithm default rules. Used particularly for text in the Arabic language, and languages using Syriac and Thaana scripts. Not usually needed for Hebrew, N'Ko, or Persian.
`, // FORMATTING CHARACTERS '\u{2020}': `Called dagger, but also known as obelisk, obelus, or long cross.b321
A reference mark, used primarily with footnotes. When used for this purpose with other signs, the traditional order is * † ‡ § ‖ ¶.b68
Also a death sign in European typography, used to mark the year of death or the names of dead persons.b321
In lexicography it marks obsolete forms, and in editing of classical texts flags passages judged to be corrupt.b321
`, '\u{2021}': `Called dagger, but also known as diesis, or double obelisk.b321
A reference mark used with footnotes. When used for this purpose with other signs, the traditional order is * † ‡ § ‖ ¶.b68
`, // … '\u{2026}': ` `, '\u{2032}': `Abbreviation for feet (1′ = 12″).b330
Also used for minutes of arc (eg. 60′=1°).b330
`, '\u{2033}': `Abbreviation for inches (1′ = 12″).b321
Also used for seconds of arc (eg. 360″=1°).b321
`, // General punctuation '\u{2018}': `‘
Closing quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.
`, '\u{2019}': `’
Opening quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.
`, '\u{201C}': `“
Closing quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.
`, '\u{201D}': `”
Opening quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.
`, '\u{2014}': `—
Em dash.
`, '\u{0021}': `!
Exclamation mark.
`, '\u{0025}': `%
Percentage mark.
`, '\u{0028}': `(
Opening parenthesis.
The words 'left' and 'right' in the Unicode names for parentheses, brackets, and other paired characters should be ignored. LEFT should be read as if it said START, and RIGHT as END. The direction in which the glyphs point will be automatically determined according to the base direction of the text.
`, '\u{0029}': `)
Closing parenthesis.
The words 'left' and 'right' in the Unicode names for parentheses, brackets, and other paired characters should be ignored. LEFT should be read as if it said START, and RIGHT as END. The direction in which the glyphs point will be automatically determined according to the base direction of the text.
`, '\u{002E}': `.
Full stop.
`, '\u{0030}': `0
Digit.
`, '\u{0031}': `1
Digit.
`, '\u{0032}': `2
Digit.
`, '\u{0033}': `3
Digit.
`, '\u{0034}': `4
Digit.
`, '\u{0035}': `5
Digit.
`, '\u{0036}': `6
Digit.
`, '\u{0037}': `7
Digit.
`, '\u{0038}': `8
Digit.
`, '\u{0039}': `9
Digit.
`, '\u{003A}': `:
Colon.
`, '\u{00AB}': `«
Opening quotation mark.
The words 'left' and 'right' in the Unicode names for parentheses, brackets, and other paired characters should be ignored. LEFT should be read as if it said START, and RIGHT as END. The direction in which the glyphs point will be automatically determined according to the base direction of the text.
`, '\u{00BB}': `»
Closing quotation mark.
The words 'left' and 'right' in the Unicode names for parentheses, brackets, and other paired characters should be ignored. LEFT should be read as if it said START, and RIGHT as END. The direction in which the glyphs point will be automatically determined according to the base direction of the text.
`, '\u{034F}': `͏
Combining grapheme joiner.
Used to produce special ordering of diacritics. The name is a misnomer, as it is generally used to break the normal sequence of diacritics.
More details:
`, '\u{2013}': `–
En dash.
`, '\u{2026}': `…
Ellipsis.
`, '\u{2030}': `‰
Per mille sign.
`, '\u{2039}': `‹
Quotation mark.
`, '\u{203A}': `›
Quotation mark.
`, // zwnj '\u{200C}': `
Zero-width non-joiner (ZWNJ).
An invisible character, that prevents two adjacent letters forming a visual connection with each other when rendered. Especially useful for educational illustrations, but also has real-world applications.
It is used to interrupt the shaping of joining glyphs in cursive scripts, and also used to manage the visual interactions of glyphs in other scripts, eg. to prevent the formation of conjuncts, position diacritics, etc.
More details:
`, // zwj '\u{200D}': `
Zero-width joiner (ZWJ).
An invisible character, that permits a letter to form a cursive connection without a visible neighbour. Especially useful for educational illustrations, but also has some real-world applications.
Also used with complex scripts to manage the visual representation of glyphs that normally interact, eg. to form conjuncts, position diacritics, etc.
More details:
`, // LRM '\u{200E}': `An invisible character with strong LTR directional properties that can be used to produce the correct ordering of text, especially where there is a risk of spillover effects while the Unicode Bidirectional Algorithm is at work.
Generally referred to as LRM.
`, // RLM '\u{200F}': `An invisible character with strong RTL directional properties that can be used to produce the correct ordering of text, especially where there is a risk of spillover effects while the Unicode Bidirectional Algorithm is at work.
Generally referred to as RLM.
`, // LRE '\u{202A}': `Sets the start point for a range of inline text when applying a base direction of left-to-right. The range is terminated by 202C (PDF).
Use 2066 (LRI) rather than this character.
`, // RLE '\u{202B}': `Sets the start point for a range of inline text when applying a base direction of right-to-left. The range is terminated by 202C (PDF).
Use 2067 (RLI) rather than this character.
`, // PDF '\u{202C}': `Sets the end point for a range of inline text when applying a base direction. The range is started with either 202A (LRE) or 202B (RLE).
Use 2069 (PDI) and its associated range starters rather than this character.
`, // LRI '\u{2066}': `Sets the start point for a range of inline text when applying a base direction of left-to-right, and isolates the text within that range from text outside it. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).
This character should be used rather than 202A (LRE).
`, // RLI '\u{2067}': `Sets the start point for a range of inline text when applying a base direction of right-to-left, and isolates the text within that range from text outside it. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).
This character should be used rather than 202B (RLE).
`, // FSI '\u{2068}': `Sets the start point for a range of inline text when applying a base direction, and isolates the text within that range from text outside it. The base direction set is determined by that of the first strong directional character in the range. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).
`, // PDI '\u{2069}': `Sets the end point for a range of inline text when applying a base direction. The range is started with either 2066 (LRI), 2066 (RLI) or 2068 (FSI).
This character should be used rather than 202C (PDF).
`, // word-break '\u{2060}': `An invisible character, equivalent to a zero-width no-break space, and used to prevent line-breaks, eg. it can be used around the + sign in base+delta to prevent a line break occuring in that sequence of characters. It has no effect on word segmentation.
It can also be used to bracket other characters to turn them into non-breaking characters, such as U+2009 THIN SPACE or ― [U+2015 HORIZONTAL BAR].
Not to be confused with U+200D ZERO WIDTH JOINER or U+034F COMBINING GRAPHEME JOINER, since it has no effect on shaping.
This functionality is also provided by U+FEFF ZERO WIDTH NO-BREAK SPACE, but since that character also represents the byte-order mark, the use of this word joiner character (added in Unicode 3.2) is strongly preferred over the latter.
`, }