/* */ var charDetails = { // MAIN BLOCK '\u{0621}': `

ɪ, ʊ, or ə. Used in word-final position to indicate a standalone short vowel sound.

جوء

ماء

ڌيء

When needed, diacritics can be used to disambiguate the sound.

ɪ is ءِ.

جوء

ʊ is ءُ.

ماء

ə is ءَ.

ڌيء

`, '\u{0622}': `

ɑ~ɑː standalone vowel in word-initial position.

آسمان

آچر

آنو

`, '\u{0626}': `

Used as a word-medial carrier for vowels that don't follow a consonant.

ʔ in some phonetic transcriptions, but not transcribed in others.

پئوڻ

ڳئون

In standard text, when it occurs alone this letter can represent either ɪ, ʊ, or ə. Other sounds can be represented by a combination of letters. Where needed, diacritics can be used (or not used) to indicate or disambiguate the vowel sound, but they are rarely seen in typical text.

i is ئِي.

سئي

ɪ is ئِ

پائڻ

ʊ is ئُ

پائلو

u is ئُو

ڳئون

e is ئي

ائين

o is ئو

سائو

ə is ئَ

پئڻ

`, '\u{0627}': `

ɑ or a vowel.

آسمان

دوا

خواب

Standalone vowels

Also used as a word-initial carrier for vowels.

In standard text, when it occurs alone at the start of a word this letter can represent either ɪ, ʊ, or ə. Other sounds can be represented by a combination of letters. Where needed, diacritics can be used (or not used) to indicate or disambiguate the vowel sound, but they are rarely seen in typical text.

i is اِي

ايران

ɪ is اِ

ارادو

ʊ is اُ

اتر

u is اُو

اوناڙڻ

o is او

اوڀر

ə is اَ

افسوس

`, '\u{0628}': `

b consonant.

برابر

جبل

حجاب

`, '\u{062A}': `

t consonant.

تنگ

متان

ورسپت

`, '\u{062B}': `

Allograph used for loan words.

s consonant.

ثواب

`, '\u{062C}': `

d͡ʑ consonant.

جبل

حجاب

سج

Combinations

جھ

d͡ʑʰ is جھ

`, '\u{062D}': `

h consonant.

حجاب

صحيح

`, '\u{062E}': `

x consonant.

خواب

`, '\u{062F}': `

d consonant.

دڪان

بدران

`, '\u{0630}': `

Allograph used for loan words.

z consonant.

ڪاغذ

`, '\u{0631}': `

r consonant.

راجا

شڪر

`, '\u{0632}': `

z consonant.

زبان

تيز

`, '\u{0633}': `

s consonant.

سنڌي

گسڻ

اپواس

`, '\u{0634}': `

ʂ consonant.

شڪر

ڪشادو

`, '\u{0635}': `

Allograph used for loan words.

s consonant.

صحيح

`, '\u{0636}': `

Allograph used for loan words.

z consonant.

رمضان

`, '\u{0637}': `

Allograph used for loan words.

t consonant.

غلطي

`, '\u{0638}': `

Allograph used for loan words.

z consonant.

مظفر ڳڙھ

`, '\u{0639}': `

æ vowel. This is an unusual letter which has its own vowel sound that can be followed by another vowel. See these examples.

علائقو

قلعو

علاقو

`, '\u{063A}': `

ɣ consonant.

غلطي

مغرب

`, '\u{0641}': `

f consonant.

افسوس

برف

`, '\u{0642}': `

q consonant.

قهوو

موسيقي

`, '\u{0644}': `

l consonant.

لوڻ

غلطي

جبل

This letter ligates with a following aleph.

علائقو

ڪوڪاڪولا

Combinations

لھ

lʰ is لھ

للھو

`, '\u{0645}': `

m consonant.

مڻيار

آسمان

ڪريم

The short, sideways tail on the final or isolated form of the letter is expected for Sindhi text.

Combinations

مھ

mʰ is مھ

`, '\u{0646}': `

n consonant.

نئون

سنڌي

زمين

◌̃ nasalisation marker.

ڏينهن

بدران

Combinations

نھ

nʰ is نھ

It can be difficult to know whether this letter represents an onset followed by an unmarked vowel, a nasal coda, or nasalisation of the vowel.

`, '\u{0647}': `

h consonant.

لاهور

قهوو

پره

This is one of 3 sounds representing heh in Sindhi, and the code points used in the wild are often inconsistent, especially where the glyph shapes look the same. Unicode experts recommend that this letter only be used for onsets. See also ھ and ہ.

`, '\u{0648}': `

ʋ consonant.

واءُ

دوا

اپواس

u vowel or o vowel.

جونء

آنو

Combinations

The alternative vowels sounds represented by this letter can be distinguished using diacritics, as follows. (When the diacritic follows this letter, it is being used as a consonant.)

u is ُو

جونء

o is و (no diacritic).

آنو

Observation: In final position in a word, this letter seems to overwhelmingly be pronounced o.

`, '\u{064A}': `

j consonant.

ڏياري

i vowel or e vowel.

زمين

تیز

Combinations

The alternative vowels sounds represented by this letter can be distinguished using diacritics, as follows. (When the diacritic follows this letter, it is being used as a consonant.)

i is ِي

زمين

e is ي (no diacritic).

تيز

`, '\u{064E}': `

ə vowel in vocalised text when not followed by alef.

برف

Combinations

ɑ is َا, though it is often written with just the alef.

ڏياري

Standalones

Word-initial standalone vowels use the carrier letter ا.

افسوس

Word-medially, the carrier is ئ.

پئڻ

For word-final ə after a vowel, the carrier is ء.

جونء

`, '\u{064F}': `

ʊ vowel in vocalised text when not followed by waw.

زبان

Combinations

u is ُو.

لوڻ

(Without this diacritic, in vowelled text, waw represents o.)

Standalones

Word-initial standalone vowels use the carrier letter ا.

اتر

Word-medially, the carrier is ئ.

ڳئون

For word-final ʊ after a vowel, the carrier is ء.

ڀاء

`, '\u{0650}': `

ɪ vowel in vocalised text when not followed by yeh.

تٿ

Combinations

i is ِي

پيء

(Without this diacritic, in vowelled text, yeh represents e.)

Standalones

Word-initial standalone vowels use the carrier letter ا.

ايران

Word-medially, the carrier is ئ.

پائڻ

سئي

For word-final ɪ after a vowel, the carrier is ء.

جوء

`, '\u{0652}': `

Vowel absence indicator.

Not normally used for Sindhi, even in vowelled text.

رمضان

`, '\u{0653}': `

Rare. Orphan. Only found in decomposed text with ALEF.

Combining madd.

`, '\u{0654}': `

Rare. Orphan. Only found in decomposed text with YEH.

Combining hamza.

`, '\u{067A}': `

ʈʰ consonant.

ٺڪر

سٺو

پٺ

`, '\u{067B}': `

ɓ consonant.

ٻج

چٻائڻ

`, '\u{067D}': `

ʈ consonant.

ٽنگ

ڇٽڻ

پيٽ

`, '\u{067E}': `

p consonant.

پئڻ

ڪپور

`, '\u{067F}': `

tʰ consonant.

ٿورو

ڪٿي

تٿ

`, '\u{0680}': `

bʰ consonant.

ڀارت

گنڀير

سڀ

`, '\u{0683}': `

ɲ consonant.

ڀڃڻ

مڃ

`, '\u{0684}': `

ʄ consonant.

ڄمڻ

سڄو

ويڄ

`, '\u{0686}': `

t͡ɕ consonant.

چيرڻ

آچر

`, '\u{0687}': `

t͡ɕʰ consonant.

ڇٽڻ

مڇي

`, '\u{068A}': `

ɖ consonant.

ڊڄڻ

آنڊو

جنڊ

`, '\u{068C}': `

dʰ consonant.

ڌرتي

سنڌي

سنڌ

`, '\u{068D}': `

ɖʰ consonant.

ڍڳو

ڍنڍ

ڳنڍ

`, '\u{068F}': `

ɗ consonant.

ڏينهن

ڪڏڻ

گڏ

`, '\u{0699}': `

ɽ consonant.

پاپڙ

ڪيوڙو

Combinations

ڙھ

ɽʰ is ڙھ

پڙھڻ

`, '\u{06A6}': `

pʰ consonant.

ڦار

ڦوڪڻ

`, '\u{06A9}': `

kʰ consonant.

کاند

بيکہ

اک

`, '\u{06AA}': `

k consonant.

ڪاغذ

ڪڪر

ماڻڪ

`, '\u{06AF}': `

ɡ consonant.

گسڻ

گانگٽ

ٽنگ

Combinations

گھ

ɡʰ is گھ

اگھڻ

`, '\u{06B1}': `

ŋ consonant.

اڱارو

ڪڱگو

سڱ

`, '\u{06B3}': `

ɠ consonant.

ڳئون

ماڳ

`, '\u{06BB}': `

ɳ consonant.

لوڻ

مڻيار

ڏئڻ

Combinations

ڻھ

ɳʰ is ڻھ

`, '\u{06BE}': `

ʰ aspiration marker.

Combinations

جھ

d͡ʑʰ is جھ

جھنگ

گھ

ɡʰ is گھ

گھر

مھ

mʰ is مھ

نھ

nʰ is نھ

ڳنھڻ

ڻھ

ɳʰ is ڻھ

ڙھ

ɽʰ is ڙھ

پڙھڻ

لھ

lʰ is لھ

ٿلھو

`, '\u{06C1}': `

Only used in word-final position.

∅(ʰ) silent he. It is normally not pronounced, but may sometimes be detected as a waning h sound.

باهہ

بيکہ

مهانگہ

`, '\u{06F0}': `

0 digit.

`, '\u{06F1}': `

1 digit.

`, '\u{06F2}': `

2 digit.

`, '\u{06F3}': `

3 digit.

`, '\u{06F4}': `

4 digit.

`, '\u{06F5}': `

5 digit.

`, '\u{06F6}': `

6 digit.

`, '\u{06F7}': `

7 digit.

`, '\u{06F8}': `

8 digit.

`, '\u{06F9}': `

9 digit.

`, '\u{06FD}': `

Ampersand.

`, '\u{06FE}': `

mẽ locative case marker.

`, '\u{204F}': `

⁏

Semicolon.

`, '\u{2E41}': `

⹁

Comma.

`, '\u{0021}': `

Exclamation mark.

`, '\u{002D}': `

Hyphen.

Combinations

arab-sd

ipa is arab-sd usage

`, '\u{002E}': `

Full stop.

`, '\u{003A}': `

Colon.

`, '\u{061F}': `

Question mark.

`, // COMMON PUNCTUATION // § '\u{00A7}': `

`, // « '\u{00AB}': `

`, // » '\u{00BB}': `

`, // danda '\u{0964}': `

।

`, // double danda '\u{0965}': `

॥

`, // – '\u{2010}': `

‐

`, // – '\u{2013}': `

–

`, // — '\u{2014}': `

—

`, // '.. '\u{2018}': `

‘

`, // ..' '\u{2019}': `

’

`, // ".. '\u{201C}': `

“

`, // .." '\u{201D}': `

”

`, // ! '\u{0021}': `

`, // … '\u{2026}': `

…

`, // ( '\u{0028}': `

(

`, // ) '\u{0029}': `

)

`, // , '\u{002C}': `

`, // . '\u{002E}': `

`, // : '\u{003A}': `

`, // ; '\u{003B}': `

;

`, // ? '\u{003F}': `

`, // cgj '\u{034F}': `

Semantically separates characters. Can be used to prevent pairs of characters being treated as digraphs, or to block canonical reordering of combining marks during normalization. The word 'joiner' in the name is a misnomer.

`, // alm '\u{061C}': `

Helps produce the correct ordering for sequences with no strong directional characters by overriding the Unicode Bidirectional Algorithm default rules. Used particularly for text in the Arabic language, and languages using Syriac and Thaana scripts. Not usually needed for Hebrew, N'Ko, or Persian.

`, '\u{060C}': `

Rare.

Comma.

`, '\u{061B}': `

Rare.

Semicolon.

`, '\u{06D4}': `

Arabic full stop.

`, // FORMATTING CHARACTERS // zwsp '\u{200B}': `

An invisible character, used to signal line-break and word-break opportunities. It was originally provided for use with writing systems such as Thai, Myanmar, Khmer, Japanese, etc. that don't use spaces between words.

Justification may visibly adjust the space between the characters on either side of this character, doing so as if the ZWSP wasn't there, eg. the Thai text อักษรไทย may look like อั ก ษ ร ไ ท ย when justified, or when letter-spacing is applied, even though the two words are separated by a ZWSP (click on the word to see the composition).

`, // zwnj '\u{200C}': `

‌

Prevents glyph joining behaviour.

`, // zwj '\u{200D}': `

‍

Creates glyph joining behaviour in the absence of normal joining contexts.

`, // rlm '\u{200F}': `

‏

An invisible character with a strong RTL directional property. Can be used to correct local issues with the Unicode Bidirectional Algorithm.

`, // lrm '\u{200E}': `

‎

An invisible character with a strong LTR directional property. Can be used to correct local issues with the Unicode Bidirectional Algorithm.

`, // ‘ '\u{2018}': `

‘

`, // ’ '\u{2019}': `

’

`, // “ '\u{201C}': `

“

`, // ” '\u{201D}': `

”

`, '\u{2020}': `

†

Called dagger, but also known as obelisk, obelus, or long cross.b321

A reference mark, used primarily with footnotes. When used for this purpose with other signs, the traditional order is * † ‡ § ‖ ¶.b68

Also a death sign in European typography, used to mark the year of death or the names of dead persons.b321

In lexicography it marks obsolete forms, and in editing of classical texts flags passages judged to be corrupt.b321

`, '\u{2021}': `

‡

Called dagger, but also known as diesis, or double obelisk.b321

A reference mark used with footnotes. When used for this purpose with other signs, the traditional order is * † ‡ § ‖ ¶.b68

`, // … '\u{2026}': `

…

`, // rle '\u{202B}': `

‫

Sets the base direction for the following text to RTL, with no isolation. The Unicode Standard recommends use of RLI, instead.

`, // lre '\u{202A}': `

‪

Sets the base direction for the following text to LTR, with no isolation. The Unicode Standard recommends use of LRI, instead.

`, // pdf '\u{202C}': `

‬

Ends the range of text that started with RLE, or LRE.

`, '\u{2032}': `

′

Abbreviation for feet (1′ = 12″).b330

Also used for minutes of arc (eg. 60′=1°).b330

`, '\u{2033}': `

″

Abbreviation for inches (1′ = 12″).b321

Also used for seconds of arc (eg. 360″=1°).b321

`, // word-break '\u{2060}': `

An invisible character, equivalent to a zero-width no-break space, and used to prevent line-breaks, eg. it can be used around the + sign in base⁠+delta⁠ to prevent a line break occuring in that sequence of characters. It has no effect on word segmentation.

It can also be used to bracket other characters to turn them into non-breaking characters, such as U+2009 THIN SPACE or ― [U+2015 HORIZONTAL BAR].

Not to be confused with U+200D ZERO WIDTH JOINER or U+034F COMBINING GRAPHEME JOINER, since it has no effect on shaping.

This functionality is also provided by U+FEFF ZERO WIDTH NO-BREAK SPACE, but since that character also represents the byte-order mark, the use of this word joiner character (added in Unicode 3.2) is strongly preferred over the latter.

`, // zwnj '\u{200C}': `

‌

Zero-width non-joiner (ZWNJ).

An invisible character, that prevents two adjacent letters forming a visual connection with each other when rendered. Especially useful for educational illustrations, but also has real-world applications.

It is used to interrupt the shaping of joining glyphs in cursive scripts, and also used to manage the visual interactions of glyphs in other scripts, eg. to prevent the formation of conjuncts, position diacritics, etc.

More details:

Managing glyph shaping

`, // zwj '\u{200D}': `

‍

Zero-width joiner (ZWJ).

An invisible character, that permits a letter to form a cursive connection without a visible neighbour. Especially useful for educational illustrations, but also has some real-world applications.

Also used with complex scripts to manage the visual representation of glyphs that normally interact, eg. to form conjuncts, position diacritics, etc.

More details:

Managing glyph shaping

`, // LRM '\u{200E}': `

An invisible character with strong LTR directional properties that can be used to produce the correct ordering of text, especially where there is a risk of spillover effects while the Unicode Bidirectional Algorithm is at work.

Generally referred to as LRM.

`, // RLM '\u{200F}': `

An invisible character with strong RTL directional properties that can be used to produce the correct ordering of text, especially where there is a risk of spillover effects while the Unicode Bidirectional Algorithm is at work.

Generally referred to as RLM.

`, // LRE '\u{202A}': `

Sets the start point for a range of inline text when applying a base direction of left-to-right. The range is terminated by 202C (PDF).

Use 2066 (LRI) rather than this character.

`, // RLE '\u{202B}': `

Sets the start point for a range of inline text when applying a base direction of right-to-left. The range is terminated by 202C (PDF).

Use 2067 (RLI) rather than this character.

`, // PDF '\u{202C}': `

Sets the end point for a range of inline text when applying a base direction. The range is started with either 202A (LRE) or 202B (RLE).

Use 2069 (PDI) and its associated range starters rather than this character.

`, // LRI '\u{2066}': `

Sets the start point for a range of inline text when applying a base direction of left-to-right, and isolates the text within that range from text outside it. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).

This character should be used rather than 202A (LRE).

`, // RLI '\u{2067}': `

Sets the start point for a range of inline text when applying a base direction of right-to-left, and isolates the text within that range from text outside it. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).

This character should be used rather than 202B (RLE).

`, // FSI '\u{2068}': `

Sets the start point for a range of inline text when applying a base direction, and isolates the text within that range from text outside it. The base direction set is determined by that of the first strong directional character in the range. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).

`, // PDI '\u{2069}': `

Sets the end point for a range of inline text when applying a base direction. The range is started with either 2066 (LRI), 2066 (RLI) or 2068 (FSI).

This character should be used rather than 202C (PDF).

`, // CGJ '\u{034F}': `

Combining grapheme joiner.

Used to produce special ordering of diacritics. The name is a misnomer, as it is generally used to break the normal sequence of diacritics.

More details:

`, } //