Arabic character notes

Updat13 July, 2018te --> • tags arabic, urdu, scriptnotes.

This page lists characters in the following Unicode blocks and provides information about them.

This is not authoritative, peer-reviewed information – these are just notes I have gathered or copied from various places.

For a summary of the script and its use in various writing systems, see the pages Arabic script summary, Urdu writing system, and Uighur writing system. For similar information related to other scripts, see the Script comparison table.

We have usage data for the following 14 languages that use the Arabic script: Arabic, Central Kurdish, Dari, Persian, Kashmiri, Malay, Mazanderani, Luri, Pashto, Western Panjabi, Sindhi, Saraiki, Urdu, Uyghur.

The transliteration for Arabic language text is based on ISO 233, though extended to remove ambiguities, especially around letters based on aleph. Most of the Latin transcriptions, however, are based on the Library of Congress system for Arabic, and the UN scheme for Persian.

Find

If you click on any red example text, you will see at the bottom right of the page a list of the characters that make up the example.

Read more about how to use this page.

To find a character by codepoint, type #char0000 at the end of the URL in the address bar, where 0000 is a four-figure, hex codepoint number, all in uppercase. Or type the character or the hex number in the Find control above.

To view this page as intended, you need a Arabic naskh and nastaliq fonts. This page comes with Scheherazade and Noto Nastaliq webfonts. Click the blue vertical bar at the bottom right of the page to apply other fonts, if you have them on your system. For transcriptions I recommend the excellent and free Doulos SIL font. The large character in the box will not be rendered unless the webfont downloaded with the page or a system font has a glyph for it. (If there is no glyph and you want to see what it looks like, click on See in UniView.)

Information about languages that use these characters is taken from the list maintained for the Character Use app. The list is not exhaustive.

References are indicated by superscript characters. Wherever possible, those contain direct links to the source material. When such a pointer is alongside an arrow → it means that it's worth following the link for the additional information it provides. Digits refer to the main sources, which are listed at the bottom of a set of notes.

When you are using UniView and you turn on Show notes, UniView will pull in information about characters from this page.

Based on ISO 8859-6

Base characters

ء

U+0621 ARABIC LETTER HAMZA

→ (modifier letter right half ring - 02BE)

Glottal stop    ʾ

Doesn't join (naskh): ء

Doesn't join (nastaliq): ء

Arabic    ʾ    هَمْزة hamzah

ʔ

For historical reasons this is treated as an orthographic sign rather than as a letter of the alphabet. It sometimes stands alone, but usually appears with a 'carrier' letter - alef, waw, or yeh (أ إ ؤ ئ) for which separate precomposed characters are available in Unicode.

This codepoint is used for representing the standalone hamza only. On its own it has no joining behaviour.

Combined with base characters: When the hamza is above or below another character you should typically use ٔ [U+0654 ARABIC HAMZA ABOVE] with the appropriate base character, although there are a number of exceptions.

Some exceptions arise because the NFC normalization form converts the base character and combining hamza to a precomposed character. These instances include

Other exceptions arise where the hamza is an integral part of the character itself (ie. an ijam). Examples of these characters include

Cutting and joining hamza in orthography: Classical arabic distinguishes between 'cutting' and 'joining' hamza. 'Cutting' means always pronounced, 'joining' means frequently elided. The joining hamza is of little practical importance in modern arabic pronounced without the old case endings.

In modern printed arabic, the hamza is rarely shown when it occurs at the beginning of a word.

The following are simplified rules for use of (cutting) hamza:

The sign indicating a joining hamza is called a wasla (see ٱ [U+0671 ARABIC LETTER ALEF WASLA]).

Persian ʾ همزه hamza ʔ

Eg. ضیاء Ziyā’.

Urdu vowel separator / calendar indicator, hamzā hamzaː

This is the character code for the standalone hamza.

The hamza is also used in conjunction with other characters in Urdu, for which there are precomposed characters that can be used. See ؤ [U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE], ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE], ۓ [U+06D3 ARABIC LETTER YEH BARREE WITH HAMZA ABOVE], and ۂ [U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE].

٫ [U+066B ARABIC DECIMAL SEPARATOR] looks like a hamza, but isn't.

A standalone hamza is sometimes used at the end of words derived from Arabic, though it is usually omitted in modern Urdu publications, eg. ضیاء ziaː light, ذکاء zakaː intelligence.

Vowel junctions: The hamzā is used to indicate the boundaries between vowel sounds when there is no intervening consonant. Depending on the vowels concerned, it is used in a number of different ways, usually combined with other characters.

In some cases this standalone form is used, eg. انشاءاللہ ɪnʃallaː God willing.

See other ways in which vowel junctions are formed when the hamza is combined with other characters.

Calendar indicator: Gregorian dates are indicated by placing ؁ [U+0601 ARABIC SIGN SANAH] below the year digits with the word عیسوی iːsviː Christian era. This is usually abbreviated as a hamza, eg. ۲۰۰۴؁ء.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

آ

U+0622 ARABIC LETTER ALEF WITH MADDA ABOVE

≡ 0627 0653

Vowel    ā

Decomposes to ا + ٓ [U+0627 ARABIC LETTER ALEF + U+0653 ARABIC MADDAH ABOVE]

Cursive shapes (naskh): ـآ آ

Cursive shapes (nastaliq): ـآ آ

Arabic    ā    أَلِفْ مَدَّة ʼalīf maddah

Used when either of the two following combinations of glottal stop and a vowel appear in a word:

  • ʔaʔ (hamza, short a, hamza) eg. آثار ʔaːθaːr
  • ʔaː (hamza, long a) eg. قرآن qur'ʔaːn

Normal pronunciation in both cases is ʔaː.

The madda sign is still very often shown in print.

Persian ā ɑː

Used at the beginning of words to represent ɑː, eg. آب āb water.

Urdu consonant, alif madd əlɪf mədd

ɑː (used word initially), eg. آب ɑːb now. Unlike the short vowel diacritics, the diacritic madd is never omitted.

As an exception, it used in non-initial position in the word for Koran, القرآن.

madd means increasing.

See also ا [U+0627 ARABIC LETTER ALEF].

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

أ

U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE

≡ 0627 0654

Consonant+vowel   

Decomposes to ا + ٔ [U+0627 ARABIC LETTER ALEF + U+0654 ARABIC HAMZA ABOVE]

Cursive shapes (naskh): ـأ أ

Arabic       أَلِفْ هَمْزة ʼalīf hamzah

ʔa, ʔu, ʔ

This character represents the hamza (ء).

At the beginning of a word hamza is always written on an alef carrier, regardless which vowel it takes. In this case, where the hamza appears above the alef, the vowel could be a or u. Examples: أحمد 'aḥmad, أريد 'urīd.

This character is also used to represent hamza in the middle or at the end of a word. Which of the possible alternative sequences (أ, ؤ or ئ) is used mid-word depends on the vowels preceding and following the hamza. The rules are complicated (and a common source of spelling errors among Arabs).

At the end of a word this character is only used after a short vowel. Examples: سأل sa'al, قرأ qara'.

See ء [U+0621 ARABIC LETTER HAMZA] for more information about hamza. See also إ [U+0625 ARABIC LETTER ALEF WITH HAMZA BELOW], ؤ [U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE], and ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE].

Persian ʔ

Eg. مأمونیه Ma’muniye.

General sources: Unicode, Wikipedia, Daniels

ؤ

U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE

≡ 0648 0654

Consonant+vowel    ʷ

Decomposes to و + ٔ [U+0648 ARABIC LETTER WAW + U+0654 ARABIC HAMZA ABOVE]

Cursive shapes (naskh): ـؤ ؤ

Cursive shapes (nastaliq): ـؤ ؤ

Arabic    ʷ    وَاو هَمْزة wāw hamzah

ʔu, ʔ

This character represents the hamza (ء) in the middle of a word.

In the middle of a word the hamza is almost always written above a carrier letter. Which one depends on the vowels preceding and following the hamza, and the rules are complicated (and a common source of spelling errors among Arabs), eg. مؤمن mu'min (but cf. سأل sa'al, نائم nā'im).

See ء [U+0621 ARABIC LETTER HAMZA] for more information about hamza. See also أ [U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE], إ [U+0625 ARABIC LETTER ALEF WITH HAMZA BELOW], and ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE].

Persian ʔ

Eg. مؤمنآباد Mo’menābād.

Urdu vowel separator+vowel

or o immediately after a preceding vowel (see below).

Vowel junctions: The hamzā is used to indicate the boundaries between vowel sounds when there is no intervening consonant. Depending on the vowels concerned, it is used in a number of different ways. It can also have two different shapes, one like the initial form of 'ain and the other more like an italic 's'.

When the second vowel is an or o represented by و, the hamzā typically sits directly on top of the و, eg. آؤ ɑːo come; جاؤں ʤɑːũː I may go. Often the hamzā is omitted in this situation.

Many words have the vowel combinations iːo, where hamzā is not typically used, eg. لڑکیوں کا laɽkiːõ kɑː of the girls.

See other ways in which vowel junctions are formed when dealing with other combinations of vowels.

General sources: Unicode, Wikipedia, Daniels, Matthews

إ

U+0625 ARABIC LETTER ALEF WITH HAMZA BELOW

≡ 0627 0655

Consonant+vowel   

Decomposes to ا + ٕ [U+0627 ARABIC LETTER ALEF + U+0655 ARABIC HAMZA BELOW]

Cursive shapes (naskh): ـإ إ

Arabic      

ʔi

This character represents the hamza (ء).

At the beginning of a word hamza is always written on an alef carrier, regardless which vowel it takes. When it takes an i vowel it is written below the alef. Example: إكرام 'ikrām.

The mid-word and word-final equivalent of this character is ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE].

See ء [U+0621 ARABIC LETTER HAMZA] for more information about hamza. See also أ [U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE], ؤ [U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE], and ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE].

General sources: Unicode, Wikipedia, Daniels

ئ

U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE

≡ 064A 0654

Consonant+vowel    ʸ

Decomposes to ي + ٔ [U+064A ARABIC LETTER YEH + U+0654 ARABIC HAMZA ABOVE]

Cursive shapes (naskh): ئئئ ئ

Cursive shapes (nastaliq): ئئئ ئ

Arabic    ʸ    يَاء هَمْزة yāʼ hamzah

ʔɪ, ʔ

This character represents the hamza (ء) in the middle of a word. When yeh is used as a mid-word carrier it loses its dots.

In the middle of a word the hamza is almost always written above a carrier letter. Which one depends on the vowels preceding and following the hamza, and the rules are complicated (and a common source of spelling errors among Arabs), eg. نائم nā'im (but cf. سأل sa'al, مؤمن mu'min).

See ء [U+0621 ARABIC LETTER HAMZA] for more information about hamza. See also أ [U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE], إ [U+0625 ARABIC LETTER ALEF WITH HAMZA BELOW], and ؤ [U+0624 ARABIC LETTER WAW WITH HAMZA ABOVE].

Persian ʔ

Eg. قائمشهر Qā’emšahr.

Urdu vowel separator / vowel

ɪ or a when following a vowel, eg. کوئلہ koɪlɑː coal; لائن lɑːɪn queue; ہیئت hɛat astronomy. The hamza indicates that this vowel is pronounced separately from the preceding one.

iː, ɛ when used as izafat (see below).

Otherwise functions as a soundless vowel junction indicator ('hamza on its chair').

Vowel junctions: The hamza is used to indicate the boundaries between vowel sounds when there is no intervening consonant. Depending on the vowels concerned, it is used in a number of different ways. It can also have two different shapes, one like the initial form of 'ain and the other more like an italic 's'.

When the second vowel is an or e represented by ی or ے, the hamzā 'sits on a chair' before the letter representing the second vowel, eg. کئی kaiː several; تیئیس teiːs twenty-three; کوئی koiː someone; گئے gae they went; گائے gɑːe they sang.

Many words, however, have vowel combinations iːe, where hamzā is not typically used, eg. چلیے ʧaliːe come on.

See other ways in which vowel junctions are formed when dealing with other combinations of vowels.

Izāfat:Izāfat ɪzɑːfat is the name given to the short vowel ɛ used to describe a relationship between two words. It may be translated of, eg. as in the Lion of Punjab.

This sound is mostly represented using zer, but in certain cases can be represented with a combining hamza.

When the preceding word ends in ye ی [U+06CC ARABIC LETTER FARSI YEH], izafat is represented by a combining hamza, eg. ولیٔ کامل valiː ɛ kɑːmɪl perfect saint. Question Should you use ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE], rather than ی [U+06CC ARABIC LETTER FARSI YEH] + combining hamza?

When the preceding word ends in a long vowel, izafat is represented using hamza 'on a chair', ie. ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE], plus ے [U+06D2 ARABIC LETTER YEH BARREE], eg. صدائے بلند sadɑː ɛ buland a high voice; روئے زمین ruː ɛ zamiːn the surface of the ground. Sometimes, however, the hamza is not shown.[2 p99] [11]

There are other ways in which izafat can be formed.

General sources: Unicode, Wikipedia, Daniels, Matthews, Khan

ا

U+0627 ARABIC LETTER ALEF

Consonant+vowel    ˡ

Cursive shapes (naskh): ـا ا

Cursive shapes (nastaliq): ـا ا

Arabic    ˡ    أَلِفْ ’alif

aː, a, ʔa, -

Formally speaking, this letter has no sound of its own. It is really a vowel lengthener and hamza carrier. Its main uses in arabic orthography are:

It also has one or two minor functions such as in conjunction with tawiin (nunation) (see U+064B ARABIC FATHATAN ً ).

Certain parts of the arabic verb end in a long u-vowel that is conventionally written with a following alef that has no effect on pronunciation, eg. كتبوا kætæbuː. The alef is omitted if a suffix is added, eg. كتبوها kætæbuː-haa.

Persian ˡ الف ʾalef ʔ, ɔ, æ, -

In word-initial position, one of:

ɑː in the middle or at the end of a word, eg. مار mār snake, بابا bābā father.

Urdu vowel, alif alɪf

a/ɪ/u on its own in word initial position.

iː/e/ɛ word initial, combined with a following ye, ای

uː/o/ɔ word initial, combined with a following vāū, او

ɑː with madd آ, but see آ [U+0622 ARABIC LETTER ALEF WITH MADDA ABOVE] for this.

ʊ/∅ sometimes as part of the Arabic definite article (see below).

ɑː elsewhere, unless part of the Arabic definite article (see below).

The alternative sounds possible in the initial combinations can be disambiguated, when necessary, by the use of combining marks. The combining marks are rarely used in normal text (with the exception of madd shown above). See a table of combining marks for vowels.

Arabic definite article The pronunciation of ال (alif followed by lām) varies when it represents the Arabic definite article. This affects many words in Urdu that have come from Arabic, in particular names and adverbial expressions.

Often the alif is not pronounced after a short preceding word that ends in a vowel. If the preceding vowel was long, it is shortened in this process. Examples: بالکل bɪlkul (absolutely); فی الحال filhɑːl (at present).

Often the vowel is pronounced ʊ, eg. دارالحکومت dɑːrʊlhʊkuːmat (capital).

(The lam may also not be pronounced. See ل [U+0644 ARABIC LETTER LAM].)

Refs: [1] matthews    [2] delacy

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ب

U+0628 ARABIC LETTER BEH

Consonant    b

Cursive shapes (naskh): ببب ب

Cursive shapes (nastaliq): بـبـب ب

Arabic    b    بَاء bā’    b

Persian b بِ be b

Urdu    be    b

bʰe together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated b, a distinct letter of the alphabet called bhe.

Shape:بھبھبھ بھ

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ة

U+0629 ARABIC LETTER TEH MARBUTA

Consonant+vowel   

Cursive shapes (naskh): ـة ة

Cursive shapes (nastaliq): ـة ة

Arabic       تاء مربوطة tāʼ marbūṭah

Usually no sound, sometimes t.

Used for historical reasons to write the feminine ending, æ – the dots are borrowed from teh (ت). Pronounced as t in specific grammatical contexts. Example: مدرسة mædræsæ.

This letter is only used in final position. If any suffix is added the ending is spelled with ت [U+062A ARABIC LETTER TEH], eg. مدرستنا mædræsæt-naː.

In modern arabic it is not uncommon to find the two dots omitted, particularly on masculine proper names that have the feminine ending, eg. طلبة tulbæ.

Persian h, -, ɛ, æ

Arabic fem. t

General sources: Unicode, Wikipedia, Daniels, Farzad

ت

U+062A ARABIC LETTER TEH

Consonant    t

Cursive shapes (naskh): تتت ت

Cursive shapes (nastaliq): تتت ت

Arabic    t    تَاء tā’    t

Persian t تِ te t

Urdu    te    t

together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated t in Urdu, a distinct letter of the alphabet called the.

Shape:  تھتھتھ تھ.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ث

U+062B ARABIC LETTER THEH

Consonant   

Cursive shapes (naskh): ثثث ث

Cursive shapes (nastaliq): ثثث ث

Arabic       ثَاء thā’    θ

Persian ثِ se

Urdu consonant, se se

s Only occurs in words of Arabic and Persian origin, and is much less common than س [U+0633 ARABIC LETTER SEEN], which is also pronounced s.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ج

U+062C ARABIC LETTER JEEM

Consonant    ǧ

Cursive shapes (naskh): ججج ج

Cursive shapes (nastaliq): ججج ج

Arabic    ǧ    جِيمْ jīm    ʒ

Persian ǧ جیم jim ʤ

Urdu    jīm ʤiːm    ʤ

ʤʰ together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated ʤ in Urdu, a distinct letter of the alphabet called jhe.

Shape:  جھجھجھ جھ.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ح

U+062D ARABIC LETTER HAH

Consonant   

Cursive shapes (naskh): ححح ح

Cursive shapes (nastaliq): ححح ح

Arabic       حَاء ḥā’    ħ

Persian حِ he h, -

Urdu    baṛī he baɽiː he    h

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

خ

U+062E ARABIC LETTER KHAH

Consonant   

Cursive shapes (naskh): خخخ خ

Cursive shapes (nastaliq): خخخ خ

Arabic       خَاء khā’    x

Persian خِ xe x

Urdu    xe xe    x

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

د

U+062F ARABIC LETTER DAL

Consonant    d

Cursive shapes (naskh): ـد د

Cursive shapes (nastaliq): ـد د

Arabic    d    دَالْ dāl    d

Persian d دال dāl d

Urdu    dāl dɑːl    d

together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated d in Urdu, a distinct letter of the alphabet called dhe.

Shape:ـدھ دھ.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ذ

U+0630 ARABIC LETTER THAL

Consonant   

Cursive shapes (naskh): ـذ ذ

Cursive shapes (nastaliq): ـذ ذ

Arabic       ذَال dhāl    ð

Persian ذال zāl z

Urdu    zɑːl    z

In Urdu, this letter only occurs in words of Arabic and Persian origin, and is much less common than ز [U+0632 ARABIC LETTER ZAIN], which is also pronounced z. It is not counted as a regular letter of the Urdu alphabet.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ر

U+0631 ARABIC LETTER REH

Consonant    r

Cursive shapes (naskh): ـر ر

Cursive shapes (nastaliq): ـر ر

Arabic    r    رَاء rā’    r

Persian r رِ re r

Urdu    re re    r (pronounced with a trill).

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ز

U+0632 ARABIC LETTER ZAIN

Consonant    z

Cursive shapes (naskh): ـز ز

Cursive shapes (nastaliq): ـز ز

Arabic    z    زَاي zayn    z

Persian z زِ ze z

Urdu    ze ze

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

س

U+0633 ARABIC LETTER SEEN

Consonant    s

Cursive shapes (naskh): سسس س

Cursive shapes (nastaliq): سسس س

Arabic    s    ِسِينْ sīn    s

Persian s سین sin s

Urdu    sīn siːn    s

In Urdu nastiliq text this can have two somewhat different shapes. The main part of the shape may be a wavy line, a little like a 'w', or can sometimes be a single swash – especially when two seen characters are written together. Use the same character for both visual forms. When one or other of the possible shapes is desired, this should be produced by the font.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ش

U+0634 ARABIC LETTER SHEEN

Consonant    š

Cursive shapes (naskh): ششش ش

Cursive shapes (nastaliq): ششش ش

Arabic    š    شِينْ shīn    ʃ

Persian š شین šin ʃ

Urdu    šīn ʃiːn    ʃ

Shape: ششش ش In Urdu nastiliq text this can have two somewhat different shapes. The main part of the shape may be a wavy line, a little like a 'w', or can sometimes be a single swash – especially when two sheen characters are written together. Use the same character for both visual forms. When one or other of the possible shapes is desired, this should be produced by the font.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ص

U+0635 ARABIC LETTER SAD

Consonant   

Cursive shapes (naskh): صصص ص

Cursive shapes (nastaliq): صصص ص

Arabic       صَادْ ṣād   

Persian صاد sād s

Urdu    svād svɑːd

s Only used in words of Arabic origin.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ض

U+0636 ARABIC LETTER DAD

Consonant   

Cursive shapes (naskh): ضضض ض

Cursive shapes (nastaliq): ضضض ض

Arabic       ضَاد ḍād   

Persian ضاد zād z

Urdu    zvād zvɑːd

z Only used in words of Arabic origin.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ط

U+0637 ARABIC LETTER TAH

Consonant   

Cursive shapes (naskh): ططط ط

Cursive shapes (nastaliq): ططط ط

Arabic       طَاء ṭā’   

Persian طی ṭā t

Urdu    toe toe

t Only used in words of Arabic origin.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ظ

U+0638 ARABIC LETTER ZAH

Consonant   

Cursive shapes (naskh): ظظظ ظ

Cursive shapes (nastaliq): ظظظ ظ

Arabic       ظَاء ẓā’   

Persian ظی z

Urdu    zoe zoe

z Only used in words of Arabic origin.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ع

U+0639 ARABIC LETTER AIN

→ (latin small letter ezh reversed - 01B9)
→ (modifier letter left half ring - 02BF)

Consonant    ʿ

Cursive shapes (naskh): ععع ع

Cursive shapes (nastaliq): ععع ع

Arabic    ʿ    عَين ‘ayn    ʕ

Persian ʿ عین ʿeyn ʔ, -

Preceding V → Vː

Urdu    'ain ain.

Not pronounced when preserved in Arabic words.

If it occurs at the beginning of a word, it can fulfill a similar role to alif, allowing words to begin with a vowel, but also allowing for alternative spellings for different words with the same pronunciation, eg. عرب arab (Arab) vs. ارب arab (necessity).

Note that a word-initial ɑː sound when the spelling begins with alif is written as alif with madd, eg. آج ɑːʤ (today). The same word-initial sound with 'ain is represented by 'ain followed by alif, eg. عادت ɑːdat (habit).

In non-word-initial positions an ain can cause a change in sound to preceding short vowels. This results in long vowels, but not always the long form typically associated with a given short form.

  • a short a becomes ɑː, eg. بعد bɑːd (after).
  • a short ɪ becomes e, eg. سعر ser (verse).
  • a short ʊ becomes o, eg. شعلہ ʃolɑː (flame).

ʔ occasionally between two vowels, although this is often lost in Urdu, eg. معاف mʊʔɑːf or mɑːf (forgiven); سعآدت səʔɑːdət or sɑːdət (fortunate).

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

غ

U+063A ARABIC LETTER GHAIN

Consonant    ġ

Cursive shapes (naskh): غغغ غ

Cursive shapes (nastaliq): غغغ غ

Arabic    ġ    غَين ghayn    ɣ

Persian ġ غین ġeyn ɣ

Between vowels q, ɢ, x

Urdu consonant, ghain ɣain

ɣ Used in words that came into Urdu from Arabic and Persian.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ف

U+0641 ARABIC LETTER FEH

Consonant    f

Cursive shapes (naskh): ففف ف

Cursive shapes (nastaliq): ففف ف

Arabic    f    فَاء fā’    f

In arabic material printed in North Africa this letter sometimes has one dot below. There is a separate code point for that (ڢ [U+06A2 ARABIC LETTER FEH WITH DOT MOVED BELOW]), but it would make more sense to use a font to make this difference than to use a different character.

Persian f فِ fe f

Urdu    fe fe    f

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ق

U+0642 ARABIC LETTER QAF

Consonant    q

Cursive shapes (naskh): ققق ق

Cursive shapes (nastaliq): ققق ق

Arabic    q    قَاف qāf    q

In arabic material printed in North Africa this letter sometimes has only one dot above. There is a separate code point for that (ڧ [U+06A7 ARABIC LETTER QAF WITH DOT ABOVE]), but it would make more sense to use a font to make this difference than to use a different character.

Persian q قاف qāf q, ɢ

Urdu    qāf qɑːf    q

Used in words that came into Urdu from Arabic and Persian.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ك

U+0643 ARABIC LETTER KAF

Consonant    k

Cursive shapes (naskh): ككك ك

Cursive shapes (nastaliq): ككك ك

Arabic    k    كَاف kāf    k

Persian    k

General sources: Unicode, Wikipedia, Daniels, Farzad

Not used in Urdu. See ک [U+06A9 ARABIC LETTER KEHEH].

ل

U+0644 ARABIC LETTER LAM

Consonant    l

Cursive shapes (naskh): للل ل

Cursive shapes (nastaliq): للل ل

Arabic    l    لاَمْ lām    l

Persian l لام lām l

Urdu    lām lɑːm

l

when part of the Arabic definite article (see below).

Combined with a following alif, lām is usually written as لا, eg. گلاس gilɑːs (glass). Sometimes, however, especially in words of Arabic origin such as the equivalent of the English prefix 'un-', the more Arabic form لا is used, eg. لاعلاج lɑːʕilɑːʒ (incurable).

Note that I don't know a way to make this example work with a single font. To produce it I had to mix two different fonts. There may be a special font setting that allows you to control this.

Arabic definite article The pronunciation of ال (alif followed by lām) varies when it represents the Arabic definite article . This affects many words in Urdu that have come from Arabic, in particular names and adverbial expressions.

The lām is not pronounced if it precedes one of the following characters: ت‎062A te, ث‎062B se, د‎062F dāl, ذ‎0630 zāl, ر‎0631 re, ز‎0632 ze, س‎0633 sīn, ش‎0634 šīn, ص‎0635 svād, ض‎0636 zvād, ط‎0637 toe, ظ‎0638 zoe, ل‎0644 lām, ن‎0646 nūn. Instead, the following sound is doubled. A tašdīd (ـّ [U+0651 ARABIC SHADDA] ) may sometimes be used to indicate this. Example: السلام علیکم asːalɑːm alaikum (greetings).

There may also be effects to the sound of the alif too. See ا [U+0627 ARABIC LETTER ALEF].

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

م

U+0645 ARABIC LETTER MEEM

Consonant    m

Cursive shapes (naskh): ممم م

Cursive shapes (nastaliq): ممم م

Arabic    m    مِيمْ mīm    m

Persian m میم mim m

Urdu    mīm miːm    m

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ن

U+0646 ARABIC LETTER NOON

Consonant    n

Cursive shapes (naskh): ننن ن

Cursive shapes (nastaliq): ننن ن

Arabic    n    نُون nūn    n

Persian n نون nun n

Urdu    nūn nuːn

n, eg. انسان insãːn human being.

◌̃ Within a word it may signal that the preceding vowel is nasalised, rather than representing an n sound, eg. ٹانگ tãːg leg. Sometimes, ـ٘ [U+0658 ARABIC MARK NOON GHUNNA] is used above this letter in such cases, to clarify that its function is nasalisation. Nasalisation at the end of a word is signalled using ں [U+06BA ARABIC LETTER NOON GHUNNA], eg. ماں mãː, mother, کروں karũː, I may do. 1

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad, Pournader

ه

U+0647 ARABIC LETTER HEH

Consonant    h

Cursive shapes (naskh): ههه ه

Cursive shapes (nastaliq): ههه ه

Arabic    h    هَاء hā’    h

The marker for hijri dates is an initial form of heh, even though it doesn't join to the left, ie. ه‍. For this, use a U+200D ZERO WIDTH JOINER immediately after the heh, eg. الاثنين 10 رجب 1415 ه‍.. In some cases ـ [U+0640 ARABIC TATWEEL] is used to ensure that the shape looks right, because some applications or fonts don't produce the right effect when using the ZWJ, eg. الاثنين 10 رجب 1415 هـ..

Persian h هِ he do-češm h, -, ɛ, æ, e

ɛ in word-final position after a consonant, eg. نَامِه nāme letter. In one common word (the informal version of no) in the educated Tehran dialect, it represents æ. In rural and regional dialects æ is more common.

h as a consonant, otherwise, eg. خواهش xāheš desire. Word-finally after a vowel, and in monosyllabic words, it is pronounced as a consonant, eg. مَاه māh month, moon, نُه noh nine.

When written without an intervening space, a suffix such as گاه ɡāh (place, time) or ها -hā (plural) doesn't join with this character when it is used as ezafe, eg. خانه‌ها xāne-hā houses. To achieve this, you need to use U+200C ZERO WIDTH NON-JOINER.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

و

U+0648 ARABIC LETTER WAW

Consonant+vowel    w

Cursive shapes (naskh): ـو و

Cursive shapes (nastaliq): ـو و

Arabic    w    وَاو wāw    w, u

w consonant or lengthener of u-vowel.

In certain foreign words, pronounced more like , eg. بنطلون bænt̴æloːn

The male proper name عمرو ʕæmr is written with an unpronounced final waw to distinguish it from the name عمر ʕumar that would otherwise be written identically.

Persian v واو vāvv, u, o, ow, -

At the start of a word, one of:

  • v, as a consonant, eg. وقت vaqt time.
  • u, when preceded by ا [U+0627 ARABIC LETTER ALEF], eg. او u he, she. This is not a common combination in Persian. In vowelled text no diacritic is written with the alef in this case.
  • w as part of a diphthong, when following an ا [U+0627 ARABIC LETTER ALEF], that carries a diacritic in vowelled text, eg. اُوج owj zenith.

In the middle of a word, one of:

  • u, as a 'long' vowel, eg. نور nuːr nur light. When used in vowelled text no diacritic is associated with the previous consonant.
  • w after another short vowel, used for a few common words, eg. پُلُو polo pilau rice, including when preceded by a nominal ـَ [U+064E ARABIC FATHA​].
  • v, as a consonant (typically the case before alef), eg. اِیوان eyvān veranda.
  • after خ and before a long vowel, eg. خوابیدن xābidan to sleep, خواهر xāhar sister. This is a spelling holdover from classical Persian.

At the end of a word, one of:

  • u after a consonant, eg. زانو zānu knee. No diacritic needed in vowelled text.
  • w after a vowel, eg. دُو do two, or پُلُو polo pilau rice.
  • v after another long vowel, eg. گاو ɡāv throat.

Urdu    vāū vɑːuː

β as consonant, eg. والد vaːlɪd (father), نومبر navambar (November).

or o or ɔ as a vowel, whether word initial after alif, او, or elsewhere on its own, eg. اوپر uːpər (above); لوگ log (people); شوق ʃɔq (keenness). The alternative vowel sounds can be disambiguated, when necessary, by the use of combining marks. The combining marks are rarely used in normal text. See a table of combining marks for vowels.

in a number of words of Persian origin beginning with خوا, eg. خواب xɑːb (dream).

ʊ in two very common words: خود xʊd (self), and خوش xʊʃ (happy).

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ى

U+0649 ARABIC LETTER ALEF MAKSURA

• represents YEH-shaped dual-joining letter with no dots in any positional form
• not intended for use in combination with 0654
→ (arabic letter yeh with hamza above - 0626)

Vowel   

Cursive shapes (naskh): ىىى ى

Cursive shapes (nastaliq): ىىى ى

Arabic       ألف مقصورة alif maqṣūrah    ɑː

ɑː, مستشفى mustaʃfɑː hospital.

The long a-vowel at the end of many words is written with yeh instead of an alef. In this case the yeh is typically printed without dots, to avoid confusion, although this is not universal. This spelling only occurs with certain words, and only when the final sound is , eg. معنى mæʕnaː. If any suffix is added, the spelling reverts to the normal alef, eg. معناهم mæʕnaː-hum.

Vowelled text may omit the short æ diacritic before the teh marbuta, because the sound is always the same.

Older fonts may not show dual joining.

General sources: Unicode, Wikipedia, Daniels

Not used in Persian or Urdu. See ی [U+06CC ARABIC LETTER FARSI YEH].

ي

U+064A ARABIC LETTER YEH

• loses its dots when used in combination with 0654
• retains its dots when used in combination with other combining marks
→ (arabic letter yeh with two dots below and hamza above - 08A8)

Consonant+vowel    y

Cursive shapes (naskh): ييي ي

Cursive shapes (nastaliq): ييي ي

Arabic    y    يَاء yā’

j, iː

In certain foreign words, pronounced more like , eg. سكرتير sɪkrɪteːr.

Use with hamza When used with ٔ [U+0654 ARABIC HAMZA ABOVE] the two dots are suppressed in all positions. Text in NFC actually uses ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE] rather than the decomposed sequence, so that is recommended.

Unlike this character, [U+08A8 ARABIC LETTER YEH WITH TWO DOTS BELOW AND HAMZA ABOVE] retains the two dots in all forms, however it also represents a semantically different character.

General sources: Unicode, Wikipedia, Daniels

Not used in Persian or Urdu. See ی [U+06CC ARABIC LETTER FARSI YEH].

Points

ً

U+064B ARABIC FATHATAN

Vowel    aⁿ

Arabic    aⁿ    fatḥatān    æn

In classical arabic, indefinite nouns and adjectives were marked by a final n-sound, called تنوين tænwiːn or, in English, 'nunation'. This is normally indicated by doubling the vowel diacritic. On a word ending with an a-vowel (though not with a feminine ending or some other suffixes) an extra alef was also added at the end of the word. In modern arabic printing the fathatan is usually dropped, but the alef is retained. The pronunciation of the ending æn is also retained in many words. Examples: كِتَابًا kɪtæːbæn, فَرَسًا færæsæn.

Persian aⁿ تنوین نصب tanvin e nasban

Urdu    an

This is a doubled zabar (ـَ [U+064E ARABIC FATHA] ). These marks appear at the ends of certain Arabic adverbs. The doubled zabar is the most common of the three marks of this type. Although the mark appears over an alif the vowel sound is short. Examples, یقیناً yakiːnan (certainly); مثلاً masalan (for example).

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ٌ

U+064C ARABIC DAMMATAN

• a common alternative form is written as two intertwined dammas, one of which is turned 180 degrees

Vowel    uⁿ

Arabic    uⁿ    ḍammatān    un

In classical arabic indefinite nouns and adjectives were marked by a final n-sound, called تنوين tænwiːn or, in English, 'nunation'. This is normally indicated by doubling the vowel diacritic. Example: جَبَلٌ or جَبَلُ ُ ʒælæbun.

Not usually shown in modern text (exceptions in the Koran, difficult older texts, and children's schoolbooks).

Persian uⁿ تنوین رفع tanvin e rafeun

Urdu    un

Doubled peš (ـُ [U+064F ARABIC DAMMA] ).

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ٍ

U+064D ARABIC KASRATAN

Vowel    iⁿ

Arabic    iⁿ    kasratān    ɪn

In classical arabic indefinite nouns and adjectives were marked by a final n-sound, called تنوين tænwiːn or, in English, 'nunation'. This is normally indicated by doubling the vowel diacritic. Example: جَبَلٌ or قَلَمٍ qælæmɪn.

Not usually shown in modern text (exceptions in the Koran, difficult older texts, and children's schoolbooks).

Persian iⁿ تنوین جرّ tanvin e jarrin

Urdu    in

Doubled zer (ـِ [U+0650 ARABIC KASRA] ).

َ

U+064E ARABIC FATHA

Vowel    a

Arabic    a    فَتْحَة fatḥah   

æ or a after ص ض ط ظ غ ق and sometimes after خ ر ل, eg. كَتَبَ kætæbæ. Actual pronunciation varies with context.

Not usually shown in text (exceptions tend to be the Koran, difficult older texts, and children's schoolbooks)

Persian a زِبَر zebar

Rarely used; only where pronunciation needs to be spelled out.

a when used word medially without one of ا, و, or ی, eg. تَبَر tabar axe.

Urdu    zabar zəbər

Rarely used; only where pronunciation needs to be spelled out. Indicates a vowel following its base character. zabar means above.

ə above a consonant, eg. بَب bəb. At the begining of a word it appears above alif or 'ain, eg. اَب əb (now), and عَرَب ərəb (Arab).

When the base consonant is followed by certain other letters, zabar represents different sounds, as shown below:

  • ɑː when followed by alif, silent choṭī he, or 'ain, eg. بَاغ bɑːɣ (garden), مکَہ makːɑː (Mecca), and بَعد bɑːd (after).

  • ɛ when followed by je (both forms), eg. جَیسا ʤɛsɑː (as), اَیسا ɛsɑː (such), and ہَے (is).

  • ɛ when followed by choṭī he or baṛī he, eg. اَحمد ɛhmad (Ahmed), and رَہنا rɛhnɑː (to remain).

  • ɔ when followed by vɑːuː, eg. شَوق ʃɔq (keenness), and اَور ɔr (and).

See a table of combining marks for vowels.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ُ

U+064F ARABIC DAMMA

Vowel    u

Arabic    u    ضَمَّة ḍammah   

u, eg. كُتُب (kutub). Actual pronunciation varies with context.

Not usually shown in text (exceptions tend to be the Koran, difficult older texts, and children's schoolbooks).

Persian u پیش piš

Rarely used; only where pronunciation needs to be spelled out.

o when used word medially without one of ا, و, or ی, eg. شُتُر šotor camel.

Urdu    peš peʃ.

Rarely used; only where pronunciation needs to be spelled out. Indicates a vowel following its base character. peš means forward.

ʊ above a consonant, eg. بُب bʊb. At the begining of a word it appears above alif or 'ain, eg. اُب ʊb.

When the base consonant is followed by certain other letters, peš represents different sounds, as shown below:

  • when followed by vɑːuː, eg. پُورا puːrɑː (full), and اُوپر uːpar (above).

  • o when followed by 'ain, eg. شُعلہ ʃolɑː (flame), and توُّع tavaqːo (hope).

  • ɔ when followed by ʧʰoʈiː he or baṛī he, eg. شُہرت ʃɔhrat (fame), and توجُّہ tavajːɔh (attention).

ʊ, rather than a long vowel, in two very common words with a following vɑːuː: خُود xʊd (self), and خُوش xʊʃ (happy).

The word وہ vo (that, he, she, it) is irregular.

See a table of combining marks for vowels.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ِ

U+0650 ARABIC KASRA

Vowel    i

Arabic    i    كَسْرَة kasrah    i

ɪ, eg. بِهِ bɪhɪ. Actual pronunciation varies with context.

Not usually shown in text (exceptions tend to be the Koran, difficult older texts, and children's schoolbooks.)

Persian i زیر zir

Rarely used; only where pronunciation needs to be spelled out.

e when used word medially without one of ا, و, or ی, eg. کِرم kerm worm.

Urdu    zer zer

Rarely used; only where pronunciation needs to be spelled out. Indicates a vowel following its base character. zer means below.

ɪ below a base consonant, eg. بِب bɪb. At the begining of a word it appears below alif or 'ain, eg. اِتْنَا ɪtnɑː (so much) and عِلْم ɪlm (knowledge).

When the base consonant is followed by certain other letters, zer represents different sounds, as shown below:

  • when followed by je, eg. سِینہ siːnɑː (breast), and اِیمان iːmɑːn (faith).

  • e when followed by ain, eg. شِعر ʃer (verse), and واقِع vɑːqe (situated).

  • ɛ when followed by ʧʰoʈiː he or baɽiː he, eg. مِہربانی mɛhrbɑːniː (kindness), and واضِح vɑːzɛh (clear).

See a table of combining marks for vowels.

ɪzāfat ɪzɑːfat is the name given to the short vowel ɛ when used to describe a relationship between two words. It may be translated of, eg. as in the Lion of Punjab.

This sound is mostly represented using zer. Sometimes, however, the combining mark is not shown, even though pronounced. Examples: شیرِ پنجاب ʃer ɛ panʤɑːb (Lion of the Panjab); طالبِ علم tɑːlɪb ɛ ɪlm (seeker of knowledge (a student)).

There are other ways in which ɪzāfat can be formed.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ّ

U+0651 ARABIC SHADDA

Gemination mark   

Arabic       شَدَّة shaddah   

Diacritic that doubles the length of the supporting consonant, eg. رتّب rætːæbæ. Visible in arabic printing, but not always marked consistently.

A common, though not universal, practice is to display any combining kasra below the shadda, rather than below the base consonant, eg. قَبِّل qæbːɪl. Some fonts, such as Amiri, don't do this.

The sign derives from a miniature nucleus of seen, without dots.

Persian تشدید tašdid

Doubles the sound of the base consonant.

Urdu    tašdīd taʃdiːd.

Doubles the sound of the base consonant, eg. ستّر sattar seventy. More often than not, this is not written.

tašdīd means strengthening.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ْ

U+0652 ARABIC SUKUN

• marks absence of a vowel after the base consonant
• used in some Korans to mark a long vowel as ignored
• can have a variety of shapes, including a circular one and a shape that looks like '06E1'
→ (arabic small high dotless head of khah - 06E1)

Vowel absence marker    ̊

Arabic    ̊    سُكُون sukūn

Indicates that no vowel follows the consonant to which this is attached, eg. مَكْتَب maktab.

Not usually shown in text (exceptions tend to be the Koran, difficult older texts, and children's schoolbooks).

Persian ̊ سکون sokun

Rarely used; indicates absence of a vowel between consonants.

Urdu    sukūn sukuːn or jazm ʤazm.

Rarely used; indicates absence of a vowel between consonants, eg. سَخْت saxt (hard).

It has various possible forms, including a small round circle, something that looks like peʃ, and something like a circumflex. (There is another Unicode character that provides an alternative visual form, ـۡ [U+06E1 ARABIC SMALL HIGH DOTLESS HEAD OF KHAH], but it is better to use this character and select the variant required using a font.)

This diacritic is never normally written above the final character in a word, because as a rule a short vowel is not pronounced in this position.

Sukūn is an Arabic word meaning rest or pause.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

Combining characters

Combining maddah and hamza

ٓ

U+0653 ARABIC MADDAH ABOVE

For maddah combined with alef, use آ [U+0622 ARABIC LETTER ALEF WITH MADDA ABOVE] . 1

Maddah may be used in Koranic text to lengthen waw or yeh. 1

Some other non-Arabic orthographies also use maddah as an ijam, and this codepoint should be used in those situations. 1

Refs: [1] [unicode9] p370   

ٔ

U+0654 ARABIC HAMZA ABOVE

• restricted to hamza and ezafe semantics
• is not used as a diacritic to form new letters

Glottal stop    ͗

Arabic    ͗    ʔ

The hamza sometimes stands alone (see ء [U+0621 ARABIC LETTER HAMZA]), but usually appears with a 'carrier' letter - alef, waw, or yeh (أ إ ؤ ئ) for which separate precomposed characters are available in Unicode.

Combined with base characters: When the hamza is above or below another character you should typically this character with the appropriate base character, however there are a number of exceptions, where you would not normally use this character.

Some exceptions arise because the NFC normalization form converts the base character and combining hamza to a precomposed character. These instances include

Other exceptions arise where the hamza is an integral part of the character itself (ie. an ijam). Examples of these characters include

Cutting and joining hamza in orthography: Classical arabic distinguishes between 'cutting' and 'joining' hamza. 'Cutting' means always pronounced, 'joining' means frequently elided. The joining hamza is of little practical importance in modern arabic pronounced without the old case endings.

In modern printed arabic, the hamza is rarely shown when it occurs at the beginning of a word.

The following are simplified rules for use of (cutting) hamza:

The sign indicating a joining hamza is called a wasla (see ٱ [U+0671 ARABIC LETTER ALEF WASLA]).

Persian ͗ ʔ

ye when used to express the ezafe over ه [U+0647 ARABIC LETTER HEH] when that character is used to represent a final, short e, eg. خانهٔ بزرگ xāne-ye bozorɡ big house.

Urdu used for izafat, ɪzɑːfat

ɛ when used as izafat.

Izāfat:Izāfat ɪzɑːfat is the name given to the short vowel ɛ used to describe a relationship between two words. It may be translated of, eg. as in the Lion of Punjab. This sound is mostly represented using zer, but in certain cases can be represented with a combining hamza.

When the preceding word ends in ye ی [U+06CC ARABIC LETTER FARSI YEH], izafat is represented by a combining hamza, eg. ولیٔ کامل valiː ɛ kɑːmɪl perfect saint. Question Should you use ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE] (or its decomposed form, which includes this character), rather than ی [U+06CC ARABIC LETTER FARSI YEH] + combining hamza?

When the preceding word ends in a silent choṭī he ہ [U+06C1 ARABIC LETTER HEH GOAL], izafat is also represented by a combining hamza, eg. قطرۂ آب qatra ɛ ɑːb drop of water. There is a precomposed character for this combination, ۂ [U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE]. Note that if the choṭī he is pronounced, then zer is used, eg. آہِ گرم āh-e garm hot sigh.

When the preceding word ends in a long vowel, izafat is represented using hamza 'on a chair', ie. ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE], plus ے [U+06D2 ARABIC LETTER YEH BARREE], eg. صدائے بلند sadɑː ɛ buland a high voice; روئے زمین ruː ɛ zamiːn the surface of the ground. Sometimes, however, the hamza is not shown.[2 p99] [11]

There are other ways in which izafat can be formed.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ٕ

U+0655 ARABIC HAMZA BELOW

Glottal stop    ̹

Arabic    ͗      ʔ

Persian ̹ ʔ

Other combining marks

ٖ

U+0656 ARABIC SUBSCRIPT ALEF

Urdu mark

Used to indicate that the vowel is or i rather than e, eg. نُحْیٖ.

This diacritic is not usually needed, and serves only to emphasise that the vowel is long.

ٗ

U+0657 ARABIC INVERTED DAMMA

= ulta pesh
• Kashmiri, Urdu

Urdu mark

Used to indicate that the vowel is or ʊ rather than ɔ, eg. حبل حلالہٗ.

This diacritic is not usually needed, and serves only to emphasise that the vowel is long.

٘

U+0658 ARABIC MARK NOON GHUNNA

• Baluchi
• indicates nasalization in Urdu

Urdu mark

Nasalisation of Urdu vowels is normally indicated by ں [U+06BA ARABIC LETTER NOON GHUNNA] at the end of a word (eg. کروں karũː, I may do), and ن [U+0646 ARABIC LETTER NOON] in the middle of a word (eg. ٹانگ tãːg leg).

This diacritic is used when people want to make it clear that a noon character represents nasalisation rather than the sound n, eg. ٹان٘گ tãːg leg.

It is not used in a standard way, just when the user prefers, and is fairly uncommon.

Refs: [1] [unicode8] p383    [2] [L2/12-381]

ٙ

U+0659 ARABIC ZWARAKAY

• Pashto
ٚ

U+065A ARABIC VOWEL SIGN SMALL V ABOVE

• African languages
ٜ

U+065C ARABIC VOWEL SIGN DOT BELOW

• African languages
ٝ

U+065D ARABIC REVERSED DAMMA

• African languages
ٟ

U+065F ARABIC WAVY HAMZA BELOW

• Kashmiri
ٰ

U+0670 ARABIC LETTER SUPERSCRIPT ALEF

• actually a vowel sign, despite the name

Vowel    ̍

Arabic    ̍    أَلِيف خَنْجَرِيَّة alīf khanjariyyah   

Used in certain Arabic words such as هٰذَا this or ذٰلِكَ that, and not forgetting اللّٰه Allah.

Urdu    ɑː

Used in a few Arabic words over the final form of ی [U+06CC ARABIC LETTER FARSI YEH] to produce the sound ɑː: eg. اعلیٰ alɑː (paramount, highest); دعویٰ davɑː (law suit, claim).

General sources: Unicode, Wikipedia, Daniels, Matthews

Extended Arabic letters

Various

ٱ

U+0671 ARABIC LETTER ALEF WASLA

• Koranic Arabic

Consonant+vowel   

Cursive shapes (naskh): ـٱ ٱ

Cursive shapes (nastaliq): ـٱ ٱ

Arabic       اَلِفُ ٱلْوَصْلِ alifu l-waṣli    a

Classical arabic distinguishes between 'cutting' and 'joining' hamza. 'Cutting' means always pronounced, 'joining' means frequently elided. The sign indicating a joining hamza is called a wasla.

The joining hamza is of little practical importance in modern arabic pronounced without the old case endings. Example: مَا ٱسْمُكَ mā smuka What's your name?.

General sources: Unicode, Wikipedia, Daniels

ٴ

U+0674 ARABIC LETTER HIGH HAMZA

• Kazakh
• forms digraphs
ٵ

U+0675 ARABIC LETTER HIGH HAMZA ALEF

• Kazakh
≈ 0627 0674
ٶ

U+0676 ARABIC LETTER HIGH HAMZA WAW

• Kazakh
≈ 0648 0674
ٷ

U+0677 ARABIC LETTER U WITH HAMZA ABOVE

• Kazakh
≈ 06C7 0674
ٸ

U+0678 ARABIC LETTER HIGH HAMZA YEH

• Kazakh
≈ 064A 0674
ٹ

U+0679 ARABIC LETTER TTEH

• Urdu

Urdu consonant, ṭe ʈe

ʈ

ʈʰ together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated retroflex t in Urdu, a distinct letter of the alphabet called ṭhe, ie.   ٹھٹھٹھ ٹھ

General sources: Unicode, Wikipedia, Daniels, Matthews

ٺ

U+067A ARABIC LETTER TTEHEH

• Sindhi
ٻ

U+067B ARABIC LETTER BEEH

• Sindhi
پ

U+067E ARABIC LETTER PEH

• Persian, Urdu, ...

Persian p پِ pe p

Urdu consonant, pe

p

together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated p in Urdu, a distinct letter of the alphabet called phe, ie.  پھپھپھ پھ

General sources: Unicode, Wikipedia, Daniels, Farzad

ٿ

U+067F ARABIC LETTER TEHEH

• Sindhi
ڀ

U+0680 ARABIC LETTER BEHEH

• Sindhi
ځ

U+0681 ARABIC LETTER HAH WITH HAMZA ABOVE

• Pashto
• represents the phoneme /dz/

Pashto consonant

dz

This character does not decompose. It is treated as a separate letter, and is not equivalent to ح [U+062D ARABIC LETTER HAH] + ٔ [U+0654 ARABIC HAMZA ABOVE].

Shape:ځځځ ځ

ڂ

U+0682 ARABIC LETTER HAH WITH TWO DOTS VERTICAL ABOVE

• not used in modern Pashto
ڃ

U+0683 ARABIC LETTER NYEH

• Sindhi
ڄ

U+0684 ARABIC LETTER DYEH

• Sindhi
څ

U+0685 ARABIC LETTER HAH WITH THREE DOTS ABOVE

• Pashto, Khwarazmian
• represents the phoneme /ts/ in Pashto

Pashto consonant

ts

Shape:څڅڅ څ

چ

U+0686 ARABIC LETTER TCHEH

• Persian, Urdu, ...

Consonant č

Cursive shapes (naskh): چـچـچ چ

Persian č چِ če ʧ

Urdu consonant, ce ʧe

ʧ

ʧʰ together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated ʧ in Urdu, a distinct letter of the alphabet called che, ie.  چھچھچھ چھ.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ڇ

U+0687 ARABIC LETTER TCHEHEH

• Sindhi
ڈ

U+0688 ARABIC LETTER DDAL

• Urdu

Urdu consonant, ḍāl ɖɑːl

ɖ

ɖʰ together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated retroflex d in Urdu, a distinct letter of the alphabet called ḍhe.

Shape:ـڈ ڈ   and   ـڈھ ڈھ.

General sources: Unicode, Wikipedia, Daniels, Matthews

ڊ

U+068A ARABIC LETTER DAL WITH DOT BELOW

• Sindhi, early Persian
ڌ

U+068C ARABIC LETTER DAHAL

• Sindhi
ڍ

U+068D ARABIC LETTER DDAHAL

• Sindhi
ڎ

U+068E ARABIC LETTER DUL

• older shape for DUL, now obsolete in Sindhi
• Burushaski
ڏ

U+068F ARABIC LETTER DAL WITH THREE DOTS ABOVE DOWNWARDS

• Sindhi
• current shape used for DUL
ڐ

U+0690 ARABIC LETTER DAL WITH FOUR DOTS ABOVE

• old Urdu, not in current use
ڑ

U+0691 ARABIC LETTER RREH

• Urdu

Urdu consonant, ṛe ɽe

ɽ

ɽʰ together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated retroflex r in Urdu, a distinct letter of the alphabet called ṛhe, ie. ـڑھ ڑھ.

General sources: Unicode, Wikipedia, Daniels, Matthews

ڔ

U+0694 ARABIC LETTER REH WITH DOT BELOW

• Kurdish, early Persian
ژ

U+0698 ARABIC LETTER JEH

• Persian, Urdu, ...

Consonant ž

Cursive shapes (naskh): ژـژـژ ژ

Urdu consonant, že ʒe ʒ

Persian ž ژِ že ʒ

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ڞ

U+069E ARABIC LETTER SAD WITH THREE DOTS ABOVE

• Berber, Burushaski
ڢ

U+06A2 ARABIC LETTER FEH WITH DOT MOVED BELOW

• Maghrib Arabic

Arabic consonant

f

An alternative form of ف [U+0641 ARABIC LETTER FEH] used in Arabic in North Africa. It would make more sense to use a font to make this difference than to use a different character.

Shape:ڢڢڢ ڢ

ڤ

U+06A4 ARABIC LETTER VEH

• Middle Eastern Arabic for foreign words
• Kurdish, Khwarazmian, early Persian
ڥ

U+06A5 ARABIC LETTER FEH WITH THREE DOTS BELOW

• North African Arabic for foreign words
ڦ

U+06A6 ARABIC LETTER PEHEH

• Sindhi
ڧ

U+06A7 ARABIC LETTER QAF WITH DOT ABOVE

• Maghrib Arabic, Uighur

Arabic consonant

q

An alternative form of ق [U+0642 ARABIC LETTER QAF] used in Arabic in North Africa. It would make more sense to use a font to make this difference than to use a different character.

Shape:ڧڧڧ ڧ

ک

U+06A9 ARABIC LETTER KEHEH

• Persian, Urdu, ...

Consonant k

Cursive shapes (naskh): کـکـک ک

Persian k کاف kāf k

Urdu consonant, kāf kɑːf

k

together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated k in Urdu, a distinct letter of the alphabet called khe, ie. کھکھکھ کھ.

When followed by alif or lām, this has a special rounded shape, eg. کا kɑː (of); کل kal (yesterday).

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ڪ

U+06AA ARABIC LETTER SWASH KAF

• represents a letter distinct from Arabic KAF (0643) in Sindhi
ګ

U+06AB ARABIC LETTER KAF WITH RING

• Pashto
• may appear like an Arabic KAF (0643) with a ring below the base
ڭ

U+06AD ARABIC LETTER NG

• Uighur, Kazakh, old Malay, early Persian, ...
ڮ

U+06AE ARABIC LETTER KAF WITH THREE DOTS BELOW

• Berber, early Persian
گ

U+06AF ARABIC LETTER GAF

• Persian, Urdu, ...

Consonant g

Cursive shapes (naskh): گـگـگ گ

Persian g گاف ɡāf g

Urdu consonant, gāf gɑːf

g

together with ھ [U+06BE ARABIC LETTER HEH DOACHASHMEE], to represent the aspirated g in Urdu, a distinct letter of the alphabet called ghe, ie. گھگھگھ گھ.

When followed by alif or lām, this has a special rounded shape, eg. گام gɑːm (step); گل gul (rose).

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ڱ

U+06B1 ARABIC LETTER NGOEH

• Sindhi
ڲ

U+06B2 ARABIC LETTER GAF WITH TWO DOTS BELOW

• not used in Sindhi
ڳ

U+06B3 ARABIC LETTER GUEH

• Sindhi
ڴ

U+06B4 ARABIC LETTER GAF WITH THREE DOTS ABOVE

• not used in Sindhi
ں

U+06BA ARABIC LETTER NOON GHUNNA

• Urdu, archaic Arabic
• dotless in all four contextual forms

Urdu nasalisation indicator, nun ghunna nuːn ɣunna.

◌̃ Indicates that the preceding vowel is nasalised. It is normally only used at the end of a word, eg. ماں mãː, mother, کروں karũː, I may do. Nasalization within a word usually uses ن [U+0646 ARABIC LETTER NOON], eg. ٹانگ tãːg leg.

Occasionally noon ghunna is found in the middle of a word where a final form is used, eg. لکھیں‌گے vs. لکھینگے for likhēge he,they will write.

It may also be used to represent the noon skeleton in representations of early Arabic and Quranic texts.

This is not counted as a regular letter of the Urdu alphabet.

Shape:ںںں ں

According to the Unicode Standard, all four forms of this character should be dotless, however when it appears initially or medially some fonts use a dotted form (eg. Scheherazade) or a dotted form with a noon ghunna diacritic above (eg. Noto Nastaliq Urdu).

Ref: [L2/12-381]

General sources: Unicode8 p383, Wikipedia, Daniels, Matthews

ڻ

U+06BB ARABIC LETTER RNOON

• Sindhi
ھ

U+06BE ARABIC LETTER HEH DOACHASHMEE

• forms aspirate digraphs in Urdu and other languages of South Asia
• represents the glottal fricative /h/ in Uighur

Urdu aspiration marker / calendar indicator, do cašmī he.

Aspiration: Used to create the aspirated letters of the Urdu alphabet. Each letter is composed of two characters. The letters are: بھ bʰe, پھ pʰe, تھ tʰe, ٹھ ʈʰe, جھ ʤʰe, چھ ʧʰe, دھ dʰe, ڈھ ɖʰe, ڑھ ɽʰe, کھ kʰe, and گھ gʰe.

Until recently choṭī he (ہ [U+06C1 ARABIC LETTER HEH GOAL]) and do cašmī he could be used interchangeably to express aspiration, eg. ہاں or ھاں for hãː yes. Modern practice is to use this character exclusively for aspiration, though people do still occasionally confuse the two.

Calendar indicator: Dates using the Muslim calendar are followed by the word ہجری hɪʤriː, which is abbreviated with the symbol ھ.

General sources: Unicode8 p383, Wikipedia, Daniels, Matthews

ۀ

U+06C0 ARABIC LETTER HEH WITH YEH ABOVE

= arabic letter hamzah on ha (1.0)
• for ezafe, use 0654 over the language-appropriate base letter
• actually a ligature, not an independent letter
≡ 06D5 0654
ہ

U+06C1 ARABIC LETTER HEH GOAL

• Urdu

Urdu consonant, choṭī he ʧʰoʈiː he

h

ɑː as 'silent he' (see below).

ɛ occasionally as a variant of 'silent he' (see below).

when doubled at the end of a word (see below).

Silent he: In Urdu words this letter is pronounced ɑː at the end of a word. Many Arabic and Persian words end in a he that is pronounced ɑː (just like alif), eg. مکّہ məkkɑː (Mecca).

A word like rɑːʤɑː (king), can be spelled with either an alif or a he, ie. راجا or راجہ. This is because the original Indian word was borrowed into Persian, then back into Urdu. Both spellings are now acceptable.

In a few words, the pronunciation of silent he is irregular, eg. کہ (that) and نہ (no).

Doubled he: In order to distinguish some words where the final h is pronounced rather than representing ɑː (or ɛ in irregular pronunciations), the choṭī he is sometimes doubled, eg. کہہ kɛh (say) vs. کہ .

Aspiration: Until recently choṭī he ہ and do cašmī he ھ could be used interchangeably, eg. ہاں or ھاں for hãː (yes). Modern practice is to use the latter exclusively for aspiration, though people do still occasionally confuse the two.

Vowel changes: choṭī he can change the preceding vowel as follows:

  • a to ɛ, eg. رَہنا rɛhnɑː (to remain ).

  • ɪ to ɛ, eg. مہربانی mɛhrbɑːniː (kindness).

  • ʊ to o, eg. , شہرت ʃohrət (fame).

Shape:ہہہ ہ

The initial form is written with a hook beneath, eg. ہندو hinduː (Hindu). The medial can be written with or without, eg. کہاں xɑːb (dream).

A special initial form is used before alif or lam, eg. ہاں hãː (yes), and اہل ahl (people).

Refs: [1] matthews, pp. xxiv-xxvi,xxviii-xxix;    [2] delacy,pp.104-105

General sources: Unicode8 p383, Wikipedia, Daniels, Matthews

ۂ

U+06C2 ARABIC LETTER HEH GOAL WITH HAMZA ABOVE

• Urdu
• actually a ligature, not an independent letter
≡ 06C1 0654

This is equivalent to ہ + ◌ٔ [U+06C1 ARABIC LETTER HEH GOAL + U+0654 ARABIC HAMZA ABOVE​]. NFC produces this character.

Urdu consonant with izafat, ɪzɑːfat

when used as izafat.

Izāfat:Izāfat ɪzɑːfat is the name given to the short vowel ɛ used to describe a relationship between two words. It may be translated of, eg. as in the Lion of Punjab. This sound is mostly represented using zer, but in certain cases can be represented with a combining hamza.

When the preceding word ends in a silent choṭī he ہ [U+06C1 ARABIC LETTER HEH GOAL], izafat is represented by a combining hamza, eg. قطرۂ آب qatra ɛ ɑːb drop of water. Note that if the choṭī he is pronounced, then zer is used, eg. آہِ گرم āh-e garm hot sigh.

There are other ways in which izafat can be formed.

ۆ

U+06C6 ARABIC LETTER OE

• Uighur, Kurdish, Kazakh, Azerbaijani
ۇ

U+06C7 ARABIC LETTER U

• Kirghiz, Azerbaijani
ۈ

U+06C8 ARABIC LETTER YU

• Uighur
ۉ

U+06C9 ARABIC LETTER KIRGHIZ YU

• Kazakh, Kirghiz
ۋ

U+06CB ARABIC LETTER VE

• Uighur, Kazakh
ی

U+06CC ARABIC LETTER FARSI YEH

• Arabic, Persian, Urdu, Kashmiri, ...
• initial and medial forms of this letter have dots
→ (arabic letter alef maksura - 0649)
→ (arabic letter yeh - 064A)

Consonant+vowel y

Cursive shapes (naskh): یـیـی ی

Cursive shapes (nastaliq): یـیـی ی

This character has two dots below it in initial and medial position, but no dots in final or independent form.

Persian y یِ ye j, i, ɒː, aj

At the start of a word, one of:

  • i when preceded by ا [U+0627 ARABIC LETTER ALEF], eg. . این īn this. In vowelled text no diacritic is written with the alef in this case.
  • j, as a consonant, eg. یازده yāzdah eleven, or as part of a diphthong when following an initial alef that carries a diacritic in vowelled text, eg. اِیوان eyvān veranda.

In the middle of a word, one of:

  • i, a 'long' vowel, eg. تیر tir arrow. When used in vowelled text no diacritic is associated with the previous consonant.
  • j after another vowel, eg. خَیلی xeyli very, رویگر ruyɡar coppersmith, پایدار pāydār permanent, or as a consonant (typically the case before alef), eg. سیاه siyāh black.

At the end of a word, one of:

  • i, after a consonant, eg. تکسی taksī taxi. No diacritic needed in vowelled text.
  • j after a vowel, eg. چای čāy black tea, نِی nej reed.
  • ɒː when combined with ـٰ [U+0670 ARABIC LETTER SUPERSCRIPT ALEF​]
  • ye if this is the ezafe (adjectival joiner) used after a vowel, eg. اسب زیبای تندروی من asbe zibāye tundruye min my beautiful, fast horse.

Urdu consonant / vowel, ye je

j as a consonant (word initial or medial), یار jɑːr (friend) and سایہ sɑːjɑː (shadow).

or e or ɛ as an initial or medial vowel (initially it is used after alif, ای), eg. ایک ek (one), سینہ siːnɑː (breast), and کیسا kɛsɑː (how).

The alternative vowel sounds can be disambiguated, when necessary, by the use of combining marks. The combining marks are rarely used in normal text. See a table of combining marks for vowels.

in word final position, eg. لڑکی ləɽkiː (girl).

The Urdu letter je has another, distinct visual form, used only finally or in isolation, to represent the sound e or ɛ. For that you need to use ے [U+06D2 ARABIC LETTER YEH BARREE], eg. لڑکے ləɽke boys.

General sources: Unicode, Wikipedia, Daniels, Matthews, Farzad

ۍ

U+06CD ARABIC LETTER YEH WITH TAIL

• Pashto, Sindhi
ې

U+06D0 ARABIC LETTER E

• Pashto, Uighur
• used as the letter bbeh in Sindhi
ے

U+06D2 ARABIC LETTER YEH BARREE

• Urdu

Urdu vowel, baṛī ye baɽiː je

e or ɛ, eg. لڑکے laɽke, (boys).

The alternative sounds possible in the initial combinations can be disambiguated, when necessary, by the use of combining marks, eg. ہَے vs. بجے baʤe. The combining marks are rarely used in normal text. See a table of combining marks for vowels.

Izāfat:Izāfat ɪzɑːfat is the name given to the short vowel ɛ used to describe a relationship between two words. It may be translated of, eg. as in the Lion of Punjab. This sound is mostly represented using zer, but in certain cases can be represented with a combining hamza.

When the preceding word ends in a long vowel, izafat is represented using hamza 'on a chair', ie. ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE], plus ے [U+06D2 ARABIC LETTER YEH BARREE], eg. صدائے بلند sadɑː ɛ buland a high voice; روئے زمین ruː ɛ zamiːn the surface of the ground. Sometimes, however, the hamza is not shown.[2 p99] [11]

There are other ways in which izafat can be formed.

Shape:‍ے ے. This form is used only in word final or isolated position. This shape is regarded as a stylistic variant in Arabic and Persian text, but it has a functional purpose in Urdu, to help with Urdu's larger number of vowels.

The Urdu letter je has another visual form, used for sounds other than final or isolated e or ɛ. For that you need to use ی [U+06CC ARABIC LETTER FARSI YEH], eg. لڑکی ləɽkiː (girl).

This shape is also used with a hamza to represent the izāfat ɪzɑːfat. For this you should use ۓ [U+06D3 ARABIC LETTER YEH BARREE WITH HAMZA ABOVE].

General sources: Unicode8 p382, Wikipedia, Daniels, Matthews

ۓ

U+06D3 ARABIC LETTER YEH BARREE WITH HAMZA ABOVE

• Urdu
• actually a ligature, not an independent letter
≡ 06D2 0654

Urdu Izāfat ɪzɑːfat marker?

See also ے [U+06D2 ARABIC LETTER YEH BARREE].

General sources: Unicode, Wikipedia, Daniels, Matthews

ە

U+06D5 ARABIC LETTER AE

• Uighur, Kazakh, Kirghiz
ݢ

U+0762 ARABIC LETTER KEHEH WITH DOT ABOVE

• old Malay, preferred to 06AC
→ (arabic letter kaf with dot above - 06AC)
ݣ

U+0763 ARABIC LETTER KEHEH WITH THREE DOTS ABOVE

• Moroccan Arabic, Amazigh, Burushaski
→ (arabic letter ng - 06AD)
ݨ

U+0768 ARABIC LETTER NOON WITH SMALL TAH

• Saraiki, Pathwari
ݬ

U+076C ARABIC LETTER REH WITH HAMZA ABOVE

• Ormuri
• represents a voiced alveolo-palatal laminal fricative
→ (latin small letter z with curl - 0291)

Ormuri consonant

ʑ

Shape:ـݬ ݬ

Addition for Kashmiri

Additions for early Persian and Azerbaijani

Extended Arabic letters for Parkari

ۯ

U+06EF ARABIC LETTER REH WITH INVERTED V

• also used in early Persian

Extended Arabic letters for African languages

U+08A1 ARABIC LETTER BEH WITH HAMZA ABOVE

• Adamawa Fulfulde (Cameroon)
• used for the implosive bilabial stop
→ (latin small letter b with hook - 0253)

Adamawa Fulfulde

ɓ (bilabial implosive)

This character does not decompose. It is treated as a separate letter, and is not equivalent to ب [U+0628 ARABIC LETTER BEH] with ٔ [U+0654 ARABIC HAMZA ABOVE].

Shape:ࢡࢡࢡ ࢡ

U+08A8 ARABIC LETTER YEH WITH TWO DOTS BELOW AND HAMZA ABOVE

• Adamawa Fulfulde
• used for the implosive palatal approximant, realized as pharyngealization of the approximant
→ (latin small letter y with hook - 01B4)

Adamawa Fulfulde

(palatal implosive)

Unlike ئ [U+0626 ARABIC LETTER YEH WITH HAMZA ABOVE], which loses the two dots when combined with hamza, this character retains the two dots in all forms.

Note that this character does not decompose. It is treated as a separate letter.

Shape:ࢨࢨࢨ ࢨ

U+08A9 ARABIC LETTER YEH WITH TWO DOTS BELOW AND DOT ABOVE

• Adamawa Fulfulde
• used for the voiced palatal nasal
→ (latin small letter n with left hook - 0272)

Extended vowel signs for African languages

U+08F6 ARABIC KASRA WITH DOT BELOW

• also used in Philippine languages

Dependent consonants for Rohingya

U+08AC ARABIC LETTER ROHINGYA YEH

= bottya-yeh

Extended vowel signs for Rohingya

Tone marks for Rohingya

Signs for Sindhi

Additions for Khowar

Addition for Torwali

Additions for Burushaski

Arabic letters for European and Central Asian languages

U+08AD ARABIC LETTER LOW ALEF

• Bashkir, Tatar

U+08B0 ARABIC LETTER GAF WITH INVERTED STROKE

• Crimean Tatar, Chechen, Lak

Arabic letter for Berber

Archaic letters

ٮ

U+066E ARABIC LETTER DOTLESS BEH

ٯ

U+066F ARABIC LETTER DOTLESS QAF

Punctuation & symbols

General punctuation

؉

U+0609 ARABIC-INDIC PER MILLE SIGN

→ (per mille sign - 2030)
؊

U+060A ARABIC-INDIC PER TEN THOUSAND SIGN

→ (per ten thousand sign - 2031)
،

U+060C ARABIC COMMA

• also used with Thaana and Syriac in modern text
→ (comma - 002C)
→ (turned comma - 2E32)
→ (reversed comma - 2E41)

Comma    ,

؛

U+061B ARABIC SEMICOLON

• also used with Thaana and Syriac in modern text
→ (semicolon - 003B)
→ (reversed semicolon - 204F)
→ (turned semicolon - 2E35)

Semi-colon    ;

؟

U+061F ARABIC QUESTION MARK

• also used with Thaana and Syriac in modern text
→ (question mark - 003F)
→ (reversed question mark - 2E2E)

Question mark    ?

٪

U+066A ARABIC PERCENT SIGN

→ (percent sign - 0025)

Percent mark    %

Arabic

Used with arabic-indic numerals, eg. ١٢٪.h

It is also possible to find % [U+0025 PERCENT SIGN].

٫

U+066B ARABIC DECIMAL SEPARATOR

Arabic punctuation

Used in arabic, but not in common use (possibly because it is not available on the keyboard), eg. ١٫٢٣. Khaled Hosny reports that people usually use , [U+002C COMMA] or even a small ر [U+0631 ARABIC LETTER REH] (mainly in newspapers, probably because it looks closer to this character than a comma) 1.

Refs: [1] hosny

Urdu punctuation, ašāriya əʃɑːrɪjɑ.

Example ۲۵۲۴٫۲۳ do hazɑːr pɑ̃ːʧ sau caubiːs aʃɑːrɪjɑː do tiːn (2524.23). In Urdu this looks like a hamza.

٬

U+066C ARABIC THOUSANDS SEPARATOR

→ (apostrophe - 0027)
→ (right single quotation mark - 2019)

Arabic punctuation

Used in old documents, eg. ١٬٢٣٤, but not common in recent use 1. In Morocco, it seems that a comma or a space is more common, though there may also be no separator 2.

Refs: [1] hosny    [2] Tounsi

٭

U+066D ARABIC FIVE POINTED STAR

• appearance rather variable
→ (asterisk - 002A)
۔

U+06D4 ARABIC FULL STOP

• Urdu

Full stop    .

Tatweel

ـ

U+0640 ARABIC TATWEEL

= kashida
• inserted to stretch characters or to carry tashkil with no base letter
• also used with Mandaic, Manichaean, Psalter Pahlavi, and Syriac

Baseline extender    _

Arabic

Used to stretch words for simple justification, or to make a word or phrase a particular width, or as a form of emphasis.

Better quality justification systems stretch glyphs, rather than adding baseline extensions. This dynamic stretching of glyphs is often called kashida. In some typesetting systems, such as InDesign, the tatweel character serves more to indicate opportunities for stretching, and the glyph for the character itself is not shown.

Which words are stretched and how much depends on a set of rules that tends to vary for different font styles. (Elongation is not normally used at all for the ruq'a style.) The following is an example of text justified using tatweel from a newspaper column.

Justified Arabic text.

One of the major problems when using tatweel to stretch text is that when the width of the space in which the text is displayed, or when new text is added near the start of a paragraph, lines wrap differently and all the places where tatweel would be needed have to be recalibrated. Thus tatweels only work for static text with fixed dimensions.

It is very common to see baseline stretching in modern Arabic text where a word or phrase is stretched to fill a particular space, eg. the Arabic tag line (الابداع المتجدد Creativity renewed) below the word Lexus in the following image is stretched to be the same width.

Arabic text stretched to fit the width of the word Lexus.

Arabic Layout Mark

U+061C ARABIC LETTER MARK

• commonly abbreviated ALM
→ (right-to-left mark - 200F)

Arabic directional formatting character

In a RTL context, the bidi algorithm expects a sequence of numbers separated by hyphens (for example), to run left to right when preceded by Hebrew or N’Ko text, however it expects the sequence to run right to left when preceded by Arabic, Thaana, or Syriac characters.

ك 12-34-5678

އ 12-34-5678

ܐ 12-34-5678

ߕ 12-34-5678

א 12-34-5678

However, when a sequence like this appears alone on a line, it always runs left to right, because the bidi algorithm can't detect a previous Arabic, or other, character.

12-34-5678

This order not appropriate for documents in the Arabic language (and i guess Dhivehi or Syriac).

The ALM is a way of producing the right sequencing by inserting what is effectively an invisible Arabic character before the number.

؜12-34-5678

RLM and RLI..PDI, etc, are unable to produce the RTL sequencing, because the difference lies in what script precedes the number.

This is all helpful for Arabic language text, but Persian doesn’t order these sequences RTL, so it’s necessary to use one (any) of the directional formatting characters before such sequences in Persian to prevent this special ordering.

Similar special ordering is applied to numbers in equations, such as 1 + 2 = 3 for Arabic language text.

Subtending & supertending marks

؀

U+0600 ARABIC NUMBER SIGN

Urdu symbol

Used to indicate a number, eg. ۱۲۳؀.

The stroke may be elongated and pass under the number, but this is not a combining character, and should appear before the number in memory. The length of the symbol may vary according to the number of digits. It is terminated by a non-digit character.

Refs: [1] [Unicode9] p374.

؁

U+0601 ARABIC SIGN SANAH

Urdu symbol, sanh sənh.

Dates are indicated by placing this long sweep below the year digits. For the Gregorian calendar this is followed with the word عیسوی iːsviː Christian era. This is usually abbreviated as a hamza ء.

Dates using the Muslim calendar are followed by the word ہجری hɪʤriː, which is abbreviated with the symbol ھ.

The sanh sign is typed before the digits (in a rtl context): eg. ۲۰۰۴؁ء ‎(2004). It is not a combining character, even though it displays beneath the digits. The length of the symbol may vary according to the number of digits. It is terminated by a non-digit character.

The sanh is derived from the Arabic word for year سنة.

Refs: [1] [Unicode9] p374.

؂

U+0602 ARABIC FOOTNOTE MARKER

Urdu symbol

Used to indicate that a number is a footnote, eg. ؎۵.

The number sits above the symbol, although this is not a combining character. The marker should come before the number in logical order.

Do not confuse this with ؎ [U+060E ARABIC POETIC VERSE SIGN].

Refs: [1] [Unicode9] p374.

؃

U+0603 ARABIC SIGN SAFHA

Urdu symbol, safah səfəh

Used to indicate a page number, where English would use an abbreviation such as "pp. 35-45", eg. ؃۴۵. The stroke may be elongated and pass under the number.

The symbol should come before the number in logical order.

The symbol is derived from the stroke used for ص [U+0635 ARABIC LETTER SAD].

Refs: [1] [Unicode9] p374.

؄

U+0604 ARABIC SIGN SAMVAT

• used for writing Samvat era dates in Urdu

Urdu symbol

Used in Urdu to indicate a year in the Śaka calendar. (Cf. the sign sanh which is used for dates in the Gregorian or Islamic calendar.)

The symbol should come before the number in logical order.

The symbol is a stylized abbreviation of the word samvat, the name of this calendar.

Refs: [1] [Unicode9] p374.

؅

U+0605 ARABIC NUMBER MARK ABOVE

• may be used with Coptic Epact numbers

Arabic symbol

Used in Arabic text with Coptic numbers, such as in early astronomical tables. Unlike the other Arabic number signs, it extends across the top of the sequence of digits, and is used with Coptic digits, rather than with Arabic digits

The symbol should come before the number in logical order.

Refs: [1] [Unicode9] p374.

Radix Symbols

؆

U+0606 ARABIC-INDIC CUBE ROOT

→ (cube root - 221B)
؇

U+0607 ARABIC-INDIC FOURTH ROOT

→ (fourth root - 221C)

Letterlike symbol

Poetic marks

؎

U+060E ARABIC POETIC VERSE SIGN

Urdu Often used to mark the beginning of poetic verse. For an example see Figure 8 in Jonathan Kew's examples.

Do not confuse this with ؂ [U+0602 ARABIC FOOTNOTE MARKER].

؏

U+060F ARABIC SIGN MISRA

Urdu symbol misra misrə

Urdu poetry typically creates poems from couplets. This symbol is used to indicate a single line (misra) of a couplet (shayr) from an Urdu poem, when quoted in text.

This sign is used when quoting a line of verse in text. It is used at the beginning of the line, and is followed by the line of verse. See an example.

Honorifics

ؐ

U+0610 ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM

• represents sallallahu alayhe wasallam 'may God's peace and blessings be upon him'

Urdu Represents sallallahu alayhe wasallam sallallao alae va sallam (may God's peace and blessings be upon him) صلّى الله عليه وسلّم. Used over the name of Mohammed.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. محمّدؐ muhamːed sallallao alae va sallam.

ؑ

U+0611 ARABIC SIGN ALAYHE ASSALLAM

• represents alayhe assalam 'upon him be peace'

Urdu Represents alayhe asallam alejsallam (upon him be peace) عليه السّلام. Used over the name of prophets other than Mohammed.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. عیسؑیٰ isaː salejsallam Christ, upon him be peace!.

ؒ

U+0612 ARABIC SIGN RAHMATULLAH ALAYHE

• represents rahmatullah alayhe 'may God have mercy upon him'

Urdu Represents rahmatulla alayhe raːmatʊlla alee (may God have mercy upon him) رحمت الله عليه. Used over the names of saints, religious authorities, and other deceased pious persons.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. قاضی نور محمّدؒ kaziː nur mamed rahmatulla alayhe Qazi Nur Muhammad, may God have mercy upon him!.

ؓ

U+0613 ARABIC SIGN RADI ALLAHOU ANHU

• represents radi allahu 'anhu 'may God be pleased with him'

Urdu Represents radi allahu 'anhu raziallaːo ano (may God be pleased with him) رضي الله عنه. Used over the names of the Companions of the Prophet.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. ابوبکرؓ abu bakr, raziallaːo ano Abu Bakr, may God be pleased with him!.

ؔ

U+0614 ARABIC SIGN TAKHALLUS

• sign placed over the name or nom-de-plume of a poet, or in some writings used to mark all proper names

Urdu Sign placed over the name or nom-de-plume of a poet, or in some writings used to mark all proper names.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. عطاشادؔ ataː ʃaːd Ata Shad (author's name).

Koranic annotation signs

ؘ

U+0618 ARABIC SMALL FATHA

• should not be confused with 064E FATHA
ؙ

U+0619 ARABIC SMALL DAMMA

• should not be confused with 064F DAMMA
ؚ

U+061A ARABIC SMALL KASRA

• should not be confused with 0650 KASRA
ۜ

U+06DC ARABIC SMALL HIGH SEEN

Arabic

Used in more than one way with ـْ [U+0652 ARABIC SUKUN​] in Koranic text.

Used closer to ص [U+0635 ARABIC LETTER SAD] than a sukun, it indicates that the base letter should be pronounced s as in seem, eg. بَصْۜطَةً. (Note that this is incorrectly displayed in some fonts due to problems with the Unicode combining-character ordering rules. There is a proposal to address this in UTR #53.)

Used above a sukun, it is a pause-related hint, eg. مَالِيَةْۜ.

Refs: [1] Unicode9 p374 [2] UTR #53 draft

۝

U+06DD ARABIC END OF AYAH

Arabic

Used to indicate numbered verses in the Koran. The symbol encloses the numbers, eg. ٣۝ or ٤٣۝.

See also U+08E2 ARABIC DISPUTED END OF AYAH, which is used occasionally in Koranic text to mark a verse for which there is scholarly disagreement about the location of the end of the verse.

Refs: [1] [Unicode9] p374.

۟

U+06DF ARABIC SMALL HIGH ROUNDED ZERO

• smaller than the typical circular shape used for 0652
ۡ

U+06E1 ARABIC SMALL HIGH DOTLESS HEAD OF KHAH

= Arabic jazm
• presentation form of 0652, using font technology to select the variant is preferred
• used in some Korans to mark absence of a vowel
→ (arabic sukun - 0652)
ۤ

U+06E4 ARABIC SMALL HIGH MADDA

• typically used with 06E5, 06E6, 06E7, and 08F3
ۥ

U+06E5 ARABIC SMALL WAW

→ (arabic small high waw - 08F3)
۩

U+06E9 ARABIC PLACE OF SAJDAH

• there is a range of acceptable glyphs for this character

U+08F0 ARABIC OPEN FATHATAN

= successive fathatan

U+08F1 ARABIC OPEN DAMMATAN

= successive dammatan

U+08F2 ARABIC OPEN KASRATAN

= successive kasratan

Currency sign

؋

U+060B AFGHANI SIGN

Deprecated letter

ٳ

U+0673 ARABIC LETTER ALEF WITH WAVY HAMZA BELOW

• Kashmiri
• this character is deprecated and its use is strongly discouraged
• use the sequence 0627 065F instead

Digits

Arabic-Indic

٠

U+0660 ARABIC-INDIC DIGIT ZERO

Digit    0

١

U+0661 ARABIC-INDIC DIGIT ONE

Digit    1

Arabic    وَاحِد wāḥid    wɑːħid

٢

U+0662 ARABIC-INDIC DIGIT TWO

Digit    2

Arabic    اِثْنَين ithnayn    ʔiθnain

٣

U+0663 ARABIC-INDIC DIGIT THREE

Digit    3

Arabic    ثَلَاثَة arbaʻah    θɑlɑːθɑ

٤

U+0664 ARABIC-INDIC DIGIT FOUR

Digit    4

Arabic    أَربَعَة thalāthah    ʔɑrbɑʕɑ

٥

U+0665 ARABIC-INDIC DIGIT FIVE

Digit    5

Arabic    خَمْسَة khamsah    xɑmsɑ

٦

U+0666 ARABIC-INDIC DIGIT SIX

Digit    6

Arabic    سِتَّة sittah    sittɑ

٧

U+0667 ARABIC-INDIC DIGIT SEVEN

Digit    7

Arabic    سَبْعَة sabʻah    sɑbʕɑ

٨

U+0668 ARABIC-INDIC DIGIT EIGHT

Digit    8

Arabic    ثَمَانيَة thamānyah    θɑmɑːnjɑ

٩

U+0669 ARABIC-INDIC DIGIT NINE

Digit    9

Arabic    تِسْعَة tisʻah    tisʕɑ

Eastern Arabic-Indic digits

۰

U+06F0 EXTENDED ARABIC-INDIC DIGIT ZERO

Digit    0

Persian صِفر sefr

Shape۰, ie. same as ٠ [U+0660 ARABIC-INDIC DIGIT ZERO].

Urdu    sifr    sɪfr.

Shape۰, ie. same as ٠ [U+0660 ARABIC-INDIC DIGIT ZERO].

۱

U+06F1 EXTENDED ARABIC-INDIC DIGIT ONE

Digit    1

Persian یِک yek

Shape۱, ie. same as ١ [U+0661 ARABIC-INDIC DIGIT ONE].

Urdu    ek ek

Shape۱, ie. same as ١ [U+0661 ARABIC-INDIC DIGIT ONE].

۲

U+06F2 EXTENDED ARABIC-INDIC DIGIT TWO

Digit    2

Persian دُو do

Shape۲, ie. same as ٢ [U+0662 ARABIC-INDIC DIGIT TWO].

Urdu    do    do

Shape۲, ie. same as ٢ [U+0662 ARABIC-INDIC DIGIT TWO].

۳

U+06F3 EXTENDED ARABIC-INDIC DIGIT THREE

Digit    3

Persian سِه se

Shape۳, ie. same as ٣ [U+0663 ARABIC-INDIC DIGIT THREE].

Urdu    tīn    tiːn

Shape۳, ie. same as ٣ [U+0663 ARABIC-INDIC DIGIT THREE].

۴

U+06F4 EXTENDED ARABIC-INDIC DIGIT FOUR

• Persian has a different glyph than Sindhi and Urdu

Digit    4

Persian چَهَار čahār

Shape۴, cf. ٤ [U+0664 ARABIC-INDIC DIGIT FOUR].

Urdu   cār    ʧɑːr

Shape۴, cf. ٤ [U+0664 ARABIC-INDIC DIGIT FOUR].

۵

U+06F5 EXTENDED ARABIC-INDIC DIGIT FIVE

• Persian, Sindhi, and Urdu share glyph different from Arabic

Digit    5

Persian پَنج panj

Shape۵, cf. ٥ [U+0665 ARABIC-INDIC DIGIT FIVE].

Urdu    pāṅc    pɑ̃ːʧ

Shape۵, cf. ٥ [U+0665 ARABIC-INDIC DIGIT FIVE].

۶

U+06F6 EXTENDED ARABIC-INDIC DIGIT SIX

• Persian, Sindhi, and Urdu have glyphs different from Arabic

Digit    6

Persian شِش šeš

Shape۶, cf. ٦ [U+0666 ARABIC-INDIC DIGIT SIX].

Urdu    che    ʧʰe

Shape۶, cf. ٦ [U+0666 ARABIC-INDIC DIGIT SIX].

۷

U+06F7 EXTENDED ARABIC-INDIC DIGIT SEVEN

• Urdu and Sindhi have glyphs different from Arabic

Digit    7

Persian هَفت haft

Shape۷, ie. same as ٧ [U+0667 ARABIC-INDIC DIGIT SEVEN].

Urdu    sāt    sɑːt

Shape۷, cf. ٧ [U+0667 ARABIC-INDIC DIGIT SEVEN].

۸

U+06F8 EXTENDED ARABIC-INDIC DIGIT EIGHT

Digit    8

Persian هَشت hašt

Shape۸, ie. same as ٨ [U+0668 ARABIC-INDIC DIGIT EIGHT].

Urdu    āṭh    ɑːʈʰ

Shape۸, ie. same as ٨ [U+0668 ARABIC-INDIC DIGIT EIGHT].

۹

U+06F9 EXTENDED ARABIC-INDIC DIGIT NINE

Digit    9

Persian نُه noh

Shape۹, ie. same as ٩ [U+0669 ARABIC-INDIC DIGIT NINE].

Urdu   nau    nəʊ.

Shape۹, ie. same as ٩ [U+0669 ARABIC-INDIC DIGIT NINE].

Presentation forms

Most presentation form characters are for compatibility and should not be used. There are a few characters, however, that may be useful in particular circumstances, and those are the ones discussed here.

Ornate parentheses

U+FD3E ORNATE LEFT PARENTHESIS

Arabic punctuation.

This is considered to be traditional Arabic punctuation, rather than a compatibility character. 1

Unlike other parentheses, for legacy reasons this and its pair are not automatically mirrored when used in text, so you need to choose the right code point based on the expected glyph shape. 1

Refs: [1] [unicode9] p389-390   

﴿

U+FD3F ORNATE RIGHT PARENTHESIS

Arabic punctuation.

This is considered to be traditional Arabic punctuation, rather than a compatibility character. 1

Unlike other parentheses, for legacy reasons this and its pair are not automatically mirrored when used in text, so you need to choose the right code point based on the expected glyph shape. 1

Refs: [1] [unicode9] p389-390   

Word ligatures

The word ligature range extends from U+FDF0 to U+FDFD, but the Unicode Standard only calls out a few of those codepoints for mention (those listed here), relegating the others to rarely used compatibility characters.

The shape of these characters varies significantly across different fonts. You can experiment with various font renderings using the font switch dialogue box (click on the blue horizontal bar at the bottom right of the page).

U+FDF2 ARABIC LIGATURE ALLAH ISOLATED FORM

≈ [isolated] 0627 0644 0644 0647

Arabic word ligature, Allah.

According to the text of the Unicode Standard, you should normally create this word ligature with the following sequence of ordinary characters: اللّٰه (click on the red text to see the list). However, the compatibility decomposition for this character in the Unicode database is just الله, ie. no combining characters, even though the text of the standard says that the ligature should not be formed by fonts when ـّ [U+0651 ARABIC SHADDA]  and ـٰ [U+0670 ARABIC LETTER SUPERSCRIPT ALEF]  are not present (because the four base characters exist in Persian and other languages in contexts where they have different meanings and pronunciations). 1

Shape The shape varies slightly from font to font, and is not always correct – for example, a number of fonts omit the initial alef. The Noto Naskh Arabic font used in the box above left doesn't produce the diacritics. Here is the rendering of this code point in the Unicode charts.

Refs: [1] [unicode9] p390   

Other word ligature, Allah.

See the information about usage in Arabic above.

In other languages a different form of heh may be used, eg. ہ [U+06C1 ARABIC LETTER HEH GOAL], ie. اللّٰہ. 1

Refs: [1] [unicode9] p390   

U+FDFA ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM

≈ [isolated] 0635 0644 0649 0020 0627 0644 0644 0647 0020 0639 0644 064A 0647 0020 0648 0633 0644 0645

Arabic word ligature, honorific.

Honorific used after the name of God or Mohammed, meaning 'may God's peace and blessings be upon him'. 1

Its use is comparable to the combining honorific ـؐ [U+0610 ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM].. 1

The compatibility decomposition for this character in the Unicode database is صلى الله عليه وسلم – click on the red text to see the list of characters.

This character is sometimes used by Muslims writing in Latin or Cyrillic scripts. 1

Shape The shape varies slightly from font to font. Here is the rendering of this code point in the Unicode charts.

Refs: [1] [unicode9] p390   

U+FDFB ARABIC LIGATURE JALLAJALALOUHOU

Description in the Unicode standard:

≈ [isolated] 062C 0644 0020 062C 0644 0627 0644 0647

Arabic word ligature, honorific.

Honorific used after the name of God or Mohammed. 1

The compatibility decomposition for this character in the Unicode database is جل جلاله – click on the red text to see the list of characters.

This character is sometimes used by Muslims writing in Latin or Cyrillic scripts. 1

Shape The shape varies slightly from font to font. Here is the rendering of this code point in the Unicode charts.

Refs: [1] [unicode9] p390   

U+FDFD ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM

Arabic word ligature, basmala.

A common opening phrase, meaning "In the name of God, the Most Gracious, the Most Merciful". It is used more commonly than the other word ligatures shown above, and tends to appear above text. It is also used in other scripts, such as Bengali and Thaana. 1

This is the phrase recited before each sura (chapter) of the Qur'an – except for the ninth. It is used by Muslims in various contexts (for instance, during daily prayer) and is used in over half of the constitutions of countries where Islam is the official religion or more than half of the population follows Islam, usually the first phrase in the preamble, including those of Afghanistan, Bahrain, Bangladesh, Brunei, Egypt, Iran, Iraq, Kuwait, Libya, Maldives, Pakistan, Tunisia, and the United Arab Emirates. 2

There is no decomposition for this character.

Shape The shape varies significantly from font to font and usage to usage. Here is the rendering of this code point in the Unicode charts. See other renderings at Wikipedia.

Refs: [1] [unicode9] p390    [2] Wikipedia

Currency

U+FDFC RIAL SIGN

≈ [isolated] 0631 06CC 0627 0644

Persian currency symbol, rial.

Created by a typewriter standardisation committee in 1973, this is intended to be a condensed version of the word for the Iranian currency. The compatibility decomposition of this character is ریال (click on the red text to see the list).

The Unicode Standard says that this was only used for a short while in typewriters and keyboard layouts, and so is provided mainly for compatability reasons. Persian users are said to prefer typing the word rather than using this symbol. 1

Refs: [1] [unicode9] p391   

Acknowledgements

The following people have provided helpful advice.

References

Last commit 2018-07-13 10:31 GMT.  •  Make a comment  •  Licence CC-By © r12a.