/* */ var charDetails = { '\u{0600}': `

`, '\u{0601}': `

`, '\u{0602}': `

`, '\u{0603}': `

`, '\u{0604}': `

`, '\u{0605}': `

Used in Arabic text with Coptic numbers, such as in early astronomical tables. Unlike the other Arabic number signs, it extends across the top of the sequence of digits, and is used with Coptic digits, rather than with Arabic digits

The symbol should come before the number in logical order.

`, '\u{0606}': `

`, '\u{0607}': `

`, '\u{0608}': `

`, '\u{0609}': `

Per mille sign.

`, '\u{060A}': `

`, '\u{060B}': `

`, '\u{060C}': `

Comma.

`, '\u{060D}': `

Date separator.

`, '\u{060E}': `

`, '\u{060F}': `

`, '\u{0610}': `

Represents sallallahu alayhe wasallam sallallao alae va sallam (may God's peace and blessings be upon him) صلّى الله عليه وسلّم. Used over the name of Mohammed.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

The mark is really associated with a word, rather than a character, but the placement is left to the user. The mark is often added somewhere in the middle of a name, but commonly appears towards the end. This depends to some extent on the letter shapes present and the calligraphic style in use, eg. محمّدؐ muhamːed sallallao alae va sallam.

`, '\u{0611}': `

Represents alayhe asallam alejsallam (upon him be peace) عليه السّلام. Used over the name of prophets other than Mohammed.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

`, '\u{0612}': `

Represents rahmatulla alayhe raːmatʊlla alee (may God have mercy upon him) رحمت الله عليه. Used over the names of saints, religious authorities, and other deceased pious persons.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

`, '\u{0613}': `

Represents radi allahu 'anhu raziallaːo ano (may God be pleased with him) رضي الله عنه. Used over the names of the Companions of the Prophet.

One of several marks that represent phrases expressing the status of a person, most having specifically religious meaning.

`, '\u{0614}': `

Sign placed over the name or nom-de-plume of a poet, or in some writings used to mark all proper names.

`, '\u{0615}': `

`, '\u{0616}': `

`, '\u{0617}': `

`, '\u{0618}': `

`, '\u{0619}': `

`, '\u{061A}': `

`, '\u{061B}': `

This is the standard equivalent of the semi-colon in Arabic text.

`, '\u{061C}': `

In a RTL context, the bidi algorithm expects a sequence of numbers separated by hyphens (for example), to run left to right when preceded by Hebrew or N’Ko text, however it expects the sequence to run right to left when preceded by Arabic, Thaana, or Syriac characters.

ك 12-34-5678

އ 12-34-5678

ܐ 12-34-5678

ߕ 12-34-5678

א 12-34-5678

However, when a sequence like this appears alone on a line, it always runs left to right, because the bidi algorithm can't detect a previous Arabic, or other, character.

12-34-5678

This order not appropriate for documents in the Arabic language (and i guess Dhivehi or Syriac).

The ALM is a way of producing the right sequencing by inserting what is effectively an invisible Arabic character before the number.

؜12-34-5678

RLM and RLI..PDI, etc, are unable to produce the RTL sequencing, because the difference lies in what script precedes the number.

This is all helpful for Arabic language text, but Persian doesn’t order these sequences RTL, so it’s necessary to use one (any) of the directional formatting characters before such sequences in Persian to prevent this special ordering.

Similar special ordering is applied to numbers in equations, such as 1 + 2 = 3 for Arabic language text.

`, '\u{061E}': `

`, '\u{061F}': `

This is the standard equivalent of the question mark in Arabic text.

`, '\u{0620}': `

`, '\u{0621}': `

ʔ glottal stop. For historical reasons this is treated as an orthographic sign rather than as a letter of the alphabet. It sometimes stands alone, but usually appears with a 'carrier' letter - alef, waw, or yeh (أ إ ؤ ئ) for which separate precomposed characters are available in Unicode.

This codepoint is used for representing the standalone hamza only. On its own it has no joining behaviour.

Hamza with carrier letters

When the hamza is above or below another character it is possible to use 0654 or 0655 with the appropriate base character, however Unicode provides precomposed characters which are usually preferred when writing Arabic.

The following are converted to precomposed characters by the NFC normalization form. (Only ther first 4 are used for the Arabic language.)

0623
0625
0624
0626
ۂ U+06C2 LETTER HEH GOAL WITH HAMZA ABOVE
ۓ U+06D3 LETTER YEH BARREE WITH HAMZA ABOVE
ۀ U+06C0 LETTER HEH WITH YEH ABOVE (note that this decomposes to ە U+06D5 LETTER AE and ٔ U+0654 HAMZA ABOVE, not ه U+0647 LETTER HEH and ٔ U+0654 HAMZA ABOVE)

Important exceptions arise where the hamza is an integral part of the character itself (ie. an ijam). Examples of these characters include the following, none of which are used for the Arabic language.

ځ U+0681 LETTER HAH WITH HAMZA ABOVE
ݬ U+076C LETTER REH WITH HAMZA ABOVE
ࢡ U+08A1 LETTER BEH WITH HAMZA ABOVE
ࢨ U+08A8 LETTER YEH WITH TWO DOTS BELOW AND HAMZA ABOVE

Cutting and joining hamza in orthography

Classical arabic distinguishes between 'cutting' and 'joining' hamza. 'Cutting' means always pronounced, 'joining' means frequently elided. The joining hamza is of little practical importance in modern arabic pronounced without the old case endings.

The sign indicating a joining hamza is called a wasla (see 0671).

In modern printed arabic, the hamza is rarely shown when it occurs at the beginning of a word.

Rules for choosing carriers

The following are simplified rules for use of (cutting) hamza:

At the beginning of a word hamza is always written with an alef carrier, regardless which vowel it takes, eg.

أول

أجور

إدارة

When it takes an i-vowel the hamza is written below the alef. See 0623 and 0625.
In the middle of a word it is almost always written above a carrier letter, as one of أ ؤ ئ

Examples:

سأل

مؤمن

زائر

Which one depends on the vowels preceding and following the hamza, and the rules are complicated (and a common source of spelling errors among Arabs). See 0623, 0624, and 0626. When yeh is used as a mid-word carrier it loses its dots.
The rules are less clear for sequences that would produce two medial WA letters in a row, eg. CuʔuːC

By one application of the rules, 0624 is replaced with a free-standing hamza (this letter), eg.

رؤوس

would be written رءوس rʔws

The rule appears to be less relevant in modern Arabic.
At the end of a word the hamza is written above a carrier after a short vowel, eg.

قرأ

تنبؤ

However, after a long vowel, a dipthong, or a vowelless letter this free-standing letter is used. eg.

بناء

مليء

جزء

شيء

The following tables are adapted from Wikipedia and show one set of rules for choosing a character to represent hamza in a given context. C or tatweel represents any consonant, ʔ represents this character, and v represents the vowel shown on the top row. The phones just before the table show what this character may represent in unvowelled text for that word position. Cells with * can also be spelled using a combining hamza over a base.

Word-medial. ʔɪ, ʔu, or ʔa, and ʔiː when followed by 064A, or ʔuː when followed by 0648, or ʔaː when followed by 0627.

	i	iː	u	uː	a	aː	examples
CiʔvC
CiːʔvC
CuʔvC
CuːʔvC	ـُوءِـ	ـُوءِيـ	ـُوءُـ	ـُوءُوـ	ـُوءَـ	ـُوءَاـ
CaʔvC
CaːʔvC				ـَاءُوـ	ـَاءَـ	ـَاءَاـ	قراءة
Diphthongs CajʔvC
CawʔvC	ـَوْءِـ*	ـَوْءِيـ*	ـَوْءُـ*	ـَوْءُوـ	ـَوْءَـ*	ـَوْءَاـ*
Clusters CʔvC				ـْءُوـ
CvʔC				ـُوءْـ		ـَاءْـ

Word-final.

ɪʔ, uʔ, or aʔ, and iːʔ when preceded by 064A, or uːʔ when preceded by 0648, or aːʔ when preceded by 0627.

	i	iː	u	uː	a	aː	examples
Cvʔ	ـِء*	ـِيء	ـُء*	ـُوء	ـَء*	ـَاء	مليء شيء بناء

It can also occur word-finally in a consonant cluster, eg.

جزء
`, '\u{0622}': `

ʔaː Represents the hamza (ء), in word-medial and word-final positions.

Word-initial.ʔaː

	i	iː	u	o	a	aː	examples
ʔvC						آـــ	آخر

Word-medial.ʔaː

	aː	examples
CiʔvC
CiːʔvC
CuʔvC
CuːʔvC
CaʔvC	ـــَآـــ	ثآليل
CaːʔvC
Diphthongs CayʔvC
CawʔvC	ـــَوْآـــ
Clusters CʔvC	ـــْآـــ	قرآني
CvʔC

Word-final. Not used.

`, '\u{0623}': `

ʔ Represents the hamza (ء) in word-initial, -medial, and -final positions.

At the beginning of a word hamza is always written with an alef carrier, regardless which vowel it takes. Which of the possible alternative sequences (أ, ؤ or ئ) is used to represent hamza mid-word depends on the vowels preceding and following the hamza. The rules are complicated (and a common source of spelling errors among Arabs).

Word-initial. ʔa or ʔʊ, or ʔo when followed by 0648. May also represent ʔoː in words like أوتيل.

	i	iː	u	o	a	aː	examples
ʔvC			أُـــ	أُوـــ	أَـــ		أول أجور أوتوبيس

Word-medial. ʔa or aʔ

	a	examples
CiʔvC
CiːʔvC
CuʔvC
CuːʔvC
CaʔvC	ـــَأَـــ	سأل
CaːʔvC
Diphthongs CayʔvC
CawʔvC	ـــَوْأَـــ
Clusters CʔvC	ـــْأَـــ
CvʔC	ـــَأْـــ	مأمور

Notes:
CawʔaC can also be written ـــَوْءَـــ.

Word-final.aʔ

	i	iː	u	uː	a	aː	examples
Cvʔ					ـــَأ		قرأ

Notes: May also be written ـــُء.

See 0621 for more information about hamza. See also 0625, 0624, and 0626.

`, '\u{0624}': `

Represents the hamza (ء) in the middle or at the end of a word. It is not used word-initially.

In the middle of a word the hamza is almost always written above a carrier letter (its 'seat'). Which one depends on the vowels preceding and following the hamza, and the rules are complicated (and a common source of spelling errors among Arabs).

Word-medial.

ʔu or ʔa, and ʔuː when followed by 0648, or ʔaː when followed by 0627.

	u	uː	a	aː	examples
CiʔvC
CiːʔvC
CuʔvC	ـــُؤُـــ	ـــُؤُوـــ	ـــُؤَـــ	ـــُؤَاـــ	يؤثر سؤال رؤوس
CuːʔvC
CaʔvC	ـــَؤُـــ	ـــَؤُوـــ			صؤل
CaːʔvC	ـــَاؤُـــ
Diphthongs CayʔvC
CawʔvC	ـــَوْؤُـــ
Clusters CʔvC	ـــْؤُـــ	ـــْؤُوـــ			مسؤول
CvʔC	ـــُؤْـــ				مؤمن

Notes:
CawʔuC can also be written using ـــَوْءُـــ

The rules are less clear for sequences that produce two WA letters in a row, eg. –uʔuː–

Some texts follow the rule that this letter is replaced with 0621, eg. رؤوس would be written رءوس rʔws

The rule appears to be less relevant in modern Arabic.

Word-final. uʔ

	i	iː	u	uː	a	aː	examples
Cvʔ			ـــُؤ				تنبؤ

Notes: May also be written ـــُء

See 0621 for more information about hamza. See also 0623, 0625, and 0626.

`, '\u{0625}': `

ʔ Represents the hamza (ء), but is only used at the beginning of a word.

At the beginning of a word hamza is always written with an alef carrier, regardless which vowel it takes.

The following table is adapted from Wikipedia and show one set of rules for choosing a character to represent hamza in a given context. C or tatweel represents any consonant, ʔ represents this character, and v represents the vowel shown on the top row. The phones just before the table show what this character may represent in unvowelled text for that word position.

Word-initial

ʔɪ, or ʔiː when followed by 064A.

	i	iː	u	uː	a	aː	examples
ʔvC	إِـــ	إِيـــ					إدارة, إيقاف

Word-medial, -final

The mid-word and word-final equivalent of this character is 0626.

See 0621 for more information about hamza. See also 0623, 0624, and 0626.

`, '\u{0626}': `

Represents the hamza (ء) in the middle of or at the end of a word. It is not used word-initially.

When yeh is used as a mid-word carrier it loses its dots.

In the middle of a word the hamza is almost always written above a carrier letter as one of أ ؤ ئ

Which one depends on the vowels preceding and following the hamza, and the rules are complicated (and a common source of spelling errors among Arabs).

Word-medial

ʔɪ, ʔu, or ʔa, and ʔiː when followed by 064A, or ʔuː when followed by 0648, or ʔaː when followed by 0627.

	i	iː	u	uː	a	aː	examples
CiʔvC	ـــِئِـــ	ـــِئِيـــ	ـــِئُـــ	ـــِئُوـــ	ـــِئَـــ	ـــِئَاـــ
CiːʔvC	ـــِيئِـــ	ـــِيئِيـــ	ـــِيئُـــ	ـــِيئُوـــ	ـــِيئَـــ	ـــِيئَاـــ
CuʔvC	ـــُئِـــ	ـــُئِيـــ
CuːʔvC
CaʔvC	ـــَئِـــ	ـــَئِيـــ					نائم رئيس
CaːʔvC	ـــَائِـــ	ـــَائِيـــ					زائر نهائي
Diphthongs CajʔvC	ـــَيْئِـــ	ـــَيْئِيـــ	ـــَيْئُـــ	ـــَيْئُوـــ	ـــَيْئَـــ	ـــَيْئَاـــ
CawʔvC	ـــَوْئِـــ	ـــَوْئِيـــ
Clusters CʔvC	ـــْئِـــ	ـــْئِيـــ
CvʔC	ـــِئْـــ	ـــِيئْـــ

Notes: .The items in the last row can alternatively be written using ء [U+0621 ARABIC LETTER HAMZA].

Word-final.

ɪʔ

At the end of a word the hamza is written above a carrier after a short i vowel only. However, it may also be written in this position using 0621.

	i	iː	u	uː	a	aː	examples
Cvʔ	ـــِئ

Notes: May also be written as ـــِء.

See 0621 for more information about hamza. See also 0623, 0625, and 0624.

`, '\u{0627}': `

Formally speaking, this letter has no sound of its own. It is really a vowel lengthener and carrier. Its main uses in arabic orthography are:

as a carrier letter for a word-initial vowel, eg. الآن انتباه
as a lengthening sign for the a-vowel, eg. احتاج
as a carrier letter for the hamza (see 0623, and 0625).

أحمر

إيقاف

That said, its presence usually indicates the location of a vowel.

It also has one or two minor functions such as in conjunction with tawiin (nunation) (see 064B).

Certain parts of the arabic verb end in a long u-vowel that is conventionally written with a following alef that has no effect on pronunciation, eg.

كتبوا

The alef is omitted if a suffix is added, eg.

كتبوها

See also 0623, 0625, and 0671.

Arabic definite article

When the definite article ال (alif followed by lām) precedes the following consonants the l sound is dropped and the following consonant lengthened.

ت ث د ذ ر ز س ش ص ض ط ظ ل ن

These are called 'sun letters' in Arabic. The other letters are 'moon letters'.j§32

The alif is also not pronounced if the preceding word ends with a vowel or h. It is, however, written.j§32

Shape

The combination 0644 0627 is always rendered as a ligature. In Arabic language text, it typically looks like لا if it doesn't join, or ‍لا otherwise. All combinations of these 2 letters with other diacritics, such as vowels or the precomposed characters mentioned just above, ligate.

`, '\u{0628}': `

b consonant.

بيت كَبِير مَكْتَب ضَبَاب

May be used for foreign words instead of p, eg. باريس

`, '\u{0629}': `

Feminine indicator.

∅ Only used in word-final position, and usually with no sound, however it may have a weak h soundj§60, eg.

مدرسة

ja when preceded by iːj§27, eg.

يَابَانِية

Used for historical reasons to write the feminine ending a – the dots are borrowed from teh (ت).

Vowelled text may omit the short a diacritic before the teh marbuta, because the sound is always the same.

If any suffix is added the ending is spelled with 062A, eg.

مدرستنا

In modern Arabic it is not uncommon to find the two dots omitted, particularly on masculine proper names that have the feminine ending, eg. طلبة ᵵlbẗ tulbæ

ref? Should this be written using heh, rather than teh marbuta? Scheherazade doesn't have a switch.

`, '\u{062A}': `

t consonant.

تلك مَتَى بيت حُوت `, '\u{062B}': `

θ consonant.

ثلاثة كَثِير حَدِيث ثلاثة `, '\u{062C}': `

d͡ʒ in Algerian, Hejazi, Najdi, Iraqi, and Gulf regions, eg.

جبل نَجْم ثَلْج زَوْج

ʒ in Moroccan, Tunisian, Egyptian, Levantine, and Israeli regions.wp§#Local_variations_of_Modern_Standard_Arabic

`, '\u{062D}': `

ħ consonant.

حتى نحو رِيح جَنَاح `, '\u{062E}': `

x consonant.

خلال أشخص وَسِخ خوخ `, '\u{062F}': `

d consonant (right-joining only).

ديوان وَاحِدَة نقود `, '\u{0630}': `

ð consonant (right-joining only).

ذَاتِيَّة قَذِر نفاذ `, '\u{0631}': `

r consonant (right-joining only).

رؤوس مَرْأَة دينار `, '\u{0632}': `

z consonant (right-joining only).

زيارة جزء أرز `, '\u{0633}': `

s consonant. سنة يَسَار يَابِس فيروس `, '\u{0634}': `

ʃ consonant.

شمس مَشَى حَنَش جأش `, '\u{0635}': `

sˤ pharyngealised consonant.

صحيفة قَصِير مقص قرص `, '\u{0636}': `

dˤ pharyngealised consonant.

ضمان خَضْرَاء أَبْيَض أَرْض `, '\u{0637}': `

tˤ pharyngealised consonant.

طريق مَطَر اختط اشترط `, '\u{0638}': `

ðˤ pharyngealised consonant.

ظلام عَظْم قيظ قلاووظ

zˤ sometimes.

`, '\u{0639}': `

ʕ consonant.

على لَعِبَ وَاسِع زَرْع `, '\u{063A}': `

ɣ consonant.

غابة صَغِير أمازيغ وزغ `, '\u{063B}': `

`, '\u{063C}': `

`, '\u{063D}': `

`, '\u{063E}': `

`, '\u{063F}': `

`, '\u{0640}': `

Used to stretch words for simple justification, or to make a word or phrase a particular width, or as a form of emphasis.

Better quality justification systems stretch glyphs, rather than adding baseline extensions. This dynamic stretching of glyphs is often called kashida. In some typesetting systems, such as InDesign, the tatweel character serves more to indicate opportunities for stretching, and the glyph for the character itself is not shown.

Which words are stretched and how much depends on a set of rules that tends to vary for different font styles. (Elongation is not normally used at all for the ruq'a style.) The following is an example of text justified using tatweel from a newspaper column.

Justified Arabic text.

One of the major problems when using tatweel to stretch text is that when the width of the space in which the text is displayed, or when new text is added near the start of a paragraph, lines wrap differently and all the places where tatweel would be needed have to be recalibrated. Thus tatweels only work for static text with fixed dimensions.

It is very common to see baseline stretching in modern Arabic text where a word or phrase is stretched to fill a particular space, eg. the Arabic tag line (الابداع المتجدد Creativity renewed) below the word Lexus in the following image is stretched to be the same width.

Arabic text stretched to fit the width of the word Lexus.

Descriptions elsewhere: Syriac.

`, '\u{0641}': `

f consonant.

فقيد طِفْل كيف إيقاف

In North Africa this letter sometimes has one dot below. There is a separate code point for that: 06A2.

`, '\u{0642}': `

q consonant.

قدم نقود طريق فندق

In North Africa an alternative version of this letter with only one dot above is sometimes used. There is a separate code point for that: 06A7.

`, '\u{0643}': `

k consonant.

كيف سكرتير سَمَك هناك `, '\u{0644}': `

l consonant.

لحم مِلْح رَجُل شِمَال

ɫ occurs as a phoneme in a handful of loanwords, though not in all pronunciations. It also occurs in the name اللّٰه ɑll˖̍h aɫˈɫaː.

`, '\u{0645}': `

m consonant.

منذ يَمِين نَجْم يَوْم `, '\u{0646}': `

n consonant.

نحو هُنَا عَيْن ديوان `, '\u{0647}': `

h consonant.

هناك نهائي شبيه انتباه

Calendar marker. The marker for hijri dates is an initial form of heh, even though it doesn't join to the left, ie. ه‍. For this, use a 200D immediately after the heh, eg.

الاثنين 10 رجب 1415 ه‍.

In some cases 0640 is used to ensure that the shape looks right, because some applications or fonts don't produce the right effect when using the ZWJ, eg.

الاثنين 10 رجب 1415 هـ.

`, '\u{0648}': `

w (right-joining only) as a consonant, eg. وَاسِع نحو مرو

-w as part of the aw diphthong, eg. يَوْم

uː as mater lectionis, indicating location of a long u-vowel, eg. نقود

o or oː in certain foreign words, eg. بنطلون

A final, unpronounced waw is used to distinguish the two following names, which are otherwise written identically, but pronounced differently:

عمرو عمر

Combinations

ʔ is ؤ

ُو

uː is ُو in vowelled text.

`, '\u{0649}': `

aː consonant. حَتَّى أَفْعًى رأى

The long a-vowel at the end of many words is written with yeh instead of an alef. In this case the yeh is typically printed without dots, to avoid confusion, although this is not universal. This spelling only occurs with certain words, and only when the final sound is aː, eg. معنى

If any suffix is added, the spelling reverts to the normal alef, eg. معناهم mʕnɑhm maʕnaː-hʊm

Older texts may show this letter in word-medial positions. Some fonts may not show dual joining in that situation.

`, '\u{064A}': `

j as a consonant followed by a vowel, eg. يَمّ سيارة نسبي ناي

-j as part of the diphthong aj, eg. بيت

iː as a mater lectionis, eg. في

eː approximately, in certain foreign words, eg.

سكرتير أوتيل

Combinations

ʔ is ئ

ِي

iː is ِي in vowelled text.

Use with hamza

When used with 0654 the two dots are suppressed in all positions. Text in NFC actually uses 0626 rather than the decomposed sequence, so that is recommended. However, even if the decomposed sequence 064A 0654 is used, the font should hide the dots.

Unlike this character, ࢨ U+08A8 LETTER YEH WITH TWO DOTS BELOW AND HAMZA ABOVE retains the two dots in all forms, however it also represents a semantically different character, that is not used for the Arabic language.

`, '\u{064B}': `

Infrequent.

In classical arabic, indefinite nouns and adjectives were marked by a final n-sound, called تنوين tænwiːn or, in English, 'nunation'. This is normally indicated by doubling the vowel diacritic. On a word ending with an a-vowel (though not with a feminine ending or some other suffixes) an extra alef was also added at the end of the word. In modern arabic printing the fathatan is usually dropped, but the alef is retained. The pronunciation of the ending æn is also retained in many words. Examples:

كِتَابًا kitaɑbaⁿɑ kɪtæːbæn

فَرَسًا farasaⁿɑ færæsæn

This is often used in the combination 064B 0627, where the ALEF is silent and the ending is pronounced -an, eg. فورًا

The same applies before TEH MARBUTA, eg.

أفعًى

If it appears as 064E 0629 064B the pronunciation is -atan, eg.

عادةً

After a final YEH, the pronunciation has an extra j soundjm§51, ie. -iːjan, eg.

رسميًا

`, '\u{064C}': `

Infrequent.

In classical arabic indefinite nouns and adjectives were marked by a final n-sound, called تنوين tænwiːn or, in English, 'nunation'. This is normally indicated by doubling the vowel diacritic, eg.

جَبَلٌ ʤabaluⁿ ʒælæbun

Not usually shown in modern text (exceptions in the Qur'an, difficult older texts, and children's schoolbooks).

`, '\u{064D}': `

Infrequent.

جَبَلٌ ʤabaluⁿ qælæmɪn

Not usually shown in modern text (exceptions in the Qur'an, difficult older texts, and children's schoolbooks).

`, '\u{064E}': `

Infrequent. Only used in vowelled text.

æ or a after ص ض ط ظ غ ق and sometimes after خ ر ل, eg. قَدَم

Actual pronunciation varies with context.

Not usually shown in text (exceptions tend to be the Qur'an, difficult older texts, and children's schoolbooks).

Combinations

aː when followed by 0627, eg. سَاق

aj diphthong when followed by unvowelled 064A, eg. لَيْلَة

aw diphthong when followed by unvowelled 0648, eg. يَوْم

`, '\u{064F}': `

Infrequent. Only used in vowelled text.

ʊ vowel.

أُذُن

Actual pronunciation varies with context.

Not usually shown in text (exceptions tend to be the Qur'an, difficult older texts, and children's schoolbooks).

Combinations

uː when followed by 0648, eg. دُودَة

o sometimes for foriegn words, eg. أُكتُوبَر

oː for other foreign words, when followed by 0648, eg. أُوتِيل

`, '\u{0650}': `

Infrequent. Only used in vowelled text.

ɪ, vowel.

اِسْم

Actual pronunciation varies with context.

Not usually shown in text (exceptions tend to be the Qur'an, difficult older texts, and children's schoolbooks.)

Combinations

iː when followed by 064A, eg. رِيح

eː sometimes for foreign words in the combination 0650 064A, eg. سِكْرِتِير

`, '\u{0651}': `

Infrequent. Only used in vowelled text.

Diacritic that doubles the length of the supporting consonant, eg.

ثَمَّ

Visible in arabic printing, but not always marked consistently.

A common, though not universal, practice is to display any combining kasra below the shadda, rather than below the base consonant, eg.

مُمَثِّلْ

Some fonts, such as Amiri, don't do this.

The sign derives from a miniature nucleus of 0634, without dots.

`, '\u{0652}': `

Infrequent. Only used in vowelled text.

Indicates that no vowel follows the consonant to which this is attached, eg.

مَكْتَب

Not usually shown in text (exceptions tend to be the Qur'an, difficult older texts, and children's schoolbooks).

`, '\u{0653}': `

Rare. Found in decomposed text only; used only with ا.

Appears above 0627 in decomposed text to represent the sound ʔaː, however Unicode provides a precomposed character, 0622, which is usually preferred, eg.

الْآنَ

There is a great deal more to be said about the use of hamza in the Arabic orthography. Follow the link above and especially 0621.

`, '\u{0654}': `

Rare. Found in decomposed text only.

ʔ The hamza sometimes stands alone (see 0621), but usually appears with a 'carrier' letter - alef, waw, or yeh ( أ إ ؤ ئ ).

This combining character may appear above alef, waw, or yeh in decomposed text, however Unicode provides the following precomposed characters which are usually preferred.

0623
0624
0626

There is a great deal more to be said about the use of hamza in the Arabic orthography. Follow the links above and, especially, 0621.

`, '\u{0655}': `

Rare. Found in decomposed text only.

ʔ The hamza sometimes stands alone (see 0621), but usually appears with a 'carrier' letter - alef, waw, or yeh ( أ إ ؤ ئ ).

This combining character may appear below aleph in decomposed text, however Unicode provides a precomposed character, 0625, which is usually preferred.

There is a great deal more to be said about the use of hamza in the Arabic orthography. Follow the links above and, especially, 0621.

`, '\u{0656}': `

`, '\u{0657}': `

`, '\u{0658}': `

`, '\u{0659}': `

`, '\u{065A}': `

`, '\u{065B}': `

`, '\u{065C}': `

`, '\u{065D}': `

`, '\u{065E}': `

`, '\u{065F}': `

`, '\u{0660}': `

0 digit.

`, '\u{0661}': `

1 digit.

`, '\u{0662}': `

2 digit.

`, '\u{0663}': `

3 digit.

`, '\u{0664}': `

4 digit.

`, '\u{0665}': `

5 digit.

`, '\u{0666}': `

6 digit.

`, '\u{0667}': `

7 digit.

`, '\u{0668}': `

8 digit.

`, '\u{0669}': `

9 digit.

`, '\u{066A}': `

Used with arabic-indic numerals, eg. ١٢٪.^h

It is also possible to find 0025.

`, '\u{066B}': `

Used in arabic, but not in common use (possibly because it is not available on the keyboard), eg. ١٫٢٣. Khaled Hosny reports that people usually use , U+002C COMMA or even a small 0631 (mainly in newspapers, probably because it looks closer to this character than a comma)g49§#issuecomment-222665388.

`, '\u{066C}': `

Used in old documents, eg. ١٬٢٣٤, but not common in recent useg49§#issuecomment-222665388. In Morocco, it seems that a comma or a space is more common, though there may also be no separator.g49§#issuecomment-224670870

`, '\u{066D}': `

Five-pointed star.

`, '\u{066E}': `

Sometimes used with ۪ [U+06EA ARABIC EMPTY CENTRE LOW STOP] to hack the initial and medial forms of the palatalisation letter in Kashmiri. ؠ [U+0620 ARABIC LETTER KASHMIRI YEH] should be used instead. Fonts will automatically apply a circle diacritic for initial and medial positions (only).

`, '\u{066F}': `

`, '\u{0670}': `

Not often written, even in vowelled text. Used in only a few Arabic words, but they tend to be common ones, such as the following: هٰذَا ذٰلِكَ اللّٰه

`, '\u{0671}': `

Classical arabic distinguishes between 'cutting' and 'joining' hamza. 'Cutting' means always pronounced, 'joining' means frequently elided. The sign indicating a joining hamza is called a wasla.

The joining hamza is of little practical importance in modern arabic pronounced without the old case endings. Example: مَا ٱسْمُكَ

`, '\u{0672}': `

`, '\u{0673}': `

`, '\u{0674}': `

`, '\u{0675}': `

`, '\u{0676}': `

`, '\u{0677}': `

`, '\u{0678}': `

`, '\u{0679}': `

`, '\u{067A}': `

`, '\u{067B}': `

`, '\u{067C}': `

`, '\u{067D}': `

`, '\u{067E}': `

p Used only occasionally, for foreign wordsjm§54, eg. پاريس

`, '\u{067F}': `

`, '\u{0680}': `

`, '\u{0681}': `

`, '\u{0682}': `

`, '\u{0683}': `

`, '\u{0684}': `

`, '\u{0685}': `

`, '\u{0686}': `

t͡ʃ Used only occasionally, for foreign wordsjm§54, eg. چاكارتا

`, '\u{0687}': `

`, '\u{0688}': `

`, '\u{0689}': `

`, '\u{068A}': `

`, '\u{068B}': `

`, '\u{068C}': `

`, '\u{068D}': `

`, '\u{068E}': `

`, '\u{068F}': `

`, '\u{0690}': `

`, '\u{0691}': `

`, '\u{0692}': `

`, '\u{0693}': `

`, '\u{0694}': `

`, '\u{0695}': `

`, '\u{0696}': `

`, '\u{0697}': `

`, '\u{0698}': `

`, '\u{0699}': `

`, '\u{069A}': `

`, '\u{069B}': `

`, '\u{069C}': `

`, '\u{069D}': `

`, '\u{069E}': `

`, '\u{069F}': `

`, '\u{06A0}': `

`, '\u{06A1}': `

`, '\u{06A2}': `

A Maghrebi alternative form of 0641 used in North Africa (but not Libya & Algeria).

`, '\u{06A3}': `

`, '\u{06A4}': `

v Used only occasionally, for foreign wordsjm§54, eg. ڤيينا

`, '\u{06A5}': `

`, '\u{06A6}': `

`, '\u{06A7}': `

A Maghrebi alternative form of 0642 used in North Africa.

It may be dotless in isolated and final positions and dotted in the initial and medial forms.wa§#Regional_variations

`, '\u{06A8}': `

`, '\u{06A9}': `

`, '\u{06AA}': `

`, '\u{06AB}': `

`, '\u{06AC}': `

`, '\u{06AD}': `

`, '\u{06AE}': `

`, '\u{06AF}': `

`, '\u{06B0}': `

`, '\u{06B1}': `

`, '\u{06B2}': `

`, '\u{06B3}': `

`, '\u{06B4}': `

`, '\u{06B5}': `

`, '\u{06B6}': `

`, '\u{06B7}': `

`, '\u{06B8}': `

`, '\u{06B9}': `

`, '\u{06BA}': `

`, '\u{06BB}': `

`, '\u{06BC}': `

`, '\u{06BD}': `

`, '\u{06BE}': `

`, '\u{06BF}': `

`, '\u{06C0}': `

`, '\u{06C1}': `

`, '\u{06C2}': `

`, '\u{06C3}': `

`, '\u{06C4}': `

`, '\u{06C5}': `

`, '\u{06C6}': `

`, '\u{06C7}': `

`, '\u{06C8}': `

`, '\u{06C9}': `

`, '\u{06CA}': `

`, '\u{06CB}': `

`, '\u{06CC}': `

`, '\u{06CD}': `

`, '\u{06CE}': `

`, '\u{06CF}': `

`, '\u{06D0}': `

`, '\u{06D1}': `

`, '\u{06D2}': `

`, '\u{06D3}': `

`, '\u{06D4}': `

`, '\u{06D5}': `

`, '\u{06D6}': `

`, '\u{06D7}': `

`, '\u{06D8}': `

`, '\u{06D9}': `

`, '\u{06DA}': `

`, '\u{06DB}': `

`, '\u{06DC}': `

Rare. Koranic annotation.

Used in more than one way with 0652 in Qur'anic text.

Used closer to 0635 than a sukun, it indicates that the base letter should be pronounced s as in seem, eg.

بَصْۜطَةً

(Note that this is incorrectly displayed in some fonts due to problems with the Unicode combining-character ordering rules. There is a proposal to address this in UTR #53.)

Used above a sukun, it is a pause-related hint, eg. مَالِيَةْۜ.

`, '\u{06DD}': `

Rare. Koranic annotation.

Used to indicate numbered verses in the Qur'an. The symbol encloses the numbers, eg.

٣۝

٤٣۝

See also U+08E2 ARABIC DISPUTED END OF AYAH, which is used occasionally in Qur'anic text to mark a verse for which there is scholarly disagreement about the location of the end of the verse.

`, '\u{06DE}': `

`, '\u{06DF}': `

`, '\u{06E0}': `

`, '\u{06E1}': `

`, '\u{06E2}': `

`, '\u{06E3}': `

`, '\u{06E4}': `

`, '\u{06E5}': `

`, '\u{06E6}': `

`, '\u{06E7}': `

`, '\u{06E8}': `

`, '\u{06E9}': `

`, '\u{06EA}': `

`, '\u{06EB}': `

`, '\u{06EC}': `

`, '\u{06ED}': `

`, '\u{06EE}': `

`, '\u{06EF}': `

`, '\u{06F0}': `

`, '\u{06F1}': `

`, '\u{06F2}': `

`, '\u{06F3}': `

`, '\u{06F4}': `

`, '\u{06F5}': `

`, '\u{06F6}': `

`, '\u{06F7}': `

`, '\u{06F8}': `

`, '\u{06F9}': `

`, '\u{06FA}': `

`, '\u{06FB}': `

`, '\u{06FC}': `

`, '\u{06FD}': `

`, '\u{06FE}': `

`, '\u{06FF}': `

`, // Arabic Supplement '\u{0750}': `

`, '\u{0751}': `

`, '\u{0752}': `

`, '\u{0753}': `

`, '\u{0754}': `

`, '\u{0755}': `

`, '\u{0756}': `

`, '\u{0757}': `

`, '\u{0758}': `

`, '\u{0759}': `

`, '\u{075A}': `

`, '\u{075B}': `

`, '\u{075C}': `

`, '\u{075D}': `

`, '\u{075E}': `

`, '\u{075F}': `

`, '\u{0760}': `

`, '\u{0761}': `

`, '\u{0762}': `

`, '\u{0763}': `

`, '\u{0764}': `

`, '\u{0765}': `

`, '\u{0766}': `

`, '\u{0767}': `

`, '\u{0768}': `

`, '\u{0769}': `

`, '\u{076A}': `

`, '\u{076B}': `

`, '\u{076C}': `

`, '\u{076D}': `

`, '\u{076E}': `

`, '\u{076F}': `

`, '\u{0770}': `

`, '\u{0771}': `

`, '\u{0772}': `

`, '\u{0773}': `

`, '\u{0774}': `

`, '\u{0775}': `

`, '\u{0776}': `

`, '\u{0777}': `

`, '\u{0778}': `

`, '\u{0779}': `

`, '\u{077A}': `

`, '\u{077B}': `

`, '\u{077C}': `

`, '\u{077D}': `

`, '\u{077E}': `

`, '\u{077F}': `

`, // Arabic Extended-B '\u{0870}': `

ࡰ

`, '\u{0871}': `

ࡱ

`, '\u{0872}': `

ࡲ

`, '\u{0873}': `

ࡳ

`, '\u{0874}': `

ࡴ

`, '\u{0875}': `

ࡵ

`, '\u{0876}': `

ࡶ

`, '\u{0877}': `

ࡷ

`, '\u{0878}': `

ࡸ

`, '\u{0879}': `

ࡹ

`, '\u{087A}': `

ࡺ

`, '\u{087B}': `

ࡻ

`, '\u{087C}': `

ࡼ

`, '\u{087D}': `

ࡽ

`, '\u{087E}': `

ࡾ

`, '\u{087F}': `

ࡿ

`, '\u{0880}': `

ࢀ

`, '\u{0881}': `

ࢁ

`, '\u{0882}': `

ࢂ

`, '\u{0883}': `

ࢃ

`, '\u{0884}': `

ࢄ

`, '\u{0885}': `

ࢅ

`, '\u{0886}': `

ࢆ

`, '\u{0887}': `

ࢇ

`, '\u{0888}': `

࢈

`, '\u{0889}': `

ࢉ

`, '\u{088A}': `

ࢊ

`, '\u{088B}': `

ࢋ

`, '\u{088C}': `

ࢌ

`, '\u{088D}': `

ࢍ

`, '\u{088E}': `

ࢎ

`, '\u{0890}': `

࢐

Egyptian currency sign which extends across the top of a sequence of digits. The shape is usually based on a dotless 'head of jeem' above the amount. It is occasionally based on a dotted jeem instead. It is used in advertising and price tags, as well as in hand-written texts.u§381

`, '\u{0891}': `

࢑

Egyptian currency sign which extends across the top of a sequence of digits. The shape is written using a mirrored version of ࢐ [U+0890 ARABIC POUND MARK ABOVE]. They are used in advertising and price tags, as well as in hand-written texts.u§381

`, '\u{0898}': `

࢘

`, '\u{0899}': `

࢙

`, '\u{089A}': `

࢚

`, '\u{089B}': `

࢛

`, '\u{089C}': `

࢜

`, '\u{089D}': `

࢝

`, '\u{089E}': `

࢞

`, '\u{089F}': `

࢟

`, // Arabic Extended-A '\u{08A0}': `

ࢠ

`, '\u{08A1}': `

ࢡ

`, '\u{08A2}': `

ࢢ

`, '\u{08A3}': `

ࢣ

`, '\u{08A4}': `

ࢤ

`, '\u{08A5}': `

ࢥ

`, '\u{08A6}': `

ࢦ

`, '\u{08A7}': `

ࢧ

`, '\u{08A8}': `

ࢨ

`, '\u{08A9}': `

ࢩ

`, '\u{08AA}': `

ࢪ

`, '\u{08AB}': `

ࢫ

`, '\u{08AC}': `

ࢬ

`, '\u{08AD}': `

ࢭ

`, '\u{08AE}': `

ࢮ

`, '\u{08AF}': `

ࢯ

`, '\u{08B0}': `

ࢰ

`, '\u{08B1}': `

ࢱ

`, '\u{08B2}': `

ࢲ

zˤ consonant. Sometimes used for writing Berber sounds.lpz

`, '\u{08B3}': `

ࢳ

`, '\u{08B4}': `

ࢴ

`, '\u{08B5}': `

ࢵ

`, '\u{08B6}': `

ࢶ

`, '\u{08B7}': `

ࢷ

`, '\u{08B8}': `

ࢸ

`, '\u{08B9}': `

ࢹ

`, '\u{08BA}': `

ࢺ

`, '\u{08BB}': `

ࢻ

`, '\u{08BC}': `

ࢼ

`, '\u{08BD}': `

ࢽ

`, '\u{08BE}': `

ࢾ

`, '\u{08BF}': `

ࢿ

`, '\u{08C0}': `

ࣀ

`, '\u{08C1}': `

ࣁ

`, '\u{08C2}': `

ࣂ

`, '\u{08C3}': `

ࣃ

`, '\u{08C4}': `

ࣄ

`, '\u{08C5}': `

ࣅ

`, '\u{08C6}': `

ࣆ

`, '\u{08C7}': `

ࣇ

`, '\u{08C8}': `

ࣈ

`, '\u{08C9}': `

ࣉ

`, '\u{08CA}': `

࣊

`, '\u{08CB}': `

࣋

`, '\u{08CC}': `

࣌

`, '\u{08CD}': `

࣍

`, '\u{08CE}': `

࣎

`, '\u{08CF}': `

࣏

`, '\u{08D0}': `

࣐

`, '\u{08D1}': `

࣑

`, '\u{08D2}': `

࣒

`, '\u{08D3}': `

࣓

`, '\u{08D4}': `

ࣔ

`, '\u{08D5}': `

ࣕ

`, '\u{08D6}': `

ࣖ

`, '\u{08D7}': `

ࣗ

`, '\u{08D8}': `

ࣘ

`, '\u{08D9}': `

ࣙ

`, '\u{08DA}': `

ࣚ

`, '\u{08DB}': `

ࣛ

`, '\u{08DC}': `

ࣜ

`, '\u{08DD}': `

ࣝ

`, '\u{08DE}': `

ࣞ

`, '\u{08DF}': `

ࣟ

`, '\u{08E0}': `

࣠

`, '\u{08E1}': `

࣡

`, '\u{08E2}': `

࣢

`, '\u{08E3}': `

ࣣ

`, '\u{08E4}': `

ࣤ

`, '\u{08E5}': `

ࣥ

`, '\u{08E6}': `

ࣦ

`, '\u{08E7}': `

ࣧ

`, '\u{08E8}': `

ࣨ

`, '\u{08E9}': `

ࣩ

`, '\u{08EA}': `

࣪

`, '\u{08EB}': `

࣫

`, '\u{08EC}': `

࣬

`, '\u{08ED}': `

࣭

`, '\u{08EE}': `

࣮

`, '\u{08EF}': `

࣯

`, '\u{08F0}': `

ࣰ

`, '\u{08F1}': `

ࣱ

`, '\u{08F2}': `

ࣲ

`, '\u{08F3}': `

ࣳ

`, '\u{08F4}': `

ࣴ

`, '\u{08F5}': `

ࣵ

`, '\u{08F6}': `

ࣶ

`, '\u{08F7}': `

ࣷ

`, '\u{08F8}': `

ࣸ

`, '\u{08F9}': `

ࣹ

`, '\u{08FA}': `

ࣺ

`, '\u{08FB}': `

ࣻ

`, '\u{08FC}': `

ࣼ

`, '\u{08FD}': `

ࣽ

`, '\u{08FE}': `

ࣾ

`, '\u{08FF}': `

ࣿ

`, // PRESENTATION FORMS '\u{FBB2}': `

﮲

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBB3}': `

﮳

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBB4}': `

﮴

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBB5}': `

﮵

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBB6}': `

﮶

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBB7}': `

﮷

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBB8}': `

﮸

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBB9}': `

﮹

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBBA}': `

﮺

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBBB}': `

﮻

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBBC}': `

﮼

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBBD}': `

﮽

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBBE}': `

﮾

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBBF}': `

﮿

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBC0}': `

﯀

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBC1}': `

﯁

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FBC2}': `

﯂

A symbol used in educational materials to illustrate an ijam diacritic. It is never used as a combining mark, nor in composition with Arabic letter forms, but is simply a symbol.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD3E}': `

﴾

This is considered to be traditional Arabic punctuation, rather than a compatibility character. ¹

Unlike other parentheses, for legacy reasons this and its pair are not automatically mirrored when used in text, so you need to choose the right code point based on the expected glyph shape.u§pp398-400

`, '\u{FD3F}': `

﴿

This is considered to be traditional Arabic punctuation, rather than a compatibility character. ¹

`, '\u{FD40}': `

﵀

A word ligature used as an honorific representing رضي الله عنه, May God be pleased with him!, and used with names of companions of the prophet.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD41}': `

﵁

A word ligature used as an honorific with names of companions of the prophet.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD42}': `

﵂

A word ligature used as an honorific with names of companions of the prophet.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD43}': `

﵃

A word ligature used as an honorific with names of companions of the prophet.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD44}': `

﵄

A word ligature used as an honorific with names of companions of the prophet.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD45}': `

﵅

A word ligature used as an honorific with names of companions of the prophet.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD46}': `

﵆

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD47}': `

﵇

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD48}': `

﵈

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD49}': `

﵉

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD4A}': `

﵊

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD4B}': `

﵋

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD4C}': `

﵌

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD4D}': `

﵍

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD4E}': `

﵎

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FD4F}': `

﵏

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FDCF}': `

﵏

A word ligature used as an honorific and meaning 'His blessing upon us'. It is used in Christian texts.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FDF2}': `

ﷲ

According to the text of the Unicode Standard, you should normally create this word ligature with the following sequence of ordinary characters: اللّٰه (click on the red text to see the list).u§pp398-400 However, the compatibility decomposition for this character in the Unicode database does not contain the combining marks, although the font may automatically insert those, even though the text of the standard says that the ligature should not be formed by fonts when ـّ [U+0651 ARABIC SHADDA] and ـٰ [U+0670 ARABIC LETTER SUPERSCRIPT ALEF] are not present (because the four base characters exist in Persian and other languages in contexts where they have different meanings and pronunciations). Here is how it looks in your browser: الله

Shape The shape varies slightly from font to font, and is not always correct – for example, a number of fonts omit the initial alef. Here is the rendering of this code point in the Unicode charts.

`, '\u{FDF4}': `

ﷴ

Ligated word used as an honorific for the name of Mohammed.

Although this has a decomposition mapping, it is sometimes used as a character. The compatibility decomposition for this character in the Unicode database is محمد – click on the red text to see the list of characters.

Shape The shape varies from font to font. Here is the rendering of this code point in the Unicode charts.

`, '\u{FDFA}': `

ﷺ

Honorific used after the name of God or Mohammed, meaning 'may God's peace and blessings be upon him'. ¹

Its use is comparable to the combining honorific ـؐ [U+0610 ARABIC SIGN SALLALLAHOU ALAYHE WASSALLAM].. ¹

The compatibility decomposition for this character in the Unicode database is صلى الله عليه وسلم – click on the red text to see the list of characters.

This character is sometimes used by Muslims writing in Latin or Cyrillic scripts. ¹

Shape The shape varies slightly from font to font. Here is the rendering of this code point in the Unicode charts.

`, '\u{FDFB}': `

ﷻ

Honorific used after the name of God or Mohammed. ¹

The compatibility decomposition for this character in the Unicode database is جل جلاله – click on the red text to see the list of characters.

This character is sometimes used by Muslims writing in Latin or Cyrillic scripts. ¹

Shape The shape varies slightly from font to font. Here is the rendering of this code point in the Unicode charts.

`, '\u{FDFC}': `

﷼

Created by a typewriter standardisation committee in 1973, this is intended to be a condensed version of the word for the Iranian currency.

The Unicode Standard says that this was only used for a short while in typewriters and keyboard layouts, and so is provided mainly for compatability reasons. Persian users are said to prefer typing the word rather than using this symbol¹. Note also that the compatibility decomposition of this character is ریال, which includes ی [U+06CC ARABIC LETTER FARSI YEH]. It is therefore not appropriate to use this symbol for countries such as Oman, which also have a currency called rial.

`, '\u{FDFD}': `

﷽

A common opening phrase, meaning "In the name of God, the Most Gracious, the Most Merciful". It is used more commonly than the other word ligatures shown above, and tends to appear above text. It is also used in other scripts, such as Bengali and Thaana. ¹

This is the phrase recited before each sura (chapter) of the Qur'an – except for the ninth. It is used by Muslims in various contexts (for instance, during daily prayer) and is used in over half of the constitutions of countries where Islam is the official religion or more than half of the population follows Islam, usually the first phrase in the preamble, including those of Afghanistan, Bahrain, Bangladesh, Brunei, Egypt, Iran, Iraq, Kuwait, Libya, Maldives, Pakistan, Tunisia, and the United Arab Emirates. ²

There is no decomposition for this character.

Shape The shape varies significantly from font to font and usage to usage. Here is the rendering of this code point in the Unicode charts. See other renderings at Wikipedia.

`, '\u{FDFE}': `

﷾

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, '\u{FDFF}': `

﷿

A word ligature used as an honorific.

This character can be used in text. It does not have compatibility decompositions and (unlike most other code points in the Presentation Forms block) is not a compatibility character.u§pp398-400

`, // General punctuation '\u{2018}': `

‘

Closing quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.

`, '\u{2019}': `

’

Opening quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.

`, '\u{201C}': `

“

Closing quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.

`, '\u{201D}': `

”

Opening quotation mark. Unlike parentheses, the glyph of this character is not mirrored in right-to-left text.

`, '\u{2014}': `

—

Em dash.

`, '\u{0021}': `

Exclamation mark.

`, '\u{0025}': `

Percentage mark.

`, '\u{0028}': `

(

Opening parenthesis.

The words 'left' and 'right' in the Unicode names for parentheses, brackets, and other paired characters should be ignored. LEFT should be read as if it said START, and RIGHT as END. The direction in which the glyphs point will be automatically determined according to the base direction of the text.

`, '\u{0029}': `

)

Closing parenthesis.

`, '\u{002E}': `

Full stop.

`, '\u{0030}': `

Digit.

`, '\u{0031}': `

Digit.

`, '\u{0032}': `

Digit.

`, '\u{0033}': `

Digit.

`, '\u{0034}': `

Digit.

`, '\u{0035}': `

Digit.

`, '\u{0036}': `

Digit.

`, '\u{0037}': `

Digit.

`, '\u{0038}': `

Digit.

`, '\u{0039}': `

Digit.

`, '\u{003A}': `

Colon.

`, '\u{00AB}': `

Opening quotation mark.

`, '\u{00BB}': `

Closing quotation mark.

`, // CGJ '\u{034F}': `

Combining grapheme joiner.

Used to produce special ordering of diacritics. The name is a misnomer, as it is generally used to break the normal sequence of diacritics.

More details:

`, '\u{2013}': `

–

En dash.

`, '\u{2026}': `

…

Ellipsis.

`, '\u{2030}': `

‰

Per mille sign.

`, '\u{2039}': `

‹

Quotation mark.

`, '\u{203A}': `

›

Quotation mark.

`, // zwnj '\u{200C}': `

‌

Zero-width non-joiner (ZWNJ).

An invisible character, that prevents two adjacent letters forming a visual connection with each other when rendered. Especially useful for educational illustrations, but also has real-world applications.

It is used to interrupt the shaping of joining glyphs in cursive scripts, and also used to manage the visual interactions of glyphs in other scripts, eg. to prevent the formation of conjuncts, position diacritics, etc.

More details:

Managing glyph shaping

`, // zwj '\u{200D}': `

‍

Zero-width joiner (ZWJ).

An invisible character, that permits a letter to form a cursive connection without a visible neighbour. Especially useful for educational illustrations, but also has some real-world applications.

Also used with complex scripts to manage the visual representation of glyphs that normally interact, eg. to form conjuncts, position diacritics, etc.

More details:

Managing glyph shaping

`, // LRM '\u{200E}': `

An invisible character with strong LTR directional properties that can be used to produce the correct ordering of text, especially where there is a risk of spillover effects while the Unicode Bidirectional Algorithm is at work.

Generally referred to as LRM.

`, // RLM '\u{200F}': `

An invisible character with strong RTL directional properties that can be used to produce the correct ordering of text, especially where there is a risk of spillover effects while the Unicode Bidirectional Algorithm is at work.

Generally referred to as RLM.

`, // LRE '\u{202A}': `

Sets the start point for a range of inline text when applying a base direction of left-to-right. The range is terminated by 202C (PDF).

Use 2066 (LRI) rather than this character.

`, // RLE '\u{202B}': `

Sets the start point for a range of inline text when applying a base direction of right-to-left. The range is terminated by 202C (PDF).

Use 2067 (RLI) rather than this character.

`, // PDF '\u{202C}': `

Sets the end point for a range of inline text when applying a base direction. The range is started with either 202A (LRE) or 202B (RLE).

Use 2069 (PDI) and its associated range starters rather than this character.

`, // LRI '\u{2066}': `

Sets the start point for a range of inline text when applying a base direction of left-to-right, and isolates the text within that range from text outside it. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).

This character should be used rather than 202A (LRE).

`, // RLI '\u{2067}': `

Sets the start point for a range of inline text when applying a base direction of right-to-left, and isolates the text within that range from text outside it. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).

This character should be used rather than 202B (RLE).

`, // FSI '\u{2068}': `

Sets the start point for a range of inline text when applying a base direction, and isolates the text within that range from text outside it. The base direction set is determined by that of the first strong directional character in the range. The isolation prevents unintended spill-over effects when the text is reordered by the Unicode Bidirectional Algorithm. The range is terminated by 2069 (PDI).

`, // PDI '\u{2069}': `

Sets the end point for a range of inline text when applying a base direction. The range is started with either 2066 (LRI), 2066 (RLI) or 2068 (FSI).

This character should be used rather than 202C (PDF).

`, }