Hausa (draft)
Arabic

Updated 13 November, 2022

This page brings together basic information about the Arabic script and its use for the Hausa language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Hausa using Unicode.

See the Arabic script summary for most of the information about how the Arabic script works, and the orthography used for the Arabic language. This page aims to provide Hausa-specific information.

Sample

Select part of this sample text to show a list of characters, with links to more details. Source
Change size:   48px

رَایُوَا بَبَّنْ رَبُو نَا | غُنْ مَسَاٻِى دُونْ شِدَیْنَا | تَرْسَشِنْ أیْكِى نَ ٻَرْنَا | فَیْ دَ ٻُویٜ سِڟَیْدَ سُنَّا | شِبِ أللَّهْ بَادَكَنْغَرَا بَا

Usage & history

Hausa can be written in the Latin script, but also (less commonly) using the Arabic ajami script. Use of ajami tends to be restricted to Muslim contexts.

There is a good deal of variation in the orthography for Hausa ajami, and no official standardisation. It should be borne in mind that while this page adopts a particular set of characters based on the Warsh variants as most representative of the orthography, and describes alternative characters under the label of 'infrequent', this is not necessarily representative of the orthography used in certain regions or contexts, especially outside the area around northern Nigeria.

For information about the script in general, see the Arabic overview.u

Orthographic development & variants

Hausa has also been written in ajami since at least the early 17th century.whl

There is no standard system of using ajami for Hausa, and different writers may use letters with different values.whl

There are or have been a number of variant practices for writing Hausa ajami. There are also some confusable characters. They include the following:

Basic features

The Arabic script is an abjad. This means that in normal use the script represents only consonant and long vowel sounds. However, since Hausa ajami normally shows all the vowel diacritics, it actually functions as an alphabet. See the table to the right for a brief overview of features for Hausa using the Arabic script.

The following list describes some distinctive characteristics of the Hausa ajami orthography.

Character index

Letters

Show

Basic consonants

ب␣ٻ␣ت␣د␣ط␣ث␣ج␣ک␣ࢼ␣غ␣ع␣ࢻ␣س␣ڟ␣ز␣ش␣ح␣م␣ࢽ␣و␣ر␣ل␣ی␣ۑ␣ݣ␣ࣃ␣ࣄ

Extended consonants

ك␣ݑ␣ق␣ف␣پ␣ص␣ذ␣ظ␣ه␣ن␣ض␣ؿ

Vowels

أ␣إ␣ا␣و␣ى␣ی

Combining marks

Show

Vowels

َ␣ُ␣ِ␣ْ␣ٰ␣ٕ␣ٜ␣ٔ

Other

ّ

Punctuation

Show
،␣؟␣«␣»␣‹␣›

ASCII

.␣!␣(␣)
Character lists show:

Structure

Hausa has 3 syllable types: CV, CVV, and CVC, where VV can be a long vowel or a diphthong.bc The long vs. short vowel distinction is phonemically important, however when a syllable with a long vowel acquires and final consonant, the vowel is shortened.

Consonant clusters may occur where syllables are side by side, but not within a syllable. Gemination is, however, a distinctive feature.bc

Semivowels ʷ and ʲ may occur after an initial consonant.

Phonology

These are sounds for the Hausa language.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones.

Vowel sounds

Plain vowels

i u e o a

Diphthongs

iu ui ai au

Consonant sounds

labial dental alveolar post-
alveolar
retroflex palatal velar glottal
stop b
ɓ
t d
ɗ
        k ɡ
ɡʷ
kʷʼ
kʲʼ
ʔ
affricate   t͡sʼ   t͡ʃ d͡ʒ
       
fricative f
  s z ʃ ʒ       h
nasal m   n        
approximant w   l     j
 
trill/flap     ɾ   ɽ

Vowels

Vowel sounds to characters

Tables in this section show how Hausa vowel sounds commonly map to characters or sequences of characters in the Arabic orthography. i indicates word-initial, m medial, and f final forms. Click on the character names to see examples.

Plain vowels

Per the rules for syllable structure in Hausa, vowels are always preceded by a consonant, and where no consonant is written before a vowel in the Boko orthography that consonant is an unwritten glottal stop.

Observation: It appears to be very unusual for sounds other than a or i to appear at the start of a word.

Observation: It is very difficult to find information in the sources consulted, but my conclusion is that what would be an initial form of a vowel letter in Standard Arabic is normally written in Hausa by combining the usual vowel diacritic with a carrier, such as أ [U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE] or ع [U+0639 ARABIC LETTER AIN]. Where i don't have other information, these 'initial' forms are shown using AIN in the table.

Diphthongs

Vowel characters

The set of characters needed to represent the Hausa vowels is the following.

أ␣إ␣ا␣و␣ى␣َ␣ُ␣ِ␣ْ␣ٕ␣ٔ␣ٜ␣ٰ␣ی

Unlike Standard Arabic, all short vowel diacritics are usually written in Hausa ajami.

Diphthongs ending with i follow the initial vowel diacritic with ی [U+06CC ARABIC LETTER FARSI YEH] rather than ى [U+0649 ARABIC LETTER ALEF MAKSURA]. Two dots below are visible in medial position, eg. شِدَیْنَا ʃiday͓naābut not at the end of a word, eg. فَیْ fay͓

Vowel absence

ْ

Hausa uses ْ [U+0652 ARABIC SUKUN] to kill the vowel after a consonant, eg. تَرْسَشِنْ tar͓saʃin͓

Vowel absence is usually marked (unlike Standard Arabic), including over the YEH or WAW that signal the final part of a diphthong.

Consonants

Consonant sounds to characters

Tables in this section show how Hausa consonant sounds commonly map to characters or sequences of characters in the Arabic orthography. Click on the character names to see examples.

There is no official standard for how to write African languages in ajami, and there has been a good deal of variation over the history of the writing.dbs In addition, dialects of Hausa have different phonemic repertoires, which are reflected in their writing. So there is some variation as to which characters are mapped to which sounds, and the sets described here are a synthesis of sources describing modern usage.

The typical orthography is based on Warsh (Warš) forms, which incorporate Maghribi characteristics, and are often written with Kano style glyphs (as here). Some sources describe an alternative Hafs (Ḥafṣ) orthography, used with hand-written adaptations for the newspaper Al-Fijir.

Additional alternative shapes also occur, typically used for borrowed words, or because sounds are not differentiated in some regions. These are preceded by an asterisk in the table. (Warren-Rothlinaww lists a handful of other, less commonly attested shapes, but they are not listed here.)

In some cases the triple dot (known as wagaf) may be written by some below the base and by others above the base, but Unicode is standardising on glyphs that show it above.

ɓ

Most sources associate this sound with ٻ [U+067B ARABIC LETTER BEEH] for the Warsh orthography, but Evans & Warren-Rothlinlpp list that character as Hafs, and show ݑ [U+0751 ARABIC LETTER BEH WITH DOT BELOW AND THREE DOTS ABOVE] as the Warsh variant. Bondarevdbs says that it is written as پ [U+067E ARABIC LETTER PEH] in modern text. One of the 'alternate' shapes used for this sound is ب [U+0628 ARABIC LETTER BEH].

ɗ

Typically written with ط [U+0637 ARABIC LETTER TAH], this is sometimes written using د [U+062F ARABIC LETTER DAL].

k

ك [U+0643 ARABIC LETTER KAF] and ک [U+06A9 ARABIC LETTER KEHEH] look the same in the Kano webfont used for this page, but represent different underlying characters. In a non-Kano font, the difference is in the shape of the final position glyph, ـك vs. ـک, respectively.

Evans & Warren-Rothlinlpp associate this sound with [U+08BC ARABIC LETTER AFRICAN QAF] for the Warsh variant, as do others, but Warren-Rothlinaww lists what appears to be ڧ [U+06A7 ARABIC LETTER QAF WITH DOT ABOVE] for this sound (although it could be an incorrect attribution, given that the former has a dot over initial/medial forms).

k

ك [U+0643 ARABIC LETTER KAF] and ک [U+06A9 ARABIC LETTER KEHEH] look the same in the Kano webfont used for this page, but represent different underlying characters. In a non-Kano font, the difference is in the shape of the final position glyph, ـك vs. ـک, respectively.

f ɸ

The Warsh orthography uses [U+08BB ARABIC LETTER AFRICAN FEH] for this sound, and the Hafs uses ف [U+0641 ARABIC LETTER FEH] . Sometimes, پ [U+067E ARABIC LETTER PEH] is used as one of the 'alternative' shapes. Warren-Rothlinaww also lists what appears to be ڢ [U+06A2 ARABIC LETTER FEH WITH DOT MOVED BELOW] for this sound, although it could again be an incorrect attribution, given that [U+08BB ARABIC LETTER AFRICAN FEH] has a dot below initial/medial forms.

s

Normally, this would be written using س [U+0633 ARABIC LETTER SEEN], but ص [U+0635 ARABIC LETTER SAD] is also used, mainly in Arabic loan words.aww

z

Normally written using ز [U+0632 ARABIC LETTER ZAIN], however there are 2 'alternate' letters, ذ [U+0630 ARABIC LETTER THAL], and ظ [U+0638 ARABIC LETTER ZAH].

ʒ

ج [U+062C ARABIC LETTER JEEM] (same as d͡ʒ)

h

The usual form is ح [U+062D ARABIC LETTER HAH]. For Quranic names, ه [U+0647 ARABIC LETTER HEH] is generally used, but both can sometimes also be used interchangeably, eg. حَوْسَا or هَوْسَا.aww

n

The Warsh form is [U+08BD ARABIC LETTER AFRICAN NOON] and Hafs is ن [U+0646 ARABIC LETTER NOON]. Warren-Rothlinaww however indicates what appears to be ن [U+0646 ARABIC LETTER NOON] rather than [U+08BD ARABIC LETTER AFRICAN NOON] in Evans & Warren-Rothlinlpp.

l

ل [U+0644 ARABIC LETTER LAM] in the normal orthography, however an 'alternate' form used sometimes is ض [U+0636 ARABIC LETTER DAD].

Warren-Rothlinaww indicates that this uses ۑ [U+06D1 ARABIC LETTER YEH WITH THREE DOTS BELOW] for the Warsh orthography, rather than the ؿ [U+063F ARABIC LETTER FARSI YEH WITH THREE DOTS ABOVE] indicated by Evans & Warren-Rothlinlpp. The IPA notation for this sound is somewhat ambiguous, including ƒ, ʔʲ, and . I settled for the last of these, though not for any convincing reason.

Sources: Wikipedia, and Google Translate.

Basic set (Warsh orthography)

These characters are a basic set used for the Warsh orthography.

Stops & affricates

ب␣ٻ␣ت␣د␣ط␣ث␣ج␣ک␣ࢼ␣غ␣ع

Fricatives

ࢻ␣س␣ڟ␣ز␣ش␣ح

Nasals

م␣ࢽ

Other sonorants

و␣ر␣ل␣ی␣ۑ

Labialised & palatalised consonants

ݣ␣ࣃ␣ࣄ

Three consonant sounds in syllable initial position can be followed by ʷ or ʲ. They depend on an initial base consonant with a 3-dot diacritic, which may or may not be followed by و [U+0648 ARABIC LETTER WAW] or ی [U+06CC ARABIC LETTER FARSI YEH].

One base character was encoded in Unicode 4.1: ݣ [U+0763 ARABIC LETTER KEHEH WITH THREE DOTS ABOVE], used for combinations with the sound k. Unicode code points for the other two were encoded in Unicode v13. They are [U+08C3 ARABIC LETTER GHAIN WITH THREE DOTS ABOVE] for ɡʷ/ɡʲ and [U+08C4 ARABIC LETTER AFRICAN QAF WITH THREE DOTS ABOVE]for ƙʷ/ƙʲ. (Take care not to confuse these with ڠ [U+06A0 ARABIC LETTER AIN WITH THREE DOTS ABOVE] and ڨ [U+06A8 ARABIC LETTER QAF WITH THREE DOTS ABOVE], neither of which are used for Hausa.)

There is little information available about how these characters are used, and some ambiguity in what there is.

Warren-Rothlinaww says the following about these characters.

The labialized and palatalized velars /ɡʷ/ and /ɡʲ/, /kʷ/ and /kʲ/, and /ƙʷ/ and /ƙʲ/ are usually not written, e.g. کْي ⟨k⁰y⟩ and کْو⟨k⁰w⟩, as one might expect, but کِي ⟨kiy⟩ or کُو ⟨kuw⟩, and even with the following vowel sound intervening (e.g. کَو⟨kaw⟩ for /kwa/). As noted above for other distinctive Hausa sounds, three dots usually smaller than standard nuqaṭ may be added above for labialization and below for palatalization (e.g. ⟨k₃aw⁰taʾ⟩ kyauta).

Rather than provide characters with triple dots above and others with triple dots below, Unicode is standardising on above.

Looking at the samples in the Unicode proposallpp, there seem to be two different forms for each. It isn't clearly indicated (especially since the boko transcription doesn't indicate vowel length), but I find myself wondering whether they reflect the difference between long and short vowels. Here are some examples. Compare the top and bottom items for each bullet.

Universität Wien's document also shows it being used alone, eg. ݣَاشٜىٰ

Other consonants

The following are additional characters that may be used to write Hausa ajami, including some used for the Hafs orthography, and others used in borrowed words, or text written by speakers who don't make the phonemic distinctions in the table above,

ك␣ݑ␣ق␣ف␣پ␣ص␣ذ␣ظ␣ه␣ن␣ض␣ؿ

In addition, the following letters may be used for glottalised sounds as well as normal sounds.

ب␣د␣ك

dot variants

A typical feature of the Warsh orthography is that a character has dots in initial or medial positions, but none in final or isolate. Another is that the dots appear on the other side of the base in some characters from the side they would appear in the Hafs orthography. These differences are represented in Unicode by the use of different characters. They include the following.

The other two characters have a triple-dot addition which is associated with glottalised consonants in the Warš orthography. (They don't appear to have glyphs in the webfont used.)

Consonant clusters & gemination

Geminated consonants are indicated using ّ   [U+0651 ARABIC SHADDA], eg. بَبَّنْ

Formatting characters

The Arabic script uses a large number of Unicode characters that affect the way that other characters are rendered. Many of those have no visible form of their own.

Modern Arabic-script text makes use of a relatively large set of invisible formatting characters, especially in plain text, many of which are used to manage text direction. For more details, see the Arabic overview.

Numbers, dates, currency, etc

Need to confirm whether Hausa uses the following digit forms.

۰␣۴␣۵␣۶␣۴␣۵␣۶␣۷␣۸␣۹

Not clear whether Hausa uses ٫ [U+066B ARABIC DECIMAL SEPARATOR] and ٬ [U+066C ARABIC THOUSANDS SEPARATOR].

Text direction

Text is normally written horizontally, right to left, however numbers and non-Arabic script text run left to right.

See the Arabic overview for more details, especially related to sequences of items and numbers.

Show default bidi_class properties for characters in the Hausa orthography described here.

Glyph shaping & positioning

This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.

You can experiment with examples using the Hausa ajami character app.

The orthography has no case distinction, and no special transforms are needed to convert between characters.

See the Arabic overview for more details.

Writing styles

The kano writing style is a common way of writing Hausa, especially in Northern Nigeria, in the ajami script, and like other East African writing it is based on Warsh (Warš) forms, which incorporate Maghribi characteristics. Text written in the Kano style will include glyphs for a number of African characters that may not be available in the average naskh font.

رَایُوَا بَبَّنْ رَبُو نَا | غُنْ مَسَاٻِى دُونْ شِدَیْنَا | تَرْسَشِنْ أیْكِى نَ ٻَرْنَا | فَیْ دَ ٻُویٜ سِڟَیْدَ سُنَّا | شِبِ أللَّهْ بَادَكَنْغَرَا بَا

Urdu is normally written in the nasta'liq writing style.

رَایُوَا بَبَّنْ رَبُو نَا | غُنْ مَسَاٻِى دُونْ شِدَیْنَا | تَرْسَشِنْ أیْكِى نَ ٻَرْنَا | فَیْ دَ ٻُویٜ سِڟَیْدَ سُنَّا | شِبِ أللَّهْ بَادَكَنْغَرَا بَا

The same text, written in a standard naskh writing style.

Another orthography, that looks much closer to naskh, is used with hand-written adaptations for the newspaper Al-Fijir, and is based on the Hafs orthography, but when writing in that orthography you need to use different code points from those used for the Kano style.

Font styling & weight

tbd

Observation: Panels of text in a Tamil newspaper that uses oblique fonts, but all the body text of the panel uses that font. Other fonts used for the body text in other articles tended to also have a slight lean, though not as much. The verticals in headings tend to be upright.

Graphemes

Grapheme clusters

tbd

Punctuation & inline features

Word boundaries

Words are separated by spaces.

Phrase & section boundaries

،␣.␣؟␣!

Hausa uses a mixture of ASCII and Arabic punctuation.

phrase

، [U+060C ARABIC COMMA]

sentence

. [U+002E FULL STOP]

؟ [U+061F ARABIC QUESTION MARK]

! [U+0021 EXCLAMATION MARK]

Bracketed text

(␣)

Hausa commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

( [U+0028 LEFT PARENTHESIS]

) [U+0029 RIGHT PARENTHESIS]

Quotations & citations

«␣»␣‹␣›

Hausa texts typically use guillemets around quotations, but some texts may use quotation marks instead. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks. Note, however, that the order of use is different from that in LTR text, because they are not automatically mirrored.

  start end
initial

« [U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK]

» [U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK]
nested

[U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK]

[U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK]

Emphasis

tbd

Abbreviation, ellipsis & repetition

tbd

Inline notes & annotations

tbd

Other punctuation

tbd

Other inline text decoration

tbd

Line & paragraph layout

Line breaking & hyphenation

tbd

See the Arabic overview.

Show (default) line-breaking properties for characters in the Hausa orthography described here.

Text alignment & justification

tbd

Text spacing

tbd

This section looks at ways in which spacing is applied between characters over and above that which is introduced during justification.

Baselines, line height, etc.

tbd

Hausa ajami uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

Counters, lists, etc.

tbd

Styling initials

tbd

Page & book layout

This section is for any features that are specific to thisScript and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.

References