Updat22 September, 2021te -->
This page brings together basic information about the Arabic script and its use for the Hausa language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Hausa using Unicode.
See the Arabic script summary for most of the information about how the Arabic script works, and the orthography used for the Arabic language. This page aims to provide Hausa-specific information.
Phonetic transcriptions on this page should be treated as an approximate guide, only. Many are more phonemic than phonetic, and there may be variations depending on the source of the transcription.
رَایُوَا بَبَّنْ رَبُو نَا | غُنْ مَسَاٻِى دُونْ شِدَیْنَا | تَرْسَشِنْ أیْكِى نَ ٻَرْنَا | فَیْ دَ ٻُویٜ سِڟَیْدَ سُنَّا | شِبِ أللَّهْ بَادَكَنْغَرَا بَا
Hausa can be written in the Latin script, but also (less commonly) using the Arabic ajami script. Use of ajami tends to be restricted to Muslim contexts.
There is a good deal of variation in the orthography for Hausa ajami, and no official standardisation. It should be borne in mind that while this page adopts a particular set of characters based on the Warsh variants as most representative of the orthography, and describes alternative characters under the label of 'infrequent', this is not necessarily representative of the orthography used in certain regions or contexts, especially outside the area around northern Nigeria.
For information about the script in general, see the Arabic overview.u
Hausa has also been written in ajami since at least the early 17th century.whl
There is no standard system of using ajami for Hausa, and different writers may use letters with different values.whl
There are or have been a number of variant practices for writing Hausa ajami. There are also some confusable characters. They include the following:
The Arabic script is an abjad. This means that in normal use the script represents only consonant and long vowel sounds. However, since Hausa ajami normally shows all the vowel diacritics, it actually functions as an alphabet. See the table to the right for a brief overview of features for Hausa using the Arabic script.
The following list describes some distinctive characteristics of the Hausa ajami orthography.
Unlike Standard Arabic, ajami tends to add all vowel diacritics to text. Unlike Semitic languages, where words are built on letter patterns, it can be very difficult to read Hausa text without the full vowel information.
Hausa also has more vowel sounds than Arabic, so some additional conventions are necessary to cover those. Mostly these adaptations follow the North African, magrebi approach.
Hausa uses two principal types of writing: Ḥafṣ orthography uses characters that look and behave more like Standard Arabic, whereas the Warš orthography changes the shape of some letters, and drops the dots associated with others in certain positions.
The Warš orthography is typically written using a particularly African font style called Kano.
These are sounds for the Hausa language.
Click on the sounds to reveal locations in this document where they are mentioned.
Phones in a lighter colour are non-native or allophones.
kʲ kʲʼ gʲ
|s z||ʃ ʒ||h|
Hausa has 3 syllable types: CV, CVV, and CVC, where VV can be a long vowel or a diphthong.bc The long vs. short vowel distinction is phonemically important, however when a syllable with a long vowel acquires and final consonant, the vowel is shortened.
Consonant clusters may occur where syllables are side by side, but not within a syllable. Gemination is, however, a distinctive feature.bc
Semivowels ʷ and ʲ may occur after an initial consonant.
Click on the characters in the lists for detailed information. For a mapping of sounds to graphemes see vowel_mappings.
The set of characters needed to represent the Hausa vowels is the following.
Unlike Standard Arabic, all short vowel diacritics are usually written in Hausa ajami.
Diphthongs ending with i follow the initial vowel diacritic with ی [U+06CC ARABIC LETTER FARSI YEH] rather than ى [U+0649 ARABIC LETTER ALEF MAKSURA]. Two dots below are visible in medial position, eg. شِدَیْنَا ʃiday͓naābut not at the end of a word, eg. فَیْ fay͓
Hausa uses ْ [U+0652 ARABIC SUKUN] to kill the vowel after a consonant, eg. تَرْسَشِنْ tar͓saʃin͓
Vowel absence is usually marked (unlike Standard Arabic), including over the YEH or WAW that signal the final part of a diphthong.
The following tables show how the above vowel sounds map to common characters or sequences of characters in vowelled text. Entries are split to show initial (i), medial (m), and final (f) forms.
Per the rules for syllable structure in Hausa, vowels are always preceded by a consonant, and where no consonant is written before a vowel in the Boko orthography that consonant is an unwritten glottal stop.
Observation: It appears to be very unusual for sounds other than a or i to appear at the start of a word.
Observation: It is very difficult to find information in the sources consulted, but my conclusion is that what would be an initial form of a vowel letter in Standard Arabic is normally written in Hausa by combining the usual vowel diacritic with a carrier, such as أ [U+0623 ARABIC LETTER ALEF WITH HAMZA ABOVE] or ع [U+0639 ARABIC LETTER AIN]. Where i don't have other information, these 'initial' forms are shown using AIN in the table.
◌ِ [U+0650 ARABIC KASRA]
◌ِ [U+0650 ARABIC KASRA]
◌ُ [U+064F ARABIC DAMMA]
◌ُ [U+064F ARABIC DAMMA]
◌ُ [U+064F ARABIC DAMMA] (same as u)
◌ُ [U+064F ARABIC DAMMA] (same as u)
Click on the characters in the lists for detailed information. For a mapping of sounds to graphemes see consonant_mappings.
These characters are a basic set used for the Warsh orthography.
Three consonant sounds in syllable initial position can be followed by ʷ or ʲ. They depend on an initial base consonant with a 3-dot diacritic, which may or may not be followed by و [U+0648 ARABIC LETTER WAW] or ی [U+06CC ARABIC LETTER FARSI YEH].
One base character was encoded in Unicode 4.1: ݣ [U+0763 ARABIC LETTER KEHEH WITH THREE DOTS ABOVE], used for combinations with the sound k. Unicode code points for the other two were encoded in Unicode v13. They are ࣃ [U+08C3 ARABIC LETTER GHAIN WITH THREE DOTS ABOVE] for ɡʷ/ɡʲ and ࣄ [U+08C4 ARABIC LETTER AFRICAN QAF WITH THREE DOTS ABOVE]for ƙʷ/ƙʲ. (Take care not to confuse these with ڠ [U+06A0 ARABIC LETTER AIN WITH THREE DOTS ABOVE] and ڨ [U+06A8 ARABIC LETTER QAF WITH THREE DOTS ABOVE], neither of which are used for Hausa.)
There is little information available about how these characters are used, and some ambiguity in what there is.
Warren-Rothlinaww says the following about these characters.
The labialized and palatalized velars /ɡʷ/ and /ɡʲ/, /kʷ/ and /kʲ/, and /ƙʷ/ and /ƙʲ/ are usually not written, e.g. کْي ⟨k⁰y⟩ and کْو⟨k⁰w⟩, as one might expect, but کِي ⟨kiy⟩ or کُو ⟨kuw⟩, and even with the following vowel sound intervening (e.g. کَو⟨kaw⟩ for /kwa/). As noted above for other distinctive Hausa sounds, three dots usually smaller than standard nuqaṭ may be added above for labialization and below for palatalization (e.g. ⟨k₃aw⁰taʾ⟩ kyauta).
Rather than provide characters with triple dots above and others with triple dots below, Unicode is standardising on above.
Looking at the samples in the Unicode proposallpp, there seem to be two different forms for each. It isn't clearly indicated (especially since the boko transcription doesn't indicate vowel length), but I find myself wondering whether they reflect the difference between long and short vowels. Here are some examples. Compare the top and bottom items for each bullet.
Universität Wien's document also shows it being used alone, eg. ݣَاشٜىٰ
The following are additional characters that may be used to write Hausa ajami, including some used for the Hafs orthography, and others used in borrowed words, or text written by speakers who don't make the phonemic distinctions in the table above,
In addition, the following letters may be used for glottalised sounds as well as normal sounds.
A typical feature of the Warsh orthography is that a character has dots in initial or medial positions, but none in final or isolate. Another is that the dots appear on the other side of the base in some characters from the side they would appear in the Hafs orthography. These differences are represented in Unicode by the use of different characters. They include the following.
The other two characters have a triple-dot addition which is associated with glottalised consonants in the Warš orthography. (They don't appear to have glyphs in the webfont used.)
Geminated consonants are indicated using ّ [U+0651 ARABIC SHADDA], eg. بَبَّنْ
The following maps the above sounds to graphemes.
There is no official standard for how to write African languages in ajami, and there has been a good deal of variation over the history of the writing.dbs In addition, dialects of Hausa have different phonemic repertoires, which are reflected in their writing. So there is some variation as to which characters are mapped to which sounds, and the sets described here are a synthesis of sources describing modern usage.
The typical orthography is based on Warsh (Warš) forms, which incorporate Maghribi characteristics, and are often written with Kano style glyphs (as here). Some sources describe an alternative Hafs (Ḥafṣ) orthography, used with hand-written adaptations for the newspaper Al-Fijir.
Additional alternative shapes also occur, typically used for borrowed words, or because sounds are not differentiated in some regions. These are preceded by an asterisk in the table. (Warren-Rothlinaww lists a handful of other, less commonly attested shapes, but they are not listed here.)
In some cases the triple dot (known as wagaf) may be written by some below the base and by others above the base, but Unicode is standardising on glyphs that show it above.
Most sources associate this sound with ٻ [U+067B ARABIC LETTER BEEH] for the Warsh orthography, but Evans & Warren-Rothlinlpp list that character as Hafs, and show ݑ [U+0751 ARABIC LETTER BEH WITH DOT BELOW AND THREE DOTS ABOVE] as the Warsh variant. Bondarevdbs says that it is written as پ [U+067E ARABIC LETTER PEH] in modern text. One of the 'alternate' shapes used for this sound is ب [U+0628 ARABIC LETTER BEH].
Evans & Warren-Rothlinlpp associate this sound with ࢼ [U+08BC ARABIC LETTER AFRICAN QAF] for the Warsh variant, as do others, but Warren-Rothlinaww lists what appears to be ڧ [U+06A7 ARABIC LETTER QAF WITH DOT ABOVE] for this sound (although it could be an incorrect attribution, given that the former has a dot over initial/medial forms).
The Warsh orthography uses ࢻ [U+08BB ARABIC LETTER AFRICAN FEH] for this sound, and the Hafs uses ف [U+0641 ARABIC LETTER FEH] . Sometimes, پ [U+067E ARABIC LETTER PEH] is used as one of the 'alternative' shapes. Warren-Rothlinaww also lists what appears to be ڢ [U+06A2 ARABIC LETTER FEH WITH DOT MOVED BELOW] for this sound, although it could again be an incorrect attribution, given that ࢻ [U+08BB ARABIC LETTER AFRICAN FEH] has a dot below initial/medial forms.
ج [U+062C ARABIC LETTER JEEM] (same as d͡ʒ)
Warren-Rothlinaww indicates that this uses ۑ [U+06D1 ARABIC LETTER YEH WITH THREE DOTS BELOW] for the Warsh orthography, rather than the ؿ [U+063F ARABIC LETTER FARSI YEH WITH THREE DOTS ABOVE] indicated by Evans & Warren-Rothlinlpp. The IPA notation for this sound is somewhat ambiguous, including ƒ, ʔʲ, and j̰ . I settled for the last of these, though not for any convincing reason.
Sources: Wikipedia, and Google Translate.
The Arabic script uses a large number of Unicode characters that affect the way that other characters are rendered. Many of those have no visible form of their own.
Modern Arabic-script text makes use of a relatively large set of invisible formatting characters, especially in plain text, many of which are used to manage text direction. For more details, see the Arabic overview.
Need to confirm whether Hausa uses the following digit forms.
Not clear whether Hausa uses ٫ [U+066B ARABIC DECIMAL SEPARATOR] and ٬ [U+066C ARABIC THOUSANDS SEPARATOR].
Text is normally written horizontally, right to left, however numbers and non-Arabic script text run left to right.
See the Arabic overview for more details, especially related to sequences of items and numbers.
bidi_class properties for characters in the Hausa orthography described here.
This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.
You can experiment with examples using the Hausa ajami character app.
The orthography has no case distinction, and no special transforms are needed to convert between characters.
See the Arabic overview for more details.
The kano writing style is a common way of writing Hausa, especially in Northern Nigeria, in the ajami script, and like other East African writing it is based on Warsh (Warš) forms, which incorporate Maghribi characteristics. Text written in the Kano style will include glyphs for a number of African characters that may not be available in the average naskh font.
Another orthography, that looks much closer to naskh, is used with hand-written adaptations for the newspaper Al-Fijir, and is based on the Hafs orthography, but when writing in that orthography you need to use different code points from those used for the Kano style.
Observation: Panels of text in a Tamil newspaper that uses oblique fonts, but all the body text of the panel uses that font. Other fonts used for the body text in other articles tended to also have a slight lean, though not as much. The verticals in headings tend to be upright.
Words are separated by spaces.
. [U+002E FULL STOP]
|initial||» [U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK]|
|nested||› [U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK]|
See the Arabic overview.
Show (default) line-breaking properties for characters in the Hausa orthography described here.
This section is for any features that are specific to thisScript and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.
According to ScriptSource, the Arabic script is used for the following languages: