Updated 5 January, 2023
This page brings together basic information about the Ethiopic script and its use for the Amharic language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Amharic using Unicode.
አንቀጽ፡፪፤ እያንዳንዱ፡ሰው፡የዘር፡የቀለም፡የጾታ፡የቋንቋ፡የሃይማኖት፡የፖለቲካ፡ወይም፡የሌላ፡ዓይነት፡አስተሳሰብ፡የብሔራዊ፡ወይም፡የኀብረተሰብ፡ታሪክ፡የሀብት፡የትውልድ፡ወይም፡የሌላ፡ደረጃ፡ልዩነት፡ሳይኖሩ፡በዚሁ፡ውሳኔ፡የተዘረዘሩት፡መብቶችንና፡ነጻነቶች፡ሁሉ፡እንዲከበሩለት፡ይገባል። ከዚህም፡በተቀረ፡አንድ፡ሰው፡ከሚኖርበት፡አገር፡ወይም፡ግዛት፡የፖለቲካ፡የአገዛዝ፡ወይም፡የኢንተርናሽናል፡አቋም፡የተነሳ፡አገሩ፡ነጻም፡ሆነ፡በሞግዚትነት፡አስተዳደር፡ወይም፡እራሱን፡ችሎ፡የማይተዳደር፡አገር፡ተወላጅ፡ቢሆንም፡በማንኛውም፡ዓይነት፡ገደብ፡ያለው፡አገዛዝ፡ሥር፡ቢሆንም፡ልዩነት፡አይፈጸምበትም።
The Ethiopic, or Geʽez, script is widely used for writing the Ethiopian and Eritrean Semitic languages such as Tigré, Amharic and Tigrinya. It is also used for Gurage, Me'en, and most other languages of Ethiopia. In Eritrea it is used traditionally used for Blin, a Cushitic language. Some other languages in the Horn of Africa, such as Oromo, used to be written using Geʽez, but have migrated to Latin-based orthographies.
With 31 million mother-tongue speakers, and more than 25 million second language speakers. Amharic is the most widely spoken language in Ethiopia, and the second most spoken mother-tongue (after Oromo). It serves as the official working language of the Ethiopian federal government, and of several of Ethiopia's federal regions.wam
ግዕዝ gəʿəzə gəʿəz Geʽez ፊደል
The Ethiopic (Geʽez) script was developed as the writing system of the Geʽez language, a Semitic language spoken in Ethiopia and Eritrea until the 10th to the 12th centuries. The Geʽez language is now only in liturgical use.
The basic consonant shapes come from the original Geʽez script, which was an abjad. The script became an abugida when small changes were added to those shapes to indicate the following vowel sound. Each complete syllable is now represented by a single syllabic character in the Unicode repertoire. The original Ethiopic script contained 182 characters, although the basic (unmarked) consonants number only 26. Script extensions for other languages have added many more symbols, and often represent phonological processes such as palatalization, pharyngealization and labialization.
According to ScriptSource,
the script is believed by many to have derived from the epigraphic South Arabian script, of Proto-Sinaitic heritage, although there is some dispute surrounding this assertion; some also believe it to have descended from Egyptian hieroglyphics. According to the tradition of the Ethiopian Orthodox Tewahedo Church, the script was divinely revealed to Enos, grandson of the first man, Adam.
Sources: Scriptsource, Wikipedia.
The Ethiopic script is a featural syllabary, ie. each symbol typically represents both a consonant and a vowel, but vowel components are indicated by largely standardised adaptations to the base consonant shape. See the table to the right for a brief overview of features for the modern Amharic orthography.
The Ethiopic script runs left to right in horizontal lines.
Modern Amharic generally uses spaces to separate words, but sometimes still uses the Ethiopic wordspace character, instead.
The Ethiopic script blocks in Unicode list over 453 characters. Amharic uses 282 syllable characters.
Gemination and consonant clusters are not indicated by the script (although some diacritics have been proposed for that, which are encoded in Unicode). Silent vowels are typically indicated using the 6th order -ə syllable, which creates some ambiguity.
The script is unicameral, and has only three just mentioned combining characters, which are rarely used. Characters don't interact, and the baseline is standard.
Ethiopic does have a range of native punctuation. In particular, although words in modern text are increasingly separated by spaces they may be separated by a wordspace character instead.
Ethiopic also has its own numeric digits, which are used in an additive way, rather than in the way numbers are formed in Western text.
This section looks at the vowel and consonant sounds of Amharic.
Click on the sounds to reveal locations in this document where they are mentioned.
Source Comrie. Phones in a lighter colour are non-native or allophones.
Much writing on Amharic has ɑ̈ and ə respectively for ə and ɨ. The sound ɨ only rarely occurs at the end of a word, and ə rarely at the beginning of a word. These letters are also often elided by adjacent vowel sounds.c
ɛ appears as a variant of e after h.c
|stop||p b||t d||k ɡ||ʔ|
|fricative||f v||s z||ʃ ʒ||h|
Note that t and d are produced with a dental articulation, ie t̪ and d̪.wa
p was introduced through loan words and is unaspirated (often confused with b).wa
The basic letter shapes come from the original Geʽez script, which was an abjad. The script became an abugida when small changes were added to those shapes to indicate the following vowel sound. Each CV syllable is now represented by a single character in the Unicode repertoire.
Vowels Each consonant can be followed by one of 7 vowel sounds. The original consonant shape is known as the 'first order', and the other shapes constitute incremental orders. The illustration below is based on the m consonant.
The IPA symbols shown above are broad transcriptions and the actual phonetic pronunciation can vary somewhat, depending on the consonant, on stress, or on other factors. The IPA transcriptions for the vowels may also vary from publication to publication. In particular, the following are common:
|Symbol used here||Allophones||IPA used elsewhere||Typical non-IPA transcription|
The first-order sounds of syllables beginning with h or standalone vowel syllables are usually pronounced a, rather than ɜ.
Basic consonants The basic set of Ethiopic syllables comprises the following consonants. The pronunciation listed is for Amharic (which has lost the phonetic distinction between some characters).
See basicV for the last two items in the list.
Language-specific consonants Additional sets of consonants match the sounds in the various different languages that use the Ethiopic script. Amharic, Tigrinya, Tigre, and Blin each use a selection from the following set.
Amharic uses all of the above apart from the letters for xʼ and ŋ.
Glides Most consonants can be accompanied by the bilabial ʷa, but Amharic also has a set of common labiovelar consonants, which are followed by 5 of the vowel sounds. The list below shows the -ɜ form; the other vowels following labiovelar consonants include i, a, e, and ə.
Three consonants also have a ʲɛ ending:
Other characters in the Unicode block The remaining characters, largely including those in the extension blocks, are for writing the sounds of other languages, such as Me'en, Gurage, Gamo-Gofa-Dawro, Basketo, Gumuz, etc. The set of extended characters also includes combinations of the previous characters with an oa vowel sound.
The አ and ዐ series have lost their consonantal values and are vowel carriers in modern Amharic. Though sometimes the glottal stop ʔ is pronounced in word initial and medial positions, it is often dropped,wa eg. አየሩ
ኧ [U+12A7 ETHIOPIC SYLLABLE GLOTTAL WA] is irregular and is pronounced (ʔ)ä.wa
In the Geʽez language, ዐ [U+12D0 ETHIOPIC SYLLABLE PHARYNGEAL A] represents the sound ʕ, and አ [U+12A0 ETHIOPIC SYLLABLE GLOTTAL A] represents a glottal stop ʔ.
The following is a list of syllabic characters used for Amharic.
In a number of cases, alternative syllabic symbols are available for a given pronunciation. This is because they used to have different pronunciations in the Geʽez language but those differences have fallen away in Amharic. Amharic writing still preserves the old spelling, reflecting the origin of the word.wa
The IPA shown is the standard form used for lexemes, but stress and context may replace the sounds shown with allophones.
The በ [U+1260 ETHIOPIC SYLLABLE BA] series is often β between vowels, rather than b.wa
The ቨ [U+1268 ETHIOPIC SYLLABLE VA] series may be pronounced β, rather than v.wa
The ረ [U+1228 ETHIOPIC SYLLABLE RA] series may be pronounced ɾ or r.wa
Many words end with a consonant followed by no vowel. These are written using the ə orthographic syllable, eg. ስም
However the syllable is ambiguous – in some cases the vowel could be pronounced, and there is no way to tell the difference, eg. the last 3 characters in the following word all use the ə syllable but the vowel is dropped for 2 of them.
The ə orthographic syllable is also used for clusters of consonants with no intervening vowels, eg. click on the following to see its composition: ኢትዮጵያ
Geminate consonants do occur in Amharic and other languages that use the Ethiopic script, and they can be important to distinguish one word from another. However, they are not marked in the script, eg.
The Ethiopic blocks have only 3 combining characters, but their use is rare.
The first is for vowel length, the second a gemination indicator (see silent), and the third a combination of both.
According to Wikipediawa, Ethiopian novelist Haddis Alemayehu, who was an advocate of Amharic orthography reform, indicated gemination in his novel Fǝqǝr Ǝskä Mäqabǝr by placing a dot above the characters whose consonants were geminated, but this practice is rare. Unicode provides ፟ [U+135F ETHIOPIC COMBINING GEMINATION MARK] for this, but sometimes ̎ [U+030E COMBINING DOUBLE VERTICAL LINE ABOVE] is used.
European digits are commonly used.
Ethiopic also has a native numbering system that is additive in nature.
You can generate Ethiopic numbers using the Counter styles converter app. Type in a number at the top and select ethiopic-numeric from the select box.
Note that it is common for there to be an unbroken line across the whole number at the top and bottom, although sometimes the lines remain broken.
Ordinal numbers are indicated in Amharic by following the cardinal number with ኛ [U+129B ETHIOPIC SYLLABLE NYAA], often superscripted. It can be applied to numbers using both Western and Ethiopic digits.e,#ethiopic_ordinal_notation
The Ethiopic script runs left to right in horizontal lines.
bidi_class properties for characters in the Amharic language.
This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.
You can experiment with examples using the Ethiopic character app.
Since there are normally no combining characters and no joining behaviour, text in the Ethiopic script usually has no contextual variation or placement of glyphs. Nor is printed text cursive.
The orthography has no case distinction, and no special transforms are needed to convert between characters.
In Amharic, a typographic unit is normally equivalent to a single character, and therefore also equivalent to a Unicode grapheme cluster, eg.
On the very rare occasions when a combining mark is used, the unit is still a standard grapheme cluster.
The ordering of codepoints in an Amharic grapheme is generally not relevant, because graphemes are usually single syllable code points. When combining characters are used, there is usually just one, because ፝ [U+135D ETHIOPIC COMBINING GEMINATION AND VOWEL LENGTH MARK] combines the gemination and length mark diacritics in a single code point.
Words are often separated by spaces in modern text, however they may be separated by ፡ [U+1361 ETHIOPIC WORDSPACE] instead, although it is becoming less common, but is still seen commonly in handwritten texte,#ethiopic_punctuation.
Observation: A sample page from Wikipedia mixes both approaches on the same page. Some paragraphs use the wordspace and others just separate words with spaces. Where the wordspace is used, it is surrounded by ordinary spaces.
If inline text is styled, eg. underlining, colouring, etc., the wordspace receives the same styling as the word it follows.
Hyphenated words can also be found, eg. ድረ-ገጾች web sites
Some ASCII punctuation may be used, but Ethiopic has several native punctuation characters.
? [U+003F QUESTION MARK]
! [U+0021 EXCLAMATION MARK]
|paragraph||፨ [U+1368 ETHIOPIC PARAGRAPH SEPARATOR]|
|section||፠ [U+1360 ETHIOPIC SECTION MARK]|
Phrases. ፣ [U+1363 ETHIOPIC COMMA] or ፥ [U+1365 ETHIOPIC COLON] are both roughly equivalent to a comma. They are considered glyph variants for the same punctuation symbol, although usually a document will consistently use only one or the other. Different texts tend to favour one or the other. The latter is more common in religious texts, and is used for biblical references where English would use a colon.e,#ethiopic_punctuation For more detail see Yacobe,#ethiopic_comma_usage e,#wordspace_in_comma_context.
፤ [U+1364 ETHIOPIC SEMICOLON] Used to separate equivalent main phrases in one idea. Even though it is not placed at the end of a paragraph, it can be used to separate sentences with similar ideas in a paragraph.e,#ethiopic_punctuation Usage is consistent within a given text, but may overlap with one of the previous comma puncuation marks.
፦ [U+1366 ETHIOPIC PREFACE COLON] Follows clarification of a subject. It will preface validation statements and examples that support the clarification.
Sentences.። [U+1362 ETHIOPIC FULL STOP] is commonly used, immediately preceded by a wordspace character if the text contains them. It is also possible to find the ASCII full stop used.
The ASCII question mark is common at the end of questions, but Ethiopic also has its own ፧ [U+1367 ETHIOPIC QUESTION MARK]. This has fallen into disuse in modern textse,#ethiopic_punctuation.
Amharic also uses the ASCII exclamation mark at the end of a sentence.
¡ [U+00A1 INVERTED EXCLAMATION MARK], known as “Timirte Slaq” (ትእምርተ፡ሥላቅ) appears at the end of a sentence and denotes sarcasm.e,2.3.1 It is not common, but can be found in often in political comics.
Paragraphs. ፨ [U+1368 ETHIOPIC PARAGRAPH SEPARATOR] may be used to conclude the final paragraph of a section in lieu of ።. Like ፠ below, three or more may also be used together on a line of their own. This is not much used in modern text.e,#ethiopic_punctuation
Sections.፠ [U+1360 ETHIOPIC SECTION MARK] Used to divide sections or subsections; generally three or more used together on a line of their own. This, also, is not much seen in modern text.e,#ethiopic_punctuation
Amharic commonly uses ASCII parentheses to insert parenthetical information into text.
( [U+0028 LEFT PARENTHESIS]
) [U+0029 RIGHT PARENTHESIS]
Amharic texts typically use guillemets around quotations, but modern texts may use the quotation marks insteade,#quotation. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.
|initial||« [U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK]||» [U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK]|
|secondary||‹ [U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK]||› [U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK]|
“ [U+201C LEFT DOUBLE QUOTATION MARK]
” [U+201D RIGHT DOUBLE QUOTATION MARK]
The same punctuation is used to highlight cited words (see fig_guillemets).
Observation: The Ethiopian Reporter site has dialogues that begin with the person's name followed by ፡– [U+1361 ETHIOPIC WORDSPACE + U+2013 EN DASH]. There are no wordspace characters in the normal text. Should this be ፦ [U+1366 ETHIOPIC PREFACE COLON]?
According to Yacob,
Emphasis in modern Ethiopic writing will employ every emphasis device available from the available publishing technology (e.g. underline, slant, embolden, letter size, letter outline, background shapes, etc.). The practice however is idiosyncratic and inconsistently applied leading to debate and disagreement within the publishing community.e,#h_emphasis
He provides the following examples:
In ecclesiastical texts emphasis is commonly indicated by colouring the text rede,#h_emphasis.
In text that uses the wordspace to separate words the styling is also associated with the wordspacee,#emphasis_with_wordspace.
Amharic abbreviates by placing / [U+002F SOLIDUS] between letters taken from the original word or words.e,#ethiopic_abbreviation_formation
Sometimes an ASCII period is used, rather than the solidus.e,#ethiopic_abbreviation_formation
Amharic uses 3 consecutive dots to signal ellipsis.
Yacob notes that in Ethiopic literature ellipsis may have anywhere between 3–6 dotse,#ellipsis
Digits are identified by a line that runs across the top and the bottom of a number. The line is built into the font glyphs, rather than text decoration, but in a capable rendering system extends unbroken across the whole number. See digits.
Modern Ethiopic text is generally wrapped word by word. If wordspace separators are used, they are wrapped with the word, and should not appear alone at the beginning of a line.g116
Older Ethiopic text is generally wrapped wherever it hits the right margin, whether wordspace or space are used to separate words, and no hyphenation occurs.g116
Observation: It's possible that a rule is sometimes applied to letter-based wrapping that requires a minimum of 2 letters at the end of a line for printed text (as opposed to handwritten manuscripts). This was observed by Daniel Yacob in the book, "ዜናዊ ፓርልማ" from 1953 (1946EC).g116,#issuecomment-582412224
Whatever style of wrapping is used, however, the following punctuation wrapping rules apply (which means that a wordspace separator should not appear at the start of a line, even when letter-by-letter wrapping occurs).
A new line should not start with a space, math operator or any of the following:e,#ethiopic_punctuation
Show (default) line-breaking properties for characters in the Amharic language.
Full justification is a common typesetting practice. Ethiopic is usually justified by adjusting inter-word spacing. Where words are separated with ፡ [U+1361 ETHIOPIC WORDSPACE] this is still the case, however no extra spaces should be added – the width of the wordspace character changes.
When the wordspace character width changes, the wordspace glyph may be centred, or may appear alongside the previous word, depending on preference.
This section looks at ways in which spacing is applied between characters over and above that which is introduced during justification.
The Ethiopic script uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts. There is some variability from letter to letter in the height of the letter forms and in the positioning relative to the baseline, but the differences are small.
By way of example, fig_baselines compares Latin and Ethiopic glyphs from Noto fonts. The maximum height of Ethiopic letters with a top bar is about the Latin ascender height, with 'serifs' that rise very slightly higher. Note, however, that the height of the Ethiopic glyphs varies from letter to letter, and other Ethiopic letters are set to the Latin cap height. Noto letters sit on the alphabetic baseline. No glyphs reach below the Latin descender extension.
If diacritics are applied above the Ethiopic letters, they will increase the overall line height.
fig_baselines_kefa compares Latin and Ethiopic glyphs from the Kefa and Nyala fonts. The Kefa Ethiopic letters are less regular in height and are on the whole just slightly taller compared to the Noto fonts. All fonts are similar, however.
According to Yacobe,#relative_character_heights, fixed height styles are more generally used for advertisement and not publishing.
You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.
The Amharic language uses numeric and alphabetic styles.
Ethiopic uses a decimal numeric style based on ASCII digits.
It also uses a much more complicated numeric system, described in the CSS Counter Styles specification as the
ethiopic-numeric style. The system uses the following 18 digits, and combines them in a somewhat complicated manner.
The amharic alphabetic style for the Amharic language uses these letters.
Another alphabetic style, which we will call amharic-abegede, uses the same letters, but in a different order.
The most common suffix for lists in Amharic is / [U+002F SOLIDUS + U+0020 SPACE] after the counter.
This section is for any features that are specific to Ethiopic and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.