Updated 14 November, 2022
This page brings together basic information about the Mongolian (Hudum) script and its use for the Mongolian language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Mongolian using Unicode.
ᠬᠦᠮᠦᠨ ᠪᠦᠷ ᠲᠥᠷᠥᠵᠦ ᠮᠡᠨᠳᠡᠯᠡᠬᠦ ᠡᠷᠬᠡ ᠴᠢᠯᠥᠭᠡ ᠲᠡᠢ᠂ ᠠᠳᠠᠯᠢᠬᠠᠨ ᠨᠡᠷᠡ ᠲᠥᠷᠥ ᠲᠡᠢ᠂ ᠢᠵᠢᠯ ᠡᠷᠬᠡ ᠲᠡᠢ ᠪᠠᠢᠠᠭ᠃ ᠣᠶᠤᠨ ᠤᠬᠠᠭᠠᠨ᠂ ᠨᠠᠨᠳᠢᠨ ᠴᠢᠨᠠᠷ ᠵᠠᠶᠠᠭᠠᠰᠠᠨ ᠬᠦᠮᠦᠨ ᠬᠡᠭᠴᠢ ᠥᠭᠡᠷᠡ ᠬᠣᠭᠣᠷᠣᠨᠳᠣᠨ ᠠᠬᠠᠨ ᠳᠡᠭᠦᠦ ᠢᠨ ᠦᠵᠢᠯ ᠰᠠᠨᠠᠭᠠ ᠥᠠᠷ ᠬᠠᠷᠢᠴᠠᠬᠥ ᠤᠴᠢᠷ ᠲᠠᠢ᠃
The Mongolian script is used for writing the Mongolian language. In the Mongolian People's Republic (Outer Mongolia), the traditional script was replaced by a Cyrillic orthography since the early 1940s, but revived in the 1990s, so that both scripts are now used in tandem. The script is also used within the Inner Mongolia Autonomous Region of the People’s Republic of China and elsewhere in China.
The traditional writing for Mongolian is known as Hudum Mongol bichig, and was adapted from the Old Uighur alphabet during the reign of Genghis Khan in the 13th century.
There are four other scripts which are derived from and closely related to Mongolian. These are the Galik, Todo (or "clear script"), Manchu and Sibe scripts.
Sources: Scriptsource, Wikipedia.
The Mongolian script is an alphabet, ie. a writing system in which both consonants and vowels are indicated. See the table to the right for a brief overview of features for the modern Halh Mongolian orthography.
Modern Mongolian can be written using a subset of the letters available in the Mongolian Unicode block. The remainder are used for writing Todo, Sibe, and Manchu, or for writing foriegn words, especially in Tibetan and Sanskrit.
Mongolian text runs top to bottom in vertical lines and (unusually) the lines flow left to right.
The script is cursive, ie. letters in a word are joined. All letters join both on the left and right.
Words are separated by spaces, but also contain narrow spaces that precede suffixes and may produce shaping differences to the surrounding letters. These are part of the word, and the parts on either side should not be separated.
It has 16 basic consonant letters and 11 more for representing foreign sounds.
There are 8 vowel letters, including one for foreign sounds.
The script is monocameral.
There is a set of Mongolian digits.
The basic unit of text is a word, however words can contain prefixes and suffixes. Some of the suffixes are separated from the root of the word by a small gap, but they are still considered to be part of the word.
Vowel harmony is an important aspect of the Mongolian language – words contain only masculine+neuter vowels, or only feminine+neuter vowels. The masculine vowels are:
The feminine vowels are:
The following vowel is neutral, and can appear in words with either masculine or feminine vowels.
Grammatical suffixes also have masculine and feminine versions.
Unicode encodes separate characters for different sounds for the Mongolian language, regardless of whether the glyph shapes used are identical. For example, the glyph shapes for the 2 characters ᠣ [U+1823 MONGOLIAN LETTER O] and ᠤ [U+1824 MONGOLIAN LETTER U] are identical, as are those for ᠥ [U+1825 MONGOLIAN LETTER OE] and ᠦ [U+1826 MONGOLIAN LETTER UE]. The two pairs only differ in shape in isolated and initial forms.
Identical glyphs for different sounds occur across other pairings also. For example, the medial and final shapes for a and n are identical.
The Unicode Standard provides the following examples of word pairs that cannot be distinguished visually.u,530 (Click on the words to see their actual composition.)
The result of this encoding method is that it is impossible to accurately copy Mongolian text from a visual source unless you speak the language well enough to recognise the phonetics of the words involved. It also leads to mistakes when Mongolian speakers type text.
Written Mongolian words use traditional spellings that may not correspond closely to modern pronunciations. For example, if you were to spell out the letters in the following word as written you would get uʤəgulxu, whereas the modern pronunciation is uʤuuləx. ᠤᠵᠡᠭᠦᠯᠬᠦ
These are the sounds of Khalkha Mongolian.
Click on the sounds to reveal locations in this document where they are mentioned.
Phones in a lighter colour are non-native or allophones. Source Wikipedia.
Click on the characters in the lists for detailed information.
Eight vowels are used for the Mongolian language.
ᠧ [U+1827 MONGOLIAN LETTER EE] is used for foreign words.
As previously mentioned, vowel harmony is an important part of the orthography for the Mongolian language.
In addition to the set of Mongolian vowels, the Mongolian block also includes additional vowel characters for use with Todo, Sibe, Manchu and Ali Gali vowels.
Todo
Sibe
Manchu
Ali gali
Many Mongolian suffixes are separated from the root or other suffixes by a small gap, eg. ᠭᠠᠵᠠᠷ ᠠ The Unicode Standard provides U+202F NARROW NO-BREAK SPACE] for this gap, which is thinner than a normal space, and doesn't provide an opportunity for line-breaking. [
Characters following NNBSP may take on special shapes.
U+180E MONGOLIAN VOWEL SEPARATOR] is used where a final [ᠠ [U+1820 MONGOLIAN LETTER A] or ᠡ [U+1821 MONGOLIAN LETTER E] vowel is separated from the rest of a word.
Unlike characters following U+202F NARROW NO-BREAK SPACE], the [A or E following a word is not a suffix, but an integral part of the word. Whether a final A or E is joined or separated is a purely lexical decision, and not an instance of varying orthography.
MVS always requires the forward tail form of the following A or E letter. The preceding letter form varies according to the letter, and in some cases whether this is traditional or modern orthography. See fig_mvs.
Not used for Todo, Manchu or Sibe.
The Mongolian block contains consonant symbols for use with Mongolian, Todo, Sibe, Manchu and Ali Gali. Some of the Mongolian characters are shared with other uses.
Click on the characters in the lists for detailed information.
The Mongolian language has a basic set of 16 consonants.
In the current Mongolian encoding model, the code points ᠬ [U+182C MONGOLIAN LETTER QA] and ᠭ [U+182D MONGOLIAN LETTER GA] each have both masculine and feminine forms. The different forms have different shapes and different pronunciations.
The masculine form is used before a masculine vowel, and vice versa.
masculine | feminine | masculine | feminine | ||
---|---|---|---|---|---|
initial | |||||
medial |
The font is expected to automatically select the appropriate glyph form for these velar consonants. This becomes more complicated, however, where these consonants occur without a following vowel (ie. before another consonant, or in final position).
In Sibe and Manchu, the form is selected based on the previous vowel. In Mongolian and Todo, however, the shape depends on the gender of the word, as described in harmony, and this may not be detectable from the previous vowel. The Unicode Standard gives examples of 2 words where it is necessary to look at the beginning of the word to determine the shape at the end of the word.u,534
This puts a significant strain on the capabilities of the font itself and of the font developers, and some fonts do not achieve this correctly. In addition, exceptional circumstances have to be taken into account. In consequence, fonts may need 100 or more rules to handle this.
The full set of consonants used for Mongolian includes 11 letters that are normally used for writing foreign sounds.
Todo
Sibe
Manchu & Buryat
Ali gali
Because the script is alphabetic, there are no special mechanisms for representing clusters of consonants without intervening vowels, or doubled consonants.
The Mongolian block contains 3 visible combining characters.
It also contains 3 invisible control characters, also classed by Unicode as combining characters, which can be used to indicate specific alternative forms for letters. See Context-based shaping.
Mongolian often uses european digits, however there is a set of Mongolian digits.
Traditionally, Mongolian digits run horizontally within the vertical lines, but it is common in modern text for them to run down the line instead.u
Mongolian script is written vertically, top to bottom, in columns that flow left to right. This is an unusual configuration. (Chinese, Japanese and Korean vertical text columns are read right to left). It derives from the fact that this script descended from a script (Old Uyghur) that was written right to left.
Fullwidth Latin alphabetic and digit characters are seen in traditional Mongolian text, as are fullwidth Chinese characters and punctuation (see mixed_text). When used, the latter are displayed upright. The fullwidth series of Unicode characters may be used as an easy way to achieve this.g5
Cyrillic characters may also be seen, used in a way that resembles fullwidth characters, but this is actually a property of the font used to display the characters, since there are no fullwidth cyrillic code points in Unicode. Emoji are also expected to be displayed upright.g5
Non-fullwidth letters and numbers tend to be written sideways.g5 See mixed_text_sideways.
Upright digits may be used for list counters. And Mongolian also has the feature referred to in Japanese as tate chu yoko, whereby small sequences of non-fullwidth numbers or punctuation may run horizontally within the vertical flow.g5 See fig_digits_tate_chu_yoko.
Certain punctuation marks are upright, and others are rotated.g5 (See inline.)
Many of the conventions seen in actual digital text may be determined more by the available technology than by what the content author wants to achieve.g5
Show default bidi_class
properties for characters in the Mongolian orthography described here.
When Mongolian excerpts are shown in text that is set horizontally (such as on this page), the Mongolian is sometimes represented as a sequence of single vertical words, eg. ᠮᠣᠩᠤᠯ
ᠪᠢᠴᠢᠭ, but in other cases it is rotated left and joins horizontally, eg. ᠮᠣᠩᠤᠯ ᠪᠢᠴᠢᠭ.
Mongolian text written horizontally is read left-to-right. This means that if it contains embedded text from another language, such as English, there is no bidirectional behaviour (as there would be in Arabic-script text).
Note also that it is not possible to produce a page of vertical text by printing it horizontally and then rotating the page, This is because the order of lines in the rotated page will be right-to-left, whereas it should be left-to-right.
This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.
You can experiment with examples using the Mongolian character app.
The orthography has no case distinction, and no special transforms are needed to convert between characters.
Most of the complexity of the Mongolian traditional script has to do with two things: (1) characters are allocated on the basis of phonemic differences, but many characters share identical shapes, and (2) there are many variant forms for a given character, some of which cannot be produced automatically.
Similarly to the Arabic script, Mongolian letters within a word tend to be joined cursively along the centre baseline, and the shapes of joined characters can vary significantly in various positions. Unlike Arabic, and many other cursive scripts, there are no characters that only join on one side.
Letters following a Mongolian suffix space may need to be displayed using a joining form, however that is not always the case. It depends on the suffix.
The base shape of a letter can change significantly, depending on the position in a word. On the other hand, a number of letters have adopt identical shapes in the same, or sometimes different joining contexts.
Certain letters ligate with adjacent letters.
In addition to the cursive shaping mentioned just above, individual letters may have context-dependent variant forms, that can be quite different from the standard forms. Where the alternative form can be determined algorithmically, the font should produce the change.
An unusual feature of the Mongolian traditional script is that the shape of a letter may depend on the vowel harmony of the word, and so may be determined at some distance from the character in question.
For unpredictable variants, the Mongolian block has three 'free variation selectors' which can be used to indicate which variant form should be used. The variant selector is used immediately after the character to be changed.
Unfortunately, variation selector usage is still not completely standardised across Mongolian fonts. For a set of tables summarising current standardisation proposals and major font support see Mongolian variant forms.
tbd
tbd
Words are separated by spaces. The gaps before certain suffixes are not considered to be word delimiters. Those gaps are usually created using U+180E MONGOLIAN VOWEL SEPARATOR] or [U+202F NARROW NO-BREAK SPACE]. [
Mongolian uses a mixture of local punctuation and punctuation from Chinese.
phrase | |
---|---|
sentence | ᠃ [U+1803 MONGOLIAN FULL STOP] |
Question marks and exclamation marks are fullwidth, upright characters (ee an example). Mongolian punctuation is horizontally centred in each vertical line.n,#punctuation_rules
Mongolian commonly uses parentheses or brackets to insert parenthetical information into text. Parentheses and brackets may be fullwidth or may not be. They are rotated.g5
start | end | |
---|---|---|
standard | ( [U+0028 LEFT PARENTHESIS] |
) [U+0029 RIGHT PARENTHESIS] |
alternate | 〔 [U+3014 LEFT TORTOISE SHELL BRACKET] | 〕 [U+3015 RIGHT TORTOISE SHELL BRACKET] |
A variety of brackets are used around quotations. Quotation marks are often fullwidth but may not be. Quotation marks are rotated, as in Chinese.g5
start | end | |
---|---|---|
initial | 《 [U+300A LEFT DOUBLE ANGLE BRACKET] | 》 [U+300B RIGHT DOUBLE ANGLE BRACKET] |
« [U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK] | » [U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK] | |
nested | 〈 [U+3008 LEFT ANGLE BRACKET] | 〉 [U+3009 RIGHT ANGLE BRACKET] |
tbd
᠁ [U+1801 MONGOLIAN ELLIPSIS] is used for ellipsis.
Like Chinese and Japanese, Mongolian text uses ruby annotations to express the pronunciation of words for beginners or in ambiguous situations. This is useful in Mongolian because many characters look identical in cursive text.
Annotations are typically written in the Latin script, and run down the right side of the line.g1
᠊ [U+180A MONGOLIAN NIRUGU] is used to extend the baseline.
Underlines run down the right side of vertical lines of Mongolian text. Lines down the left side are equivalent to overline in English text.n,#h_text_decoration
The side of the vertical line for underlines doesn't change for embedded Latin text. Since Latin text runs down the page, this makes the underline run across the top of the Latin letters. See the red line in text_decoration_mixed.n,#h_text_decoration
There are a number of different styles of underlining in use, as shown in underline_styles.g10
If an underline is styled so that it leaves a gap below spaces that separate words, the underline should not also leave gaps below the narrow spaces used to separate some suffixes from the word root. The desired outcome is that shown here, however implementations may vary.g9
Line-breaking normally occurs at word boundaries (indicated by spaces).
Lines should not break, however, on gaps between words and their suffixes when separated by U+202F NARROW NO-BREAK SPACE], or where [U+180E MONGOLIAN VOWEL SEPARATOR] is used.n,#mongolian_space [
As for most writing systems, there are restrictions around which characters are allowed to start or end a line (eg. question marks and colons should not be used at the beginning of a line, and opening brackets should not be used at the end of a line).n,#punctuation_rules
Show (default) line-breaking properties for characters in the modern Mongolian orthography.
When Traditional Mongolian (and Todo) is hyphenated, the visual marker used is ᠆ [U+1806 MONGOLIAN TODO SOFT HYPHEN], which is placed at the beginning of the second line.u,545
To justify the text on a line, the spaces between words are adjusted.
When Chinese characters are embedded in Mongolian text and justification applied, space is not added between the Chinese characters (as it would be in a Chinese document).n,#mixed_arrangement_cjk
tbd
This section looks at ways in which spacing is applied between characters over and above that which is introduced during justification.
The default baseline for Mongolian-script text runs down the centre of the vertical line spacing, as shown in centre_baseline.n,#h_text_decoration
When mixed with other languages, the text in those languages should also be centre-aligned along the Mongolian baseline.n,#mixed_arrangement_alphanum
You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.
The Mongolian orthography uses ASCII and native numeric styles. It also uses fixed styles based on circled numbers.
The mongolian numeric style is decimal-based and uses these digits.rmcs
Examples:
The circled-decimal fixed style uses these numbers. It is only able to count to 50.
The dotted-decimal fixed style uses these numbers. It is only able to count to 20.
The Baiti and Noto fonts show the first 20 counters for the circled style lie on their side, instead of upright. The Mongolian White font fixes this, but doesn't appear to handle dotted digits above 9.
As list counters, these digits are generally used upright, as shown in fig_circled_counters. g5
However, counters may also run down the page (see fig_rotated_counters).g5
tbd
This section is for any features that are specific to Mongolian and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.
Generally, book binding is on the left, and pages are turned towards the left, unlike books set vertically in Chinese or Japanese.n,#h_binding
Columns run horizontally, rather than vertically as in Western typography.n,#h_columns
It is common for the default orientation of pages to be landscape, rather than portrait.n,#h_paper_direction
Form controls on Web pages should be rotated 90 degrees clockwise, compared to the form controls for Western languages.n,#h_input
Page numbers should be displayed on the upper or lower side of the page.n,#h_page_numbering