Updated 18 December, 2020
This page gathers together basic information about the two scripts used to write Georgian: Mkedruli (+Mtavruli), and Khutsuri (Asomtavruli+Nuskhuri) and their use for the Georgian language. It aims (generally) to provide an overview of the orthography and typographic features, and (specifically) to advise how to write Georgian using Unicode.
Phonetic transcriptions on this page should be treated as an approximate guide, only. Many are more phonemic than phonetic, and there may be variations depending on the source of the transcription.
მუხლი 1. ყველა ადამიანი იბადება თავისუფალი და თანასწორი თავისი ღირსებითა და უფლებებით. მათ მინიჭებული აქვთ გონება და სინდისი და ერთმანეთის მიმართ უნდა იქცეოდნენ ძმობის სულისკვეთებით.
მუხლი 2. ამ დეკლარაცით გამოცხადებული ყველა უფლება და ყველა თავისუფლება მინიჭებული უნდა ჰქონდეს ყოველ ადამიანს განურჩევლად რაიმე განსხვავების, სახელდობრ, რასის, კანის ფერის, სქესის, ენის, რელიგიის, პოლიტიკური თუ სხვა რწმენის, ეროვნული თუ სოციალური წარმომავლობის, ქონებრივი, წოდებრივი თუ სხვა მდგომარეობისა. გარდა ამისა, დაუშვებელია რაიმე განსხვავება იმ ქვეყნის თუ ტერიტორიის პოლიტიკური, სამართლებრივი ან საერთაშორისო სტატუსის საფუძველზე, რომელსაც ადამიანი ეკუთვნის, მიუხედავად იმისა, თუ როგორია ეს ტერიტორია - დამოუკიდებელი, სამეურვეო, არათვითმმართველი თუ სხვაგვარად შეზღუდული თავის სევერენიტეტში.
The Georgian language is spoken by approximately 3,900,000 people in Georgia, as well as by 355,000 people in Azerbaijan, Turkey and Iran. Georgian is written in four distinct orthographies: Asomtavruli, Nushuri, Khutsuri (a mixture of the previous two), and Mkhedruli. The Mkhedruli alphabet is also used for writing the Mingrelian and Svan languages spoken in Georgia, as well as Laz, spoken in Turkey. Asomtavruli and Nuskhuri are now used only by the Georgian Orthodox Church, in ceremonial religious texts and iconography.
დამწერლობა damʦ̇ɛrlɔba damts'erloba script (mkhedruli)
The earliest uncontested use of the script dates from a 5th century inscription. Scriptsource describes the subsequent development as follows:
Since that time, Georgian has been written in three distinct scripts. The original script was an inscriptional form called Asomtavruli, from which a manuscript form, Nuskhuri, was derived. For a time, these were combined in a bicameral system called Khutsuri in which Asomtavruli letters were used as the upper case and Nushkuri as the lower case. Since the 11th century, a third script has been attested, called Mkhedruli. There is some debate as to the origins of this third script; some scholars say that it evolved from the Khutsuri system, other, that it pre-dates it. What is generally agreed upon is that Mkhedruli was used as a secular script alongside the ecclesiastical Khutsuri until the 18th century, since which time it has been used for nearly all Georgian writing. The three scripts share the same letter names, despite having different letter shapes.
Sources: Scriptsource, Wikipedia.
The scripts are alphabets. Both consonants and vowels are indicated by letters. See the table to the right for a brief overview of features for the orthography of the modern Georgian language.
Characters in the Unicode Georgian blocks represent 4 different letter styles for, with few exceptions, the same phonetic range. Modern Georgian uses only the mkhedruli style of lettering, though occasionally its mtavruli variants are used for emphasis or titles. The asomtavruli and nuskhuri styles are not well understood by ordinary Georgians. They are used together in ecclesiastical texts as the bicameral 'khutsuri' writing system.
The script is very close to the phonetics of the language, and all 4 styles generally provide a letter for each sound in a very regular way.
Georgian texts run left to right in horizontal lines.
Words are separated by spaces.
Case is a little special. When asomtavruli and nuskhuri are mixed as khutsuri, then words may be title-cased, and there was an attempt to introduce something similar for mkhedruli in the mid-20th century, but modern Georgian is normally written using lowercase only. If the mtavruli capitals are used, they are applied to a whole word at the minimum, so their use is more akin to ALL-CAPS than to the Capitalisation used in the Latin script.
Mkhedruli has 28 consonant letters. There are 10 vowel letters.
Numbers use ASCII digits.
The visual forms of letters don't usually interact.
This section lists non-ASCII characters used for writing modern Georgian, and other characters in the Georgian script block not used to write the modern language. Core Khusturi consonants and vowels are listed in the same section. Letters used by other languages or archaic mkhedruli are listed further down. For descriptions of usage, click on ↓.
A feature of the Georgian language is the propensity to cluster consonants, and it does so in 2 ways.
All the character lists in this section show mkedruli to the left and mtavruli to the right.
Mkhedruli (მხედრული mχɛdruli mxɛdruli) is the standard set of characters for writing modern Georgian. It is normally used as a monocased script, even though there are Unicode mappings to uppercase variants (see mtavruli).
For more information about the characters, click on them and follow the links to the character notes page.
Mtavruli (მხედრული მთავრული mχɛdruli mtavruli mxɛdruli mtʰɑvruli) is also used for writing modern Georgian. These characters in Unicode are classed as uppercase versions of the mkedruli, however in modern text they are normally used like all-caps rather than at the beginning of a sentence or proper noun, etc. They are typically used to emphasise words or for headings.w,#Mkhedruli
The mtavruli letters are have similar forms to the mkhedruli except that, in principle, all letters written in the mtavruli style appear with an equal height standing on the baseline, similar to small caps in the Latin script.
Dedicated characters were only introduced in Unicode v11. Prior to that, authors had to use special fonts with the mkhedruli code points in order to write mtavruli letters.
At the time of writing, there are still not many Unicode fonts that provide glyphs for the mtavruli characters, and browsers on OS X and iOS browsers map (most) mtavruli letters to mkhedruli glyphs if a font doesn't contain the necessary glyphs.
Click on the sounds to reveal locations in this document where they are mentioned.
Phones in a lighter colour are non-native or allophones. Source Wikipedia.
Click on the sounds to reveal locations in this document where they are mentioned.
Phones in a lighter colour are non-native or allophones. Source Wikipedia.
The following characters are used to write the modern Georgian language. Each pair shows mkhedruli followed by mtavruli letters.
The following characters are obsolete in the Georgian language, but still used in other languages.l They were removed by the Society for the Spreading of Literacy among Georgians, founded by Prince Ilia Chavchavadze in 1879 because they were redundant. w,#Mkhedruli
IPA values are for the languages that use them. For previous Georgian pronunciation, click on the character and follow links to the character notes page.
The above letters are all used for the Svan language, and the 2nd in the list is used also for Mingrelian and Laz.
The characters below were specifically created for use with other languages (Svan and Mingrelian for the first two, and Laz for the last).
One Georgian-only character is no longer used (since the 1879 reform).
The characters below were used for other languages in the past, including Bats, Ossetian and Abkhaz.
All the character lists in this section show asomtavruli to the left and nuskhuri to the right.
Asomtavruli was used for writing historic Georgian inscriptions, and is really only used in liturgical texts now. These characters in Unicode are classed as uppercase versions of the nuskhuri, and in religious texts they are mixed in a similar way to capitals and lowercase characters in the Latin script. This mixture is called khutsuri.
Nuskhuri developed as a non-inscriptional alphabet, alongside Asomtravuli, and is also only used in liturgical texts now. These characters in Unicode are classed as lowercase versions of the asomtravuli.
In religious texts asomtravuli and nuskhuri are mixed in a similar way to capitals and lowercase characters in the Latin script. This mixture is called khutsuri.
The following characters are used to write the modern Georgian language. Each pair shows asomtavruli followed by nuskhuri.
The first 3 characters are obsolete in the Georgian language, but are still used in Svan, Mingrelian, and Laz languages.l The last character in the list was created specifically for use with Svan.
The following characters are archaic. The first pair was used for Georgian, and the second for Ossetian.
There are no special arrangments for consonant clusters in Georgian.
Georgian normally has no combining marks, and there are none in the Unicode Georgian block.
It is, however, possible to find a combining accent character used with Laz for certain vowels.
The Georgian Unicode block contains no symbols, but Georgian uses a currency symbol and a number symbol from elsewhere in Unicode.
Georgian uses the standard western digits.
The Georgian currency symbol, ₾ [U+20BE LARI SIGN] is found in the Currency Symbols block.
Georgian text runs left to right in horizontal lines.
This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.
You can experiment with examples using the All Georgian character app, the Modern Georgian character app, or the Khutsuri character app.
Georgian doesn't do any shaping or positioning of characters based on the context, but individual letter forms can vary from font to font.
tbd
tbd
tbd
Words are separated by spaces.
phrase | , [U+002C COMMA] ; [U+003B SEMICOLON] : [U+003A COLON] |
---|---|
sentence | . [U+002E FULL STOP] |
paragraph | ჻ [U+10FB GEORGIAN PARAGRAPH SEPARATOR] |
Georgian uses ASCII punctuation.
჻ [U+10FB GEORGIAN PARAGRAPH SEPARATOR] was formerly used to indicate the end of a paragraph, but is not common in modern Georgian. When used, it appeared at the end of the last line in the paragraph.
start | end | |
---|---|---|
standard |
start | end | |
---|---|---|
initial | “ [U+201C LEFT DOUBLE QUOTATION MARK] | |
nested | » [U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK] |
According to CLDR, the default quote marks for Georgian are „ [U+201E DOUBLE LOW-9 QUOTATION MARK] at the start, and “ [U+201C LEFT DOUBLE QUOTATION MARK] at the end.
When an additional quote is embedded within the first, the quote marks are « [U+00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK] and » [U+00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK].
The following example shows quotation marks used to offset terms.
თავრული სტილი არასოდეს გამოიყენება როგორც ე.წ. „დიდი ასოები“.
Modern Georgian tends to use mtavruli characters for a word or phrase to show emphasis or highlight it. The mtavruli characters are used like ALL-CAPS and applied to whole words or phrases, and never just the first letter in a word.
tbd
tbd
tbd
CLDR lists the following additional punctuation marks.
The primary line-break opportunities for Georgian text are the spaces between words.
Georgian uses hyphenation to fit text to lines better.
Characters used for the Georgian language have the following assignments related to line-break properties.
AI | 6 | § † ‡ |
---|---|---|
AL | 134 | ი Ი უ Უ ე Ე ო Ო ა Ა პ Პ ფ Ფ ბ Ბ ტ Ტ თ Თ დ Დ კ Კ ქ Ქ გ Გ ყ Ყ წ Წ ც Ც ძ Ძ ჭ Ჭ ჩ Ჩ ჯ Ჯ ვ Ვ ს Ს ზ Ზ შ Შ ჟ Ჟ ღ Ღ ხ Ხ ჰ Ჰ მ Მ ნ Ნ რ Რ ლ Ლ ჻ |
B2 | 2 | — |
BA | 4 | ‐ – |
IN | 2 | … |
OP | 4 | ‚ „ |
PO | 6 | ′ ″ ₾ |
PR | 2 | № |
QU | 8 | « » ‘ “ |
AL (ordinary alphabetic and symbol characters) requires other characters to provide break opportunities; otherwise, unless tailored rules are applied, no line breaks are allowed between pairs of them.
B2 (break opportunity before and after) the EM DASH used to set off parenthetical text may allow line breaks before or after, but may also be affected by local orthographic rules.
BA (break after) indicates that it is normal to break after that character.
IN (inseparable characters) is intended to be used consecutively. There is never a line break between two characters of this class.
OP (open punctuation) should be kept with the character that follows. This is desirable, even if there are intervening space characters, as it prevents the appearance of a bare opening punctuation mark at the end of a line.
PO (postfix numeric) usually follows a numerical expression and may not be separated from preceding numeric characters or preceding closing characters, even if one or more space characters intervene. For example, there is no break opportunity in “(12.00) %”..
PR (numeric prefix) may not be separated from following numeric characters or following opening characters, even if a space character intervenes. For example, there is no break opportunity in “฿ (100.00)”.
QU (quotation) characters can be opening or closing, or even both, depending on usage. The default is to treat them as both opening and closing.
tbd
tbd
Ready-made Counter Styles lists one additive counter style for use with the Georgian language. You can experiment with this style using the Counter styles converter.
1 | 2 | 3 | 4 | |
---|---|---|---|---|
georgian (additive) |
ა | ბ | გ | დ |
11 | 22 | 33 | 44 | |
---|---|---|---|---|
georgian (additive) |
ია | კბ | ლგ | მდ |
111 | 222 | 333 | 444 | |
---|---|---|---|---|
georgian (additive) |
რია | სკბ | ტლგ | ჳმდ |
The georgian additive style uses the letters shown below. It is specified for a range between 1 and 19,999. It uses mkhedruli characters, several of which are archaic in written text.
tbd
This section is for any features that are specific to Georgian and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.
Version 12.0 of the Unicode Standard has the following blocks dedicated to the Georgian script (numbers in lists are non-ASCII only):
The modern Georgian orthography described here uses characters from the following Unicode blocks.
Currency Symbols | 1 | ₾ | ![]() |
---|---|---|---|
General Punctuation | 12 | ‐ – — ‘ ‚ “ „ † ‡ … ′ ″ | ![]() |
Georgian | 34 | ა ბ გ დ ე ვ ზ თ ი კ ლ მ ნ ო პ ჟ რ ს ტ უ ფ ქ ღ ყ შ ჩ ც ძ წ ჭ ხ ჯ ჰ ჻ | ![]() |
Georgian Extended | 33 | Ა Ბ Გ Დ Ე Ვ Ზ Თ Ი Კ Ლ Მ Ნ Ო Პ Ჟ Რ Ს Ტ Უ Ფ Ქ Ღ Ყ Შ Ჩ Ც Ძ Წ Ჭ Ხ Ჯ Ჰ | ![]() |
Latin-1 Supplement | 3 | § « » | ![]() |
Letterlike Symbols | 1 | № | ![]() |
See also the Character usage lookup page, and the Script Comparison Table.
According to ScriptSource, the Georgian script is used for the following languages:
Main | ||
---|---|---|
Auxiliary | ||
Archaic | ||
Other | ||
Deprecated |