Cyrillic script summary

Updated 04-Apr-2019 • tags cyrillic, scriptnotes

This page provides basic information about the Cyrillic script and its use for the Russian language. It is not authoritative, peer-reviewed information – these are just notes I have gathered or copied from various places as I learned. For character-specific details see the Cyrillic character notes.

For similar information related to other scripts, see the Script comparison table.

Languages using Cyrillic include: Abkhaz, Adyghe, Southern Altai, Belorussian, Bosnian, Bulgarian, Chechen, Church Slavonic, Even, Evenki, Kazakh, Kyrgyz, Kabardian, Khakas, Komi-Permyak, Macedonian, Mongolian, Nanai, Nganasan, Ossetian, Orok, Russian, Yakut, Serbian, Shor, Turkmen, Tajik, Tatar, Tuvan, Ukrainian, Uzbek, Tundra Yukaghir (32).

Clicking on red text examples, or highlighting part of the sample text shows a list of characters, with links to more details. Click on the vertical blue bar (bottom right) to change font settings for the sample text. Colours and annotations on panels listing characters are relevant to their use for the Russian language.

Sample (Russian)

Статья 1 Все люди рождаются свободными и равными в своем достоинстве и правах. Они наделены разумом и совестью и должны поступать в отношении друг друга в духе братства.

Статья 2 Каждый человек должен обладать всеми правами и всеми свободами, провозглашенными настоящей Декларацией, без какого бы то ни было различия, как-то в отношении расы, цвета кожи, пола, языка, религии, политических или иных убеждений, национального или социального происхождения, имущественного, сословного или иного положения. Кроме того, не должно проводиться никакого различия на основе политического, правового или международного статуса страны или территории, к которой человек принадлежит, независимо от того, является ли эта территория независимой, подопечной, несамоуправляющейся или как-либо иначе ограниченной в своем суверенитете.

Usage & history

From Scriptsource:

The creation of the Cyrillic script is traditionally attributed to Saint Cyril, a missionary working in Bulgaria during the 9th century. He and his brother are also credited with the invention of the Glagolitic script, a derivation of the Greek cursive alphabet which was modified to fit the sound systems of Slavic languages. Some historians credit Clement of Ohrid, a student of Saint Cyril's, with creating the Cyrillic script as a more readable writing system based on Glagolitic. The Cyrillic script was initially used for writing Old Church Slavonic (also called Old Bulgarian), but it has undergone a number of changes since that time, so much so that the old and modern variants are considered by many to be two different but related scripts. Many of the modern letterforms differ from those used in early Cyrillic writing, some letters have been dropped, and new letters have been added. An orthographic reform was implemented by the Russian tsar Peter the Great in 1708 which removed a number of obsolete letters so that Russian writing is now almost perfectly phonetic.

The script has traditionally been used for writing the Slavic languages, of which Russian is the most widely spoken. During the nineteenth and twentieth centuries, particularly under Soviet rule, it was extended to write over 50 languages throughout Eastern Europe and Asia.

From Wikipedia:

The Cyrillic script /sɪˈrɪlɪk/ is a writing system used for various alphabets across Eurasia (particularity in Eastern Europe, the Caucasus, Central Asia, and North Asia). It is based on the Early Cyrillic alphabet developed during the 9th century AD at the Preslav Literary School in the First Bulgarian Empire. It is the basis of alphabets used in various languages, especially those of Orthodox Slavic origin, and non-Slavic languages influenced by Russian. As of 2011, around 252 million people in Eurasia use it as the official alphabet for their national languages, with Russia accounting for about half of them.

With the accession of Bulgaria to the European Union on 1 January 2007, Cyrillic became the third official script of the European Union, following the Latin script and Greek script. Cyrillic is derived from the Greek uncial script, augmented by letters from the older Glagolitic alphabet, including some ligatures. These additional letters were used for Old Church Slavonic sounds not found in Greek. The script is named in honor of the two Byzantine brothers, Saints Cyril and Methodius, who created the Glagolitic alphabet earlier on. Modern scholars believe that Cyrillic was developed and formalized by early disciples of Cyril and Methodius.

Key features

Cyrillic is an alphabet. Letters typically represent a consonant or vowel sound. See the table to the right for a brief overview of features, taken from the Script Comparison Table.

Text is written horizontally, left to right. The visual forms of letters don't usually interact. The script is bicameral, and uppercase and lowercase shapes are typically the same. There can be a significant difference, however, between regular and cursive/italic shapes for the same character.

Of the 441 characters in the Unicode Cyrillic blocks, 177 are historic (33%) and 2 are for Lithuanian dialectology. The remaining 262 are just letters – no punctuation, digits, or combining characters. These are all bicameral, which brings the number of distinct modern letters to 131. Although modern Cyrillic text tends to use precomposed forms, rather than combining diacritics separately with base letters, many extended characters are formed by slightly tweaking a set of basic shapes.

In yellow boxes, show:

Character lists

The Cyrillic script characters in Unicode 10.0 are spread across 4 blocks (not counting the phonetic blocks nor any of the combining character blocks):

The following links give information about characters used for languages associated with this script. The numbers in parentheses are for non-ASCII characters.

For character-specific details see the Cyrillic character notes.

Vowels

Standard Russian uses 11 vowel letters (22 characters).

Аа␣Яя␣Ээ␣Ее␣Ыы␣Ии␣Йй␣Оо␣Ёё␣Уу␣Юю

The arrangement above shows the correspondence between vowels involving iotation/patalalisation and those not, although the rules for their application are not always clear cutw, and some are more common than others, eg. э is not particularly common.

The phonetic sounds shown are not always applied, particularly for unstressed vowels, where for example о may be pronounced more like a.

й [U+0439 CYRILLIC SMALL LETTER SHORT I] and ё [U+0451 CYRILLIC SMALL LETTER IO] and their capitals decompose to base plus diacritic. 

Obsolete Russian vowels

Peter the Great's reform led to the abandonment of these vowel characters.w

Ѫѫ␣Ѧѧ␣Ѭѭ␣Ѩѩ

The Russian orthographic reform of 1918 dropped the following additional vowel letters from the Russian repertoire.w

Іі␣Ѣѣ

Consonants

Russian uses 20 consonants (40 characters, if you include uppercase and lowercase), plus a hard and soft sign (see below).

Бб␣Вв␣Гг␣Дд␣Жж␣Зз␣Кк␣Лл␣Мм␣Нн␣Пп␣Рр␣Сс␣Тт␣Фф␣Хх␣Цц␣Чч␣Шш␣Щщ

Most of the consonants can be pronounced with or without palatisation, ie. 'hard' or 'soft', respectively. In principle, this is determined by which vowel follows it.

Palatalised pronunciations are followed by these vowels: я ё е ю и. The other vowels, а о э у ы, follow hard sounds.

Hard and soft signs

Ъъ␣Ьь

The hard sign slightly separates a non-palatised consonant sound from a following iotated vowel. In modern Russian it is mostly used to separate a prefix from a root.w

The soft sign can be used in two ways.

In most positions it indicates that the preceding consonant is palatalized and any following vowel is iotated, eg. брать bratʹ bratʲ to take vs. брат brat brat brother.

After root-final consonants ч щ (always soft) or ж ш ц (always hard), the soft sign doesn't alter pronunciation but has a grammatical meaning, eg. тушь tuʂʹ tuʂ india ink vs. туш tuʂ tuʂ flourish after a toast.w

Prior to the 1918 reforms, every word ending in a consonant had to be followed by a hard or soft sign. That is no longer the case, and the hard sign is now the least common letter in the Russian alphabet.w

Obsolete Russian consonants

Around 1750, after Peter the Great's orthographic reform, the following consonants fell into disuse in Russian.w

Ѕѕ␣Ѯѯ␣Ѱѱ

After the subsequent orthographic reform of 1918, the following additional consonants were removed from the Russian repertoirew, although you can still find them used in Church Slavonic and some other languages.

Ѳѳ␣Ѵѵ

Diacritics

̈␣̆

Unicode decomposition can produce ◌̈ [U+0308 COMBINING DIAERESIS​] from ё [U+0451 CYRILLIC SMALL LETTER IO], and ◌̆ [U+0306 COMBINING BREVE​] from й [U+0439 CYRILLIC SMALL LETTER SHORT I], but usually precomposed characters are used, and these are each letters of the alphabet.

́

One diacritic that is sometimes used as a combining character is ◌́ [U+0301 COMBINING ACUTE ACCENT​], used to indicate stressed vowels for educational materials, dictionaries, and such, eg. за́мок zámok zamak castle vs. замо́к zamók zamok lock. Rarely, it may be used to specify the stress in uncommon foreign words and in poems with unusual stress used to fit the meter.w

All 55 of the combining characters in the Unicode Cyrillic block fall under the historical category.

Glyph shaping & positioning

Cyrillic doesn't normally have any of the changeability of complex scripts. Characters are typically separate and self-contained.

However, there can be a significant difference in shape between regular and italic/cursive font shapes for the same character.

вв
шш
йй
мм
Conservative transformations between regular and italic.
гг
дд
тт
More radical transformations between regular and italic.

Note in particular the italic form of т in the figure just above. The italic form of м is shown in the previous figure.

The shapes of the italic forms can also vary by language.→ w

The shape of the breve sign in Cyrillic is different from that used for Latin text.s A font such as Brill can detect the appropriate shape from the adjacent characters.

̆ [U+0306 COMBINING BREVE] between cyrillic and latin characters changes shape in the Brill font.

Structural boundaries & markers

Word boundaries

Words are separated by spaces.

Phrase boundaries

Cyrillic uses standard Latin punctuation.

Line & paragraph layout

Text direction

Cyrillic text is written left to right in horizontal lines.

Line breaking

Spaces between words provide the primary line break opportunities.u

Justification

Justification is done, principally, by adjusting the space between words.

Use the control below to see how your browser justifies the text sample here.

Все люди равны перед законом и имеют право, без всякого различия, на равную защиту закона. Все люди имеют право на равную защиту от какой бы то ни было дискриминации, нарушающей настоящую Декларацию, и от какого бы то ни было подстрекательства к такой дискриминации.

TBD

Further information needed for this section includes:

Glyph shaping & positioning
    Cursive text
    Context-based shaping
    Multiple combining characters
    Context-based positioning
    Transforming characters

Structural boundaries & markers
    Grapheme, word & phrase boundaries
    Hyphens & dashes
    Bracketing information
    Quotations
    Abbreviations, ellipsis, & repetition
    Emphasis & highlights
    Inline notes & annotations

Inline layout
    Inline text spacing
    Bidirectional text

Line & paragraph layout
    Line breaking
    Hyphenation
    Text alignment & justification
    Counters, lists, etc.
    Styling initials
    Baselines & inline alignment

Page & book layout
    General page layout & progression
    Directional layout features
	Grids & tables
    Notes, footnotes, etc.
    Forms & user interaction
    Page numbering, running headers, etc.

References

  1. [ u ] The Unicode Standard v10.0, Cyrillic, pp316-320.
  2. [ d ] Peter T. Daniels and William Bright, The World's Writing Systems, Oxford University Press, ISBN 0-19-507993-0, pp408-412
  3. [ w ] Wikipedia, Cyrillic script.
  4. [ s ] Scriptsource
Last changed 2019-04-04 7:22 GMT.  •  Make a comment.  •  Licence CC-By © r12a.