Cyrillic script summary

Updated 09-Feb-2018 • tags cyrillic, scriptnotes

This page provides basic information about the Cyrillic script and its use for the Russian language. It is not authoritative, peer-reviewed information – these are just notes I have gathered or copied from various places as I learned. For character-specific details see the Cyrillic character notes.

For similar information related to other scripts, see the Script comparison table.

Languages using Cyrillic include: Abkhaz, Adyghe, Southern Altai, Belorussian, Bosnian, Bulgarian, Chechen, Church Slavonic, Even, Evenki, Kazakh, Kyrgyz, Kabardian, Khakas, Komi-Permyak, Macedonian, Mongolian, Nanai, Nganasan, Ossetian, Orok, Russian, Yakut, Serbian, Shor, Turkmen, Tajik, Tatar, Tuvan, Ukrainian, Uzbek, Tundra Yukaghir (32).

Clicking on red text examples, or highlighting part of the sample text shows a list of characters, with links to more details. Click on the vertical blue bar (bottom right) to change font settings for the sample text.

Sample (Russian)

Статья 1 Все люди рождаются свободными и равными в своем достоинстве и правах. Они наделены разумом и совестью и должны поступать в отношении друг друга в духе братства.

Статья 2 Каждый человек должен обладать всеми правами и всеми свободами, провозглашенными настоящей Декларацией, без какого бы то ни было различия, как-то в отношении расы, цвета кожи, пола, языка, религии, политических или иных убеждений, национального или социального происхождения, имущественного, сословного или иного положения. Кроме того, не должно проводиться никакого различия на основе политического, правового или международного статуса страны или территории, к которой человек принадлежит, независимо от того, является ли эта территория независимой, подопечной, несамоуправляющейся или как-либо иначе ограниченной в своем суверенитете.

Usage & history

From Scriptsource:

The creation of the Cyrillic script is traditionally attributed to Saint Cyril, a missionary working in Bulgaria during the 9th century. He and his brother are also credited with the invention of the Glagolitic script, a derivation of the Greek cursive alphabet which was modified to fit the sound systems of Slavic languages. Some historians credit Clement of Ohrid, a student of Saint Cyril's, with creating the Cyrillic script as a more readable writing system based on Glagolitic. The Cyrillic script was initially used for writing Old Church Slavonic (also called Old Bulgarian), but it has undergone a number of changes since that time, so much so that the old and modern variants are considered by many to be two different but related scripts. Many of the modern letterforms differ from those used in early Cyrillic writing, some letters have been dropped, and new letters have been added. An orthographic reform was implemented by the Russian tsar Peter the Great in 1708 which removed a number of obsolete letters so that Russian writing is now almost perfectly phonetic.

The script has traditionally been used for writing the Slavic languages, of which Russian is the most widely spoken. During the nineteenth and twentieth centuries, particularly under Soviet rule, it was extended to write over 50 languages throughout Eastern Europe and Asia.

From Wikipedia:

The Cyrillic script /sɪˈrɪlɪk/ is a writing system used for various alphabets across Eurasia (particularity in Eastern Europe, the Caucasus, Central Asia, and North Asia). It is based on the Early Cyrillic alphabet developed during the 9th century AD at the Preslav Literary School in the First Bulgarian Empire. It is the basis of alphabets used in various languages, especially those of Orthodox Slavic origin, and non-Slavic languages influenced by Russian. As of 2011, around 252 million people in Eurasia use it as the official alphabet for their national languages, with Russia accounting for about half of them.

With the accession of Bulgaria to the European Union on 1 January 2007, Cyrillic became the third official script of the European Union, following the Latin script and Greek script. Cyrillic is derived from the Greek uncial script, augmented by letters from the older Glagolitic alphabet, including some ligatures. These additional letters were used for Old Church Slavonic sounds not found in Greek. The script is named in honor of the two Byzantine brothers, Saints Cyril and Methodius, who created the Glagolitic alphabet earlier on. Modern scholars believe that Cyrillic was developed and formalized by early disciples of Cyril and Methodius.

Key features

Cyrillic is an alphabet. Letters typically represent a consonant or vowel sound. See the table to the right for a brief overview of features, taken from the Script Comparison Table.

Text is written horizontally, left to right. The visual forms of letters don't usually interact. The script is bicameral, and uppercase and lowercase shapes are typically the same. There can be a significant difference, however, between regular and cursive/italic shapes for the same character.

About a third of the characters in the Unicode Cyrillic blocks are for historical letterforms. The remaining 262 are just letters – no punctuation, digits, or combining characters. These are all bicameral, which brings the number of distinct modern letters to 131. Although modern Cyrillic text tends to use precomposed forms, rather than combining diacritics separately with base letters, many extended characters are formed by slightly tweaking a set of basic shapes.

Character lists

The Cyrillic script characters in Unicode 10.0 are spread across 4 blocks (not counting the phonetic blocks nor any of the combining character blocks):

The following links give information about characters used for languages associated with this script. The numbers in parentheses are for non-ASCII characters.

For character-specific details see the Cyrillic character notes.

Consonants

Russian uses 20 consonants (40 characters, if you include uppercase and lowercase), plus a hard and soft sign (see below).

list all
Ббbb,bʲ
Ввvv,vʲ
Ггgg,gʲ
Ддdd,dʲ
Жжzhʐ
Ззzz,zʲ
Ккkk,kʲ
Ллlɫ,lʲ
Ммmm,mʲ
Ннnn,nʲ
Ппpp,pʲ
Ррrr,rʲ
Ссss,sʲ
Ттtt,tʲ
Ффff,fʲ
Ххkhx,xʲ
Ццʦʦ
Ччchʨ
Шшshʂ
Щщshchɕɕ

Most of the consonants can be pronounced with or without palatisation, ie. 'hard' or 'soft', respectively. In principle, this is determined by which vowel follows it.

Palatalised pronunciations are followed by these vowels: я ё е ю и. The other vowels, а о э у ы, follow hard sounds.

Hard and soft signs

list all
Ъъʺ-hard
Ььʹʲsoft

The hard sign slightly separates a non-palatised consonant sound from a following iotated vowel. In modern Russian it is mostly used to separate a prefix from a root.w

The soft sign can be used in two ways.

In most positions it indicates that the preceding consonant is palatalized and any following vowel is iotated, eg. брать bratʲ to take vs. брат brat brother.

After root-final consonants ч щ (always soft) or ж ш ц (always hard), the soft sign doesn't alter pronunciation but has a grammatical meaning, eg. тушь tuʂ india ink vs. туш tuʂ flourish after a toast.w

Prior to the 1918 reforms, every word ending in a consonant had to be followed by a hard or soft sign. That is no longer the case, and the hard sign is now the least common letter in the Russian alphabet.w

Vowels

Standard Russian uses 11 vowel letters (22 characters).

list all
Ааaa
Яяi͡uja
Ээėe
Ееeje,ʲe,e
Ыыyɨ
Ииii,ʲi,ɨ
Ййĭj
Ооoo
Ёёējo,ʲo
Ууuu
Ююiju

The arrangement above shows the correspondence between vowels involving iotation/patalalisation and those not, although the rules for their application are not always clear cutw, and some are more common than others, eg. э is not particularly common.

The phonetic sounds shown are not always applied, particularly for unstressed vowels, where for example о may be pronounced more like a.

Diacritics

Unicode decomposition can produce ◌̈ [U+0308 COMBINING DIAERESIS​] from ё [U+0451 CYRILLIC SMALL LETTER IO], and ◌̆ [U+0306 COMBINING BREVE​] from й [U+0439 CYRILLIC SMALL LETTER SHORT I], but usually precomposed characters are used, and these are each letters of the alphabet.

One diacritic that is sometimes used as a combining character is ◌́ [U+0301 COMBINING ACUTE ACCENT​], used to indicate stressed vowels for educational materials, dictionaries, and such, eg. за́мок castle vs. замо́к lock. Rarely, it may be used to specify the stress in uncommon foreign words and in poems with unusual stress used to fit the meter.w

There are, however, other languages written in the Cyrillic script that use one or more of the 55 diacritics in the Cyrillic repertoire.

Historical characters

Of the 441 characters in the Unicode Cyrillic blocks, 177 are historic (33%) and 2 are for Lithuanian dialectology. All 55 of the combining characters in the Cyrillic block fall under the historical category.

Obsolete Russian consonants

Around 1750, after Peter the Great's orthographic reform, the following consonants fell into disuse in Russian.w

list all
Ѕѕż
Ѯѯk͡s
Ѱѱp͡s

After the subsequent orthographic reform of 1918, the following additional consonants were removed from the Russian repertoirew, although you can still find them used in Church Slavonic and some other languages.

list all
Ѳѳ
Ѵѵ

Obsolete Russian vowels

Peter the Great's reform led to the abandonment of these vowel characters.w

list all
Ѫѫǫ
Ѧѧę
Ѭѭi͡ǫ
Ѩѩi͡ę

The Russian orthographic reform of 1918 dropped the following additional vowel letters from the Russian repertoire.w

list all
Ііī
Ѣѣi͡e

Context-based glyph changes

Cyrillic doesn't normally have any of the changeability of complex scripts. Characters are typically separate and self-contained.

However, there can be a significant difference in shape between regular and italic/cursive font shapes for the same character.

вв
шш
йй
мм
Conservative transformations between regular and italic.
гг
дд
тт
More radical transformations between regular and italic.

Note in particular the italic form of т in the figure just above. The italic form of м is shown in the previous figure.

The shapes of the italic forms can also vary by language.→ w

Text layout

Text direction

Cyrillic text is written left to right in horizontal lines.

Text delimiters

Words are separated by spaces.

Cyrillic uses standard Latin punctuation.

Line breaking

Spaces between words provide the primary line break opportunities.u

Justification

Justification is done, principally, by adjusting the space between words.

Use the control below to see how your browser justifies the text sample here.

Все люди равны перед законом и имеют право, без всякого различия, на равную защиту закона. Все люди имеют право на равную защиту от какой бы то ни было дискриминации, нарушающей настоящую Декларацию, и от какого бы то ни было подстрекательства к такой дискриминации.

TBD

Other features to be investigate in this section include: text delimiters, emphasis & highlighting, text decoration, abbreviations & ellipsis, hyphens & dashes character transforms, quotations, line breaking, hyphenation, justification & alignment, first-letter styling, notes & footnotes, page layout

References

  1. [U] The Unicode Standard v10.0, Cyrillic, pp316-320.
  2. [D] Peter T. Daniels and William Bright, The World's Writing Systems, Oxford University Press, ISBN 0-19-507993-0, pp408-412
  3. [W] Cyrillic script.
  4. [S] Sri Lanka Standard, Sinhala Character Code for Information Exchange.
Last changed 2018-02-09 21:10 GMT.  •  Make a comment.  •  Licence CC-By © r12a.