Updated 25 November, 2021
This page gathers basic information about the Cherokee script and its use for the Cherokee language. It aims (generally) to provide an overview of the orthography and typographic features, and (specifically) to advise how to write Cherokee using Unicode.
Phonetic transcriptions on this page should be treated as an approximate guide, only. Many are more phonemic than phonetic, and there may be variations depending on the source of the transcription.
Ꭰꮿꮩꮈ 1 Ꮒꭶꮣ ꭰꮒᏼꮻ ꭴꮎꮥꮕꭲ ꭴꮎꮪꮣꮄꮣ ꭰꮄ ꭱꮷꮃꭽꮙ ꮎꭲ ꭰꮲꮙꮩꮧ ꭰꮄ ꭴꮒꮂ ꭲᏻꮎꮫꮧꭲ. Ꮎꮝꭹꮎꮓ ꭴꮅꮝꭺꮈꮤꮕꭹ ꭴꮰꮿꮝꮧ ꮕᏸꮅꮫꭹ ꭰꮄ ꭰꮣꮕꮦꮯꮣꮝꮧ ꭰꮄ ꭱꮅꮝꮧ ꮟᏼꮻꭽ ꮒꮪꮎꮣꮫꮎꮥꭼꭹ ꮎ ꮧꮎꮣꮕꮯ ꭰꮣꮕꮩ ꭼꮧ.
Ꭰꮿꮩꮈ 2 Ꮒꭶꮫ ꭰꮒᏼꮻ ꭴꮎꮣꮒꮬ ꮎꭲ ꮒꭶꮣ ꭴꮒꮂ ꭲᏻꮎꮫꮑꮧꭲ ꭰꮄ ꮩꭿ ꭰꮥꮧꭲ ꮎꭲ ꮥꭶꭷꮕꭹ ꭿꭰ ꮧꭶꮓꮳꮃꮕꭲ, ꭴꮎꮴꮅꮫ ꮔꮎꮰꮿꮝꮫꮎ ꮎꭲ ꮒꭶꭵꮙ ꮷꮣꮄꮕꮣ, ꮥꭷꮑꭲꮝꮤꮕꭿ ꮷꮎꮣꮄꮕꮣ ꭰꮒᏼꮻ, ꮧꭸꭶꭶꮕꮧꭲ, ꭰꭸꮿ ꭰꮄ ꭰꮝꭶꮿ, ꭶꮼꮒꭿꮝꮧ, ꮷꮎꮑꮅꮧ, ꮧꮎꮩꭹꮿꮝꭹ ꭰꮄ ꮠꭲ ꮎꮒꮅꮝꭼꭹ, ꭰᏸꮅ ꭴꮎꮩꮲꭿ ꭰꮄ ᏼꮻ ꮒꮩꮣᏻꮎꮣꮄꮕꭹ, ꮔꮕꮏꮕ, ꭴꮥꮕ ꭰꮄ ꮠꭲ ꮔꮝꮧꮣꮕꭲ. Ꭴꮧꮧꮲꭲꭸꮝꮩꮧ, ꮭ ꮔꮎꮰꮿꮝꮫꮎ ꭴꮩꭿᏻꮢꮎ ꮎꮝꭹꮓ ꮧꮎꮩꭹꮿꮝꭹ ꮒꮣᏻꮅꮝꮩꮤꮕ ꮎꮝꭹ ꭴꮩꮲꮕꭲ, ꭲᏻꮎꮫꮑꮅꮣꮝꮧ ꭴꮒꮂꭹ ꭰꮄ ꭰᏸꮅ ꮪꮎꮩꮲꮢ ꮔꮝꮧꮣꮕ ꮎꮝꭹ ꮒꭼꮎꮫꭲ ꭰꮄ ꮝꭶꮪꭹ ꮎꮝꭹꮓ ꭰꮒᏼꮻ ꭰꮎꮑꮈꭹ, ꭲᏻꮓꮝꮚ ꮎꮝꭹꮎꭲ ꭴꮎꮣꮴꮅꮣ, ꭶꭸꭶꮕꮨ ꭸꮢꭲ, ꭼꮒꭼꭼ-ꭴꮹꮢ-ꭴꭶꮞꮝꮧꮥꭹ ꭽꮻꮒꮧꮲ ꮒꭶꭵ ꮠꭲ ꮕꮒᏺꭲꮝꮣꮑꮂꮎ ꮎꭲ ꭴꮒꮂ ꭴꮎꮣꮴꮅꭶꮿ.
ᎠᏯᏙᎸ 1 ᏂᎦᏓ ᎠᏂᏴᏫ ᎤᎾᏕᏅᎢ ᎤᎾᏚᏓᎴᏓ ᎠᎴ ᎡᏧᎳᎭᏉ ᎾᎢ ᎠᏢᏉᏙᏗ ᎠᎴ ᎤᏂᎲ ᎢᏳᎾᏛᏗᎢ. ᎾᏍᎩᎾᏃ ᎤᎵᏍᎪᎸᏔᏅᎩ ᎤᏠᏯᏍᏗ ᏅᏰᎵᏛᎩ ᎠᎴ ᎠᏓᏅᏖᏟᏓᏍᏗ ᎠᎴ ᎡᎵᏍᏗ ᏏᏴᏫᎭ ᏂᏚᎾᏓᏛᎾᏕᎬᎩ Ꮎ ᏗᎾᏓᏅᏟ ᎠᏓᏅᏙ ᎬᏗ.
ᎠᏯᏙᎸ 2 ᏂᎦᏛ ᎠᏂᏴᏫ ᎤᎾᏓᏂᏜ ᎾᎢ ᏂᎦᏓ ᎤᏂᎲ ᎢᏳᎾᏛᏁᏗᎢ ᎠᎴ ᏙᎯ ᎠᏕᏗᎢ ᎾᎢ ᏕᎦᎧᏅᎩ ᎯᎠ ᏗᎦᏃᏣᎳᏅᎢ, ᎤᎾᏤᎵᏛ ᏄᎾᏠᏯᏍᏛᎾ ᎾᎢ ᏂᎦᎥᏉ ᏧᏓᎴᏅᏓ, ᏕᎧᏁᎢᏍᏔᏅᎯ ᏧᎾᏓᎴᏅᏓ ᎠᏂᏴᏫ, ᏗᎨᎦᎦᏅᏗᎢ, ᎠᎨᏯ ᎠᎴ ᎠᏍᎦᏯ, ᎦᏬᏂᎯᏍᏗ, ᏧᎾᏁᎵᏗ, ᏗᎾᏙᎩᏯᏍᎩ ᎠᎴ ᏐᎢ ᎾᏂᎵᏍᎬᎩ, ᎠᏰᎵ ᎤᎾᏙᏢᎯ ᎠᎴ ᏴᏫ ᏂᏙᏓᏳᎾᏓᎴᏅᎩ, ᏄᏅᎿᏅ, ᎤᏕᏅ ᎠᎴ ᏐᎢ ᏄᏍᏗᏓᏅᎢ. ᎤᏗᏗᏢᎢᎨᏍᏙᏗ, Ꮭ ᏄᎾᏠᏯᏍᏛᎾ ᎤᏙᎯᏳᏒᎾ ᎾᏍᎩᏃ ᏗᎾᏙᎩᏯᏍᎩ ᏂᏓᏳᎵᏍᏙᏔᏅ ᎾᏍᎩ ᎤᏙᏢᏅᎢ, ᎢᏳᎾᏛᏁᎵᏓᏍᏗ ᎤᏂᎲᎩ ᎠᎴ ᎠᏰᎵ ᏚᎾᏙᏢᏒ ᏄᏍᏗᏓᏅ ᎾᏍᎩ ᏂᎬᎾᏛᎢ ᎠᎴ ᏍᎦᏚᎩ ᎾᏍᎩᏃ ᎠᏂᏴᏫ ᎠᎾᏁᎸᎩ, ᎢᏳᏃᏍᏊ ᎾᏍᎩᎾᎢ ᎤᎾᏓᏤᎵᏓ, ᎦᎨᎦᏅᏘ ᎨᏒᎢ, ᎬᏂᎬᎬ-ᎤᏩᏒ-ᎤᎦᏎᏍᏗᏕᎩ ᎭᏫᏂᏗᏢ ᏂᎦᎥ ᏐᎢ ᏅᏂᏲᎢᏍᏓᏁᎲᎾ ᎾᎢ ᎤᏂᎲ ᎤᎾᏓᏤᎵᎦᏯ.
It is estimated that only around 2,000 Cherokee people speak the language. However, those who do speak the language use the script widely for
writing letters, recipes, folktales, diaries, and for personal record-keeping. It is also used in some legal, governmental and religious documents and, in some areas, public signage. Efforts are being made to revive both the language and the script; to that end it is used in a limited capacity in education. Knowledge of the script is considered a prerequisite for full Cherokee citizenship.s
ᏣᎳᎩ TˢᵃLᵃGⁱ(tsalagi) Cherokee
The script was developed by a Cherokee named Sequoyah and presented to the Cherokee Nation in 1821. It was popular and most Cherokee were literate in the script by 1828, when Sequoyah and Samuel Worcester reformed the orthography during the process of preparing it for printing.
From the 1870s to the early 1900s, the US government actively suppressed the Cherokee language and culture, sending children away from their parents and creating a generation that was unfamiliar with the language and script. The ultimate result of this policy is that the Cherokee language is now considered endangered to moribund. There are, however, efforts to increase usage, and users are able to use the language and script for social media on mobile devices.
Sources: Scriptsource, Wikipedia.
Cherokee is a syllabary. Letters typically represent a combination of consonants and vowels. See the table to the right for a brief overview of features of modern Cherokee.
Cherokee text runs left to right in horizontal lines.
Words are separated by spaces.
The syllabary has 85 characters, of which 6 represent syllables that start with either no consonant or with ʔ (Ꭰ Ꭱ Ꭲ Ꭳ Ꭴ Ꭵ), and one character represents the non-syllabic consonant sound s (Ꮝ). The rest nominally represent a combination of consonant plus vowel, though the actual practise is a little more nuanced, and there is a degree of vagueness in the script when it comes to phonetically transcribing spoken sounds.d
The script doesn't fully represent the sounds of the spoken language. Vowel length is not distinguished, with some exceptions syllable-final consonants and syllable-initial aspiration are not reflected in the orthography, and the user has to figure out when to drop the vowel of a CV letter to make consonant clusters. Some readers are beginning to use diacritics to indicate pronunciation more accurately.
The spoken language is tonal, but tones are not written.
The script is becoming bicameral, after a long period when syllabic characters ressembled uppercase letters.
There is no standard spelling. The way a word is written may vary, according to the pronunciation of the writer, or choices they make for dealing with consonant clusters.
The visual forms of letters don't interact. There are no combining characters or diacritics, and ASCII digits are used.
These are sounds for the Cherokee language.
Click on the sounds to reveal locations in this document where they are mentioned.
Phones in a lighter colour are non-native or allophones. Source Wikipedia.
|stop||t d||k ɡ
One syllable is archaic and not used.
Lowercase characters were introduced in Unicode 8.0, to cover growing use of bicameral content in modern typesetting, as well as some older texts such as the Cherokee New Testament. The lowercase text above is likely to be displayed as tofu (boxes), since it is currently difficult to find a font that includes lowercase forms.
It is unusual for the majority of content to be in uppercase, and for lowercase to come in later, and implementers may need to take care in introducing the new characters. For example, Cherokee case-folds to uppercase, rather than lower. For more details see the Unicode Standard.u
The shapes of the upper- vs. lower-cased letters don't change radically (as they do in Latin or Cyrillic). The lowercase letters are often simply smaller, however they may have ascenders and descenders in some fontse,5.
The six vowel characters, when they appear at the start of a word represent plain vowel sounds, eg. ᎠᎹ
Elsewhere they represent a syllable starting with ʔ,d eg. ᎯᎠ
The vowel in a CV syllable doesn't distinguish between short and long vowel sounds, nor does it indicate tonal values, eg. the following sequence of Cherokee characters represents two different words, each having different lengths and tones (low vs. high, respectively)d: ᎠᎹ
With one exception, consonant clusters are managed by using a normal syllabic character but ignoring the ('dummy') vowel, eg. ᎦᎵᏉᎩ ᎬᏙᎠ The character chosen is largely up to the writer, but some words bring in etymological connections.
The exception is Ꮝ [U+13CD CHEROKEE LETTER S], which is not followed by a vowel,d eg. ᏍᎪᎯ
Only 6 syllable pairs distinguish between aspirated and non-aspirated sounds at the start of a syllable.
Only one nasal syllable makes this distinction, ie. compare ᎬᎾ ᎬᎿ
However, the following could be either kə̃ːniha I'm striking it or kə̃ːhniha gv-ni-ha she's striking it.ᎬᏂᎭ gv-ni-ha
There are five pairs of characters that make this distinction for stops or affricates: Ꭶ+Ꭷ, Ꮣ+Ꮤ, Ꮥ+Ꮦ, Ꮧ+Ꮨ, Ꮬ+Ꮭ. For example, it is possible to distinguish between the first two syllables of ᎧᎦᎵ but notd between the two meanings of ᎪᎳ go-laie. koːla winter kʰoːla bone.
Some manuscripts precede syllables beginning with an s sound with Ꮝ [U+13CD CHEROKEE LETTER S], and Sequoyah spelled his name like that, ie. ᏍᏏᏉᏯ s-si-qo-ya
Each character may not only end with a vowel, but may also end with ʔ or h, eg. ᏑᏗ ᏔᎵ are written with just two characters.
There is one distinctive pair related to syllables ending with h, ie. compare: Ꮎ na Ꮐ nah
Syllables that end with an s sound can be written using Ꮝ [U+13CD CHEROKEE LETTER S], eg. ᎯᏴᏫᏯᏍ
Everson reports that some combining diacritical marks are now used in Cherokee text by ordinary readers and especially children.e,5
These diacritics are in the Unicode Combining Diacritical Marks block. The Cherokee block has no combining characters.
̣ [U+0323 COMBINING DOT BELOW] indicates shifts in consonant readings – such as voiced to voiceless, voiceless to voiced; for example, where Ꭺ is ko, Ꭺ̣ would be kʰo.
̱ [U+0331 COMBINING MACRON BELOW] indicates the dropping of a vowel; for example, Oklahoma could be written ᎣᎦ̱ᎳᎰᎹ o-ga̱-la-ho-ma
When a consonant is both shifted and has its vowel dropped, ̤ [U+0324 COMBINING DIAERESIS BELOW] is used.
Nasalisation is only very rarely marked: in such cases, it can be indicated using ̰ [U+0330 COMBINING TILDE BELOW].
Spoken Cherokee has tones, but they are not shown in the text.u
Linguists who want to show tones do so using standard allocations of combining characters. The following list shows diacritics used to express tones. (Mid is the default, and doesn't need marking.)e,5
Sequoyah, the inventor of the script, created a set of Cherokee numbers, but they were not adopted and are not encoded in Unicode.u The shapes of the numbers can be seen on the Omniglot page.
Cherokee text runs left-to-right in horizontal lines.
bidi_class properties for characters in the Cherokee orthography described here.
This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.
You can experiment with examples using the Cherokee character app.
There is no interaction between the glyphs in Cherokee.
Cherokee has no special requirements for baseline alignment between mixed scripts or in general.
Cherokee users would like their fonts to have italic and bold styles, although this is not currently common. These alternate styles would be used in the same way as for the Latin script.e,5
In 2015 a set of lowercase letters were added to version 8.0 of the Unicode repertoire, to complement the original set. This is discussed in more detail in cs.
Applications should provide for transformations between upper and lower case forms, however the situation is slightly unusual in that the pre-existing text is now written uppercase, and transforms need to in some cases treat lowercasing as the default operation. The following is from the Unicode Standard:
This exceptional introduction of a lowercase set to change a unicameral encoding into a bicameral encoding has important implications that implementers of the Cherokee script need to keep in mind. First, in order to preserve case folding stability, Cherokee case folds to the previously encoded uppercase letters, rather than to the newly encoded lowercase letters. This exceptional case folding behavior impacts identifiers, and so can trip up implementations if they are not prepared for it. Second, representation of cased Cherokee text requires using the new lowercase letters for most of the body text, instead of just changing a few initial letters to uppercase. That means that representation of traditional text such as the Cherokee New Testament requires substantial re-encoding of the text. Third, the fact that uppercase Cherokee still represents the default and is most widely supported in fonts means that input systems which are extended to support the new lowercase letters face unusual design choices.
Words are separated by spaces.
, [U+002C COMMA]
; [U+003B SEMICOLON]
: [U+003A COLON]
. [U+002E FULL STOP]
? [U+003F QUESTION MARK]
! [U+0021 EXCLAMATION MARK]
Cherokee uses standard Latin punctuation.u
In some cases, it has been known for full stops to be raised above the baseline.d
( [U+0028 LEFT PARENTHESIS]
) [U+0029 RIGHT PARENTHESIS]
“ [U+201C LEFT DOUBLE QUOTATION MARK]
” [U+201D RIGHT DOUBLE QUOTATION MARK]
‘ [U+2018 LEFT SINGLE QUOTATION MARK]
’ [U+2019 RIGHT SINGLE QUOTATION MARK]
By default, lines are broken at inter-word spaces. As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line.
Show (default) line-breaking properties for characters in the modern Cherokee orthography.
Justification is done, principally, by adjusting the space between words.
This section is for any features that are specific to Cherokee and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.
According to ScriptSource, the Cherokee script is only used for the Cherokee [chr] language.