Use accesskey "n" to jump to the internal navigation links at any point. Right now you can

 
r12a >> docs

Cherokee

orthography notes

Updated 25 April, 2024 • recent changes scripts/cher/chr • leave a comment

This page brings together basic information about the Cherokee script and its use for the Cherokee language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Cherokee using Unicode.

It is remarkably difficult to find actual phonetic transcriptions of Cherokee words, and most so-called 'phonetic' transcriptions use one of the Latin orthographies, which also lack the more fine-grained distinctions needed to understand how to pronounce the text properly. Take, for example, the word for apple (ᏒᎦᏔ), which is written svgata in the Latin orthography, but is actually pronounced sə̃̌ːkʰtʰ. Therefore, the pronunciation information here and in the character notes is necessarily somewhat vague.

Referencing this document

Richard Ishida, Cherokee Orthography Notes, 25-Apr-2024, https://r12a.github.io/scripts/cher/chr

 

Click to toggle Table of Contents.

Phonological transcriptions should be treated as a guide, only. They are taken from the sources consulted, and may be narrow or broad, phonemic or phonetic, depending on what is available. They mostly represent pronunciation of words in isolation. For more detailed information about allophones, alternations, sandhi, dialectal differences, and so on, follow the links to cited references.

This is an interactive document. Click/tap on the following to reveal detailed information and examples for each character: (a) coloured characters in examples and lists; (b) link text on character names. If your browser supports it, your cursor will change to look like as you hover over these items.

More about using this page

Character names. The names of characters in codepoint markup drop the initial CHEROKEE label (purely to reduce the length of the examples). In other places the full name can be found.

Navigation. The Toggle images icon opens the table of contents in a popup window. Dismiss it by clicking on the X alongside it, or by hitting the ESC key.

Detailed character notes. Clicking on coloured characters in lists or on character names opens panels that give detailed information about each character. This information is taken from the companion document, Cherokee Character Notes. (Those panels can be dismissed by pressing on the ESC key.)

Transcriptions & transliterations. Phonological transcriptions are surrounded by ⌈corner brackets⌋, to indicate that they vary between narrow, [phonetic] and broad, /phonemic/ transcriptions.
Latin transcriptions between <angle brackets>, represent the letters as commonly written in the Latin script.
A transliteration has also been developed especially for this orthography, and is generally based on the sound of a letter where possible, but where a letter has multiple pronunciations, the transliteration represents only one.
Transliterations provide perfect round-trip conversion between the native script and Latin, whereas Latin transcriptions rarely do.
When you click on an example to see its composition, the top of the panel that opens contains a transliteration, followed by the native text, then (if available) an IPA transcription.

Copied !
TOC.
Accessibility settings
ˇ

Languages using the Cherokee scriptCherokee pickerTerms listCharacter notesCherokee linksOther orthography notes

Samples

Select part of this sample text to show a list of characters, with links to more details.
Change size:   28px

Cased

Ꭰꮿꮩꮈ 1 Ꮒꭶꮣ ꭰꮒᏼꮻ ꭴꮎꮥꮕꭲ ꭴꮎꮪꮣꮄꮣ ꭰꮄ ꭱꮷꮃꭽꮙ ꮎꭲ ꭰꮲꮙꮩꮧ ꭰꮄ ꭴꮒꮂ ꭲᏻꮎꮫꮧꭲ. Ꮎꮝꭹꮎꮓ ꭴꮅꮝꭺꮈꮤꮕꭹ ꭴꮰꮿꮝꮧ ꮕᏸꮅꮫꭹ ꭰꮄ ꭰꮣꮕꮦꮯꮣꮝꮧ ꭰꮄ ꭱꮅꮝꮧ ꮟᏼꮻꭽ ꮒꮪꮎꮣꮫꮎꮥꭼꭹ ꮎ ꮧꮎꮣꮕꮯ ꭰꮣꮕꮩ ꭼꮧ.

Ꭰꮿꮩꮈ 2 Ꮒꭶꮫ ꭰꮒᏼꮻ ꭴꮎꮣꮒꮬ ꮎꭲ ꮒꭶꮣ ꭴꮒꮂ ꭲᏻꮎꮫꮑꮧꭲ ꭰꮄ ꮩꭿ ꭰꮥꮧꭲ ꮎꭲ ꮥꭶꭷꮕꭹ ꭿꭰ ꮧꭶꮓꮳꮃꮕꭲ, ꭴꮎꮴꮅꮫ ꮔꮎꮰꮿꮝꮫꮎ ꮎꭲ ꮒꭶꭵꮙ ꮷꮣꮄꮕꮣ, ꮥꭷꮑꭲꮝꮤꮕꭿ ꮷꮎꮣꮄꮕꮣ ꭰꮒᏼꮻ, ꮧꭸꭶꭶꮕꮧꭲ, ꭰꭸꮿ ꭰꮄ ꭰꮝꭶꮿ, ꭶꮼꮒꭿꮝꮧ, ꮷꮎꮑꮅꮧ, ꮧꮎꮩꭹꮿꮝꭹ ꭰꮄ ꮠꭲ ꮎꮒꮅꮝꭼꭹ, ꭰᏸꮅ ꭴꮎꮩꮲꭿ ꭰꮄ ᏼꮻ ꮒꮩꮣᏻꮎꮣꮄꮕꭹ, ꮔꮕꮏꮕ, ꭴꮥꮕ ꭰꮄ ꮠꭲ ꮔꮝꮧꮣꮕꭲ. Ꭴꮧꮧꮲꭲꭸꮝꮩꮧ, ꮭ ꮔꮎꮰꮿꮝꮫꮎ ꭴꮩꭿᏻꮢꮎ ꮎꮝꭹꮓ ꮧꮎꮩꭹꮿꮝꭹ ꮒꮣᏻꮅꮝꮩꮤꮕ ꮎꮝꭹ ꭴꮩꮲꮕꭲ, ꭲᏻꮎꮫꮑꮅꮣꮝꮧ ꭴꮒꮂꭹ ꭰꮄ ꭰᏸꮅ ꮪꮎꮩꮲꮢ ꮔꮝꮧꮣꮕ ꮎꮝꭹ ꮒꭼꮎꮫꭲ ꭰꮄ ꮝꭶꮪꭹ ꮎꮝꭹꮓ ꭰꮒᏼꮻ ꭰꮎꮑꮈꭹ, ꭲᏻꮓꮝꮚ ꮎꮝꭹꮎꭲ ꭴꮎꮣꮴꮅꮣ, ꭶꭸꭶꮕꮨ ꭸꮢꭲ, ꭼꮒꭼꭼ-ꭴꮹꮢ-ꭴꭶꮞꮝꮧꮥꭹ ꭽꮻꮒꮧꮲ ꮒꭶꭵ ꮠꭲ ꮕꮒᏺꭲꮝꮣꮑꮂꮎ ꮎꭲ ꭴꮒꮂ ꭴꮎꮣꮴꮅꭶꮿ.

Uncased

ᎠᏯᏙᎸ 1 ᏂᎦᏓ ᎠᏂᏴᏫ ᎤᎾᏕᏅᎢ ᎤᎾᏚᏓᎴᏓ ᎠᎴ ᎡᏧᎳᎭᏉ ᎾᎢ ᎠᏢᏉᏙᏗ ᎠᎴ ᎤᏂᎲ ᎢᏳᎾᏛᏗᎢ. ᎾᏍᎩᎾᏃ ᎤᎵᏍᎪᎸᏔᏅᎩ ᎤᏠᏯᏍᏗ ᏅᏰᎵᏛᎩ ᎠᎴ ᎠᏓᏅᏖᏟᏓᏍᏗ ᎠᎴ ᎡᎵᏍᏗ ᏏᏴᏫᎭ ᏂᏚᎾᏓᏛᎾᏕᎬᎩ Ꮎ ᏗᎾᏓᏅᏟ ᎠᏓᏅᏙ ᎬᏗ.

ᎠᏯᏙᎸ 2 ᏂᎦᏛ ᎠᏂᏴᏫ ᎤᎾᏓᏂᏜ ᎾᎢ ᏂᎦᏓ ᎤᏂᎲ ᎢᏳᎾᏛᏁᏗᎢ ᎠᎴ ᏙᎯ ᎠᏕᏗᎢ ᎾᎢ ᏕᎦᎧᏅᎩ ᎯᎠ ᏗᎦᏃᏣᎳᏅᎢ, ᎤᎾᏤᎵᏛ ᏄᎾᏠᏯᏍᏛᎾ ᎾᎢ ᏂᎦᎥᏉ ᏧᏓᎴᏅᏓ, ᏕᎧᏁᎢᏍᏔᏅᎯ ᏧᎾᏓᎴᏅᏓ ᎠᏂᏴᏫ, ᏗᎨᎦᎦᏅᏗᎢ, ᎠᎨᏯ ᎠᎴ ᎠᏍᎦᏯ, ᎦᏬᏂᎯᏍᏗ, ᏧᎾᏁᎵᏗ, ᏗᎾᏙᎩᏯᏍᎩ ᎠᎴ ᏐᎢ ᎾᏂᎵᏍᎬᎩ, ᎠᏰᎵ ᎤᎾᏙᏢᎯ ᎠᎴ ᏴᏫ ᏂᏙᏓᏳᎾᏓᎴᏅᎩ, ᏄᏅᎿᏅ, ᎤᏕᏅ ᎠᎴ ᏐᎢ ᏄᏍᏗᏓᏅᎢ. ᎤᏗᏗᏢᎢᎨᏍᏙᏗ, Ꮭ ᏄᎾᏠᏯᏍᏛᎾ ᎤᏙᎯᏳᏒᎾ ᎾᏍᎩᏃ ᏗᎾᏙᎩᏯᏍᎩ ᏂᏓᏳᎵᏍᏙᏔᏅ ᎾᏍᎩ ᎤᏙᏢᏅᎢ, ᎢᏳᎾᏛᏁᎵᏓᏍᏗ ᎤᏂᎲᎩ ᎠᎴ ᎠᏰᎵ ᏚᎾᏙᏢᏒ ᏄᏍᏗᏓᏅ ᎾᏍᎩ ᏂᎬᎾᏛᎢ ᎠᎴ ᏍᎦᏚᎩ ᎾᏍᎩᏃ ᎠᏂᏴᏫ ᎠᎾᏁᎸᎩ, ᎢᏳᏃᏍᏊ ᎾᏍᎩᎾᎢ ᎤᎾᏓᏤᎵᏓ, ᎦᎨᎦᏅᏘ ᎨᏒᎢ, ᎬᏂᎬᎬ-ᎤᏩᏒ-ᎤᎦᏎᏍᏗᏕᎩ ᎭᏫᏂᏗᏢ ᏂᎦᎥ ᏐᎢ ᏅᏂᏲᎢᏍᏓᏁᎲᎾ ᎾᎢ ᎤᏂᎲ ᎤᎾᏓᏤᎵᎦᏯ.

Source: Unicode UDHR, articles 1 & 2. Cased version generated by hand from that.

Usage & history

It is estimated that only around 2,000 Cherokee people speak the language. However, those who do speak the language use the script widely for writing letters, recipes, folktales, diaries, and for personal record-keeping. It is also used in some legal, governmental and religious documents and, in some areas, public signage. Efforts are being made to revive both the language and the script; to that end it is used in a limited capacity in education. Knowledge of the script is considered a prerequisite for full Cherokee citizenship.6

ᏣᎳᎩ tsalagi Cherokee

The script was developed by a Cherokee named Sequoyah and presented to the Cherokee Nation in 1821. It was popular and most Cherokee were literate in the script by 1828, when Sequoyah and Samuel Worcester reformed the orthography during the process of preparing it for printing.

From the 1870s to the early 1900s, the US government actively suppressed the Cherokee language and culture, sending children away from their parents and creating a generation that was unfamiliar with the language and script. The ultimate result of this policy is that the Cherokee language is now considered endangered to moribund. There are, however, efforts to increase usage, and users are able to use the language and script for social media on mobile devices.

Sources: Scriptsource, Wikipedia.

Script codecher
Language codechr
Script typesyllabary
Originnam
Native speakers1,520
  
Total characters184
Letters170
Combining marks10
Punctuation4
Possible other14
Unicode blocks2
  
Character counts above are for this
orthography but exclude ASCII.
  
Text directionltr
Post-consonant vowelsletters
Standalone vowels
Case distinctionyes
Cursive scriptno
Combining marksno
Clusters markedno
Other ligaturesno
Word separatorspace
Wraps atword
Hyphenation?
G Clusters OK?yes
Justificationspaces
Baselineromn

Basic features

Cherokee is a syllabary. Letters typically represent a combination of consonants and vowels. See the table to the right for a brief overview of features of modern Cherokee.

Cherokee text runs left to right in horizontal lines. Words are separated by spaces.

The script is becoming bicameral, after a long period when syllabic characters ressembled uppercase letters.

❯ Syllables

The Cherokee syllabary has 85 characters, of which 6 represent syllables that start with either no consonant or with ʔ (Ꭰ Ꭱ Ꭲ Ꭳ Ꭴ Ꭵ), and one character represents the non-syllabic consonant sound s (). The rest nominally represent a combination of consonant plus vowel, though the actual practise is a little more nuanced, and there is a degree of vagueness in the script when it comes to phonetically transcribing spoken sounds.1 It is a simple syllabary where letter shapes don't follow any systematic pattern.

The script doesn't fully represent the sounds of the spoken language. Vowel length is not distinguished, with some exceptions syllable-final consonants and syllable-initial aspiration are not reflected in the orthography, and the user has to figure out when to drop the vowel of a CV letter to make consonant clusters. Some readers are beginning to use diacritics to indicate pronunciation more accurately.

The spoken language is tonal, but tones are not written. A set of diacritics exists, however, to enable linguists to indicate tones.

There is no standard spelling. The way a word is written may vary, according to the pronunciation of the writer, or choices they make for dealing with consonant clusters.

The visual forms of letters don't interact. There are no combining characters or diacritics, and ASCII digits are used.

Character index

The index points to locations where a character is mentioned in this page, and indicates whether it is used by the Cherokee orthography described here.

Manage characters.

Click on the image to the left to view all the 'main' and 'infrequent' characters in the index in various groupings or open related apps.

Letters

Show

Basic syllables

list all 79
13D3
CHEROKEE LETTER DAsyllable ta da
13D4
CHEROKEE LETTER TAsyllable tʰa ta
13D5
CHEROKEE LETTER DEsyllable te de
13D6
CHEROKEE LETTER TEsyllable tʰe te
13D7
CHEROKEE LETTER DIsyllable ti di
13D8
CHEROKEE LETTER TIsyllable tʰi ti
13D9
CHEROKEE LETTER DOsyllable to tʰo do
13DA
CHEROKEE LETTER DUsyllable tu tʰu du
13DB
CHEROKEE LETTER DVsyllable tə̃ tʰə̃ dv
13DC
CHEROKEE LETTER DLAsyllable tˡa dla
13DD
CHEROKEE LETTER TLAsyllable tˡʰa tla
13DE
CHEROKEE LETTER TLEsyllable tˡe tʰˡe tle
13DF
CHEROKEE LETTER TLIsyllable tˡi tʰˡi tli
13E0
CHEROKEE LETTER TLOsyllable tˡo tʰˡo tlo
13E1
CHEROKEE LETTER TLUsyllable tˡu tʰˡu tlu
13E2
CHEROKEE LETTER TLVsyllable tˡə̃ tʰˡə̃ tlv
13A6
CHEROKEE LETTER GAsyllable ka ga
13A7
CHEROKEE LETTER KAsyllable kʰa ka
13A8
CHEROKEE LETTER GEsyllable ke kʰe ge
13A9
CHEROKEE LETTER GIsyllable ki kʰi gi
13AA
CHEROKEE LETTER GOsyllable ko kʰo go
13AB
CHEROKEE LETTER GUsyllable ku kʰu gu
13AC
CHEROKEE LETTER GVsyllable kə̃ kʰə̃ gv
13C6
CHEROKEE LETTER QUAsyllable kʷa kʰw̥a qua
13C7
CHEROKEE LETTER QUEsyllable kʷe kʰw̥e que
13C8
CHEROKEE LETTER QUIsyllable kʷi kʰw̥i qui
13C9
CHEROKEE LETTER QUOsyllable kʷo kʰw̥o quo
13CA
CHEROKEE LETTER QUUsyllable kʷu kʰw̥u quu
13CB
CHEROKEE LETTER QUVsyllable kʷə̃ kʰw̥ə̃ quv
13E3
CHEROKEE LETTER TSAsyllable t͡sa t͡ʒa t͡ʰʃa tsa
13E4
CHEROKEE LETTER TSEsyllable t͡se t͡ʒe t͡ʰʃe tse
13E5
CHEROKEE LETTER TSIsyllable t͡si t͡ʒi t͡ʰʃi tsi
13E6
CHEROKEE LETTER TSOsyllable t͡so t͡ʒo t͡ʰʃo tso
13E7
CHEROKEE LETTER TSUsyllable t͡su t͡ʒu t͡ʰʃu tsu
13E8
CHEROKEE LETTER TSVsyllable t͡sə̃ t͡ʒə̃ t͡ʰʃə̃ tsv
13CC
CHEROKEE LETTER SAsyllable sa sa
13CD
CHEROKEE LETTER Ssyllable s s
13CE
CHEROKEE LETTER SEsyllable se se
13CF
CHEROKEE LETTER SIsyllable si si
13D0
CHEROKEE LETTER SOsyllable so so
13D1
CHEROKEE LETTER SUsyllable su su
13D2
CHEROKEE LETTER SVsyllable sə̃ sv
13AD
CHEROKEE LETTER HAsyllable ha ha
13AE
CHEROKEE LETTER HEsyllable he he
13AF
CHEROKEE LETTER HIsyllable hi hi
13B0
CHEROKEE LETTER HOsyllable ho ho
13B1
CHEROKEE LETTER HUsyllable hu hu
13B2
CHEROKEE LETTER HVsyllable hə̃ hv
13B9
CHEROKEE LETTER MAsyllable ma ma
13BA
CHEROKEE LETTER MEsyllable me me
13BB
CHEROKEE LETTER MIsyllable mi mi
13BC
(rare)    CHEROKEE LETTER MOsyllable mo mo
13BD
(rare)    CHEROKEE LETTER MUsyllable mu mu
13BE
CHEROKEE LETTER NAsyllable na na
13BF
CHEROKEE LETTER HNAsyllable hn̥a hna
13C0
(infrequent)    CHEROKEE LETTER NAHsyllable nah nah
13C1
CHEROKEE LETTER NEsyllable ne hn̥e ne
13C2
CHEROKEE LETTER NIsyllable ni hn̥i ni
13C3
CHEROKEE LETTER NOsyllable no hn̥o no
13C4
CHEROKEE LETTER NUsyllable nu hn̥u nu
13C5
CHEROKEE LETTER NVsyllable nə̃ hn̥ə̃ nv
13E9
CHEROKEE LETTER WAsyllable wa hwa wa
13EA
CHEROKEE LETTER WEsyllable we hwe we
13EB
CHEROKEE LETTER WIsyllable wi hwi wi
13EC
CHEROKEE LETTER WOsyllable wo hwo wo
13ED
CHEROKEE LETTER WUsyllable wu hwu wu
13EE
CHEROKEE LETTER WVsyllable wə̃ hwə̃ wv
13B3
CHEROKEE LETTER LAsyllable la ɬa la
13B4
CHEROKEE LETTER LEsyllable le ɬe le
13B5
CHEROKEE LETTER LIsyllable li ɬi li
13B6
CHEROKEE LETTER LOsyllable lo ɬo lo
13B7
CHEROKEE LETTER LUsyllable lu ɬu lu
13B8
CHEROKEE LETTER LVsyllable lə̃ ɬə̃ lv
13EF
CHEROKEE LETTER YAsyllable ja hja ya
13F0
CHEROKEE LETTER YEsyllable je hje ye
13F1
CHEROKEE LETTER YIsyllable ji hji yi
13F2
CHEROKEE LETTER YOsyllable jo hjo yo
13F3
CHEROKEE LETTER YUsyllable ju hju yu
13F4
CHEROKEE LETTER YVsyllable jə̃ hjə̃ yv
list all 79
ABA3
CHEROKEE SMALL LETTER DAsyllable ta Da
ABA4
CHEROKEE SMALL LETTER TAsyllable tʰa Ta
ABA5
CHEROKEE SMALL LETTER DEsyllable te De
ABA6
CHEROKEE SMALL LETTER TEsyllable tʰe Te
ABA7
CHEROKEE SMALL LETTER DIsyllable ti Di
ABA8
CHEROKEE SMALL LETTER TIsyllable tʰi Ti
ABA9
CHEROKEE SMALL LETTER DOsyllable to tʰo Do
ABAA
CHEROKEE SMALL LETTER DUsyllable tu tʰu Du
ABAB
CHEROKEE SMALL LETTER DVsyllable tə̃ tʰə̃ Dv
ABAC
CHEROKEE SMALL LETTER DLAsyllable tˡa Dla
ABAD
CHEROKEE SMALL LETTER TLAsyllable tˡʰa Tla
ABAE
CHEROKEE SMALL LETTER TLEsyllable tˡe tʰˡe Tle
ABAF
CHEROKEE SMALL LETTER TLIsyllable tˡi tʰˡi Tli
ABB0
CHEROKEE SMALL LETTER TLOsyllable tˡo tʰˡo Tlo
ABB1
CHEROKEE SMALL LETTER TLUsyllable tˡu tʰˡu Tlu
ABB2
CHEROKEE SMALL LETTER TLVsyllable tˡə̃ tʰˡə̃ Tlv
AB76
CHEROKEE SMALL LETTER GAsyllable ka Ga
AB77
CHEROKEE SMALL LETTER KAsyllable kʰa Ka
AB78
CHEROKEE SMALL LETTER GEsyllable ke kʰe Ge
AB79
CHEROKEE SMALL LETTER GIsyllable ki kʰi Gi
AB7A
CHEROKEE SMALL LETTER GOsyllable ko kʰo Go
AB7B
CHEROKEE SMALL LETTER GUsyllable ku kʰu Gu
AB7C
CHEROKEE SMALL LETTER GVsyllable kə̃ kʰə̃ Gv
AB96
CHEROKEE SMALL LETTER QUAsyllable kʷa kʰw̥a Qua
AB97
CHEROKEE SMALL LETTER QUEsyllable kʷe kʰw̥e Que
AB98
CHEROKEE SMALL LETTER QUIsyllable kʷi kʰw̥i Qui
AB99
CHEROKEE SMALL LETTER QUOsyllable kʷo kʰw̥o Quo
AB9A
CHEROKEE SMALL LETTER QUUsyllable kʷu kʰw̥u Quu
AB9B
CHEROKEE SMALL LETTER QUVsyllable kʷə̃ kʰw̥ə̃ Quv
ABB3
CHEROKEE SMALL LETTER TSAsyllable t͡sa t͡ʒa t͡ʰʃa Tsa
ABB4
CHEROKEE SMALL LETTER TSEsyllable t͡se t͡ʒe t͡ʰʃe Tse
ABB5
CHEROKEE SMALL LETTER TSIsyllable t͡si t͡ʒi t͡ʰʃi Tsi
ABB6
CHEROKEE SMALL LETTER TSOsyllable t͡so t͡ʒo t͡ʰʃo Tso
ABB7
CHEROKEE SMALL LETTER TSUsyllable t͡su t͡ʒu t͡ʰʃu Tsu
ABB8
CHEROKEE SMALL LETTER TSVsyllable t͡sə̃ t͡ʒə̃ t͡ʰʃə̃ Tsv
AB9C
CHEROKEE SMALL LETTER SAsyllable sa Sa
AB9D
CHEROKEE SMALL LETTER Ssyllable s S
AB9E
CHEROKEE SMALL LETTER SEsyllable se Se
AB9F
CHEROKEE SMALL LETTER SIsyllable si Si
ABA0
CHEROKEE SMALL LETTER SOsyllable so So
ABA1
CHEROKEE SMALL LETTER SUsyllable su Su
ABA2
CHEROKEE SMALL LETTER SVsyllable sə̃ Sv
AB7D
CHEROKEE SMALL LETTER HAsyllable ha Ha
AB7E
CHEROKEE SMALL LETTER HEsyllable he He
ꭿAB7F
CHEROKEE SMALL LETTER HIsyllable hi Hi
AB80
CHEROKEE SMALL LETTER HOsyllable ho Ho
AB81
CHEROKEE SMALL LETTER HUsyllable hu Hu
AB82
CHEROKEE SMALL LETTER HVsyllable hə̃ Hv
AB89
CHEROKEE SMALL LETTER MAsyllable ma Ma
AB8A
CHEROKEE SMALL LETTER MEsyllable me Me
AB8B
CHEROKEE SMALL LETTER MIsyllable mi Mi
AB8C
(rare)    CHEROKEE SMALL LETTER MOsyllable mo Mo
AB8D
(rare)    CHEROKEE SMALL LETTER MUsyllable mu Mu
AB8E
CHEROKEE SMALL LETTER NAsyllable na Na
AB8F
CHEROKEE SMALL LETTER HNAsyllable hn̥a Hna
AB90
(infrequent)    CHEROKEE SMALL LETTER NAHsyllable nah Nah
AB91
CHEROKEE SMALL LETTER NEsyllable ne hn̥e Ne
AB92
CHEROKEE SMALL LETTER NIsyllable ni hn̥i Ni
AB93
CHEROKEE SMALL LETTER NOsyllable no hn̥o No
AB94
CHEROKEE SMALL LETTER NUsyllable nu hn̥u Nu
AB95
CHEROKEE SMALL LETTER NVsyllable nə̃ hn̥ə̃ Nv
ABB9
CHEROKEE SMALL LETTER WAsyllable wa hwa Wa
ABBA
CHEROKEE SMALL LETTER WEsyllable we hwe We
ABBB
CHEROKEE SMALL LETTER WIsyllable wi hwi Wi
ABBC
CHEROKEE SMALL LETTER WOsyllable wo hwo Wo
ABBD
CHEROKEE SMALL LETTER WUsyllable wu hwu Wu
ABBE
CHEROKEE SMALL LETTER WVsyllable wə̃ hwə̃ Wv
AB83
CHEROKEE SMALL LETTER LAsyllable la ɬa La
AB84
CHEROKEE SMALL LETTER LEsyllable le ɬe Le
AB85
CHEROKEE SMALL LETTER LIsyllable li ɬi Li
AB86
CHEROKEE SMALL LETTER LOsyllable lo ɬo Lo
AB87
CHEROKEE SMALL LETTER LUsyllable lu ɬu Lu
AB88
CHEROKEE SMALL LETTER LVsyllable lə̃ ɬə̃ Lv
ꮿABBF
CHEROKEE SMALL LETTER YAsyllable ja hja Ya
13F8
CHEROKEE SMALL LETTER YEsyllable je hje Ye
13F9
CHEROKEE SMALL LETTER YIsyllable ji hji Yi
13FA
CHEROKEE SMALL LETTER YOsyllable jo hjo Yo
13FB
CHEROKEE SMALL LETTER YUsyllable ju hju Yu
13FC
CHEROKEE SMALL LETTER YVsyllable jə̃ hjə̃ Yv

Vowels

list all 6
13A0
CHEROKEE LETTER Asyllable a ʔa a
13A1
CHEROKEE LETTER Esyllable e ʔe e
13A2
CHEROKEE LETTER Isyllable i ʔi i
13A3
CHEROKEE LETTER Osyllable o ʔo o
13A4
CHEROKEE LETTER Usyllable u ʔu u
13A5
CHEROKEE LETTER Vsyllable ə̃ ʔə̃ v
list all 6
AB70
CHEROKEE SMALL LETTER Asyllable a ʔa A
AB71
CHEROKEE SMALL LETTER Esyllable e ʔe E
AB72
CHEROKEE SMALL LETTER Isyllable i ʔi I
AB73
CHEROKEE SMALL LETTER Osyllable o ʔo O
AB74
CHEROKEE SMALL LETTER Usyllable u ʔu U
AB75
CHEROKEE SMALL LETTER Vsyllable ə̃ ʔə̃ V

Not used for Cherokee

list both
13F5
(archaic)    CHEROKEE LETTER MVsyllable archaic mə̃ mv
13FD
(archaic)    CHEROKEE SMALL LETTER MVsyllable archaic mə̃ Mv

Combining marks

Show
list all 10
̣0323
(infrequent)    COMBINING DOT BELOWconsonant shift
̱0331
(infrequent)    COMBINING MACRON BELOWvowel-killer
̤0324
(infrequent)    COMBINING DIAERESIS BELOWconsonant shift + vowel killer
̰0330
(infrequent)    COMBINING TILDE BELOWnasalisation ̃ ̃
̀0300
(rare)    COMBINING GRAVE ACCENTlow tone mark i ˨ ̀
́0301
(rare)    COMBINING ACUTE ACCENThigh tone mark i ˦ ́
̂0302
(rare)    COMBINING CIRCUMFLEX ACCENTfalling tone mark i ˦˨ ̂
̄0304
(rare)    COMBINING MACRONmid tone mark i ˧ ̄
̋030B
(rare)    COMBINING DOUBLE ACUTE ACCENTsuper high tone mark i ˥ ̋
̌030C
(rare)    COMBINING CARONrising tone mark i ˨˦ ̌

Punctuation

Show
list all 4
2018
LEFT SINGLE QUOTATION MARKquotation mark
2019
RIGHT SINGLE QUOTATION MARKquotation mark
201C
LEFT DOUBLE QUOTATION MARKquotation mark
201D
RIGHT DOUBLE QUOTATION MARKquotation mark

ASCII

list all 8
(0028
LEFT PARENTHESISparenthesis (
)0029
RIGHT PARENTHESISparenthesis )
,002C
COMMAcomma ,
.002E
FULL STOPfull stop .
:003A
COLONcolon :
;003B
SEMICOLONsemicolon ;
?003F
QUESTION MARKquestion mark ?
!0021
EXCLAMATION MARKexclamation mark !

Other

Show

To be investigated

list all 14
%0025
(tbc)    PERCENT SIGNpercentage mark
[005B
(tbc)    LEFT SQUARE BRACKETbracket [
]005D
(tbc)    RIGHT SQUARE BRACKETbracket ]
§00A7
(tbc)    SECTION SIGNsection sign §
ʼ02BC
(tbc)    MODIFIER LETTER APOSTROPHEapostrophe ʼ
2011
(tbc)    NON-BREAKING HYPHENnon-breaking hyphen
2013
(tbc)    EN DASHen dash
2014
(tbc)    EM DASHem dash
2020
(tbc)    DAGGERdagger
2021
(tbc)    DOUBLE DAGGERdouble dagger
2026
(tbc)    HORIZONTAL ELLIPSISellipsis
2030
(tbc)    PER MILLE SIGNper mille mark
2032
(tbc)    PRIMEprime
2033
(tbc)    DOUBLE PRIMEdouble prime

Phonology

These are sounds for the Cherokee language.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones. Source Wikipedia.

Vowel sounds

i u e o ə̃ ə̃ ə̃ː ə̃ː a

Consonant sounds

labial alveolar post-
alveolar
palatal velar glottal
stop   t     k ʔ
labialised stop         kw̥ kʰw̥  
affricate   t͡s tʰ͡s t͡ʃ tʰ͡ʃ
t͡l t͡ɬ
     
fricative   s
ɬ
      h
nasal m n      
approximant w l   j
trill/flap      

Although some transcriptions suggest it, Cherokee doesn't contrast voiced vs. unvoiced stops and affricates; all are unvoiced. Instead, it contrasts aspirated and unaspirated forms334.

t͡s is pronounced t͡ʃ by some speakers. This also applies to the aspirated forms.

The glottal stop appears between vowels, and also sometimes between a vowel and a consonant, though less frequently. However, it is not written when using the syllabary, and so minimal pairs may be spelled the same, eg. ᎪᎢ kòʔi oil, grease ᎠᏓ àta/áʔta wood/young animal

Tone

Cherokee has 6 tones, 2 level and 4 contoured. They are shown in the table with Latin transcriptions used by Scancarelli (1986), Montgomery-Anderson (2008,2015), and Feeling (2003)/Uchihara (2016), respectively10.

IPAshort long name
˨à, a, a à:, aa, aa low
˧á, á, á á:, áa, áá high
˨˧  ǎ:, aá, aá rising
˧˨  â:, áà, áà falling
˨˩   ȁ:, aà, àà, àa lowfall
˧˦   a̋:, áá, aa̋ superhigh

The level tones can appear with short or long vowels, but the others only occur with long vowels.

Tones are not marked in the Cherokee orthography, although this doesn't often create ambiguity10. Nor are they used in the Latin orthographies, except in dictionaries. See also Tone marks.

For the superhigh tone, Wikipedia says: The superhigh tone, also called "highfall" by Montgomery-Anderson, has a distinctive morphosyntactical function, primarily appearing on adjectives, nouns derived from verbs, and on subordinate verbs. It is mobile and falls on the rightmost long vowel. If the final short vowel is dropped and the superhigh tone becomes in word-final position, it is shortened and pronounced like a slightly higher final tone (notated as a̋ in most orthographies). There can only be one superhigh tone per word, constraint not shared by the other tones. For these reasons, this contour exhibits some accentual properties and has been referred to as an "accent" (or stress) in the literature.

Phonological processes

Cherokee, like many other North American languages, builds words into short phrases by adding prefixes and suffixes to the word root. A number of phonological changes are applied during this process. Typically, but not always, the result of these changes is captured by the orthography.

Vowel deletion

In certain circumstances, vowels in the underlying morphemic model are dropped when particles are added to a word root. The rule is as follows365:

t, k, j, w, y, n, kw, l + short vowel + h + plosive or vowelremoval of the short vowel and aspiration of the initial letter in the sequence

For example:

Cherokee:Transcription:Components:IPA:
ᏫᎦᏘwi-ga-diwi-hi-kah.thiw̥ì.kʰtʰi
ᏫᏥᎦᏔwi-tsi-ga-tawi-tsi-kah-thawìt͡sìkâːtʰa
ᏂᏪᏓᏍni-we-da-sni-wi-hi-eetasnìw̥èːtàs
Three phrases: You're heading there, I'm heading there, and Are you already going?. The first experiences vowel deletion, the second not, the third does.

This phonological change is generally reflected in the orthography, however, the in the first example still uses a non-aspirated letter.

Metathesis

tbd

Syllables

The script is becoming bicameral, after a long period when syllabic characters ressembled uppercase letters.

The Cherokee syllabary has 85 characters, of which 6 represent syllables that start with either no consonant or with ʔ (Ꭰ Ꭱ Ꭲ Ꭳ Ꭴ Ꭵ), and one character represents the non-syllabic consonant sound s (). The rest nominally represent a combination of consonant plus vowel, though the actual practise is a little more nuanced, and there is a degree of vagueness in the script when it comes to phonetically transcribing spoken sounds.1 It is a simple syllabary where letter shapes don't follow any systematic pattern.

The script doesn't fully represent the sounds of the spoken language. Vowel length is not distinguished, with some exceptions syllable-final consonants and syllable-initial aspiration are not reflected in the orthography, and the user has to figure out when to drop the vowel of a CV letter to make consonant clusters. Some readers are beginning to use diacritics to indicate pronunciation more accurately.

The spoken language is tonal, but tones are not written. A set of diacritics exists, however, to enable linguists to indicate tones.

Cherokee syllabic letters don't provide all the information needed to detect the underlying sounds if you are not familiar with the language. Features that are not generally expressed by the orthography include syllable-initial aspiration, syllable-final consonants, vowel length and unpronounced vowels, and tone. These are described in more detail below. Montgomery-Anderson lists the following possible sounds that are represented by U+13D9 LETTER DO:
tò, tó, tòː, tóː, tǒː, tôː, tȍː, tőː, tòh, tóh, tòʔ, tóʔ,
tʰò, tʰó, tʰòː, tʰóː, tʰǒː, tʰôː, tʰȍː, tʰőː, tʰòh, tʰóh, tʰòʔ, tʰóʔ

The realisation of Cherokee sounds is also often affected by phonological rules that are determined by context – creating another step away from the sounds implied by the syllabic letters.

In addition, vowels at the end of a word or in syllable onsets may be dropped, eg. ᏥᎩᎵ t͡ski.li ghost

As mentioned in the introduction, it is difficult to find precise information about how Cherokee syllables and words are pronounced, so while we try to provide what phonetic information we can here, most of the transcriptions are in one of the rather imprecise Latin orthographies. There are also different versions of the Latin orthographies, some preferring to make a distinction between d and t, whereas others map to t and th.

Vowel syllables


6
i ʔiI13A2
u ʔuU13A4
e ʔeE13A1
o ʔoO13A3
ə̃ ʔə̃V13A5
a ʔaA13A0

6
AB72
AB74
AB71
AB73
AB75
AB70

The six vowel characters, when they appear at the start of a word represent plain vowel sounds. They may be short or long, and will be modified by tone, but none of those things are expressed by the orthography, eg. ᎠᎹ ama/aːma water/salt

Elsewhere they represent a syllable starting with ʔ,1 eg. ᎯᎠ hiʔa this

Vowel length and tone

The vowel in a CV syllable doesn't distinguish between short and long vowel sounds, nor does it indicate tonal values, eg. the following sequence of Cherokee characters represents two different words, each having different lengths and tones (low vs. high, respectively)1: ᎠᎹ ama/aːma water/salt

Consonant-vowel syllables

Stops


9
tiDⁱ13D7
tʰiTⁱ13D8
tu tʰuDᵘ13DA
teDᵉ13D5
tʰeTᵉ13D6
to tʰoDᵒ13D9
tə̃ tʰə̃Dᵛ13DB
taDᵃ13D3
tʰaTᵃ13D4

9
ABA7
ABA8
ABAA
ABA5
ABA6
ABA9
ABAB
ABA3
ABA4

7
ki kʰiGⁱ13A9
ku kʰuGᵘ13AB
ke kʰeGᵉ13A8
ko kʰoGᵒ13AA
kə̃ kʰə̃Gᵛ13AC
kaGᵃ13A6
kʰaKᵃ13A7

7
AB79
AB7B
AB78
AB7A
AB7C
AB76
AB77

6
kʷi kʰw̥iQⁱ13C8
kʷu kʰw̥uQᵘ13CA
kʷe kʰw̥eQᵉ13C7
kʷo kʰw̥oQᵒ13C9
kʷə̃ kʰw̥ə̃Qᵛ13CB
kʷa kʰw̥aQᵃ13C6

6
AB98
AB9A
AB97
AB99
AB9B
AB96

Affricate


6
t͡si t͡ʒi t͡ʰʃiTˢⁱ13E5
t͡su t͡ʒu t͡ʰʃuTˢᵘ13E7
t͡se t͡ʒe t͡ʰʃeTˢᵉ13E4
t͡so t͡ʒo t͡ʰʃoTˢᵒ13E6
t͡sə̃ t͡ʒə̃ t͡ʰʃə̃Tˢᵛ13E8
t͡sa t͡ʒa t͡ʰʃaTˢᵃ13E3

6
ABB5
ABB7
ABB4
ABB6
ABB8
ABB3

7
tˡi tʰˡiTˡⁱ13DF
tˡu tʰˡuTˡᵘ13E1
tˡe tʰˡeTˡᵉ13DE
tˡo tʰˡoTˡᵒ13E0
tˡə̃ tʰˡə̃Tˡᵛ13E2
tˡaDˡᵃ13DC
tˡʰaTˡᵃ13DD

7
ABAF
ABB1
ABAE
ABB0
ABB2
ABAC
ABAD

Fricatives


6
siSⁱ13CF
suSᵘ13D1
seSᵉ13CE
soSᵒ13D0
sə̃Sᵛ13D2
saSᵃ13CC

6
AB9F
ABA1
AB9E
ABA0
ABA2
AB9C

6
hiHⁱ13AF
huHᵘ13B1
heHᵉ13AE
hoHᵒ13B0
hə̃Hᵛ13B2
haHᵃ13AD

6
ꭿAB7F
AB81
AB7E
AB80
AB82
AB7D

A 3rd fricative, ɬ, appears below as an aspirated form of l.

See also Consonant 's'.

Nasals


5
 miMⁱ13BB
raremuMᵘ13BD
 meMᵉ13BA
raremoMᵒ13BC
 maMᵃ13B9

5
 AB8B
rareAB8D
 AB8A
rareAB8C
 AB89

8
 ni hn̥iNⁱ13C2
 nu hn̥uNᵘ13C4
 ne hn̥eNᵉ13C1
 no hn̥oNᵒ13C3
 nə̃ hn̥ə̃Nᵛ13C5
 naNᵃ13BE
 hn̥aHⁿᵃ13BF
infreq.nahNᵃh13C0

8
 AB92
 AB94
 AB91
 AB93
 AB95
 AB8E
 AB8F
infreq.AB90

Other sonorants


6
wi hwiWⁱ13EB
wu hwuWᵘ13ED
we hweWᵉ13EA
wo hwoWᵒ13EC
wə̃ hwə̃Wᵛ13EE
wa hwaWᵃ13E9

6
ABBB
ABBD
ABBA
ABBC
ABBE
ABB9

6
li ɬiLⁱ13B5
lu ɬuLᵘ13B7
le ɬeLᵉ13B4
lo ɬoLᵒ13B6
lə̃ ɬə̃Lᵛ13B8
la ɬaLᵃ13B3

6
AB85
AB87
AB84
AB86
AB88
AB83

6
ji hjiYⁱ13F1
ju hjuYᵘ13F3
je hjeYᵉ13F0
jo hjoYᵒ13F2
jə̃ hjə̃Yᵛ13F4
ja hjaYᵃ13EF

6
13F9
13FB
13F8
13FA
13FC
ꮿABBF

Archaic

One syllable is archaic and not used.


archaicmə̃Mᵛ13F5

archaic13FD

Consonant 's'


sS13CD

AB9D

Because it is not followed by a vowel, this character can be used to form consonant clusters at the start of a syllable1, eg. ᏍᎪᎯ skoːhi ten

It is also used for syllables that end with an s sound, eg. ᎯᏴᏫᏯᏍ hijə̃ːwiːjaːs Are you a Native American?

Some manuscripts precede syllables beginning with an s sound with this character. Sequoyah spelled his name like that, ie. ᏍᏏᏉᏯ s-si-quo-ya

Aspiration

Syllable initial aspiration

Most syllables can start with aspirated forms, but only 6 pairs of letters distinguish between aspirated and non-aspirated sounds in the onset.

Five pairs of characters make this distinction for stops or affricates: Ꭶ+Ꭷ, Ꮣ+Ꮤ, Ꮥ+Ꮦ, Ꮧ+Ꮨ, Ꮬ+Ꮭ. For example, it is possible to distinguish between the first two syllables of ᎧᎦᎵ kʰaːkaʔli February but not1590 between the two meanings of ᎪᎳ kőːla/ kʰǒːla winter/bone

Only one nasal syllable makes this distinction, ie. compare ᎬᎾ kə̃́.na/kə̃̀ː.na turkey/I'm alive ᎬᎿ kə̃ː.hn̥a he is alive

However, the following could have two different meanings ᎬᏂᎭ kə̃ːniha/kə̃ːhn̥iha I am/(s)he is striking it

The intrusive h

Aspiration can also arise when there is a non-written h sound in a syllable. Most syllables can have a coda with this sound, which then interacts with the sounds around it as morpheme prefixes and suffixes are attached to the base word. In some cases, it may produce transformations in other syllables.

The following 2 words illustrate non-written h sounds, the first in the syllable coda, and the second in the onset. ᎤᏂᎷᏨ ȕː.nì.lúh.tʰ͡ʃə̃ they arrived ᎠᎨᏯ à.kěː.hj̥a woman

Vowel absence

Vowels at the end of a word

In spoken Cherokee, vowels at the end of a word are often dropped, although the orthography indicates what the vowel would have been396. Figure 2 shows an example based on Montgomery-Anderson of the pronunciation of the sentence: The hungry man ate all the good food.

Cherokee:Transcription:IPA:Meaning:
ᎠᏍᎦᏯa-s-ga-yaàskàjman
ᎤᏲᏏᏍᎬu-yo-si-s-kvȕːyòːsíːskhungry
ᏂᎦᏓni-ka-danìka̋ːtall
ᎤᏒᏅu-sv-nvȕːsə̃̀hn̥ə̃eat
ᎣᏍᏓo-s-daőːstgood
ᎠᎵᏍᏓᏴᏙᏗa-li-s-da-yv-do-diálstȁːhj̥tòhtifood
Phonetic transcription of a sentence where word-final vowels have been dropped from most words in pronunciation.

Additional vowel loss occurs as a result of phonological changes. See phonological_processes.

Final consonants

Each character may not only end with a vowel, but may also end with ʔ or h, eg. the following are written with just two characters ᏑᏗ suhti fishhook ᏔᎵ tʰaʔ.li two

There is one distinctive pair related to syllables ending with h, ie. compare: na nah

Syllables that end with an s sound can be written using U+13CD LETTER S, eg. ᎯᏴᏫᏯᏍ hijə̃ːwiːjaːs Are you a Native American?

Consonant clusters

With one exception, consonant clusters are managed by using a normal syllabic character but ignoring the ('dummy') vowel, eg. ᎦᎵᏉᎩ kaɬkʷoːki seven ᎬᏙᎠ ktʰoːʔa it's hanging The character chosen is largely up to the writer, but some words bring in etymological connections.

The exception is U+13CD LETTER S, which is not followed by a vowel,1 eg. ᏍᎪᎯ skoːhi ten

ssV sequences

Some manuscripts precede syllables beginning with an s sound with U+13CD LETTER S, and Sequoyah spelled his name like that, ie. ᏍᏏᏉᏯ s-si-qo-ya

Tone marks

Spoken Cherokee has tones, but they are not shown in the text.7

Linguists who want to show tones do so using standard allocations of combining characters. The following list shows diacritics used to express tones. (Mid is the default, and doesn't need marking.)25


6
Ꭰ̄mid13A0
0304
Ꭰ̂falling13A0
0302
Ꭰ̌rising13A0
030C
Ꭰ̀low13A0
0300
Ꭰ́high13A0
0301
Ꭰ̋sup.hi13A0
030B

Other features

Pronunciation-related diacritics

Everson reports that some combining diacritical marks are now used in Cherokee text by ordinary readers and especially children.25

These diacritics are in the Unicode Combining Diacritical Marks block. The Cherokee block has no combining characters.


4
̣infreq.0323
̱infreq.0331
̤infreq.0324
̰infreq.0330

◌̣U+0323 COMBINING DOT BELOW indicates shifts in consonant readings – such as voiced to voiceless, voiceless to voiced; for example, where is ko, Ꭺ̣ would be kʰo.

◌̱U+0331 COMBINING MACRON BELOW indicates the dropping of a vowel; for example, Oklahoma could be written

ᎣᎦ̱ᎳᎰᎹ òːklàːhőːma Oklahoma

When a consonant is both shifted and has its vowel dropped, ◌̤U+0324 COMBINING DIAERESIS BELOW is used.

Nasalisation is only very rarely marked: in such cases, it can be indicated using ◌̰U+0330 COMBINING TILDE BELOW.

Numbers

This section describes typographic features related to digits, dates, currencies, etc.

Sequoyah, the inventor of the script, created a set of Cherokee numbers, but they were not adopted and are not encoded in Unicode.7 The shapes of the numbers can be seen on the Omniglot page.

Text direction

Cherokee text runs left-to-right in horizontal lines.

Show default bidi_class properties for characters in the Cherokee orthography described here.

Glyph shaping & positioning

This section describes typographic features related to font/writing styles, cursive text, context-based shaping, context-based positioning, letterform slopes, weights & italics, and case & other character transforms.

Experiment with examples using the Cherokee character app.

Context-based shaping & positioning

Are special glyph forms needed, depending on the context in which a character is used? Do glyphs interact in some circumstances? Are there requirements to position diacritics or other items specially, depending on context? Does the script have multiple diacritics competing for the same location relative to the base?

There is no interaction between the glyphs in Cherokee.

Normally, there are no combining marks in Cherokee text. Such marks are only found in special cases, such as specialised educational or linguistic contexts.

Letterform slopes, weights, & italics

Are italicisation, bolding, oblique, etc relevant? Do italic fonts lean in the right direction? Is synthesised italicisation problematic? Are there other problems relating to bolding or italicisation - perhaps relating to generalised assumptions of applicability?

Cherokee users would like their fonts to have italic and bold styles, although this is not currently common. These alternate styles would be used in the same way as for the Latin script.25

Case & other character transforms

Is the orthography bicameral? Are there other character pairings, especially when transforms are needed to convert between the two?

In 2015 a set of lowercase letters were added to version 8.0 of the Unicode repertoire, to complement the original set. This is discussed in more detail in Case.

Applications should provide for transformations between upper and lower case forms, however the situation is slightly unusual in that the pre-existing text is now written uppercase, and transforms need to in some cases treat lowercasing as the default operation. The following is from the Unicode Standard:

This exceptional introduction of a lowercase set to change a unicameral encoding into a bicameral encoding has important implications that implementers of the Cherokee script need to keep in mind. First, in order to preserve case folding stability, Cherokee case folds to the previously encoded uppercase letters, rather than to the newly encoded lowercase letters. This exceptional case folding behavior impacts identifiers, and so can trip up implementations if they are not prepared for it. Second, representation of cased Cherokee text requires using the new lowercase letters for most of the body text, instead of just changing a few initial letters to uppercase. That means that representation of traditional text such as the Cherokee New Testament requires substantial re-encoding of the text. Third, the fact that uppercase Cherokee still represents the default and is most widely supported in fonts means that input systems which are extended to support the new lowercase letters face unusual design choices.

Case

Lowercase characters were introduced in Unicode 8.0, to cover growing use of bicameral content in modern typesetting, as well as some older texts such as the Cherokee New Testament. The lowercase text above is likely to be displayed as tofu (boxes), since it is currently difficult to find a font that includes lowercase forms.

It is unusual for the majority of content to be in uppercase, and for lowercase to come in later, and implementers may need to take care in introducing the new characters. For example, Cherokee case-folds to uppercase, rather than lower. For more details see the Unicode Standard.7

The shapes of the upper- vs. lower-cased letters don't change radically (as they do in Latin or Cyrillic). The lowercase letters are often simply smaller, however they may have ascenders and descenders in some fonts25.

ᎠᎾᎦᎵᏍᎬ

Ꭰꮎꭶꮅꮝꭼ

Traditional uppercase text (top) and the newer mixed case text (bottom).
details

ᎠᎾᎦᎵᏍᎬ anagalisgv lightning

Typographic units

Word boundaries

Are words separated by spaces, or other characters? Are there special requirements when double-clicking on the text? Are words hyphenated?

The concept of 'word' is difficult to define in any language (see What is a word?). Here, a word is a vaguely-defined, but recognisable semantic unit that is typically smaller than a phrase and may comprise one or more syllables.

Words are separated by spaces.

See type samples.

Some words are hyphenated.

Graphemes

A grapheme is a user-perceived unit of text. Text operations that use graphemes as a unit of text include line-breaking, forwards deletion, cursor movement & selection, character counts, text spacing, text insertion, justification, case conversions, and sorting. The Unicode Standard uses generalised rules to define 'grapheme clusters', which approximate the likely grapheme boundaries in a writing system, however they don't work well with many complex scripts.

Since there are no combining marks or decompositions in normal Cherokee text, grapheme clusters correspond to individual characters. Where combining marks are attached to letters, the combination of base and combining mark still fits within the definition of a grapheme cluster.

Grapheme clusters

Base (Mark?)

Each letter is a grapheme cluster, even if (rare) combining marks are attached.

Click on the text version of this word to see more detail about the composition.

ᎠᎺᏉᎢ a.méː.kʷő.ʔi ocean
ᎠᎨᏯ à.kěː.hj̥a woman
ᎬᏙᎠ ktʰoːʔa it's hanging
ᎣᎦ̱ᎳᎰᎹ òːklàːhőːma Oklahoma

Punctuation & inline features

This section describes typographic features related to word boundaries, phrase & section boundaries, bracketed text, quotations & citations, emphasis, abbreviation, ellipsis & repetition, inline notes & annotations, other punctuation, and other inline text decoration.

Phrase & section boundaries

What characters are used to indicate the boundaries of phrases, sentences, and sections?

See type samples.


6
,002C
;003B
:003A
.002E
?003F
!0021

Cherokee uses standard Latin punctuation.7

phrase

,U+002C COMMA

;U+003B SEMICOLON

:U+003A COLON

sentence

.U+002E FULL STOP

?U+003F QUESTION MARK

!U+0021 EXCLAMATION MARK

In some cases, it has been known for full stops to be raised above the baseline.1

Parentheses & brackets

See type samples.


both
(0028
)0029

Cherokee commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

(U+0028 LEFT PARENTHESIS

)U+0029 RIGHT PARENTHESIS

Quotations

What characters are used to indicate quotations? Do quotations within quotations use different characters? What characters are used to indicate dialogue? Are the same mechanisms used to cite words, or for scare quotes, etc? What about citing book or article names?

See type samples.


4
201C
201D
2018
2019

Cherokee texts typically use quotation marks. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.

  start end
initial

U+201C LEFT DOUBLE QUOTATION MARK

U+201D RIGHT DOUBLE QUOTATION MARK

alternative

U+2018 LEFT SINGLE QUOTATION MARK

U+2019 RIGHT SINGLE QUOTATION MARK

Line & paragraph layout

This section describes typographic features related to line breaking & hyphenation, text alignment & justification, text spacing, baselines, line height, counters, lists, and styling initials.

Line breaking & hyphenation

Are there special rules about the way text wraps when it hits the end of a line? Does line-breaking wrap whole 'words' at a time, or characters, or something else (such as syllables in Tibetan and Javanese)? What characters should not appear at the end or start of a line, and what should be done to prevent that? Is hyphenation used, or something else? What rules are used? What difficulties exist?

By default, lines are broken at inter-word spaces. As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line.

Line-edge rules

As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line. The Unicode line-break properties help applications decide whether a character should appear at the start or end of a line.

Show (default) line-breaking properties for characters in the Cherokee orthography.

The following list gives examples of typical behaviours for some of the characters used in Cherokee. Context may affect the behaviour of some of these and other characters.

Click/tap on the characters to show what they are.

  • “ ‘ (   should not be the last character on a line.
  • ” ’ ) . , ; ! ? %   should not begin a new line.

Text alignment & justification

Does text in a paragraph needs to have flush lines down both sides? Does the script allow punctuation to hang outside the text box at the start or end of a line? Where adjustments are need to make a line flush, how is that done? Does the script shrink/stretch space between words and/or letters? Are word baselines stretched, as in Arabic? What about paragraph indents?

Justification is done, principally, by adjusting the space between words.

Baselines, line height, etc.

Does the script have special requirements for baseline alignment between mixed scripts and in general? Is line height special for this script? Are there other aspects that affect line spacing, or positioning of items vertically within a line?

Cherokee uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

Cherokee character glyphs are generally the same height, and only rarely descend (a short way) below the baseline. There are no combining marks in normal text.

To give an approximate idea, Figure 4 compares Latin and Cherokee glyphs from a Noto font. The height of uppercase Cherokee letters is that of the Latin cap-height, and lowercase is set to the Latin x-height.

HhqxᎢᏑᏪᏰᎤꭲꮡꮺꭴᏸ
Font metrics for Latin text compared with Cherokee glyphs in the Noto Sans Cherokee font.

Figure 5 shows similar comparisons for the Galvji and Gadugi fonts.

HhqxᎢᏑᏪᏰᎤꭲꮡꮺꭴᏸ HhqxᎢᏑᏪᏰᎤꭲꮡꮺꭴᏸ
Latin font metrics compared with Cherokee glyphs in the Galvji (top) and Gadugi (bottom) fonts.

Page & book layout

This section describes typographic features related to general page layout & progression; grids & tables, notes, footnotes, etc, forms & user interaction, and page numbering, running headers, etc.

References & sources

1Peter T. Daniels and William Bright, The World's Writing Systems, Oxford University Press, ISBN 0-19-507993-0

2Michael Everson & Durbin Feeling, Revised proposal for the addition of Cherokee characters to the UCS

3Brad Montgomery-Anderson (2008), A reference grammar of Oklahoma Cherokee, PhD dissertation

4Omniglot, Cherokee

5Margaret Peake Raymond (2008), The Cherokee Nation and its Language, tsalagi ayeli ale uniwonishisdi (retr. Dec 2021)

6ScriptSource, Cherokee

7Unicode Consortium, The Unicode Standard, Version 13.0, Chapter 20.1: Americas, Cherokee, 788-789, ISBN 978-1-936213-16-0.

8Uchihara, Hiroto (2007), Cherokee Phonology and Verb Morphology, University of Tokyo, MA Thesis (retr. Dec 2021)

9Unicode Consortium, Unicode Line Breaking Algorithm (UAX#14)

10Wikipedia, Cherokee syllabary

See recent changes.  •  Make a comment.  •  Licence CC-By © r12a.