Sundanese

orthography notes

Updated 12 April, 2024

This page brings together basic information about the Sundanese script and its use for the Sundanese language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Sundanese using Unicode.

Referencing this document

Richard Ishida, Sundanese Orthography Notes, 12-Apr-2024, https://r12a.github.io/scripts/sund/su

Sample

Select part of this sample text to show a list of characters, with links to more details.
Change size:   28px

ᮞᮊᮥᮙ᮪ᮔ ᮏᮜ᮪ᮙ ᮌᮥᮘᮢᮌ᮪ ᮊ ᮃᮜᮙ᮪ ᮓᮥᮑ ᮒᮨᮂᮞᮤᮖᮒ᮪ᮔ ᮙᮨᮛ᮪ᮓᮤᮊ ᮏᮩᮀ ᮘᮧᮌ ᮙᮛ᮪ᮒᮘᮒ᮪ ᮊᮒᮥᮒ᮪ ᮠᮊ᮪-ᮠᮊ᮪ ᮃᮔᮥ ᮞᮛᮥᮃ. ᮙᮛᮔᮨᮂᮔ ᮓᮤᮘᮨᮛᮨ ᮃᮊᮜ᮪ ᮏᮩᮀ ᮠᮒᮨ ᮔᮥᮛᮔᮤ, ᮎᮙ᮪ᮕᮥᮁ-ᮌᮅᮜ᮪ ᮏᮩᮀ ᮞᮞᮙᮔ ᮃᮚ ᮓᮤᮔ ᮞᮥᮙᮔᮨᮒ᮪ ᮓᮥᮓᮥᮜᮥᮛᮔ᮪.

Source: Unicode UDHR, article 1

Usage & history

Origins of the Sundanese script, 14thC – today.

Phoenician

└ Aramaic

└ Brahmi

└ Tamil-Brahmi

└ Pallava

└ Old Kawi

└ Old Sundanese

└ Sundanese

+ Balinese

+ Batak

+ Baybayin

+ Javanese

+ Lontara

+ Lampung

+ Makasar

+ Rencong

+ Rejang

Since 1996 the Sundanese script has been the official orthography for the 27 million Sundanese speakers on the island of Java, although the Latin script is also used. It is currently taught in schools and used for public signage.

ᮃᮊ᮪ᮞᮛ ᮞᮥᮔ᮪ᮓ

The modern orthography is derived from the Old Sundanese orthography (Aksara Sunda Kuno) which was used by the Sundanese between the 14th and 18th centuries. It, in turn, derived from the Pallava script.

Source: Scriptsource, Wikipedia.

Basic features

Sundanese is an abugida, ie. consonants carry an inherent vowel sound that is overridden, where needed, using vowel signs. See the table to the right for a brief overview of features for the modern Sundanese orthography.

Sundanese text runs left to right in horizontal lines. Words are separated by spaces.

❯ consonantSummary

The 18 native consonant letters are supplemented by 7 more used for non-native sounds, such as those from Arabic.

Syllable-initial clusters use 3 dedicated combining marks for the second consonant.

Syllable-final consonant sounds are also represented by 3 dedicated combining marks. When a vowel sign and final consonant are both attached to the same base, they are arranged side by side.

Other consonant clusters are indicated by a visible mark called pamaaeh. There are no stacked consonants or other conjuncts in modern Sundanese, however they were used in the Old Sundanese orthography.

❯ basicV

This orthography is an abugida with one inherent vowel, pronounced a.

Other post-consonant vowels are represented by 6 combining marks (vowel signs). There are no vowel letters.

There is 1 pre-base glyph. There are no circumgraphs or multipart vowels.

Standalone vowels are written using 7 independent vowel letters,

Sundanese has a set of native digits, but uses ASCII punctuation.

Character index

Letters

Show

Basic consonants

ᮕ␣ᮘ␣ᮒ␣ᮓ␣ᮊ␣ᮌ␣ᮎ␣ᮏ␣ᮞ␣ᮠ␣ᮙ␣ᮔ␣ᮑ␣ᮍ␣ᮝ␣ᮛ␣ᮜ␣ᮚ

Extended consonants

ᮟ␣ᮋ␣ᮖ␣ᮗ␣ᮐ␣ᮯ␣ᮮ

Vowels

ᮄ␣ᮅ␣ᮆ␣ᮇ␣ᮈ␣ᮉ␣ᮃ

Not used for modern Sundanese

ᮺ␣ᮻ␣ᮼ␣ᮽ␣ᮾ␣ᮿ

Combining marks

Show

Vowels

ᮦ␣ᮤ␣ᮥ␣ᮧ␣ᮨ␣ᮩ

Medials

ᮢ␣ᮣ␣ᮡ

Finals

ᮀ␣ᮁ␣ᮂ

Other

Not used for modern Sundanese

᮫␣ᮬ␣ᮭ

Numbers

Show
᮰␣᮱␣᮲␣᮳␣᮴␣᮵␣᮶␣᮷␣᮸␣᮹

Punctuation

Show
‘␣’␣“␣”

ASCII

(␣)␣,␣.␣:␣;␣?␣!␣|

Not used for modern Sundanese

᳀␣᳁␣᳂␣᳃␣᳄␣᳅␣᳆␣᳇

Other

Show

To be investigated

%␣-␣[␣]␣§␣«␣»␣ʼ␣͏␣​␣‌␣‍␣‑␣–␣—␣†␣‡␣…␣‰␣′␣″␣‹␣›␣⁠
Items to show in lists

Phonology

These are sounds for the Sundanese language.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones. Source Wikipedia.

Vowel sounds

i u ɤ ɤ ə ə ɛ ɔ a

Consonant sounds

labial alveolar post-
alveolar
palatal velar uvular glottal
stop p b t d     k ɡ q  
affricate     t͡ʃ d͡ʒ   k͡s    
fricative f v s z     x   h
nasal m n   ɲ ŋ  
approximant w l   j    
trill/flap   r    

Structure

An orthographic syllable in modern Sundanese can be described as one of

C {y,r,l} {vs} {ng,r,h}
Cp
V {ng,r,h}

where C is a consonant and V is an independent vowel, y,r,l represents a medial combining character, vs a vowel sign, ng,r,h a syllable-final combining character, and p a vowel-killer.

Vowels

Vowel summary table

The following table summarises the main vowel to character assigments.

ⓘ represents the inherent vowel. The right-hand column shows standalone vowels.

Simple:
◌ᮤ␣ ␣◌ᮥ
ᮄ␣ ␣ᮅ
◌ᮦ␣ ␣◌ᮩ␣ ␣◌ᮧ
ᮆ␣ ␣ᮉ␣ ␣ᮇ
◌ᮨ

For additional details see vowel_mappings.

Below is the full set of characters needed to represent the vowels of the Sundanese language.

ᮃ␣ᮄ␣ᮅ␣ᮆ␣ᮇ␣ᮈ␣ᮉ
ᮤ␣ᮥ␣ᮦ␣ᮧ␣ᮨ␣ᮩ␣᮪

Inherent vowel

ka [U+1B8A SUNDANESE LETTER KA]

a following a consonant is not written, but is seen as an inherent part of the consonant letter, so ka is written by simply using the consonant letter.

Vowel basics

Combining marks used for vowels

ᮊᮤ kiː [U+1B8A SUNDANESE LETTER KA + U+1BA4 SUNDANESE VOWEL SIGN PANGHULU]

Sundanese uses the following dedicated combining marks for vowels.

ᮤ␣ᮥ␣ᮧ␣ᮦ␣ᮨ␣ᮩ

Two vowel signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.

Pre-base vowel sign

One vowel sign appears to the left of the base consonant letter or cluster.

This is a combining mark that is always stored after the base consonant. The font places the glyph before the base consonant.

ᮛᮦᮌᮀ
A prebase vowel sign appears to the left of the consonant after which it is pronounced.
show composition

ᮛᮦᮌᮀ

Standalone vowels

Sundanese represents standalone vowels using a set of independent vowel letters, eg. ᮅᮃᮕ᮪ The set includes a character to represent the inherent vowel sound.

ᮄ␣ᮅ␣ᮆ␣ᮇ␣ᮈ␣ᮉ␣ᮃ

Independent vowels can carry syllable-final consonants, eg.

ᮃᮀᮊᮥᮒᮔ᮪

ᮎᮤᮊᮤᮄᮂ

ᮘᮤᮃᮀ

Vowel absence

In modern Sundanese writing suppressed inherent vowels are indicated by either

For example, ᮃᮌᮢᮤᮊᮥᮜ᮪ᮒᮥᮁ contains all three (see fig_vowel_absence).

ᮃᮌᮢᮤᮊᮥᮜ᮪ᮒᮥᮁ
Left to right, the highlights show a medial diacritic, a pamaaeh, and a final consonant diacritic.
show composition

ᮃᮌᮢᮤᮊᮥᮜ᮪ᮒᮥᮁ

At the end of a word, 1BAA is used, eg. see ᮄᮊᮣᮤᮙ᮪ in fig_final_pamaaeh.

ᮄᮊᮣᮤᮙ᮪
Pamaaeh in word-final position.
show composition

ᮄᮊᮣᮤᮙ᮪

Vowel sounds mapped to characters

This section maps Sundanese vowel sounds to common graphemes in the Sundanese orthography.

The left column shows dependent vowels, and the right column independent vowel letters.

Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

Plain vowels

i
 

ᮄᮊᮣᮤᮙ᮪

ᮄᮛᮠ

u
 

ᮆᮜ᮪ᮙᮥ

ᮅᮔᮤᮍ

e
 

ᮛᮦᮌᮀ

ᮆᮜ᮪ᮙᮥ

ɤ
 

ᮊᮩᮞᮢᮊ᮪

ᮞᮩᮉᮁ

o
 

ᮌᮨᮜᮧᮙ᮪ᮘᮀ ᮙᮤᮊᮢᮧ

ᮇᮛᮚ᮪

ə
 

ᮌᮨᮓᮀ

ᮈᮙᮞ᮪

a
 

Inherent vowel

ᮊᮜᮕ

ᮃᮊ᮪ᮞᮛ

Consonants

Consonant summary table

The following table summarises the main consonant to character assigments.

Onsets
ᮕ␣ᮘ␣ᮒ␣ᮓ␣ᮊ␣ᮌ␣ᮋ␣ ␣ᮟ
ᮎ␣ᮏ
ᮖ␣ᮗ␣ᮞ␣ᮯ␣ᮐ␣ᮮ␣ᮠ
ᮙ␣ᮔ␣ᮑ␣ᮍ
ᮝ␣ᮛ␣ᮜ␣ᮚ
Medials
◌ᮢ␣◌ᮣ␣◌ᮡ
Finals
◌ᮂ␣ ␣◌ᮀ␣ ␣◌ᮁ

For additional details see vowel_mappings.

Basic consonants

The Sundanese block has 18 consonant letters for indigenous sounds in modern Sundanese writing.

ᮕ␣ᮘ␣ᮒ␣ᮓ␣ᮊ␣ᮌ␣ᮎ␣ᮏ␣ᮞ␣ᮠ␣ᮙ␣ᮔ␣ᮑ␣ᮍ␣ᮝ␣ᮛ␣ᮜ␣ᮚ

Repertoire extension

An extended set of consonants is used to represent non-native sounds, eg. Arabic.

ᮟ␣ᮋ␣ᮖ␣ᮗ␣ᮐ␣ᮯ␣ᮮ

Onsets

ᮢ␣ᮣ␣ᮡ

The three trailing consonants that can appear in syllable-initial pairs are written using dedicated combining marks, eg. ᮄᮊᮣᮤᮙ᮪ ᮃᮌᮢᮤᮊᮥᮜ᮪ᮒᮥᮁ

Finals

ᮀ␣ᮁ␣ᮂ

The three syllable-final consonant sounds are also represented using dedicated combining marks, eg. ᮙᮀᮌᮥ ᮕᮞᮤᮁ ᮃᮘᮂ-ᮃᮘᮂ

Consonant clusters

Syllable-initial clusters

Syllable-initial consonant clusters allow 3 sounds after the initial consonant, j, r, or l. These are all represented using dedicated combining marks (see onsets).

Other consonant clusters

In modern Sundanese the absence of a vowel sound between two consonants is shown using a visible vowel killer 1BAA. This produces no special conjunct forms.

ᮃᮊ᮪ᮞᮛ
The word aksara, showing pamaaeh vowel killer.
show composition

ᮃᮊ᮪ᮞᮛ

Old Sundanese

Historical Sundanese does have conjunct forms. They can be produced using the invisible 1BAB. The following shows known conjuncts:os

Historically, Sundanese also had special forms for subjoined -m and -w. These can be represented using 1BAD and 1BAC.

ᮬ␣ᮭ

Consonant to script mapping

This section maps Sundanese consonant sounds to common graphemes in the Sundanese orthography. Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

p
 

ᮕᮔᮧᮔ᮪ᮕᮧᮆ

b
 

ᮘᮥᮃᮂ

t
 

ᮒᮅᮔ᮪

t͡ʃ
 

ᮎᮥᮎᮥᮒ᮪

d
 

ᮓᮜᮤᮀᮓᮤᮀ

d͡ʒ
 

ᮏᮏᮔ᮪ᮒᮥᮀ

k
 

ᮊᮥᮜ᮪ᮊᮞ᮪

ks
 

for foreign words.

ɡ
 

ᮌᮥᮌᮥᮊ᮪

q
 

for foreign words.

f
 

ᮍᮖᮥᮀᮞᮤᮊᮩᮔ᮪

v
 

for writing foreign words.

s
 

ᮞᮔᮦᮞ᮪

in Arabic loan words.

ᮮᮥᮯᮥ

z
 

for writing foreign words.

x
 

in Arabic loan words.

ᮮᮥᮯᮥ

h
 

ᮠᮜᮩᮀᮠᮩᮙ᮪

when syllable-final.

ᮃᮘᮂ-ᮃᮘᮂ

m
 

ᮙᮛᮩᮙᮔ᮪

n
 

ᮔᮇᮔ᮪

ɲ
 

ᮑᮔ

ŋ
 

ᮍᮙᮔᮂ

ᮙᮀᮌᮥ when syllable-final.

w
 

ᮝᮠᮍᮔ᮪

r
 

ᮛᮔ᮪ᮏᮀ

when medial.

ᮃᮌᮢᮤᮊᮥᮜ᮪ᮒᮥᮁ

when syllable-final.

ᮕᮞᮤᮁ

l
 

ᮜᮍᮤᮒ᮪

when medial.

ᮄᮊᮣᮤᮙ᮪

j
 

ᮜᮍᮣᮚᮍᮔ᮪

when medial.

Other features

Other letters

is an archaic letter used for writing Sanskrit.

For reproduction of Old Sundanese writing there are 5 additional characters:

ᮻ␣ᮼ␣ᮽ␣ᮾ␣ᮿ

Numbers

Digits

Sundanese uses native digits, which are decimal-based and used in the same way as European numerals.

᮰␣᮱␣᮲␣᮳␣᮴␣᮵␣᮶␣᮷␣᮸␣᮹

To help distinguish the digits from other characters | U+007C VERTICAL LINE is used around numbers.

|᮲᮰᮱᮷|
Vertical bars are used to distinguish numbers.

Text direction

Sundanese runs left to right in horizontal lines.

Show default bidi_class properties for characters in the Sundanese orthography described here.

Glyph shaping & positioning

You can experiment with examples using the Sundanese character app.

Context-based shaping & positioning

Sundanese text is not cursive.

Glyph shaping is required for subjoined consonants in Old Sundanese, but doesn't appear to be needed for modern Sundanese orthography.

However, when two diacritics appear in the same position relative to the base character they are positioned side by side, as shown in fig_multiple_diacritics.

ᮊᮤᮀ   ᮊᮣᮥ   ᮊᮧᮂ
Multiple combining marks alongside the same base character sit side by side.

Observation: Everson says that the same applies for ᮊᮢᮥ, but the fonts I've tried all render that combination vertically.

For Old Sundanese orthography, positioning rules are also needed to produce conjunct forms.

Typographic units

Word boundaries

Words are separated by spaces.

Graphemes

tbd

Punctuation & inline features

Phrase & section boundaries

,␣;␣:␣.␣?␣!

Modern Sundanese typically uses ASCII punctuation for sentence and phrase punctuation.

phrase ,

;

:

sentence

.

?

!

Old Sundanese

The punctuation described here is used for Old Sundanese texts, and is not used for modern Sundanese.

phrase

In Old Sundanese, if is used as a full stop, is used as a comma.

Otherwise may be used as a comma in older texts.

sentence

may be used in Old Sundanese texts.

Religious texts in Old Sundanese contain ᳆᳀᳆ and ᳆᳁ markers, which include additional code points , and .

Historical texts in Old Sundanese contain ᳅᳂᳅ markers, with the additional code point .

Other similar code points include and .

Bracketed text

(␣)

Sundanese commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

(

)

Quotations & citations

‘␣’␣“␣”

Sundanese texts use quotation marks around quotations. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.

  start end
initial

nested

Line & paragraph layout

Line breaking

tbd

No information about whether lines break after syllables or space-separated words.

In-word line-breaking

According to Everson, hyphenation can occur after any full orthographic syllable, but there are no details about how that works.

Line-edge rules

As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line. The Unicode line-break properties help applications decide whether a character should appear at the start or end of a line.

Show (default) line-breaking properties for characters in the modern Sundanese orthography.

The following list gives examples of typical behaviours for some of the characters used in modern Sundanese. Context may affect the behaviour of some of these and other characters.

Click/tap on the characters to show what they are.

  • “ ‘ (   should not be the last character on a line.
  • ” ’ ) . , ; ! ? %   should not begin a new line.

Baselines, line height, etc.

Sundanese uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

Most Sundanese letters are of a uniform height, but Sundanese places vowel marks and final characters above and below base characters. If the latter occur together, they are typically placed side by side, rather than extending away from the baseline.

To give an approximate idea, fig_baselines compares Latin and Sundanese glyphs from the Noto font. The basic height of Sundanese letters is typically around the Latin cap-height, and combining marks below fit pretty much within the Latin descender height, however combining marks reach above the Latin ascenders, creating a need for larger line spacing.

Hhqxᮕᮞᮤᮁᮃᮌᮢᮤᮊᮥᮜ᮪ᮒᮥᮁᮊᮣᮥ
Font metrics for Latin text compared with Sundanese glyphs in the Noto Sans Sundanese font.

Page & book layout

References