Use accesskey "n" to jump to the internal navigation links at any point. Right now you can

 
r12a >> docs

Balinese

orthography notes

Updated 13 December, 2024 • recent changes scripts/bali/ban • leave a comment

This page brings together basic information about the Balinese script and its use for the Balinese language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Balinese using Unicode.

Referencing this document

Richard Ishida, Balinese Orthography Notes, 13-Dec-2024, https://r12a.github.io/scripts/bali/ban

 

Click to toggle Table of Contents.

Phonological transcriptions should be treated as a guide, only. They are taken from the sources consulted, and may be narrow or broad, phonemic or phonetic, depending on what is available. They mostly represent pronunciation of words in isolation. For more detailed information about allophones, alternations, sandhi, dialectal differences, and so on, follow the links to cited references.

This is an interactive document. Click/tap on the following to reveal detailed information and examples for each character: (a) coloured characters in examples and lists; (b) link text on character names. If your browser supports it, your cursor will change to look like as you hover over these items.

More about using this page

Character names. The names of characters in codepoint markup drop the initial BALINESE label (purely to reduce the length of the examples). In other places the full name can be found.

Navigation. The Toggle images icon opens the table of contents in a popup window. Dismiss it by clicking on the X alongside it, or by hitting the ESC key.

Detailed character notes. Clicking on coloured characters in lists or on character names opens panels that give detailed information about each character. This information is taken from the companion document, Balinese Character Notes. (Those panels can be dismissed by pressing on the ESC key.)

Transcriptions & transliterations. Phonological transcriptions are surrounded by ⌈corner brackets⌋, to indicate that they vary between narrow, [phonetic] and broad, /phonemic/ transcriptions.
Latin transcriptions between <angle brackets>, represent the letters as commonly written in the Latin script.
A transliteration has also been developed especially for this orthography, and is generally based on the sound of a letter where possible, but where a letter has multiple pronunciations, the transliteration represents only one.
Transliterations provide perfect round-trip conversion between the native script and Latin, whereas Latin transcriptions rarely do.
When you click on an example to see its composition, the top of the panel that opens contains a transliteration, followed by the native text, then (if available) an IPA transcription.

Copied !
TOC.
Accessibility settings
ˇ

Languages using the Balinese scriptBalinese pickerTerms listCharacter notesBalinese linksOther orthography notes

Sample

Select part of this sample text to show a list of characters, with links to more details.
Change size:   28px

ᬫᬓᬲᬫᬶ​ᬫᬦᬸᬲᬦᬾ​ᬓᬳᭂᬫ᭄ᬩᬲᬶᬦ᭄ᬫᬳᬃᬤᬶᬓ​ᬮᬦ᭄ᬧᬢᭂᬄ​ᬲᬚ᭄ᬭᭀᬦᬶᬂ​ᬓᬳᬦᬦ᭄ᬮᬦ᭄ᬓ᭄ᬯᬲ᭟ ᬳᬶᬧᬸᬦ᭄ᬓᬵᬦᬸᬕ᭄ᬭᬳᬶᬦ᭄ᬯᬶᬯᬾᬓ​ᬮᬦ᭄ᬩᬸᬤ᭄ᬥᬶ᭞ ᬧᬦ᭄ᬢᬭᬦᬶᬂ​ᬫᬦᬸᬲ​ᬫᬗ᭄ᬤᬦᬾ​ᬧᬭᬲ᭄ᬧᬭᭀᬲ᭄ᬫᬲᭂᬫᭂᬢᭀᬦᬦ᭄᭟

Source: UDHR, article 1 in Omniglot

Usage & history

Origins of the Balinese script, 11thC – today.

Phoenician

└ Aramaic

└ Brahmi

└ Pallava

└ Old Kawi

└ Balinese

+ Batak

+ Baybayin

+ Javanese

+ Lontara

+ Makasar

+ Old Sundanese

+ Recong

+ Rejang

The Balinese script is used for writing the Balinese language spoken on the Indonesian islands of Java and Bali. It may also be used for Old Javanese, and liturgical Sanskrit. With some additions, it is also used to write Sasak in the neighbouring island of Lombok.

Everyday use of the script has largely been eclipsed by the Latin alphabet, but Balinese has a significant presence in traditional ceremonies and texts of the Hindu religion. It is also used for signage on roads, at the entrances to villages, and on government buildings. Traditional literature is published on a small scale, but little modern literature. Sekaha Pesantian community groups gather to read the Balinese script in a social context, commonly in song form.

ᬅᬓ᭄ᬱᬭᬩᬮᬶ ạk͓ṡ̂rbli aksara bali Balinese script

Balinese script is derived from Old Kawi, and ultimately from Brahmi. Historically, Balinese was written on palm leaves or inscribed in stone. Its similarity to the Javanese script in form and behaviour leads some to propose that they are typological variants of each other.

More information: Scriptsource and Wikipedia.

Script codebali
Language codeban-bali
Script typeabugida
Originoce
Native speakers3,300,000
  
Total characters93
Letters47
Combining marks22
Punctuation12
Numbers10
Other2
Possible other31
Unicode blocks1
  
Character counts above are for this
orthography but exclude ASCII.
  
Text directionltr
Post-consonant vowels1 inherent vowel
marks
vocalics
pre-base marks
circumgraphs
Standalone vowelsletters
carrier HA ᬳ
Case distinctionno
Cursive scriptno
Combining marks>1 per base
Clusters markedyes
Dedicated finalsmarks
Consonant
Clusters
ligated glyphs
stacks
conjoined glyphs
visual killer
killer type: v
Other ligaturesno
Word separatorno separation
Wraps atsyllable
Hyphenationyes ᭠
Conjunctsspan word boundaries
G Clusters OK?no
Justification?
Baselineromn

Basic features

The script is an abugida. See the table to the right for a brief overview of features for the Balinese language.

Balinese text runs left to right in horizontal lines.

Words are not separated by spaces, however syllables may be separated by ZWSP, as long as they don't fall inside a stack.

Stacked consonants and conjoined pairs span word boundaries. This means that text must be wrapped at orthographic syllable boundaries, and not at word boundaries. Hyphenation occurs, using U+1B60 PAMENENG at the line end to indicate the break.

❯ Consonant summary table

18 consonant letters are used for pure Balinese words, supplemented by 15 more used for Sanskrit and Kawi loanwords. Some of these letters are used as honorifics, a little like capital letters in English proper nouns.

The second (or occasionally third) consonant in a syllable-initial cluster is written using ◌᭄U+1B44 ADEG ADEG followed by one of 4 ordinary consonants or using a special vocalic combining mark.

Syllable-final consonant sounds are most commonly written using an ordinary consonant followed by ◌᭄U+1B44 ADEG ADEG. If another consonant follows, the consonant shapes are combined into a conjunct form — even if the consonants represent the end of one word and the beginning of another! Alternatively, three syllable-final consonants may be represented by one of 3 final-consonant diacritics, two of which only occur word-finally.

Consonant clusters are represented by conjunct forms that are either stacked consonants or conjoined pairs. The shape of many subjoined consonant glyphs differs from the normal shape. The shaping is produced by adding ◌᭄U+1B44 ADEG ADEG between consonant code points.

Usually the adeg adeg is invisible, but it is rendered visibly when no other consonant follows, or occasionally in special circumstances, when it can be forced to appear using an invisible formatting character.

❯ Vowel summary table

The Balinese orthography is an abugida with one inherent vowel, generally pronounced a, but ə when word-final or in some affixes.

Other post-consonant vowels are written using 11 combining marks (vowel signs). There are 2 pre-base glyphs and 6 circumgraphs.

In principle, Balinese has no multipart vowels, however the 6 circumgraphs can also be decomposed into 2 parts. Those can involve up to 2 glyphs, and glyphs can surround the base consonant(s) on up to 3 sides.

Independent vowels are used at the beginning of a word for standalone vowel sounds. Inside a word these are written using vowel signs applied to U+1B33 LETTER HA.

The inherent vowel is suppressed using U+1B44 ADEG ADEG, which is invisible in consonant clusters, but is visible elsewhere, and is used word-finally.

Balinese has vocalics, and their use is required for certain consonant-vowel combinations.

Balinese has a set of native digits, and uses native punctuation marks.

Character index

The index points to locations where a character is mentioned in this page, and indicates whether it is used by the Balinese orthography described here.

Manage characters.

Click on the image to the left to view all the 'main' and 'infrequent' characters in the index in various groupings or open related apps.

Letters

Show

Basic consonants

list all 18
1B27
BALINESE LETTER PAbasic consonant p p
1B29
BALINESE LETTER BAbasic consonant b b
1B22
BALINESE LETTER TAbasic consonant t t
1B24
BALINESE LETTER DAbasic consonant d d
1B18
BALINESE LETTER CAbasic consonant t͡ʃ c
1B1A
BALINESE LETTER JAbasic consonant d͡ʒ j
1B13
BALINESE LETTER KAbasic consonant k k
1B15
BALINESE LETTER GAbasic consonant ɡ g
1B32
BALINESE LETTER SAbasic consonant s s
1B33
BALINESE LETTER HAbasic consonant h ∅ h ∅
1B2B
BALINESE LETTER MAbasic consonant m m
1B26
BALINESE LETTER NAbasic consonant n n
1B17
BALINESE LETTER NGAbasic consonant ŋ ng
1B1C
BALINESE LETTER NYAbasic consonant ɲ nya
1B2F
BALINESE LETTER WAbasic consonant w w
1B2D
BALINESE LETTER RAbasic consonant r r
1B2E
BALINESE LETTER LAconsonant l l
1B2C
BALINESE LETTER YAbasic consonant j y

Honorifics

list all 9
1B28
BALINESE LETTER PA KAPALkawi consonant in Kawi loan words. p p ph
1B1D
BALINESE LETTER TA LATIKkawi consonant in Kawi loan words. t t ṭ
1B23
BALINESE LETTER TA TAWAkawi consonant in Kawi loan words. t t th
1B25
BALINESE LETTER DA MADUkawi consonant in Kawi loan words. d d ḍ dh
1B16
BALINESE LETTER GA GORAkawi consonant in Kawi loan words. ɡ gh
1B30
BALINESE LETTER SA SAGAkawi consonant in Kawi loan words. s s sy
1B31
BALINESE LETTER SA SAPAkawi consonant in Kawi loan words. s s ṣ
1B21
BALINESE LETTER NA RAMBATkawi consonant in Kawi loan words. n n ṇ
1B2A
BALINESE LETTER BA KEMBANGkawi consonant in Kawi loan words. b b bh

Extended consonants

list all 6
1B1E
BALINESE LETTER TA MURDA MAHAPRANAhonorific consonant t t
1B1F
BALINESE LETTER DA MURDA ALPAPRANAhonorific consonant d
1B20
BALINESE LETTER DA MURDA MAHAPRANAhonorific consonant d
1B14
(rare)    BALINESE LETTER KA MAHAPRANAhonorific consonant Very rare. k
1B1B
(rare)    BALINESE LETTER JA JERAhonorific consonant Used in one word only. d͡ʒ jh
1B19
(rare)    BALINESE LETTER CA LACAhonorific consonant Only found in subjoined form. t͡ʃ

Vowels

list all 10
1B07
BALINESE LETTER IKARAindependent i i
1B08
BALINESE LETTER IKARA TEDUNGindependent
1B09
BALINESE LETTER UKARAindependent u u
1B0A
BALINESE LETTER UKARA TEDUNGindependent
1B0F
BALINESE LETTER EKARAindependent e ɛ é
1B11
BALINESE LETTER OKARAindependent o ɔ o
1B05
BALINESE LETTER AKARAindependent a a
1B06
BALINESE LETTER AKARA TEDUNGindependent ɑː a
1B10
BALINESE LETTER AIKARAindependent vowel aːi ai
1B12
BALINESE LETTER OKARA TEDUNGindependent vowel aːu o

Vocalics

list all 4
1B0B
BALINESE LETTER RA REPAvocalic
1B0C
BALINESE LETTER RA REPA TEDUNGvocalic rəː
1B0D
BALINESE LETTER LA LENGAvocalic le
1B0E
BALINESE LETTER LA LENGA TEDUNGvocalic ləː

Not used for contemporary Balinese

list all 7
1B45
(unused)    BALINESE LETTER KAF SASAKconsonant sasak
1B46
(unused)    BALINESE LETTER KHOT SASAKconsonant sasak
1B47
(unused)    BALINESE LETTER TZIR SASAKconsonant sasak
1B48
(unused)    BALINESE LETTER EF SASAKconsonant sasak
1B49
(unused)    BALINESE LETTER VE SASAKconsonant sasak
1B4A
(unused)    BALINESE LETTER ZAL SASAKconsonant sasak
1B4B
(unused)    BALINESE LETTER ASYURA SASAKconsonant sasak

Combining marks

Show

Vowel signs

list all 11
1B3E
BALINESE VOWEL SIGN TALINGvowel sign e ɛ é e
ᬿ1B3F
BALINESE VOWEL SIGN TALING REPAvowel sign aːi ai
1B40
BALINESE VOWEL SIGN TALING TEDUNGvowel sign o ɔ o
1B43
BALINESE VOWEL SIGN PEPET TEDUNGvowel sign əː
1B41
BALINESE VOWEL SIGN TALING REPA TEDUNGvowel sign aːu
1B36
BALINESE VOWEL SIGN ULUvowel sign i i
1B37
BALINESE VOWEL SIGN ULU SARIvowel sign i
1B38
BALINESE VOWEL SIGN SUKUvowel sign u u
1B39
BALINESE VOWEL SIGN SUKU ILUTvowel sign u
1B42
BALINESE VOWEL SIGN PEPETvowel sign ə ě e
1B35
BALINESE VOWEL SIGN TEDUNGvowel sign ɑː a

Vocalics

list all 4
1B3A
BALINESE VOWEL SIGN RA REPAvowel sign/semi-vowel/medial consonant
1B3B
BALINESE VOWEL SIGN RA REPA TEDUNGvowel sign/semi-vowel/medial consonant rəː
1B3C
BALINESE VOWEL SIGN LA LENGAvocalic
1B3D
BALINESE VOWEL SIGN LA LENGA TEDUNGvocalic ləː

Bindu

list all 3
1B00
(infrequent)    BALINESE SIGN ULU RICEMfinal consonant -m
1B01
(infrequent)    BALINESE SIGN ULU CANDRAfinal consonant
1B02
BALINESE SIGN CECEKfinal consonant ng

Finals

list
1B03
BALINESE SIGN SURANGfinal consonant -r r

Nukta

list
1B34
(loan)    BALINESE SIGN REREKANnukta

Virama

list
1B44
BALINESE ADEG ADEGvowel-killer

Visarga

list
1B04
BALINESE SIGN BISAHfinal consonant -h h

Numbers

Show
list all 10
1B50
BALINESE DIGIT ZEROdigit 0
1B51
BALINESE DIGIT ONEdigit 1 1
1B52
BALINESE DIGIT TWOdigit 2 2
1B53
BALINESE DIGIT THREEdigit 3 3
1B54
BALINESE DIGIT FOURdigit 4 4
1B55
BALINESE DIGIT FIVEdigit 5 5
1B56
BALINESE DIGIT SIXdigit 6 6
1B57
BALINESE DIGIT SEVENdigit 7 7
1B58
BALINESE DIGIT EIGHTdigit 8 8
1B59
BALINESE DIGIT NINEdigit 9 9

Punctuation

Show
list all 12
1B5A
BALINESE PANTItext start symbol
1B5B
BALINESE PAMADAtext start symbol
1B5C
BALINESE WINDUpunctuation
1B5D
BALINESE CARIK PAMUNGKAHcolon
1B5E
BALINESE CARIK SIKI~comma
1B5F
BALINESE CARIK PAREREN~full stop
1B60
BALINESE PAMENENGline-breaking hyphen
1B7D
BALINESE PANTI LANTANGtext end symbol
1B7E
BALINESE PAMADA LANTANGtext end symbol
᭿1B7F
(infrequent)    BALINESE PANTI BAWAKfiner section division Only found in some manuscripts.
1B4E
(infrequent)    BALINESE INVERTED CARIK SIKIfine comma Only found in some manuscripts.
1B4F
(infrequent)    BALINESE INVERTED CARIK PARERENfine detail full stop Only found in some manuscripts.

Other

Show
list both
ZWNJ200C
ZERO WIDTH NON-JOINERzwnj
ZWSP200B
ZERO WIDTH SPACEzero-width space

To be investigated

list all 31
!0021
(tbc)    EXCLAMATION MARKexclamation mark
%0025
(tbc)    PERCENT SIGNpercentage mark
(0028
(tbc)    LEFT PARENTHESISparenthesis
)0029
(tbc)    RIGHT PARENTHESISparenthesis
-002D
(tbc)    HYPHENhyphen
;003B
(tbc)    SEMICOLONsemicolon
?003F
(tbc)    QUESTION MARKquestion mark
[005B
(tbc)    LEFT SQUARE BRACKETbracket
]005D
(tbc)    RIGHT SQUARE BRACKETbracket
§00A7
(tbc)    SECTION SIGNsection sign
«00AB
(tbc)    LEFT-POINTING DOUBLE ANGLE QUOTATION MARKquotation mark
»00BB
(tbc)    RIGHT-POINTING DOUBLE ANGLE QUOTATION MARKquotation mark
ʼ02BC
(tbc)    MODIFIER LETTER APOSTROPHEapostrophe
͏034F
(tbc)    COMBINING GRAPHEME JOINERcombining grapheme joiner
200D
(tbc)    ZERO WIDTH JOINERzwj
2011
(tbc)    NON-BREAKING HYPHENnon-breaking hyphen
2012
inherent vowel a a
2013
(tbc)    EN DASHen dash
2014
(tbc)    EM DASHem dash
2018
(tbc)    LEFT SINGLE QUOTATION MARKquotation mark
2019
(tbc)    RIGHT SINGLE QUOTATION MARKquotation mark
201C
(tbc)    LEFT DOUBLE QUOTATION MARKquotation mark
201D
(tbc)    RIGHT DOUBLE QUOTATION MARKquotation mark
2020
(tbc)    DAGGERdagger
2021
(tbc)    DOUBLE DAGGERdouble dagger
2026
(tbc)    HORIZONTAL ELLIPSISellipsis
2030
(tbc)    PER MILLE SIGNper mille mark
2032
(tbc)    PRIMEprime
2033
(tbc)    DOUBLE PRIMEdouble prime
2039
(tbc)    LEFT SINGLE QUOTATION MARKquotation mark
203A
(tbc)    RIGHT SINGLE QUOTATION MARKquotation mark

Phonology

The following represents the repertoire of the Balinese language.

Click on the sounds to see where else in the document they are referred to.

Phones in a lighter colour are non-native or allophones .

Vowel sounds

Plain vowels

i u e o ə əː ə əː ɛ ɔ a ɑː ɑː

Diphthongs

aːi aːu

The sources are not very clear about Balinese vowel length. Wiktionary IPA transcriptions make no distinction in pronunciation between the long and short vowel graphemes, and this is backed up in some sources. One study describes Balinese speakers reduce long vowels to short when speaking English. Clynes§ argues that some apparently long vowels are parts of separate syllables and split into different sounds under morphological changes.

On the other hand, sources including Ida Bagus Adi Sudewa8 and Wikipedia11 indicate that there is a difference in vowel length.

Consonant sounds

labial dental alveolar post-
alveolar
palatal velar pharyngeal glottal
stop p b t d       k ɡ    
affricate       t͡ʃ d͡ʒ        
fricative f v   s z     x ɣ ħ ʕ h
nasal m   n   ɲ ŋ  
approximant w   l   j    
trill/flap     r  

Vowels

The Balinese orthography is an abugida with one inherent vowel, generally pronounced a, but ə when word-final or in some affixes.

Other post-consonant vowels are written using 11 combining marks (vowel signs). There are 2 pre-base glyphs and 6 circumgraphs.

In principle, Balinese has no multipart vowels, however the 6 circumgraphs can also be decomposed into 2 parts. Those can involve up to 2 glyphs, and glyphs can surround the base consonant(s) on up to 3 sides.

Independent vowels are used at the beginning of a word for standalone vowel sounds. Inside a word these are written using vowel signs applied to U+1B33 LETTER HA.

The inherent vowel is suppressed using U+1B44 ADEG ADEG, which is invisible in consonant clusters, but is visible elsewhere, and is used word-finally.

Balinese has vocalics, and their use is required for certain consonant-vowel combinations.

Vowel summary table

The following table summarises the main vowel to character assigments.

ⓘ represents the inherent vowel. Multipart forms are not shown here because all vowels and diphthongs are normally represented using one of the atomic characters listed here. Standalone vowels are shown in the right-hand column.

  Post-consonant vowels Standalone vowels
Plain:

4
iii1B36
iī1B37
  
uuu1B38
uū1B39

8
ii1B07
iᬳᬶihi1B33
1B36
 ị̄1B08
ᬳᬷ 1B33
1B37
    
uu1B09
uᬳᬸuhu1B33
1B38
 ụ̄1B0A
ᬳᬹ 1B33
1B39

both
e ɛé ee1B3E
    
o ɔoo1B40

4
eé1B0F
eᬳᬾehe1B33
1B3E
  
oo1B11
oᬳᭀ ho1B33
1B40

both
əě eə1B42
əː ə̄1B43

both
əᬳᭂ 1B33
1B42
əːᬳᭃ hə̄1B33
1B43

both
a  24D8
ɑːaɑ̄1B35

3
aa1B05
ɑːaɑ̣̄1B06
ɑːᬳᬵ hɑ̄1B33
1B35
Diphthongs:

both
aːiᬿai1B3F
aːu 1B41

4
aːiaiạʲ1B10
aːiᬳᬿ haʲ1B33
1B3F
aːuoạʷ1B12
aːuᬳᭁ haʷ1B33
1B41
Vocalics:

4
1B3A
rəː r̥̄1B3B
 1B3C
ləː l̥̄1B3D

4
r̥̣1B0B
rəː r̥̣̄1B0C
lel̥̣1B0D
ləː l̥̣̄1B0E

For more details see Vowel sounds to characters.

Inherent vowel

ka U+1B13 BALINESE LETTER KA

An inherent vowel is a vowel sound that is automatically pronounced after a consonant letter, unless specifically suppressed.

a following a consonant is not written, but is seen as an inherent part of the consonant letter, so ka is written by simply using the consonant letter.

However, the inherent vowel is pronounced ə at the end of a word and also in prefixes ma-, pa- and da-.

Vowels after consonants

Post-consonant vowels are written using 11 combining marks (vowel signs). There are 2 pre-base glyphs and 6 circumgraphs.

In principle, Balinese has no multipart vowels, however the 6 circumgraphs can also be decomposed into 2 parts. Those can involve up to 2 glyphs, and glyphs can surround the base consonant(s) on up to 3 sides.

Vowel signs

ᬓᬶ ki U+1B13 BALINESE LETTER KA + U+1B36 BALINESE VOWEL SIGN ULU

A vowel sign is attached to a consonant base to express a following vowel sound. Sometimes vowel signs have multiple parts, which are displayed on different sides of the base consonant or cluster. They are known as 'matras' in Sanskrit.

Balinese uses the following dedicated combining marks for vowels. They are all vowel signs.


11
iii1B36
iī1B37
uuu1B38
uū1B39
e ɛé ee1B3E
o ɔoo1B40
əě eə1B42
əː ə̄1B43
ɑːaɑ̄1B35
   
ᬿaːiai1B3F
aːu 1B41

To represent the sounds or , Balinese uses vocalic letters. A sequence such as *ᬭᭂ U+1B2D LETTER RA + U+1B42 VOWEL SIGN PEPET is not used. See Vocalics.

Six of the vowel signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.

All vowel signs are typed and stored after the base consonant, and the glyph rendering system takes care of the positioning at display time. The glyphs used to represent vowels, whether alone or in multipart vowels, are arranged around an orthographic syllable, which may be 2 consonants, rather than just around the immediately preceding consonant. See Pre-base vowel signs and Circumgraphs.

Composite vowel signs

Composite vowel signs are only produced when text is decomposed; 5 of the circumgraphs split off the U+1B35 VOWEL SIGN TEDUNG glyph, to create the following pairs:


5
ᭀavoido ɔ o1B3E
1B35
ᭃavoidəː ə̄1B42
1B35
ᭁavoidaːu 1B3F
1B35
ᬻ rəː r̥̄1B3A
1B35
ᬽ ləː l̥̄1B3C
1B35

Pre-base vowel signs

ᬓᬾ ke U+1B13 BALINESE LETTER KA + U+1B3E BALINESE VOWEL SIGN TALING

Two vowel signs appear to the left of the base consonant letter or cluster.


both
e ɛé ee1B3E
ᬿaːiai1B3F

These are combining marks that are always typed and stored after the syllable-initial consonant. The rendering process places the glyph before the consonant for display or printing.
Click on the following word to see the sequence of characters in storage.

ᬘᬾᬮᬾᬂ ˈcɛlɛŋ pig

These vowel characters are actually placed before the start of the orthographic syllable. This means that a word with a consonant cluster at the start separates the pre-base vowel from any post-base vowels by more than one consonant character (see Figure 1).

ᬩᭂᬦ᭄ᬤᬾᬰ
A pre-base vowel sign. Although stored after d, it appears before the nd cluster.
show composition

ᬩᭂᬦ᭄ᬤᬾᬰ bəndesə village chief

Circumgraphs

When a single vowel sign code point produces glyphs on more than one side of the consonant base, it is referred to here as a circumgraph.

ᬓᭀ ko U+1B13 BALINESE LETTER KA + U+1B40 BALINESE VOWEL SIGN TALING TEDUNG

Five vowel or vocalic sounds are represented by a vowel sign that is a single code point in memory, but when displayed it has visually separate parts that appear on different sides of the preceding consonant or cluster.


6
o ɔoo1B40
əː ə̄1B43
aːu 1B41
rəː r̥̄1B3B
 1B3C
ləː l̥̄1B3D

This section includes some vowel signs described in the section Vocalics.

Like pre-base glyphs, these are combining marks that are always stored after the base consonant. The rendering process places the glyphs around the base consonant, as needed.
Click on 'Show composition' to see the sequence of characters in storage for the following word.

ᬤᭀᬦ᭄
A circumgraph. The right-hand side ligates with the base character in this font.
show composition

ᬤᭀᬦ᭄ don leaf

Glyphs can appear on up to 3 sides of the base. Some of the glyphs merge with the base character's glyph (see Context-based shaping & positioning).

These circumgraphs have canonically equivalent decomposed forms (see Encoding vowel signs).

Vowel sign placement

Show details about vowel glyph positioning.

The following list shows where vowel signs, including vocalics, are positioned around a base consonant to produce vowels, and how many instances of that pattern there are.

  • 2 pre-base, eg. ᬓᬾ ᬓᬿ
  • 1 post-base, eg. ᬓᬵ kɑ̄
  • 3 above-base, eg. ᬓᬶ ᬓᬷ ᬓᭂ
  • 3 below-base, eg. ᬓᬸ ᬓᬹ ᬓᬺ
  • 2 pre+post-base, eg. ᬓᭀ ᬓᭁ
  • 1 below+post-base, eg. ᬓᬻ
  • 1 below+above-base, eg. ᬓᬼ
  • 1 below+above+post-base, eg. ᬓᬽ
  • 1 above+post-base, eg. ᬓᭃ kə̄

At maximum, vowel components can occur concurrently on 3 sides of the base.

Vowel length

The sources are not very clear about whether Balinese vowels vary in length during pronunciation (see Vowel sounds). The Balinese vowel sign repertoire does, however, contain glyphs that distinguish between short and long vowels (see Vowel summary table).

Nasalisation

If Balinese nasalises any vowel sounds, it is not explicitly marked in the orthography.

Standalone vowels

Standalone vowels are vowel sounds that are not preceded by a consonant sound, or are preceded by only a glottal stop. They may appear at the beginning of a word or in the middle of a word after a preceding vowel.

Balinese has 2 ways to represent standalone vowels: using independent vowels, or using vowel signs.

Independent vowels

How does the orthography handle vowels that are not preceded by a consonant?


10
ii1B07
 ị̄1B08
uu1B09
 ụ̄1B0A
e ɛé1B0F
o ɔo1B11
aa1B05
ɑːaɑ̣̄1B06
   
aːiaiạʲ1B10
aːuoạʷ1B12

At the beginning of a word, most standalone vowels are represented using one of the 10 independent vowel characters. The set includes a character to represent the inherent vowel sound.

ᬉᬱᬥ usadə traditional medecine

ᬆᬤᬶ adi first

The vowel signs for ə (U+1B42 VOWEL SIGN PEPET) and əː (U+1B43 VOWEL SIGN PEPET TEDUNG) don't have an independent form, and have to be used after U+1B33 LETTER HA at the beginning of a word, ie. ᬳᭂU+1B33 LETTER HA + U+1B42 VOWEL SIGN PEPET and ᬳᭃU+1B33 LETTER HA + U+1B43 VOWEL SIGN PEPET TEDUNG, respectively, eg.

ᬳᭂᬫ᭄ᬧᬢ᭄ əm.pat four

In Sasak, independent vowel U+1B05 LETTER AKARA can be followed by an explicit U+1B44 ADEG ADEG in word- or syllable-final position, where it indicates the glottal stop. Other consonants can also be subjoined to it. eg. ᬳᬫᬅ᭄ hmạ͓ amaʔ

Vowel signs


11
ᬳᬶiihi1B33
1B36
ᬳᬷ 1B33
1B37
ᬳᬸuuhu1B33
1B38
ᬳᬹ 1B33
1B39
ᬳᬾe ɛehe1B33
1B3E
ᬳᭀo ɔ ho1B33
1B40
ᬳᭂə 1B33
1B42
ᬳᭃəː hə̄1B33
1B43
ᬳᬵɑː hɑ̄1B33
1B35
   
ᬳᬿaːi haʲ1B33
1B3F
ᬳᭁaːu haʷ1B33
1B41

Typically, a standalone vowel is represented by a vowel sign attached to U+1B33 LETTER HA, which acts as a carrier, eg. ᬤᬳᬾᬭᬄ daerah development

Without a vowel sign the letter U+1B33 LETTER HA may represent a, eg. ᬳᬮᬲ᭄ alas forest

However, it may be unclear from the written text whether U+1B33 LETTER HA represents the sound h or is used as a carrier for a vowel, eg. compare ᬳᬶᬕ higa rib ᬳᬶᬕᭂᬮ᭄ igel dance

Vowel sounds to characters

This section maps Balinese vowel sounds to common graphemes in the Balinese orthography. Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

Vowel signs are post-consonant, dependent vowels. Independent vowels are usually only used in word-initial position. Word-internal standalone vowels (and word-initial in the case of ə and əː) use the vowel sign over a silent U+1B33 LETTER HA. Vowel signs that decompose are shown only in precomposed form.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc. Light coloured characters occur infrequently.

Plain vowels

ə

inherent vowel at the end of a word and also in prefixes ma-, pa- and da-.

vowel sign U+1B42 VOWEL SIGN PEPET

medial standalone ᬳᭂU+1B33 LETTER HA + U+1B42 VOWEL SIGN PEPET

a

inherent vowel eg. ᬅᬯᬢᬵᬭ awatarə avatar

independent U+1B05 LETTER AKARA

ɑː

Diphthongs and other combinations

Vocalics

Vocalics are letters derived from Sanskrit that generally behave like vowels, but represent r/l followed by a vowel. They are often available both as vowel signs and independent vowel letters.


8
r̥̣1B0B
rəː r̥̣̄1B0C
lel̥̣1B0D
ləː l̥̣̄1B0E
1B3A
rəː r̥̄1B3B
 1B3C
ləː l̥̄1B3D

At the beginning of a syllable following a vowel the standalone form of the vocalic is used, eg.

ᬓᭂᬋᬂ kěrěng eat a lot

ᬢᬍᬃ taler therefore

As a second component in a consonant cluster, the vocalic has a postfixed form and a subjoined form. The examples that follow are for the sound .

When the sound occurs directly after a syllable-final consonant, ie. as the onset of a new syllable, the sequence of Unicode characters is C + + consonant + U+1B44 ADEG ADEG + U+1B0B LETTER RA REPA. This produces the conjoined (postfix) form ᭄ᬋ, eg.

ᬧᬓ᭄ᬋᬋᬄ Pak Rěrěh Mr Rereh

When the sound occurs after a syllable-initial consonant, ie. when it occurs as a medial consonant within the same syllable, the sequence of characters is simply C + consonant + U+1B3A VOWEL SIGN RA REPA, using the vowel sign. This produces the subjoined form , eg.

ᬓᬺᬰ᭄ᬡ Krĕsna Krishna

Consonants

18 consonant letters are used for pure Balinese words, supplemented by 15 more used for Sanskrit and Kawi loanwords. Some of these letters are used as honorifics, a little like capital letters in English proper nouns.

The second (or occasionally third) consonant in a syllable-initial cluster is written using ◌᭄U+1B44 ADEG ADEG followed by one of 4 ordinary consonants or using a special vocalic combining mark.

Syllable-final consonant sounds are most commonly written using an ordinary consonant followed by ◌᭄U+1B44 ADEG ADEG. If another consonant follows, the consonant shapes are combined into a conjunct form — even if the consonants represent the end of one word and the beginning of another! Alternatively, three syllable-final consonants may be represented by one of 3 final-consonant diacritics, two of which only occur word-finally.

Consonant clusters are represented by conjunct forms that are either stacked consonants or conjoined pairs. The shape of many subjoined consonant glyphs differs from the normal shape. The shaping is produced by adding ◌᭄U+1B44 ADEG ADEG between consonant code points.

Usually the adeg adeg is invisible, but it is rendered visibly when no other consonant follows, or occasionally in special circumstances, when it can be forced to appear using an invisible formatting character.

Consonant summary table

The following table summarises the main consonant to character assigments.

Consonants used for native Balinese words are shown in the left-hand column. On the right are consonants used for words from Kawi, Sanskrit, and other languages.

  Native Balinese sounds Used for Kawi, Sanskrit, etc. loan words
Onsets

6
ppp1B27
bbb1B29
ttt1B22
ddd1B24
kkk1B13
ɡgg1B15

12
p p ph1B28
b b bh1B2A
t tT1B1E
t t ṭ1B1D
t t th1B23
d  D1B1F
d  1B20
d d ḍ dh1B25
krare K1B14
ɡ gh1B16
     
ɖᬤ᬴rare 1B24
1B34
ʔᬗ᬴loan ŋˑ1B17
1B34

both
t͡ʃcʧ1B18
d͡ʒjʤ1B1A

both
t͡ʃ᭄ᬙrare ͞T͡Ʃ1B44
1B19
d͡ʒrarejhD͡Ʒ1B1B

both
sss1B32
h ∅h ∅h1B33

8
s s sy1B30
s s ṣ1B31
     
fᬧ᬴raref1B27
1B34
vᬯ᬴rare 1B2F
1B34
zᬚ᬴rare ʤˑ1B1A
1B34
xᬓ᬴rare 1B13
1B34
ɣᬕ᬴rare 1B15
1B34
ħᬳ᬴rare 1B33
1B34

4
mmm1B2B
nnn1B26
ŋngŋ1B17
ɲnyaɲ1B1C

nn ṇ1B21

4
www1B2F
rrr1B2D
lll1B2E
jyy1B2C
 
Medials

5
-w-᭄ᬯ  1B44
1B2F
-r-᭄ᬭ  1B44
1B2D
-rə1B3A
-l-᭄ᬮ  1B44
1B2E
-j-᭄ᬬ  1B44
1B2C
Finals

3
ngŋ̽1B02
-rr1B03
-hh1B04

For more details see Consonant sounds to characters.

Basic consonants

Balinese uses 18 basic consonants known as aksara wreṣāstra (ᬅᬓ᭄ᬱᬭᬯᬺᬱᬵᬲ᭄ᬢ᭄ᬭ).


18
ppp1B27
bbb1B29
ttt1B22
ddd1B24
kkk1B13
ɡgg1B15
t͡ʃcʧ1B18
d͡ʒjʤ1B1A
sss1B32
h ∅h ∅h1B33
mmm1B2B
nnn1B26
ŋngŋ1B17
ɲnyaɲ1B1C
www1B2F
rrr1B2D
lll1B2E
jyy1B2C

The characters listed here (and in the following sections) also have subjoined/conjoined shapes, which may differ significantly from those shown here. See Consonant clusters for a list of glyph shapes.

U+1B33 LETTER HA at the beginning of a word or after a preceding vowel is mostly used as a support for a vowel sign (see Standalone vowels), and is not pronounced or transcribed. Word finally with a suffix vowel, however, it is transcribed.4

Additional/honorific consonants

These are called aksara sualalita (ᬅᬓ᭄ᬱᬭᬰ᭄ᬯᬮᬮᬶᬢ).

Many of the additional consonants are commonly used in words originating from Arabic and Dutch, and are most common in north Bali and Lombok. When used in pure Balinese words, they are similar to capital letters and are used to create an honorific effect. There are similar characters in Javanese.

They don't add any consonant sounds to the Balinese repertoire. In words originating from Sanskrit, Old Javanese, or Old Balinese, they represent aspirated or other consonants.4

Additional consonants used for Sanskrit words.


6
 t  1B1E
 d  1B1F
 d  1B20
 k  1B14
᭄ᬙraret͡ʃ T͡Ʃ1B44
1B19
 d͡ʒ  1B1B

Additional consonants used for words from Kawi.


9
pp ph1B28
bb bh1B2A
tt ṭ1B1D
tt th1B23
dd ḍ dh1B25
ɡgh1B16
ss sy1B30
ss ṣ1B31
nn ṇ1B21

The following are particularly noteworthy points about certain characters listed above. More details for each character can be revealed by clicking on the lists above. See also the sound to character mapping table.

Two consonants, U+1B14 LETTER KA MAHAPRANA and U+1B19 LETTER CA LACA, are considered very rare, and one other, U+1B1B LETTER JA JERA, seems to be known from only one word:

ᬦᬶᬃᬛᬭ nirjhara pond

(It is possible that an original ai may have been lost in Balinese, to be replaced by the glyph for jʰa.)

A number of the Sanskrit or Kawi consonants are rather poorly attested. The letter U+1B19 LETTER CA LACA is only found in non-initial position following U+1B18 LETTER CA, ie. ᬘ᭄ᬙ c͓CMost of the series that originally represented retroflex sounds is often omitted in books about the script.

Rerekan

The combining mark U+1B34 SIGN REREKAN is used, as is a similar sign in Javanese, to extend the character repertoire for foreign sounds. However, according to Perdana113 the use of this sign is specific to Lombok texts, and even there its use is sporadic and inconsistent. While the sign can theoretically be used in Balinese settings, common Balinese users would not be familiar with the sign and normally render foreign consonants using the nearest sounding native sound without any additional markings.

See Perdana p13 for many more details.

The first 7 of the 8 combinations listed below are attested in Library of Congress transliterations and in earlier Sasak orthography. The 8th, ᬤ᬴U+1B24 LETTER DA + U+1B34 SIGN REREKAN could be used for one-to-one transliteration for Javanese ɖ.


8
ᬧ᬴rareff1B27
1B34
ᬯ᬴rarev 1B2F
1B34
ᬚ᬴rarez ʤˑ1B1A
1B34
ᬓ᬴rarex 1B13
1B34
ᬕ᬴rareɣ 1B15
1B34
ᬳ᬴rareħ 1B33
1B34
ᬗ᬴loanʔ ŋˑ1B17
1B34
ᬤ᬴rareɖ 1B24
1B34

In rendering, the dots of these letters appear above the top character, which can cause some ambiguity in reading. The following are all visually indistinguishable: ᬓ᬴᭄ᬚ kˑ͓ʤ xja ᬓ᭄ᬚ᬴ k͓ʤˑ kza ᬓ᬴᭄ᬚ᬴ kˑ͓ʤˑ xza

In practice these combinations are probably rather rare.

Sasak

In recent times, Sasak users abandoned the use of the Javanese-influenced rerekan in favour of a series of modified letters (see above), making use, in addition, of some of unused Kawi letters for the Arabic sounds. In place of ᬓ᬴ x and ᬕ᬴ ɣ, for instance, the new fusion of KA and HA,U+1B46 LETTER KHOT SASAK and the Kawi letter U+1B16 LETTER GA GORA are used.

See Perdana p15 for many more details.


7
unused1B45
unused1B46
unused1B47
unused1B48
unused1B49
unused1B4A
unused1B4B

Onsets


5
᭄ᬯw  1B44
1B2F
᭄ᬭr  1B44
1B2D
1B3A
᭄ᬮl  1B44
1B2E
᭄ᬬj  1B44
1B2C

The consonants ya, ra, la and wa regularly appear immediately after the initial consonant in a syllable. Unlike Javanese, Balinese has no special characters for these medial sounds (other than the vocalics mentioned earlier); they are just written using the normal approach for dealing with consonant clusters. These shapes are called pangangge aksara (ᬧᬗ᭢​ᬗ᭄ᬕᬅᬓ᭄ᬱᬭ).

ᬓ᭄ᬭᬫ krama member

Multiple medials can occur: r or l can be followed by w or y, eg.

ᬩ᭄ᬭ᭄ᬬᬕ᭄ bryag laughter

In addition, the vocalics can produce consonant sounds (tied to a specific vowel) in medial position, eg.

ᬓᬺᬰ᭄ᬡ Krĕsna Krishna

See Consonant clusters for more details on shaping of glyphs.

Finals

Normally, syllable and word-final consonant sounds with no following consonant are represented using an ordinary consonant character followed by U+1B44 ADEG ADEG. For example,

ᬓᬵᬤᭂᬧ᭄ kādĕp sold

ᬓᬧᬮ᭄ kapal ship

If the consonant is followed by another consonant, either in the middle or at the end of a word, the adeg adeg code point remains, but becomes invisible as the consonant shapes combine vertically or horizontally (see Consonant clusters).

Combining marks

However, there is also a set of combining marks for syllable-final consonants that don't need to be followed by the adeg adeg.


3
-hh1B04
ngŋ̽1B02
-rr1B03

U+1B02 SIGN CECEK and U+1B04 SIGN BISAH only appear at the end of a word, eg.

ᬓᭂᬋᬂ kěrěng eat a lot

ᬫᬗᬄ mangah logic

unless the word involves repetition, eg.

ᬘᬾᬂᬘᬾᬂ cengceng musical instrument

U+1B03 SIGN SURANG can appear at the end of any syllable.

ᬓᬃᬡ karna ear

A syllable-final diacritic may appear above a stack. It is typed and stored after the other components in the stack, eg. ᬩᬗ᭄ᬓᬸᬂ bangkung pig

When the syllable has a spacing vowel sign, any above-base final-consonant mark appears over the base character, rather than over the vowel sign. This is positioned by the font; the final consonant mark is still typed and stored after the other syllable components, eg. ᬕᭂᬤᭀᬂ ɡədoŋ building

See also Modre symbols.

Consonant clusters

A consonant cluster is a sequence of consonant sounds with no intervening vowels.

A conjunct is a consonant cluster where the lack of intervening vowels is indicated by one or more of stacking, changing and merging the shapes of the constituent letter forms (usually in abugidas). Not all consonant clusters are displayed as conjuncts.

The absence of a vowel sound between two or more consonants is visually indicated in one of the following ways.

  1. Stacked consonants, where the non-initial (subjoined) consonant appears below the initial, often with a different shape from normal.
  2. Conjoined consonants, where consonants sit side-by-side but the non-initial consonant has a slightly different form than usual.
  3. A visible adeg adeg following the initial consonant.

See also Finals for a dedicated final consonant mark followed by a regular consonant.

Word boundaries. Conjuncts span word boundaries. Because there are no spaces between words, a cluster is created when a consonant with no following vowel at the end of a word is followed by a consonant at the beginning of the next word.

ᬓᬳᬦᬦ᭄ᬮᬦ᭄ᬓ᭄ᬯᬲ
In the sequence of words kahanan lan kwasa the initial consonant of each word is subjoined below the final consonant of the preceding word.

Stacks and conjoined sequences are not normally split at line ends (see Word boundaries and Line breaking & hyphenation for the ramifications of this).

Conjunct formation

See a table of 2-consonant clusters.
The table allows you to test results for various fonts.

Stacked and conjoined consonant clusters are referred to as conjuncts.


1B44

In Unicode, the stacking and conjoining behaviour is achieved by adding U+1B44 ADEG ADEG between the consonants. The font hides the glyph automatically when a conjunct is formed.

In some cases, however, the adeg adeg remains visible (see Visible adeg adeg).

Stacking

To represent consonant sounds without intervening vowels, the non-initial consonant letter is typically drawn below the initial consonant letter, and with a slightly different shape. These subjoined forms are called gantungan (ᬕᬦ᭄ᬢᬸᬗᬦ᭄).

Many of the subjoined forms are just slightly smaller versions of the original, but several have very different shapes altogether, most of which ligate with the cluster initial consonant by joining strokes.

There can be up to 3 consonants combined in this way, but the third consonant must be one of ya, ra, la or wa.

The lists below show consonants in their normal and subjoined forms

Native letters

16
ᬩ᭄ᬩb1B29
1B44
1B29
ᬢ᭄ᬢt1B22
1B44
1B22
ᬤ᭄ᬤd1B24
1B44
1B24
ᬘ᭄ᬘt͡ʃ1B18
1B44
1B18
ᬚ᭄ᬚd͡ʒ1B1A
1B44
1B1A
ᬓ᭄ᬓk1B13
1B44
1B13
ᬕ᭄ᬕg1B15
1B44
1B15
ᬳ᭄ᬳh1B33
1B44
1B33
ᬫ᭄ᬫm1B2B
1B44
1B2B
ᬦ᭄ᬦn1B26
1B44
1B26
ᬜ᭄ᬜɲ1B1C
1B44
1B1C
ᬗ᭄ᬗŋ1B17
1B44
1B17
ᬯ᭄ᬯw1B2F
1B44
1B2F
ᬭ᭄ᬭr1B2D
1B44
1B2D
ᬮ᭄ᬮl1B2E
1B44
1B2E
ᬬ᭄ᬬy1B2C
1B44
1B2C
Sanskrit letters

6
ᬞ᭄ᬞt1B1E
1B44
1B1E
ᬟ᭄ᬟd1B1F
1B44
1B1F
ᬠ᭄ᬠd1B20
1B44
1B20
ᬔ᭄ᬔk1B14
1B44
1B14
ᬙ᭄ᬙ t͡ʃ1B19
1B44
1B19
ᬛ᭄ᬛd͡ʒ1B1B
1B44
1B1B
Kawi letters

7
ᬝ᭄ᬝt1B1D
1B44
1B1D
ᬣ᭄ᬣt1B23
1B44
1B23
ᬥ᭄ᬥd1B25
1B44
1B25
ᬖ᭄ᬖg1B16
1B44
1B16
ᬰ᭄ᬰs1B30
1B44
1B30
ᬡ᭄ᬡn1B21
1B44
1B21
ᬪ᭄ᬪn1B2A
1B44
1B2A

Conjoined consonants

In conjoined clusters, the consonant glyphs remain side by side, but the non-initial consonant is reduced on the left side. These conjoined forms are called gempelan (ᬕᬾᬫ᭄ᬧᬾᬮᬦ᭄).

ᬅᬓ᭄ᬱᬭ
The left side of U+1B30 LETTER SA SAGA is reduced when conjoined.
show composition

ᬅᬓ᭄ᬱᬭ ak.sa.rə letter, alphabet

This list shows consonants in their normal and conjoined forms

native letters

3
ᬧ᭄ᬧp1B27
1B44
1B27
ᬲ᭄ᬲs1B32
1B44
1B32
ᬋ᭄ᬋ1B0B
1B44
1B0B
Kawi letters

both
ᬨ᭄ᬨp1B28
1B44
1B28
ᬱ᭄ᬱs1B31
1B44
1B31

The conjoined U+1B32 LETTER SA is unusual in that it also adds a stroke below the initial consonant (see Figure 5). This helps distinguish it from the conjoined p.

ᬧᬓ᭄ᬲ
U+1B32 LETTER SA when conjoined not only loses some of its left side but also adds a glyph below the initial consonant.
show composition

ᬧᬓ᭄ᬲ paksa force

Visible adeg adeg

Because there is no word separator, consonants at the end of one word and beginning of the following word are normally stacked, too.

In some cases this leads to ambiguity about whether this is one or two words. If you really want to make clear which is which, you can use an explicit adeg-adeg, eg. compare ᬧᬓ᭄ᬭᬫᬦ᭄ pakraman membership ᬧᬓ᭄‌ᬭᬫᬦ᭄ Pak Raman Mr Raman

The Unicode Standard recommends the use of ‌U+200C ZERO WIDTH NON-JOINER (ZWNJ) after the adeg-adeg in order to prevent conjunct formation. However, not many people understand the function of ZWNJ or can access it easily from the keypad. It also doesn't introduce line-break opportunities. A better solution may be to use ​U+200B ZERO WIDTH SPACE (ZWSP). This character is needed anyway on most systems in order to allow line-breaking, and it appears to work equally well for this.

A somewhat ambiguous situation arises where conventions prevent certain combinations stacking. For example, the name of the village tamblung should not stack the mbl, but should look as follows.

ᬢᬫ᭄‌ᬩ᭄ᬮᬂ

The Unicode Standard advises to use a zero-width non-joiner after ma, to achieve this.

Observation: Note that this may also be achieved by intelligence in the font, as was actually the case when I generated this example (click on it to see). It's not clear to me what is the preferred approach: put ZWNJ in only when the font doesn't do what you want, or use it always. The latter may lead to more consistent content where different fonts are applied to the text (eg. after cut and paste). In theory, this shouldn't affect searching and sorting, although some applications may not ignore the ZWNJ as they should.

Dedicated final marks

Balinese represents some final consonants using dedicated marks (see Finals). Such final marks are followed by ordinary consonant shapes in consonant clusters. There is no visual indication of missing vowel sounds other than the use of the mark itself.

ᬓᬃᬡ
A cluster involving a dedicated final mark doesn't form a conjunct.
show composition

ᬓᬃᬡ karna ear

Consonant sounds to characters

This section maps Balinese consonant sounds to common graphemes in the Balinese orthography orthography.

The table distinguishes between native Balinese letters and letters borrowed from Sanskrit or Kawi, or extended with rerekan. The right-hand edge shows how conjuncts look by doubling up the letter with an adeg adeg between.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc. Light coloured characters occur infrequently.

p

ᬧ᭄ᬧ basic U+1B27 LETTER PA

ᬨ᭄ᬨ kawi U+1B28 LETTER PA KAPAL in Kawi loan words.

b

ᬩ᭄ᬩ basic U+1B29 LETTER BA

ᬪ᭄ᬪ kawi U+1B2A LETTER BA KEMBANG in Kawi loan words.

t

ᬢ᭄ᬢ basic U+1B22 LETTER TA

ᬞ᭄ᬞ honorific U+1B1E LETTER TA MURDA MAHAPRANA

ᬝ᭄ᬝ kawi U+1B1D LETTER TA LATIK in Kawi loan words.

ᬣ᭄ᬣ kawi U+1B23 LETTER TA TAWA in Kawi loan words.

t͡ʃ

ᬘ᭄ᬘ basic U+1B18 LETTER CA

ᬙ᭄ᬙ honorific ᭄ᬙU+1B44 ADEG ADEG + U+1B19 LETTER CA LACA Very rare. Only found in subjoined form.

d

ᬤ᭄ᬤ basic U+1B24 LETTER DA

ᬥ᭄ᬥ kawi U+1B25 LETTER DA MADU in Kawi loan words.

ᬟ᭄ᬟ honorific U+1B1F LETTER DA MURDA ALPAPRANA

ᬠ᭄ᬠ honorific U+1B20 LETTER DA MURDA MAHAPRANA

d͡ʒ

ᬚ᭄ᬚ basic U+1B1A LETTER JA

ᬛ᭄ᬛ honorific U+1B1B LETTER JA JERA Used in one word only.

ɖ

extension ᬤ᬴U+1B24 LETTER DA + U+1B34 SIGN REREKAN Used in Lombok texts, but even then only sporadically.

k

ᬓ᭄ᬓ basic U+1B13 LETTER KA

ᬔ᭄ᬔ honorific U+1B14 LETTER KA MAHAPRANA Very rare.

ɡ

ᬕ᭄ᬕ basic U+1B15 LETTER GA

ᬖ᭄ᬖ kawi U+1B16 LETTER GA GORA in Kawi loan words.

f

extension ᬧ᬴U+1B27 LETTER PA + U+1B34 SIGN REREKAN Used in Lombok texts, but even then only sporadically.

v

extension ᬯ᬴U+1B2F LETTER WA + U+1B34 SIGN REREKAN Used in Lombok texts, but even then only sporadically.

s

ᬲ᭄ᬲ basic U+1B32 LETTER SA

 

ᬰ᭄ᬰ kawi U+1B30 LETTER SA SAGA in Kawi loan words.

 

ᬱ᭄ᬱ kawi U+1B31 LETTER SA SAPA in Kawi loan words.

z

extension ᬚ᬴U+1B1A LETTER JA + U+1B34 SIGN REREKAN Used in Lombok texts, but even then only sporadically.

x

extension ᬓ᬴U+1B13 LETTER KA + U+1B34 SIGN REREKAN Used in Lombok texts, but even then only sporadically.

ɣ

extension ᬕ᬴U+1B15 LETTER GA + U+1B34 SIGN REREKAN Used in Lombok texts, but even then only sporadically.

ħ

extension ᬳ᬴U+1B33 LETTER HA + U+1B34 SIGN REREKAN Used in Lombok texts, but even then only sporadically.

h

ᬳ᭄ᬳ basic U+1B33 LETTER HA

codaU+1B04 SIGN BISAH

m

ᬫ᭄ᬫ basic U+1B2B LETTER MA

codaU+1B00 SIGN ULU RICEM Holy letter, only used in Sanskrit texts.

n

ᬦ᭄ᬦ basic U+1B26 LETTER NA

ᬡ᭄ᬡ kawi U+1B21 LETTER NA RAMBAT in Kawi loan words.

ɲ

ᬜ᭄ᬜ basic U+1B1C LETTER NYA

ŋ

ᬗ᭄ᬗ basic U+1B17 LETTER NGA

codaU+1B02 SIGN CECEK

codaU+1B01 SIGN ULU CANDRA Holy letter, only used in Sanskrit texts.

w

ᬯ᭄ᬯ basic U+1B2F LETTER WA

r

ᬭ᭄ᬭ basic U+1B2D LETTER RA

codaU+1B03 SIGN SURANG

ᬋ᭄ᬋ vocalic U+1B0B LETTER RA REPA

medial U+1B3A VOWEL SIGN RA REPA

l

ᬮ᭄ᬮ basic U+1B2E LETTER LA

j

ᬬ᭄ᬬ basic U+1B2C LETTER YA

Symbols

Modre symbols

Two combining marks have a specialist usage related to (usually religious) Sanskrit words.


both
infreq.-m 1B00
infreq. ŋ̇̽1B01

U+1B00 SIGN ULU RICEM when combined with certain syllables becomes part of the Aksara Modre, or holy letters, which are used to write words in Sanskrit, usually part of prayers. This character only appears in Sanskrit texts, eg. ᬰᬶᬤ᭄ᬥᬀ siddham

U+1B01 SIGN ULU CANDRA appears only in holy letters, eg. ᬫᬁ mŋ̽ (Mang)When combined with independent vowel ạʷ it becomes a special symbol called omkara and is pronounced m. In this form it is used to represent god, eg. ᬒᬁᬱᬦ᭄ᬢᬶ᭞ᬱᬦ᭄ᬢᬶ᭞ᬱᬦ᭄ᬢᬶ᭞ᬒᬁ omsanti,santi,santi,om May peace be everywhere

᭜ᬁ   ᭟ᬁ   ᭛ᬁ.
Modre symbols that include ulu candra.

Musical marks and symbols

The other symbols in the Balinese block are all musical symbols, and are not described here.


19
1B61
1B62
1B63
1B64
1B65
1B66
1B67
1B68
1B69
1B6A
1B74
1B75
1B76
1B77
1B78
1B79
1B7A
1B7B
1B7C

There is also a set of musical diacritical marks, which are not described here.


9
1B6B
1B6C
1B6D
1B6E
1B6F
1B70
1B71
1B72
1B73

For an in-depth look at musical symbols in Balinese see Perdana.

Encoding choices

Balinese is a script where different sequences of Unicode characters may produce the same visual result. Here we look at those related to vowels.

Encoding vowel signs

Five of the circumgraphs can be written as a single character, or as two characters, the second being [U+1B35 BALINESE VOWEL SIGN TEDUNG] in all cases.

Atomic Decomposed
U+1B40 VOWEL SIGN TALING TEDUNG ᭀU+1B3E VOWEL SIGN TALING + U+1B35 VOWEL SIGN TEDUNG
U+1B43 VOWEL SIGN PEPET TEDUNG ᭃU+1B42 VOWEL SIGN PEPET + U+1B35 VOWEL SIGN TEDUNG
U+1B41 VOWEL SIGN TALING REPA TEDUNG ᭁU+1B3F VOWEL SIGN TALING REPA + U+1B35 VOWEL SIGN TEDUNG
U+1B3B VOWEL SIGN RA REPA TEDUNG ᬻU+1B3A VOWEL SIGN RA REPA + U+1B35 VOWEL SIGN TEDUNG
U+1B3D VOWEL SIGN LA LENGA TEDUNG ᬽU+1B3C VOWEL SIGN LA LENGA + U+1B35 VOWEL SIGN TEDUNG

The single code point per vowel sign is preferred, however the parts are separated in Unicode Normalisation Form D (NFD), and recomposed in Unicode Normalisation Form C (NFC), so both approaches are canonically equivalent.

Whichever approach is used, the vowel signs must be typed and stored after the consonant characters they surround, and in left to right order.

Encoding independent vowels

Four of the independent vowels can be written as a single character, or as two. The alternatives are regarded as canonically equivalent in Unicode. Again, this always involves U+1B35 VOWEL SIGN TEDUNG.

Atomic Decomposed
U+1B08 LETTER IKARA TEDUNG ᬈU+1B07 LETTER IKARA + U+1B35 VOWEL SIGN TEDUNG
U+1B0A LETTER UKARA TEDUNG ᬊU+1B09 LETTER UKARA + U+1B35 VOWEL SIGN TEDUNG
U+1B12 LETTER OKARA TEDUNG ᬒU+1B11 LETTER OKARA + U+1B35 VOWEL SIGN TEDUNG
U+1B06 LETTER AKARA TEDUNG ᬆU+1B05 LETTER AKARA + U+1B35 VOWEL SIGN TEDUNG

The precomposed characters decompose in NFD, and reform again in NFC. It is generally recommended to use the precomposed character.

Combining mark order

The following indicates the expected ordering of Unicode characters within a Burmese combining character sequence. The labels are those used for the Unicode Indic Syllabic Categories. Follow the links to see what characters are represented by a given label.

Burmese has 2 types of combining character sequence (CCS).

The first type is a base plus Virama. This is the non-final part of a consonant cluster or a consonant with a killed vowel, and consists of just the base and the virama.

The general CCS type uses the following preferred ordering after a base.

  1. Nukta
  2. Vowel_Dependent (15)
  3. Bindu (3) | Visarga | Consonant_Final

Ordering characters as shown above avoids potential ambiguities and maximises the likelihood of success when rendering the text.

Numbers

This section describes typographic features related to digits, dates, currencies, etc.

There is a set of Balinese digits, and they are used in the same way as ASCII digits in Latin text.


10
1B51
1B52
1B53
1B54
1B55
1B56
1B57
1B58
1B59
1B50

However, because many of the digit symbols are indistinguishable from other Balinese letters, numbers are typically surrounded by U+1B5E CARIK SIKI, so that they are clearly distinguished, eg. ᬩᬮᬶ᭞᭓᭞ᬚᬸᬮᬶ᭞᭑᭙᭘᭒᭟ Bali, 3 July 1982

Text direction

Balinese text is written horizontally, left to right.

Show default bidi_class properties for characters in the Balinese orthography described here.

Glyph shaping & positioning

This section describes typographic features related to font/writing styles, cursive text, context-based shaping, context-based positioning, letterform slopes, weights & italics, and case & other character transforms.

You can experiment with examples using the Balinese character app.

Context-based shaping & positioning

Are special glyph forms needed, depending on the context in which a character is used? Do glyphs interact in some circumstances? Are there requirements to position diacritics or other items specially, depending on context? Does the script have multiple diacritics competing for the same location relative to the base?

Balinese text relies on OpenType rules to correctly position glyphs and shape them according to the surrounding text.

One major area where this applies is in the use of conjunct forms for consonant clusters. See the relevant sections for lists of stacked and conjoined shapes.

ᬒᬁᬲ᭄ᬯᬲ᭄ᬢ᭄ᬬᬲ᭄ᬢᬸ
Stacked conjunct forms in the word om swastiastu.
show composition

ᬒᬁᬲ᭄ᬯᬲ᭄ᬢ᭄ᬬᬲ᭄ᬢᬸ om swastiastu God bless you

The following is a selection of other examples of contextual shaping and positioning.

After a stacked consonant, the vowel signs that would normally appear below a base are moved to the side, and the shape is modified.

  Composition Example
ᬓ᭄ᬭᬸ + + U+1B44 ADEG ADEG + U+1B2D LETTER RA + U+1B38 VOWEL SIGN SUKU ᬓ᭄ᬭᬸᬦ kruna word
ᬓ᭄ᬬᬹ  + + U+1B44 ADEG ADEG + U+1B2C LETTER YA + U+1B39 VOWEL SIGN SUKU ILUT  

U+1B35 VOWEL SIGN TEDUNG and the right side of U+1B41 VOWEL SIGN TALING REPA TEDUNG combine with several of the consonants. The table below shows 2 examples.

  Composition Example
ᬳᬵ  + U+1B33 LETTER HA + U+1B35 VOWEL SIGN TEDUNG  
ᬭᬵ + U+1B2D LETTER RA + U+1B35 VOWEL SIGN TEDUNG ᬢᬭᬵ tarə star

When a vowel sign and a syllable-final consonant mark appear over the same base, they are typically drawn side by side. Combinations such as rerekan and above-base vowels are typically stacked.§

  Composition Example
ᬓᬷᬃ + U+1B37 VOWEL SIGN ULU SARI + U+1B03 SIGN SURANG ᬢᬷᬃᬢ tirtə holy water
ᬰᬶᬁ + U+1B36 VOWEL SIGN ULU + U+1B01 SIGN ULU CANDRA  

 

Typographic units

Word boundaries

Are words separated by spaces, or other characters? Are there special requirements when double-clicking on the text? Are words hyphenated?

The concept of 'word' is difficult to define in any language (see What is a word?). Here, a word is a vaguely-defined, but recognisable semantic unit that is typically smaller than a phrase and may comprise one or more syllables.

Words are not separated by spaces, and in fact some word boundaries occur between stacked consonants. This means that segmentation for line-breaking, etc. uses orthographic syllables as a unit (see Graphemes).

ᬓᬳᬦᬦ᭄ᬮᬦ᭄ᬓ᭄ᬯᬲ
In this sequence of three words kahanan + lan + kwasa, the initial letter of both the 2nd and 3rd words are subjoined below the last letter of the previous word.
details

ᬓᬳᬦᬦ᭄ pŋn kahanan
ᬮᬦ᭄ pŋn lan
ᬓ᭄ᬯᬲ dik kwasa

Graphemes

A grapheme is a user-perceived unit of text. Text operations that use graphemes as a unit of text include line-breaking, forwards deletion, cursor movement & selection, character counts, text spacing, text insertion, justification, case conversions, and sorting. The Unicode Standard uses generalised rules to define 'grapheme clusters', which approximate the likely grapheme boundaries in a writing system, however they don't work well with many complex scripts.

The term orthographic syllable is not clearly defined in the Unicode Standard. In the orthography notes on this site we define it to mean a typographic unit that includes more than one grapheme cluster. This is commonly the case for Brahmi-derived scripts, such as for Devanagari conjuncts, or Balinese stacks. Orthographic syllables do not correspond to phonetic syllables.

Grapheme clusters alone are not sufficient to represent typographic units in Balinese. Stacks and conjoined sequences are very common and must not be split apart by edit operations that visually change the text (such as letter-spacing, first-letter highlighting, and line breaking). For those operations one needs to segment the text using orthographic syllables, which string grapheme clusters together with U+1B44 ADEG ADEG, which has an Indic Syllabic Category of Virama.

The adeg-adeg is rendered visibly if it is not part of a consonant cluster, for example at the end of a word followed by a space.

Balinese doesn't use word boundaries for text segmentation, relying instead on grapheme boundaries because consonant clusters that span word boundaries are combined into stacks or conjoined forms.

Grapheme clusters

Base Combining_mark* Joiner?

Combining marks may include zero or more of the following types of character:

  1. Nukta (see Rerekan)
  2. Dependent vowels (see Vowel signs and Vocalics)
  3. Final consonants (see Finals)
  4. Virama (adeg adeg) (see Consonant clusters and novowel)

Any of the above may occur after a consonant base. Independent vowel bases usually only have final consonant marks.

The following examples show a variety of grapheme clusters:

Click on the text version of these words to see more detail about the composition.

ᬢᬷᬃᬢᬢᬷᬃᬢ holy water
ᬅᬃᬣ wealth
ᬓᬺᬰ᭄ᬡᬓᬺᬰ᭄ᬡ Krĕsna Krishna
ᬤᬍᬫ᭄ daləm deep
ᬤᬦ᭄ᬢ dantə tooth

Note how grapheme clusters break up the conjuncts. This is not usually desirable (see Larger typographic units just below).

Larger typographic units

(Consonant Rerekan? Adeg_adeg)* Grapheme_cluster

Balinese commonly stacks or conjoins glyphs, to form conjuncts. The conjuncts represent consonant clusters, which can arise (a) where one phonetic syllable ends in a consonant letter and the following syllable begins with a consonant, or (b) when most medial consonants are written, since Balinese uses conjunct forms for sequences such as Cr-, Cy-, Cw-, Cry-, etc. The cluster of consonants that make up the conjunct are all encoded with adeg adeg between them (see Consonant clusters).

Balinese is unusual in that these conjuncts occur across word boundaries, so the word-final consonant of the first word may be stacked above the word-initial consonant of the second. See Figure 9 for an example.

Grapheme clusters terminate after a sequence of marks containing an adeg adeg, but editorial operations that change the visual appearance of the text, such as letter-spacing, first-letter highlighting, line-breaking, and justification, should never split conjunct forms apart. For this reason, an alternative way of segmenting graphemes is needed. This may not apply, however, for some other operations such as cursor movement or backwards delete.

Where conjuncts appear, a typographic unit contains multiple grapheme clusters. The non-final grapheme clusters all end with U+1B44 ADEG ADEG, and the final grapheme cluster begins with a consonant.

The following are examples.

Click on the text version of these words to see more detail about the composition.

ᬤᬦ᭄ᬢ dantə tooth
ᬢᬶᬫ᭄ᬧᬮ᭄ timpal friend
ᬩ᭄ᬭ᭄ᬬᬕ᭄ bryag laughter
ᬰᬵᬲ᭄ᬢ᭄ᬭ sastrə writing

Note that one of the characteristic features of the Indic category of Virama is that the adeg adeg is visible when not followed by a consonant, but invisible when a consonant does follow (creating a stack). This means that the adeg adeg sometimes participates in a simple grapheme cluster, but when followed by a consonant it becomes the 'glue' that creates an orthographic syllable.

On the infrequent occasions when an adeg adeg needs to be visible even though it is followed by another base, an invisible character must be added to prevent it joining with the following base. A zero-width space can achieve that.

ᬧᬓ᭄​ᬭᬫᬦ᭄ pak.ra.man Mr Raman

Browser behaviour

Test in your browser. The words test units that equate to grapheme clusters only, and others that include conjuncts. First, the text is displayed in a contenteditable paragraph, then in a textarea. Results are reported for Gecko (Firefox), Blink (Chrome), and WebKit (Safari) on a Mac.

ᬢᬷᬃᬢ ᬓᬺᬰ᭄ᬡ ᬧᬾᬜ᭄ᬚᭀᬃ ᬢᬶᬫ᭄ᬧᬮ᭄ᬩ᭄ᬭ᭄ᬬᬕ᭄

Cursor movement. Move the cursor through the text.
Gecko steps through the whole text using grapheme clusters. It takes 2 or more steps (depending on the number of GCs) to get through the stacks, one grapheme cluster at a time. Blink and WebKit step through all words using the orthographic syllables described here (ie. they step over a stack and all associated combining characters in one jump).

Selection. Place the cursor next to a character and hold down shift while pressing an arrow key.
The behaviour is the same as for cursor movement.

Deletion. Forward deletion works in the same way as cursor movement. The backspace key deletes code point by code point, except for WebKit, which deletes one grapheme cluster at a time.

Line-break. See this test. The CSS sets the value of the line-break property to anywhere. Change the size of the box to slowly move the line break point.
Gecko appears to segment on orthographic syllable, per the description here, except for one case where the complex stack is split. WebKit and Blink appear to sometimes wrap inside stacks and other times not. It's not obvious why, but both segment in the same way.

Punctuation & inline features

This section describes typographic features related to word boundaries, phrase & section boundaries, bracketed text, quotations & citations, emphasis, abbreviation, ellipsis & repetition, inline notes & annotations, other punctuation, and other inline text decoration.

Phrase & section boundaries

What characters are used to indicate the boundaries of phrases, sentences, and sections?

See type samples.


6
1B5E
1B5D
1B5F
1B5A
1B5B
1B5C

Balinese has its own punctuation marks.

phrase

U+1B5E CARIK SIKI

᭎U+1B4E INVERTED CARIK SIKI

᭏U+1B4F INVERTED CARIK PAREREN

U+1B5D CARIK PAMUNGKAH

sentence

U+1B5F CARIK PAREREN

section start

U+1B5A PANTI

U+1B5B PAMADA

᭿U+1B7F PANTI BAWAK

section end

᭞᭜᭞U+1B5E CARIK SIKI + U+1B5C WINDU + U+1B5E CARIK SIKI

᭟᭜᭟U+1B5F CARIK PAREREN + U+1B5C WINDU + U+1B5F CARIK PAREREN

᭚᭜᭚U+1B5A PANTI + U+1B5C WINDU + U+1B5A PANTI

᭛᭜᭛U+1B5B PAMADA + U+1B5C WINDU + U+1B5B PAMADA

end of text

U+1B7D PANTI LANTANG

U+1B7E PAMADA LANTANG

᭽᭜᭽U+1B7D PANTI LANTANG + U+1B5C WINDU + U+1B7D PANTI LANTANG

᭚᭜᭽U+1B5A PANTI + U+1B5C WINDU + U+1B7D PANTI LANTANG

U+1B5D CARIK PAMUNGKAH is used as a colon, and U+1B5E CARIK SIKI and U+1B5F CARIK PAREREN are used as comma and full stop respectively. ᭎U+1B4E INVERTED CARIK SIKI and ᭏U+1B4F INVERTED CARIK PAREREN were introduced in Unicode v16 to express finer distinctions than the former, used in some manuscripts. 9

Both U+1B5A PANTI and U+1B5B PAMADA are used to begin a section in text. ᭿U+1B7F PANTI BAWAK was introduced in Unicode v16 to represent finer subdivisions in some manuscripts.

At the end of a section, U+1B5C WINDU is usually used between two other punctuation marks that vary according to the section opener. Typical sequences include carik siki ᭞᭜᭞, carik pareren ᭟᭜᭟ (sometimes called pasalinan), panti ᭚᭜᭚, and carik agung ᭛᭜᭛.9

End of text markers include U+1B7D PANTI LANTANG and U+1B7E PAMADA LANTANG, or a combination of those or their shorter counterparts with U+1B5C WINDU, such as ᭽᭜᭽U+1B7D PANTI LANTANG + U+1B5C WINDU + U+1B7D PANTI LANTANG or ᭚᭜᭽U+1B5A PANTI + U+1B5C WINDU + U+1B7D PANTI LANTANG.9

Line & paragraph layout

This section describes typographic features related to line breaking & hyphenation, text alignment & justification, text spacing, baselines, line height, counters, lists, and styling initials.

Line breaking & hyphenation

Are there special rules about the way text wraps when it hits the end of a line? Does line-breaking wrap whole 'words' at a time, or characters, or something else (such as syllables in Tibetan and Javanese)? What characters should not appear at the end or start of a line, and what should be done to prevent that? Is hyphenation used, or something else? What rules are used? What difficulties exist?

Because there are no spaces between words, and because the end of one word and the beginning of another often form conjuncts (see Figure 9), Balinese doesn't wrap at word boundaries. See Graphemes for a description of the typographic units that are used for line break opportunities.

Unfortunately, modern browsers are often unable to detect appropriate break points for Balinese, so in the sample text at the beginning of this page ​U+200B ZERO WIDTH SPACE is used at places where the line could be broken. Otherwise, the line would continue, unbroken off the right side of the page.

Pameneng


1B60

In lontar texts where a word must be broken at the end of a line (always after a full syllable), the sign U+1B60 PAMENENG is inserted. This sign is not used as a word-joining hyphen; it is used only in linebreaking.

Observation: The images appear to show a gap before the pameneng.

A compacted image of a lontar showing a pameneng at the end of a line, with the beginning of the following line below. (Click to see more.)

In online use, an application would need to insert the pameneng, rather than the content author. As line-length is changed by stretching a window, or as content is added earlier in the same paragraph, the location of the word relative to the line edge will change. The insertion of pameneng is only appropriate at those instants when the appropriate sequence of characters appears at the line end.

For an application to use this correctly, it would need to know where the word boundaries are in the text, and then put this character at the end of the line only when a multisyllabic word is broken. This would require a dictionary to be applied to the text, since it would not be appropriate to insert the pameneng at the boundary of 2 words.

Observation: Aditya Bayu Perdana has found instances in lontar where U+1B04 SIGN BISAH is moved to the beginning of a line, alone, while a pameneng appears at the end of the previous line. If this is not just a scribal inconsistency (eg. it's not clear why you wouldn't put the bisah at the end of the line if there's space for a pameneng), it may indicate that this letter should not be a combining mark in Unicode; however, the usage needs to be verified first. See pictures.

Line-edge rules

As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line. The Unicode line-break properties help applications decide whether a character should appear at the start or end of a line.

Show (default) line-breaking properties for characters in the Balinese orthography.

The following list gives examples of typical behaviours for characters used in contemporary Balinese. Context may affect the behaviour of some of these and other characters.

Click on the Balinese characters to show what they are.

  • ᭚ ᭛ ᭝ ᭞ ᭟ ᭠   should not begin a new line

Text alignment & justification

Does text in a paragraph needs to have flush lines down both sides? Does the script allow punctuation to hang outside the text box at the start or end of a line? Where adjustments are need to make a line flush, how is that done? Does the script shrink/stretch space between words and/or letters? Are word baselines stretched, as in Arabic? What about paragraph indents?

According to Sudewa, full justification is not a feature of Balinese text in traditional palm-leaf manuscripts, and only left, or occasionally centred or right alignment is relevant.

Baselines, line height, etc.

Does the script have special requirements for baseline alignment between mixed scripts and in general? Is line height special for this script? Are there other aspects that affect line spacing, or positioning of items vertically within a line?

Balinese uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

Figure 11 shows glyphs from the Noto Serif fonts. The basic height of Balinese letters is the same as the Latin x-height, however extenders and combining marks, extend well beyond the Latin ascenders and descenders, creating a need for larger line heights.

qhx᭛᭄ᬐᬓᬿᬲᬺᬧᬷᬲᬸᬃᬭᬼ
Font metrics for Latin text in the Noto Serif font compared with Balinese glyphs in the Noto Serif Balinese font.

Page & book layout

This section describes typographic features related to general page layout & progression; grids & tables, notes, footnotes, etc, forms & user interaction, and page numbering, running headers, etc.

General page layout & progression

How are the main text area and ancilliary areas positioned and defined? Are there any special requirements here, such as dimensions in characters for the Japanese kihon hanmen? The book cover for scripts that are read right-to-left scripts is on the right of the spine, rather than the left. When content can flow vertically and to the left or right, how to specify the location of objects, text, etc. relative to the flow? Do tables and grid layouts work as expected? How do columns work in vertical text? Can you mix block of vertical and horizontal text? Does text scroll in a different direction?

Traditionally, Balinese was written on thin, landscape palm-leaf manuscripts, called lontar.

Picture of a palm leaf manuscript.

Example of a palm-leaf manuscript from Wikipedia.

The text was packed in without paragraph breaks.

Terminology

ᬅᬓ᭄ᬱᬭ aksara letter

ᬯ᭄ᬬᬜ᭄ᬚᬦ wianjana consonant

ᬅᬓ᭄ᬱᬭᬯ᭄ᬬᬜ᭄ᬚᬦ aksara wianjana consonant

ᬯᬺᬱᬵᬲ᭄ᬢ᭄ᬭ wreṣāstra 18 consonants used to write basic Balinese words

ᬰ᭄ᬯᬮᬮᬶᬢ sualalita consonants used used for writing Sanskrit and Kawi loanwords

ᬅᬮ᭄ᬧᬧ᭄ᬭᬵᬡ alpaprāṇa unaspirated

ᬫᬵᬳᬵᬧ᭄ᬭᬵᬡ mahāprāṇa aspirated

References & sources

1Aditya Bayu Perdana (2023), Musical Symbols and Sasak Characters in the Balinese Script

2Peter T. Daniels and William Bright, The World's Writing Systems, Oxford University Press, ISBN 0-19-507993-0

3Michael Everson, I Made Suatjana, Proposal for encoding the Balinese script in the UCS

4Library of Congress, Balinese transcription

5Norbert Lindenberg, Bringing Balinese to iOS

6Omniglot, Balinese

7ScriptSource, Balinese

8Ida Bagus Adi Sudewa, The Balinese Alphabet

9Unicode Consortium, The Unicode Standard, Version 16.0, Chapter 17.3: Indonesia and Oceania, Balinese, ISBN 978-1-936213-34-4

10Unicode Consortium, Unicode Line Breaking Algorithm (UAX#14)

11Wikipedia, Balinese language

12Wikipedia, Balinese script

See recent changes.  •  Make a comment.  •  Licence CC-By © r12a.