Balinese

Updated 26 November, 2020

This page gathers basic information about the Balinese script and its use for the Balinese language. It aims (generally) to provide an overview of the orthography and typographic features, and (specifically) to advise how to write Balinese using Unicode.

See also the companion document, Balinese character notes, for detailed information about specific Unicode characters.

Phonetic transcriptions on this page should be treated as an approximate guide, only. Many are more phonemic than phonetic, and there may be variations depending on the source of the transcription.

Related pages.
Other script summaries.
About this page

Sample (Balinese)

Select part of this sample text to show a list of characters, with links to more details.
Change size:   28px

ᬫᬓᬲᬫᬶ​ᬫᬦᬸᬲᬦᬾ​ᬓᬳᭂᬫ᭄ᬩᬲᬶᬦ᭄ᬫᬳᬃᬤᬶᬓ​ᬮᬦ᭄ᬧᬢᭂᬄ​ᬲᬚ᭄ᬭᭀᬦᬶᬂ​ᬓᬳᬦᬦ᭄ᬮᬦ᭄ᬓ᭄ᬯᬲ᭟ ᬳᬶᬧᬸᬦ᭄ᬓᬵᬦᬸᬕ᭄ᬭᬳᬶᬦ᭄ᬯᬶᬯᬾᬓ​ᬮᬦ᭄ᬩᬸᬤ᭄ᬥᬶ᭞ ᬧᬦ᭄ᬢᬭᬦᬶᬂ​ᬫᬦᬸᬲ​ᬫᬗ᭄ᬤᬦᬾ​ᬧᬭᬲ᭄ᬧᬭᭀᬲ᭄ᬫᬲᭂᬫᭂᬢᭀᬦᬦ᭄᭟

Usage & history

The Balinese script is used for writing the Balinese language spoken on the Indonesian islands of Java and Bali. It may also be used for Old Javanese, and liturgical Sanskrit. With some additions, it is also used to write Sasak in the neighbouring island of Lombok.

Everyday use of the script has largely been eclipsed by the Latin alphabet, but Balinese has a significant presence in traditional ceremonies and texts of the Hindu religion. It is also used for signage on roads, at the entrances to villages, and on government buildings. Traditional literature is published on a small scale, but little modern literature. Sekaha Pesantian community groups gather to read the Balinese script in a social context, commonly in song form.

Names:

Balinese script is derived from Old Kawi, and ultimately from Brahmi. Historically, Balinese was written on palm leaves or inscribed in stone. Its similarity to the Javanese script in form and behaviour leads some to propose that they are typological variants of each other.

Sources Scriptsource and Wikipedia.

Basic features

The script is an abugida. Consonants carry an inherent vowel a, although that is pronounced ə at the end of a word. See the table to the right for a brief overview of features for the Balinese language.

Balinese text runs left to right in horizontal lines.

Words are not separated by spaces, however syllables may be separated by ZWSP, as long as they don't fall inside a stack.

The 18 consonant letters used for pure Balinese words are supplemented by 15 more derived from Sanskrit and Kawi words, some of which are used as honorifics, a little like capital letters. Repertoire extensions for 8 non-native sounds are achieved by applying the rerekan diacritic to characters.

Consonant clusters are represented by stacked consonants (many subjoined consonants have alternative shapes) or conjoined pairs. Occasionally, a visible adeg adeg is used.

Stacked consonants and conjoined pairs span word boundaries.

Syllable-initial clusters use 3 subjoined versions of ordinary consonants or vocalics for the second consonant.

Word-final consonant sounds may be represented by 3 final-consonant diacritics. Otherwise, if nothing follows, they are ordinary consonants followed by a visible    [U+1B44 BALINESE ADEG ADEG].

The Balinese orthography has an inherent vowel, and represents vowels using 11 vowel-signs (including 2 prescripts and 3 circumgraphs, all of which can decompose into composite vowels). All vowel-signs are combining marks, and are stored after the base character.

Independent vowels are used at the beginning of a word for standalone vowel sounds. Inside a word these are written using vowel-signs applied to [U+1B33 BALINESE LETTER HA].

Balinese has no composite vowels, in principle, however the 3 circumgraphs can also be decomposed into 2 parts. Those can involve up to 2 glyphs, and glyphs can surround the base consonant(s) on 2 sides only, eg. ᬓᭀ ko.

Balinese has vocalics.

Character lists show:

Character index

This section lists non-ASCII characters used for Balinese, and other characters in the Balinese script block not used by modern Balinese. For descriptions of usage, click on ↓.

Letters

Basic consonants

ᬧ␣ᬩ␣ᬢ␣ᬤ␣ᬘ␣ᬚ␣ᬓ␣ᬕ␣ᬲ␣ᬳ␣ᬫ␣ᬦ␣ᬗ␣ᬜ␣ᬯ␣ᬭ␣ᬮ␣ᬬ

Extended consonants

ᬞ␣ᬟ␣ᬠ␣ᬔ␣ᬛ␣ᬨ␣ᬝ␣ᬣ␣ᬥ␣ᬖ␣ᬰ␣ᬱ␣ᬡ␣ᬪ␣ᬙ

Vowels

ᬇ␣ᬈ␣ᬉ␣ᬊ␣ᬏ␣ᬑ␣ᬅ␣ᬆ␣ᬐ␣ᬒ

Vocalics

ᬋ␣ᬌ␣ᬍ␣ᬎ

Not used for Balinese

ᭅ␣ᭆ␣ᭇ␣ᭈ␣ᭉ␣ᭊ␣ᭋ

Combining marks

Vowel-signs

ᬾ␣ᬿ␣ᭀ␣ᭃ␣ᭁ␣ᬶ␣ᬷ␣ᬸ␣ᬹ␣ᭂ␣ᬵ

Vocalic

ᬺ␣ᬻ␣ᬼ␣ᬽ

Finals

ᬂ␣ᬃ␣ᬄ

Other

᭄␣᬴␣ᬀ␣ᬁ

Musical symbols

᭬␣᭫␣᭭␣᭮␣᭯␣᭰␣᭱␣᭲␣᭳

Numbers

᭐␣᭑␣᭒␣᭓␣᭔␣᭕␣᭖␣᭗␣᭘␣᭙

Punctuation

᭚␣᭛␣᭜␣᭝␣᭞␣᭟␣᭠

Symbols

Musical symbols

᭡␣᭢␣᭣␣᭤␣᭥␣᭦␣᭧␣᭨␣᭩␣᭪␣᭴␣᭵␣᭶␣᭷␣᭸␣᭹␣᭺␣᭻␣᭼

Vowels

Vowel sounds

Click on the sound groups to see where else in the document each of the sounds are referred to.

Phones in a lighter colour are non-native or allophones.

Plain vowels

i iː u uː e o ə əː ə əː ɛ ɔ a ɑː ɑː

Diphthongs

aːi aːu

Inherent vowel

a is the sound of the inherent vowel, so [U+1B13 BALINESE LETTER KA] is pronounced ka.

However, the inherent vowel is pronounced ə at the end of a word and also in prefixes ma-, pa- and da-.

Vowel-signs

Non-inherent vowel sounds that follow a consonant are represented using vowel-signs, eg. ᬓᬶ [U+1B13 BALINESE LETTER KA + U+1B36 BALINESE VOWEL SIGN ULU] is pronounced ki.

Balinese vowel-signs are all combining characters. In principle a single Unicode character is used per base consonant, even if the vowel-signs appear on both sides of the base consonant, however 3 vowel signs decompose to more than one character (see circumgraphs). All vowel-signs are typed and stored after the base consonant, and the font puts them in the correct place for display.

The majority of the vowel-signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.

See also vocalics.

Prescript vowel-signs

Two vowel-signs appear to the left of the base consonant letter or cluster, eg. ᬘᬾᬂᬘᬾᬂ.

ᬾ␣ᬿ

These are combining marks that are always stored after the base consonant. The font places the glyph before the base consonant.

Circumgraphs

Five vowels are produced by a single combining character with visually separate parts, that appear on different (mostly opposite) sides of the consonant onset eg. កើ kaᵊ kaw.

ᭀ␣ᭃ␣ᭁ

Encoding. All 3 of these circumgraphs can be written as a single character, or as two, eg.

  1. [U+1B41 BALINESE VOWEL SIGN TALING REPA TEDUNG], could be written as:
    ᭁ [U+1B3F BALINESE VOWEL SIGN TALING REPA + U+1B35 BALINESE VOWEL SIGN TEDUNG]
  2. [U+1B43 BALINESE VOWEL SIGN PEPET TEDUNG] could be written as:
    ᭃ [U+1B42 BALINESE VOWEL SIGN PEPET + U+1B35 BALINESE VOWEL SIGN TEDUNG]
  3. [U+1B41 BALINESE VOWEL SIGN TALING REPA TEDUNG] could be written as:
    ᭁ [U+1B3F BALINESE VOWEL SIGN TALING REPA + U+1B35 BALINESE VOWEL SIGN TEDUNG]

The single code point per vowel-sign is preferred, however the parts are separated in Unicode Normalisation Form D (NFD).

Whichever approach is used, the vowel-signs must be typed and stored after the consonant characters they surround, and in left to right order.

Other vowel-signs

The following combining marks are also used to indicate vowel sounds.

ᬶ␣ᬷ␣ᬸ␣ᬹ␣ᭂ␣ᬵ

To represent the sound , Balinese uses the vocalic letter. The sequence [U+1B2D BALINESE LETTER RA] + [U+1B42 BALINESE VOWEL SIGN PEPET​] is not used.

Vowel suppression

Balinese uses   [U+1B44 BALINESE ADEG ADEG​] (the Balinese equivalent of the Sanskrit virama) to kill the inherent vowel after a consonant.

The adeg adeg is visible at the end of a word that ends in consonant, but is usually hidden (with occasional exceptions) when the consonant is part of a consonant cluster (see clusters).

Standalone vowels

In the middle of a word, Balinese represents standalone vowels using a vowel-sign after [U+1B33 BALINESE LETTER HA], eg. ᬤᬳᬾᬭᬄ.

At the beginning of a word, most standalone vowels are represented using one of the 10 independent vowel characters. The set includes a character to represent the inherent vowel sound.

ᬇ␣ᬈ␣ᬉ␣ᬊ␣ᬏ␣ᬑ␣ᬅ␣ᬆ␣ᬐ␣ᬒ

The vowels ◌ᭂ [U+1B42 BALINESE VOWEL SIGN PEPET​] and ◌ᭃ [U+1B43 BALINESE VOWEL SIGN PEPET TEDUNG​] don't have an independent form, and have to be used alongside [U+1B33 BALINESE LETTER HA] at the beginning of a word.

In Sasak, independent vowel [U+1B05 BALINESE LETTER AKARA] can be followed by an explicit ◌᭄ [U+1B44 BALINESE ADEG ADEG​] in word- or syllable-final position, where it indicates the glottal stop, eg. ᬳᬫᬅ᭄ hmạ͓ amaʔ; other consonants can also be subjoined to it.

Encoding. Three of these independent vowels can be written as a single character, or as two. They decompose in Unicode Normalisation Form D. It is generally recommended to use the precomposed character.

  1. [U+1B0A BALINESE LETTER UKARA TEDUNG] could be written as:
    ᬊ [U+1B09 BALINESE LETTER UKARA + U+1B35 BALINESE VOWEL SIGN TEDUNG]
  2. [U+1B12 BALINESE LETTER OKARA TEDUNG] could be written as:
    ᬒ [U+1B11 BALINESE LETTER OKARA + U+1B35 BALINESE VOWEL SIGN TEDUNG]
  3. [U+1B06 BALINESE LETTER AKARA TEDUNG] could be written as:
    ᬆ [U+1B05 BALINESE LETTER AKARA + U+1B35 BALINESE VOWEL SIGN TEDUNG]

Vowel sounds mapped to characters

The following tables show how the above vowel sounds commonly map to characters or sequences of characters.

Word-internal standalone vowels (and word-initial in the case of ə and əː) use the vowel-sign over a silent [U+1B33 BALINESE LETTER HA]. Vowel-signs that decompose are shown only in precomposed form.

Plain vowels

ə
 

Inherent vowel at the end of a word and also in prefixes ma-, pa- and da-.

[U+1B42 BALINESE VOWEL SIGN PEPET]

[U+1B3A BALINESE VOWEL SIGN RA REPA] in vocalic .

[U+1B3C BALINESE VOWEL SIGN LA LENGA] in vocalic .

[U+1B0B BALINESE LETTER RA REPA] in word-initial vocalic .

[U+1B0D BALINESE LETTER LA LENGA] in word-initial vocalic .

əː
 

[U+1B43 BALINESE VOWEL SIGN PEPET TEDUNG]

[U+1B3B BALINESE VOWEL SIGN RA REPA TEDUNG] in vocalic rəː

[U+1B3D BALINESE VOWEL SIGN LA LENGA TEDUNG] in vocalic ləː

[U+1B0C BALINESE LETTER RA REPA TEDUNG] in word-initial vocalic rəː.

[U+1B0E BALINESE LETTER LA LENGA TEDUNG] in word-initial vocalic ləː.

a
 

Inherent vowel

[U+1B05 BALINESE LETTER AKARA] when word-initial.

Diphthongs and other combinations

Vowel-sign placement

Show details about vowel glyph positioning.

The following list shows where vowel-signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are.

  • 2 prescript, eg. ᬓᬿ kaʲ
  • 1 postscript, eg. ᬓᬵ kɑ̄
  • 3 superscript, eg. ᬓᬶ ki
  • 2 subscript, eg. ᬓᬸ ku
  • 2 pre+postscript, eg. ᬓᭀ ko
  • 1 super+postscript, eg. ᬓᭃ kə̄

At maximum, vowel components can occur concurrently on 2 sides of the base.

Vocalics

ᬋ␣ᬌ␣ᬍ␣ᬎ␣ᬺ␣ᬻ␣ᬼ␣ᬽ

At the beginning of a syllable the vocalic is treated as a consonant, eg. ᬓᭂᬋᬂ, ᬢᬍᬃ.

As a second component in a consonant cluster, the vocalic ra repa has a postfixed form and a subjoined form.

The postfixed form ◌᭄ᬭᭂ is seen where the independent (consonant) form of ra repa follows a syllable which ends in a consonant, The sequence of Unicode characters to be used for this is C + + [ consonant + U+1B44 BALINESE ADEG ADEG + U+1B0B BALINESE LETTER RA REPA], eg. ᬧᬓ᭄ᬋᬋᬄ.

The subjoined form ◌ᬺ is used to represent the dependent ra repa after a syllable-initial consonant. The sequence of characters to be used here is C + [ consonant + U+1B3A BALINESE VOWEL SIGN RA REPA], eg.ᬓᬺᬰ᭄ᬡ .

Consonants

Consonant sounds

The following represents the repertoire of the Balinese language.

Click on the sounds to see where else in the document they are referred to.

Phones in a lighter colour are non-native or allophones .

labial dental alveolar post-
alveolar
palatal velar pharyngeal glottal
stop p b t d       k ɡ    
affricate       t͡ʃ d͡ʒ        
fricative f v   s z     x ɣ ħ ʕ h
nasal m   n   ɲ ŋ  
approximant w   l   j    
trill/flap     r  

Basic consonants

Only 18 of the consonants in the Balinese Unicode block are used for pure Balinese language text. The remainder are used for words derived from Sanskrit or Kawi.

The characters listed here and in the following sections also have conjoined and/or subjoined forms, which may differ significantly from those shown here. See clusters for a list of glyph shapes.

ᬧ␣ᬩ␣ᬢ␣ᬤ␣ᬘ␣ᬚ␣ᬓ␣ᬕ
ᬲ␣ᬳ
ᬫ␣ᬦ␣ᬗ␣ᬜ
ᬯ␣ᬭ␣ᬮ␣ᬬ

[U+1B33 BALINESE LETTER HA] at the beginning of a word or after a preceding vowel is mostly used as a support for a vowel-sign (see independentvowels), and is not pronounced or transcribed. Word finally with a suffix vowel, however, it is transcribed.l

Additional/honorific consonants

These are called ᬅᬓ᭄ᬱᬭᬰ᭄ᬯᬮᬮᬶᬢ ạk͓ṡ̂rŝ͓wllit aksara sualalita.

Many of the additional consonants are commonly used in words originating from Arabic and Dutch, and are most common in north Bali and Lombok. When used in pure Balinese words, they are similar to capital letters and are used to create an honorific effect. There are similar characters in Javanese.

They don't add any consonant sounds to the Balinese repertoire. In words originating from Sanskrit, Old Javanese, or Old Balinese, they represent aspirated or other consonants.l

Additional consonants used for Sanskrit words.

ᬞ␣ᬟ␣ᬠ␣ᬔ␣᭄ᬙ␣ᬛ

Additional consonants used for words from Kawi.

ᬨ␣ᬝ␣ᬣ␣ᬥ␣ᬖ␣ᬰ␣ᬱ␣ᬡ␣ᬪ

Two consonants, [U+1B14 BALINESE LETTER KA MAHAPRANA] and [U+1B19 BALINESE LETTER CA LACA,] are considered very rare, and one other, [U+1B1B BALINESE LETTER JA JERA], seems to be known from only one word ( ᬦᬶᬃᬛᬭ ). (It is possible that an original ai may have been lost in Balinese, to be replaced by the glyph for jʰa.)

A number of the Sanskrit or Kawi consonants are rather poorly attested. The letter [U+1B19 BALINESE LETTER CA LACA] is only found in non-initial position following [U+1B18 BALINESE LETTER CA], ie. ᬘ᭄ᬙ c͓C, and most of the retroflex series is often omitted in books about the script.

Repertoire extension

The combining mark [U+1B34 BALINESE SIGN REREKAN​] is used, as is a similar sign in Javanese, to extend the character repertoire for foreign sounds.

ᬧ᬴␣ᬯ᬴␣ᬚ᬴␣ᬓ᬴␣ᬕ᬴␣ᬳ᬴␣ᬗ᬴␣ᬤ᬴

The first 7 of the 8 listed above are attested in Library of Congress transliterations and in earlier Sasak orthography. The 8th, ᬤ᬴ could be used for one-to-one transliteration for Javanese ɖ.

In rendering, the dots of these letters appear above the top character, which can cause some ambiguity in reading. The following are all visually indistinguishable: ᬓ᬴᭄ᬚ kˑ͓ʤ (xja), or ᬓ᭄ᬚ᬴ k͓ʤˑ (kza), or indeed ᬓ᬴᭄ᬚ᬴ kˑ͓ʤˑ (xza). In practice these combinations are probably rather rare.

In recent times, Sasak users abandoned the use of the Javanese-influenced rerekan in favour of a series of modified letters (see above), making use, in addition, of some of unused Kawi letters for the Arabic sounds. In place of ᬓ᬴ x and ᬕ᬴ ɣ, for instance, the new fusion of KA and HA, [U+1B46 BALINESE LETTER KHOT SASAK] and the Kawi letter [U+1B16 BALINESE LETTER GA GORA] are used.

Sasak

There are also a few characters in the Unicode block that are used for the Sasak language. (Does the fact that these relate to aspirated or retroflex forms originally affect the pronunciation?)

ᭅ␣ᭆ␣ᭇ␣ᭈ␣ᭉ␣ᭊ␣ᭋ

Medial consonants

The consonants ya, ra, la and wa regularly appear immediately after the initial consonant in a syllable. Balinese has no special characters for these medial sounds, they are just written using the normal approach for dealing with consonant clusters, eg. ᬓ᭄ᬭᬫ.

᭄ᬭ␣᭄ᬮ␣᭄ᬬ

Multiple medials can occur: r or l can be followed by w or y, eg. ᬩ᭄ᬭ᭄ᬬᬕ᭄.

In addition, the vocalics can produce consonant sounds in medial position, eg. ᬓᬺᬰ᭄ᬡ.

See clusters for more details on shaping of glyphs.

Word-final consonants

Word-final consonant sounds with no following consonant are by default represented by ordinary consonant characters, followed by a visible [U+1B44 BALINESE ADEG ADEG​] character, eg. ᬓᬵᬤᭂᬧ᭄, ᬓᬧᬮ᭄.

However, there is also a set of combining characters that don't need to be followed by the adeg adeg.

ᬂ␣ᬃ␣ᬄ

[U+1B02 BALINESE SIGN CECEK] and [U+1B04 BALINESE SIGN BISAH] only appear at the end of a word, eg. ᬓᭂᬋᬂ, ᬫᬗᬄ, unless the word involves repetition, eg. ᬘᬾᬂᬘᬾᬂ.

[U+1B03 BALINESE SIGN SURANG] can appear at the end of any syllable, eg. ᬓᬃᬡ.

A syllable-final diacritic may appear above a stack, eg. ᬩᬗ᭄ᬓᬸᬂ.

See also cchar_modre.

Consonant clusters

The absence of a vowel sound between two or more consonants is visually indicated in one of the following ways.

  1. Stacked consonants, where the non-initial (subjoined) consonant appears below the initial, often with a different shape from normal.
  2. Conjoined consonants, where consonants sit side-by-side but the non-initial consonant has a slightly different form than usual.
  3. A visible [U+1B44 BALINESE ADEG ADEG​] following the initial consonant.
  4. A dedicated final consonant mark followed by a regular consonant.

In Unicode, the stacking and conjoining behaviour is achieved by adding [U+1B44 BALINESE ADEG ADEG​] between the consonants. The font hides the glyph automatically when a stacked conjunct is formed.

Word boundaries. Conjuncts span word boundaries. Because there are no spaces between words, a cluster is created when a consonant with no following vowel at the end of a word is followed by a consonant at the beginning of the next word.

ᬓᬳᬦᬦ᭄ᬮᬦ᭄ᬓ᭄ᬯᬲ
In the sequence of words kahanan lan kwasa the initial consonant of each word is subjoined below the final consonant of the preceding word.

Stacks and conjoined sequences are not normally split at line ends (see word and linebreak for the ramifications of this).

Stacking

To represent consonants without intervening vowels, the non-initial consonant is typically drawn below the initial consonant, and with a slightly different shape.

Many of the subjoined forms are just slightly smaller versions of the original, but several have very different shapes altogether, most of which ligate with the cluster initial consonant by joining strokes.

There can be up to 3 consonants combined in this way, and the third consonant must be one of ya, ra, la or wa.

This list shows consonants in their normal and subjoined forms

native letters
ᬩ᭄ᬩ␣ᬢ᭄ᬢ␣ᬤ᭄ᬤ␣ᬘ᭄ᬘ␣ᬚ᭄ᬚ␣ᬓ᭄ᬓ␣ᬕ᭄ᬕ␣ᬳ᭄ᬳ␣ᬫ᭄ᬫ␣ᬦ᭄ᬦ␣ᬜ᭄ᬜ␣ᬗ᭄ᬗ␣ᬯ᭄ᬯ␣ᬭ᭄ᬭ␣ᬮ᭄ᬮ␣ᬬ᭄ᬬ
Sanskrit letters
ᬞ᭄ᬞ␣ᬟ᭄ᬟ␣ᬠ᭄ᬠ␣ᬔ᭄ᬔ␣ᬙ᭄ᬙ␣ᬛ᭄ᬛ
Kawi letters
ᬝ᭄ᬝ␣ᬣ᭄ᬣ␣ᬥ᭄ᬥ␣ᬖ᭄ᬖ␣ᬰ᭄ᬰ␣ᬡ᭄ᬡ␣ᬪ᭄ᬪ

Conjoined consonants

In conjoined clusters, the consonant glyphs remain side by side, but the non-initial consonant is reduced on the left side. fig_conjoined_s shows an example in the word ᬅᬓ᭄ᬱᬭ.

ᬅᬓ᭄ᬱᬭ
The left side of [U+1B30 BALINESE LETTER SA SAGA] is reduced when conjoined.

This list shows consonants in their normal and conjoined forms

native letters
ᬧ᭄ᬧ␣ᬲ᭄ᬲ␣ᬋ᭄ᬋ
Kawi letters
ᬨ᭄ᬨ␣ᬱ᭄ᬱ

The conjoined [U+1B32 BALINESE LETTER SA] is unusual in that it also adds a stroke below the initial consonant. This helps distinguish it from the conjoined p. See fig_conjoined_sa for an example in the word ᬧᬓ᭄ᬲ.

ᬧᬓ᭄ᬲ
[U+1B32 BALINESE LETTER SA] when conjoined not only loses some of its left side but also adds a glyph below the initial consonant.

Visible adeg adeg

Because there is no word separator, consonants at the end of one word and beginning of the following word are normally stacked, too.

In some cases this leads to ambiguity about whether this is one or two words. If you really want to make clear which is which, you can use an explicit adeg-adeg, eg. ᬧᬓ᭄ᬭᬫᬦ᭄ vs. ᬧᬓ᭄‌ᬭᬫᬦ᭄.

The Balinese section of the Unicode Standard recommends the use of U+200C ZERO WIDTH NON-JOINER (ZWNJ) after the adeg-adeg in order to prevent conjunct formation. However, not many people understand the function of ZWNJ or can access it easily from the keypad. It also doesn't introduce line-break opportunities. A better solution may be to use U+200B ZERO WIDTH SPACE (ZWSP). This character is needed anyway on most systems in order to allow line-breaking, and it appears to work equally well for this.

A somewhat ambiguous situation arises where apparently norms prevent certain combinations stacking. For example, the name of the village tamblung should not stack the mbl, but should look like ᬢᬫ᭄‌ᬩ᭄ᬮᬂ . The Unicode Standard advises to use a zero-width non-joiner after ma, to achieve this.

Note that this may also be achieved by intelligence in the font, as was actually the case when I generated this example (click on it to see). It's not clear to me what is the preferred approach: put ZWNJ in only when the font doesn't do what you want, or use it always. The latter may lead to more consistent content where different fonts are applied to the text (eg. after cut and paste). In theory, this shouldn't affect searching and sorting, although some applications may not ignore the ZWNJ as they should.

Dedicated final marks

Balinese represents some final consonants using dedicated marks. Such final marks are followed by ordinary consonant shapes in consonant clusters. There is no visual indication of missing vowel sounds other than the use of the mark itself.

ᬓᬃᬡ
A cluster involving a dedicated final mark doesn't form a conjunct.
(Word shown is ᬓᬃᬡ .)

Consonant sounds to characters

The following maps the above sounds to graphemes. The items are split according to whether they are native Balinese letters (b), Sanskrit (s) or Kawi (k) derived forms, or extended with rerekan (e).

Initials

Symbols

The symbols in the Balinese block are all musical symbols, and are not described here.

᭡␣᭢␣᭣␣᭤␣᭥␣᭦␣᭧␣᭨␣᭩␣᭪␣᭴␣᭵␣᭶␣᭷␣᭸␣᭹␣᭺␣᭻␣᭼

Modre symbols

Two combining marks have a specialist usage related to (usually religious) Sanskrit words.

ᬀ␣ᬁ

[U+1B00 BALINESE SIGN ULU RICEM] when combined with certain syllables becomes part of the Aksara Modre, or holy letters, which are used to write words in Sanskrit, usually part of prayers. This character only appears in Sanskrit texts, eg. ᬰᬶᬤ᭄ᬥᬀ siddham.

[U+1B01 BALINESE SIGN ULU CANDRA] appears only in holy letters, eg. ᬫᬁ mŋ̽ (Mang). When combined with independent vowel ạʷ it becomes a special symbol called omkara and is pronounced m. In this form it is used to represent god, eg. ᬒᬁᬱᬦ᭄ᬢᬶ᭞ᬱᬦ᭄ᬢᬶ᭞ᬱᬦ᭄ᬢᬶ᭞ᬒᬁ.

᭜ᬁ   ᭟ᬁ   ᭛ᬁ.
Modre symbols that include ulu candra.

Musical marks

There is also a set of musical diacritical marks, which are not described here.

᭫␣᭬␣᭭␣᭮␣᭯␣᭰␣᭱␣᭲␣᭳

Numbers

There is a set of Balinese digits, and they are used in the same way as Latin digits.

᭑␣᭒␣᭓␣᭔␣᭕␣᭖␣᭗␣᭘␣᭙␣᭐

However, many of the digit symbols are indistinguishable from other Balinese letters. Numbers are typically surrounded by [U+1B5E BALINESE CARIK SIKI], so that they are easily recognisable, eg. ᬩᬮᬶ᭞᭓᭞ᬚᬸᬮᬶ᭞᭑᭙᭘᭒᭟.

Text direction

Balinese text is written horizontally, left to right.

Glyph shaping & positioning

This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.

You can experiment with examples using the Balinese character app.

Balinese text is not cursive (ie. joined up like Arabic), however there is a significant amount of interaction between glyphs, and some joining, around consonant clusters.

The orthography has no case distinction, and no special transforms are needed to convert between characters.

Context-based shaping

Many of the subjoined and post-fixed consonant forms have different shapes from the standard glyph for that character, for example na becomes   ᭄ᬦ.

In addition, many conjunct clusters combine characters with special shapes, or subtly change parts of glyphs to join smoothly. Often the changes are significant, especially the medial consonants, ya, ra, wa and la. For example, see the sequence <ba, adeg-adeg, ra, adeg-adeg, ya> in ᬩ᭄ᬭ᭄ᬬᬕ᭄.

Combining vowel signs can also have different shapes depending on the context. For example, the vowel sign tedung typically ligates with the preceding consonant, eg. ha is but <ha, tedong> is ᬳᬵ and subjoined ya is   ᭄ᬬ but <consonant, adeg-adeg, ya, tedong> can be rendered as   ᭄ᬬᬵ.

Context-based positioning

When a vowel and a final consonant sign both appear above a consonant, they are placed side by side, eg. ᬱᬫ᭄ᬧᬶᬂ.

Font styles

tbd

Structural boundaries & markers

Grapheme boundaries

Observation: The basic unit for Balinese text appears to be a stack of consonants plus all combining characters, where a stack could be a single character, or could have up to 3 consonants joined by adeg-adeg. The combining characters include all vowel-signs, and final consonant marks.

In the Chrome browser, this is the case for cursor movement. The cursor jumps over each of the stacks in ᬓᬓᬵᬓ᭄ᬓᬓ᭄ᬓ᭄ᬬᬓ᭄ᬓ᭄ᬬᬵᬃ one by one. In Firefox, however, the cursor appears to follow Unicode grapheme clusters, which makes it jump inside stacks with adeg-adeg because a grapheme cluster doesn't include the non-combining characters following the base.

Word boundaries

Words are not separated by spaces, and in fact some word boundaries occur between stacked consonants. This means that segmentation for line-breaking, etc. uses orthographic syllables as a unit, where orthographic means a character or stack of characters with all associated combining marks.

ᬓᬳᬦᬦ᭄ᬮᬦ᭄ᬓ᭄ᬯᬲ
In the sequence of words ᬓᬳᬦᬦ pŋn kahanan, ᬮᬦ pŋn lan, ᬓ᭄ᬯᬲ dik kwasa, the initial letter of both the 2nd and 3rd words are subjoined below the last letter of the previous word.

Phrase & section boundaries

phrase

[U+1B5E BALINESE CARIK SIKI]

[U+1B5D BALINESE CARIK PAMUNGKAH]

sentence

[U+1B5F BALINESE CARIK PAREREN]

section

[U+1B5A BALINESE PANTI]

[U+1B5B BALINESE PAMADA]

᭟᭜᭟ [U+1B5F BALINESE CARIK PAREREN + U+1B5C BALINESE WINDU + U+1B5F BALINESE CARIK PAREREN]

 ᭛᭜᭛ [U+1B5B BALINESE PAMADA + U+1B5C BALINESE WINDU + U+1B5B BALINESE PAMADA]

[U+1B5D BALINESE CARIK PAMUNGKAH] is used as a colon, and [U+1B5E BALINESE CARIK SIKI] and [U+1B5F BALINESE CARIK PAREREN] are used as comma and full stop respectively.

Both [U+1B5A BALINESE PANTI] and [U+1B5B BALINESE PAMADA] are used to begin a section in text. At the end of a section, ᭟᭜᭟ pasalinan and ᭛᭜᭛ carik agung may be used (depending on what sign began the section).

Parentheses & brackets

tbd

Quotations

tbd

Emphasis

tbd

Abbreviation, ellipsis & repetition

tbd

Inline notes & annotations

tbd

Other inline ranges

tbd

Other punctuation

tbd

Line & paragraph layout

Line breaking & hyphenation

Common practice is to break the sentence at any point when it reaches the end of a line, except that no line breaks should be allowed within syllable boundaries and no line breaks are allowed just before a colon, comma or full stop.

Pameneng

In lontar texts where a word must be broken at the end of a line (always after a full syllable), the sign [U+1B60 BALINESE PAMENENG] is inserted. This sign is not used as a word-joining hyphen; it is used only in linebreaking.

A compacted image of a lontar showing a pameneng at the end of a line, with the beginning of the following line below. (Click to see more.)

In online use, an application would need to insert the pameneng, rather than the content author. As line-length is changed by stretching a window, or as content is added earlier in the same paragraph, the location of the word relative to the line edge will change. The insertion of pameneng is only appropriate at those instants when the appropriate sequence of characters appears at the line end.

For an application to use this correctly, it would need to know where the word boundaries are in the text, and then put this character at the end of the line only when a multisyllabic word is broken. This would require a dictionary to be applied to the text, since it would not be appropriate to insert the pameneng at the boundary of 2 words.

Observation: Aditya Bayu Perdana has found instances in lontar where [U+A983 JAVANESE SIGN WIGNYAN] is moved to the beginning of a line, alone, while a pameneng appears at the end of the previous line. If this is not just a scribal inconsistency (eg. it's not clear why you wouldn't put the wignyan at the end of the line if there's space for a pameneng), it may indicate that this letter should not be a combining mark in Unicode; however, the usage needs to be verified first. See pictures.

Observation: The images appear to show a gap before the pameneng.

 

Character properties

Characters used for the Balinese language have the following assignments related to line-break properties.

AL94ᬅ ᬆ ᬇ ᬈ ᬉ ᬊ ᬋ ᬌ ᬍ ᬎ ᬏ ᬐ ᬑ ᬒ ᬓ ᬔ ᬕ ᬖ ᬗ ᬘ ᬙ ᬚ ᬛ ᬜ ᬝ ᬞ ᬟ ᬠ ᬡ ᬢ ᬣ ᬤ ᬥ ᬦ ᬧ ᬨ ᬩ ᬪ ᬫ ᬬ ᬭ ᬮ ᬯ ᬰ ᬱ ᬳ ᭜
BA12᭚ ᭛ ᭝ ᭞ ᭟ ᭠
CM44ᬀ ᬁ ᬂ ᬃ ᬄ ᬴ ᬵ ᬶ ᬷ ᬸ ᬹ ᬺ ᬻ ᬼ ᬽ ᬾ ᬿ ᭀ ᭁ ᭂ ᭃ ᭄
NU20᭐ ᭑ ᭒ ᭓ ᭔ ᭕ ᭖ ᭗ ᭘ ᭙
Show legend u

AL (ordinary alphabetic and symbol characters) requires other characters to provide break opportunities; otherwise, unless tailored rules are applied, no line breaks are allowed between pairs of them.

BA (break after) indicates that it is normal to break after that character. 

CM (combining mark) takes on the behaviour of its base character.

NU (number) behaves like ordinary characters (AL) in the context of most characters but activate the prefix and postfix behavior of prefix and postfix characters.

Text alignment & justification

According to Sudewa, full justification is not a feature of Balinese text in traditional palm-leaf manuscripts, and only left, or occasionally centred or right alignment is relevant.

Letter spacing

tbd

Counters, lists, etc.

tbd

Styling initials

tbd

Page & book layout

This section is for any features that are specific to Balinese and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.

General page layout & progression

Traditionally, Balinese was written on thin, landscape palm-leaf manuscripts, called lontar.

Picture of a palm leaf manuscript.

Example of a palm-leaf manuscript from Wikipedia.

The text was packed in without paragraph breaks.

Character lists

Version 12.0 of the Unicode Standard has the following blocks dedicated to the Balinese script (numbers in lists are non-ASCII only):

The Balinese orthography described here uses characters from the following Unicode blocks.

Balinese83ᬂ ᬃ ᬄ ᬅ ᬆ ᬇ ᬈ ᬉ ᬊ ᬋ ᬌ ᬍ ᬎ ᬏ ᬐ ᬑ ᬒ ᬓ ᬔ ᬕ ᬖ ᬗ ᬘ ᬙ ᬚ ᬛ ᬜ ᬝ ᬞ ᬟ ᬠ ᬡ ᬢ ᬣ ᬤ ᬥ ᬦ ᬧ ᬨ ᬩ ᬪ ᬫ ᬬ ᬭ ᬮ ᬯ ᬰ ᬱ ᬳ ᬴ ᬵ ᬶ ᬷ ᬸ ᬹ ᬺ ᬻ ᬼ ᬽ ᬾ ᬿ ᭀ ᭁ ᭂ ᭃ ᭄ ᭐ ᭑ ᭒ ᭓ ᭔ ᭕ ᭖ ᭗ ᭘ ᭙ ᭚ ᭛ ᭜ ᭝ ᭞ ᭟ ᭠Copy to clipboard

The infrequently used characters come from these blocks.

Balinese2ᬀ ᬁCopy to clipboard

See also the Character usage lookup page, and the Script Comparison Table.

Languages using the Balinese script

According to ScriptSource, the Balinese script is used for the following languages:

References

  1. [ e ] Everson et al., Proposal for encoding the Balinese script in the UCS
  2. [ l ] Library of Congress, Balinese transcription
  3. [ n ] Norbert Lindenberg, Bringing Balinese to iOS
  4. [ o ] Omniglot, Balinese
  5. [ s ] Ida Bagus Adi Sudewa, The Balinese Alphabet
  6. [ ss ] ScriptSource, Balinese
  7. [ u ] The Unicode Standard v6.0
  8. [ w ] Wikipedia, Balinese script
Show stats
Main
Auxiliary
Archaic
Other
Deprecated