Updated Mon 13 Oct 2014 • tags balinese, scriptnotes
In these notes I synthesize information from various sources, encountered as I explore the Balinese script as used for the Balinese language. The page contains brief notes on general script features and discussions about which Unicode characters are most appropriate when there is a choice. See also the companion document, Balinese Character Notes, which describes the characters used in the Balinese script one by one.
If you click on the Balinese example text, the page will show the constituent characters at the bottom right of the page. (Hint: since the examples are all displayed using graphics (to ensure that you see what is expected), you can copy the characters in the example by clicking on the example and then copying the red text that appears at the bottom right of the page, just above the list of characters.)
For more detailed information, especially about the history and phonology of Balinese, follow the links in the text and at the bottom of the page. You can also click on the symbols in the next section to jump to a description of each character.
The script type is abugida. Consonants carry an inherent vowel a, although that is pronounced ə at the end of a word.
Text runs left-to-right, and words are not separated by spaces.
Consonants have an inherent -a vowel sound. Consonants combine with following consonants in the usual Brahmic fashion: the inherent vowel is silenced by U+1B44 BALINESE ADEG ADEG ᭄ (the Balinese equivalent of the Sanskrit virama), and the following consonant is subjoined or postfixed, often with a change in shape.
Only 18 of the consonants are used for pure Balinese language text. The remainder are used for words derived from Sanskrit or Kawi. There are also a few characters in the Unicode block that are used for the Sasak language. (It's not clear to me whether the fact that these relate to aspirated or retroflex forms originally affects the pronunciation.)
A number of the Sanskrit or Kawi consonants are rather poorly attested. The letter ca laca is only found in non-initial position as ᭄ᬙ, and most of the retroflex series is often omitted in books about the script. The letter JA JERA õ (jha) seems to be known from only one word, ᬦᬶᬃᬛᬭ nirjhara pond. (It is possible that an original ai may have been lost in Balinese, to be replaced by the glyph for jha.)
The symbols for vocalic r and vocalic l have been reclassified as consonants (see below for details).
To represent consonants without intervening vowels, the non-initial consonant is typically drawn below the initial consonant, and with a slightly different shape. There can be up to 3 consonants combined in this way, and the third consonant must be one of ya, ra, la or wa. In some cases the following consonant appears to the right of the initial consonant.
Otherwise, the sign adeg-adeg is used to show that no vowel is present, eg. kapal (ship).
In Unicode, the adeg-adeg character is used between consonants to cause the conjunct combining behaviour.
Because there is no word separator, consonants at the end of one word and beginning of the following word are normally stacked, too. In some cases this leads to ambiguity about whether this is one or two words. If you really want to make clear which is which, you can use an explicit adeg-adeg, eg. pakraman (membership) vs. Pak Raman (Mr. Raman).
You can do this in Unicode by including a zero-width non-joiner after the adeg-adeg.
A somewhat ambiguous situation arises where apparently norms prevent certain combinations stacking. For example, the name of the village tamblung should not stack the mbl, but should look like . This would look exactly like this if you used a zero-width non-joiner after ma, but it could be achieved also by intelligence in the font, as was actually the case when I generated this example (click on it to see). It's not clear to me what is the preferred approach: put zwnj in only when the font doesn't do what you want, or use it always. The latter may lead to more consistent content where different fonts are applied to the text (eg. after cut and paste). In theory, this shouldn't affect searching and sorting, although some applications may not ignore the zwnj as they should.
Balinese doesn"t use ra + pepet to represent the sound rə. Instead it uses ra repa . U+1B0B BALINESE LETTER RA REPA at the beginning of a syllable, such as in kěrěng (eat a lot), is treated as a consonant.
Ra repa has a postfixed form and a subjoined form. The postfixed form is seen where the consonant form of ra repa follows a word which ends in a consonant, such as Pak Rěrěh (Mr Rereh). The sequence of characters to be used for this is <consonant, adeg-adeg, ra repa> (ie. not using U+1B3A BALINESE VOWEL SIGN RA REPA).
The subjoined form is used to represent the original vocalic r. In such cases, it follows a syllable-initial consonant, as in Krěsna (Krishna). This is where U+1B3A BALINESE VOWEL SIGN RA REPA is used. The sequence of characters to be used here is <consonant, vowel sign ra repa>.
The combining mark rerekan is used, as is a similar sign in Javanese, to extend the character repertoire for foreign sounds. Attested in Library of Congress transliterations and in earlier Sasak orthography are: x, ɣ, ʕ, z, f, v, and ħ. could be used for one-to-one transliteration for Javanese d̠.
In rendering, the dots of these letters appear above the top character, which can cause some ambiguity in reading; could be xja <ka, rerekan, adeg-adeg, ja>, or kza <ka, adeg-adeg, ja, rerekan>, or indeed xza <ka, rerekan, adeg-adeg, ja, rerekan>. In practice these combinations are probably rather rare.
In recent times, Sasak users abandoned the use of the Javanese-influenced rerekan in favour of a series of modified letters (see above), making use, in addition, of some of unused Kawi letters for the Arabic sounds. In place of x and ɣ, for instance, the new fusion (of and ) khot sasak and the Kawi letter ga gora are used.
Consonants carry an inherent vowel a, pronounced ə at the end of a word and also in prefixes ma-, pa- and da-. There are vowel signs for all vowel sounds in Balinese except the inherent vowel.
There are also independent vowel forms for most vowels for use at the beginning of a word. In the middle of a word, the vowel sign is used over ha. The vowels pepet and pepet tedong don't have an independent form, and have to be used over ha at the beginning of a word.
In Sasak, independent vowel akara can be treated as a consonant insomuch as it can be followed by an explicit adeg adeg in word- or syllable-final position, where it indicates the glottal stop, eg. amaq; other consonants can also be subjoined to it.
Many of the subjoined and post-fixed consonant forms have different shapes from the standard glyph for that character, for example na becomes .
In addition, many conjunct clusters combine characters with special shapes, or subtly change parts of glyphs to join smoothly. Often the changes are significant, especially the medial consonants, ya, ra, wa and la. For example, see the sequence <ba, adeg-adeg, ra, adeg-adeg, ya> in briag laughter.
Combining vowel signs can also have different shapes depending on the context. For example, the vowel sign tedung typically ligates with the preceding consonant, eg. ha is but <ha, tedong> is and subjoined ya is but <consonant, adeg-adeg, ya, tedong> is .
When two diacritics appear above a consonant, the shape and position needs to be adapted.
You can experiment with other examples using the Balinese picker.
There are a set of Balinese digits, and they are used in the same way as Latin digits.
However, many of the digit symbols are indistinguishable from other Balinese letters. Numbers are typically surrounded by carik siki, so that they are easily recognisable, eg. (Bali, 3 July 1982).
Both panti and pamada are used to begin a section in text.
carik pamungkah is used as a colon, and carik siki and carki pareren are used as comma and full stop respectively.
At the end of a section, pasalinan and carik agung may be used (depending on what sign began the section). These are encoded using the punctuation ring windu together with carik pareren and pamada.
In some texts, "holy letters" or modre symbols are made by using ulu candra with these: , , .
Common practice is to break the sentence at any point when it reaches the end of a line, except that no line breaks should be allowed within syllable boundaries and no line breaks are allowed just before a colon, comma or full stop.
In lontar texts where a word must be broken at the end of a line (always after a full syllable), the sign pameneng is inserted. This sign is not used as a word-joining hyphen; it is used only in linebreaking.
This is a list of main characters or character combinations needed for Balinese. Clicking on these characters will open a page in another window. If the character is underlined, the new page will display additional information about that character.
To test the various contextual forms of these characters, use the Balinese character picker.