Sinhala (draft)
Sinhala

Updated 16 April, 2022

This page brings together basic information about the Sinhala script and its use for the Sinhalese language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Sinhalese using Unicode.

Sample

Select part of this sample text to show a list of characters, with links to more details. Source
Change size:   28px

1 වන වගන්තිය සියලු මනුෂ්‍යයෝ නිදහස්ව උපත ලබා ඇත. ගරුත්වයෙන් හා අයිතිවාසිකම්වලින් සමාන වෙති. යුක්ති අයුක්ති පිළිබඳ හැඟීමෙන් හා හෘදය සාක්ෂියෙන් යුත් ඔවුන්, ඔවුනොවුන්ට සැළකිය යුත්තේ සහෝදරත්වය පිළිබඳ හැඟීමෙනි.

2 වන වගන්තිය ජාති, වංශ, වර්ණ, ස්ත්‍රී පුරුෂ භාවය, භාෂාව, ආගම්, දේශපාලන ආදී කවර බේදයක් හෝ සමාජ, ජාතික, දේපළ, උපත ආදී කවර තත්ත්වයක විශේෂයක් හෝ නොමැතිව මේ ප්‍රකාශනයේ සඳහන් සියලු හිමිකම්වලට හා ස්වාධීනත්වයන්ට සෑම පුද්ගලයකුම උරුම වන්නේය. තවද යම් පුද්ගලයකු අයත්වන රටේ දේශපාලන, නීතිමය හෝ ජාත්‍යන්තර තත්ත්වයන් පිළිබඳ කිසිදු විශේෂයක් ද ඒ රටේ ස්වාධීන, භාරකාර, අස්වාධීන ආදී කවර තත්ත්වයක් පිළිබඳ විශේෂයක් ද නොමැතිව මේ හිමිකම් ඔහු සතු වන්නේය.

Usage & history

The Sinhala script is used for writing the Sinhala language, spoken by approximately 16 million people in Sri Lanka,

සිංහල අක්ෂර මාලාව Siṁhala Akṣara Mālāva Sinhalese alphabet

The alphabet is a descendant of the ancient Indian Brahmi script and is closely related to the South Indian Grantha script and Kadamba alphabet.

Sources Scriptsource and Wikipedia.

Basic features

The Sinhala script is an abugida, ie. consonants carry an inherent vowel sound that is overridden using vowel signs. In Sinhala, consonants carry an inherent vowel a. See the table to the right for a brief overview of features for the modern Sinhala orthography. (See the key. Character counts exclude ASCII characters.)

Modern Sinhala can be written using a subset of the letters available in the Sinhala Unicode block. The remainder are used for representing the sounds of Sanskrit, Pail and other languages, and include all the aspirated consonants (which are pronounced in the same way as the unaspirated ones).

Sinhala is a diglossic language, that is, the spoken and written forms of the language show considerable variation. ...

Sinhalese is often considered two alphabets, or an alphabet within an alphabet, due to the presence of two sets of letters. The core set, known as the śuddha siṃhala (pure Sinhalese, ශුද්ධ සිංහල) or eḷu hōḍiya (Eḷu alphabet එළු හෝඩිය), can represent all native phonemes. In order to render Sanskrit and Pali words, an extended set, the miśra siṃhala (mixed Sinhalese, මිශ්‍ර සිංහලimg), is available.

There are two forms of the Sinhala script. The standard, 'pure', form which is taught in schools is called eḷu hōḍiya or śuddha hōḍiya. This system contains twenty consonant and twenty vowel letters and can be used to represent the sounds of the spoken language almost perfectly. However, to adhere to current spelling conventions - some of which represent archaic pronunciations - and to accurately transcribe Sanskrit, Pali, Hindi and English loanwords, a wider set of letters is needed. This set is called 'mixed alphabet' miśra hōḍiya and contains an additional eighteen consonant letters, many of which are aspirated equivalents of existing letters.

Unusually for indic scripts, there is a set of prenasalised consonants, and there is also an æ vowel.

The virama is usually displayed in consonant clusters. However, it is also possible to render clusters using conjunct forms (ligatures or reduced glyphs). Zero width joiner is used after the virama to signal the intention for that. Putting the ZWJ before the virama produces another form of conjunct, where adjacent consonants touch each other, but this is not used for modern Sinhalese.

Character index

Letters

Show

Basic consonants

ක␣ග␣ඟ␣ච␣ජ␣ට␣ඩ␣ණ␣ඬ␣ත␣ද␣න␣ඳ␣ප␣බ␣ම␣ඹ␣ය␣ර␣ල␣ව␣ස␣හ␣ළ

Extended consonants

ඛ␣ඝ␣ඞ␣ඡ␣ඣ␣ඤ␣ඥ␣ඨ␣ඪ␣ථ␣ධ␣ඵ␣භ␣ශ␣ෂ␣ෆ

Vowels

ඉ␣ඊ␣උ␣ඌ␣එ␣ඒ␣ඔ␣ඕ␣ඇ␣ඈ␣අ␣ආ␣ඓ␣ඖ

Vocalics

Not used for modern Sinhala

ඤ␣ඦ␣ඎ␣ඏ␣ඐ

Combining marks

Show

Vowels

ි␣ී␣ු␣ූ␣ෙ␣ේ␣ො␣ෝ␣ැ␣ෑ␣ා␣ෛ␣ෞ

Vocalics

Bindu

Virama

Visarga

Not used for modern Sinhala

ෲ␣ෟ␣ෳ

Numbers

Show
෦␣෧␣෨␣෩␣෪␣෫␣෬␣෭␣෮␣෯␣𑇡␣𑇢␣𑇣␣𑇤␣𑇥␣𑇦␣𑇧␣𑇨␣𑇩␣𑇪␣𑇫␣𑇬␣𑇭␣𑇮␣𑇯␣𑇰␣𑇱␣𑇲␣𑇳␣𑇴

Punctuation

Show
‘␣’␣“␣”

ASCII

(␣)␣,␣.␣:␣;␣?␣!

Not used for modern Sinhala

Other

Show
‌␣‍
Items to show in lists

Phonology

These are sounds of the Sinhala language.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones. Source Wikipedia.

Vowel sounds

i u e o ə əː ə əː æ æː ɐ a

əː is restricted to English loans.

a and ə are allophones in Sinhala and contrast with each other as inherent vowels in stressed and unstressed syllables, respectively.wl,#Phonology

Consonant sounds

labial dental alveolar post-
alveolar
retroflex palatal velar glottal
stops p b t d     ʈ ɖ   k ɡ  
pre-nasalised ᵐb ⁿd     ᶯɖ   ᵑɡ  
affricates       t͡ʃ d͡ʒ
t͡ɕ d͡ʑ
       
fricatives f
ɸ
  s ʃ
ɕ
      h
nasals m   n       ŋ
approximants ʋ   l     j  
trills/flaps     r    

Vowels

Click on the characters in the lists for detailed information.

Inherent vowel

The inherent vowel is typically transcribed as a, and pronounced a in stressed syllables, and otherwise ə.wl,#Phonology So ka is written by simply using the consonant letter [U+0D9A SINHALA LETTER ALPAPRAANA KAYANNA].

Vowel-signs

Non-inherent vowel sounds that follow a consonant are represented using vowel-signs, eg. ki is written කී [U+0D9A SINHALA LETTER ALPAPRAANA KAYANNA + U+0DD3 SINHALA VOWEL SIGN DIGA IS-PILLA].

An orthography that uses vowel-signs is different from one that uses simple diacritics or letters for vowels, in that the vowel-signs are generally attached to a whole orthographic syllable, rather than just applied to the letter of the immediately preceding consonant. This means that pre-base vowel-signs and the left glyph of circumgraphs appears before a whole consonant cluster if it is rendered as a conjunct (see prebase_vowels).

Sinhala vowel-signs are all combining characters. All vowel-signs are stored after the base consonant, and the font puts them in the correct place for display. This also applies for the 5 circumgraphs, where a single code point produces glyphs on more than one side of the consonant base, and the 2 prescript vowel-signs. In principle a single character is used per base consonant, but several vowel signs decompose to more than one character.

Nine vowel-signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.

One particular affix, යි yi, is pronounced j and treated as a final consonant.

Combining marks used for vowels

Sinhala uses the following dedicated combining marks for vowels.

ි␣ී␣ු␣ූ␣ෙ␣ේ␣ො␣ෝ␣ැ␣ෑ␣ා␣ ␣ෛ␣ෞ

Two vowel-signs appear to the left of the base consonant letter or cluster, eg. කෙ.

The first of these is a core letter, the second an extended letter.

These are combining marks that are always stored after the base consonant. The font places the glyph before the base consonant.

The vowel letters of Sinhala are divided into a core set and an extended set. The core (ʃuddʰa) alphabet covers the sounds of modern spoken Sinhala. The extended (miʃra) letters and vocalics are used for writing Sanskrit, Pali, and Tamil words. These are the 2 right-hand vowel-signs (diphthongs) in the list above.

Pre-base vowel-signs

ෙ␣ ␣ෛ

Two vowel-signs appear to the left of the base consonant letter or cluster, eg. කෙ.

The first of these is a core letter, the second an extended letter.

These are combining marks that are always stored after the base consonant. The font places the glyph before the base consonant.

Circumgraphs

ේ␣ො␣ෝ␣ ␣ෞ

Four vowels are produced by a single combining character with visually separate parts that appear on different (mostly opposite) sides of the consonant onset eg. කො ko.

These are all core letters, except for the last.

Encoding. All of these circumgraphs can be written as a single code points, or as multiple code points.

  1. [U+0DDA SINHALA VOWEL SIGN DIGA KOMBUVA]
    ේ [U+0DD9 SINHALA VOWEL SIGN KOMBUVA + U+0DCA SINHALA SIGN AL-LAKUNA]
  2. [U+0DDC SINHALA VOWEL SIGN KOMBUVA HAA AELA-PILLA]
    ො [U+0DD9 SINHALA VOWEL SIGN KOMBUVA + U+0DCF SINHALA VOWEL SIGN AELA-PILLA]
  3. [U+0DDD SINHALA VOWEL SIGN KOMBUVA HAA DIGA AELA-PILLA]
    ෝ [U+0DD9 SINHALA VOWEL SIGN KOMBUVA + U+0DCF SINHALA VOWEL SIGN AELA-PILLA + U+0DCA SINHALA SIGN AL-LAKUNA]
  4. [U+0DDE SINHALA VOWEL SIGN KOMBUVA HAA GAYANUKITTA]
    ෞ [U+0DD9 SINHALA VOWEL SIGN KOMBUVA + U+0DDF SINHALA VOWEL SIGN GAYANUKITTA]

The single code point per vowel-sign is the form preferred by the Sinhala encoding standards and the form in common use for Sinhala. The parts are separated, however, in Unicode when normalised using Normalisation Form D (NFD). If Normalisation Form C (NFC) is applied, they recompose.

Whichever approach is used, the vowel-signs must be typed and stored after the consonant characters they surround. In the case of decomposed vowel-signs, the order is also important and must be as shown above.

Vowel-sign placement

The following list shows where vowel-signs are positioned around a base consonant to produce vowels, and how many instances of that pattern there are.

Standalone vowels

Sinhala represents standalone vowels using a set of independent vowel letters. The set includes a character to represent the inherent vowel sound.

The core (ʃuddʰa) alphabet includes the following.

ඉ␣ඊ␣උ␣ඌ␣එ␣ඒ␣ඔ␣ඕ␣ඇ␣ඈ␣අ␣ආ

The extended (miʃra) letters are as follows, but see also vocalics:

ඓ␣ඖ

The pronunciations of [U+0D85 SINHALA LETTER AYANNA] and [U+0D86 SINHALA LETTER AAYANNA] vary, but in a fairly predictable way. The former is a in the first syllable, except for a few words, and before double consonants or clusters, and ə word finally and before single consonants. The latter represents everywhere except word-finally, where it may be a, depending on the word structure. Similar length rules apply to e and o in final position.

Vowel absence

[U+0DCA SINHALA SIGN AL-LAKUNA] is attached to a consonant to indicate that the inherent vowel is not pronounced. It has 2 different shapes, depending on which base consonant it is attached to.

ක්   ඛ්
The two different shapes of AL-LAKUNA. Combined with shuddha k on the left, and mishra k on the right.

Consonants without a following vowel typically occur at the end of a word, or as part of a consonant cluster or geminate (see clusters), eg. අලුත් ඇතැම් කන්ද

Vocalics

These are all classed as extended (miʃra) letters. Most are no longer in contemporary use.

ඍ␣ෘ
ඎ␣ඏ␣ඐ␣ෲ␣ෟ␣ෳ

Consonants

As for vowels, Sinhala uses a core set of consonants for writing the modern, spoken language of Sinhalese, but has an extended set used for writing Sanskrit, Pali, and Tamil words.

Click on the characters in the lists for detailed information.

Basic (ʃuddʰa) consonants

The core set, or ʃuddʰa hōɖiya, is based on the classical grammar of the middle ages (called එළු හෝඩිය ẹɭu hōɖiya) and contains the following consonants:

Stops

ප␣බ␣ඹ␣ත␣ද␣ඳ␣ට␣ඩ␣ඬ␣ක␣ග␣ඟ

Affricates

ච␣ජ

Fricatives

ස␣හ

Nasals

ම␣න␣ණ

Other sonorants

ව␣ර␣ල␣ළ␣ය

miʃra & other consonants

The full set of consonants, known as miʃra hōɖiya (mixed alphabet), includes the additional consonants in this section.

Stops

ඵ␣භ␣ථ␣ධ␣ඨ␣ඪ␣ඛ␣ඝ

Affricates

ඡ␣ඣ

Fricatives

ෆ␣ශ␣ෂ

Nasals

ඤ␣ඥ␣ඞ

Note that the aspirated miʃra consonants are mapped to the same sounds as the unaspirated ʃuddʰa ones, and the retroflex and are each pronounced without retroflexion.

Sinhalese has a new character for f, [U+0DC6 SINHALA LETTER FAYANNA]. Sometimes, instead, a character is used that combines the Latin letter 'f' with the Sinhalese p, [U+0DB4 SINHALA LETTER ALPAPRAANA PAYANNA].

Prenasalised consonants

A peculiarity of Sinhalese among indic scripts is the inclusion of prenasalised consonants, representing a nasal sound followed by a stop. The orthography distinguishes these graphemes from the more straightforward nasal consonant followed by a stop. For example, compare අණ්ඩ අඬ

The prenasalised letters are:

ඹ␣ඳ␣ඬ␣ඟ␣ඥ

The prenasalised shapes are formed from a combination of the shapes of the participating characters.

There is one additional, archaic letter.

Final consonants

Two combining characters are used to represent syllable-final consonant sounds.

ං␣ඃ

[U+0D82 SINHALA SIGN ANUSVARAYA​] usually represents the sound ŋ, eg. සිංහල

[U+0D83 SINHALA SIGN VISARGAYA​] is also in the repertoire. Not clear how it's used in Sinhala. 

Either of these 'semi-consonants' must be used after a vowel or after a consonant+vowel (including the inherent vowel), and must be the last combining character in the syllable.

Consonant clusters

Consonant cluster handling is a little unusual in Sinhala, compared to other indic scripts.

There are 3 ways of managing consonant clusters. Modern Sinhala uses only the first two alternatives.

  1. Visible virama : Show [U+0DCA SINHALA SIGN AL-LAKUNA​], called al lakuna, over the first character in the cluster. Unlike Devanagari, this is the default for Sinhala.
  2. Conjunct forms : Use a reduced or ligated form, especially for r or y. Since the approach changes the shape of the constituent components, the cluster is referred to as a conjunct.
  3. Touching consonants : Make the consonants touch (not used in modern Sinhala).

See also finals.

See a table of 2-consonant clusters.
The table allows you to test results for various fonts.

Using a visible virama

Consonant Virama Consonant

The virama indicates that a consonant has no vowel (see novowel). The shape of the virama can take two forms, depending on the base character it is appended to: with k you get ක්; with kh you get ඛ්, eg. ලක්ෂය අම්මා

If a pre-base vowel-sign is added after the last consonant in a cluster, it will appear immediately to the left of that consonant, rather than before the first consonant in the cluster, eg. see how the vowel in fig_kko cuts between the two consonants in the cluster.

The pre-base part of the vowel (highighted) appears immediately before the consonant after which it is pronounced, rather than at the beginning of the consonant cluster.

Conjuncts

Consonant Virama ZWJ Consonant

The combination   ්‍+ZWJ [U+0DCA SINHALA SIGN AL-LAKUNA + U+200D ZERO WIDTH JOINER] causes the font to hide the virama glyph and form a conjunct, eg. ඉංග්‍රීසි

This approach is principally used when combining r or y with another consonant (both before and after, in the case of r), and produces a reduced or ligated form.

ර්‍ක ක්‍ර ක්‍ය
Common conjuncts in Sinhala.

There are also forms using both, eg. ක්‍ය්‍ර kyra Kyra කාර්‍ය්‍යාලය kār͓₊y͓₊yāly (kār͓₊y͓₊yālaya) the office

Although the use of the conjunct with r is required in normal Sinhalese text, it is possible to not use it: both of the following are valid ways to write karma.sකර්ම kr͓m කර්‍ම kr͓₊m

Wikipedia lists several more conjuncts, some of which are reproduced below. The availability of these conjuncts is font dependent, eg. ඳ්‍ව ⁿd͓₊v doesn't ligate using the default font of this page, but may with another.

ක්‍ව␣ක්‍ෂ␣ත්‍ථ␣ත්‍ව␣න්‍ද␣න්‍ධ␣න්‍ද්‍ර

Touching consonants

Consonant ZWJ Virama Consonant

The third approach is used in ancient scriptures but is not used in modern Sinhala.ws It hides the virama and moves the consonants alongside each other, so that they are touching, eg. මම becomes ම‍්ම mm

For this use ZWJ first, ie. ‍් [U+200D ZERO WIDTH JOINER + U+0DCA SINHALA SIGN AL-LAKUNA].

Numbers, dates, currency, etc.

Sinhala uses european digits.

There is, however, a set of native digits, that were used into the 20th century, but mostly associated with horoscopes. The shapes of some of these are identical to characters used for other purposes.

෦␣෧␣෨␣෩␣෪␣෫␣෬␣෭␣෮␣෯

There is also another, older set that were used in an archaic number system, called Sinhala Illakkam, prior to 1815. These are all in the Sinhala Archaic Numbers block.

𑇡␣𑇢␣𑇣␣𑇤␣𑇥␣𑇦␣𑇧␣𑇨␣𑇩␣𑇪␣𑇫␣𑇬␣𑇭␣𑇮␣𑇯␣𑇰␣𑇱␣𑇲␣𑇳␣𑇴

Text direction

Sinhala text runs left to right in horizontal lines.

Show default bidi_class properties for characters in the Sinhala orthography described here.

Glyph shaping & positioning

This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.

You can experiment with examples using the Sinhala character app.

Sinhala text is not cursive (ie. joined up like Arabic), however there is a significant amount of interaction between glyphs, and some joining, around consonant clusters.

The orthography has no case distinction, and no special transforms are needed to convert between characters.

Context-based shaping & positioning

Contextual shaping

Similarly to the Tamil script, the u and ū vowels assume various different shapes and connection points, depending on what consonant they follow.

  (-a) -u
kකුකූ
pපුපූ
rරුරූ
ළුළූ
Shape variants for the u and ū vowel signs.

Other idiosyncratic combinations are also possible, such as the rendering of .

ra
රැ
රෑ
Shape variants for the æ and ǣ vowel signs.

Combining characters may need to be adapted to fit the consonants they are attached to.

ක් ඛ්    පි රි ඬි

Two different versions of hal kirīma (left); differently shaped i in pi, ri and ⁿɖi (right).

As described above, consonant clusters may cause conjuncts to form, as a way of indicating that there are no intervening vowels. Conjunct ligations are generally expected for r and y, and other conjuncts depend on font availability. Generally, a conjunct is formed by reducing the non-final consonant shapes.

ක්‍ව    ක්ව

Conjoined kv (left), and kv with hal kirīma (left).

Context-based positioning

Vowel signs may appear above, below, to the right, to the left, or on both sides of the base consonant.

ක කි කු කැ කෙ කො

Position of vowel signs for the sequence ka ki ku kæ ke ko.

Vowels signs are positioned around an orthographic syllable, rather than around a specific consonant. So a part of a vowel sign that appears to the left of its base will appear to the left of a conjunct.

ක්‍වො

In the syllable kvo the vowel sign appears on either side of the conjunct, not the letter v.

When a u vowel (or the long vowel) appears below a conjunct, it is placed below the final consonant, eg. ක්‍යු k͓₊yu

Explicit shaping controls

U+200D ZERO WIDTH JOINER (ZWJ) is used to produce conjuncts (see conjuncts).

Font styles

tbd

Text segmentation

Sinhala uses word boundaries for line wrapping and basic justification, but may use grapheme boundaries (sometimes called orthographic syllables) for other operations that work at the sub-word level.

Phrase, sentence, and section delimiters are described in phrase.

Grapheme boundaries

CCS here represents a base character with zero or more combining marks. Sinhala has a virama, but it doesn't combine CCSs unless it is accompanied by a ZWJ. A base is either a consonant or an independent vowel.

Basic typographic units

Base Combining_mark*

Most of the time, in modern Sinhala, a typographic unit is equivalent to a single CCS. The virama, which is a visible vowel-killer, is just another combining mark and doesn't extend the typographic unit.

Sequences that include a syllable nucleus may include combining marks with the following Indic syllabic categories:

  1. Vowel_Dependent (see combiningvowels)
  2. Bindu (see finals)
  3. Visarga (see finals)
අදිනවාඅදිනවා
පුංචිපුංචි

Syllable codas and vowel-less consonants in clusters are written using a sequence of consonant letter followed by a visible Virama (see novowel) (called al-lakuni).

අලුත්අලුත්
කන්දකන්ද

Conjunct typographic units

(Base Virama ZWJ)+ Base Combining_mark*

Where conjuncts appear, a typographic units contain multiple CCS units. The union is signalled by ්‍+ZWJ [U+0DCA SINHALA SIGN AL-LAKUNA + U+200D ZERO WIDTH JOINER] before a base.

ඉංග්‍රීසිඉංග්‍රීසි
චර්‍මයචර්‍මය

Correct behaviour is font-dependent. If a font doesn't have a conjunct form for a particular combination of characters it will fall back to what looks like 2 basic typographic units, however the sequence behaves as a single typographic unit. For example, observe the placement of the pre-base vowel in fig_kro. In the syllable kro on the left, the vowel-sign surrounds the whole conjunct. In the middle we drop the ZWJ to give -k.ro, and now the pre-base glyph precedes the RA. The same should happen for the right-hand example, if the code points indicate a conjunct but the font doesn't have the necessary glyphs.

ක්‍රො  ක්රො  *කේරා
Placement of pre-base vowel glyphs.

Grapheme clusters

Unicode grapheme clusters are equivalent to the basic typographic units described above. However, they are not useful for dealing with conjunct typographic units, since the virama always causes them to stop.

Observation: Browsers based on the Gecko engine (Firefox, etc.) use grapheme clusters for cursor movement and forward delete, whereas Blink (Chrome) and WebKit (Safari) browsers use the tyographic units described here.

Word boundaries

Words are separated by spaces.

Punctuation & inline features

Phrase & section boundaries

,␣:␣;␣.␣?␣!␣෴
phrase

, [U+002C COMMA]

; [U+003B SEMICOLON]

: [U+003A COLON]

sentence

. [U+002E FULL STOP]

? [U+003F QUESTION MARK]

! [U+0021 EXCLAMATION MARK]

Sinhala uses western punctuation.

The punctuation character [U+0DF4 SINHALA PUNCTUATION KUNDDALIYA] once functioned to indicate the end of a paragraph, but is not used for modern Sinhala content.

Parentheses & brackets

(␣)
  start end
standard

( [U+0028 LEFT PARENTHESIS]

) [U+0029 RIGHT PARENTHESIS]

Quotations

‘␣’␣“␣”
  start end
initial

[U+201C LEFT DOUBLE QUOTATION MARK]

[U+201D RIGHT DOUBLE QUOTATION MARK]
nested

[U+2018 LEFT SINGLE QUOTATION MARK]

[U+2019 RIGHT SINGLE QUOTATION MARK]

Single quotation marks are used for quotations within quotations.

Emphasis

tbd

Other inline ranges

tbd

Text spacing

tbd

Abbreviation, ellipsis & repetition

tbd

Inline notes & annotations

tbd

Other punctuation

tbd

Line & paragraph layout

Line breaking & hyphenation

tbd

Show (default) line-breaking properties for characters in the modern Sinhala orthography.

Text alignment & justification

tbd

Counters, lists, etc.

tbd

Styling initials

tbd

Page & book layout

This section is for any features that are specific to Sinhala and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.

Input

The Sinhala keyboards has deadkeys which change the assignments of keys around them when pressed. For example, pressing the key for e will change several keys to letters that start with the same symbol.

Sinhala keyboard in default state.

Sinhala keyboard after the key for e is pressed.

Note also, in the bottom left corner, that the keyboard has a key for the combination of + ZWJ + [U+0DCA SINHALA SIGN AL-LAKUNA​ + U+200D ZERO WIDTH JOINER + U+0DBB SINHALA LETTER RAYANNA], ie. the conjoined -r. The shifted layout has a similar key for -y.

There is a rephaya key (for the sequence + + ZWJ [U+0DBB SINHALA LETTER RAYANNA + U+0DCA SINHALA SIGN AL-LAKUNA​ + U+200D ZERO WIDTH JOINER]), but it is typed after the consonant that normally follows it in memory. The input method then has to rearrange the codepoints in canonical order.

Effectively, you type characters or parts of multipart characters in visual order, and the system then has to rearrange things to produce the expected codepoint order.

References