Updated 8 April, 2024
This page brings together basic information about the Newa (or Pracalit) script and its use for the Newar language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Newar using Unicode.
Richard Ishida, Newar (Newa) Orthography Notes, 08-Apr-2024, https://r12a.github.io/scripts/newa/new
๐ฉ๐ธ๐ฎ๐ธ๐๐ซ๐ ๐ฐ๐ ๐๐ธ ๐ฌ๐ต๐๐ฃ๐๐๐ถ๐ ๐ซ๐ฌ๐ถ๐ฐ๐ฌ๐๐๐ฃ ๐ค๐น๐๐ธ ๐ณ๐ฐ๐ถ๐ข๐ต๐ฃ๐ ๐ฃ๐พ๐ซ๐ต๐ ๐ซ๐ต๐ ๐ณ๐๐๐ท๐ซ ๐ฎ๐๐๐๐ต๐ฃ๐๐๐๐ฌ๐ ๐๐๐๐ฃ๐๐๐๐ฌ ๐ฃ๐ถ๐ฌ๐๐ฉ๐ต๐ ๐ซ๐ต๐๐๐ธ ๐๐น๐๐ธ๐ฎ๐ถ๐ ๐ฌ๐พ๐ฅ๐ต๐ ๐ซ๐ต ๐๐ก๐ถ๐ฐ๐ต๐ณ๐ท ๐ฃ๐พ๐ฐ๐ต๐ ๐๐ฃ๐ซ๐ธ๐ณ๐ ๐ฃ๐พ๐ฐ๐ต๐ ๐ณ๐ต๐ซ๐๐๐ ๐ฌ๐ต๐๐๐ฌ๐ซ๐ต ๐ฉ๐ต๐ ๐ซ๐ต๐ ๐๐ธ ๐๐๐ฌ๐ธ ๐
Source: Lipi Pau Monthly newspaper, (February 2009)
Origins of the Newa script, 10thC โ today.
Phoenician
โ Aramaic
โ Brahmi
โ Gupta
โ Siddham
โ Nepali
โ Newa (Prachalit)
+ Ranjana
+ Bujimol
Newa (also known as Prachalit or Nepaalalipi) is a Brahmi-derived script used principally to write the Tibeto-Burman language Newar (also known as Nepal Bhasa). The language is spoken by around 800,000 people, predominantly in the Kathmandu valley (the 5th most spoken language in Nepal), plus 14,000 in Sikkim, where it is recognised as a state language. The Newar language is mostly written in Devanagari, but there is a movement to promote more use of the Newa script.
It has also been used to write Sanskrit, Bengali, Maithili, and Hindi.
๐ฃ๐พ๐ฐ๐ต๐ ๐จ๐ต๐ซ๐ newaห bสฑaj Newa (Newar)
๐ฃ๐พ๐ฅ๐ต๐ฎ ๐จ๐ต๐ฒ๐ต nepal bสฑasa Newar (Nepalese)
The script emerged in the 10th century and was actively used until Gorkha rule ended the reign of Newar dynasties in 1769, after which the use began to decline. The use of the Newa script and Newar language was banned by the Rana government in 1905, with harsh treatment of proponents. When Rana rule ended in 1951, the ban was lifted, but the effects are still felt.
A revival initiative gained momentum in the 1980s, and a standard was created by the Nepal Lipi Guthi with the help of leading scholars in 1989
Newa is one of at least 6 scripts used for writing Nepali languages, which include Ranjana, Bhujimol, Kutila, Golmol, and Litumol.
Sources L2/12-003R and Scriptsource.
The Newa script is an abugida. Consonants carry an inherent vowel which can be modified by appending vowel signs to the consonant. See the table to the right for a brief overview of features for the modern Newar orthography.
A unusual feature of Newa orthography is that vowel signs with a wavy horizontal line replace the flat headstroke of the base consonant. Newa also has consonant-vowel ligatures.
Newa runs left to right in horizontal lines. Words are separated by spaces.
โฏ consonantSummary
The 29 consonant letters used for Newar include precomposed characters for 4 out of 6 murmured consonants.
Consonant clusters are normally rendered using fused forms. A visible virama may be used. Initial RA is rendered as a reph over the top right of the following consonant.
โฏ basicV
The Newar orthography is an abugida. Most sources list a single inherent vowel ษ, however the inherent vowel after w appears to be ษ. Other vowels following a consonant are mostly written using dedicated combining marks.
The vowels ษห and รฆห in Katmandhu Newar are written using the letter ๐ซ๐, which always retains its visible virama, even if word-medial.
Newa has 1 pre-base glyph, and has 4 circumgraphs that form only in certain character combinations, and do not decompose. There are 3 multipart vowel signs if the vowel lengthener is counted as part of the vowel.
Newar vowel signs are involved in a number of unusual, context-dependent shaping behaviours, including alternate shapes for vowel signs, integration of vowel signs into the consonant's headstroke, and the formation of circumgraphs for certain consonant-vowel combinations.
Standalone vowel sounds are written using 10 independent vowel letters, as well as ๐ซ๐.
The vowel signs and independent vowels are combined with diacritics to indicate various combinations of vowel length and nasalisation. The i and u sounds have different symbols for short and long values, but ๐ is used to indicate length for the other vowels. ๐ is used to nasalise a short vowel, and ๐ for a long vowel. Note, however, that where a sound has separate symbols for short and long, the latter nasalisation diacritic is used with the symbol for the short vowel.
Vowel absence is indicated in a regular fashion by the use of 11442.
There is a set of 4 vocalics, each with vowel sign and independent forms, but only 1 is used, and not in modern Newa.
Newa has native digit shapes.
Danda (from the Devanagari block) is used at the end of a sentence, and usually preceded by a space. Otherwise, most of the punctuation is ASCII.
Distinctive characteristics: headline replacement, contextual circumgraphs, fused conjuncts dominate.
These are sounds for the Kathmandu dialect of the Newar language.
Click on the sounds to reveal locations in this document where they are mentioned.
Phones in a lighter colour are non-native or allophones. Source Wikipedia.
All of the vowels and diphthongs can be nasalised (see length_nasalisation).
o, oห and u can also be pronounced ษ, ษห, and ส.wl,#Vowels
The sound ษ, or something close to it, is used in the Dolakhar Newa dialect, used outside Kathmandu.wl,#Vowels
The retroflex sounds only occur in the small Dolakha Newar dialect, located to the West of Kathmandhu.wl,#Consonants
Tap consonants ษพ and ษพสฑ can occur as word-medial alternates of t, d, dสฑ, or (in Dolakha) ษ.wl,#Consonants
ล occurs only in word-final position in the Kathmandu dialect.wl,#Consonants
The following table summarises the main vowel to character assigments.
โ represents the inherent vowel. Diacritics are added to the vowels to indicate nasalisation (only shown here for the dependent vowels).
Plain: | ||
---|---|---|
Diphthongs: |
For additional details see vowel_mappings.
This is the full set of characters needed to represent the Newar language vowels.
𑐎 kษ U+1140E NEWA LETTER KA
ษ following a consonant is not written, but is seen as an inherent part of the consonant letter, so kษ is written by simply using the consonant letter.
๐๐ฎ
Wiktionary transcriptions indicate that the inherent vowel after ๐ฐ is pronounced ษ
๐ฐ๐๐ธ
๐ง๐๐ฐ๐ด
The inherent vowel can be lengthened and nasalised. This is indicated in the same way as for other vowels, except that there is no vowel sign involved.
๐ณ๐๐๐๐ฐ๐
๐๐๐๐
Other than the inherent vowel, vowels following a consonant are mostly written using dedicated combining marks called vowel signs. The vowel signs are combined with other diacritics to indicate the various combinations of vowel length and nasalisation. See basicV for a summary.
The vowels ษห and รฆห in Katmandhu Newar are written using the letter ๐ซ๐, which always retains its visible virama, even if word-medial.
Newar vowel signs are involved in a number of unusual, context-dependent shaping behaviours, including alternate shapes for vowel signs, integration of vowel signs into the consonant's headstroke, and the formation of circumgraphs for certain consonant-vowel combinations.
𑐎𑐷 kiห U+1140E NEWA LETTER KA + U+11437 NEWA VOWEL SIGN II
Newar uses the following dedicated combining marks for vowels. They may be used on their own, or in combination with other characters (see compositeV).
An orthography that uses vowel signs is different from one that uses simple diacritics or letters for vowels, in that the vowel signs are generally attached to an orthographic syllable, rather than just applied to the letter of the immediately preceding consonant. In other words, pre-base vowel sign components are rendered before a whole consonant cluster if that cluster is rendered as a conjunct (see prebase for an example).
Five vowel signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.
๐๐ธ ๐๐ธ
The shape of 11438 and (to a lesser extent) 11439 varies according to the consonant used. For example, compare ๐๐ธ and ๐๐ธ. For more, see u_shape.
๐ ๐๐พ
Another noteworthy aspect of shaping is that certain vowel signs replace the headstroke of the consonant they follow. For example, compare ๐ and ๐๐พ. For more, see headstroke_assimilation.
Two basic vowel sounds found in Katmandhu are represented using ๐ซ๐ (usually a consonant letter).
ษห is written ๐ซ๐.
๐ฆ๐ซ๐
๐ช๐ซ๐โ๐๐ต
รฆห is written ๐ต๐ซ๐.
๐๐ฅ๐ต๐ซ๐
๐จ๐ต๐ซ๐โ๐ฎ๐๐
Note that both are long vowels, and that when used word-medially it is necessary to use 200C after the virama, so that it remains visible.
A number of words use this as a standalone vowel, when word-medial or word-final. For example:
๐ก๐พ๐ซ๐
๐๐๐ฐ๐๐ซ๐
The multipart vowels in Newa are described in nasalisation, just below.
It is common to see Newar vowels described in a chart which shows long and nasalised forms.
Vowel length is indicated by using a dedicated character in the case of 11437 and 11439, but otherwise by adding 11445.
Nasalisation is indicated using 11443 for a short vowel, and 11444 for a long vowel.
Long, nasalised ฤฉห and ลฉห vowels use the short form of the vowel sign.m,5-6
The following matrix shows these various forms for the vowel signs. The same rules apply to the standalone vowel letters.
Short | Long | Short nasal | Long nasal | |
---|---|---|---|---|
i | ๐ถ | ๐ท | ๐ถ๐ | ๐ถ๐ |
u | ๐ธ | ๐น | ๐ธ๐ | ๐ธ๐ |
e | ๐พ | ๐พ๐ | ๐พ๐ | ๐พ๐ |
o | ๐ | ๐๐ | ๐๐ | ๐๐ |
รฆ | ๐ต | ๐ต๐ | ๐ต๐ | ๐ต๐ |
a | inherent | ๐ | ๐ | ๐ |
ษi | - | ๐ฟ | - | ๐ฟ๐ |
ษu | - | ๐ | - | ๐๐ |
𑐎𑐶 ki U+1140E NEWA LETTER KA + U+11436 NEWA VOWEL SIGN I
The short i sound is written using 11436, which appears to the left of the base consonant letter or cluster.
This combining mark is always typed and stored after the base consonant. The font places the glyph before the base consonant.
When an orthographic syllable begins with a consonant cluster that is rendered as a conjunct, the vowel sign is rendered before the start of the orthographic syllable, eg. here are 3 sets of consonant clusters, each followed by i when spoken, but the vowel sign appears to the left of each cluster.๐๐๐๐ถ ๐ณ๐๐๐ถ ๐ง๐๐ฌ๐ถ jkhi sti bri
𑐐𑑀 ษกo U+11410 NEWA LETTER GA + U+11440 NEWA VOWEL SIGN O
Another idiosyncracy of Newa orthography is that 5 vowel signs change shape when attached to the base consonants that don't have a headstroke. Four of those vowel signs are so-called 'wavy-headed', and when combined with the 7 headless consonants they are rendered as circumgraphs.p,6
The following table shows the various forms, combined with both ๐ (has headstroke) and ๐ (headless). The last 4 vowel signs combined with the headless GA produce the circumgraphs.
With headstroke | Without headstroke | |
---|---|---|
11435 | ๐๐ต | ๐๐ต |
1143E | ๐๐พ | ๐๐พ |
11440 | ๐๐ | ๐๐ |
1143F | ๐๐ฟ | ๐๐ฟ |
11441 | ๐๐ | ๐๐ |
No special encoding is needed to create these circumgraph forms. The shape change should be effected automatically by the font. Also, and usefully, unlike some other Indic scripts, it is not possible to incorrectly compose these circumgraph forms by combining other Newa characters, since the shapes don't exist in the character set.
Newa represents standalone vowels using a set of independent vowel letters. The set includes a character to represent initial ษ, the inherent vowel sound. There are separate letters for short and long versions of i and u, but diacritics are used to lengthen the remaining letters (see nasalisation).
๐๐ฅ๐ต
๐๐ฎ๐๐
๐ณ๐ฎ๐ต๐
As mentioned earlier, a number of words also use ๐ซ๐ as a standalone vowel, when word-medial or word-final. For example:
๐ฉ๐พ๐ซ๐
๐ฅ๐ธ๐ฎ๐ถ๐๐๐ฐ๐๐ซ๐
In Sanskrit texts, elision of an initial a due to sandhi is indicated using 11447.
Newa uses 11442 (the Newa equivalent of the Sanskrit virama) to indicate that there is no inherent vowel after a consonant. For example, compare the following.
๐ซ๐พ๐๐
๐จ๐ต๐ฌ๐
All syllable codas are written with a following virama.
Other consonant clusters also involve typing and storing this character after the consonant(s) with no following vowel, but if the cluster forms a conjunct then the virama is not rendered visibly (see clusters).
๐๐ณ๐๐๐ถ
(The virama is also frequently found as part of the vowel ๐ซ๐, where it has a different function.)
๐๐๐ซ๐
This section maps Newar vowel sounds to common graphemes in the Newa orthography, where vs indicates a vowel sign, and s a standalone vowel. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.
Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.
๐ถ
๐๐ถ๐ณ๐ถ
๐
๐๐๐ธ๐
๐ถ๐
๐๐ถ๐๐ข๐ธ๐
๐๐
๐๐ธ๐๐๐
๐ท
๐ฎ๐๐ต๐ฃ๐ท
๐
๐๐ฎ๐๐
๐ถ๐
๐ฃ๐๐ถ๐
๐๐
๐๐
๐ธ
๐๐ธ๐ซ๐ธ
๐
๐๐ณ๐ต๐๐ซ๐
๐ธ๐
๐ง๐ธ๐๐๐ธ๐
๐๐
๐น
๐๐ฉ๐น
๐
๐ธ๐
๐๐ธ๐
๐๐
๐๐๐
๐พ
๐๐ธ๐ซ๐พ
๐
๐๐ฎ๐ต
๐พ๐
๐๐
๐พ๐
๐๐พ๐
๐๐
๐พ๐
๐๐พ๐๐๐น
๐๐
๐
๐จ๐น๐๐๐ฎ
๐
๐๐ด๐ต๐ซ๐
๐๐
๐๐
๐๐
๐๐
๐๐
๐๐
๐ when used with the inherent vowel.
๐๐๐
๐๐
๐ when used with the inherent vowel.
๐๐๐๐๐ฐ๐ต
๐๐
๐ when used with the inherent vowel.
๐๐ฎ๐๐
๐๐
๐๐๐๐
๐ซ๐
๐๐ซ๐โ๐ฉ๐ถ
๐๐ซ๐
๐๐ซ๐โ๐ฎ๐ต๐
๐๐ซ๐
๐๐๐ซ๐
๐ต๐ซ๐
๐๐ฅ๐ต๐ซ๐
๐๐ซ๐
๐ต๐๐ซ๐
๐๐ณ๐ต๐๐ซ๐
๐ต
๐๐ต๐ซ
๐
๐๐๐
๐ต๐
๐ฎ๐ธ๐ณ๐ต๐
๐๐
๐ต๐
๐ฉ๐ต๐
๐๐
๐ต๐
๐ ๐ต๐
๐๐
๐ฟ
๐ฉ๐๐ฟ๐๐๐ซ
๐
๐ฟ๐
๐
๐จ๐
๐
๐๐
๐บ
๐ณ๐๐ณ๐๐๐บ๐
๐
๐ป
๐
๐ผ
๐
๐ฝ
๐
Newa has a set of vocalic letters and vowel signs, but they are used for other languages, such as Sanskrit, and not for Newar.
The following table summarises the main consonant to character assigments.
Plosives | |
---|---|
Affricates | |
Fricatives | |
Nasals | |
Other |
For additional details see vowel_mappings.
This section lists letters representing sounds of the Kathmandu dialect of the Newar language (shown in the table just above). See the next section for letters used in other dialects, or other languages (such as Sanskrit).
Whereas the table just above takes you from sounds to letters, the following simply lists the basic consonant letters (however, since the orthography is highly phonetic there is little difference in ordering).
The following letters are used in other dialects, or other languages (such as Sanskrit).
A feature of Newar is the number of consonants, besides the plosives, that are pronounced with accompanying breathiness. The following list shows these sounds and the way they are written.
Unicode provides single characters for most of these.
Observation: Sources indicate that wสฐ and jสฐ are also part of the Newar phonetic repertoire, and are represented by these conjunct forms, but Unicode doesn't provide precomposed characters for them. They therefore have to be composed as consonant clusters.
Observation: One source stated that when these sounds are used for transcriptions of Sanskrit, they should all be written as consonant clusters, rather than using the precomposed characters.
Quite a lot of Newa consonants participate in context-sensitive shaping. See headstrokes, headstroke_assimilation and bha_ha.
Newa has 11446 that can be used to represent foreign sounds, but it doesn't appear to be used for Newar currently.
Clusters of consonant letters at the beginning of an orthographic syllable occur in Newa, and they are handled as described in the section clusters.
Special behaviours include handling of RA at the beginning of an orthographic syllable (see preceding_ra).
Word-final consonant sounds with no following consonant are represented by ordinary consonant characters, followed by a visible 11442 character.
The commonly found combination ๐ซ๐ represents a vowel sound when it occurs at the end of a syllable.
๐ฆ๐ซ๐
๐๐๐ฐ๐๐ซ๐
๐๐ฅ๐ต๐ซ๐
Syllable-final consonants that are not word-final normally form conjuncts. See clusters.
Observation: Pandey says that 11445 can represent syllable-final aspiration, but it's not clear whether that occurs in Newar as well as in Sanskrit.
The absence of a vowel sound between two or more consonants is visually indicated in one of the following ways.
See a table of 2-consonant clusters.
The table allows you to test results for various fonts.
In Unicode, the conjunct formation is achieved by adding 11442 between the consonants. The font hides the virama glyph automatically when a conjunct is formed.
See also finals.
Conjuncts are normally formed by fusing glyphs for the component characters, so that they fit within the normal character height. One or both of the original letters may be unrecognisable, but generally the parts, though simplified, are recognisable.
It is most common for glyphs to merge vertically, although there are also many that merge diagonally. A few merge horizontally. See a list of combinations.
For a detailed analysis of conjunct composition see Pandey, pages 7โ10.
A trailing RA has a fairly regular appearance as a subjoined glyph below the preceding consonant, though on the left side.
However, like many other Indian scripts, ๐ฌ at the beginning of a cluster is represented idiosyncratically, and appears as a small, superscript glyph over the top right of the following syllable.
In some circumstances a cluster doesn't give rise to a conjunct. In that case, the virama is displayed below the initial consonant. fig_conjunct_virama shows an example spotted in a newspaper.
If the font automatically substitutes a conjunct, but you don't want it to you can use 200C immediately after the virama to prevent the fusion of the characters. (If there is no consonant following, as in the case at the end of the line, this formatting character isn't needed.)
Newa has a few clusters involving 3 consonants. fig_conjunct_ndr gives an example.
The following is a list of the more common triple conjuncts, according to a Noto Fonts issue on GitHub.g1203
Observation: The list just above raises 2 questions: (a) why sequences such as nh don't use the precomposed code point, (b) which of these are used for Newar, as opposed to Sanskrit or another language?
Gemination and consonant lengthening are handled using the normal approach to consonant clusters (see clusters).
This section maps Newar consonant sounds to common graphemes in the Newa orthography. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.
Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.
๐ฅ
๐ฅ๐ฎ๐พ๐ณ๐๐ฐ๐ต๐
๐ฆ
๐ฆ๐ซ๐
๐ง
๐ง๐ฉ๐น
๐จ
๐จ๐ต๐ฌ๐
๐
๐๐ต๐ฎ๐๐ฎ๐ต
๐
๐ ๐ต๐ซ๐
๐
๐๐ต๐๐
๐
๐๐ถ๐๐ธ
๐๐๐ฒ
๐๐๐ฒ๐ถ๐๐ถ๐
๐ก
๐ก๐ฃ๐ต๐ณ๐ธ
๐ข
๐ข๐ฌ๐๐ฉ
๐
๐๐ท๐ณ๐๐ฐ๐ต๐
๐
๐๐ต๐ณ๐ธ
๐
๐
๐
๐๐๐ฐ๐ต๐๐ธ
๐
๐
๐๐ฎ
๐
๐๐ต๐๐ต
๐
๐๐ณ๐ต
๐
๐๐ ๐๐ต
๐๐๐
๐ณ
๐ณ๐ฃ๐๐๐๐ฌ๐ต๐ณ๐ถ
๐ฑ Infrequent. Generally used for loan words.
๐ฑ๐ฃ๐ถ๐ง๐ต๐
๐ฒ Infrequent. Generally used for loan words.
๐ด
๐ด๐ฎ๐ถ๐ฉ
๐ฉ
๐ฉ๐๐ต
๐ช
๐ช๐๐ ๐ณ
๐ฃ
๐ฃ๐๐ฌ
๐ค
๐ค๐พ๐ฅ๐ธ
๐
๐๐ฌ๐๐ ๐ฅ๐น๐ฌ๐๐
๐
๐ฅ๐๐๐๐ต๐ง๐ท
๐
๐
๐
๐
๐ฐ
๐ฐ๐ต๐๐๐ธ
๐ด๐๐ฐ
๐ด๐๐ฐ๐๐๐ซ๐ต
๐ฌ
๐๐ฌ๐ฐ๐ต๐ฌ
๐ญ
๐ฎ
๐ฎ๐ต๐๐ต๐
๐ฏ
๐ฏ๐ต๐
๐ซ
๐ซ๐ต๐ณ๐ธ
๐ด๐๐ซ
๐ด๐๐ซ๐ต๐๐๐ธ
Om.The symbol for the word Om is produced using ๐.
Visually, several of the standalone vowels and some vowel signs look as it they could be composed of smaller parts. This section gives guidance on which approach is best.
Newa is relatively resistant to incorrect coding techniques, but it is possible that someone may occasionally try to use 2 characters rather than the single character which is canonical. Doing so produces text that will not match correctly encoded text for search, spell-checking, and so on, and so should be avoided. The list below shows some examples.
Use | Do not use |
---|---|
๐ | 11400 11435 |
๐ | 11404 11440 |
11440 | 1143E 11435 |
The following code points in the Unicode block need further investigation. Their usage and/or their relevance to writing modern Newar is not clear from the research done so far.
11446 Combined with a letter to represent sounds not native to the script, such as in loan words.
๐ Used to elide an initial A in Sanskrit as a result of sandhi.p,11
๐ Represents nasalisation in some manuscripts. In other sources, a form of punctuation.p,11
๐ Indicates end of a text block larger than a sentence.p,11
๐ Used for marking breaks and filling gaps in a line at a margin.p,11
๐ Marks abbreviations. p,11
๐ Represents the Sanskrit invocation เคธเคฟเคฆเฅเคงเคฟเคฐเคธเฅเคคเฅ siddhirastu may there be success. It is written at the beginning of a text, often in the combination ๐๐. It corresponds to the sign ঀ [U+0980 BENGALI ANJI] in related scripts such as Bengali.p,11
๐ Used for filling gaps in a line and as a mark for end of text.p,11
๐
๐
11460
11461
1145E
For other glyphs found in Newa manuscripts, see Pandey.p,11
Newa has a set of native digits.
Pandey describes variant shapes for 3, 4, and 5, which are to be managed by font.p,10
Newa text runs left to right in horizontal lines.
Show default bidi_class
properties for characters in the Newar orthography described here.
You can experiment with examples using the Newa character app.
Headstrokes & headlines. Pandey writes: The headstrokes of Newar letters do not connect to preceding or following letters. Connection of headstrokes of characters that form a syllable may occur, such as in the combination of a consonant letter and a dependent vowel sign. The majority of Newar manuscripts attest this behavior. However, there is no particular rule that describes the joining properties of headstrokes. Variations in the writing of headstrokes are to be attributed to scribal preferences. In modern digitized typefaces the headstrokes of glyphs connect, but this feature may be an influence of modern Devanagari typography.
p,13
The following 7 consonant letters have no headstroke. This leads to some special shaping for 5 vowel signs, including 4 that are changed into circumgraphs. See circumgraphs for details.
Another idiosyncrasy of Newa is that consonant letters with headstrokes have that headstroke replaced by a wavy line by 4 of the same vowel signs. See headstroke_vowels.
A rather unusual feature of Newa orthography is that vowel signs with a wavy horizontal line replace the flat headstroke of the base consonant.
This includes vowels written with the following vowel signs: 1143E, 11440, 1143F, and 11441.p,6
The sound u is produced by the letter 11438, but that letter can have a different shape when attached to different consonant letters. The vowel sign used to represent the long uห sound also has contextual variations, though not as many as the short vowel. All of these orthographic variants are produced automatically by the font; there is no need to use different characters.
The short sound is rendered as a curved shape with the following 4 consonant letters:p,7
The alternative shape is shown in fig_u_shape.
Both short and long sounds are also written as ligatures with the consonant letters ๐ and ๐ฌ, as shown in fig_u_ligatures.
The consonants ๐จ and ๐ด also take on special shapes when followed by a u-vowel (see bha_ha).
๐จ and ๐ด have special shapes when combined with the 11438 or 11439, or any of the vocalic vowel signs.p,7
Additional contextual shaping for consonants carrying a u-related vowel sign can be seen in u_shape.
200C (ZWNJ) can be used to force the production of a visible virama, rather than a conjunct form.
Word units are separated by spaces.
Usually a typographic character unit correlates with the Unicode concept of grapheme clusters, but not in the case of conjuncts (in common with several other Indic scripts).
Conjuncts and any dependent combining characters should never be split.
This creates a problem when dealing with Unicode grapheme clusters, because they stop after reaching a virama. So conjuncts usually contain multiple grapheme clusters. This produces incorrect segmentation as seen on the right in fig_grapheme_conjunct. Applications need to tailor the grapheme cluster rules to avoid splitting conjuncts.
Unfortunately, this is harder than it seems, because whether a conjunct is formed or not usually depends on the capabilities of the font โ it cannot be determined solely by looking at the code points in memory. If a font doesn't contain the glyphs to create a conjunct it will render the consonant cluster with a visible virama. In that case, the grapheme cluster approach is appropriate.
Newa uses a mixture of ASCII and native punctuation marks.
phrase |
, ๐ ; : |
---|---|
sentence |
๐ ? ! ๐ |
section | ๐ |
Observation: The Lipi Pau newspaper in 2009 used spaces before and after the newa danda.
Newar commonly uses ASCII parentheses to insert parenthetical information into text.
start | end | |
---|---|---|
standard | ( |
) |
Newar texts use quotation marks around quotations. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.
start | end | |
---|---|---|
initial | โ |
โ |
nested | โ |
โ |
Single quotation marks are used for quotations within quotations.
Lines are mostly broken at inter-word spaces.
Like most writing systems, certain characters are expected not to start or end a line. For example, periods and commas shouldn't start a line, and opening parentheses shouldn't end a line.
Show (default) line-breaking properties for characters in the Newar language.
tbd
Newar uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.