Use accesskey "n" to jump to the internal navigation links at any point. Right now you can
Updated 30 December, 2024 • recent changes scripts/orya/or • leave a comment
This page brings together basic information about the Odia (Oriya) script and its use for the Odia language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Odia using Unicode.
Richard Ishida, Odia (Oriya) Orthography Notes, 30-Dec-2024, https://r12a.github.io/scripts/orya/or
Click to toggle Table of Contents.
Phonological transcriptions should be treated as a guide, only. They are taken from the sources consulted, and may be narrow or broad, phonemic or phonetic, depending on what is available. They mostly represent pronunciation of words in isolation. For more detailed information about allophones, alternations, sandhi, dialectal differences, and so on, follow the links to cited references.
This is an interactive document. Click/tap on the following to reveal detailed information and examples for each character: (a) coloured characters in examples and lists; (b) link text on character names. If your browser supports it, your cursor will change to look like as you hover over these items.
Languages using the Oriya script • Odia picker • Terms list • Character notes • Oriya links • Other orthography notes
ଭାରତୀୟ ମହାକାଶ ଗବେଷଣା ସଂସ୍ଥା ବା ଇସ୍ରୋ ହେଉଛି ଭାରତ ସରକାରଙ୍କ ପ୍ରମୁଖ ମହାକାଶ ପ୍ରାଧିକରଣ । ଏହା ପୃଥିବୀର ଛଅଟି ବଡ ସରକାରୀ ମହାକାଶ ପ୍ରାଧିକରଣ ମଧ୍ୟରୁ ଅନ୍ୟତମ ଯଥା ।
Source: Unicode UDHR, article 1
Origins of the Oriya script, 1051 – today.
Phoenician
└ Aramaic
└ Brahmi
└ Gupta
└ Siddham
└ Gaudi
└ Oriya
+ Bengali
+ Tirhuta
+ Nagari
+ Nepalese
The Oriya script is the official orthography used to write the Odia language of the Odisha (Orissa) state in India, as well as minority languages such as Khondi and Santali, and a number of Dravidian and Munda minority languages spoken in that region.10487 It is also used in Orissa for transcribing Sanskrit texts.
ଓଡ଼ିଆ ଅକ୍ଷର oɽia ɔkʰjɔrɔ Odia script
The Oriya script is a descendant of the Brahmi script, via Siddham. Earliest recorded instances of the script go back to the 11thC. The language was initially written in the Kalinga script, from which the Oriya script developed.
The rounded shapes of the letters, especially the top bar, are ascribed to the practise of writing on palm leaves, where rounded lines are less likely to split the leaf than straight ones.
A cursive version of the script, called Karani (କରଣୀ ଅକ୍ଷର), was used by scribes in the royal courts.
The language and script were previously referrred to in English as Oriya, but in 2011 India changed the spelling to Odia in the constitution.9
Sources Scriptsource and Wikipedia.
Script code | orya |
---|---|
Language code | ory |
Script type | abugida |
Origin | sasia |
Native speakers | 33,000,000 |
Total characters | 93 |
Letters | 53 |
Combining marks | 20 |
Symbols | 1 |
Punctuation | 7 |
Numbers | 10 |
Other | 2 |
Possible other | 9 |
Unicode blocks | 1 |
Character counts above are for this orthography but exclude ASCII. | |
Text direction | ltr |
Post-consonant vowels | 1 inherent vowel marks vocalics pre-base marks circumgraphs |
Standalone vowels | letters |
Case distinction | no |
Cursive script | no |
Combining marks | >1 per base |
Clusters marked | yes |
Dedicated finals | marks |
Consonant Clusters | ligated glyphs stacks conjoined glyphs visual killer killer type: v |
Other ligatures | yes |
Word separator | space |
Wraps at | word |
Hyphenation | ? |
G Clusters OK? | no |
Justification | spaces |
Baseline | romn |
The Odia script is an abugida. Consonants carry an inherent vowel which can be modified by appending vowel signs to the consonant. See the table to the right for a brief overview of features for the modern Odia orthography.
Odia runs left to right in horizontal lines. Words are separated by spaces.
The 36 consonant letters used for Odia include repertoire extensions for 2 sounds by applying the nukta diacritic to characters. There are 2 additional, newer characters used for w and v.
Consonant clusters are most commonly rendered usingsubjoined forms, usually for the second character, but sometimes for the initial. Certain clusters use fused forms, and a couple are conjoined. A visible virama is used for borrowed words. Initial RA is rendered as a reph over the top right of the following consonant.
Syllable-final consonant sounds may be represented by 2 dedicated combining marks (anusvara & visarga). Velar consonant cluster initials may be written either using a regular character or using anusvara.
This orthography is an abugida with one inherent vowel, pronounced ɔ. Other post-consonant vowels are written using 9 combining marks (vowel signs). There is 1 pre-base form, and 3 circumgraphs.
In principle, there are no multipart vowels, however the 2 circumgraphs are decomposed into 2 parts each.
Vowels have short lengths only, although there are vestigial orthographic letters for long sounds that now represent alternatives for the short sounds.
Vowels may be nasalised, using the candrabindu diacritic.
Standalone vowels are written using 10 independent vowel letters. Additional symbols are used to express length and nasalisation.
There is a set of 4 vocalics, each with vowel sign and independent forms, but only one vocalic is used in modern Odia.
Odia has native digit shapes, but may also use ASCII digits.
Danda (from the Devanagari block) is used at the end of a sentence, and usually preceded by a space. Otherwise, most of the punctuation is ASCII.
The index points to locations where a character is mentioned in this page, and indicates whether it is used by the Oriya orthography described here.
Click on the image to the left to view all the 'main' and 'infrequent' characters in the index in various groupings or open related apps.
The following represents the repertoire of the Odia language.
Click on the sounds to reveal locations in this document where they are mentioned.
Phones in a lighter colour are non-native or allophones. Source Wikipedia.
Click on the characters to find where they are mentioned in this page.
Descriptions of the Oriya alphabet vary. CLDR§ lists the following 'index' characters. Note the multipart letter at the end of the list.
This orthography is an abugida with one inherent vowel, pronounced ɔ. Other post-consonant vowels are written using 9 combining marks (vowel signs). There is 1 pre-base form, and 3 circumgraphs.
In principle, there are no multipart vowels, however the 2 circumgraphs are decomposed into 2 parts each.
Vowels have short lengths only, although there are vestigial orthographic letters for long sounds that now represent alternatives for the short sounds.
Vowels may be nasalised, using the candrabindu diacritic.
Standalone vowels are written using 10 independent vowel letters. Additional symbols are used to express length and nasalisation.
There is a set of 4 vocalics, each with vowel sign and independent forms, but only one vocalic is used in modern Odia.
The following table summarises the main vowel to character assigments.
ⓘ represents the inherent vowel. Diacritics are added to the vowels to indicate nasalisation (not shown here).
Plain: | 4 iିii0B3F iୀīī0B40 uୁuu0B41 uୂūū0B42 |
4 iଇiị0B07 iଈīị̄0B08 uଉuụ0B09 uଊūụ̄0B0A |
---|---|---|
both eେee0B47 oୋoo0B4B |
both eଏeẹ0B0F oଓoọ0B13 |
|
ɔⓘ |
ɔଅaɔ̣0B05 |
|
aାāā0B3E |
aଆāạ̄0B06 |
|
Dipthongs: | both ɔiୈaiɔʲ0B48 ɔuୌauɔᵘ0B4C |
both ɔiଐaiɔ̣ʲ0B10 ɔuଔauɔ̣ᵘ0B14 |
For additional details see Vowel sounds to characters.
କ kɔ U+0B15 ORIYA LETTER KA
ɔ following a consonant is not written, but is seen as an inherent part of the consonant letter, so kɔ is written by simply using the consonant letter. This vowel sound is transcribed as a.2
Odia uses ୍U+0B4D SIGN VIRAMA, called halant, (the Odia equivalent of the Sanskrit virama) to indicate that the inherent vowel is not pronounced after a consonant, eg. the following explicitly represents just the sound k.କ୍ k͓
Word-final consonants without a following inherent vowel use the halant, If there is no halant, the vowel is pronounced, eg. compare ଫୁଲ phulô flower ଇ-ମେଲ୍ i-mel e-mail
Consonant clusters also use this character, but if the cluster forms a conjunct then the virama is not rendered visibly (see Consonant clusters).
Post-consonant vowels are written using 9 combining marks (vowel signs). There is 1 pre-base form, and 3 circumgraphs.
In principle, there are no multipart vowels, however the 2 circumgraphs are decomposed into 2 parts each.
Vowels have short lengths only, although there are vestigial orthographic letters for long sounds that now represent alternatives for the short sounds.
Vowels may be nasalised, using the candrabindu diacritic.
କି ki U+0B15 ORIYA LETTER KA + U+0B3F ORIYA VOWEL SIGN I
Odia uses the following dedicated combining marks for vowels.
The 'primary' vowels have 'short' and 'long' written forms that hark back to the earlier Indic script origins, but modern Odia phonetics don't distinguish between long and short vowel sounds.
Six vowel signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.
All vowel signs are stored after the base consonant, and the rendering process puts them in the correct place for display. This also applies for the pre-base vowel sign, and the 3 circumgraphs (where a single code point produces glyphs on more than one side of the consonant base).
An orthography that uses vowel signs is different from one that uses simple diacritics or letters for vowels in that the vowel signs are generally attached to the orthographic syllable, rather than just applied to the letter of the immediately preceding consonant. This means that pre-base vowel signs and the left glyph of circumgraphs appear before a whole consonant cluster if the cluster is rendered as a conjunct (see prebase_vowels).
See also the 2 lengthening marks, which may occur in decomposed text.
The following 'lengthening marks' may be used to create vowel sounds as part of a decomposed circumgraph, although the Unicode Standard recommends the use of the precomposed forms.
See Encoding vowel-signs for more details.
In Sanskrit, ଽU+0B3D SIGN AVAGRAHA can be used to show vowel elongation,13
The only multipart vowels occur when the circumgraphs are encoded as pairs of characters (see Other symbols used for vowels and Encoding vowel-signs).
କେ ke U+0B15 ORIYA LETTER KA + U+0B47: ORIYA VOWEL SIGN E
The sound e is written using େU+0B47 VOWEL SIGN E, which appears to the left of the base consonant letter or cluster.
This is a combining mark that is always stored after the base consonant. The rendering process places the glyph before the base consonant.
When an orthographic syllable begins with a consonant cluster that is rendered as a conjunct, the vowel sign is rendered before the start of the cluster, eg. Figure 2 shows 3 sets of consonant clusters, each followed by e when spoken, but the vowel sign appears to the left of each cluster.
କୋ ke U+0B15 ORIYA LETTER KA + U+0B4B: ORIYA VOWEL SIGN O
Three vowels are produced by a single combining character with visually separate parts, that appear on different sides of the consonant onset.
ସ୍ତ୍ରୈଣ stɾɔi̯ɳɔ feminine
All 3 of these circumgraphs can be written as a single character, or as two. See Encoding vowel-signs.
Oriya doesn't mark vowel length.
Vowels may be nasalised using ଁU+0B01 SIGN CANDRABINDU or ଂU+0B02 SIGN ANUSVARA.
ମୁଁହ mũhɔ mouth
Where 2 vowels appear together, the nasalisation sign is rendered above the second, eg. ଜ୍ୱାଇଁ d͡ʒwaĩ son-in-law
Odia represents standalone vowels using a set of independent vowel letters. The set includes a character to represent the inherent vowel sound, ɔ.
This section maps Odia vowel sounds to common graphemes in the Oriya orthography.
Dependent (post-consonant) and standalone vowel graphemes are labelled.
dependent ିU+0B3F VOWEL SIGN I
standalone ଇU+0B07 U+0020 LETTER I
dependent ୀU+0B40 VOWEL SIGN II
standalone ଈU+0B08 LETTER II
dependent ୁU+0B41 U+0020 VOWEL SIGN U
standalone ଉU+0B09 LETTER U
dependent ୂU+0B42 VOWEL SIGN UU
standalone ଊU+0B0A LETTER UU
dependent େU+0B47 VOWEL SIGN E
standalone ଏU+0B0F LETTER E
dependent ୋU+0B4B VOWEL SIGN O
standalone ଓU+0B13 LETTER O
inherent vowel eg. ଅକ୍ଷର ɔkʰjɔrɔ character
standalone ଅU+0B05 LETTER A
dependent ାU+0B3E VOWEL SIGN AA
standalone ଆU+0B06 U+0020 LETTER AA
dependent ୈU+0B48 VOWEL SIGN AI
standalone ଐU+0B10 LETTER AI
dependent ୌU+0B4C VOWEL SIGN AU
standalone ଔU+0B14 LETTER AU
Only one vocalic is regularly used, in vowel sign form, in modern Odia.
କୃମି krumi worm
Other vocalics exist in the script, in independent and vowel sign forms, but are used for Sanskrit transcriptions.
The 36 consonant letters used for Odia include repertoire extensions for 2 sounds by applying the nukta diacritic to characters. There are 2 additional, newer characters used for w and v.
Consonant clusters are most commonly rendered usingsubjoined forms, usually for the second character, but sometimes for the initial. Certain clusters use fused forms, and a couple are conjoined. A visible virama is used for borrowed words. Initial RA is rendered as a reph over the top right of the following consonant.
Syllable-final consonant sounds may be represented by 2 dedicated combining marks (anusvara & visarga). Velar consonant cluster initials may be written either using a regular character or using anusvara.
The following table summarises the main consonant to character assignments.
Onsets | 8 pପpp0B2A bବbb0B2C tତtt0B24 dଦdd0B26 ʈଟṭʈ0B1F ɖଡḍɖ0B21 kକkk0B15 ɡଗgg0B17 |
---|---|
8 pʰଫphpʰ0B2B bʰଭbhbʰ0B2D tʰଥthtʰ0B25 dʰଧdhdʰ0B27 ʈʰଠṭhʈʰ0B20 ɖʰଢḍhɖʰ0B22 kʰଖkhkʰ0B16 ɡʰଘghgʰ0B18 |
|
5 t͡ʃଚcc0B1A d͡ʒଜjj0B1C d͡ʒଯẏʤ0B2F t͡ʃʰଛchcʰ0B1B d͡ʒʰଝjhjʰ0B1D |
|
4 sସss0B38 sଷṣṣ0B37 sଶśś0B36 ɦହhɦ0B39 |
|
5 mମmm0B2E nନnn0B28 ɲଞñɲ0B1E ɳଣṇɳ0B23 ŋଙṅŋ0B19 |
|
8 wୱww0B71 ʋଵvv0B35 rରrr0B30 ɽଡ଼ṛrˑ0B21 0B3C ɽʰଢ଼ṛhrʰˑ0B22 0B3C lଲll0B32 ɭଳḷɭ0B33 jୟyy0B5F |
|
Finals | both ̃ ŋଂṃm̽0B02 hଃḥh̽0B03 |
For additional details see Vowel sounds to characters.
Whereas the table just above takes you from sounds to letters, the following simply lists the basic consonant letters (however, since the orthography is highly phonetic there is little difference in ordering).
The velar and palatal nasals only occur in homorganic clusters.2406
କ୍ଷU+0B15 LETTER KA + U+0B4D SIGN VIRAMA + U+0B37 LETTER SSA is regarded as a letter of the alphabet.
The letters ୱU+0B71 LETTER WA and ଵU+0B35 LETTER VA were added to Unicode version 4.
The subjoined forms of ୱU+0B71 LETTER WA and ବU+0B2C LETTER BA may look the same. For a discussion of the possible historical relationship between these characters see Everson/Stone3→.
Observation: The Library of Congress transcription page says that when ବ [U+0B2C ORIYA LETTER BA] occurs as the second consonant of a consonant cluster (except when geminated), it is transliterated v5. It appears, however, that it also keeps the b sound after the letters m and r.
ଵU+0B35 LETTER VA is described by Wiktionary as "used sporadically for the phonetic Va/Wa as an alternative for the officially recognised letter ୱ, but has not gained widespread acceptance".
The sounds ɽ and ɽʰ are written by combining ଼U+0B3C SIGN NUKTA with an existing consonant.
The nukta should always be typed and stored immediately after the consonant it modifies, and before any combining vowels or diacritics.
Unicode also has precomposed forms of these letters, but they decompose under Unicode Normalisation Form C (NFC). Therefore, the Unicode Standard recommends the use of the decomposed sequence.
The nukta may also be used to produce other non-native sounds. Wiktionary describes the following:
Clusters of consonant letters at the beginning of an orthographic syllable occur in Odia, and they are handled as described in the section Consonant clusters.
Special behaviours include handling of RA at the beginning of an orthographic syllable (see rconjuncts).
A syllable-final nasal sound can be written using ଂU+0B02 SIGN ANUSVARA, eg. ଜଂତୁ jôṃtu animal ଜଂଗଲ jôṃgôlô forest ଏବଂ ebôṃ and
It is optional whether the nasal sound is written using anusvara or by using a conjunct. Figure 6 shows two ways of writing ଅଂକ ɔŋkɔ ink.10488
A word-final h consonant can also be written using ଃU+0B03 SIGN VISARGA.754 (In the middle of a word, it creates a geminated consonant.)
Observation: According to Wikipedia, that sound is a h, but according to Nakanishi it is a glottal stop.
The absence of a vowel sound between two or more consonants is visually indicated in one of the following ways.
See also Finals and Consonant length.
See a table of 2-consonant clusters.
The table allows you to test results for various fonts.
In Unicode, conjunct formation is achieved by adding ୍U+0B4D SIGN VIRAMA between the consonants. The font hides the virama glyph automatically when a conjunct is formed.
The overwhelming majority of conjuncts in Odia are achieved by subjoining a reduced form of the non-initial consonant below the initial.
In most cases the non-initial consonant is just reduced in size, but in some cases the shape is changed, either by removing the circular top line, or in a more fundamental way.
However, when TA is the initial consonant, it is sometimes the initial that is reduced and subjoined. In other combinations, however, it retains its full form.
A trailing RA has a fairly regular appearance as a subjoined glyph below the preceding consonant, although that line may join with the preceding letter shape, and therefore cause a slight change to it.
However, like many other Indian scripts, ରU+0B30 LETTER RA at the beginning of a cluster is represented idiosyncratically, and appears as a small, superscript glyph over the top right of the following consonant.
Observation: Unlike Devanagari, it appears that the RA doesn't move over a following vowel sign, such as ା [U+0B3E ORIYA VOWEL SIGN AA].
Certain combinations of consonants form conjuncts by producing a merged glyph one or both of the original letters may be unrecognisable.
The following is a list of combinations that produce such an effect. Click on the items to see the component letters.
Three letters in particular tend not to stack, but sit alongside the initial consonant in the cluster.
As can be seen above, the conjoined forms for ʤ and j are identical.
The letter NYA also sits alongside the cluster initial, but the halanta may be shown below the initial letter.
Observation: Noto, Nirmala, and Kangila fonts all show the halanta below the initial consonant in the first example at Figure 13, but Oriya MN and Oriya Sangam MN fonts don't show it.
The halanta is also left showing for borrowed words.2404 The halanta can be made visible by following it with U+200C ZERO WIDTH NON-JOINER.
Oriya has a number of clusters involving 3 consonants. For example, the following words contain triple-consonant clusters. As always, click on the example to see the composition. ପୂର୍ଣ୍ଣ pūrṇṇô full ତୀକ୍ଷ୍ଣ tīkṣṇô sharp (as a knife) ଚନ୍ଦ୍ର côndrô moon
Geminated consonants in the middle of a word can be written using ଃU+0B03 SIGN VISARGA,754 eg. ଦୁଃଖ dukkʰɔ sorrow
This section maps Odia consonant sounds to common graphemes in the Oriya orthography.
To the right, typical subjoined forms are shown after a dotted circle. Oriya also has some unusual conjuncts which are also shown. (Combinations with a trailing r are not shown.)
Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc. Light coloured characters occur infrequently.
୍ପ ତ୍ପ ମ୍ପ consonant ପU+0B2A LETTER PA
୍ଫ ମ୍ଫ consonant ଫU+0B2B LETTER PHA
୍ବ ବ୍ବ consonant ବU+0B2C LETTER BA
୍ଭ ଦ୍ଭ ମ୍ଭ consonant ଭU+0B2D LETTER BHA
୍ତ ତ୍ତ consonant ତU+0B24 LETTER TA
୍ଥ ତ୍ଥ consonant ଥU+0B25 LETTER THA
୍ଚ ଚ୍ଚ ଞ୍ଚ consonant ଚU+0B1A LETTER CA
୍ଛ ଚ୍ଛ ଞ୍ଛ ଶ୍ଛ consonant ଛU+0B1B LETTER CHA
୍ଦ ଦ୍ଦ ନ୍ଦ ବ୍ଦ consonant ଦU+0B26 LETTER DA
୍ଧ ଦ୍ଧ ନ୍ଧ consonant ଧU+0B27 LETTER DHA
୍ଜ ଞ୍ଜ consonant ଜU+0B1C LETTER JA
୍ଯ ଧ୍ଯ consonant ଯU+0B2F LETTER YA
୍ଝ ଞ୍ଝ consonant ଝU+0B1D LETTER JHA
୍ଟ consonant ଟU+0B1F LETTER TTA
୍ଠ consonant ଠU+0B20 LETTER TTHA
୍ଡ ଣ୍ଡ consonant ଡU+0B21 LETTER DDA
୍ଢ consonant ଢU+0B22 LETTER DDHA
୍କ ଙ୍କ ତ୍କ consonant କU+0B15 LETTER KA
୍ଖ ଙ୍ଖ consonant ଖU+0B16 LETTER KHA
alphabetic letter କ୍ଷU+0B15 LETTER KA + U+0B4D SIGN VIRAMA + U+0B37 LETTER SSA
୍ଗ ଙ୍ଗ consonant ଗU+0B17 LETTER GA
୍ଘ ଙ୍ଘ consonant ଘU+0B18 LETTER GHA
consonant କ଼U+0B15 LETTER KA + U+0B3C SIGN NUKTA Used for loan words.
୍ସ ତ୍ସ consonant ସU+0B38 LETTER SA
୍ଷ ଖ୍ଷ consonant ଷU+0B37 LETTER SSA
୍ଶ consonant ଶU+0B36 LETTER SHA
non-native consonant ଝ଼U+0B1D LETTER JHA + U+0B3C SIGN NUKTA Used to produce non-native sounds.
non-native consonant ଖ଼U+0B16 LETTER KHA + U+0B3C SIGN NUKTA Used to produce non-native sounds.
final aspiration/consonant doubler ଃU+0B03 SIGN VISARGA Coda.
୍ହ consonant ହU+0B39 LETTER HA
୍ମ ତ୍ମ ମ୍ମ consonant ମU+0B2E LETTER MA
୍ନ ତ୍ନ consonant ନU+0B28 LETTER NA
୍ଣ ଣ୍ଣ ଷ୍ଣ consonant ଣU+0B23 LETTER NNA
ଜ୍ଞ consonant ଞU+0B1E LETTER NYA
୍ଙ consonant ଙU+0B19 LETTER NGA
nasalisation ଂU+0B02 SIGN ANUSVARA Coda.
୍ୱ ବ୍ୱ consonant ୱU+0B71 LETTER WA
୍ଵ ବ୍ଵ consonant ଵU+0B35 LETTER VA
୍ର consonant ରU+0B30 LETTER RA
dependent vocalic ୃU+0B43 VOWEL SIGN VOCALIC R
dependent vocalic ୄU+0B44 VOWEL SIGN VOCALIC RR Usually only used for Sanskrit transcriptions.
vocalic ଋU+0B0B U+0020 LETTER VOCALIC R Usually only used for Sanskrit transcriptions.
vocalic ୠU+0B60 LETTER VOCALIC RR Usually only used for Sanskrit transcriptions.
consonant ଡ଼U+0B21 LETTER DDA + U+0B3C SIGN NUKTA
consonant ଢ଼U+0B22 LETTER DDHA + U+0B3C SIGN NUKTA
non-native consonant ଷ଼U+0B37 LETTER SSA + U+0B3C SIGN NUKTA
୍ଲ consonant ଲU+0B32 LETTER LA
୍ଳ consonant ଳU+0B33 LETTER LLA
dependent vocalic ୢU+0B62 U+0020 VOWEL SIGN VOCALIC L Usually only used for Sanskrit transcriptions.
dependent vocalic ୣU+0B63 VOWEL SIGN VOCALIC LL Usually only used for Sanskrit transcriptions.
vocalic ଌU+0B0C LETTER VOCALIC L Usually only used for Sanskrit transcriptions.
vocalic ୡU+0B61 LETTER VOCALIC LL Usually only used for Sanskrit transcriptions.
୍ୟ ଧ୍ୟ consonant ୟU+0B5F LETTER YYA
Deceased honorific.୰U+0B70 ISSHAR is used before the name of a deceased person.
Om.The symbol for the word Om is produced using ଓଁU+0B13 LETTER O + U+0B01 SIGN CANDRABINDU. It also occurs as a ligated form. If the font doesn't produce the ligated form automatically, the font may produce it if U+200D ZERO WIDTH JOINER is inserted between the two characters.
Visually, several of the standalone vowels and some vowel signs look as if they could be composed of smaller parts. This section compares approaches and considers the relevance of Unicode Normalisation Form D (NFD) and Unicode Normalisation Form C (NFC) to give guidance on which approach is best.
The three circumgraphs can be written as a single character, or as two characters. In 2 of those cases, the second character is a lengthening mark.
The single code point per vowel sign is preferred, however the parts are separated in Unicode Normalisation Form D (NFD), and recomposed in Unicode Normalisation Form C (NFC), so both approaches are canonically equivalent.
Whichever approach is used, the vowel signs must be typed and stored after the consonant characters they surround, and in left to right order.
The approach listed in the table below is not equivalent when the text is normalised, and therefore only the precomposed approach in the left column should be used.10487
In addition to the problem previously mentioned, combinations on rows 2 and 3 don't have the joining bar and so won't display correctly.
This section describes typographic features related to digits, dates, currencies, etc.
Odia has its own set of native digits.
The CLDR standard-decimal pattern is #,##,##0.###
. The standard-percent pattern is #,##,##0%
.c
ASCII digits may also be used.6
Odia also has a number of pre-decimal characters representing fractions.
These are used additively, with larger values appearing before smaller, eg. ୳୵U+0B73 U+0020 FRACTION ONE HALF, SPACE + U+0B75 FRACTION ONE SIXTEENTH represents the value 5/16.10490
Odia text runs left to right in horizontal lines.
Show default bidi_class
properties for characters in the Odia orthography described here.
This section describes typographic features related to font/writing styles, cursive text, context-based shaping, context-based positioning, letterform slopes, weights & italics, and case & other character transforms.
You can experiment with examples using the Odia character app.
Are special glyph forms needed, depending on the context in which a character is used? Do glyphs interact in some circumstances? Are there requirements to position diacritics or other items specially, depending on context? Does the script have multiple diacritics competing for the same location relative to the base?
Odia text relies on OpenType rules to correctly position glyphs and shape them according to the surrounding text.
One major area where this applies is in the use of conjunct forms for consonant clusters.
ସୂର୍ଯ୍ୟ suɾd͡ʒjɔ sun
See a table of 2-consonant clusters for Oriya.
The following is a selection of other examples of contextual shaping and positioning.
Positioning u in clusters. When a below-base vowel sign occurs with a cluster with a conjoined form it is attached to the larger glyph, rather than to the consonant it actually follows in memory and speech, eg.
ମୃତ୍ୟୁ mrutyu death
Position & shape of i. After a certain consonant glyphs, in some fonts, the vowel sign for i appears in a different position and with a different shape. The first example in the table below shows the typical shape.
Composition | Example | |
---|---|---|
ସି | ସ + ିU+0B38 LETTER SA + U+0B3F VOWEL SIGN I | ପ୍ରସିଦ୍ଧି pɾɔsiddʱi fame |
ସ୍ଥି | ସ + ୍ + ଥ + ିU+0B38 LETTER SA + U+0B4D SIGN VIRAMA + U+0B25 LETTER THA + U+0B3F VOWEL SIGN I | ଅସ୍ଥି ôsthi bone |
ଥି | ଥ + ିU+0B25 LETTER THA + U+0B3F VOWEL SIGN I | ପୃଥିବୀ pɾutʰibi Earth |
Other glyph variants. Nakanishi lists a number of alternative shapes for glyphs.
U+200C ZERO WIDTH NON-JOINER (ZWNJ) can be used to force the production of a visible virama, rather than a conjunct form.
U+200D ZERO WIDTH JOINER (ZWJ) is used to produce a ligated version of OM (see Symbols).
Are words separated by spaces, or other characters? Are there special requirements when double-clicking on the text? Are words hyphenated?
Words are separated by spaces.
Hyphens may be used to separate parts of a compound word,640 eg. ଡ୍ରପ୍-ଡାଉନ୍ ɖrɔp-ɖaun drop-down
Usually a typographic character unit correlates with the Unicode concept of grapheme clusters, but not in the case of conjuncts (in common with several other Indic scripts).
Conjuncts and any dependent combining characters should never be split.
This creates a problem when dealing with Unicode grapheme clusters, because they stop after reaching a virama. So conjuncts usually contain multiple grapheme clusters. This produces incorrect segmentation as seen on the right in Figure 19. Applications need to tailor the grapheme cluster rules to avoid splitting conjuncts.
ସାଙ୍ଗେ saṅge with
Unfortunately, this is harder than it seems, because whether a conjunct is formed or not usually depends on the capabilities of the font – it cannot be determined solely by looking at the code points in memory. If a font doesn't contain the glyphs to create a conjunct it will render the consonant cluster with a visible virama. In that case, the grapheme cluster approach is appropriate.
This section describes typographic features related to word boundaries, phrase & section boundaries, bracketed text, quotations & citations, emphasis, abbreviation, ellipsis & repetition, inline notes & annotations, other punctuation, and other inline text decoration.
What characters are used to indicate the boundaries of phrases, sentences, and sections?
Odia uses a combination of ASCII and native punctuation.
phrase | |
---|---|
sentence | |
section |
।U+0964 DEVANAGARI DANDA and ॥U+0965 DEVANAGARI DOUBLE DANDA are from the Unicode Devanagari block. Odia uses a space before these punctuation marks, which avoids confusion with ାU+0B3E VOWEL SIGN AA, eg.
… ଲୋପ ପାଇଗଲା ।
Odia commonly uses ASCII parentheses to insert parenthetical information into text.
start | end | |
---|---|---|
standard |
(U+0028 LEFT PARENTHESIS and )U+0029 RIGHT PARENTHESIS are used for parentheses.6
What characters are used to indicate quotations? Do quotations within quotations use different characters? What characters are used to indicate dialogue? Are the same mechanisms used to cite words, or for scare quotes, etc? What about citing book or article names?
Odia texts typically use quotation marks around quotations. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.
start | end | |
---|---|---|
initial | ”U+201D RIGHT DOUBLE QUOTATION MARK | |
nested | ’U+2019 RIGHT SINGLE QUOTATION MARK |
Single quotation marks are used for quotations within quotations.
What characters are used to indicate abbreviation, ellipsis & repetition?
Odia abbreviations use a period after the first syllable, but sometimes include more than one syllable,645 eg. ବଶେଷ୍ୟ → ବ. ଉଦାହରଣ → ଉ.ଦା.
Odia uses …U+2026 HORIZONTAL ELLIPSIS for ellipsis,640 eg. ଆଇକନ୍ ପରିବର୍ତ୍ତନ କରନ୍ତୁ…
In Sanskrit, ଽU+0B3D SIGN AVAGRAHA is used to indicate elision,13 eg. ଦ୍ୱିତୀୟୋଽଧ୍ୟାୟଃ dwitīyo'dhyāyaḥ chapter 2
Any other form of highlighting or marking of text, such as underlining, numeric overbars, etc. What characters or methods (eg. text decoration) are used to convey information about a range of text? If lines are drawn alongside, over or through the text, do they need to be a special distance from the text itself? Is it important to skip characters when underlining, etc? How do things change for vertically set text?
Punctuation not already mentioned, such as dashes, connectors, separators, scare quotes, etc.
CLDR lists the following non-ASCII punctuation marks for Odia.
This section describes typographic features related to line breaking & hyphenation, text alignment & justification, text spacing, baselines, line height, counters, lists, and styling initials.
Are there special rules about the way text wraps when it hits the end of a line? Does line-breaking wrap whole 'words' at a time, or characters, or something else (such as syllables in Tibetan and Javanese)? What characters should not appear at the end or start of a line, and what should be done to prevent that? Is hyphenation used, or something else? What rules are used? What difficulties exist?
Lines are mostly broken at inter-word spaces.
Like most writing systems, certain characters are expected not to start or end a line. For example, periods and commas shouldn't start a line, and opening parentheses shouldn't end a line.
Show (default) line-breaking properties for characters in the modern Odia orthography.
Does the script have special requirements for baseline alignment between mixed scripts and in general? Is line height special for this script? Are there other aspects that affect line spacing, or positioning of items vertically within a line?
tbd
Odia uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.
Are there list or other counter styles in use? If so, what is the format used? Do counters need to be upright in vertical text? Are there other aspects related to counters and lists that need to be addressed?
You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.
The oriya numeric style is decimal-based and uses these digits.8
Examples:
This section describes typographic features related to general page layout & progression; grids & tables, notes, footnotes, etc, forms & user interaction, and page numbering, running headers, etc.
1Unicode Consortium, CLDR, Locale Data Summary for Odia
2Peter T. Daniels and William Bright, The World's Writing Systems, Oxford University Press, 404-407, ISBN 0-19-507993-0✓
3Michael Everson and Anthony Stone, On Oriya VA and WA, and a proposal to encode one Oriya letter in the UCS
4Tapas S Ray, Peri Bhaskararao (ed), International Symposium on Indic Scripts, Past & Future, Research Institute for the Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies, 143-153
5Library of Congress (1996), Oriya (retr. Oct 2021)
6Microsoft, Odia style guide
7Akira Nakanishi, Writing Systems of the World, Tuttle, 54-55, ISBN 0-8048-1654-9✓
8Richard Ishida, Ready-made Counter Styles✓
9Swaran Lata, L2/12-380: Letter requesting change of name in Unicode✓
10Unicode Consortium, The Unicode Standard, Version 13.0, Chapter 12.5: South and Central Asia I, Oriya (Odia), 487-490, ISBN 978-1-936213-16-0.✓
11Unicode Consortium, Unicode Line Breaking Algorithm (UAX#14)✓
12Wikipedia, Odia language
13Wikipedia, Odia script✓