Use accesskey "n" to jump to the internal navigation links at any point. Right now you can

 
r12a >> docs

Odia

Oriya orthography notes

Updated 30 December, 2024 • recent changes scripts/orya/or • leave a comment

This page brings together basic information about the Odia (Oriya) script and its use for the Odia language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Odia using Unicode.

Referencing this document

Richard Ishida, Odia (Oriya) Orthography Notes, 30-Dec-2024, https://r12a.github.io/scripts/orya/or

 

Click to toggle Table of Contents.

Phonological transcriptions should be treated as a guide, only. They are taken from the sources consulted, and may be narrow or broad, phonemic or phonetic, depending on what is available. They mostly represent pronunciation of words in isolation. For more detailed information about allophones, alternations, sandhi, dialectal differences, and so on, follow the links to cited references.

This is an interactive document. Click/tap on the following to reveal detailed information and examples for each character: (a) coloured characters in examples and lists; (b) link text on character names. If your browser supports it, your cursor will change to look like as you hover over these items.

More about using this page

Character names. The names of characters in codepoint markup drop the initial ORIYA label (purely to reduce the length of the examples). In other places the full name can be found.

Navigation. The Toggle images icon opens the table of contents in a popup window. Dismiss it by clicking on the X alongside it, or by hitting the ESC key.

Detailed character notes. Clicking on coloured characters in lists or on character names opens panels that give detailed information about each character. This information is taken from the companion document, Oriya Character Notes. (Those panels can be dismissed by pressing on the ESC key.)

Transcriptions & transliterations. Phonological transcriptions are surrounded by ⌈corner brackets⌋, to indicate that they vary between narrow, [phonetic] and broad, /phonemic/ transcriptions.
Latin transcriptions between <angle brackets>, represent the letters as commonly written in the Latin script.
A transliteration has also been developed especially for this orthography, and is generally based on the sound of a letter where possible, but where a letter has multiple pronunciations, the transliteration represents only one.
Transliterations provide perfect round-trip conversion between the native script and Latin, whereas Latin transcriptions rarely do.
When you click on an example to see its composition, the top of the panel that opens contains a transliteration, followed by the native text, then (if available) an IPA transcription.

Copied !
TOC.
Accessibility settings
ˇ

Languages using the Oriya scriptOdia pickerTerms listCharacter notesOriya linksOther orthography notes

Sample

Select part of this sample text to show a list of characters, with links to more details.
Change size:   36px

ଭାରତୀୟ ମହାକାଶ ଗବେଷଣା ସଂସ୍ଥା ବା ଇସ୍ରୋ ହେଉଛି ଭାରତ ସରକାରଙ୍କ ପ୍ରମୁଖ ମହାକାଶ ପ୍ରାଧିକରଣ । ଏହା ପୃଥିବୀର ଛଅଟି ବଡ ସରକାରୀ ମହାକାଶ ପ୍ରାଧିକରଣ ମଧ୍ୟରୁ ଅନ୍ୟତମ ଯଥା ।

Source: Unicode UDHR, article 1

Usage & history

Origins of the Oriya script, 1051 – today.

Phoenician

└ Aramaic

└ Brahmi

└ Gupta

└ Siddham

└ Gaudi

└ Oriya

+ Bengali

+ Tirhuta

+ Nagari

+ Nepalese

The Oriya script is the official orthography used to write the Odia language of the Odisha (Orissa) state in India, as well as minority languages such as Khondi and Santali, and a number of Dravidian and Munda minority languages spoken in that region.10487 It is also used in Orissa for transcribing Sanskrit texts.

ଓଡ଼ିଆ ଅକ୍ଷର oɽia ɔkʰjɔrɔ Odia script

The Oriya script is a descendant of the Brahmi script, via Siddham. Earliest recorded instances of the script go back to the 11thC. The language was initially written in the Kalinga script, from which the Oriya script developed.

The rounded shapes of the letters, especially the top bar, are ascribed to the practise of writing on palm leaves, where rounded lines are less likely to split the leaf than straight ones.

A cursive version of the script, called Karani (କରଣୀ ଅକ୍ଷର), was used by scribes in the royal courts.

The language and script were previously referrred to in English as Oriya, but in 2011 India changed the spelling to Odia in the constitution.9

Sources Scriptsource and Wikipedia.

Script codeorya
Language codeory
Script typeabugida
Originsasia
Native speakers33,000,000
  
Total characters93
Letters53
Combining marks20
Symbols1
Punctuation7
Numbers10
Other2
Possible other9
Unicode blocks1
  
Character counts above are for this
orthography but exclude ASCII.
  
Text directionltr
Post-consonant vowels1 inherent vowel
marks
vocalics
pre-base marks
circumgraphs
Standalone vowelsletters
Case distinctionno
Cursive scriptno
Combining marks>1 per base
Clusters markedyes
Dedicated finalsmarks
Consonant
Clusters
ligated glyphs
stacks
conjoined glyphs
visual killer
killer type: v
Other ligaturesyes
Word separatorspace
Wraps atword
Hyphenation?
G Clusters OK?no
Justificationspaces
Baselineromn

Basic features

The Odia script is an abugida. Consonants carry an inherent vowel which can be modified by appending vowel signs to the consonant. See the table to the right for a brief overview of features for the modern Odia orthography.

Odia runs left to right in horizontal lines. Words are separated by spaces.

❯ Consonant summary table

The 36 consonant letters used for Odia include repertoire extensions for 2 sounds by applying the nukta diacritic to characters. There are 2 additional, newer characters used for w and v.

Consonant clusters are most commonly rendered usingsubjoined forms, usually for the second character, but sometimes for the initial. Certain clusters use fused forms, and a couple are conjoined. A visible virama is used for borrowed words. Initial RA is rendered as a reph over the top right of the following consonant.

Syllable-final consonant sounds may be represented by 2 dedicated combining marks (anusvara & visarga). Velar consonant cluster initials may be written either using a regular character or using anusvara.

❯ Vowel summary table

This orthography is an abugida with one inherent vowel, pronounced ɔ. Other post-consonant vowels are written using 9 combining marks (vowel signs). There is 1 pre-base form, and 3 circumgraphs.

In principle, there are no multipart vowels, however the 2 circumgraphs are decomposed into 2 parts each.

Vowels have short lengths only, although there are vestigial orthographic letters for long sounds that now represent alternatives for the short sounds.

Vowels may be nasalised, using the candrabindu diacritic.

Standalone vowels are written using 10 independent vowel letters. Additional symbols are used to express length and nasalisation.

There is a set of 4 vocalics, each with vowel sign and independent forms, but only one vocalic is used in modern Odia.

Odia has native digit shapes, but may also use ASCII digits.

Danda (from the Devanagari block) is used at the end of a sentence, and usually preceded by a space. Otherwise, most of the punctuation is ASCII.

Character index

The index points to locations where a character is mentioned in this page, and indicates whether it is used by the Oriya orthography described here.

Manage characters.

Click on the image to the left to view all the 'main' and 'infrequent' characters in the index in various groupings or open related apps.

Letters

Show

Basic consonants

list all 36
0B2A
ORIYA LETTER PAconsonant p p
0B2C
ORIYA LETTER BAconsonant b b
0B2B
ORIYA LETTER PHAconsonant ph
0B2D
ORIYA LETTER BHAconsonant bh
0B24
ORIYA LETTER TAconsonant t t
0B26
ORIYA LETTER DAconsonant d d
0B25
ORIYA LETTER THAconsonant th
0B27
ORIYA LETTER DHAconsonant dh
0B1F
ORIYA LETTER TTAconsonant ʈ
0B21
ORIYA LETTER DDAconsonant ɖ
0B20
ORIYA LETTER TTHAconsonant ʈʰ ṭh
0B22
ORIYA LETTER DDHAconsonant ɖʰ ḍh
0B15
ORIYA LETTER KAconsonant k k
0B17
ORIYA LETTER GAconsonant ɡ g
0B16
ORIYA LETTER KHAconsonant kh
0B18
ORIYA LETTER GHAconsonant ɡʰ gh
0B1A
ORIYA LETTER CAconsonant t͡ʃ c
0B1C
ORIYA LETTER JAconsonant d͡ʒ j
0B1B
ORIYA LETTER CHAconsonant t͡ʃʰ ch
0B1D
ORIYA LETTER JHAconsonant d͡ʒʰ jh
0B2F
ORIYA LETTER YAconsonant d͡ʒ
0B38
ORIYA LETTER SAconsonant s s
0B37
ORIYA LETTER SSAconsonant s
0B36
ORIYA LETTER SHAconsonant s ś
0B39
ORIYA LETTER HAconsonant ɦ h
0B2E
ORIYA LETTER MAconsonant m m
0B28
ORIYA LETTER NAconsonant n n
0B1E
ORIYA LETTER NYAconsonant ɲ ñ
0B23
ORIYA LETTER NNAconsonant ɳ
0B19
ORIYA LETTER NGAconsonant ŋ
0B30
ORIYA LETTER RAconsonant r r
0B32
ORIYA LETTER LAconsonant l l
0B33
ORIYA LETTER LLAconsonant ɭ
0B5F
ORIYA LETTER YYAconsonant j y
0B71
ORIYA LETTER WAconsonant w w
0B35
ORIYA LETTER VAconsonant ʋ v

Extended consonants

list all 3
ଡ଼0B21
0B3C
ORIYA LETTER DDA, SIGN NUKTAconsonant ɽ
ଢ଼0B22
0B3C
ORIYA LETTER DDHA, SIGN NUKTAconsonant ɽʰ ṛh
କ୍ଷ0B15
0B4D
0B37
ORIYA LETTER KA, SIGN VIRAMA, LETTER SSAalphabetic letter kʰj kṣ
list both
0B5C
(avoid)    ORIYA LETTER RRAconsonant Decomposed is recommended. ɽ
0B5D
(avoid)    ORIYA LETTER RHAconsonant Decomposed is recommended. ɽʰ ṛh

Vowels

list all 10
0B07
ORIYA LETTER Iindependent vowel i i
0B08
ORIYA LETTER IIindependent vowel i ī
0B09
ORIYA LETTER Uindependent vowel u u
0B0A
ORIYA LETTER UUindependent vowel u ū
0B0F
ORIYA LETTER Eindependent vowel e e
0B13
ORIYA LETTER Oindependent vowel o o
0B05
ORIYA LETTER Aindependent vowel ɔ a
0B06
ORIYA LETTER AAindependent vowel a ā
0B10
ORIYA LETTER AIindependent vowel ɔi ai
0B14
ORIYA LETTER AUindependent vowel ɔu au

Vocalics

list all 4
0B0B
(rare)    ORIYA LETTER VOCALIC Rvocalic Usually only used for Sanskrit transcriptions. ru
0B60
(rare)    ORIYA LETTER VOCALIC RRvocalic Usually only used for Sanskrit transcriptions. ru r̥̄
0B0C
(rare)    ORIYA LETTER VOCALIC Lvocalic Usually only used for Sanskrit transcriptions. lu
0B61
(rare)    ORIYA LETTER VOCALIC LLvocalic Usually only used for Sanskrit transcriptions. lu l̥̄

Avagraha

list
0B3D
(transcription)    ORIYA SIGN AVAGRAHAelision; vowel prolongation Usually only used for Sanskrit transcriptions.

Combining marks

Show

Vowels

list all 9
ି0B3F
ORIYA VOWEL SIGN Ivowel sign i i
0B40
ORIYA VOWEL SIGN IIvowel sign i ī
0B41
ORIYA VOWEL SIGN Uvowel sign u u
0B42
ORIYA VOWEL SIGN UUvowel sign u ū
0B47
ORIYA VOWEL SIGN Evowel sign e e
0B4B
ORIYA VOWEL SIGN Ovowel sign o o
0B3E
ORIYA VOWEL SIGN AAvowel sign a ā
0B48
ORIYA VOWEL SIGN AIvowel sign ɔi ai
0B4C
ORIYA VOWEL SIGN AUvowel sign ɔu au
list both
0B56
(avoid)    ORIYA AI LENGTH MARKlengthening mark Only found in decomposed text.
0B57
(avoid)    ORIYA AU LENGTH MARKlengthening mark Only found in decomposed text.

Vocalics

list
0B43
ORIYA VOWEL SIGN VOCALIC Rdependent vocalic ru
list all 3
0B44
(rare)    ORIYA VOWEL SIGN VOCALIC RRdependent vocalic Usually only used for Sanskrit transcriptions. ru r̥̄
0B62
(rare)    ORIYA VOWEL SIGN VOCALIC Ldependent vocalic Usually only used for Sanskrit transcriptions. lu
0B63
(rare)    ORIYA VOWEL SIGN VOCALIC LLdependent vocalic Usually only used for Sanskrit transcriptions. lu l̥̄

Bindu

list both
0B02
ORIYA SIGN ANUSVARAnasalisation ̃ ŋ
0B01
ORIYA SIGN CANDRABINDUnasalisation ̃ ̃

Nukta

list
0B3C
ORIYA SIGN NUKTAnukta

Visarga

list
0B03
ORIYA SIGN VISARGAfinal aspiration/consonant doubler h

Virama

list
0B4D
ORIYA SIGN VIRAMAvowel-killer

Numbers

Show
list all 10
0B66
ORIYA DIGIT ZEROdigit
0B67
ORIYA DIGIT ONEdigit 1
0B68
ORIYA DIGIT TWOdigit 2
0B69
ORIYA DIGIT THREEdigit 3
0B6A
ORIYA DIGIT FOURdigit 4
0B6B
ORIYA DIGIT FIVEdigit 5
0B6C
ORIYA DIGIT SIXdigit 6
0B6D
ORIYA DIGIT SEVENdigit 7
0B6E
ORIYA DIGIT EIGHTdigit 8
0B6F
ORIYA DIGIT NINEdigit 9
list all 6
0B72
(archaic)    ORIYA FRACTION ONE QUARTERquarter
0B73
(archaic)    ORIYA FRACTION ONE HALF, SPACEhalf
0B74
(archaic)    ORIYA FRACTION THREE QUARTERSthree-quarters
0B75
(archaic)    ORIYA FRACTION ONE SIXTEENTHone sixteenth
0B76
(archaic)    ORIYA FRACTION ONE EIGHTHone eighth
0B77
(archaic)    ORIYA FRACTION THREE SIXTEENTHSthree sixteenths

Punctuation

Show
list all 7
0964
DEVANAGARI DANDAsection divider .
0965
DEVANAGARI DOUBLE DANDAsection divider
2026
HORIZONTAL ELLIPSISellipsis
2018
LEFT SINGLE QUOTATION MARKquotation mark
2019
RIGHT SINGLE QUOTATION MARKquotation mark
201C
LEFT DOUBLE QUOTATION MARKquotation mark
201D
RIGHT DOUBLE QUOTATION MARKquotation mark

ASCII

list all 8
,002C
COMMAcomma ,
;003B
SEMICOLONsemicolon ;
:003A
COLONcolon :
.002E
FULL STOPfull stop .
?003F
QUESTION MARKquestion mark ?
!0021
EXCLAMATION MARKexclamation mark !
(0028
LEFT PARENTHESISparenthesis (
)0029
RIGHT PARENTHESISparenthesis )

Symbols

Show
list
0B70
ORIYA ISSHARdeceased sign; name of deity
list
ଓଁ0B13
0B01
LETTER O, SIGN CANDRABINDUom sign om

Other

Show
list both
ZWNJ200C
ZERO WIDTH NON-JOINERzwnj
ZWJ200D
ZERO WIDTH JOINERzwj

To be investigated

list all 14
%0025
(tbc)    PERCENT SIGNpercentage mark %
-002D
(tbc)    HYPHEN-MINUShyphen -
[005B
(tbc)    LEFT SQUARE BRACKETbracket [
]005D
(tbc)    RIGHT SQUARE BRACKETbracket ]
ʼ02BC
(tbc)    MODIFIER LETTER APOSTROPHEapostrophe ʼ
͏034F
(tbc)    COMBINING GRAPHEME JOINERcombining grapheme joiner
2011
(tbc)    NON-BREAKING HYPHENnon-breaking hyphen
2013
(tbc)    EN DASHen dash
2014
(tbc)    EM DASHem dash

Phonology

The following represents the repertoire of the Odia language.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones. Source Wikipedia.

Vowel sounds

Plain vowels

i u e o ɔ a

Complex vowels

ɔi ɔu

Consonant sounds

labial dental alveolar post-
alveolar
retroflex palatal velar glottal
stops p b t d     ʈ ɖ   k ɡ  
aspirated     ʈʰ ɖʰ   ɡʰ  
affricates       t͡ʃ d͡ʒ        
aspirated       t͡ʃʰ d͡ʒʰ        
fricatives v   s         ɦ
nasals m   n   ɳ ɲ ŋ
approximants w   l   ɭ j  
trills/flaps     ɾ   ɽ
aspirated         ɽʰ

Alphabet

Click on the characters to find where they are mentioned in this page.

Descriptions of the Oriya alphabet vary. CLDR§ lists the following 'index' characters. Note the multipart letter at the end of the list.


45
 ɔaɔ̣0B05
 aāạ̄0B06
 ii0B07
 iīị̄0B08
 uu0B09
 uūụ̄0B0A
rarerur̥̣0B0B
 ee0B0F
 ɔiaiɔ̣ʲ0B10
 oo0B13
 ɔuauɔ̣ᵘ0B14
 
 kkk0B15
 kh0B16
 ɡgg0B17
 ɡʰgh0B18
 ŋŋ0B19
 t͡ʃcc0B1A
 t͡ʃʰch0B1B
 d͡ʒjj0B1C
 d͡ʒʰjh0B1D
 ɲñɲ0B1E
 ʈʈ0B1F
 ʈʰṭhʈʰ0B20
 ɖɖ0B21
 ɖʰḍhɖʰ0B22
 ɳɳ0B23
 ttt0B24
 th0B25
 ddd0B26
 dh0B27
 nnn0B28
 ppp0B2A
 ph0B2B
 bbb0B2C
 bh0B2D
 mmm0B2E
 d͡ʒʤ0B2F
 rrr0B30
 lll0B32
 ɭɭ0B33
 sśś0B36
 s0B37
 sss0B38
 ɦhɦ0B39
କ୍ଷ kʰjkṣk͓ṣ0B15
0B4D
0B37

Vowels

This orthography is an abugida with one inherent vowel, pronounced ɔ. Other post-consonant vowels are written using 9 combining marks (vowel signs). There is 1 pre-base form, and 3 circumgraphs.

In principle, there are no multipart vowels, however the 2 circumgraphs are decomposed into 2 parts each.

Vowels have short lengths only, although there are vestigial orthographic letters for long sounds that now represent alternatives for the short sounds.

Vowels may be nasalised, using the candrabindu diacritic.

Standalone vowels are written using 10 independent vowel letters. Additional symbols are used to express length and nasalisation.

There is a set of 4 vocalics, each with vowel sign and independent forms, but only one vocalic is used in modern Odia.

Vowel summary table

The following table summarises the main vowel to character assigments.

ⓘ represents the inherent vowel. Diacritics are added to the vowels to indicate nasalisation (not shown here).

Plain:

4
iିii0B3F
iīī0B40
uuu0B41
uūū0B42

4
ii0B07
iīị̄0B08
uu0B09
uūụ̄0B0A

both
eee0B47
ooo0B4B

both
ee0B0F
oo0B13

ɔ  

ɔaɔ̣0B05

aāā0B3E

aāạ̄0B06
Dipthongs:

both
ɔiaiɔʲ0B48
ɔuauɔᵘ0B4C

both
ɔiaiɔ̣ʲ0B10
ɔuauɔ̣ᵘ0B14

For additional details see Vowel sounds to characters.

Inherent vowel

U+0B15 ORIYA LETTER KA

In Indic scripts independent vowels are independent letter symbols that stand on their own and are typically used to represent standalone vowel sounds.

ɔ following a consonant is not written, but is seen as an inherent part of the consonant letter, so is written by simply using the consonant letter. This vowel sound is transcribed as a.2

Odia uses  U+0B4D SIGN VIRAMA, called halant, (the Odia equivalent of the Sanskrit virama) to indicate that the inherent vowel is not pronounced after a consonant, eg. the following explicitly represents just the sound k.କ୍

Word-final consonants without a following inherent vowel use the halant, If there is no halant, the vowel is pronounced, eg. compare ଫୁଲ phulô flower ଇ-ମେଲ୍ i-mel e-mail

Consonant clusters also use this character, but if the cluster forms a conjunct then the virama is not rendered visibly (see Consonant clusters).

Vowels after consonants

Post-consonant vowels are written using 9 combining marks (vowel signs). There is 1 pre-base form, and 3 circumgraphs.

In principle, there are no multipart vowels, however the 2 circumgraphs are decomposed into 2 parts each.

Vowels have short lengths only, although there are vestigial orthographic letters for long sounds that now represent alternatives for the short sounds.

Vowels may be nasalised, using the candrabindu diacritic.

Vowel signs

କି ki U+0B15 ORIYA LETTER KA + U+0B3F ORIYA VOWEL SIGN I

Odia uses the following dedicated combining marks for vowels.


9
ିiii0B3F
iīī0B40
uuu0B41
uūū0B42
eee0B47
ooo0B4B
aāā0B3E
ɔiaiɔʲ0B48
ɔuauɔᵘ0B4C

The 'primary' vowels have 'short' and 'long' written forms that hark back to the earlier Indic script origins, but modern Odia phonetics don't distinguish between long and short vowel sounds.

Six vowel signs are spacing marks, meaning that they consume horizontal space when added to a base consonant.

All vowel signs are stored after the base consonant, and the rendering process puts them in the correct place for display. This also applies for the pre-base vowel sign, and the 3 circumgraphs (where a single code point produces glyphs on more than one side of the consonant base).

An orthography that uses vowel signs is different from one that uses simple diacritics or letters for vowels in that the vowel signs are generally attached to the orthographic syllable, rather than just applied to the letter of the immediately preceding consonant. This means that pre-base vowel signs and the left glyph of circumgraphs appear before a whole consonant cluster if the cluster is rendered as a conjunct (see prebase_vowels).

See also the 2 lengthening marks, which may occur in decomposed text.

Other symbols used for vowels

The following 'lengthening marks' may be used to create vowel sounds as part of a decomposed circumgraph, although the Unicode Standard recommends the use of the precomposed forms.


both
avoid0B56
avoidxᵘ0B57

See Encoding vowel-signs for more details.

In Sanskrit, U+0B3D SIGN AVAGRAHA can be used to show vowel elongation,13

Multipart vowels

A composite vowel sign is a single vowel sound or diphthong that is represented by more than one code point from the set of vowel signs, repurposed consonants, and diacritics available. It is the opposite of a circumgraph.

The only multipart vowels occur when the circumgraphs are encoded as pairs of characters (see Other symbols used for vowels and Encoding vowel-signs).

Pre-base vowel sign

କେ ke U+0B15 ORIYA LETTER KA + U+0B47: ORIYA VOWEL SIGN E


eee0B47

The sound e is written using U+0B47 VOWEL SIGN E, which appears to the left of the base consonant letter or cluster.

This is a combining mark that is always stored after the base consonant. The rendering process places the glyph before the base consonant.

The combination କେ [U+0B15 ORIYA LETTER KA + U+0B47 ORIYA VOWEL SIGN E] produces a pre-base positioned glyph.

When an orthographic syllable begins with a consonant cluster that is rendered as a conjunct, the vowel sign is rendered before the start of the cluster, eg. Figure 2 shows 3 sets of consonant clusters, each followed by e when spoken, but the vowel sign appears to the left of each cluster.

ଜ୍ଲେ ଛ୍ଯେ ଜ୍ଞେ
Three examples of a prebase vowel, pronounced after a consonant cluster, but rendered to the left of the conjunct.

Circumgraphs

କୋ ke U+0B15 ORIYA LETTER KA + U+0B4B: ORIYA VOWEL SIGN O

When a single vowel sign code point produces glyphs on more than one side of the consonant base, it is referred to here as a circumgraph.


3
ooo0B4B
ɔiaiɔʲ0B48
ɔuauɔᵘ0B4C

Three vowels are produced by a single combining character with visually separate parts, that appear on different sides of the consonant onset.

ସ୍ତ୍ରୈଣ
A circumgraph vowel: a single code point with glyphs on two sides of the consonant cluster after which it is pronounced.
show composition

ସ୍ତ୍ରୈଣ stɾɔi̯ɳɔ feminine

All 3 of these circumgraphs can be written as a single character, or as two. See Encoding vowel-signs.

Vowel length

Oriya doesn't mark vowel length.

Nasalisation


̃̃˜0B01

Vowels may be nasalised using U+0B01 SIGN CANDRABINDU or U+0B02 SIGN ANUSVARA.

ମୁଁହ
U+0B01 SIGN CANDRABINDU used to nasalise the u sound.
show composition

ମୁଁହ mũhɔ mouth

Where 2 vowels appear together, the nasalisation sign is rendered above the second, eg. ଜ୍ୱାଇଁ d͡ʒwaĩ son-in-law

Standalone vowels

Standalone vowels are vowel sounds that are not preceded by a consonant sound, or are preceded by only a glottal stop. They may appear at the beginning of a word or in the middle of a word after a preceding vowel.

Odia represents standalone vowels using a set of independent vowel letters. The set includes a character to represent the inherent vowel sound, ɔ.


10
ii0B07
iīị̄0B08
uu0B09
uūụ̄0B0A
ee0B0F
oo0B13
ɔaɔ̣0B05
aāạ̄0B06
ɔiaiɔ̣ʲ0B10
ɔuauɔ̣ᵘ0B14

Vowel sounds to characters

This section maps Odia vowel sounds to common graphemes in the Oriya orthography.

Dependent (post-consonant) and standalone vowel graphemes are labelled.

Plain vowels

i

dependent ିU+0B3F VOWEL SIGN I

standalone U+0B07 U+0020 LETTER I

dependent U+0B40 VOWEL SIGN II

standalone U+0B08 LETTER II

u

dependent U+0B41 U+0020 VOWEL SIGN U

standalone U+0B09 LETTER U

dependent U+0B42 VOWEL SIGN UU

standalone U+0B0A LETTER UU

e

dependent U+0B47 VOWEL SIGN E

standalone U+0B0F LETTER E

o

dependent U+0B4B VOWEL SIGN O

standalone U+0B13 LETTER O

ɔ

inherent vowel eg. ଅକ୍ଷର ɔkʰjɔrɔ character

standalone U+0B05 LETTER A

a

dependent U+0B3E VOWEL SIGN AA

standalone U+0B06 U+0020 LETTER AA

Diphthongs and other combinations

ɔi

dependent U+0B48 VOWEL SIGN AI

standalone U+0B10 LETTER AI

ɔu

dependent U+0B4C VOWEL SIGN AU

standalone U+0B14 LETTER AU

Nasalisation

Vocalics

Vocalics are letters derived from Sanskrit that generally behave like vowels, but represent r/l followed by a vowel. They are often available both as vowel signs and independent vowel letters.


ru0B43

Only one vocalic is regularly used, in vowel sign form, in modern Odia.

କୃମି
A vocalic vowel sign.
show composition

କୃମି krumi worm

Other vocalics exist in the script, in independent and vowel sign forms, but are used for Sanskrit transcriptions.


7
rarerur̥̣0B0B
rarerur̥̄r̥̣̄0B60
rarerur̥̄r̥̄0B44
rarelul̥̣0B0C
rarelu0B62
rarelul̥̄l̥̣̄0B61
rarelul̥̄l̥̄0B63

Consonants

The 36 consonant letters used for Odia include repertoire extensions for 2 sounds by applying the nukta diacritic to characters. There are 2 additional, newer characters used for w and v.

Consonant clusters are most commonly rendered usingsubjoined forms, usually for the second character, but sometimes for the initial. Certain clusters use fused forms, and a couple are conjoined. A visible virama is used for borrowed words. Initial RA is rendered as a reph over the top right of the following consonant.

Syllable-final consonant sounds may be represented by 2 dedicated combining marks (anusvara & visarga). Velar consonant cluster initials may be written either using a regular character or using anusvara.

Consonant summary table

The following table summarises the main consonant to character assignments.

Onsets

8
ppp0B2A
bbb0B2C
ttt0B24
ddd0B26
ʈʈ0B1F
ɖɖ0B21
kkk0B15
ɡgg0B17

8
ph0B2B
bh0B2D
th0B25
dh0B27
ʈʰṭhʈʰ0B20
ɖʰḍhɖʰ0B22
kh0B16
ɡʰgh0B18

5
t͡ʃcc0B1A
d͡ʒjj0B1C
d͡ʒʤ0B2F
t͡ʃʰch0B1B
d͡ʒʰjh0B1D

4
sss0B38
s0B37
sśś0B36
ɦhɦ0B39

5
mmm0B2E
nnn0B28
ɲñɲ0B1E
ɳɳ0B23
ŋŋ0B19

8
www0B71
ʋvv0B35
rrr0B30
ɽଡ଼0B21
0B3C
ɽʰଢ଼ṛhrʰˑ0B22
0B3C
lll0B32
ɭɭ0B33
jyy0B5F
Finals

both
̃ ŋ0B02
h0B03

For additional details see Vowel sounds to characters.

Basic consonants

Whereas the table just above takes you from sounds to letters, the following simply lists the basic consonant letters (however, since the orthography is highly phonetic there is little difference in ordering).


36
ppp0B2A
bbb0B2C
ph0B2B
bh0B2D
ttt0B24
ddd0B26
th0B25
dh0B27
ʈʈ0B1F
ɖɖ0B21
ʈʰṭhʈʰ0B20
ɖʰḍhɖʰ0B22
kkk0B15
ɡgg0B17
kh0B16
ɡʰgh0B18
t͡ʃcc0B1A
t͡ʃʰch0B1B
d͡ʒjj0B1C
d͡ʒʰjh0B1D
d͡ʒʤ0B2F
sss0B38
s0B37
sśś0B36
ɦhɦ0B39
mmm0B2E
nnn0B28
ɲñɲ0B1E
ɳɳ0B23
ŋŋ0B19
www0B71
ʋvv0B35
rrr0B30
lll0B32
ɭɭ0B33
jyy0B5F

The velar and palatal nasals only occur in homorganic clusters.2406

କ୍ଷU+0B15 LETTER KA + U+0B4D SIGN VIRAMA + U+0B37 LETTER SSA is regarded as a letter of the alphabet.

wa and va

The letters U+0B71 LETTER WA and U+0B35 LETTER VA were added to Unicode version 4.

The subjoined forms of U+0B71 LETTER WA and U+0B2C LETTER BA may look the same. For a discussion of the possible historical relationship between these characters see Everson/Stone3→.

Observation: The Library of Congress transcription page says that when [U+0B2C ORIYA LETTER BA] occurs as the second consonant of a consonant cluster (except when geminated), it is transliterated v5. It appears, however, that it also keeps the b sound after the letters m and r.

U+0B35 LETTER VA is described by Wiktionary as "used sporadically for the phonetic Va/Wa as an alternative for the officially recognised letter ୱ, but has not gained widespread acceptance".

Repertoire extension using nukta

The sounds ɽ and ɽʰ are written by combining U+0B3C SIGN NUKTA with an existing consonant.


both
ଡ଼ɽ0B21
0B3C
ଢ଼ɽʰṛhrʰˑ0B22
0B3C

The nukta should always be typed and stored immediately after the consonant it modifies, and before any combining vowels or diacritics.

Unicode also has precomposed forms of these letters, but they decompose under Unicode Normalisation Form C (NFC). Therefore, the Unicode Standard recommends the use of the decomposed sequence.


both
avoidɽ0B5C
avoidɽʰṛhrʰˑ0B5D

The nukta may also be used to produce other non-native sounds. Wiktionary describes the following:

Onsets

Clusters of consonant letters at the beginning of an orthographic syllable occur in Odia, and they are handled as described in the section Consonant clusters.

Special behaviours include handling of RA at the beginning of an orthographic syllable (see rconjuncts).

Finals


both
̃ ŋ0B02
h0B03

A syllable-final nasal sound can be written using U+0B02 SIGN ANUSVARA, eg. ଜଂତୁ jôṃtu animal ଜଂଗଲ jôṃgôlô forest ଏବଂ ebôṃ and

It is optional whether the nasal sound is written using anusvara or by using a conjunct. Figure 6 shows two ways of writing ଅଂକ ɔŋkɔ ink.10488

ଅଙ୍କ	ଅଂକ
The sound ɔŋkɔ written using a conjunct (top) and using anusvara followed by KA (bottom).

A word-final h consonant can also be written using U+0B03 SIGN VISARGA.754 (In the middle of a word, it creates a geminated consonant.)

Observation: According to Wikipedia, that sound is a h, but according to Nakanishi it is a glottal stop.

Consonant clusters

A consonant cluster is a sequence of consonant sounds with no intervening vowels.

A conjunct is a consonant cluster where the lack of intervening vowels is indicated by one or more of stacking, changing and merging the shapes of the constituent letter forms (usually in abugidas). Not all consonant clusters are displayed as conjuncts.

The absence of a vowel sound between two or more consonants is visually indicated in one of the following ways.

  1. Create a conjunct. There are a number of possibilities here:
    1. Stacking : Reduce a non-initial consonant in size and shape and position it below the first.
    2. Conjoining : The two consonants sit side by side, but the second consonant has a special shape.
    3. Ligation : Create a fusion of the letter shapes, where it may be difficult to identify one or more of the components.
    4. The letter RA has its own idiosyncratic way of combining with other consonants, whether it precedes or follows them.
  2. Show a visible virama below the non-final consonants in the cluster.
  3. Use the anusvara.

See also Finals and Consonant length.

Conjunct formation

See a table of 2-consonant clusters.
The table allows you to test results for various fonts.


͞0B4D

In Unicode, conjunct formation is achieved by adding   U+0B4D SIGN VIRAMA between the consonants. The font hides the virama glyph automatically when a conjunct is formed.

Stacking

The overwhelming majority of conjuncts in Odia are achieved by subjoining a reduced form of the non-initial consonant below the initial.

ହନ→ହ୍ନ
ɦnɔ
ଳପ→ଳ୍ପ
ɭpɔ
କକ→କ୍କ
kkɔ
Examples of stacked conjuncts.

In most cases the non-initial consonant is just reduced in size, but in some cases the shape is changed, either by removing the circular top line, or in a more fundamental way.

କତ→କ୍ତ
ktɔ
କଢ→କ୍ଢ
kɖɔ
Stacked conjuncts where the subjoined shape is significantly different from the normal shape.

However, when TA is the initial consonant, it is sometimes the initial that is reduced and subjoined. In other combinations, however, it retains its full form.

ତନ→ତ୍ନ
tkɔ
ତନ→ତ୍ନ
tnɔ
Stacked conjuncts with an initial TA. The TA may be subjoined in some combinations.

RA in clusters

A trailing RA has a fairly regular appearance as a subjoined glyph below the preceding consonant, although that line may join with the preceding letter shape, and therefore cause a slight change to it.

କର→କ୍ର
krɔ
A trailing RA in a cluster is rendered as a subjoined glyph.

However, like many other Indian scripts, U+0B30 LETTER RA at the beginning of a cluster is represented idiosyncratically, and appears as a small, superscript glyph over the top right of the following consonant.

ରକ→ର୍କ
rkɔ
An initial RA in a cluster is rendered as a superscript over the following consonant.

Observation: Unlike Devanagari, it appears that the RA doesn't move over a following vowel sign, such as [U+0B3E ORIYA VOWEL SIGN AA].

Ligated forms

Certain combinations of consonants form conjuncts by producing a merged glyph one or both of the original letters may be unrecognisable.

ଜଞ→ଜ୍ଞ
d͡ʒɲɔ
ତତ→ତ୍ତ
ttɔ
କଷ→କ୍ଷ
kʰjɔ
Clusters that fuse into forms different from their original component shapes.

The following is a list of combinations that produce such an effect. Click on the items to see the component letters.


23
କ୍ଷ0B15
0B4D
0B37
ତ୍ତ0B24
0B4D
0B24
ଦ୍ଧ0B26
0B4D
0B27
ଦ୍ଭ0B26
0B4D
0B2D
ଙ୍କ0B19
0B4D
0B15
ଙ୍ଖ0B19
0B4D
0B16
ଙ୍ଗ0B19
0B4D
0B17
ଙ୍ଘ0B19
0B4D
0B18
ଜ୍ଞ0B1C
0B4D
0B1E
ଟ୍ଟ0B1F
0B4D
0B1F
ତ୍ତ0B24
0B4D
0B24
ଦ୍ଭ0B26
0B4D
0B2D
ଦ୍ଦ0B26
0B4D
0B26
ଦ୍ଧ0B26
0B4D
0B27
ଧ୍ଯ0B27
0B4D
0B2F
ଧ୍ୟ0B27
0B4D
0B5F
ନ୍ଧ0B28
0B4D
0B27
ନ୍ଦ0B28
0B4D
0B26
ବ୍ଦ0B2C
0B4D
0B26
ବ୍ବ0B2C
0B4D
0B2C
ମ୍ପ0B2E
0B4D
0B2A
ମ୍ଫ0B2E
0B4D
0B2B
ମ୍ଭ0B2E
0B4D
0B2D

Conjoined consonants or a visible halanta

Three letters in particular tend not to stack, but sit alongside the initial consonant in the cluster.

ତଯ→ତ୍ଯ
td͡ʒɔ
ତୟ→ତ୍ୟ
tjɔ
Conjoined letters for the clusters , and tj, respectively (top to bottom).

As can be seen above, the conjoined forms for ʤ and j are identical.

The letter NYA also sits alongside the cluster initial, but the halanta may be shown below the initial letter.

ତଞ→ତ୍ଞ
tɲɔ
A consonant cluster that shows a visible virama, rather than creating a conjunct.

Observation: Noto, Nirmala, and Kangila fonts all show the halanta below the initial consonant in the first example at Figure 13, but Oriya MN and Oriya Sangam MN fonts don't show it.

The halanta is also left showing for borrowed words.2404 The halanta can be made visible by following it with ‌U+200C ZERO WIDTH NON-JOINER.

Triple-consonant clusters

Oriya has a number of clusters involving 3 consonants. For example, the following words contain triple-consonant clusters. As always, click on the example to see the composition. ପୂର୍ଣ୍ଣ pūrṇṇô full ତୀକ୍ଷ୍ଣ tīkṣṇô sharp (as a knife) ଚନ୍ଦ୍ର côndrô moon

Consonant length


h0B03

Geminated consonants in the middle of a word can be written using U+0B03 SIGN VISARGA,754 eg. ଦୁଃଖ dukkʰɔ sorrow

Consonant sounds to characters

This section maps Odia consonant sounds to common graphemes in the Oriya orthography.

To the right, typical subjoined forms are shown after a dotted circle. Oriya also has some unusual conjuncts which are also shown. (Combinations with a trailing r are not shown.)

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc. Light coloured characters occur infrequently.

p

୍ପ ତ୍ପ ମ୍ପ consonant U+0B2A LETTER PA

୍ଫ ମ୍ଫ consonant U+0B2B LETTER PHA

b

୍ବ ବ୍ବ consonant U+0B2C LETTER BA

୍ଭ ଦ୍ଭ ମ୍ଭ consonant U+0B2D LETTER BHA

t

୍ତ ତ୍ତ consonant U+0B24 LETTER TA

୍ଥ ତ୍ଥ consonant U+0B25 LETTER THA

t͡ʃ

୍ଚ ଚ୍ଚ ଞ୍ଚ consonant U+0B1A LETTER CA

t͡ʃʰ

୍ଛ ଚ୍ଛ ଞ୍ଛ ଶ୍ଛ consonant U+0B1B LETTER CHA

d

୍ଦ ଦ୍ଦ ନ୍ଦ ବ୍ଦ consonant U+0B26 LETTER DA

୍ଧ ଦ୍ଧ ନ୍ଧ consonant U+0B27 LETTER DHA

d͡ʒ

୍ଜ ଞ୍ଜ consonant U+0B1C LETTER JA

୍ଯ ଧ୍ଯ consonant U+0B2F LETTER YA

d͡ʒʰ

୍ଝ ଞ୍ଝ consonant U+0B1D LETTER JHA

ʈ

୍ଟ consonant U+0B1F LETTER TTA

ʈʰ

୍ଠ consonant U+0B20 LETTER TTHA

ɖ

୍ଡ ଣ୍ଡ consonant U+0B21 LETTER DDA

ɖʰ

୍ଢ consonant U+0B22 LETTER DDHA

k

୍କ ଙ୍କ ତ୍କ consonant U+0B15 LETTER KA

୍ଖ ଙ୍ଖ consonant U+0B16 LETTER KHA

ɡ

୍ଗ ଙ୍ଗ consonant U+0B17 LETTER GA

ɡʰ

୍ଘ ଙ୍ଘ consonant U+0B18 LETTER GHA

q

consonant କ଼U+0B15 LETTER KA + U+0B3C SIGN NUKTA Used for loan words.

s

୍ସ ତ୍ସ consonant U+0B38 LETTER SA

୍ଷ ଖ୍ଷ consonant U+0B37 LETTER SSA

୍ଶ consonant U+0B36 LETTER SHA

ʒ

non-native consonant ଝ଼U+0B1D LETTER JHA + U+0B3C SIGN NUKTA Used to produce non-native sounds.

x

non-native consonant ଖ଼U+0B16 LETTER KHA + U+0B3C SIGN NUKTA Used to produce non-native sounds.

h

final aspiration/consonant doubler U+0B03 SIGN VISARGA Coda.

ɦ

୍ହ consonant U+0B39 LETTER HA

m

୍ମ ତ୍ମ ମ୍ମ consonant U+0B2E LETTER MA

n

୍ନ ତ୍ନ consonant U+0B28 LETTER NA

ɳ

୍ଣ ଣ୍ଣ ଷ୍ଣ consonant U+0B23 LETTER NNA

ɲ

ଜ୍ଞ consonant U+0B1E LETTER NYA

ŋ

୍ଙ consonant U+0B19 LETTER NGA

nasalisation U+0B02 SIGN ANUSVARA Coda.

w

୍ୱ ବ୍ୱ consonant U+0B71 LETTER WA

ʋ

୍ଵ ବ୍ଵ consonant U+0B35 LETTER VA

r

୍ର consonant U+0B30 LETTER RA

ru

dependent vocalic U+0B43 VOWEL SIGN VOCALIC R

dependent vocalic U+0B44 VOWEL SIGN VOCALIC RR Usually only used for Sanskrit transcriptions.

vocalic U+0B0B U+0020 LETTER VOCALIC R Usually only used for Sanskrit transcriptions.

vocalic U+0B60 LETTER VOCALIC RR Usually only used for Sanskrit transcriptions.

ɻ

non-native consonant ଷ଼U+0B37 LETTER SSA + U+0B3C SIGN NUKTA

l

୍ଲ consonant U+0B32 LETTER LA

ɭ

୍ଳ consonant U+0B33 LETTER LLA

lu

dependent vocalic U+0B62 U+0020 VOWEL SIGN VOCALIC L Usually only used for Sanskrit transcriptions.

dependent vocalic U+0B63 VOWEL SIGN VOCALIC LL Usually only used for Sanskrit transcriptions.

vocalic U+0B0C LETTER VOCALIC L Usually only used for Sanskrit transcriptions.

vocalic U+0B61 LETTER VOCALIC LL Usually only used for Sanskrit transcriptions.

j

୍ୟ ଧ୍ୟ consonant U+0B5F LETTER YYA

Symbols

Deceased honorific.U+0B70 ISSHAR is used before the name of a deceased person.

Om.The symbol for the word Om is produced using ଓଁU+0B13 LETTER O + U+0B01 SIGN CANDRABINDU. It also occurs as a ligated form. If the font doesn't produce the ligated form automatically, the font may produce it if ‍U+200D ZERO WIDTH JOINER is inserted between the two characters.

ଓଁ	ଓ‍ଁ
A non-ligated combination of O+CANDRABINDU (left) and a ligated form using ZERO WIDTH JOINER (right)..

Encoding choices

Visually, several of the standalone vowels and some vowel signs look as if they could be composed of smaller parts. This section compares approaches and considers the relevance of Unicode Normalisation Form D (NFD) and Unicode Normalisation Form C (NFC) to give guidance on which approach is best.

Encoding vowel-signs

The three circumgraphs can be written as a single character, or as two characters. In 2 of those cases, the second character is a lengthening mark.

Precomposed Decomposed
U+0B4B VOWEL SIGN O ୋU+0B47 VOWEL SIGN E + U+0B3E VOWEL SIGN AA
U+0B48 VOWEL SIGN AI ୈU+0B47 VOWEL SIGN E + U+0B56 AI LENGTH MARK
U+0B4C VOWEL SIGN AU ୌU+0B47 VOWEL SIGN E + U+0B57 AU LENGTH MARK

The single code point per vowel sign is preferred, however the parts are separated in Unicode Normalisation Form D (NFD), and recomposed in Unicode Normalisation Form C (NFC), so both approaches are canonically equivalent.

Whichever approach is used, the vowel signs must be typed and stored after the consonant characters they surround, and in left to right order.

Independent vowels

The approach listed in the table below is not equivalent when the text is normalised, and therefore only the precomposed approach in the left column should be used.10487

Use Do not use
U+0B06 U+0020 LETTER AA ଅାU+0B05 LETTER A + U+0B3E VOWEL SIGN AA
U+0B10 LETTER AI ଏୗU+0B0F LETTER E + U+0B57 AU LENGTH MARK
U+0B14 LETTER AU ଓୗU+0B13 LETTER O + U+0B57 AU LENGTH MARK

In addition to the problem previously mentioned, combinations on rows 2 and 3 don't have the joining bar and so won't display correctly.

Numbers

This section describes typographic features related to digits, dates, currencies, etc.

Digits

Odia has its own set of native digits.


10
10B67
20B68
30B69
40B6A
50B6B
60B6C
70B6D
80B6E
90B6F
00B66

The CLDR standard-decimal pattern is #,##,##0.###. The standard-percent pattern is #,##,##0%.c

ASCII digits may also be used.6

Fractions

Odia also has a number of pre-decimal characters representing fractions.


6
archaic¼0B72
archaic½0B73
archaic¾0B74
archaic{1/16}0B75
archaic{1/8}0B76
archaic{3/16}0B77

These are used additively, with larger values appearing before smaller, eg. ୳୵U+0B73 U+0020 FRACTION ONE HALF, SPACE + U+0B75 FRACTION ONE SIXTEENTH represents the value 5/16.10490

Text direction

Odia text runs left to right in horizontal lines.

Show default bidi_class properties for characters in the Odia orthography described here.

Glyph shaping & positioning

This section describes typographic features related to font/writing styles, cursive text, context-based shaping, context-based positioning, letterform slopes, weights & italics, and case & other character transforms.

You can experiment with examples using the Odia character app.

Context-based shaping & positioning

Are special glyph forms needed, depending on the context in which a character is used? Do glyphs interact in some circumstances? Are there requirements to position diacritics or other items specially, depending on context? Does the script have multiple diacritics competing for the same location relative to the base?

Odia text relies on OpenType rules to correctly position glyphs and shape them according to the surrounding text.

One major area where this applies is in the use of conjunct forms for consonant clusters.

ସୂର୍ଯ୍ୟ
The 3 glyphs highlighted on the right are conjunct forms in the word ସୂର୍ଯ୍ୟ suɾd͡ʒjɔ sun.
show composition

ସୂର୍ଯ୍ୟ suɾd͡ʒjɔ sun

See a table of 2-consonant clusters for Oriya.

The following is a selection of other examples of contextual shaping and positioning.

Positioning u in clusters. When a below-base vowel sign occurs with a cluster with a conjoined form it is attached to the larger glyph, rather than to the consonant it actually follows in memory and speech, eg.

ମୃତ୍ୟୁ
The arrow points from where the sound u is pronounced to the position the vowel sign is displayed in the word ମୃତ୍ୟୁ mrutyu death.
show composition

ମୃତ୍ୟୁ mrutyu death

Position & shape of i. After a certain consonant glyphs, in some fonts, the vowel sign for i appears in a different position and with a different shape. The first example in the table below shows the typical shape.

  Composition Example
ସି + ିU+0B38 LETTER SA + U+0B3F VOWEL SIGN I ପ୍ରସିଦ୍ଧି pɾɔsiddʱi fame
ସ୍ଥି + + + ିU+0B38 LETTER SA + U+0B4D SIGN VIRAMA + U+0B25 LETTER THA + U+0B3F VOWEL SIGN I ଅସ୍ଥି ôsthi bone
ଥି + ିU+0B25 LETTER THA + U+0B3F VOWEL SIGN I ପୃଥିବୀ pɾutʰibi Earth

Other glyph variants. Nakanishi lists a number of alternative shapes for glyphs.

Description of glyph variants from Nakanishi, p54.

Explicit shaping controls

‌U+200C ZERO WIDTH NON-JOINER (ZWNJ) can be used to force the production of a visible virama, rather than a conjunct form.

‍U+200D ZERO WIDTH JOINER (ZWJ) is used to produce a ligated version of OM (see Symbols).

Typographic units

Word boundaries

Are words separated by spaces, or other characters? Are there special requirements when double-clicking on the text? Are words hyphenated?

The concept of 'word' is difficult to define in any language (see What is a word?). Here, a word is a vaguely-defined, but recognisable semantic unit that is typically smaller than a phrase and may comprise one or more syllables.

Words are separated by spaces.

Hyphens may be used to separate parts of a compound word,640 eg. ଡ୍ରପ୍-ଡାଉନ୍ ɖrɔp-ɖaun drop-down

Graphemes

A grapheme is a user-perceived unit of text. Text operations that use graphemes as a unit of text include line-breaking, forwards deletion, cursor movement & selection, character counts, text spacing, text insertion, justification, case conversions, and sorting. The Unicode Standard uses generalised rules to define 'grapheme clusters', which approximate the likely grapheme boundaries in a writing system, however they don't work well with many complex scripts.

Grapheme clusters

Usually a typographic character unit correlates with the Unicode concept of grapheme clusters, but not in the case of conjuncts (in common with several other Indic scripts).

Conjuncts

Conjuncts and any dependent combining characters should never be split.

This creates a problem when dealing with Unicode grapheme clusters, because they stop after reaching a virama. So conjuncts usually contain multiple grapheme clusters. This produces incorrect segmentation as seen on the right in Figure 19. Applications need to tailor the grapheme cluster rules to avoid splitting conjuncts.

ସାଙ୍ଗେ   ସାଙ୍‌ଗେ
Segmentation of the word ସାଙ୍ଗେ saṅge with as it should be (left), and how it would be if grapheme clusters are used as the maximal unit (right).
show composition

ସାଙ୍ଗେ saṅge with

Unfortunately, this is harder than it seems, because whether a conjunct is formed or not usually depends on the capabilities of the font – it cannot be determined solely by looking at the code points in memory. If a font doesn't contain the glyphs to create a conjunct it will render the consonant cluster with a visible virama. In that case, the grapheme cluster approach is appropriate.

Punctuation & inline features

This section describes typographic features related to word boundaries, phrase & section boundaries, bracketed text, quotations & citations, emphasis, abbreviation, ellipsis & repetition, inline notes & annotations, other punctuation, and other inline text decoration.

Phrase & section boundaries

What characters are used to indicate the boundaries of phrases, sentences, and sections?


8
,002C
;003B
:003A
.002E
?003F
!0021
0964
0965

Odia uses a combination of ASCII and native punctuation.

phrase

,U+002C COMMA

;U+003B SEMICOLON

:U+003A COLON

sentence

.U+002E FULL STOP

?U+003F QUESTION MARK

!U+0021 EXCLAMATION MARK

U+0964 DEVANAGARI DANDA

section

U+0965 DEVANAGARI DOUBLE DANDA

U+0964 DEVANAGARI DANDA and U+0965 DEVANAGARI DOUBLE DANDA are from the Unicode Devanagari block. Odia uses a space before these punctuation marks, which avoids confusion with U+0B3E VOWEL SIGN AA, eg.

… ଲୋପ ପାଇଗଲା ।

Bracketed text


both
(0028
)0029

Odia commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

(U+0028 LEFT PARENTHESIS

)U+0029 RIGHT PARENTHESIS

(U+0028 LEFT PARENTHESIS and )U+0029 RIGHT PARENTHESIS are used for parentheses.6

Quotations & citations

What characters are used to indicate quotations? Do quotations within quotations use different characters? What characters are used to indicate dialogue? Are the same mechanisms used to cite words, or for scare quotes, etc? What about citing book or article names?


4
201C
201D
2018
2019

Odia texts typically use quotation marks around quotations. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks.

  start end
initial

U+201C LEFT DOUBLE QUOTATION MARK

U+201D RIGHT DOUBLE QUOTATION MARK
nested

U+2018 LEFT SINGLE QUOTATION MARK

U+2019 RIGHT SINGLE QUOTATION MARK

Single quotation marks are used for quotations within quotations.

Abbreviation, ellipsis & repetition

What characters are used to indicate abbreviation, ellipsis & repetition?

Abbreviations

Odia abbreviations use a period after the first syllable, but sometimes include more than one syllable,645 eg. ବଶେଷ୍ୟ → ବ. ଉଦାହରଣ → ଉ.ଦା.

Ellipsis

Odia uses U+2026 HORIZONTAL ELLIPSIS for ellipsis,640 eg. ଆଇକନ୍ ପରିବର୍ତ୍ତନ କରନ୍ତୁ…

In Sanskrit, U+0B3D SIGN AVAGRAHA is used to indicate elision,13 eg. ଦ୍ୱିତୀୟୋଽଧ୍ୟାୟଃ dwitīyo'dhyāyaḥ chapter 2

Other inline features

Any other form of highlighting or marking of text, such as underlining, numeric overbars, etc. What characters or methods (eg. text decoration) are used to convey information about a range of text? If lines are drawn alongside, over or through the text, do they need to be a special distance from the text itself? Is it important to skip characters when underlining, etc? How do things change for vertically set text?

Other punctuation

Punctuation not already mentioned, such as dashes, connectors, separators, scare quotes, etc.

CLDR lists the following non-ASCII punctuation marks for Odia.


4
 2010
 2011
 2013
 2014

Line & paragraph layout

This section describes typographic features related to line breaking & hyphenation, text alignment & justification, text spacing, baselines, line height, counters, lists, and styling initials.

Line breaking & hyphenation

Are there special rules about the way text wraps when it hits the end of a line? Does line-breaking wrap whole 'words' at a time, or characters, or something else (such as syllables in Tibetan and Javanese)? What characters should not appear at the end or start of a line, and what should be done to prevent that? Is hyphenation used, or something else? What rules are used? What difficulties exist?

Lines are mostly broken at inter-word spaces.

Like most writing systems, certain characters are expected not to start or end a line. For example, periods and commas shouldn't start a line, and opening parentheses shouldn't end a line.

Show (default) line-breaking properties for characters in the modern Odia orthography.

Baselines, line height, etc.

Does the script have special requirements for baseline alignment between mixed scripts and in general? Is line height special for this script? Are there other aspects that affect line spacing, or positioning of items vertically within a line?

tbd

Odia uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

Counters, lists, etc.

Are there list or other counter styles in use? If so, what is the format used? Do counters need to be upright in vertical text? Are there other aspects related to counters and lists that need to be addressed?

You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.

The oriya numeric style is decimal-based and uses these digits.8


10
00B66
10B67
20B68
30B69
40B6A
50B6B
60B6C
70B6D
80B6E
90B6F

Examples:


12
10B67
20B68
30B69
40B6A
୧୧110B67
0B67
୨୨220B68
0B68
୩୩330B69
0B69
୪୪440B6A
0B6A
୧୧୧1110B67
0B67
0B67
୨୨୨2220B68
0B68
0B68
୩୩୩3330B69
0B69
0B69
୪୪୪4440B6A
0B6A
0B6A

Page & book layout

This section describes typographic features related to general page layout & progression; grids & tables, notes, footnotes, etc, forms & user interaction, and page numbering, running headers, etc.

References & sources

1Unicode Consortium, CLDR, Locale Data Summary for Odia

2Peter T. Daniels and William Bright, The World's Writing Systems, Oxford University Press, 404-407, ISBN 0-19-507993-0

3Michael Everson and Anthony Stone, On Oriya VA and WA, and a proposal to encode one Oriya letter in the UCS

4Tapas S Ray, Peri Bhaskararao (ed), International Symposium on Indic Scripts, Past & Future, Research Institute for the Languages and Cultures of Asia and Africa, Tokyo University of Foreign Studies, 143-153

5Library of Congress (1996), Oriya (retr. Oct 2021)

6Microsoft, Odia style guide

7Akira Nakanishi, Writing Systems of the World, Tuttle, 54-55, ISBN 0-8048-1654-9

8Richard Ishida, Ready-made Counter Styles

9Swaran Lata, L2/12-380: Letter requesting change of name in Unicode

10Unicode Consortium, The Unicode Standard, Version 13.0, Chapter 12.5: South and Central Asia I, Oriya (Odia), 487-490, ISBN 978-1-936213-16-0.

11Unicode Consortium, Unicode Line Breaking Algorithm (UAX#14)

12Wikipedia, Odia language

13Wikipedia, Odia script

See recent changes.  •  Make a comment.  •  Licence CC-By © r12a.