Dhivehi

Thaana orthography notes

Updated 24 May, 2023

This page brings together basic information about the Thaana script and its use for the Maldivian language, Dhivehi. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Dhivehi using Unicode.

Sample

Select part of this sample text to show a list of characters, with links to more details. Source
Change size:   28px

1 ވަނަ މާއްދާ ހުރިހާ އިންސާނުންވެސް ދުނިޔެއަށް އުފަންވަނީ، މިނިވަންކަމުގައި، ހަމަހަމަ ޙައްޤުތަކަކާއެކު، ހަމަހަމަ ދަރަޖައެއްގައި ކަމޭހިތެވިގެންވާ ބައެއްގެ ގޮތުގައެވެ. ހެޔޮ ވިސްނުމާއި، ހެޔޮބުއްދީގެ ބާރު އެމީހުންނަށް ލިބިގެންވެއެވެ. އަދި އެކަކު އަނެކަކާމެދު އެމީހުން މުޢާމަލާތް ކުރަންވާނީ، އުޚުއްވަތްތެރިކަމުގެ ރޫޙެއްގައެވެ.

2 ވަނަ މާއްދާ ހަމަ ކޮންމެ މީހަކަށްމެ، މިޤަރާރުގައި ބަޔާންކޮށްފައިވާ ހުރިހާ ޙައްޤުތަކަކާއި މިނިވަންކަމުގެ މިންގަނޑުތަކެއް ހޯދުމާއި، ލިބިގަތުމުގެ ޙައްޤު ލިބިގެންވެއެވެ. އެޙައްޤުތަކާއި އެމިންގަނޑުތައް ލިބިދެނީ، ނަސްލާއި، ކުލައާއި، ޖިންސާއި، ބަހާއި، ދީނާއި، ސިޔާސީގޮތުން ނުވަތަ އެހެންވެސް ކަމަކާ ގުޅޭގޮތުން ވިސްނުން ގެންގުޅޭ ގޮތާއި، ވަކި ޤައުމަކަށް ނުވަތަ މުޖުތަމަޢަކަށް ނިސްބަތްވުމާއި، މުދާ ލިބިހުރުމާއި، އުފަންވީ ޢާއިލާއެއްގެ ސަބަބުން ޤަދަރުވެރިވުމާއި، އެހެންވެސް ސަބަބަކާ ހުރެ ޤަދަރުވެރިވުންފަދަ، އެއްވެސް ބާވަތެއްގެ މިންގަނޑަކުން ތަފާތު ކުރުމެއް ނެތިއެވެ. އަދި މީހަކު ނިސްބަތްވާ ޤައުމަކީ، ނުވަތަ ސަރަޙައްދަކީ ސިޔާސީ ގޮތުން، ނުވަތަ އެޤައުމުގެ ބާރު ހިނގާ ސަރަޙައްދުގެ މިންވަރުގެ ގޮތުން، ނުވަތަ އެޤައުމުގެ ބައިނަލްއަޤްވާމީ ހައިސިއްޔަތުގެ ގޮތުން ނަމަވެސް، އެއްވެސް ތަފާތުކުރުމެއް ގެންގުޅެގެން ނުވާނޭ ގޮތުންނެވެ. އެޤައުމަކީ، ނުވަތަ މީހަކު ނިސްބަތްވާ އެސަރަޙައްދަކީ، މިނިވަން ޤައުމަކަށް ވިޔަސް، ނުވަތަ އެކުވެރި ދައުލަތްތަކުގެ ބެލުމުގެ ދަށުން އެހެން ޤައުމަކުންބަލަހައްޓަމުންދާ ޤައުމެއް ކަމުގައިވިޔަސް، ނުވަތަ އަމިއްލަ ވެރިކަމެއް ނެތް ޤައުމެއް ކަމުގައި ވިޔަސް، ނުވަތަ އެހެންވެސް ގޮތަކުން ސިޔާދަތީ ބާރު މަޙްދޫދު ކުރެވިގެންވާ ޤައުމަކަށް ވީނަމަވެހެވެ.

Usage & history

The Thaana script is used for writing the Maldivian language, also known as Dhivehi, spoken by about 370,000 people in the Maldives and in Maldivian communities in India.

ތާނަ tə̄nə təːnə ThaanaIt is thought that speakers of Dhivehi have written their language for over two thousand years, and that writing was developed by  Maldivian Buddhist monks translating the Buddhist scriptures.

Over time, the script evolved, slanting the letters by 45 degrees and adding spaces between words. The earliest known sample of the Thaana script (rather than the Dhives Akuru alphabet) is inscribed on the main Friday mosque of the island and dates back to AD 1599.

Sources: Scriptsource, Wikipedia.

Basic features

Dhivehi is an alphabetic abjad. All vowels are written, but as diacritics above the consonants. See the table to the right for a brief overview of features for the modern Dhivehi orthography using the Thaana script.

Thaana text is written horizontally, right to left, but unlike Arabic or N'Ko, the text is not cursive. Multi-digit numbers are displayed left-to-right. ❯ direction

Words are separated by spaces.

The script is monocameral.

Dhivehi has 24 basic consonant letters, but the repertoire also contains 14 extra consonants to represent the sounds of Arabic and English. ❯ consonants

Thaana is an alphabet, but the vowels are written using 10 combining marks. An additional mark is used to indicate consonant clusters, or the absence of a vowel after a consonant base. ❯ vowels

Thaana represents standalone vowels using އ [U+0787 THAANA LETTER ALIFU] as a base, to which vowel diacritics are attached. ❯ standalone

Numbers use ASCII digits.

The visual forms of letters don't usually interact. Punctuation mixes ASCII and Arabic characters.

Character index

Letters

Show

Basic consonants

ޕ␣ބ␣ތ␣ދ␣ޗ␣ޖ␣ޓ␣ޑ␣ކ␣ގ␣ފ␣ވ␣ސ␣ޒ␣ށ␣ހ␣މ␣ނ␣ޏ␣ރ␣ލ␣ޅ␣ޔ␣އ

Extended consonants

ޠ␣ޟ␣ޤ␣ޘ␣ޛ␣ޞ␣ޡ␣ޝ␣ޜ␣ޚ␣ޣ␣ޙ␣ޢ␣ޥ

Other

ޕ

Not used for modern Dhivehi

ޱ

Combining marks

Show

Vowels

ި␣ީ␣ު␣ޫ␣ެ␣ޭ␣ަ␣ާ␣ޮ␣ޯ

Other

ް

Punctuation

Show
،␣؛

ASCII

(␣)␣.␣:␣?␣!

Other

Show

To be investigated

,␣/␣;␣[␣]␣؟␣⹁
Items to show in lists

Phonology

These are sounds for the Dhivehi language.

Click on the sound groups to see where else in the document each of the sounds are referred to.

Phones in a lighter colour are non-native or allophones. Source wp.

Vowel sounds

i u e o ə ə əː

Consonant sounds

labial dental alveolar post-
alveolar
retroflex palatal velar uvular pharyngeal glottal
stops p b t d     ʈ ɖ   k ɡ q    
pre-nasalised ᵐb ⁿd
    ᶯɖ   ᵑɡ      
ejective                  
affricate     t͡ʃ d͡ʒ              
fricatives f θ ð s z ʃ ʒ ʂ ɕ x ɣ   ħ ʕ h
ejective                  
nasal m   n   ɳ ɲ ŋ    
approximant ʋ w l   ɭ j      
trill/flap     r      

Tone

Dhivehi is not a tonal language.

Structure

Native Maldivian words do not allow initial consonant clusters. The syllable structure is (C)V(C), ie. one vowel with the option of a consonant in the onset and/or coda. This affects the introduction of loanwords, such as is.kuːl from English school.wp

Vowels

Vowel diacritics

All vowels are always written, but as diacritics above or below the consonant they follow.

ި␣ީ␣ު␣ޫ␣ެ␣ޭ␣ަ␣ާ␣ޮ␣ޯ

Diphthongs & glides

Diphthongs are written as a vowel diacritic over a consonant followed by a standalone vowel.

ވައި

މިއުޒިކް

ސައުވީސް

Standalone vowels

Thaana represents standalone vowels using އ [U+0787 THAANA LETTER ALIFU] as a base, to which vowel diacritics are attached.

އާއިލާ

އިރުގައި

Vowel length

tbd

Nasalisation

tbd

Tones

Dhivehi is not a tonal language.

Vowel sounds to characters

This section maps Dhivehi vowel sounds to common graphemes in the Thaana orthography. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

Plain vowels

Consonants

Basic consonants

For writing the languages of the Maldives there are 24 consonants in this block.

ޕ␣ބ␣ތ␣ދ␣ޓ␣ޑ␣ކ␣ގ
ޗ␣ޖ
ފ␣ވ␣ސ␣ޒ␣ށ␣ހ
މ␣ނ␣ޏ
ރ␣ލ␣ޅ␣ޔ␣އ

އ [U+0787 THAANA LETTER ALIFU] has no sound of its own, and is used in a number of situations, including as a support for standalone vowels (see standalone), to indicate gemination (see gemination), and to produce a glottal stop at the end of a word (see glottal).

Repertoire extension

A further 14 dotted versions of the normal consonants are available for transcribing Arabic sounds and the English sound ʒ (ޜ).

ޠ␣ޟ␣ޤ␣ޘ␣ޛ␣ޞ␣ޡ␣ޝ␣ޜ␣ޚ␣ޣ␣ޙ␣ޢ␣ޥ

Glottal stop

Several combinations of characters at the end of a word indicate a glottal stop.

އް [U+0787 THAANA LETTER ALIFU + U+07B0 THAANA SUKUN] is the most commond,566 (however, Wiktionary sometimes transcribes this as h).

އެކެއް

A word-final ށް [U+0781 THAANA LETTER SHAVIYANI + U+07B0 THAANA SUKUN] or ތް [U+078C THAANA LETTER THAA + U+07B0 THAANA SUKUN], also indicate a glottal stop, except that the latter also adds j before the stop.

ރަށް

ރަތް

Prenasalisation

Dhivehi, like Sinhala, has prenasalised stops that contrast with sequences of nasal plus stop.

These are not always written, but when they are the stop is preceded by ނ [U+0782 THAANA LETTER NOONU]. No sukun is used (in fact, this is the only situation where a consonant character isn't followed by a diacritic).

ކަނޑު

ކަޑު

Obsolete consonant

ޱ [U+07B1 THAANA LETTER NAA] was abolished from Maldivian official documents around 1953, but it is still seen in reprints of old books like the Bodu Tartheebu, and is used by the people of Addu Atoll and Fuvahmulah.

ޱ

Vowel absence

ް

The diacritic ◌ް [U+07B0 THAANA SUKUN​] indicates that there is no vowel following the consonant it sits on. This is always used, with one exception: when ށ [U+0781 THAANA LETTER SHAVIYANI] is written with no diacritic, this indicates prenasalization of a following stop, eg. ކަނޑު It is the only case where a letter can appear without a diacritic.

Onset consonants

tbd

Final consonants

tbd

Consonant clusters

The diacritic ◌ް [U+07B0 THAANA SUKUN] attached to a consonant signifies that it is not followed by a vowel.

Gemination

Gemination is common in Dhivehi. It is mostly written using އް [U+0787 THAANA LETTER ALIFU + U+07B0 THAANA SUKUN] before the doubled consonant.

ބައްޓެއް

However, if the consonant in question is a nasal, it is preceded instead by ން [U+0782 THAANA LETTER NOONU + U+07B0 THAANA SUKUN].d,566

އެންމެ

Gemination is also produced after ށް [U+0781 THAANA LETTER SHAVIYANI + U+07B0 THAANA SUKUN] and ތް [U+078C THAANA LETTER THAA + U+07B0 THAANA SUKUN], except that the latter also adds j before the doubled consonant.

އަށްޑިހަ

އަތްޕުޅު

Consonant sounds to characters

This section maps Dhivehi consonant sounds to common graphemes in the Thaana orthography. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, Sanskrit, etc.

Stops

Affricates

Fricatives

Nasals

 

ނ [U+0782 THAANA LETTER NOONU] prenasalisation is indicated by this letter when it occurs without a vowel diacritic.

Other

Numbers, dates, currency, etc.

Either european digits or arabic-indic digits are used.

See type samples.

Text direction

Thaana script is written horizontally and right-to-left in the main, but as with most RTL scripts, numbers and embedded LTR script text are written left-to-right (producing 'bidirectional' text).

7.2.2 ބައްތެލި
The section number in this example is read LTR, ie 7.2.2, but the Thaana word following is RTL.

Show default bidi_class properties for characters in the Dhivehi orthography described here.

Glyph shaping & positioning

This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.

You can experiment with examples using the Thaana character app.

The Thaana script is not cursive, and involves no significant context-based shaping or positioning.

The script is unicameral and needs no transforms to convert between code points.

Font styling & weight

tbd

Graphemes

Dhivehi graphemes are straightforward, and can be mapped to Unicode grapheme clusters.

Grapheme clusters

Base Combining_mark?

Nearly all base letters in Dhivehi are accompanied by a combining mark to indicate a vowel (or absence of a vowel). There is only one mark per base. All such sequences conform to Unicode grapheme clusters.

Click on the text version of this word to see more detail about the composition.

ލަކްޝަދީބު
އެކެއް

Punctuation & inline features

Word boundaries

Words are separated by spaces.

Phrase & section boundaries

،␣:␣؛␣.␣?␣!

Dhivehi punctuation uses a mixture of western and Arabic punctuation. The latter includes, in particular, ، [U+060C ARABIC COMMA] and ؛ [U+061B ARABIC SEMICOLON].

phrase

، [U+060C ARABIC COMMA]

؛ [U+061B ARABIC SEMICOLON]

: [U+003A COLON]

sentence

. [U+002E FULL STOP]

? [U+003F QUESTION MARK]

! [U+0021 EXCLAMATION MARK]

In the sample text at the top of the page Arabic commas and ASCII full stops are mixed together.

އުފަންވަނީ،
An excerpt from the sample text above showing the use of an arabic comma.

See type samples.

Bracketed text

(␣)

Dhivehi commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

( [U+0028 LEFT PARENTHESIS]

) [U+0029 RIGHT PARENTHESIS]

See type samples.

Quotations & citations

tbd

Emphasis

tbd

Abbreviation, ellipsis & repetition

tbd

Inline notes & annotations

tbd

Other punctuation

tbd

Other inline text decoration

tbd

See type samples.

Line & paragraph layout

Line breaking & hyphenation

The primary break-point for line-wrapping is the inter-word space.

Line-edge rules

As in almost all writing systems, certain punctuation characters should not appear at the end or the start of a line. The Unicode line-break properties help applications decide whether a character should appear at the start or end of a line.

Show (default) line-breaking properties for characters in the modern Dhivehi orthography.

The following list gives examples of typical behaviours for some of the characters used in modern Dhivehi. Context may affect the behaviour of some of these and other characters.

Click/tap on the characters to show what they are.

  • “ ‘ (   should not be the last character on a line.
  • ” ’ ) . ، ؛ ! ? %   should not begin a new line.

Text alignment & justification

tbd

Justification of Dhivehi text appears to be common, and a common approach involves stretching inter-word spacing.

See type samples.

Text spacing

tbd

This section looks at ways in which spacing is applied between characters over and above that which is introduced during justification.

Baselines, line height, etc.

Dhivehi uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

Dhivehi places vowel marks above and below base characters but there is only ever one.

To give an approximate idea, fig_baselines compares Latin and Dhivehi glyphs from a Noto font. The basic height of Dhivehi letters is typically below the Latin x-height, however extenders for some characters push below the baseline, and the combining marks can reach beyond the Latin ascenders and descenders, creating a need for slightly larger line spacing.

Hhqxއިރުޗޯއާއުފަންވަނީޓީ،2
Font metrics for Latin text compared with Dhivehi glyphs in the Noto Serif Thaana font.

fig_baselines_other shows similar comparisons for the MV Boli font.

Hhqxއިރުޗޯއާއުފަންވަނީޓީ،2
Latin font metrics compared with Dhivehi glyphs in the MV Boli font.

Counters, lists, etc.

tbd

See type samples.

Styling initials

tbd

Page & book layout

This section is for any features that are specific to Thaana and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.

General page layout & progression

tbd

See type samples.

Grids & tables

tbd

Notes, footnotes, etc

tbd

Forms & user interaction

tbd

Page numbering, running headers, etc

tbd

References