Bangla/Bengali

Updated 29 November, 2021

This page brings together basic information about the Bengali script and its use for the Bangla language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Bengali using Unicode.

Phonetic transcriptions on this page should be treated as an approximate guide, only. Many are more phonemic than phonetic, and there may be variations depending on the source of the transcription.

More about using this page
Related pages.
Other script summaries.

Sample (Bangla)

Select part of this sample text to show a list of characters, with links to more details.
Change size:   28px

ধারা ১ সমস্ত মানুষ স্বাধীনভাবে সমান মর্যাদা এবং অধিকার নিয়ে জন্মগ্রহণ করে। তাঁদের বিবেক এবং বুদ্ধি আছে; সুতরাং সকলেরই একে অপরের প্রতি ভ্রাতৃত্বসুলভ মনোভাব নিয়ে আচরণ করা উচিত।

ধারা ২ এ ঘোষণায় উল্লেখিত স্বাধীনতা এবং অধিকারসমূহে গোত্র, ধর্ম, বর্ণ, শিক্ষা, ভাষা, রাজনৈতিক বা অন্যবিধ মতামত, জাতীয় বা সামাজিক উত্‍পত্তি, জন্ম, সম্পত্তি বা অন্য কোন মর্যাদা নির্বিশেষে প্রত্যেকের‌ই সমান অধিকার থাকবে। কোন দেশ বা ভূখণ্ডের রাজনৈতিক, সীমানাগত বা আন্তর্জাতিক মর্যাদার ভিত্তিতে তার কোন অধিবাসীর প্রতি কোনরূপ বৈষম্য করা হবেনা; সে দেশ বা ভূখণ্ড স্বাধীন‌ই হোক, হোক অছিভূক্ত, অস্বায়ত্বশাসিত কিংবা সার্বভৌমত্বের অন্য কোন সীমাবদ্ধতায় বিরাজমান।

Usage & history

The Bengali or Bangla script is used by over 180 million people in Bangladesh and India to write the Bengali language, and a number of other Indian languages including Sylheti, Meithei, Bishnupriya Manipuri, and, with one or two modifications, Assamese. It has historically been used to write Sanskrit within Bengal. It ranks 5th in the world for writing system usage.

বাংলা বর্ণমালা bɑŋ̽lɑ br͓n̈mɑlɑ (bangla bôrnômala) Bengali/Bangla alphabet বাংলা লিপি bɑŋ̽lɑ lipi (bangla lipi) Bangla script

The script is descended from Brahmi, but there is some dispute about its derivation, since it shares shapes with both Dravidian and Aryan scripts.

Sources: Scriptsource, Wikipedia

Basic features

The Bengali script is an abugida. Consonants carry an inherent vowel which can be modified by appending vowel-signs to the consonant. See the table to the right for a brief overview of features for Bangla.

The orthographic letters of the Bengali script are derived from Sanskrit, and in some cases don't quite fit the needs of modern Bangla (eg. lack of simple vowels for the sounds ɛ and æ, letters for only 2 of many diphthongs, long and short letters where pronunciation no longer distinguishes those sounds, etc.)

Bengali runs left to right in horizontal lines.

Words are separated by spaces.

The 33 consonant letters used for Bangla are supplemented by repertoire extensions for 3 more sounds by applying the nukta diacritic to characters.

Consonant clusters at any location are normally indicated using the virama between consonants. This results in a large number of conjunct forms expressed using stacked consonants, conjoined consonants, and ligated glyphs. Conjuncts often have different pronunciations than might be expected from the letters involved, and in particular gemination is very common. Occasionally, a visible virama is used. However, clusters are often not marked at all.

As part of a cluster, RA has special forms, for both cluster-initial and post-base positions.

Word-final consonant sounds may be represented by a special letter, , or by 2 dedicated combining marks (anusvara & visarga), but are generally ordinary consonants that are not marked by a virama.

Vowel harmony plays a significant role in the pronunciation of vowel-related code points.

The Bangla orthography has 2 inherent vowels, and represents other vowels using 9 vowel-signs, including 3 prescripts and 2 circumgraphs. All vowel-signs are combining marks, and are stored after the base character. Other vowels are represented by adaptations of the y consonant. The final sound of numerous diphthongs is represented using independent vowels.

There are 10 independent vowels, one for each vowel sound, including the inherent vowel, and these are used to write all standalone vowel sounds.

There are no composite vowels, in principle, however the 2 circumgraphs are decomposed into 2 parts each.

Vowels may be nasalised, using the candrabindu diacritic.

Bengali has native digit shapes.

Character index

Letters

Show

Basic consonants

প␣ব␣ফ␣ভ␣ত␣দ␣থ␣ধ␣ট␣ড␣ঠ␣ঢ␣ক␣গ␣খ␣ঘ␣চ␣ছ␣জ␣ঝ␣য␣শ␣ষ␣স␣হ␣ম␣ন␣ণ␣ঞ␣ঙ␣র␣ল

Extended consonants

ড়␣ঢ়␣য়

Independent vowels

ই␣ঈ␣উ␣ঊ␣এ␣ও␣অ␣আ␣ঐ␣ঔ

Vocalic

Other

ৎ␣ঽ␣ʼ

Combining marks

Show

Vowel-signs

ি␣ে␣ৈ␣ো␣ৌ␣ী␣ু␣ূ␣া

Vocalic

Other

ঁ␣ং␣ঃ␣়␣্␣ৗ

Numbers

Show
০␣১␣২␣৩␣৪␣৫␣৬␣৭␣৮␣৯

Not used for Bangla

৴␣৵␣৶␣৷␣৸␣৹

Punctuation

Show
।␣॥␣“␣”␣‘␣’

ASCII

!␣(␣)␣,␣.␣:␣;␣?

Possible

৽␣†␣‡␣′␣″␣–␣—␣…

Symbols

Show
৳␣৺

Not used for Bangla

৲␣৻

Other

Show
‍␣‌
Character lists show:

Phonology

Click on the sound groups to see where else in the document each of the sounds are referred to.

Phones in a lighter colour are non-native or allophones. Source Wikipedia.

Vowel sounds

Plain vowels

i ĩ i ĩ u ũ u ũ ʊ e ẽ e ẽ o õ o õ ɛ ɛ̃ ɛ ɛ̃ ɔ ɔ̃ ɔ ɔ̃ æ æ a ã a ã

Diphthongs

There are a large number of diphthongs in Bangla, and the chart below shows an incomplete set.wp

ii̯ iu̯ ii̯ iu̯ ui̯ ui̯ ei̯ eu̯ ei̯ eu̯ oi̯ ou̯ oe̯ oo̯ oi̯ ou̯ oe̯ oo̯ ɛe̯ ɛe̯ ɔe̯ ɔo̯ ɔe̯ ɔo̯ æe̯ æo̯ æe̯ æo̯ ai̯ au̯ ae̯ ao̯ ai̯ au̯ ae̯ ao̯

Vowel harmony

The pronunciation of a vowel can be affected by the vowel in the following syllable. Radice provides the following table, though this is a simplification and there are many exceptions.

Followed by i or u Followed by ɔ, o, e or a
o → u o → ɔ
ɔ → o u → o
e → i e → æ
æ → e i → e

For example, the verb শোনা ʃonɑ to hear with an i ending becomes ʃuni, দেখা dækʰa to see becomes dekʰi, etc. This sometimes accounts for the pronunciation of the inherent vowel, eg. অতিথি otitʰi guest and অনুবাদ onubad translation start with o rather than ɔ.

Consonant sounds

labial dental alveolar post-
alveolar
retroflex palatal velar glottal
stop p b
pf
t d
    ʈ ɖ
ʈʰ ɖʰ
  k ɡ
ɡʰ
 
affricate       t͡ʃ d͡ʒ
t͡ʃʰ d͡ʒʰ
       
fricative f   s z ʃ       ɦ h
nasal m   n       ŋ
approximant w   l     j  
trill/flap     r ɾ   ɽ
ɽʰ

pf, and f are alternative pronunciations for the same phoneme, depending on where the speaker is from, and all are written using [U+09AB BENGALI LETTER PHA],

True retroflex (murdhonno) consonants are not found in Bengali. They are apical postalveolar in Western Dialects. In other dialects, they are fronted to apical alveolar.wp

r occurs word-initially, whereas ɾ occurs medially and finally. Both sounds are written using [U+09B0 BENGALI LETTER RA].wp

s and ʃ are often merged. z is found mainly in foreign words.wp

In the Bangla spoken in Dhaka, ɾ and ɽ are often indistinct phonemically,wp eg. the following two words can be homophonous: করা কড়া

j and w are pronunciations of য় [U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA] when it appears between certain vowels.

Structure

The effective unit of the Bengali writing systems is the orthographic syllable.

An orthographic syllable can be defined in one of the code point sequences described below. Lowercase letters represent combining characters. Some vowel signs may be displayed at the start of the sequence, although the code points representing them always appear after the base consonant

Consonant-based syllables

[C[n]h] [C[n]h] C[n] [h | v (n)] [f]

Legend
C
Consonant.
Cn
Consonant followed by nukta.
h
Hasant.
v
Vowel-sign.
n
Nasalisation diacritic (candrabindu).
f
Final consonant (one of khanda ta, anusvara, or visarga).

The core of a consonant-based syllable is a base consonant character, which may or may not additionally represent an inherent vowel if it stands alone.

There is no inherent vowel if it is followed by a vowel-sign, eg. কী কি ki কো ko or hasant, eg. ক্ At the end of a word, there may or may not be an inherent vowel, even if there is no hasant.

Any base consonant may be a combination of consonant code point plus nukta.

The base consonant can be preceded by up to two consonant+hasant pairs (where the consonant may also be a combination of consonant+nukta), but only if those consonants form conjuncts (ie. the hasant is invisible), eg. ক্ক k͓k ম্প m͓p ক্ষ k͓ʃ̇ ন্ত্র n͓t͓r If the preceding consonants carry visible hasant symbols, those are treated as separate orthographic syllables.

Likewise, the variable use of the hasant in Bengali means that a phonetic cluster of consonants can constitute a larger series of orthographic syllables. For example, this word for cymbal has two phonetic syllables, but 3 orthographic since the rt combination is not combined: করতাল

A vowel-sign may optionally be followed by a nasalisation diacritic.

Unless the base consonant is followed by a hasant, the syllable may be terminated by a final consonant repesented by khanda ta, anusvara, or visarga.

Special cases

In some cases, a RA + hasant followed by YA may introduce a ZWJ character before the hasant, in order to specify special shaping rules for the YA. Compare র্য r͓ʲ র‍্য r‍ᶻʷʲ͓ʲ

The base consonant can be followed by either one or two code points representing vowel-signs, eg. compare কো ko কো keɑ Multiple vowel sequences only occur in decomposed text.

A base consonant may be followed by ZWNJ + vowel code point where the author wants to prevent ligation of the following vowel sign, eg. শ‌ু ʃᶻʷⁿʲu

Conjunct positions

Native Bengali words do not allow initial consonant clusters, and word-final clusters are rare. However, words borrowed from Sanskrit, English, etc. have introduced many such features.

Many Bengali speakers, however, retain the native phonology, even when using Sanskrit or English borrowings, such as গেরাম gerɑm (CV.CVC) for গ্রাম g͓rɑm village (CCVC), or ইস্কুল ịʃ͓̈kul (VC.CVC) forস্কুল ʃ͓̈kul school.wp,#Consonant_clusters

Most word-final clusters were introduced from English, eg. লিফ্ট lipʰ͓ʈ lift, elevator or ব্যাংক b͓ʲɑŋ̽k bank. In some dialects, a final nasal+stop is written as a cluster, whereas in standard Bengali it would use nasalisation, eg. চান্দ cɑn͓d vs. চাঁদ cɑm̽d moon.wp,#Consonant_clusters

For more information, see Wikipediawp,#Consonant_clusters.

Vowel-based syllables

Vowel-based syllables begin with a standalone vowel, which is represented by a single independent vowel or vocalic.

An independent vowel may be followed by an anusvara, visarga or candrabindu (nasalisation), eg. উঃ, আঁ ụh̽, ɑm̽

Vowels

Click on the characters in the lists for detailed information. For a mapping of sounds to graphemes see vowel_mappings.

Inherent vowel

The inherent vowel is typically transcribed as a, and pronounced as ɔ or o. (And sometimes halfway between these two, when influenced by surrounding sounds.) Bengalis are not always aware of these sound differences – thinking of this as one sound. So or ko are written by simply using the consonant letter [U+0995 BENGALI LETTER KA].

There is also a vowel-sign pronounced o. This can lead to inconsistent spellings, eg. bhalo, good, well, can be spelled either ভালো or ভাল. Verb forms tend to be particularly inconsistent, sometimes basing the rationale on what looks good in a particular context.

The rules for determining the sound of the inherent vowel are not simple. Partly it is a question of vowel harmony. The following two tendencies can help:

Vowel-signs

Non-inherent vowel sounds that follow a consonant are represented using vowel-signs, eg. ki is written কী [U+0995 BENGALI LETTER KA + U+09C0 BENGALI VOWEL SIGN II].

An orthography that uses vowel-signs is different from one that uses simple diacritics or letters for vowels, in that the vowel-signs are generally attached to the syllable, rather than just applied to the letter of the immediately preceding consonant (see prescript_vowels for an example).

Bengali vowel signs are nearly all combining characters. One consonant is also used in a special configuration described below. In principle a single character is used per base consonant, but 2 vowel-signs decompose to more than one character (see circumgraphs). All vowel-signs are typed and stored after the base consonant, whether or not they precede it when displayed. The glyph rendering system takes care of the positioning at display time.

Almost all of the vowel-signs are spacing combining characters, meaning that they consume horizontal space when added to a base consonant.

See also vocalics.

Combining marks used for vowels

Bangla uses the following dedicated combining marks for vowels.

ি␣ী␣ু␣ূ␣ে␣ো␣া␣ ␣ৈ␣ৌ

Bengali has lost the distinction between short and long vowels in pronunciation, but retains the difference in spelling.

The variation in pronunciation for the vowel-signs can often be explained by vowel harmony.

[U+09CB BENGALI VOWEL SIGN O] was originally pronounced ʊ, and that pronunciation sometimes persists alongside the o that came from Sanskrit, eg. নোংরা

Consonant used for vowels

্যা␣্য

When it occurs as the last member of a consonant cluster [U+09AF BENGALI LETTER YA] has a special shape seen in orange in fig_yophala, and is called ʤɔ-pfɔlɑ (য-ফলা). One of its functions is to create the sound æ.

হ্যাঁ
The word হ্যাঁ, which creates the sound æ with the sequence ্যা [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA + U+09BE BENGALI VOWEL SIGN AA].

Unusually for Indian scripts, y̌ɔ-phɔla can also be used after independent vowels to create the standalone sound æ. The sequences অ্যা [U+0985 BENGALI LETTER A + U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA + U+09BE BENGALI VOWEL SIGN AA] and এ্যা [U+098F BENGALI LETTER E + U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA + U+09BE BENGALI VOWEL SIGN AA] are used mostly in transliterations of borrowed words, eg. অ্যাটর্নি এ্যাডভোকেট

See also compositeVowels, which explains how independent vowels are used for the off-glide of diphthongs.

Pre-base vowel-signs

ি␣ে␣ ␣ৈ

Three vowel-signs appear to the left of the base consonant letter or cluster.

কি
Pre-base positioning of the vowel-sign in কি [U+0995 BENGALI LETTER KA + U+09BF BENGALI VOWEL SIGN I].

These combining marks are always stored after the base consonant. The font and rendering places the glyph before the base consonant.

Because vowel-signs are attached to the syllable. rather than a letter, a pre-base vowel glyph that represents a vowel sound after a consonant cluster is displayed before the whole cluster if that cluster is represented by a conjunct form, eg. compare ব্যক্তি বালতি

However, note that if the cluster is split by a visible virama, this creates two syllables and the pre-base vowel-sign appears after the consonant with the virama. If you click on the examples below, you'll see that the characters and code point orders are the same, apart from the addition of the ZWNJ in the second instance to force the virama to appear. প্লিজ প্‌লিজ

Circumgraphs

Two vowels are represented by circumgraphs, producing glyphs on opposite sides of the consonant onset.

ো␣ ␣ৌ
কৌ
The single code point representing the vowel-sign in কৌ [U+0995 BENGALI LETTER KA + U+09CC BENGALI VOWEL SIGN AU] is rendered on two sides of the base character.

Composite vowels & diphthongs

Composite vowels in Bengali may occur when [U+09CB BENGALI VOWEL SIGN O] and [U+09CC BENGALI VOWEL SIGN AU] are decomposed, however this is not common. See vowel_encoding for more details.

Although 2 of the vowel-signs (and independent vowels) represent diphthongs (oi̯ and ou̯) with a single code point, most of the many diphthongs are represented by a sequence of vowels,wa,#Vowels eg. কেউ

Diphthongs typically represent the off-glide using one of the following:

The following are examples of diphthongs. Hyphens indicate a consonant with inherent vowel. The detailed character notes contain examples of Bangla words containing the diphthongs shown.

িই␣ুই␣েই␣ৈ␣-ই␣াই␣ ␣িউ␣েউ␣ৌ␣াউ␣ ␣-য়␣ায়␣ ␣-ও␣াও

Standalone vowels

ই␣ঈ␣উ␣ঊ␣এ␣ও␣অ␣আ␣ ␣ঐ␣ঔ

Bengali represents syllable-initial vowels using a set of independent vowel letters, eg. ওস্তাদ উট ঔষুধ

Vowel elongation

[U+09BD BENGALI SIGN AVAGRAHA] is a Sanskrit-derived symbol that is used in modern Bengali to lengthen vowel soundsws, eg. কিঽঽঽ? kiiii Whaaatt? শুনঽঽঽ ʃunooo Listennn

Nasalisation

[U+0981 BENGALI SIGN CANDRABINDU] nasalises the vowel in a syllable, eg. হ্যাঁ হাঁপান Nasalised vowels include ĩ ũ ẽ õ ɛ̃ ɔ̃ ã.

The candrabindu should be placed over the top of an independent vowel, but over the base consonant when a vowel sign is attached – not over the vowel sign. In the code point sequence, however, this should occur after any combining vowel sign associated with the same syllable. Note how the base consonant is identified correctly in the second word of fig_candrabindu, even though the candrabindu is 4 code points away. Some fonts do not position the candrabindu correctly.

হাঁপান হ্যাঁ
The candrabindu is positioned over the base consonant, even though it is the last code point in the syllable. (The arrow gives the approximate location of the code point. Click to see the code point sequence.)

Vowel ligatures

Sometimes vowel signs (particularly u) form ligatures with a preceding base consonant. fig_lig shows ligated (top) and non-ligated (bottom) forms for several combinations. In certain contexts it may be less appropriate to ligate (eg. newspapers and modern typefaces). Both forms are equivalent in every way but visually.

রু র‌ু রূ র‌ূ হৃ হ‌ৃ হু হ‌ু ন্তু ন্ত‌ু শু শ‌ু গু গ‌ু
Ligated (top) and non-ligated (bottom) forms for several combinations of consonant+vowel.

The default behaviour of a given font can be modified using the zero-width non-joiner character in Unicode content. For example, a font that produces the ligature গু gu can be made to show the simpler form গ‌ু by the sequence + ZWNJ + [U+0997 BENGALI LETTER GA + U+200C ZERO WIDTH NON-JOINER + U+09C1 BENGALI VOWEL SIGN U].

See a matrix of consonants followed by vowel-signs for Bengali. A few clusters that are often ligated are pre-highlighted.

Consonants with no following vowel

Bengali uses [U+09CD BENGALI SIGN VIRAMA] (called হসন্ত hʃ̈n͓t (hasant) hɔsonto in Bengali) to indicate that the inherent vowel is not pronounced after a consonant, eg. the following explicitly represents just the sound k.ক্

The hasant is rarely seen. It is not used at the end of a word even though the inherent vowel is pronounced at the end of some words and not others, eg. গরম gɔrôm, hot vs. গড়ান gɔɽɑnô, to roll. There is no real way to tell when it is pronounced and when not in this position, except that it is usually pronounced following a word-final consonant cluster.

Within a word also, some clusters don't use the hasant in Bengali, and the reader simply has to know that the inherent vowel is not pronounced, eg. করতাল

This is particularly common at morpheme boundaries, for example in verb forms.

Consonant clusters that are represented by conjunct forms use the hasant between consonants to invoke the shape changes. If the font has the glyphs needed to produce the conjunct the hasant is hidden (see clusters).

Refs: Radice 3, 7-8, 21, 148; Daniels 400

Encoding choices

Bengali is a script where different sequences of Unicode characters may produce the same visual result. Here we look at those related to vowels.

The 2 circumgraphs can be written as a single character, or as two characters.

Precomposed Decomposed
[U+09CB BENGALI VOWEL SIGN O] ো [U+09C7 BENGALI VOWEL SIGN E + U+09BE BENGALI VOWEL SIGN AA]
[U+09CC BENGALI VOWEL SIGN AU] ৌ [U+09C7 BENGALI VOWEL SIGN E + U+09D7 BENGALI AU LENGTH MARK]

The single code point per vowel-sign is the form preferred by the Unicode Standard and the form in common use for Bengali. The parts are separated, however, in Unicode Normalisation Form D (NFD), and recomposed in Unicode Normalisation Form C (NFC), so both approaches are canonically equivalent.

Whichever approach is used, the vowel-signs must be typed and stored after the consonant characters they surround. In the case of decomposed vowel-signs, the order is also important and must be as shown above.

Vowel sounds mapped to characters

This section maps Bengali vowel sounds to common graphemes, grouped by whether they are vowel-signs ( vs ), or standalone ( s ). Click on the character names to see examples.

Hyphens indicate a consonant with inherent vowel.

Plain vowels

i
vs

ি [U+09BF BENGALI VOWEL SIGN I], eg. বিড়ি

[U+09C0 BENGALI VOWEL SIGN II], eg. বীর

[U+09C3 BENGALI VOWEL SIGN VOCALIC R] as part of the vocalic ri, eg. বৃহৎ

 
s

[U+0987 BENGALI LETTER I], eg. ইংরেজ

[U+0988 BENGALI LETTER II], eg. ঈদ

[U+098B BENGALI LETTER VOCALIC R] as part of the vocalic ri, eg. ঋতু

u
vs

[U+09C1 BENGALI VOWEL SIGN U], eg. বুক

[U+09C2 BENGALI VOWEL SIGN UU], eg. মূল

[U+09CB BENGALI VOWEL SIGN O] with vowel harmony before one of i u, eg. কোকিল

 
s

[U+0989 BENGALI LETTER U], eg. উঁচু

[U+098A BENGALI LETTER UU]

[U+0993 BENGALI LETTER O] with vowel harmony before one of i u, eg. ওদিকে

e
vs

[U+09C7 BENGALI VOWEL SIGN E], with vowel harmony before one of i u, eg. বেগুন.

ি [U+09BF BENGALI VOWEL SIGN I] with vowel harmony before one of ɔ o e a, eg. বিড়াল.

্য [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA] with the inherent vowel before i, eg. ব্যক্তি.

 
s

[U+098F BENGALI LETTER E], with vowel harmony before one of i u, eg. একটু.

[U+0987 BENGALI LETTER I] with vowel harmony before one of ɔ o e a.

vs

য় [U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA] as part of a diphthong, esp after ɔ a o, eg. ভয়.

o
vs

Inherent vowel

[U+09CB BENGALI VOWEL SIGN O], eg. বোন.

[U+09C1 BENGALI VOWEL SIGN U] with vowel harmony before one of ɔ o e a, eg. বুড়ো.

 
s

[U+0993 BENGALI LETTER O]

[U+0985 BENGALI LETTER A] with vowel harmony before one of i u, eg. অভিধান.

[U+0989 BENGALI LETTER U], with vowel harmony before one of ɔ o e a, eg. উচ্চারন.

ɔ
vs

Inherent vowel

[U+09CB BENGALI VOWEL SIGN O] with vowel harmony before one of ɔ o e a, eg. বোকা

 
s

[U+0985 BENGALI LETTER A] , eg. অঙ্ক.

[U+0993 BENGALI LETTER O] with vowel harmony before one of ɔ o e a, eg. ওড়া.

æ
vs

[U+09BE BENGALI VOWEL SIGN AA] after জ্ঞ [U+099C BENGALI LETTER JA + U+09CD BENGALI SIGN VIRAMA + U+099E BENGALI LETTER NYA], eg. জ্ঞান;

্যা [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA + U+09BE BENGALI VOWEL SIGN AA] sometimes, eg. ব্যাঙ্ক.

্য [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA] with the inherent vowel and not followed by i, eg. ব্যথা.

[U+09C7 BENGALI VOWEL SIGN E], with vowel harmony before one of ɔ o e a, eg. বেলা.

 
s

[U+098F BENGALI LETTER E] with vowel harmony before one of ɔ o e a, eg. একবার.

a
vs

[U+09BE BENGALI VOWEL SIGN AA], eg. কাটা.

 
s

[U+0986 BENGALI LETTER AA], eg. আকাশ.

Diphthongs and other combinations

ii̯
o

িই [U+09BF BENGALI VOWEL SIGN I + U+0987 BENGALI LETTER I], eg. পাখিই.

iu̯
o

িউ [U+09BF BENGALI VOWEL SIGN I + U+0989 BENGALI LETTER U], eg. পারফিউম.

ui̯
o

ুই [U+09C1 BENGALI VOWEL SIGN U + U+0987 BENGALI LETTER I], eg. বাবুই.

ei̯
o

েই [U+09C7 BENGALI VOWEL SIGN E + U+0987 BENGALI LETTER I], eg. পারেই.

eu̯
o

েউ [U+09C7 BENGALI VOWEL SIGN E + U+0989 BENGALI LETTER U], eg. যেকেউ.

oi̯
o

[U+09C8 BENGALI VOWEL SIGN AI], eg. তাথৈ.

-ই [U+0987 BENGALI LETTER I], eg. তেরই.

ou̯
vs

[U+09CC BENGALI VOWEL SIGN AU], eg. ধৌত

 
s

[U+0994 BENGALI LETTER AU], eg. ঔষুধ.

oo̯
o

-ও [U+0993 BENGALI LETTER O], eg. দুঃখও

ɔe̯
 o

-য় [U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA], eg. ভয়

ɔo̯
o

-ও [U+0993 BENGALI LETTER O], ], eg. তওবার

ai̯
o

াই [U+09BE BENGALI VOWEL SIGN AA + U+0987 BENGALI LETTER I], eg. অনলাইন.

au̯
o

াউ [U+09BE BENGALI VOWEL SIGN AA + U+0989 BENGALI LETTER U], eg. একাউন্ট.

ao̯
o

াও [U+09BE BENGALI VOWEL SIGN AA + U+0993 BENGALI LETTER O], eg. এছাড়াও.

Vocalics

ঋ␣ৃ

Only one vocalic is in common use for modern Bangla. It is used in standalone and vowel-sign forms, eg. বৃহৎ ঋতু

Three more vocalics, [U+098C BENGALI LETTER VOCALIC L], [U+09E0 BENGALI LETTER VOCALIC RR] and [U+09E1 BENGALI LETTER VOCALIC LL], are historic and only used to write Sanskrit in Bengali.

ৠ␣ঌ␣ৡ␣ৄ␣ৢ␣ৣ

Consonants

Click on the characters in the lists for detailed information. For a mapping of sounds to graphemes see consonant_mappings.

Basic consonants

Stops

প␣ব␣ফ␣ভ␣ত␣দ␣থ␣ধ␣ট␣ড␣ঠ␣ঢ␣ক␣গ␣খ␣ঘ

Affricates

চ␣ছ␣জ␣ঝ␣য

Fricatives

শ␣ষ␣স␣হ

Nasals

ম␣ন␣ণ␣ঞ␣ঙ

Other sonorants

র␣ল

Khiyɔ

ক্ষ

ক্ষ [U+0995 BENGALI LETTER KA + U+09CD BENGALI SIGN VIRAMA + U+09B7 BENGALI LETTER SSA] is called khiyɔ and is often treated as a letter of the alphabet in that some dictionaries give it it's own section, eg. ক্ষুদ্র

Repertoire extension

[U+09BC BENGALI SIGN NUKTA] is used to create 3 additional letters, eg. the dot changes ɖ to ড় ɽ. Here is a list of graphemes that combine nukta with an existing consonant.

To reveal detailed notes about usage see the list of precomposed characters a little lower.

ড়␣ঢ়␣য়

য় [U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA] represents j, w or , depending on what vowels occur alongside it. See the character notes for details.

The nukta should always be typed and stored immediately after the consonant it modifies, and before any combining vowels or diacritics.

The Unicode Standard recommends that content authors use decomposed sequences for these letters. However, the Unicode block also contains the precomposed code points shown below.

ড়␣ঢ়␣য়

Decomposed sequences are not recomposed by Unicode Normalisation Form C (NFC).

Assamese

Two more letters in the Unicode Bengali block are specifically for Assamese.

ৰ␣ৱ

Final consonants

One letter and 2 diacritics represent syllable-final consonant sounds.

ৎ␣ং␣ঃ

In a sequence of characters, these should all occur after any combining vowel sign associated with the same syllable. None carry vowel signs.

Khanda ta

[U+09CE BENGALI LETTER KHANDA TA], pronounced , is a variant form of [U+09A4 BENGALI LETTER TA] that was added to Unicode 4.1 as a separate character. In some words a that has no following inherent or other vowel may have this shape. It either comes at the end of words, or before a consonant that doesn't naturally combine with , eg. হঠাৎ উৎসব

This character replaces and obsoletes an earlier approach that required the use of the sequence ত্‍ [U+09A4 BENGALI LETTER TA + U+09CD BENGALI SIGN VIRAMA + U+200D ZERO WIDTH JOINER].

Many words, however, use [U+09A4 BENGALI LETTER TA] in the same situations, and it's not possible to guess which will be used for a given word, eg. হঠাত্

Anusvara

[U+0982 BENGALI SIGN ANUSVARA] is a final nasal ŋ, eg. বাংলা

Sometimes spelling is inconsistent, especially when this or [U+0999 BENGALI LETTER NGA] are used in a conjunct, eg. compare these pairs: সাঙঘাতিক সাংঘাতিক রঙ রং

However, in certain words the spelling is fixed. One such word is বাংলা But, since this cannot support vowel signs, the word for Bengali nation (rather than language) has to be spelled with [U+0999 BENGALI LETTER NGA], ie. বাঙালী bɑŋɑlī bɑŋgɑlī

See also the candrabindu diacritic, which nasalises a vowel.

Visarga

When used to represent a word-final consonant, [U+0983 BENGALI SIGN VISARGA] produces vigorous final aspiration ɦ, eg. বাঃ

It doesn't appear in many common words.

One of the other uses of the visarga is to lengthen a following consonant, in which case there is no aspiration, eg. নিঃশব্দ

Consonant clusters

Consonant clusters are written using:

  1. No special rendering. This is a common approach in Bengali.
  2. Conjuncts. There are a number of possibilities here.
    1. Fused vertically : Reduce the component shapes and combine them vertically, usually approximately within the normal character height.
    2. Conjoined : The two consonants sit side by side, but the first consonant has an altered shape.
    3. Ligated : A fusion of the component letter shapes where it may be difficult to tell them apart.
    4. Special forms : These apply to cluster-final YA (y̌ɔ-phɔla), the letter RA, and cluster-final MA.
  3. A visible virama below the non-final consonants in the cluster. In Bengali this may be a conscious decision, and not just a gap in font support.
  4. Final consonant letters or marks followed by another consonant. There is no interaction between the finals and the following character (see finals).

See also consonant_length.

No special rendering

Unlike languages written in the Devanagari script, consonant clusters are often not represented as conjuncts in Bengali. It is necessary to just know that the vowel should not be pronounced, eg. রিকশা

Morphological boundaries, such as grammatical suffixes and endings are typically written without conjuncts, eg. the present tense form of khan plus negative suffix which is na is written খাননা

The stem kôr from kôra plus present continuous ending chô is written করছ

Conjunct formation

See a table of 2-consonant clusters.
The table allows you to test results for various fonts.

To produce a conjunct, [U+09CD BENGALI SIGN VIRAMA] is added between the consonants in the cluster. There are exceptions, but this type of virama is usually not displayed, eg. the sequence + + [U+0995 BENGALI LETTER KA + U+09CD BENGALI SIGN VIRAMA + U+09B7 BENGALI LETTER SSA] produces ক্ষ k͓ʃ̇

The font usually determines how a cluster is rendered, although it is possible to influence this (see formatting).

Different fonts may combine the same letters in different ways. The following figures shows characters that are combined in different ways by different fonts.

ল্গ ল্প হ্ব জ্ঝ ষ্ক হ্ণ ঞ্ঝ
Conjuncts composed in different ways by the Noto Sans Bengali font (top) and Solaimon Lipi font (bottom). (Click for list of code points.)

Quite often, clustered consonants are pronounced differently than you would expect. In particular, conjuncts ending with [U+09AC BENGALI LETTER BA] or [U+09AE BENGALI LETTER MA] tend to not pronounce the latter, but double the length of the consonant before it (see consonant_length).

Nasals in conjuncts tend to conform to phonological patterns. Velar consonants (k, kh, g, etc) combine with ŋɔ, palatal consonant (c, ch, ..) combine with ñɔ, retroflex ɳɔ, dental , and labial .

The sections below show examples of the various types of conjunct forms. The lists are not exhaustive. The shapes shown are by default those contained in the Noto Sans Bengali webfont. Other fonts may combine components in different ways. Click on the characters if you want to see the components.

Vertical conjuncts

Conjunct shapes are most commonly formed by arranging the components vertically, reducing and combining the shapes of the individual components as needed.

সথ→স্থ
sthô
লল→ল্ল
llô
Vertically fused conjunct forms.

The conjuncts in fig_conjunct_vertical used in words: আস্থা ঝিল্লি

Examples of components arranged vertically.

ব্ল␣ক্ল␣ক্ব␣ঘ্ন␣ফ্ল␣ক্ক␣ক্ব␣গ্গ␣গ্ধ␣গ্ন␣গ্ল␣গ্ব␣হ্ল␣হ্ণ␣থ্ব␣ঞ্ছ␣ঞ্জ␣ঞ্ঝ␣ল্ক␣ল্গ␣ল্প␣ল্ব␣ল্ল␣ণ্ণ␣ণ্ম␣ণ্ব␣ণ্ড␣ন্ত␣ন্থ␣ন্ন␣ন্ব␣ন্ত্ব␣ম্ন␣দ্ভ␣দ্ব␣ত্ত␣ত্ত্ব␣ণ্ড␣ণ্ড্র␣ত্ন␣ত্ব␣ট্ট␣ক্ট␣প্প␣প্ন␣প্ত␣প্ল␣শ্ন␣শ্ল␣শ্ব␣ষ্ক␣স্ক␣স্খ␣স্ত␣স্থ␣স্ন␣স্ল␣স্ব␣স্ত্র␣ম্ন␣ম্ল␣ম্ভ␣ম্ব␣ঙ্ক

Conjoined conjuncts

Many conjuncts are formed by combining components horizontally.

মপ→ম্প
mpô
চ+চ→চ্চ
ccô
Conjoined conjunct forms.

The conjuncts in fig_conjoined used in words: ক্যম্পাস উচ্চারণ

Other examples of components arranged side-by-side, frequently with simplification of the initial consonant.

ব্ব␣ব্দ␣ধ্ব␣ল্ট␣ল্ফ␣ণ্ট␣ণ্ঢ␣ণ্ঠ␣ন্দ␣ন্স␣ল্ড␣দ্দ␣ড্ড␣ঙ্খ␣ঙ্ম␣জ্জ␣জ্জ্ব␣জ্ব␣জ্ঝ␣চ্চ␣চ্ছ␣চ্ঞ␣ণ্ঠ␣প্ট␣প্স␣শ্চ␣শ্ছ␣ষ্প␣ষ্ফ␣স্ট␣স্প␣স্ফ␣ম্প␣ম্ফ␣হ্ব

Ligating conjuncts

A small set of conjuncts combine the consonants into a ligated shape, where individual components can't always be easily discerned.

ষট→ষ্ট
mpô
কষ→ক্ষ
ccô
Ligated conjunct forms.

The conjuncts in fig_conjunct_ligated used in words: খ্রিষ্টান ক্ষণ

Examples of conjuncts arranged in a way that involves ligation, significantly altering one or more of the components.

ষ্ণ␣ব্ধ␣হ্ন␣ঞ্চ␣ত্থ␣ন্ধ␣ন্ধ্র␣ক্ষ্ম␣ক্স␣হ্ম␣ঙ্গ␣দ্ধ␣দ্ধ্ব␣ক্ত␣ঞ্ঝ␣ব্জ␣জ্ঞ␣ষ্ঠ␣ষ্ট␣ট্ট

Special forms

Cluster-initial RA [U+09B0 BENGALI LETTER RA] at the start of a cluster is displayed as a mark above the following consonant(s), eg. rt in গর্ত gɔrtô hole. Unlike Devanagari, it doesn't appear to be displayed above the vowel-sign of the orthographic syllable, eg. কুর্তা

Like other consonant clusters, the sound may also be written without a conjunct at all, eg. কারসাজি

Cluster-final RA A trailing [U+09B0 BENGALI LETTER RA] is displayed as a wavy line below the other consonants, eg. gr in গ্রাম

Examples of clusters with a trailing r.

ব্র␣ঘ্র␣ফ্র␣ধ্র␣খ্র␣গ্র␣থ্র␣হ্র␣ন্ত্র␣ন্দ্র␣ণ্ড্র␣ন্ধ্র␣ম্র␣স্র␣ষ্ক্র␣দ্র␣ক্র␣ত্রু␣ত্র␣ভ্র␣ড্র␣জ্র␣ভ্র␣চ্ছ্র␣ট্র␣প্র␣শ্র␣ষ্ট্র␣স্র␣স্ট্র␣স্প্র␣ম্ভ্র␣ম্র␣ম্প্র

Cluster-final MA A cluster-final [U+09AE BENGALI LETTER MA] is also displayed in a characteristic way. The initial consonant is reduced, and the m is rendered as a long vertical line to the right with an appendage to the left at the bottom, producing a kind of diagonal grouping, eg. উন্মত্ত

Examples of cluster-final MA.

ক্ম␣গ্ম␣ল্ম␣ন্ম␣ম্ম␣স্ম␣দ্ম␣ত্ম␣শ্ম␣ষ্ম␣স্ম

y̌ɔ-phɔla Bengali also has a particular way of representing a cluster-final j semi-vowel. This is typically represented using the full form of the preceding consonant followed by a special form of [U+09AF BENGALI LETTER YA], ‍্য, known as y̌ɔ-phɔla, eg. হ্যাঁ

The effect of yo-phola at the end of a conjunct is generally (a) to double the length of the preceding consonant, and (b) to change the value of the following vowel if it is inherent or a. For more details, see the character notes.

Visible virama

When the virama is used it may be because the font doesn't have a particular conjunct ligature, but it may also be visible in places where the phonology is unusual, eg. ফ্‌ল্যাট লান্‌চ although these may also be spelled with conjuncts, eg. ফ্ল্যাট pʰ͓ₓl͓ʲɑʈ pʰlæʈ flat

It is also quite common to see it used to distinguish words such as the following, which are etymologically related, but phonetically distinct:উদ্‌যাপন ụd͓‌ýɑpnউদ্যান ụd͓ýɑn

If a visible virama is wanted but not what the font does by default, it is possible to force it by inserting a ZWNJ character after the virama (see formatting).

Formatting characters

ZWNJ [U+200C ZERO WIDTH NON-JOINER] (ZWNJ) can be used to force the production of a visible virama, rather than a half-form (see visiblevirama). It can also be used to prevent the formation of vowel ligatures (see vowelligatures).

ZWJ [U+200D ZERO WIDTH JOINER] (ZWJ) is used to produce special joining forms for YA (see consonant_syllable).

Consonant lengthening

There are a number of ways of producing a lengthened consonant sound in Bangla.

A straightforward approach is to duplicate the consonant sound in conjunct form. For example, a long l can be written ল্ল [U+09B2 BENGALI LETTER LA + U+09CD BENGALI SIGN VIRAMA + U+09B2 BENGALI LETTER LA], eg. ঝিল্লি

Another common way of doubling the length of a consonant is to use a conjunct ending with [U+09AC BENGALI LETTER BA] or [U+09AE BENGALI LETTER MA], eg. ভস্ম বিশ্ব

The y̌ɔ-phɔla (্য [U+09CD BENGALI SIGN VIRAMA + U+09AF BENGALI LETTER YA]) can also lengthen the consonant it follows, eg. জন্য

[U+0983 BENGALI SIGN VISARGA] can also lengthen the following consonant, with no aspiration, eg. নিঃশব্দ

Consonant sounds to characters

This section maps Bengali consonant sounds to common graphemes, grouped according to whether they are regular ( r ), conjuncts ( c ), or final ( f ). Click on the character names to see examples.

p
r

[U+09AA BENGALI LETTER PA], eg. পথ.

b
r

[U+09AC BENGALI LETTER BA] (when not in a conjunct), eg. বড়.

pʰ/pf
r

[U+09AB BENGALI LETTER PHA], eg. ফটো.

r

[U+09AD BENGALI LETTER BHA], eg. ভাল.

t
r

[U+09A4 BENGALI LETTER TA], eg. তারা.

 
f

[U+09CE BENGALI LETTER KHANDA TA], in syllable-final positions, eg. হঠাৎ and উৎসব.

d
r

[U+09A6 BENGALI LETTER DA], eg. দাদী.

r

[U+09A5 BENGALI LETTER THA], eg. থামা.

r

[U+09A7 BENGALI LETTER DHA], eg. ধন্যবাদ.

ʈ
r

[U+09A0 BENGALI LETTER TTHA], eg. ঠাণ্ডা.

ɖ
r

[U+09A1 BENGALI LETTER DDA], eg. ডাক্তার.

ʈʰ
r

[U+099F BENGALI LETTER TTA], eg. টাকা.

ɖʰ
r

[U+09A2 BENGALI LETTER DDHA], eg. ঢেউ.

k
r

[U+0995 BENGALI LETTER KA], eg. কলম.

r

[U+0996 BENGALI LETTER KHA],  eg. খবর.

 
c

ক্ষ [U+0995 BENGALI LETTER KA + U+09CD BENGALI SIGN VIRAMA + U+09B7 BENGALI LETTER SSA] when initial, eg. ক্ষুদ্র.

ɡ
r

[U+0997 BENGALI LETTER GA], eg. গতকাল.

 
c

জ্ঞ [U+099C BENGALI LETTER JA + U+09CD BENGALI SIGN VIRAMA + U+099E BENGALI LETTER NYA] when word-initial, eg. জ্ঞান.

ɡː
c

জ্ঞ [U+099C BENGALI LETTER JA + U+09CD BENGALI SIGN VIRAMA + U+099E BENGALI LETTER NYA] when between vowels, eg. বিজ্ঞান.

ɡʰ
r

[U+0998 BENGALI LETTER GHA], eg. ঘর.

t͡ʃ
r

[U+099A BENGALI LETTER CA], eg. চক্র.

t͡ʃʰ
r

[U+099B BENGALI LETTER CHA], eg. ছাতা.

d͡ʒ
r

[U+099C BENGALI LETTER JA], eg. জগৎ.

[U+09AF BENGALI LETTER YA] when word-initial, eg. যখন.

d͡ʒʰ
r

[U+099D BENGALI LETTER JHA], eg. ঝড়.

f
 

See pʰ/pf.

 
c

As the initial sound in the following conjuncts:

স্ত [U+09B8 BENGALI LETTER SA + U+09CD BENGALI SIGN VIRAMA + U+09A4 BENGALI LETTER TA] st, eg. রাস্তা.

স্থ [U+09B8 BENGALI LETTER SA + U+09CD BENGALI SIGN VIRAMA + U+09A5 BENGALI LETTER THA] stʰ, eg. পাকস্থলী.

স্ন [U+09B8 BENGALI LETTER SA + U+09CD BENGALI SIGN VIRAMA + U+09A8 BENGALI LETTER NA] sn, eg. স্নেহ.

স্ত্র [U+09B8 BENGALI LETTER SA + U+09CD BENGALI SIGN VIRAMA + U+09A4 BENGALI LETTER TA + U+09CD BENGALI SIGN VIRAMA + U+09B0 BENGALI LETTER RA] str

ʃ
r

[U+09B6 BENGALI LETTER SHA], eg. শাপ.

[U+09B7 BENGALI LETTER SSA],  eg. ষড়যন্ত্র.

[U+09B8 BENGALI LETTER SA],  eg. সকাল.

h
r

[U+09B9 BENGALI LETTER HA], eg. হল.

 
f

[U+0983 BENGALI SIGN VISARGA], eg. বাঃ.

w
r

ওয় [U+0993 BENGALI LETTER O + U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA], (light, like French 'oui') between o...a , eg. দাওয়াত.

r ɾ
r

[U+09B0 BENGALI LETTER RA], eg. রওয়া.

[U+09C3 BENGALI VOWEL SIGN VOCALIC R] as part of the vocalic ri, eg. বৃহৎ

[U+098B BENGALI LETTER VOCALIC R] as part of the standalone vocalic ri, eg. ঋতু

ɽ(ʰ)
r

ড় [U+09A1 BENGALI LETTER DDA + U+09BC BENGALI SIGN NUKTA], eg. ওড়া.

ঢ় [U+09A2 BENGALI LETTER DDHA + U+09BC BENGALI SIGN NUKTA], eg. আষাঢ়.

l
r

[U+09B2 BENGALI LETTER LA], eg. লওয়া.

j
r

য় [U+09AF BENGALI LETTER YA + U+09BC BENGALI SIGN NUKTA], between i...e, a...u, or e...e, eg. আষাঢ়.

Other letters

Besides the vowels and consonants described above, the Unicode Bengali block contains the following letters. They don't appear to be commonly used in Bangla.

ঀ␣ৼ

Numbers, dates, currency, etc.

Bengali has a set of native digits, which are used regularly in text. They are decimal-based.

০␣১␣২␣৩␣৪␣৫␣৬␣৭␣৮␣৯

See also the section Counters below.

Currency

[U+09F3 BENGALI RUPEE SIGN] is the Bengali rupee sign.

There are also a number of currency symbols, used in older texts, including [U+09F2 BENGALI RUPEE MARK] and the following currency denominator signs.

৲␣৴␣৵␣৶␣৷␣৸␣৹␣৻

These were used in an additive/subtractive system for specifying the number of ānā in the Bengali notation for currency used up to 1957, eg. ৷৷৶৹ 11 ānā (11 ana); ৸৶৹ 15 ānā (15 ana). There are 16 ana in one rupee, and the system works in multiples of 4. For a detailed explanation of usage, see [Pandey].

Text direction

Text is normally written horizontally, left to right.

Show default bidi_class properties for characters by the modern Bangla orthography.

Glyph shaping & positioning

This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.

You can experiment with examples using the Bengali character app.

Bengali text is not cursive (ie. joined up like Arabic), however there is a significant amount of interaction between glyphs, and some joining, around consonant clusters.

The orthography has no case distinction, and no special transforms are needed to convert between characters.

Font styles

tbd

Punctuation & inline features

Grapheme boundaries

tbd

Word boundaries

Words are separated by spaces.

Phrase & section boundaries

,␣;␣:␣.␣।␣?␣!
phrase

, [U+002C COMMA]

; [U+003B SEMICOLON]

: [U+003A COLON]

sentence

. [U+002E FULL STOP]

[U+0964 DEVANAGARI DANDA]

? [U+003F QUESTION MARK]

! [U+0021 EXCLAMATION MARK]

The danda, [U+0964 DEVANAGARI DANDA], is used for sentence final punctuation.

Observation: I haven't seen much evidence for the use of the double danda, [U+0965 DEVANAGARI DOUBLE DANDA].

Western punctuation, such as commas, semicolons, colons, quotation marks and hyphens are also used quite commonly.

Parentheses & brackets

(␣)
  start end
standard

( [U+0028 LEFT PARENTHESIS]

) [U+0029 RIGHT PARENTHESIS]

Quotations

“␣”␣‘␣’
  start end
initial

[U+201C LEFT DOUBLE QUOTATION MARK]

[U+201D RIGHT DOUBLE QUOTATION MARK]
nested

[U+2018 LEFT SINGLE QUOTATION MARK]

[U+2019 RIGHT SINGLE QUOTATION MARK]

Emphasis

Italicisation, bolding, and underlining are not traditionally features of Bengali text.

Abbreviation, ellipsis & repetition

ঃ␣ʼ

The bisɔrgô [U+0983 BENGALI SIGN VISARGA​] is sometimes used to mark initial abbreviations.

A sign called urdha-comma can be used to indicate truncation of words, eg. কʼরেʼপরে The Unicode Standard recommends use of ʼ [U+02BC MODIFIER LETTER APOSTROPHE].u,460

Observation: Wikipedia seems to use a normal apostrophe.

The Unicode Bengali block also has the punctuation [U+09FD BENGALI ABBREVIATION SIGN]. It is possible that [U+2026 HORIZONTAL ELLIPSIS] is used.

Observation: Information is needed about how it is used. It doesn't appear to be in common use.

Inline notes & annotations

tbd

Other inline ranges

tbd

Other punctuation

[U+09FA BENGALI ISSHAR] is used alongside the names of deceased persons.

Line & paragraph layout

Line breaking

Bengali is preferably wrapped at word boundaries.

Hyphenation

Bengali text can be hyphenated during line wrap, though it is not very common (unlike several south Indian scripts). This is partly because Bengali contains mostly short words.st

Hyphenation adds a hyphen at the end of the line when a word is broken.

Line-edge rules

According to ilreq, a line should not start with any of the following characters.

,␣.␣:␣;␣।␣॥␣)␣]␣}␣>␣+␣*␣/␣=␣_␣|␣~␣%

Line breaking should also not move a danda or double danda to the beginning of a new line, even if they are preceded by a space character. These punctuation characters should behave in the same way as a full stop does in English text.

Presumably, similar rules apply for the end of a line.

Show (default) line-breaking properties for characters in the modern Bangla orthography.

Text alignment & justification

tbd

Letter spacing

tbd

Counters, lists, etc.

You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.

The modern Bangla orthography uses ASCII digit numbering, but also has a native numeric style.

Numeric

The bengali numeric style is decimal-based and uses these digits.rmcs

০␣১␣২␣৩␣৪␣৫␣৬␣৭␣৮␣৯

Examples:

১␣২␣৩␣৪␣১১␣২২␣৩৩␣৪৪␣১১১␣২২২␣৩৩৩␣৪৪৪

Prefixes and suffixes

Generally, Bangla lists use a full stop plus a space as a suffix.

Examples:

১. ২. ৩. ৪. ৫.
Separator for Bangla list counters: full stop + space.

Page & book layout

This section is for any features that are specific to Bengali and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.

Languages using the Bengali script

According to ScriptSource, the Bengali script is used for the following languages:

References