Bengali script

Updated 24 July, 2017 • tags bengali, scriptnotes

This page provides basic information about the Bengali script. It is not authoritative, peer-reviewed information – these are just notes I have gathered or copied from various places as i learned. For similar information related to other scripts, see the Script comparison table.

Click on red text examples or highlight part of the sample text to see a list of characters. Click on the vertical blue bar (bottom right) to change font settings for the sample text.

There is some variability in pronunciation that is not always systematic, due to vowel harmony and other things. Because of this I usually provide a transcription rather than a phonetic equivalent for examples. The transcription is that used in the book by William Radice. Letter names are taken from Radice for the most common letters.

For more details see: Character notes Script links

Sample (Bangla)

ধারা ১ সমস্ত মানুষ স্বাধীনভাবে সমান মর্যাদা এবং অধিকার নিয়ে জন্মগ্রহণ করে। তাঁদের বিবেক এবং বুদ্ধি আছে; সুতরাং সকলেরই একে অপরের প্রতি ভ্রাতৃত্বসুলভ মনোভাব নিয়ে আচরণ করা উচিত।

ধারা ২ এ ঘোষণায় উল্লেখিত স্বাধীনতা এবং অধিকারসমূহে গোত্র, ধর্ম, বর্ণ, শিক্ষা, ভাষা, রাজনৈতিক বা অন্যবিধ মতামত, জাতীয় বা সামাজিক উত্‍পত্তি, জন্ম, সম্পত্তি বা অন্য কোন মর্যাদা নির্বিশেষে প্রত্যেকের‌ই সমান অধিকার থাকবে। কোন দেশ বা ভূখণ্ডের রাজনৈতিক, সীমানাগত বা আন্তর্জাতিক মর্যাদার ভিত্তিতে তার কোন অধিবাসীর প্রতি কোনরূপ বৈষম্য করা হবেনা; সে দেশ বা ভূখণ্ড স্বাধীন‌ই হোক, হোক অছিভূক্ত, অস্বায়ত্বশাসিত কিংবা সার্বভৌমত্বের অন্য কোন সীমাবদ্ধতায় বিরাজমান।

Key features

Bengali is an abugida, and is called বাংলা baɱla in the Bangla language.

This is characterised by consonant characters that include an inherent vowel sound. The inherent vowel can be overridden using vowel signs appended to the character. There are also independent vowel signs to represent vowels that are not preceded by consonants. The syllable is the unit for various aspects of the behaviour of the script.

The alphabet is split into vowels and consonants. With one exception (ɔ-kar), each vowel is represented by both an independent version and a combining vowel sign.

Text runs horizontally, left to right, and lines typically break at the spaces between words. The script has no upper-/lowercase distinction.

The basic unit for text segmentation is the syllable. Unicode grapheme clusters don't cover consonant clusters, so some additional processing is needed to identify text unit boundaries.

For more information see ScriptSource, Wikipedia, Omniglot, and the Unicode Standard.

Consonants

The Bengali block has 36 consonants:
কখগঘঙচছজঝঞটঠডঢণতথদধনপফবভমযরলশষসহৎড়ঢ়য়.

Usage tip: However, 3 of these are best represented by combinations of base character and diacritic, since they are decomposed by Normalization Form C.

Don't use Do use
ড়
09DC BENGALI LETTER RRA
+
09A1 BENGALI LETTER DDA +
09BC BENGALI SIGN NUKTA
ঢ়
09DD BENGALI LETTER RHA
+
09A2 BENGALI LETTER DDHA +
09BC BENGALI SIGN NUKTA
য়
09DF BENGALI LETTER YYA
+
09AF BENGALI LETTER YA +
09BC BENGALI SIGN NUKTA

Conjuncts

The absence of vowels between consonants can be represented in the following ways:

Unlike languages written in the Devanagari script, consonant clusters are often not represented as conjuncts in Bengali. It is necessary to just know that the vowel should not be pronounced, eg. রিকশা rikʃa rickshaw. Grammatical suffixes and endings are typically written without conjuncts, eg. খাননা khanna, which is the present tense form khan plus negative suffix na; করছ kôrchô, which is stem kôr from kôra plus present continuous ending chô.

Conjunct shapes are commonly formed by displaying a slightly smaller version of the non-final consonants in a cluster (equivalent to half-forms in Hindi), eg. see the m in ক্যম্পাস kyæmpas campus, or by combining the consonants into a more complex conjunct shape, eg. khr and ʂʈ in খ্রিষ্টান khriʂʈan christian.

Like other scripts, [U+09B0 BENGALI LETTER RA] is displayed in a non-standard way in consonant clusters. A syllable initial is displayed as a mark to the top right of the cluster, eg. rt in গর্ত gɔrtô hole, and a trailing is typically displayed as a wavy line below the other consonants, eg. gr in গ্রাম gram village.

Bengali also has a particular way of representing a cluster-final j semi-vowel. This is typically represented using the full form of the preceding consonant followed by a special form of [U+09AF BENGALI LETTER YA], known as y̌ɔ-phɔla, eg. হ্যাঁ hyæ̃ yes.

When the virama is often used it is usually because the font doesn't have a particular conjunct ligature, but it may also be visible in places where the phonology is unusual, eg. ফ্‌ল্যাট phlæʈ flat; লান্‌চ lanc lunch (though these may also be spelled with conjuncts, eg. ফ্ল্যাট phlæʈ flat). It is also quite common to see উদ্‌যাপন to distinguish it from words like উদ্যান. These words are etymologically related, but distinct phonetically.

Nasals in conjuncts tend to conform to phonological patterns. Velar consonants (k, kh, g, etc) combine with ŋɔ, palatal consonant (c, ch, ..) combine with ñɔ, retroflex ɳɔ, dental , and labial .

Vowels

Vowel harmony

The pronunciation of a vowel can be affected by the vowel in the following syllable. Radice provides the following table, though this is a simplification and there are many exceptions.

Followed by i or u Followed by ɔ, o, e or a
o → u o → ɔ
ɔ → o u → o
e → i e → æ
æ → e i → e

For example, the verb শোনা ʃona to hear with an i ending becomes ʃuni, দেখা dækʰa to see becomes dekʰi, etc. This sometimes accounts for the pronunciation of the inherent vowel, eg. অতিথি ôtithi guest and অনুবাদ ônubad translation start with o rather than ɔ.

Inherent vowel

The inherent vowel, unlike Hindi, is ɔ or o. (And sometimes halfway between these two, when influenced by surrounding sounds.) Bengalis are not always aware of these sound differences – thinking of this as one sound.

Note that there is also a vowel pronounced o. This can lead to inconsistent spellings, eg. bhalo, good, well, can be spelled either ভালো or ভাল. Verb forms tend to be particularly inconsistent, sometimes basing the rationale on what looks good in a particular context.

The rules for determining the sound of the inherent vowel are not simple. Partly it is a question of vowel harmony. The following two tendencies can help:

The inherent vowel is pronounced at the end of some words and not others, eg. গরম gɔrôm, hot vs. গড়ান gɔɽanô, to roll . There is no real way to tell when it is pronounced and when not in this position, except that it is usually pronounced following a word-final conjunct, eg. যুদ্ধ y̌uddhô war. When pronounced in this position, the sound is usually o .

Refs: Radice 3, 7-8, 21, 148; Daniels 400

Vowel ligatures

Some vowel signs can form ligatures with a preceding base consonant in certain contexts, but do not ligate in others (eg. newspapers and modern typefaces). Both forms are equivalent in every way but visually.

The default behaviour of a given font can be modified using the zero-width joiner and zero-width non-joiner characters, eg. গু vs. গ‌ু.

Multi-glyph vowel signs

Usage tip: The Bengali block also has code points that enable you to split vowel signs that circumvent the base into two parts; the single code point should be used. (If you do use the two code points, they must be both input after the base, and in the correct order.)

Don't useUse
+
09C7 BENGALI VOWEL SIGN E +
09D7 BENGALI AU LENGTH MARK

09CC BENGALI VOWEL SIGN AU
+
09C7 BENGALI VOWEL SIGN E +
09BE BENGALI VOWEL SIGN AA

09CB BENGALI VOWEL SIGN O

Punctuation

The danda, [U+0964 DEVANAGARI DANDA], is used for sentence final punctuation. I haven't seen much evidence for the use of the double danda, [U+0965 DEVANAGARI DOUBLE DANDA].

Western punctuation, such as commas, semicolons, colons, quotation marks and hyphens are also used quite commonly.

The bisɔrgô [U+0983 BENGALI SIGN VISARGA​] is sometimes used to mark initial abbreviations.

A sign called urdha-comma can be used to indicate truncation of words, eg. কʼরে kô're after, and ʼপরে 'pôre above. The Unicode Standard recommends use of ʼ [U+02BC MODIFIER LETTER APOSTROPHE], but in Wikipedia a normal apostrophe seems to be used Unicode 460.

Refs: Bhattacharya

Text layout

Initial letter styling can be applied to Bengali text.

Counters

The Predefined Counter Styles document contains a single counter style for Bengali. It is numeric.

@counter-style bengali {
system: numeric;
symbols: '\9E6' '\9E7' '\9E8' '\9E9' '\9EA' '\9EB' '\9EC' '\9ED' '\9EE' '\9EF';
/* symbols: '০' '১' '২' '৩' '৪' '৫' '৬' '৭' '৮' '৯'; */
} 
  

Examples: 1 ⇨ , 2 ⇨ , 3 ⇨ , 4 ⇨ , 11 ⇨ ১১, 22 ⇨ ২২, 33 ⇨ ৩৩, 44 ⇨ ৪৪, 111 ⇨ ১১১, 2222 ⇨ ২২২২.

List of basic symbols

Bangla

This is a list of main characters or character combinations needed for Bangla. Clicking on these characters will open a page in another window. If the character is underlined, the new page will display additional information about that character.

 

Consonants ড় ঢ় য়   ্য
Independent vowels
Vowel signs   া   ি   ী   ু   ূ   ৃ   ে   ৈ   ো   ৌ
Combining marks   ঁ   ং   ঃ   ়   ্
Symbols & punctuation
Numbers
Other symbols in the Bengali block   ৗ   ৄ   ৢ   ৣ

 

To see a list of ligatures and alternative shapes go to the 'shape' view of the Bengali character picker. (Hint: to see the composition of a conjunct, click on it and select 'Codepoints' .)

References

  1. Peter T. Daniels and William Bright, The World's Writing Systems, Oxford University Press, ISBN 0-19-507993-0
  2. Wikipedia, Bengali Script
  3. William Radice, Teach Yourself Bengali, Hodder & Shoughton, ISBN 0-340-86029-4
  4. The Unicode Standard
  5. Private correspondance with Tanmoy Bhattacharya, July 2004.
  6. Ishida, Predefined Counter Styles
First published 7 Apr 2006. This version 2017-07-24 22:09 GMT.  •  Copyright r12a@w3.org. Licence CC-By.