Hebrew script summary

Updated 12-Apr-2019 • tags hebrew, scriptnotes

This page provides basic information about the Hebrew script, and its use for modern Israeli Hebrew. It is not authoritative, peer-reviewed information – these are just notes I have gathered or copied from various places as I learned. For character-specific details follow the links to the Hebrew character notes.

For similar information related to other scripts, see the Script comparison table.

Clicking on red text examples, or highlighting part of the sample text shows a list of characters, with links to more details. Click on the vertical blue bar (bottom right) to change font settings for the sample text. Colours and annotations on panels listing characters are relevant to their use for the modern Hebrew language.

Sample (Hebrew)

סעיף א. כל בני אדם נולדו בני חורין ושווים בערכם ובזכויותיהם. כולם חוננו בתבונה ובמצפון, לפיכך חובה עליהם לנהוג איש ברעהו ברוח של אחוה.

סעיף ב. כל אדם זכאי לזכויות ולחרויות שנקבעו בהכרזש זו ללא הפליה כלשהיא מטעמי גזע, צבע, מין, לשון, דח, דעה פוליטית או דעה בבעיות אחרות, בגלל מוצא לאומי או חברתי, קנין, לידה או מעמד אחר. גדולה מזו, לא יופלה אדם על פי מעמדה המדיני, על פי סמכותה או על פי מעמדה הבינלאומי של המדינה או הארץ שאליה הוא שייך, דין שהארץ היא עצמאית, ובין שהיא נתונה לנאמנות, בין שהיא נטולת שלטון עצמי ובין שריבונותה מוגבלת כל הגבלה אחרת.

Usage & history

From Scriptsource:

The Hebrew script is primarily used for writing the Hebrew, Samaritan and Yiddish languages. It is also used for writing some varieties of Arabic spoken in North Africa, Iraq and Yemen; the languages of the Jewish communities in Italy and Corfu, Morocco (Berber), Spain and the Caucasus mountains; and the modern Jewish Aramaic languages. Prior to 500 BC the Hebrew language was written in the Paleo-Hebrew script, which was abandoned after the Jewish exile in the 5th century BC in favour of the Aramaic script, from which the current Hebrew script descended. It is commonly called the Hebrew alphabet, after its first two letters aleph and bet, although it is actually an abjad. ...

There are four main styles of writing the Hebrew language. Ashuri is a widely-used block style. A particular form of Ashuri, called STA"M (an acronym for the Hebrew words for which this style is used), is used for sacred texts such as the Torah. Rashi is a typeface commonly used for commentaries on sacred texts. A 'cursive' style is used in handwriting. This is characterized by rounded letter shapes; unlike other cursive scripts the letters are generally unconnected.

From Wikipedia:

The Hebrew alphabet (Hebrew: אָלֶף־בֵּית עִבְרִי‬, Alefbet Ivri), known variously by scholars as the Jewish script, square script and block script, is an abjad script used in the writing of the Hebrew language, also adapted as an alphabet script in the writing of other Jewish languages, most notably in Yiddish (lit. "Jewish" for Judeo-German), Djudío (lit. "Jewish" for Judeo-Spanish), and Judeo-Arabic. Historically, there have been two separate abjad scripts to write Hebrew. The original, old Hebrew script, is known as the paleo-Hebrew alphabet, which has been largely preserved, in a variant form, in the Samaritan alphabet. The present "Jewish script" or "square script" to write Hebrew, on the contrary, is a stylized form of the Aramaic alphabet and was known by Jewish sages as the Ashuri alphabet (lit. "Assyrian"), since its origins were alleged to be from Assyria. Various "styles" (in current terms, "fonts") of representation of the Jewish script letters described in this article also exist, as well as a cursive form which has also varied over time and place, and today is referred to as cursive Hebrew.

Distictive features

Hebrew is an abjad. This means that in normal use the script represents only consonant and long vowel sounds. This approach is helped by the strong emphasis on consonant patterns in Semitic languages. See the table to the right for a brief overview of features, taken from the Script Comparison Table.

Character lists

The Hebrew script characters in Unicode 10.0 are in a single block:

The following links give information about characters used for languages associated with this script. The numbers in parentheses are for non-ASCII characters.

For character-specific details see Hebrew character notes.

In yellow boxes, show:

Text direction

Hebrew script is written right-to-left in the main, but as with all RTL scripts, numbers and embedded LTR script text are written left-to-right (bidirectional text). In the following example, the Hebrew words are read right-to-left, starting with the one on the right, and the numeric expression ("10-12") is read left-to-right, ie. it starts with 10 and ends with 12. (Note that this is unlike Arabic, where the 10 and 12 would be in opposite positions.)

התאריכים 12־10 במרץ

Bidirectional Hebrew text.

Vowels

Matres lectionis

In the spelling of Arabic and some other Semitic languages, matres lectionis refers to the use of certain consonants to indicate a vowel. w

In addition to their role as consonants, the following may also indicate the location of a vowel.

א␣ע␣ו␣י

The first two are silent vowel supports, whereas the second two are considered to be part of the vowel.

Niqqud points

A series of points, known as niqqud, can be used to give precision about vowel sounds. They are rarely used outside of educational, children's, and religious texts, or for foreign or ambiguous words. Hebrew readers are usually able to understand the pronunciation from the context and the regular structure of Hebrew words.

ִ␣ֵ␣ֶ␣ַ␣ָ␣ֹ␣ֻ␣ְ␣ֱ␣ֲ␣ֳ

The last three code points listed above show a combination of sh'va and other points to produce the sounds listed. These may have had shortened lengths in older Hebrew, but modern Israeli Hebrew makes no length distinctions. Note that the combinations are produced in each case by a single, non-decomposing, Unicode code point.

אָלֶף־בֵּית עִבְרִי

'Hebrew alphabet' spelled out using diacritic points. The transliteration is ʔɑlɛf̽-vẹyṫ ʔ̇iv͓ʁiy.

Vowel absence

To express an absence of vowel between two consonants, use ְ [U+05B0 HEBREW POINT SHEVA]. However, in various contexts the sh'va is pronounced.

Consonants

Basic consonants

These are the basic consonants used in modern Hebrew.

א␣ב␣ג␣ד␣ה␣ו␣ז␣ח␣ט␣י␣כ␣ל␣מ␣נ␣ס␣ע␣פ␣צ␣ק␣ר␣ש␣ת

Five consonants have special word-final forms, called sofit. They are encoded as separate code points in Unicode, and appear as separate keys on a keyboard, so no special processing is needed to display or store them (unlike Arabic).

ך␣ם␣ן␣ף␣ץ

There are 3 methods for representing additional sounds, which we will consider next.

Three of the consonants can represent vowel locations. See the section on vowels below.

Repertoire extensions

In vowelled text, the following diacritics can be used to modify the sound of a consonant.

ּ␣ׁ␣ׂ

The only diacritic to appear inside a consonant, ּ [U+05BC HEBREW POINT DAGESH OR MAPIQ] is used in vowelled text to indicate that 5 consonants map to 'hard' sounds. This is similar to the distinction made in Syriac. Below, the hard sounds are shown to the left, and the normal to the right – some distinctions have been lost over time.

בּ␣כּ␣ךּ␣פּ␣תּ␣ ␣ב␣כ␣ך␣פ␣ת

Hebrew has two phonemes, s and ʃ, that are represented by a single consonant character, ש [U+05E9 HEBREW LETTER SHIN]. If it is necessary to indicate which is intended, two diacritics used only with this character, do the job: ׂ [U+05C2 HEBREW POINT SIN DOT] and ׁ [U+05C1 HEBREW POINT SHIN DOT]. They look identical, but the side to which they are positioned makes the difference.

שׁ␣שׂ

Other consonants are extended to non-native sounds by use of a following ׳ [U+05F3 HEBREW PUNCTUATION GERESH].

ג׳␣ז׳␣צ׳␣ו׳␣וו␣ד׳␣ת׳␣ח׳␣ר׳␣ע׳

The latter 5 of these represent sounds found in Arabic.

The graphemes ו׳ and וו are alternative ways of writing the same thing.

Other letters

The Unicode Hebrew block contains the following additional characters with the general property of letter.

ׯ␣ױ␣ײ␣װ

Three of these are digraphs used for Yiddish.

Combining marks

Combining characters used in modern Israeli Hebrew include the following types of diacritic:

These are only rarely used for normal Hebrew text.

In Biblical and older Hebrew texts, additional diacritics can be attached to the base character, including many cantillation marks. The following list shows the other combining characters in the Unicode Hebrew block.

֑␣֒␣֓␣֔␣֕␣֖␣֗␣֘␣֙␣֚␣֛␣֜␣֝␣֞␣֟␣֠␣֡␣֢␣֣␣֤␣֥␣֦␣֧␣֨␣֩␣֪␣֫␣֬␣֭␣֮␣֯␣ֺ␣ֽ␣ֿ␣ׄ␣ׅ␣ׇ

Punctuation

־␣״␣׳

־ [U+05BE HEBREW PUNCTUATION MAQAF] acts as a hyphen in compound words (see hyphens).

״ [U+05F4 HEBREW PUNCTUATION GERSHAYIM] and ׳ [U+05F3 HEBREW PUNCTUATION GERESH] are used to indicate abbreviations (see abbreviations).

Geresh is also used (a) to change the sound of a consonant (see consonant_extensions), (b) to indicate numbers represented by Hebrew letters. and (c) as a cantillation mark.

Three combining marks used like punctuation to produce different sounds when attached to basic consonants (see consonant_extensions).

Hebrew also uses western punctuation, and there are a number of other punctuation marks that are only used for liturgical texts.

׀␣׃␣׆

Numbers

Hebrew uses european digits.

For about a thousand years from the 2nd century BC, Hebrew used letters as numbers. Nowadays, they are only used this way for the Hebrew calendar, for school grades, for counter styles, and in religious contexts.

Currency

The denomination is generally expressed by the abbreviation ש״ח, meaning new sheqel, and standing for sheqel ẖadash. wp

[U+20AA NEW SHEQEL SIGN] may also be used. It is displayed to the left of the amount, with no separation or with a thin space, eg. ₪12,000. (Wikipedia says that this requires the sheqel sign to be typed after the amount, however, the opposite is the case for all major browsers.) wp

Glyph shaping & positioning

Word-final shapes

In Hebrew several characters have a different shape at the end of a word, but each shape variant has it's own codepoint and keyboard key, so there is no need for rendering rules to choose the correct glyph.

This example shows מ [U+05DE HEBREW LETTER MEM] and ם [U+05DD HEBREW LETTER FINAL MEM] (on the left).

מומחים

Two different shapes for mem, depending on position in the word.

Glyph positioning

Multiple diacritics for one base character are common where the various types of diacritic are mixed.

תֹ֙הוּ֙ וָבֹ֔הוּ וְחֹ֖שֶׁךְ עַל־פְּנֵ֣י

Text using a mixture of vowel, consonant, and cantillation diacritic points.

Combinations of ְ [U+05B0 HEBREW POINT SHEVA] with other vowel diacritics are represented by single, non-decomposable code points, eg. ֱ [U+05B1 HEBREW POINT HATAF SEGOL].

In NFC normalised text, a dagesh or shin/sin dot always follows the vowel diacritic. It may be necessary to reorder the diacritics for some applications, eg. for transcriptions that map a consonant+dagesh to a single letter.

Structural boundaries & markers

Word boundaries

The concept of 'word' is difficult to define in any language (see What is a word?). Here, a word is a vaguely-defined semantic unit that is typically smaller than a phrase and may comprise one or more syllables.

Words are separated by spaces.

phrase boundaries

Hebrew uses Latin punctuation for the most part. Full stops, question marks, exclamation marks, and commas are used as in English. There are 6 additional punctuation characters in the Hebrew Unicode block.

Note that the direction of the question mark (?) is the same as in English, and unlike Arabic.

Biblical & liturgical usage. ׀ [U+05C0 HEBREW PUNCTUATION PASEQ] is used as a word separator. wc

Prayer books and similar use ׃ [U+05C3 HEBREW PUNCTUATION SOF PASUQ] as a full stop.wc

Bracketing

Hebrew uses the same parentheses as English, and uses ( [U+0028 LEFT PARENTHESIS] at the start (right) and ) [U+0029 RIGHT PARENTHESIS] at the end (left).wc These are mirrored characters in Unicode, so the glyph for each character is automatically reversed in RTL text.

For example, click on the following to see the component characters, (חדשה) (the first character is the paren on the right).

Hyphens

־ [U+05BE HEBREW PUNCTUATION MAQAF] is the proper punctuation for representing hyphens between compounds, eg. תל־אביב ṫl-ʔvyv Tel Aviv.wc

However, it is less common online because it is not always available on keyboards. Therefore, - [U+002D HYPHEN-MINUS] is often substituted, even though the position of that character is too low when displayed.wc

The Unicode Standard indicates that lines should not break on either side of the maqaf.g

In the Bible, maqaf is primarily associated with cantillation marks and indicates a combination of 2 or more words that are pronounced in one breath. g

Acronyms & abbreviations

Acronyms are indicated by placing ״ [U+05F4 HEBREW PUNCTUATION GERSHAYIM] before the last character, eg. ר״ת‬.

׳ [U+05F3 HEBREW PUNCTUATION GERESH] is used to indicate an abbreviation, eg. גְּבֶרֶת g͓̣vɛʁɛṫ Mrs. is abbreviated as גב׳ gv´. wg

Due to keyboard inadequacies, these are often replaced by ASCII single and double quote characters, even though in general they are visually too high. 

Quotations

Modern quotation marks use [U+201D RIGHT DOUBLE QUOTATION MARK] at the start and [U+201C LEFT DOUBLE QUOTATION MARK] at the end, eg.”ישראל“ Israel. Note that the start and end characters are the opposite way around from the use in English.wc

Up to around 1970 Hebrew used [U+201E DOUBLE LOW-9 QUOTATION MARK] instead for the initial quotation mark, ie. „ישראל“, but this changed due to inadequate keyboard designs.wc

Line & paragraph layout

Text alignment & justification

Use the control below to see how your browser justifies the text sample here.

הכל שווים לפני החוק וזכזאים ללא הפליה להגנה שווה של החוק. הכל זכאים להגנה שווה מפני כל הפליה המפירה את מצוות ההכרזש הזאת ומפני כל הסתה להפליה כזו.

References

  1. [ d ] Peter T. Daniels and William Bright, The World's Writing Systems, Oxford University Press, ISBN 0-19-507993-0, pp487-497
  2. [ w ] Wikipedia, Hebrew alphabet
  3. [ wc ] Wikipedia, Cantillation
  4. [ wp ] Wikipedia, Hebrew punctuation
  5. [ wg ] Wikipedia, Geresh
  6. [ u ] The Unicode Standard v11.0
Last changed 2019-04-12 8:50 GMT.  •  Make a comment.  •  Licence CC-By © r12a.