Hebrew (draft)
Hebrew

Updated 6 January, 2023

This page brings together basic information about the Hebrew script and its use for the Hebrew language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Hebrew using Unicode.

Sample

Select part of this sample text to show a list of characters, with links to more details. Source
Change size:   28px

סעיף א. כל בני אדם נולדו בני חורין ושווים בערכם ובזכויותיהם. כולם חוננו בתבונה ובמצפון, לפיכך חובה עליהם לנהוג איש ברעהו ברוח של אחוה.

סעיף ב. כל אדם זכאי לזכויות ולחרויות שנקבעו בהכרזש זו ללא הפליה כלשהיא מטעמי גזע, צבע, מין, לשון, דח, דעה פוליטית או דעה בבעיות אחרות, בגלל מוצא לאומי או חברתי, קנין, לידה או מעמד אחר. גדולה מזו, לא יופלה אדם על פי מעמדה המדיני, על פי סמכותה או על פי מעמדה הבינלאומי של המדינה או הארץ שאליה הוא שייך, דין שהארץ היא עצמאית, ובין שהיא נתונה לנאמנות, בין שהיא נטולת שלטון עצמי ובין שריבונותה מוגבלת כל הגבלה אחרת.

Usage & history

The Hebrew script is widely used by the Jewish community and is used to write modern Hebrew in Israel. It is the script used for Jewish sacred texts. It is also used for a number of other languages, including Samaritan, Yiddish, and Judeo-Arabic.

אָלֶף־בֵּית עִבְרִי alefbet ivri Hebrew alphabet

Before the Jewish exile in Babylon, Hebrew was written using a Paleo-Hebrew script that resembles the Samaritan alphabet. The current script, known as 'square', or 'block' script, derives from Aramaic writing. It is generally referred to as the Ashuri (Assyrian) script, although there are a few alternate writing styles. It dates from the 5th century BCE.

Sources Scriptsource and Wikipedia.

Basic features

Hebrew is an abjad. This means that in normal use the script represents only consonants. This approach is helped by the strong emphasis on consonant patterns in Semitic languages. See the table to the right for a brief overview of features for the modern Hebrew orthography.

Hebrew text runs right-to-left in horizontal lines, but numbers and embedded Latin text are read left-to-right.

There is no case distinction.

Words are separated by spaces.

The Modern Israeli Hebrew alphabet has 22 letters, plus 5 word-final letters that have their own code points. Additional sounds can be represented using dagesh, shin/sin dots, or geresh.

The script hides short vowels, however these and other phonetic information can be written where needed for clarifying ambiguity or educational purposes using diacritics (points). There are 11 vowel diacritics. Vowel locations can be marked by 4 matres lectionis (consonants indicating vowel locations), which also take diacritics in vowelled text.

In vowelled text, there is a diacritic to indicate the absence of a vowel in consonant clusters.

Modern Hebrew uses both European digits, and ASCII punctuation marks.

Character index

Letters

Show

Consonants

א␣ע␣ט␣ת␣ד␣ק␣ג␣צ␣פ␣ב␣ו␣ש␣ס␣ז␣ח␣כ␣ר␣ה␣מ␣נ␣ל␣י␣ץ␣ף␣ך␣ם␣ן

Combining marks

Show

Vowels

ִ␣ֻ␣ֵ␣ֶ␣ֱ␣ֹ␣ֳ␣ְ␣ָ␣ַ␣ֲ

Other

ּ␣ׁ␣ׂ

Punctuation

Show
־␣״␣׳␣’␣”

ASCII

(␣)␣,␣-␣.␣:␣;␣?␣!

Symbols

Show

Other

Show
⁧␣‫␣⁦␣‪␣⁨␣⁩␣‬␣‏␣‎
Items to show in lists

Phonology

These are phonemes of Israeli Hebrew.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones. Source Wikipedia.

Notes on phonology

Modern Israeli Hebrew was born from speakers who brought their own accents and pronunciations from different parts of the world. There are still variations in pronunciation, but two main types predominate today: Oriental and Occidental. Oriental Hebrew was chosen as the preferred accent for Israel by the Academy of the Hebrew Language, but has since declined in popularity. Age is often a factor in individual pronunciation.wp

In particular, there are alternative pronunciations for x~ħ, ʁ~r, ʔ~ʕ. In this document we use the left-hand side of each of these pairings.

Younger speakers also tend to make all consonants in a cluster voiced or unvoiced, depending of the last consonant, eg. לִסְגֹּר lisᵊgoˑʁ lis'ɡoʁ to close becomes liz'ɡoʁ, and אַבְטָחָה ʔavᵊtāxāh avta'xa security becomes afta'xa.

For more details, see: Wikipedia.

Vowel sounds

Plain vowels

i u e o ə ə a

Diphthongs

ej aj

See also the phonology_notes.

Consonant sounds

labial dental alveolar post-
alveolar
palatal velar uvular glottal
stop p b t d       k ɡ   ʔ
affricate   t͡s   t͡ʃ d͡ʒ        
fricative f v   s z ʃ ʒ
  x ʁ h
nasal m   n        
approximant w   l   j    
trill/flap     r ɾ    

x is sometimes described as χ, and ʁ as r. For more variants see phonology_notes.

Final -h is rarely pronounced in modern Hebrew.wp,#Loss_of_final_H_consonant

Vowels

Hebrew has diacritics that can be used to express short vowel sounds, but rarely uses them in normal text. Hebrew readers are usually able to understand the pronunciation from the context and the regular structure of Hebrew words.

Certain consonant letters, referred to as matres lectionis, may indicate the location of vowels in pointed and unpointed text.

Vowel sounds to characters

This section maps Hebrew vowel sounds to common graphemes in the Hebrew orthography, in pointed text. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

Plain vowels

i
 

ִ   [U+05B4 HEBREW POINT HIRIQ], eg. ראשון.

ִי   [U+05B4 HEBREW POINT HIRIQ + U+05D9 HEBREW LETTER YOD], eg. נין.

u
 

וּ [U+05D5 HEBREW LETTER VAV + U+05BC HEBREW POINT DAGESH OR MAPIQ], eg. מוּם

ֻ   [U+05BB HEBREW POINT QUBUTS] כְּתֻמִּים.

e
 

ֶ   [U+05B6 HEBREW POINT SEGOL], eg. נֵבֶל.

ֵ   [U+05B5 HEBREW POINT TSERE], eg. נבל.

ְ   [U+05B0 HEBREW POINT SHEVA], but only in certain circumstances, eg. נמלים. For details of usage in modern Israeli, see Wikipedia.

o
 

ֹ   [U+05B9 HEBREW POINT HOLAM], eg. פֹּה.

וֹ [U+05D5 HEBREW LETTER VAV + U+05B9 HEBREW POINT HOLAM], eg. סוף.

ָ   [U+05B8 HEBREW POINT QAMATS], eg. שָׁנָה.

ə
 

Any of the five short vowels may be realized as a schwa when far from lexical stress.wp,#Vowels

a
 

ַ   [U+05B7 HEBREW POINT PATAH], eg. תּן.

ָ   [U+05B8 HEBREW POINT QAMATS], eg. שנה

Diphthongs

Matres lectionis

Hebrew uses the following consonant letters to indicate the location of a vowel.

א␣ע␣ו␣י

The first two are silent vowel supports, whereas the second two are considered to be part of the vowel.

There is a trend in Modern Hebrew towards the use of matres lectionis to indicate vowels that have traditionally gone unwritten, a practice known as full spelling.ws

Niqqud points

A series of points, known as niqqud, can be used to give precision about vowel sounds. They are rarely used outside of educational, children's, and religious texts, or for foreign or ambiguous words.

אָלֶף־בֵּית עִבְרִי
'Hebrew alphabet', alef-bet ivri, spelled out using diacritic points.

These are the niqqud used for modern Hebrew.

ִ␣ֻ␣ֵ␣ֶ␣ֱ␣ֹ␣ֳ␣ְ␣ָ␣ַ␣ֲ

Redundancy arises because the modern orthography retains alternative points that in the past expressed length differences. Modern Israeli Hebrew pronunciation ignores phonetic length.

ℹ

Three of the above code points have glyphs that combine ְ [U+05B0 HEBREW POINT SHEVA] (sh'va) and another point (used to indicate shortened lengths in older Hebrew). A single Unicode code point (that doesn't decompose during normalisation) is used for each of these combinations. Authors should not attach multiple vowel code points to a single consonant letter.

Vowel absence

In pointed text, ְ   [U+05B0 HEBREW POINT SHEVA] may be used to express an absence of vowel between two consonants. However, in various other contexts this sh'va is pronounced.

Standalone vowels

Word-initial vowels that are not preceded by a consonant sound are represented by, or written in conjunction with, א [U+05D0 HEBREW LETTER ALEF] or ע [U+05E2 HEBREW LETTER AYIN].

Consonants

Consonant sounds to characters

This section maps Hebrew consonant sounds to common graphemes in the Hebrew orthography. Click on a grapheme to find other mentions on this page (links appear at the bottom of the page). Click on the character name to see examples and for detailed descriptions of the character(s) shown.

Sounds listed as 'infrequent' are allophones, or sounds used for foreign words, etc.

Stops

p
 
t
 

ט [U+05D8 HEBREW LETTER TET], eg. קט.

ת [U+05EA HEBREW LETTER TAV], eg. תות

תּ [U+05EA HEBREW LETTER TAV + U+05BC HEBREW POINT DAGESH OR MAPIQ]. Archaic spelling, still found sometimes in pointed text. (see previous example).

d
 

ד [U+05D3 HEBREW LETTER DALET], eg. דוד.

דּ [U+05D3 HEBREW LETTER DALET + U+05BC HEBREW POINT DAGESH OR MAPIQ]. Archaic spelling, still found sometimes in pointed text.

k
 

ק [U+05E7 HEBREW LETTER QOF], eg. קול

כ [U+05DB HEBREW LETTER KAF], eg. הכה.

כּ [U+05DB HEBREW LETTER KAF + U+05BC HEBREW POINT DAGESH OR MAPIQ] in pointed text, eg. הִכָּה

ךּ [U+05DA HEBREW LETTER FINAL KAF + U+05BC HEBREW POINT DAGESH OR MAPIQ], eg. ממּךּ. Rare, final form.

ɡ
 

ג [U+05D2 HEBREW LETTER GIMEL], eg. גג.

גּ [U+05D2 HEBREW LETTER GIMEL]. Archaic spelling, still sometimes used in pointed text (see previous example).

ʔ
 

ע [U+05E2 HEBREW LETTER AYIN], eg. מועיל.

א [U+05D0 HEBREW LETTER ALEF], eg. שאל.

Affricates

t͡s
 

צ [U+05E6 HEBREW LETTER TSADI], eg. ציץ.

ץ [U+05E5 HEBREW LETTER FINAL TSADI]. Word-final form.

t͡ʃ
 

צ׳ [U+05E6 HEBREW LETTER TSADI + U+05F3 HEBREW PUNCTUATION GERESH], eg. ריצ׳רץ׳. Used in loanwords and slang.

ץ׳ [U+05E5 HEBREW LETTER FINAL TSADI + U+05F3 HEBREW PUNCTUATION GERESH]. Final form. 

d͡ʒ
 

ג׳ [U+05D2 HEBREW LETTER GIMEL + U+05F3 HEBREW PUNCTUATION GERESH], eg. ג׳וּק. Used in loanwords and slang.

Fricatives

f
 

פ [U+05E4 HEBREW LETTER PE], eg. פיספס.

ף [U+05E3 HEBREW LETTER FINAL PE] in word-final position, eg. כנף.

v
 

ב [U+05D1 HEBREW LETTER BET], eg. טוב.

ו [U+05D5 HEBREW LETTER VAV], eg. וו

θ
 

Foreign sound.

ת׳ [U+05EA HEBREW LETTER TAV + U+05F3 HEBREW PUNCTUATION GERESH], eg. ת׳רסטון.

ð
 

Foreign sound.

ד׳ [U+05D3 HEBREW LETTER DALET + U+05F3 HEBREW PUNCTUATION GERESH], eg. ד׳ו אל-חיג׳ה.

s
 

ס [U+05E1 HEBREW LETTER SAMEKH], eg. סוף.

ש [U+05E9 HEBREW LETTER SHIN], eg. שם.

שׂ [U+05E9 HEBREW LETTER SHIN + U+05C2 HEBREW POINT SIN DOT], eg. שָׂם. Explicit form, used in pointed text to distinguish from ʃ.

z
 

ז [U+05D6 HEBREW LETTER ZAYIN], eg. זה.

ʃ
 

ש [U+05E9 HEBREW LETTER SHIN], eg. שם.

שׁ [U+05E9 HEBREW LETTER SHIN + U+05C1 HEBREW POINT SHIN DOT], eg. שָׁם. Explicit form, used in pointed text to distinguish from s.

ʒ
 

ז׳ [U+05D6 HEBREW LETTER ZAYIN + U+05F3 HEBREW PUNCTUATION GERESH], eg. ז׳רגון. Used in loan words & slang.

χ
 

כ [U+05DB HEBREW LETTER KAF], eg. סכך.

ך [U+05DA HEBREW LETTER FINAL KAF] word finally.

ח [U+05D7 HEBREW LETTER HET], eg. חם.

ח׳ [U+05D7 HEBREW LETTER HET + U+05F3 HEBREW PUNCTUATION GERESH], eg. שייח׳. Used to indicate that this sound should be used rather than h in non-Hebrew (esp. Arabic) text. 

ʁ
 

ר [U+05E8 HEBREW LETTER RESH], eg. עיר.

ר׳ [U+05E8 HEBREW LETTER RESH + U+05F3 HEBREW PUNCTUATION GERESH], eg. ר׳ג׳ר. Explicitly indicates the sound for Arabic transliteration.

h
 

ה [U+05D4 HEBREW LETTER HE], eg. הד.

Nasals

m
 

מ [U+05DE HEBREW LETTER MEM], eg. מוּם.

ם [U+05DD HEBREW LETTER FINAL MEM]. Word-final form.

n
 

נ [U+05E0 HEBREW LETTER NUN], eg. נין.

ן [U+05DF HEBREW LETTER FINAL NUN]. Word-final form.

ŋɡ
 

Other

w
 

וו [U+05D5 HEBREW LETTER VAV + U+05F3 HEBREW PUNCTUATION GERESH], eg. אוטווה. Non-standard orthography.

ו׳ [U+05D5 HEBREW LETTER VAV + U+05F3 HEBREW PUNCTUATION GERESH] Non-standard orthography (not common), eg. ו׳יליאם.

l
 

ל [U+05DC HEBREW LETTER LAMED], eg. לי.

j
 

י [U+05D9 HEBREW LETTER YOD], eg. ים.

Basic consonants

These are the basic consonant letters used in modern Hebrew.

א␣ע
ט␣ת␣ד␣ק␣ג
צ
פ␣ב␣ו␣ש␣ס␣ז␣ח␣כ␣ר␣ה
מ␣נ
ל␣י

Word-final shapes

Five letters have special word-final forms, called sofit. They are encoded as separate code points in Unicode, and appear as separate keys on a keyboard, so no special processing is needed to display or store them (unlike Arabic).

ץ␣ף␣ך␣ם␣ן

Foreign words and names may sometimes use the normal forms at the end of a word, rather than the sofit form. In those cases, use the non-final code points.

Matres lectionis

Three of the letters can also represent vowel locations. See matres.

Repertoire extensions

Methods used to modify the sound of a consonant.

Dagesh

ּ [U+05BC HEBREW POINT DAGESH OR MAPIQ] is used in pointed text with 3 consonant letters (and one final form) to indicate that they map to 'hard' sounds. This is similar to the distinction made in Syriac. Dagesh is the only diacritic to appear inside a consonant. Below, the hard sounds are shown to the left, and the normal to the right.

פּ␣בּ␣כּ␣ךּ␣ ␣פ␣ב␣כ␣ך

Dagesh can also be found alongside other letters, without any sound change, due to preservation of archaic spelling. The pairs tθ, dð and ɡɣ were lost over time, leaving:

תּ␣דּ␣גּ

Shin & sin dots

The two phonemes s and ʃ are represented by a single consonant letter, ש [U+05E9 HEBREW LETTER SHIN]. If it is necessary to indicate which is intended, two diacritics used only with this character, do the job: ׂ [U+05C2 HEBREW POINT SIN DOT] and ׁ [U+05C1 HEBREW POINT SHIN DOT]. They look identical apart from the side to which they are positioned.

שׁ␣שׂ

Geresh

Other consonants are extended to non-native sounds by use of a following ׳ [U+05F3 HEBREW PUNCTUATION GERESH].

This first set is used in loanwords and slang that are part of the everyday Hebrew colloquial vocabulary.ws,#Sounds_represented_with_diacritic_geresh

ג׳␣ז׳␣צ׳␣ו׳␣וו

The graphemes ו׳ and וו are alternative ways of writing the same thing.

A second set is only used to transliterate foreign sounds, especially Arabic.ws,#Sounds_represented_with_diacritic_geresh

ד׳␣ת׳␣ח׳␣ר׳␣ע׳

Cantillation marks

In Biblical and older Hebrew texts, many additional diacritics are attached to the base character alongside the niqqud. Nearly all of the following additional marks in the Hebrew Unicode block are cantillation marks, used to indicate how to chant ritual readings from the Hebrew Bible in synagogue services.

֑␣֒␣֓␣֔␣֕␣֖␣֗␣֘␣֙␣֚␣֛␣֜␣֝␣֞␣֟␣֠␣֡␣֢␣֣␣֤␣֥␣֦␣֧␣֨␣֩␣֪␣֫␣֬␣֭␣֮␣֯␣ֺ␣ֽ␣ֿ␣ׄ␣ׅ␣ׇ

Numbers

Hebrew uses european digits.

For about a thousand years from the 2nd century BC, Hebrew used letters as numbers. Nowadays, they are only used this way for the Hebrew calendar, for school grades, for counter styles, and in religious contexts.

Currency

The denomination is generally expressed by the following abbreviationwhp, which stands for שקל חדש: ש״ח

[U+20AA NEW SHEQEL SIGN] may also be used. It is displayed to the left of the amount, with no separation or with a thin space, eg. ₪12,000 (Wikipedia says that this requires the sheqel sign to be typed after the amount, however, the opposite is the case for all major browsers.)whp

Text direction

Hebrew script is written right-to-left in the main, but as with all RTL scripts, numbers and embedded LTR script text are written left-to-right (bidirectional text). In the following example, the Hebrew words are read right-to-left, starting with the one on the right, and the numeric expression ("10-12") is read left-to-right, ie. it starts with 10 and ends with 12. (Note that this is unlike Arabic, where the 10 and 12 would be in opposite positions.)

התאריכים 10-12 במרץ

Bidirectional Hebrew text.

The Unicode Bidirectional Algorithm automatically takes care of the ordering for all the text in fig_bidi, as long as the 'base direction' is set to RTL. In HTML this can be set using the dir attribute, or in plain text using formatting controls.

If the base direction is not set appropriately, the directional runs will be ordered incorrectly as shown in fig_bidi_no_base_direction, and can become unreadable.

ב־HMTL5 זה מתבצע על ידי הוספת אלמנט ה־inline bdo.

ב־HMTL5 זה מתבצע על ידי הוספת אלמנט ה־inline bdo.

The exact same sequence of characters with the base direction set to RTL (top), and with no base direction set on this LTR page (bottom).

Show default bidi_class properties for characters in the Hebrew orthography described here.

For more information about how directionality and base direction work, see Unicode Bidirectional Algorithm basics. For information about plain text formatting characters see How to use Unicode controls for bidi text. And for working with markup in HTML, see Creating HTML Pages in Arabic, Hebrew and Other Right-to-left Scripts.

On this page, see also expressions and linebreak for additional features related to direction.

Managing text direction

Unicode provides a set of 10 formatting characters that can be used to control the direction of text when displayed. These characters have no visual form in the rendered text, however text editing applications may have a way to show their location.

RLE [U+202B RIGHT-TO-LEFT EMBEDDING] (RLE), LRE [U+202A LEFT-TO-RIGHT EMBEDDING] (LRE), and PDF [U+202C POP DIRECTIONAL FORMATTING] (PDF) are in widespread use to set the base direction of a range of characters. RLE/LRE come at the start, and PDF at the end of a range of characters for which the base direction is to be set.

More recently, the Unicode Standard added a set of characters which do the same thing but also isolate the content from surrounding characters, in order to avoid spillover effects. They are RLI [U+2067 RIGHT-TO-LEFT ISOLATE] (RLI), LRI [U+2066 LEFT-TO-RIGHT ISOLATE] (LRI), and PDI [U+2069 POP DIRECTIONAL ISOLATE] (PDI). The Unicode Standard recommends that these be used instead.

There is also PDI [U+2068 FIRST STRONG ISOLATE] (FSI), used initially to set the base direction according to the first recognised strongly-directional character.

RLM [U+200F RIGHT-TO-LEFT MARK] (RLM) and LRM [U+200E LEFT-TO-RIGHT MARK] (LRM) are invisible characters with strong directional properties that are also sometimes used to produce the correct ordering of text.

For more information about how to use these formatting characters see How to use Unicode controls for bidi text. Note, however, that when writing HTML you should generally use markup rather than these control codes. For information about that, see Creating HTML Pages in Arabic, Hebrew and Other Right-to-left Scripts.

Expressions & sequences

A sequence of numbers, for example a range separated by hyphens, generally runs left to right in Hebrew (unlike Arabic).

Glyph shaping & positioning

This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.

You can experiment with examples using the Hebrew character app.

The Hebrew script is not usually cursive (ie. joined up) when printed.

The script makes no case distinctions and needs no transforms to convert between code points.

Font styles

Hebrew has a number of different writing styles.

The standard, 'square script' is derived from Aramaic. There are serif and sans-serif fonts.

 

Serif (top) and sans (bottom) examples of the standard writing style.

The STAM style is used for sacred texts such as the Torah. Certain letters have decorative tags above.s

Text written in the STAM writing style.

The rashi style is used for commentaries on sacred texts. Letters have a more rounded, almost cursive style.s

Text written in the rashi writing style.

Hebrew also has a 'cursive' style, which means 'handwriting' style. Letters are not normally joined. Cursive fonts are only used as display fonts. Many glyphs look very different from the standard letter forms.

Text in Yoav Cursive font.
Text written in the 'cursive' writing style.

Before the Babylonian exile (from which the square script derives), Hebrew was written with different shapes, which are similar to those used for Samaritan.

Context-based shaping & positioning

In Hebrew several characters have a different shape at the end of a word, but each shape variant has it's own code point and keyboard key, so there is no need for rendering rules to choose the correct glyph.

This example shows מ [U+05DE HEBREW LETTER MEM] and ם [U+05DD HEBREW LETTER FINAL MEM] (on the left).

Two different shapes for mem, depending on position in the word.

Multiple diacritics for one base character are common where the various types of diacritic are mixed.

Text using a mixture of vowel, consonant, and cantillation diacritic points.

Combinations of ְ [U+05B0 HEBREW POINT SHEVA] with other vowel diacritics are represented by single, non-decomposable code points, eg. ֱ   [U+05B1 HEBREW POINT HATAF SEGOL].

In NFC normalised text, a dagesh or shin/sin dot always follows the vowel diacritic. It may be necessary to reorder the diacritics for some applications, eg. for transcriptions that map a consonant+dagesh to a single letter.

The diacritic ֹ [U+05B9 HEBREW POINT HOLAM] illustrates how positioning can be context-sensitive. fig_holam shows 3 examples.

תֹּאַר תֹּאר שֹׂבַע
Three slightly different placements of ֹ [U+05B9 HEBREW POINT HOLAM], depending on the surrounding context.

Font styling & weight

Bold text is used as one way to highlight or emphasise text. The degree of bolding is often quite light. Bold-italic is typically only used for large display text.l

Italics may also be used, however its use is not abundant, and many of the italic faces in fonts are designed for display use, rather than to accompany a regular font.l

There are different preferences for the direction of the slant for italicised Hebrew text. The choice as to which is preferred appears to be down to the individual, and is a question of whether the slant matches the direction of the Hebrew text, or embedded Latin text.l

עברית
Example of forward-leaning italics (bottom).

Graphemes

Grapheme clusters

Hebrew typographic units consist of base characters, optionally followed by one or more combining marks. Unicode grapheme clusters can be applied to Hebrew without problems. There are no special issues related to operations that use grapheme clusters as their basic unit of text.

Punctuation & inline features

Word boundaries

Words are separated by spaces.

Hyphens.־ [U+05BE HEBREW PUNCTUATION MAQAF] is the proper punctuation for representing hyphens between compounds,wc eg. תל־אביב

However, it is less common online because it is not always easily available on keyboards. Therefore, - [U+002D HYPHEN-MINUS] is often substituted, even though the position of that character is too low when displayed.wc

The Unicode Standard indicates that lines should not break on either side of the maqaf.g

In the Bible, maqaf is primarily associated with cantillation marks and indicates a combination of 2 or more words that are pronounced in one breath.g

Phrase & section boundaries

,␣;␣:␣.␣?␣!

Hebrew uses ASCII punctuation for the most part. Full stops, question marks, exclamation marks, and commas are used as in English. There are 6 additional punctuation characters in the Hebrew Unicode block.

phrase

, [U+002C COMMA]

; [U+003B SEMICOLON]

: [U+003A COLON]

sentence

. [U+002E FULL STOP]

? [U+003F QUESTION MARK]

! [U+0021 EXCLAMATION MARK]

Note that the direction of the question mark (?) is the same as in English, and unlike Arabic. The same is true for the comma ( , ).

Biblical & liturgical usage. ׀ [U+05C0 HEBREW PUNCTUATION PASEQ] is used as a word separator.wc

Prayer books and similar use ׃ [U+05C3 HEBREW PUNCTUATION SOF PASUQ] as a full stop.wc

Bracketed text

(␣)

Hebrew commonly uses ASCII parentheses to insert parenthetical information into text.

  start end
standard

) [U+0029 RIGHT PARENTHESIS]

( [U+0028 LEFT PARENTHESIS]

Hebrew uses the same parentheses as English, and uses ( [U+0028 LEFT PARENTHESIS] at the start (right) and ) [U+0029 RIGHT PARENTHESIS] at the end (left).wc These are mirrored characters in Unicode, so the glyph for each character is automatically reversed in RTL text.

For example, click on the following to see the component characters (חדשה)

The first character in memory is the paren on the right. The consequence of this is that, when writing Hebrew, the parentheses should be used as if they were named U+0028 START PARENTHESIS and U+0028 END PARENTHESIS, respectively.

Quotations & citations

”␣”␣’

Hebrew texts use quotation marks around quotations. Of course, due to keyboard design, quotations may also be surrounded by ASCII double and single quote marks. Note, however, that these are not paired.

  start end
initial

[U+201D RIGHT DOUBLE QUOTATION MARK]

[U+201D RIGHT DOUBLE QUOTATION MARK]
nested

[U+2019 RIGHT SINGLE QUOTATION MARK]

[U+2019 RIGHT SINGLE QUOTATION MARK]

In principle, for modern quotations, [U+201D RIGHT DOUBLE QUOTATION MARK] is used at the start and at the end, eg.”ישראל” IsraelNested quotations use different quote marks, which would typically be [U+2019 RIGHT SINGLE QUOTATION MARK].a,#target-4931

However, in practice, Hebrew texts often use " [U+0022 QUOTATION MARK] and ' [U+0027 APOSTROPHE].

Up to around 1970 Hebrew used [U+201E DOUBLE LOW-9 QUOTATION MARK] instead for the initial quotation mark, ie. „ישראל”but this changed, partly due to inadequate keyboard designs.whp,#Quotation_marks

Emphasis

Increased tracking is a common way to express emphasis in Hebrew.

The last part of this text is stretched to show emphasis.

Aternatives include the use of a different typeface, and/or underlining.l

Abbreviation, ellipsis & repetition

״␣׳

Acronyms are indicated by placing ״ [U+05F4 HEBREW PUNCTUATION GERSHAYIM] before the last character, eg. ד״ר

׳ [U+05F3 HEBREW PUNCTUATION GERESH] may also be used to indicate an abbreviation,wg eg. גברת is abbreviated as גב׳

Due to keyboard inadequacies, these are often replaced by ASCII single and double quote characters, even though in general they are visually too high. 

Inline notes & annotations

tbd

Other punctuation

tbd

Other inline text decoration

Text can be highlighted using bold, italic, different fonts, font sizing, colour, or tracking.

Line & paragraph layout

Line breaking & hyphenation

Lines are normally broken at word boundaries.

Like most writing systems, certain characters are expected not to start or end a line. For example, periods and commas shouldn't start a line, and opening parentheses shouldn't end a line.

Breaking between Latin words. When a line break occurs in the middle of an embedded left-to-right sequence, the items in that sequence need to be rearranged visually so that it isn't necessary to read lines from top to bottom.

latin-line-breaks shows how two Latin words are apparently reordered in the flow of text to accommodate this rule. Of course, the rearragement is only that of the visual glyphs: nothing affects the order of the characters in memory.

Text with no line break in Latin text.

Text with line break in Latin text.
The lower of these two images shows the result of decreasing the line width, so that text wraps between a sequence of Latin words.

Show (default) line-breaking properties for characters in the Hebrew orthography described here.

Text alignment & justification

tbd

Text spacing

Increased tracking is a common way to express emphasis in Hebrew.

Examples of letter-spacing (highlighted by the red lines) in Hebrew text.

Baselines, line height, etc.

Hebrew uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.

The Hebrew characters are commonly slightly taller than the Latin x-height. fig_baselines shows ascenders and descenders for Hebrew letters in the Noto Serif fonts. In this font combination the maximum height of the Hebrew letters reaches slightly higher than the Latin extenders.

qhxאבלך
Font metrics for text in the Noto Serif and Noto Serif Hebrew fonts.

Counters, lists, etc.

You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.

The Hebrew orthography uses an additive styles, in addition to numeric decimal style based on ASCII digits.

Additive

The hebrew additive style uses the letters shown below. It is specified for a range between 1 and 10,999. This system manually specifies the values for 19-15 to force the correct display of 15 and 16, which are commonly rewritten to avoid a close resemblance to the Tetragrammaton. Implementations may, and some do, implement this manually to a higher range.

י׳␣ט׳␣ח׳␣ז׳␣ו׳␣ה׳␣ד׳␣ג׳␣ב׳␣א׳␣ת␣ש␣ר␣ק␣צ␣פ␣ע␣ס␣נ␣מ␣ל␣כ␣יט␣יח␣יז␣טז␣טו␣י␣ט␣ח␣ז␣ו␣ה␣ד␣ג␣ב␣א

Examples:

א␣ב␣ג␣ד␣יא␣כב␣לג␣מד␣קיא␣רכב␣שלג␣תמד

Prefixes and suffixes

The default list style uses a full stop + space as a suffix.

Examples:

א. ב. ג. ד. ה.
Separator for Hebrew list counters.

Styling initials

It is possible to find the first letter in a paragraph styled so that it is larger and sits alongside several lines of the continuing paragraph text.

An enlarged initial letter in the word לפגי at the beginning of a paragraph.

Observation: The glyph in fig_drop_cap rises above the normal top line of most Hebrew characters. It also rises above the top line of the adjacent glyphs when positioned alongside them. The bottom of the glyph is aligned with the bottom of the glyphs on the 3rd line down.

Boxed initials can also be found, such as the one in fig_drop_cap_box. Here, the initial letter is centred horizontally and vertically inside the space created by the box. The box extends from the top line of the first line of text to the baseline of the 6th line.

An enlarged initial letter in the word גבול at the beginning of a paragraph, set in a box.

Page & book layout

This section is for any features that are specific to thisScript and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.

General page layout & progression

Hebrew books, magazines, etc., are bound on the right-hand side, and pages progress from right to left.

عنوان كتاب

Binding configuration for Hebrew books, magazines, etc.

Columns are vertical but run right-to-left across the page.

References