This page brings together basic information about the Mandaic script and its use for the Neo-Mandaic language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Mandaic using Unicode.
The relatively small number of letters in Mandaic (especially for vowels) cover a fairly wide set of allophonic sounds. Differences in pronunciation also arise due to the dialect or accent of the speaker. Although these may be spelled out in some of the examples, it is best to assume that many of the letters described here represent more than one sound, and that the pronunciations given for the examples may differ for other speakers.
It was difficult to find word lists that show IPA pronunciations for Neo-Mandaic spellings, although there are lists of words that show IPA for transcriptions that appear to be close to transliterations. A Neo-Mandaic term with ⁍ alongside it indicates that the spelling has been guessed at, rather than copied.
Phonological transcriptions should be treated as a guide, only. They are taken from the sources consulted, and may be narrow or broad, phonemic or phonetic, depending on what is available. They mostly represent pronunciation of words in isolation. For more detailed information about allophones, alternations, sandhi, dialectal differences, and so on, follow the links to cited references.
This is an interactive document. Click/tap on the following to reveal detailed information and examples for each character: (a)coloured characters in examples and lists; (b)link text on character names. If your browser supports it, your cursor will change to look like as you hover over these items.
More about using this page
Character names. The names of characters in codepoint markup drop the initial MANDAIC label (purely to reduce the length of the examples). In other places the full name can be found.
Navigation. The icon opens the table of contents in a popup window. Dismiss it by clicking on the X alongside it, or by hitting the ESC key.
Detailed character notes. Clicking on coloured characters in lists or on character names opens panels that give detailed information about each character. This information is taken from the companion document, Mandaic Character Notes. (Those panels can be dismissed by pressing on the ESC key.)
Transcriptions & transliterations. Phonological transcriptions are surrounded by ⌈corner brackets⌋, to indicate that they vary between narrow, [phonetic] and broad, /phonemic/ transcriptions.
Latin transcriptions between <angle brackets>, represent the letters as commonly written in the Latin script.
A transliteration has also been developed especially for this orthography, and is generally based on the sound of a letter where possible, but where a letter has multiple pronunciations, the transliteration represents only one.
Transliterations provide perfect round-trip conversion between the native script and Latin, whereas Latin transcriptions rarely do.
When you click on an example to see its composition, the top of the panel that opens contains a transliteration, followed by the native text, then (if available) an IPA transcription.
ˇ
Set text size:
Font & text size of the examples can be changed independently using the control that pulls out from the bottom right of the page.
Source: Paragraph 1, Unicode UDHR, article 1; paragraph 2, From a Masiqta hymn (Macuch 1967: 54, no.5. lines 1-3) in Daniels.
Usage & history
Origins of the Mandaic script, 2ndC – today.
Phoenician
└ Aramaic
└ Mandaic
+ Hebrew
+ Nabataean
+ Syriac
+ Palmyrene
+ Hatran
+ Elymaic
+ Pahlavi
+ Kharosthi
+ Brahmi
The Mandaic script is used for writing Neo-Mandaic, an Iraqi language spoken by about 5,500 people, and is also the script of Classical Mandaic, the liturgical language of the Mandaean religion. Persecution and war over a long period has reduced the language to a severely endangered level. There may be 200 or less first language speakers of Mandaic.
ࡀࡁࡀࡂࡀābāgāMandaic alphabet
The origins of the script are not clear, but many scholars believe it to be descended from Aramaic via Parthian. Research has indicated that it has remained relatively unchanged since its initial development between the 2nd and 7th centuries CE.
The Mandaic script is an alphabet. This means that it is phonetic in nature, where each letter represents a basic sound. This is unusual among scripts of semitic origin. See the table to the right for a brief overview of features for the modern Neo-Mandaic orthography.
Mandaic text runs right-to-left in horizontal lines, but numbers and embedded Latin text are read left-to-right. There is no case distinction.
Words are separated by spaces, and contain a mixture of consonants and vowels, with diacritics to indicate vowel quality, gemination, or foreign sounds.
The script is cursive, but basic letter shapes don't change radically. In some letters, the joining edge of the glyph adapts to join with an adjacent character.
The standard Mandaic alphabet consists of 24 letters, since 24 is a significant number to Mandaeans, however this is only achieved by repeating the first letter of the alphabet, ࡀU+0840 LETTER HALQA, at the end, and including a ligature, ࡗU+0857 LETTER KAD.
Mandaic has 17 basic consonant letters. Similarly to Syriac, many of the consonant letters, especially the stops, represent more than one phoneme – typically a stop and a fricative. Particular phonemes and additional sounds used in Arabic and Persian can be indicated explicitly using an affrication mark added to consonants, and one extra character.
3 more special characters represent the sounds of grammatical syllables.
Gemination is not normally marked, but can be indicated using a combining mark.
Mandaic is an alphabet where vowels are written using 4 vowel letters, derived from consonants. The 4 vowel letters represent 6 phonemes, and various allophonic realisations depending on syllabic context or speaker location (see Vowel sounds). A seventh phoneme, ə, is unwritten.
Three of the 4 letters representing vowel sounds may represent one of two phonemes; the specific phoneme can be clarified for educational purposes using ࡚U+085A VOCALIZATION MARK.
Click on the sounds to reveal locations in this document where they are mentioned.
Phones in a lighter colour are non-native or allophones
.
Vowel sounds
Plain vowels
There is considerable allophonic variation in Neo-Mandaic vowels. Figure 1 shows common realisations of the basic sounds listed above, based on syllable type. Note that o, e, and a are very rare in open, accented syllables.3
Open syllable
i
u
ɔ
oː
e
a~æ
Open, accented syllable
iː
uː
ɔː
o
e
a~æ
Closed syllable
ɪ
ʊ
ʌ
ɛ
ɑ
Typical allophones based on syllable type for the primary vowels in Neo-Mandaic.359
Click on the characters to find where they are mentioned in this page.
The Mandaic alphabet has 24 letters, since that number is symbolic to Mandaeans. To reach that number, the alphabet includes the ligature ࡖU+0856 LETTER DUSHENNA and the first letter is repeated at the end of the alphabet.
24
ࡀa ɔāā0840
ࡁb v wbb0841
ࡂɡ ʁgg0842
ࡃddd0843
ࡄhhh0844
ࡅu o w vuu0845
ࡆzzz0846
ࡇiːʷ ħuᵘẖẖ0847
ࡈtˤṭᵵ0848
ࡉi e jii0849
ࡊk χkk084A
ࡋlll084B
ࡌmmm084C
ࡍnnn084D
ࡎsss084E
ࡏe i ∅ʿʿ084F
ࡐp fpp0850
ࡑsˤṣᵴ0851
ࡒqqq0852
ࡓrrr0853
ࡔʃ t͡ʃšʃ0854
ࡕt θtt0855
ࡖdiḏḏ0856
ࡀa ɔāā0840
Vowels
Mandaic is an alphabet where vowels are written using 4 vowel letters, derived from consonants. The 4 vowel letters represent 6 phonemes, and various allophonic realisations depending on syllabic context or speaker location (see Vowel sounds). A seventh phoneme, ə, is unwritten.
Three of the 4 letters representing vowel sounds may represent one of two phonemes; the specific phoneme can be clarified for educational purposes using ࡚U+085A VOCALIZATION MARK.
The following table summarises the main vowel to character assigments.
The table shows only phonemic vowels, unless indicated otherwise. These vowels represent a variety of allophones – see Vowel ranges for more information. Hyphens are used to indicate word-initial or word-final forms. The right-hand column shows where the vocalisation mark can be used in educational texts to disambiguate the vowel sound.
Vowels that follow consonants are written using 4 vowel letters, derived from consonants. The 4 vowel letters represent 6 phonemes, and various allophonic realisations depending on syllabic context or speaker location (see Vowel sounds). A seventh phoneme, ə, is unwritten.
Three of the 4 letters representing vowel sounds may represent one of two phonemes; the specific phoneme can be clarified for educational purposes using ࡚U+085A VOCALIZATION MARK.
Vowel letters
The Mandaic Unicode block uses just 4 characters for vowels, however each vowel letter represents 2 phonemic vowel distinctions and a number of allophonic realisations, both in quality and vowel length (see Vowel ranges).
4
ࡉi e ji0849
ࡅu o w vu0845
ࡀa ɔā0840
ࡏe i ∅ʿ084F
The letters used for vowels all have their origin in consonants, but ࡀU+0840 LETTER HALQA and ࡏU+084F LETTER IN are now used only for vowels. They are available to use as vowels because the language dropped the glottal and pharyngeal sounds.
Observation: Looking at text samples (such as the UDHR, for which i have no IPA transcription) it's not clear that the above covers all the uses of ࡏU+084F LETTER IN adequately. More information is needed.
Although the script is basically alphabetic, vowel sounds are not always shown. For example, the i is not shown in ࡌࡍmnminfrom
Two ligatures encoded in the Unicode block have unwritten vowel sounds, ie.
ࡖḏdi
ࡗkḏi
Vowel ranges
Figure 2 shows how 3 of the vowel letters encompass a range of sounds, rather than representing a single, specific sound. The 6 darker phones within the circles are phonemic vowel distinctions, whereas the lighter phones are allophonic realisations. In each circled case, two primary vowel sounds are associated with a given letter. There can also be long and short versions of the primary vowels.
Sound ranges associated with vowel letters.
Observation: Need to confirm that ɛ falls within 2 circles.
Vowel disambiguation
࡚085A
Where needed in educational texts, ࡚U+085A VOCALIZATION MARK8 can be used to distinguish primary vowel sounds for two letters, and the length of the third.
According to Häberl375, vowel length is entirely predictable in Neo-Mandaic and depends entirely upon the placement of the accent and the syllable structure. Vowels in open, accented syllables are long, and pretonic, open syllables have short vowels.
Mandaic has no regular mechanisms in the orthography to indicate vowel length.
Standalone vowels
Standalone vowels are vowel sounds that are not preceded by a consonant sound, or are preceded by only a glottal stop. They may appear at the beginning of a word or in the middle of a word after a preceding vowel.
Standalone vowels only occur in word-initial position in Neo-Mandaic372. Two of the vowel letters are commonly preceded by ࡏU+084F LETTER IN in word-initial position. That letter on its own represents an e sound, however Daniels1512 says that this usually represents a prothetic vowel before the t-prefix in passive verbs or before a monoconsonantal word.
Observation: Does ࡏU+084F LETTER IN on its own therefore represent the vowel ə rather than one of the sounds covered by the ࡉU+0849 LETTER AKSA range?
This gives the following typical forms:
4
ࡏࡉi- ɛ-084F 0849
ࡏࡅu- o-084F 0845
ࡏe-084F
ࡀa- ɛ-0840
Examples:
ࡏࡉࡍࡂࡋࡉࡆࡉࡀ⁍ iŋ.glɪ.ˈzi English
ࡏࡅࡓࡀࡔࡋࡀࡌ urašlām Jerusalem
ࡏࡕࡌࡀࡋ eθmal yesterday
ࡀࡌࡀࡉ aːmaj today
Vowel sounds to characters
This section maps Neo-Mandaic vowel sounds to common graphemes in the Mandaic orthography.
Code points shown are for typical word-initial, word-medial, and word-final usage.
Mandaic has 17 basic consonant letters. Similarly to Syriac, many of the consonant letters, especially the stops, represent more than one phoneme – typically a stop and a fricative. Particular phonemes and additional sounds used in Arabic and Persian can be indicated explicitly using an affrication mark added to consonants, and one extra character.
3 more special characters represent the sounds of grammatical syllables.
Gemination is not normally marked, but can be indicated using a combining mark.
Consonant summary table
The following table summarises the main consonant to character assigments.
The right-hand column lists sounds that only occur in words from other languages, principally Arabic or Persian.
Native Mandaic sounds include 25 basic consonants. They are written using the following consonant letters.
17
ࡐp fp0850
ࡁb v wb0841
ࡕt θt0855
ࡃdd0843
ࡈtˤᵵ0848
ࡊk χk084A
ࡂɡ ʁg0842
ࡒqq0852
ࡎss084E
ࡆzz0846
ࡑsˤᵴ0851
ࡔʃ t͡ʃʃ0854
ࡄhh0844
ࡌmm084C
ࡍnn084D
ࡓrr0853
ࡋll084B
Like other orthographies in the region such as Syriac and Hebrew, some letters represent both 'hard' and 'soft' consonants, although a few of the soft sounds in Mandaic are only used for loan words (see Repertoire extension). Figure 3 shows the correspondences between hard and soft sounds.
Letter
ࡐ
ࡁ
ࡕ
ࡈ
ࡃ
ࡊ
ࡂ
Hard
p
b
t
tˤ
d
k
ɡ
Soft
f
v
θ
ðˤ
ð
χ
ʁ
Correspondences between hard sounds and soft sounds.
These sounds are not usually distinguished in writing, although they can be, if needed (such as in educational texts), by a diacritic (see Consonant disambiguator). For example,
ࡇU+0847 LETTER ITẖ only appears at the end of personal names or at the end of words to indicate the third person singular suffix.
ࡖU+0856 LETTER DUSHENNA is a letter of the alphabet, but it has a morphemic function, being used to write the relative pronoun and genitive exponent ḏ-, eg. ࡖࡍࡐࡀࡒࡕḏnpāqtdinpaqtwho left youࡖࡎࡉࡍࡀḏsinādisinaof hatred
Neo-Mandaic is heavily influenced by Arabic and Persian languages, and they can bring additional sounds into the text via loan words or dialectal variations. Mostly, the non-native sounds are written using ordinary Mandaic letters, but diacritics can be used to point out particular pronunciations (see Consonant disambiguator). The list below shows the main non-native sounds and the letters used to write them.
࡙U+0859 AFFRICATION MARK can be used to disambiguate letter sounds in educational texts, some of which sounds are typically used only in loan words.
9
ࡔ࡙t͡ʃ d͡ʒʃˑ0854 0859
ࡐ࡙fpˑ0850 0859
ࡕ࡙θtˑ0855 0859
ࡃ࡙ðdˑloans0843 0859
ࡈ࡙ðˤᵵˑloans0848 0859
ࡑ࡙ʒᵴˑ0851 0859
ࡂ࡙ʁgˑ0842 0859
ࡊ࡙χkˑ084A 0859
ࡄ࡙ħhˑ0844 0859
Also, although gemination is not usually marked, ࡛U+085B GEMINATION MARK can be used to indicate gemination of a consonant (referred to by native writers as 'hard' pronunciation).
ࡋࡉࡁ࡛ࡀ ˈlɛbbɔ heart
Consonant clusters
Häberl3729→ provides some detailed information about rules for consonant clusters.
Consonant length
Gemination occurs in Neo-Mandaic words, but is not usually marked.
In educational texts ࡛U+085B GEMINATION MARK can be used to indicate gemination of a consonant (referred to by native writers as 'hard' pronunciation).
This section describes typographic features related to digits, dates, currencies, etc.
The Unicode Mandaic block has no native digits. How numbers are represented in Mandaic text is TBD.
Text direction
Mandaic text runs right to left in horizontal lines.
Normally, the Unicode Bidirectional Algorithm automatically takes care of the ordering of text, as long as the 'base direction' (ie. the surrounding directional context) is set to right-to-left (RTL).
Characters are all stored in the order in which they are spoken (and typed). This so-called 'logical' order is then rendered as bidirectional flows by the application at run time, as the text is displayed or printed. The relative placement of characters within a single directional flow is based on strong directional properties (RTL or LTR) assigned to each Unicode character by the Unicode Standard. There exist, however a set of neutral direction property values, mostly for punctuation, where the placement of characters depends on the base direction.
If the base direction is not set appropriately, the directional runs will be ordered incorrectly, making it very difficult to get the meaning.
In some circumstances the Unicode Bidirectional Algorithm requires additional assistance to correctly render the directionality of bidirectional text. For such cases the Unicode Standard provides invisible formatting characters for use in plain text. See Managing text direction.
In HTML the base direction and higher level controls can be set using the dir or bdi attributes. CSS should not be used to control direction. Unicode formatting codes should also not be used where markup is available.
For authoring HTML pages, one of the most important things to remember is to use <html dir="rtl" … > at the top of a right-to-left page, and then use the dir attribute or bdi tag for ranges within the page, but only when you need to change the base direction. Also, use markup to manage direction, and do not use CSS styling.
For other aspects of dealing with right-to-left writing systems see the following sections:
Unicode provides a set of 10 formatting characters that can be used to control the direction of text when displayed. These characters have no visual form in the rendered text, however text editing applications may have a way to show their location.
In Unicode 6.1, the Unicode Standard added a set of characters which do the same thing but also isolate the content from surrounding characters, in order to avoid spillover effects. They are U+2067 RIGHT-TO-LEFT ISOLATE (RLI), U+2066 LEFT-TO-RIGHT ISOLATE (LRI), and U+2066 LEFT-TO-RIGHT ISOLATE (PDI). The Unicode Standard recommends that these be used instead.
There is also U+2068 FIRST STRONG ISOLATE (FSI), used initially to set the base direction according to the first recognised strongly-directional character.
U+061C ARABIC LETTER MARK (ALM) is used to produce correct sequencing of numeric data. Click on the character name, and see also expressions for details.
This section describes typographic features related to font/writing styles, cursive text, context-based shaping, context-based positioning, letterform slopes, weights & italics, and case & other character transforms.
Do letters in this script join with each other by default? Is the basic shape of a letter radically changed? Is it sometimes not cursive? Are there any special features to note? Are Unicode joiner and non-joiner characters needed to override default joining behaviours?
Mandaic is cursive, ie. letters in a word are joined up. Fonts need to produce the appropriate joining form for a code point, according to its visual context.
The cursive treatment doesn't produce significant variations of the essential part of a rendered character (unlike Arabic). In some letters, the joining edge of the glyph adapts to join with an adjacent character. Two examples show how strokes away from the baseline are typically shortened to create joining shapes.
Two examples of small tweaks to glyphs when joining.
Other small adaptations may occur between certain adjacent characters, such as kl, wt and mn.1512
Cursive joining forms
The cursive treatment produces only minor changes to glyph shapes in most cases. Figure 5 and Figure 6 show all the basic shapes in Mandaic and what their joining forms look like.
isolated
right-joined
dual-join
left-joined
Mandaic letters
ࡐ
ـࡐ
ـࡐـ
ࡐـ
ࡐ0850
ࡁ
ـࡁ
ـࡁـ
ࡁـ
ࡁ0841
ࡕ
ـࡕ
ـࡕـ
ࡕـ
ࡕ0855
ࡃ
ـࡃ
ـࡃـ
ࡃـ
ࡃ0843
ࡈ
ـࡈ
ـࡈـ
ࡈـ
ࡈ0848
ࡊ
ـࡊ
ـࡊـ
ࡊـ
ࡊ084A
ࡂ
ـࡂ
ـࡂـ
ࡂـ
ࡂ0842
ࡒ
ـࡒ
ـࡒـ
ࡒـ
ࡒ0852
ࡎ
ـࡎ
ـࡎـ
ࡎـ
ࡎ084E
ࡑ
ـࡑ
ـࡑـ
ࡑـ
ࡑ0851
ࡄ
ـࡄ
ـࡄـ
ࡄـ
ࡄ0844
ࡌ
ـࡌ
ـࡌـ
ࡌـ
ࡌ084C
ࡍ
ـࡍ
ـࡍـ
ࡍـ
ࡍ084D
ࡓ
ـࡓ
ـࡓـ
ࡓـ
ࡓ0853
ࡋ
ـࡋ
ـࡋـ
ࡋـ
ࡋ084B
ࡅ
ـࡅ
ـࡅـ
ࡅـ
ࡅ0845
ࡏ
ـࡏ
ـࡏـ
ࡏـ
ࡏ084F
Joining forms for shapes that join on both sides.
isolated
right-joined
Mandaic letters
ࡆ
ـࡆ
ࡆ0846
ࡔ
ـࡔ
ࡔ0854
ࡉ
ـࡉ
ࡉ0849
ࡀ
ـࡀ
ࡀ0840
ࡖ
ـࡖ
ࡖ0856
ࡇ
ـࡇ
ࡇ0847
ࡘ
ـࡘ
ࡘ0858
ࡗ
ـࡗ
ࡗ0857
Joining forms for shapes that join on the right only.
Unicode 13 changed the joining properties of ࡘU+0858 LETTER AIN and ࡗU+0857 LETTER KAD. Previously they didn't join on either side. Now they join to the right. It is actually possible to find examples of the former that do join, and other examples (sometimes in the same paragraph) that do not join. To prevent joining, U+200C ZERO WIDTH NON-JOINER should be used.10
Observation: Although that isn't obvious from the font used in the table, because the line isn't continuous, you can see the behaviour in a sequence such as ࡍࡘU+084D LETTER AN + U+0858 LETTER AIN, where the left-hand stroke of the initial letter is shortened.
Context-based shaping & positioning
Are special glyph forms needed, depending on the context in which a character is used? Do glyphs interact in some circumstances? Are there requirements to position diacritics or other items specially, depending on context? Does the script have multiple diacritics competing for the same location relative to the base?
In addition to the cursive shaping described just above, the position of diacritics may vary according to whether or not the glyph of the base character extends below the baseline. The diacritic also needs to be positioned horizontally underneath the character in the appropriate place. Several such variations are shown here:
Diacritic placement varying horizontally and vertically.
The 3 combining marks found in Neo-Mandaic are normally only used for educational texts.
Typographic units
Word boundaries
Are words separated by spaces, or other characters? Are there special requirements when double-clicking on the text? Are words hyphenated?
The concept of 'word' is difficult to define in any language (see What is a word?). Here, a word is a vaguely-defined, but recognisable semantic unit that is typically smaller than a phrase and may comprise one or more syllables.
Words are separated by spaces.
Graphemes
A grapheme is a user-perceived unit of text. Text operations that use graphemes as a unit of text include line-breaking, forwards deletion, cursor movement & selection, character counts, text spacing, text insertion, justification, case conversions, and sorting. The Unicode Standard uses generalised rules to define 'grapheme clusters', which approximate the likely grapheme boundaries in a writing system, however they don't work well with many complex scripts.
Grapheme clusters
As just mentioned, Neo-Mandaic normally uses no combining marks. When they are used, it is typically in educational texts.
Graphemes in Neo-Mandaic therefore consist of single letters or letters with a combining mark. This means that text can be segmented into typographic units using grapheme clusters.
Phrase, sentence, and section delimiters are described in phrase.
Punctuation & inline features
This section describes typographic features related to word boundaries, phrase & section boundaries, bracketed text, quotations & citations, emphasis, abbreviation, ellipsis & repetition, inline notes & annotations, other punctuation, and other inline text decoration.
Phrase & section boundaries
What characters are used to indicate the boundaries of phrases, sentences, and sections?
both
࡞085E
.002E
Mandaic uses sentence punctuation sparsely2. ࡞U+085E PUNCTUATION is used to start and end text sections. Everson describes a smaller version of this symbol that is used like a comma.2 There is no Unicode character for the smaller version.
The smaller size is also used in colophons (historical lay text added to religious text).1512
Observation: The keyboard at MandeanNetwork.com suggests that writers of Mandaic use Arabic punctuation, such as the following, in addition to western punctuation such as colon, full stop, etc. This is TBC.
Mandaic uses ornate parentheses, such as the following (the shape may vary).
both
﴾FD3E
﴿FD3F
Mirrored characters
The words 'left' and 'right' in the Unicode names for parentheses, brackets, and other paired characters should be ignored. LEFT should be read as if it said START, and RIGHT as END. The direction in which the glyphs point will be automatically determined according to the base direction of the text.
Both of these lines use >U+003E GREATER-THAN SIGN, but the direction it faces depends on the base direction at the point of display.
The number of characters that are mirrored in this way is around 550, most of which are mathematical symbols. Some are single characters, rather than pairs. The following are some of the more common ones.
12
(0028
)0029
<003C
>003E
[005B
]005D
{007B
}007D
«00AB
»00BB
‹2039
›203A
Quotations & citations
What characters are used to indicate quotations? Do quotations within quotations use different characters? What characters are used to indicate dialogue? Are the same mechanisms used to cite words, or for scare quotes, etc? What about citing book or article names?
Observation: The keyboard at MandeanNetwork.com suggests that writers of Mandaic use the following. This is TBC.
both
«00AB
»00BB
Line & paragraph layout
This section describes typographic features related to line breaking & hyphenation, text alignment & justification, text spacing, baselines, line height, counters, lists, and styling initials.
Line breaking
Are there special rules about the way text wraps when it hits the end of a line? Does line-breaking wrap whole 'words' at a time, or characters, or something else (such as syllables in Tibetan and Javanese)? What characters should not appear at the end or start of a line, and what should be done to prevent that? Is hyphenation used, or something else? What rules are used? What difficulties exist?
When a line break occurs in the middle of an embedded left-to-right sequence, the items in that sequence need to be rearranged visually so that it isn't necessary to read lines upwards.
Figure 9 shows how this happens in Arabic text, which works in the same way. Two Latin words are apparently reordered in the flow of text to accommodate this rule. Of course, the rearragement is only that of the visual glyphs: nothing affects the order of the characters in memory.
In this Arabic language text, the lower of these two images shows the result of decreasing the line width, so that text wraps between a sequence of Latin words.
Text alignment & justification
Does text in a paragraph needs to have flush lines down both sides? Does the script allow punctuation to hang outside the text box at the start or end of a line? Where adjustments are need to make a line flush, how is that done? Does the script shrink/stretch space between words and/or letters? Are word baselines stretched, as in Arabic? What about paragraph indents?
When text is fully justified the baseline may be stretched, as in Arabic. The Unicode Standard says6 that ـU+0640 ARABIC TATWEEL may be used to achieve that effect, however this is not a good solution in text where the line width varies, eg. in a web browser whose window can be stretched. (The reason being that as the paragraphs reflow words will wrap into different positions on the line.)
The whole document is justified on both sides of the text. In many cases the final word is stretched internally to make the line fit the width of the available space. Only rarely are words earlier in the line stretched.
Lines where justification is achieved by stretching the last word internally.
A difference from Arabic is that many lines are stretched to the end of the available space by a trailing baseline extension. The choice of internal vs trailing extension appears to be related to the character at the end of the word.
Lines where justification is achieved by extending the baseline from the last character in a word to the end of the line.
On a good number of lines, final letters in a word appear to be squeezed onto the line by writing them above the preceding part of the line. A short example can be seen in Figure 12.
Another notable feature is the use of a 'rule' such as
ࡎـــــࡀ U+084E MANDAIC LETTER AS + baseline extension + U+0840 MANDAIC LETTER HALQA, where the baseline extension can cause the combination to span all or a large part of the line. In some cases, the letter ࡔU+0854 LETTER ASH or ࡄU+0844 LETTER AH may appear at the midpoint of the rule. If this combination doesn't fill a whole line, it appears at the end of a line and is long enough to fill the remaining space.
A rule drawn across a whole line.A rule drawn from the end of the text to the end of the line.
Further research is needed to ascertain whether these justification techniques are generally applicable to Mandaic text, rather than unique to this document.
Daniels says1 that ࡇU+0847 LETTER IT can sometimes be 'manipulated calligraphically in an otherwise pedestrian manuscript in order to fill out a line'.
Baselines, line height, etc.
Does the script have special requirements for baseline alignment between mixed scripts and in general? Is line height special for this script? Are there other aspects that affect line spacing, or positioning of items vertically within a line?
Mandaic uses the so-called 'alphabetic' baseline, which is the same as for Latin and many other scripts.
A few Mandaic characters have glyphs that rise above the main height, and a few more that descend below the baseline. Diacritics are attached below the letters.
To give an approximate idea, Figure 14 compares Latin and Mandaic glyphs from the Noto font. Many Mandaic letters are less high than the Latin x-height, however some extend well below the Latin descenders, especially when they have combining marks attached. A few character glyphs reach the Latin cap-height.
Font metrics for Latin text compared with Mandaic glyphs in the Noto Serif Mandaic font.
Page & book layout
This section describes typographic features related to general page layout & progression; grids & tables, notes, footnotes, etc, forms & user interaction, and page numbering, running headers, etc.
General page layout & progression
How are the main text area and ancilliary areas positioned and defined? Are there any special requirements here, such as dimensions in characters for the Japanese kihon hanmen? The book cover for scripts that are read right-to-left scripts is on the right of the spine, rather than the left. When content can flow vertically and to the left or right, how to specify the location of objects, text, etc. relative to the flow? Do tables and grid layouts work as expected? How do columns work in vertical text? Can you mix block of vertical and horizontal text? Does text scroll in a different direction?
Mandaic books, leaflets, etc., are bound on the right-hand side, and pages progress from right to left.
Columns are vertical but run right-to-left across the page.