Punctuation character notes

Updated 11 February, 2020

This page lists characters in the following Unicode blocks and provides information about them.

This is not authoritative, peer-reviewed information – these are just notes I have gathered and copied from various places.

Find

If you click on any red example text, you will see at the bottom right of the page a list of the characters that make up the example.

Read more about how to use this page.

To find a character by codepoint, type #char0000 at the end of the URL in the address bar, where 0000 is a four-figure, hex codepoint number, all in uppercase. Or type the character or the hex number in the Find control above.

To view this page as intended, you need an appropriate font. Click the blue vertical bar at the bottom right of the page to apply other fonts, if you have them on your system. For transcriptions I recommend the excellent and free Doulos SIL font. The large character in the box will not be rendered unless the webfont downloaded with the page or a system font has a glyph for it. (If there is no glyph and you want to see what it looks like, click on See in UniView.)

Information about languages that use these characters is taken from the list maintained for the Character Use app. The list is not exhaustive.

References are indicated by superscript characters. Wherever possible, those contain direct links to the source material. When such a pointer is alongside an arrow → it means that it's worth following the link for the additional information it provides. Digits refer to the main sources, which are listed at the bottom of a set of notes.

When you are using UniView and you turn on Show notes, UniView will pull in information about characters from this page.

Spaces

U+2000 EN QUAD

U+2001 EM QUAD

U+2002 EN SPACE

U+2003 EM SPACE

U+2004 THREE-PER-EM SPACE

U+2005 FOUR-PER-EM SPACE

U+2006 SIX-PER-EM SPACE

U+2007 FIGURE SPACE

U+2008 PUNCTUATION SPACE

U+2009 THIN SPACE

U+200A HAIR SPACE

U+200B ZERO WIDTH SPACE

An invisible character, used to signal line-break and word-break opportunities. It was originally provided for use with writing systems such as Thai, Myanmar, Khmer, Japanese, etc. that don't use spaces between words.

Justification may visibly adjust the space between the characters on either side of this character, doing so as if the ZWSP wasn't there, eg. the Thai text อักษร​ไทย may look like อั ก ษ ร ไ ท ย when justified, or when letter-spacing is applied, even though the two words are separated by a ZWSP (click on the word to see the composition).

General sources: Unicode p872

U+205F MEDIUM MATHEMATICAL SPACE

Format characters

U+200C ZERO WIDTH NON-JOINER

Prevents two adjacent letters forming a cursive connection with each other when rendered.

Persian

The ZWNJ is used in Persian for plural suffixes, some proper names, and Ottoman Turkish vowels. Ignoring or removing the ZWNJ will result in text with a different meaning or meaningless text.1 For example, تن‌ها is the plural of body, whereas تنها is the adjective alone.2 The only difference is the presence or absence of ZWNJ after noon. u373 g

Khmer

Khmer register shifters (ie. ◌៉ [U+17C9 KHMER SIGN MUUSIKATOAN​] or ◌៊ [U+17CA KHMER SIGN TRIISAP​]) usually appear above a consonant. However, if a superscript vowel is also attached to the consonant, the shifter is normally displayed below the consonant, instead. If you want to force the shifter to remain above the consonant, as is occasionally the case, insert ZWNJ between the consonant and the shifter. For example, ហ ហ៊ ហ៊ី ហ‌៊ី. u373 sk

General sources: Unicode

U+200D ZERO WIDTH JOINER

Permits a letter to form a cursive connection without a visible neighbour.

Arabic

The marker for hijri dates is an initial form of heh, even though it doesn't join to the left, ie. ه‍. For this, use a U+200D ZERO WIDTH JOINER immediately after the heh, eg. الاثنين 10 رجب 1415 ه‍..

In some cases ـ [U+0640 ARABIC TATWEEL] is used to ensure that the shape looks right, because some applications or fonts don't produce the right effect when using the ZWJ, eg. الاثنين 10 رجب 1415 هـ..

U+200E LEFT-TO-RIGHT MARK

U+200F RIGHT-TO-LEFT MARK

U+2028 LINE SEPARATOR

U+2029 PARAGRAPH SEPARATOR

U+202A LEFT-TO-RIGHT EMBEDDING

U+202B RIGHT-TO-LEFT EMBEDDING

U+202C POP DIRECTIONAL FORMATTING

U+202D LEFT-TO-RIGHT OVERRIDE

U+202E RIGHT-TO-LEFT OVERRIDE

U+202F NARROW NO-BREAK SPACE

U+2060 WORD JOINER

An invisible character, equivalent to a zero-width no-break space, and used to prevent line-breaks, eg. it can be used around the + sign in base⁠+delta⁠ to prevent a line break occuring in that sequence of characters. It has no effect on word segmentation.

It can also be used to bracket other characters to turn them into non-breaking characters, such as U+2009 THIN SPACE or [U+2015 HORIZONTAL BAR].

Not to be confused with U+200D ZERO WIDTH JOINER or U+034F COMBINING GRAPHEME JOINER​, since it has no effect on shaping.

This functionality is also provided by U+FEFF ZERO WIDTH NO-BREAK SPACE, but since that character also represents the byte-order mark, the use of this word joiner character (added in Unicode 3.2) is strongly preferred over the latter.

General sources: Unicode

U+2066 LEFT-TO-RIGHT ISOLATE

U+2067 RIGHT-TO-LEFT ISOLATE

U+2068 FIRST STRONG ISOLATE

U+2069 POP DIRECTIONAL ISOLATE

Dashes and hyphens

U+2010 HYPHEN

U+2011 NON-BREAKING HYPHEN

U+2012 FIGURE DASH

U+2013 EN DASH

U+2014 EM DASH

U+2015 HORIZONTAL BAR

U+2E3A TWO-EM DASH

U+2E3B THREE-EM DASH

U+2E40 DOUBLE HYPHEN

U+2E43 DASH WITH LEFT UPTURN

Quotations and apostrophe

U+2018 LEFT SINGLE QUOTATION MARK

U+2019 RIGHT SINGLE QUOTATION MARK

U+201A SINGLE LOW-9 QUOTATION MARK

U+201B SINGLE HIGH-REVERSED-9 QUOTATION MARK

U+201C LEFT DOUBLE QUOTATION MARK

U+201D RIGHT DOUBLE QUOTATION MARK

U+201E DOUBLE LOW-9 QUOTATION MARK

U+201F DOUBLE HIGH-REVERSED-9 QUOTATION MARK

U+2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK

U+203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK

Brackets

U+2E1C LEFT LOW PARAPHRASE BRACKET

U+2E1D RIGHT LOW PARAPHRASE BRACKET

U+2E20 LEFT VERTICAL BAR WITH QUILL

U+2E21 RIGHT VERTICAL BAR WITH QUILL

U+2E22 TOP LEFT HALF BRACKET

U+2E23 TOP RIGHT HALF BRACKET

U+2E24 BOTTOM LEFT HALF BRACKET

U+2E25 BOTTOM RIGHT HALF BRACKET

U+2E26 LEFT SIDEWAYS U BRACKET

U+2E27 RIGHT SIDEWAYS U BRACKET

U+2E28 LEFT DOUBLE PARENTHESIS

U+2E29 RIGHT DOUBLE PARENTHESIS

General punctuation

U+2016 DOUBLE VERTICAL LINE

Called double bar.b

An old standard reference mark used with footnotes. When used for this purpose with other signs, the traditional order is * † ‡ § ‖ ¶.b

Also used as a standard symbol for bibliographic work.b

General sources: Unicode

U+2017 DOUBLE LOW LINE

U+2020 DAGGER

Called dagger, but also known as obelisk, obelus, or long cross.b321

A reference mark, used primarily with footnotes. When used for this purpose with other signs, the traditional order is * † ‡ § ‖ ¶.b68

Also a death sign in European typography, used to mark the year of death or the names of dead persons.b321

In lexicography it marks obsolete forms, and in editing of classical texts flags passages judged to be corrupt.b321

General sources: Unicode

U+2021 DOUBLE DAGGER

Called dagger, but also known as diesis, or double obelisk.b321

A reference mark used with footnotes. When used for this purpose with other signs, the traditional order is * † ‡ § ‖ ¶.b68

General sources: Unicode

U+2022 BULLET

U+2023 TRIANGULAR BULLET

U+2024 ONE DOT LEADER

Armenian punctuation miǰakēt

Used like a semi-colon – a shorter break than a full stop. u322

Ոչ ոք չպետք է լինի ստրկության կամ անազատ վիճակում․ պետք է արգելվեն ստրկատիրության ու ստրուկների առուծախի բոլոր ձևերը։ 

U+2025 TWO DOT LEADER

U+2026 HORIZONTAL ELLIPSIS

U+2027 HYPHENATION POINT

U+2030 PER MILLE SIGN

U+2031 PER TEN THOUSAND SIGN

U+2032 PRIME

Abbreviation for feet (1′ = 12″).b330

Also used for minutes of arc (eg. 60′=1°).b330

U+2033 DOUBLE PRIME

Abbreviation for inches (1′ = 12″).b321

Also used for seconds of arc (eg. 360″=1°).b321

U+2034 TRIPLE PRIME

U+2035 REVERSED PRIME

U+2036 REVERSED DOUBLE PRIME

U+2037 REVERSED TRIPLE PRIME

U+2038 CARET

U+203B REFERENCE MARK

U+203D INTERROBANG

U+203E OVERLINE

U+203F UNDERTIE

U+2040 CHARACTER TIE

U+2041 CARET INSERTION POINT

U+2042 ASTERISM

U+2043 HYPHEN BULLET

U+2044 FRACTION SLASH

U+2045 LEFT SQUARE BRACKET WITH QUILL

U+2046 RIGHT SQUARE BRACKET WITH QUILL

U+204A TIRONIAN SIGN ET

U+204B REVERSED PILCROW SIGN

U+204C BLACK LEFTWARDS BULLET

U+204D BLACK RIGHTWARDS BULLET

U+204E LOW ASTERISK

U+204F REVERSED SEMICOLON

U+2050 CLOSE UP

U+2051 TWO ASTERISKS ALIGNED VERTICALLY

U+2052 COMMERCIAL MINUS SIGN

U+2053 SWUNG DASH

U+2054 INVERTED UNDERTIE

U+2055 FLOWER PUNCTUATION MARK

U+2057 QUADRUPLE PRIME

U+2E18 INVERTED INTERROBANG

U+2E19 PALM BRANCH

U+2E44 DOUBLE SUSPENSION MARK

Double punctuation for vertical text

U+203C DOUBLE EXCLAMATION MARK

U+2047 DOUBLE QUESTION MARK

U+2048 QUESTION EXCLAMATION MARK

U+2049 EXCLAMATION QUESTION MARK

Archaic/historical punctuation

U+2056 THREE DOT PUNCTUATION

U+2058 FOUR DOT PUNCTUATION

U+2059 FIVE DOT PUNCTUATION

U+205A TWO DOT PUNCTUATION

U+205B FOUR DOT MARK

U+205C DOTTED CROSS

U+205D TRICOLON

U+205E VERTICAL FOUR DOTS

U+2E2A TWO DOTS OVER ONE DOT PUNCTUATION

U+2E2B ONE DOT OVER TWO DOTS PUNCTUATION

U+2E2C SQUARED FOUR DOT PUNCTUATION

U+2E2D FIVE DOT MARK

U+2E2E REVERSED QUESTION MARK

U+2E2F VERTICAL TILDE

U+2E30 RING POINT

U+2E31 WORD SEPARATOR MIDDLE DOT

U+2E33 RAISED DOT

U+2E34 RAISED COMMA

⸿

U+2E3F CAPITULUM

Specialist symbols

New Testament editorial symbols

Ancient Greek textual symbols

Ancient Near-Eastern linguistic symbol

Dictionary punctuation

Palaeotype transliteration symbol

Typicon punctuation

Alternate forms of punctuation

Reversed punctuation

Invisible operators

U+2061 FUNCTION APPLICATION

U+2062 INVISIBLE TIMES

U+2063 INVISIBLE SEPARATOR

U+2064 INVISIBLE PLUS

Deprecated

U+206A INHIBIT SYMMETRIC SWAPPING

U+206B ACTIVATE SYMMETRIC SWAPPING

U+206C INHIBIT ARABIC FORM SHAPING

U+206D ACTIVATE ARABIC FORM SHAPING

U+206E NATIONAL DIGIT SHAPES

U+206F NOMINAL DIGIT SHAPES

CJK Symbols & punctuation

CJK Symbols & punctuation

CJK angle brackets

CJK corner brackets

CJK brackets

CJK symbols

Shuzou numerals

Combining tone marks

Kana repeat marks

Special CJK indicators

Halfwidth and Fullwidth Forms

Fullwidth ASCII variants

Fullwidth brackets

Halfwidth CJK punctuation

Halfwidth katakana variants

Halfwidth hangul variants

Fullwidth symbol variants

Halfwidth symbol variants

References