Hanzi/Chinese

Updated 20 September, 2021

This page gathers basic information about the Han Simplified and Traditional writing systems and their use for the Chinese language. It aims (generally) to provide an overview of the orthography and typographic features, and (specifically) to advise how to write Chinese using Unicode.

More about using this page
Related pages.
Other script summaries.

Samples

Select part of this sample text to show a list of characters, with links to more details.
Change size:   24px

Simplified Chinese

第一条 人人生而自由,在尊严和权利上一律平等。他们赋有理性和良心,并应以兄弟关系的精神相对待。

第二条 人人有资格享有本宣言所载的一切权利和自由,不分种族、肤色、性别、语言、宗教、政治或其他见解、国籍或社会出身、财产、出生或其他身分等任何区别。并且不得因一人所属的国家或领土的政治的、行政的或者国际的地位之不同而有所区别,无论该领土是独立领土、托管领土、非自治领土或者处于其他任何主权受限制的情况之下。

Traditional Chinese

第一條 人人生而自由,在尊嚴和權利上一律平等。他們賦有理性和良心,並應以兄弟關係的精神相對待。

第二條 人人有資格享受本宣言所載的一切權利和自由,不分種族、膚色、性別、語言、宗教、政治或其他見解、國籍或社會出身、財產、出生或其他身分等任何區別。

Usage & history

Two styles of Han characters are used to write Chinese. The traditional orthography was used from the 5th century until 1949 in Mainland China. The simplified orthography was introduced in 1949 and is used in Mainland China and Singapore. Traditional Han is still used in Taiwan, Hong Kong and Macau, and for aesthetic purposes elsewhere in East Asia.

People speaking different Chinese dialects nevertheless write largely the same way, due to the way that the Han characters represent concepts rather than sounds.

Han characters are also widely used in Japan to represent the main part of Japanese words, and sometimes used in Korea (though modern Korean text will contain very few, if any, han characters).

汉字 hànzì Simplified Chinese 漢字 hànzì Traditional Chinese

Chinese writing dates from the second half of the second millenium BC. There is no evidence for a predecessor. The earliest inscriptions were on bones and shells used in divination during the Shang dynasty (1600-1046 BC), and employed a set of logographic symbols now known as the Oracle Bone Script. Although these symbols have been extinct since the end of the Bronze Age, the modern Han characters are direct descendants from these.

Sources: Scriptsource, Wikipedia.

Basic features

The Han script is an ideographic script. Letters typically represent a spoken syllable with its tone. See the table to the right for a brief overview of features for the modern Mandarin Chinese orthography, using the Simplified Chinese orthography. The character count reflects a typical set of characters needed for everyday reading and writing: there are many thousands more Han characters that could be added for other purposes (see chars).

The Simplified Chinese orthography has a smaller repertoire and simpler shapes than the Traditional version.

The Chinese script is used as a common writing system by people who may speak a wide variety of Chinese languages, and who may pronounce the written text very differently. This is possible because the characters represent concepts rather than phonetics.

Text can be written horizontally, left to right, or vertically with lines progressing from right to left. Vertically set text is more common in traditional chinese than simplified chinese areas.

Words are not separated by spaces.

In its 'main' category, CLDR lists 2,210 characters for the Simplified Chinese orthography, and 2,180 for Traditional Chinese. Combined, this includes 3,026 unique characters, and an overlap of 1,064 characters.

The language is tonal, but the tones are not written explicitly.

Chinese has no combining marks, but has many punctuation marks. It also has a relatively complex set of typographic rules.

The visual forms of characters don't interact.

Character index

Ideographic characters

Show

See cldr_character_lists.

For other associated blocks, see chars.

Counter styles

一␣丁␣七␣三␣丑␣丙␣乙␣九␣二␣五␣亥␣仟␣伍␣佰␣八␣六␣十␣千␣午␣卯␣叁␣參␣四␣壬␣壹␣子␣寅␣己␣巳␣庚␣戊␣戌␣拾␣捌␣未␣柒␣玖␣甲␣申␣癸␣百␣肆␣貳␣贰␣辛␣辰␣酉␣陆␣陸␣零␣𝍲␣𝍳␣𝍴␣𝍵␣𝍶

Numbers

Show
〇␣㈠␣㈡␣㈢␣㈣␣㈤␣㈥␣㈦␣㈧␣㈨␣㈩␣㊀␣㊁␣㊂␣㊃␣㊄␣㊅␣㊆␣㊇␣㊈␣㊉

Punctuation

Show
!␣"␣#␣%␣&␣(␣)␣*␣,␣-␣.␣/␣:␣;␣?␣@␣[␣\␣]␣_␣{␣}␣§␣·␣‐␣‑␣–␣—␣―␣‖␣‘␣’␣“␣”␣†␣‡␣•␣‥␣…␣‧␣‰␣′␣″␣‵␣※␣‼␣‾␣⁇␣⁈␣⁉␣⸺␣、␣。␣〃␣〈␣〉␣《␣》␣「␣」␣『␣』␣【␣】␣〔␣〕␣〖␣〗␣〝␣〞␣・␣!␣"␣#␣%␣&␣'␣(␣)␣*␣,␣-␣.␣/␣:␣;␣?␣@␣[␣\␣]␣_␣{␣}

CLDR additions

︰␣︱␣︲␣︳␣︴␣︵␣︶␣︷␣︸␣︹␣︺␣︻␣︼␣︽␣︾␣︿␣﹀␣﹁␣﹂␣﹃␣﹄␣﹉␣﹊␣﹋␣﹌␣﹍␣﹎␣﹏␣﹐␣﹑␣﹒␣﹔␣﹕␣﹖␣﹗␣﹘␣﹙␣﹚␣﹛␣﹜␣﹝␣﹞␣﹟␣﹠␣﹡␣﹣␣﹨␣﹪␣﹫

Symbols

Show

Phonology

This section lists sounds for Mandarin Chinese, as spoken in the Beijing area.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones.wp.

Vowel sounds

Plain vowels

i y y u ɤ ɤ o ə ə a

Diphthongs

ie ia iu ye ye uə ua uo ei ou ai au

Consonant sounds

labial dental alveolar post-
alveolar
retroflex palatal velar
stop p
t
        k
affricate   t͡s
t͡sʰ
  t͡ɕ
t͡ɕʰ
t͡ʂ
t͡ʂʰ
   
fricative f   s ɕ ʂ   x
nasal m   n       ŋ
approximant     l   ɻ    
trill/flap          

Characters

'Ideographic' characters

Chinese text is primarily constructed from characters that each correspond to a phonetic symbol, including a tone. Some have pictographic origins that are still evident, whereas others have a more complicated structure.

It is said that Chinese people typically use around 3-4,000 characters for most communication, but a reasonable word processor would need to support at least 10,000. Unicode supports over 70,000 Han characters, most of which cover advanced or esoteric usage.

CLDR character lists

The listings here show a list of characters used per version 36 of CLDR's lists of characters (exemplarCharacters).

Show characters used for Simplified Chinese.
Main 一 丁 七 万 丈 三 上 下 丌 不 与 丑 专 且 世 丘 丙 业 东 丝 丢 两 严 丧 个 中 丰 串 临 丸 丹 为 主 丽 举 乃 久 么 义 之 乌 乍 乎 乏 乐 乔 乖 乘 乙 九 也 习 乡 书 买 乱 乾 了 予 争 事 二 于 亏 云 互 五 井 亚 些 亡 交 亥 亦 产 亨 享 京 亮 亲 人 亿 什 仁 仅 仇 今 介 仍 从 仔 他 付 仙 代 令 以 仪 们 仰 仲 件 价 任 份 仿 企 伊 伍 伏 伐 休 众 优 伙 会 伟 传 伤 伦 伯 估 伴 伸 似 伽 但 位 低 住 佐 佑 体 何 余 佛 作 你 佤 佩 佳 使 例 供 依 侠 侦 侧 侨 侬 侯 侵 便 促 俄 俊 俗 保 信 俩 修 俱 俾 倍 倒 候 倚 借 倦 值 倾 假 偌 偏 做 停 健 偶 偷 储 催 傲 傻 像 僧 儒 儿 允 元 兄 充 兆 先 光 克 免 兑 兔 党 入 全 八 公 六 兮 兰 共 关 兴 兵 其 具 典 兹 养 兼 兽 内 冈 册 再 冒 写 军 农 冠 冬 冰 冲 决 况 冷 准 凌 减 凝 几 凡 凤 凭 凯 凰 出 击 函 刀 分 切 刊 刑 划 列 刘 则 刚 创 初 判 利 别 到 制 刷 券 刺 刻 剂 前 剑 剧 剩 剪 副 割 力 劝 办 功 加 务 劣 动 助 努 劫 励 劲 劳 势 勇 勉 勋 勒 勤 勾 勿 包 匆 匈 化 北 匙 匹 区 医 十 千 升 午 半 华 协 卒 卓 单 卖 南 博 占 卡 卢 卫 卯 印 危 即 却 卷 厂 厄 厅 历 厉 压 厌 厍 厚 原 去 县 参 又 叉 及 友 双 反 发 叔 取 受 变 叙 口 古 句 另 只 叫 召 叭 可 台 史 右 叶 号 司 叹 吃 各 合 吉 吊 同 名 后 吐 向 吓 吗 君 吝 吟 否 吧 含 听 启 吵 吸 吹 吻 吾 呀 呆 呈 告 呐 员 呜 呢 呦 周 味 呵 呼 命 和 咖 咦 咧 咨 咪 咬 咯 咱 哀 品 哇 哈 哉 响 哎 哟 哥 哦 哩 哪 哭 哲 唉 唐 唤 唬 售 唯 唱 唷 商 啊 啡 啥 啦 啪 喀 喂 善 喇 喊 喏 喔 喜 喝 喵 喷 喻 嗒 嗨 嗯 嘉 嘛 嘴 嘻 嘿 器 四 回 因 团 园 困 围 固 国 图 圆 圈 土 圣 在 圭 地 圳 场 圾 址 均 坎 坐 坑 块 坚 坛 坜 坡 坤 坦 坪 垂 垃 型 垒 埃 埋 城 埔 域 培 基 堂 堆 堕 堡 堪 塑 塔 塞 填 境 增 墨 壁 壤 士 壬 壮 声 处 备 复 夏 夕 外 多 夜 够 夥 大 天 太 夫 央 失 头 夷 夸 夹 夺 奇 奈 奉 奋 奏 契 奔 奖 套 奥 女 奴 奶 她 好 如 妇 妈 妖 妙 妥 妨 妮 妹 妻 姆 姊 始 姐 姑 姓 委 姿 威 娃 娄 娘 娜 娟 娱 婆 婚 媒 嫁 嫌 嫩 子 孔 孕 字 存 孙 孜 孝 孟 季 孤 学 孩 宁 它 宇 守 安 宋 完 宏 宗 官 宙 定 宛 宜 宝 实 审 客 宣 室 宪 害 宴 家 容 宽 宾 宿 寂 寄 寅 密 寇 富 寒 寝 寞 察 寡 寨 寸 对 寻 导 寿 封 射 将 尊 小 少 尔 尖 尘 尚 尝 尤 就 尺 尼 尽 尾 局 屁 层 居 屋 屏 展 属 屠 山 岁 岂 岗 岘 岚 岛 岳 岸 峡 峰 崇 崩 崴 川 州 巡 工 左 巧 巨 巫 差 己 已 巳 巴 巷 币 市 布 帅 师 希 帐 帕 帖 帝 带 席 帮 常 帽 幅 幕 干 平 年 并 幸 幻 幼 幽 广 庆 床 序 库 应 底 店 庙 庚 府 庞 废 度 座 庭 康 庸 廉 廖 延 廷 建 开 异 弃 弄 弊 式 引 弗 弘 弟 张 弥 弦 弯 弱 弹 强 归 当 录 彝 形 彩 彬 彭 彰 影 彷 役 彻 彼 往 征 径 待 很 律 後 徐 徒 得 循 微 徵 德 心 必 忆 忌 忍 志 忘 忙 忠 忧 快 念 忽 怀 态 怎 怒 怕 怖 思 怡 急 性 怨 怪 总 恋 恐 恢 恨 恩 恭 息 恰 恶 恼 悄 悉 悔 悟 悠 患 您 悲 情 惑 惜 惠 惧 惨 惯 想 惹 愁 愈 愉 意 愚 感 愧 慈 慎 慕 慢 慧 慰 憾 懂 懒 戈 戊 戌 戏 成 我 戒 或 战 截 戴 户 房 所 扁 扇 手 才 扎 扑 打 托 扣 执 扩 扫 扬 扭 扮 扯 批 找 承 技 抄 把 抑 抓 投 抗 折 抢 护 报 披 抬 抱 抵 抹 抽 担 拆 拉 拍 拒 拔 拖 拘 招 拜 拟 拥 拦 拨 择 括 拳 拷 拼 拾 拿 持 指 按 挑 挖 挝 挡 挤 挥 挪 振 挺 捉 捐 捕 损 捡 换 据 捷 授 掉 掌 排 探 接 控 推 掩 措 掸 描 提 插 握 援 搜 搞 搬 搭 摄 摆 摊 摔 摘 摩 摸 撒 撞 播 操 擎 擦 支 收 改 攻 放 政 故 效 敌 敏 救 教 敝 敢 散 敦 敬 数 敲 整 文 斋 斐 斗 料 斜 斥 断 斯 新 方 於 施 旁 旅 旋 族 旗 无 既 日 旦 旧 旨 早 旭 时 旺 昂 昆 昌 明 昏 易 星 映 春 昨 昭 是 显 晃 晋 晒 晓 晚 晨 普 景 晴 晶 智 暂 暑 暖 暗 暮 暴 曰 曲 更 曹 曼 曾 替 最 月 有 朋 服 朗 望 朝 期 木 未 末 本 札 术 朱 朵 机 杀 杂 权 杉 李 材 村 杜 束 条 来 杨 杯 杰 松 板 极 构 析 林 果 枝 枢 枪 枫 架 柏 某 染 柔 查 柬 柯 柳 柴 标 栋 栏 树 校 样 核 根 格 桃 框 案 桌 桑 档 桥 梁 梅 梦 梯 械 梵 检 棉 棋 棒 棚 森 椅 植 椰 楚 楼 概 榜 模 樱 檀 欠 次 欢 欣 欧 欲 欺 款 歉 歌 止 正 此 步 武 歪 死 殊 残 段 毅 母 每 毒 比 毕 毛 毫 氏 民 气 氛 水 永 求 汇 汉 汗 汝 江 池 污 汤 汪 汶 汽 沃 沈 沉 沙 沟 没 沧 河 油 治 沿 泉 泊 法 泛 泡 波 泣 泥 注 泰 泳 泽 洋 洗 洛 洞 津 洪 洲 活 洽 派 流 浅 测 济 浏 浑 浓 浙 浦 浩 浪 浮 浴 海 涅 消 涉 涛 涨 涯 液 涵 淋 淑 淘 淡 深 混 添 清 渐 渡 渣 温 港 渴 游 湖 湾 源 溜 溪 滋 滑 满 滥 滨 滴 漂 漏 演 漠 漫 潘 潜 潮 澎 澳 激 灌 火 灭 灯 灰 灵 灿 炉 炎 炮 炸 点 烂 烈 烤 烦 烧 热 焦 然 煌 煞 照 煮 熊 熟 燃 燕 爆 爪 爬 爱 爵 父 爷 爸 爽 片 版 牌 牙 牛 牡 牢 牧 物 牲 牵 特 牺 犯 状 犹 狂 狐 狗 狠 独 狮 狱 狼 猛 猜 猪 献 猴 玄 率 玉 王 玛 玩 玫 环 现 玲 玻 珀 珊 珍 珠 班 球 理 琊 琪 琳 琴 琼 瑙 瑜 瑞 瑟 瑰 瑶 璃 瓜 瓦 瓶 甘 甚 甜 生 用 田 由 甲 申 电 男 甸 画 畅 界 留 略 番 疆 疏 疑 疗 疯 疲 疼 疾 病 痕 痛 痴 癸 登 白 百 的 皆 皇 皮 盈 益 监 盒 盖 盘 盛 盟 目 直 相 盼 盾 省 眉 看 真 眠 眼 着 睛 睡 督 瞧 矛 矣 知 短 石 矶 码 砂 砍 研 破 础 硕 硬 确 碍 碎 碗 碟 碧 碰 磁 磅 磨 示 礼 社 祖 祚 祝 神 祥 票 祯 祸 禁 禅 福 离 秀 私 秋 种 科 秒 秘 租 秤 秦 秩 积 称 移 稀 程 稍 税 稣 稳 稿 穆 究 穷 穹 空 穿 突 窗 窝 立 站 竞 竟 章 童 端 竹 笑 笔 笛 符 笨 第 等 筋 筑 答 策 筹 签 简 算 管 箭 箱 篇 篮 簿 籍 米 类 粉 粒 粗 粤 粹 精 糊 糕 糖 糟 系 素 索 紧 紫 累 繁 红 约 级 纪 纯 纲 纳 纵 纷 纸 纽 线 练 组 细 织 终 绍 经 结 绕 绘 给 络 绝 统 继 绩 绪 续 维 绵 综 绿 缅 缓 编 缘 缠 缩 缴 缶 缸 缺 罐 网 罕 罗 罚 罢 罪 置 署 羊 美 羞 群 羯 羽 翁 翅 翔 翘 翠 翰 翻 翼 耀 老 考 者 而 耍 耐 耗 耳 耶 聊 职 联 聘 聚 聪 肉 肖 肚 股 肤 肥 肩 肯 育 胁 胆 背 胎 胖 胜 胞 胡 胶 胸 能 脆 脑 脱 脸 腊 腐 腓 腰 腹 腾 腿 臂 臣 自 臭 至 致 舌 舍 舒 舞 舟 航 般 舰 船 良 色 艺 艾 节 芒 芝 芦 芬 芭 花 芳 苍 苏 苗 若 苦 英 茂 范 茨 茫 茶 草 荐 荒 荣 药 荷 莉 莎 莪 莫 莱 莲 获 菜 菩 菲 萄 萍 萤 营 萧 萨 落 著 葛 葡 蒂 蒋 蒙 蓉 蓝 蓬 蔑 蔡 薄 薪 藉 藏 藤 虎 虑 虫 虹 虽 虾 蚁 蛇 蛋 蛙 蛮 蜂 蜜 蝶 融 蟹 蠢 血 行 街 衡 衣 补 表 袋 被 袭 裁 裂 装 裕 裤 西 要 覆 见 观 规 视 览 觉 角 解 言 誉 誓 警 计 订 认 讨 让 训 议 讯 记 讲 讷 许 论 设 访 证 评 识 诉 词 译 试 诗 诚 话 诞 询 该 详 语 误 说 请 诸 诺 读 课 谁 调 谅 谈 谊 谋 谓 谜 谢 谨 谱 谷 豆 象 豪 貌 贝 贞 负 贡 财 责 贤 败 货 质 贩 贪 购 贯 贱 贴 贵 贸 费 贺 贼 贾 资 赋 赌 赏 赐 赔 赖 赚 赛 赞 赠 赢 赤 赫 走 赵 起 趁 超 越 趋 趣 足 跃 跌 跑 距 跟 路 跳 踏 踢 踩 身 躲 车 轨 轩 转 轮 软 轰 轻 载 较 辅 辆 辈 辉 辑 输 辛 辞 辨 辩 辰 辱 边 达 迁 迅 过 迈 迎 运 近 返 还 这 进 远 违 连 迟 迦 迪 迫 述 迷 追 退 送 适 逃 逆 选 逊 透 逐 递 途 通 逛 逝 速 造 逢 逸 逻 逼 遇 遍 道 遗 遭 遮 遵 避 邀 邓 那 邦 邪 邮 邱 邻 郎 郑 部 郭 都 鄂 酉 酋 配 酒 酷 酸 醉 醒 采 释 里 重 野 量 金 针 钓 钟 钢 钦 钱 钻 铁 铃 铜 铢 铭 银 铺 链 销 锁 锅 锋 错 锡 锦 键 锺 镇 镜 镭 长 门 闪 闭 问 闰 闲 间 闷 闹 闻 阁 阅 阐 阔 队 阮 防 阳 阴 阵 阶 阻 阿 陀 附 际 陆 陈 降 限 院 除 险 陪 陵 陶 陷 隆 随 隐 隔 障 难 雄 雅 集 雉 雨 雪 雯 雳 零 雷 雾 需 震 霍 霖 露 霸 霹 青 靖 静 非 靠 面 革 靼 鞋 鞑 韦 韩 音 页 顶 项 顺 须 顽 顾 顿 预 领 颇 频 颗 题 额 风 飘 飙 飞 食 餐 饭 饮 饰 饱 饼 馆 首 香 馨 马 驱 驶 驻 驾 验 骑 骗 骚 骤 骨 高 鬼 魂 魅 魔 鱼 鲁 鲜 鸟 鸡 鸣 鸭 鸿 鹅 鹤 鹰 鹿 麦 麻 黄 黎 黑 默 鼓 鼠 鼻 齐 齿 龄 龙 龟 2,210
Auxiliary 仂 侣 傈 傣 僳 卑 卞 厘 吕 坝 堤 奎 屿 巽 撤 楔 楠 滕 瑚 甫 盲 碑 禄 粟 脚 艮 谬 钯 铂 锑 镑 魁 乒 乓 仓 伞 冥 凉 刨 匕 厦 厨 呣 唇 啤 啮 喱 嗅 噘 噢 墟 妆 婴 媚 宅 寺 尬 尴 屑 巾 弓 彗 惊 戟 扔 扰 扳 抛 挂 捂 摇 撅 杆 杖 柜 柱 栗 栽 桶 棍 棕 棺 榈 槟 橙 洒 浆 涌 淇 滚 滩 灾 烛 烟 焰 煎 犬 猫 瓢 皱 盆 盔 眨 眯 瞌 矿 祈 祭 祷 稻 竿 笼 筒 篷 粮 纠 纬 缆 缎 耸 舔 舵 艇 芽 苜 苞 菇 菱 葫 葵 蒸 蓿 蔽 薯 蘑 蚂 蛛 蜗 蜘 蜡 蝎 蝴 螃 裹 谍 豚 账 跤 踪 躬 轴 辐 迹 郁 鄙 酢 钉 钥 钮 铅 铛 锄 锚 锤 闺 阱 隧 雕 霾 靴 靶 鞠 颠 馏 驼 骆 髦 鲤 鲸 鳄 鸽 181
Show characters used for Traditional Chinese.
Main 一 丁 七 丈 三 上 下 丌 不 丑 且 世 丘 丙 丟 並 中 串 丸 丹 主 乃 久 么 之 乎 乏 乖 乘 乙 九 也 乾 亂 了 予 事 二 于 云 互 五 井 些 亞 亡 交 亥 亦 亨 享 京 亮 人 什 仁 仇 今 介 仍 仔 他 付 仙 代 令 以 仰 仲 件 任 份 企 伊 伍 伐 休 伙 伯 估 伴 伸 似 伽 但 佈 佉 位 低 住 佔 何 余 佛 作 你 佩 佳 使 來 例 供 依 侯 侵 便 係 促 俄 俊 俗 保 俠 信 修 俱 俾 個 倍 們 倒 候 倚 借 倫 值 假 偉 偏 做 停 健 側 偵 偶 偷 傑 備 傢 傣 傲 傳 傷 傻 傾 僅 像 僑 僧 價 儀 億 儒 儘 優 允 元 兄 充 兇 先 光 克 免 兒 兔 入 內 全 兩 八 公 六 兮 共 兵 其 具 典 兼 冊 再 冒 冠 冬 冰 冷 准 凌 凝 凡 凰 凱 出 函 刀 分 切 刊 列 初 判 別 利 刪 到 制 刷 刺 刻 則 剌 前 剛 剩 剪 副 割 創 劃 劇 劉 劍 力 功 加 助 努 劫 勁 勇 勉 勒 動 務 勝 勞 勢 勤 勵 勸 勿 包 匈 化 北 匹 區 十 千 升 午 半 卒 卓 協 南 博 卜 卡 卯 印 危 即 卷 卻 厄 厘 厚 原 厭 厲 去 參 又 及 友 反 叔 取 受 口 古 句 另 只 叫 召 叭 可 台 史 右 司 吃 各 合 吉 吊 同 名 后 吐 向 吒 君 吝 吞 吟 吠 否 吧 含 吳 吵 吸 吹 吾 呀 呂 呆 告 呢 周 味 呵 呼 命 和 咖 咦 咧 咪 咬 咱 哀 品 哇 哈 哉 哎 員 哥 哦 哩 哪 哭 哲 唉 唐 唔 唬 售 唯 唱 唷 唸 商 啊 問 啟 啡 啥 啦 啪 喀 喂 善 喇 喊 喔 喜 喝 喬 單 喵 嗎 嗚 嗨 嗯 嘆 嘉 嘗 嘛 嘴 嘻 嘿 器 噴 嚇 嚴 囉 四 回 因 困 固 圈 國 圍 園 圓 圖 團 圜 土 在 圭 地 圾 址 均 坎 坐 坡 坤 坦 坪 垂 垃 型 埃 城 埔 域 執 培 基 堂 堅 堆 堡 堪 報 場 塊 塔 塗 塞 填 塵 境 增 墨 墮 壁 壇 壓 壘 壞 壢 士 壬 壯 壽 夏 夕 外 多 夜 夠 夢 夥 大 天 太 夫 央 失 夷 夸 夾 奇 奈 奉 奎 奏 契 奔 套 奧 奪 奮 女 奴 奶 她 好 如 妙 妝 妥 妨 妮 妳 妹 妻 姆 姊 始 姐 姑 姓 委 姿 威 娃 娘 娛 婁 婆 婚 婦 媒 媽 嫌 嫩 子 孔 字 存 孝 孟 季 孤 孩 孫 學 它 宅 宇 守 安 宋 完 宏 宗 官 宙 定 宛 宜 客 宣 室 宮 害 家 容 宿 寂 寄 寅 密 富 寒 寞 察 寢 實 寧 寨 審 寫 寬 寮 寵 寶 封 射 將 專 尊 尋 對 導 小 少 尖 尚 尤 就 尺 尼 尾 局 屁 居 屆 屋 屏 展 屠 層 屬 山 岡 岩 岸 峰 島 峽 崇 崙 崴 嵐 嶺 川 州 巡 工 左 巧 巨 巫 差 己 已 巳 巴 巷 市 布 希 帕 帖 帛 帝 帥 師 席 帳 帶 常 帽 幅 幕 幣 幫 干 平 年 幸 幹 幻 幼 幽 幾 庇 床 序 底 店 庚 府 度 座 庫 庭 康 庸 廉 廖 廠 廢 廣 廳 延 廷 建 弄 式 引 弗 弘 弟 弦 弱 張 強 彈 彊 彌 彎 彝 彞 形 彥 彩 彬 彭 彰 影 役 彼 往 征 待 很 律 後 徐 徑 徒 得 從 復 微 徵 德 徹 心 必 忌 忍 志 忘 忙 忠 快 念 忽 怎 怒 怕 怖 思 怡 急 性 怨 怪 恆 恐 恢 恥 恨 恩 恭 息 恰 悅 悉 悔 悟 悠 您 悲 悶 情 惑 惜 惠 惡 惱 想 惹 愁 愈 愉 意 愚 愛 感 慈 態 慕 慘 慢 慣 慧 慮 慰 慶 慾 憂 憐 憑 憲 憶 憾 懂 應 懶 懷 懼 戀 戈 戊 戌 成 我 戒 或 截 戰 戲 戴 戶 房 所 扁 扇 手 才 扎 打 托 扣 扥 扭 扯 批 找 承 技 抄 把 抓 投 抗 折 披 抬 抱 抵 抹 抽 拆 拉 拋 拍 拏 拒 拔 拖 招 拜 括 拳 拼 拾 拿 持 指 按 挑 挖 挪 振 挺 捐 捕 捨 捲 捷 掃 授 掉 掌 排 掛 採 探 接 控 推 措 描 提 插 揚 換 握 揮 援 損 搖 搜 搞 搬 搭 搶 摘 摩 摸 撐 撒 撞 撣 撥 播 撾 撿 擁 擇 擊 擋 操 擎 擔 據 擠 擦 擬 擴 擺 擾 攝 支 收 改 攻 放 政 故 效 敍 敏 救 敗 敘 教 敝 敢 散 敦 敬 整 敵 數 文 斐 斗 料 斯 新 斷 方 於 施 旁 旅 旋 族 旗 既 日 旦 早 旭 旺 昂 昆 昇 昌 明 昏 易 星 映 春 昨 昭 是 時 晉 晒 晚 晨 普 景 晴 晶 智 暑 暖 暗 暫 暴 曆 曉 曰 曲 更 書 曼 曾 替 最 會 月 有 朋 服 朗 望 朝 期 木 未 末 本 札 朱 朵 杉 李 材 村 杜 束 杯 杰 東 松 板 析 林 果 枝 架 柏 某 染 柔 查 柬 柯 柳 柴 校 核 根 格 桃 案 桌 桑 梁 梅 條 梨 梯 械 梵 棄 棉 棋 棒 棚 森 椅 植 椰 楊 楓 楚 業 極 概 榜 榮 構 槍 樂 樓 標 樞 模 樣 樹 橋 機 橫 檀 檔 檢 欄 權 次 欣 欲 欺 欽 款 歉 歌 歐 歡 止 正 此 步 武 歲 歷 歸 死 殊 殘 段 殺 殼 毀 毅 母 每 毒 比 毛 毫 氏 民 氣 水 永 求 汗 汝 江 池 污 汪 汶 決 汽 沃 沈 沉 沒 沖 沙 河 油 治 沿 況 泉 泊 法 泡 波 泥 注 泰 泳 洋 洗 洛 洞 洩 洪 洲 活 洽 派 流 浦 浩 浪 浮 海 涇 消 涉 涯 液 涵 涼 淑 淚 淡 淨 深 混 淺 清 減 渡 測 港 游 湖 湯 源 準 溝 溪 溫 滄 滅 滋 滑 滴 滾 滿 漂 漏 演 漠 漢 漫 漲 漸 潔 潘 潛 潮 澤 澳 激 濃 濟 濤 濫 濱 瀏 灌 灣 火 灰 災 炎 炮 炸 為 烈 烏 烤 無 焦 然 煙 煞 照 煩 熊 熟 熱 燃 燈 燒 營 爆 爐 爛 爪 爬 爭 爵 父 爸 爺 爽 爾 牆 片 版 牌 牙 牛 牠 牧 物 牲 特 牽 犧 犯 狀 狂 狐 狗 狠 狼 猛 猜 猴 猶 獄 獅 獎 獨 獲 獸 獻 玄 率 玉 王 玩 玫 玲 玻 珊 珍 珠 珥 班 現 球 理 琉 琪 琴 瑙 瑜 瑞 瑟 瑤 瑪 瑰 環 瓜 瓦 瓶 甘 甚 甜 生 產 用 田 由 甲 申 男 甸 界 留 畢 略 番 畫 異 當 疆 疏 疑 疼 病 痕 痛 痴 瘋 療 癡 癸 登 發 白 百 的 皆 皇 皮 盃 益 盛 盜 盟 盡 監 盤 盧 目 盲 直 相 盼 盾 省 眉 看 真 眠 眼 眾 睛 睡 督 瞧 瞭 矛 矣 知 短 石 砂 砍 研 砲 破 硬 碎 碗 碟 碧 碩 碰 確 碼 磁 磨 磯 礎 礙 示 社 祕 祖 祚 祛 祝 神 祥 票 祿 禁 禍 禎 福 禪 禮 秀 私 秋 科 秒 秘 租 秤 秦 移 稅 程 稍 種 稱 稿 穆 穌 積 穩 究 穹 空 穿 突 窗 窩 窮 窶 立 站 竟 章 童 端 競 竹 笑 笛 符 笨 第 筆 等 筋 答 策 简 算 管 箭 箱 節 範 篇 築 簡 簫 簽 簿 籃 籌 籍 籤 米 粉 粗 粵 精 糊 糕 糟 系 糾 紀 約 紅 納 紐 純 紙 級 紛 素 索 紫 累 細 紹 終 組 結 絕 絡 給 統 絲 經 綜 綠 維 綱 網 緊 緒 線 緣 編 緩 緬 緯 練 縛 縣 縮 縱 總 績 繁 繆 織 繞 繪 繳 繼 續 缸 缺 罕 罪 置 罰 署 罵 罷 羅 羊 美 羞 群 義 羽 翁 習 翔 翰 翹 翻 翼 耀 老 考 者 而 耍 耐 耗 耳 耶 聊 聖 聚 聞 聯 聰 聲 職 聽 肉 肚 股 肥 肩 肯 育 背 胎 胖 胞 胡 胸 能 脆 脫 腓 腔 腦 腰 腳 腿 膽 臉 臘 臣 臥 臨 自 臭 至 致 臺 與 興 舉 舊 舌 舍 舒 舞 舟 航 般 船 艦 良 色 艾 芝 芬 花 芳 若 苦 英 茅 茫 茲 茶 草 荒 荷 荼 莉 莊 莎 莫 菜 菩 華 菲 萄 萊 萬 落 葉 著 葛 葡 蒂 蒙 蒲 蒼 蓋 蓮 蔕 蔡 蔣 蕭 薄 薦 薩 薪 藉 藍 藏 藝 藤 藥 蘆 蘇 蘭 虎 處 虛 號 虧 蛇 蛋 蛙 蜂 蜜 蝶 融 螢 蟲 蟹 蠍 蠻 血 行 術 街 衛 衝 衡 衣 表 袋 被 裁 裂 裕 補 裝 裡 製 複 褲 西 要 覆 見 規 視 親 覺 覽 觀 角 解 觸 言 訂 計 訊 討 訓 託 記 訥 訪 設 許 訴 註 証 評 詞 詢 試 詩 話 該 詳 誇 誌 認 誓 誕 語 誠 誤 說 誰 課 誼 調 談 請 諒 論 諸 諺 諾 謀 謂 講 謝 證 識 譜 警 譯 議 護 譽 讀 變 讓 讚 谷 豆 豈 豐 象 豪 豬 貌 貓 貝 貞 負 財 貢 貨 貪 貫 責 貴 買 費 貼 賀 資 賈 賓 賜 賞 賢 賣 賤 賦 質 賭 賴 賺 購 賽 贈 贊 贏 赤 赫 走 起 超 越 趕 趙 趣 趨 足 跌 跎 跑 距 跟 跡 路 跳 踏 踢 蹟 蹤 躍 身 躲 車 軌 軍 軒 軟 較 載 輔 輕 輛 輝 輩 輪 輯 輸 轉 轟 辛 辦 辨 辭 辯 辰 辱 農 迅 迎 近 返 迦 迪 迫 述 迴 迷 追 退 送 逃 逆 透 逐 途 這 通 逛 逝 速 造 逢 連 週 進 逸 逼 遇 遊 運 遍 過 道 達 違 遙 遜 遠 適 遭 遮 遲 遷 選 遺 避 邀 邁 還 邊 邏 那 邦 邪 邱 郎 部 郭 郵 都 鄂 鄉 鄭 鄰 酉 配 酒 酷 酸 醉 醒 醜 醫 采 釋 里 重 野 量 金 針 釣 鈴 鉢 銀 銅 銖 銘 銳 銷 鋒 鋼 錄 錢 錦 錫 錯 鍋 鍵 鍾 鎊 鎖 鎮 鏡 鐘 鐵 鑑 長 門 閃 閉 開 閏 閒 間 閣 閱 闆 闊 闍 闐 關 闡 防 阻 阿 陀 附 降 限 院 陣 除 陪 陰 陳 陵 陶 陷 陸 陽 隆 隊 階 隔 際 障 隨 險 隱 隻 雄 雅 集 雉 雖 雙 雜 雞 離 難 雨 雪 雲 零 雷 電 需 震 霍 霧 露 霸 霹 靂 靈 青 靖 靜 非 靠 面 革 靼 鞋 韃 韋 韓 音 韻 響 頁 頂 項 順 須 預 頑 頓 頗 領 頞 頭 頻 顆 題 額 顏 願 類 顧 顯 風 飄 飛 食 飯 飲 飽 飾 餅 養 餐 餘 館 首 香 馬 駐 駕 駛 騎 騙 騷 驅 驗 驚 骨 體 高 髮 鬆 鬥 鬧 鬱 鬼 魁 魂 魅 魔 魚 魯 鮮 鳥 鳳 鳴 鴻 鵝 鷹 鹿 麗 麥 麵 麻 麼 黃 黎 黑 默 點 黨 鼓 鼠 鼻 齊 齋 齒 齡 龍 龜 2,180
Auxiliary 乍 仂 伏 佐 侶 僳 兆 兌 兹 别 券 勳 卑 卞 占 叶 堤 墎 壤 奥 孜 峇 嶼 巽 栗 楔 涅 渾 澎 燦 狄 琳 瑚 甫 碑 礁 芒 苗 茨 蓬 蚩 蜀 裘 謬 酋 隴 乳 划 匕 匙 匣 叉 吻 嘟 噘 妖 巾 帆 廁 廚 弋 弓 懸 戟 扳 捂 摔 暈 框 桶 桿 櫃 煎 燭 牡 皺 盒 眨 眩 筒 簍 糰 紋 紗 纏 纜 羯 聳 肖 艇 虹 蛛 蜘 蝴 蝸 蠟 裙 豚 躬 釘 鈔 鈕 鉛 鎚 鎬 鐺 鑰 鑽 霄 鞠 骰 骷 髏 鯉 鳶 115

In its 'main' category, CLDR lists 2,210 characters for the Simplified Chinese orthography, and 2,180 for Traditional Chinese. Combined, this includes 3,026 unique characters, and an overlap of 1,064 characters.

Radicals

A radical is an ideograph or, more commonly, a component of an ideograph that is used for indexing dictionaries and word lists, and as the basis for creating new ideographs. The 214 radicals of the KangXi dictionary are universally recognised.

The visual appearance of radicals may vary significantly from the original character on which they are based.

Han character for word/say/speak (top) and water (bottom), and associated radicals used in other characters (highlighted yellow).

The shape of the radical may be influenced by the arrangement with other elements of a character, or by standardised simplifications. In the figure above, the shape of the top right radical (word) is a product of the simplification process in China.

Unicode dedicates two blocks to radicals. The KangXi radicals block contains the base forms of the 214 radicals.

The CJK Radicals Supplement contains variant shapes of these radicals when they are used as parts of other characters or in simplified form. These have not been unified because they often appear independently in dictionaries indices.

Characters in those blocks should never be used as ideographs.

Symbols

No characters with the general property symbol are included in the CLDR sets.

Text direction

Text can be written horizontally, left to right, or vertically with lines progressing from right to left. Vertically set text is more common in traditional chinese than simplified chinese areas.

Older horizontally set texts in Chinese also ran right to left.

If your browser supports vertical text, you can change the direction of the text sample here.

Article 7. 法律之前人人平等,并有权享受法律的平等保护,不受任何歧视。人人有权享受平等保护,以免受违反本宣言的任何歧视行为以及煽动这种歧视的任何行为之害。

It should be noted, however, that horizontal and vertical text is not usually identical. Apart from the question of what gets rotated and what does not, the two writing modes may show different preferences for emphasis marks, brackets, numbers, and so forth.

Glyph shaping & positioning

This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.

You can experiment with examples using the Chinese character app.

Han characters have no contextual variation or placement of glyphs. Nor is text cursive (in the sense of joined-up).

The orthography has no case distinction, and no special transforms are needed to convert between characters.

On the other hand, punctuation and embedded text in other scripts is affected by the direction of lines.

By default, all han characters and punctuation are inside a character frame that is square and the same size for all characters. The box containing the actual symbol is called the letter face, and there should be some space left between the letter face and the character frame. There may be variations, particularly for punctuation, etc., in the size of the letter face.

character frame letter face
Character frame and letter face.

Because of the regularity of the character frame size, it can be used to measure the size of the text area or other parts of a page (horizontally or vertically).

In principle, Han characters are set solid, ie. with no space between the character frames. However, text alignment and justification can make adjustments to the placement of characters in the direction of the line flow. See justify and letterspace.

きょう。  ょう。
Character frame and letter face.

Context-based positioning

Dashes, ellipses, and brackets are rotated 90º to the right when they appear in vertical text. Here is a list of characters to which this applies.c,#table_of_punctuation_marks

⸺␣—␣…␣~␣-␣–␣/␣「␣」␣『␣』␣“␣”␣‘␣’␣(␣)␣《␣》␣〈␣〉␣【␣】␣〖␣〗␣〔␣〕␣[␣]␣{␣}

Baselines & inline alignment

The standard baseline for han characters is slightly lower than the alphabetic baseline used for Latin characters. Mixed script text needs to align baselines correctly.

Han characters have not ascenders or descenders, but occupy the square space described earlier.

Font styles

tbd

Punctuation & inline features

Several of the following sections include icons after punctuation marks that indicate whether the punctuation is used in horizontal (H) or vertical (V) writing, and whether the character is rotated 90º in vertical text, or translated to a different location in the character frame.

When lines or other text decoration are used, they normally appear below horizontal text, and to the left of vertical text. However, emphasis marks appear to the right of a vertical line.

When two underlined items appear side-by-side, the underline should be broken between the two. If a line of Chinese text contains some text in another language and orthography, the position of any text decoration should follow the Chinese conventions.

Grapheme boundaries

Since there are no combining marks or decompositions, graphemes correspond to individual characters.

Unicode grapheme clusters can be applied to Chinese without problems. There are no special issues related to operations that use grapheme clusters as their basic unit of text.

Word boundaries

Chinese rarely uses spaces. In the sample text there are gaps around punctuation, but these are produced by a lack of 'ink' in parts of the square character glyphs:

You can verify this by clicking on this example. The character list popup shows that only three characters make up this sequence, and none are spaces.

别。并

The gap between these characters is only an absence of ink. There is no space character.

Phrase & section boundaries

Chinese uses the following separators at the sentence level and below.c,#h-pause-or-stop-punctuation-marks

    SC: H, V TC: H, V
phrase [U+FF0C FULLWIDTH COMMA]  Bottom left Bottom right Bottom right Bottom right
[U+3001 IDEOGRAPHIC COMMA] * Bottom left Bottom right Bottom right Bottom right
[U+FF1A FULLWIDTH COLON]  Left Right Centre Centre
[U+FF1B FULLWIDTH SEMICOLON]  Bottom right Right Centre Centre
sentence [U+3002 IDEOGRAPHIC FULL STOP] Bottom left Bottom right Bottom right Bottom right
[U+FF0E FULLWIDTH FULL STOP] * Bottom left Bottom right Bottom right Bottom right
exclamation [U+FF01 FULLWIDTH EXCLAMATION MARK]  Bottom right Right Centre Centre
question [U+FF1F FULLWIDTH QUESTION MARK]  Bottom right Right Centre Centre

[U+3001 IDEOGRAPHIC COMMA] is typically used as a list separator.

[U+FF0E FULLWIDTH FULL STOP] is used as sentence-final punctuation in, for example, college textbooks, science and technology literature, and grammar books of Western languages, most of which are in horizontal writing mode, and Western language is heavily used.

As the table shows, these punctuation marks are not rotated, however their position varies in Simplified Chinese for horizontal and vertical text, relative to the character frame. In Traditional Chinese they are all centred.

These different positions in Simplified Chinese require dedicated glyphs in the font, and cannot be achieved by simply rotating the glyph.

Chinese also uses the following doubled exclamation/question marks. They remain upright in vertical text.

[U+203C DOUBLE EXCLAMATION MARK]
[U+2047 DOUBLE QUESTION MARK]
[U+2048 QUESTION EXCLAMATION MARK] 
[U+2049 EXCLAMATION QUESTION MARK] 

Other punctuation used to separate phrases or items includes:

  H V
[U+2E3A TWO-EM DASH]  Bottom right Bottom right
—— [U+2014 EM DASH + U+2014 EM DASH] Bottom right Bottom right

If EM DASH characters are used, they are used in pairs.

Parentheses & brackets

For general parentheses in text, Chinese uses:

    H V
[U+FF08 FULLWIDTH LEFT PARENTHESIS] [U+FF09 FULLWIDTH RIGHT PARENTHESIS] Right aligned Left aligned Bottom aligned
Top aligned

Dashes can also be used to offset information, in which case Chinese typically uses those listed in the previous section, doubled up.

Although there are a number of other bracket characters (listed just below), they are rarely used in Chinese publications.c,#id81

【␣】␣〖␣〗␣〔␣〕␣[␣]␣{␣}

Brackets are also used to indicate titles and proper names (see otherinline).

Quotations

Mainland China. Mainland China, where vertical text is not common, uses different quote marks for horizontal and vertical writing. The default quote marks are:

    H V
[U+201C LEFT DOUBLE QUOTATION MARK] [U+201D RIGHT DOUBLE QUOTATION MARK] Top=right aligned Top-left aligned -
[U+300E LEFT WHITE CORNER BRACKET] [U+300F RIGHT WHITE CORNER BRACKET] - Bottom-right aligned
Top-left aligned

When an additional quote is embedded within the first, the quote marks are:

    H V
[U+2018 LEFT SINGLE QUOTATION MARK] [U+2019 RIGHT SINGLE QUOTATION MARK] Top-right aligned Bottom-left aligned -
[U+300C LEFT CORNER BRACKET] [U+300D RIGHT CORNER BRACKET] - Bottom-right aligned
Top-left aligned

Taiwan. Taiwan tends to use a single set of quote marks, but the other way around compared to Mainland China. The default quote marks are:

    H V
[U+300C LEFT CORNER BRACKET] [U+300D RIGHT CORNER BRACKET] Top-right aligned Bottom-left aligned Bottom-right aligned
Top-left aligned

When an additional quote is embedded within the first, the quote marks are:

    H V
[U+300E LEFT WHITE CORNER BRACKET] [U+300F RIGHT WHITE CORNER BRACKET] Top-right aligned Bottom-left aligned Bottom-right aligned
Top-left aligned

Occasionally, Traditional Chinese text may use double brackets for the default, and single for the embedded. It may also use quotation marks, like Mainland China, but not commonly, and much less so for vertical text.

Emphasis

Straight or wavy lines alongside the text are typically used in Chinese to indicate proper nouns such as a person's name, a book title, or the name of a place, rather than for emphasis like the underlining associated with Latin-script text.c,#id82

Chinese uses dots or circles alongside characters to express emphasis, one dot per base character.

缔造真正<em>全球</em>通行的万维网 缔造真正<em>全球</em>通行的万维网
Text emphasis in horizontal (left) and vertical (right) text.

In horizontal text, emphasis marks appear underneath the base text. They usually appear to the right of a vertical line, ie. on the opposite side of lines used for book titles, proper nouns, etc. (to avoid interference).

In the same way as for other line decorations, embedded text in other languages would have dots displayed on the same side as for Chinese.

Abbreviation, ellipsis & repetition

Ellipsis

An ellipsis in Chinese consists of six dots and takes up the space of two Hanzi characters. This is normally achieved using two [U+2026 HORIZONTAL ELLIPSIS] characters, side-by-side.c,#id83

  H V
…… [U+2026 HORIZONTAL ELLIPSIS] x2 Bottom right Left

Inline notes & annotations

Inter-line annotations are used to indicate pronunciation (usually only for children or foreigners), and to provide commentaries on or bilingual equivalents of the main text.

With the exception of zhuyin in horizontal text, all annotations appear within the standard inter-line space for the page, and don't create extra space if they appear on a single line. That said, the inter-line space is usually set at an appropriate size to accommodate annotations.

Unlike Japanese, it is rare to find annotations applied just to specific words; generally the whole text is annotated. If annotations are only needed for individual characters or words, they are often presented in parentheses, following.

These annotations do not appear alongside punctuation.

Indicating pronunciation with Latin characters

Pinyin is the most common way of representing pronunciation, although occasionally other transcriptions are used.

Horizontal  semantic annotation.
Examples of pinyin, word-based phonetic annotations (source).

The annotation usually appears above the main line of text, except when both zhuyin and pinyin annotations are both present, in which case it commonly appears below the line.

Pinyin and zhuyin together.
Examples of pinyin and zhuyin phonetic annotations applied to the same base characters (source).

Latin annotations for pronunciation are usually only used with horizontal text.

For native children the annotations are usually applied character by character, whereas for foreign learners they are often applied word by word. The annotation is normally centred above the base text, and contains no spaces.

In order to avoid collisions or wrongly implied word boundaries, there should always be a 1/4em space between adjacent long annotations (usually up to 5 characters per syllable for pinyin). Letter-spacing is typically applied evenly across all the base text to allow room for annotations.

There is a preference for annotations to use a sans-serif font, and for the base text to use Kai.

Indicating pronunciation using Zhuyin Fuhao

The 國語注音符號 (guóyǔ zhùyīn fúhào) approach uses a set of characters referred to as bopomofo (after the initial letters in the alphabet), and is mostly used in Taiwan.

The bopomofo annotations usually appear in a vertical column to the right of each base character, in both horizontal and vertical text.

Vertical zhuyin.    Vertical zhuyin.
Examples of zhuyin phonetic annotations (source).

Each syllable is described by up to 3 bopomofo characters, plus a tone mark. The neutral tone mark appears above the stack, but the others appear to the right of the bopomofo column. The height of the tone mark depends on the number of bopomofo characters to its left. For details, see CLREQ.

Annotations representing meaning or commentaries

These annotations are common in light novels and translated works, and tend to describe phrases or words. They may contain casing, punctuation, and spaces, and may contain Chinese text explaining Latin base text, or vice versa.

They usually appear below a horizontal line of text, and to the left of a vertical line.

Horizontal  semantic annotation. Vertical semantic annotation.
Examples of bilingual annotations (source).

Unlike phonetic annotations, these annotations are only attached to specific words or phrases.

Other inline ranges

Titles

Titles of works including books, articles, songs, movies, files, calligraphy and paintings are identified in Chinese in one of two ways:c,#id87

  1. Using angle brackets around the title.
  2. Underlining the title with a wavy line (is rarely used in modern publications, but can still be seen in some textbooks and ancient publications).
    H V
[U+300A LEFT DOUBLE ANGLE BRACKET] [U+300B RIGHT DOUBLE ANGLE BRACKET] Right aligned Left aligned Bottom aligned
Top aligned
[U+3008 LEFT ANGLE BRACKET] [U+3009 RIGHT ANGLE BRACKET] Right aligned Left aligned Bottom aligned
Top aligned

The double brackets tend to be used for book and chapter titles, and the single brackets for articles.c,#glyphs_sizes_and_positions_in_character_faces_of_punctuation_marks

Proper names

Proper names can be highlighted using line decoration, but this time a straight underline. Note that the underline is not used for emphasis in this case. This is mostly used in textbooks and older publications.c,#glyphs_sizes_and_positions_in_character_faces_of_punctuation_marks

Other punctuation

Chinese uses a large number of punctuation marks, and that number is increased by the duplication of normal vs. fullwidth variants. The fullwidth punctuation often includes significant amounts of white space, so that character frames of the punctuation characters are the same size as Han characters.

CLDR lists 136 punctuation characters for the union of Simplified and Traditional Chinese, grouped here by Unicode block.

CJK Symbols & Punctuation:

、␣。␣〃␣〈␣〉␣《␣》␣「␣」␣『␣』␣【␣】␣〔␣〕␣〖␣〗␣〝␣〞

(Halfwidth &) Fullwidth Forms:

!␣"␣#␣%␣&␣'␣(␣)␣*␣,␣-␣.␣/␣:␣;␣?␣@␣[␣\␣]␣_␣{␣}

Basic & General punctuation:

!␣"␣#␣%␣&␣(␣)␣*␣,␣-␣.␣/␣:␣;␣?␣@␣[␣\␣]␣_␣{␣}␣‐␣‑␣–␣—␣―␣‖␣‘␣’␣“␣”␣†␣‡␣‥␣…␣‧␣‰␣′␣″␣‵␣※␣‾␣§␣·

CLDR also includes some compatibility characters, included for handling legacy implementations. These include vertical text forms, which should normally be automatically enabled by the font in a vertical context. The other forms should also be avoided in favour of normal characters, with the variant shapes provided by fonts or styling.u,284-5

CJK Compatibility Forms:

︰␣︱␣︲␣︳␣︴␣︵␣︶␣︷␣︸␣︹␣︺␣︻␣︼␣︽␣︾␣︿␣﹀␣﹁␣﹂␣﹃␣﹄␣﹉␣﹊␣﹋␣﹌␣﹍␣﹎␣﹏

Small Form Variants:

﹐␣﹑␣﹒␣﹔␣﹕␣﹖␣﹗␣﹘␣﹙␣﹚␣﹛␣﹜␣﹝␣﹞␣﹟␣﹠␣﹡␣﹣␣﹨␣﹪␣﹫

Dashes. The long dashes mentioned in bracketing can also be used to show a continuation of tone or sound, an abrupt change in thought, or adding new content to the contextc,#id82.

Connectors.Connector marks are used "to indicate the beginning and end of time or space, to indicate quantity, to express the name of a chemical compound, to label a table or illustration, to connect a house number in an address, for a phone number, to separate digits which indicate the year, month and date, or to connect compound nouns and for the romanization, as well as the foreign text in the content".c,#id85

Chinese uses the following punctuation for this.c,#id85

  SC: H, V TC: H, V
[U+FF5E FULLWIDTH TILDE] Bottom right Bottom right Bottom right Bottom right
[U+2013 EN DASH] Bottom right Bottom right Bottom right Bottom right
[U+2014 EM DASH]  Bottom right Bottom right - -

Separators. Interpuncts are used to separate the first name and family name in foreign or minority names rendered using Chinese characters, and with book title marks to separate chapters, articles and volumes in publications.c,#id86

  H V
· [U+00B7 MIDDLE DOT] Centred Centred

Middle dots sometimes take up only a halfwidth space in Simplified Chinese when used with dates, eg. 2·11.c,#id86

The following characters are not recommended for this purpose: [U+FF0E FULLWIDTH FULL STOP], [U+2027 HYPHENATION POINT], [U+2022 BULLET], and [U+30FB KATAKANA MIDDLE DOT].c,#id86

Line & paragraph layout

Line breaking & word wrap

Lines are normally wrapped between characters – word boundaries have no significance for the wrapping. Chinese should, however, take into account a few rules which dictate what characters cannot appear at the end or start of a line.

Show (default) line-breaking properties for non-ideographic characters in the Chinese orthography described here.

There is no hyphenation when Chinese characters are wrapped to the next line.

Line start/end rules

The following characters should not begin a line. Instead, they should bring the previous Han character with them.c,#table_of_punctuation_marks

。␣.␣,␣、␣:␣;␣!␣‼␣?␣⁇␣~␣-␣–␣—␣·␣・␣‧␣/␣/␣」␣』␣”␣’␣)␣》␣〉␣】␣〗␣〕␣]␣}

These rules are not always observed for Traditional Chinese in Taiwan, and may be ignored also for newsprint, because it deals with narrow columns of text.c,#prohibition_rules_for_line_start_end

Also, where several punctuation marks appear together, for example 。』」, moving all characters from the previous line might create too large a gap for justification to handle elegantly, and so punctuation marks might be allowed to appear at the line start.c,#prohibition_rules_for_line_start_end

The following characters should not appear at the end of a line.

「␣『␣“␣‘␣(␣《␣〈␣【␣〖␣〔␣[␣{

Text alignment & justification

Chinese justifies text using a complex set of rules which adjust the space between characters on a line. Some characters are adjusted before others.

Use the control below to see how your browser justifies the text sample here.

法律之前人人平等,并有权享受法律的平等保护,不受任何歧视。人人有权享受平等保护,以免受违反本宣言的任何歧视行为以及煽动这种歧视的任何行为之害。

Letter spacing

tbd

Counters, lists, etc.

You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.

Chinese text uses a number of different counter styles. Some of the more common include full-width European numbers, which in vertical text stand upright. Unicode has various sets of numbers that can be useful here.

For the dotted-decimal numeric style Unicode provides precomposed characters from 1 to 20.

⒈␣⒉␣⒊␣⒋␣⒌␣⒍␣⒎␣⒏␣⒐␣⒑␣⒒␣⒓␣⒔␣⒕␣⒖␣⒗␣⒘␣⒙␣⒚␣⒛

For the circled-decimal numeric style Unicode provides characters from 1 to 50.

⓪␣①␣②␣③␣④␣⑤␣⑥␣⑦␣⑧␣⑨␣⑩␣⑪␣⑫␣⑬␣⑭␣⑮␣⑯␣⑰␣⑱␣⑲␣⑳␣㉑␣㉒␣㉓␣㉔␣㉕␣㉖␣㉗␣㉘␣㉙␣㉚␣㉛␣㉜␣㉝␣㉞␣㉟␣㊱␣㊲␣㊳␣㊴␣㊵␣㊶␣㊷␣㊸␣㊹␣㊺␣㊻␣㊼␣㊽␣㊾␣㊿

Chinese orthographies also use ideographic characters to create 1 numeric, 2 fixed, 1 cyclic, 1 additive and 2 idiosynchratic styles.

Numeric

The cjk-decimal numeric style is decimal-based and uses these digits.rmcs

〇␣一␣二␣三␣四␣五␣六␣七␣八␣九

Examples:

一␣二␣三␣四␣一一␣二二␣三三␣四四␣一三一␣二四二␣三三三␣四六四

Chinese-specific

Several ideographic-based counter styles have an algorithm that is like an additive style, but has some differences. The algorithm to use can be found in the CSS Counter Styles specification, where they are called Longhand East Asian styles.

These styles are all decimal-based, and use the same algorithm but with different characters. The CSS spec only defines the algorithm up to 9,999, because there appears to be some disagreement about how larger numbers are handled.

The simp-chinese-informal longhand style uses the characters shown just below. The separator for lists is and the numbers can be negative when using the symbol .

零␣一␣二␣三␣四␣五␣六␣七␣八␣九␣十␣百␣千

Examples:

一␣二␣三␣四␣十一␣二十二␣三十三␣四十四␣一百三十一␣二百四十二␣三百三十三␣四百六十四

The trad-chinese-informal style uses exactly the same characters, except that the negative symbol is .

The simp-chinese-formal longhand style uses the characters shown below. The separator for lists is and the numbers can be negative when using the symbol .

零␣壹␣贰␣叁␣肆␣伍␣陆␣柒␣捌␣玖␣拾␣佰␣仟

Examples:

壹␣贰␣叁␣肆␣壹拾壹␣贰拾贰␣叁拾叁␣肆拾肆␣壹佰叁拾壹␣贰佰肆拾贰␣叁佰叁拾叁␣肆佰陆拾肆

The trad-chinese-formal longhand style uses 3 different code points where there is a difference in shape (for 2, 3, and 6), shown below. The separator for lists is and the numbers can be negative when using the symbol .

零␣壹␣貳␣參␣肆␣伍␣陸␣柒␣捌␣玖␣拾␣佰␣仟

Examples:

壹␣貳␣參␣肆␣壹拾壹␣貳拾貳␣參拾參␣肆拾肆␣壹佰參拾壹␣貳佰肆拾貳␣參佰參拾參␣肆佰陸拾肆

Fixed

The cjk-earthly-branch fixed style uses the letters shown just below. It is only able to count to 12.

子␣丑␣寅␣卯␣辰␣巳␣午␣未␣申␣酉␣戌␣亥

The cjk-heavenly-stem fixed style uses the letters shown below. It is also only able to count to 10.

甲␣乙␣丙␣丁␣戊␣己␣庚␣辛␣壬␣癸

The circled-ideograph fixed style uses the letters shown below. It is only able to count to 10.

㊀␣㊁␣㊂␣㊃␣㊄␣㊅␣㊆␣㊇␣㊈␣㊉

The parenthesised-ideograph fixed style uses the letters shown below. It is also only able to count to 10.

㈠␣㈡␣㈢␣㈣␣㈤␣㈥␣㈦␣㈧␣㈨␣㈩

Cyclic

The cjk-stem-branch cyclic style uses the pairs of characters shown just below. Once 60 is reached, the list begins over.

甲子␣乙丑␣丙寅␣丁卯␣戊辰␣己巳␣庚午␣辛未␣壬申␣癸酉␣甲戌␣乙亥␣丙子␣丁丑␣戊寅␣己卯␣庚辰␣辛巳␣壬午␣癸未␣甲申␣乙酉␣丙戌␣丁亥␣戊子␣己丑␣庚寅␣辛卯␣壬辰␣癸巳␣甲午␣乙未␣丙申␣丁酉␣戊戌␣己亥␣庚子␣辛丑␣壬寅␣癸卯␣甲辰␣乙巳␣丙午␣丁未␣戊申␣己酉␣庚戌␣辛亥␣壬子␣癸丑␣甲寅␣乙卯␣丙辰␣丁巳␣戊午␣己未␣庚申␣辛酉␣壬戌␣癸亥

Additive

𝍶␣𝍵␣𝍴␣𝍳␣𝍲

The cjk-tally-mark additive style uses the letters shown just above. It is based on only 5 basic characters, which were introduced in Unicode 11. The potential range of this style is very large, but counters rapidly grow in size, so smaller numbers are most likely.

Prefixes and suffixes

The most common suffix is [U+3001 IDEOGRAPHIC COMMA]. The circled or parenthesised fixed styles have no prefix/suffix.

Examples:

一、 二、 三、 四、 五、
Separator for Chinese list counters.

Styling initials

tbd

Page & book layout

This section is for any features that are specific to thisScript and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.

Character lists

The Han script characters in Unicode 13.0 are spread across 7 blocks. The total number of these characters is 92,896.

There are also 2 compatibility blocks, containing 1,014 characters in total.

There are also various related blocks, containing 459 characters.

The following links give information about characters used for everyday use of Chinese. The numbers in parentheses are for non-ASCII characters.

Languages using the Han script

According to ScriptSource, the Han script is used for the following languages:

Hani

Hant

Hans

References