Chinese

Han orthography notes

Updated 13 September, 2024

This page brings together basic information about the Han Simplified and Traditional writing systems and their use for the Chinese language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Chinese using Unicode.

Referencing this document

Richard Ishida, Chinese (Han) Orthography Notes, 13-Sep-2024, https://r12a.github.io/scripts/hani/zh

Samples

Select part of this sample text to show a list of characters, with links to more details.
Change size:   24px

Simplified Chinese

第一条 人人生而自由,在尊严和权利上一律平等。他们赋有理性和良心,并应以兄弟关系的精神相对待。

第二条 人人有资格享有本宣言所载的一切权利和自由,不分种族、肤色、性别、语言、宗教、政治或其他见解、国籍或社会出身、财产、出生或其他身分等任何区别。并且不得因一人所属的国家或领土的政治的、行政的或者国际的地位之不同而有所区别,无论该领土是独立领土、托管领土、非自治领土或者处于其他任何主权受限制的情况之下。

Traditional Chinese

第一條 人人生而自由,在尊嚴和權利上一律平等。他們賦有理性和良心,並應以兄弟關係的精神相對待。

第二條 人人有資格享受本宣言所載的一切權利和自由,不分種族、膚色、性別、語言、宗教、政治或其他見解、國籍或社會出身、財產、出生或其他身分等任何區別。

Usage & history

Two styles of Han characters are used to write Chinese. The traditional orthography was used from the 5th century until 1949 in Mainland China. The simplified orthography was introduced in 1949 and is used in Mainland China and Singapore. Traditional Han is still used in Taiwan, Hong Kong and Macau, and for aesthetic purposes elsewhere in East Asia.

People speaking different Chinese dialects nevertheless write largely the same way, due to the way that the Han characters represent concepts rather than sounds.

Han characters are also widely used in Japan to represent the main part of Japanese words, and sometimes used in Korea (though modern Korean text will contain very few, if any, han characters).

汉字 hànzì Simplified Chinese 漢字 hànzì Traditional Chinese

Chinese writing dates from the second half of the second millenium BC. There is no evidence for a predecessor. The earliest inscriptions were on bones and shells used in divination during the Shang dynasty (1600-1046 BC), and employed a set of logographic symbols now known as the Oracle Bone Script. Although these symbols have been extinct since the end of the Bronze Age, the modern Han characters are direct descendants from these.

More information: Scriptsource, Wikipedia.

Basic features

The Han script is an ideographic script. Letters typically represent a spoken syllable with its tone. See the table to the right for a brief overview of features for the modern Mandarin Chinese orthography, using the Simplified Chinese orthography. The character count reflects a typical set of characters needed for everyday reading and writing: there are many thousands more Han characters that could be added for other purposes (see chars).

The Simplified Chinese orthography has a smaller repertoire and simpler shapes than the Traditional version.

The Chinese script is used as a common writing system by people who may speak a wide variety of Chinese languages, and who may pronounce the written text very differently. This is possible because the characters represent concepts rather than phonetics.

Text can be written in one of 2 directions: horizontally, left to right, or vertically with lines progressing from right to left. Vertically set text is more common in Traditional Chinese than Simplified Chinese areas. It was possible until recently to find Chinese text written horiztonally, right to left, but this doesn't normally occur in contemporary texts.

Words are not separated by spaces or any other character. There is no case distinction. The visual forms of characters don't interact.

❯ characters

In its 'main' category, CLDR lists 2,210 characters for the Simplified Chinese orthography, and 2,180 for Traditional Chinese. Combined, this includes 3,026 unique characters, and an overlap of 1,064 characters.

The language is tonal, but the tones are not written explicitly.

Chinese has no combining marks, but has many punctuation marks. It also has a relatively complex set of typographic rules.

Character index

Ideographic characters

Show

See cldr_character_lists.

For other associated blocks, see chars.

Counter styles

一␣丁␣七␣三␣丑␣丙␣乙␣九␣二␣五␣亥␣仟␣伍␣佰␣八␣六␣十␣千␣午␣卯␣叁␣參␣四␣壬␣壹␣子␣寅␣己␣巳␣庚␣戊␣戌␣拾␣捌␣未␣柒␣玖␣甲␣申␣癸␣百␣肆␣貳␣贰␣辛␣辰␣酉␣陆␣陸␣零␣𝍲␣𝍳␣𝍴␣𝍵␣𝍶

Numbers

Show
〇␣㈠␣㈡␣㈢␣㈣␣㈤␣㈥␣㈦␣㈧␣㈨␣㈩␣㊀␣㊁␣㊂␣㊃␣㊄␣㊅␣㊆␣㊇␣㊈␣㊉

Punctuation

Show
§␣·␣‐␣‑␣–␣—␣―␣‖␣‘␣’␣“␣”␣†␣‡␣•␣‥␣…␣‧␣‰␣′␣″␣‵␣※␣‼␣‾␣⁇␣⁈␣⁉␣⸺␣、␣。␣〃␣〈␣〉␣《␣》␣「␣」␣『␣』␣【␣】␣〔␣〕␣〖␣〗␣〝␣〞␣・␣!␣"␣#␣%␣&␣'␣(␣)␣*␣,␣-␣.␣/␣:␣;␣?␣@␣[␣\␣]␣_␣{␣}

ASCII

!␣#␣%␣&␣(␣)␣*␣,␣-␣.␣/␣:␣;␣?␣@␣[␣\␣]␣_␣{␣}

CLDR additions

︰␣︱␣︲␣︳␣︴␣︵␣︶␣︷␣︸␣︹␣︺␣︻␣︼␣︽␣︾␣︿␣﹀␣﹁␣﹂␣﹃␣﹄␣﹉␣﹊␣﹋␣﹌␣﹍␣﹎␣﹏␣﹐␣﹑␣﹒␣﹔␣﹕␣﹖␣﹗␣﹘␣﹙␣﹚␣﹛␣﹜␣﹝␣﹞␣﹟␣﹠␣﹡␣﹣␣﹨␣﹪␣﹫

Symbols

Show

Phonology

This section lists sounds for Mandarin Chinese, as spoken in the Beijing area.

Click on the sounds to reveal locations in this document where they are mentioned.

Phones in a lighter colour are non-native or allophones.wp.

Vowel sounds

Plain vowels

i y y u ɤ ɤ o ə ə a

Diphthongs

ie ia iu ye ye ua uo ei ou ai au

Consonant sounds

labial alveolar post-
alveolar
retroflex palatal velar
stops p t       k
aspirated      
affricates   t͡s t͡ɕ t͡ʂ    
aspirated   t͡sʰ t͡ɕʰ t͡ʂʰ    
fricative f s ɕ ʂ   x
nasal m n       ŋ
approximant   l   ɻ    

Tone

tbd

Structure

tbd

Characters

'Ideographic' characters

Chinese text is primarily constructed from characters that each correspond to a phonetic symbol, including a tone. Some have pictographic origins that are still evident, whereas others have a more complicated structure.

It is said that Chinese people typically use around 3-4,000 characters for most communication, but a reasonable word processor would need to support at least 10,000. Unicode supports over 70,000 Han characters, most of which cover advanced or esoteric usage.

CLDR character lists

The listings here show a list of characters used per version 36 of CLDR's lists of characters (exemplarCharacters).

Show characters used for Simplified Chinese.
Main 一 丁 七 万 丈 三 上 下 丌 不 与 丑 专 且 世 丘 丙 业 东 丝 丢 两 严 丧 个 中 丰 串 临 丸 丹 为 主 丽 举 乃 久 么 义 之 乌 乍 乎 乏 乐 乔 乖 乘 乙 九 也 习 乡 书 买 乱 乾 了 予 争 事 二 于 亏 云 互 五 井 亚 些 亡 交 亥 亦 产 亨 享 京 亮 亲 人 亿 什 仁 仅 仇 今 介 仍 从 仔 他 付 仙 代 令 以 仪 们 仰 仲 件 价 任 份 仿 企 伊 伍 伏 伐 休 众 优 伙 会 伟 传 伤 伦 伯 估 伴 伸 似 伽 但 位 低 住 佐 佑 体 何 余 佛 作 你 佤 佩 佳 使 例 供 依 侠 侦 侧 侨 侬 侯 侵 便 促 俄 俊 俗 保 信 俩 修 俱 俾 倍 倒 候 倚 借 倦 值 倾 假 偌 偏 做 停 健 偶 偷 储 催 傲 傻 像 僧 儒 儿 允 元 兄 充 兆 先 光 克 免 兑 兔 党 入 全 八 公 六 兮 兰 共 关 兴 兵 其 具 典 兹 养 兼 兽 内 冈 册 再 冒 写 军 农 冠 冬 冰 冲 决 况 冷 准 凌 减 凝 几 凡 凤 凭 凯 凰 出 击 函 刀 分 切 刊 刑 划 列 刘 则 刚 创 初 判 利 别 到 制 刷 券 刺 刻 剂 前 剑 剧 剩 剪 副 割 力 劝 办 功 加 务 劣 动 助 努 劫 励 劲 劳 势 勇 勉 勋 勒 勤 勾 勿 包 匆 匈 化 北 匙 匹 区 医 十 千 升 午 半 华 协 卒 卓 单 卖 南 博 占 卡 卢 卫 卯 印 危 即 却 卷 厂 厄 厅 历 厉 压 厌 厍 厚 原 去 县 参 又 叉 及 友 双 反 发 叔 取 受 变 叙 口 古 句 另 只 叫 召 叭 可 台 史 右 叶 号 司 叹 吃 各 合 吉 吊 同 名 后 吐 向 吓 吗 君 吝 吟 否 吧 含 听 启 吵 吸 吹 吻 吾 呀 呆 呈 告 呐 员 呜 呢 呦 周 味 呵 呼 命 和 咖 咦 咧 咨 咪 咬 咯 咱 哀 品 哇 哈 哉 响 哎 哟 哥 哦 哩 哪 哭 哲 唉 唐 唤 唬 售 唯 唱 唷 商 啊 啡 啥 啦 啪 喀 喂 善 喇 喊 喏 喔 喜 喝 喵 喷 喻 嗒 嗨 嗯 嘉 嘛 嘴 嘻 嘿 器 四 回 因 团 园 困 围 固 国 图 圆 圈 土 圣 在 圭 地 圳 场 圾 址 均 坎 坐 坑 块 坚 坛 坜 坡 坤 坦 坪 垂 垃 型 垒 埃 埋 城 埔 域 培 基 堂 堆 堕 堡 堪 塑 塔 塞 填 境 增 墨 壁 壤 士 壬 壮 声 处 备 复 夏 夕 外 多 夜 够 夥 大 天 太 夫 央 失 头 夷 夸 夹 夺 奇 奈 奉 奋 奏 契 奔 奖 套 奥 女 奴 奶 她 好 如 妇 妈 妖 妙 妥 妨 妮 妹 妻 姆 姊 始 姐 姑 姓 委 姿 威 娃 娄 娘 娜 娟 娱 婆 婚 媒 嫁 嫌 嫩 子 孔 孕 字 存 孙 孜 孝 孟 季 孤 学 孩 宁 它 宇 守 安 宋 完 宏 宗 官 宙 定 宛 宜 宝 实 审 客 宣 室 宪 害 宴 家 容 宽 宾 宿 寂 寄 寅 密 寇 富 寒 寝 寞 察 寡 寨 寸 对 寻 导 寿 封 射 将 尊 小 少 尔 尖 尘 尚 尝 尤 就 尺 尼 尽 尾 局 屁 层 居 屋 屏 展 属 屠 山 岁 岂 岗 岘 岚 岛 岳 岸 峡 峰 崇 崩 崴 川 州 巡 工 左 巧 巨 巫 差 己 已 巳 巴 巷 币 市 布 帅 师 希 帐 帕 帖 帝 带 席 帮 常 帽 幅 幕 干 平 年 并 幸 幻 幼 幽 广 庆 床 序 库 应 底 店 庙 庚 府 庞 废 度 座 庭 康 庸 廉 廖 延 廷 建 开 异 弃 弄 弊 式 引 弗 弘 弟 张 弥 弦 弯 弱 弹 强 归 当 录 彝 形 彩 彬 彭 彰 影 彷 役 彻 彼 往 征 径 待 很 律 後 徐 徒 得 循 微 徵 德 心 必 忆 忌 忍 志 忘 忙 忠 忧 快 念 忽 怀 态 怎 怒 怕 怖 思 怡 急 性 怨 怪 总 恋 恐 恢 恨 恩 恭 息 恰 恶 恼 悄 悉 悔 悟 悠 患 您 悲 情 惑 惜 惠 惧 惨 惯 想 惹 愁 愈 愉 意 愚 感 愧 慈 慎 慕 慢 慧 慰 憾 懂 懒 戈 戊 戌 戏 成 我 戒 或 战 截 戴 户 房 所 扁 扇 手 才 扎 扑 打 托 扣 执 扩 扫 扬 扭 扮 扯 批 找 承 技 抄 把 抑 抓 投 抗 折 抢 护 报 披 抬 抱 抵 抹 抽 担 拆 拉 拍 拒 拔 拖 拘 招 拜 拟 拥 拦 拨 择 括 拳 拷 拼 拾 拿 持 指 按 挑 挖 挝 挡 挤 挥 挪 振 挺 捉 捐 捕 损 捡 换 据 捷 授 掉 掌 排 探 接 控 推 掩 措 掸 描 提 插 握 援 搜 搞 搬 搭 摄 摆 摊 摔 摘 摩 摸 撒 撞 播 操 擎 擦 支 收 改 攻 放 政 故 效 敌 敏 救 教 敝 敢 散 敦 敬 数 敲 整 文 斋 斐 斗 料 斜 斥 断 斯 新 方 於 施 旁 旅 旋 族 旗 无 既 日 旦 旧 旨 早 旭 时 旺 昂 昆 昌 明 昏 易 星 映 春 昨 昭 是 显 晃 晋 晒 晓 晚 晨 普 景 晴 晶 智 暂 暑 暖 暗 暮 暴 曰 曲 更 曹 曼 曾 替 最 月 有 朋 服 朗 望 朝 期 木 未 末 本 札 术 朱 朵 机 杀 杂 权 杉 李 材 村 杜 束 条 来 杨 杯 杰 松 板 极 构 析 林 果 枝 枢 枪 枫 架 柏 某 染 柔 查 柬 柯 柳 柴 标 栋 栏 树 校 样 核 根 格 桃 框 案 桌 桑 档 桥 梁 梅 梦 梯 械 梵 检 棉 棋 棒 棚 森 椅 植 椰 楚 楼 概 榜 模 樱 檀 欠 次 欢 欣 欧 欲 欺 款 歉 歌 止 正 此 步 武 歪 死 殊 残 段 毅 母 每 毒 比 毕 毛 毫 氏 民 气 氛 水 永 求 汇 汉 汗 汝 江 池 污 汤 汪 汶 汽 沃 沈 沉 沙 沟 没 沧 河 油 治 沿 泉 泊 法 泛 泡 波 泣 泥 注 泰 泳 泽 洋 洗 洛 洞 津 洪 洲 活 洽 派 流 浅 测 济 浏 浑 浓 浙 浦 浩 浪 浮 浴 海 涅 消 涉 涛 涨 涯 液 涵 淋 淑 淘 淡 深 混 添 清 渐 渡 渣 温 港 渴 游 湖 湾 源 溜 溪 滋 滑 满 滥 滨 滴 漂 漏 演 漠 漫 潘 潜 潮 澎 澳 激 灌 火 灭 灯 灰 灵 灿 炉 炎 炮 炸 点 烂 烈 烤 烦 烧 热 焦 然 煌 煞 照 煮 熊 熟 燃 燕 爆 爪 爬 爱 爵 父 爷 爸 爽 片 版 牌 牙 牛 牡 牢 牧 物 牲 牵 特 牺 犯 状 犹 狂 狐 狗 狠 独 狮 狱 狼 猛 猜 猪 献 猴 玄 率 玉 王 玛 玩 玫 环 现 玲 玻 珀 珊 珍 珠 班 球 理 琊 琪 琳 琴 琼 瑙 瑜 瑞 瑟 瑰 瑶 璃 瓜 瓦 瓶 甘 甚 甜 生 用 田 由 甲 申 电 男 甸 画 畅 界 留 略 番 疆 疏 疑 疗 疯 疲 疼 疾 病 痕 痛 痴 癸 登 白 百 的 皆 皇 皮 盈 益 监 盒 盖 盘 盛 盟 目 直 相 盼 盾 省 眉 看 真 眠 眼 着 睛 睡 督 瞧 矛 矣 知 短 石 矶 码 砂 砍 研 破 础 硕 硬 确 碍 碎 碗 碟 碧 碰 磁 磅 磨 示 礼 社 祖 祚 祝 神 祥 票 祯 祸 禁 禅 福 离 秀 私 秋 种 科 秒 秘 租 秤 秦 秩 积 称 移 稀 程 稍 税 稣 稳 稿 穆 究 穷 穹 空 穿 突 窗 窝 立 站 竞 竟 章 童 端 竹 笑 笔 笛 符 笨 第 等 筋 筑 答 策 筹 签 简 算 管 箭 箱 篇 篮 簿 籍 米 类 粉 粒 粗 粤 粹 精 糊 糕 糖 糟 系 素 索 紧 紫 累 繁 红 约 级 纪 纯 纲 纳 纵 纷 纸 纽 线 练 组 细 织 终 绍 经 结 绕 绘 给 络 绝 统 继 绩 绪 续 维 绵 综 绿 缅 缓 编 缘 缠 缩 缴 缶 缸 缺 罐 网 罕 罗 罚 罢 罪 置 署 羊 美 羞 群 羯 羽 翁 翅 翔 翘 翠 翰 翻 翼 耀 老 考 者 而 耍 耐 耗 耳 耶 聊 职 联 聘 聚 聪 肉 肖 肚 股 肤 肥 肩 肯 育 胁 胆 背 胎 胖 胜 胞 胡 胶 胸 能 脆 脑 脱 脸 腊 腐 腓 腰 腹 腾 腿 臂 臣 自 臭 至 致 舌 舍 舒 舞 舟 航 般 舰 船 良 色 艺 艾 节 芒 芝 芦 芬 芭 花 芳 苍 苏 苗 若 苦 英 茂 范 茨 茫 茶 草 荐 荒 荣 药 荷 莉 莎 莪 莫 莱 莲 获 菜 菩 菲 萄 萍 萤 营 萧 萨 落 著 葛 葡 蒂 蒋 蒙 蓉 蓝 蓬 蔑 蔡 薄 薪 藉 藏 藤 虎 虑 虫 虹 虽 虾 蚁 蛇 蛋 蛙 蛮 蜂 蜜 蝶 融 蟹 蠢 血 行 街 衡 衣 补 表 袋 被 袭 裁 裂 装 裕 裤 西 要 覆 见 观 规 视 览 觉 角 解 言 誉 誓 警 计 订 认 讨 让 训 议 讯 记 讲 讷 许 论 设 访 证 评 识 诉 词 译 试 诗 诚 话 诞 询 该 详 语 误 说 请 诸 诺 读 课 谁 调 谅 谈 谊 谋 谓 谜 谢 谨 谱 谷 豆 象 豪 貌 贝 贞 负 贡 财 责 贤 败 货 质 贩 贪 购 贯 贱 贴 贵 贸 费 贺 贼 贾 资 赋 赌 赏 赐 赔 赖 赚 赛 赞 赠 赢 赤 赫 走 赵 起 趁 超 越 趋 趣 足 跃 跌 跑 距 跟 路 跳 踏 踢 踩 身 躲 车 轨 轩 转 轮 软 轰 轻 载 较 辅 辆 辈 辉 辑 输 辛 辞 辨 辩 辰 辱 边 达 迁 迅 过 迈 迎 运 近 返 还 这 进 远 违 连 迟 迦 迪 迫 述 迷 追 退 送 适 逃 逆 选 逊 透 逐 递 途 通 逛 逝 速 造 逢 逸 逻 逼 遇 遍 道 遗 遭 遮 遵 避 邀 邓 那 邦 邪 邮 邱 邻 郎 郑 部 郭 都 鄂 酉 酋 配 酒 酷 酸 醉 醒 采 释 里 重 野 量 金 针 钓 钟 钢 钦 钱 钻 铁 铃 铜 铢 铭 银 铺 链 销 锁 锅 锋 错 锡 锦 键 锺 镇 镜 镭 长 门 闪 闭 问 闰 闲 间 闷 闹 闻 阁 阅 阐 阔 队 阮 防 阳 阴 阵 阶 阻 阿 陀 附 际 陆 陈 降 限 院 除 险 陪 陵 陶 陷 隆 随 隐 隔 障 难 雄 雅 集 雉 雨 雪 雯 雳 零 雷 雾 需 震 霍 霖 露 霸 霹 青 靖 静 非 靠 面 革 靼 鞋 鞑 韦 韩 音 页 顶 项 顺 须 顽 顾 顿 预 领 颇 频 颗 题 额 风 飘 飙 飞 食 餐 饭 饮 饰 饱 饼 馆 首 香 馨 马 驱 驶 驻 驾 验 骑 骗 骚 骤 骨 高 鬼 魂 魅 魔 鱼 鲁 鲜 鸟 鸡 鸣 鸭 鸿 鹅 鹤 鹰 鹿 麦 麻 黄 黎 黑 默 鼓 鼠 鼻 齐 齿 龄 龙 龟 2,210
Auxiliary 仂 侣 傈 傣 僳 卑 卞 厘 吕 坝 堤 奎 屿 巽 撤 楔 楠 滕 瑚 甫 盲 碑 禄 粟 脚 艮 谬 钯 铂 锑 镑 魁 乒 乓 仓 伞 冥 凉 刨 匕 厦 厨 呣 唇 啤 啮 喱 嗅 噘 噢 墟 妆 婴 媚 宅 寺 尬 尴 屑 巾 弓 彗 惊 戟 扔 扰 扳 抛 挂 捂 摇 撅 杆 杖 柜 柱 栗 栽 桶 棍 棕 棺 榈 槟 橙 洒 浆 涌 淇 滚 滩 灾 烛 烟 焰 煎 犬 猫 瓢 皱 盆 盔 眨 眯 瞌 矿 祈 祭 祷 稻 竿 笼 筒 篷 粮 纠 纬 缆 缎 耸 舔 舵 艇 芽 苜 苞 菇 菱 葫 葵 蒸 蓿 蔽 薯 蘑 蚂 蛛 蜗 蜘 蜡 蝎 蝴 螃 裹 谍 豚 账 跤 踪 躬 轴 辐 迹 郁 鄙 酢 钉 钥 钮 铅 铛 锄 锚 锤 闺 阱 隧 雕 霾 靴 靶 鞠 颠 馏 驼 骆 髦 鲤 鲸 鳄 鸽 181
Show characters used for Traditional Chinese.
Main 一 丁 七 丈 三 上 下 丌 不 丑 且 世 丘 丙 丟 並 中 串 丸 丹 主 乃 久 么 之 乎 乏 乖 乘 乙 九 也 乾 亂 了 予 事 二 于 云 互 五 井 些 亞 亡 交 亥 亦 亨 享 京 亮 人 什 仁 仇 今 介 仍 仔 他 付 仙 代 令 以 仰 仲 件 任 份 企 伊 伍 伐 休 伙 伯 估 伴 伸 似 伽 但 佈 佉 位 低 住 佔 何 余 佛 作 你 佩 佳 使 來 例 供 依 侯 侵 便 係 促 俄 俊 俗 保 俠 信 修 俱 俾 個 倍 們 倒 候 倚 借 倫 值 假 偉 偏 做 停 健 側 偵 偶 偷 傑 備 傢 傣 傲 傳 傷 傻 傾 僅 像 僑 僧 價 儀 億 儒 儘 優 允 元 兄 充 兇 先 光 克 免 兒 兔 入 內 全 兩 八 公 六 兮 共 兵 其 具 典 兼 冊 再 冒 冠 冬 冰 冷 准 凌 凝 凡 凰 凱 出 函 刀 分 切 刊 列 初 判 別 利 刪 到 制 刷 刺 刻 則 剌 前 剛 剩 剪 副 割 創 劃 劇 劉 劍 力 功 加 助 努 劫 勁 勇 勉 勒 動 務 勝 勞 勢 勤 勵 勸 勿 包 匈 化 北 匹 區 十 千 升 午 半 卒 卓 協 南 博 卜 卡 卯 印 危 即 卷 卻 厄 厘 厚 原 厭 厲 去 參 又 及 友 反 叔 取 受 口 古 句 另 只 叫 召 叭 可 台 史 右 司 吃 各 合 吉 吊 同 名 后 吐 向 吒 君 吝 吞 吟 吠 否 吧 含 吳 吵 吸 吹 吾 呀 呂 呆 告 呢 周 味 呵 呼 命 和 咖 咦 咧 咪 咬 咱 哀 品 哇 哈 哉 哎 員 哥 哦 哩 哪 哭 哲 唉 唐 唔 唬 售 唯 唱 唷 唸 商 啊 問 啟 啡 啥 啦 啪 喀 喂 善 喇 喊 喔 喜 喝 喬 單 喵 嗎 嗚 嗨 嗯 嘆 嘉 嘗 嘛 嘴 嘻 嘿 器 噴 嚇 嚴 囉 四 回 因 困 固 圈 國 圍 園 圓 圖 團 圜 土 在 圭 地 圾 址 均 坎 坐 坡 坤 坦 坪 垂 垃 型 埃 城 埔 域 執 培 基 堂 堅 堆 堡 堪 報 場 塊 塔 塗 塞 填 塵 境 增 墨 墮 壁 壇 壓 壘 壞 壢 士 壬 壯 壽 夏 夕 外 多 夜 夠 夢 夥 大 天 太 夫 央 失 夷 夸 夾 奇 奈 奉 奎 奏 契 奔 套 奧 奪 奮 女 奴 奶 她 好 如 妙 妝 妥 妨 妮 妳 妹 妻 姆 姊 始 姐 姑 姓 委 姿 威 娃 娘 娛 婁 婆 婚 婦 媒 媽 嫌 嫩 子 孔 字 存 孝 孟 季 孤 孩 孫 學 它 宅 宇 守 安 宋 完 宏 宗 官 宙 定 宛 宜 客 宣 室 宮 害 家 容 宿 寂 寄 寅 密 富 寒 寞 察 寢 實 寧 寨 審 寫 寬 寮 寵 寶 封 射 將 專 尊 尋 對 導 小 少 尖 尚 尤 就 尺 尼 尾 局 屁 居 屆 屋 屏 展 屠 層 屬 山 岡 岩 岸 峰 島 峽 崇 崙 崴 嵐 嶺 川 州 巡 工 左 巧 巨 巫 差 己 已 巳 巴 巷 市 布 希 帕 帖 帛 帝 帥 師 席 帳 帶 常 帽 幅 幕 幣 幫 干 平 年 幸 幹 幻 幼 幽 幾 庇 床 序 底 店 庚 府 度 座 庫 庭 康 庸 廉 廖 廠 廢 廣 廳 延 廷 建 弄 式 引 弗 弘 弟 弦 弱 張 強 彈 彊 彌 彎 彝 彞 形 彥 彩 彬 彭 彰 影 役 彼 往 征 待 很 律 後 徐 徑 徒 得 從 復 微 徵 德 徹 心 必 忌 忍 志 忘 忙 忠 快 念 忽 怎 怒 怕 怖 思 怡 急 性 怨 怪 恆 恐 恢 恥 恨 恩 恭 息 恰 悅 悉 悔 悟 悠 您 悲 悶 情 惑 惜 惠 惡 惱 想 惹 愁 愈 愉 意 愚 愛 感 慈 態 慕 慘 慢 慣 慧 慮 慰 慶 慾 憂 憐 憑 憲 憶 憾 懂 應 懶 懷 懼 戀 戈 戊 戌 成 我 戒 或 截 戰 戲 戴 戶 房 所 扁 扇 手 才 扎 打 托 扣 扥 扭 扯 批 找 承 技 抄 把 抓 投 抗 折 披 抬 抱 抵 抹 抽 拆 拉 拋 拍 拏 拒 拔 拖 招 拜 括 拳 拼 拾 拿 持 指 按 挑 挖 挪 振 挺 捐 捕 捨 捲 捷 掃 授 掉 掌 排 掛 採 探 接 控 推 措 描 提 插 揚 換 握 揮 援 損 搖 搜 搞 搬 搭 搶 摘 摩 摸 撐 撒 撞 撣 撥 播 撾 撿 擁 擇 擊 擋 操 擎 擔 據 擠 擦 擬 擴 擺 擾 攝 支 收 改 攻 放 政 故 效 敍 敏 救 敗 敘 教 敝 敢 散 敦 敬 整 敵 數 文 斐 斗 料 斯 新 斷 方 於 施 旁 旅 旋 族 旗 既 日 旦 早 旭 旺 昂 昆 昇 昌 明 昏 易 星 映 春 昨 昭 是 時 晉 晒 晚 晨 普 景 晴 晶 智 暑 暖 暗 暫 暴 曆 曉 曰 曲 更 書 曼 曾 替 最 會 月 有 朋 服 朗 望 朝 期 木 未 末 本 札 朱 朵 杉 李 材 村 杜 束 杯 杰 東 松 板 析 林 果 枝 架 柏 某 染 柔 查 柬 柯 柳 柴 校 核 根 格 桃 案 桌 桑 梁 梅 條 梨 梯 械 梵 棄 棉 棋 棒 棚 森 椅 植 椰 楊 楓 楚 業 極 概 榜 榮 構 槍 樂 樓 標 樞 模 樣 樹 橋 機 橫 檀 檔 檢 欄 權 次 欣 欲 欺 欽 款 歉 歌 歐 歡 止 正 此 步 武 歲 歷 歸 死 殊 殘 段 殺 殼 毀 毅 母 每 毒 比 毛 毫 氏 民 氣 水 永 求 汗 汝 江 池 污 汪 汶 決 汽 沃 沈 沉 沒 沖 沙 河 油 治 沿 況 泉 泊 法 泡 波 泥 注 泰 泳 洋 洗 洛 洞 洩 洪 洲 活 洽 派 流 浦 浩 浪 浮 海 涇 消 涉 涯 液 涵 涼 淑 淚 淡 淨 深 混 淺 清 減 渡 測 港 游 湖 湯 源 準 溝 溪 溫 滄 滅 滋 滑 滴 滾 滿 漂 漏 演 漠 漢 漫 漲 漸 潔 潘 潛 潮 澤 澳 激 濃 濟 濤 濫 濱 瀏 灌 灣 火 灰 災 炎 炮 炸 為 烈 烏 烤 無 焦 然 煙 煞 照 煩 熊 熟 熱 燃 燈 燒 營 爆 爐 爛 爪 爬 爭 爵 父 爸 爺 爽 爾 牆 片 版 牌 牙 牛 牠 牧 物 牲 特 牽 犧 犯 狀 狂 狐 狗 狠 狼 猛 猜 猴 猶 獄 獅 獎 獨 獲 獸 獻 玄 率 玉 王 玩 玫 玲 玻 珊 珍 珠 珥 班 現 球 理 琉 琪 琴 瑙 瑜 瑞 瑟 瑤 瑪 瑰 環 瓜 瓦 瓶 甘 甚 甜 生 產 用 田 由 甲 申 男 甸 界 留 畢 略 番 畫 異 當 疆 疏 疑 疼 病 痕 痛 痴 瘋 療 癡 癸 登 發 白 百 的 皆 皇 皮 盃 益 盛 盜 盟 盡 監 盤 盧 目 盲 直 相 盼 盾 省 眉 看 真 眠 眼 眾 睛 睡 督 瞧 瞭 矛 矣 知 短 石 砂 砍 研 砲 破 硬 碎 碗 碟 碧 碩 碰 確 碼 磁 磨 磯 礎 礙 示 社 祕 祖 祚 祛 祝 神 祥 票 祿 禁 禍 禎 福 禪 禮 秀 私 秋 科 秒 秘 租 秤 秦 移 稅 程 稍 種 稱 稿 穆 穌 積 穩 究 穹 空 穿 突 窗 窩 窮 窶 立 站 竟 章 童 端 競 竹 笑 笛 符 笨 第 筆 等 筋 答 策 简 算 管 箭 箱 節 範 篇 築 簡 簫 簽 簿 籃 籌 籍 籤 米 粉 粗 粵 精 糊 糕 糟 系 糾 紀 約 紅 納 紐 純 紙 級 紛 素 索 紫 累 細 紹 終 組 結 絕 絡 給 統 絲 經 綜 綠 維 綱 網 緊 緒 線 緣 編 緩 緬 緯 練 縛 縣 縮 縱 總 績 繁 繆 織 繞 繪 繳 繼 續 缸 缺 罕 罪 置 罰 署 罵 罷 羅 羊 美 羞 群 義 羽 翁 習 翔 翰 翹 翻 翼 耀 老 考 者 而 耍 耐 耗 耳 耶 聊 聖 聚 聞 聯 聰 聲 職 聽 肉 肚 股 肥 肩 肯 育 背 胎 胖 胞 胡 胸 能 脆 脫 腓 腔 腦 腰 腳 腿 膽 臉 臘 臣 臥 臨 自 臭 至 致 臺 與 興 舉 舊 舌 舍 舒 舞 舟 航 般 船 艦 良 色 艾 芝 芬 花 芳 若 苦 英 茅 茫 茲 茶 草 荒 荷 荼 莉 莊 莎 莫 菜 菩 華 菲 萄 萊 萬 落 葉 著 葛 葡 蒂 蒙 蒲 蒼 蓋 蓮 蔕 蔡 蔣 蕭 薄 薦 薩 薪 藉 藍 藏 藝 藤 藥 蘆 蘇 蘭 虎 處 虛 號 虧 蛇 蛋 蛙 蜂 蜜 蝶 融 螢 蟲 蟹 蠍 蠻 血 行 術 街 衛 衝 衡 衣 表 袋 被 裁 裂 裕 補 裝 裡 製 複 褲 西 要 覆 見 規 視 親 覺 覽 觀 角 解 觸 言 訂 計 訊 討 訓 託 記 訥 訪 設 許 訴 註 証 評 詞 詢 試 詩 話 該 詳 誇 誌 認 誓 誕 語 誠 誤 說 誰 課 誼 調 談 請 諒 論 諸 諺 諾 謀 謂 講 謝 證 識 譜 警 譯 議 護 譽 讀 變 讓 讚 谷 豆 豈 豐 象 豪 豬 貌 貓 貝 貞 負 財 貢 貨 貪 貫 責 貴 買 費 貼 賀 資 賈 賓 賜 賞 賢 賣 賤 賦 質 賭 賴 賺 購 賽 贈 贊 贏 赤 赫 走 起 超 越 趕 趙 趣 趨 足 跌 跎 跑 距 跟 跡 路 跳 踏 踢 蹟 蹤 躍 身 躲 車 軌 軍 軒 軟 較 載 輔 輕 輛 輝 輩 輪 輯 輸 轉 轟 辛 辦 辨 辭 辯 辰 辱 農 迅 迎 近 返 迦 迪 迫 述 迴 迷 追 退 送 逃 逆 透 逐 途 這 通 逛 逝 速 造 逢 連 週 進 逸 逼 遇 遊 運 遍 過 道 達 違 遙 遜 遠 適 遭 遮 遲 遷 選 遺 避 邀 邁 還 邊 邏 那 邦 邪 邱 郎 部 郭 郵 都 鄂 鄉 鄭 鄰 酉 配 酒 酷 酸 醉 醒 醜 醫 采 釋 里 重 野 量 金 針 釣 鈴 鉢 銀 銅 銖 銘 銳 銷 鋒 鋼 錄 錢 錦 錫 錯 鍋 鍵 鍾 鎊 鎖 鎮 鏡 鐘 鐵 鑑 長 門 閃 閉 開 閏 閒 間 閣 閱 闆 闊 闍 闐 關 闡 防 阻 阿 陀 附 降 限 院 陣 除 陪 陰 陳 陵 陶 陷 陸 陽 隆 隊 階 隔 際 障 隨 險 隱 隻 雄 雅 集 雉 雖 雙 雜 雞 離 難 雨 雪 雲 零 雷 電 需 震 霍 霧 露 霸 霹 靂 靈 青 靖 靜 非 靠 面 革 靼 鞋 韃 韋 韓 音 韻 響 頁 頂 項 順 須 預 頑 頓 頗 領 頞 頭 頻 顆 題 額 顏 願 類 顧 顯 風 飄 飛 食 飯 飲 飽 飾 餅 養 餐 餘 館 首 香 馬 駐 駕 駛 騎 騙 騷 驅 驗 驚 骨 體 高 髮 鬆 鬥 鬧 鬱 鬼 魁 魂 魅 魔 魚 魯 鮮 鳥 鳳 鳴 鴻 鵝 鷹 鹿 麗 麥 麵 麻 麼 黃 黎 黑 默 點 黨 鼓 鼠 鼻 齊 齋 齒 齡 龍 龜 2,180
Auxiliary 乍 仂 伏 佐 侶 僳 兆 兌 兹 别 券 勳 卑 卞 占 叶 堤 墎 壤 奥 孜 峇 嶼 巽 栗 楔 涅 渾 澎 燦 狄 琳 瑚 甫 碑 礁 芒 苗 茨 蓬 蚩 蜀 裘 謬 酋 隴 乳 划 匕 匙 匣 叉 吻 嘟 噘 妖 巾 帆 廁 廚 弋 弓 懸 戟 扳 捂 摔 暈 框 桶 桿 櫃 煎 燭 牡 皺 盒 眨 眩 筒 簍 糰 紋 紗 纏 纜 羯 聳 肖 艇 虹 蛛 蜘 蝴 蝸 蠟 裙 豚 躬 釘 鈔 鈕 鉛 鎚 鎬 鐺 鑰 鑽 霄 鞠 骰 骷 髏 鯉 鳶 115

In its 'main' category, CLDR lists 2,210 characters for the Simplified Chinese orthography, and 2,180 for Traditional Chinese. Combined, this includes 3,026 unique characters, and an overlap of 1,064 characters.

Radicals

A radical is an ideograph or, more commonly, a component of an ideograph that is used for indexing dictionaries and word lists, and as the basis for creating new ideographs. The 214 radicals of the KangXi dictionary are universally recognised.

The visual appearance of radicals may vary significantly from the original character on which they are based.

Han character for word/say/speak (top) and water (bottom), and associated radicals used in other characters (highlighted yellow).

The shape of the radical may be influenced by the arrangement with other elements of a character, or by standardised simplifications. In the figure above, the shape of the top right radical (word) is a product of the simplification process in China.

Unicode dedicates two blocks to radicals. The KangXi radicals block contains the base forms of the 214 radicals.

The CJK Radicals Supplement contains variant shapes of these radicals when they are used as parts of other characters or in simplified form. These have not been unified because they often appear independently in dictionaries indices.

Characters in those blocks should never be used as ideographs.

Text direction

Text can be written horizontally, left to right, or vertically with lines progressing from right to left. Vertically set text is more common in traditional chinese than simplified chinese areas.

Older horizontally set texts in Chinese also ran right to left.

It should be noted, however, that horizontal and vertical text is not usually identical. Apart from the question of what gets rotated and what does not, the two writing modes may show different preferences for emphasis marks, brackets, numbers, and so forth.

Glyph shaping & positioning

This section brings together information about the following topics: font/writing styles; cursive text; context-based shaping; context-based positioning; letterform slopes, weights, & italics; case & other character transforms.

Experiment with examples using the Chinese character app.

Han characters have no contextual variation or placement of glyphs. Nor is text cursive (in the sense of joined-up).

The orthography has no case distinction, and no special transforms are needed to convert between characters.

On the other hand, punctuation and embedded text in other scripts is affected by the direction of lines.

By default, all han characters and punctuation are inside a character frame that is square and the same size for all characters. The box containing the actual symbol is called the letter face, and there should be some space left between the letter face and the character frame. There may be variations, particularly for punctuation, etc., in the size of the letter face.

character frame letter face
Character frame and letter face.

Because of the regularity of the character frame size, it can be used to measure the size of the text area or other parts of a page (horizontally or vertically).

In principle, Han characters are set solid, ie. with no space between the character frames. However, text alignment and justification can make adjustments to the placement of characters in the direction of the line flow. See justify and letterspace.

きょう。  ょう。
Character frame and letter face.

Context-based shaping & positioning

Dashes, ellipses, and brackets are rotated 90º to the right when they appear in vertical text. Here is a list of characters to which this applies.c,#table_of_punctuation_marks

⸺␣—␣…␣~␣-␣–␣/␣「␣」␣『␣』␣“␣”␣‘␣’␣(␣)␣《␣》␣〈␣〉␣【␣】␣〖␣〗␣〔␣〕␣[␣]␣{␣}

Typographic units

Word boundaries

Chinese rarely uses spaces. In the sample text there are gaps around punctuation, but these are produced by a lack of 'ink' in parts of the square character glyphs:

You can verify this by clicking on this example. The character list popup shows that only three characters make up this sequence, and none are spaces.

别。并

The gap between these characters is only an absence of ink. There is no space character.

Graphemes

Since there are no combining marks or decompositions, graphemes correspond to individual characters.

Unicode grapheme clusters can be applied to Chinese without problems. There are no special issues related to operations that use grapheme clusters as their basic unit of text.

Punctuation & inline features

Several of the following sections contain examples that show the position of punctuation marks within the character square and whether the character is rotated 90º in vertical text, or translated to a different location in the character frame. Column headings with H show horizontal, and V vertical writing modes. Where there is a systematic difference, the H or V are preceded by SC (Simplified Chinese) or TC (Traditional Chinese).

Phrase & section boundaries

Chinese uses the following separators at the sentence level and below.c,#h-pause-or-stop-punctuation-marks

    SCH TCH SCV TCV
phrase
 
 
 
sentence
 
 
 

is typically used as a list separator.

is used as sentence-final punctuation in, for example, college textbooks, science and technology literature, and grammar books of Western languages, most of which are in horizontal writing mode, and Western language is heavily used.

As the table shows, these punctuation marks are not rotated, however their position varies in Simplified Chinese for horizontal and vertical text, relative to the character frame. In Traditional Chinese they are all centred.

These different positions in Simplified Chinese require dedicated glyphs in the font, and cannot be achieved by simply rotating the glyph.

Chinese also uses the following doubled exclamation/question marks. They remain upright in vertical text.

Other punctuation used to separate phrases or items includes:

  H V
—— —— ——

If EM DASH characters are used, they are used in pairs.

Bracketed text

Chinese commonly uses fullwidth parentheses to insert parenthetical information into text.

    H V

Dashes can also be used to offset information, in which case Chinese typically uses those listed in the previous section, doubled up.

Although there are a number of other bracket characters (listed just below), they are rarely used in Chinese publications.c,#id81

【␣】␣〖␣〗␣〔␣〕␣[␣]␣{␣}

Brackets are also used to indicate titles and proper names (see otherinline).

Other punctuation

Chinese uses a large number of punctuation marks, and that number is increased by the duplication of normal vs. fullwidth variants. The fullwidth punctuation often includes significant amounts of white space, so that character frames of the punctuation characters are the same size as Han characters.

CLDR lists 136 punctuation characters for the union of Simplified and Traditional Chinese, grouped here by Unicode block.

CJK Symbols & Punctuation:

、␣。␣〃␣〈␣〉␣《␣》␣「␣」␣『␣』␣【␣】␣〔␣〕␣〖␣〗␣〝␣〞

(Halfwidth &) Fullwidth Forms:

!␣"␣#␣%␣&␣'␣(␣)␣*␣,␣-␣.␣/␣:␣;␣?␣@␣[␣\␣]␣_␣{␣}

Basic & General punctuation:

!␣"␣#␣%␣&␣(␣)␣*␣,␣-␣.␣/␣:␣;␣?␣@␣[␣\␣]␣_␣{␣}␣‐␣‑␣–␣—␣―␣‖␣‘␣’␣“␣”␣†␣‡␣‥␣…␣‧␣‰␣′␣″␣‵␣※␣‾␣§␣·

CLDR also includes some compatibility characters, included for handling legacy implementations. These include vertical text forms, which should normally be automatically enabled by the font in a vertical context. The other forms should also be avoided in favour of normal characters, with the variant shapes provided by fonts or styling.u,284-5

CJK Compatibility Forms:

︰␣︱␣︲␣︳␣︴␣︵␣︶␣︷␣︸␣︹␣︺␣︻␣︼␣︽␣︾␣︿␣﹀␣﹁␣﹂␣﹃␣﹄␣﹉␣﹊␣﹋␣﹌␣﹍␣﹎␣﹏

Small Form Variants:

﹐␣﹑␣﹒␣﹔␣﹕␣﹖␣﹗␣﹘␣﹙␣﹚␣﹛␣﹜␣﹝␣﹞␣﹟␣﹠␣﹡␣﹣␣﹨␣﹪␣﹫

Dashes. The long dashes mentioned in bracketing can also be used to show a continuation of tone or sound, an abrupt change in thought, or adding new content to the contextc,#id82.

Connectors.Connector marks are used "to indicate the beginning and end of time or space, to indicate quantity, to express the name of a chemical compound, to label a table or illustration, to connect a house number in an address, for a phone number, to separate digits which indicate the year, month and date, or to connect compound nouns and for the romanization, as well as the foreign text in the content".c,#id85

Chinese uses the following punctuation for this.c,#id85

  SC: H, V TC: H, V
Bottom right Bottom right Bottom right Bottom right
Bottom right Bottom right Bottom right Bottom right
Bottom right Bottom right - -

Separators. Interpuncts are used to separate the first name and family name in foreign or minority names rendered using Chinese characters, and with book title marks to separate chapters, articles and volumes in publications.c,#id86

  H V
· Centred Centred

Middle dots sometimes take up only a halfwidth space in Simplified Chinese when used with dates, eg. 2·11.c,#id86

The following characters are not recommended for this purpose: U+FF0E FULLWIDTH FULL STOP, U+2027 HYPHENATION POINT, U+2022 BULLET, and U+30FB KATAKANA MIDDLE DOT.c,#id86

Quotations & citations

See type samples.

Quotations

Mainland China. Mainland China, where vertical text is not common, uses different quote marks for horizontal and vertical writing. The default quote marks are:

    H V

When an additional quote is embedded within the first, the quote marks are:

    H V

Taiwan. Taiwan tends to use a single set of quote marks, but the other way around compared to Mainland China. The default quote marks are:

    H V

When an additional quote is embedded within the first, the quote marks are:

    H V

Occasionally, Traditional Chinese text may use double brackets for the default, and single for the embedded. It may also use quotation marks, like Mainland China, but not commonly, and much less so for vertical text.

Proper names

Proper names can be highlighted using line decoration, with a straight, single underline. Note that the underline is not used for emphasis in this case. c,#glyphs_sizes_and_positions_in_character_faces_of_punctuation_marks

To achieve this in web pages the HTML specification recommends the use of the u element.

See also otherinline.

莫怨東風當自嗟:宋歐陽修明妃曲:「紅顏勝人多薄命,莫怨東風當自嗟。」
A footnote containing 2 proper nouns and a title of a poem, all indicated by underlining. Note the gaps in the lines between the three items. There are no gaps in the character sequence.
translation

It is your own heart that causes you pain: Song, Ouyang Xiu, Ming Fei Song: Do not blame the east wind for your sorrows, For it is your own heart that causes you pain. (Mò yuàn dōngfēng dāng zì jiē: Sòng ōuyángxiū míng fēi qū:`Hóngyán shèng rén duō bómìng, mò yuàn dōngfēng dāng zì jiē.')

Titles

Titles of works including books, articles, songs, movies, files, calligraphy and paintings are cited in Chinese in one of two ways:c,#id87

  1. Using angle brackets around the title.
  2. Underlining the title with a wavy line (see fig_underlines).

The double brackets tend to be used for book and chapter titles, and the single brackets for articles.c,#glyphs_sizes_and_positions_in_character_faces_of_punctuation_marks

    H V

To achieve the wavy underline in web pages use the u element with a class name, such as title_of_work and add the following line of CSS to your style sheet.

.title_of_work { text-decoration-style: wavy; }

Emphasis

To express emphasis Chinese uses dots or circles alongside characters, one dot per base character.

缔造真正全球通行的万维网 缔造真正全球通行的万维网
Text emphasis in horizontal (left) and vertical (right) text.
translation

Making the World Wide Web truly world wide. (Dìzào zhēnzhèng quánqiú tōngxíng de wànwéiwǎng)

In horizontal text, emphasis marks appear underneath the base text. In vertical text, they run down the right-hand side. Regardless of the orientation of the line, the dot is centred alongside or below the base character.

Where both lines and emphasis dots decorate the same run of text, the lines and emphasis dots will usually appear on opposite sides of a vertical line of text, but will normally both appear below a line of horizontally set text. In horizontal text the line decoration is normally closer to the text than the emphasis dots.

In the same way as for other line decorations, embedded text in other languages that run sideways up or down the line would have dots displayed on the same side as when decorating Chinese.

Straight or wavy lines alongside the text are not used for emphasis (unlike in Latin script text), but are instead used in Chinese to indicate proper nouns such as a person's name, a book title, or the name of a place.c,#id82 See inline_titles and inline_propernames.

Abbreviation, ellipsis & repetition

Ellipsis

An ellipsis in Chinese consists of six dots and takes up the space of two Hanzi characters. This is normally achieved using two characters, side-by-side.c,#id83

  H V
…… U+2026 HORIZONTAL ELLIPSIS x2 Bottom right Left

Inline notes & annotations

Inter-line annotations are used to indicate pronunciation (usually only for children or foreigners), and to provide commentaries on or bilingual equivalents of the main text.

With the exception of zhuyin in horizontal text, all annotations appear within the standard inter-line space for the page, and don't create extra space if they appear on a single line. That said, the inter-line space is usually set at an appropriate size to accommodate annotations.

Unlike Japanese, it is rare to find annotations applied just to specific words; generally the whole text is annotated. If annotations are only needed for individual characters or words, they are often presented in parentheses, following.

These annotations do not appear alongside punctuation.

Indicating pronunciation with Latin characters

Pinyin is the most common way of representing pronunciation, although occasionally other transcriptions are used.

Horizontal  semantic annotation.
Examples of pinyin, word-based phonetic annotations (source).

The annotation usually appears above the main line of text, except when both zhuyin and pinyin annotations are both present, in which case it commonly appears below the line.

Pinyin and zhuyin together.
Examples of pinyin and zhuyin phonetic annotations applied to the same base characters (source).

Latin annotations for pronunciation are usually only used with horizontal text.

For native children the annotations are usually applied character by character, whereas for foreign learners they are often applied word by word. The annotation is normally centred above the base text, and contains no spaces.

In order to avoid collisions or wrongly implied word boundaries, there should always be a 1/4em space between adjacent long annotations (usually up to 5 characters per syllable for pinyin). Letter-spacing is typically applied evenly across all the base text to allow room for annotations.

There is a preference for annotations to use a sans-serif font, and for the base text to use Kai.

Indicating pronunciation using Zhuyin Fuhao

The 國語注音符號 (guóyǔ zhùyīn fúhào) approach uses a set of characters referred to as bopomofo (after the initial letters in the alphabet), and is mostly used in Taiwan.

The bopomofo annotations usually appear in a vertical column to the right of each base character, in both horizontal and vertical text.

Vertical zhuyin.    Vertical zhuyin.
Examples of zhuyin phonetic annotations (source).

Each syllable is described by up to 3 bopomofo characters, plus a tone mark. The neutral tone mark appears above the stack, but the others appear to the right of the bopomofo column. The height of the tone mark depends on the number of bopomofo characters to its left. For details, see CLREQ.

Annotations representing meaning or commentaries

These annotations are common in light novels and translated works, and tend to describe phrases or words. They may contain casing, punctuation, and spaces, and may contain Chinese text explaining Latin base text, or vice versa.

They usually appear below a horizontal line of text, and to the left of a vertical line.

Horizontal  semantic annotation. Vertical semantic annotation.
Examples of bilingual annotations (source).

Unlike phonetic annotations, these annotations are only attached to specific words or phrases.

Other text decoration & inline features

Text decoration characteristics

When lines or other text decoration are used, they normally appear below horizontal text, and to the left of vertical text. However, emphasis marks appear to the right of a vertical line.c,#handling_interlinear_punctuation

Where both lines and emphasis dots decorate the same run of text, the lines and emphasis dots will usually appear on opposite sides of a vertical line of text, but will normally both appear together below a line of horizontally set text. In horizontal text the line decoration is normally closer to the text than the emphasis dots.c,#handling_interlinear_punctuation

When two underlined items appear side-by-side, the underline should be broken between the two.c,#handling_interlinear_punctuation

宋歐陽修明妃曲
From fig_underlines, detail of the two proper nouns and a poem title, side by side, showing a small gap between each item. There is no gap in the character content.

To achieve this in web pages the CSS Text specification currently proposes the use of a text-decoration-skip-inset property, although it is not yet finalised and is not supported by any browser. The auto value should automatically produce the slightly shorter line lengths below the characters. For full details of the options available see the CSS spec.

Most of the time you will probably want to use the following:
text-decoration-skip-inset: auto;

If a line of Chinese text contains some text in another language and orthography, the position of any text decoration should follow the Chinese conventions.c,#handling_interlinear_punctuation

Lines alongside the text are used to indicate personal names, rather than emphasis (see inline_propernames). Wavy lines may also be used to mark a title of a book or work of art (see inline_titles).

Emphasis can be indicated using dots alongside the line (see emphasis).

Line & paragraph layout

Line breaking & word wrap

Lines are normally wrapped between characters – word boundaries have no significance for the wrapping. Chinese should, however, take into account a few rules which dictate what characters cannot appear at the end or start of a line.

There is no hyphenation when Chinese characters are wrapped to the next line.

Line start/end rules

The following characters should not normally begin a line. Instead, they should bring the previous Han character with them.c,#prohibition_rules_for_line_start_end

。␣.␣,␣、␣:␣;␣!␣‼␣?␣⁇␣~␣-␣–␣—␣·␣・␣‧␣」␣』␣”␣’␣)␣》␣〉␣】␣〗␣〕␣]␣}

A slightly more strict rule, called GB-style by CLReq, adds the following solidus characters.c,#prohibition_rules_for_line_start_end

/␣/

A further level of strictness adds the following to the list. Where 2 characters are listed here, they should ideally not be broken across a line ending, but they may be split to reduce the length of text wrapped onto the next line.c,#prohibition_rules_for_line_start_end

⸺␣——␣……␣⋯⋯␣

These rules can be modified by preferences, and in some cases are not observed at all – particularly for Traditional Chinese in Taiwan and Hong Kong, and especially for newsprint, to help deal with narrow columns of text.c,#prohibition_rules_for_line_start_end

Also, where several punctuation marks appear together, for example 。』」, moving all characters from the previous line might create too large a gap for justification to handle elegantly, and so punctuation marks might be allowed to appear at the line start.c,#prohibition_rules_for_line_start_end

The following characters should not appear at the end of a line.

「␣『␣“␣‘␣(␣《␣〈␣【␣〖␣〔␣[␣{

Show (default) line-breaking properties for non-ideographic characters in the Chinese orthography described here.

Text alignment & justification

Chinese justifies text using a complex set of rules which adjust the space between characters on a line. Some characters are adjusted before others.

Use the control below to see how your browser justifies the text sample here.

法律之前人人平等,并有权享受法律的平等保护,不受任何歧视。人人有权享受平等保护,以免受违反本宣言的任何歧视行为以及煽动这种歧视的任何行为之害。

Baselines, line height, etc

The standard baseline for Han characters is slightly lower than the alphabetic baseline used for Latin characters. Mixed script text needs to align baselines correctly.

Han characters have no ascenders or descenders, but occupy the square space described earlier.

fig_baselines shows metrics for the Heiti TC font. In this font the maximum height of the Han characters reaches slightly higher than the Latin ascenders, but not as low as the Latin descenders.

qhx国家或领
Font metrics for text in the Heiti TC font.

Interline spacing. Interline spacing should be consistent across all lines in a given text. It should allow a gap of sufficient size to include interlinear text decorations, such as lines for proper names or book titles or dots for emphasis marks. If an interline space is likely to include both line decorations and emphasis marks in a single interline gap, then the interline spacing must be set to accomodate that. (Note that paragraphs on the Web may reflow such that a single interline gap may sometimes contain both line decorations and emphasis dots, while at other times the line may only contain one.)

Counters, lists, etc.

You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.

Chinese text uses a number of different counter styles. Some of the more common include full-width European numbers, which in vertical text stand upright. Unicode has various sets of numbers that can be useful here.

For the dotted-decimal numeric style Unicode provides precomposed characters from 1 to 20.

⒈␣⒉␣⒊␣⒋␣⒌␣⒍␣⒎␣⒏␣⒐␣⒑␣⒒␣⒓␣⒔␣⒕␣⒖␣⒗␣⒘␣⒙␣⒚␣⒛

For the circled-decimal numeric style Unicode provides characters from 1 to 50.

⓪␣①␣②␣③␣④␣⑤␣⑥␣⑦␣⑧␣⑨␣⑩␣⑪␣⑫␣⑬␣⑭␣⑮␣⑯␣⑰␣⑱␣⑲␣⑳␣㉑␣㉒␣㉓␣㉔␣㉕␣㉖␣㉗␣㉘␣㉙␣㉚␣㉛␣㉜␣㉝␣㉞␣㉟␣㊱␣㊲␣㊳␣㊴␣㊵␣㊶␣㊷␣㊸␣㊹␣㊺␣㊻␣㊼␣㊽␣㊾␣㊿

Chinese orthographies also use ideographic characters to create 1 numeric, 2 fixed, 1 cyclic, 1 additive and 2 idiosynchratic styles.

Numeric

The cjk-decimal numeric style is decimal-based and uses these digits.rmcs

〇␣一␣二␣三␣四␣五␣六␣七␣八␣九

Examples:

一␣二␣三␣四␣一一␣二二␣三三␣四四␣一三一␣二四二␣三三三␣四六四

Chinese-specific

Several ideographic-based counter styles have an algorithm that is like an additive style, but has some differences. The algorithm to use can be found in the CSS Counter Styles specification, where they are called Longhand East Asian styles.

These styles are all decimal-based, and use the same algorithm but with different characters. The CSS spec only defines the algorithm up to 9,999, because there appears to be some disagreement about how larger numbers are handled.

The simp-chinese-informal longhand style uses the characters shown just below. The separator for lists is and the numbers can be negative when using the symbol .

零␣一␣二␣三␣四␣五␣六␣七␣八␣九␣十␣百␣千

Examples:

一␣二␣三␣四␣十一␣二十二␣三十三␣四十四␣一百三十一␣二百四十二␣三百三十三␣四百六十四

The trad-chinese-informal style uses exactly the same characters, except that the negative symbol is .

The simp-chinese-formal longhand style uses the characters shown below. The separator for lists is and the numbers can be negative when using the symbol .

零␣壹␣贰␣叁␣肆␣伍␣陆␣柒␣捌␣玖␣拾␣佰␣仟

Examples:

壹␣贰␣叁␣肆␣壹拾壹␣贰拾贰␣叁拾叁␣肆拾肆␣壹佰叁拾壹␣贰佰肆拾贰␣叁佰叁拾叁␣肆佰陆拾肆

The trad-chinese-formal longhand style uses 3 different code points where there is a difference in shape (for 2, 3, and 6), shown below. The separator for lists is and the numbers can be negative when using the symbol .

零␣壹␣貳␣參␣肆␣伍␣陸␣柒␣捌␣玖␣拾␣佰␣仟

Examples:

壹␣貳␣參␣肆␣壹拾壹␣貳拾貳␣參拾參␣肆拾肆␣壹佰參拾壹␣貳佰肆拾貳␣參佰參拾參␣肆佰陸拾肆

Fixed

The cjk-earthly-branch fixed style uses the letters shown just below. It is only able to count to 12.

子␣丑␣寅␣卯␣辰␣巳␣午␣未␣申␣酉␣戌␣亥

The cjk-heavenly-stem fixed style uses the letters shown below. It is also only able to count to 10.

甲␣乙␣丙␣丁␣戊␣己␣庚␣辛␣壬␣癸

The circled-ideograph fixed style uses the letters shown below. It is only able to count to 10.

㊀␣㊁␣㊂␣㊃␣㊄␣㊅␣㊆␣㊇␣㊈␣㊉

The parenthesised-ideograph fixed style uses the letters shown below. It is also only able to count to 10.

㈠␣㈡␣㈢␣㈣␣㈤␣㈥␣㈦␣㈧␣㈨␣㈩

Cyclic

The cjk-stem-branch cyclic style uses the pairs of characters shown just below. Once 60 is reached, the list begins over.

甲子␣乙丑␣丙寅␣丁卯␣戊辰␣己巳␣庚午␣辛未␣壬申␣癸酉␣甲戌␣乙亥␣丙子␣丁丑␣戊寅␣己卯␣庚辰␣辛巳␣壬午␣癸未␣甲申␣乙酉␣丙戌␣丁亥␣戊子␣己丑␣庚寅␣辛卯␣壬辰␣癸巳␣甲午␣乙未␣丙申␣丁酉␣戊戌␣己亥␣庚子␣辛丑␣壬寅␣癸卯␣甲辰␣乙巳␣丙午␣丁未␣戊申␣己酉␣庚戌␣辛亥␣壬子␣癸丑␣甲寅␣乙卯␣丙辰␣丁巳␣戊午␣己未␣庚申␣辛酉␣壬戌␣癸亥

Additive

𝍶␣𝍵␣𝍴␣𝍳␣𝍲

The cjk-tally-mark additive style uses the letters shown just above. It is based on only 5 basic characters, which were introduced in Unicode 11. The potential range of this style is very large, but counters rapidly grow in size, so smaller numbers are most likely.

Prefixes and suffixes

The most common suffix is . The circled or parenthesised fixed styles have no prefix/suffix.

Examples:

一、 二、 三、 四、 五、
Separator for Chinese list counters.

Page & book layout

Forms & user interaction

Form controls on Web pages should be rotated 90 degrees clockwise, compared to the form controls for Western languages.9→

The following figures show examples of what is expected. Major browsers don't fully support forms with this orientation at the time of writing.

评语: 缔造真正全球通行的万维网

Text entry form controls.

A select control closed (right) and then open while the user makes a choice (left).

油位/文件进度/复制

Meter, progress, and button elements (right to left).

Character lists

The Han script characters in Unicode 13.0 are spread across 7 blocks. The total number of these characters is 92,896.

There are also 2 compatibility blocks, containing 1,014 characters in total.

There are also various related blocks, containing 459 characters.

The following links give information about characters used for everyday use of Chinese. The numbers in parentheses are for non-ASCII characters.

References