Updated 12 December, 2022
This page brings together basic information about the Han Simplified and Traditional writing systems and their use for the Chinese language. It aims to provide a brief, descriptive summary of the modern, printed orthography and typographic features, and to advise how to write Chinese using Unicode.
第一条 人人生而自由,在尊严和权利上一律平等。他们赋有理性和良心,并应以兄弟关系的精神相对待。
第二条 人人有资格享有本宣言所载的一切权利和自由,不分种族、肤色、性别、语言、宗教、政治或其他见解、国籍或社会出身、财产、出生或其他身分等任何区别。并且不得因一人所属的国家或领土的政治的、行政的或者国际的地位之不同而有所区别,无论该领土是独立领土、托管领土、非自治领土或者处于其他任何主权受限制的情况之下。
第一條 人人生而自由,在尊嚴和權利上一律平等。他們賦有理性和良心,並應以兄弟關係的精神相對待。
第二條 人人有資格享受本宣言所載的一切權利和自由,不分種族、膚色、性別、語言、宗教、政治或其他見解、國籍或社會出身、財產、出生或其他身分等任何區別。
Two styles of Han characters are used to write Chinese. The traditional orthography was used from the 5th century until 1949 in Mainland China. The simplified orthography was introduced in 1949 and is used in Mainland China and Singapore. Traditional Han is still used in Taiwan, Hong Kong and Macau, and for aesthetic purposes elsewhere in East Asia.
People speaking different Chinese dialects nevertheless write largely the same way, due to the way that the Han characters represent concepts rather than sounds.
Han characters are also widely used in Japan to represent the main part of Japanese words, and sometimes used in Korea (though modern Korean text will contain very few, if any, han characters).
汉字 hànzì Simplified Chinese 漢字 hànzì Traditional Chinese
Chinese writing dates from the second half of the second millenium BC. There is no evidence for a predecessor. The earliest inscriptions were on bones and shells used in divination during the Shang dynasty (1600-1046 BC), and employed a set of logographic symbols now known as the Oracle Bone Script. Although these symbols have been extinct since the end of the Bronze Age, the modern Han characters are direct descendants from these.
Sources: Scriptsource, Wikipedia.
The Han script is an ideographic script. Letters typically represent a spoken syllable with its tone. See the table to the right for a brief overview of features for the modern Mandarin Chinese orthography, using the Simplified Chinese orthography. The character count reflects a typical set of characters needed for everyday reading and writing: there are many thousands more Han characters that could be added for other purposes (see chars).
The Simplified Chinese orthography has a smaller repertoire and simpler shapes than the Traditional version.
The Chinese script is used as a common writing system by people who may speak a wide variety of Chinese languages, and who may pronounce the written text very differently. This is possible because the characters represent concepts rather than phonetics.
Text can be written horizontally, left to right, or vertically with lines progressing from right to left. Vertically set text is more common in traditional chinese than simplified chinese areas.
Words are not separated by spaces.
In its 'main' category, CLDR lists 2,210 characters for the Simplified Chinese orthography, and 2,180 for Traditional Chinese. Combined, this includes 3,026 unique characters, and an overlap of 1,064 characters.
The language is tonal, but the tones are not written explicitly.
Chinese has no combining marks, but has many punctuation marks. It also has a relatively complex set of typographic rules.
The visual forms of characters don't interact.
This section lists sounds for Mandarin Chinese, as spoken in the Beijing area.
Click on the sounds to reveal locations in this document where they are mentioned.
Phones in a lighter colour are non-native or allophones.wp.
labial | dental | alveolar | post- alveolar |
retroflex | palatal | velar | |
---|---|---|---|---|---|---|---|
stops | p | t | k | ||||
aspirated | pʰ | tʰ | kʰ | ||||
affricates | t͡s | t͡ɕ | t͡ʂ | ||||
aspirated | t͡sʰ | t͡ɕʰ | t͡ʂʰ | ||||
fricative | f | s | ɕ | ʂ | x | ||
nasal | m | n | ŋ | ||||
approximant | l | ɻ | |||||
Chinese text is primarily constructed from characters that each correspond to a phonetic symbol, including a tone. Some have pictographic origins that are still evident, whereas others have a more complicated structure.
It is said that Chinese people typically use around 3-4,000 characters for most communication, but a reasonable word processor would need to support at least 10,000. Unicode supports over 70,000 Han characters, most of which cover advanced or esoteric usage.
The listings here show a list of characters used per version 36 of CLDR's lists of characters (exemplarCharacters).
Main | 一 丁 七 万 丈 三 上 下 丌 不 与 丑 专 且 世 丘 丙 业 东 丝 丢 两 严 丧 个 中 丰 串 临 丸 丹 为 主 丽 举 乃 久 么 义 之 乌 乍 乎 乏 乐 乔 乖 乘 乙 九 也 习 乡 书 买 乱 乾 了 予 争 事 二 于 亏 云 互 五 井 亚 些 亡 交 亥 亦 产 亨 享 京 亮 亲 人 亿 什 仁 仅 仇 今 介 仍 从 仔 他 付 仙 代 令 以 仪 们 仰 仲 件 价 任 份 仿 企 伊 伍 伏 伐 休 众 优 伙 会 伟 传 伤 伦 伯 估 伴 伸 似 伽 但 位 低 住 佐 佑 体 何 余 佛 作 你 佤 佩 佳 使 例 供 依 侠 侦 侧 侨 侬 侯 侵 便 促 俄 俊 俗 保 信 俩 修 俱 俾 倍 倒 候 倚 借 倦 值 倾 假 偌 偏 做 停 健 偶 偷 储 催 傲 傻 像 僧 儒 儿 允 元 兄 充 兆 先 光 克 免 兑 兔 党 入 全 八 公 六 兮 兰 共 关 兴 兵 其 具 典 兹 养 兼 兽 内 冈 册 再 冒 写 军 农 冠 冬 冰 冲 决 况 冷 准 凌 减 凝 几 凡 凤 凭 凯 凰 出 击 函 刀 分 切 刊 刑 划 列 刘 则 刚 创 初 判 利 别 到 制 刷 券 刺 刻 剂 前 剑 剧 剩 剪 副 割 力 劝 办 功 加 务 劣 动 助 努 劫 励 劲 劳 势 勇 勉 勋 勒 勤 勾 勿 包 匆 匈 化 北 匙 匹 区 医 十 千 升 午 半 华 协 卒 卓 单 卖 南 博 占 卡 卢 卫 卯 印 危 即 却 卷 厂 厄 厅 历 厉 压 厌 厍 厚 原 去 县 参 又 叉 及 友 双 反 发 叔 取 受 变 叙 口 古 句 另 只 叫 召 叭 可 台 史 右 叶 号 司 叹 吃 各 合 吉 吊 同 名 后 吐 向 吓 吗 君 吝 吟 否 吧 含 听 启 吵 吸 吹 吻 吾 呀 呆 呈 告 呐 员 呜 呢 呦 周 味 呵 呼 命 和 咖 咦 咧 咨 咪 咬 咯 咱 哀 品 哇 哈 哉 响 哎 哟 哥 哦 哩 哪 哭 哲 唉 唐 唤 唬 售 唯 唱 唷 商 啊 啡 啥 啦 啪 喀 喂 善 喇 喊 喏 喔 喜 喝 喵 喷 喻 嗒 嗨 嗯 嘉 嘛 嘴 嘻 嘿 器 四 回 因 团 园 困 围 固 国 图 圆 圈 土 圣 在 圭 地 圳 场 圾 址 均 坎 坐 坑 块 坚 坛 坜 坡 坤 坦 坪 垂 垃 型 垒 埃 埋 城 埔 域 培 基 堂 堆 堕 堡 堪 塑 塔 塞 填 境 增 墨 壁 壤 士 壬 壮 声 处 备 复 夏 夕 外 多 夜 够 夥 大 天 太 夫 央 失 头 夷 夸 夹 夺 奇 奈 奉 奋 奏 契 奔 奖 套 奥 女 奴 奶 她 好 如 妇 妈 妖 妙 妥 妨 妮 妹 妻 姆 姊 始 姐 姑 姓 委 姿 威 娃 娄 娘 娜 娟 娱 婆 婚 媒 嫁 嫌 嫩 子 孔 孕 字 存 孙 孜 孝 孟 季 孤 学 孩 宁 它 宇 守 安 宋 完 宏 宗 官 宙 定 宛 宜 宝 实 审 客 宣 室 宪 害 宴 家 容 宽 宾 宿 寂 寄 寅 密 寇 富 寒 寝 寞 察 寡 寨 寸 对 寻 导 寿 封 射 将 尊 小 少 尔 尖 尘 尚 尝 尤 就 尺 尼 尽 尾 局 屁 层 居 屋 屏 展 属 屠 山 岁 岂 岗 岘 岚 岛 岳 岸 峡 峰 崇 崩 崴 川 州 巡 工 左 巧 巨 巫 差 己 已 巳 巴 巷 币 市 布 帅 师 希 帐 帕 帖 帝 带 席 帮 常 帽 幅 幕 干 平 年 并 幸 幻 幼 幽 广 庆 床 序 库 应 底 店 庙 庚 府 庞 废 度 座 庭 康 庸 廉 廖 延 廷 建 开 异 弃 弄 弊 式 引 弗 弘 弟 张 弥 弦 弯 弱 弹 强 归 当 录 彝 形 彩 彬 彭 彰 影 彷 役 彻 彼 往 征 径 待 很 律 後 徐 徒 得 循 微 徵 德 心 必 忆 忌 忍 志 忘 忙 忠 忧 快 念 忽 怀 态 怎 怒 怕 怖 思 怡 急 性 怨 怪 总 恋 恐 恢 恨 恩 恭 息 恰 恶 恼 悄 悉 悔 悟 悠 患 您 悲 情 惑 惜 惠 惧 惨 惯 想 惹 愁 愈 愉 意 愚 感 愧 慈 慎 慕 慢 慧 慰 憾 懂 懒 戈 戊 戌 戏 成 我 戒 或 战 截 戴 户 房 所 扁 扇 手 才 扎 扑 打 托 扣 执 扩 扫 扬 扭 扮 扯 批 找 承 技 抄 把 抑 抓 投 抗 折 抢 护 报 披 抬 抱 抵 抹 抽 担 拆 拉 拍 拒 拔 拖 拘 招 拜 拟 拥 拦 拨 择 括 拳 拷 拼 拾 拿 持 指 按 挑 挖 挝 挡 挤 挥 挪 振 挺 捉 捐 捕 损 捡 换 据 捷 授 掉 掌 排 探 接 控 推 掩 措 掸 描 提 插 握 援 搜 搞 搬 搭 摄 摆 摊 摔 摘 摩 摸 撒 撞 播 操 擎 擦 支 收 改 攻 放 政 故 效 敌 敏 救 教 敝 敢 散 敦 敬 数 敲 整 文 斋 斐 斗 料 斜 斥 断 斯 新 方 於 施 旁 旅 旋 族 旗 无 既 日 旦 旧 旨 早 旭 时 旺 昂 昆 昌 明 昏 易 星 映 春 昨 昭 是 显 晃 晋 晒 晓 晚 晨 普 景 晴 晶 智 暂 暑 暖 暗 暮 暴 曰 曲 更 曹 曼 曾 替 最 月 有 朋 服 朗 望 朝 期 木 未 末 本 札 术 朱 朵 机 杀 杂 权 杉 李 材 村 杜 束 条 来 杨 杯 杰 松 板 极 构 析 林 果 枝 枢 枪 枫 架 柏 某 染 柔 查 柬 柯 柳 柴 标 栋 栏 树 校 样 核 根 格 桃 框 案 桌 桑 档 桥 梁 梅 梦 梯 械 梵 检 棉 棋 棒 棚 森 椅 植 椰 楚 楼 概 榜 模 樱 檀 欠 次 欢 欣 欧 欲 欺 款 歉 歌 止 正 此 步 武 歪 死 殊 残 段 毅 母 每 毒 比 毕 毛 毫 氏 民 气 氛 水 永 求 汇 汉 汗 汝 江 池 污 汤 汪 汶 汽 沃 沈 沉 沙 沟 没 沧 河 油 治 沿 泉 泊 法 泛 泡 波 泣 泥 注 泰 泳 泽 洋 洗 洛 洞 津 洪 洲 活 洽 派 流 浅 测 济 浏 浑 浓 浙 浦 浩 浪 浮 浴 海 涅 消 涉 涛 涨 涯 液 涵 淋 淑 淘 淡 深 混 添 清 渐 渡 渣 温 港 渴 游 湖 湾 源 溜 溪 滋 滑 满 滥 滨 滴 漂 漏 演 漠 漫 潘 潜 潮 澎 澳 激 灌 火 灭 灯 灰 灵 灿 炉 炎 炮 炸 点 烂 烈 烤 烦 烧 热 焦 然 煌 煞 照 煮 熊 熟 燃 燕 爆 爪 爬 爱 爵 父 爷 爸 爽 片 版 牌 牙 牛 牡 牢 牧 物 牲 牵 特 牺 犯 状 犹 狂 狐 狗 狠 独 狮 狱 狼 猛 猜 猪 献 猴 玄 率 玉 王 玛 玩 玫 环 现 玲 玻 珀 珊 珍 珠 班 球 理 琊 琪 琳 琴 琼 瑙 瑜 瑞 瑟 瑰 瑶 璃 瓜 瓦 瓶 甘 甚 甜 生 用 田 由 甲 申 电 男 甸 画 畅 界 留 略 番 疆 疏 疑 疗 疯 疲 疼 疾 病 痕 痛 痴 癸 登 白 百 的 皆 皇 皮 盈 益 监 盒 盖 盘 盛 盟 目 直 相 盼 盾 省 眉 看 真 眠 眼 着 睛 睡 督 瞧 矛 矣 知 短 石 矶 码 砂 砍 研 破 础 硕 硬 确 碍 碎 碗 碟 碧 碰 磁 磅 磨 示 礼 社 祖 祚 祝 神 祥 票 祯 祸 禁 禅 福 离 秀 私 秋 种 科 秒 秘 租 秤 秦 秩 积 称 移 稀 程 稍 税 稣 稳 稿 穆 究 穷 穹 空 穿 突 窗 窝 立 站 竞 竟 章 童 端 竹 笑 笔 笛 符 笨 第 等 筋 筑 答 策 筹 签 简 算 管 箭 箱 篇 篮 簿 籍 米 类 粉 粒 粗 粤 粹 精 糊 糕 糖 糟 系 素 索 紧 紫 累 繁 红 约 级 纪 纯 纲 纳 纵 纷 纸 纽 线 练 组 细 织 终 绍 经 结 绕 绘 给 络 绝 统 继 绩 绪 续 维 绵 综 绿 缅 缓 编 缘 缠 缩 缴 缶 缸 缺 罐 网 罕 罗 罚 罢 罪 置 署 羊 美 羞 群 羯 羽 翁 翅 翔 翘 翠 翰 翻 翼 耀 老 考 者 而 耍 耐 耗 耳 耶 聊 职 联 聘 聚 聪 肉 肖 肚 股 肤 肥 肩 肯 育 胁 胆 背 胎 胖 胜 胞 胡 胶 胸 能 脆 脑 脱 脸 腊 腐 腓 腰 腹 腾 腿 臂 臣 自 臭 至 致 舌 舍 舒 舞 舟 航 般 舰 船 良 色 艺 艾 节 芒 芝 芦 芬 芭 花 芳 苍 苏 苗 若 苦 英 茂 范 茨 茫 茶 草 荐 荒 荣 药 荷 莉 莎 莪 莫 莱 莲 获 菜 菩 菲 萄 萍 萤 营 萧 萨 落 著 葛 葡 蒂 蒋 蒙 蓉 蓝 蓬 蔑 蔡 薄 薪 藉 藏 藤 虎 虑 虫 虹 虽 虾 蚁 蛇 蛋 蛙 蛮 蜂 蜜 蝶 融 蟹 蠢 血 行 街 衡 衣 补 表 袋 被 袭 裁 裂 装 裕 裤 西 要 覆 见 观 规 视 览 觉 角 解 言 誉 誓 警 计 订 认 讨 让 训 议 讯 记 讲 讷 许 论 设 访 证 评 识 诉 词 译 试 诗 诚 话 诞 询 该 详 语 误 说 请 诸 诺 读 课 谁 调 谅 谈 谊 谋 谓 谜 谢 谨 谱 谷 豆 象 豪 貌 贝 贞 负 贡 财 责 贤 败 货 质 贩 贪 购 贯 贱 贴 贵 贸 费 贺 贼 贾 资 赋 赌 赏 赐 赔 赖 赚 赛 赞 赠 赢 赤 赫 走 赵 起 趁 超 越 趋 趣 足 跃 跌 跑 距 跟 路 跳 踏 踢 踩 身 躲 车 轨 轩 转 轮 软 轰 轻 载 较 辅 辆 辈 辉 辑 输 辛 辞 辨 辩 辰 辱 边 达 迁 迅 过 迈 迎 运 近 返 还 这 进 远 违 连 迟 迦 迪 迫 述 迷 追 退 送 适 逃 逆 选 逊 透 逐 递 途 通 逛 逝 速 造 逢 逸 逻 逼 遇 遍 道 遗 遭 遮 遵 避 邀 邓 那 邦 邪 邮 邱 邻 郎 郑 部 郭 都 鄂 酉 酋 配 酒 酷 酸 醉 醒 采 释 里 重 野 量 金 针 钓 钟 钢 钦 钱 钻 铁 铃 铜 铢 铭 银 铺 链 销 锁 锅 锋 错 锡 锦 键 锺 镇 镜 镭 长 门 闪 闭 问 闰 闲 间 闷 闹 闻 阁 阅 阐 阔 队 阮 防 阳 阴 阵 阶 阻 阿 陀 附 际 陆 陈 降 限 院 除 险 陪 陵 陶 陷 隆 随 隐 隔 障 难 雄 雅 集 雉 雨 雪 雯 雳 零 雷 雾 需 震 霍 霖 露 霸 霹 青 靖 静 非 靠 面 革 靼 鞋 鞑 韦 韩 音 页 顶 项 顺 须 顽 顾 顿 预 领 颇 频 颗 题 额 风 飘 飙 飞 食 餐 饭 饮 饰 饱 饼 馆 首 香 馨 马 驱 驶 驻 驾 验 骑 骗 骚 骤 骨 高 鬼 魂 魅 魔 鱼 鲁 鲜 鸟 鸡 鸣 鸭 鸿 鹅 鹤 鹰 鹿 麦 麻 黄 黎 黑 默 鼓 鼠 鼻 齐 齿 龄 龙 龟 | 2,210 |
---|---|---|
Auxiliary | 仂 侣 傈 傣 僳 卑 卞 厘 吕 坝 堤 奎 屿 巽 撤 楔 楠 滕 瑚 甫 盲 碑 禄 粟 脚 艮 谬 钯 铂 锑 镑 魁 乒 乓 仓 伞 冥 凉 刨 匕 厦 厨 呣 唇 啤 啮 喱 嗅 噘 噢 墟 妆 婴 媚 宅 寺 尬 尴 屑 巾 弓 彗 惊 戟 扔 扰 扳 抛 挂 捂 摇 撅 杆 杖 柜 柱 栗 栽 桶 棍 棕 棺 榈 槟 橙 洒 浆 涌 淇 滚 滩 灾 烛 烟 焰 煎 犬 猫 瓢 皱 盆 盔 眨 眯 瞌 矿 祈 祭 祷 稻 竿 笼 筒 篷 粮 纠 纬 缆 缎 耸 舔 舵 艇 芽 苜 苞 菇 菱 葫 葵 蒸 蓿 蔽 薯 蘑 蚂 蛛 蜗 蜘 蜡 蝎 蝴 螃 裹 谍 豚 账 跤 踪 躬 轴 辐 迹 郁 鄙 酢 钉 钥 钮 铅 铛 锄 锚 锤 闺 阱 隧 雕 霾 靴 靶 鞠 颠 馏 驼 骆 髦 鲤 鲸 鳄 鸽 | 181 |
Main | 一 丁 七 丈 三 上 下 丌 不 丑 且 世 丘 丙 丟 並 中 串 丸 丹 主 乃 久 么 之 乎 乏 乖 乘 乙 九 也 乾 亂 了 予 事 二 于 云 互 五 井 些 亞 亡 交 亥 亦 亨 享 京 亮 人 什 仁 仇 今 介 仍 仔 他 付 仙 代 令 以 仰 仲 件 任 份 企 伊 伍 伐 休 伙 伯 估 伴 伸 似 伽 但 佈 佉 位 低 住 佔 何 余 佛 作 你 佩 佳 使 來 例 供 依 侯 侵 便 係 促 俄 俊 俗 保 俠 信 修 俱 俾 個 倍 們 倒 候 倚 借 倫 值 假 偉 偏 做 停 健 側 偵 偶 偷 傑 備 傢 傣 傲 傳 傷 傻 傾 僅 像 僑 僧 價 儀 億 儒 儘 優 允 元 兄 充 兇 先 光 克 免 兒 兔 入 內 全 兩 八 公 六 兮 共 兵 其 具 典 兼 冊 再 冒 冠 冬 冰 冷 准 凌 凝 凡 凰 凱 出 函 刀 分 切 刊 列 初 判 別 利 刪 到 制 刷 刺 刻 則 剌 前 剛 剩 剪 副 割 創 劃 劇 劉 劍 力 功 加 助 努 劫 勁 勇 勉 勒 動 務 勝 勞 勢 勤 勵 勸 勿 包 匈 化 北 匹 區 十 千 升 午 半 卒 卓 協 南 博 卜 卡 卯 印 危 即 卷 卻 厄 厘 厚 原 厭 厲 去 參 又 及 友 反 叔 取 受 口 古 句 另 只 叫 召 叭 可 台 史 右 司 吃 各 合 吉 吊 同 名 后 吐 向 吒 君 吝 吞 吟 吠 否 吧 含 吳 吵 吸 吹 吾 呀 呂 呆 告 呢 周 味 呵 呼 命 和 咖 咦 咧 咪 咬 咱 哀 品 哇 哈 哉 哎 員 哥 哦 哩 哪 哭 哲 唉 唐 唔 唬 售 唯 唱 唷 唸 商 啊 問 啟 啡 啥 啦 啪 喀 喂 善 喇 喊 喔 喜 喝 喬 單 喵 嗎 嗚 嗨 嗯 嘆 嘉 嘗 嘛 嘴 嘻 嘿 器 噴 嚇 嚴 囉 四 回 因 困 固 圈 國 圍 園 圓 圖 團 圜 土 在 圭 地 圾 址 均 坎 坐 坡 坤 坦 坪 垂 垃 型 埃 城 埔 域 執 培 基 堂 堅 堆 堡 堪 報 場 塊 塔 塗 塞 填 塵 境 增 墨 墮 壁 壇 壓 壘 壞 壢 士 壬 壯 壽 夏 夕 外 多 夜 夠 夢 夥 大 天 太 夫 央 失 夷 夸 夾 奇 奈 奉 奎 奏 契 奔 套 奧 奪 奮 女 奴 奶 她 好 如 妙 妝 妥 妨 妮 妳 妹 妻 姆 姊 始 姐 姑 姓 委 姿 威 娃 娘 娛 婁 婆 婚 婦 媒 媽 嫌 嫩 子 孔 字 存 孝 孟 季 孤 孩 孫 學 它 宅 宇 守 安 宋 完 宏 宗 官 宙 定 宛 宜 客 宣 室 宮 害 家 容 宿 寂 寄 寅 密 富 寒 寞 察 寢 實 寧 寨 審 寫 寬 寮 寵 寶 封 射 將 專 尊 尋 對 導 小 少 尖 尚 尤 就 尺 尼 尾 局 屁 居 屆 屋 屏 展 屠 層 屬 山 岡 岩 岸 峰 島 峽 崇 崙 崴 嵐 嶺 川 州 巡 工 左 巧 巨 巫 差 己 已 巳 巴 巷 市 布 希 帕 帖 帛 帝 帥 師 席 帳 帶 常 帽 幅 幕 幣 幫 干 平 年 幸 幹 幻 幼 幽 幾 庇 床 序 底 店 庚 府 度 座 庫 庭 康 庸 廉 廖 廠 廢 廣 廳 延 廷 建 弄 式 引 弗 弘 弟 弦 弱 張 強 彈 彊 彌 彎 彝 彞 形 彥 彩 彬 彭 彰 影 役 彼 往 征 待 很 律 後 徐 徑 徒 得 從 復 微 徵 德 徹 心 必 忌 忍 志 忘 忙 忠 快 念 忽 怎 怒 怕 怖 思 怡 急 性 怨 怪 恆 恐 恢 恥 恨 恩 恭 息 恰 悅 悉 悔 悟 悠 您 悲 悶 情 惑 惜 惠 惡 惱 想 惹 愁 愈 愉 意 愚 愛 感 慈 態 慕 慘 慢 慣 慧 慮 慰 慶 慾 憂 憐 憑 憲 憶 憾 懂 應 懶 懷 懼 戀 戈 戊 戌 成 我 戒 或 截 戰 戲 戴 戶 房 所 扁 扇 手 才 扎 打 托 扣 扥 扭 扯 批 找 承 技 抄 把 抓 投 抗 折 披 抬 抱 抵 抹 抽 拆 拉 拋 拍 拏 拒 拔 拖 招 拜 括 拳 拼 拾 拿 持 指 按 挑 挖 挪 振 挺 捐 捕 捨 捲 捷 掃 授 掉 掌 排 掛 採 探 接 控 推 措 描 提 插 揚 換 握 揮 援 損 搖 搜 搞 搬 搭 搶 摘 摩 摸 撐 撒 撞 撣 撥 播 撾 撿 擁 擇 擊 擋 操 擎 擔 據 擠 擦 擬 擴 擺 擾 攝 支 收 改 攻 放 政 故 效 敍 敏 救 敗 敘 教 敝 敢 散 敦 敬 整 敵 數 文 斐 斗 料 斯 新 斷 方 於 施 旁 旅 旋 族 旗 既 日 旦 早 旭 旺 昂 昆 昇 昌 明 昏 易 星 映 春 昨 昭 是 時 晉 晒 晚 晨 普 景 晴 晶 智 暑 暖 暗 暫 暴 曆 曉 曰 曲 更 書 曼 曾 替 最 會 月 有 朋 服 朗 望 朝 期 木 未 末 本 札 朱 朵 杉 李 材 村 杜 束 杯 杰 東 松 板 析 林 果 枝 架 柏 某 染 柔 查 柬 柯 柳 柴 校 核 根 格 桃 案 桌 桑 梁 梅 條 梨 梯 械 梵 棄 棉 棋 棒 棚 森 椅 植 椰 楊 楓 楚 業 極 概 榜 榮 構 槍 樂 樓 標 樞 模 樣 樹 橋 機 橫 檀 檔 檢 欄 權 次 欣 欲 欺 欽 款 歉 歌 歐 歡 止 正 此 步 武 歲 歷 歸 死 殊 殘 段 殺 殼 毀 毅 母 每 毒 比 毛 毫 氏 民 氣 水 永 求 汗 汝 江 池 污 汪 汶 決 汽 沃 沈 沉 沒 沖 沙 河 油 治 沿 況 泉 泊 法 泡 波 泥 注 泰 泳 洋 洗 洛 洞 洩 洪 洲 活 洽 派 流 浦 浩 浪 浮 海 涇 消 涉 涯 液 涵 涼 淑 淚 淡 淨 深 混 淺 清 減 渡 測 港 游 湖 湯 源 準 溝 溪 溫 滄 滅 滋 滑 滴 滾 滿 漂 漏 演 漠 漢 漫 漲 漸 潔 潘 潛 潮 澤 澳 激 濃 濟 濤 濫 濱 瀏 灌 灣 火 灰 災 炎 炮 炸 為 烈 烏 烤 無 焦 然 煙 煞 照 煩 熊 熟 熱 燃 燈 燒 營 爆 爐 爛 爪 爬 爭 爵 父 爸 爺 爽 爾 牆 片 版 牌 牙 牛 牠 牧 物 牲 特 牽 犧 犯 狀 狂 狐 狗 狠 狼 猛 猜 猴 猶 獄 獅 獎 獨 獲 獸 獻 玄 率 玉 王 玩 玫 玲 玻 珊 珍 珠 珥 班 現 球 理 琉 琪 琴 瑙 瑜 瑞 瑟 瑤 瑪 瑰 環 瓜 瓦 瓶 甘 甚 甜 生 產 用 田 由 甲 申 男 甸 界 留 畢 略 番 畫 異 當 疆 疏 疑 疼 病 痕 痛 痴 瘋 療 癡 癸 登 發 白 百 的 皆 皇 皮 盃 益 盛 盜 盟 盡 監 盤 盧 目 盲 直 相 盼 盾 省 眉 看 真 眠 眼 眾 睛 睡 督 瞧 瞭 矛 矣 知 短 石 砂 砍 研 砲 破 硬 碎 碗 碟 碧 碩 碰 確 碼 磁 磨 磯 礎 礙 示 社 祕 祖 祚 祛 祝 神 祥 票 祿 禁 禍 禎 福 禪 禮 秀 私 秋 科 秒 秘 租 秤 秦 移 稅 程 稍 種 稱 稿 穆 穌 積 穩 究 穹 空 穿 突 窗 窩 窮 窶 立 站 竟 章 童 端 競 竹 笑 笛 符 笨 第 筆 等 筋 答 策 简 算 管 箭 箱 節 範 篇 築 簡 簫 簽 簿 籃 籌 籍 籤 米 粉 粗 粵 精 糊 糕 糟 系 糾 紀 約 紅 納 紐 純 紙 級 紛 素 索 紫 累 細 紹 終 組 結 絕 絡 給 統 絲 經 綜 綠 維 綱 網 緊 緒 線 緣 編 緩 緬 緯 練 縛 縣 縮 縱 總 績 繁 繆 織 繞 繪 繳 繼 續 缸 缺 罕 罪 置 罰 署 罵 罷 羅 羊 美 羞 群 義 羽 翁 習 翔 翰 翹 翻 翼 耀 老 考 者 而 耍 耐 耗 耳 耶 聊 聖 聚 聞 聯 聰 聲 職 聽 肉 肚 股 肥 肩 肯 育 背 胎 胖 胞 胡 胸 能 脆 脫 腓 腔 腦 腰 腳 腿 膽 臉 臘 臣 臥 臨 自 臭 至 致 臺 與 興 舉 舊 舌 舍 舒 舞 舟 航 般 船 艦 良 色 艾 芝 芬 花 芳 若 苦 英 茅 茫 茲 茶 草 荒 荷 荼 莉 莊 莎 莫 菜 菩 華 菲 萄 萊 萬 落 葉 著 葛 葡 蒂 蒙 蒲 蒼 蓋 蓮 蔕 蔡 蔣 蕭 薄 薦 薩 薪 藉 藍 藏 藝 藤 藥 蘆 蘇 蘭 虎 處 虛 號 虧 蛇 蛋 蛙 蜂 蜜 蝶 融 螢 蟲 蟹 蠍 蠻 血 行 術 街 衛 衝 衡 衣 表 袋 被 裁 裂 裕 補 裝 裡 製 複 褲 西 要 覆 見 規 視 親 覺 覽 觀 角 解 觸 言 訂 計 訊 討 訓 託 記 訥 訪 設 許 訴 註 証 評 詞 詢 試 詩 話 該 詳 誇 誌 認 誓 誕 語 誠 誤 說 誰 課 誼 調 談 請 諒 論 諸 諺 諾 謀 謂 講 謝 證 識 譜 警 譯 議 護 譽 讀 變 讓 讚 谷 豆 豈 豐 象 豪 豬 貌 貓 貝 貞 負 財 貢 貨 貪 貫 責 貴 買 費 貼 賀 資 賈 賓 賜 賞 賢 賣 賤 賦 質 賭 賴 賺 購 賽 贈 贊 贏 赤 赫 走 起 超 越 趕 趙 趣 趨 足 跌 跎 跑 距 跟 跡 路 跳 踏 踢 蹟 蹤 躍 身 躲 車 軌 軍 軒 軟 較 載 輔 輕 輛 輝 輩 輪 輯 輸 轉 轟 辛 辦 辨 辭 辯 辰 辱 農 迅 迎 近 返 迦 迪 迫 述 迴 迷 追 退 送 逃 逆 透 逐 途 這 通 逛 逝 速 造 逢 連 週 進 逸 逼 遇 遊 運 遍 過 道 達 違 遙 遜 遠 適 遭 遮 遲 遷 選 遺 避 邀 邁 還 邊 邏 那 邦 邪 邱 郎 部 郭 郵 都 鄂 鄉 鄭 鄰 酉 配 酒 酷 酸 醉 醒 醜 醫 采 釋 里 重 野 量 金 針 釣 鈴 鉢 銀 銅 銖 銘 銳 銷 鋒 鋼 錄 錢 錦 錫 錯 鍋 鍵 鍾 鎊 鎖 鎮 鏡 鐘 鐵 鑑 長 門 閃 閉 開 閏 閒 間 閣 閱 闆 闊 闍 闐 關 闡 防 阻 阿 陀 附 降 限 院 陣 除 陪 陰 陳 陵 陶 陷 陸 陽 隆 隊 階 隔 際 障 隨 險 隱 隻 雄 雅 集 雉 雖 雙 雜 雞 離 難 雨 雪 雲 零 雷 電 需 震 霍 霧 露 霸 霹 靂 靈 青 靖 靜 非 靠 面 革 靼 鞋 韃 韋 韓 音 韻 響 頁 頂 項 順 須 預 頑 頓 頗 領 頞 頭 頻 顆 題 額 顏 願 類 顧 顯 風 飄 飛 食 飯 飲 飽 飾 餅 養 餐 餘 館 首 香 馬 駐 駕 駛 騎 騙 騷 驅 驗 驚 骨 體 高 髮 鬆 鬥 鬧 鬱 鬼 魁 魂 魅 魔 魚 魯 鮮 鳥 鳳 鳴 鴻 鵝 鷹 鹿 麗 麥 麵 麻 麼 黃 黎 黑 默 點 黨 鼓 鼠 鼻 齊 齋 齒 齡 龍 龜 | 2,180 |
---|---|---|
Auxiliary | 乍 仂 伏 佐 侶 僳 兆 兌 兹 别 券 勳 卑 卞 占 叶 堤 墎 壤 奥 孜 峇 嶼 巽 栗 楔 涅 渾 澎 燦 狄 琳 瑚 甫 碑 礁 芒 苗 茨 蓬 蚩 蜀 裘 謬 酋 隴 乳 划 匕 匙 匣 叉 吻 嘟 噘 妖 巾 帆 廁 廚 弋 弓 懸 戟 扳 捂 摔 暈 框 桶 桿 櫃 煎 燭 牡 皺 盒 眨 眩 筒 簍 糰 紋 紗 纏 纜 羯 聳 肖 艇 虹 蛛 蜘 蝴 蝸 蠟 裙 豚 躬 釘 鈔 鈕 鉛 鎚 鎬 鐺 鑰 鑽 霄 鞠 骰 骷 髏 鯉 鳶 | 115 |
In its 'main' category, CLDR lists 2,210 characters for the Simplified Chinese orthography, and 2,180 for Traditional Chinese. Combined, this includes 3,026 unique characters, and an overlap of 1,064 characters.
A radical is an ideograph or, more commonly, a component of an ideograph that is used for indexing dictionaries and word lists, and as the basis for creating new ideographs. The 214 radicals of the KangXi dictionary are universally recognised.
The visual appearance of radicals may vary significantly from the original character on which they are based.
The shape of the radical may be influenced by the arrangement with other elements of a character, or by standardised simplifications. In the figure above, the shape of the top right radical (word) is a product of the simplification process in China.
Unicode dedicates two blocks to radicals. The KangXi radicals block contains the base forms of the 214 radicals.
The CJK Radicals Supplement contains variant shapes of these radicals when they are used as parts of other characters or in simplified form. These have not been unified because they often appear independently in dictionaries indices.
Characters in those blocks should never be used as ideographs.
No characters with the general property symbol are included in the CLDR sets.
Text can be written horizontally, left to right, or vertically with lines progressing from right to left. Vertically set text is more common in traditional chinese than simplified chinese areas.
Older horizontally set texts in Chinese also ran right to left.
It should be noted, however, that horizontal and vertical text is not usually identical. Apart from the question of what gets rotated and what does not, the two writing modes may show different preferences for emphasis marks, brackets, numbers, and so forth.
This section brings together information about the following topics: writing styles; cursive text; context-based shaping; context-based positioning; baselines, line height, etc.; font styles; case & other character transforms.
You can experiment with examples using the Chinese character app.
Han characters have no contextual variation or placement of glyphs. Nor is text cursive (in the sense of joined-up).
The orthography has no case distinction, and no special transforms are needed to convert between characters.
On the other hand, punctuation and embedded text in other scripts is affected by the direction of lines.
By default, all han characters and punctuation are inside a character frame that is square and the same size for all characters. The box containing the actual symbol is called the letter face, and there should be some space left between the letter face and the character frame. There may be variations, particularly for punctuation, etc., in the size of the letter face.
Because of the regularity of the character frame size, it can be used to measure the size of the text area or other parts of a page (horizontally or vertically).
In principle, Han characters are set solid, ie. with no space between the character frames. However, text alignment and justification can make adjustments to the placement of characters in the direction of the line flow. See justify and letterspace.
Dashes, ellipses, and brackets are rotated 90º to the right when they appear in vertical text. Here is a list of characters to which this applies.c,#table_of_punctuation_marks
tbd
Since there are no combining marks or decompositions, graphemes correspond to individual characters.
Unicode grapheme clusters can be applied to Chinese without problems. There are no special issues related to operations that use grapheme clusters as their basic unit of text.
Several of the following sections include icons after punctuation marks that indicate whether the punctuation is used in horizontal (H) or vertical (V) writing, and whether the character is rotated 90º in vertical text, or translated to a different location in the character frame.
Chinese rarely uses spaces. In the sample text there are gaps around punctuation, but these are produced by a lack of 'ink' in parts of the square character glyphs:
You can verify this by clicking on this example. The character list popup shows that only three characters make up this sequence, and none are spaces.
Chinese uses the following separators at the sentence level and below.c,#h-pause-or-stop-punctuation-marks
SC: H, V | TC: H, V | ||||
---|---|---|---|---|---|
phrase | , [U+FF0C FULLWIDTH COMMA] | ||||
、 [U+3001 IDEOGRAPHIC COMMA] * | |||||
: [U+FF1A FULLWIDTH COLON] | |||||
; [U+FF1B FULLWIDTH SEMICOLON] | |||||
sentence | 。 [U+3002 IDEOGRAPHIC FULL STOP] | ||||
. [U+FF0E FULLWIDTH FULL STOP] * | |||||
exclamation | ! [U+FF01 FULLWIDTH EXCLAMATION MARK] | ||||
question | ? [U+FF1F FULLWIDTH QUESTION MARK] |
、 [U+3001 IDEOGRAPHIC COMMA] is typically used as a list separator.
. [U+FF0E FULLWIDTH FULL STOP] is used as sentence-final punctuation in, for example, college textbooks, science and technology literature, and grammar books of Western languages, most of which are in horizontal writing mode, and Western language is heavily used.
As the table shows, these punctuation marks are not rotated, however their position varies in Simplified Chinese for horizontal and vertical text, relative to the character frame. In Traditional Chinese they are all centred.
These different positions in Simplified Chinese require dedicated glyphs in the font, and cannot be achieved by simply rotating the glyph.
Chinese also uses the following doubled exclamation/question marks. They remain upright in vertical text.
‼ [U+203C DOUBLE EXCLAMATION MARK] |
⁇ [U+2047 DOUBLE QUESTION MARK] |
⁈ [U+2048 QUESTION EXCLAMATION MARK] |
⁉ [U+2049 EXCLAMATION QUESTION MARK] |
Other punctuation used to separate phrases or items includes:
H | V | |
---|---|---|
⸺ [U+2E3A TWO-EM DASH] | ||
—— [U+2014 EM DASH + U+2014 EM DASH] |
If EM DASH characters are used, they are used in pairs.
Chinese commonly uses fullwidth parentheses to insert parenthetical information into text.
H | V | ||
---|---|---|---|
( [U+FF08 FULLWIDTH LEFT PARENTHESIS] | ) [U+FF09 FULLWIDTH RIGHT PARENTHESIS] |
Dashes can also be used to offset information, in which case Chinese typically uses those listed in the previous section, doubled up.
Although there are a number of other bracket characters (listed just below), they are rarely used in Chinese publications.c,#id81
Brackets are also used to indicate titles and proper names (see otherinline).
Mainland China. Mainland China, where vertical text is not common, uses different quote marks for horizontal and vertical writing. The default quote marks are:
H | V | ||
---|---|---|---|
“ [U+201C LEFT DOUBLE QUOTATION MARK] | ” [U+201D RIGHT DOUBLE QUOTATION MARK] | - | |
『 [U+300E LEFT WHITE CORNER BRACKET] | 』 [U+300F RIGHT WHITE CORNER BRACKET] | - |
When an additional quote is embedded within the first, the quote marks are:
H | V | ||
---|---|---|---|
‘ [U+2018 LEFT SINGLE QUOTATION MARK] | ’ [U+2019 RIGHT SINGLE QUOTATION MARK] | - | |
「 [U+300C LEFT CORNER BRACKET] | 」 [U+300D RIGHT CORNER BRACKET] | - |
Taiwan. Taiwan tends to use a single set of quote marks, but the other way around compared to Mainland China. The default quote marks are:
H | V | ||
---|---|---|---|
「 [U+300C LEFT CORNER BRACKET] | 」 [U+300D RIGHT CORNER BRACKET] |
When an additional quote is embedded within the first, the quote marks are:
H | V | ||
---|---|---|---|
『 [U+300E LEFT WHITE CORNER BRACKET] | 』 [U+300F RIGHT WHITE CORNER BRACKET] |
Occasionally, Traditional Chinese text may use double brackets for the default, and single for the embedded. It may also use quotation marks, like Mainland China, but not commonly, and much less so for vertical text.
Titles of works including books, articles, songs, movies, files, calligraphy and paintings are cited in Chinese in one of two ways:c,#id87
H | V | ||
---|---|---|---|
《 [U+300A LEFT DOUBLE ANGLE BRACKET] | 》 [U+300B RIGHT DOUBLE ANGLE BRACKET] | ||
〈 [U+3008 LEFT ANGLE BRACKET] | 〉 [U+3009 RIGHT ANGLE BRACKET] |
The double brackets tend to be used for book and chapter titles, and the single brackets for articles.c,#glyphs_sizes_and_positions_in_character_faces_of_punctuation_marks
Proper names can be highlighted using line decoration, but this time a straight underline. Note that the underline is not used for emphasis in this case. This is mostly used in textbooks and older publications.c,#glyphs_sizes_and_positions_in_character_faces_of_punctuation_marks
To express emphasis Chinese uses dots or circles alongside characters, one dot per base character.
In horizontal text, emphasis marks appear underneath the base text. In vertical text, they run down the right-hand side. Regardless of the orientation of the line, the dot is centred alongside or below the base character.
Where both lines and emphasis dots decorate the same run of text, the lines and emphasis dots will usually appear on opposite sides of a vertical line of text, but will normally both appear below a line of horizontally set text. In horizontal text the line decoration is normally closer to the text than the emphasis dots.
In the same way as for other line decorations, embedded text in other languages that run sideways up or down the line would have dots displayed on the same side as when decorating Chinese.
Straight or wavy lines alongside the text are not used for emphasis (unlike in Latin script text), but are instead used in Chinese to indicate proper nouns such as a person's name, a book title, or the name of a place.c,#id82 See inline_titles and inline_propernames.
An ellipsis in Chinese consists of six dots and takes up the space of two Hanzi characters. This is normally achieved using two … [U+2026 HORIZONTAL ELLIPSIS] characters, side-by-side.c,#id83
H | V | |
---|---|---|
…… [U+2026 HORIZONTAL ELLIPSIS] x2 |
Inter-line annotations are used to indicate pronunciation (usually only for children or foreigners), and to provide commentaries on or bilingual equivalents of the main text.
With the exception of zhuyin in horizontal text, all annotations appear within the standard inter-line space for the page, and don't create extra space if they appear on a single line. That said, the inter-line space is usually set at an appropriate size to accommodate annotations.
Unlike Japanese, it is rare to find annotations applied just to specific words; generally the whole text is annotated. If annotations are only needed for individual characters or words, they are often presented in parentheses, following.
These annotations do not appear alongside punctuation.
Pinyin is the most common way of representing pronunciation, although occasionally other transcriptions are used.
The annotation usually appears above the main line of text, except when both zhuyin and pinyin annotations are both present, in which case it commonly appears below the line.
Latin annotations for pronunciation are usually only used with horizontal text.
For native children the annotations are usually applied character by character, whereas for foreign learners they are often applied word by word. The annotation is normally centred above the base text, and contains no spaces.
In order to avoid collisions or wrongly implied word boundaries, there should always be a 1/4em space between adjacent long annotations (usually up to 5 characters per syllable for pinyin). Letter-spacing is typically applied evenly across all the base text to allow room for annotations.
There is a preference for annotations to use a sans-serif font, and for the base text to use Kai.
The 國語注音符號 (guóyǔ zhùyīn fúhào) approach uses a set of characters referred to as bopomofo (after the initial letters in the alphabet), and is mostly used in Taiwan.
The bopomofo annotations usually appear in a vertical column to the right of each base character, in both horizontal and vertical text.
Each syllable is described by up to 3 bopomofo characters, plus a tone mark. The neutral tone mark appears above the stack, but the others appear to the right of the bopomofo column. The height of the tone mark depends on the number of bopomofo characters to its left. For details, see CLREQ.
These annotations are common in light novels and translated works, and tend to describe phrases or words. They may contain casing, punctuation, and spaces, and may contain Chinese text explaining Latin base text, or vice versa.
They usually appear below a horizontal line of text, and to the left of a vertical line.
Unlike phonetic annotations, these annotations are only attached to specific words or phrases.
Chinese uses a large number of punctuation marks, and that number is increased by the duplication of normal vs. fullwidth variants. The fullwidth punctuation often includes significant amounts of white space, so that character frames of the punctuation characters are the same size as Han characters.
CLDR lists 136 punctuation characters for the union of Simplified and Traditional Chinese, grouped here by Unicode block.
CJK Symbols & Punctuation:
(Halfwidth &) Fullwidth Forms:
Basic & General punctuation:
CLDR also includes some compatibility characters, included for handling legacy implementations. These include vertical text forms, which should normally be automatically enabled by the font in a vertical context. The other forms should also be avoided in favour of normal characters, with the variant shapes provided by fonts or styling.u,284-5
CJK Compatibility Forms:
Small Form Variants:
Dashes. The long dashes mentioned in bracketing can also be used to show a continuation of tone or sound, an abrupt change in thought, or adding new content to the contextc,#id82.
Connectors.Connector marks are used "to indicate the beginning and end of time or space, to indicate quantity, to express the name of a chemical compound, to label a table or illustration, to connect a house number in an address, for a phone number, to separate digits which indicate the year, month and date, or to connect compound nouns and for the romanization, as well as the foreign text in the content".c,#id85
Chinese uses the following punctuation for this.c,#id85
SC: H, V | TC: H, V | |||
---|---|---|---|---|
~ [U+FF5E FULLWIDTH TILDE] | ||||
– [U+2013 EN DASH] | ||||
— [U+2014 EM DASH] | - | - |
Separators. Interpuncts are used to separate the first name and family name in foreign or minority names rendered using Chinese characters, and with book title marks to separate chapters, articles and volumes in publications.c,#id86
H | V | |
---|---|---|
· [U+00B7 MIDDLE DOT] |
Middle dots sometimes take up only a halfwidth space in Simplified Chinese when used with dates, eg. 2·11.c,#id86
The following characters are not recommended for this purpose: . [U+FF0E FULLWIDTH FULL STOP], ‧ [U+2027 HYPHENATION POINT], • [U+2022 BULLET], and ・ [U+30FB KATAKANA MIDDLE DOT].c,#id86
When lines or other text decoration are used, they normally appear below horizontal text, and to the left of vertical text. However, emphasis marks appear to the right of a vertical line.
Where both lines and emphasis dots decorate the same run of text, the lines and emphasis dots will usually appear on opposite sides of a vertical line of text, but will normally both appear together below a line of horizontally set text. In horizontal text the line decoration is normally closer to the text than the emphasis dots.
When two underlined items appear side-by-side, the underline should be broken between the two.
If a line of Chinese text contains some text in another language and orthography, the position of any text decoration should follow the Chinese conventions.
Lines alongside the text are used to indicate personal names, rather than emphasis (see inline_propernames). Wavy lines may also be used to mark a title of a book or work of art (see inline_titles).
Emphasis can be indicated using dots alongside the line (see emphasis).
Lines are normally wrapped between characters – word boundaries have no significance for the wrapping. Chinese should, however, take into account a few rules which dictate what characters cannot appear at the end or start of a line.
There is no hyphenation when Chinese characters are wrapped to the next line.
The following characters should not normally begin a line. Instead, they should bring the previous Han character with them.c,#prohibition_rules_for_line_start_end
A slightly more strict rule, called GB-style by CLReq, adds the following solidus characters.c,#prohibition_rules_for_line_start_end
A further level of strictness adds the following to the list. Where 2 characters are listed here, they should ideally not be broken across a line ending, but they may be split to reduce the length of text wrapped onto the next line.c,#prohibition_rules_for_line_start_end
These rules can be modified by preferences, and in some cases are not observed at all – particularly for Traditional Chinese in Taiwan and Hong Kong, and especially for newsprint, to help deal with narrow columns of text.c,#prohibition_rules_for_line_start_end
Also, where several punctuation marks appear together, for example 。』」, moving all characters from the previous line might create too large a gap for justification to handle elegantly, and so punctuation marks might be allowed to appear at the line start.c,#prohibition_rules_for_line_start_end
The following characters should not appear at the end of a line.
Chinese justifies text using a complex set of rules which adjust the space between characters on a line. Some characters are adjusted before others.
This section looks at ways in which spacing is applied between characters over and above that which is introduced during justification.
The standard baseline for Han characters is slightly lower than the alphabetic baseline used for Latin characters. Mixed script text needs to align baselines correctly.
Han characters have no ascenders or descenders, but occupy the square space described earlier.
fig_baselines shows metrics for the Heiti TC font. In this font the maximum height of the Han characters reaches slightly higher than the Latin ascenders, but not as low as the Latin descenders.
Interline spacing. Interline spacing should be consistent across all lines in a given text. It should allow a gap of sufficient size to include interlinear text decorations, such as lines for proper names or book titles or dots for emphasis marks. If an interline space is likely to include both line decorations and emphasis marks in a single interline gap, then the interline spacing must be set to accomodate that. (Note that paragraphs on the Web may reflow such that a single interline gap may sometimes contain both line decorations and emphasis dots, while at other times the line may only contain one.)
You can experiment with counter styles using the Counter styles converter. Patterns for using these styles in CSS can be found in Ready-made Counter Styles, and we use the names of those patterns here to refer to the various styles.
Chinese text uses a number of different counter styles. Some of the more common include full-width European numbers, which in vertical text stand upright. Unicode has various sets of numbers that can be useful here.
For the dotted-decimal numeric style Unicode provides precomposed characters from 1 to 20.
For the circled-decimal numeric style Unicode provides characters from 1 to 50.
Chinese orthographies also use ideographic characters to create 1 numeric, 2 fixed, 1 cyclic, 1 additive and 2 idiosynchratic styles.
The cjk-decimal numeric style is decimal-based and uses these digits.rmcs
Examples:
Several ideographic-based counter styles have an algorithm that is like an additive style, but has some differences. The algorithm to use can be found in the CSS Counter Styles specification, where they are called Longhand East Asian styles.
These styles are all decimal-based, and use the same algorithm but with different characters. The CSS spec only defines the algorithm up to 9,999, because there appears to be some disagreement about how larger numbers are handled.
The simp-chinese-informal longhand style uses the characters shown just below. The separator for lists is 、 and the numbers can be negative when using the symbol 负.
Examples:
The trad-chinese-informal
style uses exactly the same characters, except that the negative symbol is 負.
The simp-chinese-formal longhand style uses the characters shown below. The separator for lists is 、 and the numbers can be negative when using the symbol 负.
Examples:
The trad-chinese-formal longhand style uses 3 different code points where there is a difference in shape (for 2, 3, and 6), shown below. The separator for lists is 、 and the numbers can be negative when using the symbol 負.
Examples:
The cjk-earthly-branch fixed style uses the letters shown just below. It is only able to count to 12.
The cjk-heavenly-stem fixed style uses the letters shown below. It is also only able to count to 10.
The circled-ideograph fixed style uses the letters shown below. It is only able to count to 10.
The parenthesised-ideograph fixed style uses the letters shown below. It is also only able to count to 10.
The cjk-stem-branch cyclic style uses the pairs of characters shown just below. Once 60 is reached, the list begins over.
The cjk-tally-mark additive style uses the letters shown just above. It is based on only 5 basic characters, which were introduced in Unicode 11. The potential range of this style is very large, but counters rapidly grow in size, so smaller numbers are most likely.
The most common suffix is 、 [U+3001 IDEOGRAPHIC COMMA]. The circled or parenthesised fixed styles have no prefix/suffix.
Examples:
tbd
This section is for any features that are specific to thisScript and that relate to the following topics: general page layout & progression; grids & tables; notes, footnotes, etc; forms & user interaction; page numbering, running headers, etc.
The Han script characters in Unicode 13.0 are spread across 7 blocks. The total number of these characters is 92,896.
There are also 2 compatibility blocks, containing 1,014 characters in total.
There are also various related blocks, containing 459 characters.
The following links give information about characters used for everyday use of Chinese. The numbers in parentheses are for non-ASCII characters.