Japanese writing system summary

Updated 10-Feb-2019 • tags kana, scriptnotes

This page provides basic information about the Japanese writing system, which includes the use of Kanji, Hiragana, Katakana and Latin scripts. It is not authoritative, peer-reviewed information – these are just notes I have gathered or copied from various places as I learned.

For similar information related to other scripts, see the Script comparison table.

Clicking on red text examples, or highlighting part of the sample text shows a list of characters, with links to more details. Click on the vertical blue bar (bottom right) to change font settings for the sample text.

Sample (Japanese)

第1条 すべての人間は、生まれながらにして自由であり、かつ、尊厳と権利とについて平等である。人間は、理性と良心とを授けられており、互いに同胞の精神をもって行動しなければならない。

第2条 すべて人は、人種、皮膚の色、性、言語、宗教、政治上その他の意見、国民的もしくは社会的出身、財産、門地その他の地位又はこれに類するいかなる自由による差別をも受けることなく、この宣言に掲げるすべての権利と自由とを享有することができる。 さらに、個人の属する国又は地域が独立国であると、信託統治地域であると、非自治地域であると、又は他のなんらかの主権制限の下にあるとを問わず、その国又は地域の政治上、管轄上又は国際上の地位に基ずくいかなる差別もしてはならない。

Usage & history

From Scriptsource:

Kanji characters were introduced to Japan around the 3rd century, it is thought from Korea. Until the 7th or 8th century, the Japanese language was written exclusively in these Chinese characters. Initially these were used phonetically to represent similar-sounding Japanese syllables, regardless of their meaning in written Chinese. However, the process of writing Japanese solely in kanji was laborious; each symbol consisted of a number of strokes and only represented one syllable. Two simplified forms of writing began to emerge around the 7th century. The modern hiragana script developed from a simplified cursive style originally developed by women, who were discouraged from learning kanji, and katakana was developed by Buddhist scholars who wrote only one element of each kanji symbol as a form of shorthand.

From Wikipedia:

The modern Japanese writing system is a combination of two character types: logographic kanji, which are adopted from Chinese characters, and syllabic kana. Kana itself consists of a pair of syllabaries: hiragana, used primarily for native or naturalised Japanese words and grammatical elements, and katakana, used primarily for foreign words and names, loanwords, onomatopoeia, scientific names, and sometimes for emphasis. Almost all written Japanese sentences contain a mixture of kanji and kana. Because of this mixture of scripts, in addition to a large inventory of kanji characters, the Japanese writing system is often considered to be the most complicated in use anywhere in the world.

Distinctive features

Kanji characters are mostly derived from the Chinese Han script. They are used for word roots.

The term kana covers two syllabaries that are used with kanji characters (see Han) to write Japanese. See the table to the right for a brief overview of features, taken from the Script Comparison Table.

One syllabary is hiragana, the other katakana. In both cases, the repertoire includes 5 independent vowel sounds, one nasal sound, and the rest are consonant+vowel combinations. There are a small number of additional characters with particular functions, such a katakana lengthening mark, and a few small characters for representing medial glides.

Text can be written horizontally or vertically. The visual forms of characters don't interact.

Character lists

The Kana script characters in Unicode 10.0 are spread across 5 blocks:

The commonly used characters are in the Hiragana and Katakana blocks. The other characters are somewhat more esoteric.

The listings at the bottom of the page show a list of characters used, per version 31 of CLDR's lists of characters (exemplarCharacters).


The Japanese language, unlike many neighouring languages, uses polysyllabic words, and has no tones. It is an agglutinative language, and doesn't use spaces or other characters to separate words.

Wikipediag has a nicely written summary of the structural characteristics:

Text (文章 bunshō) is composed of sentences ( bun), which are in turn composed of phrases (文節 bunsetsu), which are its smallest coherent components. Like Chinese and classical Korean, written Japanese does not typically demarcate words with spaces; its agglutinative nature further makes the concept of a word rather different from words in English. The reader identifies word divisions by semantic cues and a knowledge of phrase structure. Phrases have a single meaning-bearing word, followed by a string of suffixes, auxiliary verbs and particles to modify its meaning and designate its grammatical role. In the following example, phrases are indicated by vertical bars:

taiyō ga | higashi no | sora ni | noboru
sun SUBJECT | east POSSESSIVE | sky LOCATIVE | rise
The sun rises in the eastern sky.

Romanised Japanese text may add spaces between bunsetsu phrases, with hyphens separating suffixes (eg. higashi-no), or may also separate the suffixes using spaces (ie. higashi no).

Text direction

Text can be written horizontally, left to right, or vertically with lines progressing from right to left. Vertically set text is still common in Japan; most novels, newspapers and magazines are set vertically.

Older horizontally set texts in Japanese also ran right to left.

If your browser supports vertical text, you can change the direction of the text sample here.

第7条 すべての人は、法の下において平等であり、また、いかなる差別もなしに法の平等な保護を受ける権利を有する。すべての人は、この宣言に違反するいかなる差別に対しても、また、そのような差別をそそのかすいかなる行為に対しても、平等な保護を受ける権利を有する。

It should be noted, however, that horizontal and vertical text is not usually identical. Apart from the question of what gets rotated and what does not, the two writing modes may show different preferences for emphasis marks, brackets, numbers, and so forth.

Kanji characters

Kanji characters are mostly derived from the Chinese Han script. They are commonly used for word roots and compound words.


The compound word static electricity (seidenki), written with kanji.

Kanji characters are primarily constructed from characters that each represent a phonetic symbol. Some have pictographic origins that are still evident, whereas others have a more complicated structure.

In reforms in the mid 20th century, the Japanese repertoire was standardised on around 2,000 characters, however standaridised character sets support a few thousand more.

Show kanji characters in CLDR 31 (examplarCharacters).
Kanji 々 一 丁 七 万 下 不 与 丑 且 世 丘 丙 両 並 中 丸 丹 主 久 乏 乗 乙 九 乱 乳 乾 亀 了 予 争 事 二 互 五 井 亜 亡 交 亥 亨 享 亭 人 仁 今 介 仏 仕 他 付 仙 代 以 仮 仰 仲 件 任 企 伊 伏 休 会 伝 伯 伴 伸 伺 似 但 位 佐 体 何 余 作 佳 併 使 例 侍 供 依 価 侮 侯 侵 便 係 促 俊 俗 保 信 修 俳 俵 俸 俺 倉 個 倍 倒 候 借 倣 値 倫 倹 偉 偏 停 健 側 偶 偽 傍 傑 傘 備 催 債 傷 傾 働 像 僕 僚 僧 儀 億 儒 償 優 元 兆 先 光 克 免 兎 児 党 入 全 八 六 共 兵 具 典 兼 内 円 冊 再 冒 冗 写 冠 冬 冷 准 凍 凝 凡 処 凶 凸 出 刀 刃 分 刈 刊 刑 列 初 判 別 利 到 制 券 刺 刻 則 削 前 剖 剛 剣 剤 副 剰 割 創 劇 力 功 加 劣 助 努 励 労 効 劾 勅 勇 勉 動 勘 務 勝 募 勢 勤 勧 勲 勺 匁 包 化 北 匠 匹 医 匿 十 千 升 午 半 卑 協 南 単 博 占 卯 危 即 卵 卸 厄 厘 厚 原 厳 去 参 又 及 収 叔 取 受 叙 口 句 叫 召 可 台 史 右 号 司 各 合 吉 同 向 君 吟 否 含 吸 吹 呈 告 周 味 呼 命 和 咲 哀 品 員 哲 唆 唇 唐 唯 唱 商 問 啓 善 喚 喜 喝 喪 喫 営 嗣 嘆 嘉 嘱 器 噴 嚇 囚 四 回 因 団 困 囲 図 固 国 圏 園 土 圧 在 地 坂 均 坊 坑 坪 垂 型 垣 埋 城 域 執 培 基 埼 堀 堂 堅 堕 堤 堪 報 場 塀 塁 塊 塑 塔 塗 塚 塩 塾 境 墓 増 墜 墨 墳 墾 壁 壇 壊 壌 士 壬 壮 声 売 変 夏 夕 外 多 夜 夢 大 天 夫 央 失 奇 奉 奏 契 奔 奥 奨 奪 奮 女 奴 好 如 妄 妊 妙 妥 妨 妹 妻 姉 始 姓 委 姫 姻 姿 威 娘 娠 娯 婆 婚 婦 婿 媒 嫁 嫌 嫡 嬢 子 孔 字 存 孝 季 孤 学 孫 宅 宇 安 完 宗 定 宜 宝 実 客 室 宮 宰 害 家 容 宿 寂 寄 密 富 寒 寛 寝 察 寡 寧 審 寮 寸 寺 対 寿 封 専 射 将 尉 尋 導 小 少 尚 就 尺 尼 局 居 屈 届 屋 展 属 層 履 屯 山 岐 岡 岩 岬 岳 岸 峠 峡 峰 島 崇 崎 崩 川 州 巡 巣 工 巨 差 己 巳 巻 市 布 帆 希 帝 帥 師 席 帯 帰 帳 常 帽 幅 幕 幣 干 年 幸 幹 幻 幾 庁 広 床 序 底 店 庚 府 度 座 庫 庭 庶 庸 廃 廉 廊 延 廷 建 弁 弊 式 弐 弓 引 弘 弟 弦 弧 弱 張 強 弾 当 形 彩 彫 彰 影 役 彼 往 征 径 待 律 後 徐 徒 従 得 御 復 循 微 徳 徴 徹 心 必 忌 忍 志 忙 応 忠 快 念 怒 怖 思 怠 急 性 怪 恋 恐 恒 恥 恨 恩 恭 息 恵 悔 悟 悠 患 悦 悩 悪 悲 悼 情 惑 惜 惨 惰 想 愁 愉 意 愚 愛 感 慈 態 慌 慎 慕 慢 慣 慨 慮 慰 慶 憂 憎 憤 憩 憲 憶 憾 懇 懐 懲 懸 戊 戌 成 戒 戦 戯 戸 戻 房 所 扇 扉 手 才 打 払 扱 扶 批 承 技 抄 把 抑 投 抗 折 抜 択 披 抱 抵 抹 押 抽 担 拍 拐 拒 拓 拘 拙 招 拝 拠 拡 括 拷 拾 持 指 挑 挙 挟 振 挿 捕 捜 捨 据 掃 授 掌 排 掘 掛 採 探 接 控 推 措 掲 描 提 揚 換 握 揮 援 揺 損 搬 搭 携 搾 摂 摘 摩 撃 撤 撮 撲 擁 操 擦 擬 支 改 攻 放 政 故 敏 救 敗 教 敢 散 敬 数 整 敵 敷 文 斉 斎 斗 料 斜 斤 斥 断 新 方 施 旅 旋 族 旗 既 日 旧 早 旬 昆 昇 昌 明 易 昔 星 映 春 昨 昭 是 昼 時 晩 普 景 晴 晶 暁 暇 暑 暖 暗 暦 暫 暮 暴 曇 曜 曲 更 書 曹 替 最 月 有 服 朕 朗 望 朝 期 木 未 札 朱 朴 机 朽 杉 材 村 束 条 来 杯 東 松 板 析 林 枚 果 枝 枠 枢 枯 架 柄 某 染 柔 柱 柳 査 栄 栓 校 株 核 根 格 栽 桃 案 桑 桜 桟 梅 械 棄 棋 棒 棚 棟 森 棺 植 検 業 極 楼 楽 概 構 様 槽 標 模 権 横 樹 橋 機 欄 欠 次 欧 欲 欺 款 歌 歓 止 正 武 歩 歯 歳 歴 死 殉 残 殖 殴 段 殺 殻 殿 母 毎 毒 比 毛 氏 民 気 水 氷 永 汁 求 汎 汗 汚 江 池 決 汽 沈 沖 没 沢 河 沸 油 治 沼 沿 況 泉 泊 泌 法 泡 泣 泥 注 泰 泳 洋 洗 洞 津 洪 活 派 流 浄 浅 浜 浦 浪 浮 浴 海 浸 消 涙 涯 液 涼 淑 淡 深 混 添 清 渇 渉 渋 渓 減 渡 渦 温 測 港 湖 湯 湾 満 源 準 溝 溶 滅 滋 滑 滝 滞 滴 漁 漂 漆 漏 演 漠 漢 漫 漬 漸 潔 潜 潟 潤 潮 澄 激 濁 濃 濫 濯 瀬 火 灯 灰 災 炉 炊 炎 炭 点 為 烈 無 焦 然 焼 煙 照 煩 煮 熟 熱 燃 燥 爆 爵 父 片 版 牙 牛 牧 物 牲 特 犠 犬 犯 状 狂 狩 独 狭 猛 猟 猪 猫 献 猶 猿 獄 獣 獲 玄 率 玉 王 珍 珠 班 現 球 理 琴 環 璽 瓶 甘 甚 生 産 用 田 申 男 町 画 界 畑 畔 留 畜 畝 略 番 異 畳 疎 疑 疫 疲 疾 病 症 痘 痛 痢 痴 療 癒 癖 癸 発 登 白 百 的 皆 皇 皮 皿 盆 益 盗 盛 盟 監 盤 目 盲 直 相 盾 省 看 県 真 眠 眺 眼 着 睡 督 瞬 矛 矢 知 短 矯 石 砂 研 砕 砲 破 硝 硫 硬 碁 碑 確 磁 磨 礁 礎 示 礼 社 祈 祉 祖 祚 祝 神 祥 票 祭 禁 禄 禅 禍 福 秀 私 秋 科 秒 秘 租 秩 称 移 程 税 稚 種 稲 稼 稿 穀 穂 積 穏 穫 穴 究 空 突 窃 窒 窓 窮 窯 立 竜 章 童 端 競 竹 笑 笛 符 第 筆 等 筋 筒 答 策 箇 算 管 箱 節 範 築 篤 簡 簿 籍 米 粉 粋 粒 粗 粘 粛 粧 精 糖 糧 糸 系 糾 紀 約 紅 紋 納 純 紙 紛 素 索 紫 累 細 紳 紹 紺 終 組 経 結 絞 絡 給 統 絵 絶 絹 継 続 維 綱 網 綿 緊 総 緑 緒 線 締 編 緩 緯 練 縁 縄 縛 縦 縫 縮 績 繁 繊 織 繕 繭 繰 缶 罪 置 罰 署 罷 羅 羊 美 群 義 羽 翁 翌 習 翻 翼 老 考 者 耐 耕 耗 耳 聖 聞 聴 職 肉 肌 肖 肝 肢 肥 肩 肪 肯 育 肺 胃 胆 背 胎 胞 胴 胸 能 脂 脅 脈 脚 脱 脳 脹 腐 腕 腰 腸 腹 膚 膜 膨 臓 臣 臨 自 臭 至 致 興 舌 舎 舗 舞 舟 航 般 舶 船 艇 艦 良 色 芋 芝 花 芳 芸 芽 苗 若 苦 英 茂 茎 茶 草 荒 荘 荷 菊 菌 菓 菜 華 落 葉 著 葬 蒸 蓄 蔵 薄 薦 薪 薬 藤 藩 藻 虎 虐 虚 虜 虞 虫 蚊 蚕 蛇 蛍 蛮 融 血 衆 行 術 街 衛 衝 衡 衣 表 衰 衷 袋 被 裁 裂 装 裏 裕 補 裸 製 複 褐 褒 襟 襲 西 要 覆 覇 見 規 視 覚 覧 親 観 角 解 触 言 訂 計 討 訓 託 記 訟 訪 設 許 訳 訴 診 証 詐 詔 評 詞 詠 試 詩 詰 詳 誇 誉 誌 認 誓 誕 誘 語 誠 誤 説 読 誰 課 調 談 請 論 諭 諮 諸 諾 謀 謁 謄 謙 講 謝 謡 謹 識 譜 警 議 譲 護 谷 豆 豊 豚 象 豪 貝 貞 負 貢 貧 販 貫 責 貯 貴 買 貸 費 貿 賀 賃 賄 資 賊 賓 賛 賜 賞 賠 賢 賦 質 購 贈 赤 赦 走 赴 起 超 越 趣 足 距 跡 路 跳 践 踊 踏 躍 身 車 軌 軍 軒 軟 転 軸 軽 較 載 輝 輩 輪 輸 轄 辛 辞 辰 農 辺 込 迅 迎 近 返 迫 迭 述 迷 追 退 送 逃 逆 透 逐 逓 途 通 逝 速 造 連 逮 週 進 逸 遂 遅 遇 遊 運 遍 過 道 違 遠 遣 適 遭 遮 遵 遷 選 遺 避 還 邦 邪 邸 郊 郎 郡 部 郭 郵 郷 都 酉 酌 配 酒 酔 酢 酪 酬 酵 酷 酸 醜 醸 釈 里 量 金 針 釣 鈍 鈴 鉄 鉛 鉢 鉱 銀 銃 銅 銑 銘 銭 鋭 鋳 鋼 錘 錠 錬 錯 録 鍛 鎖 鎮 鏡 鐘 鑑 長 門 閉 開 閏 閑 間 関 閣 閥 閲 闘 阪 防 阻 附 降 限 陛 院 陥 陪 陰 陳 陵 陶 陸 険 陽 隅 隆 隊 階 随 隔 際 障 隠 隣 隷 隻 雄 雇 雉 雌 雑 離 難 雨 雪 雰 雲 零 雷 電 需 震 霊 霜 霧 露 青 静 非 面 革 靴 韓 音 韻 響 頂 頃 項 順 預 頒 領 頭 頻 頼 題 額 顔 顕 願 類 顧 風 飛 食 飢 飯 飲 飼 飾 養 餓 館 首 香 馬 駄 駆 駐 騎 騒 験 騰 驚 骨 髄 高 髪 鬼 魂 魅 魔 魚 鮮 鯨 鳥 鳴 鶏 鹿 麗 麦 麻 黄 黒 黙 鼓 鼠 鼻 齢 叱 剥 填 頬 1,908
Auxiliary 兌 拼 楔 錄 鳯 5

Kana syllabaries

Japanese uses two syllabaries: hiragana and katakana.

Katakana characters are used for foreign loan words, such as the word 'text' here:


The word text (tekisuto), written with katakana syllables.

Hiragana is used for indigenous Japanese words, such as the verb 'to be':


The word for to be (desu), written with hiragana syllables.

as well as for grammatical endings after a word root written using kanji characters:


The word for to collect (atsumarimasu), with the verb root (atsu) written using a kanji character, and the remainder in hiragana expresssing the grammatical present-tense.

The basic syllabary includes 5 independent vowel sounds, one nasal sound, and the rest are consonant+vowel combinations. In these lists we show hiragana (first) and katakana (second) together.


Voiced consonants are indicated by attaching a dakuten mark (looks like a quote mark) to the unvoiced shape. Unicode provides precomposed code points for every combination of syllable+dakuten.


The ‘p’ sound is indicated in a similar way by the use of a han-dakuten (half-dakuten).


The Unicode hiragana block does contain code points for dakuten combining marks and modifiers, but these are not used in normal text.

The basic set is completed by a number of small forms used for medial glides, foreign sounds, and gemination, and a vowel lengthener.


Small versions of や, ゆ, and よ are used to form syllables such as きゃ kya kja, きゅ kya kja, and きょ kyo kjo.

[U+30C3 KATAKANA LETTER SMALL TU] is used to lengthen a consonant sound.

[U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK] is used to elongate vowel sounds. This elongation is phonemically significant. It is used predominantly with katakana, but occasionally also with hiragana. . u720

Archaic characters

A number of characters in the kana blocks are no longer used in modern text.

These characters were dropped by an orthographic reform shortly after World War 2.


Halfwidth katakana

Unicode has a set of halfwidth katakana forms for legacy encoding roundtrips. In principle, these characters should not be used. The normal, fullsized characters should be used instead.



CLDR 31 lists the following punctuation characters for Japanese. First the fullwidth forms of normal characters.


Then the halfwidth forms.


And finally, the other punctuation.


The katakana block contains two additional punctuation marks.


[U+30FB KATAKANA MIDDLE DOT] is used to separate words when writing non-Japanese phrases. . u720

[U+30A0 KATAKANA-HIRAGANA DOUBLE HYPHEN] is a delimiter occasionally used in analyzed Katakana or Hiragana textual material. 

The hiragana block contains some combining and modifier characters used to represent dakuten and han-dakuten for compatibility with older systems.


The kana blocks each have two marks that are used to indicate repetition of a syllable – one for syllables with unvoiced consonants and another for voiced. The table below shows the hiragana first, then the katakana. In both cases there is a character for repetition of ordinary syllables, and one for repetition of syllables with dakuten.


Unicode also has   [U+3000 IDEOGRAPHIC SPACE] for occasions where it is needed.

Structural boundaries & markers

Word boundaries

The concept of 'word' is difficult to define in any language (see What is a word?). Here, a word is a vaguely-defined, but recognisable semantic unit that is typically smaller than a phrase and may comprise one or more syllables.

Japanese rarely uses spaces. In the sample text there are gaps around punctuation, but these are produced by a lack of 'ink' in parts of the square character glyphs:

You can verify this by clicking on this example. The character list popup shows that only four characters make up this sequence, and none are spaces.


Phrase boundaries

Japanese uses native bracket styles, commas and periods. Some of the punctuation looks like that for Latin (eg. parentheses), but the width of the punctuation is likely to include signficant amounts of white space, so that punctuation characters occupy the same space as han characters.


The pronunciation of Japanese ideographic characters cannot be guessed, and so can pose difficulties for those learning the language. A common way around this problem is to annotate the text with kana characters. This is called ruby, or in its most common form furigana. The following example uses markup to indicate which is base text and which is ruby, however not all browsers can display ruby correctly.



Japanese sometimes uses katakana characters to create visual emphasis. u720

Line & paragraph layout


Lines are normally wrapped between characters – word boundaries have no significance for the wrapping. Japanese should, however, take into account a few rules (kinsoku rules) which dictate what characters cannot appear at the end or start of a line.

Text alignment & justification

Japanese justifies text using a complex set of rules which adjust the space between characters on a line. Some characters are adjusted before others.

Use the control below to see how your browser justifies the text sample here.



Further information needed for this section includes:

Glyph shaping & positioning
    Cursive text
    Context-based shaping
    Multiple combining characters
    Context-based positioning
    Transforming characters

Structural boundaries & markers
    Grapheme, word & phrase boundaries
    Hyphens & dashes
    Bracketing information
    Abbreviations, ellipsis, & repetition
    Emphasis & highlights
    Inline notes & annotations

Inline layout
    Inline text spacing
    Bidirectional text

Line & paragraph layout
    Text direction
    Line breaking
    Text alignment & justification
    Counters, lists, etc.
    Styling initials
    Baselines & inline alignment

Page & book layout
    General page layout & progression
    Directional layout features
	Grids & tables
    Notes, footnotes, etc.
    Forms & user interaction
    Page numbering, running headers, etc.


  1. [ w ] Wikipedia, Japanese Grammar.
  2. [ u ] The Unicode Standard, pp 720-723
Last changed 2019-02-10 17:29 GMT.  •  Make a comment.  •  Licence CC-By © r12a.