Khmer picker notes/help

Updated 20-Feb-2018 • tags apps, pickers, khmer

This Unicode character picker allows you to produce or analyse runs of Khmer text. Character pickers are especially useful for people who don't know a script well, as characters are displayed in ways that aid identification.

If something is broken or missing raise an issue. For version information see the Github commit list.

About the chart

Includes all the characters in the Unicode Khmer and Khmer Symbols blocks (in the default panel).

Controls above the text area

Transliterate. Converts the contents of the text area to a Latin transliteration based on a cross between the ALA LOC and UNGEGN systems, with a few small additions to help eliminate ambiguity. In addition, the transliterations show inherent vowels.

Transliterations provide a single correspondence between each Khmer symbol and a Latin symbol, so that the process is reversible. It doesn't represent the actual pronunciation very well, but it does allow you to convert the Latin text back to the original Khmer. (Transcriptions to UNGEGN or IPA provide a Latin version of the Khmer text that is closer to the actual pronunciation, but is not reversible.)

Khmer to UNGEGN. This is still in development! Converts Khmer text into a transcription along the lines of the UNGEGN system. It still needs a lot of work.

Show syllables. This control attempts to split the Khmer text into syllables, so that an IPA transcription can be attempted. It is still in the prototype stage.

Because the algorithm for IPA conversion works on phonetic syllables, rather than orthographic ones, in order to produce an automatic IPA transcription (see below) you'll need to additionally do the following:

  1. Remove any spaces separating syllable-final consonants that would otherwise be regarded as having an inherent vowel. There will probably always be problems distinguishing these consonants, due to the ambiguities of the Khmer writing system. Simply delete the space between the two parts of the syllable.
  2. Because phonetic rules for multisyllabic words need to know what makes up the word, you'll need to add a hyphen instead of a space between syllables that make up a single word.

Khmer to IPA. Produces an output that is intended to approximately reflect actual pronunciation. It uses the rules in Franklin Huffman's Cambodian System of Writing. However, it needs some assistance from the user. This is because Khmer doesn't use spaces between words, and it is often ambiguous as to whether a consonant represents a syllable-final sound or a syllable in its own right. It also needs help to identify unstressed syllables. You should find that the Show syllables control helps to prepare the text (see above).

After the first syllable on the line, put an ordinary space before each consonant or independent vowel sign that begins a new syllable (not word). (Note that this may split consonant clusters. The Khmer text will look strange but still work.) You should also indicate unstressed syllables by following the syllable with a hyphen, rather than a space. For many bisyllabic words, this means putting a hyphen after the first of the two syllables. For example, converting ប្រកាន់និទៀន to ប្រ-កាន់ និ-ទៀន will produce the following transcription prɑkannitiən. Note that, if you don't know Khmer well enough to know when a syllable is unstressed, you can still get an approximation to the pronunciation using only spaces. For instance, the previous example separated by spaces only will yield prɑːkanniʔtiən.

Although the transcription is based on rules by Franklin Huffman in Cambodian System of Writing, some symbols are changed to be more recognizable to those familiar with IPA. While the transcription rules are quite detailed, and Khmer is largely regular, there are a few exceptions, particularly in words from Sanskrit or Pali, or ambiguities, for example in a few independent vowel signs, that cause problems for the transcription. The transcription is non-reversible. I created it to help me quickly reproduce (simple) phonetic alternatives for examples in my notes on Khmer. 

Remove space/hyphen. Removes the spaces from the highlighted range (or the whole output area, if nothing is highlighted).

Input aids

Huffman transcription. Displays a panel that allows you to generate Khmer text from a transcription as used by Huffman in Cambodian System of Writing. Where there are multiple possible choices, these choices are presented in a small pop-up box; click on the choice you want to insert it into the output area.

A hyphen in a selection list for either of this or the following transcription panels indicates that the sound is produced without a Khmer character, ie. the inherent vowel.

In a small number of cases, you will need to click twice on the components that make up the sound (eg. when bantoc is used on the following consonant). These cases are indicated by a red plus sign between two clickable shapes (one of which may be just a hyphen). You need to click on the item to the left of the plus sign, then add a consonant, then click on the item to the right of the plus sign. In several cases the item to the left is a hyphen (representing the inherent vowel), in which case just add another consonant followed by the item to the right.

Gilbert transcription. Displays a panel that allows you to generate Khmer text from a transcription as used by Gilbert and Hang in Cambodian for Beginners.

See also the notes for the Huffman transcription mentioned just above.

Copyright Licence CC-By.