Making a keyboard layout

From LING073
Jump to: navigation, search

You can see a previous year's version of this page here: Spring 2017/Making a keyboard layout

Designing a layout

Common approaches for input

Normally, keyboard gives access to two "levels" of symbols. For example, one A might allow one to input the characters ‹a› and (using Shift) ‹A›. Once numbers, punctuation, and other useful symbols are accounted for, there is not a lot of room on a keyboard for the standard set of characters used in a language, particularly if it's much higher than 26. Several strategies are used to make more characters available:

  • AltGr. When held down, this key allows access to a third and fourth level of keys.
  • Compose. The two (or more) keystrokes following pressing this key determine what character is input. For example, Compose followed by a followed by ' (or vice versa) might be a way to input ‹á›.
  • Dead keys. Use of a dead key is like replacing the first two keystrokes of a compose key combination. So for example, ' followed by a might be a way to input ‹á›. To get ‹'›, you would need to either type ' and character that can't be combined with ‹◌́›, or a special key that "neutralises" the dead key, like space (so to get ‹'› followed by a space, you'd need to press space twice). Note some differences in how different systems treat dead keys: IBus displays ‹'› unless something it can combine with is subsequently entered, but Windows has traditionally needed a neutralising key.
  • Combining diacritical marks. A combining diacritical mark is a character that is separate from a previous character, but functions as a diacritic. For example, ‹a› followed by ‹◌́› is ‹á›, which is represented as U+0061 U+0301, whereas there is also a single character that looks the same, ‹á›, is represented as U+00E1. The ‹◌́› character may be mapped to any key, potentially even with AltGr access to it.
  • Key combinations / Many-character. There exist a variety of ways to input long words in non-alphabetic orthographies. This can be seen as similar to the "dead keys" approach.

What to base it on

What layout should you base a new keyboard layout on?

  • The keyboard of a similar script
  • A regionally dominant keyboard
  • U.S. English

There are some issues to keep in mind:

  • Do you want to send the message "your script is just a variety of X colonial script"?
  • What is easiest for the speakers? / What might they be familiar with already?
  • What do the speakers have access to? / What technology might they be using already?
  • Could a few characters be added that allow the speakers of this language to input another major language (of the area / world)?

For assignments, of course, you may choose to just make an input method that is easier for you, e.g. based on the U.S. English layout. But make sure you explain your decision.


  • Which of these xkb keyboard layouts do you think Kazakh speakers in Kazakhstan prefer to use to type Kazakh, and why? Note that almost all Kazakh speakers in Kazakhstan also speak Russian (but not all Russian speakers in Kazakhstan also speak Kazakh).
Kazakh - available in Windows for at least 15 years or so
Kazakh (with Russian)
Russian - the modern standard Russian keyboard layout

Russian (Kazakhstan, with Kazakh)

  • Here's a basic Spanish layout based on a U.S. English layout:












































(Dead keys are in green)

Combinations of inputs that result in other inputs include the following:

  • ◌́ a = á
  • ◌́ e = é
  • ◌́ i = í
  • ◌́ o = ó
  • ◌́ u = ú
  • ◌̈ u = ü

Can you think of any issues with this layout?

Implementing a layout

Different operating systems have different input frameworks. Some of them are easier to make a new layout for than others.

In GNU/Linux, the most common frameworks are xkb and IBus. The former is basically the default for all graphical session managers, and the latter depends on it to some extent. Unfortunately xkb doesn't support as many of the tricks mentioned above, and crucially, new layouts cannot be installed (or really even tested easily) without admin access to the machine. So usually I suggest using IBus (more specifically, its m17n module) for keyboard layouts on linux workstations, which doesn't have these limitations.

See Making a keyboard layout using IBus for information on how to design an IBus layout. You can see 2017's keyboard layout assignment for information on how to make an xkb layout.

There is also information about developing keyboards for other operating systems (for Windows and macOS, and to some extent Android and iOS).

The assignment

Create a keyboard layout for your language for the operating system you primarily use for this class. For Linux, use IBus, or xkb if you have root access on your system. For Windows use MSKLC and for macOS use Ukelele. See instructions linked to above.

This assignment is due at midnight (wherever) at the end of the day on Friday (this semester February 26th, 2021—though technically midnight at the beginning of Saturday).

Where to submit

You will submit this assignment in a new git repository on titled ling073-xyz-keyboard, where xyz is the three-letter ISO code of your language. Create the repo under the Ling073_sp21 group, and set all members of your group as contributors of the repo.

You will also be asked to put some things on a new wiki page titled Language/Keyboard, where Language is the name of your language (a page which should already exist). Add the category Sp21_KeyboardLayouts to the page, as well as a category for your language.

You should also adjust your original Language page: move all the documented resources under an "External resources" section (you'll probably want to add an extra layer of =s to your current headings), create a new "Developed resources" section, and add a link to both the keyboard page on the wiki and the keyboard repository in git.

Background Research

To the wiki page for your keyboard, add a section on existing resources. List any any keyboard layouts that already exist for your language. Check all operating systems you can think of and have access to—GNU/Linux, Windows, macOS on computers you might have access to, and Android or iOS on a phone or tablet—whatever you have access to or comes up in a web search. You don't have to test everything you find (especially if you don't have access to a device with the relevant OS), but do list everything you find.

You may find that some keyboard layouts are not included in the standard OS, but are available for download or through an app store. There may also be old "fake" layouts, in the form of a font that replaces unused character points with characters of the language. Try searching in your language or a larger regional language if you can.

Does there seem to be a standard layout, available across platforms? Briefly discuss the similarities and differences between the layouts you find, and what you think are strengths or weaknesses of these particular layouts. Is there any discussion anywhere about what people who type in this language might use?

Your layout

If there are existing layouts, you have the option of choosing one of them. If there are none or you decide not to use one, for ease of use, it might make sense to base the layout on something likely to be common in the area. However, this can fail in several different ways, as seen with the Kazakh example above:

  • It could seem like the less-well-resourced language is being thought of as "derived from" or a "variant of" the better-resourced language.
  • It could become difficult to type on the new layout because of awkward placement of symbols:
    • "extra" symbols placed on e.g., number row
    • "extra" symbols placed on third row (AltGr)
    • characters of the original layout are replaced with symbols of the new layout (and removed or rearranged)
    • symbols of the original layout are rearranged (!)
  • While it could be useful to maintain support on the same keyboard layout for other languages speakers might use (e.g., a larger language in the area or English), it isn't necessarily crucial: speakers can always just use an alternate keyboard layout.

Document your justification for each decision you made, whether it relates to existing keyboards for the language or the modification of some other layout. Include this in a file entitled README in your new repository, and in a section called "Justification" on the wiki page (see #Where to submit above). In the README file, include a link to the keyboard page on the wiki.

Prominently document the layout you decide on on the wiki page, similar to the Spanish example above. To be clear, do not simply post a screenshot—use the wiki markup in the same way as I did on the wiki (you can click "Edit" on that section and copy-paste to get started). Be sure to make it clear what's a deadkey or combining diacritic mark, and list any combinations of keys that do certain things.

Development cycle

For IBus m17n layouts, the steps to develop a keyboard are generally something like this:

  1. Make a copy of an existing layout (.mim file) from /usr/share/m17n/ and name it something else (e.g., the 3-letter ISO-639 code for your language) and put it in ~/.m17n.d/
  2. Modify the new file and adjust it as needed for your language.
  3. Restart IBus (right click on status bar icon, click "restart")
  4. Make more modifications and restart again to continue testing.

For more information, see the page on making a keyboard layout using IBus.

Other things to document

  • Document the steps needed to install your keyboard layout. Write up the steps in an INSTALL file in the repository and under an "Installation" section on the wiki page (write it up in one place and transfer it over—i.e., try to keep them in sync).
  • Add an AUTHORS file in the repository and put your name(s) (or alias(es)) and email address(es) there.
  • Decide on an open-source license for your keyboard layout and include the text of the license in a LICENSE file in the repo. Add a mention of how it's licensed on the wiki page.

Extra steps

If designing your keyboard layout involves only minor changes to an existing layout (such as adding one or two letters to 3rd-level positions or a deadkey or two) or if a layout for your language already exists for for your platform, then also do one of the following:

  • Make a "transcription" layout that includes most phonetic symbols needed to transcribe your language. This should include both IPA symbols and any other symbols commonly used in academic transcriptions of the language.
  • Set up some graphic Compose combinations for your language. E.g., for Kazakh, ‹-› could compose with ‹о›, ‹ү›, and ‹г› to result in ‹ө›, ‹ұ›, and ‹ғ›, respectively, while ‹,› could compose with ‹к› and ‹н› to result in ‹қ› and ‹ң›, respectively. (This leaves only three more letters in Kazakh that aren't in Russian too: ‹ә›, ‹і›, ‹һ›.) Don't forget upper-case characters too (if relevant)!
  • Make your keyboard capable of being used to type not just your language, but a range of related languages, or other languages spoken in the area, or other languages of cultural import to the community. You can do this in part using the compose key idea above, so that with the Kazakh example above, ‹,› could combine with ‹х› to create the Uzbek and Tajik letter ‹ҳ›, with ‹ч› for the Tajik letter ‹ҷ›, with ‹с› and ‹з› for the Bashqort and Chuvash letter ‹ҫ› and the Bashqort letter ‹ҙ›. You could also do something similar with dead keys or combining diacritical marks, which are like using Compose combinations except that you don't need Compose (though dead key strategies aren't easily implemented in xkb for non-Latin orthographies). You may try another input options, such as iBus, for things like this, though that requires a certain amount more effort. If you just use xkb, make sure you support the full orthography of at least two other languages besides your language and the larger language whose keyboard you're basing this on.
  • Make (an) additional keyboard layout(s) for (an)other orthography / orthographies that the language is written in, e.g. a script in use in another country, or a script that's no longer used.
  • If you want to make a completely different layout for your language (e.g., a Russian keyboard based on Zhuyin fuhao, or something based on character frequency for the language, like the Dvorak layout for English) and can justify it, by all means do that instead of something else here—but check with me first.

Add what you did to the README file and the "Justification" section on the wiki.