This chapter explains the character sets used by Emacs for input commands and for the contents of files, and also explains the concepts of keys and commands, which are fundamental for understanding how Emacs interprets your keyboard and mouse input.
GNU Emacs uses an extension of the ASCII character set for keyboard input; it also accepts non-character input events including function keys and mouse button actions.
ASCII consists of 128 character codes. Some of these codes are assigned graphic symbols such as `a' and `='; the rest are control characters, such as Control-a (usually written C-a for short). C-a gets its name from the fact that you type it by holding down the CTRL key while pressing a.
Some ASCII control characters have special names, and most terminals have special keys you can type them with: for example, RET, TAB, DEL and ESC. The space character is usually referred to below as SPC, even though strictly speaking it is a graphic character whose graphic happens to be blank. Some keyboards have a key labeled "linefeed" which is an alias for C-j.
Emacs extends the ASCII character set with thousands more printing characters (see section International Character Set Support), additional control characters, and a few more modifiers that can be combined with any character.
On ASCII terminals, there are only 32 possible control characters. These are the control variants of letters and `@[]\^_'. In addition, the shift key is meaningless with control characters: C-a and C-A are the same character, and Emacs cannot distinguish them.
But the Emacs character set has room for control variants of all printing characters, and for distinguishing between C-a and C-A. X Windows makes it possible to enter all these characters. For example, C-- (that's Control-Minus) and C-5 are meaningful Emacs commands under X.
Another Emacs character-set extension is additional modifier bits. Only one modifier bit is commonly used; it is called Meta. Every character has a Meta variant; examples include Meta-a (normally written M-a, for short), M-A (not the same character as M-a, but those two characters normally have the same meaning in Emacs), M-RET, and M-C-a. For reasons of tradition, we usually write C-M-a rather than M-C-a; logically speaking, the order in which the modifier keys CTRL and META are mentioned does not matter.
Some terminals have a META key, and allow you to type Meta characters by holding this key down. Thus, Meta-a is typed by holding down META and pressing a. The META key works much like the SHIFT key. Such a key is not always labeled META, however, as this function is often a special option for a key with some other primary purpose.
If there is no META key, you can still type Meta characters using two-character sequences starting with ESC. Thus, to enter M-a, you could type ESC a. To enter C-M-a, you would type ESC C-a. ESC is allowed on terminals with META keys, too, in case you have formed a habit of using it. X Windows provides several other modifier keys that can be applied to any input character. These are called SUPER, HYPER and ALT. We write `s-', `H-' and `A-' to say that a character uses these modifiers. Thus, s-H-C-x is short for Super-Hyper-Control-x. Not all X terminals actually provide keys for these modifier flags--in fact, many terminals have a key labeled ALT which is really a META key. The standard key bindings of Emacs do not include any characters with these modifiers. But you can assign them meanings of your own by customizing Emacs.
Keyboard input includes keyboard keys that are not characters at all: for example function keys and arrow keys. Mouse buttons are also outside the gamut of characters. You can modify these events with the modifier keys CTRL, META, SUPER, HYPER and ALT, just like keyboard characters.
Input characters and non-character inputs are collectively called input events. See section `Input Events' in The Emacs Lisp Reference Manual, for more information. If you are not doing Lisp programming, but simply want to redefine the meaning of some characters or non-character events, see section Customization.
ASCII terminals cannot really send anything to the computer except ASCII characters. These terminals use a sequence of characters to represent each function key. But that is invisible to the Emacs user, because the keyboard input routines recognize these special sequences and convert them to function key events before any other part of Emacs gets to see them.
A key sequence (key, for short) is a sequence of input events that are meaningful as a unit--as "a single command." Some Emacs command sequences are just one character or one event; for example, just C-f is enough to move forward one character. But Emacs also has commands that take two or more events to invoke.
If a sequence of events is enough to invoke a command, it is a complete key. Examples of complete keys include C-a, X, RET, NEXT (a function key), DOWN (an arrow key), C-x C-f, and C-x 4 C-f. If it isn't long enough to be complete, we call it a prefix key. The above examples show that C-x and C-x 4 are prefix keys. Every key sequence is either a complete key or a prefix key.
Most single characters constitute complete keys in the standard Emacs command bindings. A few of them are prefix keys. A prefix key combines with the following input event to make a longer key sequence, which may itself be complete or a prefix. For example, C-x is a prefix key, so C-x and the next input event combine to make a two-character key sequence. Most of these key sequences are complete keys, including C-x C-f and C-x b. A few, such as C-x 4 and C-x r, are themselves prefix keys that lead to three-character key sequences. There's no limit to the length of a key sequence, but in practice people rarely use sequences longer than four events.
By contrast, you can't add more events onto a complete key. For example, the two-character sequence C-f C-k is not a key, because the C-f is a complete key in itself. It's impossible to give C-f C-k an independent meaning as a command. C-f C-k is two key sequences, not one.
All told, the prefix keys in Emacs are C-c, C-h, C-x, C-x RET, C-x @, C-x a, C-x n, C-x r, C-x v, C-x 4, C-x 5, C-x 6, ESC, M-g and M-j. But this list is not cast in concrete; it is just a matter of Emacs's standard key bindings. If you customize Emacs, you can make new prefix keys, or eliminate these. See section Customizing Key Bindings.
If you do make or eliminate prefix keys, that changes the set of possible key sequences. For example, if you redefine C-f as a prefix, C-f C-k automatically becomes a key (complete, unless you define it too as a prefix). Conversely, if you remove the prefix definition of C-x 4, then C-x 4 f (or C-x 4 anything) is no longer a key.
Typing the help character (C-h or F1) after a prefix character displays a list of the commands starting with that prefix. There are a few prefix characters for which C-h does not work--for historical reasons, they have other meanings for C-h which are not easy to change. But F1 should work for all prefix characters.
This manual is full of passages that tell you what particular keys do. But Emacs does not assign meanings to keys directly. Instead, Emacs assigns meanings to named commands, and then gives keys their meanings by binding them to commands.
Every command has a name chosen by a programmer. The name is usually
made of a few English words separated by dashes; for example,
next-line
or forward-word
. A command also has a
function definition which is a Lisp program; this is what makes
the command do what it does. In Emacs Lisp, a command is actually a
special kind of Lisp function; one which specifies how to read arguments
for it and call it interactively. For more information on commands and
functions, see section `What Is a Function' in The Emacs Lisp Reference Manual. (The definition we use in this manual is
simplified slightly.)
The bindings between keys and commands are recorded in various tables called keymaps. See section Keymaps.
When we say that "C-n moves down vertically one line" we are
glossing over a distinction that is irrelevant in ordinary use but is vital
in understanding how to customize Emacs. It is the command
next-line
that is programmed to move down vertically. C-n has
this effect because it is bound to that command. If you rebind
C-n to the command forward-word
then C-n will move
forward by words instead. Rebinding keys is a common method of
customization.
In the rest of this manual, we usually ignore this subtlety to keep
things simple. To give the information needed for customization, we
state the name of the command which really does the work in parentheses
after mentioning the key that runs it. For example, we will say that
"The command C-n (next-line
) moves point vertically
down," meaning that next-line
is a command that moves vertically
down and C-n is a key that is standardly bound to it.
While we are on the subject of information for customization only,
it's a good time to tell you about variables. Often the
description of a command will say, "To change this, set the variable
mumble-foo
." A variable is a name used to remember a value.
Most of the variables documented in this manual exist just to facilitate
customization: some command or other part of Emacs examines the variable
and behaves differently according to the value that you set. Until you
are interested in customizing, you can ignore the information about
variables. When you are ready to be interested, read the basic
information on variables, and then the information on individual
variables will make sense. See section Variables.
Text in Emacs buffers is a sequence of 8-bit bytes. Each byte can hold a single ASCII character. Both ASCII control characters (octal codes 000 through 037, and 0177) and ASCII printing characters (codes 040 through 0176) are allowed; however, non-ASCII control characters cannot appear in a buffer. The other modifier flags used in keyboard input, such as Meta, are not allowed in buffers either.
Some ASCII control characters serve special purposes in text, and have special names. For example, the newline character (octal code 012) is used in the buffer to end a line, and the tab character (octal code 011) is used for indenting to the next tab stop column (normally every 8 columns). See section How Text Is Displayed.
Non-ASCII printing characters can also appear in buffers. When multibyte characters are enabled, you can use any of the non-ASCII printing characters that Emacs supports. They have character codes starting at 256, octal 0400, and each one is represented as a sequence of two or more bytes. See section International Character Set Support.
If you disable multibyte characters, then you can use only one alphabet of non-ASCII characters, but they all fit in one byte. They use codes 0200 through 0377. See section Single-byte European Character Support.
Go to the first, previous, next, last section, table of contents.