GNU Emacs Lisp Reference Manual

Go to the first, previous, next, last section, table of contents.

Loading Non-ASCII Characters

When Emacs Lisp programs contain string constants with non-ASCII characters, these can be represented within Emacs either as unibyte strings or as multibyte strings (see section Text Representations). Which representation is used depends on how the file is read into Emacs. If it is read with decoding into multibyte representation, the text of the Lisp program will be multibyte text, and its string constants will be multibyte strings. If a file containing Latin-1 characters (for example) is read without decoding, the text of the program will be unibyte text, and its string constants will be unibyte strings. See section Coding Systems.

To make the results more predictable, Emacs always performs decoding into the multibyte representation when loading Lisp files, even if it was started with the `--unibyte' option. This means that string constants with non-ASCII characters translate into multibyte strings. The only exception is when a particular file specifies no decoding.

The reason Emacs is designed this way is so that Lisp programs give predictable results, regardless of how Emacs was started. In addition, this enables programs that depend on using multibyte text to work even in a unibyte Emacs. Of course, such programs should be designed to notice whether the user prefers unibyte or multibyte text, by checking default-enable-multibyte-characters, and convert representations appropriately.

In most Emacs Lisp programs, the fact that non-ASCII strings are multibyte strings should not be noticeable, since inserting them in unibyte buffers converts them to unibyte automatically. However, if this does make a difference, you can force a particular Lisp file to be interpreted as unibyte by writing `-*-unibyte: t;-*-' in a comment on the file's first line. With that designator, the file will be unconditionally be interpreted as unibyte, even in an ordinary multibyte Emacs session.

Go to the first, previous, next, last section, table of contents.