A formal grammar is a mathematical construct. To define the language for Bison, you must write a file expressing the grammar in Bison syntax: a Bison grammar file. See section Bison Grammar Files.
A nonterminal symbol in the formal grammar is represented in Bison input
as an identifier, like an identifier in C. By convention, it should be
in lower case, such as expr
, stmt
or declaration
.
The Bison representation for a terminal symbol is also called a token
type. Token types as well can be represented as C-like identifiers. By
convention, these identifiers should be upper case to distinguish them from
nonterminals: for example, INTEGER
, IDENTIFIER
, IF
or
RETURN
. A terminal symbol that stands for a particular keyword in
the language should be named after that keyword converted to upper case.
The terminal symbol error
is reserved for error recovery.
See section Symbols, Terminal and Nonterminal.
A terminal symbol can also be represented as a character literal, just like a C character constant. You should do this whenever a token is just a single character (parenthesis, plus-sign, etc.): use that same character in a literal as the terminal symbol for that token.
A third way to represent a terminal symbol is with a C string constant containing several characters. See section Symbols, Terminal and Nonterminal, for more information.
The grammar rules also have an expression in Bison syntax. For example,
here is the Bison rule for a C return
statement. The semicolon in
quotes is a literal character token, representing part of the C syntax for
the statement; the naked semicolon, and the colon, are Bison punctuation
used in every rule.
stmt: RETURN expr ';' ;
See section Syntax of Grammar Rules.
Go to the first, previous, next, last section, table of contents.