2. AutoGen Definitions File

This chapter describes the syntax and semantics of the AutoGen definition file. In order to instantiate a template, you normally must provide a definitions file that identifies itself and contains some value definitions. Consequently, we keep it very simple. For "advanced" users, there are preprocessing directives and comments that may be used as well.

The definitions file is used to associate values with names. When multiple values are associated with a single name, an implicit array of values is formed. Values may be either simple strings or compound collections of name-value pairs. An array may not contain both simple and compound members. Fundamentally, it is as simple as:

prog_name = "autogen"; flag = { name = templ_dirs; value = L; descrip = "Template search directory list"; };

For purposes of commenting and controlling the processing of the definitions, C-style comments and most C preprocessing directives are honored. The major exception is that the #if directive is ignored, along with all following text through the matching #endif directive. The C preprocessor is not actually invoked, so C macro substitution is not performed.

2.1 The Identification Definition

2.2 Named Definitions

2.3 Dynamic Text

2.4 Controlling What Gets Processed

2.5 Pre-defined Names

2.6 Commenting Your Definitions

2.7 What it all looks like.

2.8 YACC Language Grammar

2.9 Alternate Definition Forms

2.1 The Identification Definition

The first definition in this file is used to identify it as a AutoGen file. It consists of the two keywords, `autogen' and `definitions' followed by the default template name and a terminating semi-colon (;). That is:

AutoGen Definitions template-name;

Note that, other than the name template-name, the words `AutoGen' and `Definitions' are searched for without case sensitivity. Most lookups in this program are case insensitive.

Also, if the input contains more identification definitions, they will be ignored. This is done so that you may include (see section 2.4 Controlling What Gets Processed) other definition files without an identification conflict.

AutoGen uses the name of the template to find the corresponding template file. It searches for the file in the following way, stopping when it finds the file:

It tries to open `./template-name'. If it fails,
it tries `./template-name.tpl'.
It searches for either of these files in the directories listed in the templ-dirs command line option.

If AutoGen fails to find the template file in one of these places, it prints an error message and exits.

2.2 Named Definitions

Any name may have multiple values associated with it in the definition file. If there is more than one instance, the only way to expand all of the copies of it is by using the FOR (see section 3.6.13 FOR - Emit a template block multiple times) text function on it, as described in the next chapter.

There are two kinds of definitions, `simple' and `compound'. They are defined thus (see section 2.8 YACC Language Grammar):

compound_name '=' '{' definition-list '}' ';' simple_name '=' string ';' no_text_name ';'

No_text_name is a simple definition with a shorthand empty string value. The string values for definitions may be specified in any of several formation rules.

2.2.1 Naming a Value

2.2.2 Definition List

2.2.3 Double Quote String

2.2.4 Single Quote String

2.2.6 An Unquoted String

2.2.5 Shell Output String

2.2.7 Scheme Result String

2.2.8 A Here String

2.2.9 Concatenated Strings

2.2.1 Naming a Value

The names may be a simple name taking the next available index, or may specify an index by name or number. For example:

txt_name txt_name[2] txt_name[ DEF_NAME ]
DEF_NAME must be defined to have a numeric value. If you do specify an index, you must take care not to cause conflicts.

2.2.2 Definition List

definition-list is a list of definitions that may or may not contain nested compound definitions. Any such definitions may only be expanded within a FOR block iterating over the containing compound definition. See section 3.6.13 FOR - Emit a template block multiple times.

Here is, again, the example definitions from the previous chapter, with three additional name value pairs. Two with an empty value assigned (first and last), and a "global" group_name.

autogen definitions list; group_name = example; list = { list_element = alpha; first; list_info = "some alpha stuff"; }; list = { list_info = "more beta stuff"; list_element = beta; }; list = { list_element = omega; last; list_info = "final omega stuff"; };

2.2.3 Double Quote String

The string follows the C-style escaping (\, \n, \f, \v, etc.), plus octal character numbers specified as \ooo. The difference from "C" is that the string may span multiple lines. Like ANSI "C", a series of these strings, possibly intermixed with single quote strings, will be concatenated together.

2.2.4 Single Quote String

This is similar to the shell single-quote string. However, escapes \ are honored before another escape, single quotes ' and hash characters #. This latter is done specifically to disambiguate lines starting with a hash character inside of a quoted string. In other words,

foo = ' #endif ';

could be misinterpreted by the definitions scanner, whereas this would not:

foo = ' \#endif ';

As with the double quote string, a series of these, even intermixed with double quote strings, will be concatenated together.

2.2.5 Shell Output String

This is assembled according to the same rules as the double quote string, except that there is no concatenation of strings and the resulting string is written to a shell server process. The definition takes on the value of the output string.

NB The text is interpreted by a server shell. There may be left over state from previous ` processing and it may leave state for subsequent processing. However, a cd to the original directory is always issued before the new command is issued.

2.2.6 An Unquoted String

A simple string that does not contain white space may be left unquoted. The string must not contain any of the characters special to the definition text (i.e. ", #, ', (, ), ,, ;, <, =, >, [, ], `, {, or }). This list is subject to change, but it will never contain underscore (_), period (.), slash (/), colon (:), hyphen (-) or backslash (\\). Basically, if the string looks like it is a normal DOS or UNIX file or variable name, and it is not one of two keywords (`autogen' or `definitions') then it is OK to not quote it, otherwise you should.

2.2.7 Scheme Result String

A scheme result string must begin with an open parenthesis (. The scheme expression will be evaluated by Guile and the value will be the result. The AutoGen expression functions are disabled at this stage, so do not use them.

2.2.8 A Here String

A `here string' is formed in much the same way as a shell here doc. It is denoted with a doubled less than character and, optionally, a hyphen. This is followed by optional horizontal white space and an ending marker-identifier. This marker must follow the syntax rules for identifiers. Unlike the shell version, however, you must not quote this marker. The resulting string will start with the first character on the next line and continue up to but not including the newline that precedes the line that begins with the marker token. No backslash or any other kind of processing is done on this string. The characters are copied directly into the result string.

Here are two examples:

str1 = <<- STR_END $quotes = " ' ` STR_END; str2 = << STR_END $quotes = " ' ` STR_END; STR_END;
The first string contains no new line characters. The first character is the dollar sign, the last the back quote.

The second string contains one new line character. The first character is the tab character preceeding the dollar sign. The last character is the semicolon after the STR_END. That STR_END does not end the string because it is not at the beginning of the line. In the preceeding case, the leading tab was stripped.

2.2.9 Concatenated Strings

If single or double quote characters are used, then you also have the option, a la ANSI-C syntax, of implicitly concatenating a series of them together, with intervening white space ignored.

NB You cannot use directives to alter the string content. That is,

str = "foo" #ifdef LATER "bar" #endif ;

will result in a syntax error. The preprocessing directives are not carried out by the C preprocessor. However,

str = '"foo\n" #ifdef LATER " bar\n" #endif ';

Will work. It will enclose the `#ifdef LATER' and `#endif' in the string. But it may also wreak havoc with the definition processing directives. The hash characters in the first column should be disambiguated with an escape \ or join them with previous lines: "foo\n#ifdef LATER....

2.3 Dynamic Text

There are several methods for including dynamic content inside a definitions file. Three of them are mentioned above (2.2.5 Shell Output String and see section 2.2.7 Scheme Result String) in the discussion of string formation rules. Another method uses the #shell processing directive. It will be discussed in the next section (see section 2.4 Controlling What Gets Processed). Guile/Scheme may also be used to yield to create definitions.

When the Scheme expression is preceeded by a backslash and single quote, then the expression is expected to be an alist of names and values that will be used to create AutoGen definitions.

This method can be be used as follows:

\'( (name (value-expression)) (name2 (another-expr)) )

This is entirely equivalent to:

name = (value-expression); name2 = (another-expr);

Under the covers, the expression gets handed off to a Guile function named alist->autogen-def in an expression that looks like this:

(alist->autogen-def ( (name (value-expression)) (name2 (another-expr)) ) )

2.4 Controlling What Gets Processed

Definition processing directives can only be processed if the '#' character is the first character on a line. Also, if you want a '#' as the first character of a line in one of your string assignments, you should either escape it by preceding it with a backslash `\', or by embedding it in the string as in "\n#".

All of the normal C preprocessing directives are recognized, though several are ignored. There is also an additional #shell - #endshell pair. Another minor difference is that AutoGen directives must have the hash character (#) in column 1.

The final tweak is that #! is treated as a comment line. Using this feature, you can use: `#! /usr/local/bin/autogen' as the first line of a definitons file, set the mode to executable and "run" the definitions file as if it were a direct invocation of AutoGen. This was done for its hack value.

The ignored directives are: `#assert', `#ident', `#pragma', and `#if'. Note that when ignoring the #if directive, all intervening text through its matching #endif is also ignored, including the #else clause.

The AutoGen directives that affect the processing of definitions are:

#define name [ <text> ]

Will add the name to the define list as if it were a DEFINE program argument. Its value will be the first non-whitespace token following the name. Quotes are not processed.

After the definitions file has been processed, any remaining entries in the define list will be added to the environment.

#elif

This must follow an #if otherwise it will generate an error. It will be ignored.

#else

This must follow an #if, #ifdef or #ifndef. If it follows the #if, then it will be ignored. Otherwise, it will change the processing state to the reverse of what it was.

#endif

This must follow an #if, #ifdef or #ifndef. In all cases, this will resume normal processing of text.

#endshell

Ends the text processed by a command shell into autogen definitions.

#error [ <descriptive text> ]

This directive will cause AutoGen to stop processing and exit with a status of EXIT_FAILURE.

#if [ <ignored conditional expression> ]

#if expressions are not analyzed. Everything from here to the matching #endif is skipped.

#ifdef name-to-test

The definitions that follow, up to the matching #endif will be processed only if there is a corresponding -Dname command line option.

#ifndef name-to-test

The definitions that follow, up to the matching #endif will be processed only if there is not a corresponding -Dname command line option or there was a canceling -Uname option.

#include unadorned-file-name

This directive will insert definitions from another file into the current collection. If the file name is adorned with double quotes or angle brackets (as in a C program), then the include is ignored.

#line

Alters the current line number and/or file name. You may wish to use this directive if you extract definition source from other files. getdefs uses this mechanism so AutoGen will report the correct file and approximate line number of any errors found in extracted definitions.

#shell

Invokes $SHELL or `/bin/sh' on a script that should generate AutoGen definitions. It does this using the same server process that handles the back-quoted ` text. CAUTION let not your $SHELL be csh.

#undef name-to-undefine

Will remove any entries from the define list that match the undef name pattern.

2.5 Pre-defined Names

When AutoGen starts, it tries to determine several names from the operating environment and put them into environment variables for use in both #ifdef tests in the definitions files and in shell scripts with environment variable tests. __autogen__ is always defined. For other names, AutoGen will first try to use the POSIX version of the sysinfo(2) system call. Failing that, it will try for the POSIX uname(2) call. If neither is available, then only "__autogen__" will be inserted into the environment.

If sysinfo(2) is available, the strings associated with

SI_SYSNAME (e.g., "__sunos__")
SI_HOSTNAME (e.g., "__ellen__")
SI_ARCHITECTURE (e.g., "__sparc__")
SI_HW_PROVIDER (e.g., "__sun_microsystems__")
SI_PLATFORM (e.g., "__sun_ultra_5_10__")
SI_MACHINE (e.g., "__sun4u__")

are used. The associated names are converted to lower case, surrounded by doubled underscores and non-symbol characters are replaced with underscores. For example, for Solaris on a sparc platform, you would get these definitions:

For Linux and other operating systems that only support the uname(2) call, AutoGen will use these values:

sysname (e.g., "__linux__")
machine (e.g., "__i586__")
nodename (e.g., "__bach__")

By testing these pre-defines in my definitions, you can select pieces of the definitions without resorting to writing shell scripts that parse the output of uname(1). You can also segregate real C code from autogen definitions by testing for "__autogen__".

#ifdef __bach__ location = home; #else location = work; #endif

2.6 Commenting Your Definitions

The definitions file may contain C and C++ style comments.

/* * This is a comment. It continues for several lines and closes * when the characters '*' and '/' appear together. */ // this comment is a single line comment

2.7 What it all looks like.

This is an extended example:

autogen definitions `template-name'; /* * This is a comment that describes what these * definitions are all about. */ global = "value for a global text definition."; /* * Include a standard set of definitions */ #include standards.def a_block = { a_field; a_subblock = { sub_name = first; sub_field = "sub value."; }; #ifdef FEATURE a_subblock = { sub_name = second; }; #endif };

2.8 YACC Language Grammar

The processing directives and comments are not part of the grammar. They are handled by the scanner/lexer. The following was extracted directly from the defParse.y source file:

definitions : identity def_list TK_END { $$ = (YYSTYPE)(rootDefCtx.pDefs = (tDefEntry*)$2); } ; def_list : definition { $$ = $1; } | definition def_list { $$ = addSibMacro( $1, $2 ); } | identity def_list { $$ = $2; } ; identity : TK_AUTOGEN TK_DEFINITIONS anyname ';' { $$ = identify( $3 ); } ; definition : value_name ';' { $$ = makeMacro( $1, (YYSTYPE)"", VALTYP_TEXT ); } | value_name '=' text_list ';' { $$ = makeMacroList( $1, $3, VALTYP_TEXT ); } | value_name '=' block_list ';' { $$ = makeMacroList( $1, $3, VALTYP_BLOCK ); } ; text_list : anystring { $$ = startList( $1 ); } | anystring ',' text_list { $$ = appendList( $1, $3 ); } ; block_list : def_block { $$ = startList( $1 ); } | def_block ',' block_list { $$ = appendList( $1, $3 ); } ; def_block : '{' def_list '}' { $$ = $2; } ; anystring : anyname { $$ = $1; } | TK_STRING { $$ = $1; } | TK_NUMBER { $$ = $1; } ; anyname : TK_OTHER_NAME { $$ = $1; } | TK_VAR_NAME { $$ = $1; } ; value_name : TK_VAR_NAME { $$ = findPlace( (YYSTYPE)$1, (YYSTYPE)NULL ); } | TK_VAR_NAME '[' TK_NUMBER ']' { $$ = findPlace( (YYSTYPE)$1, (YYSTYPE)$3 ); } | TK_VAR_NAME '[' TK_VAR_NAME ']' { $$ = findPlace( (YYSTYPE)$1, (YYSTYPE)$3 ); } ;

2.9 Alternate Definition Forms

It is entirely possible to write a template that does not depend upon external definitions. Such a template would likely have an unvarying output, but be convenient nonetheless because of an external library of either AutoGen or Scheme functions, or both. This can be accommodated by providing the --override-tpl and --no-definitions options on the command line. See section 5. Invoking autogen.

This document was generated by Bruce Korb on February, 4 2002 using texi2html