Static instance support.

[sod] / doc / syntax.tex
diff --git a/doc/syntax.tex b/doc/syntax.tex

index 0b7456f..6c05acc 100644 (file)
--- a/doc/syntax.tex
+++ b/doc/syntax.tex
@@ -26,51 +26,6 @@
  \chapter{Module syntax} \label{ch:syntax}
  
  %%%--------------------------------------------------------------------------
-\section{Notation} \label{sec:syntax.notation}
-
-Fortunately, Sod is syntactically quite simple.  The notation is slightly
-unusual in order to make the presentation shorter and easier to read.
-
-Anywhere a simple nonterminal name $x$ may appear in the grammar, an
-\emph{indexed} nonterminal $x[a_1, \ldots, a_n]$ may also appear.  On the
-left-hand side of a production rule, the indices $a_1$, \ldots, $a_n$ are
-variables which vary over all nonterminal and terminal symbols, and the
-variables may also appear on the right-hand side in place of a nonterminal.
-Such a rule stands for a family of rules, in each variable is replaced by
-each possible simple nonterminal or terminal symbol.
-
-The letter $\epsilon$ denotes the empty nonterminal
-\begin{quote}
-  \syntax{$\epsilon$ ::=}
-\end{quote}
-
-The following indexed productions are used throughout the grammar, some often
-enough that they deserve special notation.
-\begin{itemize}
-\item @[$x$@] abbreviates @<optional>$[x]$, denoting an optional occurrence
-  of $x$:
-  \begin{quote}
-    \syntax{@[$x$@] ::= <optional>$[x]$ ::= $\epsilon$ @! $x$}
-  \end{quote}
-\item $x^*$ abbreviates @<zero-or-more>$[x]$, denoting a sequence of zero or
-  more occurrences of $x$:
-  \begin{quote}
-    \syntax{$x^*$ ::= <zero-or-more>$[x]$ ::=
-      $\epsilon$ @! <zero-or-more>$[x]$ $x$}
-  \end{quote}
-\item $x^+$ abbreviates @<one-or-more>$[x]$, denoting a sequence of one or
-  more occurrences of $x$:
-  \begin{quote}
-    \syntax{$x^+$ ::= <one-or-more>$[x]$ ::= <zero-or-more>$[x]$ $x$}
-  \end{quote}
-\item @<list>$[x]$ denotes a sequence of one or more occurrences of $x$
-  separated by commas:
-  \begin{quote}
-    \syntax{<list>$[x]$ ::= $x$ @! <list>$[x]$ "," $x$}
-  \end{quote}
-\end{itemize}
-
-%%%--------------------------------------------------------------------------
  \section{Lexical syntax} \label{sec:syntax.lex}
  
  Whitespace and comments are discarded.  The remaining characters are
@@ -96,20 +51,20 @@ could be a token.
  
  <id-start-char> ::= <alpha-char> | "_"
  
-<id-body-char> ::= <id-start-char> @! <digit-char>
+<id-body-char> ::= <id-start-char> | <digit-char>
  
-<alpha-char> ::= "A" | "B" | \dots\ | "Z"
-\alt "a" | "b" | \dots\ | "z"
-\alt <extended-alpha-char>
+<alpha-char> ::= "A" | "B" | $\cdots$ | "Z"
+  | "a" | "b" | $\cdots$ | "z"
+  | <extended-alpha-char>
  
  <digit-char> ::= "0" | <nonzero-digit-char>
  
-<nonzero-digit-char> ::= "1" | "2" $| \cdots |$ "9"
+<nonzero-digit-char> ::= "1" | "2" | $\cdots$ | "9"
  \end{grammar}
  
  The precise definition of @<alpha-char> is left to the function
-\textsf{alpha-char-p} in the hosting Lisp system.  For portability,
-programmers are encouraged to limit themselves to the standard ASCII letters.
+@|alpha-char-p| in the hosting Lisp system.  For portability, programmers are
+encouraged to limit themselves to the standard ASCII letters.
  
  There are no reserved words at the lexical level, but the higher-level syntax
  recognizes certain identifiers as \emph{keywords} in some contexts.  There is
@@ -125,11 +80,11 @@ level.
  
  <char-literal> ::= "'" <char-literal-char> "'"
  
-<string-literal-char> ::= any character other than "\\" or "\""
-\alt "\\" <char>
+<string-literal-char> :: "\\" <char>
+  | any character other than "\\" or "\""
  
-<char-literal-char> ::= any character other than "\\" or "'"
-\alt "\\" <char>
+<char-literal-char> :: "\\" <char>
+  | any character other than "\\" or "'"
  
  <char> ::= any single character
  \end{grammar}
@@ -141,29 +96,30 @@ and the simple syntax seems adequate.  For the sake of future compatibility,
  the use of character sequences which resemble C escape sequences is
  discouraged.
  
-\subsubsection{Integer literals} \label{sec:syntax.lex.int}
+
+\subsection{Integer literals} \label{sec:syntax.lex.int}
  
  \begin{grammar}
  <integer-literal> ::= <decimal-integer>
-\alt <binary-integer>
-\alt <octal-integer>
-\alt <hex-integer>
+  | <binary-integer>
+  | <octal-integer>
+  | <hex-integer>
  
  <decimal-integer> ::= "0" | <nonzero-digit-char> @<digit-char>^*
  
-<binary-integer> ::= "0" @("b"|"B"@) @<binary-digit-char>^+
+<binary-integer> ::= "0" @("b" | "B"@) @<binary-digit-char>^+
  
  <binary-digit-char> ::= "0" | "1"
  
-<octal-integer> ::= "0" @["o"|"O"@] @<octal-digit-char>^+
+<octal-integer> ::= "0" @["o" | "O"@] @<octal-digit-char>^+
  
-<octal-digit-char> ::= "0" | "1" $| \cdots |$ "7"
+<octal-digit-char> ::= "0" | "1" | $\cdots$ | "7"
  
-<hex-integer> ::= "0" @("x"|"X"@) @<hex-digit-char>^+
+<hex-integer> ::= "0" @("x" | "X"@) @<hex-digit-char>^+
  
  <hex-digit-char> ::= <digit-char>
-\alt "A" | "B" | "C" | "D" | "E" | "F"
-\alt "a" | "b" | "c" | "d" | "e" | "f"
+  | "A" | "B" | "C" | "D" | "E" | "F"
+  | "a" | "b" | "c" | "d" | "e" | "f"
  \end{grammar}
  
  Sod understands only integers, not floating-point numbers; its integer syntax
@@ -174,15 +130,16 @@ binary.  However, length and signedness indicators are not permitted.
  \subsection{Punctuation} \label{sec:syntax.lex.punct}
  
  \begin{grammar}
-<punctuation> ::= any nonalphanumeric character other than "_", "\"" or "'"
+<punctuation> ::= "<<" | ">>" | "||" | "&&"
+  | "<=" | ">=" | "==" | "!=" | "\dots"
+\alt any nonalphanumeric character other than "_", "\"", or "'"
  \end{grammar}
  
  
  \subsection{Comments} \label{sec:syntax.lex.comment}
  
  \begin{grammar}
-<comment> ::= <block-comment>
-\alt <line-comment>
+<comment> ::= <block-comment> | <line-comment>
  
  <block-comment> ::=
    "/*"
@@ -196,16 +153,16 @@ binary.  However, length and signedness indicators are not permitted.
  
  <not-star-or-slash> ::= any character other than "*" or  "/"
  
-<line-comment> ::= "//" @<not-newline>^* <newline>
+<line-comment> ::= "/\,/" @<not-newline>^* <newline>
  
  <newline> ::= a newline character
  
  <not-newline> ::= any character other than newline
  \end{grammar}
  
-Comments are exactly as in C99: both traditional block comments `\texttt{/*}
-\dots\ \texttt{*/}' and \Cplusplus-style `\texttt{//} \dots' comments are
-permitted and ignored.
+Comments are exactly as in C99: both traditional block comments `@|/*| \dots\
+@|*/|' and \Cplusplus-style `@|/\,/| \dots' comments are permitted and
+ignored.
  
  
  \subsection{Special nonterminals} \label{sec:syntax.lex.special}
@@ -234,9 +191,356 @@ during translation.  They are read using a simple scanner which nonetheless
  understands C comments and string and character literals.
  
  A C fragment is terminated by one of a small number of delimiter characters
-determined by the immediately surrounding context -- usually a closing brace
-or bracket.  The first such delimiter character which is not enclosed in
-brackets, braces or parenthesis ends the fragment.
+determined by the immediately surrounding context -- usually some kind of
+bracket.  The first such delimiter character which is not enclosed in
+brackets, braces or parentheses ends the fragment.
+
+%%%--------------------------------------------------------------------------
+\section{C types} \label{sec:syntax.type}
+
+Sod's syntax for C types closely mirrors the standard C syntax.  A C type has
+two parts: a sequence of @<declaration-specifier>s and a @<declarator>.  In
+Sod, a type must contain at least one @<declaration-specifier> (i.e.,
+`implicit @|int|' is forbidden), and storage-class specifiers are not
+recognized.
+
+
+\subsection{Declaration specifiers} \label{sec:syntax.type.declspec}
+
+\begin{grammar}
+<declaration-specifier> ::= <type-name>
+\alt "struct" <identifier> | "union" <identifier> | "enum" <identifier>
+\alt "void" | "char" | "int" | "float" | "double"
+\alt "short" | "long"
+\alt "signed" | "unsigned"
+\alt "bool" | "_Bool"
+\alt "imaginary" | "_Imaginary" | "complex" | "_Complex"
+\alt <qualifier>
+\alt <storage-specifier>
+\alt <atomic-type>
+\alt <other-declspec>
+
+<qualifier> ::= <atomic> | "const" | "volatile" | "restrict"
+
+<plain-type> ::= @<declaration-specifier>^+ <abstract-declarator>
+
+<atomic-type> ::= <atomic> "(" <plain-type> ")"
+
+<atomic> ::= "atomic" | "_Atomic"
+
+<storage-specifier> ::= <alignas> "(" <c-fragment> ")"
+
+<alignas> ::= "alignas" "_Alignas"
+
+<type-name> ::= <identifier>
+\end{grammar}
+
+Declaration specifiers may appear in any order.  However, not all
+combinations are permitted.  A declaration specifier must consist of zero or
+more @<qualifier>s, zero or more @<storage-specifier>s, and one of the
+following, up to reordering:
+\begin{itemize}
+\item @<type-name>;
+\item @<atomic-type>;
+\item @"struct" @<identifier>; @"union" @<identifier>; @"enum" @<identifier>;
+\item @"void";
+\item @"_Bool", @"bool";
+\item @"char"; @"unsigned char"; @"signed char";
+\item @"short", @"signed short", @"short int", @"signed short int";
+  @"unsigned short", @"unsigned short int";
+\item @"int", @"signed", @"signed int"; @"unsigned", @"unsigned int";
+\item @"long", @"signed long", @"long int", @"signed long int"; @"unsigned
+  long", @"unsigned long int";
+\item @"long long", @"signed long long", @"long long int", @"signed long long
+  int"; @"unsigned long long", @"unsigned long long int";
+\item @"float"; @"double"; @"long double";
+\item @"float _Imaginary", @"float imaginary"; @"double _Imaginary", @"double
+  imaginary"; @"long double _Imaginary", @"long double imaginary";
+\item @"float _Complex", @"float complex"; @"double _Complex", @"double
+  complex"; @"long double _Complex", @"long double complex".
+\end{itemize}
+All of these have their usual C meanings.  Groups separated by commas mean
+the same thing, and Sod will not preserve the distinction.
+
+Almost all of these mean the same as they do in C.  There are some minor
+differences:
+\begin{itemize}
+\item In C, the `tag' namespace is shared between @|struct|, @|union|, and
+  @|enum|; Sod has three distinct namespaces for tags.  This may be fixed in
+  the future.
+\item The @<other-declspec> production is a syntactic extension point, where
+  extensions can introduce their own additions to the type system.
+\end{itemize}
+
+C standards from C99 onwards have tended to introduce new keywords beginning
+with an underscore followed by an uppercase letter, so as to avoid conflicts
+with existing code.  More conventional spellings are then provided by macros
+in new header files.  For example, C99 introduced @"_Bool", and a header file
+@|<stdbool.h>| which defines the macro @|bool|.  Sod recognizes both the ugly
+underscore names and the more conventional macro names on input, but always
+emits the ugly names.  This doesn't cause a compatibility problem in Sod,
+because Sod's parser recognizes keywords only in the appropriate context.
+For example, the (ill-advised) slot declaration
+\begin{prog}
+  bool bool;
+\end{prog}
+is completely acceptable, and will cause the C structure member
+\begin{prog}
+  \_Bool bool;
+\end{prog}
+to be emitted on output, which will be acceptable to C as long as
+@|<stdbool.h>| is not included.
+
+A @<type-name> is an identifier which has been declared as being a type name,
+using the @"typename" or @"class" definitions.  The following type names are
+defined in the built-in module.
+\begin{itemize}
+\item @|va_list|
+\item @|size_t|
+\item @|ptrdiff_t|
+\item @|wchar_t|
+\end{itemize}
+
+
+\subsection{Declarators} \label{sec:syntax.type.declarator}
+
+\begin{grammar}
+<declarator>$[k, a]$ ::= @<pointer>^* <primary-declarator>$[k, a]$
+
+<primary-declarator>$[k, a]$ ::= $k$
+\alt "(" <primary-declarator>$[k, a]$ ")"
+\alt <primary-declarator>$[k, a]$ @<declarator-suffix>$[a]$
+
+<pointer> ::= "*" @<qualifier>^*
+
+<declarator-suffix>$[a]$ ::= "[" <c-fragment> "]"
+\alt "(" $a$ ")"
+
+<argument-list> ::= $\epsilon$ | "\dots"
+\alt <list>$[\mbox{@<argument>}]$ @["," "\dots"@]
+
+<argument> ::= @<declaration-specifier>^+ <argument-declarator>
+
+<abstract-declarator> ::= <declarator>$[\epsilon, \mbox{@<argument-list>}]$
+
+<argument-declarator> ::=
+  <declarator>$[\mbox{@<identifier> | $\epsilon$}, \mbox{@<argument-list>}]$
+
+<simple-declarator> ::=
+  <declarator>$[\mbox{@<identifier>}, \mbox{@<argument-list>}]$
+\end{grammar}
+
+The declarator syntax is taken from C, but with some differences.
+\begin{itemize}
+\item Array dimensions are uninterpreted @<c-fragments>, terminated by a
+  closing square bracket.  This allows array dimensions to contain arbitrary
+  constant expressions.
+\item A declarator may have either a single @<identifier> at its centre or a
+  pair of @<identifier>s separated by a @`.'; this is used to refer to
+  slots or messages defined in superclasses.
+\end{itemize}
+The remaining differences are (I hope) a matter of presentation rather than
+substance.
+
+There is additional syntax to support messages and methods which accept
+keyword arguments.
+
+\begin{grammar}
+<keyword-argument> ::= <argument> @["=" <c-fragment>@]
+
+<keyword-argument-list> ::=
+  @[<list>$[\mbox{@<argument>}]$@]
+  "?" @[<list>$[\mbox{@<keyword-argument>}]$@]
+
+<method-argument-list> ::= <argument-list> | <keyword-argument-list>
+
+<dotted-name> ::= <identifier> "." <identifier>
+
+<keyword-declarator>$[k]$ ::=
+  <declarator>$[k, \mbox{@<method-argument-list>}]$
+\end{grammar}
+
+%%%--------------------------------------------------------------------------
+\section{Properties} \label{sec:syntax.prop}
+
+\begin{grammar}
+<properties> ::= "[" <list>$[\mbox{@<property>}]$ "]"
+
+<property> ::= <identifier> "=" <expression>
+
+<expression> ::= <logical-or>
+
+<logical-or> ::= <logical-and>
+  | <logical-or> "||" <logical-and>
+
+<logical-and> ::= <bitwise-or>
+  | <logical-and> "&&" <bitwise-or>
+
+<bitwise-or> ::= <bitwise-xor>
+  | <bitwise-or> "|" <bitwise-xor>
+
+<bitwise-xor> ::= <bitwise-and>
+  | <bitwise-xor> "^" <bitwise-and>
+
+<bitwise-and> ::= <equality>
+  | <bitwise-and> "&" <equality>
+
+<equality> ::= <ordering>
+  | <equality> "==" <ordering>
+  | <equality> "!=" <ordering>
+
+<ordering> ::= <shift>
+  | <ordering> "<" <shift>
+  | <ordering> "<=" <shift>
+  | <ordering> ">=" <shift>
+  | <ordering> ">" <shift>
+
+<shift> ::= <additive>
+  | <shift> "<<" <additive>
+  | <shift> ">>" <additive>
+
+<additive> ::= <term>
+  | <additive> "+" <term>
+  | <additive> "--" <term>
+
+<term> ::= <factor>
+  | <term> "*" <factor>
+  | <term> "/" <factor>
+
+<factor> ::= <primary>
+  | "!" <factor> | "~" factor
+  | "+" <factor> | "--" <factor>
+
+<primary> ::=
+     <integer-literal> | <string-literal> | <char-literal> | <identifier>
+\alt "<" <plain-type> ">" | "{" <c-fragment> "}" | "?" <s-expression>
+  | "(" <expression> ")"
+\end{grammar}
+
+\emph{Property sets} are a means for associating miscellaneous information
+with compile-time metaobjects such as modules, classes, messages, methods,
+slots, and initializers.  By using property sets, additional information can
+be passed to extensions without the need to introduce idiosyncratic syntax.
+(That said, extensions can add additional first-class syntax, if necessary.)
+
+An error is reported if an unrecognized property is associated with an
+object.
+
+
+\subsection{Property values} \label{sec:syntax.prop.value}
+
+A property has a name, given as an @<identifier>, and a value computed by
+evaluating an @<expression>.  The value can be one of a number of types.
+
+\begin{itemize}
+
+\item An @<integer-literal> denotes a value of type @|int|.
+
+\item Similarly @<string-literal> and @<char-literal> denote @|string| and
+  @|char| values respectively.  Note that, as properties, characters are
+  quite distinct from integers, whereas in C, a character literal denotes a
+  value of type @|int|.
+
+\item There are no variables in the property-value syntax.  Rather, an
+  @<identifier> denotes that identifier, as a value of type @|id|.
+
+\item A C type (a @<plain-type>, as described in \xref{sec:syntax.type})
+  between angle brackets, e.g., @|<int>|, or @|<char *>|, or @|<void (*(int,
+  void (*)(int)))(int)>|, denotes that C type, as a value of type @|type|.
+
+\item A @<c-fragment> within braces denotes the tokens between (and not
+  including) the braces, as a value of type @|c-fragment|.
+
+\end{itemize}
+
+As shown in the grammar, there are four binary operators, @"+" (addition),
+@"--" (subtraction), @"*" (multiplication), and @"/" (division);
+multiplication and division have higher precedence than addition and
+subtraction, and operators of the same precedence associate left-to-right.
+There are also unary @"+" (no effect) and @"--" (negation) operators, with
+higher precedence.  All of the above operators act only on integer operands
+and yield integer results.  (Although the unary @"+" operator yields its
+operand unchanged, an error is still reported if it is applied to a
+non-integer value.)  There are currently no bitwise, logical, or comparison
+operators.
+
+Finally, an S-expression preceded by @|?| causes the expression to be read in
+the current package (which is always @|sod-user| at the start of a module)
+and immediately evaluated (using @|eval|); the resulting value is converted
+into a property value using the \descref{gf}{decode-property}[generic
+function].
+
+
+\subsection{Property output types and coercions}
+\label{sec:syntax.prop.coerce}
+
+When a property value is inspected by the Sod translator, or an extension, it
+is \emph{coerced} so as to conform to a requested output type.  This coercion
+process is performed by the \descref{gf}{coerce-property-value}[generic
+function], and additional output types and coercions can be defined by
+extensions.  The built-in output types coercions, from the value types listed
+above, are as follows.
+
+\begin{itemize}
+
+\item The output types @|int|, @|string|, @|char|, @|id|, and @|c-fragment|
+  correspond to the like-named value types described above.  No coercions to
+  these output types are defined for the described value types.\footnote{%
+    There is a coercion to @|id| from the value type @|symbol|, but it is
+    only possible to generate a property value of type @|symbol| using Lisp.}
+
+\item The output type @|type| denotes a C type, as does the value type
+  @|type|.  In addition, a value of type @|id| can be coerced to a C type if
+  it is the name of a class, a type name explicitly declared by @|typename|,
+  or it is one of: @|bool|, @|_Bool|, @|void|, @|char|, @|short|, @|int|,
+  @|signed|, @|unsigned|, @|long|, @|size_t|, @|ptrdiff_t|, @|wchar_t|,
+  or @|va_list|.
+
+\item The @|boolean| output type denotes a boolean value, which may be either
+  true or false.  A value of type @|id| is considered true if it is @|true|,
+  @|t|, @|yes|, @|on|, @|yup|, or @|verily|; or false if it is @|false|,
+  @|nil|, @|no|, @|off|, @|nope|, or @|nowise|; it is erroneous to provide
+  any other identifier where a boolean value is wanted.  A value of type
+  @|int| is considered true if it is nonzero, or false if it is zero.
+
+\item The @|symbol| output type denotes a Lisp symbol.
+
+  A value of type @|id| is coerced to a symbol as follows.  First, the
+  identifier name is subjected to \emph{case inversion}: if all of the
+  letters in the name have the same case, either upper or lower, then they
+  are replaced with the corresponding letters in the opposite case, lower or
+  upper; if the name contains letters of both cases, then it is not changed.
+  For example, @|foo45| becomes @|FOO45|, or \emph{vice-versa}; but @|Splat|
+  remains as it is.  Second, the name is subjected to \emph{separator
+  switching}: all underscores in the name are replaced with hyphens (and
+  \emph{vice-versa}, though hyphens aren't permitted in identifiers in the
+  first place).  Finally, the resulting name is interned in the current
+  package, which will usually be @|sod-user| unless changed explicitly by the
+  module.
+
+  A value of type @|string| is coerced to a symbol as follows.  If the string
+  contains no colons, then it is case-inverted (but not separator-switched)
+  and interned in the current package.  Otherwise, the string either has the
+  form $p @|:| q$, where $q$ does not begin with a colon (the
+  \emph{single-colon} case) or $p @|::| q$ (the \emph{double-colon} case);
+  where $p$ does not contain a colon.  Both $p$ and $q$ are case-inverted
+  (but not separator-switched).  If $p$ does not name a package, then an
+  error is reported; as a special case, if $p$ is empty, then it is
+  considered to name the @|keyword| package.  Otherwise, $q$ is looked up as
+  a symbol name in package~$p$; in the single-colon case, if the symbol is
+  not an exported symbol in package~$p$, then an error is reported; in the
+  double-colon case, $q$ is interned in package~$p$ (and so there needn't be
+  an exported symbol -- or, indeed, and symbol at all -- named $q$
+  beforehand).
+
+\item The @|keyword| output type denotes symbols within the @|keyword|
+  package.  Value of type @|id| or @|string| can be coerced to a @|keyword|
+  in the same way as to a @|symbol|, as described above, only the converted
+  name is looked up in the @|keyword| package rather than the current
+  package.  (A @|string| can override this by specifying an explicit package
+  name, but this is unlikely to be very helpful.)
+
+\end{itemize}
  
  %%%--------------------------------------------------------------------------
  \section{Module syntax} \label{sec:syntax.module}
@@ -244,16 +548,30 @@ brackets, braces or parenthesis ends the fragment.
  \begin{grammar}
  <module> ::= @<definition>^*
  
-<definition> ::= <import-definition>
+<definition> ::= <property-definition> \fixme{undefined}
+\alt <import-definition>
  \alt <load-definition>
  \alt <lisp-definition>
  \alt <code-definition>
  \alt <typename-definition>
  \alt <class-definition>
+\alt <other-definition> \fixme{undefined}
  \end{grammar}
  
-A @<module> is the top-level syntactic item.  A module consists of a sequence
-of definitions.
+A @<module> is the top-level syntactic item: a source file presented to Sod
+is expected to conform with the @<module> syntax.
+
+A module consists of a sequence of definitions.
+
+\fixme{describe syntax; expand}
+Properties:
+\begin{description}
+\item[@|module_class|] A symbol naming the Lisp class to use to
+  represent the module.
+\item[@|guard|] An identifier to use as the guard symbol used to prevent
+  multiple inclusion in the header file.
+\end{description}
+
  
  \subsection{Simple definitions} \label{sec:syntax.module.simple}
  
@@ -268,9 +586,9 @@ A search is made for a module source file as follows.
  \begin{itemize}
  \item The module name @<string> is converted into a filename by appending
    @`.sod', if it has no extension already.\footnote{%
-    Technically, what happens is \textsf{(merge-pathnames name (make-pathname
-    :type "SOD" :case :common))}, so exactly what this means varies
-    according to the host system.} %
+    Technically, what happens is @|(merge-pathnames name (make-pathname :type
+    "SOD" :case :common))|, so exactly what this means varies according to
+    the host system.} %
  \item The file is looked for relative to the directory containing the
    importing module.
  \item If that fails, then the file is looked for in each directory on the
@@ -280,7 +598,7 @@ A search is made for a module source file as follows.
  \end{itemize}
  At this point, if the file has previously been imported, nothing further
  happens.\footnote{%
-  This check is done using \textsf{truename}, so it should see through simple
+  This check is done using @|truename|, so it should see through simple
    tricks like symbolic links.  However, it may be confused by fancy things
    like bind mounts and so on.} %
  
@@ -297,23 +615,22 @@ A search is made for a Lisp source file as follows.
  \begin{itemize}
  \item The name @<string> is converted into a filename by appending @`.lisp',
    if it has no extension already.\footnote{%
-    Technically, what happens is \textsf{(merge-pathnames name (make-pathname
-    :type "LISP" :case :common))}, so exactly what this means varies
-    according to the host system.} %
+    Technically, what happens is @|(merge-pathnames name (make-pathname :type
+    "LISP" :case :common))|, so exactly what this means varies according to
+    the host system.} %
  \item A search is then made in the same manner as for module imports
    (\xref{sec:syntax-module}).
  \end{itemize}
-If the file is found, it is loaded using the host Lisp's \textsf{load}
-function.
+If the file is found, it is loaded using the host Lisp's @|load| function.
  
  Note that Sod doesn't attempt to compile Lisp files, or even to look for
  existing compiled files.  The right way to package a substantial extension to
  the Sod translator is to provide the extension as a standard ASDF system (or
-similar) and leave a dropping @"foo-extension.lisp" in the module path saying
+similar) and leave a dropping @|foo-extension.lisp| in the module path saying
  something like
-\begin{quote}
-  \textsf{(asdf:load-system :foo-extension)}
-\end{quote}
+\begin{prog}
+  (asdf:load-system :foo-extension)
+\end{prog}
  which will arrange for the extension to be compiled if necessary.
  
  (This approach means that the language doesn't need to depend on any
@@ -353,20 +670,24 @@ declarations instead.
  
  \begin{grammar}
  <code-definition> ::=
-  "code" <identifier> ":" <item-name> @[<constraints>@]
+  "code" <reason> ":" <item-name> @[<constraints>@]
    "{" <c-fragment> "}"
+\alt
+  "code" <reason> ":" <constraints> ";"
+
+<reason> ::= <identifier>
  
  <constraints> ::= "[" <list>$[\mbox{@<constraint>}]$ "]"
  
  <constraint> ::= @<item-name>^+
  
-<item-name> ::= <identifier> @! "(" @<identifier>^+ ")"
+<item-name> ::= <identifier> | "(" @<identifier>^+ ")"
  \end{grammar}
  
  The @<c-fragment> will be output unchanged to one of the output files.
  
  The first @<identifier> is the symbolic name of an output file.  Predefined
-output file names are @"c" and @"h", which are the implementation code and
+output file names are @|c| and @|h|, which are the implementation code and
  header file respectively; other output files can be defined by extensions.
  
  Output items are named with a sequence of identifiers, separated by
@@ -383,184 +704,31 @@ must appear in the order given -- though the translator is free to insert
  additional items in between them.  (The particular output items needn't be
  defined already -- indeed, they needn't be defined ever.)
  
-There is a predefined output item @"includes" in both the @"c" and @"h"
-output files which is a suitable place for inserting @"\#include"
+There is a predefined output item @|includes| in both the @|c| and @|h|
+output files which is a suitable place for inserting @|\#include|
  preprocessor directives in order to declare types and functions for use
  elsewhere in the generated output files.
  
  
-\subsection{Property sets} \label{sec:syntax.module.properties}
-\begin{grammar}
-<properties> ::= "[" <list>$[\mbox{@<property>}]$ "]"
-
-<property> ::= <identifier> "=" <expression>
-\end{grammar}
-
-Property sets are a means for associating miscellaneous information with
-classes and related items.  By using property sets, additional information
-can be passed to extensions without the need to introduce idiosyncratic
-syntax.
-
-A property has a name, given as an @<identifier>, and a value computed by
-evaluating an @<expression>.  The value can be one of a number of types,
-though the only operators currently defined act on integer values only.
-
-\subsubsection{The expression evaluator}
-\begin{grammar}
-<expression> ::= <term> | <expression> "+" <term> | <expression> "-" <term>
-
-<term> ::= <factor> | <term> "*" <factor> | <term> "/" <factor>
-
-<factor> ::= <primary> | "+" <factor> | "-" <factor>
-
-<primary> ::=
-     <integer-literal> | <string-literal> | <char-literal> | <identifier>
-\alt "?" <s-expression>
-\alt "(" <expression> ")"
-\end{grammar}
-
-The arithmetic expression syntax is simple and standard; there are currently
-no bitwise, logical, or comparison operators.
-
-A @<primary> expression may be a literal or an identifier.  Note that
-identifiers stand for themselves: they \emph{do not} denote values.  For more
-fancy expressions, the syntax
-\begin{quote}
-  @"?" @<s-expression>
-\end{quote}
-causes the @<s-expression> to be evaluated using the Lisp \textsf{eval}
-function.
-%%% FIXME crossref to extension docs
-
-
-\subsection{C types} \label{sec:syntax.module.types}
-
-Sod's syntax for C types closely mirrors the standard C syntax.  A C type has
-two parts: a sequence of @<declaration-specifier>s and a @<declarator>.  In
-Sod, a type must contain at least one @<declaration-specifier> (i.e.,
-`implicit @"int"' is forbidden), and storage-class specifiers are not
-recognized.
-
-\subsubsection{Declaration specifiers}
-\begin{grammar}
-<declaration-specifier> ::= <type-name>
-\alt "struct" <identifier> | "union" <identifier> | "enum" <identifier>
-\alt "void" | "char" | "int" | "float" | "double"
-\alt "short" | "long"
-\alt "signed" | "unsigned"
-\alt "bool" | "_Bool"
-\alt "imaginary" | "_Imaginary" | "complex" | "_Complex"
-\alt <qualifier>
-\alt <storage-specifier>
-\alt <atomic-type>
-
-<qualifier> ::= <atomic> | "const" | "volatile" | "restrict"
-
-<atomic-type> ::=
-  <atomic> "(" @<declaration-specifier>^+ <abstract-declarator> ")"
-
-<atomic> ::= "atomic" | "_Atomic"
-
-<storage-specifier> ::= <alignas> "(" <c-fragment> ")"
-
-<alignas> ::= "alignas" "_Alignas"
-
-<type-name> ::= <identifier>
-\end{grammar}
-
-A @<type-name> is an identifier which has been declared as being a type name,
-using the @"typename" or @"class" definitions.  The following type names are
-defined in the built-in module.
-\begin{itemize}
-\item @"va_list"
-\item @"size_t"
-\item @"ptrdiff_t"
-\item @"wchar_t"
-\end{itemize}
-
-Declaration specifiers may appear in any order.  However, not all
-combinations are permitted.  A declaration specifier must consist of zero or
-more @<qualifier>s, zero or more @<storage-specifier>s, and one of the
-following, up to reordering.
-\begin{itemize}
-\item @<type-name>
-\item @<atomic-type>
-\item @"struct" @<identifier>, @"union" @<identifier>, @"enum" @<identifier>
-\item @"void"
-\item @"_Bool", @"bool"
-\item @"char", @"unsigned char", @"signed char"
-\item @"short", @"unsigned short", @"signed short"
-\item @"short int", @"unsigned short int", @"signed short int"
-\item @"int", @"unsigned int", @"signed int", @"unsigned", @"signed"
-\item @"long", @"unsigned long", @"signed long"
-\item @"long int", @"unsigned long int", @"signed long int"
-\item @"long long", @"unsigned long long", @"signed long long"
-\item @"long long int", @"unsigned long long int", @"signed long long int"
-\item @"float", @"double", @"long double"
-\item @"float _Imaginary", @"double _Imaginary", @"long double _Imaginary"
-\item @"float imaginary", @"double imaginary", @"long double imaginary"
-\item @"float _Complex", @"double _Complex", @"long double _Complex"
-\item @"float complex", @"double complex", @"long double complex"
-\end{itemize}
-All of these have their usual C meanings.
+\subsection{Static instance definitions} \label{sec:syntax.module.instance}
  
-\subsubsection{Declarators}
  \begin{grammar}
-<declarator>$[k, a]$ ::= @<pointer>^* <primary-declarator>$[k, a]$
+<static-instance-definition> ::=
+  "instance" <identifier> <identifier>
+  @[":" <list>$[\mbox{@<instance-initializer>}]$@] ";"
  
-<primary-declarator>$[k, a]$ ::= $k$
-\alt "(" <primary-declarator>$[k, a]$ ")"
-\alt <primary-declarator>$[k, a]$ @<declarator-suffix>$[a]$
-
-<pointer> ::= "*" @<qualifier>^*
-
-<declarator-suffix>$[a]$ ::= "[" <c-fragment> "]"
-\alt "(" $a$ ")"
-
-<argument-list> ::= $\epsilon$ | "..."
-\alt <list>$[\mbox{@<argument>}]$ @["," "..."@]
-
-<argument> ::= @<declaration-specifier>^+ <argument-declarator>
-
-<abstract-declarator> ::= <declarator>$[\epsilon, \mbox{@<argument-list>}]$
-
-<argument-declarator> ::= <declarator>$[\mbox{@<identifier> @! $\epsilon$}]$
-<argument-declarator> ::=
-  <declarator>$[\mbox{@<identifier> @! $\epsilon$}, \mbox{@<argument-list>}]$
-
-<simple-declarator> ::=
-  <declarator>$[\mbox{@<identifier>}, \mbox{@<argument-list>}]$
+<instance-initializer> ::= <identifier> "." <identifier> "=" <c-fragment>
  \end{grammar}
  
-The declarator syntax is taken from C, but with some differences.
-\begin{itemize}
-\item Array dimensions are uninterpreted @<c-fragments>, terminated by a
-  closing square bracket.  This allows array dimensions to contain arbitrary
-  constant expressions.
-\item A declarator may have either a single @<identifier> at its centre or a
-  pair of @<identifier>s separated by a @`.'; this is used to refer to
-  slots or messages defined in superclasses.
-\end{itemize}
-The remaining differences are (I hope) a matter of presentation rather than
-substance.
-
-There is additional syntax to support messages and methods which accept
-keyword arguments.
-
-\begin{grammar}
-<keyword-argument> ::= <argument> @["=" <c-fragment>@]
-
-<keyword-argument-list> ::=
-  @[<list>$[\mbox{@<argument>}]$@]
-  "?" @[<list>$[\mbox{@<keyword-argument>}]$@]
-
-<method-argument-list> ::= <argument-list> @! <keyword-argument-list>
-
-<dotted-name> ::= <identifier> "." <identifier>
-
-<keyword-declarator>$[k]$ ::=
-  <declarator>$[k, \mbox{@<method-argument-list>}]$
-\end{grammar}
+Properties:
+\begin{description}
+\item[@"extern"] A boolean flag: if true, then the instance is public, and
+  will be declared in the output header file; if false (the default), then
+  the instance is only available to code defined within the module.
+\item[@"const"] A boolean flag: if true (the default), then the instance is
+  read-only, and may end up in write-protected storage at run-time; if false,
+  then the instance will be writable.
+\end{description}
  
  
  \subsection{Class definitions} \label{sec:syntax.module.class}
@@ -579,14 +747,14 @@ A @<class-forward-declaration> informs Sod that an @<identifier> will be used
  to name a class which is currently undefined.  Forward declarations are
  necessary in order to resolve certain kinds of circularity.  For example,
  \begin{prog}
-class Sub;
-\\+
-class Super : SodObject \{ \\ \ind
-  Sub *sub; \- \\
-\};
-\\+
-class Sub : Super \{ \\ \ind
-  /* \dots */ \- \\
+class Sub;                                                      \\+
+
+class Super: SodObject \{                                       \\ \ind
+  Sub *sub;                                                   \-\\
+\};                                                             \\+
+
+class Sub: Super \{                                             \\ \ind
+  /* \dots\ */                                                \-\\
  \};
  \end{prog}
  
@@ -605,6 +773,7 @@ class Sub : Super \{ \\ \ind
  \alt <fragment-item>
  \alt <message-item>
  \alt <method-item>
+\alt <other-item> \fixme{undefined}
  \end{grammar}
  
  A full class definition provides a complete description of a class.
@@ -618,24 +787,25 @@ The @<list>$[\mbox{@<identifier>}]$ names the direct superclasses for the new
  class.  It is an error if any of these @<identifier>s does not name a defined
  class.  The superclass list is required, and must not be empty; listing
  @|SodObject| as your class's superclass is a good choice if nothing else
-seems suitable.  It's not possible to define a \emph{root class} in the Sod
-language: you must use Lisp to do this, and it's quite involved.
+seems suitable.  A class with no direct superclasses is called a \emph{root
+class}.  It is not possible to define a root class in the Sod language: you
+must use Lisp to do this, and it's quite involved.
  
  The @<properties> provide additional information.  The standard class
  properties are as follows.
  \begin{description}
-\item[@"lisp_class"] The name of the Lisp class to use within the translator
+\item[@|lisp_class|] The name of the Lisp class to use within the translator
    to represent this class.  The property value must be an identifier; the
-  default is @"sod_class".  Extensions may define classes with additional
+  default is @|sod_class|.  Extensions may define classes with additional
    behaviour, and may recognize additional class properties.
-\item[@"metaclass"] The name of the Sod metaclass for this class.  In the
+\item[@|metaclass|] The name of the Sod metaclass for this class.  In the
    generated code, a class is itself an instance of another class -- its
    \emph{metaclass}.  The metaclass defines which slots the class will have,
    which messages it will respond to, and what its behaviour will be when it
    receives them.  The property value must be an identifier naming a defined
-  subclass of @"SodClass".  The default metaclass is @"SodClass".
-  %%% FIXME xref to theory
-\item[@"nick"] A nickname for the class, to be used to distinguish it from
+  subclass of @|SodClass|.  The default metaclass is @|SodClass|.
+  See \xref{sec:concepts.metaclasses} for more details.
+\item[@|nick|] A nickname for the class, to be used to distinguish it from
    other classes in various limited contexts.  The property value must be an
    identifier; the default is constructed by forcing the class name to
    lower-case.
@@ -661,21 +831,32 @@ It is not possible to declare a slot with function type: such an item is
  interpreted as being a @<message-item> or @<method-item>.  Pointers to
  functions are fine.
  
+Properties:
+\begin{description}
+\item[@|slot_class|] A symbol naming the Lisp class to use to represent the
+  direct slot.
+\item[@|initarg|] An identifier naming an initialization argument which can
+  be used to provide a value for the slot.  See
+  \xref{sec:concepts.lifecycle.birth} for the details.
+\item[@|initarg_class|] A symbol naming the Lisp class to use to represent
+  the initarg.  Only permitted if @|initarg| is also set.
+\end{description}
+
  An @<initializer>, if present, is treated as if a separate
  @<initializer-item> containing the slot name and initializer were present.
  For example,
  \begin{prog}
-[nick = eg] \\-
-class Example : Super \{ \\ \ind
-  int foo = 17; \- \\
+[nick = eg]                                                     \\
+class Example: Super \{                                         \\ \ind
+  int foo = 17;                                               \-\\
  \};
  \end{prog}
  means the same as
  \begin{prog}
-[nick = eg] \\-
-class Example : Super \{ \\ \ind
-  int foo; \\
-  eg.foo = 17; \- \\
+[nick = eg]                                                     \\
+class Example: Super \{                                         \\ \ind
+  int foo;                                                      \\
+  eg.foo = 17;                                                \-\\
  \};
  \end{prog}
  
@@ -685,21 +866,29 @@ class Example : Super \{ \\ \ind
  
  <slot-initializer> ::= <dotted-name> @["=" <initializer>@]
  
-<initializer> :: <c-fragment>
+<initializer> ::= <c-fragment>
  \end{grammar}
  
  An @<initializer-item> provides an initial value for one or more slots.  If
-prefixed by @"class", then the initial values are for class slots (i.e.,
+prefixed by @|class|, then the initial values are for class slots (i.e.,
  slots of the class object itself); otherwise they are for instance slots.
  
  The first component of the @<dotted-name> must be the nickname of one of the
  class's superclasses (including itself); the second must be the name of a
  slot defined in that superclass.
  
-An @|initarg| property may be set on an instance slot initializer (or a
-direct slot definition).  See \xref{sec:concepts.lifecycle.birth} for the
-details.  An initializer item must have either an @|initarg| property, or an
-initializer expression, or both.
+Properties:
+\begin{description}
+\item[@|initializer_class|] A symbol naming the Lisp class to use to
+  represent the initializer.
+\item[@|initarg|] An identifier naming an initialization argument which can
+  be used to provide a value for the slot.  See
+  \xref{sec:concepts.lifecycle.birth} for the details.  An initializer item
+  must have either an @|initarg| property, or an initializer expression, or
+  both.
+\item[@|initarg_class|] A symbol naming the Lisp class to use to represent
+  the initarg.  Only permitted if @|initarg| is also set.
+\end{description}
  
  Each class may define at most one initializer item with an explicit
  initializer expression for a given slot.
@@ -711,6 +900,11 @@ initializer expression for a given slot.
    @<declaration-specifier>^+
    <list>$[\mbox{@<init-declarator>}]$ ";"
  \end{grammar}
+Properties:
+\begin{description}
+\item[@|initarg_class|] A symbol naming the Lisp class to use to represent
+  the initarg.
+\end{description}
  
  \subsubsection{Fragment items}
  \begin{grammar}
@@ -726,6 +920,52 @@ initializer expression for a given slot.
    <keyword-declarator>$[\mbox{@<identifier>}]$
    @[<method-body>@]
  \end{grammar}
+Properties:
+\begin{description}
+\item[@|message_class|] A symbol naming the Lisp class to use to represent
+  the message.
+\item[@|readonly|] A boolean indicating whether the message guarantees not to
+  modify its receiver.  If this is true, the receiver will be declared
+  @"const".
+\item[@|combination|] A keyword naming the aggregating method combination to
+  use.
+\item[@|most_specific|] A keyword, either @`first' or @`last', according to
+  whether the most specific applicable method should be invoked first or
+  last.
+\end{description}
+
+Properties for the @|custom| aggregating method combination:
+\begin{description}
+\item[@|retvar|] An identifier for the return value from the effective
+  method.  The default is @|sod__ret|.  Only permitted if the message return
+  type is not @|void|.
+\item[@|valvar|] An identifier holding each return value from a direct method
+  in the effective method.  The default is @|sod__val|.  Only permitted if
+  the method return type (see @|methty| below) is not @|void|.
+\item[@|methty|] A C type, which is the return type for direct methods of
+  this message.  The default is the return type of the message.
+\item[@|decls|] A code fragment containing declarations to be inserted at the
+  head of the effective method body.  The default is to insert nothing.
+\item[@|before|] A code fragment containing initialization to be performed at
+  the beginning of the effective method body.  The default is to insert
+  nothing.
+\item[@|empty|] A code fragment executed if there are no primary methods;
+  it should usually store a suitable (identity) value in @<retvar>.  The
+  default is not to emit an effective method at all if there are no primary
+  methods.
+\item[@|first|] A code fragment to set the return value after calling the
+  first applicable direct method.  The default is to use the @|each|
+  fragment.
+\item[@|each|] A code fragment to set the return value after calling a direct
+  method.  If @|first| is also set, then it is used after the first direct
+  method instead of this.  The default is to insert nothing, which is
+  probably not what you want.
+\item[@|after|] A code fragment inserted at the end of the effective method
+  body.  The default is to insert nothing.
+\item[@|count|] An identifier naming a variable to be declared in the
+  effective method body, of type @|size_t|, holding the number of applicable
+  methods.  The default is not to provide such a variable.
+\end{description}
  
  \subsubsection{Method items}
  \begin{grammar}
@@ -736,6 +976,14 @@ initializer expression for a given slot.
  
  <method-body> ::= "{" <c-fragment> "}" | "extern" ";"
  \end{grammar}
+Properties:
+\begin{description}
+\item[@|method_class|] A symbol naming the Lisp class to use to represent
+  the direct method.
+\item[@|role|] A keyword naming the direct method's rôle.  For the built-in
+  `simple' message classes, the acceptable rôle names are @|before|,
+  @|after|, and @|around|.  By default, a primary method is constructed.
+\end{description}
  
  %%%----- That's all, folks --------------------------------------------------