doc/: Sort out the manual structure. Write stuff.

author Mark Wooding <mdw@distorted.org.uk>

Sun, 30 Aug 2015 09:58:38 +0000 (10:58 +0100)

committer Mark Wooding <mdw@distorted.org.uk>

Thu, 17 Sep 2015 10:17:16 +0000 (11:17 +0100)
author Mark Wooding <mdw@distorted.org.uk>
Sun, 30 Aug 2015 09:58:38 +0000 (10:58 +0100)
committer Mark Wooding <mdw@distorted.org.uk>
Thu, 17 Sep 2015 10:17:16 +0000 (11:17 +0100)
diff --git a/doc/sod-protocol.tex b/doc/clang.tex

similarity index 68%

rename from doc/sod-protocol.tex

rename to doc/clang.tex

index 6ef829b..c26ece5 100644 (file)
--- a/doc/sod-protocol.tex
+++ b/doc/clang.tex
@@ -1,13 +1,13 @@
  %%% -*-latex-*-
  %%%
-%%% Description of the internal class structure and protocol
+%%% C language utilities
  %%%
-%%% (c) 2009 Straylight/Edgeware
+%%% (c) 2015 Straylight/Edgeware
  %%%
  
  %%%----- Licensing notice ---------------------------------------------------
  %%%
-%%% This file is part of the Simple Object Definition system.
+%%% This file is part of the Sensble Object Design, an object system for C.
  %%%
  %%% SOD is free software; you can redistribute it and/or modify
  %%% it under the terms of the GNU General Public License as published by
@@ -23,152 +23,12 @@
  %%% along with SOD; if not, write to the Free Software Foundation,
  %%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
  
-\chapter{Protocol overview} \label{ch:proto}
-
-This chapter provides an overview of the Sod translator's internal object
-model.  It describes most of the important classes and generic functions, how
-they are used to build a model of a Sod module and produce output code, and
-how an extension might modify the translator's behaviour.
-
-I assume familiarity with the Common Lisp Object System (CLOS).  Familiarity
-with the CLOS Metaobject Protocol isn't necessary but may be instructive.
-
-%%%--------------------------------------------------------------------------
-\section{A tour through the translator}
-
-At the very highest level, the Sod translator works in two phases: it
-\emph{parses} source files into an internal representation, and then it
-\emph{generates} output files from the internal representation.
-
-The function @|read-module| is given a pathname for a file: it opens the
-file, parses the program text, and returns a @|module| instance describing
-the classes and other items found.
-
-At the other end, the main output function is @|output-module|, which is
-given a module, an output stream and a 
-
-
-%%%--------------------------------------------------------------------------
-\section{Specification conventions} \label{sec:proto.conventions}
-
-Throughout this specification, the phrase `it is an error' indicates that a
-particular circumstance is erroneous and results in unspecified and possibly
-incorrect behaviour.  In particular, the situation need not be immediately
-diagnosed, and the consequences may be far-reaching.
-
-The following conventions apply throughout this specification.
-
-\begin{itemize}
-
-\item If a specification describes an argument as having a particular type or
-  syntax, then it is an error to provide an argument not having that
-  particular type or syntax.
-
-\item If a specification describes a function then that function might be
-  implemented as a generic function; it is an error to attempt to (re)define
-  it as a generic function, or to attempt to add methods to it.  A function
-  specified as being a generic function will certainly be so; if user methods
-  are permitted on the generic function then this will be specified.
-
-\item Where a class precedence list is specified, either explicitly or
-  implicitly by a class hierarchy, the implementation may include additional
-  superclasses not specified here.  Such additional superclasses will not
-  affect the order of specified classes in the class precedence lists either
-  of specified classes themselves or of user-defined subclasses of specified
-  classes.
-
-\item Unless otherwise specified, generic functions use the standard method
-  combination.
-
-\item The specifications for methods are frequently brief; they should be
-  read in conjunction with and in the context of the specification for the
-  generic function and specializing classes, if any.
-
-\item An object $o$ is a \emph{direct instance} of a class $c$ if @|(eq
-  (class-of $o$) $c$)|; $o$ is an \emph{instance} of $c$ if it is a direct
-  instance of any subclass of $c$.
-
-\item If a class is specified as being \emph{abstract} then it is an error to
-  construct direct instances of it, e.g., using @|make-instance|.
-
-\item If an object is specified as being \emph{immutable} then it is an error
-  to mutate it, e.g., using @|(setf (slot-value \ldots) \ldots)|.  Programs
-  may rely on immutable objects retaining their state.
-
-\item A value is \emph{fresh} if it is guaranteed to be not @|eql| to any
-  previously existing value.
-
-\item Unless otherwise specified, it is an error to change the class of an
-  instance of any class described here; and it is an error to change the
-  class of an object to a class described here.
-
-\end{itemize}
-
-\subsection{Format of the entries} \label{sec:proto.conventions.format}
-
-Most symbols defined by the protocol have their own entries.  An entry begins
-with a header line, showing a synopsis of the symbol on the left, and the
-category (function, class, macro, etc.) on the right.
-
-\begin{describe}{fun}{example-function @<required>
-    \&optional @<optional>
-    \&rest @<rest>
-    \&key :keyword}
-  The synopsis for a function, generic function or method describes the
-  function's lambda-list using the usual syntax.  Note that keyword arguments
-  are shown by naming their keywords; in the description, the value passed
-  for the keyword argument @|:keyword| is shown as @<keyword>.
-
-  For a method, specializers are shown using the usual @|defmethod| syntax,
-  e.g.,
-  \begin{quote} \sffamily
-    some-generic-function ((@<specialized> list) @<unspecialized>)
-  \end{quote}
-\end{describe}
-
-\begin{describe}{mac}{example-macro
-  ( @{ @<symbol> @! (@<symbol> @<form>) @}^* ) \\ \push
-    @[[ @<declaration>^* @! @<documentation-string> @]] \\
-    @<body-form>^*}
-  The synopsis for a macro describes the acceptable syntax using the
-  following notation.
-  \begin{itemize}
-  \item Literal symbols, e.g., keywords and parenthesis, are shown in
-    @|monospace|.
-  \item Metasyntactic variables are shown in @<italics>.
-  \item Items are grouped together by braces `@{ $\dots$ @}'.  The notation
-    `@{ $\dots$ @}^*' indicates that the enclosed items may be repeated zero
-    or more times; `@{ $\dots$ @}^+' indicates that the enclosed items may be
-    repeated one or more times.  This notation may be applied to a single
-    item without the braces.
-  \item Optional items are shown enclosed in brackets `@[ $\dots$ @]'.
-  \item Alternatives are separated by vertical bars `@!'; the vertical bar
-    has low precedence, so alternatives extend as far as possible between
-    bars and up to the enclosing brackets if any.
-  \item A sequence of alternatives enclosed in double-brackets `@[[ $\ldots$
-    @]]' indicates that the alternatives may occur in any order, but each may
-    appear at most once unless marked by a star.
-  \end{itemize}
-  For example, the notation at the head of this example describes syntax
-  for @|let|.
-\end{describe}
-
-\begin{describe}{cls}{example-class (direct-super other-direct-super) \&key
-    :initarg}
-  The synopsis for a class lists the class's direct superclasses, and the
-  acceptable initargs in the form of a lambda-list.  The initargs may be
-  passed to @|make-instance| when constructing an instance of the class or a
-  subclass of it.  If instances of the class may be reinitialized, or if
-  objects can be changed to be instances of the class, then these initargs
-  may also be passed to @|reinitialize-instance| and/or @|change-class| as
-  applicable; the class description will state explicitly when these
-  operations are allowed.
-\end{describe}
+\chapter{C language utilities} \label{ch:clang}
  
  %%%--------------------------------------------------------------------------
-\section{C type representation} \label{sec:proto.c-types}
+\section{C type representation} \label{sec:clang.c-types}
  
-\subsection{Overview} \label{sec:proto.c-types.over}
+\subsection{Overview} \label{sec:clang.c-types.over}
  
  The Sod translator represents C types in a fairly simple and direct way.
  However, because it spends a fair amount of its time dealing with C types, it
@@ -178,11 +38,11 @@ The class hierarchy is shown in~\xref{fig:proto.c-types}.
  
  \begin{figure} \centering
    \parbox{10pt}{\begin{tabbing}
-    @|c-type| \\ \push
-      @|qualifiable-c-type| \\ \push
-        @|simple-c-type| \\ \push
+    @|c-type| \\ \ind
+      @|qualifiable-c-type| \\ \ind
+        @|simple-c-type| \\ \ind
            @|c-class-type| \- \\
-        @|tagged-c-type| \\ \push
+        @|tagged-c-type| \\ \ind
            @|c-struct-type| \\
            @|c-union-type| \\
            @|c-enum-type| \- \\
@@ -228,7 +88,7 @@ similar names.
  Neither generic function defines a default primary method; subclasses of
  @|c-type| must define their own methods in order to print correctly.
  
-\subsection{The C type root class} \label{sec:proto.c-types.root}
+\subsection{The C type root class} \label{sec:clang.c-types.root}
  
  \begin{describe}{cls}{c-type ()}
    The class @|c-type| marks the root of the built-in C type hierarchy.
@@ -240,7 +100,7 @@ Neither generic function defines a default primary method; subclasses of
    The class @|c-type| is abstract.
  \end{describe}
  
-\subsection{C type S-expression notation} \label{sec:proto.c-types.sexp}
+\subsection{C type S-expression notation} \label{sec:clang.c-types.sexp}
  
  The S-expression representation of a type is described syntactically as a
  type specifier.  Type specifiers fit into two syntactic categories.
@@ -254,13 +114,14 @@ type specifier.  Type specifiers fit into two syntactic categories.
    arguments to the type operator.
  \end{itemize}
  
-\begin{describe}{mac}{c-type @<type-spec> @to @<type>}
+\begin{describe}{mac}{c-type @<type-spec> @> @<c-type>}
    Evaluates to a C type object, as described by the type specifier
    @<type-spec>.
  \end{describe}
  
-\begin{describe}{mac}{
-    defctype @{ @<name> @! (@<name>^*) @} @<type-spec> @to @<names>}
+\begin{describe}{mac}
+    {defctype @{ @<name> @! (@<name> @<nickname>^*) @} @<type-spec>
+      @> @<names>}
    Defines a new symbolic type specifier @<name>; if a list of @<name>s is
    given, then all are defined in the same way.  The type constructed by using
    any of the @<name>s is as described by the type specifier @<type-spec>.
@@ -270,16 +131,14 @@ type specifier.  Type specifiers fit into two syntactic categories.
    @<name> is used in a type specifier.
  \end{describe}
  
-\begin{describe}{mac}{c-type-alias @<original> @<alias>^* @to @<aliases>}
+\begin{describe}{mac}{c-type-alias @<original> @<alias>^* @> @<aliases>}
    Defines each @<alias> as being a type operator identical in behaviour to
    @<original>.  If @<original> is later redefined then the behaviour of the
    @<alias>es changes too.
  \end{describe}
  
-\begin{describe}{mac}{%
-  define-c-type-syntax @<name> @<lambda-list> \\ \push
-    @<form>^* \-\\
-  @to @<name>}
+\begin{describe}{mac}
+    {define-c-type-syntax @<name> @<lambda-list> @<form>^* @> @<name>}
    Defines the symbol @<name> as a new type operator.  When a list of the form
    @|(@<name> @<argument>^*)| is used as a type specifier, the @<argument>s
    are bound to fresh variables according to @<lambda-list> (a destructuring
@@ -291,12 +150,12 @@ type specifier.  Type specifiers fit into two syntactic categories.
    type specifiers among its arguments.
  \end{describe}
  
-\begin{describe}{fun}{expand-c-type-spec @<type-spec> @to @<form>}
+\begin{describe}{fun}{expand-c-type-spec @<type-spec> @> @<form>}
    Returns the Lisp form that @|(c-type @<type-spec>)| would expand into.
  \end{describe}
  
-\begin{describe}{gf}{%
-    print-c-type @<stream> @<type> \&optional @<colon> @<atsign>}
+\begin{describe}{gf}
+    {print-c-type @<stream> @<type> \&optional @<colon> @<atsign>}
    Print the C type object @<type> to @<stream> in S-expression form.  The
    @<colon> and @<atsign> arguments may be interpreted in any way which seems
    appropriate: they are provided so that @|print-c-type| may be called via
@@ -307,14 +166,15 @@ type specifier.  Type specifiers fit into two syntactic categories.
    default method.
  \end{describe}
  
-\subsection{Comparing C types} \label{sec:proto.c-types.cmp}
+\subsection{Comparing C types} \label{sec:clang.c-types.cmp}
  
  It is necessary to compare C types for equality, for example when checking
  argument lists for methods.  This is done by @|c-type-equal-p|.
  
-\begin{describe}{gf}{c-type-equal-p @<type>_1 @<type>_2 @to @<boolean>}
-  The generic function @|c-type-equal-p| compares two C types @<type>_1 and
-  @<type>_2 for equality; it returns true if the two types are equal and
+\begin{describe}{gf}
+    {c-type-equal-p @<c-type>_1 @<c-type>_2 @> @<generalized-boolean>}
+  The generic function @|c-type-equal-p| compares two C types @<c-type>_1 and
+  @<c-type>_2 for equality; it returns true if the two types are equal and
    false if they are not.
  
    Two types are equal if they are structurally similar, where this property
@@ -323,24 +183,24 @@ argument lists for methods.  This is done by @|c-type-equal-p|.
  
    The generic function @|c-type-equal-p| uses the @|and| method combination.
  
-  \begin{describe}{meth}{c-type-equal-p @<type>_1 @<type>_2}
+  \begin{describe}{meth}{c-type-equal-p @<c-type>_1 @<c-type>_2}
      A default primary method for @|c-type-equal-p| is defined.  It simply
      returns @|nil|.  This way, methods can specialize on both arguments
      without fear that a call will fail because no methods are applicable.
    \end{describe}
-  \begin{describe}{ar-meth}{c-type-equal-p @<type>_1 @<type>_2}
+  \begin{describe}{ar-meth}{c-type-equal-p @<c-type>_1 @<c-type>_2}
      A default around-method for @|c-type-equal-p| is defined.  It returns
-    true if @<type>_1 and @<type>_2 are @|eql|; otherwise it delegates to the
-    primary methods.  Since several common kinds of C types are interned,
+    true if @<c-type>_1 and @<c-type>_2 are @|eql|; otherwise it delegates to
+    the primary methods.  Since several common kinds of C types are interned,
      this is a common case worth optimizing.
    \end{describe}
  \end{describe}
  
-\subsection{Outputting C types} \label{sec:proto.c-types.output}
+\subsection{Outputting C types} \label{sec:clang.c-types.output}
  
-\begin{describe}{gf}{pprint-c-type @<type> @<stream> @<kernel>}
+\begin{describe}{gf}{pprint-c-type @<c-type> @<stream> @<kernel>}
    The generic function @|pprint-c-type| pretty-prints to @<stream> a C-syntax
-  declaration of an object or function of type @<type>.  The result is
+  declaration of an object or function of type @<c-type>.  The result is
    written to @<stream>.
  
    A C declaration has two parts: a sequence of \emph{declaration specifiers}
@@ -385,7 +245,7 @@ argument lists for methods.  This is done by @|c-type-equal-p|.
    Every concrete subclass of @|c-type| is expected to provide a primary
    method on this function.  There is no default primary method.
  
-  \begin{describe}{ar-meth}{pprint-c-type @<type> @<stream> @<kernel>}
+  \begin{describe}{ar-meth}{pprint-c-type @<c-type> @<stream> @<kernel>}
      A default around method is defined on @|pprint-c-type| which `canonifies'
      non-function @<kernel> arguments.  In particular:
      \begin{itemize}
@@ -404,9 +264,8 @@ argument lists for methods.  This is done by @|c-type-equal-p|.
    specifiers.  The precise details are subject to change.
  \end{describe}
  
-\begin{describe}{mac}{%
-  maybe-in-parens (@<stream-var> @<guard-form>) \\ \push
-    @<form>^*}
+\begin{describe}{mac}
+    {maybe-in-parens (@<stream-var> @<guard-form>) @<form>^*}
    The @<guard-form> is evaluated, and then the @<form>s are evaluated in
    sequence within a pretty-printer logical block writing to the stream named
    by the symbol @<stream-var>.  If the @<guard-form> evaluates to nil, then
@@ -419,7 +278,7 @@ argument lists for methods.  This is done by @|c-type-equal-p|.
  \end{describe}
  
  \subsection{Type qualifiers and qualifiable types}
-\label{sec:proto.ctypes.qual}
+\label{sec:clang.ctypes.qual}
  
  \begin{describe}{cls}{qualifiable-c-type (c-type) \&key :qualifiers}
    The class @|qualifiable-c-type| describes C types which can bear
@@ -437,18 +296,18 @@ argument lists for methods.  This is done by @|c-type-equal-p|.
    The class @|qualifiable-c-type| is abstract.
  \end{describe}
  
-\begin{describe}{gf}{c-type-qualifiers @<type> @to @<list>}
-  Returns the qualifiers of the @|qualifiable-c-type| instance @<type> as an
-  immutable list.
+\begin{describe}{gf}{c-type-qualifiers @<c-type> @> @<list>}
+  Returns the qualifiers of the @|qualifiable-c-type| instance @<c-type> as
+  an immutable list.
  \end{describe}
  
-\begin{describe}{fun}{qualify-type @<type> @<qualifiers>}
-  The argument @<type> must be an instance of @|qualifiable-c-type|,
+\begin{describe}{fun}{qualify-type @<c-type> @<qualifiers> @> @<c-type>}
+  The argument @<c-type> must be an instance of @|qualifiable-c-type|,
    currently bearing no qualifiers, and @<qualifiers> a list of qualifier
    keywords.  The result is a C type object like @<c-type> except that it
    bears the given @<qualifiers>.
  
-  The @<type> is not modified.  If @<type> is interned, then the returned
+  The @<c-type> is not modified.  If @<c-type> is interned, then the returned
    type will be interned.
  \end{describe}
  
@@ -458,7 +317,7 @@ argument lists for methods.  This is done by @|c-type-equal-p|.
    non-null then the final character of the returned string will be a space.
  \end{describe}
  
-\subsection{Leaf types} \label{sec:proto.c-types.leaf}
+\subsection{Leaf types} \label{sec:clang.c-types.leaf}
  
  A \emph{leaf type} is a type which is not defined in terms of another type.
  In Sod, the leaf types are
@@ -522,19 +381,20 @@ In Sod, the leaf types are
    \label{tab:proto.c-types.simple}
  \end{table}
  
-\begin{describe}{fun}{make-simple-type @<name> \&optional @<qualifiers>}
+\begin{describe}{fun}
+    {make-simple-type @<name> \&optional @<qualifiers> @> @<c-type>}
    Return the (unique interned) simple C type object for the C type whose name
    is @<name> (a string) and which has the given @<qualifiers> (a list of
    keywords).
  \end{describe}
  
-\begin{describe}{gf}{c-type-name @<type>}
-  Returns the name of a @|simple-c-type| instance @<type> as an immutable
+\begin{describe}{gf}{c-type-name @<c-type> @> @<string>}
+  Returns the name of a @|simple-c-type| instance @<c-type> as an immutable
    string.
  \end{describe}
  
-\begin{describe}{mac}{%
-    define-simple-c-type @{ @<name> @! (@<name>^*) @} @<string>}
+\begin{describe}{mac}
+    {define-simple-c-type @{ @<name> @! (@<name>^*) @} @<string> @> @<name>}
    Define type specifiers for a new simple C type.  Each symbol @<name> is
    defined as a symbolic type specifier for the (unique interned) simple C
    type whose name is the value of @<string>.  Further, each @<name> is
@@ -559,13 +419,22 @@ In Sod, the leaf types are
    structs and unions.
  \end{boxy}
  
-\begin{describe}{gf}{c-tagged-type-kind @<type>}
-  Returns a symbol classifying the tagged @<type>: one of @|enum|, @|struct|
-  or @|union|.  User-defined subclasses of @|tagged-c-type| should return
-  their own classification symbols.  It is intended that @|(string-downcase
-  (c-tagged-type-kind @<type>))| be valid C syntax.\footnote{%
+\begin{describe}{gf}{c-tagged-type-kind @<c-type> @> @<keyword>}
+  Returns a keyword classifying the tagged @<c-type>: one of @|:enum|,
+  @|:struct| or @|:union|.  User-defined subclasses of @|tagged-c-type|
+  should return their own classification symbols.  It is intended that
+  @|(string-downcase (c-tagged-type-kind @<c-type>))| be valid C
+  syntax.\footnote{%
      Alas, C doesn't provide a syntactic category for these keywords;
      \Cplusplus\ calls them a @<class-key>.} %
+  There is a method defined for each of the built-in tagged type classes
+  @|c-struct-type|, @|c-union-type| and @|c-enum-type|.
+\end{describe}
+
+\begin{describe}{gf}{kind-c-tagged-type @<keyword> @> @<symbol>}
+  This is not quite the inverse of @|c-tagged-type-kind|.  Given a keyword
+  naming a kind of tagged type, return the name of the corresponding C
+  type class as a symbol.
  \end{describe}
  
  \begin{describe}{cls}{c-enum-type (tagged-c-type) \&key :qualifiers :tag}
@@ -576,7 +445,8 @@ In Sod, the leaf types are
    interned) enumerated type with the given @<tag> and @<qualifier>s (all
    evaluated).
  \end{describe}
-\begin{describe}{fun}{make-enum-type @<tag> \&optional @<qualifiers>}
+\begin{describe}{fun}
+    {make-enum-type @<tag> \&optional @<qualifiers> @> @<c-enum-type>}
    Return the (unique interned) C type object for the enumerated C type whose
    tag is @<tag> (a string) and which has the given @<qualifiers> (a list of
    keywords).
@@ -590,7 +460,8 @@ In Sod, the leaf types are
    interned) structured type with the given @<tag> and @<qualifier>s (all
    evaluated).
  \end{describe}
-\begin{describe}{fun}{make-struct-type @<tag> \&optional @<qualifiers>}
+\begin{describe}{fun}
+    {make-struct-type @<tag> \&optional @<qualifiers> @> @<c-struct-type>}
    Return the (unique interned) C type object for the structured C type whose
    tag is @<tag> (a string) and which has the given @<qualifiers> (a list of
    keywords).
@@ -605,24 +476,31 @@ In Sod, the leaf types are
    interned) union type with the given @<tag> and @<qualifier>s (all
    evaluated).
  \end{describe}
-\begin{describe}{fun}{make-union-type @<tag> \&optional @<qualifiers>}
+\begin{describe}{fun}
+    {make-union-type @<tag> \&optional @<qualifiers> @> @<c-union-type>}
    Return the (unique interned) C type object for the union C type whose tag
    is @<tag> (a string) and which has the given @<qualifiers> (a list of
    keywords).
  \end{describe}
  
-\subsection{Pointers and arrays} \label{sec:proto.c-types.ptr-array}
+\subsection{Compound C types} \label{sec:code.c-types.compound}
+
+Some C types are \emph{compound types}: they're defined in terms of existing
+types.  The classes which represent compound types implement a common
+protocol.
  
-Pointers and arrays are \emph{compound types}: they're defined in terms of
-existing types.  A pointer describes the type of objects it points to; an
-array describes the type of array element.
-\begin{describe}{gf}{c-type-subtype @<type>}
-  Returns the underlying type of a compound type @<type>.  Precisely what
-  this means depends on the class of @<type>.
+\begin{describe}{gf}{c-type-subtype @<c-type> @> @<subtype>}
+  Returns the underlying type of a compound type @<c-type>.  Precisely what
+  this means depends on the class of @<c-type>.
  \end{describe}
  
-\begin{describe}{cls}{c-pointer-type (qualifiable-c-type)
-    \&key :qualifiers :subtype}
+\subsection{Pointer types} \label{sec:clang.c-types.pointer}
+
+Pointers compound types.  The subtype of a pointer type is the type it points
+to.
+
+\begin{describe}{cls}
+    {c-pointer-type (qualifiable-c-type) \&key :qualifiers :subtype}
    Represents a C pointer type.  An instance denotes the C type @<subtype>
    @|*|@<qualifiers>.
  
@@ -639,12 +517,20 @@ array describes the type of array element.
    characters; the symbol @|const-string| is a type specifier for the type
    pointer to constant characters.
  \end{describe}
-\begin{describe}{fun}{make-pointer-type @<subtype> \&optional @<qualifiers>}
+
+\begin{describe}{fun}
+    {make-pointer-type @<c-type> \&optional @<qualifiers>
+      @> @<c-pointer-type>}
    Return an object describing the type of qualified pointers to @<subtype>.
    If @<subtype> is interned, then the returned pointer type object is
    interned also.
  \end{describe}
  
+\subsection{Array types} \label{sec:clang.c-types.array}
+
+Arrays implement the compound-type protocol.  The subtype of an array is the
+array element type.
+
  \begin{describe}{cls}{c-array-type (c-type) \&key :subtype :dimensions}
    Represents a multidimensional C array type.  The @<dimensions> are a list
    of dimension specifiers $d_0$, $d_1$, \ldots, $d_{n-1}$; an instance then
@@ -668,23 +554,66 @@ array describes the type of array element.
    single-dimensional array with unspecified extent.  The synonyms @|array|
    and @|vector| may be used in place of the brackets @`[]'.
  \end{describe}
-\begin{describe}{fun}{make-array-type @<subtype> @<dimensions>}
+
+\begin{describe}{fun}
+    {make-array-type @<subtype> @<dimensions> @> @<c-array-type>}
    Return an object describing the type of arrays with given @<dimensions> and
    with element type @<subtype> (an instance of @|c-type|).  The @<dimensions>
    argument is a list whose elements are strings or nil; see the description
    of the class @|c-array-type| above for details.
  \end{describe}
-\begin{describe}{gf}{c-array-dimensions @<type>}
-  Returns the dimensions of @<type>, an array type, as an immutable list.
+
+\begin{describe}{gf}{c-array-dimensions @<c-type> @> @<list>}
+  Returns the dimensions of @<c-type>, an array type, as an immutable list.
+\end{describe}
+
+\subsection{Function types} \label{sec:clang.c-types.fun}
+
+\begin{describe}{cls}{argument}
+\end{describe}
+
+\begin{describe}{fun}{argumentp @<value> @> @<generalized-boolean>}
+\end{describe}
+
+\begin{describe}{fun}{make-argument @<name> @<c-type> @> @<argument>}
+\end{describe}
+
+\begin{describe}{fun}{argument-name @<argument> @> @<string>}
+\end{describe}
+
+\begin{describe}{fun}{argument-type @<argument> @> @<c-type>}
  \end{describe}
  
-\subsection{Function types} \label{sec:proto.c-types.fun}
+\begin{describe}{fun}
+    {commentify-argument-name @<name> @> @<commentified-name>}
+\end{describe}
  
  \begin{describe}{cls}{c-function-type (c-type) \&key :subtype :arguments}
    Represents C function types.  An instance denotes the C type of a C
-  function which 
+  function which FIXME
+\end{describe}
+
+\begin{describe}{fun}
+    {c-function-arguments @<c-function-type> @> @<arguments>}
+\end{describe}
+
+\begin{describe}{fun}
+    {make-c-type @<c-type> @<arguments> @> @<c-function-type>}
+\end{describe}
+
+\begin{describe}{fun}
+    {commentify-argument-names @<arguments> @> @<commentified-arguments>}
+\end{describe}
+
+\begin{describe}{fun}
+    {commentify-function-type @<c-type> @> @<commentified-c-type>}
  \end{describe}
  
+\subsection{Parsing C types} \label{sec:clang.c-types.parsing}
+
+%%%--------------------------------------------------------------------------
+\section{Generating C code} \label{sec:clang.codegen}
+
  %%%----- That's all, folks --------------------------------------------------
  
  %%% Local variables:
diff --git a/doc/concepts.tex b/doc/concepts.tex

new file mode 100644 (file)

index 0000000..d554b51
--- /dev/null
+++ b/doc/concepts.tex
@@ -0,0 +1,42 @@
+%%% -*-latex-*-
+%%%
+%%% Conceptual background
+%%%
+%%% (c) 2015 Straylight/Edgeware
+%%%
+
+%%%----- Licensing notice ---------------------------------------------------
+%%%
+%%% This file is part of the Sensble Object Design, an object system for C.
+%%%
+%%% SOD is free software; you can redistribute it and/or modify
+%%% it under the terms of the GNU General Public License as published by
+%%% the Free Software Foundation; either version 2 of the License, or
+%%% (at your option) any later version.
+%%%
+%%% SOD is distributed in the hope that it will be useful,
+%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
+%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+%%% GNU General Public License for more details.
+%%%
+%%% You should have received a copy of the GNU General Public License
+%%% along with SOD; if not, write to the Free Software Foundation,
+%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+\chapter{Concepts}
+
+\section{Classes and slots}
+
+\section{Messages and methods}
+
+\section{Metaclasses}
+
+\section{Modules}
+
+%%%----- That's all, folks --------------------------------------------------
+
+%%% Local variables:
+%%% mode: LaTeX
+%%% TeX-master: "sod.tex"
+%%% TeX-PDF-mode: t
+%%% End:
diff --git a/doc/cutting-room-floor.tex b/doc/cutting-room-floor.tex

new file mode 100644 (file)

index 0000000..c8f241b
--- /dev/null
+++ b/doc/cutting-room-floor.tex
@@ -0,0 +1,489 @@
+%%% -*-latex-*-
+%%%
+%%% Conceptual background
+%%%
+%%% (c) 2015 Straylight/Edgeware
+%%%
+
+%%%----- Licensing notice ---------------------------------------------------
+%%%
+%%% This file is part of the Sensble Object Design, an object system for C.
+%%%
+%%% SOD is free software; you can redistribute it and/or modify
+%%% it under the terms of the GNU General Public License as published by
+%%% the Free Software Foundation; either version 2 of the License, or
+%%% (at your option) any later version.
+%%%
+%%% SOD is distributed in the hope that it will be useful,
+%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
+%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+%%% GNU General Public License for more details.
+%%%
+%%% You should have received a copy of the GNU General Public License
+%%% along with SOD; if not, write to the Free Software Foundation,
+%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+\chapter{Cutting-room floor}
+
+%%%--------------------------------------------------------------------------
+\section{Generated names}
+
+The generated names for functions and objects related to a class are
+constructed systematically so as not to interfere with each other.  The rules
+on class, slot and message naming exist so as to ensure that the generated
+names don't collide with each other.
+
+The following notation is used in this section.
+\begin{description}
+\item[@<class>] The full name of the `focus' class: the one for which we are
+  generating name.
+\item[@<super-nick>] The nickname of a superclass.
+\item[@<head-nick>] The nickname of the chain-head class of the chain
+  in question.
+\end{description}
+
+\subsection{Instance layout}
+
+%%%--------------------------------------------------------------------------
+\section{Class objects}
+
+\begin{listing}
+typedef struct SodClass__ichain_obj SodClass;
+
+struct sod_chain {
+  size_t n_classes;                     /* Number of classes in chain */
+  const SodClass *const *classes;       /* Vector of classes, head first */
+  size_t off_ichain;                    /* Offset of ichain from instance base */
+  const struct sod_vtable *vt;          /* Vtable pointer for chain */
+  size_t ichainsz;                      /* Size of the ichain structure */
+};
+
+struct sod_vtable {
+  SodClass *_class;                     /* Pointer to instance's class */
+  size_t _base;                         /* Offset to instance base */
+};
+
+struct SodClass__islots {
+
+  /* Basic information */
+  const char *name;                     /* The class's name as a string */
+  const char *nick;                     /* The nickname as a string */
+
+  /* Instance allocation and initialization */
+  size_t instsz;                        /* Instance layout size in bytes */
+  void *(*imprint)(void *);             /* Stamp instance with vtable ptrs */
+  void *(*init)(void *);                /* Initialize instance */
+
+  /* Superclass structure */
+  size_t n_supers;                      /* Number of direct superclasses */
+  const SodClass *const *supers;        /* Vector of direct superclasses */
+  size_t n_cpl;                         /* Length of class precedence list */
+  const SodClass *const *cpl;           /* Vector for class precedence list */
+
+  /* Chain structure */
+  const SodClass *link;                 /* Link to next class in chain */
+  const SodClass *head;                 /* Pointer to head of chain */
+  size_t level;                         /* Index of class in its chain */
+  size_t n_chains;                      /* Number of superclass chains */
+  const sod_chain *chains;              /* Vector of chain structures */
+
+  /* Layout */
+  size_t off_islots;                    /* Offset of islots from ichain base */
+  size_t islotsz;                       /* Size of instance slots */
+};
+
+struct SodClass__ichain_obj {
+  const SodClass__vt_obj *_vt;
+  struct SodClass__islots cls;
+};
+
+struct sod_instance {
+  struct sod_vtable *_vt;
+};
+\end{listing}
+
+\begin{listing}
+void *sod_convert(const SodClass *cls, const void *obj)
+{
+  const struct sod_instance *inst = obj;
+  const SodClass *real = inst->_vt->_cls;
+  const struct sod_chain *chain;
+  size_t i, index;
+
+  for (i = 0; i < real->cls.n_chains; i++) {
+    chain = &real->cls.chains[i];
+    if (chain->classes[0] == cls->cls.head) {
+      index = cls->cls.index;
+      if (index < chain->n_classes && chain->classes[index] == cls)
+        return ((char *)cls - inst->_vt._base + chain->off_ichain);
+      else
+        return (0);
+    }
+  }
+  return (0);
+}
+\end{listing}
+
+%%%--------------------------------------------------------------------------
+\section{Classes}
+\label{sec:class}
+
+\subsection{Classes and superclasses} \label{sec:class.defs}
+
+A @<full-class-definition> must list one or more existing classes to be the
+\emph{direct superclasses} for the new class being defined.  We make the
+following definitions.
+\begin{itemize}
+\item The \emph{superclasses} of a class consist of the class itself together
+  with the superclasses of its direct superclasses.
+\item The \emph{proper superclasses} of a class are its superclasses other
+  than itself.
+\item If $C$ is a (proper) superclass of $D$ then $D$ is a (\emph{proper})
+  \emph{subclass} of $C$.
+\end{itemize}
+The predefined class @|SodObject| has no direct superclasses; it is unique in
+this respect.  All classes are subclasses of @|SodObject|.
+
+\subsection{The class precedence list} \label{sec:class.cpl}
+
+Let $C$ be a class.  The superclasses of $C$ form a directed graph, with an
+edge from each class to each of its direct superclasses.  This is the
+\emph{superclass graph of $C$}.
+
+In order to resolve inheritance of items, we define a \emph{class precedence
+  list} (or CPL) for each class, which imposes a total order on that class's
+superclasses.  The default algorithm for computing the CPL is the \emph{C3}
+algorithm \cite{fixme-c3}, though extensions may implement other algorithms.
+
+The default algorithm works as follows.  Let $C$ be the class whose CPL we
+are to compute.  Let $X$ and $Y$ be two of $C$'s superclasses.
+\begin{itemize}
+\item $C$ must appear first in the CPL.
+\item If $X$ appears before $Y$ in the CPL of one of $C$'s direct
+  superclasses, then $X$ appears before $Y$ in the $C$'s CPL.
+\item If the above rules don't suffice to order $X$ and $Y$, then whichever
+  of $X$ and $Y$ has a subclass which appears further left in the list of
+  $C$'s direct superclasses will appear earlier in the CPL.
+\end{itemize}
+This last rule is sufficient to disambiguate because if both $X$ and $Y$ are
+superclasses of the same direct superclass of $C$ then that direct
+superclass's CPL will order $X$ and $Y$.
+
+We say that \emph{$X$ is more specific than $Y$ as a superclass of $C$} if
+$X$ is earlier than $Y$ in $C$'s class precedence list.  If $C$ is clear from
+context then we omit it, saying simply that $X$ is more specific than $Y$.
+
+\subsection{Instances and metaclasses} \label{sec:class.meta}
+
+A class defines the structure and behaviour of its \emph{instances}: run-time
+objects created (possibly) dynamically.  An instance is an instance of only
+one class, though structurally it may be used in place of an instance of any
+of that class's superclasses.  It is possible, with care, to change the class
+of an instance at run-time.
+
+Classes are themselves represented as instances -- called \emph{class
+  objects} -- in the running program.  Being instances, they have a class,
+called the \emph{metaclass}.  The metaclass defines the structure and
+behaviour of the class object.
+
+The predefined class @|SodClass| is the default metaclass for new classes.
+@|SodClass| has @|SodObject| as its only direct superclass.  @|SodClass| is
+its own metaclass.
+
+To make matters more complicated, Sod has \emph{two} distinct metalevels: as
+well as the runtime metalevel, as discussed above, there's a compile-time
+metalevel hosted in the Sod translator.  Since Sod is written in Common Lisp,
+a Sod class's compile-time metaclass is a CLOS class.  The usual compile-time
+metaclass is @|sod-class|.  The compile-time metalevel is the subject of
+\xref{ch:api}.
+
+\subsection{Items and inheritance} \label{sec:class.inherit}
+
+A class definition also declares \emph{slots}, \emph{messages},
+\emph{initializers} and \emph{methods} -- collectively referred to as
+\emph{items}.  In addition to the items declared in the class definition --
+the class's \emph{direct items} -- a class also \emph{inherits} items from
+its superclasses.
+
+The precise rules for item inheritance vary according to the kinds of items
+involved.
+
+Some object systems have a notion of `repeated inheritance': if there are
+multiple paths in the superclass graph from a class to one of its
+superclasses then items defined in that superclass may appear duplicated in
+the subclass.  Sod does not have this notion.
+
+\subsubsection{Slots} \label{sec:class.inherit.slots}
+A \emph{slot} is a unit of state.  In other object systems, slots may be
+called `fields', `member variables', or `instance variables'.
+
+A slot has a \emph{name} and a \emph{type}.  The name serves only to
+distinguish the slot from other direct slots defined by the same class.  A
+class inherits all of its proper superclasses' slots.  Slots inherited from
+superclasses do not conflict with each other or with direct slots, even if
+they have the same names.
+
+At run-time, each instance of the class holds a separate value for each slot,
+whether direct or inherited.  Changing the value of an instance's slot
+doesn't affect other instances.
+
+\subsubsection{Initializers} \label{sec:class.inherit.init}
+Mumble.
+
+\subsubsection{Messages} \label{sec:class.inherit.messages}
+A \emph{message} is the stimulus for behaviour.  In Sod, a class must define,
+statically, the name and format of the messages it is able to receive and the
+values it will return in reply.  In this respect, a message is similar to
+`abstract member functions' or `interface member functions' in other object
+systems.
+
+Like slots, a message has a \emph{name} and a \emph{type}.  Again, the name
+serves only to distinguish the message from other direct messages defined by
+the same class.  Messages inherited from superclasses do not conflict with
+each other or with direct messages, even if they have the same name.
+
+At run-time, one sends a message to an instance by invoking a function
+obtained from the instance's \emph{vtable}: \xref{sec:fixme-vtable}.
+
+\subsubsection{Methods} \label{sec:class.inherit.methods}
+A \emph{method} is a unit of behaviour.  In other object systems, methods may
+be called `member functions'.
+
+A method is associated with a message.  When a message is received by an
+instance, all of the methods associated with that message on the instance's
+class or any of its superclasses are \emph{applicable}.  The details of how
+the applicable methods are invoked are described fully in
+\xref{sec:fixme-method-combination}.
+
+\subsection{Chains and instance layout} \label{sec:class.layout}
+
+C is a rather low-level language, and in particular it exposes details of the
+way data is laid out in memory.  Since an instance of a class~$C$ should be
+(at least in principle) usable anywhere an instance of some superclass $B
+\succeq C$ is expected, this implies that an instance of the subclass $C$
+needs to contain within it a complete instance of each superclass $B$, laid
+out according to the rules of instances of $B$, so that if we have (the
+address of) an instance of $C$, we can easily construct a pointer to a thing
+which looks like an instance of $B$ contained within it.
+
+Specifically, the information we need to retain for an instance of a
+class~$C$ is:
+\begin{itemize}
+\item the values of each of the slots defined by $C$, including those defined
+  by superclasses;
+\item information which will let us convert a pointer to $C$ into a pointer
+  to any superclass $B \succeq C$;
+\item information which will let us call the appropriate effective method for
+  each message defined by $C$, including those defined by superclasses; and
+\item some additional meta-level information, such as how to find the class
+  object for $C$ given (the address of) one of its instances.
+\end{itemize}
+
+Observe that, while each distinct instance must clearly have its own storage
+for slots, all instances of $C$ can share a single copy of the remaining
+information.  The individual instance only needs to keep a pointer to this
+shared table, which, inspired by the similar structure in many \Cplusplus\
+ABIs, are called a \emph{vtable}.
+
+The easiest approach would be to decide that instances of $C$ are exactly
+like instances of $B$, only with extra space at the end for the extra slots
+which $C$ defines over and above those already existing in $B$.  Conversion
+is then trivial: a pointer to an instance of $C$ can be converted to a
+pointer to an instance of some superclass $B$ simply by casting.  Even though
+the root class @|SodObject| doesn't have any slots at all, its instances will
+still need a vtable so that you can find its class object: the address of the
+vtable therefore needs to be at the very start of the instance structure.
+Again, a vtable for a superclass would have a vtable for each of its
+superclasses as a prefix, with new items added afterwards.
+
+This appealing approach works well for an object system which only permits
+single inheritance of both state and behaviour.  Alas, it breaks down when
+multiple inheritance is allowed: $C$ can be a subclass of both $B$ and $B'$,
+even though $B$ is not a subclass of $B'$, nor \emph{vice versa}; so, in
+general, $B$'s instance structure will not be a prefix of $B'$'s, nor will
+$B'$'s be a prefix of $B$'s, and therefore $C$ cannot have both $B$ and $B'$
+as a prefix.
+
+A (non-root) class may -- though need not -- have a distinguished \emph{link}
+superclass, which need not be a direct superclass.  Furthermore, each
+class~$C$ must satisfy the \emph{chain condition}: for any superclass $A$ of
+$C$, there can be at most one other superclass of $C$ whose link superclass
+is $A$.\footnote{%
+  That is, it's permitted for two classes $B$ and $B'$ to have the same link
+  superclass $A$, but $B$ and $B'$ can't then both be superclasses of the
+  same class $C$.} %
+Therefore, the links partition the superclasses of~$C$ into nice linear
+\emph{chains}, such that each superclass is a member of exactly one chain.
+If a class~$B$ has a link superclass~$A$, then $B$'s \emph{level} is one more
+than that of $A$; otherwise $B$ is called a \emph{chain head} and its level
+is zero.  If the classes in a chain are written in a list, chain head first,
+then the level of each class gives its index in the list.
+
+Chains therefore allow us to recover some of the linearity properties which
+made layout simple in the case of single inheritance.  The instance structure
+for a class $C$ contains a substructure for each of $C$'s superclass chains;
+a pointer to an object of class $C$ actually points to the substructure for
+the chain containing $C$.  The order of these substructures is unimportant
+for now.\footnote{%
+  The chains appear in the order in which their most specific classes appear
+  in $C$'s class precedence list.  This guarantees that the chain containing
+  $C$ itself appears first, so that a pointer to $C$'s instance structure is
+  actually a pointer to $C$'s chain substructure.  Apart from that, it's a
+  simple, stable, but basically arbitrary choice which can't be changed
+  without breaking the ABI.} %
+The substructure for each chain begins with a pointer to a vtable, followed
+by a structure for each superclass in the chain containing the slots defined
+by that superclass, with the chain head (least specific class) first.
+
+Suppose we have a pointer to (static) type $C$, and want to convert it into a
+pointer to some superclass $B$ of $C$ -- an \emph{upcast}.\footnote{%
+  In the more general case, we have a pointer to static type $C$, which
+  actually points to an object of some subclass $D$ of $C$, and want to
+  convert it into a pointer to type $B$.  Such a conversion is called a
+  \emph{downcast} if $B$ is a subclass of $C$, or a \emph{cross-cast}
+  otherwise.  Downcasts and cross-casts require complicated run-time
+  checking, and can will fail unless $B$ is a superclass of $D$.} %
+If $B$ is in the same chain as $C$ -- an \emph{in-chain upcast} -- then the
+pointer value is already correct and it's only necessary to cast it
+appropriately.  Otherwise -- a \emph{cross-chain upcast} -- the pointer needs
+to be adjusted to point to a different chain substructure.  Since the lengths
+and relative positions of the chain substructures vary between classes, the
+adjustments are stored in the vtable.  Cross-chain upcasts are therefore a
+bit slower than in-chain upcasts.
+
+Each chain has its own separate vtable, because much of the metadata stored
+in the vtable is specific to a particular chain.  For example:
+\begin{itemize}
+\item offsets to other chains' substructures will vary depending on which
+  chain we start from; and
+\item entry points to methods 
+\end{itemize}
+%%%--------------------------------------------------------------------------
+\section{Superclass linearization}
+
+Before making any decisions about relationships between superclasses, Sod
+\emph{linearizes} them, i.e., imposes a total order consistent with the
+direct-subclass/superclass partial order.
+
+In the vague hope that we don't be completely bogged down in formalism by the
+end of this, let's introduce some notation.  We'll fix some class $z$ and
+consider its set of superclasses $S(z) = \{ a, b, \dots \}$.  We can define a
+relation $c \prec_1 d$ if $c$ is a direct subclass of $d$, and extend it by
+taking the reflexive, transitive closure: $c \preceq d$ if and only if
+\begin{itemize}
+\item $c = d$, or
+\item there exists some class $x$ such that $c \prec_1 x$ and $x \preceq d$.
+\end{itemize}
+This is the `is-subclass-of' relation we've been using so far.\footnote{%
+  In some object systems, notably Flavors, this relation is allowed to fail
+  to be a partial order because of cycles in the class graph.  I haven't
+  given a great deal of thought to how well Sod would cope with a cyclic
+  class graph.} %
+We write $d \succeq c$ and say that $d$ is a superclass of $c$ if and only if
+$c \preceq d$.
+
+The problem comes when we try to resolve inheritance questions.  A class
+should inherit behaviour from its superclasses; but, in a world of multiple
+inheritance, which one do we choose?  We get a simple version of this problem
+when we try to resolve inheritance of slot initializers: only one initializer
+can be inherited.
+
+We start by collecting into a set~$I$ the classes which define an initializer
+for the slot.  If $I$ contains both a class $x$ and one of $x$'s superclasses
+then we should prefer $x$ and consider the superclass to be overridden.  So
+we should confine our attention to \emph{least} classes: a member $x$ of a
+set $I$ is least, with respect to a particular partial order, if $y \preceq
+x$ only when $x = y$.  If there is a single least class in our set the we
+have a winner.  Otherwise we want some way to choose among them.
+
+This is not uncontroversial.  Languages such as \Cplusplus\ refuse to choose
+among least classes; instead, any program in which such a choice must be made
+is simply declared erroneous.
+
+Simply throwing up our hands in horror at this situation is satisfactory when
+we only wanted to pick one `winner', as we do for slot initializers.
+However, method combination is a much more complicated business.  We don't
+want to pick just one winner: we want to order all of the applicable methods
+in some way.  Insisting that there is a clear winner at every step along the
+chain is too much of an imposition.  Instead, we \emph{linearize} the
+classes.
+
+%%%--------------------------------------------------------------------------
+\section{Invariance, covariance, contravariance}
+
+In Sod, at least with regard to the existing method combinations, method
+types are \emph{invariant}.  This is not an accident, and it's not due to
+ignorance.
+
+The \emph{signature} of a function, method or message describes its argument
+and return-value types.  If a method's arguments are an integer and a string,
+and it returns a character, we might write its signature as
+\[ (@|int|, @|string|) \to @|char| \]
+In Sod, a method's arguments have to match its message's arguments precisely,
+and the return type must either be @|void| -- for a dæmon method -- or again
+match the message's return type.  This is argument and return-type
+\emph{invariance}.
+
+Some object systems allow methods with subtly different signatures to be
+defined on a single message.  In particular, since the idea is that instances
+of a subclass ought to be broadly compatible~(see \xref{sec:phil.lsp}) with
+existing code which expects instances of a superclass, we might be able to
+get away with bending method signatures one way or another to permit this.
+
+\Cplusplus\ permits \emph{return-type covariance}, where a method's return
+type can be a subclass of the return type specified by a less-specific
+method.  Eiffel allows \emph{argument covariance}, where a method's arguments
+can be subclasses of the arguments specified by a less-specific
+method.\footnote{%
+  Attentive readers will note that I ought to be talking about pointers to
+  instances throughout.  I'm trying to limit the weight of the notation.
+  Besides, I prefer data models as found in Lisp and Python where all values
+  are held by reference.} %
+
+Eiffel's argument covariance is unsafe.\footnote{%
+  Argument covariance is correct if you're doing runtime dispatch based on
+  argument types.  Eiffel isn't: it's single dispatch, like Sod is.} %
+Suppose that we have two pairs of classes, $a \prec_1 b$ and $c \prec_1 d$.
+Class $b$ defines a message $m$ with signature $d \to @|int|$; class $a$
+defines a method with signature $c \to @|int|$.  This means that it's wrong
+to send $m$ to an instance $a$ carrying an argument of type $d$.  But of
+course, we can treat an instance of $a$ as if it's an instance of $b$,
+whereupon it appears that we are permitted to pass a~$c$ in our message.  The
+result is a well-known hole in the type system.  Oops.
+
+\Cplusplus's return-type covariance is fine.  Also fine is argument
+\emph{contravariance}.  If $b$ defined its message to have signature $c \to
+@|int|$, and $a$ were to broaden its method to $d \to @|int|$, there'd be no
+problem.  All $c$s are $d$s, so viewing an $a$ as a $b$ does no harm.
+
+All of this fiddling with types is fine as long as method inheritance or
+overriding is an all-or-nothing thing.  But Sod has method combinations,
+where applicable methods are taken from the instance's class and all its
+superclasses and combined.  And this makes everything very messy.
+
+It's possible to sort all of the mess out in the generated effective method
+-- we'd just have to convert the arguments to the types that were expected by
+the direct methods.  This would require expensive run-time conversions of all
+of the non-invariant arguments and return values.  And we'd need some
+complicated rule so that we could choose sensible types for the method
+entries in our vtables.  Something like this:
+\begin{quote} \itshape
+  For each named argument of a message, there must be a unique greatest type
+  among the types given for that argument by the applicable methods; and
+  there must be a unique least type among all of the return types of the
+  applicable methods.
+\end{quote}
+I have visions of people wanting to write special no-effect methods whose
+only purpose is to guide the translator around the class graph properly.
+Let's not.
+
+%% things to talk about:
+%% Liskov substitution principle and why it's mad
+
+%%%----- That's all, folks --------------------------------------------------
+
+%%% Local variables:
+%%% mode: LaTeX
+%%% TeX-master: "sod.tex"
+%%% TeX-PDF-mode: t
+%%% End:
diff --git a/doc/lispintro.tex b/doc/lispintro.tex

new file mode 100644 (file)

index 0000000..5f1043f
--- /dev/null
+++ b/doc/lispintro.tex
@@ -0,0 +1,207 @@
+%%% -*-latex-*-
+%%%
+%%% Description of the internal class structure and protocol
+%%%
+%%% (c) 2009 Straylight/Edgeware
+%%%
+
+%%%----- Licensing notice ---------------------------------------------------
+%%%
+%%% This file is part of the Simple Object Definition system.
+%%%
+%%% SOD is free software; you can redistribute it and/or modify
+%%% it under the terms of the GNU General Public License as published by
+%%% the Free Software Foundation; either version 2 of the License, or
+%%% (at your option) any later version.
+%%%
+%%% SOD is distributed in the hope that it will be useful,
+%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
+%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+%%% GNU General Public License for more details.
+%%%
+%%% You should have received a copy of the GNU General Public License
+%%% along with SOD; if not, write to the Free Software Foundation,
+%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+\chapter{Protocol overview} \label{ch:proto}
+
+This chapter provides an overview of the Sod translator's internal object
+model.  It describes most of the important classes and generic functions, how
+they are used to build a model of a Sod module and produce output code, and
+how an extension might modify the translator's behaviour.
+
+I assume familiarity with the Common Lisp Object System (CLOS).  Familiarity
+with the CLOS Metaobject Protocol isn't necessary but may be instructive.
+
+%%%--------------------------------------------------------------------------
+\section{A tour through the translator}
+
+At the very highest level, the Sod translator works in two phases: it
+\emph{parses} source files into an internal representation, and then it
+\emph{generates} output files from the internal representation.
+
+The function @|read-module| is given a pathname for a file: it opens the
+file, parses the program text, and returns a @|module| instance describing
+the classes and other items found.  Parsing has a number of extension points
+which allow extensions to add their own module syntax.  Properties can be
+attached to modules and the items defined within them, which select which
+internal classes are used to represent them, and possibly provide additional
+parameters to them.
+
+Modules contain a variety of objects, but the most important objects are
+classes, which are associated with a menagerie of other objects representing
+the slots, messages, methods and so on defined in the module.  These various
+objects engage in a (fairly complicated) protocol to construct another
+collection of \emph{layout objects} describing the low-level data structures
+and tables which need to be creates.
+
+At the far end, the main output function is @|output-module|, which is given
+a module, an output stream and a \emph{reason}, which describes what kind of
+output is wanted.  The module items and the layout objects then engage in
+another protocol to work out what output needs to be produced, and which
+order it needs to be written in.
+
+%%%--------------------------------------------------------------------------
+\section{Specification conventions} \label{sec:proto.conventions}
+
+Throughout this specification, the phrase `it is an error' indicates that a
+particular circumstance is erroneous and results in unspecified and possibly
+incorrect behaviour.  In particular, the situation need not be immediately
+diagnosed, and the consequences may be far-reaching.
+
+The following conventions apply throughout this specification.
+
+\begin{itemize}
+
+\item If a specification describes an argument as having a particular type or
+  syntax, then it is an error to provide an argument not having that
+  particular type or syntax.
+
+\item If a specification describes a function then that function might be
+  implemented as a generic function; it is an error to attempt to (re)define
+  it as a generic function, or to attempt to add methods to it.  A function
+  specified as being a generic function will certainly be so; if user methods
+  are permitted on the generic function then this will be specified.
+
+\item Where a class precedence list is specified, either explicitly or
+  implicitly by a class hierarchy, the implementation may include additional
+  superclasses not specified here.  Such additional superclasses will not
+  affect the order of specified classes in the class precedence lists either
+  of specified classes themselves or of user-defined subclasses of specified
+  classes.
+
+\item Unless otherwise specified, generic functions use the standard method
+  combination.
+
+\item The specifications for methods are frequently brief; they should be
+  read in conjunction with and in the context of the specification for the
+  generic function and specializing classes, if any.
+
+\item An object $o$ is a \emph{direct instance} of a class $c$ if @|(eq
+  (class-of $o$) $c$)|; $o$ is an \emph{instance} of $c$ if it is a direct
+  instance of any subclass of $c$.
+
+\item If a class is specified as being \emph{abstract} then it is an error to
+  construct direct instances of it, e.g., using @|make-instance|.
+
+\item If an object is specified as being \emph{immutable} then it is an error
+  to mutate it, e.g., using @|(setf (slot-value \ldots) \ldots)|.  Programs
+  may rely on immutable objects retaining their state.
+
+\item A value is \emph{fresh} if it is guaranteed to be not @|eql| to any
+  previously existing value.  A list is \emph{fresh} if it is guaranteed that
+  none of the cons cells in its main cdr chain (i.e., the list head, its cdr,
+  and so on) are @|eql| to any previously existing value.
+
+\item Unless otherwise specified, it is an error to mutate any part of value
+  passed as an argument to, or a non-fresh part of a value returned by, a
+  function specified in this document.
+
+\item Unless otherwise specified, it is an error to change the class of an
+  instance of any class described here; and it is an error to change the
+  class of an object to a class described here.
+
+\end{itemize}
+
+\subsection{Format of the entries} \label{sec:proto.conventions.format}
+
+Most symbols defined by the protocol have their own entries.  An entry begins
+with a header line, showing a synopsis of the symbol on the left, and the
+category (function, class, macro, etc.) on the right.
+
+\begin{describe}{fun}{example-function @<required>
+    \&optional @<optional>
+    \&rest @<rest>
+    \&key :keyword
+    @> @<result>}
+  The synopsis for a function, generic function or method describes the
+  function's lambda-list using the usual syntax.  Note that keyword arguments
+  are shown by naming their keywords; in the description, the value passed
+  for the keyword argument @|:keyword| is shown as @<keyword>.
+
+  If no results are shown, then the return values (if any) are not
+  specified.  Functions may return more than one result, e.g.,
+  \begin{quote} \sffamily
+    floor @<dividend> \&optional (@<divisor> 1) @> @<quotient> @<remainder>
+  \end{quote}
+  or possibly multiple results, e.g.,
+  \begin{quote} \sffamily
+    values \&rest @<values> @> @<value>^*
+  \end{quote}
+
+  For a method, specializers are shown using the usual @|defmethod| syntax,
+  e.g.,
+  \begin{quote} \sffamily
+    some-generic-function ((@<specialized> list) @<unspecialized>)
+      @> @<result>
+  \end{quote}
+\end{describe}
+
+\begin{describe}{mac}{example-macro
+  ( @{ @<symbol> @! (@<symbol> @<form>) @}^* ) \\ \ind
+    @[[ @<declaration>^* @! @<documentation-string> @]] \\
+    @<body-form>^*
+      \nlret @<value>^*}
+  The synopsis for a macro describes the acceptable syntax using the
+  following notation.
+  \begin{itemize}
+  \item Literal symbols, e.g., keywords and parenthesis, are shown in
+    @|sans|.
+  \item Metasyntactic variables are shown in (roman) @<italics>.
+  \item Items are grouped together by braces `@{ $\dots$ @}'.  The notation
+    `@{ $\dots$ @}^*' indicates that the enclosed items may be repeated zero
+    or more times; `@{ $\dots$ @}^+' indicates that the enclosed items may be
+    repeated one or more times.  This notation may be applied to a single
+    item without the braces.
+  \item Optional items are shown enclosed in brackets `@[ $\dots$ @]'.
+  \item Alternatives are separated by vertical bars `@!'; the vertical bar
+    has low precedence, so alternatives extend as far as possible between
+    bars and up to the enclosing brackets if any.
+  \item A sequence of alternatives enclosed in double-brackets `@[[ $\ldots$
+    @]]' indicates that the alternatives may occur in any order, but each may
+    appear at most once unless marked by a star.
+  \item The notation for results is the same as for functions.
+  \end{itemize}
+  For example, the notation at the head of this example describes syntax
+  for @|let|.
+\end{describe}
+
+\begin{describe}{cls}{example-class (direct-super other-direct-super) \&key
+    :initarg}
+  The synopsis for a class lists the class's direct superclasses, and the
+  acceptable initargs in the form of a lambda-list.  The initargs may be
+  passed to @|make-instance| when constructing an instance of the class or a
+  subclass of it.  If instances of the class may be reinitialized, or if
+  objects can be changed to be instances of the class, then these initargs
+  may also be passed to @|reinitialize-instance| and/or @|change-class| as
+  applicable; the class description will state explicitly when these
+  operations are allowed.
+\end{describe}
+
+%%%----- That's all, folks --------------------------------------------------
+
+%%% Local variables:
+%%% mode: LaTeX
+%%% TeX-master: "sod.tex"
+%%% TeX-PDF-mode: t
+%%% End:
diff --git a/doc/output.tex b/doc/output.tex

new file mode 100644 (file)

index 0000000..29a8b4d
--- /dev/null
+++ b/doc/output.tex
@@ -0,0 +1,116 @@
+%%% -*-latex-*-
+%%%
+%%% Output machinery
+%%%
+%%% (c) 2015 Straylight/Edgeware
+%%%
+
+%%%----- Licensing notice ---------------------------------------------------
+%%%
+%%% This file is part of the Sensble Object Design, an object system for C.
+%%%
+%%% SOD is free software; you can redistribute it and/or modify
+%%% it under the terms of the GNU General Public License as published by
+%%% the Free Software Foundation; either version 2 of the License, or
+%%% (at your option) any later version.
+%%%
+%%% SOD is distributed in the hope that it will be useful,
+%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
+%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+%%% GNU General Public License for more details.
+%%%
+%%% You should have received a copy of the GNU General Public License
+%%% along with SOD; if not, write to the Free Software Foundation,
+%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+\chapter{The output system} \label{ch:output}
+
+%%%--------------------------------------------------------------------------
+
+%% output for `h' files
+%%
+%% prologue
+%% guard start
+%% typedefs start
+%% typedefs
+%% typedefs end
+%% includes start
+%% includes
+%% includes end
+%% classes start
+%% CLASS banner
+%% CLASS islots start
+%% CLASS islots slots
+%% CLASS islots end
+%% CLASS vtmsgs start
+%% CLASS vtmsgs CLASS start
+%% CLASS vtmsgs CLASS slots
+%% CLASS vtmsgs CLASS end
+%% CLASS vtmsgs end
+%% CLASS vtables start
+%% CLASS vtables CHAIN-HEAD start
+%% CLASS vtables CHAIN-HEAD slots
+%% CLASS vtables CHAIN-HEAD end
+%% CLASS vtables end
+%% CLASS vtable-externs
+%% CLASS vtable-externs-after
+%% CLASS methods start
+%% CLASS methods
+%% CLASS methods end
+%% CLASS ichains start
+%% CLASS ichains CHAIN-HEAD start
+%% CLASS ichains CHAIN-HEAD slots
+%% CLASS ichains CHAIN-HEAD end
+%% CLASS ichains end
+%% CLASS ilayout start
+%% CLASS ilayout slots
+%% CLASS ilayout end
+%% CLASS conversions
+%% CLASS object
+%% classes end
+%% guard end
+%% epilogue
+
+%% output for `c' files
+%%
+%% prologue
+%% includes start
+%% includes
+%% includes end
+%% classes start
+%% CLASS banner
+%% CLASS direct-methods start
+%% CLASS direct-methods METHOD start
+%% CLASS direct-methods METHOD body
+%% CLASS direct-methods METHOD end
+%% CLASS direct-methods end
+%% CLASS effective-methods
+%% CLASS vtables start
+%% CLASS vtables CHAIN-HEAD start
+%% CLASS vtables CHAIN-HEAD class-pointer METACLASS
+%% CLASS vtables CHAIN-HEAD base-offset
+%% CLASS vtables CHAIN-HEAD chain-offset TARGET-HEAD
+%% CLASS vtables CHAIN-HEAD vtmsgs CLASS start
+%% CLASS vtables CHAIN-HEAD vtmsgs CLASS slots
+%% CLASS vtables CHAIN-HEAD vtmsgs CLASS end
+%% CLASS vtables CHAIN-HEAD end
+%% CLASS vtables end
+%% CLASS object prepare
+%% CLASS object start
+%% CLASS object CHAIN-HEAD ichain start
+%% CLASS object SUPER slots start
+%% CLASS object SUPER slots
+%% CLASS object SUPER vtable
+%% CLASS object SUPER slots end
+%% CLASS object CHAIN-HEAD ichain end
+%% CLASS object end
+%% classes end
+%% epilogue
+
+%%%----- That's all, folks --------------------------------------------------
+
+%%% Local variables:
+%%% mode: LaTeX
+%%% TeX-master: "sod.tex"
+%%% TeX-PDF-mode: t
+%%% End:
diff --git a/doc/parsing.tex b/doc/parsing.tex

new file mode 100644 (file)

index 0000000..1c4c3cd
--- /dev/null
+++ b/doc/parsing.tex
@@ -0,0 +1,370 @@
+%%% -*-latex-*-
+%%%
+%%% Description of the parsing machinery
+%%%
+%%% (c) 2015 Straylight/Edgeware
+%%%
+
+%%%----- Licensing notice ---------------------------------------------------
+%%%
+%%% This file is part of the Sensble Object Design, an object system for C.
+%%%
+%%% SOD is free software; you can redistribute it and/or modify
+%%% it under the terms of the GNU General Public License as published by
+%%% the Free Software Foundation; either version 2 of the License, or
+%%% (at your option) any later version.
+%%%
+%%% SOD is distributed in the hope that it will be useful,
+%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
+%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+%%% GNU General Public License for more details.
+%%%
+%%% You should have received a copy of the GNU General Public License
+%%% along with SOD; if not, write to the Free Software Foundation,
+%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+\chapter{Parsing} \label{ch:parsing}
+
+%%%--------------------------------------------------------------------------
+\section{The parser protocol} \label{sec:parsing.proto}
+
+For the purpose of Sod's parsing library, \emph{parsing} is the process of
+reading a sequence of input items, in order, and computing an output value.
+
+A \emph{parser} is an expression which consumes zero or more input items and
+returns three values: a \emph{result}, a \emph{success flag}, and a
+\emph{consumed flag}.  The two flags are (generalized) booleans.  If the
+success flag is non-nil, then the parser is said to have \emph{succeeded},
+and the result is the parser's output.  If the success flag is nil then the
+parser is said to have \emph{failed}, and the result is a list of
+\emph{indicators}.  Finally, the consumed flag is non-nil if the parser
+consumed any input items.
+
+%%%--------------------------------------------------------------------------
+\section{File locations}
+
+%%%--------------------------------------------------------------------------
+\section{Scanners} \label{sec:parsing.scanner}
+
+A \emph{scanner} is an object which keeps track of a parser's progress as it
+works through its input.  There's no common base class for scanners: a
+scanner is simply any object which implements the scanner protocol described
+here.
+
+A scanner maintains a sequence of items to read.  It can step forwards
+through the items, one at a time, until it reaches the end (if, indeed, the
+sequence is finite, which it needn't be).  Until that point, there is a
+current item, though there's no protocol for accessing it at this level
+because the nature of the items is left unspecified.
+
+Some scanners support an additional \emph{place-capture} protocol which
+allows rewinding the scanner to an earlier point in the input so that it can
+be scanned again.
+
+\subsection{Basic scanner protocol} \label{sec:parsing.scanner.basic}
+
+The basic protocol supports stepping the scanner forward through its input
+sequence, and detecting the end of the sequence.
+
+\begin{describe}{gf}{scanner-step @<scanner>}
+  Advance the @<scanner> to the next item, which becomes current.
+
+  It is an error to step the scanner if the scanner is at end-of-file.
+\end{describe}
+
+\begin{describe}{gf}{scanner-at-eof-p @<scanner> @> @<generalized-boolean>}
+  Return non-nil if the scanner is at end-of-file, i.e., there are no more
+  items to read.
+
+  If nil is returned, there is a current item, and it is safe to step the
+  scanner again; otherwise, it is an error to query the current item or to
+  step the scanner.
+\end{describe}
+
+\subsection{Place-capture scanner protocol} \label{sec:parsing.scanner.place}
+
+The place-capture protocol allows rewinding to an earlier point in the
+sequence.  Not all scanners support the place-capture protocol.
+
+To rewind a scanner to a particular point, that point must be \emph{captured}
+as a \emph{place} when it's current -- so you must know in advance that this
+is an interesting place that's worth capturing.  The type of place returned
+depends on the type of scanner.  Given a captured place, the scanner can be
+rewound to the position held in it.
+
+Depending on how the scanner works, holding onto a captured place might
+consume a lot of memory or case poor performance.  For example, if the
+scanner is reading from an input stream, having a captured place means that
+data from that point on must be buffered in case the program needs to rewind
+the scanner and read that data again.  Therefore it's possible to
+\emph{release} a place when it turns out not to be needed any more.
+
+\begin{describe}{gf}{scanner-capture-place @<scanner> @> @<place>}
+  Capture the @<scanner>'s current position as a place, and return the place.
+\end{describe}
+
+\begin{describe}{gf}{scanner-restore-place @<scanner> @<place>}
+  Rewind the @<scanner> to the state it was in when @<place> was captured.
+  In particular, the item that was current when the @<place> was captured
+  becomes current again.
+
+  It is an error to restore a @<place> that has been released, or if the
+  @<place> wasn't captured from the @<scanner>.
+\end{describe}
+
+\begin{describe}{gf}{scanner-release-place @<scanner> @<place>}
+  Release the @<place>, to avoid having to maintaining the ability to restore
+  it after it's not needed any more..
+
+  It is an error if the @<place> wasn't captured from the @<scanner>.
+\end{describe}
+
+\begin{describe}{mac}
+    {with-scanner-place (@<place> @<scanner>) @<body-form>^* @> @<value>^*}
+  Capture the @<scanner>'s current position as a place, evaluate the
+  @<body-form>s as an implicit progn with the variable @<place> bound to the captured
+  place.  When control leaves the @<body-form>s, the place is released.  The return
+  values are the values of the final @<body-form>.
+\end{describe}
+
+\subsection{Scanner file-location protocol} \label{sec:parsing.scanner.floc}
+
+Some scanners participate in the file-location protocol (\xref{sec:floc}).
+They implement a method on @|file-location| which collects the necessary
+information using scanner-specific functions described here.
+
+\begin{describe}{fun}{scanner-file-location @<scanner> @> @<file-location>}
+  Return a @|file-location| object describing the current position of the
+  @<scanner>.
+
+  This calls the @|scanner-filename|, @|scanner-line| and @|scanner-column|
+  generic functions on the scanner, and uses these to fill in an appropriate
+  @|file-location|.
+
+  Since there are default methods on these generic functions, it is not an
+  error to call @|scanner-file-location| on any kind of value, but it might
+  not be very useful.  This function exists to do the work of appropriately
+  specialized methods on @|file-location|.
+\end{describe}
+
+\begin{describe}{gf}{scanner-filename @<scanner> @> @<string>}
+  Return the name of the file the scanner is currently processing, as a
+  string, or nil if the filename is not known.
+\end{describe}
+
+\begin{describe}{meth}{scanner-filename (@<scanner> t) @> @<string>}
+  Returns nil.
+\end{describe}
+
+\begin{describe}{gf}{scanner-line @<scanner> @> @<integer>}
+  Return the line number of the @<scanner>'s current position, as an integer,
+  or nil if the line number is not known.
+\end{describe}
+
+\begin{describe}{meth}{scanner-line (@<scanner> t) @> @<integer>}
+  Returns nil.
+\end{describe}
+
+\begin{describe}{gf}{scanner-column @<scanner> @> @<integer>}
+  Return the column number of the @<scanner>'s current position, as an
+  integer, or nil if the column number is not known.
+\end{describe}
+
+\begin{describe}{meth}{scanner-column (@<scanner> t) @> @<integer>}
+  Returns nil.
+\end{describe}
+
+\subsection{Character scanners} \label{sec:parsing.scanner.char}
+
+Character scanners are scanners which read sequences of characters.
+
+\begin{describe}{cls}{character-scanner () \&key}
+  Base class for character scanners.  This provides some very basic
+  functionality.
+
+  Not all character scanners are subclasses of @|character-scanner|.
+\end{describe}
+
+\begin{describe}{gf}{scanner-current-char @<scanner> @> @<character>}
+  Returns the current character.
+\end{describe}
+
+\begin{describe}{gf}{scanner-unread @<scanner> @<character>}
+  Rewind the @<scanner> by one step.  The @<chararacter> must be the previous
+  current character, and becomes the current character again.  It is an error
+  if: the @<scanner> has reached end-of-file; the @<scanner> is never been
+  stepped; or @<character> was not the previous current character.
+\end{describe}
+
+\begin{describe}{gf}
+    {scanner-interval @<scanner> @<place-a> \&optional @<place-b>
+        @> @<string>}
+  Return the characters in the @<scanner>'s input from @<place-a> up to (but
+  not including) @<place-b>.
+
+  The characters are returned as a string.  If @<place-b> is omitted, return
+  the characters up to (but not including) the current position.  It is an
+  error if @<place-b> precedes @<place-a> or they are from different
+  scanners.
+
+  This function is a character-scanner-specific extension to the
+  place-capture protocol; not all character scanners implement the
+  place-capture protocol, and some that do may not implement this function.
+\end{describe}
+
+\subsubsection{Stream access to character scanners}
+Sometimes it can be useful to apply the standard Lisp character input
+operations to the sequence of characters held by a character scanner.
+
+\begin{describe}{gf}{make-scanner-stream @<scanner> @> @<stream>}
+  Returns a fresh input @|stream| object which fetches input characters from
+  the character scanner object @<scanner>.  Reading characters from the
+  stream steps the scanner.  The stream will reach end-of-file when the
+  scanner reports end-of-file.  If the scanner implements the file-location
+  protocol then reading from the stream will change the file location in an
+  appropriate manner.
+
+  This is mostly useful for applying standard Lisp stream functions, most
+  particularly the @|read| function, in the middle of a parsing operation.
+\end{describe}
+
+\begin{describe}{cls}{character-scanner-stream (stream) \&key :scanner}
+  A Common Lisp input @|stream| object which works using the character
+  scanner protocol.  Any @<scanner> which implements the base scanner and
+  character scanner protocols is suitable.  See @|make-scanner-stream|.
+\end{describe}
+
+\subsection{String scanners} \label{sec:parsing.scanner.string}
+
+A \emph{string scanner} is a simple kind of character scanner which reads
+input from a string object.  String scanners implement the character scanner
+and place-capture protocols.
+
+\begin{describe}{cls}{string-scanner}
+  The class of string scanners.  The @|string-scanner| class is not a
+  subclass of @|character-scanner|.
+\end{describe}
+
+\begin{describe}{fun}{string-scanner-p @<value> @> @<generalized-boolean>}
+  Return non-nil if @<value> is a @|string-scanner| object; otherwise return
+  nil.
+\end{describe}
+
+\begin{describe}{fun}
+    {make-string-scanner @<string> \&key :start :end @> @<string-scanner>}
+  Construct and return a fresh @|string-scanner| object.  The new scanner
+  will read characters from @<string>, starting at index @<start> (which
+  defaults to zero), and continuing until it reaches index @<end> (defaults
+  to the end of the @<string>).
+\end{describe}
+
+\subsection{Character buffer scanners} \label{sec:parsing.scanner.charbuf}
+
+A \emph{character buffer scanner}, or \emph{charbuf scanner} for short, is an
+efficient scanner for reading characters from an input stream.  Charbuf
+scanners implements the basic scanner, character buffer, place-capture, and
+file-location protocols.
+
+\begin{describe}{cls}
+    {charbuf-scanner (character-scanner)
+        \&key :stream :filename :line :column}
+  The class of charbuf scanners.  The scanner will read characters from
+  @<stream>.  Charbuf scanners implement the file-location protocol: the
+  initial location is set from the given @<filename>, @<line> and @<column>;
+  the scanner will update the location as it reads its input.
+\end{describe}
+
+\begin{describe}{cls}{charbuf-scanner-place}
+  The class of place objects captured by a charbuf scanner.
+\end{describe}
+
+\begin{describe}{fun}
+    {charbuf-scanner-place-p @<value> @> @<generalized-boolean>}
+  Type predicate for charbuf scanner places: returns non-nil if @<value> is a
+  place captured by a charbuf scanner, and nil otherwise.
+\end{describe}
+
+\begin{describe}{gf}
+    {charbuf-scanner-map @<scanner> @<func> \&optional @<fail>
+      \nlret @<result> @<successp> @<consumedp>}
+  Read characters from the @<scanner>'s buffers.
+
+  This is intended to be an efficient and versatile interface for reading
+  characters from a scanner in bulk.  The function @<func> is invoked
+  repeatedly, as if by
+  \begin{prog}
+    (multiple-value-bind (@<donep> @<used>) \\ \ind\ind
+        (funcall @<func> @<buf> @<start> @<end>) \- \\
+      \textrm\ldots)
+  \end{prog}
+  The argument @<buf> is a simple string; @<start> and @<end> are two
+  nonnegative fixnums, indicating that the subsequence of @<buf> between
+  @<start> (inclusive) and @<end> (exclusive) should be processed.  If
+  @<func>'s return value @<donep> is nil then @<used> is ignored: the
+  function has consumed the entire buffer and wishes to read more.  If
+  @<donep> is non-nil, then it must be a fixnum such that $@<start> \le
+  @<used> \le @<end>$: the function has consumed the buffer as far as @<used>
+  (exclusive) and has completed successfully.
+
+  If end-of-file is encountered before @<func> completes successfully then it
+  fails: the @<fail> function is called with no arguments, and is expected to
+  return two values.  If omitted, @<fail> defaults to
+  \begin{prog}
+    (lambda () \\ \ind
+      (values nil nil))%
+  \end{prog}
+
+  The @|charbuf-scanner-map| function returns three values.  The first value
+  is the non-nil @<donep> value returned by @<func> if @|charbuf-scanner-map|
+  succeeded, or the first value returned by @<fail>; the second value is @|t|
+  on success, or the second value returned by @<fail>; the third value is
+  non-nil if @<func> consumed any input, i.e., it returned with @<donep> nil
+  at least once, or with $@<used> > @<start>$.
+\end{describe}
+
+\subsection{Token scanners} \label{sec:parsing.scanner.token}
+
+\begin{describe}{cls}
+    {token-scanner () \&key :filename (:line 1) (:column 0)}
+\end{describe}
+
+\begin{describe}{gf}{token-type @<scanner> @> @<type>}
+\end{describe}
+
+\begin{describe}{gf}{token-value @<scanner> @> @<value>}
+\end{describe}
+
+\begin{describe}{gf}{scanner-token @<scanner> @> @<type> @<value>}
+\end{describe}
+
+\begin{describe}{ty}{token-scanner-place}
+\end{describe}
+
+\begin{describe}{fun}
+    {token-scanner-place-p @<value> @> @<generalized-boolean>}
+\end{describe}
+
+\subsection{List scanners}
+
+\begin{describe}{ty}{list-scanner}
+\end{describe}
+
+\begin{describe}{fun}{list-scanner-p @<value> @> @<generalized-boolean>}
+\end{describe}
+
+\begin{describe}{fun}{make-list-scanner @<list> @> @<list-scanner>}
+\end{describe}
+
+%%%--------------------------------------------------------------------------
+\section{Parsing macros}
+
+%%%--------------------------------------------------------------------------
+\section{Lexical analyser}
+
+%%%----- That's all, folks --------------------------------------------------
+
+%%% Local variables:
+%%% mode: LaTeX
+%%% TeX-master: "sod.tex"
+%%% TeX-PDF-mode: t
+%%% End:
diff --git a/doc/sod-backg.tex b/doc/sod-backg.tex

deleted file mode 100644 (file)

index 0cb3877..0000000
--- a/doc/sod-backg.tex
+++ /dev/null
@@ -1,156 +0,0 @@
-%%% -*-latex-*-
-%%%
-%%% Background philosophy
-%%%
-%%% (c) 2009 Straylight/Edgeware
-%%%
-
-%%%----- Licensing notice ---------------------------------------------------
-%%%
-%%% This file is part of the Simple Object Definition system.
-%%%
-%%% SOD is free software; you can redistribute it and/or modify
-%%% it under the terms of the GNU General Public License as published by
-%%% the Free Software Foundation; either version 2 of the License, or
-%%% (at your option) any later version.
-%%%
-%%% SOD is distributed in the hope that it will be useful,
-%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
-%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-%%% GNU General Public License for more details.
-%%%
-%%% You should have received a copy of the GNU General Public License
-%%% along with SOD; if not, write to the Free Software Foundation,
-%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
-
-\chapter{Philosophical background}
-
-%%%--------------------------------------------------------------------------
-\section{Superclass linearization}
-
-Before making any decisions about relationships between superclasses, Sod
-\emph{linearizes} them, i.e., imposes a total order consistent with the
-direct-subclass/superclass partial order.
-
-In the vague hope that we don't be completely bogged down in formalism by the
-end of this, let's introduce some notation.  We'll fix some class $z$ and
-consider its set of superclasses $S(z) = \{ a, b, \dots \}$.  We can define a
-relation $c \prec_1 d$ if $c$ is a direct subclass of $d$, and extend it by
-taking the reflexive, transitive closure: $c \preceq d$ if and only if
-\begin{itemize}
-\item $c = d$, or
-\item there exists some class $x$ such that $c \prec_1 x$ and $x \preceq d$.
-\end{itemize}
-This is the `is-subclass-of' relation we've been using so far.\footnote{%
-  In some object systems, notably Flavors, this relation is allowed to fail
-  to be a partial order because of cycles in the class graph.  I haven't
-  given a great deal of thought to how well Sod would cope with a cyclic
-  class graph.} %
-We write $d \succeq c$ and say that $d$ is a superclass of $c$ if and only if
-$c \preceq d$.
-
-The problem comes when we try to resolve inheritance questions.  A class
-should inherit behaviour from its superclasses; but, in a world of multiple
-inheritance, which one do we choose?  We get a simple version of this problem
-when we try to resolve inheritance of slot initializers: only one initializer
-can be inherited.
-
-We start by collecting into a set~$I$ the classes which define an initializer
-for the slot.  If $I$ contains both a class $x$ and one of $x$'s superclasses
-then we should prefer $x$ and consider the superclass to be overridden.  So
-we should confine our attention to \emph{least} classes: a member $x$ of a
-set $I$ is least, with respect to a particular partial order, if $y \preceq
-x$ only when $x = y$.  If there is a single least class in our set the we
-have a winner.  Otherwise we want some way to choose among them.
-
-This is not uncontroversial.  Languages such as \Cplusplus\ refuse to choose
-among least classes; instead, any program in which such a choice must be made
-is simply declared erroneous.
-
-Simply throwing up our hands in horror at this situation is satisfactory when
-we only wanted to pick one `winner', as we do for slot initializers.
-However, method combination is a much more complicated business.  We don't
-want to pick just one winner: we want to order all of the applicable methods
-in some way.  Insisting that there is a clear winner at every step along the
-chain is too much of an imposition.  Instead, we \emph{linearize} the
-classes.
-
-%%%--------------------------------------------------------------------------
-\section{Invariance, covariance, contravariance}
-
-In Sod, at least with regard to the existing method combinations, method
-types are \emph{invariant}.  This is not an accident, and it's not due to
-ignorance.
-
-The \emph{signature} of a function, method or message describes its argument
-and return-value types.  If a method's arguments are an integer and a string,
-and it returns a character, we might write its signature as
-\[ (@|int|, @|string|) \to @|char| \]
-In Sod, a method's arguments have to match its message's arguments precisely,
-and the return type must either be @|void| -- for a dæmon method -- or again
-match the message's return type.  This is argument and return-type
-\emph{invariance}.
-
-Some object systems allow methods with subtly different signatures to be
-defined on a single message.  In particular, since the idea is that instances
-of a subclass ought to be broadly compatible~(see \xref{sec:phil.lsp}) with
-existing code which expects instances of a superclass, we might be able to
-get away with bending method signatures one way or another to permit this.
-
-\Cplusplus\ permits \emph{return-type covariance}, where a method's return
-type can be a subclass of the return type specified by a less-specific
-method.  Eiffel allows \emph{argument covariance}, where a method's arguments
-can be subclasses of the arguments specified by a less-specific
-method.\footnote{%
-  Attentive readers will note that I ought to be talking about pointers to
-  instances throughout.  I'm trying to limit the weight of the notation.
-  Besides, I prefer data models as found in Lisp and Python where all values
-  are held by reference.} %
-
-Eiffel's argument covariance is unsafe.\footnote{%
-  Argument covariance is correct if you're doing runtime dispatch based on
-  argument types.  Eiffel isn't: it's single dispatch, like Sod is.} %
-Suppose that we have two pairs of classes, $a \prec_1 b$ and $c \prec_1 d$.
-Class $b$ defines a message $m$ with signature $d \to @|int|$; class $a$
-defines a method with signature $c \to @|int|$.  This means that it's wrong
-to send $m$ to an instance $a$ carrying an argument of type $d$.  But of
-course, we can treat an instance of $a$ as if it's an instance of $b$,
-whereupon it appears that we are permitted to pass a~$c$ in our message.  The
-result is a well-known hole in the type system.  Oops.
-
-\Cplusplus's return-type covariance is fine.  Also fine is argument
-\emph{contravariance}.  If $b$ defined its message to have signature $c \to
-@|int|$, and $a$ were to broaden its method to $d \to @|int|$, there'd be no
-problem.  All $c$s are $d$s, so viewing an $a$ as a $b$ does no harm.
-
-All of this fiddling with types is fine as long as method inheritance or
-overriding is an all-or-nothing thing.  But Sod has method combinations,
-where applicable methods are taken from the instance's class and all its
-superclasses and combined.  And this makes everything very messy.
-
-It's possible to sort all of the mess out in the generated effective method
--- we'd just have to convert the arguments to the types that were expected by
-the direct methods.  This would require expensive run-time conversions of all
-of the non-invariant arguments and return values.  And we'd need some
-complicated rule so that we could choose sensible types for the method
-entries in our vtables.  Something like this:
-\begin{quote} \itshape
-  For each named argument of a message, there must be a unique greatest type
-  among the types given for that argument by the applicable methods; and
-  there must be a unique least type among all of the return types of the
-  applicable methods.
-\end{quote}
-I have visions of people wanting to write special no-effect methods whose
-only purpose is to guide the translator around the class graph properly.
-Let's not.
-
-%% things to talk about:
-%% Liskov substitution principle and why it's mad
-
-%%%----- That's all, folks --------------------------------------------------
-
-%%% Local variables:
-%%% mode: LaTeX
-%%% TeX-master: "sod.tex"
-%%% TeX-PDF-mode: t
-%%% End:
diff --git a/doc/sod.sty b/doc/sod.sty

new file mode 100644 (file)

index 0000000..105a6c7
--- /dev/null
+++ b/doc/sod.sty
@@ -0,0 +1,168 @@
+%%% -*-latex-*-
+%%%
+%%% Styles and other hacking for the Sod manual
+%%%
+%%% (c) 2015 Straylight/Edgeware
+%%%
+
+%%%----- Licensing notice ---------------------------------------------------
+%%%
+%%% This file is part of the Sensble Object Design, an object system for C.
+%%%
+%%% SOD is free software; you can redistribute it and/or modify
+%%% it under the terms of the GNU General Public License as published by
+%%% the Free Software Foundation; either version 2 of the License, or
+%%% (at your option) any later version.
+%%%
+%%% SOD is distributed in the hope that it will be useful,
+%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
+%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+%%% GNU General Public License for more details.
+%%%
+%%% You should have received a copy of the GNU General Public License
+%%% along with SOD; if not, write to the Free Software Foundation,
+%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+\ProvidesPackage{sod}
+
+%% More reference types.
+\defxref{p}{part}
+
+%% Other languages with special typesetting.
+\def\Cplusplus{C\kern-\p@++}
+\def\Csharp{C\#}
+
+%% Special maths notation.
+\def\chain#1#2{\mathsf{ch}_{#1}(#2)}
+\def\chainhead#1#2{\mathsf{hd}_{#1}(#2)}
+\def\chaintail#1#2{\mathsf{tl}_{#1}(#2)}
+
+%% Other mathematical tweaks.
+\let\implies\Rightarrow
+\let\epsilon\varepsilon
+
+%% Unix manpage references.
+\def\man#1#2{\textbf{#1}(#2)}
+
+%% Listings don't need to be small.
+\let\listingsize\relax
+
+%% Metavariables are italics without decoration.
+\def\syntleft{\normalfont\itshape}
+\let\syntright\empty
+
+%% Literal code is in sans face.
+\let\codeface\sffamily
+\def\code#1{\ifmmode\hbox\fi{\normalfont\codeface\/#1\/}}
+\def\ulitleft{\normalfont\codeface}
+\let\ulitright\empty
+
+%% Conditionally enter maths mode.  Can't use \ensuremath here because we
+%% aren't necessarily sure where the maths will actually end.
+\let\m@maybe@end\relax
+\def\m@maybe{\ifmmode\else$\let\m@maybe@end$\fi}
+
+%% Standard syntax shortcuts.
+\atdef <#1>{\synt{#1}\@scripts}
+\atdef "#1"{\lit*{#1}\@scripts}
+\atdef `#1'{\lit{#1}\@scripts}
+\atdef |#1|{\textsf{#1}\@scripts}
+
+%% A handy abbreviation; `\\' itself is too good to steal.
+\atdef \\{\textbackslash}
+
+%% Intercept grammar typesetting and replace the vertical bar with the
+%% maths-font version.
+\let\@@grammar\grammar
+\def\grammar{\def\textbar{\hbox{$|$}}\@@grammar}
+
+%% Collect super- and subscripts.  (Note that underscores are active for the
+%% most part.)  When we're done, end maths mode if we entered it
+%% conditionally.
+\def\@scripts{\futurelet\@ch\@scripts@i}
+\begingroup\lccode`\~=`\_\lowercase{\endgroup
+\def\@scripts@i{\if1\ifx\@ch~1\else\ifx\@ch^1\else0\fi\fi%
+  \expandafter\@scripts@ii\else\expandafter\m@maybe@end\fi}}
+\def\@scripts@ii#1#2{\m@maybe#1{#2}\@scripts}
+
+%% Doubling characters, maybe.  Either way, chain onto \@scripts.
+\def\dbl@maybe#1{\let\@tempa#1\futurelet\@ch\dbl@maybe@i}
+\def\dbl@maybe@i{\m@maybe\ifx\@ch\@tempa\@tempa\!\@tempa%
+  \expandafter\@firstoftwo\expandafter\@scripts%
+  \else\@tempa\expandafter\@scripts\fi}
+
+%% Extra syntax for Lisp templates.  These produce the maths-font versions of
+%% characters, which should contrast well against the sans face used for
+%% literals.
+\atdef [{\dbl@maybe[}
+\atdef ]{\dbl@maybe]}
+\atdef {{\m@maybe\{\@scripts}
+\atdef }{\m@maybe\}\@scripts}
+\atdef ({\m@maybe(\@scripts}
+\atdef ){\m@maybe)\@scripts}
+\atdef !{\m@maybe|\@scripts}
+\def\returns{\m@maybe\longrightarrow\m@maybe@end\hspace{0.5em}\ignorespaces}
+\atdef >{\leavevmode\unskip\hspace{0.5em}\returns}
+
+%% Comment setting.
+\atdef ;#1\\{\normalfont\itshape;#1\\}
+
+%% Environment for setting programs.  Newlines are explicit, because
+%% otherwise I need comments in weird places to make the vertical spacing
+%% come out properly.  You can write `\obeylines' if you really want to.
+\def\prog{\codeface\quote\tabbing}
+\def\endprog{\endtabbing\endquote}
+\def\ind{\quad\=\+\kill}
+
+%% Put a chunk of text in a box.
+\newenvironment{boxy}[1][\q@]{%
+  \dimen@\linewidth\advance\dimen@-1.2pt\advance\dimen@-2ex%
+  \medskip%
+  \vbox\bgroup\hrule\hbox\bgroup\vrule%
+  \vbox\bgroup\vskip1ex\hbox\bgroup\hskip1ex\minipage\dimen@%
+  \def\@temp{#1}\ifx\@temp\q@\else\leavevmode{\headfam\bfseries#1\quad}\fi%
+}{%
+  \endminipage\hskip1ex\egroup\vskip1ex\egroup%
+  \vrule\egroup\hrule\egroup%
+  \medskip%
+}
+
+%% Lisp documentation machinery.
+\def\definedescribecategory#1#2{\@namedef{cat!#1}{#2}}
+\def\describecategoryname#1{%
+  \expandafter\let\expandafter\@tempa\csname cat!#1\endcsname%
+  \ifx\@tempa\relax#1\else\@tempa\fi}
+\definedescribecategory{fun}{function}
+\definedescribecategory{gf}{generic function}
+\definedescribecategory{var}{variable}
+\definedescribecategory{const}{constant}
+\definedescribecategory{meth}{primary method}
+\definedescribecategory{ar-meth}{around-method}
+\definedescribecategory{be-meth}{before-method}
+\definedescribecategory{af-meth}{after-method}
+\definedescribecategory{cls}{class}
+\definedescribecategory{ty}{type}
+\definedescribecategory{mac}{macro}
+\def\nlret{\\\hspace{4em}\returns}
+
+\def\q@{\q@}
+\newenvironment{describe}[3][\q@]{%
+  \normalfont%
+  \par\goodbreak%
+  \vspace{\bigskipamount}%
+  \setbox\z@\hbox{\bfseries[\describecategoryname{#2}]}%
+  \dimen@\linewidth\advance\dimen@-\wd\z@%
+  \def\@temp##1 ##2\q@{\message{#2:##1}\label{#2:##1}}%
+  \def\@tempa{#1}\ifx\@tempa\q@\@temp#3 \q@\else\@temp{#1} \q@\fi%
+  \edef\@temp{{\the\linewidth}{@{}p{\the\dimen@}%
+      @{\extracolsep{\fill}}l@{\extracolsep{0pt}}}}%
+  \noindent\csname tabular*\expandafter\endcsname\@temp%
+  \tabbing\codeface#3\endtabbing&\unhbox\z@\\\endtabular%
+%  \@afterheading%
+  \list{}{\rightmargin\z@}\item%
+}{%
+  \endlist%
+}
+
+%%% ----- That's all, folks --------------------------------------------------
+\endinput
+\ No newline at end of file
diff --git a/doc/sod.tex b/doc/sod.tex

index ff0dea4..4979b23 100644 (file)
--- a/doc/sod.tex
+++ b/doc/sod.tex
@@ -1,5 +1,32 @@
+%%% -*-latex-*-
+%%%
+%%% Description of the internal class structure and protocol
+%%%
+%%% (c) 2009 Straylight/Edgeware
+%%%
+
+%%%----- Licensing notice ---------------------------------------------------
+%%%
+%%% This file is part of the Simple Object Definition system.
+%%%
+%%% SOD is free software; you can redistribute it and/or modify
+%%% it under the terms of the GNU General Public License as published by
+%%% the Free Software Foundation; either version 2 of the License, or
+%%% (at your option) any later version.
+%%%
+%%% SOD is distributed in the hope that it will be useful,
+%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
+%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+%%% GNU General Public License for more details.
+%%%
+%%% You should have received a copy of the GNU General Public License
+%%% along with SOD; if not, write to the Free Software Foundation,
+%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
  \documentclass[noarticle]{strayman}
  
+\errorcontextlines=999
+
  \usepackage[T1]{fontenc}
  \usepackage[utf8]{inputenc}
  \usepackage[palatino, helvetica, courier, maths=cmr]{mdwfonts}
@@ -13,1195 +40,118 @@
  \usepackage{at}
  \usepackage{mdwref}
  
+\usepackage{sod}
+
  \title{A Sensible Object Design for C}
  \author{Mark Wooding}
  
-\makeatletter
-
-\errorcontextlines999
-
-\def\syntleft{\normalfont\itshape}
-\let\syntright\empty
-
-\let\codeface\sffamily
-
-\def\ulitleft{\normalfont\codeface}
-\let\ulitright\empty
-
-\let\listingsize\relax
-
-\let\epsilon\varepsilon
-
-\atdef <#1>{\synt{#1}\@scripts}
-\atdef "#1"{\lit*{#1}\@scripts}
-\atdef `#1'{\lit{#1}\@scripts}
-\atdef |#1|{\textsf{#1}\@scripts}
-\def\dbl@maybe#1{\let\@tempa#1\futurelet\@ch\dbl@maybe@i}
-\def\dbl@maybe@i{\m@maybe\ifx\@ch\@tempa\@tempa\!\@tempa%
-  \expandafter\@firstoftwo\expandafter\@scripts%
-  \else\@tempa\expandafter\@scripts\fi}
-\atdef [{\dbl@maybe[}
-\atdef ]{\dbl@maybe]}
-\atdef {{\m@maybe\{\@scripts}
-\atdef }{\m@maybe\}\@scripts}
-\atdef ({\m@maybe(\@scripts}
-\atdef ){\m@maybe)\@scripts}
-\atdef !{\m@maybe|\@scripts}
-\atdef to{\leavevmode\unskip\quad\m@maybe\longrightarrow\m@maybe@end\quad}
-\let\m@maybe@end\relax
-\def\m@maybe{\ifmmode\else$\let\m@maybe@end$\fi}
-\def\@scripts{\futurelet\@ch\@scripts@i}
-
-\def\chain#1#2{\mathsf{ch}_{#1}(#2)}
-\def\chainhead#1#2{\mathsf{hd}_{#1}(#2)}
-\def\chaintail#1#2{\mathsf{tl}_{#1}(#2)}
-
-\let\implies\Rightarrow
-
-\atdef ;#1\\{\normalfont\itshape;#1\\}
-\let\@@grammar\grammar
-\def\grammar{\def\textbar{\hbox{$|$}}\@@grammar}
-
-\begingroup\lccode`\~=`\_\lowercase{\endgroup
-\def\@scripts@i{\if1\ifx\@ch~1\else\ifx\@ch^1\else0\fi\fi%
-  \expandafter\@scripts@ii\else\expandafter\m@maybe@end\fi}}
-\def\@scripts@ii#1#2{\m@maybe#1{#2}\@scripts}
-
-\def\Cplusplus{C\kern-\p@++}
-\def\Csharp{C\#}
-\def\man#1#2{\textbf{#1}(#2)}
-
-\begingroup\lccode`\~=`\
-\lowercase{
-\endgroup
-\def\prog{%
-  \codeface%
-  \quote%
-  \let\old@nl\\%
-  \obeylines%
-  \tabbing%
-  \global\let~\\%
-  \global\let\\\textbackslash%
-}
-\def\endprog{%
-  \endtabbing%
-  \global\let\\\old@nl%
-  \endquote%
-}}
-
-\newenvironment{boxy}[1][\q@]{%
-  \dimen@\linewidth\advance\dimen@-1.2pt\advance\dimen@-2ex%
-  \medskip%
-  \vbox\bgroup\hrule\hbox\bgroup\vrule%
-  \vbox\bgroup\vskip1ex\hbox\bgroup\hskip1ex\minipage\dimen@%
-  \def\@temp{#1}\ifx\@temp\q@\else\leavevmode{\headfam\bfseries#1\quad}\fi%
-}{%
-  \endminipage\hskip1ex\egroup\vskip1ex\egroup%
-  \vrule\egroup\hrule\egroup%
-  \medskip%
-}
-
-\def\definedescribecategory#1#2{\@namedef{cat!#1}{#2}}
-\def\describecategoryname#1{%
-  \expandafter\let\expandafter\@tempa\csname cat!#1\endcsname%
-  \ifx\@tempa\relax#1\else\@tempa\fi}
-\definedescribecategory{fun}{function}
-\definedescribecategory{gf}{generic function}
-\definedescribecategory{var}{variable}
-\definedescribecategory{const}{constant}
-\definedescribecategory{meth}{primary method}
-\definedescribecategory{ar-meth}{around-method}
-\definedescribecategory{be-meth}{before-method}
-\definedescribecategory{af-meth}{after-method}
-\definedescribecategory{cls}{class}
-\definedescribecategory{ty}{type}
-\definedescribecategory{mac}{macro}
-
-\def\q@{\q@}
-\newenvironment{describe}[3][\q@]{%
-  \normalfont%
-  \par\goodbreak%
-  \vspace{\bigskipamount}%
-  \setbox\z@\hbox{\bfseries[\describecategoryname{#2}]}%
-  \dimen@\linewidth\advance\dimen@-\wd\z@%
-  \def\@temp##1 ##2\q@{\message{#2:##1}\label{#2:##1}}%
-  \def\@tempa{#1}\ifx\@tempa\q@\@temp#3 \q@\else\@temp{#1} \\\fi%
-  \edef\@temp{{\the\linewidth}{@{}p{\the\dimen@}%
-      @{\extracolsep{\fill}}l@{\extracolsep{0pt}}}}%
-  \noindent\csname tabular*\expandafter\endcsname\@temp%
-  \tabbing\codeface#3\endtabbing&\unhbox\z@\\\endtabular%
-%  \@afterheading%
-  \list{}{\rightmargin\z@}\item%
-}{%
-  \endlist%
-}
-
-\def\push{\quad\=\+\kill}
-
  \begin{document}
  
  \maketitle
  
-\include{sod-tut}
-
  %%%--------------------------------------------------------------------------
-\chapter{Internals}
-
-\section{Generated names}
-
-The generated names for functions and objects related to a class are
-constructed systematically so as not to interfere with each other.  The rules
-on class, slot and message naming exist so as to ensure that the generated
-names don't collide with each other.
-
-The following notation is used in this section.
-\begin{description}
-\item[@<class>] The full name of the `focus' class: the one for which we are
-  generating name.
-\item[@<super-nick>] The nickname of a superclass.
-\item[@<head-nick>] The nickname of the chain-head class of the chain
-  in question.
-\end{description}
-
-\subsection{Instance layout}
-
-%%%--------------------------------------------------------------------------
-\section{Syntax}
-\label{sec:syntax}
-
-Fortunately, Sod is syntactically quite simple.  I've used a little slightly
-unusual notation in order to make the presentation easier to read.  For any
-nonterminal $x$:
-\begin{itemize}
-\item $\epsilon$ denotes the empty nonterminal:
-  \begin{quote}
-    $\epsilon$ ::=
-  \end{quote}
-\item @[$x$@] means an optional $x$:
-  \begin{quote}
-    \syntax{@[$x$@] ::= $\epsilon$ @! $x$}
-  \end{quote}
-\item $x^*$ means a sequence of zero or more $x$s:
-  \begin{quote}
-    \syntax{$x^*$ ::= $\epsilon$ @! $x^*$ $x$}
-  \end{quote}
-\item $x^+$ means a sequence of one or more $x$s:
-  \begin{quote}
-    \syntax{$x^+$ ::= $x$ $x^*$}
-  \end{quote}
-\item $x$@<-list> means a sequence of one or more $x$s separated
-  by commas:
-  \begin{quote}
-    \syntax{$x$<-list> ::= $x$ @! $x$<-list> "," $x$}
-  \end{quote}
-\end{itemize}
-
-\subsection{Lexical syntax}
-\label{sec:syntax.lex}
-
-Whitespace and comments are discarded.  The remaining characters are
-collected into tokens according to the following syntax.
-
-\begin{grammar}
-<token> ::= <identifier>
-\alt <string-literal>
-\alt <char-literal>
-\alt <integer-literal>
-\alt <punctuation>
-\end{grammar}
-
-This syntax is slightly ambiguous, and is disambiguated by the \emph{maximal
-munch} rule: at each stage we take the longest sequence of characters which
-could be a token.
-
-\subsubsection{Identifiers} \label{sec:syntax.lex.id}
-
-\begin{grammar}
-<identifier> ::= <id-start-char> @<id-body-char>^*
-
-<id-start-char> ::= <alpha-char> | "_"
-
-<id-body-char> ::= <id-start-char> @! <digit-char>
-
-<alpha-char> ::= "A" | "B" | \dots\ | "Z"
-\alt "a" | "b" | \dots\ | "z"
-\alt <extended-alpha-char>
-
-<digit-char> ::= "0" | <nonzero-digit-char>
-
-<nonzero-digit-char> ::= "1" | "2" $| \cdots |$ "9"
-\end{grammar}
-
-The precise definition of @<alpha-char> is left to the function
-\textsf{alpha-char-p} in the hosting Lisp system.  For portability,
-programmers are encouraged to limit themselves to the standard ASCII letters.
-
-There are no reserved words at the lexical level, but the higher-level syntax
-recognizes certain identifiers as \emph{keywords} in some contexts.  There is
-also an ambiguity (inherited from C) in the declaration syntax which is
-settled by distinguishing type names from other identifiers at a lexical
-level.
-
-\subsubsection{String and character literals} \label{sec:syntax.lex.string}
-
-\begin{grammar}
-<string-literal> ::= "\"" @<string-literal-char>^* "\""
-
-<char-literal> ::= "'" <char-literal-char> "'"
-
-<string-literal-char> ::= any character other than "\\" or "\""
-\alt "\\" <char>
-
-<char-literal-char> ::= any character other than "\\" or "'"
-\alt "\\" <char>
-
-<char> ::= any single character
-\end{grammar}
-
-The syntax for string and character literals differs from~C.  In particular,
-escape sequences such as @`\textbackslash n' are not recognized.  The use
-of string and character literals in Sod, outside of C~fragments, is limited,
-and the simple syntax seems adequate.  For the sake of future compatibility,
-the use of character sequences which resemble C escape sequences is
-discouraged.
-
-\subsubsection{Integer literals} \label{sec:syntax.lex.int}
-
-\begin{grammar}
-<integer-literal> ::= <decimal-integer>
-\alt <binary-integer>
-\alt <octal-integer>
-\alt <hex-integer>
-
-<decimal-integer> ::= <nonzero-digit-char> @<digit-char>^*
-
-<binary-integer> ::= "0" @("b"|"B"@) @<binary-digit-char>^+
-
-<binary-digit-char> ::= "0" | "1"
-
-<octal-integer> ::= "0" @["o"|"O"@] @<octal-digit-char>^+
-
-<octal-digit-char> ::= "0" | "1" $| \cdots |$ "7"
-
-<hex-integer> ::= "0" @("x"|"X"@) @<hex-digit-char>^+
-
-<hex-digit-char> ::= <digit-char>
-\alt "A" | "B" | "C" | "D" | "E" | "F"
-\alt "a" | "b" | "c" | "d" | "e" | "f"
-\end{grammar}
-
-Sod understands only integers, not floating-point numbers; its integer syntax
-goes slightly beyond C in allowing a @`0o' prefix for octal and @`0b' for
-binary.  However, length and signedness indicators are not permitted.
-
-\subsubsection{Punctuation} \label{sec:syntax.lex.punct}
-
-\begin{grammar}
-<punctuation> ::= any nonalphanumeric character other than "_", "\"" or "'"
-\end{grammar}
-
-\subsubsection{Comments} \label{sec:lex-comment}
-
-\begin{grammar}
-<comment> ::= <block-comment>
-\alt <line-comment>
-
-<block-comment> ::=
-  "/*"
-  @<not-star>^* @(@<star>^+ <not-star-or-slash> @<not-star>^*@)^*
-  @<star>^*
-  "*/"
-
-<star> ::= "*"
-
-<not-star> ::= any character other than "*"
-
-<not-star-or-slash> ::= any character other than "*" or  "/"
-
-<line-comment> ::= "//" @<not-newline>^* <newline>
-
-<newline> ::= a newline character
-
-<not-newline> ::= any character other than newline
-\end{grammar}
-
-Comments are exactly as in C99: both traditional block comments `\texttt{/*}
-\dots\ \texttt{*/}' and \Cplusplus-style `\texttt{//} \dots' comments are
-permitted and ignored.
-
-\subsection{Special nonterminals}
-\label{sec:special-nonterminals}
-
-Aside from the lexical syntax presented above (\xref{sec:lexical-syntax}),
-two special nonterminals occur in the module syntax.
-
-\subsubsection{S-expressions} \label{sec:syntax-sexp}
-
-\begin{grammar}
-<s-expression> ::= an S-expression, as parsed by the Lisp reader
-\end{grammar}
-
-When an S-expression is expected, the Sod parser simply calls the host Lisp
-system's \textsf{read} function.  Sod modules are permitted to modify the
-read table to extend the S-expression syntax.
-
-S-expressions are self-delimiting, so no end-marker is needed.
-
-\subsubsection{C fragments} \label{sec:syntax.lex.cfrag}
-
-\begin{grammar}
-<c-fragment> ::= a sequence of C tokens, with matching brackets
-\end{grammar}
-
-Sequences of C code are simply stored and written to the output unchanged
-during translation.  They are read using a simple scanner which nonetheless
-understands C comments and string and character literals.
-
-A C fragment is terminated by one of a small number of delimiter characters
-determined by the immediately surrounding context -- usually a closing brace
-or bracket.  The first such delimiter character which is not enclosed in
-brackets, braces or parenthesis ends the fragment.
-
-\subsection{Module syntax} \label{sec:syntax-module}
-
-\begin{grammar}
-<module> ::= @<definition>^*
-
-<definition> ::= <import-definition>
-\alt <load-definition>
-\alt <lisp-definition>
-\alt <code-definition>
-\alt <typename-definition>
-\alt <class-definition>
-\end{grammar}
-
-A module is the top-level syntactic item.  A module consists of a sequence of
-definitions.
-
-\subsection{Simple definitions} \label{sec:syntax.defs}
-
-\subsubsection{Importing modules} \label{sec:syntax.defs.import}
-
-\begin{grammar}
-<import-definition> ::= "import" <string> ";"
-\end{grammar}
-
-The module named @<string> is processed and its definitions made available.
-
-A search is made for a module source file as follows.
-\begin{itemize}
-\item The module name @<string> is converted into a filename by appending
-  @`.sod', if it has no extension already.\footnote{%
-    Technically, what happens is \textsf{(merge-pathnames name (make-pathname
-    :type "SOD" :case :common))}, so exactly what this means varies
-    according to the host system.} %
-\item The file is looked for relative to the directory containing the
-  importing module.
-\item If that fails, then the file is looked for in each directory on the
-  module search path in turn.
-\item If the file still isn't found, an error is reported and the import
-  fails.
-\end{itemize}
-At this point, if the file has previously been imported, nothing further
-happens.\footnote{%
-  This check is done using \textsf{truename}, so it should see through simple
-  tricks like symbolic links.  However, it may be confused by fancy things
-  like bind mounts and so on.} %
-
-Recursive imports, either direct or indirect, are an error.
-
-\subsubsection{Loading extensions} \label{sec:syntax.defs.load}
-
-\begin{grammar}
-<load-definition> ::= "load" <string> ";"
-\end{grammar}
-
-The Lisp file named @<string> is loaded and evaluated.
-
-A search is made for a Lisp source file as follows.
-\begin{itemize}
-\item The name @<string> is converted into a filename by appending @`.lisp',
-  if it has no extension already.\footnote{%
-    Technically, what happens is \textsf{(merge-pathnames name (make-pathname
-    :type "LISP" :case :common))}, so exactly what this means varies
-    according to the host system.} %
-\item A search is then made in the same manner as for module imports
-  (\xref{sec:syntax-module}).
-\end{itemize}
-If the file is found, it is loaded using the host Lisp's \textsf{load}
-function.
-
-Note that Sod doesn't attempt to compile Lisp files, or even to look for
-existing compiled files.  The right way to package a substantial extension to
-the Sod translator is to provide the extension as a standard ASDF system (or
-similar) and leave a dropping @"foo-extension.lisp" in the module path saying
-something like
-\begin{quote}
-  \textsf{(asdf:load-system :foo-extension)}
-\end{quote}
-which will arrange for the extension to be compiled if necessary.
-
-(This approach means that the language doesn't need to depend on any
-particular system definition facility.  It's bad enough already that it
-depends on Common Lisp.)
-
-\subsubsection{Lisp escapes} \label{sec:syntax.defs.lisp}
-
-\begin{grammar}
-<lisp-definition> ::= "lisp" <s-expression> ";"
-\end{grammar}
-
-The @<s-expression> is evaluated immediately.  It can do anything it likes.
-
-\textbf{Warning!}  This means that hostile Sod modules are a security hazard.
-Lisp code can read and write files, start other programs, and make network
-connections.  Don't install Sod modules from sources that you don't
-trust.\footnote{%
-  Presumably you were going to run the corresponding code at some point, so
-  this isn't as unusually scary as it sounds.  But please be careful.} %
-
-\subsubsection{Declaring type names} \label{sec:syntax.defs.typename}
-
-\begin{grammar}
-<typename-definition> ::=
-  "typename" <identifier-list> ";"
-\end{grammar}
-
-Each @<identifier> is declared as naming a C type.  This is important because
-the C type syntax -- which Sod uses -- is ambiguous, and disambiguation is
-done by distinguishing type names from other identifiers.
-
-Don't declare class names using @"typename"; use @"class" forward
-declarations instead.
-
-\subsection{Literal code} \label{sec:syntax-code}
-
-\begin{grammar}
-<code-definition> ::=
-  "code" <identifier> ":" <identifier> @[<constraints>@]
-  "{" <c-fragment> "}"
-
-<constraints> ::= "[" <constraint-list> "]"
-
-<constraint> ::= @<identifier>^+
-\end{grammar}
-
-The @<c-fragment> will be output unchanged to one of the output files.
-
-The first @<identifier> is the symbolic name of an output file.  Predefined
-output file names are @"c" and @"h", which are the implementation code and
-header file respectively; other output files can be defined by extensions.
-
-The second @<identifier> provides a name for the output item.  Several C
-fragments can have the same name: they will be concatenated together in the
-order in which they were encountered.
-
-The @<constraints> provide a means for specifying where in the output file
-the output item should appear.  (Note the two kinds of square brackets shown
-in the syntax: square brackets must appear around the constraints if they are
-present, but that they may be omitted.)  Each comma-separated @<constraint>
-is a sequence of identifiers naming output items, and indicates that the
-output items must appear in the order given -- though the translator is free
-to insert additional items in between them.  (The particular output items
-needn't be defined already -- indeed, they needn't be defined ever.)
+\frontmatter
  
-There is a predefined output item @"includes" in both the @"c" and @"h"
-output files which is a suitable place for inserting @"\#include"
-preprocessor directives in order to declare types and functions for use
-elsewhere in the generated output files.
+\tableofcontents
  
-\subsection{Property sets} \label{sec:syntax.propset}
-
-\begin{grammar}
-<properties> ::= "[" <property-list> "]"
-
-<property> ::= <identifier> "=" <expression>
-\end{grammar}
-
-Property sets are a means for associating miscellaneous information with
-classes and related items.  By using property sets, additional information
-can be passed to extensions without the need to introduce idiosyncratic
-syntax.
-
-A property has a name, given as an @<identifier>, and a value computed by
-evaluating an @<expression>.  The value can be one of a number of types,
-though the only operators currently defined act on integer values only.
-
-\subsubsection{The expression evaluator} \label{sec:syntax.propset.expr}
-
-\begin{grammar}
-<expression> ::= <term> | <expression> "+" <term> | <expression> "-" <term>
-
-<term> ::= <factor> | <term> "*" <factor> | <term> "/" <factor>
-
-<factor> ::= <primary> | "+" <factor> | "-" <factor>
-
-<primary> ::=
-     <integer-literal> | <string-literal> | <char-literal> | <identifier>
-\alt "?" <s-expression>
-\alt "(" <expression> ")"
-\end{grammar}
-
-The arithmetic expression syntax is simple and standard; there are currently
-no bitwise, logical, or comparison operators.
-
-A @<primary> expression may be a literal or an identifier.  Note that
-identifiers stand for themselves: they \emph{do not} denote values.  For more
-fancy expressions, the syntax
-\begin{quote}
-  @"?" @<s-expression>
-\end{quote}
-causes the @<s-expression> to be evaluated using the Lisp \textsf{eval}
-function.
-%%% FIXME crossref to extension docs
-
-\subsection{C types} \label{sec:syntax.c-types}
-
-Sod's syntax for C types closely mirrors the standard C syntax.  A C type has
-two parts: a sequence of @<declaration-specifier>s and a @<declarator>.  In
-Sod, a type must contain at least one @<declaration-specifier> (i.e.,
-`implicit @"int"' is forbidden), and storage-class specifiers are not
-recognized.
-
-\subsubsection{Declaration specifiers} \label{sec:syntax.c-types.declspec}
-
-\begin{grammar}
-<declaration-specifier> ::= <type-name>
-\alt "struct" <identifier> | "union" <identifier> | "enum" <identifier>
-\alt "void" | "char" | "int" | "float" | "double"
-\alt "short" | "long"
-\alt "signed" | "unsigned"
-\alt <qualifier>
-
-<qualifier> ::= "const" | "volatile" | "restrict"
-
-<type-name> ::= <identifier>
-\end{grammar}
-
-A @<type-name> is an identifier which has been declared as being a type name,
-using the @"typename" or @"class" definitions.
-
-Declaration specifiers may appear in any order.  However, not all
-combinations are permitted.  A declaration specifier must consist of zero or
-more @<qualifiers>, and one of the following, up to reordering.
-\begin{itemize}
-\item @<type-name>
-\item @"struct" @<identifier>, @"union" @<identifier>, @"enum" @<identifier>
-\item @"void"
-\item @"char", @"unsigned char", @"signed char"
-\item @"short", @"unsigned short", @"signed short"
-\item @"short int", @"unsigned short int", @"signed short int"
-\item @"int", @"unsigned int", @"signed int", @"unsigned", @"signed"
-\item @"long", @"unsigned long", @"signed long"
-\item @"long int", @"unsigned long int", @"signed long int"
-\item @"long long", @"unsigned long long", @"signed long long"
-\item @"long long int", @"unsigned long long int", @"signed long long int"
-\item @"float", @"double", @"long double"
-\end{itemize}
-All of these have their usual C meanings.
-
-\subsubsection{Declarators} \label{sec:syntax.c-types.declarator}
-
-\begin{grammar}
-<declarator>$[k]$ ::= @<pointer>^* <primary-declarator>$[k]$
-
-<primary-declarator>$[k]$ ::= $k$
-\alt "(" <primary-declarator>$[k]$ ")"
-\alt <primary-declarator>$[k]$ @<declarator-suffix>^*
-
-<pointer> ::= "*" @<qualifier>^*
-
-<declarator-suffix> ::= "[" <c-fragment> "]"
-\alt "(" <arguments> ")"
-
-<arguments> ::= $\epsilon$ | "..."
-\alt <argument-list> @["," "..."@]
-
-<argument> ::= @<declaration-specifier>^+ <argument-declarator>
-
-<argument-declarator> ::= <declarator>@[<identifier> @! $\epsilon$@]
-
-<simple-declarator> ::= <declarator>@[<identifier>@]
-
-<dotted-name> ::= <identifier> "." <identifier>
-
-<dotted-declarator> ::= <declarator>@[<dotted-name>@]
-\end{grammar}
-
-The declarator syntax is taken from C, but with some differences.
-\begin{itemize}
-\item Array dimensions are uninterpreted @<c-fragments>, terminated by a
-  closing square bracket.  This allows array dimensions to contain arbitrary
-  constant expressions.
-\item A declarator may have either a single @<identifier> at its centre or a
-  pair of @<identifier>s separated by a @`.'; this is used to refer to
-  slots or messages defined in superclasses.
-\end{itemize}
-The remaining differences are (I hope) a matter of presentation rather than
-substance.
-
-\subsection{Defining classes} \label{sec:syntax.class}
-
-\begin{grammar}
-<class-definition> ::= <class-forward-declaration>
-\alt <full-class-definition>
-\end{grammar}
-
-\subsubsection{Forward declarations} \label{sec:class.class.forward}
-
-\begin{grammar}
-<class-forward-declaration> ::= "class" <identifier> ";"
-\end{grammar}
-
-A @<class-forward-declaration> informs Sod that an @<identifier> will be used
-to name a class which is currently undefined.  Forward declarations are
-necessary in order to resolve certain kinds of circularity.  For example,
-\begin{listing}
-class Sub;
-
-class Super : SodObject {
-  Sub *sub;
-};
-
-class Sub : Super {
-  /* ... */
-};
-\end{listing}
-
-\subsubsection{Full class definitions} \label{sec:class.class.full}
-
-\begin{grammar}
-<full-class-definition> ::=
-  @[<properties>@]
-  "class" <identifier> ":" <identifier-list>
-  "{" @<class-item>^* "}"
-
-<class-item> ::= <slot-item> ";"
-\alt <message-item>
-\alt <method-item>
-\alt  <initializer-item> ";"
-\end{grammar}
-
-A full class definition provides a complete description of a class.
-
-The first @<identifier> gives the name of the class.  It is an error to
-give the name of an existing class (other than a forward-referenced class),
-or an existing type name.  It is conventional to give classes `MixedCase'
-names, to distinguish them from other kinds of identifiers.
-
-The @<identifier-list> names the direct superclasses for the new class.  It
-is an error if any of these @<identifier>s does not name a defined class.
-
-The @<properties> provide additional information.  The standard class
-properties are as follows.
-\begin{description}
-\item[@"lisp_class"] The name of the Lisp class to use within the translator
-  to represent this class.  The property value must be an identifier; the
-  default is @"sod_class".  Extensions may define classes with additional
-  behaviour, and may recognize additional class properties.
-\item[@"metaclass"] The name of the Sod metaclass for this class.  In the
-  generated code, a class is itself an instance of another class -- its
-  \emph{metaclass}.  The metaclass defines which slots the class will have,
-  which messages it will respond to, and what its behaviour will be when it
-  receives them.  The property value must be an identifier naming a defined
-  subclass of @"SodClass".  The default metaclass is @"SodClass".
-  %%% FIXME xref to theory
-\item[@"nick"] A nickname for the class, to be used to distinguish it from
-  other classes in various limited contexts.  The property value must be an
-  identifier; the default is constructed by forcing the class name to
-  lower-case.
-\end{description}
-
-The class body consists of a sequence of @<class-item>s enclosed in braces.
-These items are discussed on the following sections.
-
-\subsubsection{Slot items} \label{sec:sntax.class.slot}
-
-\begin{grammar}
-<slot-item> ::=
-  @[<properties>@]
-  @<declaration-specifier>^+ <init-declarator-list>
-
-<init-declarator> ::= <declarator> @["=" <initializer>@]
-\end{grammar}
-
-A @<slot-item> defines one or more slots.  All instances of the class and any
-subclass will contain these slot, with the names and types given by the
-@<declaration-specifiers> and the @<declarators>.  Slot declarators may not
-contain qualified identifiers.
-
-It is not possible to declare a slot with function type: such an item is
-interpreted as being a @<message-item> or @<method-item>.  Pointers to
-functions are fine.
-
-An @<initializer>, if present, is treated as if a separate
-@<initializer-item> containing the slot name and initializer were present.
-For example,
-\begin{listing}
-[nick = eg]
-class Example : Super {
-  int foo = 17;
-};
-\end{listing}
-means the same as
-\begin{listing}
-[nick = eg]
-class Example : Super {
-  int foo;
-  eg.foo = 17;
-};
-\end{listing}
-
-\subsubsection{Initializer items} \label{sec:syntax.class.init}
-
-\begin{grammar}
-<initializer-item> ::= @["class"@] <slot-initializer-list>
-
-<slot-initializer> ::= <qualified-identifier> "=" <initializer>
-
-<initializer> :: "{" <c-fragment> "}" | <c-fragment>
-\end{grammar}
-
-An @<initializer-item> provides an initial value for one or more slots.  If
-prefixed by @"class", then the initial values are for class slots (i.e.,
-slots of the class object itself); otherwise they are for instance slots.
-
-The first component of the @<qualified-identifier> must be the nickname of
-one of the class's superclasses (including itself); the second must be the
-name of a slot defined in that superclass.
-
-The initializer has one of two forms.
-\begin{itemize}
-\item A @<c-fragment> enclosed in braces denotes an aggregate initializer.
-  This is suitable for initializing structure, union or array slots.
-\item A @<c-fragment> \emph{not} beginning with an open brace is a `bare'
-  initializer, and continues until the next @`,' or @`;' which is not within
-  nested brackets.  Bare initializers are suitable for initializing scalar
-  slots, such as pointers or integers, and strings.
-\end{itemize}
-
-\subsubsection{Message items} \label{sec:syntax.class.message}
-
-\begin{grammar}
-<message-item> ::=
-  @[<properties>@]
-  @<declaration-specifier>^+ <declarator> @[<method-body>@]
-\end{grammar}
-
-\subsubsection{Method items} \label{sec:syntax.class.method}
-
-\begin{grammar}
-<method-item> ::=
-  @[<properties>@]
-  @<declaration-specifier>^+ <declarator> <method-body>
-
-<method-body> ::= "{" <c-fragment> "}" | "extern" ";"
-\end{grammar}
+\mainmatter
  
  %%%--------------------------------------------------------------------------
-\section{Class objects}
-
-\begin{listing}
-typedef struct SodClass__ichain_obj SodClass;
-
-struct sod_chain {
-  size_t n_classes;                     /* Number of classes in chain */
-  const SodClass *const *classes;       /* Vector of classes, head first */
-  size_t off_ichain;                    /* Offset of ichain from instance base */
-  const struct sod_vtable *vt;          /* Vtable pointer for chain */
-  size_t ichainsz;                      /* Size of the ichain structure */
-};
-
-struct sod_vtable {
-  SodClass *_class;                     /* Pointer to instance's class */
-  size_t _base;                         /* Offset to instance base */
-};
-
-struct SodClass__islots {
-
-  /* Basic information */
-  const char *name;                     /* The class's name as a string */
-  const char *nick;                     /* The nickname as a string */
-
-  /* Instance allocation and initialization */
-  size_t instsz;                        /* Instance layout size in bytes */
-  void *(*imprint)(void *);             /* Stamp instance with vtable ptrs */
-  void *(*init)(void *);                /* Initialize instance */
-
-  /* Superclass structure */
-  size_t n_supers;                      /* Number of direct superclasses */
-  const SodClass *const *supers;        /* Vector of direct superclasses */
-  size_t n_cpl;                         /* Length of class precedence list */
-  const SodClass *const *cpl;           /* Vector for class precedence list */
+\part{Tutorial} \label{p:tut}
  
-  /* Chain structure */
-  const SodClass *link;                 /* Link to next class in chain */
-  const SodClass *head;                 /* Pointer to head of chain */
-  size_t level;                         /* Index of class in its chain */
-  size_t n_chains;                      /* Number of superclass chains */
-  const sod_chain *chains;              /* Vector of chain structures */
-
-  /* Layout */
-  size_t off_islots;                    /* Offset of islots from ichain base */
-  size_t islotsz;                       /* Size of instance slots */
-};
-
-struct SodClass__ichain_obj {
-  const SodClass__vt_obj *_vt;
-  struct SodClass__islots cls;
-};
-
-struct sod_instance {
-  struct sod_vtable *_vt;
-};
-\end{listing}
-
-\begin{listing}
-void *sod_convert(const SodClass *cls, const void *obj)
-{
-  const struct sod_instance *inst = obj;
-  const SodClass *real = inst->_vt->_cls;
-  const struct sod_chain *chain;
-  size_t i, index;
-
-  for (i = 0; i < real->cls.n_chains; i++) {
-    chain = &real->cls.chains[i];
-    if (chain->classes[0] == cls->cls.head) {
-      index = cls->cls.index;
-      if (index < chain->n_classes && chain->classes[index] == cls)
-        return ((char *)cls - inst->_vt._base + chain->off_ichain);
-      else
-        return (0);
-    }
-  }
-  return (0);
-}
-\end{listing}
+\include{tutorial}
  
  %%%--------------------------------------------------------------------------
-\section{Classes}
-\label{sec:class}
-
-\subsection{Classes and superclasses} \label{sec:class.defs}
-
-A @<full-class-definition> must list one or more existing classes to be the
-\emph{direct superclasses} for the new class being defined.  We make the
-following definitions.
-\begin{itemize}
-\item The \emph{superclasses} of a class consist of the class itself together
-  with the superclasses of its direct superclasses.
-\item The \emph{proper superclasses} of a class are its superclasses other
-  than itself.
-\item If $C$ is a (proper) superclass of $D$ then $D$ is a (\emph{proper})
-  \emph{subclass} of $C$.
-\end{itemize}
-The predefined class @|SodObject| has no direct superclasses; it is unique in
-this respect.  All classes are subclasses of @|SodObject|.
-
-\subsection{The class precedence list} \label{sec:class.cpl}
-
-Let $C$ be a class.  The superclasses of $C$ form a directed graph, with an
-edge from each class to each of its direct superclasses.  This is the
-\emph{superclass graph of $C$}.
-
-In order to resolve inheritance of items, we define a \emph{class precedence
-  list} (or CPL) for each class, which imposes a total order on that class's
-superclasses.  The default algorithm for computing the CPL is the \emph{C3}
-algorithm \cite{fixme-c3}, though extensions may implement other algorithms.
-
-The default algorithm works as follows.  Let $C$ be the class whose CPL we
-are to compute.  Let $X$ and $Y$ be two of $C$'s superclasses.
-\begin{itemize}
-\item $C$ must appear first in the CPL.
-\item If $X$ appears before $Y$ in the CPL of one of $C$'s direct
-  superclasses, then $X$ appears before $Y$ in the $C$'s CPL.
-\item If the above rules don't suffice to order $X$ and $Y$, then whichever
-  of $X$ and $Y$ has a subclass which appears further left in the list of
-  $C$'s direct superclasses will appear earlier in the CPL.
-\end{itemize}
-This last rule is sufficient to disambiguate because if both $X$ and $Y$ are
-superclasses of the same direct superclass of $C$ then that direct
-superclass's CPL will order $X$ and $Y$.
-
-We say that \emph{$X$ is more specific than $Y$ as a superclass of $C$} if
-$X$ is earlier than $Y$ in $C$'s class precedence list.  If $C$ is clear from
-context then we omit it, saying simply that $X$ is more specific than $Y$.
-
-\subsection{Instances and metaclasses} \label{sec:class.meta}
-
-A class defines the structure and behaviour of its \emph{instances}: run-time
-objects created (possibly) dynamically.  An instance is an instance of only
-one class, though structurally it may be used in place of an instance of any
-of that class's superclasses.  It is possible, with care, to change the class
-of an instance at run-time.
+\part{Reference} \label{p:ref}
  
-Classes are themselves represented as instances -- called \emph{class
-  objects} -- in the running program.  Being instances, they have a class,
-called the \emph{metaclass}.  The metaclass defines the structure and
-behaviour of the class object.
-
-The predefined class @|SodClass| is the default metaclass for new classes.
-@|SodClass| has @|SodObject| as its only direct superclass.  @|SodClass| is
-its own metaclass.
-
-To make matters more complicated, Sod has \emph{two} distinct metalevels: as
-well as the runtime metalevel, as discussed above, there's a compile-time
-metalevel hosted in the Sod translator.  Since Sod is written in Common Lisp,
-a Sod class's compile-time metaclass is a CLOS class.  The usual compile-time
-metaclass is @|sod-class|.  The compile-time metalevel is the subject of
-\xref{ch:api}.
-
-\subsection{Items and inheritance} \label{sec:class.inherit}
-
-A class definition also declares \emph{slots}, \emph{messages},
-\emph{initializers} and \emph{methods} -- collectively referred to as
-\emph{items}.  In addition to the items declared in the class definition --
-the class's \emph{direct items} -- a class also \emph{inherits} items from
-its superclasses.
-
-The precise rules for item inheritance vary according to the kinds of items
-involved.
-
-Some object systems have a notion of `repeated inheritance': if there are
-multiple paths in the superclass graph from a class to one of its
-superclasses then items defined in that superclass may appear duplicated in
-the subclass.  Sod does not have this notion.
-
-\subsubsection{Slots} \label{sec:class.inherit.slots}
-A \emph{slot} is a unit of state.  In other object systems, slots may be
-called `fields', `member variables', or `instance variables'.
-
-A slot has a \emph{name} and a \emph{type}.  The name serves only to
-distinguish the slot from other direct slots defined by the same class.  A
-class inherits all of its proper superclasses' slots.  Slots inherited from
-superclasses do not conflict with each other or with direct slots, even if
-they have the same names.
-
-At run-time, each instance of the class holds a separate value for each slot,
-whether direct or inherited.  Changing the value of an instance's slot
-doesn't affect other instances.
-
-\subsubsection{Initializers} \label{sec:class.inherit.init}
-Mumble.
-
-\subsubsection{Messages} \label{sec:class.inherit.messages}
-A \emph{message} is the stimulus for behaviour.  In Sod, a class must define,
-statically, the name and format of the messages it is able to receive and the
-values it will return in reply.  In this respect, a message is similar to
-`abstract member functions' or `interface member functions' in other object
-systems.
-
-Like slots, a message has a \emph{name} and a \emph{type}.  Again, the name
-serves only to distinguish the message from other direct messages defined by
-the same class.  Messages inherited from superclasses do not conflict with
-each other or with direct messages, even if they have the same name.
-
-At run-time, one sends a message to an instance by invoking a function
-obtained from the instance's \emph{vtable}: \xref{sec:fixme-vtable}.
-
-\subsubsection{Methods} \label{sec:class.inherit.methods}
-A \emph{method} is a unit of behaviour.  In other object systems, methods may
-be called `member functions'.
-
-A method is associated with a message.  When a message is received by an
-instance, all of the methods associated with that message on the instance's
-class or any of its superclasses are \emph{applicable}.  The details of how
-the applicable methods are invoked are described fully in
-\xref{sec:fixme-method-combination}.
-
-\subsection{Chains and instance layout} \label{sec:class.layout}
-
-C is a rather low-level language, and in particular it exposes details of the
-way data is laid out in memory.  Since an instance of a class~$C$ should be
-(at least in principle) usable anywhere an instance of some superclass $B
-\succeq C$ is expected, this implies that an instance of the subclass $C$
-needs to contain within it a complete instance of each superclass $B$, laid
-out according to the rules of instances of $B$, so that if we have (the
-address of) an instance of $C$, we can easily construct a pointer to a thing
-which looks like an instance of $B$ contained within it.
-
-Specifically, the information we need to retain for an instance of a
-class~$C$ is:
-\begin{itemize}
-\item the values of each of the slots defined by $C$, including those defined
-  by superclasses;
-\item information which will let us convert a pointer to $C$ into a pointer
-  to any superclass $B \succeq C$;
-\item information which will let us call the appropriate effective method for
-  each message defined by $C$, including those defined by superclasses; and
-\item some additional meta-level information, such as how to find the class
-  object for $C$ given (the address of) one of its instances.
-\end{itemize}
-
-Observe that, while each distinct instance must clearly have its own storage
-for slots, all instances of $C$ can share a single copy of the remaining
-information.  The individual instance only needs to keep a pointer to this
-shared table, which, inspired by the similar structure in many \Cplusplus\
-ABIs, are called a \emph{vtable}.
-
-The easiest approach would be to decide that instances of $C$ are exactly
-like instances of $B$, only with extra space at the end for the extra slots
-which $C$ defines over and above those already existing in $B$.  Conversion
-is then trivial: a pointer to an instance of $C$ can be converted to a
-pointer to an instance of some superclass $B$ simply by casting.  Even though
-the root class @|SodObject| doesn't have any slots at all, its instances will
-still need a vtable so that you can find its class object: the address of the
-vtable therefore needs to be at the very start of the instance structure.
-Again, a vtable for a superclass would have a vtable for each of its
-superclasses as a prefix, with new items added afterwards.
-
-This appealing approach works well for an object system which only permits
-single inheritance of both state and behaviour.  Alas, it breaks down when
-multiple inheritance is allowed: $C$ can be a subclass of both $B$ and $B'$,
-even though $B$ is not a subclass of $B'$, nor \emph{vice versa}; so, in
-general, $B$'s instance structure will not be a prefix of $B'$'s, nor will
-$B'$'s be a prefix of $B$'s, and therefore $C$ cannot have both $B$ and $B'$
-as a prefix.
-
-A (non-root) class may -- though need not -- have a distinguished \emph{link}
-superclass, which need not be a direct superclass.  Furthermore, each
-class~$C$ must satisfy the \emph{chain condition}: for any superclass $A$ of
-$C$, there can be at most one other superclass of $C$ whose link superclass
-is $A$.\footnote{%
-  That is, it's permitted for two classes $B$ and $B'$ to have the same link
-  superclass $A$, but $B$ and $B'$ can't then both be superclasses of the
-  same class $C$.} %
-Therefore, the links partition the superclasses of~$C$ into nice linear
-\emph{chains}, such that each superclass is a member of exactly one chain.
-If a class~$B$ has a link superclass~$A$, then $B$'s \emph{level} is one more
-than that of $A$; otherwise $B$ is called a \emph{chain head} and its level
-is zero.  If the classes in a chain are written in a list, chain head first,
-then the level of each class gives its index in the list.
-
-Chains therefore allow us to recover some of the linearity properties which
-made layout simple in the case of single inheritance.  The instance structure
-for a class $C$ contains a substructure for each of $C$'s superclass chains;
-a pointer to an object of class $C$ actually points to the substructure for
-the chain containing $C$.  The order of these substructures is unimportant
-for now.\footnote{%
-  The chains appear in the order in which their most specific classes appear
-  in $C$'s class precedence list.  This guarantees that the chain containing
-  $C$ itself appears first, so that a pointer to $C$'s instance structure is
-  actually a pointer to $C$'s chain substructure.  Apart from that, it's a
-  simple, stable, but basically arbitrary choice which can't be changed
-  without breaking the ABI.} %
-The substructure for each chain begins with a pointer to a vtable, followed
-by a structure for each superclass in the chain containing the slots defined
-by that superclass, with the chain head (least specific class) first.
-
-Suppose we have a pointer to (static) type $C$, and want to convert it into a
-pointer to some superclass $B$ of $C$ -- an \emph{upcast}.\footnote{%
-  In the more general case, we have a pointer to static type $C$, which
-  actually points to an object of some subclass $D$ of $C$, and want to
-  convert it into a pointer to type $B$.  Such a conversion is called a
-  \emph{downcast} if $B$ is a subclass of $C$, or a \emph{cross-cast}
-  otherwise.  Downcasts and cross-casts require complicated run-time
-  checking, and can will fail unless $B$ is a superclass of $D$.} %
-If $B$ is in the same chain as $C$ -- an \emph{in-chain upcast} -- then the
-pointer value is already correct and it's only necessary to cast it
-appropriately.  Otherwise -- a \emph{cross-chain upcast} -- the pointer needs
-to be adjusted to point to a different chain substructure.  Since the lengths
-and relative positions of the chain substructures vary between classes, the
-adjustments are stored in the vtable.  Cross-chain upcasts are therefore a
-bit slower than in-chain upcasts.
-
-Each chain has its own separate vtable, because much of the metadata stored
-in the vtable is specific to a particular chain.  For example:
-\begin{itemize}
-\item offsets to other chains' substructures will vary depending on which
-  chain we start from; and
-\item entry points to methods {
+\include{concepts}
+\include{cmdline}
+\include{syntax}
+\include{structures}
+\include{runtime}
  
  %%%--------------------------------------------------------------------------
-\chapter{The Lisp programming interface} \label{ch:api}
-
-%% output for `h' files
-%%
-%% prologue
-%% guard start
-%% typedefs start
-%% typedefs
-%% typedefs end
-%% includes start
-%% includes
-%% includes end
-%% classes start
-%% CLASS banner
-%% CLASS islots start
-%% CLASS islots slots
-%% CLASS islots end
-%% CLASS vtmsgs start
-%% CLASS vtmsgs CLASS start
-%% CLASS vtmsgs CLASS slots
-%% CLASS vtmsgs CLASS end
-%% CLASS vtmsgs end
-%% CLASS vtables start
-%% CLASS vtables CHAIN-HEAD start
-%% CLASS vtables CHAIN-HEAD slots
-%% CLASS vtables CHAIN-HEAD end
-%% CLASS vtables end
-%% CLASS vtable-externs
-%% CLASS vtable-externs-after
-%% CLASS methods start
-%% CLASS methods
-%% CLASS methods end
-%% CLASS ichains start
-%% CLASS ichains CHAIN-HEAD start
-%% CLASS ichains CHAIN-HEAD slots
-%% CLASS ichains CHAIN-HEAD end
-%% CLASS ichains end
-%% CLASS ilayout start
-%% CLASS ilayout slots
-%% CLASS ilayout end
-%% CLASS conversions
-%% CLASS object
-%% classes end
-%% guard end
-%% epilogue
-
-%% output for `c' files
-%%
-%% prologue
-%% includes start
-%% includes
-%% includes end
-%% classes start
-%% CLASS banner
-%% CLASS direct-methods start
-%% CLASS direct-methods METHOD start
-%% CLASS direct-methods METHOD body
-%% CLASS direct-methods METHOD end
-%% CLASS direct-methods end
-%% CLASS effective-methods
-%% CLASS vtables start
-%% CLASS vtables CHAIN-HEAD start
-%% CLASS vtables CHAIN-HEAD class-pointer METACLASS
-%% CLASS vtables CHAIN-HEAD base-offset
-%% CLASS vtables CHAIN-HEAD chain-offset TARGET-HEAD
-%% CLASS vtables CHAIN-HEAD vtmsgs CLASS start
-%% CLASS vtables CHAIN-HEAD vtmsgs CLASS slots
-%% CLASS vtables CHAIN-HEAD vtmsgs CLASS end
-%% CLASS vtables CHAIN-HEAD end
-%% CLASS vtables end
-%% CLASS object prepare
-%% CLASS object start
-%% CLASS object CHAIN-HEAD ichain start
-%% CLASS object SUPER slots start
-%% CLASS object SUPER slots
-%% CLASS object SUPER vtable
-%% CLASS object SUPER slots end
-%% CLASS object CHAIN-HEAD ichain end
-%% CLASS object end
-%% classes end
-%% epilogue
+\part{Lisp interface} \label{p:lisp}
+
+\include{lispintro}
+%% package.lisp
+%% sod.asd.in
+%% sod-frontend.asd.in
+%% auto.lisp.in
+
+\include{misc}
+%% pset-impl.lisp
+%% pset-parse.lisp
+%% pset-proto.lisp
+%% lexer-bits.lisp
+%% lexer-impl.lisp
+%% lexer-proto.lisp
+%% utilities.lisp
+%% optparse.lisp
+%% frontend.lisp
+%% final.lisp
+
+\include{parsing}
+%% package.lisp
+%% floc-impl.lisp
+%% floc-proto.lisp
+%% streams-impl.lisp
+%% streams-proto.lisp
+%% scanner-context-impl.lisp
+%% scanner-impl.lisp
+%% scanner-proto.lisp
+%% scanner-token-impl.lisp
+%% scanner-charbuf-impl.lisp
+%% parser-impl.lisp
+%% parser-proto.lisp
+%% parser-expr-impl.lisp
+%% parser-expr-proto.lisp
+
+\include{clang}
+%% c-types-class-impl.lisp
+%% c-types-impl.lisp
+%% c-types-parse.lisp
+%% c-types-proto.lisp
+%% codegen-impl.lisp
+%% codegen-proto.lisp
+%% fragment-parse.lisp
+
+\include{meta}
+%% classes.lisp
+%% class-utilities.lisp
+%% class-make-impl.lisp
+%% class-make-proto.lisp
+%% class-finalize-impl.lisp
+%% class-finalize-proto.lisp
+
+\include{layout}
+%% class-layout-impl.lisp
+%% class-layout-proto.lisp
+%% method-impl.lisp
+%% method-proto.lisp
+%% method-aggregate.lisp
+
+\include{module}
+%% module-impl.lisp
+%% module-parse.lisp
+%% module-proto.lisp
+%% builtin.lisp
+
+\include{output}
+%% output-impl.lisp
+%% output-proto.lisp
+%% class-output.lisp
+%% module-output.lisp
  
  %%%--------------------------------------------------------------------------
+\part{Appendices}
+\appendix
  
-\include{sod-backg}
-\include{sod-protocol}
+\include{cutting-room-floor}
  
+%%%----- That's all, folks --------------------------------------------------
  \end{document}
-\f
+
  %%% Local variables:
  %%% mode: LaTeX
  %%% TeX-PDF-mode: t
diff --git a/doc/sod.toc b/doc/sod.toc

new file mode 100644 (file)

index 0000000..55d77c6
--- /dev/null
+++ b/doc/sod.toc
@@ -0,0 +1,96 @@
+\contentsline {chapter}{Contents}{i}{chapter*.1}
+\contentsline {part}{I\hspace {1em}Tutorial}{1}{part.1}
+\contentsline {chapter}{\numberline {1}Tutorial}{3}{chapter.1}
+\contentsline {section}{\numberline {1.1}Introduction}{3}{section.1.1}
+\contentsline {subsection}{\numberline {1.1.1}Building programs with Sod}{4}{subsection.1.1.1}
+\contentsline {section}{\numberline {1.2}A traditional trivial introduction}{4}{section.1.2}
+\contentsline {part}{II\hspace {1em}Reference}{7}{part.2}
+\contentsline {chapter}{\numberline {2}Concepts}{9}{chapter.2}
+\contentsline {section}{\numberline {2.1}Classes and slots}{9}{section.2.1}
+\contentsline {section}{\numberline {2.2}Messages and methods}{9}{section.2.2}
+\contentsline {section}{\numberline {2.3}Metaclasses}{9}{section.2.3}
+\contentsline {section}{\numberline {2.4}Modules}{9}{section.2.4}
+\contentsline {chapter}{\numberline {3}Module syntax}{11}{chapter.3}
+\contentsline {subsection}{\numberline {3.0.1}Lexical syntax}{11}{subsection.3.0.1}
+\contentsline {subsubsection}{Identifiers}{11}{section*.2}
+\contentsline {subsubsection}{String and character literals}{12}{section*.3}
+\contentsline {subsubsection}{Integer literals}{12}{section*.4}
+\contentsline {subsubsection}{Punctuation}{13}{section*.5}
+\contentsline {subsubsection}{Comments}{13}{section*.6}
+\contentsline {subsection}{\numberline {3.0.2}Special nonterminals}{13}{subsection.3.0.2}
+\contentsline {subsubsection}{S-expressions}{13}{section*.7}
+\contentsline {subsubsection}{C fragments}{13}{section*.8}
+\contentsline {subsection}{\numberline {3.0.3}Module syntax}{14}{subsection.3.0.3}
+\contentsline {subsection}{\numberline {3.0.4}Simple definitions}{14}{subsection.3.0.4}
+\contentsline {subsubsection}{Importing modules}{14}{section*.9}
+\contentsline {subsubsection}{Loading extensions}{14}{section*.10}
+\contentsline {subsubsection}{Lisp escapes}{15}{section*.11}
+\contentsline {subsubsection}{Declaring type names}{15}{section*.12}
+\contentsline {subsection}{\numberline {3.0.5}Literal code}{15}{subsection.3.0.5}
+\contentsline {subsection}{\numberline {3.0.6}Property sets}{16}{subsection.3.0.6}
+\contentsline {subsubsection}{The expression evaluator}{16}{section*.13}
+\contentsline {subsection}{\numberline {3.0.7}C types}{16}{subsection.3.0.7}
+\contentsline {subsubsection}{Declaration specifiers}{17}{section*.14}
+\contentsline {subsubsection}{Declarators}{17}{section*.15}
+\contentsline {subsection}{\numberline {3.0.8}Defining classes}{18}{subsection.3.0.8}
+\contentsline {subsubsection}{Forward declarations}{18}{section*.16}
+\contentsline {subsubsection}{Full class definitions}{18}{section*.17}
+\contentsline {subsubsection}{Slot items}{19}{section*.18}
+\contentsline {subsubsection}{Initializer items}{20}{section*.19}
+\contentsline {subsubsection}{Message items}{20}{section*.20}
+\contentsline {subsubsection}{Method items}{20}{section*.21}
+\contentsline {part}{III\hspace {1em}Lisp interface}{21}{part.3}
+\contentsline {chapter}{\numberline {4}Protocol overview}{23}{chapter.4}
+\contentsline {section}{\numberline {4.1}A tour through the translator}{23}{section.4.1}
+\contentsline {section}{\numberline {4.2}Specification conventions}{23}{section.4.2}
+\contentsline {subsection}{\numberline {4.2.1}Format of the entries}{24}{subsection.4.2.1}
+\contentsline {chapter}{\numberline {5}Parsing}{27}{chapter.5}
+\contentsline {section}{\numberline {5.1}The parser protocol}{27}{section.5.1}
+\contentsline {section}{\numberline {5.2}File locations}{27}{section.5.2}
+\contentsline {section}{\numberline {5.3}Scanners}{27}{section.5.3}
+\contentsline {subsection}{\numberline {5.3.1}Basic scanner protocol}{27}{subsection.5.3.1}
+\contentsline {subsection}{\numberline {5.3.2}Place-capture scanner protocol}{28}{subsection.5.3.2}
+\contentsline {subsection}{\numberline {5.3.3}Scanner file-location protocol}{29}{subsection.5.3.3}
+\contentsline {subsection}{\numberline {5.3.4}Character scanners}{29}{subsection.5.3.4}
+\contentsline {subsubsection}{Stream access to character scanners}{30}{section*.22}
+\contentsline {subsection}{\numberline {5.3.5}String scanners}{31}{subsection.5.3.5}
+\contentsline {subsection}{\numberline {5.3.6}Character buffer scanners}{31}{subsection.5.3.6}
+\contentsline {subsection}{\numberline {5.3.7}Token scanners}{32}{subsection.5.3.7}
+\contentsline {subsection}{\numberline {5.3.8}List scanners}{33}{subsection.5.3.8}
+\contentsline {section}{\numberline {5.4}Parsing macros}{33}{section.5.4}
+\contentsline {section}{\numberline {5.5}Lexical analyser}{33}{section.5.5}
+\contentsline {chapter}{\numberline {6}C language utilities}{35}{chapter.6}
+\contentsline {section}{\numberline {6.1}C type representation}{35}{section.6.1}
+\contentsline {subsection}{\numberline {6.1.1}Overview}{35}{subsection.6.1.1}
+\contentsline {subsubsection}{Constructing C type objects}{35}{section*.23}
+\contentsline {subsubsection}{Printing}{35}{section*.24}
+\contentsline {subsection}{\numberline {6.1.2}The C type root class}{35}{subsection.6.1.2}
+\contentsline {subsection}{\numberline {6.1.3}C type S-expression notation}{36}{subsection.6.1.3}
+\contentsline {subsection}{\numberline {6.1.4}Comparing C types}{37}{subsection.6.1.4}
+\contentsline {subsection}{\numberline {6.1.5}Outputting C types}{38}{subsection.6.1.5}
+\contentsline {subsection}{\numberline {6.1.6}Type qualifiers and qualifiable types}{39}{subsection.6.1.6}
+\contentsline {subsection}{\numberline {6.1.7}Leaf types}{40}{subsection.6.1.7}
+\contentsline {subsection}{\numberline {6.1.8}Compound C types}{43}{subsection.6.1.8}
+\contentsline {subsection}{\numberline {6.1.9}Pointer types}{43}{subsection.6.1.9}
+\contentsline {subsection}{\numberline {6.1.10}Array types}{43}{subsection.6.1.10}
+\contentsline {subsection}{\numberline {6.1.11}Function types}{44}{subsection.6.1.11}
+\contentsline {subsection}{\numberline {6.1.12}Parsing C types}{45}{subsection.6.1.12}
+\contentsline {section}{\numberline {6.2}Generating C code}{45}{section.6.2}
+\contentsline {chapter}{\numberline {7}The output system}{47}{chapter.7}
+\contentsline {part}{IV\hspace {1em}Appendices}{49}{part.4}
+\contentsline {chapter}{\numberline {A}Cutting-room floor}{51}{appendix.A}
+\contentsline {section}{\numberline {A.1}Generated names}{51}{section.A.1}
+\contentsline {subsection}{\numberline {A.1.1}Instance layout}{51}{subsection.A.1.1}
+\contentsline {section}{\numberline {A.2}Class objects}{51}{section.A.2}
+\contentsline {section}{\numberline {A.3}Classes}{53}{section.A.3}
+\contentsline {subsection}{\numberline {A.3.1}Classes and superclasses}{53}{subsection.A.3.1}
+\contentsline {subsection}{\numberline {A.3.2}The class precedence list}{53}{subsection.A.3.2}
+\contentsline {subsection}{\numberline {A.3.3}Instances and metaclasses}{53}{subsection.A.3.3}
+\contentsline {subsection}{\numberline {A.3.4}Items and inheritance}{54}{subsection.A.3.4}
+\contentsline {subsubsection}{Slots}{54}{section*.25}
+\contentsline {subsubsection}{Initializers}{54}{section*.26}
+\contentsline {subsubsection}{Messages}{54}{section*.27}
+\contentsline {subsubsection}{Methods}{55}{section*.28}
+\contentsline {subsection}{\numberline {A.3.5}Chains and instance layout}{55}{subsection.A.3.5}
+\contentsline {section}{\numberline {A.4}Superclass linearization}{56}{section.A.4}
+\contentsline {section}{\numberline {A.5}Invariance, covariance, contravariance}{57}{section.A.5}
diff --git a/doc/syntax.tex b/doc/syntax.tex

new file mode 100644 (file)

index 0000000..de85ce8
--- /dev/null
+++ b/doc/syntax.tex
@@ -0,0 +1,666 @@
+%%% -*-latex-*-
+%%%
+%%% Module syntax
+%%%
+%%% (c) 2015 Straylight/Edgeware
+%%%
+
+%%%----- Licensing notice ---------------------------------------------------
+%%%
+%%% This file is part of the Sensble Object Design, an object system for C.
+%%%
+%%% SOD is free software; you can redistribute it and/or modify
+%%% it under the terms of the GNU General Public License as published by
+%%% the Free Software Foundation; either version 2 of the License, or
+%%% (at your option) any later version.
+%%%
+%%% SOD is distributed in the hope that it will be useful,
+%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
+%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+%%% GNU General Public License for more details.
+%%%
+%%% You should have received a copy of the GNU General Public License
+%%% along with SOD; if not, write to the Free Software Foundation,
+%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+
+\chapter{Module syntax} \label{ch:syntax}
+
+%%%--------------------------------------------------------------------------
+
+Fortunately, Sod is syntactically quite simple.  I've used a little slightly
+unusual notation in order to make the presentation easier to read.  For any
+nonterminal $x$:
+\begin{itemize}
+\item $\epsilon$ denotes the empty nonterminal:
+  \begin{quote}
+    $\epsilon$ ::=
+  \end{quote}
+\item @[$x$@] means an optional $x$:
+  \begin{quote}
+    \syntax{@[$x$@] ::= $\epsilon$ @! $x$}
+  \end{quote}
+\item $x^*$ means a sequence of zero or more $x$s:
+  \begin{quote}
+    \syntax{$x^*$ ::= $\epsilon$ @! $x^*$ $x$}
+  \end{quote}
+\item $x^+$ means a sequence of one or more $x$s:
+  \begin{quote}
+    \syntax{$x^+$ ::= $x$ $x^*$}
+  \end{quote}
+\item $x$@<-list> means a sequence of one or more $x$s separated
+  by commas:
+  \begin{quote}
+    \syntax{$x$<-list> ::= $x$ @! $x$<-list> "," $x$}
+  \end{quote}
+\end{itemize}
+
+\subsection{Lexical syntax}
+\label{sec:syntax.lex}
+
+Whitespace and comments are discarded.  The remaining characters are
+collected into tokens according to the following syntax.
+
+\begin{grammar}
+<token> ::= <identifier>
+\alt <string-literal>
+\alt <char-literal>
+\alt <integer-literal>
+\alt <punctuation>
+\end{grammar}
+
+This syntax is slightly ambiguous, and is disambiguated by the \emph{maximal
+munch} rule: at each stage we take the longest sequence of characters which
+could be a token.
+
+\subsubsection{Identifiers} \label{sec:syntax.lex.id}
+
+\begin{grammar}
+<identifier> ::= <id-start-char> @<id-body-char>^*
+
+<id-start-char> ::= <alpha-char> | "_"
+
+<id-body-char> ::= <id-start-char> @! <digit-char>
+
+<alpha-char> ::= "A" | "B" | \dots\ | "Z"
+\alt "a" | "b" | \dots\ | "z"
+\alt <extended-alpha-char>
+
+<digit-char> ::= "0" | <nonzero-digit-char>
+
+<nonzero-digit-char> ::= "1" | "2" $| \cdots |$ "9"
+\end{grammar}
+
+The precise definition of @<alpha-char> is left to the function
+\textsf{alpha-char-p} in the hosting Lisp system.  For portability,
+programmers are encouraged to limit themselves to the standard ASCII letters.
+
+There are no reserved words at the lexical level, but the higher-level syntax
+recognizes certain identifiers as \emph{keywords} in some contexts.  There is
+also an ambiguity (inherited from C) in the declaration syntax which is
+settled by distinguishing type names from other identifiers at a lexical
+level.
+
+\subsubsection{String and character literals} \label{sec:syntax.lex.string}
+
+\begin{grammar}
+<string-literal> ::= "\"" @<string-literal-char>^* "\""
+
+<char-literal> ::= "'" <char-literal-char> "'"
+
+<string-literal-char> ::= any character other than "\\" or "\""
+\alt "\\" <char>
+
+<char-literal-char> ::= any character other than "\\" or "'"
+\alt "\\" <char>
+
+<char> ::= any single character
+\end{grammar}
+
+The syntax for string and character literals differs from~C.  In particular,
+escape sequences such as @`\textbackslash n' are not recognized.  The use
+of string and character literals in Sod, outside of C~fragments, is limited,
+and the simple syntax seems adequate.  For the sake of future compatibility,
+the use of character sequences which resemble C escape sequences is
+discouraged.
+
+\subsubsection{Integer literals} \label{sec:syntax.lex.int}
+
+\begin{grammar}
+<integer-literal> ::= <decimal-integer>
+\alt <binary-integer>
+\alt <octal-integer>
+\alt <hex-integer>
+
+<decimal-integer> ::= <nonzero-digit-char> @<digit-char>^*
+
+<binary-integer> ::= "0" @("b"|"B"@) @<binary-digit-char>^+
+
+<binary-digit-char> ::= "0" | "1"
+
+<octal-integer> ::= "0" @["o"|"O"@] @<octal-digit-char>^+
+
+<octal-digit-char> ::= "0" | "1" $| \cdots |$ "7"
+
+<hex-integer> ::= "0" @("x"|"X"@) @<hex-digit-char>^+
+
+<hex-digit-char> ::= <digit-char>
+\alt "A" | "B" | "C" | "D" | "E" | "F"
+\alt "a" | "b" | "c" | "d" | "e" | "f"
+\end{grammar}
+
+Sod understands only integers, not floating-point numbers; its integer syntax
+goes slightly beyond C in allowing a @`0o' prefix for octal and @`0b' for
+binary.  However, length and signedness indicators are not permitted.
+
+\subsubsection{Punctuation} \label{sec:syntax.lex.punct}
+
+\begin{grammar}
+<punctuation> ::= any nonalphanumeric character other than "_", "\"" or "'"
+\end{grammar}
+
+\subsubsection{Comments} \label{sec:lex-comment}
+
+\begin{grammar}
+<comment> ::= <block-comment>
+\alt <line-comment>
+
+<block-comment> ::=
+  "/*"
+  @<not-star>^* @(@<star>^+ <not-star-or-slash> @<not-star>^*@)^*
+  @<star>^*
+  "*/"
+
+<star> ::= "*"
+
+<not-star> ::= any character other than "*"
+
+<not-star-or-slash> ::= any character other than "*" or  "/"
+
+<line-comment> ::= "//" @<not-newline>^* <newline>
+
+<newline> ::= a newline character
+
+<not-newline> ::= any character other than newline
+\end{grammar}
+
+Comments are exactly as in C99: both traditional block comments `\texttt{/*}
+\dots\ \texttt{*/}' and \Cplusplus-style `\texttt{//} \dots' comments are
+permitted and ignored.
+
+\subsection{Special nonterminals}
+\label{sec:special-nonterminals}
+
+Aside from the lexical syntax presented above (\xref{sec:lexical-syntax}),
+two special nonterminals occur in the module syntax.
+
+\subsubsection{S-expressions} \label{sec:syntax-sexp}
+
+\begin{grammar}
+<s-expression> ::= an S-expression, as parsed by the Lisp reader
+\end{grammar}
+
+When an S-expression is expected, the Sod parser simply calls the host Lisp
+system's \textsf{read} function.  Sod modules are permitted to modify the
+read table to extend the S-expression syntax.
+
+S-expressions are self-delimiting, so no end-marker is needed.
+
+\subsubsection{C fragments} \label{sec:syntax.lex.cfrag}
+
+\begin{grammar}
+<c-fragment> ::= a sequence of C tokens, with matching brackets
+\end{grammar}
+
+Sequences of C code are simply stored and written to the output unchanged
+during translation.  They are read using a simple scanner which nonetheless
+understands C comments and string and character literals.
+
+A C fragment is terminated by one of a small number of delimiter characters
+determined by the immediately surrounding context -- usually a closing brace
+or bracket.  The first such delimiter character which is not enclosed in
+brackets, braces or parenthesis ends the fragment.
+
+\subsection{Module syntax} \label{sec:syntax-module}
+
+\begin{grammar}
+<module> ::= @<definition>^*
+
+<definition> ::= <import-definition>
+\alt <load-definition>
+\alt <lisp-definition>
+\alt <code-definition>
+\alt <typename-definition>
+\alt <class-definition>
+\end{grammar}
+
+A module is the top-level syntactic item.  A module consists of a sequence of
+definitions.
+
+\subsection{Simple definitions} \label{sec:syntax.defs}
+
+\subsubsection{Importing modules} \label{sec:syntax.defs.import}
+
+\begin{grammar}
+<import-definition> ::= "import" <string> ";"
+\end{grammar}
+
+The module named @<string> is processed and its definitions made available.
+
+A search is made for a module source file as follows.
+\begin{itemize}
+\item The module name @<string> is converted into a filename by appending
+  @`.sod', if it has no extension already.\footnote{%
+    Technically, what happens is \textsf{(merge-pathnames name (make-pathname
+    :type "SOD" :case :common))}, so exactly what this means varies
+    according to the host system.} %
+\item The file is looked for relative to the directory containing the
+  importing module.
+\item If that fails, then the file is looked for in each directory on the
+  module search path in turn.
+\item If the file still isn't found, an error is reported and the import
+  fails.
+\end{itemize}
+At this point, if the file has previously been imported, nothing further
+happens.\footnote{%
+  This check is done using \textsf{truename}, so it should see through simple
+  tricks like symbolic links.  However, it may be confused by fancy things
+  like bind mounts and so on.} %
+
+Recursive imports, either direct or indirect, are an error.
+
+\subsubsection{Loading extensions} \label{sec:syntax.defs.load}
+
+\begin{grammar}
+<load-definition> ::= "load" <string> ";"
+\end{grammar}
+
+The Lisp file named @<string> is loaded and evaluated.
+
+A search is made for a Lisp source file as follows.
+\begin{itemize}
+\item The name @<string> is converted into a filename by appending @`.lisp',
+  if it has no extension already.\footnote{%
+    Technically, what happens is \textsf{(merge-pathnames name (make-pathname
+    :type "LISP" :case :common))}, so exactly what this means varies
+    according to the host system.} %
+\item A search is then made in the same manner as for module imports
+  (\xref{sec:syntax-module}).
+\end{itemize}
+If the file is found, it is loaded using the host Lisp's \textsf{load}
+function.
+
+Note that Sod doesn't attempt to compile Lisp files, or even to look for
+existing compiled files.  The right way to package a substantial extension to
+the Sod translator is to provide the extension as a standard ASDF system (or
+similar) and leave a dropping @"foo-extension.lisp" in the module path saying
+something like
+\begin{quote}
+  \textsf{(asdf:load-system :foo-extension)}
+\end{quote}
+which will arrange for the extension to be compiled if necessary.
+
+(This approach means that the language doesn't need to depend on any
+particular system definition facility.  It's bad enough already that it
+depends on Common Lisp.)
+
+\subsubsection{Lisp escapes} \label{sec:syntax.defs.lisp}
+
+\begin{grammar}
+<lisp-definition> ::= "lisp" <s-expression> ";"
+\end{grammar}
+
+The @<s-expression> is evaluated immediately.  It can do anything it likes.
+
+\textbf{Warning!}  This means that hostile Sod modules are a security hazard.
+Lisp code can read and write files, start other programs, and make network
+connections.  Don't install Sod modules from sources that you don't
+trust.\footnote{%
+  Presumably you were going to run the corresponding code at some point, so
+  this isn't as unusually scary as it sounds.  But please be careful.} %
+
+\subsubsection{Declaring type names} \label{sec:syntax.defs.typename}
+
+\begin{grammar}
+<typename-definition> ::=
+  "typename" <identifier-list> ";"
+\end{grammar}
+
+Each @<identifier> is declared as naming a C type.  This is important because
+the C type syntax -- which Sod uses -- is ambiguous, and disambiguation is
+done by distinguishing type names from other identifiers.
+
+Don't declare class names using @"typename"; use @"class" forward
+declarations instead.
+
+\subsection{Literal code} \label{sec:syntax-code}
+
+\begin{grammar}
+<code-definition> ::=
+  "code" <identifier> ":" <identifier> @[<constraints>@]
+  "{" <c-fragment> "}"
+
+<constraints> ::= "[" <constraint-list> "]"
+
+<constraint> ::= @<identifier>^+
+\end{grammar}
+
+The @<c-fragment> will be output unchanged to one of the output files.
+
+The first @<identifier> is the symbolic name of an output file.  Predefined
+output file names are @"c" and @"h", which are the implementation code and
+header file respectively; other output files can be defined by extensions.
+
+The second @<identifier> provides a name for the output item.  Several C
+fragments can have the same name: they will be concatenated together in the
+order in which they were encountered.
+
+The @<constraints> provide a means for specifying where in the output file
+the output item should appear.  (Note the two kinds of square brackets shown
+in the syntax: square brackets must appear around the constraints if they are
+present, but that they may be omitted.)  Each comma-separated @<constraint>
+is a sequence of identifiers naming output items, and indicates that the
+output items must appear in the order given -- though the translator is free
+to insert additional items in between them.  (The particular output items
+needn't be defined already -- indeed, they needn't be defined ever.)
+
+There is a predefined output item @"includes" in both the @"c" and @"h"
+output files which is a suitable place for inserting @"\#include"
+preprocessor directives in order to declare types and functions for use
+elsewhere in the generated output files.
+
+\subsection{Property sets} \label{sec:syntax.propset}
+
+\begin{grammar}
+<properties> ::= "[" <property-list> "]"
+
+<property> ::= <identifier> "=" <expression>
+\end{grammar}
+
+Property sets are a means for associating miscellaneous information with
+classes and related items.  By using property sets, additional information
+can be passed to extensions without the need to introduce idiosyncratic
+syntax.
+
+A property has a name, given as an @<identifier>, and a value computed by
+evaluating an @<expression>.  The value can be one of a number of types,
+though the only operators currently defined act on integer values only.
+
+\subsubsection{The expression evaluator} \label{sec:syntax.propset.expr}
+
+\begin{grammar}
+<expression> ::= <term> | <expression> "+" <term> | <expression> "-" <term>
+
+<term> ::= <factor> | <term> "*" <factor> | <term> "/" <factor>
+
+<factor> ::= <primary> | "+" <factor> | "-" <factor>
+
+<primary> ::=
+     <integer-literal> | <string-literal> | <char-literal> | <identifier>
+\alt "?" <s-expression>
+\alt "(" <expression> ")"
+\end{grammar}
+
+The arithmetic expression syntax is simple and standard; there are currently
+no bitwise, logical, or comparison operators.
+
+A @<primary> expression may be a literal or an identifier.  Note that
+identifiers stand for themselves: they \emph{do not} denote values.  For more
+fancy expressions, the syntax
+\begin{quote}
+  @"?" @<s-expression>
+\end{quote}
+causes the @<s-expression> to be evaluated using the Lisp \textsf{eval}
+function.
+%%% FIXME crossref to extension docs
+
+\subsection{C types} \label{sec:syntax.c-types}
+
+Sod's syntax for C types closely mirrors the standard C syntax.  A C type has
+two parts: a sequence of @<declaration-specifier>s and a @<declarator>.  In
+Sod, a type must contain at least one @<declaration-specifier> (i.e.,
+`implicit @"int"' is forbidden), and storage-class specifiers are not
+recognized.
+
+\subsubsection{Declaration specifiers} \label{sec:syntax.c-types.declspec}
+
+\begin{grammar}
+<declaration-specifier> ::= <type-name>
+\alt "struct" <identifier> | "union" <identifier> | "enum" <identifier>
+\alt "void" | "char" | "int" | "float" | "double"
+\alt "short" | "long"
+\alt "signed" | "unsigned"
+\alt <qualifier>
+
+<qualifier> ::= "const" | "volatile" | "restrict"
+
+<type-name> ::= <identifier>
+\end{grammar}
+
+A @<type-name> is an identifier which has been declared as being a type name,
+using the @"typename" or @"class" definitions.
+
+Declaration specifiers may appear in any order.  However, not all
+combinations are permitted.  A declaration specifier must consist of zero or
+more @<qualifiers>, and one of the following, up to reordering.
+\begin{itemize}
+\item @<type-name>
+\item @"struct" @<identifier>, @"union" @<identifier>, @"enum" @<identifier>
+\item @"void"
+\item @"char", @"unsigned char", @"signed char"
+\item @"short", @"unsigned short", @"signed short"
+\item @"short int", @"unsigned short int", @"signed short int"
+\item @"int", @"unsigned int", @"signed int", @"unsigned", @"signed"
+\item @"long", @"unsigned long", @"signed long"
+\item @"long int", @"unsigned long int", @"signed long int"
+\item @"long long", @"unsigned long long", @"signed long long"
+\item @"long long int", @"unsigned long long int", @"signed long long int"
+\item @"float", @"double", @"long double"
+\end{itemize}
+All of these have their usual C meanings.
+
+\subsubsection{Declarators} \label{sec:syntax.c-types.declarator}
+
+\begin{grammar}
+<declarator>$[k]$ ::= @<pointer>^* <primary-declarator>$[k]$
+
+<primary-declarator>$[k]$ ::= $k$
+\alt "(" <primary-declarator>$[k]$ ")"
+\alt <primary-declarator>$[k]$ @<declarator-suffix>^*
+
+<pointer> ::= "*" @<qualifier>^*
+
+<declarator-suffix> ::= "[" <c-fragment> "]"
+\alt "(" <arguments> ")"
+
+<arguments> ::= $\epsilon$ | "..."
+\alt <argument-list> @["," "..."@]
+
+<argument> ::= @<declaration-specifier>^+ <argument-declarator>
+
+<argument-declarator> ::= <declarator>@[<identifier> @! $\epsilon$@]
+
+<simple-declarator> ::= <declarator>@[<identifier>@]
+
+<dotted-name> ::= <identifier> "." <identifier>
+
+<dotted-declarator> ::= <declarator>@[<dotted-name>@]
+\end{grammar}
+
+The declarator syntax is taken from C, but with some differences.
+\begin{itemize}
+\item Array dimensions are uninterpreted @<c-fragments>, terminated by a
+  closing square bracket.  This allows array dimensions to contain arbitrary
+  constant expressions.
+\item A declarator may have either a single @<identifier> at its centre or a
+  pair of @<identifier>s separated by a @`.'; this is used to refer to
+  slots or messages defined in superclasses.
+\end{itemize}
+The remaining differences are (I hope) a matter of presentation rather than
+substance.
+
+\subsection{Defining classes} \label{sec:syntax.class}
+
+\begin{grammar}
+<class-definition> ::= <class-forward-declaration>
+\alt <full-class-definition>
+\end{grammar}
+
+\subsubsection{Forward declarations} \label{sec:class.class.forward}
+
+\begin{grammar}
+<class-forward-declaration> ::= "class" <identifier> ";"
+\end{grammar}
+
+A @<class-forward-declaration> informs Sod that an @<identifier> will be used
+to name a class which is currently undefined.  Forward declarations are
+necessary in order to resolve certain kinds of circularity.  For example,
+\begin{listing}
+class Sub;
+
+class Super : SodObject {
+  Sub *sub;
+};
+
+class Sub : Super {
+  /* ... */
+};
+\end{listing}
+
+\subsubsection{Full class definitions} \label{sec:class.class.full}
+
+\begin{grammar}
+<full-class-definition> ::=
+  @[<properties>@]
+  "class" <identifier> ":" <identifier-list>
+  "{" @<class-item>^* "}"
+
+<class-item> ::= <slot-item> ";"
+\alt <message-item>
+\alt <method-item>
+\alt  <initializer-item> ";"
+\end{grammar}
+
+A full class definition provides a complete description of a class.
+
+The first @<identifier> gives the name of the class.  It is an error to
+give the name of an existing class (other than a forward-referenced class),
+or an existing type name.  It is conventional to give classes `MixedCase'
+names, to distinguish them from other kinds of identifiers.
+
+The @<identifier-list> names the direct superclasses for the new class.  It
+is an error if any of these @<identifier>s does not name a defined class.
+
+The @<properties> provide additional information.  The standard class
+properties are as follows.
+\begin{description}
+\item[@"lisp_class"] The name of the Lisp class to use within the translator
+  to represent this class.  The property value must be an identifier; the
+  default is @"sod_class".  Extensions may define classes with additional
+  behaviour, and may recognize additional class properties.
+\item[@"metaclass"] The name of the Sod metaclass for this class.  In the
+  generated code, a class is itself an instance of another class -- its
+  \emph{metaclass}.  The metaclass defines which slots the class will have,
+  which messages it will respond to, and what its behaviour will be when it
+  receives them.  The property value must be an identifier naming a defined
+  subclass of @"SodClass".  The default metaclass is @"SodClass".
+  %%% FIXME xref to theory
+\item[@"nick"] A nickname for the class, to be used to distinguish it from
+  other classes in various limited contexts.  The property value must be an
+  identifier; the default is constructed by forcing the class name to
+  lower-case.
+\end{description}
+
+The class body consists of a sequence of @<class-item>s enclosed in braces.
+These items are discussed on the following sections.
+
+\subsubsection{Slot items} \label{sec:sntax.class.slot}
+
+\begin{grammar}
+<slot-item> ::=
+  @[<properties>@]
+  @<declaration-specifier>^+ <init-declarator-list>
+
+<init-declarator> ::= <declarator> @["=" <initializer>@]
+\end{grammar}
+
+A @<slot-item> defines one or more slots.  All instances of the class and any
+subclass will contain these slot, with the names and types given by the
+@<declaration-specifiers> and the @<declarators>.  Slot declarators may not
+contain qualified identifiers.
+
+It is not possible to declare a slot with function type: such an item is
+interpreted as being a @<message-item> or @<method-item>.  Pointers to
+functions are fine.
+
+An @<initializer>, if present, is treated as if a separate
+@<initializer-item> containing the slot name and initializer were present.
+For example,
+\begin{listing}
+[nick = eg]
+class Example : Super {
+  int foo = 17;
+};
+\end{listing}
+means the same as
+\begin{listing}
+[nick = eg]
+class Example : Super {
+  int foo;
+  eg.foo = 17;
+};
+\end{listing}
+
+\subsubsection{Initializer items} \label{sec:syntax.class.init}
+
+\begin{grammar}
+<initializer-item> ::= @["class"@] <slot-initializer-list>
+
+<slot-initializer> ::= <qualified-identifier> "=" <initializer>
+
+<initializer> :: "{" <c-fragment> "}" | <c-fragment>
+\end{grammar}
+
+An @<initializer-item> provides an initial value for one or more slots.  If
+prefixed by @"class", then the initial values are for class slots (i.e.,
+slots of the class object itself); otherwise they are for instance slots.
+
+The first component of the @<qualified-identifier> must be the nickname of
+one of the class's superclasses (including itself); the second must be the
+name of a slot defined in that superclass.
+
+The initializer has one of two forms.
+\begin{itemize}
+\item A @<c-fragment> enclosed in braces denotes an aggregate initializer.
+  This is suitable for initializing structure, union or array slots.
+\item A @<c-fragment> \emph{not} beginning with an open brace is a `bare'
+  initializer, and continues until the next @`,' or @`;' which is not within
+  nested brackets.  Bare initializers are suitable for initializing scalar
+  slots, such as pointers or integers, and strings.
+\end{itemize}
+
+\subsubsection{Message items} \label{sec:syntax.class.message}
+
+\begin{grammar}
+<message-item> ::=
+  @[<properties>@]
+  @<declaration-specifier>^+ <declarator> @[<method-body>@]
+\end{grammar}
+
+\subsubsection{Method items} \label{sec:syntax.class.method}
+
+\begin{grammar}
+<method-item> ::=
+  @[<properties>@]
+  @<declaration-specifier>^+ <declarator> <method-body>
+
+<method-body> ::= "{" <c-fragment> "}" | "extern" ";"
+\end{grammar}
+
+
+%%%----- That's all, folks --------------------------------------------------
+
+%%% Local variables:
+%%% mode: LaTeX
+%%% TeX-master: "sod.tex"
+%%% TeX-PDF-mode: t
+%%% End:
diff --git a/doc/sod-tut.tex b/doc/tutorial.tex

similarity index 86%

rename from doc/sod-tut.tex

rename to doc/tutorial.tex

index ca686aa..afc6109 100644 (file)
--- a/doc/sod-tut.tex
+++ b/doc/tutorial.tex
@@ -23,8 +23,7 @@
  %%% along with SOD; if not, write to the Free Software Foundation,
  %%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
  
-\chapter{Tutorial}
-\label{ch:tut}
+\chapter{Tutorial} \label{ch:tutorial}
  
  This chapter provides a tutorial introduction to the Sod object system.  It
  intentionally misses out nitty-gritty details.  If you want those, the
@@ -35,7 +34,7 @@ You'll have to bear with him.  If you think you can do a better job, I'm sure
  that he'll be grateful for your contribution.
  
  %%%--------------------------------------------------------------------------
-\section{Introduction} \label{sec:tut.intro}
+\section{Introduction} \label{sec:tutorial.intro}
  
  Sod is an object system for the C~programming language.  Because it doesn't
  have enough already.  Actually, that's not right: it's got plenty already.
@@ -68,7 +67,7 @@ means that is has the following features.
  There's a good chance that half of that didn't mean anything to you.  Bear
  with me, though, because we'll explain it all eventually.
  
-\subsection{Building programs with Sod} \label{sec:tut.intro.build}
+\subsection{Building programs with Sod} \label{sec:tutorial.intro.build}
  
  Sod is basically a fancy preprocessor, in the same vein as Lex and Yacc.  It
  reads source files written in a vaguely C-like language.  It produces output
@@ -84,8 +83,8 @@ The main consequences of this are as follows.
  \item Sod hasn't made any attempt to improve C's syntax.  It's just as
    hostile to object-oriented programming as it ever was.  This means that
    you'll end up writing ugly things like
-  \begin{prog}%
-    thing->_vt->foo.frob(thing, mumble);%
+  \begin{prog}
+    thing->_vt->foo.frob(thing, mumble);
    \end{prog}
    fairly frequently.  This can be made somewhat less painful using macros,
    but we're basically stuck with C.  The upside is that you know exactly what
@@ -96,11 +95,11 @@ The main consequences of this are as follows.
  \end{itemize}
  Of course, this means that your build system needs to become more
  complicated.  If you use \man{make}{1}, then something like
-\begin{prog}%
-  SOD = sod
-
-  .SUFFIXES: .sod .c .h
-  .sod.c:; \$(SOD) -tc \$<
+\begin{prog}
+  SOD = sod \\
+  \\
+  .SUFFIXES: .sod .c .h \\
+  .sod.c:; \$(SOD) -tc \$< \\
    .sod.h:; \$(SOD) -th \$<
  \end{prog}
  ought to do the job.
@@ -109,48 +108,47 @@ ought to do the job.
  \section{A traditional trivial introduction}
  
  The following is a simple Sod input file.
-\begin{prog}\quad\=\quad\=\kill%
-/* -*-sod-*- */
-
-code c : includes \{
-\#include "greeter.h"
-\}
-
-code h : includes \{
-\#include <stdio.h>
-\#include <sod.h>
+\begin{prog}
+/* -*-sod-*- */ \\
+\\
+code c : includes \{ \\
+\#include "greeter.h" \\
+\} \\
+\\
+code h : includes \{ \\
+\#include <stdio.h> \\
+\#include <sod/sod.h> \\
+\} \\
+\\
+class Greeter : SodObject \{ \\ \ind
+  void greet(FILE *fp) \{ \\ \ind
+    fputs("Hello, world!\textbackslash n", fp); \- \\
+  \} \- \\
  \}
-
-class Greeter : SodObject \{ \+
-  void greet(FILE *fp) \{ \+
-    fputs("Hello, world!\textbackslash n", fp); \-
-  \} \-
-\} %
  \end{prog}
  Save it as @"greeter.sod", and run
-\begin{prog}%
-sod --gc --gh greeter %
+\begin{prog}
+sod --gc --gh greeter
  \end{prog}
  This will create files @"greeter.c" and @"greeter.h" in the current
  directory.  Here's how we might use such a simple thing.
-\begin{prog}\quad\=\kill%
-\#include "greeter.h"
-
-int main(void)
-\{ \+
-  struct Greeter__ilayout g_obj;
-  Greeter *g = Greeter__class->cls.init(\&g_obj);
-
-  g->_vt.greeter.greet(g, stdout);
-  return (0); \-
-\} %
+\begin{prog}
+\#include "greeter.h" \\
+\\
+int main(void) \\
+\{ \\ \ind
+  SOD_DECL(Greeter, g); \\
+  \\
+  Greeter_greet(g, stdout); \\
+  return (0); \- \\
+\}
  \end{prog}
  Compare this to the traditional
-\begin{prog}\quad\=\kill%
-\#include <stdio.h>
-
-int main(void) \+
-  \{ fputs("Hello, world\\n", stdout); return (0); \} %
+\begin{prog}
+\#include <stdio.h> \\
+\\
+int main(void) \\ \ind
+  \{ fputs("Hello, world@\\n", stdout); return (0); \}
  \end{prog}
  and I'm sure you'll appreciate the benefits of using Sod already -- mostly to
  do with finger exercise.  Trust me, it gets more useful.
@@ -161,13 +159,13 @@ it (after the comment which tells Emacs how to cope with it).
  The first part consists of the two @"code" stanzas.  Both of them define
  gobbets of raw C code to copy into output files.  The first one, @"code~:
  c"~\ldots, says that
-\begin{prog}%
-  \#include "greeter.h" %
+\begin{prog}
+  \#include "greeter.h"
  \end{prog}
  needs to appear in the generated @|greeter.c| file; the second says that
-\begin{prog}%
-  \#include <stdio.h>
-  \#include <sod.h> %
+\begin{prog}
+  \#include <stdio.h> \\
+  \#include <sod/sod.h>
  \end{prog}
  needs to appear in the header file @|greeter.h|.  The generated C files need
  to get declarations for external types and functions (e.g., @"FILE" and
@@ -176,10 +174,10 @@ declarations from the corresponding @".h" file.  Sod takes a very simple
  approach to all of this: it expects you, the programmer, to deal with it.
  
  The basic syntax for @"code" stanzas is
-\begin{prog}\quad\=\kill%
-  code @<file-label> : @<section> \{
-  \>  @<code>
-  \} %
+\begin{prog}
+  code @<file-label> : @<section> \{ \\ \ind
+    @<code> \- \\
+  \}
  \end{prog}
  The @<file-label> is either @"c" or @"h", and says which output file the code
  wants to be written to.  The @<section> is a name which explains where in the
@@ -193,13 +191,13 @@ message.
  
  So far, so good.  The C code, which we thought we understood, contains some
  bizarre looking runes.  Let's take it one step at a time.
-\begin{prog}%
-  struct Greeter__ilayout g_obj; %
+\begin{prog}
+  struct Greeter__ilayout g_obj;
  \end{prog}
  allocates space for an instance of class @"Greeter".  We're not going to use
  this space directly.  Instead, we do this frightening looking thing.
-\begin{prog}%
-  Greeter *g = Greeter__class->cls.init(\&g_obj); %
+\begin{prog}
+  Greeter *g = Greeter__class->cls.init(\&g_obj);
  \end{prog}
  Taking it slowly: @"Greeter__class" is a pointer to the object that
  represents our class @"Greeter".  This object contains a member, named
@@ -209,8 +207,8 @@ the instance, which we use in preference to grovelling about in the
  @"ilayout" structure.
  
  Having done this, we `send the instance a message':
-\begin{prog}%
-  g->_vt->greeter.greet(g, stdout); %
+\begin{prog}
+  g->_vt->greeter.greet(g, stdout);
  \end{prog}
  This looks horrific, and seems to repeat itself quite unnecessarily.  The
  first @"g" is the recipient of our `message'.  The second is indeed a copy of
author	Mark Wooding <mdw@distorted.org.uk>
	Sun, 30 Aug 2015 09:58:38 +0000 (10:58 +0100)
committer	Mark Wooding <mdw@distorted.org.uk>
	Thu, 17 Sep 2015 10:17:16 +0000 (11:17 +0100)
doc/clang.tex	[moved from doc/sod-protocol.tex with 68% similarity]	patch \| blob \| blame \| history
doc/concepts.tex	[new file with mode: 0644]	patch \| blob
doc/cutting-room-floor.tex	[new file with mode: 0644]	patch \| blob
doc/lispintro.tex	[new file with mode: 0644]	patch \| blob
doc/output.tex	[new file with mode: 0644]	patch \| blob
doc/parsing.tex	[new file with mode: 0644]	patch \| blob
doc/sod-backg.tex	[deleted file]	patch \| blob \| blame \| history
doc/sod.sty	[new file with mode: 0644]	patch \| blob
doc/sod.tex		patch \| blob \| blame \| history
doc/sod.toc	[new file with mode: 0644]	patch \| blob
doc/syntax.tex	[new file with mode: 0644]	patch \| blob
doc/tutorial.tex	[moved from doc/sod-tut.tex with 86% similarity]	patch \| blob \| blame \| history