@@@ mess!

[sod] / doc / concepts.tex
diff --git a/doc/concepts.tex b/doc/concepts.tex

index d554b51..945e6b9 100644 (file)
--- a/doc/concepts.tex
+++ b/doc/concepts.tex
@@ -7,7 +7,7 @@
  
  %%%----- Licensing notice ---------------------------------------------------
  %%%
  
  %%%----- Licensing notice ---------------------------------------------------
  %%%
-%%% This file is part of the Sensble Object Design, an object system for C.
+%%% This file is part of the Sensible Object Design, an object system for C.
  %%%
  %%% SOD is free software; you can redistribute it and/or modify
  %%% it under the terms of the GNU General Public License as published by
  %%%
  %%% SOD is free software; you can redistribute it and/or modify
  %%% it under the terms of the GNU General Public License as published by
@@ -23,15 +23,1302 @@
  %%% along with SOD; if not, write to the Free Software Foundation,
  %%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
  
  %%% along with SOD; if not, write to the Free Software Foundation,
  %%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
  
-\chapter{Concepts}
+\chapter{Concepts} \label{ch:concepts}
  
  
-\section{Classes and slots}
+%%%--------------------------------------------------------------------------
+\section{Modules} \label{sec:concepts.modules}
  
  
-\section{Messages and methods}
+A \emph{module} is the top-level syntactic unit of input to the Sod
+translator.  As described above, given an input module, the translator
+generates C source and header files.
  
  
-\section{Metaclasses}
+A module can \emph{import} other modules.  This makes the type names and
+classes defined in those other modules available to class definitions in the
+importing module.  Sod's module system is intentionally very simple.  There
+are no private declarations or attempts to hide things.
  
  
-\section{Modules}
+As well as importing existing modules, a module can include a number of
+different kinds of \emph{items}:
+\begin{itemize}
+\item \emph{class definitions} describe new classes, possibly in terms of
+  existing classes;
+\item \emph{type name declarations} introduce new type names to Sod's
+  parser;\footnote{%
+    This is unfortunately necessary because C syntax, upon which Sod's input
+    language is based for obvious reasons, needs to treat type names
+    differently from other kinds of identifiers.} %
+  and
+\item \emph{code fragments} contain literal C code to be dropped into an
+  appropriate place in an output file.
+\end{itemize}
+Each kind of item, and, indeed, a module as a whole, can have a collection of
+\emph{properties} associated with it.  A property has a \emph{name} and a
+\emph{value}.  Properties are an open-ended way of attaching additional
+information to module items, so extensions can make use of them without
+having to implement additional syntax.
+
+%%%--------------------------------------------------------------------------
+\section{Classes, instances, and slots} \label{sec:concepts.classes}
+
+For the most part, Sod takes a fairly traditional view of what it means to be
+an object system.
+
+An \emph{object} maintains \emph{state} and exhibits \emph{behaviour}.
+(Here, we're using the term `object' in the usual sense of `object-oriented
+programming', rather than that of the ISO~C standard.  Once we have defined
+an `instance' below, we shall generally prefer that term, so as to prevent
+further confusion between these two uses of the word.)
+
+An object's state is maintained in named \emph{slots}, each of which can
+store a C value of an appropriate (scalar or aggregate) type.  An object's
+behaviour is stimulated by sending it \emph{messages}.  A message has a name,
+and may carry a number of arguments, which are C values; sending a message
+may result in the state of receiving object (or other objects) being changed,
+and a C value being returned to the sender.
+
+Every object is a \emph{direct instance} of exactly one \emph{class}.  The
+class determines which slots its instances have, which messages its instances
+can be sent, and which \emph{methods} are invoked when those messages are
+received.  The Sod translator's main job is to read class definitions and
+convert them into appropriate C declarations, tables, and functions.  An
+object cannot (usually) change its direct class, and the direct class of an
+object is not affected by, for example, the static type of a pointer to it.
+
+If an object~$x$ is a direct instance of some class~$C$, then we say that $C$
+is \emph{the class of}~$x$.  Note that the class of an object is a property
+of the object's value at runtime, and not of C's compile-time type system.
+We shall be careful in distinguishing C's compile-time notion of \emph{type}
+from Sod's run-time notion of \emph{class}.
+
+
+\subsection{Superclasses and inheritance}
+\label{sec:concepts.classes.inherit}
+
+\subsubsection{Class relationships}
+Each class has zero or more \emph{direct superclasses}.
+
+A class with no direct superclasses is called a \emph{root class}.  The Sod
+runtime library includes a root class named @|SodObject|; making new root
+classes is somewhat tricky, and won't be discussed further here.
+
+Classes can have more than one direct superclass, i.e., Sod supports
+\emph{multiple inheritance}.  A Sod class definition for a class~$C$ lists
+the direct superclasses of $C$ in a particular order.  This order is called
+the \emph{local precedence order} of $C$, and the list which consists of $C$
+follows by $C$'s direct superclasses in local precedence order is called the
+$C$'s \emph{local precedence list}.
+
+The multiple inheritance in Sod works similarly to multiple inheritance in
+Lisp-like languages, such as Common Lisp, EuLisp, Dylan, and Python, which is
+very different from how multiple inheritance works in \Cplusplus.\footnote{%
+  The latter can be summarized as `badly'.  By default in \Cplusplus, an
+  instance receives an additional copy of superclass's state for each path
+  through the class graph from the instance's direct class to that
+  superclass, though this behaviour can be overridden by declaring
+  superclasses to be @|virtual|.  Also, \Cplusplus\ offers only trivial
+  method combination (\xref{sec:concepts.methods}), leaving programmers to
+  deal with delegation manually and (usually) statically.} %
+
+If $C$ is a class, then the \emph{superclasses} of $C$ are
+\begin{itemize}
+\item $C$ itself, and
+\item the superclasses of each of $C$'s direct superclasses.
+\end{itemize}
+The \emph{proper superclasses} of a class $C$ are the superclasses of $C$
+except for $C$ itself.  If a class $B$ is a (direct, proper) superclass of
+$C$, then $C$ is a \emph{(direct, proper) subclass} of $B$.  If $C$ is a root
+class then the only superclass of $C$ is $C$ itself, and $C$ has no proper
+superclasses.
+
+If an object is a direct instance of class~$C$ then the object is also an
+(indirect) \emph{instance} of every superclass of $C$.
+
+If $C$ has a proper superclass $B$, then $B$ must not have $C$ as a direct
+superclass.  In different terms, if we construct a directed graph, whose
+nodes are classes, and draw an arc from each class to each of its direct
+superclasses, then this graph must be acyclic.  In yet other terms, the `is a
+superclass of' relation is a partial order on classes.
+
+\subsubsection{The class precedence list}
+This partial order is not quite sufficient for our purposes.  For each class
+$C$, we shall need to extend it into a total order on $C$'s superclasses.
+This calculation is called \emph{superclass linearization}, and the result is
+a \emph{class precedence list}, which lists each of $C$'s superclasses
+exactly once.  If a superclass $B$ precedes or follows some other superclass
+$A$ in $C$'s class precedence list, then we say that $B$ is respectively a
+more or less \emph{specific} superclass of $C$ than $A$.
+
+The superclass linearization algorithm isn't fixed, and extensions to the
+translator can introduce new linearizations for special effects, but the
+following properties are expected to hold.
+\begin{itemize}
+\item The first class in $C$'s class precedence list is $C$ itself; i.e.,
+  $C$ is always its own most specific superclass.
+\item If $A$ and $B$ are both superclasses of $C$, and $A$ is a proper
+  superclass of $B$ then $A$ appears after $B$ in $C$'s class precedence
+  list, i.e., $B$ is a more specific superclass of $C$ than $A$ is.
+\end{itemize}
+The default linearization algorithm used in Sod is the \emph{C3} algorithm,
+which has a number of good properties described
+in~\cite{barrett-1996:monot-super-linear-dylan}.  It works as follows.
+\begin{itemize}
+\item A \emph{merge} of some number of input lists is a single list
+  containing each item that is in any of the input lists exactly once, and no
+  other items; if an item $x$ appears before an item $y$ in any input list,
+  then $x$ also appears before $y$ in the merge.  If a collection of lists
+  have no merge then they are said to be \emph{inconsistent}.
+\item The class precedence list of a class $C$ is a merge of the local
+  precedence list of $C$ together with the class precedence lists of each of
+  $C$'s direct superclasses.  If these lists are inconsistent, then the
+  definition of $C$ is invalid.
+\item Suppose that there are multiple candidate merges.  Consider the
+  earliest position in these candidate merges at which they disagree.  The
+  \emph{candidate classes} at this position are the classes appearing at this
+  position in the candidate merges.  Each candidate class must be a
+  superclass of distinct direct superclasses of $C$, since otherwise the
+  candidates would be ordered by their common subclass's class precedence
+  list.  The class precedence list contains, at this position, that candidate
+  class whose subclass appears earliest in $C$'s local precedence order.
+\end{itemize}
+
+\begin{figure}
+  \centering
+  \begin{tikzpicture}[x=7.5mm, y=-14mm, baseline=(current bounding box.east)]
+    \node[lit] at ( 0,  0) (R) {SodObject};
+    \node[lit] at (-3, +1) (A) {A};     \draw[->] (A) -- (R);
+    \node[lit] at (-1, +1) (B) {B};     \draw[->] (B) -- (R);
+    \node[lit] at (+1, +1) (C) {C};     \draw[->] (C) -- (R);
+    \node[lit] at (+3, +1) (D) {D};     \draw[->] (D) -- (R);
+    \node[lit] at (-2, +2) (E) {E};     \draw[->] (E) -- (A);
+                                        \draw[->] (E) -- (B);
+    \node[lit] at (+2, +2) (F) {F};     \draw[->] (F) -- (A);
+                                        \draw[->] (F) -- (D);
+    \node[lit] at (-1, +3) (G) {G};     \draw[->] (G) -- (E);
+                                        \draw[->] (G) -- (C);
+    \node[lit] at (+1, +3) (H) {H};     \draw[->] (H) -- (F);
+    \node[lit] at ( 0, +4) (I) {I};     \draw[->] (I) -- (G);
+                                        \draw[->] (I) -- (H);
+  \end{tikzpicture}
+  \quad
+  \vrule
+  \quad
+  \begin{minipage}[c]{0.45\hsize}
+    \begin{nprog}
+      class A: SodObject \{ \}\quad\=@/* @|A|, @|SodObject| */  \\
+      class B: SodObject \{ \}\>@/* @|B|, @|SodObject| */       \\
+      class C: SodObject \{ \}\>@/* @|B|, @|SodObject| */       \\
+      class D: SodObject \{ \}\>@/* @|B|, @|SodObject| */       \\+
+      class E: A, B \{ \}\quad\=@/* @|E|, @|A|, @|B|, \dots */  \\
+      class F: A, D \{ \}\>@/* @|F|, @|A|, @|D|, \dots */       \\+
+      class G: E, C \{ \}\>@/* @|G|, @|E|, @|A|,
+                                @|B|, @|C|, \dots */            \\
+      class H: F \{ \}\>@/* @|H|, @|F|, @|A|, @|D|, \dots */    \\+
+      class I: G, H \{ \}\>@/* @|I|, @|G|, @|E|, @|H|, @|F|,
+                                @|A|, @|B|, @|C|, @|D|, \dots */
+    \end{nprog}
+  \end{minipage}
+
+  \caption{An example class graph and class precedence lists}
+  \label{fig:concepts.classes.cpl-example}
+\end{figure}
+
+\begin{example}
+  Consider the class relationships shown in
+  \xref{fig:concepts.classes.cpl-example}.
+
+  \begin{itemize}
+
+  \item @|SodObject| has no proper superclasses.  Its class precedence list
+    is therefore simply $\langle @|SodObject| \rangle$.
+
+  \item In general, if $X$ is a direct subclass only of $Y$, and $Y$'s class
+    precedence list is $\langle Y, \ldots \rangle$, then $X$'s class
+    precedence list is $\langle X, Y, \ldots \rangle$.  This explains $A$,
+    $B$, $C$, $D$, and $H$.
+
+  \item $E$'s list is found by merging its local precedence list $\langle E,
+    A, B \rangle$ with the class precedence lists of its direct superclasses,
+    which are $\langle A, @|SodObject| \rangle$ and $\langle B, @|SodObject|
+    \rangle$.  Clearly, @|SodObject| must be last, and $E$'s local precedence
+    list orders the rest, giving $\langle E, A, B, @|SodObject|, \rangle$.
+    $F$ is similar.
+
+  \item We determine $G$'s class precedence list by merging the three lists
+    $\langle G, E, C \rangle$, $\langle E, A, B, @|SodObject| \rangle$, and
+    $\langle C, @|SodObject| \rangle$.  The class precedence list begins
+    $\langle G, E, \ldots \rangle$, but the individual lists don't order $A$
+    and $C$.  Comparing these to $G$'s direct superclasses, we see that $A$
+    is a superclass of $E$, while $C$ is a superclass of -- indeed equal to
+    -- $C$; so $A$ must precede $C$, as must $B$, and the final list is
+    $\langle G, E, A, B, C, @|SodObject| \rangle$.
+
+  \item Finally, we determine $I$'s class precedence list by merging $\langle
+    I, G, H \rangle$, $\langle G, E, A, B, C, @|SodObject| \rangle$, and
+    $\langle H, F, A, D, @|SodObject| \rangle$.  The list begins $\langle I,
+    G, \ldots \rangle$, and then we must break a tie between $E$ and $H$; but
+    $E$ is a superclass of $G$, so $E$ wins.  Next, $H$ and $F$ must precede
+    $A$, since these are ordered by $H$'s class precedence list.  Then $B$
+    and $C$ precede $D$, since the former are superclasses of $G$, and the
+    final list is $\langle I, G, E, H, F, A, B, C, D, @|SodObject| \rangle$.
+
+  \end{itemize}
+
+  (This example combines elements from
+  \cite{barrett-1996:monot-super-linear-dylan} and
+  \cite{ducournau-1994:monot-multip-inher-linear}.)
+\end{example}
+
+\subsubsection{Class links and chains}
+The definition for a class $C$ may distinguish one of its proper superclasses
+as being the \emph{link superclass} for class $C$.  Not every class need have
+a link superclass, and the link superclass of a class $C$, if it exists, need
+not be a direct superclass of $C$.
+
+Superclass links must obey the following rule: if $C$ is a class, then there
+must be no three distinct superclasses $X$, $Y$ and~$Z$ of $C$ such that $Z$
+is the link superclass of both $X$ and $Y$.  As a consequence of this rule,
+the superclasses of $C$ can be partitioned into linear \emph{chains}, such
+that superclasses $A$ and $B$ are in the same chain if and only if one can
+trace a path from $A$ to $B$ by following superclass links, or \emph{vice
+versa}.
+
+Since a class links only to one of its proper superclasses, the classes in a
+chain are naturally ordered from most- to least-specific.  The least specific
+class in a chain is called the \emph{chain head}; the most specific class is
+the \emph{chain tail}.  Chains are often named after their chain head
+classes.
+
+
+\subsection{Names}
+\label{sec:concepts.classes.names}
+
+Classes have a number of other attributes:
+\begin{itemize}
+\item A \emph{name}, which is a C identifier.  Class names must be globally
+  unique.  The class name is used in the names of a number of associated
+  definitions, to be described later.
+\item A \emph{nickname}, which is also a C identifier.  Unlike names,
+  nicknames are not required to be globally unique.  If $C$ is any class,
+  then all the superclasses of $C$ must have distinct nicknames.
+\end{itemize}
+
+
+\subsection{Slots} \label{sec:concepts.classes.slots}
+
+Each class defines a number of \emph{slots}.  Much like a structure member, a
+slot has a \emph{name}, which is a C identifier, and a \emph{type}.  Unlike
+many other object systems, different superclasses of a class $C$ can define
+slots with the same name without ambiguity, since slot references are always
+qualified by the defining class's nickname.
+
+\subsubsection{Slot initializers}
+As well as defining slot names and types, a class can also associate an
+\emph{initial value} with each slot defined by itself or one of its
+superclasses.  A class $C$ provides an \emph{initialization message} (see
+\xref[\instead{sections}]{sec:concepts.lifecycle.birth}, and
+\ref{sec:structures.root.sodobject}) whose methods set the slots of a
+\emph{direct} instance of the class to the correct initial values.  If
+several of $C$'s superclasses define initializers for the same slot then the
+initializer from the most specific such class is used.  If none of $C$'s
+superclasses define an initializer for some slot then that slot will be left
+uninitialized.
+
+The initializer for a slot with scalar type may be any C expression.  The
+initializer for a slot with aggregate type must contain only constant
+expressions if the generated code is expected to be processed by a
+implementation of C89.  Initializers will be evaluated once each time an
+instance is initialized.
+
+Slots are initialized in reverse-precedence order of their defining classes;
+i.e., slots defined by a less specific superclass are initialized earlier
+than slots defined by a more specific superclass.  Slots defined by the same
+class are initialized in the order in which they appear in the class
+definition.
+
+The initializer for a slot may refer to other slots in the same object, via
+the @|me| pointer: in an initializer for a slot defined by a class $C$, @|me|
+has type `pointer to $C$'.  (Note that the type of @|me| depends only on the
+class which defined the slot, not the class which defined the initializer.)
+
+A class can also define \emph{class slot initializers}, which provide values
+for a slot defined by its metaclass; see \xref{sec:concepts.metaclasses} for
+details.
+
+
+\subsection{C language integration} \label{sec:concepts.classes.c}
+
+It is very important to distinguish compile-time C \emph{types} from Sod's
+run-time \emph{classes}: see \xref{sec:concepts.classes}.
+
+For each class~$C$, the Sod translator defines a C type, the \emph{class
+type}, with the same name.  This is the usual type used when considering an
+object as an instance of class~$C$.  No entire object will normally have a
+class type,\footnote{%
+  In general, a class type only captures the structure of one of the
+  superclass chains of an instance.  A full instance layout contains multiple
+  chains.  See \xref{sec:structures.layout} for the full details.} %
+so access to instances is almost always via pointers.
+
+Usually, a value of type pointer-to-class-type of class~$C$ will point into
+an instance of class $C$.  However, clever (or foolish) use of pointer
+conversions can invalidate this relationship.
+
+\subsubsection{Access to slots}
+The class type for a class~$C$ is actually a structure.  It contains one
+member for each class in $C$'s superclass chain, named with that class's
+nickname.  Each of these members is also a structure, containing the
+corresponding class's slots, one member per slot.  There's nothing special
+about these slot members: C code can access them in the usual way.
+
+For example, given the definition
+\begin{prog}
+  [nick = mine]                                                 \\
+  class MyClass: SodObject \{                                   \\ \ind
+    int x;                                                    \-\\
+  \}
+\end{prog}
+the simple function
+\begin{prog}
+  int get_x(MyClass *m) \{ return (m@->mine.x); \}
+\end{prog}
+will extract the value of @|x| from an instance of @|MyClass|.
+
+All of this means that there's no such thing as `private' or `protected'
+slots.  If you want to hide implementation details, the best approach is to
+stash them in a dynamically allocated private structure, and leave a pointer
+to it in a slot.  (This will also help preserve binary compatibility, because
+the private structure can grow more members as needed.  See
+\xref{sec:concepts.compatibility} for more details.)
+
+Slots defined by $C$'s link superclass, or any other superclass in the same
+chain, can be accessed in the same way.  Slots defined by other superclasses
+can't be accessed directly: the instance pointer must be \emph{converted} to
+point to a different chain.  See the subsection `Conversions' below.
+
+
+\subsubsection{Sending messages}
+Sod defines a macro for each message.  If a class $C$ defines a message $m$,
+then the macro is called @|$C$_$m$|.  The macro takes a pointer to the
+receiving object as its first argument, followed by the message arguments, if
+any, and returns the value returned by the object's effective method for the
+message (if any).  If you have a pointer to an instance of any of $C$'s
+subclasses, then you can send it the message; it doesn't matter whether the
+subclass is on the same chain.  Note that the receiver argument is evaluated
+twice, so it's not safe to write a receiver expression which has
+side-effects.
+
+For example, suppose we defined
+\begin{prog}
+  [nick = soupy]                                                \\
+  class Super: SodObject \{                                     \\ \ind
+    void msg(const char *m);                                  \-\\
+  \}                                                            \\+
+  class Sub: Super \{                                           \\ \ind
+    void soupy.msg(const char *m)
+      \{ printf("sub sent `\%s'@\\n", m); \}                  \-\\
+  \}
+\end{prog}
+then we can send the message like this:
+\begin{prog}
+  Sub *sub = /* \dots\ */;                                      \\
+  Super_msg(sub, "hello");
+\end{prog}
+
+What happens under the covers is as follows.  The structure pointed to by the
+instance pointer has a member named @|_vt|, which points to a structure
+called a `virtual table', or \emph{vtable}, which contains various pieces of
+information about the object's direct class and layout, and holds pointers to
+method entries for the messages which the object can receive.  The
+message-sending macro in the example above expands to something similar to
+\begin{prog}
+  sub@->_vt.sub.msg(sub, "Hello");
+\end{prog}
+
+The vtable contains other useful information, such as a pointer to the
+instance's direct class's \emph{class object} (described below).  The full
+details of the contents and layout of vtables are given in
+\xref{sec:structures.layout.vtable}.
+
+
+\subsubsection{Class objects}
+In Sod's object system, classes are objects too.  Therefore classes are
+themselves instances; the class of a class is called a \emph{metaclass}.  The
+consequences of this are explored in \xref{sec:concepts.metaclasses}.  The
+\emph{class object} has the same name as the class, suffixed with
+`@|__class|'\footnote{%
+  This is not quite true.  @|$C$__class| is actually a macro.  See
+  \xref{sec:structures.layout.additional} for the gory details.} %
+and its type is usually @|SodClass|; @|SodClass|'s nickname is @|cls|.
+
+A class object's slots contain or point to useful information, tables and
+functions for working with that class's instances.  (The @|SodClass| class
+doesn't define any messages, so it doesn't have any methods other than for
+the @|SodObject| lifecycle messages @|init| and @|teardown|; see
+\xref{sec:concepts.lifecycle}.  In Sod, a class slot containing a function
+pointer is not at all the same thing as a method.)
+
+\subsubsection{Conversions}
+Suppose one has a value of type pointer-to-class-type for some class~$C$, and
+wants to convert it to a pointer-to-class-type for some other class~$B$.
+There are three main cases to distinguish.
+\begin{itemize}
+\item If $B$ is a superclass of~$C$, in the same chain, then the conversion
+  is an \emph{in-chain upcast}.  The conversion can be performed using the
+  appropriate generated upcast macro (see below), or by simply casting the
+  pointer, using C's usual cast operator (or the \Cplusplus\ @|static_cast<>|
+  operator).
+\item If $B$ is a superclass of~$C$, in a different chain, then the
+  conversion is a \emph{cross-chain upcast}.  The conversion is more than a
+  simple type change: the pointer value must be adjusted.  If the direct
+  class of the instance in question is not known, the conversion will require
+  a lookup at runtime to find the appropriate offset by which to adjust the
+  pointer.  The conversion can be performed using the appropriate generated
+  upcast macro (see below); the general case is handled by the macro
+  \descref{mac}{SOD_XCHAIN}.
+\item If $B$ is a subclass of~$C$ then the conversion is a \emph{downcast};
+  otherwise the conversion is a~\emph{cross-cast}.  In either case, the
+  conversion can fail: the object in question might not be an instance of~$B$
+  after all.  The macro \descref{mac}{SOD_CONVERT} and the function
+  \descref{fun}{sod_convert} perform general conversions.  They return a null
+  pointer if the conversion fails.  (These are therefore your analogue to the
+  \Cplusplus\ @|dynamic_cast<>| operator.)
+\end{itemize}
+The Sod translator generates macros for performing both in-chain and
+cross-chain upcasts.  For each class~$C$, and each proper superclass~$B$
+of~$C$, a macro is defined: given an argument of type pointer to class type
+of~$C$, it returns a pointer to the same instance, only with type pointer to
+class type of~$B$, adjusted as necessary in the case of a cross-chain
+conversion.  The macro is named by concatenating
+\begin{itemize}
+\item the name of class~$C$, in upper case,
+\item the characters `@|__CONV_|', and
+\item the nickname of class~$B$, in upper case;
+\end{itemize}
+e.g., if $C$ is named @|MyClass|, and $B$'s name is @|SuperClass| with
+nickname @|super|, then the macro @|MYCLASS__CONV_SUPER| converts a
+@|MyClass~*| to a @|SuperClass~*|.  See
+\xref{sec:structures.layout.additional} for the formal description.
+
+%%%--------------------------------------------------------------------------
+\section{Keyword arguments} \label{sec:concepts.keywords}
+
+In standard C, the actual arguments provided to a function are matched up
+with the formal arguments given in the function definition according to their
+ordering in a list.  Unless the (rather cumbersome) machinery for dealing
+with variable-length argument tails (@|<stdarg.h>|) is used, exactly the
+correct number of arguments must be supplied, and in the correct order.
+
+A \emph{keyword argument} is matched by its distinctive \emph{name}, rather
+than by its position in a list.  Keyword arguments may be \emph{omitted},
+causing some default behaviour by the function.  A function can detect
+whether a particular keyword argument was supplied: so the default behaviour
+need not be the same as that caused by any specific value of the argument.
+
+Keyword arguments can be provided in three ways.
+\begin{enumerate}
+\item Directly, as a variable-length argument tail, consisting (for the most
+  part) of alternating keyword names, as pointers to null-terminated strings,
+  and argument values, and terminated by a null pointer.  This is somewhat
+  error-prone, and the support library defines some macros which help ensure
+  that keyword argument lists are well formed.
+\item Indirectly, through a @|va_list| object capturing a variable-length
+  argument tail passed to some other function.  Such indirect argument tails
+  have the same structure as the direct argument tails described above.
+  Because @|va_list| objects are hard to copy, the keyword-argument support
+  library consistently passes @|va_list| objects \emph{by reference}
+  throughout its programming interface.
+\item Indirectly, through a vector of @|struct kwval| objects, each of which
+  contains a keyword name, as a pointer to a null-terminated string, and the
+  \emph{address} of a corresponding argument value.  (This indirection is
+  necessary so that the items in the vector can be of uniform size.)
+  Argument vectors are rather inconvenient to use, but are the only practical
+  way in which a caller can decide at runtime which arguments to include in a
+  call, which is useful when writing wrapper functions.
+\end{enumerate}
+
+Perhaps surprisingly, keyword arguments have a relatively small performance
+impact.  On the author's ageing laptop, a call to a simple function, passing
+two out of three keyword arguments, takes about 30 cycles longer than calling
+a standard function which just takes integer arguments.  On the other hand,
+quite a lot of code is involved in decoding keyword arguments, so code size
+will naturally suffer.
+
+Keyword arguments are provided as a general feature for C functions.
+However, Sod has special support for messages which accept keyword arguments
+(\xref{sec:concepts.methods.keywords}); and they play an essential rôle in
+the instance construction protocol (\xref{sec:concepts.lifecycle.birth}).
+
+%%%--------------------------------------------------------------------------
+\section{Messages and methods} \label{sec:concepts.methods}
+
+Objects can be sent \emph{messages}.  A message has a \emph{name}, and
+carries a number of \emph{arguments}.  When an object is sent a message, a
+function, determined by the receiving object's class, is invoked, passing it
+the receiver and the message arguments.  This function is called the
+class's \emph{effective method} for the message.  The effective method can do
+anything a C function can do, including reading or updating program state or
+object slots, sending more messages, calling other functions, issuing system
+calls, or performing I/O; if it finishes, it may return a value, which is
+returned in turn to the message sender.
+
+The set of messages an object can receive, characterized by their names,
+argument types, and return type, is determined by the object's class.  Each
+class can define new messages, which can be received by any instance of that
+class.  The messages defined by a single class must have distinct names:
+there is no `function overloading'.  As with slots
+(\xref{sec:concepts.classes.slots}), messages defined by distinct classes are
+always distinct, even if they have the same names: references to messages are
+always qualified by the defining class's name or nickname.
+
+Messages may take any number of arguments, of any non-array value type.
+Since message sends are effectively function calls, arguments of array type
+are implicitly converted to values of the corresponding pointer type.  While
+message definitions may ascribe an array type to an argument, the formal
+argument will have pointer type, as is usual for C functions.  A message may
+accept a variable-length argument suffix, denoted @|\dots|.
+
+A class definition may include \emph{direct methods} for messages defined by
+it or any of its superclasses.
+
+Like messages, direct methods define argument lists and return types, but
+they may also have a \emph{body}, and a \emph{rôle}.
+
+A direct method need not have the same argument list or return type as its
+message.  The acceptable argument lists and return types for a method depend
+on the message, in particular its method combination
+(\xref{sec:concepts.methods.combination}), and the method's rôle.
+
+A direct method body is a block of C code, and the Sod translator usually
+defines, for each direct method, a function with external linkage, whose body
+contains a copy of the direct method body.  Within the body of a direct
+method defined for a class $C$, the variable @|me|, of type pointer to class
+type of $C$, refers to the receiving object.
+
+
+\subsection{Effective methods and method combinations}
+\label{sec:concepts.methods.combination}
+
+For each message a direct instance of a class might receive, there is a set
+of \emph{applicable methods}, which are exactly the direct methods defined on
+the object's class and its superclasses.  These direct methods are combined
+together to form the \emph{effective method} for that particular class and
+message.  Direct methods can be combined into an effective method in
+different ways, according to the \emph{method combination} specified by the
+message.  The method combination determines which direct method rôles are
+acceptable, and, for each rôle, the appropriate argument lists and return
+types.
+
+One direct method, $M$, is said to be more or less \emph{specific} than
+another, $N$, with respect to a receiving class~$C$, if the class defining
+$M$ is respectively a more or less specific superclass of~$C$ than the class
+defining $N$.
+
+\subsubsection{The standard method combination}
+The default method combination is called the \emph{standard method
+combination}; other method combinations are useful occasionally for special
+effects.  The standard method combination accepts four direct method rôles,
+called `primary' (the default), @|before|, @|after|, and @|around|.
+
+All direct methods subject to the standard method combination must have
+argument lists which \emph{match} the message's argument list:
+\begin{itemize}
+\item the method's arguments must have the same types as the message, though
+  the arguments may have different names; and
+\item if the message accepts a variable-length argument suffix then the
+  direct method must instead have a final argument of type @|va_list|.
+\end{itemize}
+Primary and @|around| methods must have the same return type as the message;
+@|before| and @|after| methods must return @|void| regardless of the
+message's return type.
+
+If there are no applicable primary methods then no effective method is
+constructed: the vtables contain null pointers in place of pointers to method
+entry functions.
+
+\begin{figure}
+  \hbox to\hsize{\hss\hbox{\begin{tikzpicture}
+    [order/.append style={color=green!70!black},
+     code/.append style={font=\sffamily},
+     action/.append style={font=\itshape},
+     method/.append style={rectangle, draw=black, thin, fill=blue!30,
+                           text height=\ht\strutbox, text depth=\dp\strutbox,
+                           minimum width=40mm}]
+
+    \def\delgstack#1#2#3{
+      \node (#10) [method, #2] {#3};
+      \node (#11) [method, above=6mm of #10] {#3};
+      \draw [->] ($(#10.north)!.5!(#10.north west) + (0mm, 1mm)$) --
+                 ++(0mm, 4mm)
+        node [code, left=4pt, midway] {next_method};
+      \draw [<-] ($(#10.north)!.5!(#10.north east) + (0mm, 1mm)$) --
+                 ++(0mm, 4mm)
+        node [action, right=4pt, midway] {return};
+      \draw [->] ($(#11.north)!.5!(#11.north west) + (0mm, 1mm)$) --
+                 ++(0mm, 4mm)
+        node [code, left=4pt, midway] {next_method}
+        node (ld) [above] {$\smash\vdots\mathstrut$};
+      \draw [<-] ($(#11.north)!.5!(#11.north east) + (0mm, 1mm)$) --
+                 ++(0mm, 4mm)
+        node [action, right=4pt, midway] {return}
+        node (rd) [above] {$\smash\vdots\mathstrut$};
+      \draw [->] ($(ld.north) + (0mm, 1mm)$) -- ++(0mm, 4mm)
+        node [code, left=4pt, midway] {next_method};
+      \draw [<-] ($(rd.north) + (0mm, 1mm)$) -- ++(0mm, 4mm)
+        node [action, right=4pt, midway] {return};
+      \node (p) at ($(ld.north)!.5!(rd.north)$) {};
+      \node (#1n) [method, above=5mm of p] {#3};
+      \draw [->, order] ($(#10.south east) + (4mm, 1mm)$) --
+                          ($(#1n.north east) + (4mm, -1mm)$)
+        node [midway, right, align=left]
+        {Most to \\ least \\ specific};}
+
+    \delgstack{a}{}{@|around| method}
+    \draw [<-] ($(a0.south)!.5!(a0.south west) - (0mm, 1mm)$) --
+               ++(0mm, -4mm);
+    \draw [->] ($(a0.south)!.5!(a0.south east) - (0mm, 1mm)$) --
+               ++(0mm, -4mm)
+      node [action, right=4pt, midway] {return};
+
+    \draw [->] ($(an.north)!.6!(an.north west) + (0mm, 1mm)$) --
+               ++(-8mm, 8mm)
+      node [code, midway, left=3mm] {next_method}
+      node (b0) [method, above left = 1mm + 4mm and -6mm - 4mm] {};
+    \node (b1) [method] at ($(b0) - (2mm, 2mm)$) {};
+    \node (bn) [method] at ($(b1) - (2mm, 2mm)$) {@|before| method};
+    \draw [->, order] ($(bn.west) - (6mm, 0mm)$) -- ++(12mm, 12mm)
+      node [midway, above left, align=center] {Most to \\ least \\ specific};
+    \draw [->] ($(b0.north east) + (-10mm, 1mm)$) -- ++(8mm, 8mm)
+      node (p) {};
+
+    \delgstack{m}{above right=1mm and 0mm of an.west |- p}{Primary method}
+    \draw [->] ($(mn.north)!.5!(mn.north west) + (0mm, 1mm)$) -- ++(0mm, 4mm)
+      node [code, left=4pt, midway] {next_method}
+      node [above right = 0mm and -8mm]
+      {$\vcenter{\hbox{\Huge\textcolor{red}{!}}}
+        \vcenter{\hbox{\begin{tabular}[c]{l}
+                         @|next_method| \\
+                         pointer is null
+                       \end{tabular}}}$};
+
+    \draw [->, color=blue, dotted]
+        ($(m0.south)!.2!(m0.south east) - (0mm, 1mm)$) --
+        ($(an.north)!.2!(an.north east) + (0mm, 1mm)$)
+      node [midway, sloped, below] {Return value};
+
+    \draw [<-] ($(an.north)!.6!(an.north east) + (0mm, 1mm)$) --
+               ++(8mm, 8mm)
+      node [action, midway, right=3mm] {return}
+      node (f0) [method, above right = 1mm and -6mm] {};
+    \node (f1) [method] at ($(f0) + (-2mm, 2mm)$) {};
+    \node (fn) [method] at ($(f1) + (-2mm, 2mm)$) {@|after| method};
+    \draw [<-, order] ($(f0.east) + (6mm, 0mm)$) -- ++(-12mm, 12mm)
+      node [midway, above right, align=center]
+      {Least to \\ most \\ specific};
+    \draw [<-] ($(fn.north west) + (6mm, 1mm)$) -- ++(-8mm, 8mm);
+
+  \end{tikzpicture}}\hss}
+
+  \caption{The standard method combination}
+  \label{fig:concepts.methods.stdmeth}
+\end{figure}
+
+The effective method for a message with standard method combination works as
+follows (see also~\xref{fig:concepts.methods.stdmeth}).
+\begin{enumerate}
+
+\item If any applicable methods have the @|around| rôle, then the most
+  specific such method, with respect to the class of the receiving object, is
+  invoked.
+
+  Within the body of an @|around| method, the variable @|next_method| is
+  defined, having pointer-to-function type.  The method may call this
+  function, as described below, any number of times.
+
+  If there any remaining @|around| methods, then @|next_method| invokes the
+  next most specific such method, returning whichever value that method
+  returns; otherwise the behaviour of @|next_method| is to invoke the
+  @|before| methods (if any), followed by the most specific primary method,
+  followed by the @|after| methods (if any), and to return whichever value
+  was returned by the most specific primary method, as described in the
+  following items.  That is, the behaviour of the least specific @|around|
+  method's @|next_method| function is exactly the behaviour that the
+  effective method would have if there were no @|around| methods.  Note that
+  if the least-specific @|around| method calls its @|next_method| more than
+  once then the whole sequence of @|before|, primary, and @|after| methods
+  occurs multiple times.
+
+  The value returned by the most specific @|around| method is the value
+  returned by the effective method.
+
+\item If any applicable methods have the @|before| rôle, then they are all
+  invoked, starting with the most specific.
+
+\item The most specific applicable primary method is invoked.
+
+  Within the body of a primary method, the variable @|next_method| is
+  defined, having pointer-to-function type.  If there are no remaining less
+  specific primary methods, then @|next_method| is a null pointer.
+  Otherwise, the method may call the @|next_method| function any number of
+  times.
+
+  The behaviour of the @|next_method| function, if it is not null, is to
+  invoke the next most specific applicable primary method, and to return
+  whichever value that method returns.
+
+  If there are no applicable @|around| methods, then the value returned by
+  the most specific primary method is the value returned by the effective
+  method; otherwise the value returned by the most specific primary method is
+  returned to the least specific @|around| method, which called it via its
+  own @|next_method| function.
+
+\item If any applicable methods have the @|after| rôle, then they are all
+  invoked, starting with the \emph{least} specific.  (Hence, the most
+  specific @|after| method is invoked with the most `afterness'.)
+
+\end{enumerate}
+
+A typical use for @|around| methods is to allow a base class to set up the
+dynamic environment appropriately for the primary methods of its subclasses,
+e.g., by claiming a lock, and releasing it afterwards.
+
+The @|next_method| function provided to methods with the primary and
+@|around| rôles accepts the same arguments, and returns the same type, as the
+message, except that one or two additional arguments are inserted at the
+front of the argument list.  The first additional argument is always the
+receiving object, @|me|.  If the message accepts a variable argument suffix,
+then the second addition argument is a @|va_list|; otherwise there is no
+second additional argument; otherwise, In the former case, a variable
+@|sod__master_ap| of type @|va_list| is defined, containing a separate copy
+of the argument pointer (so the method body can process the variable argument
+suffix itself, and still pass a fresh copy on to the next method).
+
+A method with the primary or @|around| rôle may use the convenience macro
+@|CALL_NEXT_METHOD|, which takes no arguments itself, and simply calls
+@|next_method| with appropriate arguments: the receiver @|me| pointer, the
+argument pointer @|sod__master_ap| (if applicable), and the method's
+arguments.  If the method body has overwritten its formal arguments, then
+@|CALL_NEXT_METHOD| will pass along the updated values, rather than the
+original ones.
+
+A primary or @|around| method which invokes its @|next_method| function is
+said to \emph{extend} the message behaviour; a method which does not invoke
+its @|next_method| is said to \emph{override} the behaviour.  Note that a
+method may make a decision to override or extend at runtime.
+
+\subsubsection{Aggregating method combinations}
+A number of other method combinations are provided.  They are called
+`aggregating' method combinations because, instead of invoking just the most
+specific primary method, as the standard method combination does, they invoke
+the applicable primary methods in turn and aggregate the return values from
+each.
+
+The aggregating method combinations accept the same four rôles as the
+standard method combination, and @|around|, @|before|, and @|after| methods
+work in the same way.
+
+The aggregating method combinations provided are as follows.
+\begin{description} \let\makelabel\code
+\item[progn] The message must return @|void|.  The applicable primary methods
+  are simply invoked in turn, most specific first.
+\item[sum] The message must return a numeric type.\footnote{%
+    The Sod translator doesn't check this, since it doesn't have enough
+    insight into @|typedef| names.} %
+  The applicable primary methods are invoked in turn, and their return values
+  added up.  The final result is the sum of the individual values.
+\item[product] The message must return a numeric type.  The applicable
+  primary methods are invoked in turn, and their return values multiplied
+  together.  The final result is the product of the individual values.
+\item[min] The message must return a scalar type.  The applicable primary
+  methods are invoked in turn.  The final result is the smallest of the
+  individual values.
+\item[max] The message must return a scalar type.  The applicable primary
+  methods are invoked in turn.  The final result is the largest of the
+  individual values.
+\item[and] The message must return a scalar type.  The applicable primary
+  methods are invoked in turn.  If any method returns zero then the final
+  result is zero and no further methods are invoked.  If all of the
+  applicable primary methods return nonzero, then the final result is the
+  result of the last primary method.
+\item[or] The message must return a scalar type.  The applicable primary
+  methods are invoked in turn.  If any method returns nonzero then the final
+  result is that nonzero value and no further methods are invoked.  If all of
+  the applicable primary methods return zero, then the final result is zero.
+\end{description}
+
+There is also a @|custom| aggregating method combination, which is described
+in \xref{sec:fixme.custom-aggregating-method-combination}.
+
+
+\subsection{Method entries} \label{sec:concepts.methods.entry}
+
+The effective methods for each class are determined at translation time, by
+the Sod translator.  For each effective method, one or more \emph{method
+entry functions} are constructed.  A method entry function has three
+responsibilities.
+\begin{itemize}
+\item It converts the receiver pointer to the correct type.  Method entry
+  functions can perform these conversions extremely efficiently: there are
+  separate method entries for each chain of each class which can receive a
+  message, so method entry functions are in the privileged situation of
+  knowing the \emph{exact} class of the receiving object.
+\item If the message accepts a variable-length argument tail, then two method
+  entry functions are created for each chain of each class: one receives a
+  variable-length argument tail, as intended, and captures it in a @|va_list|
+  object; the other accepts an argument of type @|va_list| in place of the
+  variable-length tail and arranges for it to be passed along to the direct
+  methods.
+\item It invokes the effective method with the appropriate arguments.  There
+  might or might not be an actual function corresponding to the effective
+  method itself: the translator may instead open-code the effective method's
+  behaviour into each method entry function; and the machinery for handling
+  `delegation chains', such as is used for @|around| methods and primary
+  methods in the standard method combination, is necessarily scattered among
+  a number of small functions.
+\end{itemize}
+
+
+\subsection{Messages with keyword arguments}
+\label{sec:concepts.methods.keywords}
+
+A message or a direct method may declare that it accepts keyword arguments.
+A message which accepts keyword arguments is called a \emph{keyword message};
+a direct method which accepts keyword arguments is called a \emph{keyword
+method}.
+
+While method combinations may set their own rules, usually keyword methods
+can only be defined on keyword messages, and all methods defined on a keyword
+message must be keyword methods.  The direct methods defined on a keyword
+message may differ in the keywords they accept, both from each other, and
+from the message.  If two applicable methods on the same message both accept
+a keyword argument with the same name, then these two keyword arguments must
+also have the same type.  Different applicable methods may declare keyword
+arguments with the same name but different defaults; see below.
+
+The keyword arguments acceptable in a message sent to an object are the
+keywords listed in the message definition, together with all of the keywords
+accepted by any applicable method.  There is no easy way to determine at
+runtime whether a particular keyword is acceptable in a message to a given
+instance.
+
+At runtime, a direct method which accepts one or more keyword arguments
+receives an additional argument named @|suppliedp|.  This argument is a small
+structure.  For each keyword argument named $k$ accepted by the direct
+method, @|suppliedp| contains a one-bit-wide bitfield member of type
+@|unsigned|, also named $k$.  If a keyword argument named $k$ was passed in
+the message, then @|suppliedp.$k$| is one, and $k$ contains the argument
+value; otherwise @|suppliedp.$k$| is zero, and $k$ contains the default value
+from the direct method definition if there was one, or an unspecified value
+otherwise.
+
+%%%--------------------------------------------------------------------------
+\section{The object lifecycle} \label{sec:concepts.lifecycle}
+
+\subsection{Creation} \label{sec:concepts.lifecycle.birth}
+
+Construction of a new instance of a class involves three steps.
+\begin{enumerate}
+\item \emph{Allocation} arranges for there to be storage space for the
+  instance's slots and associated metadata.
+\item \emph{Imprinting} fills in the instance's metadata, associating the
+  instance with its class.
+\item \emph{Initialization} stores appropriate initial values in the
+  instance's slots, and maybe links it into any external data structures as
+  necessary.
+\end{enumerate}
+The \descref{mac}{SOD_DECL}[macro] handles constructing instances with
+automatic storage duration (`on the stack').  Similarly, the
+\descref{mac}{SOD_MAKE}[macro] and the \descref*{fun}{sod_make} and
+\descref{fun}{sod_makev} functions construct instances allocated from the
+standard @|malloc| heap.  Programmers can add support for other allocation
+strategies by using the \descref{mac}{SOD_INIT}[macro] and the
+\descref*{fun}{sod_init} and \descref{fun}{sod_initv} functions, which
+package up imprinting and initialization.
+
+\subsubsection{Allocation}
+Instances of most classes (specifically including those classes defined by
+Sod itself) can be held in any storage of sufficient size.  The in-memory
+layout of an instance of some class~$C$ is described by the type @|struct
+$C$__ilayout|, and if the relevant class is known at compile time then the
+best way to discover the layout size is with the @|sizeof| operator.  Failing
+that, the size required to hold an instance of $C$ is available in a slot in
+$C$'s class object, as @|$C$__class@->cls.initsz|.  The necessary alignment,
+in bytes, is provided as @|$C$__class@->cls.align|, should this be necessary.
+
+It is not in general sufficient to declare, or otherwise allocate, an object
+of the class type $C$.  The class type only describes a single chain of the
+object's layout.  It is nearly always an error to use the class type as if it
+is a \emph{complete type}, e.g., to declare objects or arrays of the class
+type, or to enquire about its size or alignment requirements.
+
+Instance layouts may be declared as objects with automatic storage duration
+(colloquially, `allocated on the stack') or allocated dynamically, e.g.,
+using @|malloc|.  They may be included as members of structures or unions, or
+elements of arrays.  Sod's runtime system doesn't retain addresses of
+instances, so, for example, Sod doesn't make using fancy allocators which
+sometimes move objects around in memory any more difficult than it needs to
+be.
+
+The following simple function correctly allocates and returns space for an
+instance of a class given a pointer to its class object @<cls>.
+\begin{prog}
+  void *allocate_instance(const SodClass *cls)                  \\ \ind
+    \{ return malloc(cls@->cls.initsz); \}
+\end{prog}
+
+\subsubsection{Imprinting}
+Once storage has been allocated, it must be \emph{imprinted} before it can be
+used as an instance of a class, e.g., before any messages can be sent to it.
+
+Imprinting an instance stores some metadata about its direct class in the
+instance structure, so that the rest of the program (and Sod's runtime
+library) can tell what sort of object it is, and how to use it.\footnote{%
+  Specifically, imprinting an instance's storage involves storing the
+  appropriate vtable pointers in the right places in it.} %
+A class object's @|imprint| slot points to a function which will correctly
+imprint storage for one of that class's instances.
+
+Once an instance's storage has been imprinted, it is technically possible to
+send messages to the instance; however the instance's slots are still
+uninitialized at this point, so the applicable methods are unlikely to do
+much of any use unless they've been written specifically for the purpose.
+
+The following simple function imprints storage at address @<p> as an instance
+of a class, given a pointer to its class object @<cls>.
+\begin{prog}
+  void imprint_instance(const SodClass *cls, void *p)           \\ \ind
+    \{ cls@->cls.imprint(p); \}
+\end{prog}
+
+\subsubsection{Initialization}
+The final step for constructing a new instance is to \emph{initialize} it, to
+establish the necessary invariants for the instance itself and the
+environment in which it operates.
+
+Details of initialization are necessarily class-specific, but typically it
+involves setting the instance's slots to appropriate values, and possibly
+linking it into some larger data structure to keep track of it.  It is
+possible for initialization methods to attempt to allocate resources, but
+this must be done carefully: there is currently no way to report an error
+from object initialization, so the object must be marked as incompletely
+initialized, and left in a state where it will be safe to tear down later.
+
+Initialization is performed by sending the imprinted instance an @|init|
+message, defined by the @|SodObject| class.  This message uses a nonstandard
+method combination which works like the standard combination, except that the
+\emph{default behaviour}, if there is no overriding method, is to initialize
+the instance's slots, as described below, and to invoke each superclass's
+initialization fragments.  This default behaviour may be invoked multiple
+times if some method calls on its @|next_method| more than once, unless some
+other method takes steps to prevent this.
+
+Slots are initialized in a well-defined order.
+\begin{itemize}
+\item Slots defined by a more specific superclass are initialized after slots
+  defined by a less specific superclass.
+\item Slots defined by the same class are initialized in the order in which
+  their definitions appear.
+\end{itemize}
+
+A class can define \emph{initialization fragments}: pieces of literal code to
+be executed to set up a new instance.  Each superclass's initialization
+fragments are executed with @|me| bound to an instance pointer of the
+appropriate superclass type, immediately after that superclass's slots (if
+any) have been initialized; therefore, fragments defined by a more specific
+superclass are executed after fragments defined by a less specific
+superclass.  A class may define more than one initialization fragment: the
+fragments are executed in the order in which they appear in the class
+definition.  It is possible for an initialization fragment to use @|return|
+or @|goto| for special control-flow effects, but this is not likely to be a
+good idea.
+
+The @|init| message accepts keyword arguments
+(\xref{sec:concepts.methods.keywords}).  The set of acceptable keywords is
+determined by the applicable methods as usual, but also by the
+\emph{initargs} defined by the receiving instance's class and its
+superclasses, which are made available to slot initializers and
+initialization fragments.
+
+There are two kinds of initarg definitions.  \emph{User initargs} are defined
+by an explicit @|initarg| item appearing in a class definition: the item
+defines a name, a type, and (optionally) a default value for the initarg.
+\emph{Slot initargs} are defined by attaching an @|initarg| property to a
+slot or slot initializer item: the property's value determines the initarg's
+name, while the type is taken from the underlying slot type; slot initargs do
+not have default values.  Both kinds define a \emph{direct initarg} for the
+containing class.  (Note that a slot may have any number of slot initargs;
+and any number of slots may have initargs with the same name.)
+
+Initargs are inherited.  The \emph{applicable} direct initargs for an @|init|
+effective method are those defined by the receiving object's class, and all
+of its superclasses.  Applicable direct initargs with the same name are
+merged to form \emph{effective initargs}.  An error is reported if two
+applicable direct initargs have the same name but different types.  The
+default value of an effective initarg is taken from the most specific
+applicable direct initarg which specifies a defalt value; if no applicable
+direct initarg specifies a default value then the effective initarg has no
+default.
+
+All initarg values are made available at runtime to user code --
+initialization fragments and slot initializer expressions -- through local
+variables and a @|suppliedp| structure, as in a direct method
+(\xref{sec:concepts.methods.keywords}).  Furthermore, slot initarg
+definitions influence the initialization of slots.
+
+The process for deciding how to initialize a particular slot works as
+follows.
+\begin{enumerate}
+
+\item If there are any slot initargs defined on the slot, or any of its slot
+  initializers, \emph{and} the sender supplied a value for one or more of the
+  corresponding effective initargs, then the value of the most specific such
+  initarg is stored in the slot.  (For this purpose, initargs defined earlier
+  in a class definition are more specific than initargs defined
+  later.)\footnote{%
+    This is very different from the CLOS behaviour, in which a slot is
+    initialized from the first applicable initarg in the argument list.
+    Sod's keyword-argument machinery works in two stages: firstly, the
+    arguments values are collected into a structure on entry into an
+    effective method, which loses track of the order in which the arguments
+    were passed; and only then are the direct methods invoked.}
+
+\item Otherwise, if there are any slot initializers defined which include an
+  initializer expression, then the initializer expression from the most
+  specific such slot initializer is evaluated and its value stored in the
+  slot.  (A class may define at most one initializer for any particular slot,
+  so no further disambiguation is required.)
+
+\item Otherwise, the slot is left uninitialized.
+
+\end{enumerate}
+Note that the default values (if any) of effective initargs do \emph{not}
+affect this procedure.
+
+
+\subsection{Destruction}
+\label{sec:concepts.lifecycle.death}
+
+Destruction of an instance, when it is no longer required, consists of two
+steps.
+\begin{enumerate}
+\item \emph{Teardown} releases any resources held by the instance and
+  disentangles it from any external data structures.
+\item \emph{Deallocation} releases the memory used to store the instance so
+  that it can be reused.
+\end{enumerate}
+Teardown alone, for objects which require special deallocation, or for which
+deallocation occurs automatically (e.g., instances with automatic storage
+duration, or instances whose storage will be garbage-collected), is performed
+using the \descref{fun}{sod_teardown}[function].  Destruction of instances
+allocated from the standard @|malloc| heap is done using the
+\descref{fun}{sod_destroy}[function].
+
+\subsubsection{Teardown}
+Details of teardown are necessarily class-specific, but typically it
+involves releasing resources held by the instance, and disentangling it from
+any data structures it might be linked into.
+
+Teardown is performed by sending the instance the @|teardown| message,
+defined by the @|SodObject| class.  The message returns an integer, used as a
+boolean flag.  If the message returns zero, then the instance's storage
+should be deallocated.  If the message returns nonzero, then it is safe for
+the caller to forget about instance, but should not deallocate its storage.
+This is \emph{not} an error return: if some teardown method fails then the
+program may be in an inconsistent state and should not continue.
+
+This simple protocol can be used, for example, to implement a reference
+counting system, as follows.
+\begin{prog}
+  [nick = ref]                                                  \\
+  class ReferenceCountedObject: SodObject \{                    \\ \ind
+    unsigned nref = 1;                                          \\-
+    void inc() \{ me@->ref.nref++; \}                           \\-
+    [role = around]                                             \\
+    int obj.teardown()                                          \\
+    \{                                                          \\ \ind
+      if (@--me@->ref.nref) return (1);                           \\
+      else return (CALL_NEXT_METHOD);                         \-\\
+    \}                                                        \-\\
+  \}
+\end{prog}
+
+The @|teardown| message uses a nonstandard method combination which works
+like the standard combination, except that the \emph{default behaviour}, if
+there is no overriding method, is to execute the superclass's teardown
+fragments, and to return zero.  This default behaviour may be invoked
+multiple times if some method calls on its @|next_method| more than once,
+unless some other method takes steps to prevent this.
+
+A class can define \emph{teardown fragments}: pieces of literal code to be
+executed to shut down an instance.  Each superclass's teardown fragments are
+executed with @|me| bound to an instance pointer of the appropriate
+superclass type; fragments defined by a more specific superclass are executed
+before fragments defined by a less specific superclass.  A class may define
+more than one teardown fragment: the fragments are executed in the order in
+which they appear in the class definition.  It is possible for an
+initialization fragment to use @|return| or @|goto| for special control-flow
+effects, but this is not likely to be a good idea.  Similarly, it's probably
+a better idea to use an @|around| method to influence the return value than
+to write an explicit @|return| statement in a teardown fragment.
+
+\subsubsection{Deallocation}
+The details of instance deallocation are obviously specific to the allocation
+strategy used by the instance, and this is often orthogonal from the object's
+class.
+
+The code which makes the decision to destroy an object may often not be aware
+of the object's direct class.  Low-level details of deallocation often
+require the proper base address of the instance's storage, which can be
+determined using the \descref{mac}{SOD_INSTBASE}[macro].
+
+%%%--------------------------------------------------------------------------
+\section{Metaclasses} \label{sec:concepts.metaclasses}
+
+In Sod, every object is an instance of some class, and -- unlike, say,
+\Cplusplus\ -- classes are proper objects.  It follows that, in Sod, every
+class~$C$ is itself an instance of some class~$M$, which is called $C$'s
+\emph{metaclass}.  Metaclass instances are usually constructed statically, at
+compile time, and marked read-only.
+
+As an added complication, Sod classes, and other metaobjects such as
+messages, methods, slots and so on, also have classes \emph{at translation
+time}.  These translation-time metaclasses are not Sod classes; they are CLOS
+classes, implemented in Common Lisp.
+
+
+\subsection{Runtime metaclasses}
+\label{sec:concepts.metaclasses.runtime}
+
+Like other classes, metaclasses can declare messages, and define slots and
+methods.  Slots defined by the metaclass are called \emph{class slots}, as
+opposed to \emph{instance slots}.  Similarly, messages and methods defined by
+the metaclass are termed \emph{class messages} and \emph{class methods}
+respectively, though these are used much less frequently.
+
+\subsubsection{The braid}
+Every object is an instance of some class.  There are only finitely many
+classes.
+
+\begin{figure}
+  \centering
+  \begin{tikzpicture}
+    \node[lit] (obj) {SodObject};
+    \node[lit] (cls) [right=10mm of obj] {SodClass};
+    \draw [->, dashed] (obj) to[bend right] (cls);
+    \draw [->] (cls) to[bend right] (obj);
+    \draw [->, dashed] (cls) to[loop right] (cls);
+  \end{tikzpicture}
+  \qquad
+  \fbox{\ \begin{tikzpicture}
+    \node (subclass) {subclass of};
+    \node (instance) [below=\jot of subclass] {instance of};
+    \draw [->] ($(subclass.west) - (10mm, 0)$) -- ++(8mm, 0);
+    \draw [->, dashed] ($(instance.west) - (10mm, 0)$) -- ++(8mm, 0);
+  \end{tikzpicture}}
+  \caption{The Sod braid} \label{fig:concepts.metaclasses.braid}
+\end{figure}
+
+Consider the directed graph whose nodes are classes, and where there is an
+arc from $C$ to $D$ if and only if $C$ is an instance of $D$.  There are only
+finitely many nodes.  Every node has an arc leaving it, because every object
+-- and hence every class -- is an instance of some class.  Therefore this
+graph must contain at least one cycle.
+
+In Sod, this situation is resolved in the simplest manner possible:
+@|SodClass| is the only predefined metaclass, and it is an instance of
+itself.  The only other predefined class is @|SodObject|, which is also an
+instance of @|SodClass|.  There is exactly one root class, namely
+@|SodObject|; consequently, @|SodClass| is a direct subclass of @|SodObject|.
+
+\Xref{fig:concepts.metaclasses.braid} shows a diagram of this situation.
+
+\subsubsection{Class slots and initializers}
+Instance initializers were described in \xref{sec:concepts.classes.slots}.  A
+class can also define \emph{class initializers}, which provide values for
+slots defined by its metaclass.  The initial value for a class slot is
+determined as follows.
+\begin{itemize}
+\item Nonstandard slot classes may be initialized by custom Lisp code.  For
+  example, all of the slots defined by @|SodClass| are of this kind.  User
+  initializers are not permitted for such slots.
+\item If the class or any of its superclasses defines a class initializer for
+  the slot, then the class initializer defined by the most specific such
+  superclass is used.
+\item Otherwise, if the metaclass or one of its superclasses defines an
+  instance initializer, then the instance initializer defined by he most
+  specific such class is used.
+\item Otherwise there is no initializer, and an error will be reported.
+\end{itemize}
+Initializers for class slots must be constant expressions (for scalar slots)
+or aggregate initializers containing constant expressions.
+
+\subsubsection{Metaclass selection and consistency}
+Sod enforces a \emph{metaclass consistency rule}: if $C$ has metaclass $M$,
+then any subclass $C$ must have a metaclass which is a subclass of $M$.
+
+The definition of a new class can name the new class's metaclass explicitly,
+by defining a @|metaclass| property; the Sod translator will verify that the
+choice of metaclass is acceptable.
+
+If no @|metaclass| property is given, then the translator will select a
+default metaclass as follows.  Let $C_1$, $C_2$, \dots, $C_n$ be the direct
+superclasses of the new class, and let $M_1$, $M_2$, \dots, $M_n$ be their
+respective metaclasses (not necessarily distinct).  If there exists exactly
+one minimal metaclass $M_i$, i.e., there exists an $i$, with $1 \le i \le n$,
+such that $M_i$ is a subclass of every $M_j$, for $1 \le j \le n$, then $M_i$
+is selected as the new class's metaclass.  Otherwise the situation is
+ambiguous and an error will be reported.  Usually, the ambiguity can be
+resolved satisfactorily by defining a new class $M^*$ as a direct subclass of
+the minimal $M_j$.
+
+
+\subsection{Translation-time metaobjects}
+\label{sec:concepts.metaclasses.compile-time}
+
+Within the translator, modules, classes, slots and initializers, messages and
+methods are all represented as instances of classes.  Since the translator is
+written in Common Lisp, these translation-time metaobject classes are all
+CLOS classes.  Extensions can influence the translator's behaviour -- and
+hence the layout and behaviour of instances at runtime -- by subclassing the
+built-in metaobject classes and implementing methods on appropriate generic
+functions.
+
+Metaobject classes are chosen in a fairly standard way.
+\begin{itemize}
+\item All metaobject definitions support a symbol-valued property, usually
+  named @|@<thing>{}_class| (e.g., @|slot_class|, @|method_class|), which sets
+  the metaobject class explicitly.  (The class for a class metaobject is
+  taken from the @|lisp_class| property, because @|class_class| seems less
+  meaningful.)
+\item Failing that, the metaobject's parents choose a default metaobject
+  class, based on the new metaobject's properties; i.e., slots and messages
+  have their metaobject classes chosen by the defining class metaobject;
+  initializer and initarg classes are chosen by the defining class metaobject
+  and the direct slot metaobject; and method classes are chosen by the
+  defining class metaobject and the message metaobject.
+\item Classes have no parents; instead, the default is simply to use the
+  builtin metaobject class @|sod-class|.
+\item Modules are a special case because the property syntax is rather
+  awkward.  All modules are initially created as instances of the built-in
+  metaclass @|module|.  Once the module has been parsed completely, the
+  module metaobject's classes is changed, using @|change-class|, to the class
+  specified in the module's property set.
+\end{itemize}
+
+%%%--------------------------------------------------------------------------
+\section{Compatibility considerations} \label{sec:concepts.compatibility}
+
+Sod doesn't make source-level compatibility especially difficult.  As long as
+classes, slots, and messages don't change names or dissappear, and slots and
+messages retain their approximate types, everything will be fine.
+
+Binary compatibility is much more difficult.  Unfortunately, Sod classes have
+rather fragile binary interfaces.\footnote{%
+  Research suggestion: investigate alternative instance and vtable layouts
+  which improve binary compatibility, probably at the expense of instance
+  compactness, and efficiency of slot access and message sending.  There may
+  be interesting trade-offs to be made.} %
+
+If instances are allocated \fixme{incomplete}
  
  %%%----- That's all, folks --------------------------------------------------
  
  
  %%%----- That's all, folks --------------------------------------------------