mdw@git.distorted.org.uk Git - sod/blame_incremental

... / ...

Commit	Line	Data
	1	%%% --latex--
	2	%%%
	3	%%% Conceptual background
	4	%%%
	5	%%% (c) 2015 Straylight/Edgeware
	6	%%%
	7
	8	%%%----- Licensing notice ---------------------------------------------------
	9	%%%
	10	%%% This file is part of the Sensible Object Design, an object system for C.
	11	%%%
	12	%%% SOD is free software; you can redistribute it and/or modify
	13	%%% it under the terms of the GNU General Public License as published by
	14	%%% the Free Software Foundation; either version 2 of the License, or
	15	%%% (at your option) any later version.
	16	%%%
	17	%%% SOD is distributed in the hope that it will be useful,
	18	%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
	19	%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
	20	%%% GNU General Public License for more details.
	21	%%%
	22	%%% You should have received a copy of the GNU General Public License
	23	%%% along with SOD; if not, write to the Free Software Foundation,
	24	%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
	25
	26	\chapter{Concepts} \label{ch:concepts}
	27
	28	%%%--------------------------------------------------------------------------
	29	\section{Modules} \label{sec:concepts.modules}
	30
	31	A \emph{module} is the top-level syntactic unit of input to the Sod
	32	translator. As described above, given an input module, the translator
	33	generates C source and header files.
	34
	35	A module can \emph{import} other modules. This makes the type names and
	36	classes defined in those other modules available to class definitions in the
	37	importing module. Sod's module system is intentionally very simple. There
	38	are no private declarations or attempts to hide things.
	39
	40	As well as importing existing modules, a module can include a number of
	41	different kinds of \emph{items}:
	42	\begin{itemize}
	43	\item \emph{class definitions} describe new classes, possibly in terms of
	44	existing classes;
	45	\item \emph{type name declarations} introduce new type names to Sod's
	46	parser;\footnote{%
	47	This is unfortunately necessary because C syntax, upon which Sod's input
	48	language is based for obvious reasons, needs to treat type names
	49	differently from other kinds of identifiers.} %
	50	and
	51	\item \emph{code fragments} contain literal C code to be dropped into an
	52	appropriate place in an output file.
	53	\end{itemize}
	54	Each kind of item, and, indeed, a module as a whole, can have a collection of
	55	\emph{properties} associated with it. A property has a \emph{name} and a
	56	\emph{value}. Properties are an open-ended way of attaching additional
	57	information to module items, so extensions can make use of them without
	58	having to implement additional syntax.
	59
	60	%%%--------------------------------------------------------------------------
	61	\section{Classes, instances, and slots} \label{sec:concepts.classes}
	62
	63	For the most part, Sod takes a fairly traditional view of what it means to be
	64	an object system.
	65
	66	An \emph{object} maintains \emph{state} and exhibits \emph{behaviour}.
	67	(Here, we're using the term `object' in the usual sense of `object-oriented
	68	programming', rather than that of the ISO~C standard. Once we have defined
	69	an `instance' below, we shall generally prefer that term, so as to prevent
	70	further confusion between these two uses of the word.)
	71
	72	An object's state is maintained in named \emph{slots}, each of which can
	73	store a C value of an appropriate (scalar or aggregate) type. An object's
	74	behaviour is stimulated by sending it \emph{messages}. A message has a name,
	75	and may carry a number of arguments, which are C values; sending a message
	76	may result in the state of receiving object (or other objects) being changed,
	77	and a C value being returned to the sender.
	78
	79	Every object is a \emph{direct instance} of exactly one \emph{class}. The
	80	class determines which slots its instances have, which messages its instances
	81	can be sent, and which methods are invoked when those messages are received.
	82	The Sod translator's main job is to read class definitions and convert them
	83	into appropriate C declarations, tables, and functions. An object cannot
	84	(usually) change its direct class, and the direct class of an object is not
	85	affected by, for example, the static type of a pointer to it.
	86
	87	If an object~$x$ is a direct instance of some class~$C$, then we say that $C$
	88	is \emph{the class of}~$x$. Note that the class of an object is a property
	89	of the object's value at runtime, and not of C's compile-time type system.
	90	We shall be careful in distinguishing C's compile-time notion of \emph{type}
	91	from Sod's run-time notion of \emph{class}.
	92
	93
	94	\subsection{Superclasses and inheritance}
	95	\label{sec:concepts.classes.inherit}
	96
	97	\subsubsection{Class relationships}
	98	Each class has zero or more \emph{direct superclasses}.
	99
	100	A class with no direct superclasses is called a \emph{root class}. The Sod
	101	runtime library includes a root class named @\|SodObject\|; making new root
	102	classes is somewhat tricky, and won't be discussed further here.
	103
	104	Classes can have more than one direct superclass, i.e., Sod supports
	105	\emph{multiple inheritance}. A Sod class definition for a class~$C$ lists
	106	the direct superclasses of $C$ in a particular order. This order is called
	107	the \emph{local precedence order} of $C$, and the list which consists of $C$
	108	follows by $C$'s direct superclasses in local precedence order is called the
	109	$C$'s \emph{local precedence list}.
	110
	111	The multiple inheritance in Sod works similarly to multiple inheritance in
	112	Lisp-like languages, such as Common Lisp, EuLisp, Dylan, and Python, which is
	113	very different from how multiple inheritance works in \Cplusplus.\footnote{%
	114	The latter can be summarized as `badly'. By default in \Cplusplus, an
	115	instance receives an additional copy of superclass's state for each path
	116	through the class graph from the instance's direct class to that
	117	superclass, though this behaviour can be overridden by declaring
	118	superclasses to be @\|virtual\|. Also, \Cplusplus\ offers only trivial
	119	method combination (\xref{sec:concepts.methods}), leaving programmers to
	120	deal with delegation manually and (usually) statically.} %
	121
	122	If $C$ is a class, then the \emph{superclasses} of $C$ are
	123	\begin{itemize}
	124	\item $C$ itself, and
	125	\item the superclasses of each of $C$'s direct superclasses.
	126	\end{itemize}
	127	The \emph{proper superclasses} of a class $C$ are the superclasses of $C$
	128	except for $C$ itself. If a class $B$ is a (direct, proper) superclass of
	129	$C$, then $C$ is a \emph{(direct, proper) subclass} of $B$. If $C$ is a root
	130	class then the only superclass of $C$ is $C$ itself, and $C$ has no proper
	131	superclasses.
	132
	133	If an object is a direct instance of class~$C$ then the object is also an
	134	(indirect) \emph{instance} of every superclass of $C$.
	135
	136	If $C$ has a proper superclass $B$, then $B$ must not have $C$ as a direct
	137	superclass. In different terms, if we construct a directed graph, whose
	138	nodes are classes, and draw an arc from each class to each of its direct
	139	superclasses, then this graph must be acyclic. In yet other terms, the `is a
	140	superclass of' relation is a partial order on classes.
	141
	142	\subsubsection{The class precedence list}
	143	This partial order is not quite sufficient for our purposes. For each class
	144	$C$, we shall need to extend it into a total order on $C$'s superclasses.
	145	This calculation is called \emph{superclass linearization}, and the result is
	146	a \emph{class precedence list}, which lists each of $C$'s superclasses
	147	exactly once. If a superclass $B$ precedes or follows some other superclass
	148	$A$ in $C$'s class precedence list, then we say that $B$ is respectively a
	149	more or less \emph{specific} superclass of $C$ than $A$.
	150
	151	The superclass linearization algorithm isn't fixed, and extensions to the
	152	translator can introduce new linearizations for special effects, but the
	153	following properties are expected to hold.
	154	\begin{itemize}
	155	\item The first class in $C$'s class precedence list is $C$ itself; i.e.,
	156	$C$ is always its own most specific superclass.
	157	\item If $A$ and $B$ are both superclasses of $C$, and $A$ is a proper
	158	superclass of $B$ then $A$ appears after $B$ in $C$'s class precedence
	159	list, i.e., $B$ is a more specific superclass of $C$ than $A$ is.
	160	\end{itemize}
	161	The default linearization algorithm used in Sod is the \emph{C3} algorithm,
	162	which has a number of good properties described
	163	in~\cite{barrett-1996:monot-super-linear-dylan}. It works as follows.
	164	\begin{itemize}
	165	\item A \emph{merge} of some number of input lists is a single list
	166	containing each item that is in any of the input lists exactly once, and no
	167	other items; if an item $x$ appears before an item $y$ in any input list,
	168	then $x$ also appears before $y$ in the merge. If a collection of lists
	169	have no merge then they are said to be \emph{inconsistent}.
	170	\item The class precedence list of a class $C$ is a merge of the local
	171	precedence list of $C$ together with the class precedence lists of each of
	172	$C$'s direct superclasses.
	173	\item If there are no such merges, then the definition of $C$ is invalid.
	174	\item Suppose that there are multiple candidate merges. Consider the
	175	earliest position in these candidate merges at which they disagree. The
	176	\emph{candidate classes} at this position are the classes appearing at this
	177	position in the candidate merges. Each candidate class must be a
	178	superclass of distinct direct superclasses of $C$, since otherwise the
	179	candidates would be ordered by their common subclass's class precedence
	180	list. The class precedence list contains, at this position, that candidate
	181	class whose subclass appears earliest in $C$'s local precedence order.
	182	\end{itemize}
	183
	184	\begin{figure}
	185	\centering
	186	\begin{tikzpicture}[x=7.5mm, y=-14mm, baseline=(current bounding box.east)]
	187	\node[lit] at ( 0, 0) (R) {SodObject};
	188	\node[lit] at (-3, +1) (A) {A}; \draw[->] (A) -- (R);
	189	\node[lit] at (-1, +1) (B) {B}; \draw[->] (B) -- (R);
	190	\node[lit] at (+1, +1) (C) {C}; \draw[->] (C) -- (R);
	191	\node[lit] at (+3, +1) (D) {D}; \draw[->] (D) -- (R);
	192	\node[lit] at (-2, +2) (E) {E}; \draw[->] (E) -- (A);
	193	\draw[->] (E) -- (B);
	194	\node[lit] at (+2, +2) (F) {F}; \draw[->] (F) -- (A);
	195	\draw[->] (F) -- (D);
	196	\node[lit] at (-1, +3) (G) {G}; \draw[->] (G) -- (E);
	197	\draw[->] (G) -- (C);
	198	\node[lit] at (+1, +3) (H) {H}; \draw[->] (H) -- (F);
	199	\node[lit] at ( 0, +4) (I) {I}; \draw[->] (I) -- (G);
	200	\draw[->] (I) -- (H);
	201	\end{tikzpicture}
	202	\quad
	203	\vrule
	204	\quad
	205	\begin{minipage}[c]{0.45\hsize}
	206	\begin{nprog}
	207	class A: SodObject \{ \}\quad\=@/* @\|A\|, @\|SodObject\| */ \\
	208	class B: SodObject \{ \}\>@/* @\|B\|, @\|SodObject\| */ \\
	209	class C: SodObject \{ \}\>@/* @\|B\|, @\|SodObject\| */ \\
	210	class D: SodObject \{ \}\>@/* @\|B\|, @\|SodObject\| */ \\+
	211	class E: A, B \{ \}\quad\=@/* @\|E\|, @\|A\|, @\|B\|, \dots */ \\
	212	class F: A, D \{ \}\>@/* @\|F\|, @\|A\|, @\|D\|, \dots */ \\+
	213	class G: E, C \{ \}\>@/* @\|G\|, @\|E\|, @\|A\|,
	214	@\|B\|, @\|C\|, \dots */ \\
	215	class H: F \{ \}\>@/* @\|H\|, @\|F\|, @\|A\|, @\|D\|, \dots */ \\+
	216	class I: G, H \{ \}\>@/* @\|I\|, @\|G\|, @\|E\|, @\|H\|, @\|F\|,
	217	@\|A\|, @\|B\|, @\|C\|, @\|D\|, \dots */
	218	\end{nprog}
	219	\end{minipage}
	220
	221	\caption{An example class graph and class precedence lists}
	222	\label{fig:concepts.classes.cpl-example}
	223	\end{figure}
	224
	225	\begin{example}
	226	Consider the class relationships shown in
	227	\xref{fig:concepts.classes.cpl-example}.
	228
	229	\begin{itemize}
	230
	231	\item @\|SodObject\| has no proper superclasses. Its class precedence list
	232	is therefore simply $\langle @\|SodObject\| \rangle$.
	233
	234	\item In general, if $X$ is a direct subclass only of $Y$, and $Y$'s class
	235	precedence list is $\langle Y, \ldots \rangle$, then $X$'s class
	236	precedence list is $\langle X, Y, \ldots \rangle$. This explains $A$,
	237	$B$, $C$, $D$, and $H$.
	238
	239	\item $E$'s list is found by merging its local precedence list $\langle E,
	240	A, B \rangle$ with the class precedence lists of its direct superclasses,
	241	which are $\langle A, @\|SodObject\| \rangle$ and $\langle B, @\|SodObject\|
	242	\rangle$. Clearly, @\|SodObject\| must be last, and $E$'s local precedence
	243	list orders the rest, giving $\langle E, A, B, @\|SodObject\|, \rangle$.
	244	$F$ is similar.
	245
	246	\item We determine $G$'s class precedence list by merging the three lists
	247	$\langle G, E, C \rangle$, $\langle E, A, B, @\|SodObject\| \rangle$, and
	248	$\langle C, @\|SodObject\| \rangle$. The class precedence list begins
	249	$\langle G, E, \ldots \rangle$, but the individual lists don't order $A$
	250	and $C$. Comparing these to $G$'s direct superclasses, we see that $A$
	251	is a superclass of $E$, while $C$ is a superclass of -- indeed equal to
	252	-- $C$; so $A$ must precede $C$, as must $B$, and the final list is
	253	$\langle G, E, A, B, C, @\|SodObject\| \rangle$.
	254
	255	\item Finally, we determine $I$'s class precedence list by merging $\langle
	256	I, G, H \rangle$, $\langle G, E, A, B, C, @\|SodObject\| \rangle$, and
	257	$\langle H, F, A, D, @\|SodObject\| \rangle$. The list begins $\langle I,
	258	G, \ldots \rangle$, and then we must break a tie between $E$ and $H$; but
	259	$E$ is a superclass of $G$, so $E$ wins. Next, $H$ and $F$ must precede
	260	$A$, since these are ordered by $H$'s class precedence list. Then $B$
	261	and $C$ precede $D$, since the former are superclasses of $G$, and the
	262	final list is $\langle I, G, E, H, F, A, B, C, D, @\|SodObject\| \rangle$.
	263
	264	\end{itemize}
	265
	266	(This example combines elements from
	267	\cite{barrett-1996:monot-super-linear-dylan} and
	268	\cite{ducournau-1994:monot-multip-inher-linear}.)
	269	\end{example}
	270
	271	\subsubsection{Class links and chains}
	272	The definition for a class $C$ may distinguish one of its proper superclasses
	273	as being the \emph{link superclass} for class $C$. Not every class need have
	274	a link superclass, and the link superclass of a class $C$, if it exists, need
	275	not be a direct superclass of $C$.
	276
	277	Superclass links must obey the following rule: if $C$ is a class, then there
	278	must be no three distinct superclasses $X$, $Y$ and~$Z$ of $C$ such that $Z$
	279	is the link superclass of both $X$ and $Y$. As a consequence of this rule,
	280	the superclasses of $C$ can be partitioned into linear \emph{chains}, such
	281	that superclasses $A$ and $B$ are in the same chain if and only if one can
	282	trace a path from $A$ to $B$ by following superclass links, or \emph{vice
	283	versa}.
	284
	285	Since a class links only to one of its proper superclasses, the classes in a
	286	chain are naturally ordered from most- to least-specific. The least specific
	287	class in a chain is called the \emph{chain head}; the most specific class is
	288	the \emph{chain tail}. Chains are often named after their chain head
	289	classes.
	290
	291
	292	\subsection{Names}
	293	\label{sec:concepts.classes.names}
	294
	295	Classes have a number of other attributes:
	296	\begin{itemize}
	297	\item A \emph{name}, which is a C identifier. Class names must be globally
	298	unique. The class name is used in the names of a number of associated
	299	definitions, to be described later.
	300	\item A \emph{nickname}, which is also a C identifier. Unlike names,
	301	nicknames are not required to be globally unique. If $C$ is any class,
	302	then all the superclasses of $C$ must have distinct nicknames.
	303	\end{itemize}
	304
	305
	306	\subsection{Slots} \label{sec:concepts.classes.slots}
	307
	308	Each class defines a number of \emph{slots}. Much like a structure member, a
	309	slot has a \emph{name}, which is a C identifier, and a \emph{type}. Unlike
	310	many other object systems, different superclasses of a class $C$ can define
	311	slots with the same name without ambiguity, since slot references are always
	312	qualified by the defining class's nickname.
	313
	314	\subsubsection{Slot initializers}
	315	As well as defining slot names and types, a class can also associate an
	316	\emph{initial value} with each slot defined by itself or one of its
	317	superclasses. A class $C$ provides an \emph{initialization message} (see
	318	\xref[\instead{sections}]{sec:concepts.lifecycle.birth}, and
	319	\ref{sec:structures.root.sodobject}) whose methods set the slots of a
	320	\emph{direct} instance of the class to the correct initial values. If
	321	several of $C$'s superclasses define initializers for the same slot then the
	322	initializer from the most specific such class is used. If none of $C$'s
	323	superclasses define an initializer for some slot then that slot will be left
	324	uninitialized.
	325
	326	The initializer for a slot with scalar type may be any C expression. The
	327	initializer for a slot with aggregate type must contain only constant
	328	expressions if the generated code is expected to be processed by a
	329	implementation of C89. Initializers will be evaluated once each time an
	330	instance is initialized.
	331
	332	Slots are initialized in reverse-precedence order of their defining classes;
	333	i.e., slots defined by a less specific superclass are initialized earlier
	334	than slots defined by a more specific superclass. Slots defined by the same
	335	class are initialized in the order in which they appear in the class
	336	definition.
	337
	338	The initializer for a slot may refer to other slots in the same object, via
	339	the @\|me\| pointer: in an initializer for a slot defined by a class $C$, @\|me\|
	340	has type `pointer to $C$'. (Note that the type of @\|me\| depends only on the
	341	class which defined the slot, not the class which defined the initializer.)
	342
	343	A class can also define \emph{class slot initializers}, which provide values
	344	for a slot defined by its metaclass; see \xref{sec:concepts.metaclasses} for
	345	details.
	346
	347
	348	\subsection{C language integration} \label{sec:concepts.classes.c}
	349
	350	It is very important to distinguish compile-time C \emph{types} from Sod's
	351	run-time \emph{classes}: see \xref{sec:concepts.classes}.
	352
	353	For each class~$C$, the Sod translator defines a C type, the \emph{class
	354	type}, with the same name. This is the usual type used when considering an
	355	object as an instance of class~$C$. No entire object will normally have a
	356	class type,\footnote{%
	357	In general, a class type only captures the structure of one of the
	358	superclass chains of an instance. A full instance layout contains multiple
	359	chains. See \xref{sec:structures.layout} for the full details.} %
	360	so access to instances is almost always via pointers.
	361
	362	Usually, a value of type pointer-to-class-type of class~$C$ will point into
	363	an instance of class $C$. However, clever (or foolish) use of pointer
	364	conversions can invalidate this relationship.
	365
	366	\subsubsection{Access to slots}
	367	The class type for a class~$C$ is actually a structure. It contains one
	368	member for each class in $C$'s superclass chain, named with that class's
	369	nickname. Each of these members is also a structure, containing the
	370	corresponding class's slots, one member per slot. There's nothing special
	371	about these slot members: C code can access them in the usual way.
	372
	373	For example, given the definition
	374	\begin{prog}
	375	[nick = mine] \\
	376	class MyClass: SodObject \{ \\ \ind
	377	int x; \-\\
	378	\}
	379	\end{prog}
	380	the simple function
	381	\begin{prog}
	382	int get_x(MyClass *m) \{ return (m@->mine.x); \}
	383	\end{prog}
	384	will extract the value of @\|x\| from an instance of @\|MyClass\|.
	385
	386	All of this means that there's no such thing as `private' or `protected'
	387	slots. If you want to hide implementation details, the best approach is to
	388	stash them in a dynamically allocated private structure, and leave a pointer
	389	to it in a slot. (This will also help preserve binary compatibility, because
	390	the private structure can grow more members as needed. See
	391	\xref{sec:concepts.compatibility} for more details.)
	392
	393	Slots defined by $C$'s link superclass, or any other superclass in the same
	394	chain, can be accessed in the same way. Slots defined by other superclasses
	395	can't be accessed directly: the instance pointer must be \emph{converted} to
	396	point to a different chain. See the subsection `Conversions' below.
	397
	398
	399	\subsubsection{Sending messages}
	400	Sod defines a macro for each message. If a class $C$ defines a message $m$,
	401	then the macro is called @\|$C$_$m$\|. The macro takes a pointer to the
	402	receiving object as its first argument, followed by the message arguments, if
	403	any, and returns the value returned by the object's effective method for the
	404	message (if any). If you have a pointer to an instance of any of $C$'s
	405	subclasses, then you can send it the message; it doesn't matter whether the
	406	subclass is on the same chain. Note that the receiver argument is evaluated
	407	twice, so it's not safe to write a receiver expression which has
	408	side-effects.
	409
	410	For example, suppose we defined
	411	\begin{prog}
	412	[nick = soupy] \\
	413	class Super: SodObject \{ \\ \ind
	414	void msg(const char *m); \-\\
	415	\} \\+
	416	class Sub: Super \{ \\ \ind
	417	void soupy.msg(const char *m)
	418	\{ printf("sub sent `\%s'@\\n", m); \} \-\\
	419	\}
	420	\end{prog}
	421	then we can send the message like this:
	422	\begin{prog}
	423	Sub sub = / \dots\ */; \\
	424	Super_msg(sub, "hello");
	425	\end{prog}
	426
	427	What happens under the covers is as follows. The structure pointed to by the
	428	instance pointer has a member named @\|_vt\|, which points to a structure
	429	called a `virtual table', or \emph{vtable}, which contains various pieces of
	430	information about the object's direct class and layout, and holds pointers to
	431	method entries for the messages which the object can receive. The
	432	message-sending macro in the example above expands to something similar to
	433	\begin{prog}
	434	sub@->_vt.sub.msg(sub, "Hello");
	435	\end{prog}
	436
	437	The vtable contains other useful information, such as a pointer to the
	438	instance's direct class's \emph{class object} (described below). The full
	439	details of the contents and layout of vtables are given in
	440	\xref{sec:structures.layout.vtable}.
	441
	442
	443	\subsubsection{Class objects}
	444	In Sod's object system, classes are objects too. Therefore classes are
	445	themselves instances; the class of a class is called a \emph{metaclass}. The
	446	consequences of this are explored in \xref{sec:concepts.metaclasses}. The
	447	\emph{class object} has the same name as the class, suffixed with
	448	`@\|__class\|'\footnote{%
	449	This is not quite true. @\|$C$__class\| is actually a macro. See
	450	\xref{sec:structures.layout.additional} for the gory details.} %
	451	and its type is usually @\|SodClass\|; @\|SodClass\|'s nickname is @\|cls\|.
	452
	453	A class object's slots contain or point to useful information, tables and
	454	functions for working with that class's instances. (The @\|SodClass\| class
	455	doesn't define any messages, so it doesn't have any methods other than for
	456	the @\|SodObject\| lifecycle messages @\|init\| and @\|teardown\|; see
	457	\xref{sec:concepts.lifecycle}. In Sod, a class slot containing a function
	458	pointer is not at all the same thing as a method.)
	459
	460	\subsubsection{Conversions}
	461	Suppose one has a value of type pointer-to-class-type for some class~$C$, and
	462	wants to convert it to a pointer-to-class-type for some other class~$B$.
	463	There are three main cases to distinguish.
	464	\begin{itemize}
	465	\item If $B$ is a superclass of~$C$, in the same chain, then the conversion
	466	is an \emph{in-chain upcast}. The conversion can be performed using the
	467	appropriate generated upcast macro (see below), or by simply casting the
	468	pointer, using C's usual cast operator (or the \Cplusplus\ @\|static_cast<>\|
	469	operator).
	470	\item If $B$ is a superclass of~$C$, in a different chain, then the
	471	conversion is a \emph{cross-chain upcast}. The conversion is more than a
	472	simple type change: the pointer value must be adjusted. If the direct
	473	class of the instance in question is not known, the conversion will require
	474	a lookup at runtime to find the appropriate offset by which to adjust the
	475	pointer. The conversion can be performed using the appropriate generated
	476	upcast macro (see below); the general case is handled by the macro
	477	\descref{mac}{SOD_XCHAIN}.
	478	\item If $B$ is a subclass of~$C$ then the conversion is a \emph{downcast};
	479	otherwise the conversion is a~\emph{cross-cast}. In either case, the
	480	conversion can fail: the object in question might not be an instance of~$B$
	481	after all. The macro \descref{mac}{SOD_CONVERT} and the function
	482	\descref{fun}{sod_convert} perform general conversions. They return a null
	483	pointer if the conversion fails. (These are therefore your analogue to the
	484	\Cplusplus\ @\|dynamic_cast<>\| operator.)
	485	\end{itemize}
	486	The Sod translator generates macros for performing both in-chain and
	487	cross-chain upcasts. For each class~$C$, and each proper superclass~$B$
	488	of~$C$, a macro is defined: given an argument of type pointer to class type
	489	of~$C$, it returns a pointer to the same instance, only with type pointer to
	490	class type of~$B$, adjusted as necessary in the case of a cross-chain
	491	conversion. The macro is named by concatenating
	492	\begin{itemize}
	493	\item the name of class~$C$, in upper case,
	494	\item the characters `@\|__CONV_\|', and
	495	\item the nickname of class~$B$, in upper case;
	496	\end{itemize}
	497	e.g., if $C$ is named @\|MyClass\|, and $B$'s name is @\|SuperClass\| with
	498	nickname @\|super\|, then the macro @\|MYCLASS__CONV_SUPER\| converts a
	499	@\|MyClass~\| to a @\|SuperClass~\|. See
	500	\xref{sec:structures.layout.additional} for the formal description.
	501
	502	%%%--------------------------------------------------------------------------
	503	\section{Keyword arguments} \label{sec:concepts.keywords}
	504
	505	In standard C, the actual arguments provided to a function are matched up
	506	with the formal arguments given in the function definition according to their
	507	ordering in a list. Unless the (rather cumbersome) machinery for dealing
	508	with variable-length argument tails (@\|<stdarg.h>\|) is used, exactly the
	509	correct number of arguments must be supplied, and in the correct order.
	510
	511	A \emph{keyword argument} is matched by its distinctive \emph{name}, rather
	512	than by its position in a list. Keyword arguments may be \emph{omitted},
	513	causing some default behaviour by the function. A function can detect
	514	whether a particular keyword argument was supplied: so the default behaviour
	515	need not be the same as that caused by any specific value of the argument.
	516
	517	Keyword arguments can be provided in three ways.
	518	\begin{enumerate}
	519	\item Directly, as a variable-length argument tail, consisting (for the most
	520	part) of alternating keyword names, as pointers to null-terminated strings,
	521	and argument values, and terminated by a null pointer. This is somewhat
	522	error-prone, and the support library defines some macros which help ensure
	523	that keyword argument lists are well formed.
	524	\item Indirectly, through a @\|va_list\| object capturing a variable-length
	525	argument tail passed to some other function. Such indirect argument tails
	526	have the same structure as the direct argument tails described above.
	527	Because @\|va_list\| objects are hard to copy, the keyword-argument support
	528	library consistently passes @\|va_list\| objects \emph{by reference}
	529	throughout its programming interface.
	530	\item Indirectly, through a vector of @\|struct kwval\| objects, each of which
	531	contains a keyword name, as a pointer to a null-terminated string, and the
	532	\emph{address} of a corresponding argument value. (This indirection is
	533	necessary so that the items in the vector can be of uniform size.)
	534	Argument vectors are rather inconvenient to use, but are the only practical
	535	way in which a caller can decide at runtime which arguments to include in a
	536	call, which is useful when writing wrapper functions.
	537	\end{enumerate}
	538
	539	Perhaps surprisingly, keyword arguments have a relatively small performance
	540	impact. On the author's aging laptop, a call to a simple function, passing
	541	two out of three keyword arguments, takes about 30 cycles longer than calling
	542	a standard function which just takes integer arguments. On the other hand,
	543	quite a lot of code is involved in decoding keyword arguments, so code size
	544	will naturally suffer.
	545
	546	Keyword arguments are provided as a general feature for C functions.
	547	However, Sod has special support for messages which accept keyword arguments
	548	(\xref{sec:concepts.methods.keywords}); and they play an essential rôle in
	549	the instance construction protocol (\xref{sec:concepts.lifecycle.birth}).
	550
	551	%%%--------------------------------------------------------------------------
	552	\section{Messages and methods} \label{sec:concepts.methods}
	553
	554	Objects can be sent \emph{messages}. A message has a \emph{name}, and
	555	carries a number of \emph{arguments}. When an object is sent a message, a
	556	function, determined by the receiving object's class, is invoked, passing it
	557	the receiver and the message arguments. This function is called the
	558	class's \emph{effective method} for the message. The effective method can do
	559	anything a C function can do, including reading or updating program state or
	560	object slots, sending more messages, calling other functions, issuing system
	561	calls, or performing I/O; if it finishes, it may return a value, which is
	562	returned in turn to the message sender.
	563
	564	The set of messages an object can receive, characterized by their names,
	565	argument types, and return type, is determined by the object's class. Each
	566	class can define new messages, which can be received by any instance of that
	567	class. The messages defined by a single class must have distinct names:
	568	there is no `function overloading'. As with slots
	569	(\xref{sec:concepts.classes.slots}), messages defined by distinct classes are
	570	always distinct, even if they have the same names: references to messages are
	571	always qualified by the defining class's name or nickname.
	572
	573	Messages may take any number of arguments, of any non-array value type.
	574	Since message sends are effectively function calls, arguments of array type
	575	are implicitly converted to values of the corresponding pointer type. While
	576	message definitions may ascribe an array type to an argument, the formal
	577	argument will have pointer type, as is usual for C functions. A message may
	578	accept a variable-length argument suffix, denoted @\|\dots\|.
	579
	580	A class definition may include \emph{direct methods} for messages defined by
	581	it or any of its superclasses.
	582
	583	Like messages, direct methods define argument lists and return types, but
	584	they may also have a \emph{body}, and a \emph{rôle}.
	585
	586	A direct method need not have the same argument list or return type as its
	587	message. The acceptable argument lists and return types for a method depend
	588	on the message, in particular its method combination
	589	(\xref{sec:concepts.methods.combination}), and the method's rôle.
	590
	591	A direct method body is a block of C code, and the Sod translator usually
	592	defines, for each direct method, a function with external linkage, whose body
	593	contains a copy of the direct method body. Within the body of a direct
	594	method defined for a class $C$, the variable @\|me\|, of type pointer to class
	595	type of $C$, refers to the receiving object.
	596
	597
	598	\subsection{Effective methods and method combinations}
	599	\label{sec:concepts.methods.combination}
	600
	601	For each message a direct instance of a class might receive, there is a set
	602	of \emph{applicable methods}, which are exactly the direct methods defined on
	603	the object's class and its superclasses. These direct methods are combined
	604	together to form the \emph{effective method} for that particular class and
	605	message. Direct methods can be combined into an effective method in
	606	different ways, according to the \emph{method combination} specified by the
	607	message. The method combination determines which direct method rôles are
	608	acceptable, and, for each rôle, the appropriate argument lists and return
	609	types.
	610
	611	One direct method, $M$, is said to be more or less \emph{specific} than
	612	another, $N$, with respect to a receiving class~$C$, if the class defining
	613	$M$ is respectively a more or less specific superclass of~$C$ than the class
	614	defining $N$.
	615
	616	\subsubsection{The standard method combination}
	617	The default method combination is called the \emph{standard method
	618	combination}; other method combinations are useful occasionally for special
	619	effects. The standard method combination accepts four direct method rôles,
	620	called `primary' (the default), @\|before\|, @\|after\|, and @\|around\|.
	621
	622	All direct methods subject to the standard method combination must have
	623	argument lists which \emph{match} the message's argument list:
	624	\begin{itemize}
	625	\item the method's arguments must have the same types as the message, though
	626	the arguments may have different names; and
	627	\item if the message accepts a variable-length argument suffix then the
	628	direct method must instead have a final argument of type @\|va_list\|.
	629	\end{itemize}
	630	Primary and @\|around\| methods must have the same return type as the message;
	631	@\|before\| and @\|after\| methods must return @\|void\| regardless of the
	632	message's return type.
	633
	634	If there are no applicable primary methods then no effective method is
	635	constructed: the vtables contain null pointers in place of pointers to method
	636	entry functions.
	637
	638	\begin{figure}
	639	\hbox to\hsize{\hss\hbox{\begin{tikzpicture}
	640	[order/.append style={color=green!70!black},
	641	code/.append style={font=\sffamily},
	642	action/.append style={font=\itshape},
	643	method/.append style={rectangle, draw=black, thin, fill=blue!30,
	644	text height=\ht\strutbox, text depth=\dp\strutbox,
	645	minimum width=40mm}]
	646
	647	\def\delgstack#1#2#3{
	648	\node (#10) [method, #2] {#3};
	649	\node (#11) [method, above=6mm of #10] {#3};
	650	\draw [->] ($(#10.north)!.5!(#10.north west) + (0mm, 1mm)$) --
	651	++(0mm, 4mm)
	652	node [code, left=4pt, midway] {next_method};
	653	\draw [<-] ($(#10.north)!.5!(#10.north east) + (0mm, 1mm)$) --
	654	++(0mm, 4mm)
	655	node [action, right=4pt, midway] {return};
	656	\draw [->] ($(#11.north)!.5!(#11.north west) + (0mm, 1mm)$) --
	657	++(0mm, 4mm)
	658	node [code, left=4pt, midway] {next_method}
	659	node (ld) [above] {$\smash\vdots\mathstrut$};
	660	\draw [<-] ($(#11.north)!.5!(#11.north east) + (0mm, 1mm)$) --
	661	++(0mm, 4mm)
	662	node [action, right=4pt, midway] {return}
	663	node (rd) [above] {$\smash\vdots\mathstrut$};
	664	\draw [->] ($(ld.north) + (0mm, 1mm)$) -- ++(0mm, 4mm)
	665	node [code, left=4pt, midway] {next_method};
	666	\draw [<-] ($(rd.north) + (0mm, 1mm)$) -- ++(0mm, 4mm)
	667	node [action, right=4pt, midway] {return};
	668	\node (p) at ($(ld.north)!.5!(rd.north)$) {};
	669	\node (#1n) [method, above=5mm of p] {#3};
	670	\draw [->, order] ($(#10.south east) + (4mm, 1mm)$) --
	671	($(#1n.north east) + (4mm, -1mm)$)
	672	node [midway, right, align=left]
	673	{Most to \\ least \\ specific};}
	674
	675	\delgstack{a}{}{@\|around\| method}
	676	\draw [<-] ($(a0.south)!.5!(a0.south west) - (0mm, 1mm)$) --
	677	++(0mm, -4mm);
	678	\draw [->] ($(a0.south)!.5!(a0.south east) - (0mm, 1mm)$) --
	679	++(0mm, -4mm)
	680	node [action, right=4pt, midway] {return};
	681
	682	\draw [->] ($(an.north)!.6!(an.north west) + (0mm, 1mm)$) --
	683	++(-8mm, 8mm)
	684	node [code, midway, left=3mm] {next_method}
	685	node (b0) [method, above left = 1mm + 4mm and -6mm - 4mm] {};
	686	\node (b1) [method] at ($(b0) - (2mm, 2mm)$) {};
	687	\node (bn) [method] at ($(b1) - (2mm, 2mm)$) {@\|before\| method};
	688	\draw [->, order] ($(bn.west) - (6mm, 0mm)$) -- ++(12mm, 12mm)
	689	node [midway, above left, align=center] {Most to \\ least \\ specific};
	690	\draw [->] ($(b0.north east) + (-10mm, 1mm)$) -- ++(8mm, 8mm)
	691	node (p) {};
	692
	693	\delgstack{m}{above right=1mm and 0mm of an.west \|- p}{Primary method}
	694	\draw [->] ($(mn.north)!.5!(mn.north west) + (0mm, 1mm)$) -- ++(0mm, 4mm)
	695	node [code, left=4pt, midway] {next_method}
	696	node [above right = 0mm and -8mm]
	697	{$\vcenter{\hbox{\Huge\textcolor{red}{!}}}
	698	\vcenter{\hbox{\begin{tabular}[c]{l}
	699	@\|next_method\| \\
	700	pointer is null
	701	\end{tabular}}}$};
	702
	703	\draw [->, color=blue, dotted]
	704	($(m0.south)!.2!(m0.south east) - (0mm, 1mm)$) --
	705	($(an.north)!.2!(an.north east) + (0mm, 1mm)$)
	706	node [midway, sloped, below] {Return value};
	707
	708	\draw [<-] ($(an.north)!.6!(an.north east) + (0mm, 1mm)$) --
	709	++(8mm, 8mm)
	710	node [action, midway, right=3mm] {return}
	711	node (f0) [method, above right = 1mm and -6mm] {};
	712	\node (f1) [method] at ($(f0) + (-2mm, 2mm)$) {};
	713	\node (fn) [method] at ($(f1) + (-2mm, 2mm)$) {@\|after\| method};
	714	\draw [<-, order] ($(f0.east) + (6mm, 0mm)$) -- ++(-12mm, 12mm)
	715	node [midway, above right, align=center]
	716	{Least to \\ most \\ specific};
	717	\draw [<-] ($(fn.north west) + (6mm, 1mm)$) -- ++(-8mm, 8mm);
	718
	719	\end{tikzpicture}}\hss}
	720
	721	\caption{The standard method combination}
	722	\label{fig:concepts.methods.stdmeth}
	723	\end{figure}
	724
	725	The effective method for a message with standard method combination works as
	726	follows (see also~\xref{fig:concepts.methods.stdmeth}).
	727	\begin{enumerate}
	728
	729	\item If any applicable methods have the @\|around\| rôle, then the most
	730	specific such method, with respect to the class of the receiving object, is
	731	invoked.
	732
	733	Within the body of an @\|around\| method, the variable @\|next_method\| is
	734	defined, having pointer-to-function type. The method may call this
	735	function, as described below, any number of times.
	736
	737	If there any remaining @\|around\| methods, then @\|next_method\| invokes the
	738	next most specific such method, returning whichever value that method
	739	returns; otherwise the behaviour of @\|next_method\| is to invoke the
	740	@\|before\| methods (if any), followed by the most specific primary method,
	741	followed by the @\|after\| methods (if any), and to return whichever value
	742	was returned by the most specific primary method, as described in the
	743	following items. That is, the behaviour of the least specific @\|around\|
	744	method's @\|next_method\| function is exactly the behaviour that the
	745	effective method would have if there were no @\|around\| methods. Note that
	746	if the least-specific @\|around\| method calls its @\|next_method\| more than
	747	once then the whole sequence of @\|before\|, primary, and @\|after\| methods
	748	occurs multiple times.
	749
	750	The value returned by the most specific @\|around\| method is the value
	751	returned by the effective method.
	752
	753	\item If any applicable methods have the @\|before\| rôle, then they are all
	754	invoked, starting with the most specific.
	755
	756	\item The most specific applicable primary method is invoked.
	757
	758	Within the body of a primary method, the variable @\|next_method\| is
	759	defined, having pointer-to-function type. If there are no remaining less
	760	specific primary methods, then @\|next_method\| is a null pointer.
	761	Otherwise, the method may call the @\|next_method\| function any number of
	762	times.
	763
	764	The behaviour of the @\|next_method\| function, if it is not null, is to
	765	invoke the next most specific applicable primary method, and to return
	766	whichever value that method returns.
	767
	768	If there are no applicable @\|around\| methods, then the value returned by
	769	the most specific primary method is the value returned by the effective
	770	method; otherwise the value returned by the most specific primary method is
	771	returned to the least specific @\|around\| method, which called it via its
	772	own @\|next_method\| function.
	773
	774	\item If any applicable methods have the @\|after\| rôle, then they are all
	775	invoked, starting with the \emph{least} specific. (Hence, the most
	776	specific @\|after\| method is invoked with the most `afterness'.)
	777
	778	\end{enumerate}
	779
	780	A typical use for @\|around\| methods is to allow a base class to set up the
	781	dynamic environment appropriately for the primary methods of its subclasses,
	782	e.g., by claiming a lock, and releasing it afterwards.
	783
	784	The @\|next_method\| function provided to methods with the primary and
	785	@\|around\| rôles accepts the same arguments, and returns the same type, as the
	786	message, except that one or two additional arguments are inserted at the
	787	front of the argument list. The first additional argument is always the
	788	receiving object, @\|me\|. If the message accepts a variable argument suffix,
	789	then the second addition argument is a @\|va_list\|; otherwise there is no
	790	second additional argument; otherwise, In the former case, a variable
	791	@\|sod__master_ap\| of type @\|va_list\| is defined, containing a separate copy
	792	of the argument pointer (so the method body can process the variable argument
	793	suffix itself, and still pass a fresh copy on to the next method).
	794
	795	A method with the primary or @\|around\| rôle may use the convenience macro
	796	@\|CALL_NEXT_METHOD\|, which takes no arguments itself, and simply calls
	797	@\|next_method\| with appropriate arguments: the receiver @\|me\| pointer, the
	798	argument pointer @\|sod__master_ap\| (if applicable), and the method's
	799	arguments. If the method body has overwritten its formal arguments, then
	800	@\|CALL_NEXT_METHOD\| will pass along the updated values, rather than the
	801	original ones.
	802
	803	A primary or @\|around\| method which invokes its @\|next_method\| function is
	804	said to \emph{extend} the message behaviour; a method which does not invoke
	805	its @\|next_method\| is said to \emph{override} the behaviour. Note that a
	806	method may make a decision to override or extend at runtime.
	807
	808	\subsubsection{Aggregating method combinations}
	809	A number of other method combinations are provided. They are called
	810	`aggregating' method combinations because, instead of invoking just the most
	811	specific primary method, as the standard method combination does, they invoke
	812	the applicable primary methods in turn and aggregate the return values from
	813	each.
	814
	815	The aggregating method combinations accept the same four rôles as the
	816	standard method combination, and @\|around\|, @\|before\|, and @\|after\| methods
	817	work in the same way.
	818
	819	The aggregating method combinations provided are as follows.
	820	\begin{description} \let\makelabel\code
	821	\item[progn] The message must return @\|void\|. The applicable primary methods
	822	are simply invoked in turn, most specific first.
	823	\item[sum] The message must return a numeric type.\footnote{%
	824	The Sod translator doesn't check this, since it doesn't have enough
	825	insight into @\|typedef\| names.} %
	826	The applicable primary methods are invoked in turn, and their return values
	827	added up. The final result is the sum of the individual values.
	828	\item[product] The message must return a numeric type. The applicable
	829	primary methods are invoked in turn, and their return values multiplied
	830	together. The final result is the product of the individual values.
	831	\item[min] The message must return a scalar type. The applicable primary
	832	methods are invoked in turn. The final result is the smallest of the
	833	individual values.
	834	\item[max] The message must return a scalar type. The applicable primary
	835	methods are invoked in turn. The final result is the largest of the
	836	individual values.
	837	\item[and] The message must return a scalar type. The applicable primary
	838	methods are invoked in turn. If any method returns zero then the final
	839	result is zero and no further methods are invoked. If all of the
	840	applicable primary methods return nonzero, then the final result is the
	841	result of the last primary method.
	842	\item[or] The message must return a scalar type. The applicable primary
	843	methods are invoked in turn. If any method returns nonzero then the final
	844	result is that nonzero value and no further methods are invoked. If all of
	845	the applicable primary methods return zero, then the final result is zero.
	846	\end{description}
	847
	848	There is also a @\|custom\| aggregating method combination, which is described
	849	in \xref{sec:fixme.custom-aggregating-method-combination}.
	850
	851
	852	\subsection{Method entries} \label{sec:concepts.methods.entry}
	853
	854	The effective methods for each class are determined at translation time, by
	855	the Sod translator. For each effective method, one or more \emph{method
	856	entry functions} are constructed. A method entry function has three
	857	responsibilities.
	858	\begin{itemize}
	859	\item It converts the receiver pointer to the correct type. Method entry
	860	functions can perform these conversions extremely efficiently: there are
	861	separate method entries for each chain of each class which can receive a
	862	message, so method entry functions are in the privileged situation of
	863	knowing the \emph{exact} class of the receiving object.
	864	\item If the message accepts a variable-length argument tail, then two method
	865	entry functions are created for each chain of each class: one receives a
	866	variable-length argument tail, as intended, and captures it in a @\|va_list\|
	867	object; the other accepts an argument of type @\|va_list\| in place of the
	868	variable-length tail and arranges for it to be passed along to the direct
	869	methods.
	870	\item It invokes the effective method with the appropriate arguments. There
	871	might or might not be an actual function corresponding to the effective
	872	method itself: the translator may instead open-code the effective method's
	873	behaviour into each method entry function; and the machinery for handling
	874	`delegation chains', such as is used for @\|around\| methods and primary
	875	methods in the standard method combination, is necessarily scattered among
	876	a number of small functions.
	877	\end{itemize}
	878
	879
	880	\subsection{Messages with keyword arguments}
	881	\label{sec:concepts.methods.keywords}
	882
	883	A message or a direct method may declare that it accepts keyword arguments.
	884	A message which accepts keyword arguments is called a \emph{keyword message};
	885	a direct method which accepts keyword arguments is called a \emph{keyword
	886	method}.
	887
	888	While method combinations may set their own rules, usually keyword methods
	889	can only be defined on keyword messages, and all methods defined on a keyword
	890	message must be keyword methods. The direct methods defined on a keyword
	891	message may differ in the keywords they accept, both from each other, and
	892	from the message. If two applicable methods on the same message both accept
	893	a keyword argument with the same name, then these two keyword arguments must
	894	also have the same type. Different applicable methods may declare keyword
	895	arguments with the same name but different defaults; see below.
	896
	897	The keyword arguments acceptable in a message sent to an object are the
	898	keywords listed in the message definition, together with all of the keywords
	899	accepted by any applicable method. There is no easy way to determine at
	900	runtime whether a particular keyword is acceptable in a message to a given
	901	instance.
	902
	903	At runtime, a direct method which accepts one or more keyword arguments
	904	receives an additional argument named @\|suppliedp\|. This argument is a small
	905	structure. For each keyword argument named $k$ accepted by the direct
	906	method, @\|suppliedp\| contains a one-bit-wide bitfield member of type
	907	@\|unsigned\|, also named $k$. If a keyword argument named $k$ was passed in
	908	the message, then @\|suppliedp.$k$\| is one, and $k$ contains the argument
	909	value; otherwise @\|suppliedp.$k$\| is zero, and $k$ contains the default value
	910	from the direct method definition if there was one, or an unspecified value
	911	otherwise.
	912
	913	%%%--------------------------------------------------------------------------
	914	\section{The object lifecycle} \label{sec:concepts.lifecycle}
	915
	916	\subsection{Creation} \label{sec:concepts.lifecycle.birth}
	917
	918	Construction of a new instance of a class involves three steps.
	919	\begin{enumerate}
	920	\item \emph{Allocation} arranges for there to be storage space for the
	921	instance's slots and associated metadata.
	922	\item \emph{Imprinting} fills in the instance's metadata, associating the
	923	instance with its class.
	924	\item \emph{Initialization} stores appropriate initial values in the
	925	instance's slots, and maybe links it into any external data structures as
	926	necessary.
	927	\end{enumerate}
	928	The \descref{mac}{SOD_DECL}[macro] handles constructing instances with
	929	automatic storage duration (`on the stack'). Similarly, the
	930	\descref{mac}{SOD_MAKE}[macro] and the \descref*{fun}{sod_make} and
	931	\descref{fun}{sod_makev} functions construct instances allocated from the
	932	standard @\|malloc\| heap. Programmers can add support for other allocation
	933	strategies by using the \descref{mac}{SOD_INIT}[macro] and the
	934	\descref*{fun}{sod_init} and \descref{fun}{sod_initv} functions, which
	935	package up imprinting and initialization.
	936
	937	\subsubsection{Allocation}
	938	Instances of most classes (specifically including those classes defined by
	939	Sod itself) can be held in any storage of sufficient size. The in-memory
	940	layout of an instance of some class~$C$ is described by the type @\|struct
	941	$C$__ilayout\|, and if the relevant class is known at compile time then the
	942	best way to discover the layout size is with the @\|sizeof\| operator. Failing
	943	that, the size required to hold an instance of $C$ is available in a slot in
	944	$C$'s class object, as @\|$C$__class@->cls.initsz\|. The necessary alignment,
	945	in bytes, is provided as @\|$C$__class@->cls.align\|, should this be necessary.
	946
	947	It is not in general sufficient to declare, or otherwise allocate, an object
	948	of the class type $C$. The class type only describes a single chain of the
	949	object's layout. It is nearly always an error to use the class type as if it
	950	is a \emph{complete type}, e.g., to declare objects or arrays of the class
	951	type, or to enquire about its size or alignment requirements.
	952
	953	Instance layouts may be declared as objects with automatic storage duration
	954	(colloquially, `allocated on the stack') or allocated dynamically, e.g.,
	955	using @\|malloc\|. They may be included as members of structures or unions, or
	956	elements of arrays. Sod's runtime system doesn't retain addresses of
	957	instances, so, for example, Sod doesn't make using fancy allocators which
	958	sometimes move objects around in memory any more difficult than it needs to
	959	be.
	960
	961	The following simple function correctly allocates and returns space for an
	962	instance of a class given a pointer to its class object @<cls>.
	963	\begin{prog}
	964	void allocate_instance(const SodClass cls) \\ \ind
	965	\{ return malloc(cls@->cls.initsz); \}
	966	\end{prog}
	967
	968	\subsubsection{Imprinting}
	969	Once storage has been allocated, it must be \emph{imprinted} before it can be
	970	used as an instance of a class, e.g., before any messages can be sent to it.
	971
	972	Imprinting an instance stores some metadata about its direct class in the
	973	instance structure, so that the rest of the program (and Sod's runtime
	974	library) can tell what sort of object it is, and how to use it.\footnote{%
	975	Specifically, imprinting an instance's storage involves storing the
	976	appropriate vtable pointers in the right places in it.} %
	977	A class object's @\|imprint\| slot points to a function which will correctly
	978	imprint storage for one of that class's instances.
	979
	980	Once an instance's storage has been imprinted, it is technically possible to
	981	send messages to the instance; however the instance's slots are still
	982	uninitialized at this point, so the applicable methods are unlikely to do
	983	much of any use unless they've been written specifically for the purpose.
	984
	985	The following simple function imprints storage at address @<p> as an instance
	986	of a class, given a pointer to its class object @<cls>.
	987	\begin{prog}
	988	void imprint_instance(const SodClass cls, void p) \\ \ind
	989	\{ cls@->cls.imprint(p); \}
	990	\end{prog}
	991
	992	\subsubsection{Initialization}
	993	The final step for constructing a new instance is to \emph{initialize} it, to
	994	establish the necessary invariants for the instance itself and the
	995	environment in which it operates.
	996
	997	Details of initialization are necessarily class-specific, but typically it
	998	involves setting the instance's slots to appropriate values, and possibly
	999	linking it into some larger data structure to keep track of it. It is
	1000	possible for initialization methods to attempt to allocate resources, but
	1001	this must be done carefully: there is currently no way to report an error
	1002	from object initialization, so the object must be marked as incompletely
	1003	initialized, and left in a state where it will be safe to tear down later.
	1004
	1005	Initialization is performed by sending the imprinted instance an @\|init\|
	1006	message, defined by the @\|SodObject\| class. This message uses a nonstandard
	1007	method combination which works like the standard combination, except that the
	1008	\emph{default behaviour}, if there is no overriding method, is to initialize
	1009	the instance's slots, as described below, and to invoke each superclass's
	1010	initialization fragments. This default behaviour may be invoked multiple
	1011	times if some method calls on its @\|next_method\| more than once, unless some
	1012	other method takes steps to prevent this.
	1013
	1014	Slots are initialized in a well-defined order.
	1015	\begin{itemize}
	1016	\item Slots defined by a more specific superclass are initialized after slots
	1017	defined by a less specific superclass.
	1018	\item Slots defined by the same class are initialized in the order in which
	1019	their definitions appear.
	1020	\end{itemize}
	1021
	1022	A class can define \emph{initialization fragments}: pieces of literal code to
	1023	be executed to set up a new instance. Each superclass's initialization
	1024	fragments are executed with @\|me\| bound to an instance pointer of the
	1025	appropriate superclass type, immediately after that superclass's slots (if
	1026	any) have been initialized; therefore, fragments defined by a more specific
	1027	superclass are executed after fragments defined by a less specific
	1028	superclass. A class may define more than one initialization fragment: the
	1029	fragments are executed in the order in which they appear in the class
	1030	definition. It is possible for an initialization fragment to use @\|return\|
	1031	or @\|goto\| for special control-flow effects, but this is not likely to be a
	1032	good idea.
	1033
	1034	The @\|init\| message accepts keyword arguments
	1035	(\xref{sec:concepts.methods.keywords}). The set of acceptable keywords is
	1036	determined by the applicable methods as usual, but also by the
	1037	\emph{initargs} defined by the receiving instance's class and its
	1038	superclasses, which are made available to slot initializers and
	1039	initialization fragments.
	1040
	1041	There are two kinds of initarg definitions. \emph{User initargs} are defined
	1042	by an explicit @\|initarg\| item appearing in a class definition: the item
	1043	defines a name, type, and (optionally) a default value for the initarg.
	1044	\emph{Slot initargs} are defined by attaching an @\|initarg\| property to a
	1045	slot or slot initializer item: the property's value determines the initarg's
	1046	name, while the type is taken from the underlying slot type; slot initargs do
	1047	not have default values. Both kinds define a \emph{direct initarg} for the
	1048	containing class. (Note that a slot may have any number of slot initargs;
	1049	and any number of slots may have initargs with the same name.)
	1050
	1051	Initargs are inherited. The \emph{applicable} direct initargs for an @\|init\|
	1052	effective method are those defined by the receiving object's class, and all
	1053	of its superclasses. Applicable direct initargs with the same name are
	1054	merged to form \emph{effective initargs}. An error is reported if two
	1055	applicable direct initargs have the same name but different types. The
	1056	default value of an effective initarg is taken from the most specific
	1057	applicable direct initarg which specifies a defalt value; if no applicable
	1058	direct initarg specifies a default value then the effective initarg has no
	1059	default.
	1060
	1061	All initarg values are made available at runtime to user code --
	1062	initialization fragments and slot initializer expressions -- through local
	1063	variables and a @\|suppliedp\| structure, as in a direct method
	1064	(\xref{sec:concepts.methods.keywords}). Furthermore, slot initarg
	1065	definitions influence the initialization of slots.
	1066
	1067	The process for deciding how to initialize a particular slot works as
	1068	follows.
	1069	\begin{enumerate}
	1070
	1071	\item If there are any slot initargs defined on the slot, or any of its slot
	1072	initializers, \emph{and} the sender supplied a value for one or more of the
	1073	corresponding effective initargs, then the value of the most specific such
	1074	initarg is stored in the slot. (For this purpose, initargs defined earlier
	1075	in a class definition are more specific than initargs defined later.)
	1076
	1077	\item Otherwise, if there are any slot initializers defined which include an
	1078	initializer expression, then the initializer expression from the most
	1079	specific such slot initializer is evaluated and its value stored in the
	1080	slot. (A class may define at most one initializer for any particular slot,
	1081	so no further disambiguation is required.)
	1082
	1083	\item Otherwise, the slot is left uninitialized.
	1084
	1085	\end{enumerate}
	1086	Note that the default values (if any) of effective initargs do \emph{not}
	1087	affect this procedure.
	1088
	1089
	1090	\subsection{Destruction}
	1091	\label{sec:concepts.lifecycle.death}
	1092
	1093	Destruction of an instance, when it is no longer required, consists of two
	1094	steps.
	1095	\begin{enumerate}
	1096	\item \emph{Teardown} releases any resources held by the instance and
	1097	disentangles it from any external data structures.
	1098	\item \emph{Deallocation} releases the memory used to store the instance so
	1099	that it can be reused.
	1100	\end{enumerate}
	1101	Teardown alone, for objects which require special deallocation, or for which
	1102	deallocation occurs automatically (e.g., instances with automatic storage
	1103	duration, or instances whose storage will be garbage-collected), is performed
	1104	using the \descref{fun}{sod_teardown}[function]. Destruction of instances
	1105	allocated from the standard @\|malloc\| heap is done using the
	1106	\descref{fun}{sod_destroy}[function].
	1107
	1108	\subsubsection{Teardown}
	1109	Details of teardown are necessarily class-specific, but typically it
	1110	involves releasing resources held by the instance, and disentangling it from
	1111	any data structures it might be linked into.
	1112
	1113	Teardown is performed by sending the instance the @\|teardown\| message,
	1114	defined by the @\|SodObject\| class. The message returns an integer, used as a
	1115	boolean flag. If the message returns zero, then the instance's storage
	1116	should be deallocated. If the message returns nonzero, then it is safe for
	1117	the caller to forget about instance, but should not deallocate its storage.
	1118	This is \emph{not} an error return: if some teardown method fails then the
	1119	program may be in an inconsistent state and should not continue.
	1120
	1121	This simple protocol can be used, for example, to implement a reference
	1122	counting system, as follows.
	1123	\begin{prog}
	1124	[nick = ref] \\
	1125	class ReferenceCountedObject: SodObject \{ \\ \ind
	1126	unsigned nref = 1; \\-
	1127	void inc() \{ me@->ref.nref++; \} \\-
	1128	[role = around] \\
	1129	int obj.teardown() \\
	1130	\{ \\ \ind
	1131	if (@--me@->ref.nref) return (1); \\
	1132	else return (CALL_NEXT_METHOD); \-\\
	1133	\} \-\\
	1134	\}
	1135	\end{prog}
	1136
	1137	The @\|teardown\| message uses a nonstandard method combination which works
	1138	like the standard combination, except that the \emph{default behaviour}, if
	1139	there is no overriding method, is to execute the superclass's teardown
	1140	fragments, and to return zero. This default behaviour may be invoked
	1141	multiple times if some method calls on its @\|next_method\| more than once,
	1142	unless some other method takes steps to prevent this.
	1143
	1144	A class can define \emph{teardown fragments}: pieces of literal code to be
	1145	executed to shut down an instance. Each superclass's teardown fragments are
	1146	executed with @\|me\| bound to an instance pointer of the appropriate
	1147	superclass type; fragments defined by a more specific superclass are executed
	1148	before fragments defined by a less specific superclass. A class may define
	1149	more than one teardown fragment: the fragments are executed in the order in
	1150	which they appear in the class definition. It is possible for an
	1151	initialization fragment to use @\|return\| or @\|goto\| for special control-flow
	1152	effects, but this is not likely to be a good idea. Similarly, it's probably
	1153	a better idea to use an @\|around\| method to influence the return value than
	1154	to write an explicit @\|return\| statement in a teardown fragment.
	1155
	1156	\subsubsection{Deallocation}
	1157	The details of instance deallocation are obviously specific to the allocation
	1158	strategy used by the instance, and this is often orthogonal from the object's
	1159	class.
	1160
	1161	The code which makes the decision to destroy an object may often not be aware
	1162	of the object's direct class. Low-level details of deallocation often
	1163	require the proper base address of the instance's storage, which can be
	1164	determined using the \descref{mac}{SOD_INSTBASE}[macro].
	1165
	1166	%%%--------------------------------------------------------------------------
	1167	\section{Metaclasses} \label{sec:concepts.metaclasses}
	1168
	1169	In Sod, every object is an instance of some class, and -- unlike, say,
	1170	\Cplusplus\ -- classes are proper objects. It follows that, in Sod, every
	1171	class~$C$ is itself an instance of some class~$M$, which is called $C$'s
	1172	\emph{metaclass}. Metaclass instances are usually constructed statically, at
	1173	compile time, and marked read-only.
	1174
	1175	As an added complication, Sod classes, and other metaobjects such as
	1176	messages, methods, slots and so on, also have classes \emph{at translation
	1177	time}. These translation-time metaclasses are not Sod classes; they are CLOS
	1178	classes, implemented in Common Lisp.
	1179
	1180
	1181	\subsection{Runtime metaclasses}
	1182	\label{sec:concepts.metaclasses.runtime}
	1183
	1184	Like other classes, metaclasses can declare messages, and define slots and
	1185	methods. Slots defined by the metaclass are called \emph{class slots}, as
	1186	opposed to \emph{instance slots}. Similarly, messages and methods defined by
	1187	the metaclass are termed \emph{class messages} and \emph{class methods}
	1188	respectively, though these are used much less frequently.
	1189
	1190	\subsubsection{The braid}
	1191	Every object is an instance of some class. There are only finitely many
	1192	classes.
	1193
	1194	\begin{figure}
	1195	\centering
	1196	\begin{tikzpicture}
	1197	\node[lit] (obj) {SodObject};
	1198	\node[lit] (cls) [right=10mm of obj] {SodClass};
	1199	\draw [->, dashed] (obj) to[bend right] (cls);
	1200	\draw [->] (cls) to[bend right] (obj);
	1201	\draw [->, dashed] (cls) to[loop right] (cls);
	1202	\end{tikzpicture}
	1203	\qquad
	1204	\fbox{\ \begin{tikzpicture}
	1205	\node (subclass) {subclass of};
	1206	\node (instance) [below=\jot of subclass] {instance of};
	1207	\draw [->] ($(subclass.west) - (10mm, 0)$) -- ++(8mm, 0);
	1208	\draw [->, dashed] ($(instance.west) - (10mm, 0)$) -- ++(8mm, 0);
	1209	\end{tikzpicture}}
	1210	\caption{The Sod braid} \label{fig:concepts.metaclasses.braid}
	1211	\end{figure}
	1212
	1213	Consider the directed graph whose nodes are classes, and where there is an
	1214	arc from $C$ to $D$ if and only if $C$ is an instance of $D$. There are only
	1215	finitely many nodes. Every node has an arc leaving it, because every object
	1216	-- and hence every class -- is an instance of some class. Therefore this
	1217	graph must contain at least one cycle.
	1218
	1219	In Sod, this situation is resolved in the simplest manner possible:
	1220	@\|SodClass\| is the only predefined metaclass, and it is an instance of
	1221	itself. The only other predefined class is @\|SodObject\|, which is also an
	1222	instance of @\|SodClass\|. There is exactly one root class, namely
	1223	@\|SodObject\|; consequently, @\|SodClass\| is a direct subclass of @\|SodObject\|.
	1224
	1225	\Xref{fig:concepts.metaclasses.braid} shows a diagram of this situation.
	1226
	1227	\subsubsection{Class slots and initializers}
	1228	Instance initializers were described in \xref{sec:concepts.classes.slots}. A
	1229	class can also define \emph{class initializers}, which provide values for
	1230	slots defined by its metaclass. The initial value for a class slot is
	1231	determined as follows.
	1232	\begin{itemize}
	1233	\item Nonstandard slot classes may be initialized by custom Lisp code. For
	1234	example, all of the slots defined by @\|SodClass\| are of this kind. User
	1235	initializers are not permitted for such slots.
	1236	\item If the class or any of its superclasses defines a class initializer for
	1237	the slot, then the class initializer defined by the most specific such
	1238	superclass is used.
	1239	\item Otherwise, if the metaclass or one of its superclasses defines an
	1240	instance initializer, then the instance initializer defined by he most
	1241	specific such class is used.
	1242	\item Otherwise there is no initializer, and an error will be reported.
	1243	\end{itemize}
	1244	Initializers for class slots must be constant expressions (for scalar slots)
	1245	or aggregate initializers containing constant expressions.
	1246
	1247	\subsubsection{Metaclass selection and consistency}
	1248	Sod enforces a \emph{metaclass consistency rule}: if $C$ has metaclass $M$,
	1249	then any subclass $C$ must have a metaclass which is a subclass of $M$.
	1250
	1251	The definition of a new class can name the new class's metaclass explicitly,
	1252	by defining a @\|metaclass\| property; the Sod translator will verify that the
	1253	choice of metaclass is acceptable.
	1254
	1255	If no @\|metaclass\| property is given, then the translator will select a
	1256	default metaclass as follows. Let $C_1$, $C_2$, \dots, $C_n$ be the direct
	1257	superclasses of the new class, and let $M_1$, $M_2$, \dots, $M_n$ be their
	1258	respective metaclasses (not necessarily distinct). If there exists exactly
	1259	one minimal metaclass $M_i$, i.e., there exists an $i$, with $1 \le i \le n$,
	1260	such that $M_i$ is a subclass of every $M_j$, for $1 \le j \le n$, then $M_i$
	1261	is selected as the new class's metaclass. Otherwise the situation is
	1262	ambiguous and an error will be reported. Usually, the ambiguity can be
	1263	resolved satisfactorily by defining a new class $M^*$ as a direct subclass of
	1264	the minimal $M_j$.
	1265
	1266
	1267	\subsection{Translation-time metaobjects}
	1268	\label{sec:concepts.metaclasses.compile-time}
	1269
	1270	Within the translator, modules, classes, slots and initializers, messages and
	1271	methods are all represented as instances of classes. Since the translator is
	1272	written in Common Lisp, these translation-time metaobject classes are all
	1273	CLOS classes. Extensions can influence the translator's behaviour -- and
	1274	hence the layout and behaviour of instances at runtime -- by subclassing the
	1275	built-in metaobject classes and implementing methods on appropriate generic
	1276	functions.
	1277
	1278	Metaobject classes are chosen in a fairly standard way.
	1279	\begin{itemize}
	1280	\item All metaobject definitions support a symbol-valued property, usually
	1281	named @\|@<thing>_class\| (e.g., @\|slot_class\|, @\|method_class\|), which sets
	1282	the metaobject class explicitly. (The class for a class metaobject is
	1283	taken from the @\|lisp_class\| property, because @\|class_class\| seems less
	1284	meaningful.)
	1285	\item Failing that, the metaobject's parents choose a default metaobject
	1286	class, based on the new metaobject's properties; i.e., slots and messages
	1287	have their metaobject classes chosen by the defining class metaobject;
	1288	initializer and initarg classes are chosen by the defining class metaobject
	1289	and the direct slot metaobject; and method classes are chosen by the
	1290	defining class metaobject and the message metaobject.
	1291	\item Classes have no parents; instead, the default is simply to use the
	1292	builtin metaobject class @\|sod-class\|.
	1293	\item Modules are a special case because the property syntax is rather
	1294	awkward. All modules are initially created as instances of the built-in
	1295	metaclass @\|module\|. Once the module has been parsed completely, the
	1296	module metaobject's classes is changed, using @\|change-class\|, to the class
	1297	specified in the module's property set.
	1298	\end{itemize}
	1299
	1300	%%%--------------------------------------------------------------------------
	1301	\section{Compatibility considerations} \label{sec:concepts.compatibility}
	1302
	1303	Sod doesn't make source-level compatibility especially difficult. As long as
	1304	classes, slots, and messages don't change names or dissappear, and slots and
	1305	messages retain their approximate types, everything will be fine.
	1306
	1307	Binary compatibility is much more difficult. Unfortunately, Sod classes have
	1308	rather fragile binary interfaces.\footnote{%
	1309	Research suggestion: investigate alternative instance and vtable layouts
	1310	which improve binary compatibility, probably at the expense of instance
	1311	compactness, and efficiency of slot access and message sending. There may
	1312	be interesting trade-offs to be made.} %
	1313
	1314	If instances are allocated \fixme{incomplete}
	1315
	1316	%%%----- That's all, folks --------------------------------------------------
	1317
	1318	%%% Local variables:
	1319	%%% mode: LaTeX
	1320	%%% TeX-master: "sod.tex"
	1321	%%% TeX-PDF-mode: t
	1322	%%% End: