mdw@git.distorted.org.uk Git - sod/blame_incremental

... / ...

Commit	Line	Data
	1	%%% --latex--
	2	%%%
	3	%%% Conceptual background
	4	%%%
	5	%%% (c) 2015 Straylight/Edgeware
	6	%%%
	7
	8	%%%----- Licensing notice ---------------------------------------------------
	9	%%%
	10	%%% This file is part of the Sensible Object Design, an object system for C.
	11	%%%
	12	%%% SOD is free software; you can redistribute it and/or modify
	13	%%% it under the terms of the GNU General Public License as published by
	14	%%% the Free Software Foundation; either version 2 of the License, or
	15	%%% (at your option) any later version.
	16	%%%
	17	%%% SOD is distributed in the hope that it will be useful,
	18	%%% but WITHOUT ANY WARRANTY; without even the implied warranty of
	19	%%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
	20	%%% GNU General Public License for more details.
	21	%%%
	22	%%% You should have received a copy of the GNU General Public License
	23	%%% along with SOD; if not, write to the Free Software Foundation,
	24	%%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
	25
	26	\chapter{Concepts} \label{ch:concepts}
	27
	28	%%%--------------------------------------------------------------------------
	29	\section{Modules} \label{sec:concepts.modules}
	30
	31	A \emph{module} is the top-level syntactic unit of input to the Sod
	32	translator. As described above, given an input module, the translator
	33	generates C source and header files.
	34
	35	A module can \emph{import} other modules. This makes the type names and
	36	classes defined in those other modules available to class definitions in the
	37	importing module. Sod's module system is intentionally very simple. There
	38	are no private declarations or attempts to hide things.
	39
	40	As well as importing existing modules, a module can include a number of
	41	different kinds of \emph{items}:
	42	\begin{itemize}
	43	\item \emph{class definitions} describe new classes, possibly in terms of
	44	existing classes;
	45	\item \emph{type name declarations} introduce new type names to Sod's
	46	parser;\footnote{%
	47	This is unfortunately necessary because C syntax, upon which Sod's input
	48	language is based for obvious reasons, needs to treat type names
	49	differently from other kinds of identifiers.} %
	50	and
	51	\item \emph{code fragments} contain literal C code to be dropped into an
	52	appropriate place in an output file.
	53	\end{itemize}
	54	Each kind of item, and, indeed, a module as a whole, can have a collection of
	55	\emph{properties} associated with it. A property has a \emph{name} and a
	56	\emph{value}. Properties are an open-ended way of attaching additional
	57	information to module items, so extensions can make use of them without
	58	having to implement additional syntax.
	59
	60	%%%--------------------------------------------------------------------------
	61	\section{Classes, instances, and slots} \label{sec:concepts.classes}
	62
	63	For the most part, Sod takes a fairly traditional view of what it means to be
	64	an object system.
	65
	66	An \emph{object} maintains \emph{state} and exhibits \emph{behaviour}. An
	67	object's state is maintained in named \emph{slots}, each of which can store a
	68	C value of an appropriate (scalar or aggregate) type. An object's behaviour
	69	is stimulated by sending it \emph{messages}. A message has a name, and may
	70	carry a number of arguments, which are C values; sending a message may result
	71	in the state of receiving object (or other objects) being changed, and a C
	72	value being returned to the sender.
	73
	74	Every object is a (direct) instance of some \emph{class}. The class
	75	determines which slots its instances have, which messages its instances can
	76	be sent, and which methods are invoked when those messages are received. The
	77	Sod translator's main job is to read class definitions and convert them into
	78	appropriate C declarations, tables, and functions. An object cannot
	79	(usually) change its direct class, and the direct class of an object is not
	80	affected by, for example, the static type of a pointer to it.
	81
	82
	83	\subsection{Superclasses and inheritance}
	84	\label{sec:concepts.classes.inherit}
	85
	86	\subsubsection{Class relationships}
	87	Each class has zero or more \emph{direct superclasses}.
	88
	89	A class with no direct superclasses is called a \emph{root class}. The Sod
	90	runtime library includes a root class named @\|SodObject\|; making new root
	91	classes is somewhat tricky, and won't be discussed further here.
	92
	93	Classes can have more than one direct superclass, i.e., Sod supports
	94	\emph{multiple inheritance}. A Sod class definition for a class~$C$ lists
	95	the direct superclasses of $C$ in a particular order. This order is called
	96	the \emph{local precedence order} of $C$, and the list which consists of $C$
	97	follows by $C$'s direct superclasses in local precedence order is called the
	98	$C$'s \emph{local precedence list}.
	99
	100	The multiple inheritance in Sod works similarly to multiple inheritance in
	101	Lisp-like languages, such as Common Lisp, EuLisp, Dylan, and Python, which is
	102	very different from how multiple inheritance works in \Cplusplus.\footnote{%
	103	The latter can be summarized as `badly'. By default in \Cplusplus, an
	104	instance receives an additional copy of superclass's state for each path
	105	through the class graph from the instance's direct class to that
	106	superclass, though this behaviour can be overridden by declaring
	107	superclasses to be @\|virtual\|. Also, \Cplusplus\ offers only trivial
	108	method combination (\xref{sec:concepts.methods}), leaving programmers to
	109	deal with delegation manually and (usually) statically.} %
	110
	111	If $C$ is a class, then the \emph{superclasses} of $C$ are
	112	\begin{itemize}
	113	\item $C$ itself, and
	114	\item the superclasses of each of $C$'s direct superclasses.
	115	\end{itemize}
	116	The \emph{proper superclasses} of a class $C$ are the superclasses of $C$
	117	except for $C$ itself. If a class $B$ is a (direct, proper) superclass of
	118	$C$, then $C$ is a \emph{(direct, proper) subclass} of $B$. If $C$ is a root
	119	class then the only superclass of $C$ is $C$ itself, and $C$ has no proper
	120	superclasses.
	121
	122	If an object is a direct instance of class~$C$ then the object is also an
	123	(indirect) instance of every superclass of $C$.
	124
	125	If $C$ has a proper superclass $B$, then $B$ must not have $C$ as a direct
	126	superclass. In different terms, if we construct a graph, whose vertices are
	127	classes, and draw an edge from each class to each of its direct superclasses,
	128	then this graph must be acyclic. In yet other terms, the `is a superclass
	129	of' relation is a partial order on classes.
	130
	131	\subsubsection{The class precedence list}
	132	This partial order is not quite sufficient for our purposes. For each class
	133	$C$, we shall need to extend it into a total order on $C$'s superclasses.
	134	This calculation is called \emph{superclass linearization}, and the result is
	135	a \emph{class precedence list}, which lists each of $C$'s superclasses
	136	exactly once. If a superclass $B$ precedes (resp.\ follows) some other
	137	superclass $A$ in $C$'s class precedence list, then we say that $B$ is a more
	138	(resp.\ less) \emph{specific} superclass of $C$ than $A$ is.
	139
	140	The superclass linearization algorithm isn't fixed, and extensions to the
	141	translator can introduce new linearizations for special effects, but the
	142	following properties are expected to hold.
	143	\begin{itemize}
	144	\item The first class in $C$'s class precedence list is $C$ itself; i.e.,
	145	$C$ is always its own most specific superclass.
	146	\item If $A$ and $B$ are both superclasses of $C$, and $A$ is a proper
	147	superclass of $B$ then $A$ appears after $B$ in $C$'s class precedence
	148	list, i.e., $B$ is a more specific superclass of $C$ than $A$ is.
	149	\end{itemize}
	150	The default linearization algorithm used in Sod is the \emph{C3} algorithm,
	151	which has a number of good properties described in~\cite{Barrett:1996:MSL}.
	152	It works as follows.
	153	\begin{itemize}
	154	\item A \emph{merge} of some number of input lists is a single list
	155	containing each item that is in any of the input lists exactly once, and no
	156	other items; if an item $x$ appears before an item $y$ in any input list,
	157	then $x$ also appears before $y$ in the merge. If a collection of lists
	158	have no merge then they are said to be \emph{inconsistent}.
	159	\item The class precedence list of a class $C$ is a merge of the local
	160	precedence list of $C$ together with the class precedence lists of each of
	161	$C$'s direct superclasses.
	162	\item If there are no such merges, then the definition of $C$ is invalid.
	163	\item Suppose that there are multiple candidate merges. Consider the
	164	earliest position in these candidate merges at which they disagree. The
	165	\emph{candidate classes} at this position are the classes appearing at this
	166	position in the candidate merges. Each candidate class must be a
	167	superclass of distinct direct superclasses of $C$, since otherwise the
	168	candidates would be ordered by their common subclass's class precedence
	169	list. The class precedence list contains, at this position, that candidate
	170	class whose subclass appears earliest in $C$'s local precedence order.
	171	\end{itemize}
	172
	173	\begin{figure}
	174	\centering
	175	\begin{tikzpicture}[x=7.5mm, y=-14mm, baseline=(current bounding box.east)]
	176	\node[lit] at ( 0, 0) (R) {SodObject};
	177	\node[lit] at (-3, +1) (A) {A}; \draw[->] (A) -- (R);
	178	\node[lit] at (-1, +1) (B) {B}; \draw[->] (B) -- (R);
	179	\node[lit] at (+1, +1) (C) {C}; \draw[->] (C) -- (R);
	180	\node[lit] at (+3, +1) (D) {D}; \draw[->] (D) -- (R);
	181	\node[lit] at (-2, +2) (E) {E}; \draw[->] (E) -- (A);
	182	\draw[->] (E) -- (B);
	183	\node[lit] at (+2, +2) (F) {F}; \draw[->] (F) -- (A);
	184	\draw[->] (F) -- (D);
	185	\node[lit] at (-1, +3) (G) {G}; \draw[->] (G) -- (E);
	186	\draw[->] (G) -- (C);
	187	\node[lit] at (+1, +3) (H) {H}; \draw[->] (H) -- (F);
	188	\node[lit] at ( 0, +4) (I) {I}; \draw[->] (I) -- (G);
	189	\draw[->] (I) -- (H);
	190	\end{tikzpicture}
	191	\quad
	192	\vrule
	193	\quad
	194	\begin{minipage}[c]{0.45\hsize}
	195	\begin{nprog}
	196	class A: SodObject \{ \}\quad\=@/* @\|A\|, @\|SodObject\| */ \\
	197	class B: SodObject \{ \}\>@/* @\|B\|, @\|SodObject\| */ \\
	198	class C: SodObject \{ \}\>@/* @\|B\|, @\|SodObject\| */ \\
	199	class D: SodObject \{ \}\>@/* @\|B\|, @\|SodObject\| */ \\+
	200	class E: A, B \{ \}\quad\=@/* @\|E\|, @\|A\|, @\|B\|, \dots */ \\
	201	class F: A, D \{ \}\>@/* @\|F\|, @\|A\|, @\|D\|, \dots */ \\+
	202	class G: E, C \{ \}\>@/* @\|G\|, @\|E\|, @\|A\|,
	203	@\|B\|, @\|C\|, \dots */ \\
	204	class H: F \{ \}\>@/* @\|H\|, @\|F\|, @\|A\|, @\|D\|, \dots */ \\+
	205	class I: G, H \{ \}\>@/* @\|I\|, @\|G\|, @\|E\|, @\|H\|, @\|F\|,
	206	@\|A\|, @\|B\|, @\|C\|, @\|D\|, \dots */
	207	\end{nprog}
	208	\end{minipage}
	209
	210	\caption{An example class graph and class precedence lists}
	211	\label{fig:concepts.classes.cpl-example}
	212	\end{figure}
	213
	214	\begin{example}
	215	Consider the class relationships shown in
	216	\xref{fig:concepts.classes.cpl-example}.
	217
	218	\begin{itemize}
	219
	220	\item @\|SodObject\| has no proper superclasses. Its class precedence list
	221	is therefore simply $\langle @\|SodObject\| \rangle$.
	222
	223	\item In general, if $X$ is a direct subclass only of $Y$, and $Y$'s class
	224	precedence list is $\langle Y, \ldots \rangle$, then $X$'s class
	225	precedence list is $\langle X, Y, \ldots \rangle$. This explains $A$,
	226	$B$, $C$, $D$, and $H$.
	227
	228	\item $E$'s list is found by merging its local precedence list $\langle E,
	229	A, B \rangle$ with the class precedence lists of its direct superclasses,
	230	which are $\langle A, @\|SodObject\| \rangle$ and $\langle B, @\|SodObject\|
	231	\rangle$. Clearly, @\|SodObject\| must be last, and $E$'s local precedence
	232	list orders the rest, giving $\langle E, A, B, @\|SodObject\|, \rangle$.
	233	$F$ is similar.
	234
	235	\item We determine $G$'s class precedence list by merging the three lists
	236	$\langle G, E, C \rangle$, $\langle E, A, B, @\|SodObject\| \rangle$, and
	237	$\langle C, @\|SodObject\| \rangle$. The class precedence list begins
	238	$\langle G, E, \ldots \rangle$, but the individual lists don't order $A$
	239	and $C$. Comparing these to $G$'s direct superclasses, we see that $A$
	240	is a subclass of $E$, while $C$ is a subclass of -- indeed equal to --
	241	$C$; so $A$ must precede $C$, as must $B$, and the final list is $\langle
	242	G, E, A, B, C, @\|SodObject\| \rangle$.
	243
	244	\item Finally, we determine $I$'s class precedence list by merging $\langle
	245	I, G, H \rangle$, $\langle G, E, A, B, C, @\|SodObject\| \rangle$, and
	246	$\langle H, F, A, D, @\|SodObject\| \rangle$. The list begins $\langle I,
	247	G, \ldots \rangle$, and then we must break a tie between $E$ and $H$; but
	248	$E$ is a subclass of $G$, so $E$ wins. Next, $H$ and $F$ must precede
	249	$A$, since these are ordered by $H$'s class precedence list. Then $B$
	250	and $C$ precede $D$, since the former are superclasses of $G$, and the
	251	final list is $\langle I, G, E, H, F, A, B, C, D, @\|SodObject\| \rangle$.
	252
	253	\end{itemize}
	254
	255	(This example combines elements from \cite{Barrett:1996:MSL} and
	256	\cite{Ducournau:1994:PMM}.)
	257	\end{example}
	258
	259	\subsubsection{Class links and chains}
	260	The definition for a class $C$ may distinguish one of its proper superclasses
	261	as being the \emph{link superclass} for class $C$. Not every class need have
	262	a link superclass, and the link superclass of a class $C$, if it exists, need
	263	not be a direct superclass of $C$.
	264
	265	Superclass links must obey the following rule: if $C$ is a class, then there
	266	must be no three distinct superclasses $X$, $Y$ and~$Z$ of $C$ such that $Z$
	267	is the link superclass of both $X$ and $Y$. As a consequence of this rule,
	268	the superclasses of $C$ can be partitioned into linear \emph{chains}, such
	269	that superclasses $A$ and $B$ are in the same chain if and only if one can
	270	trace a path from $A$ to $B$ by following superclass links, or \emph{vice
	271	versa}.
	272
	273	Since a class links only to one of its proper superclasses, the classes in a
	274	chain are naturally ordered from most- to least-specific. The least specific
	275	class in a chain is called the \emph{chain head}; the most specific class is
	276	the \emph{chain tail}. Chains are often named after their chain head
	277	classes.
	278
	279	\subsection{Names}
	280	\label{sec:concepts.classes.names}
	281
	282	Classes have a number of other attributes:
	283	\begin{itemize}
	284	\item A \emph{name}, which is a C identifier. Class names must be globally
	285	unique. The class name is used in the names of a number of associated
	286	definitions, to be described later.
	287	\item A \emph{nickname}, which is also a C identifier. Unlike names,
	288	nicknames are not required to be globally unique. If $C$ is any class,
	289	then all the superclasses of $C$ must have distinct nicknames.
	290	\end{itemize}
	291
	292
	293	\subsection{Slots} \label{sec:concepts.classes.slots}
	294
	295	Each class defines a number of \emph{slots}. Much like a structure member, a
	296	slot has a \emph{name}, which is a C identifier, and a \emph{type}. Unlike
	297	many other object systems, different superclasses of a class $C$ can define
	298	slots with the same name without ambiguity, since slot references are always
	299	qualified by the defining class's nickname.
	300
	301	\subsubsection{Slot initializers}
	302	As well as defining slot names and types, a class can also associate an
	303	\emph{initial value} with each slot defined by itself or one of its
	304	subclasses. A class $C$ provides an \emph{initialization message} (see
	305	\xref{sec:concepts.lifecycle.birth}, and \xref{sec:structures.root.sodclass})
	306	whose methods set the slots of a \emph{direct} instance of the class to the
	307	correct initial values. If several of $C$'s superclasses define initializers
	308	for the same slot then the initializer from the most specific such class is
	309	used. If none of $C$'s superclasses define an initializer for some slot then
	310	that slot will be left uninitialized.
	311
	312	The initializer for a slot with scalar type may be any C expression. The
	313	initializer for a slot with aggregate type must contain only constant
	314	expressions if the generated code is expected to be processed by a
	315	implementation of C89. Initializers will be evaluated once each time an
	316	instance is initialized.
	317
	318	Slots are initialized in reverse-precedence order of their defining classes;
	319	i.e., slots defined by a less specific superclass are initialized earlier
	320	than slots defined by a more specific superclass. Slots defined by the same
	321	class are initialized in the order in which they appear in the class
	322	definition.
	323
	324	The initializer for a slot may refer to other slots in the same object, via
	325	the @\|me\| pointer: in an initializer for a slot defined by a class $C$, @\|me\|
	326	has type `pointer to $C$'. (Note that the type of @\|me\| depends only on the
	327	class which defined the slot, not the class which defined the initializer.)
	328
	329	A class can also define \emph{class slot initializers}, which provide values
	330	for a slot defined by its metaclass; see \xref{sec:concepts.metaclasses} for
	331	details.
	332
	333
	334	\subsection{C language integration} \label{sec:concepts.classes.c}
	335
	336	For each class~$C$, the Sod translator defines a C type, the \emph{class
	337	type}, with the same name. This is the usual type used when considering an
	338	object as an instance of class~$C$. No entire object will normally have a
	339	class type,\footnote{%
	340	In general, a class type only captures the structure of one of the
	341	superclass chains of an instance. A full instance layout contains multiple
	342	chains. See \xref{sec:structures.layout} for the full details.} %
	343	so access to instances is almost always via pointers.
	344
	345	\subsubsection{Access to slots}
	346	The class type for a class~$C$ is actually a structure. It contains one
	347	member for each class in $C$'s superclass chain, named with that class's
	348	nickname. Each of these members is also a structure, containing the
	349	corresponding class's slots, one member per slot. There's nothing special
	350	about these slot members: C code can access them in the usual way.
	351
	352	For example, if @\|MyClass\| has the nickname @\|mine\|, and defines a slot @\|x\|
	353	of type @\|int\|, then the simple function
	354	\begin{prog}
	355	int get_x(MyClass *m) \{ return (m@->mine.x); \}
	356	\end{prog}
	357	will extract the value of @\|x\| from an instance of @\|MyClass\|.
	358
	359	All of this means that there's no such thing as `private' or `protected'
	360	slots. If you want to hide implementation details, the best approach is to
	361	stash them in a dynamically allocated private structure, and leave a pointer
	362	to it in a slot. (This will also help preserve binary compatibility, because
	363	the private structure can grow more members as needed. See
	364	\xref{sec:concepts.compatibility} for more details.)
	365
	366
	367	\subsubsection{Sending messages}
	368	Sod defines a macro for each message. If a class $C$ defines a message $m$,
	369	then the macro is called @\|$C$_$m$\|. The macro takes a pointer to the
	370	receiving object as its first argument, followed by the message arguments, if
	371	any, and returns the value returned by the object's effective method for the
	372	message (if any). If you have a pointer to an instance of any of $C$'s
	373	subclasses, then you can send it the message; it doesn't matter whether the
	374	subclass is on the same chain. Note that the receiver argument is evaluated
	375	twice, so it's not safe to write a receiver expression which has
	376	side-effects.
	377
	378	For example, suppose we defined
	379	\begin{prog}
	380	[nick = soupy] \\
	381	class Super: SodObject \{ \\ \ind
	382	void msg(const char *m); \-\\
	383	\} \\+
	384	class Sub: Super \{ \\ \ind
	385	void soupy.msg(const char *m)
	386	\{ printf("sub sent `\%s'@\\n", m); \} \-\\
	387	\}
	388	\end{prog}
	389	then we can send the message like this:
	390	\begin{prog}
	391	Sub sub = / \dots\ */; \\
	392	Super_msg(sub, "hello");
	393	\end{prog}
	394
	395	What happens under the covers is as follows. The structure pointed to by the
	396	instance pointer has a member named @\|_vt\|, which points to a structure
	397	called a `virtual table', or \emph{vtable}, which contains various pieces of
	398	information about the object's direct class and layout, and holds pointers to
	399	method entries for the messages which the object can receive. The
	400	message-sending macro in the example above expands to something similar to
	401	\begin{prog}
	402	sub@->_vt.sub.msg(sub, "Hello");
	403	\end{prog}
	404
	405	The vtable contains other useful information, such as a pointer to the
	406	instance's direct class's \emph{class object} (described below). The full
	407	details of the contents and layout of vtables are given in
	408	\xref{sec:structures.layout.vtable}.
	409
	410
	411	\subsubsection{Class objects}
	412	In Sod's object system, classes are objects too. Therefore classes are
	413	themselves instances; the class of a class is called a \emph{metaclass}. The
	414	consequences of this are explored in \xref{sec:concepts.metaclasses}. The
	415	\emph{class object} has the same name as the class, suffixed with
	416	`@\|__class\|'\footnote{%
	417	This is not quite true. @\|$C$__class\| is actually a macro. See
	418	\xref{sec:structures.layout.additional} for the gory details.} %
	419	and its type is usually @\|SodClass\|; @\|SodClass\|'s nickname is @\|cls\|.
	420
	421	A class object's slots contain or point to useful information, tables and
	422	functions for working with that class's instances. (The @\|SodClass\| class
	423	doesn't define any messages, so it doesn't have any methods other than for
	424	the @\|SodObject\| lifecycle messages @\|init\| and @\|teardown\|; see
	425	\xref{sec:concepts.lifecycle}. In Sod, a class slot containing a function
	426	pointer is not at all the same thing as a method.)
	427
	428	\subsubsection{Conversions}
	429	Suppose one has a value of type pointer-to-class-type for some class~$C$, and
	430	wants to convert it to a pointer-to-class-type for some other class~$B$.
	431	There are three main cases to distinguish.
	432	\begin{itemize}
	433	\item If $B$ is a superclass of~$C$, in the same chain, then the conversion
	434	is an \emph{in-chain upcast}. The conversion can be performed using the
	435	appropriate generated upcast macro (see below), or by simply casting the
	436	pointer, using C's usual cast operator (or the \Cplusplus\ @\|static_cast<>\|
	437	operator).
	438	\item If $B$ is a superclass of~$C$, in a different chain, then the
	439	conversion is a \emph{cross-chain upcast}. The conversion is more than a
	440	simple type change: the pointer value must be adjusted. If the direct
	441	class of the instance in question is not known, the conversion will require
	442	a lookup at runtime to find the appropriate offset by which to adjust the
	443	pointer. The conversion can be performed using the appropriate generated
	444	upcast macro (see below); the general case is handled by the macro
	445	\descref{SOD_XCHAIN}{mac}.
	446	\item If $B$ is a subclass of~$C$ then the conversion is a \emph{downcast};
	447	otherwise the conversion is a~\emph{cross-cast}. In either case, the
	448	conversion can fail: the object in question might not be an instance of~$B$
	449	after all. The macro \descref{SOD_CONVERT}{mac} and the function
	450	\descref{sod_convert}{fun} perform general conversions. They return a null
	451	pointer if the conversion fails. (These are therefore your analogue to the
	452	\Cplusplus\ @\|dynamic_cast<>\| operator.)
	453	\end{itemize}
	454	The Sod translator generates macros for performing both in-chain and
	455	cross-chain upcasts. For each class~$C$, and each proper superclass~$B$
	456	of~$C$, a macro is defined: given an argument of type pointer to class type
	457	of~$C$, it returns a pointer to the same instance, only with type pointer to
	458	class type of~$B$, adjusted as necessary in the case of a cross-chain
	459	conversion. The macro is named by concatenating
	460	\begin{itemize}
	461	\item the name of class~$C$, in upper case,
	462	\item the characters `@\|__CONV_\|', and
	463	\item the nickname of class~$B$, in upper case;
	464	\end{itemize}
	465	e.g., if $C$ is named @\|MyClass\|, and $B$'s name is @\|SuperClass\| with
	466	nickname @\|super\|, then the macro @\|MYCLASS__CONV_SUPER\| converts a
	467	@\|MyClass~\| to a @\|SuperClass~\|. See
	468	\xref{sec:structures.layout.additional} for the formal description.
	469
	470	%%%--------------------------------------------------------------------------
	471	\section{Keyword arguments} \label{sec:concepts.keywords}
	472
	473	In standard C, the actual arguments provided to a function are matched up
	474	with the formal arguments given in the function definition according to their
	475	ordering in a list. Unless the (rather cumbersome) machinery for dealing
	476	with variable-length argument tails (@\|<stdarg.h>\|) is used, exactly the
	477	correct number of arguments must be supplied, and in the correct order.
	478
	479	A \emph{keyword argument} is matched by its distinctive \emph{name}, rather
	480	than by its position in a list. Keyword arguments may be \emph{omitted},
	481	causing some default behaviour by the function. A function can detect
	482	whether a particular keyword argument was supplied: so the default behaviour
	483	need not be the same as that caused by any specific value of the argument.
	484
	485	Keyword arguments can be provided in three ways.
	486	\begin{enumerate}
	487	\item Directly, as a variable-length argument tail, consisting (for the most
	488	part) of alternating keyword names, as pointers to null-terminated strings,
	489	and argument values, and terminated by a null pointer. This is somewhat
	490	error-prone, and the support library defines some macros which help ensure
	491	that keyword argument lists are well formed.
	492	\item Indirectly, through a @\|va_list\| object capturing a variable-length
	493	argument tail passed to some other function. Such indirect argument tails
	494	have the same structure as the direct argument tails described above.
	495	Because @\|va_list\| objects are hard to copy, the keyword-argument support
	496	library consistently passes @\|va_list\| objects \emph{by reference}
	497	throughout its programming interface.
	498	\item Indirectly, through a vector of @\|struct kwval\| objects, each of which
	499	contains a keyword name, as a pointer to a null-terminated string, and the
	500	\emph{address} of a corresponding argument value. (This indirection is
	501	necessary so that the items in the vector can be of uniform size.)
	502	Argument vectors are rather inconvenient to use, but are the only practical
	503	way in which a caller can decide at runtime which arguments to include in a
	504	call, which is useful when writing wrapper functions.
	505	\end{enumerate}
	506
	507	Keyword arguments are provided as a general feature for C functions.
	508	However, Sod has special support for messages which accept keyword arguments
	509	(\xref{sec:concepts.methods.keywords}); and they play an essential rôle in
	510	the instance construction protocol (\xref{sec:concepts.lifecycle.birth}).
	511
	512	%%%--------------------------------------------------------------------------
	513	\section{Messages and methods} \label{sec:concepts.methods}
	514
	515	Objects can be sent \emph{messages}. A message has a \emph{name}, and
	516	carries a number of \emph{arguments}. When an object is sent a message, a
	517	function, determined by the receiving object's class, is invoked, passing it
	518	the receiver and the message arguments. This function is called the
	519	class's \emph{effective method} for the message. The effective method can do
	520	anything a C function can do, including reading or updating program state or
	521	object slots, sending more messages, calling other functions, issuing system
	522	calls, or performing I/O; if it finishes, it may return a value, which is
	523	returned in turn to the message sender.
	524
	525	The set of messages an object can receive, characterized by their names,
	526	argument types, and return type, is determined by the object's class. Each
	527	class can define new messages, which can be received by any instance of that
	528	class. The messages defined by a single class must have distinct names:
	529	there is no `function overloading'. As with slots
	530	(\xref{sec:concepts.classes.slots}), messages defined by distinct classes are
	531	always distinct, even if they have the same names: references to messages are
	532	always qualified by the defining class's name or nickname.
	533
	534	Messages may take any number of arguments, of any non-array value type.
	535	Since message sends are effectively function calls, arguments of array type
	536	are implicitly converted to values of the corresponding pointer type. While
	537	message definitions may ascribe an array type to an argument, the formal
	538	argument will have pointer type, as is usual for C functions. A message may
	539	accept a variable-length argument suffix, denoted @\|\dots\|.
	540
	541	A class definition may include \emph{direct methods} for messages defined by
	542	it or any of its superclasses.
	543
	544	Like messages, direct methods define argument lists and return types, but
	545	they may also have a \emph{body}, and a \emph{rôle}.
	546
	547	A direct method need not have the same argument list or return type as its
	548	message. The acceptable argument lists and return types for a method depend
	549	on the message, in particular its method combination
	550	(\xref{sec:concepts.methods.combination}), and the method's rôle.
	551
	552	A direct method body is a block of C code, and the Sod translator usually
	553	defines, for each direct method, a function with external linkage, whose body
	554	contains a copy of the direct method body. Within the body of a direct
	555	method defined for a class $C$, the variable @\|me\|, of type pointer to class
	556	type of $C$, refers to the receiving object.
	557
	558
	559	\subsection{Effective methods and method combinations}
	560	\label{sec:concepts.methods.combination}
	561
	562	For each message a direct instance of a class might receive, there is a set
	563	of \emph{applicable methods}, which are exactly the direct methods defined on
	564	the object's class and its superclasses. These direct methods are combined
	565	together to form the \emph{effective method} for that particular class and
	566	message. Direct methods can be combined into an effective method in
	567	different ways, according to the \emph{method combination} specified by the
	568	message. The method combination determines which direct method rôles are
	569	acceptable, and, for each rôle, the appropriate argument lists and return
	570	types.
	571
	572	One direct method, $M$, is said to be more (resp.\ less) \emph{specific} than
	573	another, $N$, with respect to a receiving class~$C$, if the class defining
	574	$M$ is a more (resp.\ less) specific superclass of~$C$ than the class
	575	defining $N$.
	576
	577	\subsubsection{The standard method combination}
	578	The default method combination is called the \emph{standard method
	579	combination}; other method combinations are useful occasionally for special
	580	effects. The standard method combination accepts four direct method rôles,
	581	called `primary' (the default), @\|before\|, @\|after\|, and @\|around\|.
	582
	583	All direct methods subject to the standard method combination must have
	584	argument lists which \emph{match} the message's argument list:
	585	\begin{itemize}
	586	\item the method's arguments must have the same types as the message, though
	587	the arguments may have different names; and
	588	\item if the message accepts a variable-length argument suffix then the
	589	direct method must instead have a final argument of type @\|va_list\|.
	590	\end{itemize}
	591	Primary and @\|around\| methods must have the same return type as the message;
	592	@\|before\| and @\|after\| methods must return @\|void\| regardless of the
	593	message's return type.
	594
	595	If there are no applicable primary methods then no effective method is
	596	constructed: the vtables contain null pointers in place of pointers to method
	597	entry functions.
	598
	599	\begin{figure}
	600	\hbox to\hsize{\hss\hbox{\begin{tikzpicture}
	601	[order/.append style={color=green!70!black},
	602	code/.append style={font=\sffamily},
	603	action/.append style={font=\itshape},
	604	method/.append style={rectangle, draw=black, thin, fill=blue!30,
	605	text height=\ht\strutbox, text depth=\dp\strutbox,
	606	minimum width=40mm}]
	607
	608	\def\delgstack#1#2#3{
	609	\node (#10) [method, #2] {#3};
	610	\node (#11) [method, above=6mm of #10] {#3};
	611	\draw [->] ($(#10.north)!.5!(#10.north west) + (0mm, 1mm)$) --
	612	++(0mm, 4mm)
	613	node [code, left=4pt, midway] {next_method};
	614	\draw [<-] ($(#10.north)!.5!(#10.north east) + (0mm, 1mm)$) --
	615	++(0mm, 4mm)
	616	node [action, right=4pt, midway] {return};
	617	\draw [->] ($(#11.north)!.5!(#11.north west) + (0mm, 1mm)$) --
	618	++(0mm, 4mm)
	619	node [code, left=4pt, midway] {next_method}
	620	node (ld) [above] {$\smash\vdots\mathstrut$};
	621	\draw [<-] ($(#11.north)!.5!(#11.north east) + (0mm, 1mm)$) --
	622	++(0mm, 4mm)
	623	node [action, right=4pt, midway] {return}
	624	node (rd) [above] {$\smash\vdots\mathstrut$};
	625	\draw [->] ($(ld.north) + (0mm, 1mm)$) -- ++(0mm, 4mm)
	626	node [code, left=4pt, midway] {next_method};
	627	\draw [<-] ($(rd.north) + (0mm, 1mm)$) -- ++(0mm, 4mm)
	628	node [action, right=4pt, midway] {return};
	629	\node (p) at ($(ld.north)!.5!(rd.north)$) {};
	630	\node (#1n) [method, above=5mm of p] {#3};
	631	\draw [->, order] ($(#10.south east) + (4mm, 1mm)$) --
	632	($(#1n.north east) + (4mm, -1mm)$)
	633	node [midway, right, align=left]
	634	{Most to \\ least \\ specific};}
	635
	636	\delgstack{a}{}{@\|around\| method}
	637	\draw [<-] ($(a0.south)!.5!(a0.south west) - (0mm, 1mm)$) --
	638	++(0mm, -4mm);
	639	\draw [->] ($(a0.south)!.5!(a0.south east) - (0mm, 1mm)$) --
	640	++(0mm, -4mm)
	641	node [action, right=4pt, midway] {return};
	642
	643	\draw [->] ($(an.north)!.6!(an.north west) + (0mm, 1mm)$) --
	644	++(-8mm, 8mm)
	645	node [code, midway, left=3mm] {next_method}
	646	node (b0) [method, above left = 1mm + 4mm and -6mm - 4mm] {};
	647	\node (b1) [method] at ($(b0) - (2mm, 2mm)$) {};
	648	\node (bn) [method] at ($(b1) - (2mm, 2mm)$) {@\|before\| method};
	649	\draw [->, order] ($(bn.west) - (6mm, 0mm)$) -- ++(12mm, 12mm)
	650	node [midway, above left, align=center] {Most to \\ least \\ specific};
	651	\draw [->] ($(b0.north east) + (-10mm, 1mm)$) -- ++(8mm, 8mm)
	652	node (p) {};
	653
	654	\delgstack{m}{above right=1mm and 0mm of an.west \|- p}{Primary method}
	655	\draw [->] ($(mn.north)!.5!(mn.north west) + (0mm, 1mm)$) -- ++(0mm, 4mm)
	656	node [code, left=4pt, midway] {next_method}
	657	node [above right = 0mm and -8mm]
	658	{$\vcenter{\hbox{\Huge\textcolor{red}{!}}}
	659	\vcenter{\hbox{\begin{tabular}[c]{l}
	660	\textsf{next_method} \\
	661	pointer is null
	662	\end{tabular}}}$};
	663
	664	\draw [->, color=blue, dotted]
	665	($(m0.south)!.2!(m0.south east) - (0mm, 1mm)$) --
	666	($(an.north)!.2!(an.north east) + (0mm, 1mm)$)
	667	node [midway, sloped, below] {Return value};
	668
	669	\draw [<-] ($(an.north)!.6!(an.north east) + (0mm, 1mm)$) --
	670	++(8mm, 8mm)
	671	node [action, midway, right=3mm] {return}
	672	node (f0) [method, above right = 1mm and -6mm] {};
	673	\node (f1) [method] at ($(f0) + (-2mm, 2mm)$) {};
	674	\node (fn) [method] at ($(f1) + (-2mm, 2mm)$) {@\|after\| method};
	675	\draw [<-, order] ($(f0.east) + (6mm, 0mm)$) -- ++(-12mm, 12mm)
	676	node [midway, above right, align=center]
	677	{Least to \\ most \\ specific};
	678	\draw [<-] ($(fn.north west) + (6mm, 1mm)$) -- ++(-8mm, 8mm);
	679
	680	\end{tikzpicture}}\hss}
	681
	682	\caption{The standard method combination}
	683	\label{fig:concepts.methods.stdmeth}
	684	\end{figure}
	685
	686	The effective method for a message with standard method combination works as
	687	follows (see also~\xref{fig:concepts.methods.stdmeth}).
	688	\begin{enumerate}
	689
	690	\item If any applicable methods have the @\|around\| rôle, then the most
	691	specific such method, with respect to the class of the receiving object, is
	692	invoked.
	693
	694	Within the body of an @\|around\| method, the variable @\|next_method\| is
	695	defined, having pointer-to-function type. The method may call this
	696	function, as described below, any number of times.
	697
	698	If there any remaining @\|around\| methods, then @\|next_method\| invokes the
	699	next most specific such method, returning whichever value that method
	700	returns; otherwise the behaviour of @\|next_method\| is to invoke the
	701	@\|before\| methods (if any), followed by the most specific primary method,
	702	followed by the @\|after\| methods (if any), and to return whichever value
	703	was returned by the most specific primary method, as described in the
	704	following items. That is, the behaviour of the least specific @\|around\|
	705	method's @\|next_method\| function is exactly the behaviour that the
	706	effective method would have if there were no @\|around\| methods. Note that
	707	if the least-specific @\|around\| method calls its @\|next_method\| more than
	708	once then the whole sequence of @\|before\|, primary, and @\|after\| methods
	709	occurs multiple times.
	710
	711	The value returned by the most specific @\|around\| method is the value
	712	returned by the effective method.
	713
	714	\item If any applicable methods have the @\|before\| rôle, then they are all
	715	invoked, starting with the most specific.
	716
	717	\item The most specific applicable primary method is invoked.
	718
	719	Within the body of a primary method, the variable @\|next_method\| is
	720	defined, having pointer-to-function type. If there are no remaining less
	721	specific primary methods, then @\|next_method\| is a null pointer.
	722	Otherwise, the method may call the @\|next_method\| function any number of
	723	times.
	724
	725	The behaviour of the @\|next_method\| function, if it is not null, is to
	726	invoke the next most specific applicable primary method, and to return
	727	whichever value that method returns.
	728
	729	If there are no applicable @\|around\| methods, then the value returned by
	730	the most specific primary method is the value returned by the effective
	731	method; otherwise the value returned by the most specific primary method is
	732	returned to the least specific @\|around\| method, which called it via its
	733	own @\|next_method\| function.
	734
	735	\item If any applicable methods have the @\|after\| rôle, then they are all
	736	invoked, starting with the \emph{least} specific. (Hence, the most
	737	specific @\|after\| method is invoked with the most `afterness'.)
	738
	739	\end{enumerate}
	740
	741	A typical use for @\|around\| methods is to allow a base class to set up the
	742	dynamic environment appropriately for the primary methods of its subclasses,
	743	e.g., by claiming a lock, and releasing it afterwards.
	744
	745	The @\|next_method\| function provided to methods with the primary and
	746	@\|around\| rôles accepts the same arguments, and returns the same type, as the
	747	message, except that one or two additional arguments are inserted at the
	748	front of the argument list. The first additional argument is always the
	749	receiving object, @\|me\|. If the message accepts a variable argument suffix,
	750	then the second addition argument is a @\|va_list\|; otherwise there is no
	751	second additional argument; otherwise, In the former case, a variable
	752	@\|sod__master_ap\| of type @\|va_list\| is defined, containing a separate copy
	753	of the argument pointer (so the method body can process the variable argument
	754	suffix itself, and still pass a fresh copy on to the next method).
	755
	756	A method with the primary or @\|around\| rôle may use the convenience macro
	757	@\|CALL_NEXT_METHOD\|, which takes no arguments itself, and simply calls
	758	@\|next_method\| with appropriate arguments: the receiver @\|me\| pointer, the
	759	argument pointer @\|sod__master_ap\| (if applicable), and the method's
	760	arguments. If the method body has overwritten its formal arguments, then
	761	@\|CALL_NEXT_METHOD\| will pass along the updated values, rather than the
	762	original ones.
	763
	764	A primary or @\|around\| method which invokes its @\|next_method\| function is
	765	said to \emph{extend} the message behaviour; a method which does not invoke
	766	its @\|next_method\| is said to \emph{override} the behaviour. Note that a
	767	method may make a decision to override or extend at runtime.
	768
	769	\subsubsection{Aggregating method combinations}
	770	A number of other method combinations are provided. They are called
	771	`aggregating' method combinations because, instead of invoking just the most
	772	specific primary method, as the standard method combination does, they invoke
	773	the applicable primary methods in turn and aggregate the return values from
	774	each.
	775
	776	The aggregating method combinations accept the same four rôles as the
	777	standard method combination, and @\|around\|, @\|before\|, and @\|after\| methods
	778	work in the same way.
	779
	780	The aggregating method combinations provided are as follows.
	781	\begin{description} \let\makelabel\code
	782	\item[progn] The message must return @\|void\|. The applicable primary methods
	783	are simply invoked in turn, most specific first.
	784	\item[sum] The message must return a numeric type.\footnote{%
	785	The Sod translator does not check this, since it doesn't have enough
	786	insight into @\|typedef\| names.} %
	787	The applicable primary methods are invoked in turn, and their return values
	788	added up. The final result is the sum of the individual values.
	789	\item[product] The message must return a numeric type. The applicable
	790	primary methods are invoked in turn, and their return values multiplied
	791	together. The final result is the product of the individual values.
	792	\item[min] The message must return a scalar type. The applicable primary
	793	methods are invoked in turn. The final result is the smallest of the
	794	individual values.
	795	\item[max] The message must return a scalar type. The applicable primary
	796	methods are invoked in turn. The final result is the largest of the
	797	individual values.
	798	\item[and] The message must return a scalar type. The applicable primary
	799	methods are invoked in turn. If any method returns zero then the final
	800	result is zero and no further methods are invoked. If all of the
	801	applicable primary methods return nonzero, then the final result is the
	802	result of the last primary method.
	803	\item[or] The message must return a scalar type. The applicable primary
	804	methods are invoked in turn. If any method returns nonzero then the final
	805	result is that nonzero value and no further methods are invoked. If all of
	806	the applicable primary methods return zero, then the final result is zero.
	807	\end{description}
	808
	809	There is also a @\|custom\| aggregating method combination, which is described
	810	in \xref{sec:fixme.custom-aggregating-method-combination}.
	811
	812
	813	\subsection{Method entries} \label{sec:concepts.methods.entry}
	814
	815	Each instance is associated with its direct class \fixme{direct instances}
	816
	817	The effective methods for each class are determined at translation time, by
	818	the Sod translator. For each effective method, one or more \emph{method
	819	entry functions} are constructed. A method entry function has three
	820	responsibilities.
	821	\begin{itemize}
	822	\item It converts the receiver pointer to the correct type. Method entry
	823	functions can perform these conversions extremely efficiently: there are
	824	separate method entries for each chain of each class which can receive a
	825	message, so method entry functions are in the privileged situation of
	826	knowing the \emph{exact} class of the receiving object.
	827	\item If the message accepts a variable-length argument tail, then two method
	828	entry functions are created for each chain of each class: one receives a
	829	variable-length argument tail, as intended, and captures it in a @\|va_list\|
	830	object; the other accepts an argument of type @\|va_list\| in place of the
	831	variable-length tail and arranges for it to be passed along to the direct
	832	methods.
	833	\item It invokes the effective method with the appropriate arguments. There
	834	might or might not be an actual function corresponding to the effective
	835	method itself: the translator may instead open-code the effective method's
	836	behaviour into each method entry function; and the machinery for handling
	837	`delegation chains', such as is used for @\|around\| methods and primary
	838	methods in the standard method combination, is necessarily scattered among
	839	a number of small functions.
	840	\end{itemize}
	841
	842
	843	\subsection{Messages with keyword arguments}
	844	\label{sec:concepts.methods.keywords}
	845
	846	A message or a direct method may declare that it accepts keyword arguments.
	847	A message which accepts keyword arguments is called a \emph{keyword message};
	848	a direct method which accepts keyword arguments is called a \emph{keyword
	849	method}.
	850
	851	While method combinations may set their own rules, usually keyword methods
	852	can only be defined on keyword messages, and all methods defined on a keyword
	853	message must be keyword methods. The direct methods defined on a keyword
	854	message may differ in the keywords they accept, both from each other, and
	855	from the message. If two superclasses of some common class both define
	856	keyword methods on the same message, and the methods both accept a keyword
	857	argument with the same name, then these two keyword arguments must also have
	858	the same type. Different applicable methods may declare keyword arguments
	859	with the same name but different defaults; see below.
	860
	861	The keyword arguments acceptable in a message sent to an object are the
	862	keywords listed in the message definition, together with all of the keywords
	863	accepted by any applicable method. There is no easy way to determine at
	864	runtime whether a particular keyword is acceptable in a message to a given
	865	instance.
	866
	867	At runtime, a direct method which accepts one or more keyword arguments
	868	receives an additional argument named @\|suppliedp\|. This argument is a small
	869	structure. For each keyword argument named $k$ accepted by the direct
	870	method, @\|suppliedp\| contains a one-bit-wide bitfield member of type
	871	@\|unsigned\|, also named $k$. If a keyword argument named $k$ was passed in
	872	the message, then @\|suppliedp.$k$\| is one, and $k$ contains the argument
	873	value; otherwise @\|suppliedp.$k$\| is zero, and $k$ contains the default value
	874	from the direct method definition if there was one, or an unspecified value
	875	otherwise.
	876
	877	%%%--------------------------------------------------------------------------
	878	\section{The object lifecycle} \label{sec:concepts.lifecycle}
	879
	880	\subsection{Creation} \label{sec:concepts.lifecycle.birth}
	881
	882	Construction of a new instance of a class involves three steps.
	883	\begin{enumerate}
	884	\item \emph{Allocation} arranges for there to be storage space for the
	885	instance's slots and associated metadata.
	886	\item \emph{Imprinting} fills in the instance's metadata, associating the
	887	instance with its class.
	888	\item \emph{Initialization} stores appropriate initial values in the
	889	instance's slots, and maybe links it into any external data structures as
	890	necessary.
	891	\end{enumerate}
	892	The \descref{SOD_DECL}[macro]{mac} handles constructing instances with
	893	automatic storage duration (`on the stack'). Similarly, the
	894	\descref{SOD_MAKE}[macro]{mac} and the \descref{sod_make}{fun} and
	895	\descref{sod_makev}{fun} functions construct instances allocated from the
	896	standard @\|malloc\| heap. Programmers can add support for other allocation
	897	strategies by using the \descref{SOD_INIT}[macro]{mac} and the
	898	\descref{sod_init}{fun} and \descref{sod_initv}{fun} functions, which package
	899	up imprinting and initialization.
	900
	901	\subsubsection{Allocation}
	902	Instances of most classes (specifically including those classes defined by
	903	Sod itself) can be held in any storage of sufficient size. The in-memory
	904	layout of an instance of some class~$C$ is described by the type @\|struct
	905	$C$__ilayout\|, and if the relevant class is known at compile time then the
	906	best way to discover the layout size is with the @\|sizeof\| operator. Failing
	907	that, the size required to hold an instance of $C$ is available in a slot in
	908	$C$'s class object, as @\|$C$__class@->cls.initsz\|.
	909
	910	It is not in general sufficient to declare, or otherwise allocate, an object
	911	of the class type $C$. The class type only describes a single chain of the
	912	object's layout. It is nearly always an error to use the class type as if it
	913	is a \emph{complete type}, e.g., to declare objects or arrays of the class
	914	type, or to enquire about its size or alignment requirements.
	915
	916	Instance layouts may be declared as objects with automatic storage duration
	917	(colloquially, `allocated on the stack') or allocated dynamically, e.g.,
	918	using @\|malloc\|. They may be included as members of structures or unions, or
	919	elements of arrays. Sod's runtime system doesn't retain addresses of
	920	instances, so, for example, Sod doesn't make using fancy allocators which
	921	sometimes move objects around in memory any more difficult than it needs to
	922	be.
	923
	924	There isn't any way to discover the alignment required for a particular
	925	class's instances at runtime; it's best to be conservative and assume that
	926	the platform's strictest alignment requirement applies.
	927
	928	The following simple function correctly allocates and returns space for an
	929	instance of a class given a pointer to its class object @<cls>.
	930	\begin{prog}
	931	void allocate_instance(const SodClass cls) \\ \ind
	932	\{ return malloc(cls@->cls.initsz); \}
	933	\end{prog}
	934
	935	\subsubsection{Imprinting}
	936	Once storage has been allocated, it must be \emph{imprinted} before it can be
	937	used as an instance of a class, e.g., before any messages can be sent to it.
	938
	939	Imprinting an instance stores some metadata about its direct class in the
	940	instance structure, so that the rest of the program (and Sod's runtime
	941	library) can tell what sort of object it is, and how to use it.\footnote{%
	942	Specifically, imprinting an instance's storage involves storing the
	943	appropriate vtable pointers in the right places in it.} %
	944	A class object's @\|imprint\| slot points to a function which will correctly
	945	imprint storage for one of that class's instances.
	946
	947	Once an instance's storage has been imprinted, it is technically possible to
	948	send messages to the instance; however the instance's slots are still
	949	uninitialized at this point, so the applicable methods are unlikely to do
	950	much of any use unless they've been written specifically for the purpose.
	951
	952	The following simple function imprints storage at address @<p> as an instance
	953	of a class, given a pointer to its class object @<cls>.
	954	\begin{prog}
	955	void imprint_instance(const SodClass cls, void p) \\ \ind
	956	\{ cls@->cls.imprint(p); \}
	957	\end{prog}
	958
	959	\subsubsection{Initialization}
	960	The final step for constructing a new instance is to \emph{initialize} it, to
	961	establish the necessary invariants for the instance itself and the
	962	environment in which it operates.
	963
	964	Details of initialization are necessarily class-specific, but typically it
	965	involves setting the instance's slots to appropriate values, and possibly
	966	linking it into some larger data structure to keep track of it. It is
	967	possible for initialization methods to attempt to allocate resources, but
	968	this must be done carefully: there is currently no way to report an error
	969	from object initialization, so the object must be marked as incompletely
	970	initialized, and left in a state where it will be safe to tear down later.
	971
	972	Initialization is performed by sending the imprinted instance an @\|init\|
	973	message, defined by the @\|SodObject\| class. This message uses a nonstandard
	974	method combination which works like the standard combination, except that the
	975	\emph{default behaviour}, if there is no overriding method, is to initialize
	976	the instance's slots, as described below, and to invoke each superclass's
	977	initialization fragments. This default behaviour may be invoked multiple
	978	times if some method calls on its @\|next_method\| more than once, unless some
	979	other method takes steps to prevent this.
	980
	981	Slots are initialized in a well-defined order.
	982	\begin{itemize}
	983	\item Slots defined by a more specific superclass are initialized after slots
	984	defined by a less specific superclass.
	985	\item Slots defined by the same class are initialized in the order in which
	986	their definitions appear.
	987	\end{itemize}
	988
	989	A class can define \emph{initialization fragments}: pieces of literal code to
	990	be executed to set up a new instance. Each superclass's initialization
	991	fragments are executed with @\|me\| bound to an instance pointer of the
	992	appropriate superclass type, immediately after that superclass's slots (if
	993	any) have been initialized; therefore, fragments defined by a more specific
	994	superclass are executed after fragments defined by a less specific
	995	superclass. A class may define more than one initialization fragment: the
	996	fragments are executed in the order in which they appear in the class
	997	definition. It is possible for an initialization fragment to use @\|return\|
	998	or @\|goto\| for special control-flow effects, but this is not likely to be a
	999	good idea.
	1000
	1001	The @\|init\| message accepts keyword arguments
	1002	(\xref{sec:concepts.methods.keywords}). The set of acceptable keywords is
	1003	determined by the applicable methods as usual, but also by the
	1004	\emph{initargs} defined by the receiving instance's class and its
	1005	superclasses, which are made available to slot initializers and
	1006	initialization fragments.
	1007
	1008	There are two kinds of initarg definitions. \emph{User initargs} are defined
	1009	by an explicit @\|initarg\| item appearing in a class definition: the item
	1010	defines a name, type, and (optionally) a default value for the initarg.
	1011	\emph{Slot initargs} are defined by attaching an @\|initarg\| property to a
	1012	slot or slot initializer item: the property's value determines the initarg's
	1013	name, while the type is taken from the underlying slot type; slot initargs do
	1014	not have default values. Both kinds define a \emph{direct initarg} for the
	1015	containing class.
	1016
	1017	Initargs are inherited. The \emph{applicable} direct initargs for an @\|init\|
	1018	effective method are those defined by the receiving object's class, and all
	1019	of its superclasses. Applicable direct initargs with the same name are
	1020	merged to form \emph{effective initargs}. An error is reported if two
	1021	applicable direct initargs have the same name but different types. The
	1022	default value of an effective initarg is taken from the most specific
	1023	applicable direct initarg which specifies a defalt value; if no applicable
	1024	direct initarg specifies a default value then the effective initarg has no
	1025	default.
	1026
	1027	All initarg values are made available at runtime to user code --
	1028	initialization fragments and slot initializer expressions -- through local
	1029	variables and a @\|suppliedp\| structure, as in a direct method
	1030	(\xref{sec:concepts.methods.keywords}). Furthermore, slot initarg
	1031	definitions influence the initialization of slots.
	1032
	1033	The process for deciding how to initialize a particular slot works as
	1034	follows.
	1035	\begin{enumerate}
	1036	\item If there are any slot initargs defined on the slot, or any of its slot
	1037	initializers, \emph{and} the sender supplied a value for one or more of the
	1038	corresponding effective initargs, then the value of the most specific slot
	1039	initarg is stored in the slot.
	1040	\item Otherwise, if there are any slot initializers defined which include an
	1041	initializer expression, then the initializer expression from the most
	1042	specific such slot initializer is evaluated and its value stored in the
	1043	slot.
	1044	\item Otherwise, the slot is left uninitialized.
	1045	\end{enumerate}
	1046	Note that the default values (if any) of effective initargs do \emph{not}
	1047	affect this procedure.
	1048
	1049
	1050	\subsection{Destruction}
	1051	\label{sec:concepts.lifecycle.death}
	1052
	1053	Destruction of an instance, when it is no longer required, consists of two
	1054	steps.
	1055	\begin{enumerate}
	1056	\item \emph{Teardown} releases any resources held by the instance and
	1057	disentangles it from any external data structures.
	1058	\item \emph{Deallocation} releases the memory used to store the instance so
	1059	that it can be reused.
	1060	\end{enumerate}
	1061	Teardown alone, for objects which require special deallocation, or for which
	1062	deallocation occurs automatically (e.g., instances with automatic storage
	1063	duration, or instances whose storage will be garbage-collected), is performed
	1064	using the \descref{sod_teardown}[function]{fun}. Destruction of instances
	1065	allocated from the standard @\|malloc\| heap is done using the
	1066	\descref{sod_destroy}[function]{fun}.
	1067
	1068	\subsubsection{Teardown}
	1069	Details of teardown are necessarily class-specific, but typically it
	1070	involves releasing resources held by the instance, and disentangling it from
	1071	any data structures it might be linked into.
	1072
	1073	Teardown is performed by sending the instance the @\|teardown\| message,
	1074	defined by the @\|SodObject\| class. The message returns an integer, used as a
	1075	boolean flag. If the message returns zero, then the instance's storage
	1076	should be deallocated. If the message returns nonzero, then it is safe for
	1077	the caller to forget about instance, but should not deallocate its storage.
	1078	This is \emph{not} an error return: if some teardown method fails then the
	1079	program may be in an inconsistent state and should not continue.
	1080
	1081	This simple protocol can be used, for example, to implement a reference
	1082	counting system, as follows.
	1083	\begin{prog}
	1084	[nick = ref] \\
	1085	class ReferenceCountedObject: SodObject \{ \\ \ind
	1086	unsigned nref = 1; \\-
	1087	void inc() \{ me@->ref.nref++; \} \\-
	1088	[role = around] \\
	1089	int obj.teardown() \\
	1090	\{ \\ \ind
	1091	if (--\,--me@->ref.nref) return (1); \\
	1092	else return (CALL_NEXT_METHOD); \-\\
	1093	\} \-\\
	1094	\}
	1095	\end{prog}
	1096
	1097	The @\|teardown\| message uses a nonstandard method combination which works
	1098	like the standard combination, except that the \emph{default behaviour}, if
	1099	there is no overriding method, is to execute the superclass's teardown
	1100	fragments, and to return zero. This default behaviour may be invoked
	1101	multiple times if some method calls on its @\|next_method\| more than once,
	1102	unless some other method takes steps to prevent this.
	1103
	1104	A class can define \emph{teardown fragments}: pieces of literal code to be
	1105	executed to shut down an instance. Each superclass's teardown fragments are
	1106	executed with @\|me\| bound to an instance pointer of the appropriate
	1107	superclass type; fragments defined by a more specific superclass are executed
	1108	before fragments defined by a less specific superclass. A class may define
	1109	more than one teardown fragment: the fragments are executed in the order in
	1110	which they appear in the class definition. It is possible for an
	1111	initialization fragment to use @\|return\| or @\|goto\| for special control-flow
	1112	effects, but this is not likely to be a good idea. Similarly, it's probably
	1113	a better idea to use an @\|around\| method to influence the return value than
	1114	to write an explicit @\|return\| statement in a teardown fragment.
	1115
	1116	\subsubsection{Deallocation}
	1117	The details of instance deallocation are obviously specific to the allocation
	1118	strategy used by the instance, and this is often orthogonal from the object's
	1119	class.
	1120
	1121	The code which makes the decision to destroy an object may often not be aware
	1122	of the object's direct class. Low-level details of deallocation often
	1123	require the proper base address of the instance's storage, which can be
	1124	determined using the \descref{SOD_INSTBASE}[macro]{mac}.
	1125
	1126	%%%--------------------------------------------------------------------------
	1127	\section{Metaclasses} \label{sec:concepts.metaclasses}
	1128
	1129	In Sod, every object is an instance of some class, and -- unlike, say,
	1130	\Cplusplus\ -- classes are proper objects. It follows that, in Sod, every
	1131	class~$C$ is itself an instance of some class~$M$, which is called $C$'s
	1132	\emph{metaclass}. Metaclass instances are usually constructed statically, at
	1133	compile time, and marked read-only.
	1134
	1135	As an added complication, Sod classes, and other metaobjects such as
	1136	messages, methods, slots and so on, also have classes \emph{at translation
	1137	time}. These translation-time metaclasses are not Sod classes; they are CLOS
	1138	classes, implemented in Common Lisp.
	1139
	1140
	1141	\subsection{Runtime metaclasses}
	1142	\label{sec:concepts.metaclasses.runtime}
	1143
	1144	Like other classes, metaclasses can declare messages, and define slots and
	1145	methods. Slots defined by the metaclass are called \emph{class slots}, as
	1146	opposed to \emph{instance slots}. Similarly, messages and methods defined by
	1147	the metaclass are termed \emph{class messages} and \emph{class methods}
	1148	respectively, though these are used much less frequently.
	1149
	1150	\subsubsection{The braid}
	1151	Every object is an instance of some class. There are only finitely many
	1152	classes.
	1153
	1154	\begin{figure}
	1155	\centering
	1156	\begin{tikzpicture}
	1157	\node[lit] (obj) {SodObject};
	1158	\node[lit] (cls) [right=10mm of obj] {SodClass};
	1159	\draw [->, dashed] (obj) to[bend right] (cls);
	1160	\draw [->] (cls) to[bend right] (obj);
	1161	\draw [->, dashed] (cls) to[loop right] (cls);
	1162	\end{tikzpicture}
	1163	\qquad
	1164	\fbox{\ \begin{tikzpicture}
	1165	\node (subclass) {subclass of};
	1166	\node (instance) [below=\jot of subclass] {instance of};
	1167	\draw [->] ($(subclass.west) - (10mm, 0)$) -- ++(8mm, 0);
	1168	\draw [->, dashed] ($(instance.west) - (10mm, 0)$) -- ++(8mm, 0);
	1169	\end{tikzpicture}}
	1170	\caption{The Sod braid} \label{fig:concepts.metaclasses.braid}
	1171	\end{figure}
	1172
	1173	Consider the directed graph whose nodes are classes, and where there is an
	1174	arc from $C$ to $D$ if and only if $C$ is an instance of $D$. There are only
	1175	finitely many nodes. Every node has an arc leaving it, because every object
	1176	-- and hence every class -- is an instance of some class. Therefore this
	1177	graph must contain at least one cycle.
	1178
	1179	In Sod, this situation is resolved in the simplest manner possible:
	1180	@\|SodClass\| is the only predefined metaclass, and it is an instance of
	1181	itself. The only other predefined class is @\|SodObject\|, which is also an
	1182	instance of @\|SodClass\|. There is exactly one root class, namely
	1183	@\|SodObject\|; consequently, @\|SodClass\| is a direct subclass of @\|SodObject\|.
	1184
	1185	\Xref{fig:concepts.metaclasses.braid} shows a diagram of this situation.
	1186
	1187	\subsubsection{Class slots and initializers}
	1188	Instance initializers were described in \xref{sec:concepts.classes.slots}. A
	1189	class can also define \emph{class initializers}, which provide values for
	1190	slots defined by its metaclass. The initial value for a class slot is
	1191	determined as follows.
	1192	\begin{itemize}
	1193	\item Nonstandard slot classes may be initialized by custom Lisp code. For
	1194	example, all of the slots defined by @\|SodClass\| are of this kind. User
	1195	initializers are not permitted for such slots.
	1196	\item If the class or any of its superclasses defines a class initializer for
	1197	the slot, then the class initializer defined by the most specific such
	1198	superclass is used.
	1199	\item Otherwise, if the metaclass or one of its superclasses defines an
	1200	instance initializer, then the instance initializer defined by he most
	1201	specific such class is used.
	1202	\item Otherwise there is no initializer, and an error will be reported.
	1203	\end{itemize}
	1204	Initializers for class slots must be constant expressions (for scalar slots)
	1205	or aggregate initializers containing constant expressions.
	1206
	1207	\subsubsection{Metaclass selection and consistency}
	1208	Sod enforces a \emph{metaclass consistency rule}: if $C$ has metaclass $M$,
	1209	then any subclass $C$ must have a metaclass which is a subclass of $M$.
	1210
	1211	The definition of a new class can name the new class's metaclass explicitly,
	1212	by defining a @\|metaclass\| property; the Sod translator will verify that the
	1213	choice of metaclass is acceptable.
	1214
	1215	If no @\|metaclass\| property is given, then the translator will select a
	1216	default metaclass as follows. Let $C_1$, $C_2$, \dots, $C_n$ be the direct
	1217	superclasses of the new class, and let $M_1$, $M_2$, \dots, $M_n$ be their
	1218	respective metaclasses (not necessarily distinct). If there exists exactly
	1219	one minimal metaclass $M_i$, i.e., there exists an $i$, with $1 \le i \le n$,
	1220	such that $M_i$ is a subclass of every $M_j$, for $1 \le j \le n$, then $M_i$
	1221	is selected as the new class's metaclass. Otherwise the situation is
	1222	ambiguous and an error will be reported. Usually, the ambiguity can be
	1223	resolved satisfactorily by defining a new class $M^*$ as a direct subclass of
	1224	the minimal $M_j$.
	1225
	1226
	1227	\subsection{Translation-time metaobjects}
	1228	\label{sec:concepts.metaclasses.compile-time}
	1229
	1230
	1231
	1232	\fixme{unwritten}
	1233
	1234	%%%--------------------------------------------------------------------------
	1235	\section{Compatibility considerations} \label{sec:concepts.compatibility}
	1236
	1237	Sod doesn't make source-level compatibility especially difficult. As long as
	1238	classes, slots, and messages don't change names or dissappear, and slots and
	1239	messages retain their approximate types, everything will be fine.
	1240
	1241	Binary compatibility is much more difficult. Unfortunately, Sod classes have
	1242	rather fragile binary interfaces.\footnote{%
	1243	Research suggestion: investigate alternative instance and vtable layouts
	1244	which improve binary compatibility, probably at the expense of instance
	1245	compactness, and efficiency of slot access and message sending. There may
	1246	be interesting trade-offs to be made.} %
	1247
	1248	If instances are allocated \fixme{incomplete}
	1249
	1250	%%%----- That's all, folks --------------------------------------------------
	1251
	1252	%%% Local variables:
	1253	%%% mode: LaTeX
	1254	%%% TeX-master: "sod.tex"
	1255	%%% TeX-PDF-mode: t
	1256	%%% End: