Commit | Line | Data |
---|---|---|
1f7d590d MW |
1 | %%% -*-latex-*- |
2 | %%% | |
3 | %%% Conceptual background | |
4 | %%% | |
5 | %%% (c) 2015 Straylight/Edgeware | |
6 | %%% | |
7 | ||
8 | %%%----- Licensing notice --------------------------------------------------- | |
9 | %%% | |
e0808c47 | 10 | %%% This file is part of the Sensible Object Design, an object system for C. |
1f7d590d MW |
11 | %%% |
12 | %%% SOD is free software; you can redistribute it and/or modify | |
13 | %%% it under the terms of the GNU General Public License as published by | |
14 | %%% the Free Software Foundation; either version 2 of the License, or | |
15 | %%% (at your option) any later version. | |
16 | %%% | |
17 | %%% SOD is distributed in the hope that it will be useful, | |
18 | %%% but WITHOUT ANY WARRANTY; without even the implied warranty of | |
19 | %%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
20 | %%% GNU General Public License for more details. | |
21 | %%% | |
22 | %%% You should have received a copy of the GNU General Public License | |
23 | %%% along with SOD; if not, write to the Free Software Foundation, | |
24 | %%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. | |
25 | ||
3cc520db | 26 | \chapter{Concepts} \label{ch:concepts} |
1f7d590d | 27 | |
3cc520db MW |
28 | %%%-------------------------------------------------------------------------- |
29 | \section{Operational model} \label{sec:concepts.model} | |
1f7d590d | 30 | |
3cc520db MW |
31 | The Sod translator runs as a preprocessor, similar in nature to the |
32 | traditional Unix \man{lex}{1} and \man{yacc}{1} tools. The translator reads | |
33 | a \emph{module} file containing class definitions and other information, and | |
34 | writes C~source and header files. The source files contain function | |
35 | definitions and static tables which are fed directly to a C~compiler; the | |
36 | header files contain declarations for functions and data structures, and are | |
37 | included by source files -- whether hand-written or generated by Sod -- which | |
38 | makes use of the classes defined in the module. | |
1f7d590d | 39 | |
3cc520db MW |
40 | Sod is not like \Cplusplus: it makes no attempt to `enhance' the C language |
41 | itself. Sod module files describe classes, messages, methods, slots, and | |
42 | other kinds of object-system things, and some of these descriptions need to | |
43 | contain C code fragments, but this code is entirely uninterpreted by the Sod | |
44 | translator.\footnote{% | |
45 | As long as a code fragment broadly follows C's lexical rules, and properly | |
46 | matches parentheses, brackets, and braces, the Sod translator will copy it | |
47 | into its output unchanged. It might, in fact, be some other kind of C-like | |
48 | language, such as Objective~C or \Cplusplus. Or maybe even | |
49 | Objective~\Cplusplus, because if having an object system is good, then | |
50 | having three must be really awesome.} % | |
1f7d590d | 51 | |
3cc520db MW |
52 | The Sod translator is not a closed system. It is written in Common Lisp, and |
53 | can load extension modules which add new input syntax, output formats, or | |
54 | altered behaviour. The interface for writing such extensions is described in | |
55 | \xref{p:lisp}. Extensions can change almost all details of the Sod object | |
56 | system, so the material in this manual must be read with this in mind: this | |
57 | manual describes the base system as provided in the distribution. | |
58 | ||
59 | %%%-------------------------------------------------------------------------- | |
60 | \section{Modules} \label{sec:concepts.modules} | |
61 | ||
62 | A \emph{module} is the top-level syntactic unit of input to the Sod | |
63 | translator. As described above, given an input module, the translator | |
64 | generates C source and header files. | |
65 | ||
66 | A module can \emph{import} other modules. This makes the type names and | |
67 | classes defined in those other modules available to class definitions in the | |
68 | importing module. Sod's module system is intentionally very simple. There | |
69 | are no private declarations or attempts to hide things. | |
70 | ||
71 | As well as importing existing modules, a module can include a number of | |
72 | different kinds of \emph{items}: | |
73 | \begin{itemize} | |
74 | \item \emph{class definitions} describe new classes, possibly in terms of | |
75 | existing classes; | |
76 | \item \emph{type name declarations} introduce new type names to Sod's | |
77 | parser;\footnote{% | |
78 | This is unfortunately necessary because C syntax, upon which Sod's input | |
79 | language is based for obvious reasons, needs to treat type names | |
80 | differently from other kinds of identifiers.} % | |
81 | and | |
82 | \item \emph{code fragments} contain literal C code to be dropped into an | |
83 | appropriate place in an output file. | |
84 | \end{itemize} | |
85 | Each kind of item, and, indeed, a module as a whole, can have a collection of | |
86 | \emph{properties} associated with it. A property has a \emph{name} and a | |
87 | \emph{value}. Properties are an open-ended way of attaching additional | |
88 | information to module items, so extensions can make use of them without | |
89 | having to implement additional syntax. | |
90 | ||
91 | %%%-------------------------------------------------------------------------- | |
92 | \section{Classes, instances, and slots} \label{sec:concepts.classes} | |
93 | ||
94 | For the most part, Sod takes a fairly traditional view of what it means to be | |
95 | an object system. | |
96 | ||
97 | An \emph{object} maintains \emph{state} and exhibits \emph{behaviour}. An | |
98 | object's state is maintained in named \emph{slots}, each of which can store a | |
99 | C value of an appropriate (scalar or aggregate) type. An object's behaviour | |
100 | is stimulated by sending it \emph{messages}. A message has a name, and may | |
101 | carry a number of arguments, which are C values; sending a message may result | |
102 | in the state of receiving object (or other objects) being changed, and a C | |
103 | value being returned to the sender. | |
104 | ||
105 | Every object is a (direct) instance of some \emph{class}. The class | |
106 | determines which slots its instances have, which messages its instances can | |
107 | be sent, and which methods are invoked when those messages are received. The | |
108 | Sod translator's main job is to read class definitions and convert them into | |
109 | appropriate C declarations, tables, and functions. An object cannot | |
110 | (usually) change its direct class, and the direct class of an object is not | |
111 | affected by, for example, the static type of a pointer to it. | |
112 | ||
0a2d4b68 | 113 | |
3cc520db MW |
114 | \subsection{Superclasses and inheritance} |
115 | \label{sec:concepts.classes.inherit} | |
116 | ||
117 | \subsubsection{Class relationships} | |
118 | Each class has zero or more \emph{direct superclasses}. | |
119 | ||
120 | A class with no direct superclasses is called a \emph{root class}. The Sod | |
121 | runtime library includes a root class named @|SodObject|; making new root | |
122 | classes is somewhat tricky, and won't be discussed further here. | |
123 | ||
124 | Classes can have more than one direct superclass, i.e., Sod supports | |
125 | \emph{multiple inheritance}. A Sod class definition for a class~$C$ lists | |
126 | the direct superclasses of $C$ in a particular order. This order is called | |
127 | the \emph{local precedence order} of $C$, and the list which consists of $C$ | |
128 | follows by $C$'s direct superclasses in local precedence order is called the | |
129 | $C$'s \emph{local precedence list}. | |
130 | ||
131 | The multiple inheritance in Sod works similarly to multiple inheritance in | |
132 | Lisp-like languages, such as Common Lisp, EuLisp, Dylan, and Python, which is | |
133 | very different from how multiple inheritance works in \Cplusplus.\footnote{% | |
134 | The latter can be summarized as `badly'. By default in \Cplusplus, an | |
135 | instance receives an additional copy of superclass's state for each path | |
136 | through the class graph from the instance's direct class to that | |
137 | superclass, though this behaviour can be overridden by declaring | |
138 | superclasses to be @|virtual|. Also, \Cplusplus\ offers only trivial | |
139 | method combination (\xref{sec:concepts.methods}), leaving programmers to | |
140 | deal with delegation manually and (usually) statically.} % | |
141 | ||
142 | If $C$ is a class, then the \emph{superclasses} of $C$ are | |
143 | \begin{itemize} | |
144 | \item $C$ itself, and | |
145 | \item the superclasses of each of $C$'s direct superclasses. | |
146 | \end{itemize} | |
147 | The \emph{proper superclasses} of a class $C$ are the superclasses of $C$ | |
148 | except for $C$ itself. If a class $B$ is a (direct, proper) superclass of | |
149 | $C$, then $C$ is a \emph{(direct, proper) subclass} of $B$. If $C$ is a root | |
150 | class then the only superclass of $C$ is $C$ itself, and $C$ has no proper | |
151 | superclasses. | |
152 | ||
153 | If an object is a direct instance of class~$C$ then the object is also an | |
154 | (indirect) instance of every superclass of $C$. | |
155 | ||
156 | If $C$ has a proper superclass $B$, then $B$ is not allowed to have $C$ has a | |
157 | direct superclass. In different terms, if we construct a graph, whose | |
158 | vertices are classes, and draw an edge from each class to each of its direct | |
159 | superclasses, then this graph must be acyclic. In yet other terms, the `is a | |
160 | superclass of' relation is a partial order on classes. | |
161 | ||
162 | \subsubsection{The class precedence list} | |
163 | This partial order is not quite sufficient for our purposes. For each class | |
164 | $C$, we shall need to extend it into a total order on $C$'s superclasses. | |
165 | This calculation is called \emph{superclass linearization}, and the result is | |
166 | a \emph{class precedence list}, which lists each of $C$'s superclasses | |
167 | exactly once. If a superclass $B$ precedes (resp.\ follows) some other | |
168 | superclass $A$ in $C$'s class precedence list, then we say that $B$ is a more | |
169 | (resp.\ less) \emph{specific} superclass of $C$ than $A$ is. | |
170 | ||
171 | The superclass linearization algorithm isn't fixed, and extensions to the | |
172 | translator can introduce new linearizations for special effects, but the | |
173 | following properties are expected to hold. | |
174 | \begin{itemize} | |
175 | \item The first class in $C$'s class precedence list is $C$ itself; i.e., | |
176 | $C$ is always its own most specific superclass. | |
177 | \item If $A$ and $B$ are both superclasses of $C$, and $A$ is a proper | |
178 | superclass of $B$ then $A$ appears after $B$ in $C$'s class precedence | |
179 | list, i.e., $B$ is a more specific superclass of $C$ than $A$ is. | |
180 | \end{itemize} | |
181 | The default linearization algorithm used in Sod is the \emph{C3} algorithm, | |
182 | which has a number of good properties described in~\cite{FIXME:C3}. | |
183 | It works as follows. | |
184 | \begin{itemize} | |
185 | \item A \emph{merge} of some number of input lists is a single list | |
186 | containing each item that is in any of the input lists exactly once, and no | |
187 | other items; if an item $x$ appears before an item $y$ in any input list, | |
188 | then $x$ also appears before $y$ in the merge. If a collection of lists | |
189 | have no merge then they are said to be \emph{inconsistent}. | |
190 | \item The class precedence list of a class $C$ is a merge of the local | |
191 | precedence list of $C$ together with the class precedence lists of each of | |
192 | $C$'s direct superclasses. | |
193 | \item If there are no such merges, then the definition of $C$ is invalid. | |
194 | \item Suppose that there are multiple candidate merges. Consider the | |
195 | earliest position in these candidate merges at which they disagree. The | |
196 | \emph{candidate classes} at this position are the classes appearing at this | |
197 | position in the candidate merges. Each candidate class must be a | |
198 | superclass of exactly one of $C$'s direct superclasses, since otherwise the | |
199 | candidates would be ordered by their common subclass's class precedence | |
200 | list. The class precedence list contains, at this position, that candidate | |
201 | class whose subclass appears earliest in $C$'s local precedence order. | |
202 | \end{itemize} | |
203 | ||
204 | \subsubsection{Class links and chains} | |
205 | The definition for a class $C$ may distinguish one of its proper superclasses | |
206 | as being the \emph{link superclass} for class $C$. Not every class need have | |
207 | a link superclass, and the link superclass of a class $C$, if it exists, need | |
208 | not be a direct superclass of $C$. | |
209 | ||
210 | Superclass links must obey the following rule: if $C$ is a class, then there | |
211 | must be no three superclasses $X$, $Y$ and~$Z$ of $C$ such that both $Z$ is | |
212 | the link superclass of both $X$ and $Y$. As a consequence of this rule, the | |
213 | superclasses of $C$ can be partitioned into linear \emph{chains}, such that | |
214 | superclasses $A$ and $B$ are in the same chain if and only if one can trace a | |
215 | path from $A$ to $B$ by following superclass links, or \emph{vice versa}. | |
216 | ||
217 | Since a class links only to one of its proper superclasses, the classes in a | |
218 | chain are naturally ordered from most- to least-specific. The least specific | |
219 | class in a chain is called the \emph{chain head}; the most specific class is | |
220 | the \emph{chain tail}. Chains are often named after their chain head | |
221 | classes. | |
222 | ||
223 | \subsection{Names} | |
224 | \label{sec:concepts.classes.names} | |
225 | ||
226 | Classes have a number of other attributes: | |
227 | \begin{itemize} | |
228 | \item A \emph{name}, which is a C identifier. Class names must be globally | |
229 | unique. The class name is used in the names of a number of associated | |
230 | definitions, to be described later. | |
231 | \item A \emph{nickname}, which is also a C identifier. Unlike names, | |
232 | nicknames are not required to be globally unique. If $C$ is any class, | |
233 | then all the superclasses of $C$ must have distinct nicknames. | |
234 | \end{itemize} | |
235 | ||
0a2d4b68 | 236 | |
3cc520db MW |
237 | \subsection{Slots} \label{sec:concepts.classes.slots} |
238 | ||
239 | Each class defines a number of \emph{slots}. Much like a structure member, a | |
240 | slot has a \emph{name}, which is a C identifier, and a \emph{type}. Unlike | |
241 | many other object systems, different superclasses of a class $C$ can define | |
242 | slots with the same name without ambiguity, since slot references are always | |
243 | qualified by the defining class's nickname. | |
244 | ||
245 | \subsubsection{Slot initializers} | |
246 | As well as defining slot names and types, a class can also associate an | |
247 | \emph{initial value} with each slot defined by itself or one of its | |
248 | subclasses. A class $C$ provides an \emph{initialization function} (see | |
249 | \xref{sec:concepts.classes.c}, and \xref{sec:structures.root.sodclass}) which | |
250 | sets the slots of a \emph{direct} instance of the class to the correct | |
251 | initial values. If several of $C$'s superclasses define initializers for the | |
252 | same slot then the initializer from the most specific such class is used. If | |
253 | none of $C$'s superclasses define an initializer for some slot then that slot | |
254 | will not be initialized. | |
255 | ||
256 | The initializer for a slot with scalar type may be any C expression. The | |
257 | initializer for a slot with aggregate type must contain only constant | |
258 | expressions if the generated code is expected to be processed by a | |
259 | implementation of C89. Initializers will be evaluated once each time an | |
260 | instance is initialized. | |
261 | ||
0a2d4b68 | 262 | |
3cc520db MW |
263 | \subsection{C language integration} \label{sec:concepts.classes.c} |
264 | ||
265 | For each class~$C$, the Sod translator defines a C type, the \emph{class | |
266 | type}, with the same name. This is the usual type used when considering an | |
267 | object as an instance of class~$C$. No entire object will normally have a | |
268 | class type,\footnote{% | |
269 | In general, a class type only captures the structure of one of the | |
270 | superclass chains of an instance. A full instance layout contains multiple | |
271 | chains. See \xref{sec:structures.layout} for the full details.} % | |
272 | so access to instances is almost always via pointers. | |
273 | ||
274 | \subsubsection{Access to slots} | |
275 | The class type for a class~$C$ is actually a structure. It contains one | |
276 | member for each class in $C$'s superclass chain, named with that class's | |
277 | nickname. Each of these members is also a structure, containing the | |
278 | corresponding class's slots, one member per slot. There's nothing special | |
279 | about these slot members: C code can access them in the usual way. | |
280 | ||
281 | For example, if @|MyClass| has the nickname @|mine|, and defines a slot @|x| | |
282 | of type @|int|, then the simple function | |
283 | \begin{prog} | |
c18d6aba | 284 | int get_x(MyClass *m) \{ return (m@->mine.x); \} |
3cc520db MW |
285 | \end{prog} |
286 | will extract the value of @|x| from an instance of @|MyClass|. | |
287 | ||
288 | All of this means that there's no such thing as `private' or `protected' | |
289 | slots. If you want to hide implementation details, the best approach is to | |
290 | stash them in a dynamically allocated private structure, and leave a pointer | |
291 | to it in a slot. (This will also help preserve binary compatibility, because | |
292 | the private structure can grow more members as needed. See | |
293 | \xref{sec:fixme.compatibility} for more details. | |
294 | ||
295 | \subsubsection{Class objects} | |
296 | In Sod's object system, classes are objects too. Therefore classes are | |
297 | themselves instances; the class of a class is called a \emph{metaclass}. The | |
298 | consequences of this are explored in \xref{sec:concepts.metaclasses}. The | |
299 | \emph{class object} has the same name as the class, suffixed with | |
300 | `@|__class|'\footnote{% | |
301 | This is not quite true. @|$C$__class| is actually a macro. See | |
302 | \xref{sec:structures.layout.additional} for the gory details.} % | |
303 | and its type is usually @|SodClass|; @|SodClass|'s nickname is @|cls|. | |
304 | ||
305 | A class object's slots contain or point to useful information, tables and | |
306 | functions for working with that class's instances. (The @|SodClass| class | |
307 | doesn't define any messages, so it doesn't have any methods. In Sod, a class | |
308 | slot containing a function pointer is not at all the same thing as a method.) | |
309 | ||
310 | \subsubsection{Instance allocation, imprinting, and initialization} | |
311 | It is in general not sufficient to declare (or @|malloc|) an object of the | |
312 | appropriate class type and fill it in, since the class type only describes an | |
313 | instance's layout from the point of view of a single superclass chain. The | |
314 | correct type to allocate, to store a direct instance of some class is a | |
315 | structure whose tag is the class name suffixed with `@|__ilayout|'; e.g., the | |
316 | correct layout structure for a direct instance of @|MyClass| would be | |
317 | @|struct MyClass__ilayout|. | |
318 | ||
319 | Instance layouts may be declared as objects with automatic storage duration | |
320 | (colloquially, `allocated on the stack') or allocated dynamically, e.g., | |
321 | using @|malloc|. Sod's runtime system doesn't retain addresses of instances, | |
322 | so, for example, Sod doesn't make using a fancy allocator which sometimes | |
323 | moves objects around in memory any more difficult than it needs to be. | |
324 | ||
325 | Once storage for an instance has been allocated, it must be \emph{imprinted} | |
326 | before it can be used. Imprinting an instance stores some metadata about its | |
327 | direct class in the instance structure, so that the rest of the program (and | |
328 | Sod's runtime library) can tell what sort of object it is, and how to use | |
329 | it.\footnote{% | |
330 | Specifically, imprinting an instance's storage involves storing the | |
331 | appropriate vtable pointers in the right places in it.} % | |
332 | A class object's @|imprint| slot points to a function which will correctly | |
333 | imprint storage for one of that class's instances. | |
334 | ||
335 | Once an instance's storage has been imprinted, it is possible to send the | |
336 | instance messages; however, the instance's slots are uninitialized at this | |
337 | point, so most methods are unlikely to do much of any use. So, usually, you | |
338 | don't just want to imprint instance storage, but to \emph{initialize} an | |
339 | instance. Initialization includes imprinting, but also sets the new | |
340 | instance's slots to their initial values, as defined by the class. If | |
341 | neither the class nor any of its superclasses defines an initializer for a | |
342 | slot then it will not be initialized. | |
343 | ||
344 | There is currently no facility for providing parameters to the instance | |
345 | initialization process (e.g., for use by slot initializer expressions). | |
346 | Instance initialization is a complicated matter and for now I want to | |
347 | experiment with various approaches before committing to one. My current | |
348 | interim approach is to specify slot initializers where appropriate and send | |
349 | class-specific messages for more complicated parametrized initialization. | |
350 | ||
351 | Automatic-duration instances can be conveniently constructed and initialized | |
58f9b400 MW |
352 | using the \descref{SOD_DECL}[macro]{mac}. No special support is currently |
353 | provided for dynamically allocated instances. A simple function using | |
354 | @|malloc| might work as follows. | |
3cc520db MW |
355 | \begin{prog} |
356 | void *new_instance(const SodClass *c) \\ | |
357 | \{ \\ \ind | |
c18d6aba | 358 | void *p = malloc(c@->cls.initsz); \\ |
3cc520db | 359 | if (!p) return (0); \\ |
c18d6aba | 360 | c@->cls.init(p); \\ |
3cc520db MW |
361 | return (p); \- \\ |
362 | \} | |
363 | \end{prog} | |
364 | ||
365 | \subsubsection{Instance finalization and deallocation} | |
366 | There is currently no provided assistance for finalization or deallocation. | |
367 | It is the programmer's responsibility to decide and implement an appropriate | |
368 | protocol. Note that to free an instance allocated from the heap, one must | |
58f9b400 MW |
369 | correctly find its base address: the \descref{SOD_INSTBASE}[macro]{mac} will |
370 | do this for you. | |
3cc520db MW |
371 | |
372 | The following simple mixin class is suggested. | |
373 | \begin{prog} | |
374 | [nick = disposable] \\* | |
375 | class DisposableObject : SodObject \{ \\*[\jot] \ind | |
376 | void release() \{ ; \} \\* | |
377 | \quad /\=\+* Release resources held by the receiver. */ \-\- \\*[\jot] | |
378 | \} \\[\bigskipamount] | |
379 | code c : user \{ \\* \ind | |
380 | /\=\+* Free object p's instance storage. If p is a DisposableObject \\* | |
381 | {}* then release its resources beforehand. \\* | |
382 | {}*/ \- \\* | |
383 | void free_instance(void *p) \\* | |
384 | \{ \\* \ind | |
385 | DisposableObject *d = SOD_CONVERT(DisposableObject, p); \\* | |
386 | if (d) DisposableObject_release(d); \\* | |
387 | free(d); \- \\* | |
388 | \} \- \\* | |
389 | \} | |
390 | \end{prog} | |
391 | ||
392 | \subsubsection{Conversions} | |
393 | Suppose one has a value of type pointer to class type of some class~$C$, and | |
394 | wants to convert it to a pointer to class type of some other class~$B$. | |
395 | There are three main cases to distinguish. | |
396 | \begin{itemize} | |
397 | \item If $B$ is a superclass of~$C$, in the same chain, then the conversion | |
398 | is an \emph{in-chain upcast}. The conversion can be performed using the | |
399 | appropriate generated upcast macro (see below), or by simply casting the | |
400 | pointer, using C's usual cast operator (or the \Cplusplus\ @|static_cast<>| | |
401 | operator). | |
402 | \item If $B$ is a superclass of~$C$, in a different chain, then the | |
403 | conversion is a \emph{cross-chain upcast}. The conversion is more than a | |
404 | simple type change: the pointer value must be adjusted. If the direct | |
405 | class of the instance in question is not known, the conversion will require | |
406 | a lookup at runtime to find the appropriate offset by which to adjust the | |
407 | pointer. The conversion can be performed using the appropriate generated | |
408 | upcast macro (see below); the general case is handled by the macro | |
58f9b400 | 409 | \descref{SOD_XCHAIN}{mac}. |
3cc520db MW |
410 | \item If $B$ is a subclass of~$C$ then the conversion is an \emph{upcast}; |
411 | otherwise the conversion is a~\emph{cross-cast}. In either case, the | |
412 | conversion can fail: the object in question might not be an instance of~$B$ | |
58f9b400 MW |
413 | at all. The macro \descref{SOD_CONVERT}{mac} and the function |
414 | \descref{sod_convert}{fun} perform general conversions. They return a null | |
415 | pointer if the conversion fails. | |
3cc520db MW |
416 | \end{itemize} |
417 | The Sod translator generates macros for performing both in-chain and | |
418 | cross-chain upcasts. For each class~$C$, and each proper superclass~$B$ | |
419 | of~$C$, a macro is defined: given an argument of type pointer to class type | |
420 | of~$C$, it returns a pointer to the same instance, only with type pointer to | |
421 | class type of~$B$, adjusted as necessary in the case of a cross-chain | |
422 | conversion. The macro is named by concatenating | |
423 | \begin{itemize} | |
424 | \item the name of class~$C$, in upper case, | |
425 | \item the characters `@|__CONV_|', and | |
426 | \item the nickname of class~$B$, in upper case; | |
427 | \end{itemize} | |
428 | e.g., if $C$ is named @|MyClass|, and $B$'s name is @|SuperClass| with | |
429 | nickname @|super|, then the macro @|MYCLASS__CONV_SUPER| converts a | |
430 | @|MyClass~*| to a @|SuperClass~*|. See | |
431 | \xref{sec:structures.layout.additional} for the formal description. | |
432 | ||
433 | %%%-------------------------------------------------------------------------- | |
434 | \section{Messages and methods} \label{sec:concepts.methods} | |
435 | ||
436 | Objects can be sent \emph{messages}. A message has a \emph{name}, and | |
437 | carries a number of \emph{arguments}. When an object is sent a message, a | |
438 | function, determined by the receiving object's class, is invoked, passing it | |
439 | the receiver and the message arguments. This function is called the | |
440 | class's \emph{effective method} for the message. The effective method can do | |
441 | anything a C function can do, including reading or updating program state or | |
442 | object slots, sending more messages, calling other functions, issuing system | |
443 | calls, or performing I/O; if it finishes, it may return a value, which is | |
444 | returned in turn to the message sender. | |
445 | ||
446 | The set of messages an object can receive, characterized by their names, | |
447 | argument types, and return type, is determined by the object's class. Each | |
448 | class can define new messages, which can be received by any instance of that | |
449 | class. The messages defined by a single class must have distinct names: | |
450 | there is no `function overloading'. As with slots | |
451 | (\xref{sec:concepts.classes.slots}), messages defined by distinct classes are | |
452 | always distinct, even if they have the same names: references to messages are | |
453 | always qualified by the defining class's name or nickname. | |
454 | ||
455 | Messages may take any number of arguments, of any non-array value type. | |
456 | Since message sends are effectively function calls, arguments of array type | |
457 | are implicitly converted to values of the corresponding pointer type. While | |
458 | message definitions may ascribe an array type to an argument, the formal | |
459 | argument will have pointer type, as is usual for C functions. A message may | |
460 | accept a variable-length argument suffix, denoted @|\dots|. | |
461 | ||
462 | A class definition may include \emph{direct methods} for messages defined by | |
463 | it or any of its superclasses. | |
464 | ||
465 | Like messages, direct methods define argument lists and return types, but | |
466 | they may also have a \emph{body}, and a \emph{role}. | |
467 | ||
468 | A direct method need not have the same argument list or return type as its | |
469 | message. The acceptable argument lists and return types for a method depend | |
470 | on the message, in particular its method combination | |
471 | (\xref{sec:concepts.methods.combination}), and the method's role. | |
472 | ||
473 | A direct method body is a block of C code, and the Sod translator usually | |
474 | defines, for each direct method, a function with external linkage, whose body | |
475 | contains a copy of the direct method body. Within the body of a direct | |
476 | method defined for a class $C$, the variable @|me|, of type pointer to class | |
477 | type of $C$, refers to the receiving object. | |
478 | ||
0a2d4b68 | 479 | |
3cc520db MW |
480 | \subsection{Effective methods and method combinations} |
481 | \label{sec:concepts.methods.combination} | |
482 | ||
483 | For each message a direct instance of a class might receive, there is a set | |
484 | of \emph{applicable methods}, which are exactly the direct methods defined on | |
485 | the object's class and its superclasses. These direct methods are combined | |
486 | together to form the \emph{effective method} for that particular class and | |
487 | message. Direct methods can be combined into an effective method in | |
488 | different ways, according to the \emph{method combination} specified by the | |
489 | message. The method combination determines which direct method roles are | |
490 | acceptable, and, for each role, the appropriate argument lists and return | |
491 | types. | |
492 | ||
493 | One direct method, $M$, is said to be more (resp.\ less) \emph{specific} than | |
494 | another, $N$, with respect to a receiving class~$C$, if the class defining | |
495 | $M$ is a more (resp.\ less) specific superclass of~$C$ than the class | |
496 | defining $N$. | |
497 | ||
498 | \subsection{The standard method combination} | |
499 | \label{sec:concepts.methods.standard} | |
500 | ||
501 | The default method combination is called the \emph{standard method | |
502 | combination}; other method combinations are useful occasionally for special | |
503 | effects. The standard method combination accepts four direct method roles, | |
504 | called @|primary| (the default), @|before|, @|after|, and @|around|. | |
505 | ||
506 | All direct methods subject to the standard method combination must have | |
507 | argument lists which \emph{match} the message's argument list: | |
508 | \begin{itemize} | |
509 | \item the method's arguments must have the same types as the message, though | |
510 | the arguments may have different names; and | |
511 | \item if the message accepts a variable-length argument suffix then the | |
512 | direct method must instead have a final argument of type @|va_list|. | |
513 | \end{itemize} | |
b1254eb6 MW |
514 | Primary and @|around| methods must have the same return type as the message; |
515 | @|before| and @|after| methods must return @|void| regardless of the | |
516 | message's return type. | |
3cc520db MW |
517 | |
518 | If there are no applicable primary methods then no effective method is | |
519 | constructed: the vtables contain null pointers in place of pointers to method | |
520 | entry functions. | |
521 | ||
522 | The effective method for a message with standard method combination works as | |
523 | follows. | |
524 | \begin{enumerate} | |
525 | ||
526 | \item If any applicable methods have the @|around| role, then the most | |
527 | specific such method, with respect to the class of the receiving object, is | |
528 | invoked. | |
529 | ||
b1254eb6 | 530 | Within the body of an @|around| method, the variable @|next_method| is |
3cc520db MW |
531 | defined, having pointer-to-function type. The method may call this |
532 | function, as described below, any number of times. | |
533 | ||
b1254eb6 MW |
534 | If there any remaining @|around| methods, then @|next_method| invokes the |
535 | next most specific such method, returning whichever value that method | |
536 | returns; otherwise the behaviour of @|next_method| is to invoke the before | |
537 | methods (if any), followed by the most specific primary method, followed by | |
538 | the @|around| methods (if any), and to return whichever value was returned | |
539 | by the most specific primary method. That is, the behaviour of the least | |
540 | specific @|around| method's @|next_method| function is exactly the | |
541 | behaviour that the effective method would have if there were no @|around| | |
542 | methods. | |
3cc520db | 543 | |
b1254eb6 MW |
544 | The value returned by the most specific @|around| method is the value |
545 | returned by the effective method. | |
3cc520db MW |
546 | |
547 | \item If any applicable methods have the @|before| role, then they are all | |
548 | invoked, starting with the most specific. | |
549 | ||
550 | \item The most specific applicable primary method is invoked. | |
551 | ||
552 | Within the body of a primary method, the variable @|next_method| is | |
553 | defined, having pointer-to-function type. If there are no remaining less | |
554 | specific primary methods, then @|next_method| is a null pointer. | |
555 | Otherwise, the method may call the @|next_method| function any number of | |
556 | times. | |
557 | ||
558 | The behaviour of the @|next_method| function, if it is not null, is to | |
559 | invoke the next most specific applicable primary method, and to return | |
560 | whichever value that method returns. | |
561 | ||
b1254eb6 MW |
562 | If there are no applicable @|around| methods, then the value returned by |
563 | the most specific primary method is the value returned by the effective | |
564 | method; otherwise the value returned by the most specific primary method is | |
565 | returned to the least specific @|around| method, which called it via its | |
566 | own @|next_method| function. | |
3cc520db MW |
567 | |
568 | \item If any applicable methods have the @|after| role, then they are all | |
569 | invoked, starting with the \emph{least} specific. (Hence, the most | |
b1254eb6 | 570 | specific @|after| method is invoked with the most `afterness'.) |
3cc520db MW |
571 | |
572 | \end{enumerate} | |
573 | ||
b1254eb6 MW |
574 | A typical use for @|around| methods is to allow a base class to set up the |
575 | dynamic environment appropriately for the primary methods of its subclasses, | |
576 | e.g., by claiming a lock, and restore it afterwards. | |
3cc520db MW |
577 | |
578 | The @|next_method| function provided to methods with the @|primary| and | |
579 | @|around| roles accepts the same arguments, and returns the same type, as the | |
580 | message, except that one or two additional arguments are inserted at the | |
581 | front of the argument list. The first additional argument is always the | |
582 | receiving object, @|me|. If the message accepts a variable argument suffix, | |
583 | then the second addition argument is a @|va_list|; otherwise there is no | |
584 | second additional argument; otherwise, In the former case, a variable | |
585 | @|sod__master_ap| of type @|va_list| is defined, containing a separate copy | |
586 | of the argument pointer (so the method body can process the variable argument | |
587 | suffix itself, and still pass a fresh copy on to the next method). | |
588 | ||
589 | A method with the @|primary| or @|around| role may use the convenience macro | |
590 | @|CALL_NEXT_METHOD|, which takes no arguments itself, and simply calls | |
591 | @|next_method| with appropriate arguments: the receiver @|me| pointer, the | |
592 | argument pointer @|sod__master_ap| (if applicable), and the method's | |
593 | arguments. If the method body has overwritten its formal arguments, then | |
594 | @|CALL_NEXT_METHOD| will pass along the updated values, rather than the | |
595 | original ones. | |
596 | ||
597 | \subsection{Aggregating method combinations} | |
598 | \label{sec:concepts.methods.aggregating} | |
599 | ||
600 | A number of other method combinations are provided. They are called | |
601 | `aggregating' method combinations because, instead of invoking just the most | |
602 | specific primary method, as the standard method combination does, they invoke | |
603 | the applicable primary methods in turn and aggregate the return values from | |
604 | each. | |
605 | ||
606 | The aggregating method combinations accept the same four roles as the | |
b1254eb6 MW |
607 | standard method combination, and @|around|, @|before|, and @|after| methods |
608 | work in the same way. | |
3cc520db MW |
609 | |
610 | The aggregating method combinations provided are as follows. | |
611 | \begin{description} \let\makelabel\code | |
612 | \item[progn] The message must return @|void|. The applicable primary methods | |
613 | are simply invoked in turn, most specific first. | |
614 | \item[sum] The message must return a numeric type.\footnote{% | |
615 | The Sod translator does not check this, since it doesn't have enough | |
616 | insight into @|typedef| names.} % | |
617 | The applicable primary methods are invoked in turn, and their return values | |
618 | added up. The final result is the sum of the individual values. | |
619 | \item[product] The message must return a numeric type. The applicable | |
620 | primary methods are invoked in turn, and their return values multiplied | |
621 | together. The final result is the product of the individual values. | |
622 | \item[min] The message must return a scalar type. The applicable primary | |
623 | methods are invoked in turn. The final result is the smallest of the | |
624 | individual values. | |
625 | \item[max] The message must return a scalar type. The applicable primary | |
626 | methods are invoked in turn. The final result is the largest of the | |
627 | individual values. | |
665a0455 MW |
628 | \item[and] The message must return a scalar type. The applicable primary |
629 | methods are invoked in turn. If any method returns zero then the final | |
630 | result is zero and no further methods are invoked. If all of the | |
631 | applicable primary methods return nonzero, then the final result is the | |
632 | result of the last primary method. | |
633 | \item[or] The message must return a scalar type. The applicable primary | |
634 | methods are invoked in turn. If any method returns nonzero then the final | |
635 | result is that nonzero value and no further methods are invoked. If all of | |
636 | the applicable primary methods return zero, then the final result is zero. | |
3cc520db MW |
637 | \end{description} |
638 | ||
639 | There is also a @|custom| aggregating method combination, which is described | |
640 | in \xref{sec:fixme.custom-aggregating-method-combination}. | |
641 | ||
642 | %%%-------------------------------------------------------------------------- | |
643 | \section{Metaclasses} \label{sec:concepts.metaclasses} | |
1f7d590d MW |
644 | |
645 | %%%----- That's all, folks -------------------------------------------------- | |
646 | ||
647 | %%% Local variables: | |
648 | %%% mode: LaTeX | |
649 | %%% TeX-master: "sod.tex" | |
650 | %%% TeX-PDF-mode: t | |
651 | %%% End: |