X-Git-Url: https://git.distorted.org.uk/~mdw/sod/blobdiff_plain/a42893dda5f4dd2b89fbfe4e497da261159225ca..e4ea29d8e1853f8ae36ee0e65b9f5913303042d2:/doc/concepts.tex diff --git a/doc/concepts.tex b/doc/concepts.tex index 8151014..64f6320 100644 --- a/doc/concepts.tex +++ b/doc/concepts.tex @@ -26,37 +26,6 @@ \chapter{Concepts} \label{ch:concepts} %%%-------------------------------------------------------------------------- -\section{Operational model} \label{sec:concepts.model} - -The Sod translator runs as a preprocessor, similar in nature to the -traditional Unix \man{lex}{1} and \man{yacc}{1} tools. The translator reads -a \emph{module} file containing class definitions and other information, and -writes C~source and header files. The source files contain function -definitions and static tables which are fed directly to a C~compiler; the -header files contain declarations for functions and data structures, and are -included by source files -- whether hand-written or generated by Sod -- which -makes use of the classes defined in the module. - -Sod is not like \Cplusplus: it makes no attempt to `enhance' the C language -itself. Sod module files describe classes, messages, methods, slots, and -other kinds of object-system things, and some of these descriptions need to -contain C code fragments, but this code is entirely uninterpreted by the Sod -translator.\footnote{% - As long as a code fragment broadly follows C's lexical rules, and properly - matches parentheses, brackets, and braces, the Sod translator will copy it - into its output unchanged. It might, in fact, be some other kind of C-like - language, such as Objective~C or \Cplusplus. Or maybe even - Objective~\Cplusplus, because if having an object system is good, then - having three must be really awesome.} % - -The Sod translator is not a closed system. It is written in Common Lisp, and -can load extension modules which add new input syntax, output formats, or -altered behaviour. The interface for writing such extensions is described in -\xref{p:lisp}. Extensions can change almost all details of the Sod object -system, so the material in this manual must be read with this in mind: this -manual describes the base system as provided in the distribution. - -%%%-------------------------------------------------------------------------- \section{Modules} \label{sec:concepts.modules} A \emph{module} is the top-level syntactic unit of input to the Sod @@ -301,7 +270,10 @@ slots. If you want to hide implementation details, the best approach is to stash them in a dynamically allocated private structure, and leave a pointer to it in a slot. (This will also help preserve binary compatibility, because the private structure can grow more members as needed. See -\xref{sec:fixme.compatibility} for more details. +\xref{sec:fixme.compatibility} for more details.) + +\subsubsection{Vtables} + \subsubsection{Class objects} In Sod's object system, classes are objects too. Therefore classes are @@ -319,8 +291,8 @@ doesn't define any messages, so it doesn't have any methods. In Sod, a class slot containing a function pointer is not at all the same thing as a method.) \subsubsection{Conversions} -Suppose one has a value of type pointer to class type of some class~$C$, and -wants to convert it to a pointer to class type of some other class~$B$. +Suppose one has a value of type pointer-to-class-type for some class~$C$, and +wants to convert it to a pointer-to-class-type for some other class~$B$. There are three main cases to distinguish. \begin{itemize} \item If $B$ is a superclass of~$C$, in the same chain, then the conversion @@ -336,13 +308,13 @@ There are three main cases to distinguish. pointer. The conversion can be performed using the appropriate generated upcast macro (see below); the general case is handled by the macro \descref{SOD_XCHAIN}{mac}. -\item If $B$ is a subclass of~$C$ then the conversion is an \emph{upcast}; +\item If $B$ is a subclass of~$C$ then the conversion is a \emph{downcast}; otherwise the conversion is a~\emph{cross-cast}. In either case, the conversion can fail: the object in question might not be an instance of~$B$ - at all. The macro \descref{SOD_CONVERT}{mac} and the function + after all. The macro \descref{SOD_CONVERT}{mac} and the function \descref{sod_convert}{fun} perform general conversions. They return a null pointer if the conversion fails. (There are therefore your analogue to the - \Cplusplus @|dynamic_cast<>| operator.) + \Cplusplus\ @|dynamic_cast<>| operator.) \end{itemize} The Sod translator generates macros for performing both in-chain and cross-chain upcasts. For each class~$C$, and each proper superclass~$B$ @@ -616,6 +588,36 @@ There is also a @|custom| aggregating method combination, which is described in \xref{sec:fixme.custom-aggregating-method-combination}. +\subsection{Sending messages in C} \label{sec:concepts.methods.c} + +Each instance is associated with its direct class [FIXME] + +The effective methods for each class are determined at translation time, by +the Sod translator. For each effective method, one or more \emph{method +entry functions} are constructed. A method entry function has three +responsibilities. +\begin{itemize} +\item It converts the receiver pointer to the correct type. Method entry + functions can perform these conversions extremely efficiently: there are + separate method entries for each chain of each class which can receive a + message, so method entry functions are in the privileged situation of + knowing the \emph{exact} class of the receiving object. +\item If the message accepts a variable-length argument tail, then two method + entry functions are created for each chain of each class: one receives a + variable-length argument tail, as intended, and captures it in a @|va_list| + object; the other accepts an argument of type @|va_list| in place of the + variable-length tail and arranges for it to be passed along to the direct + methods. +\item It invokes the effective method with the appropriate arguments. There + might or might not be an actual function corresponding to the effective + method itself: the translator may instead open-code the effective method's + behaviour into each method entry function; and the machinery for handling + `delegation chains', such as is used for @|around| methods and primary + methods in the standard method combination, is necessarily scattered among + a number of small functions. +\end{itemize} + + \subsection{Messages with keyword arguments} \label{sec:concepts.methods.keywords} @@ -704,7 +706,7 @@ the platform's strictest alignment requirement applies. The following simple function correctly allocates and returns space for an instance of a class given a pointer to its class object @. \begin{prog} - void *allocate_instance(const SodClass *cls) \\ \ind + void *allocate_instance(const SodClass *cls) \\ \ind \{ return malloc(cls@->cls.initsz); \} \end{prog} @@ -728,7 +730,7 @@ of any use unless they've been written specifically for the purpose. The following simple function imprints storage at address @

as an instance of a class, given a pointer to its class object @. \begin{prog} - void imprint_instance(const SodClass *cls, void *p) \\ \ind + void imprint_instance(const SodClass *cls, void *p) \\ \ind \{ cls@->cls.imprint(p); \} \end{prog} @@ -745,11 +747,10 @@ Initialization is performed by sending the imprinted instance an @|init| message, defined by the @|SodObject| class. This message uses a nonstandard method combination which works like the standard combination, except that the \emph{default behaviour}, if there is no overriding method, is to initialize -the instance's slots using the initializers defined in the class and its -superclasses, and to invoke each superclass's initialization fragments. This -default behaviour may be invoked multiple times if some method calls on its -@|next_method| more than once, unless some other method takes steps to -prevent this. +the instance's slots, as described below, and to invoke each superclass's +initialization fragments. This default behaviour may be invoked multiple +times if some method calls on its @|next_method| more than once, unless some +other method takes steps to prevent this. Slots are initialized in a well-defined order. \begin{itemize} @@ -771,19 +772,53 @@ definition. It is possible for an initialization fragment to use @|return| or @|goto| for special control-flow effects, but this is not likely to be a good idea. -Note that an initialization fragment defined in a class is copied literally -into each subclass's initialization method. This is fine for simple cases -but wasteful if the initialization logic is complicated. More complex -initialization behaviour should be added either by having an initialization -fragments call functions (necessarily with external linkage), or by defining -@|after| methods on the @|init| message. These will be run after the slot -initializers have been applied, in reverse precedence order. - -Initialization is \emph{parametrized}, so the caller may select from a space -of possible initial states for the new instance, or to inform the new -instance about some other objects known to the caller. Specifically, the -@|init| message accepts keyword arguments (\xref{sec:concepts.keywords}) -which can be defined and used by methods defined on it. +The @|init| message accepts keyword arguments +(\xref{sec:concepts.methods.keywords}). The set of acceptable keywords is +determined by the applicable methods as usual, but also by the +\emph{initargs} defined by the receiving instance's class and its +superclasses, which are made available to slot initializers and +initialization fragments. + +There are two kinds of initarg definitions. \emph{User initargs} are defined +by an explicit @|initarg| item appearing in a class definition: the item +defines a name, type, and (optionally) a default value for the initarg. +\emph{Slot initargs} are defined by attaching an @|initarg| property to a +slot or slot initializer item: the property's determines the initarg's name, +while the type is taken from the underlying slot type; slot initargs do not +have default values. Both kinds define a \emph{direct initarg} for the +containing class. + +Initargs are inherited. The \emph{applicable} direct initargs for an @|init| +effective method are those defined by the receiving object's class, and all +of its superclasses. Applicable direct initargs with the same name are +merged to form \emph{effective initargs}. An error is reported if two +applicable direct initargs have the same name but different types. The +default value of an effective initarg is taken from the most specific +applicable direct initarg which specifies a defalt value; if no applicable +direct initarg specifies a default value then the effective initarg has no +default. + +All initarg values are made available at runtime to user code -- +initialization fragments and slot initializer expressions -- through local +variables and a @|suppliedp| structure, as in a direct method +(\xref{sec:concepts.methods.keywords}). Furthermore, slot initarg +definitions influence the initialization of slots. + +The process for deciding how to initialize a particular slot works as +follows. +\begin{enumerate} +\item If there are any slot initargs defined on the slot, or any of its slot + initializers, \emph{and} the sender supplied a value for one or more of the + corresponding effective initargs, then the value of the most specific slot + initarg is stored in the slot. +\item Otherwise, if there are any slot initializers defined which include an + initializer expression, then the initializer expression from the most + specific such slot initializer is evaluated and its value stored in the + slot. +\item Otherwise, the slot is left uninitialized. +\end{enumerate} +Note that the default values (if any) of effective initargs do \emph{not} +affect this procedure. \subsection{Destruction} @@ -820,16 +855,16 @@ program may be in an inconsistent state and should not continue. This simple protocol can be used, for example, to implement a reference counting system, as follows. \begin{prog} - [nick = ref] \\ - class ReferenceCountedObject \{ \\ \ind - unsigned nref = 1; \\- - void inc() \{ me@->ref.nref++; \} \\- - [role = around] \\ - int obj.teardown() \\ - \{ \\ \ind - if (--\,--me@->ref.nref) return (1); \\ - else return (CALL_NEXT_METHOD); \- \\ - \} \- \\ + [nick = ref] \\ + class ReferenceCountedObject \{ \\ \ind + unsigned nref = 1; \\- + void inc() \{ me@->ref.nref++; \} \\- + [role = around] \\ + int obj.teardown() \\ + \{ \\ \ind + if (--\,--me@->ref.nref) return (1); \\ + else return (CALL_NEXT_METHOD); \-\\ + \} \-\\ \} \end{prog} @@ -865,6 +900,22 @@ determined using the \descref{SOD_INSTBASE}[macro]{mac}. %%%-------------------------------------------------------------------------- \section{Metaclasses} \label{sec:concepts.metaclasses} +%%%-------------------------------------------------------------------------- +\section{Compatibility considerations} \label{sec:concepts.compatibility} + +Sod doesn't make source-level compatibility especially difficult. As long as +classes, slots, and messages don't change names or dissappear, and slots and +messages retain their approximate types, everything will be fine. + +Binary compatibility is much more difficult. Unfortunately, Sod classes have +rather fragile binary interfaces.\footnote{% + Research suggestion: investigate alternative instance and vtable layouts + which improve binary compatibility, probably at the expense of instance + compactness, and efficiency of slot access and message sending. There may + be interesting trade-offs to be made.} % + +If instances are allocated [FIXME] + %%%----- That's all, folks -------------------------------------------------- %%% Local variables: