X-Git-Url: https://git.distorted.org.uk/~mdw/sod/blobdiff_plain/9761db0da830385bcc0fca81f56f24536a46aeda..8d952432c8b961e4e0891eb78620615a8ae14f05:/doc/concepts.tex diff --git a/doc/concepts.tex b/doc/concepts.tex index b8cdfe9..30c6735 100644 --- a/doc/concepts.tex +++ b/doc/concepts.tex @@ -195,7 +195,7 @@ It works as follows. earliest position in these candidate merges at which they disagree. The \emph{candidate classes} at this position are the classes appearing at this position in the candidate merges. Each candidate class must be a - superclass of exactly one of $C$'s direct superclasses, since otherwise the + superclass of distinct direct superclasses of $C$, since otherwise the candidates would be ordered by their common subclass's class precedence list. The class precedence list contains, at this position, that candidate class whose subclass appears earliest in $C$'s local precedence order. @@ -208,8 +208,8 @@ a link superclass, and the link superclass of a class $C$, if it exists, need not be a direct superclass of $C$. Superclass links must obey the following rule: if $C$ is a class, then there -must be no three superclasses $X$, $Y$ and~$Z$ of $C$ such that both $Z$ is -the link superclass of both $X$ and $Y$. As a consequence of this rule, the +must be no three superclasses $X$, $Y$ and~$Z$ of $C$ such that $Z$ is the +link superclass of both $X$ and $Y$. As a consequence of this rule, the superclasses of $C$ can be partitioned into linear \emph{chains}, such that superclasses $A$ and $B$ are in the same chain if and only if one can trace a path from $A$ to $B$ by following superclass links, or \emph{vice versa}. @@ -246,12 +246,12 @@ qualified by the defining class's nickname. As well as defining slot names and types, a class can also associate an \emph{initial value} with each slot defined by itself or one of its subclasses. A class $C$ provides an \emph{initialization function} (see -\xref{sec:concepts.classes.c}, and \xref{sec:structures.root.sodclass}) which -sets the slots of a \emph{direct} instance of the class to the correct +\xref{sec:concepts.lifecycle.birth}, and \xref{sec:structures.root.sodclass}) +which sets the slots of a \emph{direct} instance of the class to the correct initial values. If several of $C$'s superclasses define initializers for the same slot then the initializer from the most specific such class is used. If none of $C$'s superclasses define an initializer for some slot then that slot -will not be initialized. +will be left uninitialized. The initializer for a slot with scalar type may be any C expression. The initializer for a slot with aggregate type must contain only constant @@ -259,6 +259,17 @@ expressions if the generated code is expected to be processed by a implementation of C89. Initializers will be evaluated once each time an instance is initialized. +Slots are initialized in reverse-precedence order of their defining classes; +i.e., slots defined by a less specific superclass are initialized earlier +than slots defined by a more specific superclass. Slots defined by the same +class are initialized in the order in which they appear in the class +definition. + +The initializer for a slot may refer to other slots in the same object, via +the @|me| pointer: in an initializer for a slot defined by a class $C$, @|me| +has type `pointer to $C$'. (Note that the type of @|me| depends only on the +class which defined the slot, not the class which defined the initializer.) + \subsection{C language integration} \label{sec:concepts.classes.c} @@ -307,89 +318,6 @@ functions for working with that class's instances. (The @|SodClass| class doesn't define any messages, so it doesn't have any methods. In Sod, a class slot containing a function pointer is not at all the same thing as a method.) -\subsubsection{Instance allocation, imprinting, and initialization} -It is in general not sufficient to declare (or @|malloc|) an object of the -appropriate class type and fill it in, since the class type only describes an -instance's layout from the point of view of a single superclass chain. The -correct type to allocate, to store a direct instance of some class is a -structure whose tag is the class name suffixed with `@|__ilayout|'; e.g., the -correct layout structure for a direct instance of @|MyClass| would be -@|struct MyClass__ilayout|. - -Instance layouts may be declared as objects with automatic storage duration -(colloquially, `allocated on the stack') or allocated dynamically, e.g., -using @|malloc|. Sod's runtime system doesn't retain addresses of instances, -so, for example, Sod doesn't make using a fancy allocator which sometimes -moves objects around in memory any more difficult than it needs to be. - -Once storage for an instance has been allocated, it must be \emph{imprinted} -before it can be used. Imprinting an instance stores some metadata about its -direct class in the instance structure, so that the rest of the program (and -Sod's runtime library) can tell what sort of object it is, and how to use -it.\footnote{% - Specifically, imprinting an instance's storage involves storing the - appropriate vtable pointers in the right places in it.} % -A class object's @|imprint| slot points to a function which will correctly -imprint storage for one of that class's instances. - -Once an instance's storage has been imprinted, it is possible to send the -instance messages; however, the instance's slots are uninitialized at this -point, so most methods are unlikely to do much of any use. So, usually, you -don't just want to imprint instance storage, but to \emph{initialize} an -instance. Initialization includes imprinting, but also sets the new -instance's slots to their initial values, as defined by the class. If -neither the class nor any of its superclasses defines an initializer for a -slot then it will not be initialized. - -There is currently no facility for providing parameters to the instance -initialization process (e.g., for use by slot initializer expressions). -Instance initialization is a complicated matter and for now I want to -experiment with various approaches before committing to one. My current -interim approach is to specify slot initializers where appropriate and send -class-specific messages for more complicated parametrized initialization. - -Automatic-duration instances can be conveniently constructed and initialized -using the \descref{SOD_DECL}[macro]{mac}. No special support is currently -provided for dynamically allocated instances. A simple function using -@|malloc| might work as follows. -\begin{prog} - void *new_instance(const SodClass *c) \\ - \{ \\ \ind - void *p = malloc(c@->cls.initsz); \\ - if (!p) return (0); \\ - c@->cls.init(p); \\ - return (p); \- \\ - \} -\end{prog} - -\subsubsection{Instance finalization and deallocation} -There is currently no provided assistance for finalization or deallocation. -It is the programmer's responsibility to decide and implement an appropriate -protocol. Note that to free an instance allocated from the heap, one must -correctly find its base address: the \descref{SOD_INSTBASE}[macro]{mac} will -do this for you. - -The following simple mixin class is suggested. -\begin{prog} - [nick = disposable] \\ - class DisposableObject : SodObject \{ \\- \ind - void release() \{ ; \} \\ - \quad /* Release resources held by the receiver. */ \- \\- - \} - \\+ - code c : user \{ \\- \ind - /\=\+* Free object p's instance storage. If p is a DisposableObject \\ - {}* then release its resources beforehand. \\ - {}*/ \- \\ - void free_instance(void *p) \\ - \{ \\ \ind - DisposableObject *d = SOD_CONVERT(DisposableObject, p); \\ - if (d) DisposableObject_release(d); \\ - free(d); \- \\ - \} \- \\ - \} -\end{prog} - \subsubsection{Conversions} Suppose one has a value of type pointer to class type of some class~$C$, and wants to convert it to a pointer to class type of some other class~$B$. @@ -413,7 +341,8 @@ There are three main cases to distinguish. conversion can fail: the object in question might not be an instance of~$B$ at all. The macro \descref{SOD_CONVERT}{mac} and the function \descref{sod_convert}{fun} perform general conversions. They return a null - pointer if the conversion fails. + pointer if the conversion fails. (There are therefore your analogue to the + \Cplusplus @|dynamic_cast<>| operator.) \end{itemize} The Sod translator generates macros for performing both in-chain and cross-chain upcasts. For each class~$C$, and each proper superclass~$B$ @@ -470,7 +399,8 @@ Keyword arguments can be provided in three ways. Keyword arguments are provided as a general feature for C functions. However, Sod has special support for messages which accept keyword arguments -(\xref{sec:concepts.methods.keywords}). +(\xref{sec:concepts.methods.keywords}); and they play an essential role in +the instance construction protocol (\xref{sec:concepts.lifecycle.birth}). %%%-------------------------------------------------------------------------- \section{Messages and methods} \label{sec:concepts.methods} @@ -576,10 +506,13 @@ follows. returns; otherwise the behaviour of @|next_method| is to invoke the before methods (if any), followed by the most specific primary method, followed by the @|around| methods (if any), and to return whichever value was returned - by the most specific primary method. That is, the behaviour of the least - specific @|around| method's @|next_method| function is exactly the - behaviour that the effective method would have if there were no @|around| - methods. + by the most specific primary method, as described in the following items. + That is, the behaviour of the least specific @|around| method's + @|next_method| function is exactly the behaviour that the effective method + would have if there were no @|around| methods. Note that if the + least-specific @|around| method calls its @|next_method| more than once + then the whole sequence of @|before|, primary, and @|after| methods occurs + multiple times. The value returned by the most specific @|around| method is the value returned by the effective method. @@ -634,6 +567,11 @@ arguments. If the method body has overwritten its formal arguments, then @|CALL_NEXT_METHOD| will pass along the updated values, rather than the original ones. +A primary or @|around| method which invokes its @|next_method| function is +said to \emph{extend} the message behaviour; a method which does not invoke +its @|next_method| is said to \emph{override} the behaviour. Note that a +method may make a decision to override or extend at runtime. + \subsubsection{Aggregating method combinations} A number of other method combinations are provided. They are called `aggregating' method combinations because, instead of invoking just the most @@ -713,6 +651,251 @@ from the direct method definition if there was one, or an unspecified value otherwise. %%%-------------------------------------------------------------------------- +\section{The object lifecycle} \label{sec:concepts.lifecycle} + +\subsection{Creation} \label{sec:concepts.lifecycle.birth} + +Construction of a new instance of a class involves three steps. +\begin{enumerate} +\item \emph{Allocation} arranges for there to be storage space for the + instance's slots and associated metadata. +\item \emph{Imprinting} fills in the instance's metadata, associating the + instance with its class. +\item \emph{Initialization} stores appropriate initial values in the + instance's slots, and maybe links it into any external data structures as + necessary. +\end{enumerate} +The \descref{SOD_DECL}[macro]{mac} handles constructing instances with +automatic storage duration (`on the stack'). Similarly, the +\descref{SOD_MAKE}[macro]{mac} and the \descref{sod_make}{fun} and +\descref{sod_makev}{fun} functions construct instances allocated from the +standard @|malloc| heap. Programmers can add support for other allocation +strategies by using the \descref{SOD_INIT}[macro]{mac} and the +\descref{sod_init}{fun} and \descref{sod_initv}{fun} functions, which package +up imprinting and initialization. + +\subsubsection{Allocation} +Instances of most classes (specifically including those classes defined by +Sod itself) can be held in any storage of sufficient size. The in-memory +layout of an instance of some class~$C$ is described by the type @|struct +$C$__ilayout|, and if the relevant class is known at compile time then the +best way to discover the layout size is with the @|sizeof| operator. Failing +that, the size required to hold an instance of $C$ is available in a slot in +$C$'s class object, as @|$C$__class@->cls.initsz|. + +It is not in general sufficient to declare, or otherwise allocate, an object +of the class type $C$. The class type only describes a single chain of the +object's layout. It is nearly always an error to use the class type as if it +is a \emph{complete type}, e.g., to declare objects or arrays of the class +type, or to enquire about its size or alignment requirements. + +Instance layouts may be declared as objects with automatic storage duration +(colloquially, `allocated on the stack') or allocated dynamically, e.g., +using @|malloc|. They may be included as members of structures or unions, or +elements of arrays. Sod's runtime system doesn't retain addresses of +instances, so, for example, Sod doesn't make using fancy allocators which +sometimes move objects around in memory any more difficult than it needs to +be. + +There isn't any way to discover the alignment required for a particular +class's instances at runtime; it's best to be conservative and assume that +the platform's strictest alignment requirement applies. + +The following simple function correctly allocates and returns space for an +instance of a class given a pointer to its class object @. +\begin{prog} + void *allocate_instance(const SodClass *cls) \\ \ind + \{ return malloc(cls@->cls.initsz); \} +\end{prog} + +\subsubsection{Imprinting} +Once storage has been allocated, it must be \emph{imprinted} before it can be +used as an instance of a class, e.g., before any messages can be sent to it. + +Imprinting an instance stores some metadata about its direct class in the +instance structure, so that the rest of the program (and Sod's runtime +library) can tell what sort of object it is, and how to use it.\footnote{% + Specifically, imprinting an instance's storage involves storing the + appropriate vtable pointers in the right places in it.} % +A class object's @|imprint| slot points to a function which will correctly +imprint storage for one of that class's instances. + +Once an instance's storage has been imprinted, it is technically possible to +send messages to the instance; however the instance's slots are still +uninitialized at this point, the applicable methods are unlikely to do much +of any use unless they've been written specifically for the purpose. + +The following simple function imprints storage at address @

as an instance +of a class, given a pointer to its class object @. +\begin{prog} + void imprint_instance(const SodClass *cls, void *p) \\ \ind + \{ cls@->cls.imprint(p); \} +\end{prog} + +\subsubsection{Initialization} +The final step for constructing a new instance is to \emph{initialize} it, to +establish the necessary invariants for the instance itself and the +environment in which it operates. + +Details of initialization are necessarily class-specific, but typically it +involves setting the instance's slots to appropriate values, and possibly +linking it into some larger data structure to keep track of it. + +Initialization is performed by sending the imprinted instance an @|init| +message, defined by the @|SodObject| class. This message uses a nonstandard +method combination which works like the standard combination, except that the +\emph{default behaviour}, if there is no overriding method, is to initialize +the instance's slots, as described below, and to invoke each superclass's +initialization fragments. This default behaviour may be invoked multiple +times if some method calls on its @|next_method| more than once, unless some +other method takes steps to prevent this. + +Slots are initialized in a well-defined order. +\begin{itemize} +\item Slots defined by a more specific superclasses are initialized after + slots defined by a less specific superclass. +\item Slots defined by the same class are initialized in the order in which + their definitions appear. +\end{itemize} + +A class can define \emph{initialization fragments}: pieces of literal code to +be executed to set up a new instance. Each superclass's initialization +fragments are executed with @|me| bound to an instance pointer of the +appropriate superclass type, immediately after that superclass's slots (if +any) have been initialized; therefore, fragments defined by a more specific +superclass are executed after fragments defined by a more specific +superclass. A class may define more than one initialization fragment: the +fragments are executed in the order in which they appear in the class +definition. It is possible for an initialization fragment to use @|return| +or @|goto| for special control-flow effects, but this is not likely to be a +good idea. + +The @|init| message accepts keyword arguments +(\xref{sec:concepts.methods.keywords}). The set of acceptable keywords is +determined by the applicable methods as usual, but also by the +\emph{initargs} defined by the receiving instance's class and its +superclasses, which are made available to slot initializers and +initialization fragments. + +There are two kinds of initarg definitions. \emph{User initargs} are defined +by an explicit @|initarg| item appearing in a class definition: the item +defines a name, type, and (optionally) a default value for the initarg. +\emph{Slot initargs} are defined by attaching an @|initarg| property to a +slot or slot initializer item: the property's determines the initarg's name, +while the type is taken from the underlying slot type; slot initargs do not +have default values. Both kinds define a \emph{direct initarg} for the +containing class. + +Initargs are inherited. The \emph{applicable} direct initargs for an @|init| +effective method are those defined by the receiving object's class, and all +of its superclasses. Applicable direct initargs with the same name are +merged to form \emph{effective initargs}. An error is reported if two +applicable direct initargs have the same name but different types. The +default value of an effective initarg is taken from the most specific +applicable direct initarg which specifies a defalt value; if no applicable +direct initarg specifies a default value then the effective initarg has no +default. + +All initarg values are made available at runtime to user code -- +initialization fragments and slot initializer expressions -- through local +variables and a @|suppliedp| structure, as in a direct method +(\xref{sec:concepts.methods.keywords}). Furthermore, slot initarg +definitions influence the initialization of slots. + +The process for deciding how to initialize a particular slot works as +follows. +\begin{enumerate} +\item If there are any slot initargs defined on the slot, or any of its slot + initializers, \emph{and} the sender supplied a value for one or more of the + corresponding effective initargs, then the value of the most specific slot + initarg is stored in the slot. +\item Otherwise, if there are any slot initializers defined which include an + initializer expression, then the initializer expression from the most + specific such slot initializer is evaluated and its value stored in the + slot. +\item Otherwise, the slot is left uninitialized. +\end{enumerate} +Note that the default values (if any) of effective initargs do \emph{not} +affect this procedure. + + +\subsection{Destruction} +\label{sec:concepts.lifecycle.death} + +Destruction of an instance, when it is no longer required, consists of two +steps. +\begin{enumerate} +\item \emph{Teardown} releases any resources held by the instance and + disentangles it from any external data structures. +\item \emph{Deallocation} releases the memory used to store the instance so + that it can be reused. +\end{enumerate} +Teardown alone, for objects which require special deallocation, or for which +deallocation occurs automatically (e.g., instances with automatic storage +duration, or instances whose storage will be garbage-collected), is performed +using the \descref{sod_teardown}[function]{fun}. Destruction of instances +allocated from the standard @|malloc| heap is done using the +\descref{sod_destroy}[function]{fun}. + +\subsubsection{Teardown} +Details of initialization are necessarily class-specific, but typically it +involves setting the instance's slots to appropriate values, and possibly +linking it into some larger data structure to keep track of it. + +Teardown is performed by sending the instance the @|teardown| message, +defined by the @|SodObject| class. The message returns an integer, used as a +boolean flag. If the message returns zero, then the instance's storage +should be deallocated. If the message returns nonzero, then it is safe for +the caller to forget about instance, but should not deallocate its storage. +This is \emph{not} an error return: if some teardown method fails then the +program may be in an inconsistent state and should not continue. + +This simple protocol can be used, for example, to implement a reference +counting system, as follows. +\begin{prog} + [nick = ref] \\ + class ReferenceCountedObject \{ \\ \ind + unsigned nref = 1; \\- + void inc() \{ me@->ref.nref++; \} \\- + [role = around] \\ + int obj.teardown() \\ + \{ \\ \ind + if (--\,--me@->ref.nref) return (1); \\ + else return (CALL_NEXT_METHOD); \- \\ + \} \- \\ + \} +\end{prog} + +This message uses a nonstandard method combination which works like the +standard combination, except that the \emph{default behaviour}, if there is +no overriding method, is to execute the superclass's teardown fragments, and +to return zero. This default behaviour may be invoked multiple times if some +method calls on its @|next_method| more than once, unless some other method +takes steps to prevent this. + +A class can define \emph{teardown fragments}: pieces of literal code to be +executed to shut down an instance. Each superclass's teardown fragments are +executed with @|me| bound to an instance pointer of the appropriate +superclass type; fragments defined by a more specific superclass are executed +before fragments defined by a more specific superclass. A class may define +more than one teardown fragment: the fragments are executed in the order in +which they appear in the class definition. It is possible for an +initialization fragment to use @|return| or @|goto| for special control-flow +effects, but this is not likely to be a good idea. Similarly, it's probably +a better idea to use an @|around| method to influence the return value than +to write an explicit @|return| statement in a teardown fragment. + +\subsubsection{Deallocation} +The details of instance deallocation are obviously specific to the allocation +strategy used by the instance, and this is often orthogonal from the object's +class. + +The code which makes the decision to destroy an object may often not be aware +of the object's direct class. Low-level details of deallocation often +require the proper base address of the instance's storage, which can be +determined using the \descref{SOD_INSTBASE}[macro]{mac}. + +%%%-------------------------------------------------------------------------- \section{Metaclasses} \label{sec:concepts.metaclasses} %%%----- That's all, folks --------------------------------------------------