X-Git-Url: https://git.distorted.org.uk/~mdw/sod/blobdiff_plain/a42893dda5f4dd2b89fbfe4e497da261159225ca..e4ea29d8e1853f8ae36ee0e65b9f5913303042d2:/doc/concepts.tex

diff --git a/doc/concepts.tex b/doc/concepts.tex
index 8151014..64f6320 100644
--- a/doc/concepts.tex
+++ b/doc/concepts.tex
@@ -26,37 +26,6 @@
 \chapter{Concepts} \label{ch:concepts}
 
 %%%--------------------------------------------------------------------------
-\section{Operational model} \label{sec:concepts.model}
-
-The Sod translator runs as a preprocessor, similar in nature to the
-traditional Unix \man{lex}{1} and \man{yacc}{1} tools.  The translator reads
-a \emph{module} file containing class definitions and other information, and
-writes C~source and header files.  The source files contain function
-definitions and static tables which are fed directly to a C~compiler; the
-header files contain declarations for functions and data structures, and are
-included by source files -- whether hand-written or generated by Sod -- which
-makes use of the classes defined in the module.
-
-Sod is not like \Cplusplus: it makes no attempt to `enhance' the C language
-itself.  Sod module files describe classes, messages, methods, slots, and
-other kinds of object-system things, and some of these descriptions need to
-contain C code fragments, but this code is entirely uninterpreted by the Sod
-translator.\footnote{%
-  As long as a code fragment broadly follows C's lexical rules, and properly
-  matches parentheses, brackets, and braces, the Sod translator will copy it
-  into its output unchanged.  It might, in fact, be some other kind of C-like
-  language, such as Objective~C or \Cplusplus.  Or maybe even
-  Objective~\Cplusplus, because if having an object system is good, then
-  having three must be really awesome.} %
-
-The Sod translator is not a closed system.  It is written in Common Lisp, and
-can load extension modules which add new input syntax, output formats, or
-altered behaviour.  The interface for writing such extensions is described in
-\xref{p:lisp}.  Extensions can change almost all details of the Sod object
-system, so the material in this manual must be read with this in mind: this
-manual describes the base system as provided in the distribution.
-
-%%%--------------------------------------------------------------------------
 \section{Modules} \label{sec:concepts.modules}
 
 A \emph{module} is the top-level syntactic unit of input to the Sod
@@ -301,7 +270,10 @@ slots.  If you want to hide implementation details, the best approach is to
 stash them in a dynamically allocated private structure, and leave a pointer
 to it in a slot.  (This will also help preserve binary compatibility, because
 the private structure can grow more members as needed.  See
-\xref{sec:fixme.compatibility} for more details.
+\xref{sec:fixme.compatibility} for more details.)
+
+\subsubsection{Vtables}
+
 
 \subsubsection{Class objects}
 In Sod's object system, classes are objects too.  Therefore classes are
@@ -319,8 +291,8 @@ doesn't define any messages, so it doesn't have any methods.  In Sod, a class
 slot containing a function pointer is not at all the same thing as a method.)
 
 \subsubsection{Conversions}
-Suppose one has a value of type pointer to class type of some class~$C$, and
-wants to convert it to a pointer to class type of some other class~$B$.
+Suppose one has a value of type pointer-to-class-type for some class~$C$, and
+wants to convert it to a pointer-to-class-type for some other class~$B$.
 There are three main cases to distinguish.
 \begin{itemize}
 \item If $B$ is a superclass of~$C$, in the same chain, then the conversion
@@ -336,13 +308,13 @@ There are three main cases to distinguish.
   pointer.  The conversion can be performed using the appropriate generated
   upcast macro (see below); the general case is handled by the macro
   \descref{SOD_XCHAIN}{mac}.
-\item If $B$ is a subclass of~$C$ then the conversion is an \emph{upcast};
+\item If $B$ is a subclass of~$C$ then the conversion is a \emph{downcast};
   otherwise the conversion is a~\emph{cross-cast}.  In either case, the
   conversion can fail: the object in question might not be an instance of~$B$
-  at all.  The macro \descref{SOD_CONVERT}{mac} and the function
+  after all.  The macro \descref{SOD_CONVERT}{mac} and the function
   \descref{sod_convert}{fun} perform general conversions.  They return a null
   pointer if the conversion fails.  (There are therefore your analogue to the
-  \Cplusplus @|dynamic_cast<>| operator.)
+  \Cplusplus\ @|dynamic_cast<>| operator.)
 \end{itemize}
 The Sod translator generates macros for performing both in-chain and
 cross-chain upcasts.  For each class~$C$, and each proper superclass~$B$
@@ -616,6 +588,36 @@ There is also a @|custom| aggregating method combination, which is described
 in \xref{sec:fixme.custom-aggregating-method-combination}.
 
 
+\subsection{Sending messages in C} \label{sec:concepts.methods.c}
+
+Each instance is associated with its direct class [FIXME]
+
+The effective methods for each class are determined at translation time, by
+the Sod translator.  For each effective method, one or more \emph{method
+entry functions} are constructed.  A method entry function has three
+responsibilities.
+\begin{itemize}
+\item It converts the receiver pointer to the correct type.  Method entry
+  functions can perform these conversions extremely efficiently: there are
+  separate method entries for each chain of each class which can receive a
+  message, so method entry functions are in the privileged situation of
+  knowing the \emph{exact} class of the receiving object.
+\item If the message accepts a variable-length argument tail, then two method
+  entry functions are created for each chain of each class: one receives a
+  variable-length argument tail, as intended, and captures it in a @|va_list|
+  object; the other accepts an argument of type @|va_list| in place of the
+  variable-length tail and arranges for it to be passed along to the direct
+  methods.
+\item It invokes the effective method with the appropriate arguments.  There
+  might or might not be an actual function corresponding to the effective
+  method itself: the translator may instead open-code the effective method's
+  behaviour into each method entry function; and the machinery for handling
+  `delegation chains', such as is used for @|around| methods and primary
+  methods in the standard method combination, is necessarily scattered among
+  a number of small functions.
+\end{itemize}
+
+
 \subsection{Messages with keyword arguments}
 \label{sec:concepts.methods.keywords}
 
@@ -704,7 +706,7 @@ the platform's strictest alignment requirement applies.
 The following simple function correctly allocates and returns space for an
 instance of a class given a pointer to its class object @<cls>.
 \begin{prog}
-  void *allocate_instance(const SodClass *cls) \\ \ind
+  void *allocate_instance(const SodClass *cls)                  \\ \ind
     \{ return malloc(cls@->cls.initsz); \}
 \end{prog}
 
@@ -728,7 +730,7 @@ of any use unless they've been written specifically for the purpose.
 The following simple function imprints storage at address @<p> as an instance
 of a class, given a pointer to its class object @<cls>.
 \begin{prog}
-  void imprint_instance(const SodClass *cls, void *p) \\ \ind
+  void imprint_instance(const SodClass *cls, void *p)           \\ \ind
     \{ cls@->cls.imprint(p); \}
 \end{prog}
 
@@ -745,11 +747,10 @@ Initialization is performed by sending the imprinted instance an @|init|
 message, defined by the @|SodObject| class.  This message uses a nonstandard
 method combination which works like the standard combination, except that the
 \emph{default behaviour}, if there is no overriding method, is to initialize
-the instance's slots using the initializers defined in the class and its
-superclasses, and to invoke each superclass's initialization fragments.  This
-default behaviour may be invoked multiple times if some method calls on its
-@|next_method| more than once, unless some other method takes steps to
-prevent this.
+the instance's slots, as described below, and to invoke each superclass's
+initialization fragments.  This default behaviour may be invoked multiple
+times if some method calls on its @|next_method| more than once, unless some
+other method takes steps to prevent this.
 
 Slots are initialized in a well-defined order.
 \begin{itemize}
@@ -771,19 +772,53 @@ definition.  It is possible for an initialization fragment to use @|return|
 or @|goto| for special control-flow effects, but this is not likely to be a
 good idea.
 
-Note that an initialization fragment defined in a class is copied literally
-into each subclass's initialization method.  This is fine for simple cases
-but wasteful if the initialization logic is complicated.  More complex
-initialization behaviour should be added either by having an initialization
-fragments call functions (necessarily with external linkage), or by defining
-@|after| methods on the @|init| message.  These will be run after the slot
-initializers have been applied, in reverse precedence order.
-
-Initialization is \emph{parametrized}, so the caller may select from a space
-of possible initial states for the new instance, or to inform the new
-instance about some other objects known to the caller.  Specifically, the
-@|init| message accepts keyword arguments (\xref{sec:concepts.keywords})
-which can be defined and used by methods defined on it.
+The @|init| message accepts keyword arguments
+(\xref{sec:concepts.methods.keywords}).  The set of acceptable keywords is
+determined by the applicable methods as usual, but also by the
+\emph{initargs} defined by the receiving instance's class and its
+superclasses, which are made available to slot initializers and
+initialization fragments.
+
+There are two kinds of initarg definitions.  \emph{User initargs} are defined
+by an explicit @|initarg| item appearing in a class definition: the item
+defines a name, type, and (optionally) a default value for the initarg.
+\emph{Slot initargs} are defined by attaching an @|initarg| property to a
+slot or slot initializer item: the property's determines the initarg's name,
+while the type is taken from the underlying slot type; slot initargs do not
+have default values.  Both kinds define a \emph{direct initarg} for the
+containing class.
+
+Initargs are inherited.  The \emph{applicable} direct initargs for an @|init|
+effective method are those defined by the receiving object's class, and all
+of its superclasses.  Applicable direct initargs with the same name are
+merged to form \emph{effective initargs}.  An error is reported if two
+applicable direct initargs have the same name but different types.  The
+default value of an effective initarg is taken from the most specific
+applicable direct initarg which specifies a defalt value; if no applicable
+direct initarg specifies a default value then the effective initarg has no
+default.
+
+All initarg values are made available at runtime to user code --
+initialization fragments and slot initializer expressions -- through local
+variables and a @|suppliedp| structure, as in a direct method
+(\xref{sec:concepts.methods.keywords}).  Furthermore, slot initarg
+definitions influence the initialization of slots.
+
+The process for deciding how to initialize a particular slot works as
+follows.
+\begin{enumerate}
+\item If there are any slot initargs defined on the slot, or any of its slot
+  initializers, \emph{and} the sender supplied a value for one or more of the
+  corresponding effective initargs, then the value of the most specific slot
+  initarg is stored in the slot.
+\item Otherwise, if there are any slot initializers defined which include an
+  initializer expression, then the initializer expression from the most
+  specific such slot initializer is evaluated and its value stored in the
+  slot.
+\item Otherwise, the slot is left uninitialized.
+\end{enumerate}
+Note that the default values (if any) of effective initargs do \emph{not}
+affect this procedure.
 
 
 \subsection{Destruction}
@@ -820,16 +855,16 @@ program may be in an inconsistent state and should not continue.
 This simple protocol can be used, for example, to implement a reference
 counting system, as follows.
 \begin{prog}
-  [nick = ref] \\
-  class ReferenceCountedObject \{ \\ \ind
-    unsigned nref = 1; \\-
-    void inc() \{ me@->ref.nref++; \} \\-
-    [role = around] \\
-    int obj.teardown() \\
-    \{ \\ \ind
-      if (--\,--me@->ref.nref) return (1); \\
-      else return (CALL_NEXT_METHOD); \- \\
-    \} \- \\
+  [nick = ref]                                                  \\
+  class ReferenceCountedObject \{                               \\ \ind
+    unsigned nref = 1;                                          \\-
+    void inc() \{ me@->ref.nref++; \}                           \\-
+    [role = around]                                             \\
+    int obj.teardown()                                          \\
+    \{                                                          \\ \ind
+      if (--\,--me@->ref.nref) return (1);                      \\
+      else return (CALL_NEXT_METHOD);                         \-\\
+    \}                                                        \-\\
   \}
 \end{prog}
 
@@ -865,6 +900,22 @@ determined using the \descref{SOD_INSTBASE}[macro]{mac}.
 %%%--------------------------------------------------------------------------
 \section{Metaclasses} \label{sec:concepts.metaclasses}
 
+%%%--------------------------------------------------------------------------
+\section{Compatibility considerations} \label{sec:concepts.compatibility}
+
+Sod doesn't make source-level compatibility especially difficult.  As long as
+classes, slots, and messages don't change names or dissappear, and slots and
+messages retain their approximate types, everything will be fine.
+
+Binary compatibility is much more difficult.  Unfortunately, Sod classes have
+rather fragile binary interfaces.\footnote{%
+  Research suggestion: investigate alternative instance and vtable layouts
+  which improve binary compatibility, probably at the expense of instance
+  compactness, and efficiency of slot access and message sending.  There may
+  be interesting trade-offs to be made.} %
+
+If instances are allocated [FIXME]
+
 %%%----- That's all, folks --------------------------------------------------
 
 %%% Local variables: