X-Git-Url: https://git.distorted.org.uk/~mdw/sod/blobdiff_plain/020b9e2b4364c61a668d47c4c5405295349817d8..3a774b55edfea441c1715994924c2999e9202143:/doc/concepts.tex

diff --git a/doc/concepts.tex b/doc/concepts.tex
index b4e80ca..9cc5be7 100644
--- a/doc/concepts.tex
+++ b/doc/concepts.tex
@@ -26,37 +26,6 @@
 \chapter{Concepts} \label{ch:concepts}
 
 %%%--------------------------------------------------------------------------
-\section{Operational model} \label{sec:concepts.model}
-
-The Sod translator runs as a preprocessor, similar in nature to the
-traditional Unix \man{lex}{1} and \man{yacc}{1} tools.  The translator reads
-a \emph{module} file containing class definitions and other information, and
-writes C~source and header files.  The source files contain function
-definitions and static tables which are fed directly to a C~compiler; the
-header files contain declarations for functions and data structures, and are
-included by source files -- whether hand-written or generated by Sod -- which
-makes use of the classes defined in the module.
-
-Sod is not like \Cplusplus: it makes no attempt to `enhance' the C language
-itself.  Sod module files describe classes, messages, methods, slots, and
-other kinds of object-system things, and some of these descriptions need to
-contain C code fragments, but this code is entirely uninterpreted by the Sod
-translator.\footnote{%
-  As long as a code fragment broadly follows C's lexical rules, and properly
-  matches parentheses, brackets, and braces, the Sod translator will copy it
-  into its output unchanged.  It might, in fact, be some other kind of C-like
-  language, such as Objective~C or \Cplusplus.  Or maybe even
-  Objective~\Cplusplus, because if having an object system is good, then
-  having three must be really awesome.} %
-
-The Sod translator is not a closed system.  It is written in Common Lisp, and
-can load extension modules which add new input syntax, output formats, or
-altered behaviour.  The interface for writing such extensions is described in
-\xref{p:lisp}.  Extensions can change almost all details of the Sod object
-system, so the material in this manual must be read with this in mind: this
-manual describes the base system as provided in the distribution.
-
-%%%--------------------------------------------------------------------------
 \section{Modules} \label{sec:concepts.modules}
 
 A \emph{module} is the top-level syntactic unit of input to the Sod
@@ -245,13 +214,13 @@ qualified by the defining class's nickname.
 \subsubsection{Slot initializers}
 As well as defining slot names and types, a class can also associate an
 \emph{initial value} with each slot defined by itself or one of its
-subclasses.  A class $C$ provides an \emph{initialization function} (see
+subclasses.  A class $C$ provides an \emph{initialization message} (see
 \xref{sec:concepts.lifecycle.birth}, and \xref{sec:structures.root.sodclass})
-which sets the slots of a \emph{direct} instance of the class to the correct
-initial values.  If several of $C$'s superclasses define initializers for the
-same slot then the initializer from the most specific such class is used.  If
-none of $C$'s superclasses define an initializer for some slot then that slot
-will be left uninitialized.
+whose methods set the slots of a \emph{direct} instance of the class to the
+correct initial values.  If several of $C$'s superclasses define initializers
+for the same slot then the initializer from the most specific such class is
+used.  If none of $C$'s superclasses define an initializer for some slot then
+that slot will be left uninitialized.
 
 The initializer for a slot with scalar type may be any C expression.  The
 initializer for a slot with aggregate type must contain only constant
@@ -301,7 +270,10 @@ slots.  If you want to hide implementation details, the best approach is to
 stash them in a dynamically allocated private structure, and leave a pointer
 to it in a slot.  (This will also help preserve binary compatibility, because
 the private structure can grow more members as needed.  See
-\xref{sec:fixme.compatibility} for more details.
+\xref{sec:fixme.compatibility} for more details.)
+
+\subsubsection{Vtables}
+
 
 \subsubsection{Class objects}
 In Sod's object system, classes are objects too.  Therefore classes are
@@ -319,8 +291,8 @@ doesn't define any messages, so it doesn't have any methods.  In Sod, a class
 slot containing a function pointer is not at all the same thing as a method.)
 
 \subsubsection{Conversions}
-Suppose one has a value of type pointer to class type of some class~$C$, and
-wants to convert it to a pointer to class type of some other class~$B$.
+Suppose one has a value of type pointer-to-class-type for some class~$C$, and
+wants to convert it to a pointer-to-class-type for some other class~$B$.
 There are three main cases to distinguish.
 \begin{itemize}
 \item If $B$ is a superclass of~$C$, in the same chain, then the conversion
@@ -336,13 +308,13 @@ There are three main cases to distinguish.
   pointer.  The conversion can be performed using the appropriate generated
   upcast macro (see below); the general case is handled by the macro
   \descref{SOD_XCHAIN}{mac}.
-\item If $B$ is a subclass of~$C$ then the conversion is an \emph{upcast};
+\item If $B$ is a subclass of~$C$ then the conversion is a \emph{downcast};
   otherwise the conversion is a~\emph{cross-cast}.  In either case, the
   conversion can fail: the object in question might not be an instance of~$B$
-  at all.  The macro \descref{SOD_CONVERT}{mac} and the function
+  after all.  The macro \descref{SOD_CONVERT}{mac} and the function
   \descref{sod_convert}{fun} perform general conversions.  They return a null
   pointer if the conversion fails.  (There are therefore your analogue to the
-  \Cplusplus @|dynamic_cast<>| operator.)
+  \Cplusplus\ @|dynamic_cast<>| operator.)
 \end{itemize}
 The Sod translator generates macros for performing both in-chain and
 cross-chain upcasts.  For each class~$C$, and each proper superclass~$B$
@@ -489,8 +461,96 @@ If there are no applicable primary methods then no effective method is
 constructed: the vtables contain null pointers in place of pointers to method
 entry functions.
 
+\begin{figure}
+  \begin{tikzpicture}
+    [>=stealth, thick,
+     order/.append style={color=green!70!black},
+     code/.append style={font=\sffamily},
+     action/.append style={font=\itshape},
+     method/.append style={rectangle, draw=black, thin, fill=blue!30,
+                           text height=\ht\strutbox, text depth=\dp\strutbox,
+                           minimum width=40mm}]
+
+    \def\delgstack#1#2#3{
+      \node (#10) [method, #2] {#3};
+      \node (#11) [method, above=6mm of #10] {#3};
+      \draw [->] ($(#10.north)!.5!(#10.north west) + (0mm, 1mm)$) --
+                 ++(0mm, 4mm)
+        node [code, left=4pt, midway] {next_method};
+      \draw [<-] ($(#10.north)!.5!(#10.north east) + (0mm, 1mm)$) --
+                 ++(0mm, 4mm)
+        node [action, right=4pt, midway] {return};
+      \draw [->] ($(#11.north)!.5!(#11.north west) + (0mm, 1mm)$) --
+                 ++(0mm, 4mm)
+        node [code, left=4pt, midway] {next_method}
+        node (ld) [above] {$\smash\vdots\mathstrut$};
+      \draw [<-] ($(#11.north)!.5!(#11.north east) + (0mm, 1mm)$) --
+                 ++(0mm, 4mm)
+        node [action, right=4pt, midway] {return}
+        node (rd) [above] {$\smash\vdots\mathstrut$};
+      \draw [->] ($(ld.north) + (0mm, 1mm)$) -- ++(0mm, 4mm)
+        node [code, left=4pt, midway] {next_method};
+      \draw [<-] ($(rd.north) + (0mm, 1mm)$) -- ++(0mm, 4mm)
+        node [action, right=4pt, midway] {return};
+      \node (p) at ($(ld.north)!.5!(rd.north)$) {};
+      \node (#1n) [method, above=5mm of p] {#3};
+      \draw [->, order] ($(#10.south east) + (4mm, 1mm)$) --
+                          ($(#1n.north east) + (4mm, -1mm)$)
+        node [midway, right, align=left]
+        {Most to \\ least \\ specific};}
+
+    \delgstack{a}{}{Around method}
+    \draw [<-] ($(a0.south)!.5!(a0.south west) - (0mm, 1mm)$) --
+               ++(0mm, -4mm);
+    \draw [->] ($(a0.south)!.5!(a0.south east) - (0mm, 1mm)$) --
+               ++(0mm, -4mm)
+      node [action, right=4pt, midway] {return};
+
+    \draw [->] ($(an.north)!.6!(an.north west) + (0mm, 1mm)$) --
+               ++(-8mm, 8mm)
+      node [code, midway, left=3mm] {next_method}
+      node (b0) [method, above left = 1mm + 4mm and -6mm - 4mm] {};
+    \node (b1) [method] at ($(b0) - (2mm, 2mm)$) {};
+    \node (bn) [method] at ($(b1) - (2mm, 2mm)$) {Before method};
+    \draw [->, order] ($(bn.west) - (6mm, 0mm)$) -- ++(12mm, 12mm)
+      node [midway, above left, align=center] {Most to \\ least \\ specific};
+    \draw [->] ($(b0.north east) + (-10mm, 1mm)$) -- ++(8mm, 8mm)
+      node (p) {};
+
+    \delgstack{m}{above right=1mm and 0mm of an.west |- p}{Primary method}
+    \draw [->] ($(mn.north)!.5!(mn.north west) + (0mm, 1mm)$) -- ++(0mm, 4mm)
+      node [code, left=4pt, midway] {next_method}
+      node [above right = 0mm and -8mm]
+      {$\vcenter{\hbox{\Huge\textcolor{red}{!}}}
+        \vcenter{\hbox{\begin{tabular}[c]{l}
+                         \textsf{next_method} \\
+                         pointer is null
+                       \end{tabular}}}$};
+
+    \draw [->, color=blue, dotted]
+        ($(m0.south)!.2!(m0.south east) - (0mm, 1mm)$) --
+        ($(an.north)!.2!(an.north east) + (0mm, 1mm)$)
+      node [midway, sloped, below] {Return value};
+
+    \draw [<-] ($(an.north)!.6!(an.north east) + (0mm, 1mm)$) --
+               ++(8mm, 8mm)
+      node [action, midway, right=3mm] {return}
+      node (f0) [method, above right = 1mm and -6mm] {};
+    \node (f1) [method] at ($(f0) + (-2mm, 2mm)$) {};
+    \node (fn) [method] at ($(f1) + (-2mm, 2mm)$) {After method};
+    \draw [<-, order] ($(f0.east) + (6mm, 0mm)$) -- ++(-12mm, 12mm)
+      node [midway, above right, align=center]
+      {Least to \\ most \\ specific};
+    \draw [<-] ($(fn.north west) + (6mm, 1mm)$) -- ++(-8mm, 8mm);
+
+  \end{tikzpicture}
+
+  \caption{The standard method combination}
+  \label{fig:concepts.methods.stdmeth}
+\end{figure}
+
 The effective method for a message with standard method combination works as
-follows.
+follows (see also~\xref{fig:concepts.methods.stdmeth}).
 \begin{enumerate}
 
 \item If any applicable methods have the @|around| role, then the most
@@ -616,6 +676,36 @@ There is also a @|custom| aggregating method combination, which is described
 in \xref{sec:fixme.custom-aggregating-method-combination}.
 
 
+\subsection{Sending messages in C} \label{sec:concepts.methods.c}
+
+Each instance is associated with its direct class [FIXME]
+
+The effective methods for each class are determined at translation time, by
+the Sod translator.  For each effective method, one or more \emph{method
+entry functions} are constructed.  A method entry function has three
+responsibilities.
+\begin{itemize}
+\item It converts the receiver pointer to the correct type.  Method entry
+  functions can perform these conversions extremely efficiently: there are
+  separate method entries for each chain of each class which can receive a
+  message, so method entry functions are in the privileged situation of
+  knowing the \emph{exact} class of the receiving object.
+\item If the message accepts a variable-length argument tail, then two method
+  entry functions are created for each chain of each class: one receives a
+  variable-length argument tail, as intended, and captures it in a @|va_list|
+  object; the other accepts an argument of type @|va_list| in place of the
+  variable-length tail and arranges for it to be passed along to the direct
+  methods.
+\item It invokes the effective method with the appropriate arguments.  There
+  might or might not be an actual function corresponding to the effective
+  method itself: the translator may instead open-code the effective method's
+  behaviour into each method entry function; and the machinery for handling
+  `delegation chains', such as is used for @|around| methods and primary
+  methods in the standard method combination, is necessarily scattered among
+  a number of small functions.
+\end{itemize}
+
+
 \subsection{Messages with keyword arguments}
 \label{sec:concepts.methods.keywords}
 
@@ -739,7 +829,11 @@ environment in which it operates.
 
 Details of initialization are necessarily class-specific, but typically it
 involves setting the instance's slots to appropriate values, and possibly
-linking it into some larger data structure to keep track of it.
+linking it into some larger data structure to keep track of it.  It is
+possible for initialization methods to attempt to allocate resources, but
+this must be done carefully: there is currently no way to report an error
+from object initialization, so the object must be marked as incompletely
+initialized, and left in a state where it will be safe to tear down later.
 
 Initialization is performed by sending the imprinted instance an @|init|
 message, defined by the @|SodObject| class.  This message uses a nonstandard
@@ -763,7 +857,7 @@ be executed to set up a new instance.  Each superclass's initialization
 fragments are executed with @|me| bound to an instance pointer of the
 appropriate superclass type, immediately after that superclass's slots (if
 any) have been initialized; therefore, fragments defined by a more specific
-superclass are executed after fragments defined by a more specific
+superclass are executed after fragments defined by a less specific
 superclass.  A class may define more than one initialization fragment: the
 fragments are executed in the order in which they appear in the class
 definition.  It is possible for an initialization fragment to use @|return|
@@ -838,9 +932,9 @@ allocated from the standard @|malloc| heap is done using the
 \descref{sod_destroy}[function]{fun}.
 
 \subsubsection{Teardown}
-Details of initialization are necessarily class-specific, but typically it
-involves setting the instance's slots to appropriate values, and possibly
-linking it into some larger data structure to keep track of it.
+Details of teardown are necessarily class-specific, but typically it
+involves releasing resources held by the instance, and disentangling it from
+any data structures it might be linked into.
 
 Teardown is performed by sending the instance the @|teardown| message,
 defined by the @|SodObject| class.  The message returns an integer, used as a
@@ -854,7 +948,7 @@ This simple protocol can be used, for example, to implement a reference
 counting system, as follows.
 \begin{prog}
   [nick = ref]                                                  \\
-  class ReferenceCountedObject \{                               \\ \ind
+  class ReferenceCountedObject: SodObject \{                    \\ \ind
     unsigned nref = 1;                                          \\-
     void inc() \{ me@->ref.nref++; \}                           \\-
     [role = around]                                             \\
@@ -866,18 +960,18 @@ counting system, as follows.
   \}
 \end{prog}
 
-This message uses a nonstandard method combination which works like the
-standard combination, except that the \emph{default behaviour}, if there is
-no overriding method, is to execute the superclass's teardown fragments, and
-to return zero.  This default behaviour may be invoked multiple times if some
-method calls on its @|next_method| more than once, unless some other method
-takes steps to prevent this.
+The @|teardown| message uses a nonstandard method combination which works
+like the standard combination, except that the \emph{default behaviour}, if
+there is no overriding method, is to execute the superclass's teardown
+fragments, and to return zero.  This default behaviour may be invoked
+multiple times if some method calls on its @|next_method| more than once,
+unless some other method takes steps to prevent this.
 
 A class can define \emph{teardown fragments}: pieces of literal code to be
 executed to shut down an instance.  Each superclass's teardown fragments are
 executed with @|me| bound to an instance pointer of the appropriate
 superclass type; fragments defined by a more specific superclass are executed
-before fragments defined by a more specific superclass.  A class may define
+before fragments defined by a less specific superclass.  A class may define
 more than one teardown fragment: the fragments are executed in the order in
 which they appear in the class definition.  It is possible for an
 initialization fragment to use @|return| or @|goto| for special control-flow
@@ -898,6 +992,22 @@ determined using the \descref{SOD_INSTBASE}[macro]{mac}.
 %%%--------------------------------------------------------------------------
 \section{Metaclasses} \label{sec:concepts.metaclasses}
 
+%%%--------------------------------------------------------------------------
+\section{Compatibility considerations} \label{sec:concepts.compatibility}
+
+Sod doesn't make source-level compatibility especially difficult.  As long as
+classes, slots, and messages don't change names or dissappear, and slots and
+messages retain their approximate types, everything will be fine.
+
+Binary compatibility is much more difficult.  Unfortunately, Sod classes have
+rather fragile binary interfaces.\footnote{%
+  Research suggestion: investigate alternative instance and vtable layouts
+  which improve binary compatibility, probably at the expense of instance
+  compactness, and efficiency of slot access and message sending.  There may
+  be interesting trade-offs to be made.} %
+
+If instances are allocated [FIXME]
+
 %%%----- That's all, folks --------------------------------------------------
 
 %%% Local variables: