From: Mark Wooding Date: Mon, 14 Sep 2015 21:34:48 +0000 (+0100) Subject: Merge branch 'master' into doc X-Git-Url: https://git.distorted.org.uk/~mdw/sod/commitdiff_plain/3dca7758421664a838c54b273bd9221f02072045?hp=58a9509014d2d133b91e96c7956957f82bc000a4 Merge branch 'master' into doc * master: (93 commits) Eliminate the separately maintained Lisp system version number. src/{builtin,final,frontent}.lisp: `clear-the-decks' makes builtin module. src/class-{finalize,layout}-impl.lisp: Error checking on layout slots. src/class-finalize-impl.lisp: Remove FIXME which was fixed ages ago. src/: Introduce a macro for defining on-demand slots. Major effort to plug slot-name leaks. doc/list-exports.lisp: Report on generic function methods. doc/list-exports.lisp: Strip duplicate exports. doc/list-exports.lisp: Better pretty formatting for keywords. doc/list-exports.lisp: Mark the start of the class tree dump. src/codegen-proto.lisp, doc/list-exports.lisp: Export `inst' readers. src/: More missing exports. doc/list-exports.lisp: Sort sibling classes by name in the tree. doc/list-exports.lisp: Search for exports inside `eval-when' blocks. doc/list-exports.lisp: Don't get confused and thing `nil' isn't interned. doc/list-exports.lisp: Check for anomalies when preparing reports. doc/list-exports.lisp: Some sketchy code to report on exported symbols. src/: Fix up some wrong exports. src/final.lisp, src/frontend.lisp: Compile methods before dumping. src/frontend.lisp: Prepare the builtin module at load time. ... --- diff --git a/doc/sod-backg.tex b/doc/sod-backg.tex index 0f19c7d..0cb3877 100644 --- a/doc/sod-backg.tex +++ b/doc/sod-backg.tex @@ -46,6 +46,8 @@ This is the `is-subclass-of' relation we've been using so far.\footnote{% to be a partial order because of cycles in the class graph. I haven't given a great deal of thought to how well Sod would cope with a cyclic class graph.} % +We write $d \succeq c$ and say that $d$ is a superclass of $c$ if and only if +$c \preceq d$. The problem comes when we try to resolve inheritance questions. A class should inherit behaviour from its superclasses; but, in a world of multiple diff --git a/doc/sod-protocol.tex b/doc/sod-protocol.tex index f0bd115..6ef829b 100644 --- a/doc/sod-protocol.tex +++ b/doc/sod-protocol.tex @@ -117,11 +117,11 @@ category (function, class, macro, etc.) on the right. The synopsis for a function, generic function or method describes the function's lambda-list using the usual syntax. Note that keyword arguments are shown by naming their keywords; in the description, the value passed - for the keyword argument @|keyword| is shown as @. + for the keyword argument @|:keyword| is shown as @. For a method, specializers are shown using the usual @|defmethod| syntax, e.g., - \begin{quote} + \begin{quote} \sffamily some-generic-function ((@ list) @) \end{quote} \end{describe} @@ -153,7 +153,6 @@ category (function, class, macro, etc.) on the right. for @|let|. \end{describe} - \begin{describe}{cls}{example-class (direct-super other-direct-super) \&key :initarg} The synopsis for a class lists the class's direct superclasses, and the diff --git a/doc/sod-tut.tex b/doc/sod-tut.tex index cff9859..ca686aa 100644 --- a/doc/sod-tut.tex +++ b/doc/sod-tut.tex @@ -100,8 +100,8 @@ complicated. If you use \man{make}{1}, then something like SOD = sod .SUFFIXES: .sod .c .h - .sod.c:; \$(SOD) -gc -o \$@@ \$< - .sod.h:; \$(SOD) -gh -o \$@@ \$< % + .sod.c:; \$(SOD) -tc \$< + .sod.h:; \$(SOD) -th \$< \end{prog} ought to do the job. diff --git a/doc/sod.tex b/doc/sod.tex index 8a46735..ff0dea4 100644 --- a/doc/sod.tex +++ b/doc/sod.tex @@ -3,9 +3,12 @@ \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} \usepackage[palatino, helvetica, courier, maths=cmr]{mdwfonts} +\usepackage{tikz} \usepackage{syntax} \usepackage{sverb} \usepackage{mdwtab} +\usepackage[mdwmargin]{mdwthm} +\usepackage{amssymb} \usepackage{footnote} \usepackage{at} \usepackage{mdwref} @@ -49,6 +52,12 @@ \def\m@maybe{\ifmmode\else$\let\m@maybe@end$\fi} \def\@scripts{\futurelet\@ch\@scripts@i} +\def\chain#1#2{\mathsf{ch}_{#1}(#2)} +\def\chainhead#1#2{\mathsf{hd}_{#1}(#2)} +\def\chaintail#1#2{\mathsf{tl}_{#1}(#2)} + +\let\implies\Rightarrow + \atdef ;#1\\{\normalfont\itshape;#1\\} \let\@@grammar\grammar \def\grammar{\def\textbar{\hbox{$|$}}\@@grammar} @@ -161,28 +170,29 @@ The following notation is used in this section. \label{sec:syntax} Fortunately, Sod is syntactically quite simple. I've used a little slightly -unusual notation in order to make the presentation easier to read. +unusual notation in order to make the presentation easier to read. For any +nonterminal $x$: \begin{itemize} \item $\epsilon$ denotes the empty nonterminal: \begin{quote} $\epsilon$ ::= \end{quote} -\item @[@@] means an optional @: +\item @[$x$@] means an optional $x$: \begin{quote} - \syntax{@[@] ::= $\epsilon$ @! } + \syntax{@[$x$@] ::= $\epsilon$ @! $x$} \end{quote} -\item @^* means a sequence of zero or more @s: +\item $x^*$ means a sequence of zero or more $x$s: \begin{quote} - \syntax{@^* ::= $\epsilon$ @! @^* } + \syntax{$x^*$ ::= $\epsilon$ @! $x^*$ $x$} \end{quote} -\item @^+ means a sequence of one or more @s: +\item $x^+$ means a sequence of one or more $x$s: \begin{quote} - \syntax{@^+ ::= @^*} + \syntax{$x^+$ ::= $x$ $x^*$} \end{quote} -\item @ means a sequence of one or more @s separated +\item $x$@<-list> means a sequence of one or more $x$s separated by commas: \begin{quote} - \syntax{ ::= @! "," } + \syntax{$x$<-list> ::= $x$ @! $x$<-list> "," $x$} \end{quote} \end{itemize} @@ -194,22 +204,15 @@ collected into tokens according to the following syntax. \begin{grammar} ::= -\alt \alt \alt \alt \alt \end{grammar} -This syntax is slightly ambiguous. The following two rules serve to -disambiguate: -\begin{enumerate} -\item Reserved words take precedence. All @s are - syntactically @s; Sod resolves the ambiguity in favour of - @. -\item `Maximal munch'. In other cases, at each stage we take the longest - sequence of characters which could be a token. -\end{enumerate} +This syntax is slightly ambiguous, and is disambiguated by the \emph{maximal +munch} rule: at each stage we take the longest sequence of characters which +could be a token. \subsubsection{Identifiers} \label{sec:syntax.lex.id} @@ -233,19 +236,11 @@ The precise definition of @ is left to the function \textsf{alpha-char-p} in the hosting Lisp system. For portability, programmers are encouraged to limit themselves to the standard ASCII letters. -\subsubsection{Reserved words} \label{sec:syntax.lex.reserved} - -\begin{grammar} - ::= -"char" | "class" | "code" | "const" | "double" | "enum" | -"extern" | "float" | "import" | "int" | "lisp" | "load" | "long" -| "restrict" | "short" | "signed" | "struct" | "typename" | -"union" | "unsigned" | "void" | "volatile" -\end{grammar} - -Many of these are borrowed from~C; however, some (e.g., @"import" and -@"lisp") are not, and some C reserved words are not reserved (e.g., -@"static"). +There are no reserved words at the lexical level, but the higher-level syntax +recognizes certain identifiers as \emph{keywords} in some contexts. There is +also an ambiguity (inherited from C) in the declaration syntax which is +settled by distinguishing type names from other identifiers at a lexical +level. \subsubsection{String and character literals} \label{sec:syntax.lex.string} @@ -302,12 +297,9 @@ binary. However, length and signedness indicators are not permitted. \subsubsection{Punctuation} \label{sec:syntax.lex.punct} \begin{grammar} - ::= any character other than "\"" or "'" + ::= any nonalphanumeric character other than "_", "\"" or "'" \end{grammar} -Due to the `maximal munch' rule, @ tokens cannot be -alphanumeric. - \subsubsection{Comments} \label{sec:lex-comment} \begin{grammar} @@ -444,9 +436,9 @@ existing compiled files. The right way to package a substantial extension to the Sod translator is to provide the extension as a standard ASDF system (or similar) and leave a dropping @"foo-extension.lisp" in the module path saying something like -\begin{listing} -(asdf:operate 'asdf:load-op :foo-extension) -\end{listing} +\begin{quote} + \textsf{(asdf:load-system :foo-extension)} +\end{quote} which will arrange for the extension to be compiled if necessary. (This approach means that the language doesn't need to depend on any @@ -888,8 +880,9 @@ void *sod_convert(const SodClass *cls, const void *obj) %%%-------------------------------------------------------------------------- \section{Classes} +\label{sec:class} -\subsection{Classes and superclasses} +\subsection{Classes and superclasses} \label{sec:class.defs} A @ must list one or more existing classes to be the \emph{direct superclasses} for the new class being defined. We make the @@ -905,7 +898,7 @@ following definitions. The predefined class @|SodObject| has no direct superclasses; it is unique in this respect. All classes are subclasses of @|SodObject|. -\subsection{The class precedence list} +\subsection{The class precedence list} \label{sec:class.cpl} Let $C$ be a class. The superclasses of $C$ form a directed graph, with an edge from each class to each of its direct superclasses. This is the @@ -934,7 +927,7 @@ We say that \emph{$X$ is more specific than $Y$ as a superclass of $C$} if $X$ is earlier than $Y$ in $C$'s class precedence list. If $C$ is clear from context then we omit it, saying simply that $X$ is more specific than $Y$. -\subsection{Instances and metaclasses} +\subsection{Instances and metaclasses} \label{sec:class.meta} A class defines the structure and behaviour of its \emph{instances}: run-time objects created (possibly) dynamically. An instance is an instance of only @@ -951,7 +944,14 @@ The predefined class @|SodClass| is the default metaclass for new classes. @|SodClass| has @|SodObject| as its only direct superclass. @|SodClass| is its own metaclass. -\subsection{Items and inheritance} +To make matters more complicated, Sod has \emph{two} distinct metalevels: as +well as the runtime metalevel, as discussed above, there's a compile-time +metalevel hosted in the Sod translator. Since Sod is written in Common Lisp, +a Sod class's compile-time metaclass is a CLOS class. The usual compile-time +metaclass is @|sod-class|. The compile-time metalevel is the subject of +\xref{ch:api}. + +\subsection{Items and inheritance} \label{sec:class.inherit} A class definition also declares \emph{slots}, \emph{messages}, \emph{initializers} and \emph{methods} -- collectively referred to as @@ -967,7 +967,7 @@ multiple paths in the superclass graph from a class to one of its superclasses then items defined in that superclass may appear duplicated in the subclass. Sod does not have this notion. -\subsubsection{Slots} +\subsubsection{Slots} \label{sec:class.inherit.slots} A \emph{slot} is a unit of state. In other object systems, slots may be called `fields', `member variables', or `instance variables'. @@ -981,10 +981,10 @@ At run-time, each instance of the class holds a separate value for each slot, whether direct or inherited. Changing the value of an instance's slot doesn't affect other instances. -\subsubsection{Initializers} +\subsubsection{Initializers} \label{sec:class.inherit.init} Mumble. -\subsubsection{Messages} +\subsubsection{Messages} \label{sec:class.inherit.messages} A \emph{message} is the stimulus for behaviour. In Sod, a class must define, statically, the name and format of the messages it is able to receive and the values it will return in reply. In this respect, a message is similar to @@ -999,7 +999,7 @@ each other or with direct messages, even if they have the same name. At run-time, one sends a message to an instance by invoking a function obtained from the instance's \emph{vtable}: \xref{sec:fixme-vtable}. -\subsubsection{Methods} +\subsubsection{Methods} \label{sec:class.inherit.methods} A \emph{method} is a unit of behaviour. In other object systems, methods may be called `member functions'. @@ -1009,7 +1009,193 @@ class or any of its superclasses are \emph{applicable}. The details of how the applicable methods are invoked are described fully in \xref{sec:fixme-method-combination}. -\subsection{Chains and instance layout} +\subsection{Chains and instance layout} \label{sec:class.layout} + +C is a rather low-level language, and in particular it exposes details of the +way data is laid out in memory. Since an instance of a class~$C$ should be +(at least in principle) usable anywhere an instance of some superclass $B +\succeq C$ is expected, this implies that an instance of the subclass $C$ +needs to contain within it a complete instance of each superclass $B$, laid +out according to the rules of instances of $B$, so that if we have (the +address of) an instance of $C$, we can easily construct a pointer to a thing +which looks like an instance of $B$ contained within it. + +Specifically, the information we need to retain for an instance of a +class~$C$ is: +\begin{itemize} +\item the values of each of the slots defined by $C$, including those defined + by superclasses; +\item information which will let us convert a pointer to $C$ into a pointer + to any superclass $B \succeq C$; +\item information which will let us call the appropriate effective method for + each message defined by $C$, including those defined by superclasses; and +\item some additional meta-level information, such as how to find the class + object for $C$ given (the address of) one of its instances. +\end{itemize} + +Observe that, while each distinct instance must clearly have its own storage +for slots, all instances of $C$ can share a single copy of the remaining +information. The individual instance only needs to keep a pointer to this +shared table, which, inspired by the similar structure in many \Cplusplus\ +ABIs, are called a \emph{vtable}. + +The easiest approach would be to decide that instances of $C$ are exactly +like instances of $B$, only with extra space at the end for the extra slots +which $C$ defines over and above those already existing in $B$. Conversion +is then trivial: a pointer to an instance of $C$ can be converted to a +pointer to an instance of some superclass $B$ simply by casting. Even though +the root class @|SodObject| doesn't have any slots at all, its instances will +still need a vtable so that you can find its class object: the address of the +vtable therefore needs to be at the very start of the instance structure. +Again, a vtable for a superclass would have a vtable for each of its +superclasses as a prefix, with new items added afterwards. + +This appealing approach works well for an object system which only permits +single inheritance of both state and behaviour. Alas, it breaks down when +multiple inheritance is allowed: $C$ can be a subclass of both $B$ and $B'$, +even though $B$ is not a subclass of $B'$, nor \emph{vice versa}; so, in +general, $B$'s instance structure will not be a prefix of $B'$'s, nor will +$B'$'s be a prefix of $B$'s, and therefore $C$ cannot have both $B$ and $B'$ +as a prefix. + +A (non-root) class may -- though need not -- have a distinguished \emph{link} +superclass, which need not be a direct superclass. Furthermore, each +class~$C$ must satisfy the \emph{chain condition}: for any superclass $A$ of +$C$, there can be at most one other superclass of $C$ whose link superclass +is $A$.\footnote{% + That is, it's permitted for two classes $B$ and $B'$ to have the same link + superclass $A$, but $B$ and $B'$ can't then both be superclasses of the + same class $C$.} % +Therefore, the links partition the superclasses of~$C$ into nice linear +\emph{chains}, such that each superclass is a member of exactly one chain. +If a class~$B$ has a link superclass~$A$, then $B$'s \emph{level} is one more +than that of $A$; otherwise $B$ is called a \emph{chain head} and its level +is zero. If the classes in a chain are written in a list, chain head first, +then the level of each class gives its index in the list. + +Chains therefore allow us to recover some of the linearity properties which +made layout simple in the case of single inheritance. The instance structure +for a class $C$ contains a substructure for each of $C$'s superclass chains; +a pointer to an object of class $C$ actually points to the substructure for +the chain containing $C$. The order of these substructures is unimportant +for now.\footnote{% + The chains appear in the order in which their most specific classes appear + in $C$'s class precedence list. This guarantees that the chain containing + $C$ itself appears first, so that a pointer to $C$'s instance structure is + actually a pointer to $C$'s chain substructure. Apart from that, it's a + simple, stable, but basically arbitrary choice which can't be changed + without breaking the ABI.} % +The substructure for each chain begins with a pointer to a vtable, followed +by a structure for each superclass in the chain containing the slots defined +by that superclass, with the chain head (least specific class) first. + +Suppose we have a pointer to (static) type $C$, and want to convert it into a +pointer to some superclass $B$ of $C$ -- an \emph{upcast}.\footnote{% + In the more general case, we have a pointer to static type $C$, which + actually points to an object of some subclass $D$ of $C$, and want to + convert it into a pointer to type $B$. Such a conversion is called a + \emph{downcast} if $B$ is a subclass of $C$, or a \emph{cross-cast} + otherwise. Downcasts and cross-casts require complicated run-time + checking, and can will fail unless $B$ is a superclass of $D$.} % +If $B$ is in the same chain as $C$ -- an \emph{in-chain upcast} -- then the +pointer value is already correct and it's only necessary to cast it +appropriately. Otherwise -- a \emph{cross-chain upcast} -- the pointer needs +to be adjusted to point to a different chain substructure. Since the lengths +and relative positions of the chain substructures vary between classes, the +adjustments are stored in the vtable. Cross-chain upcasts are therefore a +bit slower than in-chain upcasts. + +Each chain has its own separate vtable, because much of the metadata stored +in the vtable is specific to a particular chain. For example: +\begin{itemize} +\item offsets to other chains' substructures will vary depending on which + chain we start from; and +\item entry points to methods { + +%%%-------------------------------------------------------------------------- +\chapter{The Lisp programming interface} \label{ch:api} + +%% output for `h' files +%% +%% prologue +%% guard start +%% typedefs start +%% typedefs +%% typedefs end +%% includes start +%% includes +%% includes end +%% classes start +%% CLASS banner +%% CLASS islots start +%% CLASS islots slots +%% CLASS islots end +%% CLASS vtmsgs start +%% CLASS vtmsgs CLASS start +%% CLASS vtmsgs CLASS slots +%% CLASS vtmsgs CLASS end +%% CLASS vtmsgs end +%% CLASS vtables start +%% CLASS vtables CHAIN-HEAD start +%% CLASS vtables CHAIN-HEAD slots +%% CLASS vtables CHAIN-HEAD end +%% CLASS vtables end +%% CLASS vtable-externs +%% CLASS vtable-externs-after +%% CLASS methods start +%% CLASS methods +%% CLASS methods end +%% CLASS ichains start +%% CLASS ichains CHAIN-HEAD start +%% CLASS ichains CHAIN-HEAD slots +%% CLASS ichains CHAIN-HEAD end +%% CLASS ichains end +%% CLASS ilayout start +%% CLASS ilayout slots +%% CLASS ilayout end +%% CLASS conversions +%% CLASS object +%% classes end +%% guard end +%% epilogue + +%% output for `c' files +%% +%% prologue +%% includes start +%% includes +%% includes end +%% classes start +%% CLASS banner +%% CLASS direct-methods start +%% CLASS direct-methods METHOD start +%% CLASS direct-methods METHOD body +%% CLASS direct-methods METHOD end +%% CLASS direct-methods end +%% CLASS effective-methods +%% CLASS vtables start +%% CLASS vtables CHAIN-HEAD start +%% CLASS vtables CHAIN-HEAD class-pointer METACLASS +%% CLASS vtables CHAIN-HEAD base-offset +%% CLASS vtables CHAIN-HEAD chain-offset TARGET-HEAD +%% CLASS vtables CHAIN-HEAD vtmsgs CLASS start +%% CLASS vtables CHAIN-HEAD vtmsgs CLASS slots +%% CLASS vtables CHAIN-HEAD vtmsgs CLASS end +%% CLASS vtables CHAIN-HEAD end +%% CLASS vtables end +%% CLASS object prepare +%% CLASS object start +%% CLASS object CHAIN-HEAD ichain start +%% CLASS object SUPER slots start +%% CLASS object SUPER slots +%% CLASS object SUPER vtable +%% CLASS object SUPER slots end +%% CLASS object CHAIN-HEAD ichain end +%% CLASS object end +%% classes end +%% epilogue + +%%%-------------------------------------------------------------------------- \include{sod-backg} \include{sod-protocol}