It lives!
[sod] / sod.tex
CommitLineData
1f1d88f5
MW
1\documentclass[noarticle]{strayman}
2
3\usepackage[T1]{fontenc}
4\usepackage[utf8]{inputenc}
5\usepackage[palatino, helvetica, courier, maths=cmr]{mdwfonts}
6\usepackage{syntax}
7\usepackage{sverb}
8\usepackage{at}
9\usepackage{mdwref}
10
11\title{A Sensible Object Design for C}
12\author{Mark Wooding}
13
14\def\syntleft{\normalfont\itshape}
15\let\syntright\empty
16
17\def\ulitleft{\normalfont\sffamily}
18\let\ulitright\empty
19
20\let\listingsize\relax
21
22\let\epsilon\varepsilon
23
24\atdef <#1>{\synt{#1}}
25\atdef "#1"{\lit*{#1}}
26\atdef `#1'{\lit{#1}}
27\atdef |#1|{\textsf{#1}}
28
29\def\Cplusplus{C\kern-1pt++}
30\def\Csharp{C\#}
31\def\man#1#2{\textbf{#1}(#2)}
32
33\begingroup\lccode`\~=`\
34\lowercase{
35\endgroup
36\def\prog{%
37 \sffamily%
38 \quote%
39 \let\oldnl\\%
40 \obeylines%
41 \tabbing%
42 \global\let~\\%
43 \global\let\\\textbackslash%
44}
45\def\endprog{%
46 \endtabbing%
47 \global\let\\\oldnl%
48 \endquote%
49}}
50
51\begin{document}
52
53\maketitle
54
55\include{sod-tut}
56
57%%%--------------------------------------------------------------------------
58\chapter{Internals}
59
60\section{Generated names}
61
62The generated names for functions and objects related to a class are
63constructed systematically so as not to interfere with each other. The rules
64on class, slot and message naming exist so as to ensure that the generated
65names don't collide with each other.
66
67The following notation is used in this section.
68\begin{description}
69\item[@<class>] The full name of the `focus' class: the one for which we are
70 generating name.
71\item[@<super-nick>] The nickname of a superclass.
72\item[@<head-nick>] The nickname of the chain-head class of the chain
73 in question.
74\end{description}
75
76\subsection{Instance layout}
77
78%%%--------------------------------------------------------------------------
79\section{Syntax}
80\label{sec:syntax}
81
82Fortunately, Sod is syntactically quite simple. I've used a little slightly
83unusual notation in order to make the presentation easier to read.
84\begin{itemize}
85\item $\epsilon$ denotes the empty nonterminal:
86 \begin{quote}
87 $\epsilon$ ::=
88 \end{quote}
89\item $[$@<item>$]$ means an optional @<item>:
90 \begin{quote}
91 \syntax{$[$<item>$]$ ::= $\epsilon$ | <item>}
92 \end{quote}
93\item @<item>$^*$ means a sequence of zero or more @<item>s:
94 \begin{quote}
95 \syntax{<item>$^*$ ::= $\epsilon$ | <item>$^*$ <item>}
96 \end{quote}
97\item @<item>$^+$ means a sequence of one or more @<item>s:
98 \begin{quote}
99 \syntax{<item>$^+$ ::= <item> <item>$^*$}
100 \end{quote}
101\item @<item-list> means a sequence of one or more @<item>s separated
102 by commas:
103 \begin{quote}
104 \syntax{<item-list> ::= <item> | <item-list> "," <item>}
105 \end{quote}
106\end{itemize}
107
108\subsection{Lexical syntax}
109\label{sec:syntax.lex}
110
111Whitespace and comments are discarded. The remaining characters are
112collected into tokens according to the following syntax.
113
114\begin{grammar}
115<token> ::= <identifier>
116\alt <reserved-word>
117\alt <string-literal>
118\alt <char-literal>
119\alt <integer-literal>
120\alt <punctuation>
121\end{grammar}
122
123This syntax is slightly ambiguous. The following two rules serve to
124disambiguate:
125\begin{enumerate}
126\item Reserved words take precedence. All @<reserved-word>s are
127 syntactically @<identifier>s; Sod resolves the ambiguity in favour of
128 @<reserved-word>.
129\item `Maximal munch'. In other cases, at each stage we take the longest
130 sequence of characters which could be a token.
131\end{enumerate}
132
133\subsubsection{Identifiers} \label{sec:syntax.lex.id}
134
135\begin{grammar}
136<identifier> ::= <id-start-char> <id-body-char>$^*$
137
138<id-start-char> ::= <alpha-char> $|$ "_"
139
140<id-body-char> ::= <id-start-char> $|$ <digit-char>
141
142<alpha-char> ::= "A" $|$ "B" $|$ \dots\ $|$ "Z"
143\alt "a" $|$ "b" $|$ \dots\ $|$ "z"
144\alt <extended-alpha-char>
145
146<digit-char> ::= "0" $|$ <nonzero-digit-char>
147
148<nonzero-digit-char> ::= "1" $|$ "2" $| \cdots |$ "9"
149\end{grammar}
150
151The precise definition of @<alpha-char> is left to the function
152\textsf{alpha-char-p} in the hosting Lisp system. For portability,
153programmers are encouraged to limit themselves to the standard ASCII letters.
154
155\subsubsection{Reserved words} \label{sec:syntax.lex.reserved}
156
157\begin{grammar}
158<reserved-word> ::=
159"char" $|$ "class" $|$ "code" $|$ "const" $|$ "double" $|$ "enum" $|$
160"extern" $|$ "float" $|$ "import" $|$ "int" $|$ "lisp" $|$ "load" $|$ "long"
161$|$ "restrict" $|$ "short" $|$ "signed" $|$ "struct" $|$ "typename" $|$
162"union" $|$ "unsigned" $|$ "void" $|$ "volatile"
163\end{grammar}
164
165Many of these are borrowed from~C; however, some (e.g., @"import" and
166@"lisp") are not, and some C reserved words are not reserved (e.g.,
167@"static").
168
169\subsubsection{String and character literals} \label{sec:syntax.lex.string}
170
171\begin{grammar}
172<string-literal> ::= "\"" <string-literal-char>$^*$ "\""
173
174<char-literal> ::= "'" <char-literal-char> "'"
175
176<string-literal-char> ::= any character other than "\\" or "\""
177\alt "\\" <char>
178
179<char-literal-char> ::= any character other than "\\" or "'"
180\alt "\\" <char>
181
182<char> ::= any single character
183\end{grammar}
184
185The syntax for string and character literals differs from~C. In particular,
186escape sequences such as @`\textbackslash n' are not recognized. The use
187of string and character literals in Sod, outside of C~fragments, is limited,
188and the simple syntax seems adequate. For the sake of future compatibility,
189the use of character sequences which resemble C escape sequences is
190discouraged.
191
192\subsubsection{Integer literals} \label{sec:syntax.lex.int}
193
194\begin{grammar}
195<integer-literal> ::= <decimal-integer>
196\alt <binary-integer>
197\alt <octal-integer>
198\alt <hex-integer>
199
200<decimal-integer> ::= <nonzero-digit-char> <digit-char>$^*$
201
202<binary-integer> ::= "0" $($"b"$|$"B"$)$ <binary-digit-char>$^+$
203
204<binary-digit-char> ::= "0" $|$ "1"
205
206<octal-integer> ::= "0" $[$"o"$|$"O"$]$ <octal-digit-char>$^+$
207
208<octal-digit-char> ::= "0" $|$ "1" $| \cdots |$ "7"
209
210<hex-integer> ::= "0" $($"x"$|$"X"$)$ <hex-digit-char>$^+$
211
212<hex-digit-char> ::= <digit-char>
213\alt "A" $|$ "B" $|$ "C" $|$ "D" $|$ "E" $|$ "F"
214\alt "a" $|$ "b" $|$ "c" $|$ "d" $|$ "e" $|$ "f"
215\end{grammar}
216
217Sod understands only integers, not floating-point numbers; its integer syntax
218goes slightly beyond C in allowing a @`0o' prefix for octal and @`0b' for
219binary. However, length and signedness indicators are not permitted.
220
221\subsubsection{Punctuation} \label{sec:syntax.lex.punct}
222
223\begin{grammar}
224<punctuation> ::= any character other than "\"" or "'"
225\end{grammar}
226
227Due to the `maximal munch' rule, @<punctuation> tokens cannot be
228alphanumeric.
229
230\subsubsection{Comments} \label{sec:lex-comment}
231
232\begin{grammar}
233<comment> ::= <block-comment>
234\alt <line-comment>
235
236<block-comment> ::=
237 "/*"
238 <not-star>$^*$ $($<star>$^+$ <not-star-or-slash> <not-star>$^*)^*$
239 <star>$^*$
240 "*/"
241
242<star> ::= "*"
243
244<not-star> ::= any character other than "*"
245
246<not-star-or-slash> ::= any character other than "*" or "/"
247
248<line-comment> ::= "//" <not-newline>$^*$ <newline>
249
250<newline> ::= a newline character
251
252<not-newline> ::= any character other than newline
253\end{grammar}
254
255Comments are exactly as in C99: both traditional block comments `\texttt{/*}
256\dots\ \texttt{*/}' and \Cplusplus-style `\texttt{//} \dots' comments are
257permitted and ignored.
258
259\subsection{Special nonterminals}
260\label{sec:special-nonterminals}
261
262Aside from the lexical syntax presented above (\xref{sec:lexical-syntax}),
263two special nonterminals occur in the module syntax.
264
265\subsubsection{S-expressions} \label{sec:syntax-sexp}
266
267\begin{grammar}
268<s-expression> ::= an S-expression, as parsed by the Lisp reader
269\end{grammar}
270
271When an S-expression is expected, the Sod parser simply calls the host Lisp
272system's \textsf{read} function. Sod modules are permitted to modify the
273read table to extend the S-expression syntax.
274
275S-expressions are self-delimiting, so no end-marker is needed.
276
277\subsubsection{C fragments} \label{sec:syntax.lex.cfrag}
278
279\begin{grammar}
280<c-fragment> ::= a sequence of C tokens, with matching brackets
281\end{grammar}
282
283Sequences of C code are simply stored and written to the output unchanged
284during translation. They are read using a simple scanner which nonetheless
285understands C comments and string and character literals.
286
287A C fragment is terminated by one of a small number of delimiter characters
288determined by the immediately surrounding context -- usually a closing brace
289or bracket. The first such delimiter character which is not enclosed in
290brackets, braces or parenthesis ends the fragment.
291
292\subsection{Module syntax} \label{sec:syntax-module}
293
294\begin{grammar}
295<module> ::= <definition>$^*$
296
297<definition> ::= <import-definition>
298\alt <load-definition>
299\alt <lisp-definition>
300\alt <code-definition>
301\alt <typename-definition>
302\alt <class-definition>
303\end{grammar}
304
305A module is the top-level syntactic item. A module consists of a sequence of
306definitions.
307
308\subsection{Simple definitions} \label{sec:syntax.defs}
309
310\subsubsection{Importing modules} \label{sec:syntax.defs.import}
311
312\begin{grammar}
313<import-definition> ::= "import" <string> ";"
314\end{grammar}
315
316The module named @<string> is processed and its definitions made available.
317
318A search is made for a module source file as follows.
319\begin{itemize}
320\item The module name @<string> is converted into a filename by appending
321 @`.sod', if it has no extension already.\footnote{%
322 Technically, what happens is \textsf{(merge-pathnames name (make-pathname
323 :type "SOD" :case :common))}, so exactly what this means varies
324 according to the host system.} %
325\item The file is looked for relative to the directory containing the
326 importing module.
327\item If that fails, then the file is looked for in each directory on the
328 module search path in turn.
329\item If the file still isn't found, an error is reported and the import
330 fails.
331\end{itemize}
332At this point, if the file has previously been imported, nothing further
333happens.\footnote{%
334 This check is done using \textsf{truename}, so it should see through simple
335 tricks like symbolic links. However, it may be confused by fancy things
336 like bind mounts and so on.} %
337
338Recursive imports, either direct or indirect, are an error.
339
340\subsubsection{Loading extensions} \label{sec:syntax.defs.load}
341
342\begin{grammar}
343<load-definition> ::= "load" <string> ";"
344\end{grammar}
345
346The Lisp file named @<string> is loaded and evaluated.
347
348A search is made for a Lisp source file as follows.
349\begin{itemize}
350\item The name @<string> is converted into a filename by appending @`.lisp',
351 if it has no extension already.\footnote{%
352 Technically, what happens is \textsf{(merge-pathnames name (make-pathname
353 :type "LISP" :case :common))}, so exactly what this means varies
354 according to the host system.} %
355\item A search is then made in the same manner as for module imports
356 (\xref{sec:syntax-module}).
357\end{itemize}
358If the file is found, it is loaded using the host Lisp's \textsf{load}
359function.
360
361Note that Sod doesn't attempt to compile Lisp files, or even to look for
362existing compiled files. The right way to package a substantial extension to
363the Sod translator is to provide the extension as a standard ASDF system (or
364similar) and leave a dropping @"foo-extension.lisp" in the module path saying
365something like
366\begin{listing}
367(asdf:operate 'asdf:load-op :foo-extension)
368\end{listing}
369which will arrange for the extension to be compiled if necessary.
370
371(This approach means that the language doesn't need to depend on any
372particular system definition facility. It's bad enough already that it
373depends on Common Lisp.)
374
375\subsubsection{Lisp escapes} \label{sec:syntax.defs.lisp}
376
377\begin{grammar}
378<lisp-definition> ::= "lisp" <s-expression> ";"
379\end{grammar}
380
381The @<s-expression> is evaluated immediately. It can do anything it likes.
382
383\textbf{Warning!} This means that hostile Sod modules are a security hazard.
384Lisp code can read and write files, start other programs, and make network
385connections. Don't install Sod modules from sources that you don't
386trust.\footnote{%
387 Presumably you were going to run the corresponding code at some point, so
388 this isn't as unusually scary as it sounds. But please be careful.} %
389
390\subsubsection{Declaring type names} \label{sec:syntax.defs.typename}
391
392\begin{grammar}
393<typename-definition> ::=
394 "typename" <identifier-list> ";"
395\end{grammar}
396
397Each @<identifier> is declared as naming a C type. This is important because
398the C type syntax -- which Sod uses -- is ambiguous, and disambiguation is
399done by distinguishing type names from other identifiers.
400
401Don't declare class names using @"typename"; use @"class" forward
402declarations instead.
403
404\subsection{Literal code} \label{sec:syntax-code}
405
406\begin{grammar}
407<code-definition> ::=
408 "code" <identifier> ":" <identifier> $[$<constraints>$]$
409 "{" <c-fragment> "}"
410
411<constraints> ::= "[" <constraint-list> "]"
412
413<constraint> ::= <identifier>$^+$
414\end{grammar}
415
416The @<c-fragment> will be output unchanged to one of the output files.
417
418The first @<identifier> is the symbolic name of an output file. Predefined
419output file names are @"c" and @"h", which are the implementation code and
420header file respectively; other output files can be defined by extensions.
421
422The second @<identifier> provides a name for the output item. Several C
423fragments can have the same name: they will be concatenated together in the
424order in which they were encountered.
425
426The @<constraints> provide a means for specifying where in the output file
427the output item should appear. (Note the two kinds of square brackets shown
428in the syntax: square brackets must appear around the constraints if they are
429present, but that they may be omitted.) Each comma-separated @<constraint>
430is a sequence of identifiers naming output items, and indicates that the
431output items must appear in the order given -- though the translator is free
432to insert additional items in between them. (The particular output items
433needn't be defined already -- indeed, they needn't be defined ever.)
434
435There is a predefined output item @"includes" in both the @"c" and @"h"
436output files which is a suitable place for inserting @"\#include"
437preprocessor directives in order to declare types and functions for use
438elsewhere in the generated output files.
439
440\subsection{Property sets} \label{sec:syntax.propset}
441
442\begin{grammar}
443<properties> ::= "[" <property-list> "]"
444
445<property> ::= <identifier> "=" <expression>
446\end{grammar}
447
448Property sets are a means for associating miscellaneous information with
449classes and related items. By using property sets, additional information
450can be passed to extensions without the need to introduce idiosyncratic
451syntax.
452
453A property has a name, given as an @<identifier>, and a value computed by
454evaluating an @<expression>. The value can be one of a number of types,
455though the only operators currently defined act on integer values only.
456
457\subsubsection{The expression evaluator} \label{sec:syntax.propset.expr}
458
459\begin{grammar}
460<expression> ::= <term> | <expression> "+" <term> | <expression> "-" <term>
461
462<term> ::= <factor> | <term> "*" <factor> | <term> "/" <factor>
463
464<factor> ::= <primary> | "+" <factor> | "-" <factor>
465
466<primary> ::=
467 <integer-literal> | <string-literal> | <char-literal> | <identifier>
468\alt "?" <s-expression>
469\alt "(" <expression> ")"
470\end{grammar}
471
472The arithmetic expression syntax is simple and standard; there are currently
473no bitwise, logical, or comparison operators.
474
475A @<primary> expression may be a literal or an identifier. Note that
476identifiers stand for themselves: they \emph{do not} denote values. For more
477fancy expressions, the syntax
478\begin{quote}
479 @"?" @<s-expression>
480\end{quote}
481causes the @<s-expression> to be evaluated using the Lisp \textsf{eval}
482function.
483%%% FIXME crossref to extension docs
484
485\subsection{C types} \label{sec:syntax.c-types}
486
487Sod's syntax for C types closely mirrors the standard C syntax. A C type has
488two parts: a sequence of @<declaration-specifier>s and a @<declarator>. In
489Sod, a type must contain at least one @<declaration-specifier> (i.e.,
490`implicit @"int"' is forbidden), and storage-class specifiers are not
491recognized.
492
493\subsubsection{Declaration specifiers} \label{sec:syntax.c-types.declspec}
494
495\begin{grammar}
496<declaration-specifier> ::= <type-name>
497\alt "struct" <identifier> | "union" <identifier> | "enum" <identifier>
498\alt "void" | "char" | "int" | "float" | "double"
499\alt "short" | "long"
500\alt "signed" | "unsigned"
501\alt <qualifier>
502
503<qualifier> ::= "const" | "volatile" | "restrict"
504
505<type-name> ::= <identifier>
506\end{grammar}
507
508A @<type-name> is an identifier which has been declared as being a type name,
509using the @"typename" or @"class" definitions.
510
511Declaration specifiers may appear in any order. However, not all
512combinations are permitted. A declaration specifier must consist of zero or
513more @<qualifiers>, and one of the following, up to reordering.
514\begin{itemize}
515\item @<type-name>
516\item @"struct" <identifier>, @"union" <identifier>, @"enum" <identifier>
517\item @"void"
518\item @"char", @"unsigned char", @"signed char"
519\item @"short", @"unsigned short", @"signed short"
520\item @"short int", @"unsigned short int", @"signed short int"
521\item @"int", @"unsigned int", @"signed int", @"unsigned", @"signed"
522\item @"long", @"unsigned long", @"signed long"
523\item @"long int", @"unsigned long int", @"signed long int"
524\item @"long long", @"unsigned long long", @"signed long long"
525\item @"long long int", @"unsigned long long int", @"signed long long int"
526\item @"float", @"double", @"long double"
527\end{itemize}
528All of these have their usual C meanings.
529
530\subsubsection{Declarators} \label{sec:syntax.c-types.declarator}
531
532\begin{grammar}
533<declarator> ::=
534 <pointer>$^*$ <inner-declarator> <declarator-suffix>$^*$
535
536<inner-declarator> ::= <identifier> | <qualified-identifier>
537\alt "(" <declarator> ")"
538
539<qualified-identifier> ::= <identifier> "." <identifier>
540
541<pointer> ::= "*" <qualifier>$^*$
542
543<declarator-suffix> ::= "[" <c-fragment> "]"
544\alt "(" <arguments> ")"
545
546<arguments> ::= <empty> | "..."
547\alt <argument-list> $[$"," "..."$]$
548
549<argument> ::= <declaration-specifier>$^+$ <argument-declarator>
550
551<argument-declarator> ::= <declarator> | $[$<abstract-declarator>$]$
552
553<abstract-declarator> ::=
554 <pointer>$^+$ | <pointer>$^*$ <inner-abstract-declarator>
555
556<inner-abstract-declarator> ::= "(" <abstract-declarator> ")"
557\alt $[$<inner-abstract-declarator>$]$ <declarator-suffix>$^+$
558\end{grammar}
559
560The declarator syntax is taken from C, but with some differences.
561\begin{itemize}
562\item Array dimensions are uninterpreted @<c-fragments>, terminated by a
563 closing square bracket. This allows array dimensions to contain arbitrary
564 constant expressions.
565\item A declarator may have either a single @<identifier> at its centre or a
566 pair of @<identifier>s separated by a @`.'; this is used to refer to
567 slots or messages defined in superclasses.
568\end{itemize}
569The remaining differences are (I hope) a matter of presentation rather than
570substance.
571
572\subsection{Defining classes} \label{sec:syntax.class}
573
574\begin{grammar}
575<class-definition> ::= <class-forward-declaration>
576\alt <full-class-definition>
577\end{grammar}
578
579\subsubsection{Forward declarations} \label{sec:class.class.forward}
580
581\begin{grammar}
582<class-forward-declaration> ::= "class" <identifier> ";"
583\end{grammar}
584
585A @<class-forward-declaration> informs Sod that an @<identifier> will be used
586to name a class which is currently undefined. Forward declarations are
587necessary in order to resolve certain kinds of circularity. For example,
588\begin{listing}
589class Sub;
590
591class Super : SodObject {
592 Sub *sub;
593};
594
595class Sub : Super {
596 /* ... */
597};
598\end{listing}
599
600\subsubsection{Full class definitions} \label{sec:class.class.full}
601
602\begin{grammar}
603<full-class-definition> ::=
604 $[$<properties>$]$
605 "class" <identifier> ":" <identifier-list>
606 "{" <class-item>$^*$ "}"
607
608<class-item> ::= <slot-item> ";"
609\alt <message-item>
610\alt <method-item>
611\alt <initializer-item> ";"
612\end{grammar}
613
614A full class definition provides a complete description of a class.
615
616The first @<identifier> gives the name of the class. It is an error to
617give the name of an existing class (other than a forward-referenced class),
618or an existing type name. It is conventional to give classes `MixedCase'
619names, to distinguish them from other kinds of identifiers.
620
621The @<identifier-list> names the direct superclasses for the new class. It
622is an error if any of these @<identifier>s does not name a defined class.
623
624The @<properties> provide additional information. The standard class
625properties are as follows.
626\begin{description}
627\item[@"lisp_class"] The name of the Lisp class to use within the translator
628 to represent this class. The property value must be an identifier; the
629 default is @"sod_class". Extensions may define classes with additional
630 behaviour, and may recognize additional class properties.
631\item[@"metaclass"] The name of the Sod metaclass for this class. In the
632 generated code, a class is itself an instance of another class -- its
633 \emph{metaclass}. The metaclass defines which slots the class will have,
634 which messages it will respond to, and what its behaviour will be when it
635 receives them. The property value must be an identifier naming a defined
636 subclass of @"SodClass". The default metaclass is @"SodClass".
637 %%% FIXME xref to theory
638\item[@"nick"] A nickname for the class, to be used to distinguish it from
639 other classes in various limited contexts. The property value must be an
640 identifier; the default is constructed by forcing the class name to
641 lower-case.
642\end{description}
643
644The class body consists of a sequence of @<class-item>s enclosed in braces.
645These items are discussed on the following sections.
646
647\subsubsection{Slot items} \label{sec:sntax.class.slot}
648
649\begin{grammar}
650<slot-item> ::=
651 $[$<properties>$]$
652 <declaration-specifier>$^+$ <init-declarator-list>
653
654<init-declarator> ::= <declarator> $[$"=" <initializer>$]$
655\end{grammar}
656
657A @<slot-item> defines one or more slots. All instances of the class and any
658subclass will contain these slot, with the names and types given by the
659@<declaration-specifiers> and the @<declarators>. Slot declarators may not
660contain qualified identifiers.
661
662It is not possible to declare a slot with function type: such an item is
663interpreted as being a @<message-item> or @<method-item>. Pointers to
664functions are fine.
665
666An @<initializer>, if present, is treated as if a separate
667@<initializer-item> containing the slot name and initializer were present.
668For example,
669\begin{listing}
670[nick = eg]
671class Example : Super {
672 int foo = 17;
673};
674\end{listing}
675means the same as
676\begin{listing}
677[nick = eg]
678class Example : Super {
679 int foo;
680 eg.foo = 17;
681};
682\end{listing}
683
684\subsubsection{Initializer items} \label{sec:syntax.class.init}
685
686\begin{grammar}
687<initializer-item> ::= $[$"class"$]$ <slot-initializer-list>
688
689<slot-initializer> ::= <qualified-identifier> "=" <initializer>
690
691<initializer> :: "{" <c-fragment> "}" | <c-fragment>
692\end{grammar}
693
694An @<initializer-item> provides an initial value for one or more slots. If
695prefixed by @"class", then the initial values are for class slots (i.e.,
696slots of the class object itself); otherwise they are for instance slots.
697
698The first component of the @<qualified-identifier> must be the nickname of
699one of the class's superclasses (including itself); the second must be the
700name of a slot defined in that superclass.
701
702The initializer has one of two forms.
703\begin{itemize}
704\item A @<c-fragment> enclosed in braces denotes an aggregate initializer.
705 This is suitable for initializing structure, union or array slots.
706\item A @<c-fragment> \emph{not} beginning with an open brace is a `bare'
707 initializer, and continues until the next @`,' or @`;' which is not within
708 nested brackets. Bare initializers are suitable for initializing scalar
709 slots, such as pointers or integers, and strings.
710\end{itemize}
711
712\subsubsection{Message items} \label{sec:syntax.class.message}
713
714\begin{grammar}
715<message-item> ::=
716 $[$<properties>$]$
717 <declaration-specifier>$^+$ <declarator> $[$<method-body>$]$
718\end{grammar}
719
720\subsubsection{Method items} \label{sec:syntax.class.method}
721
722\begin{grammar}
723<method-item> ::=
724 $[$<properties>$]$
725 <declaration-specifier>$^+$ <declarator> <method-body>
726
727<method-body> ::= "{" <c-fragment> "}" | "extern" ";"
728\end{grammar}
729
730%%%--------------------------------------------------------------------------
731\section{Class objects}
732
733\begin{listing}
734typedef struct SodClass__ichain_obj SodClass;
735
736struct sod_chain {
737 size_t n_classes; /* Number of classes in chain */
738 const SodClass *const *classes; /* Vector of classes, head first */
739 size_t off_ichain; /* Offset of ichain from instance base */
740 const struct sod_vtable *vt; /* Vtable pointer for chain */
741 size_t ichainsz; /* Size of the ichain structure */
742};
743
744struct sod_vtable {
745 SodClass *_class; /* Pointer to instance's class */
746 size_t _base; /* Offset to instance base */
747};
748
749struct SodClass__islots {
750
751 /* Basic information */
752 const char *name; /* The class's name as a string */
753 const char *nick; /* The nickname as a string */
754
755 /* Instance allocation and initialization */
756 size_t instsz; /* Instance layout size in bytes */
757 void *(*imprint)(void *); /* Stamp instance with vtable ptrs */
758 void *(*init)(void *); /* Initialize instance */
759
760 /* Superclass structure */
761 size_t n_supers; /* Number of direct superclasses */
762 const SodClass *const *supers; /* Vector of direct superclasses */
763 size_t n_cpl; /* Length of class precedence list */
764 const SodClass *const *cpl; /* Vector for class precedence list */
765
766 /* Chain structure */
767 const SodClass *link; /* Link to next class in chain */
768 const SodClass *head; /* Pointer to head of chain */
769 size_t level; /* Index of class in its chain */
770 size_t n_chains; /* Number of superclass chains */
771 const sod_chain *chains; /* Vector of chain structures */
772
773 /* Layout */
774 size_t off_islots; /* Offset of islots from ichain base */
775 size_t islotsz; /* Size of instance slots */
776};
777
778struct SodClass__ichain_obj {
779 const SodClass__vt_obj *_vt;
780 struct SodClass__islots cls;
781};
782
783struct sod_instance {
784 struct sod_vtable *_vt;
785};
786\end{listing}
787
788\begin{listing}
789void *sod_convert(const SodClass *cls, const void *obj)
790{
791 const struct sod_instance *inst = obj;
792 const SodClass *real = inst->_vt->_cls;
793 const struct sod_chain *chain;
794 size_t i, index;
795
796 for (i = 0; i < real->cls.n_chains; i++) {
797 chain = &real->cls.chains[i];
798 if (chain->classes[0] == cls->cls.head) {
799 index = cls->cls.index;
800 if (index < chain->n_classes && chain->classes[index] == cls)
801 return ((char *)cls - inst->_vt._base + chain->off_ichain);
802 else
803 return (0);
804 }
805 }
806 return (0);
807}
808\end{listing}
809
810%%%--------------------------------------------------------------------------
811\section{Classes}
812
813\subsection{Classes and superclasses}
814
815A @<full-class-definition> must list one or more existing classes to be the
816\emph{direct superclasses} for the new class being defined. We make the
817following definitions.
818\begin{itemize}
819\item The \emph{superclasses} of a class consist of the class itself together
820 with the superclasses of its direct superclasses.
821\item The \emph{proper superclasses} of a class are its superclasses other
822 than itself.
823\item If $C$ is a (proper) superclass of $D$ then $D$ is a (\emph{proper})
824 \emph{subclass} of $C$.
825\end{itemize}
826The predefined class @|SodObject| has no direct superclasses; it is unique in
827this respect. All classes are subclasses of @|SodObject|.
828
829\subsection{The class precedence list}
830
831Let $C$ be a class. The superclasses of $C$ form a directed graph, with an
832edge from each class to each of its direct superclasses. This is the
833\emph{superclass graph of $C$}.
834
835In order to resolve inheritance of items, we define a \emph{class precedence
836 list} (or CPL) for each class, which imposes a total order on that class's
837superclasses. The default algorithm for computing the CPL is the \emph{C3}
838algorithm \cite{fixme-c3}, though extensions may implement other algorithms.
839
840The default algorithm works as follows. Let $C$ be the class whose CPL we
841are to compute. Let $X$ and $Y$ be two of $C$'s superclasses.
842\begin{itemize}
843\item $C$ must appear first in the CPL.
844\item If $X$ appears before $Y$ in the CPL of one of $C$'s direct
845 superclasses, then $X$ appears before $Y$ in the $C$'s CPL.
846\item If the above rules don't suffice to order $X$ and $Y$, then whichever
847 of $X$ and $Y$ has a subclass which appears further left in the list of
848 $C$'s direct superclasses will appear earlier in the CPL.
849\end{itemize}
850This last rule is sufficient to disambiguate because if both $X$ and $Y$ are
851superclasses of the same direct superclass of $C$ then that direct
852superclass's CPL will order $X$ and $Y$.
853
854We say that \emph{$X$ is more specific than $Y$ as a superclass of $C$} if
855$X$ is earlier than $Y$ in $C$'s class precedence list. If $C$ is clear from
856context then we omit it, saying simply that $X$ is more specific than $Y$.
857
858\subsection{Instances and metaclasses}
859
860A class defines the structure and behaviour of its \emph{instances}: run-time
861objects created (possibly) dynamically. An instance is an instance of only
862one class, though structurally it may be used in place of an instance of any
863of that class's superclasses. It is possible, with care, to change the class
864of an instance at run-time.
865
866Classes are themselves represented as instances -- called \emph{class
867 objects} -- in the running program. Being instances, they have a class,
868called the \emph{metaclass}. The metaclass defines the structure and
869behaviour of the class object.
870
871The predefined class @|SodClass| is the default metaclass for new classes.
872@|SodClass| has @|SodObject| as its only direct superclass. @|SodClass| is
873its own metaclass.
874
875\subsection{Items and inheritance}
876
877A class definition also declares \emph{slots}, \emph{messages},
878\emph{initializers} and \emph{methods} -- collectively referred to as
879\emph{items}. In addition to the items declared in the class definition --
880the class's \emph{direct items} -- a class also \emph{inherits} items from
881its superclasses.
882
883The precise rules for item inheritance vary according to the kinds of items
884involved.
885
886Some object systems have a notion of `repeated inheritance': if there are
887multiple paths in the superclass graph from a class to one of its
888superclasses then items defined in that superclass may appear duplicated in
889the subclass. Sod does not have this notion.
890
891\subsubsection{Slots}
892A \emph{slot} is a unit of state. In other object systems, slots may be
893called `fields', `member variables', or `instance variables'.
894
895A slot has a \emph{name} and a \emph{type}. The name serves only to
896distinguish the slot from other direct slots defined by the same class. A
897class inherits all of its proper superclasses' slots. Slots inherited from
898superclasses do not conflict with each other or with direct slots, even if
899they have the same names.
900
901At run-time, each instance of the class holds a separate value for each slot,
902whether direct or inherited. Changing the value of an instance's slot
903doesn't affect other instances.
904
905\subsubsection{Initializers}
906Mumble.
907
908\subsubsection{Messages}
909A \emph{message} is the stimulus for behaviour. In Sod, a class must define,
910statically, the name and format of the messages it is able to receive and the
911values it will return in reply. In this respect, a message is similar to
912`abstract member functions' or `interface member functions' in other object
913systems.
914
915Like slots, a message has a \emph{name} and a \emph{type}. Again, the name
916serves only to distinguish the message from other direct messages defined by
917the same class. Messages inherited from superclasses do not conflict with
918each other or with direct messages, even if they have the same name.
919
920At run-time, one sends a message to an instance by invoking a function
921obtained from the instance's \emph{vtable}: \xref{sec:fixme-vtable}.
922
923\subsubsection{Methods}
924A \emph{method} is a unit of behaviour. In other object systems, methods may
925be called `member functions'.
926
927A method is associated with a message. When a message is received by an
928instance, all of the methods associated with that message on the instance's
929class or any of its superclasses are \emph{applicable}. The details of how
930the applicable methods are invoked are described fully in
931\xref{sec:fixme-method-combination}.
932
933\subsection{Chains and instance layout}
934
3be8c2bf 935\include{sod-backg}
a07d8d00 936\include{sod-protocol}
1f1d88f5
MW
937
938\end{document}
939\f
940%%% Local variables:
941%%% mode: LaTeX
942%%% TeX-PDF-mode: t
943%%% End: