| 1 | %%% -*-latex-*- |
| 2 | %%% |
| 3 | %%% Module syntax |
| 4 | %%% |
| 5 | %%% (c) 2015 Straylight/Edgeware |
| 6 | %%% |
| 7 | |
| 8 | %%%----- Licensing notice --------------------------------------------------- |
| 9 | %%% |
| 10 | %%% This file is part of the Sensible Object Design, an object system for C. |
| 11 | %%% |
| 12 | %%% SOD is free software; you can redistribute it and/or modify |
| 13 | %%% it under the terms of the GNU General Public License as published by |
| 14 | %%% the Free Software Foundation; either version 2 of the License, or |
| 15 | %%% (at your option) any later version. |
| 16 | %%% |
| 17 | %%% SOD is distributed in the hope that it will be useful, |
| 18 | %%% but WITHOUT ANY WARRANTY; without even the implied warranty of |
| 19 | %%% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
| 20 | %%% GNU General Public License for more details. |
| 21 | %%% |
| 22 | %%% You should have received a copy of the GNU General Public License |
| 23 | %%% along with SOD; if not, write to the Free Software Foundation, |
| 24 | %%% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. |
| 25 | |
| 26 | \chapter{Module syntax} \label{ch:syntax} |
| 27 | |
| 28 | %%%-------------------------------------------------------------------------- |
| 29 | |
| 30 | Fortunately, Sod is syntactically quite simple. The notation is slightly |
| 31 | unusual in order to make the presentation shorter and easier to read. |
| 32 | |
| 33 | Anywhere a simple nonterminal name $x$ may appear in the grammar, an |
| 34 | \emph{indexed} nonterminal $x[a_1, \ldots, a_n]$ may also appear. On the |
| 35 | left-hand side of a production rule, the indices $a_1$, \ldots, $a_n$ are |
| 36 | variables which vary over all nonterminal and terminal symbols, and the |
| 37 | variables may also appear on the right-hand side in place of a nonterminal. |
| 38 | Such a rule stands for a family of rules, in each variable is replaced by |
| 39 | each possible simple nonterminal or terminal symbol. |
| 40 | |
| 41 | The letter $\epsilon$ denotes the empty nonterminal |
| 42 | \begin{quote} |
| 43 | \syntax{$\epsilon$ ::=} |
| 44 | \end{quote} |
| 45 | |
| 46 | The following indexed productions are used throughout the grammar, some often |
| 47 | enough that they deserve special notation. |
| 48 | \begin{itemize} |
| 49 | \item @[$x$@] abbreviates @<optional>$[x]$, denoting an optional occurrence |
| 50 | of $x$: |
| 51 | \begin{quote} |
| 52 | \syntax{@[$x$@] ::= <optional>$[x]$ ::= $\epsilon$ @! $x$} |
| 53 | \end{quote} |
| 54 | \item $x^*$ abbreviates @<zero-or-more>$[x]$, denoting a sequence of zero or |
| 55 | more occurrences of $x$: |
| 56 | \begin{quote} |
| 57 | \syntax{$x^*$ ::= <zero-or-more>$[x]$ ::= |
| 58 | $\epsilon$ @! <zero-or-more>$[x]$ $x$} |
| 59 | \end{quote} |
| 60 | \item $x^+$ abbreviates @<one-or-more>$[x]$, denoting a sequence of zero or |
| 61 | more occurrences of $x$: |
| 62 | \begin{quote} |
| 63 | \syntax{$x^+$ ::= <one-or-more>$[x]$ ::= <zero-or-more>$[x]$ $x$} |
| 64 | \end{quote} |
| 65 | \item @<list>$[x]$ denotes a sequence of one or more occurrences of $x$ |
| 66 | separated by commas: |
| 67 | \begin{quote} |
| 68 | \syntax{<list>$[x]$ ::= $x$ @! <list>$[x]$ "," $x$} |
| 69 | \end{quote} |
| 70 | \end{itemize} |
| 71 | |
| 72 | \subsection{Lexical syntax} |
| 73 | \label{sec:syntax.lex} |
| 74 | |
| 75 | Whitespace and comments are discarded. The remaining characters are |
| 76 | collected into tokens according to the following syntax. |
| 77 | |
| 78 | \begin{grammar} |
| 79 | <token> ::= <identifier> |
| 80 | \alt <string-literal> |
| 81 | \alt <char-literal> |
| 82 | \alt <integer-literal> |
| 83 | \alt <punctuation> |
| 84 | \end{grammar} |
| 85 | |
| 86 | This syntax is slightly ambiguous, and is disambiguated by the \emph{maximal |
| 87 | munch} rule: at each stage we take the longest sequence of characters which |
| 88 | could be a token. |
| 89 | |
| 90 | \subsubsection{Identifiers} \label{sec:syntax.lex.id} |
| 91 | |
| 92 | \begin{grammar} |
| 93 | <identifier> ::= <id-start-char> @<id-body-char>^* |
| 94 | |
| 95 | <id-start-char> ::= <alpha-char> | "_" |
| 96 | |
| 97 | <id-body-char> ::= <id-start-char> @! <digit-char> |
| 98 | |
| 99 | <alpha-char> ::= "A" | "B" | \dots\ | "Z" |
| 100 | \alt "a" | "b" | \dots\ | "z" |
| 101 | \alt <extended-alpha-char> |
| 102 | |
| 103 | <digit-char> ::= "0" | <nonzero-digit-char> |
| 104 | |
| 105 | <nonzero-digit-char> ::= "1" | "2" $| \cdots |$ "9" |
| 106 | \end{grammar} |
| 107 | |
| 108 | The precise definition of @<alpha-char> is left to the function |
| 109 | \textsf{alpha-char-p} in the hosting Lisp system. For portability, |
| 110 | programmers are encouraged to limit themselves to the standard ASCII letters. |
| 111 | |
| 112 | There are no reserved words at the lexical level, but the higher-level syntax |
| 113 | recognizes certain identifiers as \emph{keywords} in some contexts. There is |
| 114 | also an ambiguity (inherited from C) in the declaration syntax which is |
| 115 | settled by distinguishing type names from other identifiers at a lexical |
| 116 | level. |
| 117 | |
| 118 | \subsubsection{String and character literals} \label{sec:syntax.lex.string} |
| 119 | |
| 120 | \begin{grammar} |
| 121 | <string-literal> ::= "\"" @<string-literal-char>^* "\"" |
| 122 | |
| 123 | <char-literal> ::= "'" <char-literal-char> "'" |
| 124 | |
| 125 | <string-literal-char> ::= any character other than "\\" or "\"" |
| 126 | \alt "\\" <char> |
| 127 | |
| 128 | <char-literal-char> ::= any character other than "\\" or "'" |
| 129 | \alt "\\" <char> |
| 130 | |
| 131 | <char> ::= any single character |
| 132 | \end{grammar} |
| 133 | |
| 134 | The syntax for string and character literals differs from~C. In particular, |
| 135 | escape sequences such as @`\textbackslash n' are not recognized. The use |
| 136 | of string and character literals in Sod, outside of C~fragments, is limited, |
| 137 | and the simple syntax seems adequate. For the sake of future compatibility, |
| 138 | the use of character sequences which resemble C escape sequences is |
| 139 | discouraged. |
| 140 | |
| 141 | \subsubsection{Integer literals} \label{sec:syntax.lex.int} |
| 142 | |
| 143 | \begin{grammar} |
| 144 | <integer-literal> ::= <decimal-integer> |
| 145 | \alt <binary-integer> |
| 146 | \alt <octal-integer> |
| 147 | \alt <hex-integer> |
| 148 | |
| 149 | <decimal-integer> ::= "0" | <nonzero-digit-char> @<digit-char>^* |
| 150 | |
| 151 | <binary-integer> ::= "0" @("b"|"B"@) @<binary-digit-char>^+ |
| 152 | |
| 153 | <binary-digit-char> ::= "0" | "1" |
| 154 | |
| 155 | <octal-integer> ::= "0" @["o"|"O"@] @<octal-digit-char>^+ |
| 156 | |
| 157 | <octal-digit-char> ::= "0" | "1" $| \cdots |$ "7" |
| 158 | |
| 159 | <hex-integer> ::= "0" @("x"|"X"@) @<hex-digit-char>^+ |
| 160 | |
| 161 | <hex-digit-char> ::= <digit-char> |
| 162 | \alt "A" | "B" | "C" | "D" | "E" | "F" |
| 163 | \alt "a" | "b" | "c" | "d" | "e" | "f" |
| 164 | \end{grammar} |
| 165 | |
| 166 | Sod understands only integers, not floating-point numbers; its integer syntax |
| 167 | goes slightly beyond C in allowing a @`0o' prefix for octal and @`0b' for |
| 168 | binary. However, length and signedness indicators are not permitted. |
| 169 | |
| 170 | \subsubsection{Punctuation} \label{sec:syntax.lex.punct} |
| 171 | |
| 172 | \begin{grammar} |
| 173 | <punctuation> ::= any nonalphanumeric character other than "_", "\"" or "'" |
| 174 | \end{grammar} |
| 175 | |
| 176 | \subsubsection{Comments} \label{sec:lex-comment} |
| 177 | |
| 178 | \begin{grammar} |
| 179 | <comment> ::= <block-comment> |
| 180 | \alt <line-comment> |
| 181 | |
| 182 | <block-comment> ::= |
| 183 | "/*" |
| 184 | @<not-star>^* @(@<star>^+ <not-star-or-slash> @<not-star>^*@)^* |
| 185 | @<star>^* |
| 186 | "*/" |
| 187 | |
| 188 | <star> ::= "*" |
| 189 | |
| 190 | <not-star> ::= any character other than "*" |
| 191 | |
| 192 | <not-star-or-slash> ::= any character other than "*" or "/" |
| 193 | |
| 194 | <line-comment> ::= "//" @<not-newline>^* <newline> |
| 195 | |
| 196 | <newline> ::= a newline character |
| 197 | |
| 198 | <not-newline> ::= any character other than newline |
| 199 | \end{grammar} |
| 200 | |
| 201 | Comments are exactly as in C99: both traditional block comments `\texttt{/*} |
| 202 | \dots\ \texttt{*/}' and \Cplusplus-style `\texttt{//} \dots' comments are |
| 203 | permitted and ignored. |
| 204 | |
| 205 | \subsection{Special nonterminals} |
| 206 | \label{sec:special-nonterminals} |
| 207 | |
| 208 | Aside from the lexical syntax presented above (\xref{sec:lexical-syntax}), |
| 209 | two special nonterminals occur in the module syntax. |
| 210 | |
| 211 | \subsubsection{S-expressions} \label{sec:syntax-sexp} |
| 212 | |
| 213 | \begin{grammar} |
| 214 | <s-expression> ::= an S-expression, as parsed by the Lisp reader |
| 215 | \end{grammar} |
| 216 | |
| 217 | When an S-expression is expected, the Sod parser simply calls the host Lisp |
| 218 | system's \textsf{read} function. Sod modules are permitted to modify the |
| 219 | read table to extend the S-expression syntax. |
| 220 | |
| 221 | S-expressions are self-delimiting, so no end-marker is needed. |
| 222 | |
| 223 | \subsubsection{C fragments} \label{sec:syntax.lex.cfrag} |
| 224 | |
| 225 | \begin{grammar} |
| 226 | <c-fragment> ::= a sequence of C tokens, with matching brackets |
| 227 | \end{grammar} |
| 228 | |
| 229 | Sequences of C code are simply stored and written to the output unchanged |
| 230 | during translation. They are read using a simple scanner which nonetheless |
| 231 | understands C comments and string and character literals. |
| 232 | |
| 233 | A C fragment is terminated by one of a small number of delimiter characters |
| 234 | determined by the immediately surrounding context -- usually a closing brace |
| 235 | or bracket. The first such delimiter character which is not enclosed in |
| 236 | brackets, braces or parenthesis ends the fragment. |
| 237 | |
| 238 | \subsection{Module syntax} \label{sec:syntax-module} |
| 239 | |
| 240 | \begin{grammar} |
| 241 | <module> ::= @<definition>^* |
| 242 | |
| 243 | <definition> ::= <import-definition> |
| 244 | \alt <load-definition> |
| 245 | \alt <lisp-definition> |
| 246 | \alt <code-definition> |
| 247 | \alt <typename-definition> |
| 248 | \alt <class-definition> |
| 249 | \end{grammar} |
| 250 | |
| 251 | A module is the top-level syntactic item. A module consists of a sequence of |
| 252 | definitions. |
| 253 | |
| 254 | \subsection{Simple definitions} \label{sec:syntax.defs} |
| 255 | |
| 256 | \subsubsection{Importing modules} \label{sec:syntax.defs.import} |
| 257 | |
| 258 | \begin{grammar} |
| 259 | <import-definition> ::= "import" <string> ";" |
| 260 | \end{grammar} |
| 261 | |
| 262 | The module named @<string> is processed and its definitions made available. |
| 263 | |
| 264 | A search is made for a module source file as follows. |
| 265 | \begin{itemize} |
| 266 | \item The module name @<string> is converted into a filename by appending |
| 267 | @`.sod', if it has no extension already.\footnote{% |
| 268 | Technically, what happens is \textsf{(merge-pathnames name (make-pathname |
| 269 | :type "SOD" :case :common))}, so exactly what this means varies |
| 270 | according to the host system.} % |
| 271 | \item The file is looked for relative to the directory containing the |
| 272 | importing module. |
| 273 | \item If that fails, then the file is looked for in each directory on the |
| 274 | module search path in turn. |
| 275 | \item If the file still isn't found, an error is reported and the import |
| 276 | fails. |
| 277 | \end{itemize} |
| 278 | At this point, if the file has previously been imported, nothing further |
| 279 | happens.\footnote{% |
| 280 | This check is done using \textsf{truename}, so it should see through simple |
| 281 | tricks like symbolic links. However, it may be confused by fancy things |
| 282 | like bind mounts and so on.} % |
| 283 | |
| 284 | Recursive imports, either direct or indirect, are an error. |
| 285 | |
| 286 | \subsubsection{Loading extensions} \label{sec:syntax.defs.load} |
| 287 | |
| 288 | \begin{grammar} |
| 289 | <load-definition> ::= "load" <string> ";" |
| 290 | \end{grammar} |
| 291 | |
| 292 | The Lisp file named @<string> is loaded and evaluated. |
| 293 | |
| 294 | A search is made for a Lisp source file as follows. |
| 295 | \begin{itemize} |
| 296 | \item The name @<string> is converted into a filename by appending @`.lisp', |
| 297 | if it has no extension already.\footnote{% |
| 298 | Technically, what happens is \textsf{(merge-pathnames name (make-pathname |
| 299 | :type "LISP" :case :common))}, so exactly what this means varies |
| 300 | according to the host system.} % |
| 301 | \item A search is then made in the same manner as for module imports |
| 302 | (\xref{sec:syntax-module}). |
| 303 | \end{itemize} |
| 304 | If the file is found, it is loaded using the host Lisp's \textsf{load} |
| 305 | function. |
| 306 | |
| 307 | Note that Sod doesn't attempt to compile Lisp files, or even to look for |
| 308 | existing compiled files. The right way to package a substantial extension to |
| 309 | the Sod translator is to provide the extension as a standard ASDF system (or |
| 310 | similar) and leave a dropping @"foo-extension.lisp" in the module path saying |
| 311 | something like |
| 312 | \begin{quote} |
| 313 | \textsf{(asdf:load-system :foo-extension)} |
| 314 | \end{quote} |
| 315 | which will arrange for the extension to be compiled if necessary. |
| 316 | |
| 317 | (This approach means that the language doesn't need to depend on any |
| 318 | particular system definition facility. It's bad enough already that it |
| 319 | depends on Common Lisp.) |
| 320 | |
| 321 | \subsubsection{Lisp escapes} \label{sec:syntax.defs.lisp} |
| 322 | |
| 323 | \begin{grammar} |
| 324 | <lisp-definition> ::= "lisp" <s-expression> ";" |
| 325 | \end{grammar} |
| 326 | |
| 327 | The @<s-expression> is evaluated immediately. It can do anything it likes. |
| 328 | |
| 329 | \begin{boxy}[Warning!] |
| 330 | This means that hostile Sod modules are a security hazard. Lisp code can |
| 331 | read and write files, start other programs, and make network connections. |
| 332 | Don't install Sod modules from sources that you don't trust.\footnote{% |
| 333 | Presumably you were going to run the corresponding code at some point, so |
| 334 | this isn't as unusually scary as it sounds. But please be careful.} % |
| 335 | \end{boxy} |
| 336 | |
| 337 | \subsubsection{Declaring type names} \label{sec:syntax.defs.typename} |
| 338 | |
| 339 | \begin{grammar} |
| 340 | <typename-definition> ::= |
| 341 | "typename" <list>@[<identifier>@] ";" |
| 342 | \end{grammar} |
| 343 | |
| 344 | Each @<identifier> is declared as naming a C type. This is important because |
| 345 | the C type syntax -- which Sod uses -- is ambiguous, and disambiguation is |
| 346 | done by distinguishing type names from other identifiers. |
| 347 | |
| 348 | Don't declare class names using @"typename"; use @"class" forward |
| 349 | declarations instead. |
| 350 | |
| 351 | \subsection{Literal code} \label{sec:syntax-code} |
| 352 | |
| 353 | \begin{grammar} |
| 354 | <code-definition> ::= |
| 355 | "code" <identifier> ":" <identifier> @[<constraints>@] |
| 356 | "{" <c-fragment> "}" |
| 357 | |
| 358 | <constraints> ::= "[" <list>@[<constraint>@] "]" |
| 359 | |
| 360 | <constraint> ::= @<identifier>^+ |
| 361 | \end{grammar} |
| 362 | |
| 363 | The @<c-fragment> will be output unchanged to one of the output files. |
| 364 | |
| 365 | The first @<identifier> is the symbolic name of an output file. Predefined |
| 366 | output file names are @"c" and @"h", which are the implementation code and |
| 367 | header file respectively; other output files can be defined by extensions. |
| 368 | |
| 369 | The second @<identifier> provides a name for the output item. Several C |
| 370 | fragments can have the same name: they will be concatenated together in the |
| 371 | order in which they were encountered. |
| 372 | |
| 373 | The @<constraints> provide a means for specifying where in the output file |
| 374 | the output item should appear. (Note the two kinds of square brackets shown |
| 375 | in the syntax: square brackets must appear around the constraints if they are |
| 376 | present, but that they may be omitted.) Each comma-separated @<constraint> |
| 377 | is a sequence of identifiers naming output items, and indicates that the |
| 378 | output items must appear in the order given -- though the translator is free |
| 379 | to insert additional items in between them. (The particular output items |
| 380 | needn't be defined already -- indeed, they needn't be defined ever.) |
| 381 | |
| 382 | There is a predefined output item @"includes" in both the @"c" and @"h" |
| 383 | output files which is a suitable place for inserting @"\#include" |
| 384 | preprocessor directives in order to declare types and functions for use |
| 385 | elsewhere in the generated output files. |
| 386 | |
| 387 | \subsection{Property sets} \label{sec:syntax.propset} |
| 388 | |
| 389 | \begin{grammar} |
| 390 | <properties> ::= "[" <list>@[<property>@] "]" |
| 391 | |
| 392 | <property> ::= <identifier> "=" <expression> |
| 393 | \end{grammar} |
| 394 | |
| 395 | Property sets are a means for associating miscellaneous information with |
| 396 | classes and related items. By using property sets, additional information |
| 397 | can be passed to extensions without the need to introduce idiosyncratic |
| 398 | syntax. |
| 399 | |
| 400 | A property has a name, given as an @<identifier>, and a value computed by |
| 401 | evaluating an @<expression>. The value can be one of a number of types, |
| 402 | though the only operators currently defined act on integer values only. |
| 403 | |
| 404 | \subsubsection{The expression evaluator} \label{sec:syntax.propset.expr} |
| 405 | |
| 406 | \begin{grammar} |
| 407 | <expression> ::= <term> | <expression> "+" <term> | <expression> "-" <term> |
| 408 | |
| 409 | <term> ::= <factor> | <term> "*" <factor> | <term> "/" <factor> |
| 410 | |
| 411 | <factor> ::= <primary> | "+" <factor> | "-" <factor> |
| 412 | |
| 413 | <primary> ::= |
| 414 | <integer-literal> | <string-literal> | <char-literal> | <identifier> |
| 415 | \alt "?" <s-expression> |
| 416 | \alt "(" <expression> ")" |
| 417 | \end{grammar} |
| 418 | |
| 419 | The arithmetic expression syntax is simple and standard; there are currently |
| 420 | no bitwise, logical, or comparison operators. |
| 421 | |
| 422 | A @<primary> expression may be a literal or an identifier. Note that |
| 423 | identifiers stand for themselves: they \emph{do not} denote values. For more |
| 424 | fancy expressions, the syntax |
| 425 | \begin{quote} |
| 426 | @"?" @<s-expression> |
| 427 | \end{quote} |
| 428 | causes the @<s-expression> to be evaluated using the Lisp \textsf{eval} |
| 429 | function. |
| 430 | %%% FIXME crossref to extension docs |
| 431 | |
| 432 | \subsection{C types} \label{sec:syntax.c-types} |
| 433 | |
| 434 | Sod's syntax for C types closely mirrors the standard C syntax. A C type has |
| 435 | two parts: a sequence of @<declaration-specifier>s and a @<declarator>. In |
| 436 | Sod, a type must contain at least one @<declaration-specifier> (i.e., |
| 437 | `implicit @"int"' is forbidden), and storage-class specifiers are not |
| 438 | recognized. |
| 439 | |
| 440 | \subsubsection{Declaration specifiers} \label{sec:syntax.c-types.declspec} |
| 441 | |
| 442 | \begin{grammar} |
| 443 | <declaration-specifier> ::= <type-name> |
| 444 | \alt "struct" <identifier> | "union" <identifier> | "enum" <identifier> |
| 445 | \alt "void" | "char" | "int" | "float" | "double" |
| 446 | \alt "short" | "long" |
| 447 | \alt "signed" | "unsigned" |
| 448 | \alt <qualifier> |
| 449 | |
| 450 | <qualifier> ::= "const" | "volatile" | "restrict" |
| 451 | |
| 452 | <type-name> ::= <identifier> |
| 453 | \end{grammar} |
| 454 | |
| 455 | A @<type-name> is an identifier which has been declared as being a type name, |
| 456 | using the @"typename" or @"class" definitions. |
| 457 | |
| 458 | Declaration specifiers may appear in any order. However, not all |
| 459 | combinations are permitted. A declaration specifier must consist of zero or |
| 460 | more @<qualifiers>, and one of the following, up to reordering. |
| 461 | \begin{itemize} |
| 462 | \item @<type-name> |
| 463 | \item @"struct" @<identifier>, @"union" @<identifier>, @"enum" @<identifier> |
| 464 | \item @"void" |
| 465 | \item @"char", @"unsigned char", @"signed char" |
| 466 | \item @"short", @"unsigned short", @"signed short" |
| 467 | \item @"short int", @"unsigned short int", @"signed short int" |
| 468 | \item @"int", @"unsigned int", @"signed int", @"unsigned", @"signed" |
| 469 | \item @"long", @"unsigned long", @"signed long" |
| 470 | \item @"long int", @"unsigned long int", @"signed long int" |
| 471 | \item @"long long", @"unsigned long long", @"signed long long" |
| 472 | \item @"long long int", @"unsigned long long int", @"signed long long int" |
| 473 | \item @"float", @"double", @"long double" |
| 474 | \end{itemize} |
| 475 | All of these have their usual C meanings. |
| 476 | |
| 477 | \subsubsection{Declarators} \label{sec:syntax.c-types.declarator} |
| 478 | \begin{grammar} |
| 479 | <declarator>$[k]$ ::= @<pointer>^* <primary-declarator>$[k]$ |
| 480 | |
| 481 | <primary-declarator>$[k]$ ::= $k$ |
| 482 | \alt "(" <primary-declarator>$[k]$ ")" |
| 483 | \alt <primary-declarator>$[k]$ @<declarator-suffix> |
| 484 | |
| 485 | <pointer> ::= "*" @<qualifier>^* |
| 486 | |
| 487 | <declarator-suffix> ::= "[" <c-fragment> "]" |
| 488 | \alt "(" <arguments> ")" |
| 489 | |
| 490 | <arguments> ::= $\epsilon$ | "..." |
| 491 | \alt <list>@[<argument>@] @["," "..."@] |
| 492 | |
| 493 | <argument> ::= @<declaration-specifier>^+ <argument-declarator> |
| 494 | |
| 495 | <argument-declarator> ::= <declarator>@[<identifier> @! $\epsilon$@] |
| 496 | |
| 497 | <simple-declarator> ::= <declarator>@[<identifier>@] |
| 498 | |
| 499 | <dotted-name> ::= <identifier> "." <identifier> |
| 500 | |
| 501 | <dotted-declarator> ::= <declarator>@[<dotted-name>@] |
| 502 | \end{grammar} |
| 503 | |
| 504 | The declarator syntax is taken from C, but with some differences. |
| 505 | \begin{itemize} |
| 506 | \item Array dimensions are uninterpreted @<c-fragments>, terminated by a |
| 507 | closing square bracket. This allows array dimensions to contain arbitrary |
| 508 | constant expressions. |
| 509 | \item A declarator may have either a single @<identifier> at its centre or a |
| 510 | pair of @<identifier>s separated by a @`.'; this is used to refer to |
| 511 | slots or messages defined in superclasses. |
| 512 | \end{itemize} |
| 513 | The remaining differences are (I hope) a matter of presentation rather than |
| 514 | substance. |
| 515 | |
| 516 | \subsection{Defining classes} \label{sec:syntax.class} |
| 517 | |
| 518 | \begin{grammar} |
| 519 | <class-definition> ::= <class-forward-declaration> |
| 520 | \alt <full-class-definition> |
| 521 | \end{grammar} |
| 522 | |
| 523 | \subsubsection{Forward declarations} \label{sec:class.class.forward} |
| 524 | \begin{grammar} |
| 525 | <class-forward-declaration> ::= "class" <identifier> ";" |
| 526 | \end{grammar} |
| 527 | |
| 528 | A @<class-forward-declaration> informs Sod that an @<identifier> will be used |
| 529 | to name a class which is currently undefined. Forward declarations are |
| 530 | necessary in order to resolve certain kinds of circularity. For example, |
| 531 | \begin{listing} |
| 532 | class Sub; |
| 533 | |
| 534 | class Super : SodObject { |
| 535 | Sub *sub; |
| 536 | }; |
| 537 | |
| 538 | class Sub : Super { |
| 539 | /* ... */ |
| 540 | }; |
| 541 | \end{listing} |
| 542 | |
| 543 | \subsubsection{Full class definitions} \label{sec:class.class.full} |
| 544 | |
| 545 | \begin{grammar} |
| 546 | <full-class-definition> ::= |
| 547 | @[<properties>@] |
| 548 | "class" <identifier> ":" <list>@[<identifier>@] |
| 549 | "{" @<class-item>^* "}" |
| 550 | |
| 551 | <class-item> ::= <slot-item> ";" |
| 552 | \alt <initializer-item> ";" |
| 553 | \alt <message-item> |
| 554 | \alt <method-item> |
| 555 | \end{grammar} |
| 556 | |
| 557 | A full class definition provides a complete description of a class. |
| 558 | |
| 559 | The first @<identifier> gives the name of the class. It is an error to |
| 560 | give the name of an existing class (other than a forward-referenced class), |
| 561 | or an existing type name. It is conventional to give classes `MixedCase' |
| 562 | names, to distinguish them from other kinds of identifiers. |
| 563 | |
| 564 | The @<list>@[<identifier>@] names the direct superclasses for the new class. It |
| 565 | is an error if any of these @<identifier>s does not name a defined class. |
| 566 | |
| 567 | The @<properties> provide additional information. The standard class |
| 568 | properties are as follows. |
| 569 | \begin{description} |
| 570 | \item[@"lisp_class"] The name of the Lisp class to use within the translator |
| 571 | to represent this class. The property value must be an identifier; the |
| 572 | default is @"sod_class". Extensions may define classes with additional |
| 573 | behaviour, and may recognize additional class properties. |
| 574 | \item[@"metaclass"] The name of the Sod metaclass for this class. In the |
| 575 | generated code, a class is itself an instance of another class -- its |
| 576 | \emph{metaclass}. The metaclass defines which slots the class will have, |
| 577 | which messages it will respond to, and what its behaviour will be when it |
| 578 | receives them. The property value must be an identifier naming a defined |
| 579 | subclass of @"SodClass". The default metaclass is @"SodClass". |
| 580 | %%% FIXME xref to theory |
| 581 | \item[@"nick"] A nickname for the class, to be used to distinguish it from |
| 582 | other classes in various limited contexts. The property value must be an |
| 583 | identifier; the default is constructed by forcing the class name to |
| 584 | lower-case. |
| 585 | \end{description} |
| 586 | |
| 587 | The class body consists of a sequence of @<class-item>s enclosed in braces. |
| 588 | These items are discussed on the following sections. |
| 589 | |
| 590 | \subsubsection{Slot items} \label{sec:sntax.class.slot} |
| 591 | |
| 592 | \begin{grammar} |
| 593 | <slot-item> ::= |
| 594 | @[<properties>@] |
| 595 | @<declaration-specifier>^+ <list>@[<init-declarator>@] |
| 596 | |
| 597 | <init-declarator> ::= <simple-declarator> @["=" <initializer>@] |
| 598 | \end{grammar} |
| 599 | |
| 600 | A @<slot-item> defines one or more slots. All instances of the class and any |
| 601 | subclass will contain these slot, with the names and types given by the |
| 602 | @<declaration-specifiers> and the @<declarators>. Slot declarators may not |
| 603 | contain dotted names. |
| 604 | |
| 605 | It is not possible to declare a slot with function type: such an item is |
| 606 | interpreted as being a @<message-item> or @<method-item>. Pointers to |
| 607 | functions are fine. |
| 608 | |
| 609 | An @<initializer>, if present, is treated as if a separate |
| 610 | @<initializer-item> containing the slot name and initializer were present. |
| 611 | For example, |
| 612 | \begin{listing} |
| 613 | [nick = eg] |
| 614 | class Example : Super { |
| 615 | int foo = 17; |
| 616 | }; |
| 617 | \end{listing} |
| 618 | means the same as |
| 619 | \begin{listing} |
| 620 | [nick = eg] |
| 621 | class Example : Super { |
| 622 | int foo; |
| 623 | eg.foo = 17; |
| 624 | }; |
| 625 | \end{listing} |
| 626 | |
| 627 | \subsubsection{Initializer items} \label{sec:syntax.class.init} |
| 628 | |
| 629 | \begin{grammar} |
| 630 | <initializer-item> ::= @["class"@] <list>@[<slot-initializer>@] |
| 631 | |
| 632 | <slot-initializer> ::= <dotted-name> "=" <initializer> |
| 633 | |
| 634 | <initializer> :: "{" <c-fragment> "}" | <c-fragment> |
| 635 | \end{grammar} |
| 636 | |
| 637 | An @<initializer-item> provides an initial value for one or more slots. If |
| 638 | prefixed by @"class", then the initial values are for class slots (i.e., |
| 639 | slots of the class object itself); otherwise they are for instance slots. |
| 640 | |
| 641 | The first component of the @<dotted-name> must be the nickname of one of the |
| 642 | class's superclasses (including itself); the second must be the name of a |
| 643 | slot defined in that superclass. |
| 644 | |
| 645 | The initializer has one of two forms. |
| 646 | \begin{itemize} |
| 647 | \item A @<c-fragment> enclosed in braces denotes an aggregate initializer. |
| 648 | This is suitable for initializing structure, union or array slots. |
| 649 | \item A @<c-fragment> \emph{not} beginning with an open brace is a `bare' |
| 650 | initializer, and continues until the next @`,' or @`;' which is not within |
| 651 | nested brackets. Bare initializers are suitable for initializing scalar |
| 652 | slots, such as pointers or integers, and strings. |
| 653 | \end{itemize} |
| 654 | |
| 655 | \subsubsection{Message items} \label{sec:syntax.class.message} |
| 656 | |
| 657 | \begin{grammar} |
| 658 | <message-item> ::= |
| 659 | @[<properties>@] |
| 660 | @<declaration-specifier>^+ <declarator> @[<method-body>@] |
| 661 | \end{grammar} |
| 662 | |
| 663 | \subsubsection{Method items} \label{sec:syntax.class.method} |
| 664 | |
| 665 | \begin{grammar} |
| 666 | <method-item> ::= |
| 667 | @[<properties>@] |
| 668 | @<declaration-specifier>^+ <declarator> <method-body> |
| 669 | |
| 670 | <method-body> ::= "{" <c-fragment> "}" | "extern" ";" |
| 671 | \end{grammar} |
| 672 | |
| 673 | %%%----- That's all, folks -------------------------------------------------- |
| 674 | |
| 675 | %%% Local variables: |
| 676 | %%% mode: LaTeX |
| 677 | %%% TeX-master: "sod.tex" |
| 678 | %%% TeX-PDF-mode: t |
| 679 | %%% End: |