From: Mark Wooding Date: Sat, 27 Jul 2019 12:27:50 +0000 (+0100) Subject: doc/syntax.tex: Hoist `C-types' to section level. X-Git-Url: https://git.distorted.org.uk/~mdw/sod/commitdiff_plain/a58527f39183dd0b06f0eb1eebc86a6909f94774 doc/syntax.tex: Hoist `C-types' to section level. --- diff --git a/doc/syntax.tex b/doc/syntax.tex index 9a02090..18690fc 100644 --- a/doc/syntax.tex +++ b/doc/syntax.tex @@ -195,6 +195,171 @@ or bracket. The first such delimiter character which is not enclosed in brackets, braces or parenthesis ends the fragment. %%%-------------------------------------------------------------------------- +\section{C types} \label{sec:syntax.type} + +Sod's syntax for C types closely mirrors the standard C syntax. A C type has +two parts: a sequence of @s and a @. In +Sod, a type must contain at least one @ (i.e., +`implicit @|int|' is forbidden), and storage-class specifiers are not +recognized. + + +\subsection{Declaration specifiers} \label{sec:syntax.type.declspec} + +\begin{grammar} + ::= +\alt "struct" | "union" | "enum" +\alt "void" | "char" | "int" | "float" | "double" +\alt "short" | "long" +\alt "signed" | "unsigned" +\alt "bool" | "_Bool" +\alt "imaginary" | "_Imaginary" | "complex" | "_Complex" +\alt +\alt +\alt +\alt + + ::= | "const" | "volatile" | "restrict" + + ::= @^+ + + ::= "(" ")" + + ::= "atomic" | "_Atomic" + + ::= "(" ")" + + ::= "alignas" "_Alignas" + + ::= +\end{grammar} + +Declaration specifiers may appear in any order. However, not all +combinations are permitted. A declaration specifier must consist of zero or +more @s, zero or more @s, and one of the +following, up to reordering: +\begin{itemize} +\item @; +\item @; +\item @"struct" @; @"union" @; @"enum" @; +\item @"void"; +\item @"_Bool", @"bool"; +\item @"char"; @"unsigned char"; @"signed char"; +\item @"short", @"signed short", @"short int", @"signed short int"; + @"unsigned short", @"unsigned short int"; +\item @"int", @"signed", @"signed int"; @"unsigned", @"unsigned int"; +\item @"long", @"signed long", @"long int", @"signed long int"; @"unsigned + long", @"unsigned long int"; +\item @"long long", @"signed long long", @"long long int", @"signed long long + int"; @"unsigned long long", @"unsigned long long int"; +\item @"float"; @"double"; @"long double"; +\item @"float _Imaginary", @"float imaginary"; @"double _Imaginary", @"double + imaginary"; @"long double _Imaginary", @"long double imaginary"; +\item @"float _Complex", @"float complex"; @"double _Complex", @"double + complex"; @"long double _Complex", @"long double complex". +\end{itemize} +All of these have their usual C meanings. Groups separated by commas mean +the same thing, and Sod will not preserve the distinction. + +Almost all of these mean the same as they do in C. There are some minor +differences: +\begin{itemize} +\item In C, the `tag' namespace is shared between @|struct|, @|union|, and + @|enum|; Sod has three distinct namespaces for tags. This may be fixed in + the future. +\item The @ production is a syntactic extension point, where + extensions can introduce their own additions to the type system. +\end{itemize} + +C standards from C99 onwards have tended to introduce new keywords beginning +with an underscore followed by an uppercase letter, so as to avoid conflicts +with existing code. More conventional spellings are then provided by macros +in new header files. For example, C99 introduced @"_Bool", and a header file +@|| which defines the macro @|bool|. Sod recognizes both the ugly +underscore names and the more conventional macro names on input, but always +emits the ugly names. This doesn't cause a compatibility problem in Sod, +because Sod's parser recognizes keywords only in the appropriate context. +For example, the (ill-advised) slot declaration +\begin{prog} + bool bool; +\end{prog} +is completely acceptable, and will cause the C structure member +\begin{prog} + \_Bool bool; +\end{prog} +to be emitted on output, which will be acceptable to C as long as +@|| is not included. + +A @ is an identifier which has been declared as being a type name, +using the @"typename" or @"class" definitions. The following type names are +defined in the built-in module. +\begin{itemize} +\item @|va_list| +\item @|size_t| +\item @|ptrdiff_t| +\item @|wchar_t| +\end{itemize} + + +\subsection{Declarators} \label{sec:syntax.type.declarator} + +\begin{grammar} +$[k, a]$ ::= @^* $[k, a]$ + +$[k, a]$ ::= $k$ +\alt "(" $[k, a]$ ")" +\alt $[k, a]$ @$[a]$ + + ::= "*" @^* + +$[a]$ ::= "[" "]" +\alt "(" $a$ ")" + + ::= $\epsilon$ | "\dots" +\alt $[\mbox{@}]$ @["," "\dots"@] + + ::= @^+ + + ::= $[\epsilon, \mbox{@}]$ + + ::= + $[\mbox{@ @! $\epsilon$}, \mbox{@}]$ + + ::= + $[\mbox{@}, \mbox{@}]$ +\end{grammar} + +The declarator syntax is taken from C, but with some differences. +\begin{itemize} +\item Array dimensions are uninterpreted @, terminated by a + closing square bracket. This allows array dimensions to contain arbitrary + constant expressions. +\item A declarator may have either a single @ at its centre or a + pair of @s separated by a @`.'; this is used to refer to + slots or messages defined in superclasses. +\end{itemize} +The remaining differences are (I hope) a matter of presentation rather than +substance. + +There is additional syntax to support messages and methods which accept +keyword arguments. + +\begin{grammar} + ::= @["=" @] + + ::= + @[$[\mbox{@}]$@] + "?" @[$[\mbox{@}]$@] + + ::= @! + + ::= "." + +$[k]$ ::= + $[k, \mbox{@}]$ +\end{grammar} + +%%%-------------------------------------------------------------------------- \section{Module syntax} \label{sec:syntax.module} \begin{grammar} @@ -400,137 +565,6 @@ function. %%% FIXME crossref to extension docs -\subsection{C types} \label{sec:syntax.module.types} - -Sod's syntax for C types closely mirrors the standard C syntax. A C type has -two parts: a sequence of @s and a @. In -Sod, a type must contain at least one @ (i.e., -`implicit @"int"' is forbidden), and storage-class specifiers are not -recognized. - -\subsubsection{Declaration specifiers} -\begin{grammar} - ::= -\alt "struct" | "union" | "enum" -\alt "void" | "char" | "int" | "float" | "double" -\alt "short" | "long" -\alt "signed" | "unsigned" -\alt "bool" | "_Bool" -\alt "imaginary" | "_Imaginary" | "complex" | "_Complex" -\alt -\alt -\alt - - ::= | "const" | "volatile" | "restrict" - - ::= @^+ - - ::= - "(" ")" - - ::= "atomic" | "_Atomic" - - ::= "(" ")" - - ::= "alignas" "_Alignas" - - ::= -\end{grammar} - -A @ is an identifier which has been declared as being a type name, -using the @"typename" or @"class" definitions. The following type names are -defined in the built-in module. -\begin{itemize} -\item @"va_list" -\item @"size_t" -\item @"ptrdiff_t" -\item @"wchar_t" -\end{itemize} - -Declaration specifiers may appear in any order. However, not all -combinations are permitted. A declaration specifier must consist of zero or -more @s, zero or more @s, and one of the -following, up to reordering. -\begin{itemize} -\item @ -\item @ -\item @"struct" @, @"union" @, @"enum" @ -\item @"void" -\item @"_Bool", @"bool" -\item @"char", @"unsigned char", @"signed char" -\item @"short", @"unsigned short", @"signed short" -\item @"short int", @"unsigned short int", @"signed short int" -\item @"int", @"unsigned int", @"signed int", @"unsigned", @"signed" -\item @"long", @"unsigned long", @"signed long" -\item @"long int", @"unsigned long int", @"signed long int" -\item @"long long", @"unsigned long long", @"signed long long" -\item @"long long int", @"unsigned long long int", @"signed long long int" -\item @"float", @"double", @"long double" -\item @"float _Imaginary", @"double _Imaginary", @"long double _Imaginary" -\item @"float imaginary", @"double imaginary", @"long double imaginary" -\item @"float _Complex", @"double _Complex", @"long double _Complex" -\item @"float complex", @"double complex", @"long double complex" -\end{itemize} -All of these have their usual C meanings. - -\subsubsection{Declarators} -\begin{grammar} -$[k, a]$ ::= @^* $[k, a]$ - -$[k, a]$ ::= $k$ -\alt "(" $[k, a]$ ")" -\alt $[k, a]$ @$[a]$ - - ::= "*" @^* - -$[a]$ ::= "[" "]" -\alt "(" $a$ ")" - - ::= $\epsilon$ | "\dots" -\alt $[\mbox{@}]$ @["," "\dots"@] - - ::= @^+ - - ::= $[\epsilon, \mbox{@}]$ - - ::= - $[\mbox{@ @! $\epsilon$}, \mbox{@}]$ - - ::= - $[\mbox{@}, \mbox{@}]$ -\end{grammar} - -The declarator syntax is taken from C, but with some differences. -\begin{itemize} -\item Array dimensions are uninterpreted @, terminated by a - closing square bracket. This allows array dimensions to contain arbitrary - constant expressions. -\item A declarator may have either a single @ at its centre or a - pair of @s separated by a @`.'; this is used to refer to - slots or messages defined in superclasses. -\end{itemize} -The remaining differences are (I hope) a matter of presentation rather than -substance. - -There is additional syntax to support messages and methods which accept -keyword arguments. - -\begin{grammar} - ::= @["=" @] - - ::= - @[$[\mbox{@}]$@] - "?" @[$[\mbox{@}]$@] - - ::= @! - - ::= "." - -$[k]$ ::= - $[k, \mbox{@}]$ -\end{grammar} - - \subsection{Class definitions} \label{sec:syntax.module.class} \begin{grammar}