X-Git-Url: https://git.distorted.org.uk/~mdw/sod/blobdiff_plain/cbed66ad2c8fd59081c900be8f94956ebe181a23..12949379840101e2d65883f29c5e8f0f6de49e9c:/STYLE diff --git a/STYLE b/STYLE index 49ad150..e85be96 100644 --- a/STYLE +++ b/STYLE @@ -321,12 +321,275 @@ there aren't implementation-specific declarations with crazy syntax mixed in there, but it's more work than seems worthwhile. +* C style + +** Language subset and extensions + +I'm trying to support C89 still. There are few really worthwhile +features in C99 and later, though there are some. For now, I want Sod +to continue working if built with a C89 compiler, even if some things -- +e.g., most notably the macro sugar for varargs messages -- are +unavailable. + +Similarly, I'll use compiler-specific features if they don't adversely +affect portability. For example, I'll use GCC attributes to improve +compiler diagnostics, but they're wrapped up in preprocessor hacking so +that they won't be noticed by compilers which don't understand them. +I'm generally happy to accept contributions which make similar +improvements for other compilers. + +Sod is supposed to have minimal dependencies. It should be able to work +in what the ISO C standard names a `freestanding environment', without +most of the standard C library. The keyword-argument library is +carefully split into a piece which is fully portable and a piece which +depends on features which are only available in hosted environments, +like being able to print stuff to ~stderr~, so that users targetting +embedded systems have an easy porting job. + +** Naming + +I usually give local variables, arguments, and structure members very +short names, just one or two characters long. I find that longer names +are harder to distinguish, and take up horizontal space. Besides, +mathematicians have been using single-letter variable names quite +successfully for hundreds of years. + +I usually choose variable names to match their types in an informal way. +Loop counters are often called ~i~, ~j~, ~k~; generic pointers, and +pointers to bytes or characters, are usually ~p~ or ~q~; a character is +often ~ch~; a ~FILE~ pointer is ~fp~ following long tradition; sizes of +things, in bytes, are ~sz~, while lengths of vectors, in elements, are +~n~. I often name values of, or pointers to, structures or custom types +with the first letter of the type. If I have two things of the same +kind, I'll often double the name of one of them; e.g., if I have two +pointers to ~whatsit~ structures, I might call them ~w~ and ~ww~. + +I don't (any more) give ~typedef~ names to structures or unions. This +makes it possible to have a variable with the same name as the structure +tag without serious trouble. + +In variable names, I tend to just squash pieces of words together; in +longer names, sometimes I'll put in underscores to split things up a +bit. Camel case is bletcherous. + +File-scope names with /internal/ linkage -- i.e., things marked ~static~ +-- generally deserve somewhat longer names. I don't give them other +kind of marking; e.g., I'd probably name the pointer to the head of a +list of ~foo~ things something like ~foohead~. + +Names with /external/ linkage want more care because they're playing in +a shared global namespace. + +** Layout + +The C indent quantum is two columns. + +Declarations go at the top of functions. I don't put declarations in +inner blocks, and I certainly don't scatter declarations throughout a +block. I find that having the declarations all in one place makes it +easier for me to keep track of what things the function is going to be +thinking about. + +If I can't set a variable to its proper value immediately, I'll leave it +uninitialized until I can. That way, the compiler will warn me if I +forget. + +Most of my style is an attempt to get as much interesting code on the +screen at a time, and still be able to read it. The short variable +names keep things distinct while keeping statements short; short +statements don't need to be split across multiple lines. And keeping +the overall line length limit low means I can fit more /columns/ of code +on my screen. + +If there are several related variables with the same declaration +specifiers, I'll usually write a single declaration for all of them -- +even if they have different actual types. For example, + +: struct foo f, *fp = &f; + +Note that a ~*~ declarator operator has a space to its left, but never +to its right. (Stroustrup's style horribly misrepresents the underlying +syntax.) + +I will often write multiple statements on a single line, usually to +indicate that these things are part of the same thought, and they +shouldn't be separated. For example, if I'm working through an array of +things, I might have a pointer ~p~ to the element I'm hacking on, and a +count ~n~ of things left to hack, I'll have a loop + +: while (n) { +: /* hack on *p */ +: p++; n--; +: } + +so that the two updates don't get separated. + +I don't wrap braces around individual statements that fit on a single +line. For example, I'll write + +: while (*p == ' ') p++; + +On the other hand, if a single substatement is going to take more than +one line then it gets wrapped in braces. + +I don't write blocks which aren't part of larger compound statements, +e.g., ~if~ or ~while~. I'll write a compound statementon a single line +if I can; but I'll split ~if~ with an ~else~ over two lines. For +example, + +: if (a == 1) x = 0; +: else if (b == 3) { y = 2; z = 1; } +: else w = 15; + +On the other hand, if I can't write all of the branches of an +~if~\relax/\relax ~else if~ ladder like this, then /all/ of the +substatements get their own lines. (I write ~do~\relax/\relax ~while~ +loops in the same way, but this comes up much less frequently.) + +If I can't write a block on the same line, then the opening brace goes +on the same line as the statement head, and the closing brace gets its +own line. A trailing ~else~ or ~while~ goes on the same line as the +previous closing brace, if there is one. + +I don't write spaces inside parentheses or square brackets, or between +unary operators and their operands. I always write ~sizeof~ as if it +were a function, even though I know it isn't. I write a single space +either side of non-multiplicative binary operators -- i.e., other than +~*~, ~/~, ~%~, and ~&~; I don't write spaces around multiplicative +operators any more. The comma operator is special, and gets a space +after, but not before. + +If I'm breaking a long line at a binary operator, the break comes +/after/ the operator, not before. + +** Common conventions + +A /predicate/ is a function which answers a yes/no question -- and has +no side-effects. I don't use ~bool~ or similar; predicates return +~int~, such that zero is false and nonzero is true. Predicates usually +have names ending in ~p~ or ~_p~. (Note that function names +beginning ~is...~ are reserved for future ~~ macros.) + +On the other hand, an /operation/ is a function whose main purpose is to +have an effect -- maybe create a thing, or update some state. In the +absence of better ideas, operations also return ~int~, but zero +indicates success, and nonzero -- usually $-1$ -- indicates failure. + +** Error handling and resource management + +I've tried many techniques. I think the following is the best approach +so far. + +I try to arrange that every type which represents some resource which +might need releasing has an easily recognizable `inert' value which +indicates that the resource has not been acquired. At the top of a +function, I initialize all of the variables which might hold onto +resources to their inert values. At the end of the function, I place a +label, ~end~ or ~fail~. An ~end~ label is for common cleanup; a ~fail~ +label is for cleanup that's only needed on unsuccessful completion. + + +** Miscellaneous style issues + +I write ~0~, not ~NULL~. Doing this prevents a common error in +null-terminated variable-length argument lists, e.g., ~execlp~, where +~NULL~ is actually an integer ~0~ in disguise and ends up being an ~int~ +where a pointer was wanted. + +I don't usually write redundant comparisons against ~0~, or ~NULL~, or +well-known return codes indicating success. Again, this helps with +compression. I'll write + +: rc = do_something(foo, bar); if (rc) goto end; + +(yes, one line) rather than comparing ~rc~ against some ~STATUS_SUCCESS~ +code or similar. Exception: I still haven't decided whether I prefer +leaving the explicit relational in ~strcmp~ and similar tests. + +I always write parentheses around the expression in a ~return~ +statement. + +In declarations, storage classes come first (e.g., ~static~, ~extern~, +~typedef~), followed by qualifiers (~const~, ~volatile~; I never use +~restrict~), and then the type specifiers, signedness indicators first +where they aren't redundant (so maybe ~signed char~ for special effects, +but never ~signed int~), then length indicators, then the base type. I +omit ~int~ if there are other type specifiers, so ~unsigned~ or ~long~, +rather than ~unsigned int~ or ~long int~. + +The full declarator syntax for function pointer is pretty ugly. I often +simplify it by defining a ~typedef~ for the /function/ type, not the +function pointer type. For example + +: typedef int callbackfn(struct thing */*t*/, void */*p*/); + +I'd then use variables (structure members, arguments, etc.) of type +~callbackfn *~. + +In header files, I comment out argument names to prevent problems with +macros defined by client translation units. Also, I explicitly mark +function declarations as being ~extern~. + +** Comments and file structuring + +I never use C++-style ~//~ comments except for temporary special +effects. + +If a comment fits on one line, then its closing ~*/~ is on the same +line; otherwise, the ending ~*/~ is on a line by itself, and there's a +spine of ~*~ characters in a column on the left. + +A file starts with a big comment bearing the Emacs ~-*-c-*-~ marker, a +quick description, and copyright and licensing boilerplate. + +Header files are wrapped up with multiple-inclusion and C++ guards, with + +: #ifndef HEADER_H +: #define HEADER_H +: +: #ifdef __cplusplus +: extern "C" { +: #endif + +at the top. + +The rest of the file consists of C code. I don't use page boundaries +~^L~ to split files up. Instead, I use big banner comments for this: + +: /*----- Section title -----------------------------------------------------*/ + +Following long tradition, functions and macros are documented in a +preceding comment which looks like this. + +: /* --- @name@ --- * +: * +: * Arguments: @type fmm@ = a five-minute argument +: * @type fhh@ = the full half-hour +: * +: * Returns: A return value. +: * +: * Use: It does a thing. Otherwise I wouldn't have bothered. +: */ + +Sometimes (rarely) the description of the return value explains +sufficiently what the thing does. If so, the `Use' part can be omitted. +Fragments of C code in this comment are surrounded by ~@~ characters. +There can also be \LaTeX\ maths in here, in ~%$~...\relax ~$%~. + +Files end, as a result of long tradition, with a comment + +: /*----- That's all, folks -------------------------------------------------*/ + +The closing ~#endif~ of a header file comes after this final comment. + + * COMMENT Emacs cruft #+LATEX_CLASS: strayman ## LocalWords: CLOS ish destructure destructured accessor specializers -## LocalWords: accessors DSLs gensym gensyms +## LocalWords: accessors DSLs gensym gensyms bletcherous Stroustrup +## LocalWords: Stroustrup's signedness ## Local variables: ## mode: org