+\versionid $Id$
+
\C{input} Halibut input format
This chapter describes the format in which you should write
example, if the text is capitalised, that's usually good enough. If
I talk about the Pentium's \cw{EAX} and \cw{EDX} registers, for
example, you don't need quotes to notice that those are special; so
-I would write that in Halibut as \q{\c{the Pentium's \\cw\{EAX\} and
-\\cw\{EDX\} registers}}. But if I'm talking about the Unix command
+I would write that in Halibut as \cq{the Pentium's \\cw\{EAX\} and
+\\cw\{EDX\} registers}. But if I'm talking about the Unix command
\c{man}, which is an ordinary English word in its own right, a reader
might be slightly confused if it appeared in the middle of a
-sentence undecorated; so I would write that as \q{\c{the Unix command
-\\c\{man\}}}.
+sentence undecorated; so I would write that as \cq{the Unix command
+\\c\{man\}}.
In summary:
In really extreme cases, you might want Halibut to use \i{quotation
marks} even in output formats which can change font. In
\k{input-date}, for example, I mention the special formatting
-command \q{\cw{\\.}}. If that appeared at the end of a sentence
+command \cq{\\.}. If that appeared at the end of a sentence
\e{without} the quotes, then the two adjacent full stops would look
-pretty strange even if they were obviously in different fonts. So I
-used the \c{\\q} command to provide my own set of quotes, and then
-used \c{\\cw} rather than \c{\\c} to ensure that none of Halibut's
-output formats would add another set of quotes:
+pretty strange even if they were obviously in different fonts.
+
+For this, Halibut supports the \i\c{\\cq} command, which is exactly
+equivalent to using \c{\\q} to provide quotes and then using
+\c{\\cw} inside the quotes. So in the paragraph above, for example,
+I wrote
+
+\c the special formatting command \cq{\\.}.
+
+and I could equivalently have written
\c the special formatting command \q{\cw{\\.}}.
}
and in every output format Halibut generates, it will choose the
-best quote characters available to it in that format.
+best quote characters available to it in that format. (The quote
+characters to use can be configured with the \c{\\cfg} command.)
You can still use the ordinary quote characters of your choice if
you prefer; or you could even use the \c{\\u} command (see
built-in \c{\\q} command in most cases, because it's simple and does
the best it can everywhere.
-(Note that if you're using the \c{\\c} or \c{\\cw} commands to
-display literal computer code, you probably \e{will} want to use
-literal \i{ASCII quote characters}, because it is likely to matter
-precisely which quote character you use.)
+If you're using the \c{\\c} or \c{\\cw} commands to display literal
+computer code, you will probably want to use literal \i{ASCII quote
+characters}, because it is likely to matter precisely which quote
+character you use. In fact, Halibut actually \e{disallows} the use
+of \c{\\q} within either of \c{\\c} and \c{\\cw}, since this
+simplifies some of the output formats.
\S{input-nonbreaking} \c{\\-} and \c{\\_}: \ii{Non-breaking hyphens}
and \I{non-breaking spaces}spaces
(such as writing \c{\\dateZ}) then Halibut will assume you are
trying to invoke the name of a macro command you have defined
yourself, and will complain if no such command exists. To get round
-this you can use the special \q{\cw{\\.}} do-nothing command. See
+this you can use the special \cq{\\.} do-nothing command. See
\k{input-macro} for more about general Halibut command syntax and
-\q{\cw{\\.}}.
+\cq{\\.}.
If you would prefer the date to be generated in a specific format,
you can follow the \c{\\date} command with a format specification in
If you read it in other formats, you may see different results.
-\S{input-xref} \i\c{\\k} and \i\c{\\K}: \ii{Cross-references} to
-other sections
+\S{input-xref} \i\c{\\k} and \I{\\K-upper}\c{\\K}:
+\ii{Cross-references} to other sections
\K{intro-features} mentions that Halibut \I{section numbers}numbers
the sections of your document automatically, and can generate
braces, and does \e{not} need to escape them in the way described in
\k{input-basics}. This is because code paragraphs formatted in this
way are a special case; the intention is that you can just copy and
-paste a lump of code out of your program, put \q{\cw{\\c }} at the
+paste a lump of code out of your program, put \cq{\\c } at the
start of every line, and simply \e{not have to worry} about the
details - you don't have to go through the whole block looking for
characters to escape.
}
+If you really want to, you are allowed to use \c{\\dt} and \c{\\dd}
+without strictly interleaving them (multiple consecutive \c{\\dt}s
+or consecutive \c{\\dd}s, or a description list starting with
+\c{\\dd} or ending with \c{\\dt}). This is probably most useful if
+you are listing a sequence of things with \c{\\dt}, but only some of
+them actually need \c{\\dd} descriptions. You should \e{not} use
+multiple consecutive \c{\\dd}s to provide a multi-paragraph
+definition of something; that's what \c{\\lcont} is for, as
+explained in \k{input-list-continuation}.
+
\S2{input-list-continuation} \ii{Continuing list items} into further
paragraphs
}
-\S{input-sections} \i\c{\\C}, \i\c{\\H}, \i\c{\\S}, \i\c{\\A},
-\i\c{\\U}: Chapter and \i{section headings}
+\S{input-sections} \I{\\C-upper}\c{\\C}, \i\c{\\H}, \i\c{\\S},
+\i\c{\\A}, \I{\\U-upper}\c{\\U}: Chapter and \i{section headings}
\K{intro-features} mentions that Halibut \I{section
numbering}numbers the sections of your document automatically, and
\dd This defines the overall title of the entire document. This
title is treated specially in some output formats (for example, it's
-used in a \cw{<title>} tag in the HTML output), so it needs a
+used in a \cw{<TITLE>} tag in the HTML output), so it needs a
special paragraph type to point it out.
\dt \i\cw{\\copyright}
papers, books, websites, whatever), you might find a bibliography
feature useful.
-You can define a bibliography entry using the \i\c{\\B} command. This
-looks very like the \c{\\C} command and friends: it expects a
-keyword in braces, followed by some text describing the document
-being referred to. For example:
+You can define a bibliography entry using the \I{\\B-upper}\c{\\B}
+command. This looks very like the \c{\\C} command and friends: it
+expects a keyword in braces, followed by some text describing the
+document being referred to. For example:
\c \B{freds-book} \q{The Taming Of The Mongoose}, by Fred Bloggs.
\c Published by Paperjam & Notoner, 1993.
Sometimes you might want to index a term which is not explicitly
mentioned, but which is highly relevant to the text and you think
that somebody looking up that term in the index might find it useful
-to be directed here. To do this you can use the \i\c{\\I} command,
-to create an \i{\e{invisible} index tag}:
+to be directed here. To do this you can use the \I{\\I-upper}\c{\\I}
+command, to create an \i{\e{invisible} index tag}:
\c If your printer runs out of toner, \I{replacing toner
\c cartridge}here is what to do:
\c{\\i\{frogs\}}, so that you'd end up with two separate index
entries for what really ought to be the same concept.
-\b You might well not want the word \q{\cw{grep}} to appear in the
+\b You might well not want the word \cq{grep} to appear in the
index without explanation; you might prefer it to say something more
\I{rewriting index terms}verbose such as \q{\cw{grep} command}, so
that a user encountering it in the index has some idea of what it is
\c \IM{grep} \cw{grep} command
This will arrange that the set of places in the document where you
-asked Halibut to index \q{\cw{grep}} will be listed under
-\q{\cw{grep} command} rather than just under \q{\cw{grep}}.
+asked Halibut to index \cq{grep} will be listed under
+\q{\cw{grep} command} rather than just under \cq{grep}.
You can specify more than one index term in a \c{\\IM} command; so
to merge the index terms \q{frog} and \q{frogs} into a single term,
Halibut discards its default implicit one, and you must then specify
that one explicitly as well if you wanted to keep it.
+\S{input-index-case} Indexing terms that differ only in case
+
+The \e{tags} you use to define an index term (that is, the text in
+the braces after \c{\\i}, \c{\\I} and \c{\\IM}) are treated
+case-insensitively by Halibut. So if, as in this manual itself, you
+need two index terms that differ only in case, doing this will not
+work:
+
+\c The \i\c{\\c} command defines computer code.
+\c
+\c The \i\c{\\C} command defines a chapter.
+
+Halibut will treat these terms as the same, and will fold the two
+sets of references into one combined list (although it will warn you
+that it is doing this). The idea is to ensure that people who forget
+to use \c{\\ii} find out about it rather than Halibut silently
+generating a bad index; checking an index for errors is very hard
+work, so Halibut tries to avoid errors in the first place as much as
+it can.
+
+If you do come across this situation, you will need to define two
+distinguishable index terms. What I did in this manual was something
+like this:
+
+\c The \i\c{\\c} command defines computer code.
+\c
+\c The \I{\\C-upper}\c{\\C} command defines a chapter.
+\c
+\c \IM{\\C-upper} \c{\\C}
+
+The effect of this will be two separate index entries, one reading
+\c{\\c} and the other reading \c{\\C}, pointing to the right places.
+
\H{input-config} \ii{Configuring} Halibut
Halibut uses the \i\c{\\cfg} command to allow you to configure various
configure, and the meaning of the one(s) after that depends on the
first keyword.
-The current list of configuration keywords in the main Halibut code
-is quite small. Here it is in full:
+Each output format supports a range of configuration options of its
+own (and some configuration is shared between similar output formats
+- the PDF and PostScript formats share most of their configuration,
+as described in \k{output-paper}). The configuration keywords for
+each output format are listed in the manual section for that format;
+see \k{output}.
+
+There are also a small number of configuration options which apply
+across all output formats:
\dt \I\cw{\\cfg\{chapter\}}\cw{\\cfg\{chapter\}\{}\e{new chapter name}\cw{\}}
\dd Exactly like \c{chapter}, but changes the name given to
appendices.
+\dt \I\cw{\\cfg\{contents\}}\cw{\\cfg\{contents\}\{}\e{new contents name}\cw{\}}
+
+\dd This changes the name given to the contents section (by default
+\q{Contents}) in back ends which generate one.
+
+\dt \I\cw{\\cfg\{index\}}\cw{\\cfg\{index\}\{}\e{new index name}\cw{\}}
+
+\dd This changes the name given to the index section (by default
+\q{Index}) in back ends which generate one.
+
\dt \I\cw{\\cfg\{input-charset\}}\cw{\\cfg\{input-charset\}\{}\e{character set name}\cw{\}}
\dd This tells Halibut what \i{character set} you are writing your
You can specify any well-known name for any supported character set.
For example, \c{iso-8859-1}, \c{iso8859-1} and \c{iso_8859-1} are
all recognised, \c{GB2312} and \c{EUC-CN} both work, and so on.
+(You can list character sets known to Halibut with by invoking it
+with the \cw{--list-charsets} option; see \k{running-options}.)
This directive takes effect immediately after the \c{\\cfg} command.
-All text after that in the file is expected to be in the new
-character set. You can even change character set several times
-within a file if you really want to.
+All text after that until the end of the input file is expected to be
+in the new character set. You can even change character set several
+times within a file if you really want to.
When Halibut reads the input file, everything you type will be
converted into \i{Unicode} from the character set you specify here,
}
+\dt \I\cw{\\cfg\{quotes\}}\cw{\\cfg\{quotes\}\{}\e{open-quote}\cw{\}\{}\e{close-quote}\cw{\}}[\cw{\{}\e{open-quote}\cw{\}\{}\e{close-quote}...\cw{\}}]
+
+\dd This specifies the quote characters which should be used. You
+should separately specify the open and close quote marks; each
+quote mark can be one character (\cw{\\cfg\{quotes\}\{`\}\{'\}}), or
+more than one (\cw{\\cfg\{quotes\}\{<<\}\{>>\}}).
+
+\lcont{
+
+\cw{\\cfg\{quotes\}} can be overridden by configuration directives for
+each individual backend (see \k{output}); it is a convenient way of
+setting quote characters for all backends at once.
+
+All backends use these characters in response to the \c{\\q} command
+(see \k{input-quotes}). Some (such as the text backend) use them for
+other purposes too.
+
+You can specify multiple fallback options in this command (a pair of
+open and close quotes, each in their own braces, then another pair,
+then another if you like), and Halibut will choose the first pair
+which the output character set supports (Halibut will always use a
+matching pair). (This is to allow you to configure quote characters
+once, generate output in several different character sets, and have
+Halibut constantly adapt to make the best use of the current
+encoding.) For example, you might write
+
+\c \cfg{quotes}{\u201c}{\u201d}{"}{"}
+
+and Halibut would use the Unicode matched double quote characters if
+possible, and fall back to ASCII double quotes otherwise. If the
+output character set were to contain U+201C but not U+201D, then
+Halibut would fall back to using the ASCII double quote character as
+\e{both} open and close quotes. (No known character set is that
+silly; I mention it only as an example.)
+
+\cw{\\cfg\{quotes\}} (and the backend-specific versions) apply to the
+\e{entire} output; it's not possible to change quote characters
+partway through the output.
+
+}
+
In addition to these configuration commands, there are also
configuration commands provided by each individual output format.
These configuration commands are discussed along with each output
\c \cfg{chapter}{Chapter}
\c \cfg{section}{Section}
\c \cfg{appendix}{Appendix}
+\c \cfg{contents}{Contents}
+\c \cfg{index}{Index}
\c \cfg{input-charset}{ASCII}
+The default for \cw{\\cfg\{input-charset\}} can be changed with the
+\cw{--input-charset} option; see \k{running-options}. The default
+settings for \cw{\\cfg\{quotes\}} are backend-specific; see
+\k{output}.
+
\H{input-macro} Defining \i{macros}
If there's a complicated piece of Halibut source which you think