@@@ wip
[mLib] / struct / atom.3
1 .\" -*-nroff-*-
2 .de VS
3 .sp 1
4 .RS
5 .nf
6 .ft B
7 ..
8 .de VE
9 .ft R
10 .fi
11 .RE
12 .sp 1
13 ..
14 .TH atom 3 "21 January 2001" "Straylight/Edgeware" "mLib utilities library"
15 .SH NAME
16 atom \- atom table manager
17 .\" @atom_createtable
18 .\" @atom_destroytable
19 .\" @atom_intern
20 .\" @atom_nintern
21 .\" @atom_gensym
22 .\" @atom_name
23 .\" @atom_len
24 .\" @atom_hash
25 .\" @atom_mkiter
26 .\" @atom_next
27 .\"
28 .\" @ATOM_GLOBAL
29 .\" @INTERN
30 .\" @GENSYM
31 .\" @ATOM_NAME
32 .\" @ATOM_LEN
33 .\" @ATOM_HASH
34 .\"
35 .SH SYNOPSIS
36 .nf
37 .B "#include <mLib/atom.h>"
38
39 .B "typedef struct { ...\& } atom_table;"
40 .B "typedef struct { ...\& } atom;"
41
42 .BI "void atom_createtable(atom_table *" t );
43 .BI "void atom_destroytable(atom_table *" t );
44
45 .BI "atom *atom_intern(atom_table *" t ", const char *" p );
46 .BI "atom *atom_nintern(atom_table *" t ", const char *" p ", size_t " n );
47 .BI "atom *atom_gensym(atom_table *" t );
48 .BI "atom *INTERN(const char *" p );
49 .BI "atom *GENSYM;"
50
51 .BI "const char *atom_name(const atom *" a );
52 .BI "size_t atom_len(const atom *" a );
53 .BI "uint32 atom_hash(const atom *" a );
54 .BI "const char *ATOM_NAME(const atom *" a );
55 .BI "size_t ATOM_LEN(const atom *" a );
56 .BI "uint32 ATOM_HASH(const atom *" a );
57
58 .BI "void atom_mkiter(atom_iter *" i ", atom_table *" t );
59 .BI "atom *atom_next(atom_iter *" i );
60
61 .BI "extern atom_table *ATOM_GLOBAL;"
62 .fi
63 .SH DESCRIPTION
64 The
65 .B atom
66 functions and macros implement a data type similar to immutable strings,
67 with an additional property: that the addresses of two atoms from the
68 same table are equal if and only if they contain the same text. Atom
69 strings don't have to be null-terminated, although the interface is
70 simpler if you know that all of your atoms are null-terminated. It's
71 also possible to make
72 .IR "uninterned atoms" :
73 see below.
74 .PP
75 If a necessary memory allocation fails during an atom operation, the
76 exception
77 .B EXC_NOMEM
78 is raised.
79 .PP
80 Atoms are useful for speeding up string comparisons, and for saving
81 memory when many possibly-identical strings need storing.
82 .PP
83 There is a global atom table, named
84 .BR ATOM_GLOBAL ,
85 available for general use. You can initialize your own atom table if
86 either you want to ensure that the atoms are not shared with some other
87 table, or if you want to be able to free the atoms later. Private atom
88 tables have the type
89 .BR atom_table ;
90 initialize it using the function
91 .B atom_createtable
92 and free it when you're finished using
93 .BR atom_destroytable .
94 .SS "Creating atoms from strings"
95 The process of making atoms from strings is called
96 .IR interning .
97 The function
98 .B atom_nintern
99 takes an atom table, a string, and a length, and returns an atom with
100 the same text. If your string is null-terminated, you can instead use
101 .B atom_intern
102 which has no length argument; if, in addition, you want to use the
103 global atom table, you can use the single-argument
104 .B INTERN
105 macro, which takes just a null-terminated string.
106 .PP
107 A terminating null byte is always appended to an atom name. This is not
108 considered to be a part of the name itself, and does not contribute to
109 the atom's length as reported by the
110 .B ATOM_LEN
111 macro.
112 .SS "Uninterned atoms"
113 You can make an atom which is guaranteed to be distinct from every other
114 atom, and has no sensible text string, by calling
115 .BR atom_gensym ,
116 passing it the address of your atom table. The macro
117 .B GENSYM
118 (which doesn't look like a function, and has no parentheses following
119 it!) will return a unique atom from the global table. Uninterned atoms
120 have a generated name of the form
121 .RB ` *gen- \fInnn * '
122 where
123 .I nnn
124 is an atom-table-specific sequence number. This text is there as a
125 debugging convenience, and doesn't mean that the uninterned atom has the
126 same address as an interned atom with the same text.
127 .SS "Other enquiries about atoms"
128 Atoms can be interrogated to find their names and hashes. The macro
129 .B ATOM_NAME
130 returns a pointer to the atom's name (its text);
131 .B ATOM_LEN
132 returns the length of the atom's name, excluding the terminating null;
133 and
134 .B ATOM_HASH
135 returns the atom's hash value, which is useful if you want to use the
136 atom as a key in some other structure. There are lower-case function
137 versions of these, which have the same effect. There is little point in
138 using the functions.
139 .SS "Enumerating atoms"
140 You can iterate over the atoms in an atom table. The
141 .B atom_mkiter
142 function initializes an iterator object to iterate over a particular
143 atom table;
144 .B atom_next
145 returns the next atom from the iterator. Atoms are not returned in any
146 particular order.
147 .SH SEE ALSO
148 .BR assoc (3),
149 .BR hash (3),
150 .BR mLib (3).
151 .SH AUTHOR
152 Mark Wooding, <mdw@distorted.org.uk>