@@@ man wip
[mLib] / struct / sym.3
CommitLineData
b6b9d458 1.\" -*-nroff-*-
2.de VS
3.sp 1
d66d7727 4.RS
b6b9d458 5.nf
6.ft B
7..
8.de VE
9.ft R
10.fi
11.RE
12.sp 1
13..
d056fbdf
MW
14.ie t \{\
15. de VP
16. sp .4v
17..
18\}
19.el \{\
20. de VP
21. sp
22..
23\}
fbf20b5b 24.TH sym 3 "8 May 1999" "Straylight/Edgeware" "mLib utilities library"
b6b9d458 25.SH NAME
26sym \- symbol table manager
08da152e 27.\" @sym_create
28.\" @sym_destroy
29.\" @sym_find
30.\" @sym_remove
31.\" @sym_mkiter
32.\" @sym_next
33.\"
34.\" @SYM_NAME
0c404077 35.\" @SYM_LEN
36.\" @SYM_HASH
08da152e 37.\"
b6b9d458 38.SH SYNOPSIS
39.nf
40.B "#include <mLib/sym.h>"
d056fbdf 41.PP
4729aa69
MW
42.B "type struct { ...\& } sym_table;"
43.B "type struct { ...\& } sym_base;"
44.B "type struct { ...\& } sym_iter;"
d056fbdf 45.PP
b6b9d458 46.BI "void sym_create(sym_table *" t );
47.BI "void sym_destroy(sym_table *" t );
d056fbdf 48.PP
adec5584
MW
49.ta \w'\fBvoid *sym_find('u
50.BI "void *sym_find(sym_table *" t ,
51.BI " const char *" n ", long " l ,
52.BI " size_t " sz ", unsigned *" f );
b6b9d458 53.BI "void sym_remove(sym_table *" t ", void *" b );
d056fbdf 54.PP
0c404077 55.BI "const char *SYM_NAME(const void *" p );
56.BI "size_t SYM_LEN(const void *" p );
57.BI "uint32 SYM_HASH(const void *" p );
d056fbdf 58.PP
b6b9d458 59.BI "void sym_mkiter(sym_iter *" i ", sym_table *" t );
60.BI "void *sym_next(sym_iter *" i );
61.fi
0c404077 62.SH "DESCRIPTION"
b6b9d458 63The
64.B sym
65functions implement a data structure often described as a dictionary, a
66finite map, an associative array, or a symbol table. It associates
67.I values
68with
69.I keys
70such that the value corresponding to a given key can be found quickly.
71Additionally, all stored associations can be enumerated.
72.PP
73The interface provides an
74.I intrusive
75symbol table. The data objects stored in the table must include a small
76header used by the symbol table manager. This reduces the amount of
77pointer fiddling that needs to be done, and in practice doesn't seem to
78be much of a problem. It's also fairly easy to construct a
79non-intrusive interface if you really want one.
80.PP
81There are three main data structures involved in the interface:
82.TP
83.B sym_table
84Keeps track of the information associated with a particular table.
85.TP
86.B sym_base
87The header which must be attached to the front of all the value
88objects.
89.TP
90.B sym_iter
91An iterator object, used for enumerating all of the associations stored
92in a symbol table.
93.PP
94All of the above data structures should be considered
95.IR opaque :
96don't try looking inside. Representations have changed in the past, and
97they may change again in the future.
0c404077 98.SS "Creation and destruction"
b6b9d458 99The
100.B sym_table
101object itself needs to be allocated by the caller. It is initialized by
102passing it to the function
103.BR sym_create .
104After initialization, the table contains no entries.
105.PP
106Initializing a symbol table involves allocating some memory. If this
d2a91066 107allocation fails, an
b6b9d458 108.B EXC_NOMEM
109exception is raised.
110.PP
111When a symbol table is no longer needed, the memory occupied by the
112values and other maintenance structures can be reclaimed by calling
113.BR sym_destroy .
0c404077 114Any bits of user data attached to values should previously have been
115destroyed.
116.SS "Adding, searching and removing"
b6b9d458 117Most of the actual work is done by the function
118.BR sym_find .
119It does both lookup and creation, depending on its arguments. To do its
120job, it needs to know the following bits of information:
121.TP
ff76c38f 122.BI "sym_table *" t
b6b9d458 123A pointer to a symbol table to manipulate.
124.TP
ff76c38f 125.BI "const char *" n
b6b9d458 126The address of the
127.I key
128to look up or create. Usually this will be a simple text string,
129although it can actually be any arbitrary binary data.
130.TP
ff76c38f 131.BI "long " l
b6b9d458 132The length of the key. If this is \-1,
133.B sym_find
134assumes that the key is a null-terminated string, and calculates its
0c404077 135length itself. This is entirely equivalent to passing
136.BI strlen( n )\fR.
b6b9d458 137.TP
ff76c38f 138.BI "size_t " sz
b6b9d458 139The size of the value block to allocate if the key could not be found.
140If this is zero, no value is allocated, and a null pointer is returned
141to indicate an unsuccessful lookup.
142.TP
ff76c38f 143.BI "unsigned *" f
b6b9d458 144The address of a `found' flag to set. This is an output parameter. On
145exit,
146.B sym_find
147will set the value of
148.BI * f
149to zero if the key could not be found, or nonzero if it was found. This
150can be used to tell whether the value returned has been newly allocated,
151or whether it was already in the table.
152.PP
0c404077 153A terminating null byte is appended to the copy of the symbol's name in
154memory. This is not considered to be a part of the symbol's name, and
155does not contribute to the name's length as reported by the
156.B SYM_LEN
157macro.
b6b9d458 158.PP
159A symbol can be removed from the table by calling
160.BR sym_remove ,
161passing the symbol table itself, and the value block that needs
162removing.
0c404077 163.SS "Enquiries about symbols"
164Three macros are provided to enable simple enquiries about a symbol.
165Given a pointer
166.I s
167to a symbol table entry,
168.BI SYM_LEN( s )
169returns the length of the symbol's name (excluding any terminating null
d4efbcd9 170byte);
0c404077 171.BI SYM_NAME( s )
172returns a pointer to the symbol's name; and
173.BI SYM_HASH( s )
174returns the symbol's hash value.
175.SS "Enumerating symbols"
b6b9d458 176Enumerating the values in a symbol table is fairly simple. Allocate a
177.B sym_iter
178object from somewhere. Attach it to a symbol table by calling
179.BR sym_mkiter ,
180and passing in the addresses of the iterator and the symbol table.
181Then, each call to
182.B sym_next
183will return a different value from the symbol table, until all of them
184have been enumerated, at which point,
185.B sym_next
186returns a null pointer.
187.PP
188It's safe to remove the symbol you've just been returned by
189.BR sym_next .
190However, it's not safe to remove any other symbol. So don't do that.
191.PP
192When you've finished with an iterator, it's safe to just throw it away.
193You don't need to call any functions beforehand.
0c404077 194.SS "Use in practice"
b6b9d458 195In normal use, the keys are simple strings (usually identifiers from
196some language), and the values are nontrivial structures providing
197information about types and values.
198.PP
199In this case, you'd define something like the following structure for
200your values:
201.VS
d056fbdf 202.ta 2 20m
b6b9d458 203typedef struct val {
d056fbdf
MW
204 sym_base _base; /* Symbol header */
205 unsigned type; /* Type of this symbol */
206 int dispoff; /* Which display variable is in */
207 size_t frameoff; /* Offset of variable in frame */
b6b9d458 208} val;
209.VE
210Given a pointer
211.I v
212to a
213.BR val ,
214you can find the variable's name by calling
215.BI SYM_NAME( v )\fR.
216.PP
217You can look up a name in the table by saying something like:
218.VS
d056fbdf 219.ta 2n
b6b9d458 220val *v = sym_find(t, name, -1, 0, 0);
221if (!v)
d056fbdf 222 error("unknown variable `%s'", name);
b6b9d458 223.VE
224You can add in a new variable by saying something like
225.VS
d056fbdf 226.ta 2n
b6b9d458 227unsigned f;
228val *v = sym_find(t, name, -1, sizeof(val), &f);
229if (f)
d056fbdf 230 error("variable `%s' already exists", name);
b6b9d458 231/* fill in v */
232.VE
233You can examine all the variables in your symbol table by saying
234something like:
235.VS
d056fbdf 236.ta 2n
b6b9d458 237sym_iter i;
238val *v;
d056fbdf 239.VP
b6b9d458 240for (sym_mkiter(&i, t); (v = sym_next(&i)) != 0; ) {
d056fbdf 241 /* ... */
b6b9d458 242}
243.VE
244That ought to be enough examples to be getting on with.
0c404077 245.SS Implementation
6f444bda 246The symbol table is an extensible hashtable, using the universal hash
247function described in
248.BR unihash (3)
249and the global hashing key. The hash chains are kept very short
250(probably too short, actually). Every time a symbol is found, its block
251is promoted to the front of its bin chain so it gets found faster next
252time.
b6b9d458 253.SH SEE ALSO
0c404077 254.BR hash (3),
08da152e 255.BR mLib (3).
b6b9d458 256.SH AUTHOR
9b5ac6ff 257Mark Wooding, <mdw@distorted.org.uk>