@@@ timeout wip
[mLib] / struct / sym.3
CommitLineData
b6b9d458 1.\" -*-nroff-*-
2.de VS
3.sp 1
d66d7727 4.RS
b6b9d458 5.nf
6.ft B
7..
8.de VE
9.ft R
10.fi
11.RE
12.sp 1
13..
fbf20b5b 14.TH sym 3 "8 May 1999" "Straylight/Edgeware" "mLib utilities library"
b6b9d458 15.SH NAME
16sym \- symbol table manager
08da152e 17.\" @sym_create
18.\" @sym_destroy
19.\" @sym_find
20.\" @sym_remove
21.\" @sym_mkiter
22.\" @sym_next
23.\"
24.\" @SYM_NAME
0c404077 25.\" @SYM_LEN
26.\" @SYM_HASH
08da152e 27.\"
b6b9d458 28.SH SYNOPSIS
29.nf
30.B "#include <mLib/sym.h>"
31
4729aa69
MW
32.B "type struct { ...\& } sym_table;"
33.B "type struct { ...\& } sym_base;"
34.B "type struct { ...\& } sym_iter;"
35
b6b9d458 36.BI "void sym_create(sym_table *" t );
37.BI "void sym_destroy(sym_table *" t );
38
2b1924c2
MW
39.ds mT \fBvoid *sym_find(
40.BI "\*(mTsym_table *" t ,
41.BI "\h'\w'\*(mT'u'const char *" n ", long " l ,
42.BI "\h'\w'\*(mT'u'size_t " sz ", unsigned *" f );
b6b9d458 43.BI "void sym_remove(sym_table *" t ", void *" b );
44
0c404077 45.BI "const char *SYM_NAME(const void *" p );
46.BI "size_t SYM_LEN(const void *" p );
47.BI "uint32 SYM_HASH(const void *" p );
48
b6b9d458 49.BI "void sym_mkiter(sym_iter *" i ", sym_table *" t );
50.BI "void *sym_next(sym_iter *" i );
51.fi
0c404077 52.SH "DESCRIPTION"
b6b9d458 53The
54.B sym
55functions implement a data structure often described as a dictionary, a
56finite map, an associative array, or a symbol table. It associates
57.I values
58with
59.I keys
60such that the value corresponding to a given key can be found quickly.
61Additionally, all stored associations can be enumerated.
62.PP
63The interface provides an
64.I intrusive
65symbol table. The data objects stored in the table must include a small
66header used by the symbol table manager. This reduces the amount of
67pointer fiddling that needs to be done, and in practice doesn't seem to
68be much of a problem. It's also fairly easy to construct a
69non-intrusive interface if you really want one.
70.PP
71There are three main data structures involved in the interface:
72.TP
73.B sym_table
74Keeps track of the information associated with a particular table.
75.TP
76.B sym_base
77The header which must be attached to the front of all the value
78objects.
79.TP
80.B sym_iter
81An iterator object, used for enumerating all of the associations stored
82in a symbol table.
83.PP
84All of the above data structures should be considered
85.IR opaque :
86don't try looking inside. Representations have changed in the past, and
87they may change again in the future.
0c404077 88.SS "Creation and destruction"
b6b9d458 89The
90.B sym_table
91object itself needs to be allocated by the caller. It is initialized by
92passing it to the function
93.BR sym_create .
94After initialization, the table contains no entries.
95.PP
96Initializing a symbol table involves allocating some memory. If this
d2a91066 97allocation fails, an
b6b9d458 98.B EXC_NOMEM
99exception is raised.
100.PP
101When a symbol table is no longer needed, the memory occupied by the
102values and other maintenance structures can be reclaimed by calling
103.BR sym_destroy .
0c404077 104Any bits of user data attached to values should previously have been
105destroyed.
106.SS "Adding, searching and removing"
b6b9d458 107Most of the actual work is done by the function
108.BR sym_find .
109It does both lookup and creation, depending on its arguments. To do its
110job, it needs to know the following bits of information:
111.TP
ff76c38f 112.BI "sym_table *" t
b6b9d458 113A pointer to a symbol table to manipulate.
114.TP
ff76c38f 115.BI "const char *" n
b6b9d458 116The address of the
117.I key
118to look up or create. Usually this will be a simple text string,
119although it can actually be any arbitrary binary data.
120.TP
ff76c38f 121.BI "long " l
b6b9d458 122The length of the key. If this is \-1,
123.B sym_find
124assumes that the key is a null-terminated string, and calculates its
0c404077 125length itself. This is entirely equivalent to passing
126.BI strlen( n )\fR.
b6b9d458 127.TP
ff76c38f 128.BI "size_t " sz
b6b9d458 129The size of the value block to allocate if the key could not be found.
130If this is zero, no value is allocated, and a null pointer is returned
131to indicate an unsuccessful lookup.
132.TP
ff76c38f 133.BI "unsigned *" f
b6b9d458 134The address of a `found' flag to set. This is an output parameter. On
135exit,
136.B sym_find
137will set the value of
138.BI * f
139to zero if the key could not be found, or nonzero if it was found. This
140can be used to tell whether the value returned has been newly allocated,
141or whether it was already in the table.
142.PP
0c404077 143A terminating null byte is appended to the copy of the symbol's name in
144memory. This is not considered to be a part of the symbol's name, and
145does not contribute to the name's length as reported by the
146.B SYM_LEN
147macro.
b6b9d458 148.PP
149A symbol can be removed from the table by calling
150.BR sym_remove ,
151passing the symbol table itself, and the value block that needs
152removing.
0c404077 153.SS "Enquiries about symbols"
154Three macros are provided to enable simple enquiries about a symbol.
155Given a pointer
156.I s
157to a symbol table entry,
158.BI SYM_LEN( s )
159returns the length of the symbol's name (excluding any terminating null
d4efbcd9 160byte);
0c404077 161.BI SYM_NAME( s )
162returns a pointer to the symbol's name; and
163.BI SYM_HASH( s )
164returns the symbol's hash value.
165.SS "Enumerating symbols"
b6b9d458 166Enumerating the values in a symbol table is fairly simple. Allocate a
167.B sym_iter
168object from somewhere. Attach it to a symbol table by calling
169.BR sym_mkiter ,
170and passing in the addresses of the iterator and the symbol table.
171Then, each call to
172.B sym_next
173will return a different value from the symbol table, until all of them
174have been enumerated, at which point,
175.B sym_next
176returns a null pointer.
177.PP
178It's safe to remove the symbol you've just been returned by
179.BR sym_next .
180However, it's not safe to remove any other symbol. So don't do that.
181.PP
182When you've finished with an iterator, it's safe to just throw it away.
183You don't need to call any functions beforehand.
0c404077 184.SS "Use in practice"
b6b9d458 185In normal use, the keys are simple strings (usually identifiers from
186some language), and the values are nontrivial structures providing
187information about types and values.
188.PP
189In this case, you'd define something like the following structure for
190your values:
191.VS
192typedef struct val {
193 sym_base _base; /* Symbol header */
194 unsigned type; /* Type of this symbol */
195 int dispoff; /* Which display variable is in */
196 size_t frameoff; /* Offset of variable in frame */
197} val;
198.VE
199Given a pointer
200.I v
201to a
202.BR val ,
203you can find the variable's name by calling
204.BI SYM_NAME( v )\fR.
205.PP
206You can look up a name in the table by saying something like:
207.VS
208val *v = sym_find(t, name, -1, 0, 0);
209if (!v)
210 error("unknown variable `%s'", name);
211.VE
212You can add in a new variable by saying something like
213.VS
214unsigned f;
215val *v = sym_find(t, name, -1, sizeof(val), &f);
216if (f)
217 error("variable `%s' already exists", name);
218/* fill in v */
219.VE
220You can examine all the variables in your symbol table by saying
221something like:
222.VS
223sym_iter i;
224val *v;
225
226for (sym_mkiter(&i, t); (v = sym_next(&i)) != 0; ) {
227 /* ... */
228}
229.VE
230That ought to be enough examples to be getting on with.
0c404077 231.SS Implementation
6f444bda 232The symbol table is an extensible hashtable, using the universal hash
233function described in
234.BR unihash (3)
235and the global hashing key. The hash chains are kept very short
236(probably too short, actually). Every time a symbol is found, its block
237is promoted to the front of its bin chain so it gets found faster next
238time.
b6b9d458 239.SH SEE ALSO
0c404077 240.BR hash (3),
08da152e 241.BR mLib (3).
b6b9d458 242.SH AUTHOR
9b5ac6ff 243Mark Wooding, <mdw@distorted.org.uk>