*/*.3: Align arguments properly to the right of the opening `('.
[mLib] / struct / sym.3
CommitLineData
b6b9d458 1.\" -*-nroff-*-
2.de VS
3.sp 1
d66d7727 4.RS
b6b9d458 5.nf
6.ft B
7..
8.de VE
9.ft R
10.fi
11.RE
12.sp 1
13..
fbf20b5b 14.TH sym 3 "8 May 1999" "Straylight/Edgeware" "mLib utilities library"
b6b9d458 15.SH NAME
16sym \- symbol table manager
08da152e 17.\" @sym_create
18.\" @sym_destroy
19.\" @sym_find
20.\" @sym_remove
21.\" @sym_mkiter
22.\" @sym_next
23.\"
24.\" @SYM_NAME
0c404077 25.\" @SYM_LEN
26.\" @SYM_HASH
08da152e 27.\"
b6b9d458 28.SH SYNOPSIS
29.nf
30.B "#include <mLib/sym.h>"
31
32.BI "void sym_create(sym_table *" t );
33.BI "void sym_destroy(sym_table *" t );
34
2b1924c2
MW
35.ds mT \fBvoid *sym_find(
36.BI "\*(mTsym_table *" t ,
37.BI "\h'\w'\*(mT'u'const char *" n ", long " l ,
38.BI "\h'\w'\*(mT'u'size_t " sz ", unsigned *" f );
b6b9d458 39.BI "void sym_remove(sym_table *" t ", void *" b );
40
0c404077 41.BI "const char *SYM_NAME(const void *" p );
42.BI "size_t SYM_LEN(const void *" p );
43.BI "uint32 SYM_HASH(const void *" p );
44
b6b9d458 45.BI "void sym_mkiter(sym_iter *" i ", sym_table *" t );
46.BI "void *sym_next(sym_iter *" i );
47.fi
0c404077 48.SH "DESCRIPTION"
b6b9d458 49The
50.B sym
51functions implement a data structure often described as a dictionary, a
52finite map, an associative array, or a symbol table. It associates
53.I values
54with
55.I keys
56such that the value corresponding to a given key can be found quickly.
57Additionally, all stored associations can be enumerated.
58.PP
59The interface provides an
60.I intrusive
61symbol table. The data objects stored in the table must include a small
62header used by the symbol table manager. This reduces the amount of
63pointer fiddling that needs to be done, and in practice doesn't seem to
64be much of a problem. It's also fairly easy to construct a
65non-intrusive interface if you really want one.
66.PP
67There are three main data structures involved in the interface:
68.TP
69.B sym_table
70Keeps track of the information associated with a particular table.
71.TP
72.B sym_base
73The header which must be attached to the front of all the value
74objects.
75.TP
76.B sym_iter
77An iterator object, used for enumerating all of the associations stored
78in a symbol table.
79.PP
80All of the above data structures should be considered
81.IR opaque :
82don't try looking inside. Representations have changed in the past, and
83they may change again in the future.
0c404077 84.SS "Creation and destruction"
b6b9d458 85The
86.B sym_table
87object itself needs to be allocated by the caller. It is initialized by
88passing it to the function
89.BR sym_create .
90After initialization, the table contains no entries.
91.PP
92Initializing a symbol table involves allocating some memory. If this
d2a91066 93allocation fails, an
b6b9d458 94.B EXC_NOMEM
95exception is raised.
96.PP
97When a symbol table is no longer needed, the memory occupied by the
98values and other maintenance structures can be reclaimed by calling
99.BR sym_destroy .
0c404077 100Any bits of user data attached to values should previously have been
101destroyed.
102.SS "Adding, searching and removing"
b6b9d458 103Most of the actual work is done by the function
104.BR sym_find .
105It does both lookup and creation, depending on its arguments. To do its
106job, it needs to know the following bits of information:
107.TP
ff76c38f 108.BI "sym_table *" t
b6b9d458 109A pointer to a symbol table to manipulate.
110.TP
ff76c38f 111.BI "const char *" n
b6b9d458 112The address of the
113.I key
114to look up or create. Usually this will be a simple text string,
115although it can actually be any arbitrary binary data.
116.TP
ff76c38f 117.BI "long " l
b6b9d458 118The length of the key. If this is \-1,
119.B sym_find
120assumes that the key is a null-terminated string, and calculates its
0c404077 121length itself. This is entirely equivalent to passing
122.BI strlen( n )\fR.
b6b9d458 123.TP
ff76c38f 124.BI "size_t " sz
b6b9d458 125The size of the value block to allocate if the key could not be found.
126If this is zero, no value is allocated, and a null pointer is returned
127to indicate an unsuccessful lookup.
128.TP
ff76c38f 129.BI "unsigned *" f
b6b9d458 130The address of a `found' flag to set. This is an output parameter. On
131exit,
132.B sym_find
133will set the value of
134.BI * f
135to zero if the key could not be found, or nonzero if it was found. This
136can be used to tell whether the value returned has been newly allocated,
137or whether it was already in the table.
138.PP
0c404077 139A terminating null byte is appended to the copy of the symbol's name in
140memory. This is not considered to be a part of the symbol's name, and
141does not contribute to the name's length as reported by the
142.B SYM_LEN
143macro.
b6b9d458 144.PP
145A symbol can be removed from the table by calling
146.BR sym_remove ,
147passing the symbol table itself, and the value block that needs
148removing.
0c404077 149.SS "Enquiries about symbols"
150Three macros are provided to enable simple enquiries about a symbol.
151Given a pointer
152.I s
153to a symbol table entry,
154.BI SYM_LEN( s )
155returns the length of the symbol's name (excluding any terminating null
d4efbcd9 156byte);
0c404077 157.BI SYM_NAME( s )
158returns a pointer to the symbol's name; and
159.BI SYM_HASH( s )
160returns the symbol's hash value.
161.SS "Enumerating symbols"
b6b9d458 162Enumerating the values in a symbol table is fairly simple. Allocate a
163.B sym_iter
164object from somewhere. Attach it to a symbol table by calling
165.BR sym_mkiter ,
166and passing in the addresses of the iterator and the symbol table.
167Then, each call to
168.B sym_next
169will return a different value from the symbol table, until all of them
170have been enumerated, at which point,
171.B sym_next
172returns a null pointer.
173.PP
174It's safe to remove the symbol you've just been returned by
175.BR sym_next .
176However, it's not safe to remove any other symbol. So don't do that.
177.PP
178When you've finished with an iterator, it's safe to just throw it away.
179You don't need to call any functions beforehand.
0c404077 180.SS "Use in practice"
b6b9d458 181In normal use, the keys are simple strings (usually identifiers from
182some language), and the values are nontrivial structures providing
183information about types and values.
184.PP
185In this case, you'd define something like the following structure for
186your values:
187.VS
188typedef struct val {
189 sym_base _base; /* Symbol header */
190 unsigned type; /* Type of this symbol */
191 int dispoff; /* Which display variable is in */
192 size_t frameoff; /* Offset of variable in frame */
193} val;
194.VE
195Given a pointer
196.I v
197to a
198.BR val ,
199you can find the variable's name by calling
200.BI SYM_NAME( v )\fR.
201.PP
202You can look up a name in the table by saying something like:
203.VS
204val *v = sym_find(t, name, -1, 0, 0);
205if (!v)
206 error("unknown variable `%s'", name);
207.VE
208You can add in a new variable by saying something like
209.VS
210unsigned f;
211val *v = sym_find(t, name, -1, sizeof(val), &f);
212if (f)
213 error("variable `%s' already exists", name);
214/* fill in v */
215.VE
216You can examine all the variables in your symbol table by saying
217something like:
218.VS
219sym_iter i;
220val *v;
221
222for (sym_mkiter(&i, t); (v = sym_next(&i)) != 0; ) {
223 /* ... */
224}
225.VE
226That ought to be enough examples to be getting on with.
0c404077 227.SS Implementation
6f444bda 228The symbol table is an extensible hashtable, using the universal hash
229function described in
230.BR unihash (3)
231and the global hashing key. The hash chains are kept very short
232(probably too short, actually). Every time a symbol is found, its block
233is promoted to the front of its bin chain so it gets found faster next
234time.
b6b9d458 235.SH SEE ALSO
0c404077 236.BR hash (3),
08da152e 237.BR mLib (3).
b6b9d458 238.SH AUTHOR
9b5ac6ff 239Mark Wooding, <mdw@distorted.org.uk>