@@@ man wip
[mLib] / struct / dstr.3
CommitLineData
b6b9d458 1.\" -*-nroff-*-
2.de VS
3.sp 1
d66d7727 4.RS
b6b9d458 5.nf
6.ft B
7..
8.de VE
9.ft R
10.fi
11.RE
12.sp 1
13..
08da152e 14.de hP
b6b9d458 15.IP
16.ft B
17\h'-\w'\\$1\ 'u'\\$1\ \c
18.ft P
19..
20.ie t .ds o \(bu
21.el .ds o o
fbf20b5b 22.TH dstr 3 "8 May 1999" "Straylight/Edgeware" "mLib utilities library"
7527ed0b 23.SH NAME
b6b9d458 24dstr \- a simple dynamic string type
08da152e 25.\" @dstr_create
26.\" @dstr_destroy
27.\" @dstr_reset
28.\" @dstr_ensure
29.\" @dstr_tidy
30.\"
31.\" @dstr_putc
32.\" @dstr_putz
33.\" @dstr_puts
34.\" @dstr_putf
35.\" @dstr_putd
36.\" @dstr_putm
37.\" @dstr_putline
38.\" @dstr_write
39.\"
e49a7995 40.\" @DSTR_INIT
08da152e 41.\" @DCREATE
42.\" @DDESTROY
43.\" @DRESET
44.\" @DENSURE
45.\" @DPUTC
46.\" @DPUTZ
47.\" @DPUTS
48.\" @DPUTD
49.\" @DPUTM
50.\" @DWRITE
51.\"
b6b9d458 52.SH SYNOPSIS
53.nf
54.B "#include <mLib/dstr.h>"
d056fbdf 55.PP
4729aa69
MW
56.B "typedef struct { ...\& } dstr;"
57.B "#define DSTR_INIT ..."
d056fbdf 58.PP
b6b9d458 59.BI "void dstr_create(dstr *" d );
60.BI "void dstr_destroy(dstr *" d );
61.BI "void dstr_reset(dstr *" d );
d056fbdf 62.PP
b6b9d458 63.BI "void dstr_ensure(dstr *" d ", size_t " sz );
64.BI "void dstr_tidy(dstr *" d );
d056fbdf 65.PP
2be33c7c 66.BI "void dstr_putc(dstr *" d ", int " ch );
b6b9d458 67.BI "void dstr_putz(dstr *" d );
68.BI "void dstr_puts(dstr *" d ", const char *" s );
5a18a126 69.BI "int dstr_vputf(dstr *" d ", va_list *" ap );
d2a91066 70.BI "int dstr_putf(dstr *" d ", ...);"
b6b9d458 71.BI "void dstr_putd(dstr *" d ", const dstr *" p );
72.BI "void dstr_putm(dstr *" d ", const void *" p ", size_t " sz );
73.BI "int dstr_putline(dstr *" d ", FILE *" fp );
74.BI "size_t dstr_write(const dstr *" d ", FILE *" fp );
d056fbdf 75.PP
b6b9d458 76.BI "void DCREATE(dstr *" d );
77.BI "void DDESTROY(dstr *" d );
78.BI "void DRESET(dstr *" d );
79.BI "void DENSURE(dstr *" d ", size_t " sz );
08da152e 80.BI "void DPUTC(dstr *" c ", char " ch );
b6b9d458 81.BI "void DPUTZ(dstr *" d );
82.BI "void DPUTS(dstr *" d ", const char *" s );
83.BI "void DPUTD(dstr *" d ", const dstr *" p );
84.BI "void DPUTM(dstr *" d ", const void *" p ", size_t " sz );
85.BI "size_t DWRITE(const dstr *" d ", FILE *" fp );
86.fi
750e4b6c 87.SH DESCRIPTION
b6b9d458 88The header
89.B dstr.h
90declares a type for representing dynamically extending strings, and a
91small collection of useful operations on them. None of the operations
92returns a failure result on an out-of-memory condition; instead, the
93exception
94.B EXC_NOMEM
95is raised.
96.PP
97Many of the functions which act on dynamic strings have macro
98equivalents. These equivalent macros may evaluate their arguments
99multiple times.
750e4b6c 100.SS "Underlying type"
b6b9d458 101A
102.B dstr
4729aa69 103object is a small structure with the following members.
b6b9d458 104The
105.B buf
106member points to the actual character data in the string. The data may
107or may not be null terminated, depending on what operations have
108recently been performed on it. None of the
109.B dstr
110functions depend on the string being null-terminated; indeed, all of
111them work fine on strings containing arbitrary binary data. You can
112force null-termination by calling the
113.B dstr_putz
114function, or the
115.B DPUTZ
116macro.
117.PP
118The
119.B sz
120member describes the current size of the buffer. This reflects the
121maximum possible length of string that can be represented in
122.B buf
123without allocating a new buffer.
124.PP
125The
126.B len
127member describes the current length of the string. It is the number of
128bytes in the string which are actually interesting. The length does
129.I not
130include a null-terminating byte, if there is one.
131.PP
132The following invariants are maintained by
133.B dstr
134and must hold when any function is called:
08da152e 135.hP \*o
d4efbcd9 136If
b6b9d458 137.B sz
138is nonzero, then
139.B buf
140points to a block of memory of length
141.BR sz .
142If
143.B sz
144is zero, then
145.B buf
146is a null pointer.
08da152e 147.hP \*o
b6b9d458 148At all times,
7527ed0b 149.BR sz " \(>= " len.
b6b9d458 150.PP
d2a91066 151Note that there is no equivalent of the standard C distinction between
b6b9d458 152the empty string (a pointer to an array of characters whose first
d2a91066 153element is zero) and the nonexistent string (a null pointer). Any
b6b9d458 154.B dstr
155whose
156.B len
157is zero is an empty string.
cededfbe 158.PP
159The
160.I a
161member refers to the arena from which the string's buffer has been
162allocated. Immediately after creation, this is set to be
163.BR arena_stdlib (3);
164you can set it to point to any other arena of your choice before the
165buffer is allocated.
750e4b6c 166.SS "Creation and destruction"
b6b9d458 167The caller is responsible for allocating the
168.B dstr
528c8b4d 169structure. It can be initialized:
08da152e 170.hP \*o
528c8b4d 171using the macro
b6b9d458 172.B DSTR_INIT
528c8b4d 173as an initializer in the declaration of the object,
08da152e 174.hP \*o
528c8b4d 175passing its address to the
b6b9d458 176.B dstr_create
528c8b4d 177function, or
08da152e 178.hP \*o
528c8b4d 179passing its address to the (equivalent)
b6b9d458 180.B DCREATE
181macro.
182.PP
183The initial value of a
184.B dstr
185is the empty string.
186.PP
187The additional storage space for a string's contents may be reclaimed by
188passing it to the
189.B dstr_destroy
190function, or the
191.B DDESTROY
192macro. After destruction, a string's value is reset to the empty
193string:
194.I "it's still a valid"
195.BR dstr .
196However, once a string has been destroyed, it's safe to deallocate the
197underlying
198.B dstr
199object.
200.PP
201The
202.B dstr_reset
203function empties a string
204.I without
205deallocating any memory. Therefore appending more characters is quick,
d2a91066 206because the old buffer is still there and doesn't need to be allocated.
b6b9d458 207Calling
208.VS
209dstr_reset(d);
210.VE
d2a91066 211is equivalent to directly assigning
b6b9d458 212.VS
213d->len = 0;
214.VE
215There's also a macro
216.B DRESET
217which does the same job as the
218.B dstr_reset
219function.
750e4b6c 220.SS "Extending a string"
b6b9d458 221All memory allocation for strings is done by the function
222.BR dstr_ensure .
d4efbcd9 223Given a pointer
b6b9d458 224.I d
225to a
226.B dstr
227and a size
228.IR sz ,
229the function ensures that there are at least
230.I sz
231unused bytes in the string's buffer. The current algorithm for
232extending the buffer is fairly unsophisticated, but seems to work
233relatively well \- see the source if you really want to know what it's
234doing.
235.PP
236Extending a string never returns a failure result. Instead, if there
237isn't enough memory for a longer string, the exception
238.B EXC_NOMEM
239is raised. See
08da152e 240.BR exc (3)
d4efbcd9 241for more information about
b6b9d458 242.BR mLib 's
243exception handling system.
244.PP
245Note that if an ensure operation needs to reallocate a string buffer,
246any pointers you've taken into the string become invalid.
247.PP
248There's a macro
249.B DENSURE
250which does a quick inline check to see whether there's enough space in
251a string's buffer. This saves a procedure call when no reallocation
252needs to be done. The
253.B DENSURE
254macro is called in the same way as the
255.B dstr_ensure
256function.
257.PP
258The function
259.B dstr_tidy
260`trims' a string's buffer so that it's just large enough for the string
261contents and a null terminating byte. This might raise an exception due
262to lack of memory. (There are two possible ways this might happen.
d2a91066 263Firstly, the underlying allocator might just be brain-damaged enough to
b6b9d458 264fail on reducing a block's size. Secondly, tidying an empty string with no
265buffer allocated for it causes allocation of a buffer large enough for
266the terminating null byte.)
750e4b6c 267.SS "Contributing data to a string"
b6b9d458 268There are a collection of functions which add data to a string. All of
269these functions add their new data to the
270.I end
271of the string. This is good, because programs usually build strings
272left-to-right. If you want to do something more clever, that's up to
273you.
274.PP
275Several of these functions have equivalent macros which do the main work
276inline. (There still might need to be a function call if the buffer
277needs to be extended.)
278.PP
279Any of these functions might extend the string, causing pointers into
280the string buffer to be invalidated. If you don't want that to happen,
281pre-ensure enough space before you start.
282.PP
283The simplest function is
284.B dstr_putc
285which appends a single character
286.I ch
287to the end of the string. It has a macro equivalent called
288.BR DPUTC .
289.PP
290The function
291.B dstr_putz
292places a zero byte at the end of the string. It does
293.I not
294affect the string's length, so any other data added to the string will
295overwrite the null terminator. This is useful if you want to pass your
296string to one of the standard C library string-handling functions. The
297macro
298.B DPUTZ
299does the same thing.
300.PP
301The function
302.B dstr_puts
303writes a C-style null-terminated string to the end of a dynamic string.
304A terminating zero byte is also written, as if
305.B dstr_putz
306were called. The macro
307.B DPUTS
308does the same job.
309.PP
310The function
311.B dstr_putf
312works similarly to the standard
313.BR sprintf (3)
314function. It accepts a
315.BR print (3)-style
316format string and an arbitrary number of arguments to format and writes
317the resulting text to the end of a dynamic string, returning the number
318of characters so written. A terminating zero byte is also appended.
319The formatting is intended to be convenient and safe rather than
320efficient, so don't expect blistering performance. Similarly, there may
321be differences between the formatting done by
322.B dstr_putf
323and
324.BR sprintf (3)
325because the former has to do most of its work itself. In particular,
326.B dstr_putf
eff136f6 327understands the POSIX
b6b9d458 328.RB ` n$ '
eff136f6
MW
329positional parameter notation accepted by many Unix C libraries, even if
330the underlying C library does not. There is no macro equivalent of
b6b9d458 331.BR dstr_putf .
332.PP
333The function
334.B dstr_vputf
335provides access to the `guts' of
336.BR dstr_putf :
5a18a126 337given a format string and a pointer to a
338.BR va_list
339it will format the arguments according to the format string, just as
b6b9d458 340.B dstr_putf
5a18a126 341does. (Note: that's a
342.BR "va_list *" ,
343not a plain
344.BR va_list ,
345so that it gets updated properly on exit.)
b6b9d458 346.PP
347The function
348.B dstr_putd
349appends the contents of one dynamic string to another. A null
350terminator is also appended. The macro
351.B DPUTD
352does the same thing.
353.PP
354The function
355.B dstr_putm
356puts an arbitrary block of memory, addressed by
357.IR p ,
358with length
359.I sz
360bytes, at the end of a dynamic string. No terminating null is appended:
361it's assumed that if you're playing with arbitrary chunks of memory then
362you're probably not going to be using the resulting data as a normal
363text string. The macro
364.B DPUTM
365works the same way.
366.PP
367The function
368.B dstr_putline
369reads a line from an input stream
370.I fp
371and appends it to a string. If an error occurs, or end-of-file is
372encountered, before any characters have been read, then
373.B dstr_putline
374returns the value
750e4b6c 375.B EOF
376and does not extend the string. Otherwise, it reads until it encounters
377a newline character, an error, or end-of-file, and returns the number of
378characters read. If reading was terminated by a newline character, the
379newline character is
b6b9d458 380.I not
381inserted in the buffer. A terminating null is appended, as by
382.BR dstr_putz .
750e4b6c 383.SS "Other functions"
b6b9d458 384The
385.B dstr_write
386function writes a string to an output stream
387.IR fp .
388It returns the number of characters written, or
389.B 0
390if an error occurred before the first write. No newline character is
391written to the stream, unless the string actually contains one already.
392The macro
393.B DWRITE
394is equivalent.
395.SH "SECURITY CONSIDERATIONS"
d2a91066 396The implementation of the
b6b9d458 397.B dstr
398functions is designed to do string handling in security-critical
399programs. However, there may be bugs in the code somewhere. In
400particular, the
401.B dstr_putf
f1583053 402functions are quite complicated, and could do with some checking by
b6b9d458 403independent people who know what they're doing.
08da152e 404.SH "SEE ALSO"
405.BR exc (3),
406.BR mLib (3).
b6b9d458 407.SH AUTHOR
9b5ac6ff 408Mark Wooding, <mdw@distorted.org.uk>