| 1 | .\" -*-nroff-*- |
| 2 | .TH lbuf 3 "6 July 1999" "Straylight/Edgeware" "mLib utilities library" |
| 3 | .SH "NAME" |
| 4 | lbuf \- split lines out of asynchronously received blocks |
| 5 | .\" @lbuf_flush |
| 6 | .\" @lbuf_close |
| 7 | .\" @lbuf_free |
| 8 | .\" @lbuf_snarf |
| 9 | .\" @lbuf_setsize |
| 10 | .\" @lbuf_init |
| 11 | .\" @lbuf_destroy |
| 12 | .SH "SYNOPSIS" |
| 13 | .nf |
| 14 | .B "#include <mLib/lbuf.h>" |
| 15 | |
| 16 | .B "enum {" |
| 17 | .B "\h'4n'LBUF_CRLF," |
| 18 | .B "\h'4n'LBUF_STRICTCRLF," |
| 19 | .B "\h'4n'..." |
| 20 | .B "};" |
| 21 | .B "#define LBUF_ENABLE ..." |
| 22 | |
| 23 | .B "typedef struct {" |
| 24 | .B "\h'4n'unsigned f;" |
| 25 | .B "\h'4n'..." |
| 26 | .B "} lbuf;" |
| 27 | |
| 28 | .B "typedef void lbuf_func(char *" s ", size_t " len ", void *" p ); |
| 29 | |
| 30 | .BI "void lbuf_flush(lbuf *" b ", char *" p ", size_t " len ); |
| 31 | .BI "void lbuf_close(lbuf *" b ); |
| 32 | .BI "size_t lbuf_free(lbuf *" b ", char **" p ); |
| 33 | .BI "void lbuf_snarf(lbuf *" b ", const void *" p ", size_t " sz ); |
| 34 | .BI "void lbuf_setsize(lbuf *" b ", size_t " sz ); |
| 35 | .BI "void lbuf_init(lbuf *" b ", lbuf_func *" func ", void *" p ); |
| 36 | .BI "void lbuf_destroy(lbuf *" b ); |
| 37 | .fi |
| 38 | .SH "DESCRIPTION" |
| 39 | The declarations in |
| 40 | .B <mLib/lbuf.h> |
| 41 | implement a handy object called a |
| 42 | .IR "line buffer" . |
| 43 | Given unpredictably-sized chunks of data, the line buffer extracts |
| 44 | completed lines of text and passes them to a caller-supplied function. |
| 45 | This is useful in nonblocking network servers, for example: the server |
| 46 | can feed input from a client into a line buffer as it arrives and deal |
| 47 | with completed text lines as they appear without having to wait for |
| 48 | newline characters. |
| 49 | .PP |
| 50 | The state of a line buffer is stored in an object of type |
| 51 | .BR lbuf . |
| 52 | This is a structure which must be allocated by the caller. The |
| 53 | structure should normally be considered opaque (see the section on |
| 54 | .B Disablement |
| 55 | for an exception to this). |
| 56 | .SS "Initialization and finalization" |
| 57 | The function |
| 58 | .B lbuf_init |
| 59 | initializes a line buffer ready for use. It is given three arguments: |
| 60 | .TP |
| 61 | .BI "lbuf *" b |
| 62 | A pointer to the block of memory to use for the line buffer. The line |
| 63 | buffer will allocate memory to store incoming data automatically: this |
| 64 | structure just contains bookkeeping information. |
| 65 | .TP |
| 66 | .BI "lbuf_func *" func |
| 67 | The |
| 68 | .I line-handler |
| 69 | function to which the line buffer should pass completed lines of text. |
| 70 | See |
| 71 | .B "Line-handler functions" |
| 72 | below for a description of this function. |
| 73 | .TP |
| 74 | .BI "void *" p |
| 75 | A pointer argument to be passed to the function when a completed line of |
| 76 | text arrives. |
| 77 | .PP |
| 78 | The amount of memory set aside for reading lines is configurable. It |
| 79 | may be set by calling |
| 80 | .B lbuf_setsize |
| 81 | at any time when the buffer is empty. The default limit is 256 bytes. |
| 82 | Lines longer than the limit are truncated. By default, the buffer is |
| 83 | allocated from the current arena, |
| 84 | .BR arena_global (3); |
| 85 | this may be changed by altering the buffer's |
| 86 | .B a |
| 87 | member to refer to a different arena at any time when the buffer is |
| 88 | unallocated. |
| 89 | .PP |
| 90 | A line buffer must be destroyed after use by calling |
| 91 | .BR lbuf_destroy , |
| 92 | passing it the address of the buffer block. |
| 93 | .SS "Inserting data into the buffer" |
| 94 | There are two interfaces for inserting data into the buffer. One's much |
| 95 | simpler than the other, although it's less expressive. |
| 96 | .PP |
| 97 | The simple interface is |
| 98 | .BR lbuf_snarf . |
| 99 | This function is given three arguments: a pointer |
| 100 | .I b |
| 101 | to a line buffer structure; a pointer |
| 102 | .I p |
| 103 | to a chunk of data to read; and the size |
| 104 | .I sz |
| 105 | of the chunk of data. The data is pushed through the line buffer and |
| 106 | any complete lines are passed on to the line handler. |
| 107 | .PP |
| 108 | The complex interface is the pair of functions |
| 109 | .B lbuf_free |
| 110 | and |
| 111 | .BR lbuf_flush . |
| 112 | .PP |
| 113 | The |
| 114 | .B lbuf_free |
| 115 | function returns the address and size of a free portion of the line |
| 116 | buffer's memory into which data may be written. The function is passed |
| 117 | the address |
| 118 | .I b |
| 119 | of the line buffer. Its result is the size of the free area, and it |
| 120 | writes the base address of this free space to the location pointed to by |
| 121 | the argument |
| 122 | .IR p . |
| 123 | The caller's data must be written to ascending memory locations starting |
| 124 | at |
| 125 | .BI * p |
| 126 | and no data may be written beyond the end of the free space. However, |
| 127 | it isn't necessary to completely fill the buffer. |
| 128 | .PP |
| 129 | Once the free area has had some data written to it, |
| 130 | .B lbuf_flush |
| 131 | is called to examine the new data and break it into text lines. This is |
| 132 | given three arguments: |
| 133 | .TP |
| 134 | .BI "lbuf *" b |
| 135 | The address of the line buffer. |
| 136 | .TP |
| 137 | .BI "char *" p |
| 138 | The address at which the new data has been written. This must be the |
| 139 | base address returned from |
| 140 | .BR lbuf_free . |
| 141 | .TP |
| 142 | .BI "size_t " len |
| 143 | The number of bytes which have been written to the buffer. |
| 144 | .PP |
| 145 | The |
| 146 | .B lbuf_flush |
| 147 | function breaks the new data into lines as described below, and passes |
| 148 | each one in turn to the line-handler function. |
| 149 | .PP |
| 150 | The |
| 151 | .B lbuf_snarf |
| 152 | function is trivially implemented in terms of the more complex |
| 153 | .BR lbuf_free / lbuf_flush |
| 154 | interface. |
| 155 | .SS "Line breaking" |
| 156 | By default, the line buffer considers a line to end with either a simple |
| 157 | linefeed character (the normal Unix convention) or a |
| 158 | carriage-return/linefeed pair (the Internet convention). This can be |
| 159 | changed by modifying the |
| 160 | .B delim |
| 161 | member of the |
| 162 | .B lbuf |
| 163 | structure: the default value is |
| 164 | .BR LBUF_CRLF . |
| 165 | If set to |
| 166 | .BR LBUF_STRICTCRLF , |
| 167 | only a carriage-return/linefeed pair will terminate a line. Any other |
| 168 | value is a single character which is considered to be the line terminator. |
| 169 | .PP |
| 170 | The line buffer has a fixed amount of memory available to it. This is |
| 171 | deliberate, to prevent a trivial attack whereby a remote user sends a |
| 172 | stream of data containing no newline markers, wasting the server's |
| 173 | memory. Instead, the buffer will truncate overly long lines (silently) |
| 174 | and return only the initial portion. It will ignore the rest of the |
| 175 | line completely. |
| 176 | .SS "Line-handler functions" |
| 177 | Completed lines, as already said, are passed to the caller's |
| 178 | line-handler function. It is given three arguments: the address |
| 179 | .I s |
| 180 | of the line which has just been read; the length |
| 181 | .I len |
| 182 | of the line (not including the null terminator), and the pointer |
| 183 | .I p |
| 184 | which was set up in the call to |
| 185 | .BR lbuf_init . |
| 186 | The line passed is null-terminated, and has had its trailing newline |
| 187 | stripped. The area of memory in which the string is located may be |
| 188 | overwritten by the line-handler function, although writing beyond the |
| 189 | terminating zero byte is not permitted. |
| 190 | .PP |
| 191 | The line pointer argument |
| 192 | .I s |
| 193 | may be null to signify end-of-file; in this case, the length |
| 194 | .I len |
| 195 | will be zero. See the next section. |
| 196 | .SS "Flushing the remaining data" |
| 197 | When the client program knows that there's no more data arriving (for |
| 198 | example, an end-of-file condition exists on its data source) it should |
| 199 | call the function |
| 200 | .BR lbuf_close |
| 201 | to flush out the remaining data in the buffer as one last (improperly |
| 202 | terminated) line. This will pass the remaining text to the line |
| 203 | handler, if there is any, and then call the handler one final time with |
| 204 | a null pointer rather than the address of a text line to inform it of |
| 205 | the end-of-file. |
| 206 | .SS "Disablement" |
| 207 | The line buffer is intended to be used in higher-level program objects, |
| 208 | such as the buffer selector described in |
| 209 | .BR selbuf (3). |
| 210 | Unfortunately, a concept from this high level needs to exist at the line |
| 211 | buffer level, which complicates the description somewhat. The idea is |
| 212 | that, when a line-handler attached to some higher-level object decides |
| 213 | that it's read enough, it can |
| 214 | .I disable |
| 215 | the object so that it doesn't see any more data. |
| 216 | .PP |
| 217 | Clearly, since an |
| 218 | .B lbuf_flush |
| 219 | call can emit more than one line, it must be aware that the line handler |
| 220 | isn't interested in any more lines. However, this fact must also be |
| 221 | signalled to the higher-level object so that it can detach itself from |
| 222 | its data source. |
| 223 | .PP |
| 224 | Rather than invent some complex interface for this, the line buffer |
| 225 | exports one of its structure members, a flags word called |
| 226 | .BR f . |
| 227 | A higher-level object wishing to disable the line buffer simply clears |
| 228 | the bit |
| 229 | .B LBUF_ENABLE |
| 230 | in this flags word. |
| 231 | .PP |
| 232 | Disabling a buffer causes an immediate return from |
| 233 | .BR lbuf_flush . |
| 234 | However, it is not permitted for the functions |
| 235 | .B lbuf_flush |
| 236 | or |
| 237 | .B lbuf_close |
| 238 | to be called on a disabled buffer. (This condition isn't checked for; |
| 239 | it'll just do the wrong thing.) Furthermore, the |
| 240 | .B lbuf_snarf |
| 241 | function does not handle disablement at all, because it would complicate |
| 242 | the interface so much that it wouldn't have any advantage over the more |
| 243 | general |
| 244 | .BR lbuf_free / lbuf_flush . |
| 245 | .SH "SEE ALSO" |
| 246 | .BR selbuf (3), |
| 247 | .BR mLib (3). |
| 248 | .SH "AUTHOR" |
| 249 | Mark Wooding, <mdw@distorted.org.uk> |