2 * Potential future TODO items. Points marked ISSUE need to be
3 * resolved one way or another, with good justification for the
4 * decision made, before implementation begins.
6 * - Multiple buffers, multiple on-screen windows.
7 * + ^X^F to open new file
8 * + ^X^R to open new file RO
9 * + ^X b to switch buffers in a window
10 * + ^X o to switch windows
11 * + ^X 2 to split a window
12 * + ^X 1 to destroy all windows but this
13 * + ^X 0 to destroy this window
14 * + ^X ^ to enlarge this window by one line
15 * + width settings vary per buffer (aha, _that's_ why I wanted
16 * a buffer structure surrounding the raw B-tree)
17 * + hex-editor-style minibuffer for entering search terms,
18 * rather than the current rather crap one; in particular
19 * this enables pasting into the search string.
20 * + ISSUE: how exactly do we deal with the problem of saving
21 * over a file which we're maintaining references to in
22 * another buffer? The _current_ buffer can at least be
23 * sorted out by replacing it with a fresh tree containing a
24 * single file-data block, but other buffers are in trouble.
25 * * if we can rely on Unix fd semantics, one option is just
26 * to keep the fd open on the original file, and then the
27 * data stays around even after we rename(2) our new
28 * version over the top. Disk space usage gets silly after
29 * a few iterations, but it's better than nothing.
32 * + this actually doesn't seem _too_ horrid. For a start, one
33 * simple approach would be to clone the entire buffer B-tree
34 * every time we perform an operation! That's actually not
35 * _too_ expensive, if we maintain a limit on the number of
36 * operations we may undo.
37 * + I had also thought of cloning the tree we insert for each
38 * buf_insert_data and cloning the one removed for each
39 * buf_delete_data (both must be cloned for an overwrite),
40 * but I'm not convinced that simply cloning the entire thing
41 * isn't a superior option.
42 * + this really starts to show up the distinction between a
43 * `buffer' and a bare tree. A buffer is something which has
44 * an undo chain attached; so, in particular, the cut buffer
45 * shouldn't be one. Sort that out.
48 * + this is an extra option useful for editing disk devices
49 * directly (!), or other situation in which it's impossible
50 * or impractical to rename(2) your new file over the old
51 * one. It causes a change of semantics when saving: instead
52 * of constructing a new backup file and writing it over the
53 * old one, we simply seek within the original file and write
54 * out all the pieces that have changed.
55 * + Saving the file involves identifying the bits of the file
56 * that need to change, and changing them. A piece of file
57 * can be discarded as `no change required' if it's
58 * represented in the buffer by a from-file block whose file
59 * offset is equal to its offset in the buffer.
60 * * Once we have identified all the bits that do need to
61 * change, we have to draw up a dependency graph to
62 * indicate which bits want to be copied from which other
63 * bits. (You don't want to overwrite a piece of file if
64 * you still have from-file blocks pointing at that
65 * piece.) This is a directed graph with nodes
66 * corresponding to intervals of the file, and edges
67 * indicating that the source node's interval is intended
68 * to end up containing the data from the target node's
69 * interval in the original file. Another node type is
70 * `literal data', which can be the target of an edge but
72 * - note that this means any two nodes connected by an
73 * edge must represent intervals of the same length.
74 * Sometimes this means that an interval must be split
75 * into pieces even though it is represented in the
76 * buffer by a single large from-file block (if
77 * from-file blocks copying _from_ it don't cover the
78 * whole of it). I suspect the simplest approach here
79 * is just to start by making a B-tree of division
80 * points in the file: every from-file block adds four
81 * division points (for start and end of both source
82 * and dest interval), and once the tree is complete,
83 * each graph node represents the interval between two
84 * adjacent division points.
85 * - ISSUE: actually, that strategy is inadequate:
86 * consider a large from-file block displaced by only
87 * one byte from its source location. The above
88 * strategy gives division points at x, x+1, x+y,
89 * x+y+1, but the interval [x,x+1] actually wants to
90 * point to [x+1,x+2] and we don't have a division
91 * point for that. Worse still, finding a way to add
92 * the remaining division points is also undesirable
93 * because there'd be so many of them. Needs design
95 * * Then, any node which is not the target of any edge
96 * represents a piece of file which it's safe to write
97 * over, so we do so and throw away the node.
98 * * If we run out of such nodes and the graph is still
99 * non-empty, it's because all remaining nodes are part of
100 * loops. A loop must represent a set of disjoint
101 * intervals in the file, all the same length, which need
102 * to be permuted cyclically. So we deal with such a loop
103 * by reading a chunk of data from the start of one of the
104 * intervals and holding it, then copying from the next
105 * interval to that one, and so on until we've gone round
107 * + the intervals in the loop might be far too big to
108 * hold an entire interval's worth of real data in
109 * memory, so we might have to do it piecewise.
110 * + ISSUE: I wonder if a warning of some sort might be in
111 * order for if you accidentally request most of the file be
112 * moved about. This sort of trickery is really intended for
113 * small changes to a large file; if you (say) enable insert
114 * mode while editing a hard disk and accidentally leave
115 * everything one byte further up, you _really_ don't want to
116 * hit Save. The semantics of the warning are difficult,
119 * - Custom display and/or input formats?
120 * + for example, Zap on RISC OS is able to display a binary
121 * file at 4 bytes per line and show the ARM disassembly of
122 * each word. For added credit, ability to type an ARM
123 * instruction back _in_ and have it reassembled into binary
124 * would be even better.
125 * + a simpler example is that sometimes you want to view a
126 * file as a sequence of little-endian 32-bit words rather
128 * + this would have to involve some sort of scripting or
129 * internal API. I'd really rather the interface was nailed
130 * down very early on and people were then free to develop
131 * custom formats without my involvement; I might be
132 * persuaded to keep a library of them or a list of
133 * hyperlinks or something, but actually _maintaining_ them
134 * is more effort than I want.
135 * + ARM assembler is all very well, but what about x86, with
136 * its variable instruction length? You can start
137 * disassembling from any byte position and work forwards
138 * unambiguously, but going backwards or jumping to an
139 * arbitrary byte position is much harder. You might have to
140 * shift your current file view back or forward by one byte
141 * to resynchronise, and the semantics of insert mode become
142 * generally confused, and even trying to _predict_ what a
143 * sensible synchronisation point would be when jumping to a
144 * bit of the file you've never seen before ... yuck.
145 * * The key thing that makes this horrid is that the custom
146 * display mode looks at the file _contents_, not merely
147 * its length, when deciding how many bytes per line to
148 * display. File-position-dependent number of bytes per
149 * line is fine, but _data_ dependency is doom.
150 * * So I think that in the interests of not causing tension
151 * between random things people would like in _some_ hex
152 * editor and what makes Tweak Tweak, I am going to put my
153 * foot down and say that I will not implement any
154 * mechanism which permits a data-dependent number of
155 * bytes per line. Anything short of that, fine, send me a
156 * patch or a detailed and well thought out design and
157 * I'll consider it on its merits.
158 * * I don't, OTOH, see any reason why a custom display
159 * function couldn't be permitted to see data before or
160 * after the current lineful if it wanted to. So x86
161 * disassembly could be done in a one-byte-per-line sort
162 * of fashion in which each line shows the machine
163 * instruction which the CPU would see if it started
164 * executing at that byte, and also gave its length. Then
165 * you could pick out the sequence of instructions you
166 * were interested in from the various out-of-sync ones.
177 #if defined(unix) && !defined(GO32)
179 #include <sys/ioctl.h>
186 static void init(void);
187 static void done(void);
188 static void load_file (char *);
190 char toprint
[256]; /* LUT: printable versions of chars */
191 char hex
[256][3]; /* LUT: binary to hex, 1 byte */
195 char decstatus
[] = "%s TWEAK "VER
": %-18.18s %s posn=%-10"OFF
"d size=%-10"OFF
"d";
196 char hexstatus
[] = "%s TWEAK "VER
": %-18.18s %s posn=0x%-8"OFF
"X size=0x%-8"OFF
"X";
197 char *statfmt
= hexstatus
;
201 char *filename
= NULL
;
202 buffer
*filedata
, *cutbuffer
= NULL
;
203 int fix_mode
= FALSE
;
204 int look_mode
= FALSE
;
205 int eager_mode
= FALSE
;
206 int insert_mode
= FALSE
;
207 int edit_type
= 1; /* 1,2 are hex digits, 0=ascii */
208 int finished
= FALSE
;
210 int modified
= FALSE
;
211 int new_file
= FALSE
; /* shouldn't need initialisation -
212 * but let's not take chances :-) */
213 fileoffset_t width
= 16;
214 fileoffset_t realoffset
= 0, offset
= 16;
216 int ascii_enabled
= TRUE
;
218 fileoffset_t file_size
= 0, top_pos
= 0, cur_pos
= 0, mark_point
= 0;
225 int main(int argc
, char **argv
) {
226 fileoffset_t newoffset
= -1, newwidth
= -1;
229 * Parse command line arguments
231 pname
= *argv
; /* program name */
234 "usage: %s [-f] [-l] [-e] filename\n"
235 " or %s -D to write default tweak.rc to stdout\n",
241 char c
, *p
= *++argv
, *value
;
245 while (*p
) switch (c
= *p
++) {
249 * these parameters require arguments
256 fprintf(stderr
, "%s: option `-%c' requires an argument\n",
262 newoffset
= parse_num(value
, NULL
);
265 newwidth
= parse_num(value
, NULL
);
285 fprintf(stderr
, "%s: multiple filenames specified\n", pname
);
293 fprintf(stderr
, "%s: no filename specified\n", pname
);
299 realoffset
= newoffset
;
302 load_file (filename
);
315 * Fix up `offset' to match `realoffset'. Also, while we're here,
316 * enable or disable ASCII mode and sanity-check the width.
318 void fix_offset(void) {
319 if (3*width
+11 > display_cols
) {
320 width
= (display_cols
-11) / 3;
321 sprintf (message
, "Width reduced to %"OFF
"d to fit on the screen", width
);
323 if (4*width
+14 > display_cols
) {
324 ascii_enabled
= FALSE
;
326 edit_type
= 1; /* force to hex mode */
328 ascii_enabled
= TRUE
;
329 offset
= realoffset
% width
;
335 * Initialise stuff at the beginning of the program: mostly the
338 static void init(void) {
343 display_define_colour(COL_BUFFER
, -1, -1, FALSE
);
344 display_define_colour(COL_SELECT
, 0, 7, TRUE
);
345 display_define_colour(COL_STATUS
, 11, 4, TRUE
);
346 display_define_colour(COL_ESCAPE
, 9, 0, FALSE
);
347 display_define_colour(COL_INVALID
, 11, 0, FALSE
);
349 for (i
=0; i
<256; i
++) {
350 sprintf(hex
[i
], "%02X", i
);
351 toprint
[i
] = (i
>=32 && i
<127 ? i
: '.');
356 * Clean up all the stuff that init() did.
358 static void done(void) {
363 * Load the file specified on the command line.
365 static void load_file (char *fname
) {
369 if ( (fp
= fopen (fname
, "rb")) ) {
372 static char buffer
[4096];
374 filedata
= buf_new_empty();
379 * We've opened the file. Load it.
381 while ( (len
= fread (buffer
, 1, sizeof(buffer
), fp
)) > 0 ) {
382 buf_insert_data (filedata
, buffer
, len
, file_size
);
386 assert(file_size
== buf_length(filedata
));
387 sprintf(message
, "loaded %s (size %"OFF
"d == 0x%"OFF
"X).",
388 fname
, file_size
, file_size
);
390 filedata
= buf_new_from_file(fp
);
391 file_size
= buf_length(filedata
);
392 sprintf(message
, "opened %s (size %"OFF
"d == 0x%"OFF
"X).",
393 fname
, file_size
, file_size
);
397 if (look_mode
|| fix_mode
) {
398 fprintf(stderr
, "%s: file %s not found, and %s mode active\n",
399 pname
, fname
, (look_mode ?
"LOOK" : "FIX"));
402 filedata
= buf_new_empty();
403 sprintf(message
, "New file %s.", fname
);
409 * Save the file. Return TRUE on success, FALSE on error.
411 int save_file (void) {
413 fileoffset_t pos
= 0;
416 return FALSE
; /* do nothing! */
418 if ( (fp
= fopen (filename
, "wb")) ) {
419 static char buffer
[SAVE_BLKSIZ
];
421 while (pos
< file_size
) {
422 fileoffset_t size
= file_size
- pos
;
423 if (size
> SAVE_BLKSIZ
)
426 buf_fetch_data (filedata
, buffer
, size
, pos
);
427 if (size
!= fwrite (buffer
, 1, size
, fp
)) {
440 * Make a backup of the file, if such has not already been done.
441 * Return TRUE on success, FALSE on error.
443 int backup_file (void) {
444 char backup_name
[FILENAME_MAX
];
447 return TRUE
; /* unnecessary - pretend it's done */
448 strcpy (backup_name
, filename
);
449 #if defined(unix) && !defined(GO32)
450 strcat (backup_name
, ".bak");
456 for (p
= backup_name
; *p
; p
++) {
467 remove (backup_name
); /* don't care if this fails */
468 return !rename (filename
, backup_name
);
471 static unsigned char *scrbuf
= NULL
;
472 static int scrbufsize
= 0;
475 * Draw the screen, for normal usage.
477 void draw_scr (void) {
478 int scrsize
, scroff
, llen
, i
, j
;
479 fileoffset_t currpos
;
480 fileoffset_t marktop
, markbot
;
486 scrlines
= display_rows
- 2;
487 scrsize
= scrlines
* width
;
488 if (scrsize
> scrbufsize
) {
489 scrbuf
= (scrbuf ?
realloc(scrbuf
, scrsize
) : malloc(scrsize
));
492 fprintf(stderr
, "%s: out of memory!\n", pname
);
495 scrbufsize
= scrsize
;
498 linebuf
= malloc(width
*4+20);
501 fprintf(stderr
, "%s: out of memory!\n", pname
);
504 memset (linebuf
, ' ', width
*4+13);
505 linebuf
[width
*4+13] = '\0';
508 scroff
= width
- offset
;
513 if (scrsize
> file_size
- top_pos
)
514 scrsize
= file_size
- top_pos
;
516 buf_fetch_data (filedata
, scrbuf
, scrsize
, top_pos
);
518 scrsize
+= scroff
; /* hack but it'll work */
520 mark
= marking
&& (cur_pos
!= mark_point
);
522 if (cur_pos
> mark_point
)
523 marktop
= mark_point
, markbot
= cur_pos
;
525 marktop
= cur_pos
, markbot
= mark_point
;
527 marktop
= markbot
= 0; /* placate gcc */
532 for (i
=0; i
<scrlines
; i
++) {
533 display_moveto (i
, 0);
534 if (currpos
<=cur_pos
|| currpos
<file_size
) {
535 p
= hex
[(currpos
>> 24) & 0xFF];
538 p
= hex
[(currpos
>> 16) & 0xFF];
541 p
= hex
[(currpos
>> 8) & 0xFF];
544 p
= hex
[currpos
& 0xFF];
547 for (j
=0; j
<width
; j
++) {
549 if (currpos
== 0 && j
< width
-offset
)
552 p
= hex
[*q
], c
= *q
++;
557 linebuf
[11+3*j
]=p
[0];
558 linebuf
[12+3*j
]=p
[1];
559 linebuf
[13+3*width
+j
]=toprint
[c
];
561 llen
= (currpos ? width
: offset
);
562 if (mark
&& currpos
<markbot
&& currpos
+llen
>marktop
) {
564 * Some of this line is marked. Maybe all. Whatever
565 * the precise details, there will be two regions
566 * requiring highlighting: a hex bit and an ascii
569 fileoffset_t localstart
= (currpos
<marktop ? marktop
:
571 fileoffset_t localstop
= (currpos
+llen
>markbot ? markbot
:
572 currpos
+llen
) - currpos
;
573 localstart
+= width
-llen
;
574 localstop
+= width
-llen
;
575 display_write_chars(linebuf
, 11+3*localstart
);
576 display_set_colour(COL_SELECT
);
577 display_write_chars(linebuf
+11+3*localstart
,
578 3*(localstop
-localstart
)-1);
579 display_set_colour(COL_BUFFER
);
581 display_write_chars(linebuf
+10+3*localstop
,
582 3+3*width
+localstart
-3*localstop
);
583 display_set_colour(COL_SELECT
);
584 display_write_chars(linebuf
+13+3*width
+localstart
,
585 localstop
-localstart
);
586 display_set_colour(COL_BUFFER
);
587 display_write_chars(linebuf
+13+3*width
+localstop
,
590 display_write_chars(linebuf
+10+3*localstop
,
591 2+3*width
-3*localstop
);
594 display_set_colour(COL_BUFFER
);
595 display_write_chars(linebuf
,
596 ascii_enabled ?
13+4*width
: 10+3*width
);
599 currpos
+= (currpos ? width
: offset
);
600 display_clear_to_eol();
606 display_moveto (display_rows
-2, 0);
607 display_set_colour(COL_STATUS
);
608 sprintf(status
, statfmt
,
609 (modified ?
"**" : " "),
611 (insert_mode ?
"(Insert)" :
612 look_mode ?
"(LOOK) " :
613 fix_mode ?
"(FIX) " : "(Ovrwrt)"),
615 slen
= strlen(status
);
616 if (slen
> display_cols
)
618 display_write_chars(status
, slen
);
619 while (slen
++ < display_cols
)
620 display_write_str(" ");
621 display_set_colour(COL_BUFFER
);
624 display_moveto (display_rows
-1, 0);
625 display_write_str (message
);
626 display_clear_to_eol();
629 i
= cur_pos
- top_pos
;
632 j
= (edit_type ?
(i
%width
)*3+10+edit_type
: (i
%width
)+13+3*width
);
633 if (j
>= display_cols
)
636 display_moveto (i
/width
, j
);
640 volatile int safe_update
, update_required
;
644 * Get a string, in the "minibuffer". Return TRUE on success, FALSE
645 * on break. Possibly syntax-highlight the entered string for
646 * backslash-escapes, depending on the "highlight" parameter.
648 int get_str (char *prompt
, char *buf
, int highlight
) {
649 int maxlen
= 79 - strlen(prompt
); /* limit to 80 - who cares? :) */
654 display_moveto (display_rows
-1, 0);
655 display_set_colour (COL_MINIBUF
);
656 display_write_str (prompt
);
658 char *q
, *p
= buf
, *r
= buf
+len
;
663 if (p
<r
&& *p
== '\\')
664 p
++, display_set_colour(COL_ESCAPE
);
665 else if (p
>=r
|| !isxdigit ((unsigned char)*p
))
666 display_set_colour(COL_INVALID
);
667 else if (p
+1>=r
|| !isxdigit ((unsigned char)p
[1]))
668 p
++, display_set_colour(COL_INVALID
);
670 p
+=2, display_set_colour(COL_ESCAPE
);
672 while (p
<r
&& *p
!= '\\')
674 display_set_colour (COL_MINIBUF
);
676 display_write_chars (q
, p
-q
);
679 display_write_chars (buf
, len
);
680 display_set_colour (COL_MINIBUF
);
681 display_clear_to_eol();
686 c
= display_getkey();
688 if (c
== 13 || c
== 10) {
691 } else if (c
== 27 || c
== 7) {
693 display_post_error();
694 strcpy (message
, "User Break!");
698 if (c
>= 32 && c
<= 126) {
705 if ((c
== 127 || c
== 8) && len
> 0)
708 if (c
== 'U'-'@') /* ^U kill line */
714 * Take a buffer containing possible backslash-escapes, and return
715 * a buffer containing a (binary!) string. Since the string is
716 * binary, it cannot be null terminated: hence the length is
717 * returned from the function. The string is processed in place.
719 * Escapes are simple: a backslash followed by two hex digits
720 * represents that character; a doubled backslash represents a
721 * backslash itself; a backslash followed by anything else is
722 * invalid. (-1 is returned if an invalid sequence is detected.)
724 int parse_quoted (char *buffer
) {
729 while (*p
&& *p
!= '\\')
735 else if (p
[1] && isxdigit((unsigned char)*p
) &&
736 isxdigit((unsigned char)p
[1])) {
741 *q
++ = strtol(buf
, NULL
, 16);
750 * Suspend program. (Or shell out, depending on OS, of course.)
753 #if defined(unix) && !defined(GO32)
759 spawnl (P_WAIT
, getenv("COMSPEC"), "", NULL
);
763 strcpy(message
, "Suspend function not yet implemented.");
768 display_recheck_size();
773 void schedule_update(void) {
777 update_required
= TRUE
;
780 fileoffset_t
parse_num (char *buffer
, int *error
) {
783 if (!buffer
[strspn(buffer
, "0123456789")]) {
784 /* interpret as decimal */
785 return ATOOFF(buffer
);
786 } else if (buffer
[0]=='0' && (buffer
[1]=='X' || buffer
[1]=='x') &&
787 !buffer
[2+strspn(buffer
+2,"0123456789ABCDEFabcdef")]) {
788 return STRTOOFF(buffer
+2, NULL
, 16);
789 } else if (buffer
[0]=='$' &&
790 !buffer
[1+strspn(buffer
+1,"0123456789ABCDEFabcdef")]) {
791 return STRTOOFF(buffer
+1, NULL
, 16);