X-Git-Url: https://git.distorted.org.uk/~mdw/sgt/agedu/blobdiff_plain/8b1f55d6d5f6bc6cc8ea29fbe905952e2aa9b8ac..afe761f3b2a97873cfe6363cefacaf1aafc22d84:/TODO diff --git a/TODO b/TODO index 6df0eb3..f8c723d 100644 --- a/TODO +++ b/TODO @@ -1,37 +1,23 @@ TODO list for agedu =================== -Before it's non-embarrassingly releasable: + - we could still be using more of the information coming from + autoconf. Our config.h is defining a whole bunch of HAVE_FOOs for + particular functions (e.g. HAVE_INET_NTOA, HAVE_MEMCHR, + HAVE_FNMATCH). We could usefully supply alternatives for some of + these functions (e.g. cannibalise the PuTTY wildcard matcher for + use in the absence of fnmatch, switch to vanilla truncate() in + the absence of ftruncate); where we don't have alternative code, + it would perhaps be polite to throw an error at configure time + rather than allowing the subsequent build to fail. + + however, I don't see anything here that looks very + controversial; IIRC it's all in POSIX, for one thing. So more + likely this should simply wait until somebody complains. - - sort out the command line syntax - * I think there should be a unified --mode / -M for every - running mode, possibly without the one-letter option for the - diagnostic sorts of things - * there should be some configurable options: - + range limits on the age display - + server address in httpd mode - + HTTP authentication: specify username and/or password, the - latter by at least some means which doesn't involve it - showing up in "ps" - - - do some configurability for the disk scan - * wildcard-based includes and excludes - + wildcards can act on the last pathname component or the - whole lot - + include and exclude can be interleaved; implicit "include - *" before any - * reinstate filesystem crossing, though not doing so should - remain the default - - - work out what to do about atimes on directories - * one option is to read them during the scan and reinstate them - after each recursion pop. Race-condition prone. - * marking them in a distinctive colour in the reports is the - other option. - - - make a final decision on the name! - -Future directions: + - IPv6 support in the HTTP server + * of course, Linux magic auth can still work in this context; we + merely have to be prepared to open one of /proc/net/tcp or + /proc/net/tcp6 as appropriate. - run-time configuration in the HTTP server * I think this probably works by having a configuration form, or @@ -44,60 +30,51 @@ Future directions: * All the same options should have their starting states configurable on the command line too. - - polish the plain-text output: - + do the same formatting as in HTML, by showing files as a - single unit and also sorting by size? (Probably the other way - up, due to scrolling.) - + configurable recursive output depth - - curses-ish equivalent of the web output + try using xterm 256-colour mode. Can (n)curses handle that? If not, try doing it manually. + + I think my current best idea is to bypass ncurses and go + straight to terminfo: generate lines of attribute-interleaved + text and display them, so we only really need the sequences + "go here and display stuff", "scroll up", "scroll down". + + Infrastructure work before doing any of this would be to split + html.c into two: one part to prepare an abstract data + structure describing an HTML-like report (in particular, all + the index lookups, percentage calculation, vector arithmetic + and line sorting), and another part to generate the literal + HTML. Then the former can be reused to produce very similar + reports in coloured plain text. - - cross-module: - + figure out what to do about scans starting in the root - directory! - * Currently we end up with a double leading slash on the - pathnames, which is ugly, and we also get a zero-length - href in between those slashes which means the web interface - doesn't let you click back up to the top level at all. - * One big problem here is that a lot of the code assumes that - you can find the extent of a pathname by searching for - "foo" and "foo^A", trusting that anything inside the - directory will begin "foo/". So I'd need to consistently - fix this everywhere so that a trailing slash is disregarded - while doing it, but not actually removed. - * The text output gets it all wrong. - * The HTML output is fiddly even at the design stage: where - would I _ideally_ put the link to click on to get back to - /? It's unclear! - - - more flexible running modes - + decouple the disk scan from the index building code, so that - the former can optionally output in the same format as --dump - and the latter can optionally work from input on stdin (having - also fixed the --dump format in the process so it's perfectly - general). Then we could scan on one machine and transfer the - results over the net to another machine where they'd be - indexed; in particular, this way the indexing machine could be - 64-bit even if the machine owning the filesystems was only 32. - + in the other direction, ability to build a database _and_ - immediately run one of the ongoing interactive report modes - (httpd, curses) in a single invocation would seem handy. - - - portability - + between Unices: - * autoconf? - * configure use of stat64 - * configure use of /proc/net/tcp - * configure use of /dev/random - * configure use of Linux syscall magic replacing readdir - * what do we do elsewhere about _GNU_SOURCE? - + http://msdn.microsoft.com/en-us/library/ms724290.aspx suggest - modern Windowses support atime-equivalents, so a Windows port - is possible in principle. Would need to modify the current + - http://msdn.microsoft.com/en-us/library/ms724290.aspx suggest + modern Windowses support atime-equivalents, so a Windows port is + possible in principle. + + For a full Windows port, would need to modify the current structure a lot, to abstract away (at least) memory-mapping of - files, details of disk scan procedure, networking for httpd, - the path separator character (yuck). Unclear what the right UI - would be on Windows, too; command-line exactly as now might be - considered just a _little_ unfriendly. Or perhaps not. + files, details of disk scan procedure, networking for httpd. + Unclear what the right UI would be on Windows, too; + command-line exactly as now might be considered just a + _little_ unfriendly. Or perhaps not. + * Disk scan procedure: the FindFirstFile / FindNextFile + functions to scan a directory automatically return the file + times along with the filenames, so there's no need to stat + them later. Would want to fiddle the shape of the + abstraction layer to reflect this. + + Alternatively, a much easier approach would be to write a + Windows version of just the --scan-dump mode, which does a + filesystem scan via the Windows API and generates a valid + agedu dump file on standard output. Then one would simply feed + that over the network connection of one's choice to the rest + of agedu running on Unix as usual. + + - it might conceivably be useful to support a choice of indexing + strategies. The current "continuous index" mechanism' tradeoff of + taking O(N log N) space in order to be able to support any age + cutoff you like is not going to be ideal for everybody. A second + more conventional "discrete index" mechanism which allows the + user to specify a number of fixed cutoffs and just indexes each + directory on those alone would undoubtedly be a useful thing for + large-scale users. This will require considerable thought about + how to make the indexers pluggable at both index-generation time + and query time. + * however, now we have the cut-down version of the continuous + index, the space saving is less compelling.