From: simon Date: Mon, 20 Dec 2004 09:27:44 +0000 (+0000) Subject: The end condition in the binary search loop in the new getType() was X-Git-Url: https://git.distorted.org.uk/u/mdw/putty/commitdiff_plain/197c43ddf6420b9941133f218b73b83cd1ea8a6c The end condition in the binary search loop in the new getType() was incorrect. I must have written that binary search idiom a hundred times, so it's rather embarrassing that I can't _automatically_ get it right! This was causing all kinds of characters to be classified as ON when they should have been various other classes. Also while I'm here, I've added another test case to utf8.txt (a small piece of Arabic within a predominantly L->R line), and also supplied a means to compile minibidi.c with -DTEST_GETTYPE to produce a command-line character class lookup tool. (Not sure what use that'll be _other_ than debugging this precise problem, but I don't like to throw it away now I've written it :-) git-svn-id: svn://svn.tartarus.org/sgt/putty@5016 cda61777-01e9-0310-a592-d414129be87e --- diff --git a/minibidi.c b/minibidi.c index c13276d0..d36dee07 100644 --- a/minibidi.c +++ b/minibidi.c @@ -38,6 +38,14 @@ #define OISL 0x80 /* Override is L */ #define OISR 0x40 /* Override is R */ +/* For standalone compilation in a testing mode. + * Still depends on the PuTTY headers for snewn and sfree, but can avoid + * _linking_ with any other PuTTY code. */ +#ifdef TEST_GETTYPE +#define safemalloc malloc +#define safefree free +#endif + /* Shaping Helpers */ #define STYPE(xh) ((((xh) >= SHAPE_FIRST) && ((xh) <= SHAPE_LAST)) ? \ shapetypes[(xh)-SHAPE_FIRST].type : SU) /*))*/ @@ -848,7 +856,7 @@ unsigned char getType(int ch) i = -1; j = lenof(lookup); - while (j - i > 2) { + while (j - i > 1) { k = (i + j) / 2; if (ch < lookup[k].first) j = k; @@ -1810,3 +1818,47 @@ void doMirror(wchar_t* ch) } } } + +#ifdef TEST_GETTYPE + +#include +#include + +int main(int argc, char **argv) +{ + static const struct { int type; char *name; } typetoname[] = { +#define TYPETONAME(X) { X , #X } + TYPETONAME(L), + TYPETONAME(LRE), + TYPETONAME(LRO), + TYPETONAME(R), + TYPETONAME(AL), + TYPETONAME(RLE), + TYPETONAME(RLO), + TYPETONAME(PDF), + TYPETONAME(EN), + TYPETONAME(ES), + TYPETONAME(ET), + TYPETONAME(AN), + TYPETONAME(CS), + TYPETONAME(NSM), + TYPETONAME(BN), + TYPETONAME(B), + TYPETONAME(S), + TYPETONAME(WS), + TYPETONAME(ON), +#undef TYPETONAME + }; + int i; + + for (i = 1; i < argc; i++) { + unsigned long chr = strtoul(argv[i], NULL, 0); + int type = getType(chr); + assert(typetoname[type].type == type); + printf("U+%04x: %s\n", chr, typetoname[type].name); + } + + return 0; +} + +#endif diff --git a/testdata/utf8.txt b/testdata/utf8.txt index 7ad058f6..a8a87f9b 100644 --- a/testdata/utf8.txt +++ b/testdata/utf8.txt @@ -18,3 +18,4 @@ Arabic and bidirectional text: (من مجمع الزوائد ومنبع الفوائد للهيثمي ، ج 1 ، ص 74-84) عن جرير رضي الله عنه قال قال رسول الله صلى الله عليه وسلم: بني الاسلام على خمس شهادة ان لا اله الا الله واقام +Mixed LTR and RTL text: جرير رضي back to LTR.