X-Git-Url: https://git.distorted.org.uk/~mdw/sgt/charset/blobdiff_plain/c6d25d8d73da77087aa3e413af2ae72f6300891f..40724963a1889611c42e0ba1472e610fbec5429c:/README diff --git a/README b/README index bcfb27f..8eb7c25 100644 --- a/README +++ b/README @@ -25,11 +25,10 @@ not currently support. Those that I know of are: machinery in place, and support all the underlying character sets. - - ISO-2022-CN and ISO-2022-CN-EXT (RFC 1922), and EUC-TW. These - encodings depend on the CNS 11643-1992 character set. Mapping - table data for this set is available from unicode.org, but only - in the Unihan database - ftp://ftp.unicode.org/Public/UNIDATA/Unihan.zip + - ISO-2022-CN and ISO-2022-CN-EXT (RFC 1922). These are a little tricky + as they allow use of both GB2312 (simplified Chinese) and CNS 11643 + (traditional Chinese), so we may need some way to specify which to + prefer. - The Hong Kong (HKSCS) extension to Big5. Again, mapping tables are available in the Unihan database.