[Proj] Unicode

Wed Jun 10 10:13:23 EST 2009

On Wednesday 10 June 2009 10:21:33 am Glynn Clements wrote:
> Gerald I. Evenden wrote:
> > How about considering it this was with proj:
> >
> > Data and keyword entries associated symbols and numerics will be basic
> > ASCII thus the simple caseless comparison can be safely made.  The only
> > exception to ASCII control input would be non-format control characters
> > in format statements---thus degree marks.  However, this will probably be
> > a problem on input data scanning.
>
> Recognising degree symbols on input is relatively straightforward.
>
> You can use the ANSI mbstowcs() function to convert to wide character
> representation; if __STDC_ISO_10646__ is defined, this will be
> Unicode.
>
> Alternatively, you can use nl_langinfo() to obtain the locale's
> encoding, and the iconv() library to convert from this to e.g.
> ISO-8859-1.
>
> Or, given the constraints of the input format, you could just forget
> about encodings altogether and treat the byte sequences \xb0
> (ISO-8859-1 [1]) or \xc2\xb0 (UTF-8) as degree symbols.
>
> [1] Actually, it's the same for all of the ISO-8859-* encodings except
> 5 (Cyrillic), 6 (Arabic), 11 (Thai), 14 (Celtic) and 12 (doesn't
> exist; it was supposed to be Devanagari, but was abandoned).
>
> > However comments and descriptive material may be UTF-8.  That is, in the
> > long descriptive output of Putnins may be with full and proper accents.
>
> The problem with output is handling the case where the user's locale
> doesn't support the characters. iconv() will terminate a conversion at
> the first character which cannot be represented in the output
> encoding.

In the interest of basic sanity, I think i will withdraw back into my hardened 
shell and forget about the whole thing.  It is way too much for my plebeian 
mind.

LONG LIVE ASCII!!

Have a nice day.

-- 
The whole religious complexion of the modern world is due
to the absence from Jerusalem of a lunatic asylum.
-- Havelock Ellis (1859-1939) British psychologist