[Shapelib] Re: shapelib improvements

Bram de Greve bram.degreve at bramz.net
Thu Dec 6 10:51:33 EST 2007


Mateusz Loskot wrote:
> fopen() function accepts encoding specific for local environment.
> For example, on Polish version of Windows, it accepts Windows 1250
> (latin2) on english - Windows 1252 (latin1) etc.
> Every ASCII character is valid UTF-8, but not the other way.
>
> This implies that fopen accepts only a subset of Unicode characters
> included in particular code page but does not accept the whole range of
> UTF-8.
>
>   
OK, this at least confirms my suspicion: that not all possible filenames
are accessible using fopen, on Windows that is.

> Possible solution is to use Unicode-aware API availalbe on
> Windows: _wfopen() (or CreateFile)
> Some time ago I ported Shapelib to use wide-character versions of
> I/O functions as I needed it on Windows CE (Unicode-only system).
>
> Unfortunately, I have lost these modifications but it isn't a big
> deal to do it.
>
> Cheers
>   
I've done similar modification for pyshapelib (I choose _wfopen because
that's similar to what Python does), but as Frank is going to implement
CreateFile IO hooks anyway, I think we can rely on that.  As long as he
uses the Unicode versions of it, that is =).  However, it still isn't
without issues:

The CreateFile IO hooks will still accept char* filenames, which would
be OK if we would treat it as UTF-8 and decode it with
MultyByteToWideChar.  But that would cause an assymetry with the
(default) tradition IO hooks that use fopen and still treats the char*
filenames as ANSI encoded (in whatever codepage is set by the
regionale).  If it looks like a duck ...

Bram


More information about the Shapelib mailing list