[Proj] Modern C functions

support.mn at elisanet.fi support.mn at elisanet.fi
Sat Feb 28 11:54:04 EST 2009


Hi,

The latest MSVC environments they would like people to use safe string functions and usually issue several warnings during compilation. In UNICODE and other multibyte character sets you have to handle charaters instead of bytes. See the next text from MSVC help:


"Safe String FunctionsString safe functions duplicate or enhance familiar string functions from the standard C run-time (CRT) library. 

Many enhancements enable the string functions to work with Unicode or extended character sets. 

The strcmp function compares string1 and string2 lexicographically and returns a value indicating their relationship. wcscmp and _mbscmp are, respectively, wide-character and multibyte-character versions of strcmp. _mbscmp recognizes multibyte-character sequences according to the current multibyte code page and returns _NLSCMPERROR on an error. (For more information, see Code Pages.) Also, if string1 or string2 is a null pointer, _mbscmp invokes the invalid parameter handler, as described in Parameter Validation. If execution is allowed to continue, _mbscmp returns _NLSCMPERROR and sets errno to EINVAL. strcmp and wcscmp do not validate their parameters. These three functions behave identically otherwise.

The _stricmp function lexicographically compares lowercase versions of string1 and string2 and returns a value indicating their relationship. _stricmp differs from _stricoll in that the _stricmp comparison is affected by LC_CTYPE, whereas the _stricoll comparison is according to the LC_CTYPE and LC_COLLATE categories of the locale. For more information on the LC_COLLATE category, see setlocale and Locale Categories. The versions of these functions without the _l suffix use the current locale for locale-dependent behavior. The versions with the suffix are identical except that they use the locale passed in instead. For more information, see Locale. 

The _strcmpi function is equivalent to _stricmp and is provided for backward compatibility only. 

Because stricmp does lowercase comparisons, it may result in unexpected behavior. 

To illustrate when case conversion by stricmp affects the outcome of a comparison, assume that you have the two strings JOHNSTON and JOHN_HENRY. The string JOHN_HENRY will be considered less than JOHNSTON because the "_" has a lower ASCII value than a lowercase S. In fact, any character that has an ASCII value between 91 and 96 will be considered less than any letter. 

If the strcmp function is used instead of stricmp, JOHN_HENRY will be greater than JOHNSTON. 

_wcsicmp and _mbsicmp are wide-character and multibyte-character versions of _stricmp. The arguments and return value of _wcsicmp are wide-character strings; those of _mbsicmp are multibyte-character strings. _mbsicmp recognizes multibyte-character sequences according to the current multibyte code page and returns _NLSCMPERROR on an error. (For more information, see Code Pages.) These three functions behave identically otherwise. 

_wcsicmp and wcscmp behave identically except that wcscmp does not convert its arguments to lowercase before comparing them. _mbsicmp and _mbscmp behave identically except that _mbscmp does not convert its arguments to lowercase before comparing them. 

You will need to call setlocale for _wcsicmp to work with Latin 1 characters. The C locale is in effect by default, so, for example, ä will not compare equal to Ä. Call setlocale with any locale other than the C locale before the call to _wcsicmp. The following sample demonstrates how _wcsicmp is sensitive to the locale: 

The strncmp function lexicographically compares, at most, the first count characters in string1 and string2 and returns a value indicating the relationship between the substrings. strncmp is a case-sensitive version of _strnicmp. wcsncmp and _mbsncmp are case-sensitive versions of _wcsnicmp and _mbsnicmp. 

wcsncmp and _mbsncmp are wide-character and multibyte-character versions of strncmp. The arguments and return value of wcsncmp are wide-character strings; those of _mbsncmp are multibyte-character strings. _mbsncmp recognizes multibyte-character sequences according to a multibyte code page and returns _NLSCMPERROR on an error. 

Also, _mbsncmp validates its parameters. If string1 or string2 is a null pointer,the invalid parameter handler is invoked, as described in Parameter Validation . If execution is allowed to continue, _mbsncmp returns _NLSCMPERROR and sets errno to EINVAL. strncmp and wcsncmp do not validate their parameters. These three functions behave identically otherwise. 

The output value is affected by the setting of the LC_CTYPE category setting of the locale; see setlocale for more information. The versions of these functions without the _l suffix use the current locale for this locale-dependent behavior; the versions with the _l suffix are identical except that they use the locale parameter passed in instead. For more information, see Locale.

 The _strnicmp function lexicographically compares, at most, the first count characters of string1 and string2. The comparison is performed without regard to case; _strnicmp is a case-insensitive version of strncmp. The comparison ends if a terminating null character is reached in either string before count characters are compared. If the strings are equal when a terminating null character is reached in either string before count characters are compared, the shorter string is lesser. 

The characters from 91 to 96 in the ASCII table ('[', '\', ']', '^', '_', and '`') will evaluate as less than any alphabetic character. This ordering is identical to that of stricmp. 

_wcsnicmp and _mbsnicmp are wide-character and multibyte-character versions of _strnicmp. The arguments and return value of _wcsnicmp are wide-character strings; those of _mbsnicmp are multibyte-character strings. _mbsnicmp recognizes multibyte-character sequences according to the current multibyte code page and returns _NLSCMPERROR on an error. For more information, see Code Pages. These three functions behave identically otherwise. These functions are affected by the locale setting. The versions without the _l suffix use the current locale for their locale-dependent behavior. The versions with the _l suffix are identical except that they use the locale passed in instead. For more information, see Locale. 

All of these functions validate their parameters. If either string1 or string2 is a null pointer, the invalid parameter handler is invoked, as described in Parameter Validation. If execution is allowed to continue, these functions return _NLSCMPERROR and set errno to EINVAL. "



What a mess might somebody think?  :)

Regards: Janne.

--------------------------------



"Gerald I. Evenden" [geraldi.evenden at gmail.com] kirjoitti: 
> This is a quick probe of the programming types in this group.
> 
> In finalizing Rel. 2.0 of geodesic and libgeodesy I ran across functions under 
> the <string.h> heading which are not in the typical (read Harrison & Steele) 
> library references: strcasecmp and strncasecmp.  These are noted in the Gnu 
> man.3 as Posix 2001 standard routines.  Obviously, these two routines replace 
> strcmp and strncmp and make comparisons case independent.
> 
> As soon as I became aware of them I have included them to eliminate the nasty, 
> nasty fact of having to shift when typing in WGS84 and can now use 
> ellps=wgs84.
> 
> A little net browsing has yielded mixed results as to how well know these 
> functions are.
> 
> Thus my question here is: do non-Gnu C compilers used by this audience 
> recognize these functions?
> 
> -- 
> The whole religious complexion of the modern world is due
> to the absence from Jerusalem of a lunatic asylum.
> -- Havelock Ellis (1859-1939) British psychologist
> _______________________________________________
> Proj mailing list
> Proj at lists.maptools.org
> http://lists.maptools.org/mailman/listinfo/proj
> 



More information about the Proj mailing list