[Proj] Numerical Precision in Proj, WKT, and everywhere

Craig Bruce csbruce at cubewerx.com
Thu Apr 1 17:32:15 EST 2010

```"Gerald I. Evenden" <geraldi.evenden at gmail.com> wrote:

> Prey tell, why is someone using %,16g to print out a numbet that is
> rarely, if ever, specified to more 4 to 6 digits? %g would be completely
> adequate and avoid the absurd gyration suggested in the original email.

Sometimes those 4 to 6 digits aren't round in decimal, so %g truncates the
value quite a lot.  When these values are used for subsquent calculations,
the error they contain grows.  Suppose that tiles are requested from
an OGC WMS using one-minute longitude intervals.  If generated using
%g, an example interval would be from 135.067 to 135.083 degrees.
If the application computes the span of this interval (max - min), the
result has only two significant digits (really, log10(16) = 1.2 digits).
Subsequent calculations have an enormous error.  If all sixteen digits
had been used in the WMS BBOX= parameter, the span result still has 11
significant digits.

When doing things like generating GML output, it's most practical
to truncate the numbers to 7 or so digits to reduce the data volume,
depending on the practical accuracy of the data and any transformations
that were applied.  When generating Shapefiles, you get full doubles
whether you want them or not.

In an projlib CRS definition like (proj-4.6.0, "epsg" file):

<2086> +proj=lcc +lat_1=20.71666666666667 +lat_0=20.71666666666667 +lon_0=-76.83333333333333 +k_0=0.99994848 +x_0=500000 +y_0=229126.939 +ellps=clrk66 +datum=NAD27 +units=m +no_defs  <>

what is the accumulated error in the full transformation if lat_1 is
specified to only 4 to 6 digits in the config file instead of 16?  In a
situation like this, the specified value is infinitely precise (defined by
law), so it would be a shame not to represent it as accurately as feasible.

> Of course, the other question is why is the number, apparently from a
> data base, store in binary in the first place?

It isn't.  It's stored in BCD, base 100.  However, it is converted to a
double for processing inside of application programs.  I was saying that
Oracle bizarrely has an off-by-one error in its conversion, but this bug
is suppressed if the error control is always applied.

OTOH, the "epsg" file contains some bizarre off-by-one errors, such as in:

# NAD83 / Mississippi East (ftUS)
<2254> +proj=tmerc +lat_0=29.5 +lon_0=-88.83333333333333 +k=0.9999500000000001 +x_0=300000.0000000001 +y_0=0 +ellps=GRS80 +datum=NAD83 +to_meter=0.3048006096012192 +no_defs  <>

The x_0 value here presumably should be the integer 300000 which has
an infinitely precise double representation.  I can't imagine how
the .0000000001 got added to it.  Perhaps MS Access has some strange
BCD->double conversion faults.

> Sorry for the sarcasm but the nature of this non-problem overwhelms me.

The issue is extremely general, not just applying to projection parameters.
Most values pass through systems unchanged, except for the arbitrary
truncations that people apply blindly.  Values that are suspected to
have full source precision (e.g., those which haven't been derived from
complex computations) should remain unmolested, and the 16-digit+error
control approach allows this while suppressing off-by-one eyesores in
converting to text.

--------------------------+----------------------+--------------------------
Dr. Craig S. Bruce        | Ph 819-771-8303 x205 |             CubeWerx Inc.
Senior Software Developer |   Fax 819-771-8388   |  Gatineau, Québec, Canada
csbruce at cubewerx.com      |  http://csbruce.com/ |  http://www.cubewerx.com/
--------------------------+----------------------+--------------------------
"XML is like violence; if it doesn't solve the problem,
just use more of it."
```