[Proj] How to Reengineering the cs2cs tool

Fri Jun 5 04:37:57 EST 2009

Gerald I. Evenden wrote:

> >           I am very newbie to projection concepts. I am in need to write
> > the cs2cs tool in php/perl language to convert the coordinates from source
> > epsg-2393 to target epsg-4326. Initially i tried to understand the cs2cs
> > tool code. I can't understand the whole structure of the code , as it
> > contains more mathematical functions. So i have started to trace the code
> > linebyline using anjuta debugger. But i dont think this one is the correct
> > way. As the code contains more 'if' checks i may miss some flow of the
> > code.
> >
> >    Please give some guide lines to understand the concept of cs2cs tool.
> > Where can i get the direct mathematic formulas? So i can directly code it
> > on php or perl without looking the C code.
> >
> > Thanks in Advance,
> > Bala
> 
> Given this initial request and subsequent interchange I cannot help but feel 
> that you are very deeply in over you head.
> 
> First of all, all of the mathematics of the material in cs2cs is out there but 
> it is widely diseminated and would require a good deal of work to merely 
> assemble the sources.  I have done a portion of this in the libproj4 manual 
> but it is not complete nor would I consider it a prime source.
> 
> Secondly, one does need to be reasonably competent in mathematics and at a 
> collegiate softmore level. I get the feeling that you are not.
> 
> Secondly, you mention in one of the emails that you were looking for 
> performance and needed to compute quantities of points.  I do *not* consider 
> php nor perl suited for this type of work.  From my understanding, php is 
> primarily for html/web site purposes and perl is an interpreter language and 
> this certainly does not make it suitable for high volume applications.
> 
> I believe the previous advice to link to the existing DLLs is your best 
> option.

Ordinarily, the best option would be to run cs2cs once and feed it
multiple coordinates.

However, there's a problem with this in that cs2cs uses the default
buffering mode and doesn't call fflush() after each line. If cs2cs is
run with its stdout associated with a pipe, the default buffering mode
will be block buffering, so you only get output a block at a time.

This means that you can't use something like:

	while (more coordinates) {
		send input to cs2cs
		read output from cs2cs
	}

as the first read will block until cs2cs generates some output, which
won't happen until it gets more input, which won't happen because the
program supplying its input is blocked waiting for output, i.e. 
deadlock.

You would need to either:

1. Write the input coordinates to a file, then run cs2cs with its
stdin associated with that file, and read its output from your
program.

2. Run cs2cs with its stdout associated with a new file, feed it input
from your program, and read the output coordinates from that file once
cs2cs completes.

3. Use multiple threads, with one thread feeding input to cs2cs and
another reading the output.

4. Use non-blocking I/O on the output, so that the program can
continously feed input to cs2cs and read output when it's available
without blocking when it isn't.

5. Modify cs2cs to either call setvbuf(stdout, buf, _IOLBF, bufsize)
at startup, or call fflush(stdout) after each line has been written.

Personally, I suspect that #5 is likely to be the easiest option.

Better still would be for this change to be incorporated into a future
cs2cs release. Note that the PROJ executable has the same problem.

-- 
Glynn Clements <glynn at gclements.plus.com>