[FWTools] gdaltindex - performance issue with lots of files?

Frank Warmerdam warmerdam at pobox.com
Fri Feb 1 14:32:46 EST 2008


Paul McCullough wrote:
> When I run 
>     gdaltindex -tileindex location level0 *.tif
> in a directory with 33672 tif files, the process takes about 24 hours.
> While it does finish properly, 24 hours seems too long.
> (windows xp - sp 4 - 2GB RAM; FWTools 2.0.2)
> In other layers of my image pyramid, gdaltindex completes in what I consider
> reasonable times.
> For example, another layer with about 2000 files, runs in about 90 seconds.
> It appears that the increase in time with larger file counts is non-linear.
> I can collect more data if that would help but I suspect there is BigOh kind
> of problem here.
> Also, it seems quite reasable to have 30000 tile polygons.
> 
> I have tried these forms of the cmd line on medium file counts and seen
> little difference:
>     gdaltindex -tileindex location test1 *.tif
>     gdaltindex -tileindex location test2 --optfile files
> I have not tried this unix cmd line form:
>     find $PWD -name "*.tif" -exec gdaltindex srtm-index3 {} \;

Paul,

This is:

   http://trac.osgeo.org/gdal/ticket/2158

A work around is available now in trunk and will appear in 1.5.1.  Basically
for GDAL 1.5 changes were made to read the file list from the directory when
a file is opened and this turns out to be very very slow in some circumstances
when there are a lot of files.  Would the directory in question happen to be
on a network driver?

If you need to work around this you can use an FWTools from early in 2007
or older.

Best regards,
-- 
---------------------------------------+--------------------------------------
I set the clouds in motion - turn up   | Frank Warmerdam, warmerdam at pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush    | President OSGeo, http://osgeo.org



More information about the FWTools mailing list