[mapserver-users] [Fwd: Re: [postgis-users] PostGIS spatial query performance]

Alexander Pucher pucher@atlas.gis.univie.ac.at
Mon, 12 Aug 2002 09:01:37 +0200


This is a multi-part message in MIME format.
--------------030601060709070809030907
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Just to complete this, here is an answer I got from the PostGIS mailinglist.

I was able to speed up my query from 6 min to 2 seconds :-)

regards
alex

-- 
________________________________________________________

Institut fuer Geographie und Regionalforschung
Universitaet Wien
Kartografie und Geoinformation

Departement of Geography and Regional Research
University of Vienna
Cartography and GIS

Universitaetstr. 7, A-1010 Wien, AUSTRIA

Tel: (+43 1) 4277 48644
Fax: (+43 1) 4277 48649
E-mail: alexander.pucher@univie.ac.at

FTP: ftp://ftp.gis.univie.ac.at
WWW: http://www.gis.univie.ac.at/karto
--------------------------------------------------------
Atlas of Eastern and Southeastern Europe: http://www.aos.ac.at
--------------------------------------------------------

M$ is not the answer. M$ is the question!
No is the answer -- Eric Naggum


--------------030601060709070809030907
Content-Type: message/rfc822;
 name="Re: [postgis-users] PostGIS spatial query performance"
Content-Transfer-Encoding: 8bit
Content-Disposition: inline;
 filename="Re: [postgis-users] PostGIS spatial query performance"

>From - Fri Aug  9 08:15:09 2002
X-Mozilla-Status2: 00000000
Received: from mx1.univie.ac.at (mx1.univie.ac.at [131.130.1.28])
	by atlas.gis.univie.ac.at (8.9.3/8.9.3) with ESMTP id UAA10164
	for <pucher@atlas.gis.univie.ac.at>; Thu, 8 Aug 2002 20:16:25 +0200
Received: from offsite.refractions.net (informationpartnership.entirety.ca [204.174.116.40])
	by mx1.univie.ac.at (8.12.2/8.12.2) with ESMTP id g78IFmJN072556
	for <pucher@atlas.gis.univie.ac.at>; Thu, 8 Aug 2002 20:15:51 +0200
Received: from offsite.refractions.net (localhost.localdomain [127.0.0.1])
	by offsite.refractions.net (Postfix) with ESMTP
	id B396843639; Thu,  8 Aug 2002 11:16:01 -0700 (PDT)
Delivered-To: postgis-users@offsite.refractions.net
Received: from uvapsy.fmg.uva.nl (uvapsy.fmg.uva.nl [145.18.152.23])
	by offsite.refractions.net (Postfix) with ESMTP id 9A06E43639
	for <postgis-users@postgis.refractions.net>; Thu,  8 Aug 2002 11:15:00 -0700 (PDT)
Received: from jhartmann (jhartmann.fmg.uva.nl [145.18.125.57])
	by uvapsy.fmg.uva.nl (8.9.3/8.9.3) with SMTP id UAA05165
	for <postgis-users@postgis.refractions.net>; Thu, 8 Aug 2002 20:15:08 +0200 (MET DST)
From: "Jan Hartmann" <jhart@frw.uva.nl>
To: <postgis-users@postgis.refractions.net>
Subject: RE: [postgis-users] PostGIS spatial query performance
Message-ID: <DIEALLGCLLCNIHBDCMAEAEJACEAA.jhart@frw.uva.nl>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
In-Reply-To: <3D528DDD.9977BB10@refractions.net>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300
Importance: Normal
Sender: postgis-users-admin@postgis.refractions.net
Errors-To: postgis-users-admin@postgis.refractions.net
X-BeenThere: postgis-users@postgis.refractions.net
X-Mailman-Version: 2.0.8
Precedence: bulk
Reply-To: postgis-users@postgis.refractions.net
List-Unsubscribe: <http://postgis.refractions.net/mailman/listinfo/postgis-users>,
	<mailto:postgis-users-request@postgis.refractions.net?subject=unsubscribe>
List-Id: PostGIS Users Discussion <postgis-users.postgis.refractions.net>
List-Post: <mailto:postgis-users@postgis.refractions.net>
List-Help: <mailto:postgis-users-request@postgis.refractions.net?subject=help>
List-Subscribe: <http://postgis.refractions.net/mailman/listinfo/postgis-users>,
	<mailto:postgis-users-request@postgis.refractions.net?subject=subscribe>
List-Archive: <http://postgis.refractions.net/pipermail/postgis-users/>
Date: Thu, 8 Aug 2002 20:15:08 +0200

It is possible to do this much faster. As Paul said, you can speed up a
distance query (which doesn't use indices) by preceding it with an overlap
query, like:

select park.*
from park,roads
where park.the_geom && roads.the_geom
and distance(park.the_geom,mypoints.the_geom) = 0

This will select all parks whose bounding boxes overlap with roads, using an
index. The distance function then uses this subset to retrieve those parks
that actually overlap. In your case, you also want all parks within 2 km (?)
of roads, so the && operator is too restrictive. However, the same argument
holds for the box3d operator: you can first test on overlapping bounding
boxes (plus your margin), and use the distance function on the resulting set
only. Like:

select park.*
from park,roads
where xmin(box3d(park.the_geom)) <= xmax(box3d(roads.the_geom)) + 2
and   xmax(box3d(park.the_geom)) >= xmin(box3d(roads.the_geom)) - 2
and   ymin(box3d(park.the_geom)) <= ymax(box3d(roads.the_geom)) + 2
and   ymax(box3d(park.the_geom)) >= ymin(box3d(roads.the_geom)) - 2
and distance(park.the_geom,roads.the_geom) <= 2


This should give you much faster retrieval times; it did so on my data by
some orders of magnitude. Of course, if an individual  road winds through
the whole area, its bounding box will overlap with everything and you won't
have much time gain then. If that's really a problem, you could split up the
roads into smaller parts.

I'm not sure David's point is valid. As far as I understand it, PostgreSQL
optimizes joins, among other things by using indexes. The problem here seems
to be that indexes won't used with the distance function, but will when
using && and bbox. I guess it would be reasonably difficult to program the
distance function in such a way that it can be used by PostgreSQL
index-based functions

regards,

Jan Hartmann
Department of Geography
University of Amsterdam
jhart@frw.uva.nl

-----Original Message-----
From: postgis-users-admin@postgis.refractions.net
[mailto:postgis-users-admin@postgis.refractions.net]On Behalf Of Paul
Ramsey
Sent: Thursday, August 08, 2002 5:27 PM
To: postgis-users@postgis.refractions.net
Subject: Re: [postgis-users] PostGIS spatial query performance


Yes, this is sadly true. In order to figure out the answer, there is
choice by to interogate every feature in at least one of the tables.

In the distance-from-a-fixed-point case, it is possible to leverage the
indexing to make the process very fast. Construct a query rectagle which
is slightly larger than your desired filter radius, and do an overlap
query (&&) as well as the distance() query. The && query will use the
index and drastically subset the amount of features which need the
distance() test applied to them.

Ideally, distance() would do this automatically in some way, so that
even the full join distance(geomcolumn,geomcolumn case would at least
finish in closer to a*n time rather than n*m time.

P.

David Garnier wrote:
>
> Hello,
> Spatial indexing can't do anything for you since you're explicitly
> asking for a full join between two tables. If you think about it, your
> queries computes the distance between each possible pairs of shapes in
> your tables. So if you have 10000 shapes in each table,this means that
> distance will be called 100.000.000 times. Ouch.
>
> You should try to find another way to get the data you're looking for.
>
> Best Regards,
> David Garnier
>
> le jeu 08-08-2002 à 12:31, Alexander Pucher a écrit :
> > Hi,
> >
> > I have a question concerning the performance of a spatial query in
PostGIS.
> >
> > -) I put all GMAP demo layers into my PostgreSQL/PostGIS database.
> > -) I created GIST indexes for all tables.
> >
> > create index park_gist on park
> > using GIST (the_geom GIST_GEOMETRY_OPS)
> > .
> > .
> > .
> >
> > -) Did vacuum analyze all tables.
> >
> > OK, I want to query all parks in the GMAP demo that have a road running
> > through them.
> >
> > My SQL query:
> >
> > SELECT park.*
> > FROM park, road
> > WHERE DISTANCE(park.the_geom, road.the_geom) < 2
> >
> >
> >
> > EXPLAIN SELECT park.*
> > FROM park, road
> > WHERE DISTANCE(park.the_geom, road.the_geom) < 2
> >
> > gives me the following output:
> >
> > Nested Loop  (cost=0.00..8917.66 rows=23736 width=163)
> >   ->  Seq Scan on park  (cost=0.00..7.46 rows=46 width=131)
> >   ->  Seq Scan on road  (cost=0.00..170.48 rows=1548 width=32)
> >
> >
> > the query works fine and gives me the correct result after.....about 6
> > minutes ( on my 1Ghz Notebook)!!
> >
> > I expected the result to be there within seconds?!?
> >
> > Something wrong with the indexes??
> >
> > Any help is welcome.
> >
> > regards
> > alex
> >
> >
> >
> >
> > --
> > ________________________________________________________
> >
> > Institut fuer Geographie und Regionalforschung
> > Universitaet Wien
> > Kartografie und Geoinformation
> >
> > Departement of Geography and Regional Research
> > University of Vienna
> > Cartography and GIS
> >
> > Universitaetstr. 7, A-1010 Wien, AUSTRIA
> >
> > Tel: (+43 1) 4277 48644
> > Fax: (+43 1) 4277 48649
> > E-mail: alexander.pucher@univie.ac.at
> >
> > FTP: ftp://ftp.gis.univie.ac.at
> > WWW: http://www.gis.univie.ac.at/karto
> > --------------------------------------------------------
> > Atlas of Eastern and Southeastern Europe: http://www.aos.ac.at
> > --------------------------------------------------------
> >
> > M$ is not the answer. M$ is the question!
> > No is the answer -- Eric Naggum
> >
> >
> >
> > _______________________________________________
> > postgis-users mailing list
> > postgis-users@postgis.refractions.net
> > http://postgis.refractions.net/mailman/listinfo/postgis-users
> >
>
> _______________________________________________
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users

_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users



--------------030601060709070809030907--