Quantcast
Channel: SQL Server Spatial forum
Viewing all articles
Browse latest Browse all 364

Select on table with 1800 rows is slow

$
0
0

I have a table with 1800 rows. Each entry has a geometry position and a geometry polygon around the position. I am using the polygon to detect which (other) entries are near the current entry.

In the following testdata and the subsequent query, i am filtering on 625 (of 1865) rows, and then using the .STContains-method to finding other rows (the testdata is fully found by this query, in the live database the values are not so regular as in the testdata.

The query take 6500 ms. In the live database, only 800 records are (yet) in the table, and it takes 2200 ms. 

select SlowQueryTable.id
from SlowQueryTable  
inner join dbo.SlowQueryTable as SlowQueryTableSeen
    on SlowQueryTable.[box].STContains(SlowQueryTableSeen.position) = 1         
where   SlowQueryTable.userId = 2   

(The query in the live system is even more complex, but this is main part of it and even simplified as it is just takes too long).

This script generates test data and runs the query:

-- The number table is just needed to generate test data
CREATE TABLE [dbo].[numbers](
    [number] [int] NOT NULL
)
go
declare @t table (number int) 
insert into @t  select 0 union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9 

insert into numbers 
    select * from 
    (
        select 
            t1.number + t2.number*10 + t3.number*100 + t4.number*1000 as x
        from 
            @t as t1,  
            @t as t2, 
            @t as t3, 
            @t as t4 
    ) as  t1
    order by x       
go


-- this is the table which has the slow query. The Columns [userId], [position] and [box] are the relevant ones
CREATE TABLE [dbo].SlowQueryTable(
    [id] [int] IDENTITY(1,1) NOT NULL,
    [userId] [int] NOT NULL,    
    [position] [geometry] NOT NULL, 
    [box] [geometry] NULL,  
    constraint SlowQueryTable_primary primary key clustered (id)
);
create nonclustered index SlowQueryTable_UserIdKey on [dbo].SlowQueryTable(userId);

--insert testdata: three users with each 625 entries. Each entry per user has its unique position, and a rectangle (box) around it. 
-- In the database in question, the positions are a bit more random, often tens of entries have the same position. The slow query is nevertheless visible with these testdata
declare @range int;
set @range = 5;
INSERT INTO [dbo].SlowQueryTable (userId,position,box)
select 
    users.number,
    geometry::STGeomFromText('POINT (' + convert(varchar(15), X) + ' ' + convert(varchar(15), Y) + ')',0),
    geometry::STPolyFromText('POLYGON ((' + convert(varchar(15), X - @range) + ' ' + convert(varchar(15), Y - @range) + ', ' + convert(varchar(15), X + @range) + ' ' + convert(varchar(15), Y - @range) + ', ' + convert(varchar(15), X + @range) + ' ' + convert(varchar(15), Y + @range) + ', ' + convert(varchar(15), X - @range) + ' ' + convert(varchar(15), Y + @range) + ',' + convert(varchar(15), X - @range) + ' ' + convert(varchar(15), Y - @range) + '))', 0)
from (
select 
    (numberX.number * 40) + 4520 as X
,(numberY.number * 40) + 4520 as Y
from numbers  as numberX  
cross apply numbers  as numberY 
where numberX.number < (1000 / 40)
and numberY.number < (1000 / 40)) as positions
cross apply numbers  as users 
where users.number < 3


CREATE SPATIAL INDEX [SlowQueryTable_position]  
   ON [dbo].SlowQueryTable([position])
   USING GEOMETRY_GRID
   WITH ( 
   BOUNDING_BOX = ( 4500, 4500, 5500, 5500 ),
   GRIDS =(LEVEL_1 =   HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH),  
   CELLS_PER_OBJECT = 64, PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF,   ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
go
ALTER INDEX [SlowQueryTable_position] ON [dbo].SlowQueryTable
REBUILD;
go

CREATE SPATIAL INDEX [SlowQueryTable_box] 
   ON [dbo].SlowQueryTable(box)
   USING GEOMETRY_GRID
   WITH ( BOUNDING_BOX = ( 4500, 4500, 5500, 5500 ) ,
   GRIDS =(LEVEL_1 =   HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH),  
   CELLS_PER_OBJECT = 64, PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF,   ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
go
ALTER INDEX [SlowQueryTable_box] ON [dbo].SlowQueryTable
REBUILD;
go


SET STATISTICS IO ON
SET STATSTICS TIME ON




-- this is finally the query. it takes about 6500 ms
select SlowQueryTable.id
    into #t1
    from SlowQueryTable  
    inner join dbo.SlowQueryTable as SlowQueryTableSeen
        on SlowQueryTable.[box].STContains(SlowQueryTableSeen.position) = 1 
        --on SlowQueryTable.position.STDistance(SlowQueryTableSeen.position) < 5
    where   SlowQueryTable.userId = 2   



drop table #t1
drop table SlowQueryTable
drop table numbers

Using an explicit index hint does do the job, but then the query gets slow if i change the where clause:

select SlowQueryTable.id
    into #t1
    from SlowQueryTable 
    with (index([SlowQueryTable_box])) 
    inner join dbo.SlowQueryTable as SlowQueryTableSeen
        on SlowQueryTable.[box].STContains(SlowQueryTableSeen.position) = 1             
    where   SlowQueryTable.userId = 2   

leads to 600ms, and changing the where clause

where   SlowQueryTable.id = 100

slows it again down to 1200ms.  Filtering on ID get massively slowed down when using index hint on the spatial index.

Since the table in the live system will grow to 10000+ rows, and the query is called often by users, I badly need a more efficient query.

Do I have to create a different queries for each use-case, some with index hints and some without?




Viewing all articles
Browse latest Browse all 364

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>