I have a table with 1800 rows. Each entry has a geometry position and a geometry polygon around the position. I am using the polygon to detect which (other) entries are near the current entry.
In the following testdata and the subsequent query, i am filtering on 625 (of 1865) rows, and then using the .STContains-method to finding other rows (the testdata is fully found by this query, in the live database the values are not so regular as in the testdata.
The query take 6500 ms. In the live database, only 800 records are (yet) in the table, and it takes 2200 ms.
select SlowQueryTable.id
from SlowQueryTable
inner join dbo.SlowQueryTable as SlowQueryTableSeen
on SlowQueryTable.[box].STContains(SlowQueryTableSeen.position) = 1
where SlowQueryTable.userId = 2
(The query in the live system is even more complex, but this is main part of it and even simplified as it is just takes too long).
This script generates test data and runs the query:
-- The number table is just needed to generate test data
CREATE TABLE [dbo].[numbers](
[number] [int] NOT NULL
)
go
declare @t table (number int)
insert into @t select 0 union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9
insert into numbers
select * from
(
select
t1.number + t2.number*10 + t3.number*100 + t4.number*1000 as x
from
@t as t1,
@t as t2,
@t as t3,
@t as t4
) as t1
order by x
go
-- this is the table which has the slow query. The Columns [userId], [position] and [box] are the relevant ones
CREATE TABLE [dbo].SlowQueryTable(
[id] [int] IDENTITY(1,1) NOT NULL,
[userId] [int] NOT NULL,
[position] [geometry] NOT NULL,
[box] [geometry] NULL,
constraint SlowQueryTable_primary primary key clustered (id)
);
create nonclustered index SlowQueryTable_UserIdKey on [dbo].SlowQueryTable(userId);
--insert testdata: three users with each 625 entries. Each entry per user has its unique position, and a rectangle (box) around it.
-- In the database in question, the positions are a bit more random, often tens of entries have the same position. The slow query is nevertheless visible with these testdata
declare @range int;
set @range = 5;
INSERT INTO [dbo].SlowQueryTable (userId,position,box)
select
users.number,
geometry::STGeomFromText('POINT (' + convert(varchar(15), X) + ' ' + convert(varchar(15), Y) + ')',0),
geometry::STPolyFromText('POLYGON ((' + convert(varchar(15), X - @range) + ' ' + convert(varchar(15), Y - @range) + ', '+ convert(varchar(15), X + @range) + ' ' + convert(varchar(15), Y - @range) + ', '+ convert(varchar(15), X + @range) + ' ' + convert(varchar(15), Y + @range) + ', '+ convert(varchar(15), X - @range) + ' ' + convert(varchar(15), Y + @range) + ','+ convert(varchar(15), X - @range) + ' ' + convert(varchar(15), Y - @range) + '))', 0)
from (
select
(numberX.number * 40) + 4520 as X
,(numberY.number * 40) + 4520 as Y
from numbers as numberX
cross apply numbers as numberY
where numberX.number < (1000 / 40)
and numberY.number < (1000 / 40)) as positions
cross apply numbers as users
where users.number < 3
CREATE SPATIAL INDEX [SlowQueryTable_position]
ON [dbo].SlowQueryTable([position])
USING GEOMETRY_GRID
WITH (
BOUNDING_BOX = ( 4500, 4500, 5500, 5500 ),
GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH),
CELLS_PER_OBJECT = 64, PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
go
ALTER INDEX [SlowQueryTable_position] ON [dbo].SlowQueryTable
REBUILD;
go
CREATE SPATIAL INDEX [SlowQueryTable_box]
ON [dbo].SlowQueryTable(box)
USING GEOMETRY_GRID
WITH ( BOUNDING_BOX = ( 4500, 4500, 5500, 5500 ) ,
GRIDS =(LEVEL_1 = HIGH,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = HIGH),
CELLS_PER_OBJECT = 64, PAD_INDEX = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
go
ALTER INDEX [SlowQueryTable_box] ON [dbo].SlowQueryTable
REBUILD;
go
SET STATISTICS IO ON
SET STATSTICS TIME ON
-- this is finally the query. it takes about 6500 ms
select SlowQueryTable.id
into #t1
from SlowQueryTable
inner join dbo.SlowQueryTable as SlowQueryTableSeen
on SlowQueryTable.[box].STContains(SlowQueryTableSeen.position) = 1
--on SlowQueryTable.position.STDistance(SlowQueryTableSeen.position) < 5
where SlowQueryTable.userId = 2
drop table #t1
drop table SlowQueryTable
drop table numbers
Using an explicit index hint does do the job, but then the query gets slow if i change the where clause:
select SlowQueryTable.id
into #t1
from SlowQueryTable
with (index([SlowQueryTable_box]))
inner join dbo.SlowQueryTable as SlowQueryTableSeen
on SlowQueryTable.[box].STContains(SlowQueryTableSeen.position) = 1
where SlowQueryTable.userId = 2
leads to 600ms, and changing the where clause
where SlowQueryTable.id = 100
slows it again down to 1200ms. Filtering on ID get massively slowed down when using index hint on the spatial index.
Since the table in the live system will grow to 10000+ rows, and the query is called often by users, I badly need a more efficient query.
Do I have to create a different queries for each use-case, some with index hints and some without?