Is there a way with nearest neighbour search in MySQL? - mysql

I have the following table:
CREATE TABLE numbert_t ( v DOUBLE , id INTEGER, INDEX(v) )
and I want to do a query with a paremeter q that sorts the points in distance abs( q - v ). For example
SELECT v, id, ABS( q - v ) AS d FROM number_t ORDER BY d
I tried the query above and this one:
SELECT v, id, (v - q) AS d FROM numbers_t WHERE (v - q) > 0
ORDER BY d
I also tried slight variations of the above:
SELECT v, id, (v - q) AS d FROM numbers_t WHERE v > q ORDER BY v
They are not equivalent, but I don't mind to do two queries and have two independent cursors. However, in all cases EXPLAIN says that filesort, no indices would be used. Can I get MySQL to somehow use indices for this problem?

Did you try:
SELECT MIN(v), id FROM number_t WHERE v >= q
UNION
SELECT MAX(v), id FROM number_t WHERE v < q
MySQL specific, not standard, because of the id. But the id might be retrieved after you have gotten exact values.

You can use the spatial extension and a point datatype. Then you can use a proximity search for example when a point is inside a bounding box. You can also use my quadkey library. It uses a hilbert curve and a mercator projection. You can download my php class hilbert curve # phpclasses.org.

Related

Select the nearest point for each row in table

I use this to find the nearest point
SELECT
id,
ST_Distance(
POINT(52.760667210533,-7.22646337599035),
geo_point
) as distance
from Points
order by distance limit 1
I have a temp table TempPoints with all my candidate points and I want to normalise them onto OSM nodes, but there's lots, so I need a single query to resolve them all in one call. UNION wont let me use order by, and my DB raw query interface wont let me just fire a series of queries separated by ';'.
The temp table has lat and lon but can just as easily have a POINT. How can I go
select id,NearestTo(TempPoint.geo_point,Points) from TempPoints;
EDIT: I can parenthesise each select in my large union query, which solves my issue.
I would still like to be able to join on nearest row.
This might work for you:
SELECT t.id as tid, p.id as pid, p.geo_point
FROM TempPoint t
JOIN Points p ON p.id = (
SELECT p1.id
FROM Points p1
ORDER BY ST_Distance(p1.geo_point, t.geo_point)
LIMIT 1
)
My solution is to issue a series of queries, one for each row, and bind them together with a UNION. The mysql stack will blow eventually so you need to do them in blocks, but 1000 is OK on a default install.
You have to parenthesize the queries as they include an order by. Some points may fail so I label them all with a literal line_no sequence so you can edit and filter the originals. You also need to restrict the query with a
WHERE Contains(<polygon>,point)
clause, else it will try and sort the whole table, where polygon is a bounding box you have to cook up with GEOMFROMTEXT() and POLYGON(). And of course you need a special spatial index on the column!. Here's some code
var SMALL=0.001
var=query=points
.map(function(point){
var bottom=point.lat+SMALL
var top=point.lat-SMALL
var left=point.lon-SMALL
var right=point.lon+SMALL
var polygon=[
[bottom,left],
[top,left],
[top,right],
[bottom,right],
[bottom,left]
]
polygon="POLYGON(("+polygon.map(function(point){
return point.join(' ')
})
.join(",")+"))"
point.line_no=line_no++
return "(SELECT "+point.line_no+" as line_no,id, ST_Distance(POINT("+
point.lat+","+point.lon+
"),geo_point) as distance"+
" from Points "+
" WHERE Contains(GeomFromText('"+polygon+"'),geo_point) "+
" order by distance limit 1) "
})
.join(" UNION ")+" order by line_no"
return sequelize.query(query)

Speed up SQL nested query

I have a table called ntr_perf with the following column: data, cos, network, tot_reg, tot_rej.
I need to get the sums of tot_reg and tot_rej for each pair data-cos (I need to take all data-cos pairs and make the sum of the values for all the networks with the same data-cos pair).
I'm using the following MySQL query:
SELECT DISTINCT
data AS d,
cos AS c,
(SELECT SUM(tot_reg) FROM ntr_perf WHERE data=d AND cos=c) AS sumattempts,
(SELECT SUM(tot_rej) FROM ntr_perf WHERE data=d AND cos=c) AS sumrej FROM ntr_perf
It takes a very long time even if the table has only 91.450 rows (the table has a multi-column index data-cos).
Is it possible to speed up the query?
This is exactly what group by is designed for.
Try this:
SELECT data,cos,SUM(tot_reg),SUM(tot_rej) from ntr_perf group by data,cos
You can use this query,
SELECT data AS d, cos AS c,
SUM(tot_reg), SUM(tot_reg) where
data='d' AND cos='c' group by data , cos ;
Hope you got is. Else let me know, will help you
Try this, a group by
SELECT data d,
cos c,
SUM(tot_reg) sumattempts,
SUM(tot_rej) sumrej
FROM ntr_perf
WHERE data = 'd' -- if these are values, put in single quotes
AND cos = 'c' -- if these are values, put in single quotes
GROUP BY data, -- even though aliased, the original name needs to be used on the GROUP BY
cos -- even though aliased, the original name needs to be used on the GROUP BY
this will group your query and filter your sums, as you posted WHERE conditions:
SELECT
data AS d,
cos AS c,
SUM(IIF(data='d' AND cos='c', tot_reg, 0) AS sumattempts,
SUM(IIF(data='d' AND cos='c', tot_rej, 0)) AS sumrej
FROM
ntr_perf
GROUP BY
data,
cos

MySQL Select within another select

I have a query as follows
select
Sum(If(departments.vat, If(weeklytransactions.weekendingdate Between
'2011-01-04' And '2099-12-31', weeklytransactions.takings / 1.2,
If(weeklytransactions.weekendingdate Between '2008-11-30' And '2010-01-01',
weeklytransactions.takings / 1.15, weeklytransactions.takings / 1.175)),
weeklytransactions.takings)) As Total,
weeklytransactions.weekendingdate,......
and another that returns a vat rate as follows
select format(Max(Distinct vat_rates.Vat_Rate),3) From vat_rates Where
vat_rates.Vat_From <= '2011-01-03'
I want to replace the hard coded if statement with the lower query, replacing the date in the lower query with weeklytransactions.weekendingdate.
After Kevin's comments, here is the full query I'm trying to get to work;
Select Max(vat_rates.vat_rate) As r,
If(departments.vat, weeklytransactions.takings / r, weeklytransactions.takings) As Total,
weeklytransactions.weekendingdate,
Week(weeklytransactions.weekendingdate),
round(datediff(weekendingdate, (if(month(weekendingdate)>5,concat(year(weekendingdate),'-06-01'),concat(year(weekendingdate)-1,'-06-01'))))/7,0)+1 as fyweek,
cast((Case When Month(weeklytransactions.weekendingdate) >5 Then Concat(Year(weeklytransactions.weekendingdate), '-',Year(weeklytransactions.weekendingdate) + 1) Else Concat(Year(weeklytransactions.weekendingdate) - 1, '-',Year(weeklytransactions.weekendingdate)) End) as char) As fy,
business_units.business_unit
From departments Inner Join (business_units Inner Join weeklytransactions On business_units.buID = weeklytransactions.businessUnit) On departments.deptid = weeklytransactions.departmentId
Where (vat_rates.vat_from <= weeklytransactions.weekendingdate and business_units.Active = true and business_units.sales=1)
Group By weeklytransactions.weekendingdate, business_units.business_unit Order By fy desc, business_unit, fyweek
Regards
Pete
Assuming I read your question correctly, your problem is about having the result of another SELECT used to be returned by the result of your main query (plus depending on how acquainted you are with SQL, maybe you haven't had the occasion to learn about JOINs?).
You can have subqueries you extract data from within a SELECT, provided you define it within the FROMclause. The following query will work, for example:
SELECT A.a, B.b
FROM A
JOIN (SELECT aggregate(c) FROM C) AS B
Notice that there is no reference to table A within the subquery. Thing is, you cannot just add it like that to the query, as the subquery doesn't know it is a subquery. So the following won't work:
SELECT A.a, B.b
FROM A
JOIN (SELECT aggregate(c) FROM C WHERE C.someValue = A.someValue) AS B
Back to basics. What you want to do here visibly, is to aggregate some data associated to each of the records of another table. For that, you will need merge your SELECT queries and use GROUP BY:
SELECT A.a, aggregate(C.c)
FROM A, C
WHERE C.someValue = A.someValue
GROUP BY A.a
Back to your tables, the following should work:
SELECT w.weekendingdate, FORMAT(MAX(v.Vat_Rate, 3)
FROM weeklytransactions AS w, vat_rates AS v
WHERE v.Vat_From <= w.weekendingdate
GROUP BY w.weekendingdate
Feel free to add and remove fields and conditions as you see fit (I wouldn't be surprised that you'd also want to use a lower bound when filtering the records from vat_rates, since the way I have written it above, for a given weekendingdate, you get records from that week + the weeks before!).
So it looks like my first try did not address the actual problem. With the additional information provided in the comments, as well as the new complete query, let's see how this goes.
We are still missing error messages, but normally the query as written should result in MySQL having the following complaint:
ERROR 1109 (42S02): Unknown table 'vat_rates' in field list
Why? Because the vat_rates table does not appear in the FROM clause, whereas it should. Let's make that more obvious by simplifying the query, removing all references to the business_units table as well as the fields, calculations and order that do not add or remove anything to the problem, leaving us with the following:
SELECT MAX(vat_rates.vat_rate) AS r,
IF(d.vat, w.takings / r, w.takings) AS Total
FROM departments AS d
INNER JOIN weeklytransactions AS w ON w.departmentId = d.deptid
WHERE vat_rates.vat_from <= w.weekendingdate
GROUP BY w.weekendingdate
That cannot work, and will produce the error mentioned above. It looks like there is no FOREIGN ID between the weeklytransactions and vat_rates tables, so we have no choice but to do a CROSS JOIN for the moment, hoping that the condition in the WHERE clause and the aggregate function used to get r are enough to fit the business logic at hand here. The following query should return the expected data instead of an error message (I also remove r since that seems to be an intermediate value judging by the comments that were written):
SELECT IF(d.vat, w.takings / MAX(v.vat_rate), w.takings) AS Total
FROM vat_rates AS v, departments AS d
INNER JOIN weeklytransactions AS w ON w.departmentId = d.deptid
WHERE v.vat_from <= w.weekendingdate
GROUP BY w.weekendingdate
From there, assuming it works, you will only need to put back all the parts I removed to get your final query. I am a tad doubtful about the way the VAT rate is gotten here, but I have no idea what your requirements are in that regard so I leave it up to you to make sure that works as expected.

Do I need a temporary table?

I have the following query and I need to add "and distance < 10" to the where clause. Because 'distance' is computed variable it can't be used in where clause. I have tried HAVING instead of where but that breaks the replace part.
I think the answer is to use a temporary table for the distance computation but i can't figure out the syntax as everything I have tried doesn't work.
All help appreciated please. Thanks.
select
Contractor.contractorID,
Contractor.firstName,
Contractor.lastName,
Contractor.emailAddress,
Contractor.nationality,
Contractor.dateOfBirth,
Contractor.address1,
Contractor.address2,
Contractor.city,
Contractor.county,
Contractor.postcode,
Contractor.country,
Contractor.tel,
Contractor.mob,
postcodes.Grid_N Grid_N1,
postcodes.Grid_E Grid_E1,
(select Grid_N from postcodes where pCode='".$postcode."') Grid_N2,
(select Grid_E from postcodes where pCode='".$postcode."') Grid_E2,
( (select sqrt(((Grid_N1-Grid_N2)*(Grid_N1-Grid_N2))+((Grid_E1-Grid_E2)*(Grid_E1-Grid_E2))) ))/1000*0.621371192 as distance
from
Contractor,
postcodes
where
postcodes.Pcode = replace(substring(Contractor.postcode,1,length(Contractor.postcode)-3),'','')
order by
distance asc
In MySQL, you can do this with:
where . . .
having distance < 10
order by distance;
You don't need to put the other conditions in the having clause. Also, your query could benefit from using ANSI standard join syntax (using the on clause, for instance).
SELECT list
, of
, columns
, including
, distance
FROM (
<your existing query>
/* minus ORDER BY clause (needs to be on outermost query */
) As a_subquery
WHERE distance < 10
ORDER
BY some_stuff

Exclude mysql table records that have matching, newer records

I have a system that queries a db table of game activity at various coordinates. I have a need to query for activities by a certain player but I need to exclude any that match newer entries at the same coordinates (by other players or not, just anything).
A sample query is:
SELECT * FROM prism_actions
WHERE world = 'world'
AND (action_type = 'block-place')
AND (player = 'viveleroi')
AND (x BETWEEN -448.7667627678472 AND -438.7667627678472)
AND (y BETWEEN 62.0 AND 72.0)
AND (z BETWEEN -291.17236958025796 AND -281.17236958025796)
ORDER BY x,y,z ASC
LIMIT 0,1000000
I've tried making it work with a subquery and inner join but just can't get it. I really need to be able to make this speedy as well.
Essentially I need this query to exclude any records at coords X,Y,Z when there's another record with same X,Y,Z but with a new action_time.
I've also considered some way of expiring records at the same x,y,z when a new match is entered but that seems not quite as efficient as I'd like as well.
You can do this by joining on a subquery that filters out the 'latest' coordinates:
SELECT prism_actions.*
FROM prism_actions
JOIN (
SELECT x, y, z, max(action_time) as action_time
FROM prism_actions
GROUP BY x, y, z) latest
ON prism_actions.action_time = latest.action_time
AND prism_actions.x = latest.x
AND prism_actions.y = latest.y
AND prism_actions.z = latest.z
WHERE prism_actions.world = 'world'
AND (prism_actions.action_type = 'block-place')
AND (prism_actions.player = 'viveleroi')
AND (prism_actions.x BETWEEN -448.7667627678472 AND -438.7667627678472)
AND (prism_actions.y BETWEEN 62.0 AND 72.0)
AND (prism_actions.z BETWEEN -291.17236958025796 AND -281.17236958025796)
ORDER BY x,y,z ASC
LIMIT 0,1000000