Select the nearest point for each row in table

Select the nearest point for each row in table - mysql

I use this to find the nearest point
SELECT
id,
ST_Distance(
POINT(52.760667210533,-7.22646337599035),
geo_point
) as distance
from Points
order by distance limit 1
I have a temp table TempPoints with all my candidate points and I want to normalise them onto OSM nodes, but there's lots, so I need a single query to resolve them all in one call. UNION wont let me use order by, and my DB raw query interface wont let me just fire a series of queries separated by ';'.
The temp table has lat and lon but can just as easily have a POINT. How can I go
select id,NearestTo(TempPoint.geo_point,Points) from TempPoints;
EDIT: I can parenthesise each select in my large union query, which solves my issue.
I would still like to be able to join on nearest row.

This might work for you:
SELECT t.id as tid, p.id as pid, p.geo_point
FROM TempPoint t
JOIN Points p ON p.id = (
SELECT p1.id
FROM Points p1
ORDER BY ST_Distance(p1.geo_point, t.geo_point)
LIMIT 1
)

My solution is to issue a series of queries, one for each row, and bind them together with a UNION. The mysql stack will blow eventually so you need to do them in blocks, but 1000 is OK on a default install.
You have to parenthesize the queries as they include an order by. Some points may fail so I label them all with a literal line_no sequence so you can edit and filter the originals. You also need to restrict the query with a
WHERE Contains(<polygon>,point)
clause, else it will try and sort the whole table, where polygon is a bounding box you have to cook up with GEOMFROMTEXT() and POLYGON(). And of course you need a special spatial index on the column!. Here's some code
var SMALL=0.001
var=query=points
.map(function(point){
var bottom=point.lat+SMALL
var top=point.lat-SMALL
var left=point.lon-SMALL
var right=point.lon+SMALL
var polygon=[
[bottom,left],
[top,left],
[top,right],
[bottom,right],
[bottom,left]
]
polygon="POLYGON(("+polygon.map(function(point){
return point.join(' ')
})
.join(",")+"))"
point.line_no=line_no++
return "(SELECT "+point.line_no+" as line_no,id, ST_Distance(POINT("+
point.lat+","+point.lon+
"),geo_point) as distance"+
" from Points "+
" WHERE Contains(GeomFromText('"+polygon+"'),geo_point) "+
" order by distance limit 1) "
})
.join(" UNION ")+" order by line_no"
return sequelize.query(query)

Related

JOIN on keys that don't have the same value

I am trying to do an INNER JOIN on two tables that have similar values, but not quite the same. One table has a fully qualified host name for its primary key, and the other the hosts short name, as well as the subdomain. It it safe to assume that the short name and the subdomain together are unique.
So I've tried:
SELECT table1.nisinfo.* FROM table1.nisinfo INNER JOIN table2.hosts ON (table1.nisinfo.shortname + '.' + table1.nisinfo.subdomainname + '.domain.com') = table2.hosts.fqhn WHERE table2.hosts.package = 'somepkg';
This doesn't return the results I expect, it returns the first result hundreds of times. I'd like to return distinct rows. It takes a long time to run as well.
What am I doing wrong? I was thinking of running a subquery to get the hostnames, but I don't know what the right path from here is.
Thank you!

You can use group by in your query so you can achieve the desired results you want
please see this two links
Group by with 2 distinct columns in SQL Server
http://www.sqlteam.com/article/how-to-use-group-by-with-distinct-aggregates-and-derived-tables

Try putting your results into a temp table and then view the table to make sure that the columns are as expected.
SELECT table1.nisinfo.*, table1.nisinfo.shortname + '.' + table1.nisinfo.subdomainname + '.domain.com' AS ColID
INTO #temp
FROM table1.nisinfo;
Select *
from #temp INNER JOIN table2.hosts ON ##temp.ColID = table2.hosts.fqhn
WHERE table2.hosts.package = 'somepkg'
;
Put a Group By clause at the end of the second statement

So in this case, I used a subquery to get the initial results, and then used a join.
SELECT table1.nisinfo.* FROM table1.nisinfo JOIN (SELECT distinct(fqhn) FROM table2.hosts WHERE package = 'bash') AS FQ ON ((SUBSTRING_INDEX(FQ.fqhn, '.', 1)) = table1.nisinfo.shortname);

Using a COUNT value in an expression getting..does not include specified expression as part of an aggregate function

I am trying to display a warning if a bike station gets to over 90% full or less than 10% full. When i run this query I get "you are trying to execute query that does not include the iif statment... as part of an aggregate function.
Bike_locations table - Bicycle_id and Locations_ID
Locations table - Locations_ID, No_of_Spaces, Location_Address
SELECT Locations.Location_Address, Count(Bike_Locations.Bicycle_ID) AS CountOfBicycle_ID,
IIf(((([CountOfBicycle_ID]/[LOCATIONS]![No_Of_Spaces])*100)>90),"This Station is nearly full.
Need to move some bicycles out of here",IIf(((([CountOfBicycle_ID]/[LOCATIONS]![No_Of_Spaces])*100)
<10),"This station is nearly empty. Need to move some bicycles here","")) AS Warnings
FROM Locations INNER JOIN Bike_Locations ON Locations.[LOCATIONS_ID] = Bike_Locations.[LOCATIONS_ID]
GROUP BY Locations.Location_Address;
Anyone got a scooby

When you use a GROUP BY, you should have the exact same fields in both your SELECT and GROUP BY statements, except for the aggregate function that should only be specified in the SELECT
The aggregate function in your case is the COUNT(*)
The fields you aggregate on are:
in the SELECT : Location_Address and Warnings
in the GROUP BY : Location_Address only
The error message is telling you that you don't have the same in both statements.
2 solutions:
Remove the Warnings from the SELECT statement
Add the Warnings to the GROUP BY statement
Note that in MS Access SQL, you can't (unfortunately) use in the GROUP BY, the Aliases specified in the SELECT. So you have to copy over the whole field, which would be the long iif in your case
Edit: better solution proposal:
I would radically change your approach as you'll go no where with all those nested iff
Create the following Query and Name it (for instance) Stations_Occupation
SELECT L.Locations_ID AS ID,
L.Location_Address AS Addr,
L.No_of_Spaces AS TotSpace,
BL.cnt AS OccSpace,
ROUND((BL.cnt/L.No_of_Spaces*100),0) AS OccPourc
FROM Locations L
LEFT JOIN
(
SELECT Locations_ID, COUNT(*) AS cnt
FROM Bike_Locations
GROUP BY LOCATIONS_ID
) AS BL ON L.Locations_ID = BL.Locations_ID
This query will probably be a lot helpfull in many parts of your application, and not only here, as it calculates the occupation % of each station
Some examples:
Get all stations with >90% occupation:
SELECT Addr
FROM Stations_Occupation
WHERE OccPourc > 90
Get all stations with <10% occupation:
SELECT Addr
FROM Stations_Occupation
WHERE OccPourc < 10
Get Occupation level of a specific station:
SELECT OccPourc
FROM Stations_Occupation
WHERE ID=specific_station_ID
Get number of bikes and max on a specific station:
SELECT OccSpace & "/" & TotSpace
FROM Stations_Occupation
WHERE ID=specific_station_ID

MYSQL GROUP_CONCAT and IN

I have a little query, it goes like this:
It's slightly more complex than it looks, the only issue is using the output of one subquery as the parameter for an IN clause to generate another. It works to some degree - but it only provides the results from the first id in the "IN" clause. Oddly, if I manually insert the record ids "00003,00004,00005" it does give the proper results.
What I am seeking to do is get second level many to many relationship - basically tour_stops have items, which in turn have images. I am trying to get all the images from all the items to be in a JSON string as 'item_images'. As stated, it runs quickly, but only returns the images from the first related item.
SELECT DISTINCT
tour_stops.record_id,
(SELECT
GROUP_CONCAT( item.record_id ) AS in_item_ids
FROM tour_stop_item
LEFT OUTER JOIN item
ON item.record_id = tour_stop_item.item_id
WHERE tour_stop_item.tour_stops_id = tour_stops.record_id
GROUP BY tour_stops.record_id
) AS rel_items,
(SELECT
CONCAT('[ ',
GROUP_CONCAT(
CONCAT('{ \"record_id\" : \"',record_id,'\",
\"photo_credit\" : \"',photo_credit,'\" }')
)
,' ]')
FROM images
WHERE
images.attached_to IN(rel_items) AND
images.attached_table = 'item'
ORDER BY img_order ASC) AS item_images
FROM tour_stops
WHERE
tour_stops.attached_to_tour = $record_id
ORDER BY tour_stops.stop_order ASC
Both of these below answers I tried, but it did not help. The second example (placing the entire first subquery inside he "IN" statement) not only produced the same results I am already getting, but also increased query time exponentially.
EDIT: I replaced my IN statement with
IN(SELECT item_id FROM tour_stop_item WHERE tour_stops_id = tour_stops.record_id)
and it works, but it brutally slow now. Assuming I have everything indexed correctly, is this the best way to do it?
using group_concat in PHPMYADMIN will show the result as [BLOB - 3B]
GROUP_CONCAT in IN Subquery
Any insights are appreciated. Thanks

I am surprised that you can use rel_items in the subquery.
You might try:
concat(',', images.attached_to, ',') like concat('%,', rel_items, ',%') and
This may or may not be faster. The original version was fast presumably because there are no matches.
Or, you can try to change your in clause. Sometimes, these are poorly optimized:
exists (select 1
from tour_stop_item
where tour_stops_id = tour_stops.record_id and images.attached_to = item_id
)
And then be sure you have an index on tour_stop_item(tour_stops_id, item_id).

MYSQL: How to use outer aliases in subquery properly?

The first query works just fine. It returns one row from the table 'routepoint'. It has a certain 'route_id' and 'geo_distance()' is on its minimum given the parameters. I know that the subquery in the FROM section seems unnecessarily complicated but in my eyes it helps to highlight the problem with the second query.
The differences are in the last two rows.
SELECT rp.*
FROM routepoint rp, route r, (SELECT * FROM ride_offer WHERE id = 6) as ro
WHERE rp.route_id = r.id
AND r.id = ro.current_route_id
AND geo_distance(rp.lat,rp.lng,52372070,9735690) =
(SELECT MIN(geo_distance(lat,lng,52372070,9735690))
FROM routepoint rp1, ride_offer ro1
WHERE rp1.route_id = ro1.current_route_id AND ro1.id = 6);
The next query does not work at all. It completely freezes mysql and I have to restart.
What am I doing wrong? The first subquery returns excactly one row. I don't understand the difference.
SELECT rp.*
FROM routepoint rp, route r, (SELECT * FROM ride_offer WHERE id = 6) as ro
WHERE
rp.route_id = r.id
AND r.id = ro.current_route_id
AND geo_distance(rp.lat,rp.lng,52372070,9735690) =
(SELECT MIN(geo_distance(lat,lng,52372070,9735690))
FROM routepoint rp1
WHERE rp1.route_id = ro.current_route_id);

The problem is, as pointed out by Romain, that this is costly.
This article describes an algorithm that reduces the cost by a 2-step process.
Step 1: Find a bounding box that contains at least one point.
Step 2: Find the closest point by examining all points in the bounding box, which should be a comparatively small number, thus not so costly.

indexes in mysql SELECT AS or using Views

I'm in over my head with a big mysql query (mysql 5.0), and i'm hoping somebody here can help.
Earlier I asked how to get distinct values from a joined query
mysql count only for distinct values in joined query
The response I got worked (using a subquery with join as)
select *
from media m
inner join
( select uid
from users_tbl
limit 0,30) map
on map.uid = m.uid
inner join users_tbl u
on u.uid = m.uid
unfortunately, my query has grown more unruly, and though I have it running, joining into a derived table is taking too long because there is no indexes available to the derived query.
my query now looks like this
SELECT mdate.bid, mdate.fid, mdate.date, mdate.time, mdate.title, mdate.name,
mdate.address, mdate.rank, mdate.city, mdate.state, mdate.lat, mdate.`long`,
ext.link,
ext.source, ext.pre, meta, mdate.img
FROM ext
RIGHT OUTER JOIN (
SELECT media.bid,
media.date, media.time, media.title, users.name, users.img, users.rank, media.address,
media.city, media.state, media.lat, media.`long`,
GROUP_CONCAT(tags.tagname SEPARATOR ' | ') AS meta
FROM media
JOIN users ON media.bid = users.bid
LEFT JOIN tags ON users.bid=tags.bid
WHERE `long` BETWEEN -122.52224684058 AND -121.79760915942
AND lat BETWEEN 37.07500915942 AND 37.79964684058
AND date = '2009-02-23'
GROUP BY media.bid, media.date
ORDER BY media.date, users.rank DESC
LIMIT 0, 30
) mdate ON (mdate.bid = ext.bid AND mdate.date = ext.date)
phew!
SO, as you can see, if I understand my problem correctly, i have two derivative tables without indexes (and i don't deny that I may have screwed up the Join statements somehow, but I kept messing with different types, is this ended up giving me the result I wanted).
What's the best way to create a query similar to this which will allow me to take advantage of the indexes?
Dare I say, I actually have one more table to add into the mix at a later date.
Currently, my query is taking .8 seconds to complete, but I'm sure if I could take advantage of the indexes, this could be significantly faster.

First, check for indices on ext(bid, date), users(bid) and tags(bid), you should really have them.
It seems, though, that it's LONG and LAT that cause you most problems. You should try keeping your LONG and LAT as a (coordinate POINT), create a SPATIAL INDEX on this column and query like that:
WHERE MBRContains(#MySquare, coordinate)
If you can't change your schema for some reason, you can try creating additional indices that include date as a first field:
CREATE INDEX ix_date_long ON media (date, `long`)
CREATE INDEX ix_date_lat ON media (date, lat)
These indices will be more efficient for you query, as you use exact search on date combined with a ranged search on axes.

Starting fresh:
Question - why are you grouping by both media.bid and media.date? Can a bid have records for more than one date?
Here's a simpler version to try:
SELECT
mdate.bid,
mdate.fid,
mdate.date,
mdate.time,
mdate.title,
mdate.name,
mdate.address,
mdate.rank,
mdate.city,
mdate.state,
mdate.lat,
mdate.`long`,
ext.link,
ext.source,
ext.pre,
meta,
mdate.img,
( SELECT GROUP_CONCAT(tags.tagname SEPARATOR ' | ')
FROM tags
WHERE ext.bid = tags.bid
ORDER BY tags.bid GROUP BY tags.bid
) AS meta
FROM
ext
LEFT JOIN
media ON ext.bid = media.bid AND ext.date = media.date
JOIN
users ON ext.bid = users.bid
WHERE
`long` BETWEEN -122.52224684058 AND -121.79760915942
AND lat BETWEEN 37.07500915942 AND 37.79964684058
AND ext.date = '2009-02-23'
AND users.userid IN
(
SELECT userid FROM users ORDER BY rank DESC LIMIT 30
)
ORDER BY
media.date,
users.rank DESC
LIMIT 0, 30

You might want to compare your perforamnces against using a temp table for each selection, and joining those tables together.
create table #whatever
create table #whatever2
insert into #whatever select...
insert into #whatever2 select...
select from #whatever join #whatever 2
....
drop table #whatever
drop table #whatever2
If your system has enough memory to hold full tables this might work out much faster. It depends on how big your database is.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Select the nearest point for each row in table - mysql

This might work for you: SELECT t.id as tid, p.id as pid, p.geo_point FROM TempPoint t JOIN Points p ON p.id = ( SELECT p1.id FROM Points p1 ORDER BY ST_Distance(p1.geo_point, t.geo_point) LIMIT 1 )

Related

JOIN on keys that don't have the same value

Using a COUNT value in an expression getting..does not include specified expression as part of an aggregate function

MYSQL GROUP_CONCAT and IN

MYSQL: How to use outer aliases in subquery properly?

indexes in mysql SELECT AS or using Views

Categories

Resources