SQL: Two different queries to merge - mysql

I have these two different queries.
This query pulls the records from "posts" table as per their replies counter. Only posts with replies are returned with this query:
SELECT posts.title, posts.num, posts.status, COUNT( posts_replies.post_num) AS count
FROM posts_replies
INNER JOIN posts ON ( posts_replies.post_num = posts.num )
WHERE posts.status = 1
AND posts.category='uncategorized'
GROUP BY posts.num
And this is a new query that i want to merge with the above one to pull and sort records as per gps.
SELECT num, title, ( 3959 * acos( cos( radians( 37 ) ) * cos( radians( lat ) ) * cos( radians( lon ) - radians( -122 ) ) + sin( radians( 37 ) ) * sin( radians( lat ) ) ) ) AS distance
FROM posts
HAVING distance <75
ORDER BY distance
This query uses the columns lat and long to return records that are within the 75 miles radius of the user.
I am not a sql expert and don't know how to merge both of the queries to gather results having the following criteria:
Only return posts with replies
Sort by their distance
Sort by their number of replies
Any help would be highly appreciated.
Thanks!

The having clause in the second query does not look correct. In most dialects of SQL is would not be allowed without a group by. I forget if MySQL t implicitly treats the whole query as an aggregation (returning one row) or if the having gets converted to a where. In either case, you should be explicit and use where when there are no aggregations.
You can just combine them by putting in the where clause. I would do it with a subquery, to make the variable definitions clearer:
SELECT p.title, p.num, p.status, p.distance,
COUNT( p_replies.post_num) AS count
FROM posts_replies pr INNER JOIN
(select p.*,
( 3959 * acos( cos( radians( 37 ) ) * cos( radians( lat ) ) * cos( radians( lon ) - radians( -122 ) ) + sin( radians( 37 ) ) * sin( radians( lat ) ) ) ) AS distance
from posts p
) p
ON pr.post_num = p.num
WHERE p.status = 1 AND
p.category='uncategorized' and
distance < 75
GROUP BY p.num
order by distance

Related

using first sql statement result into another sql statement

What basically i want to do is pick all the coordinates from roadData
one by one and then find all the point in tweetMelbourne within 20
miles of it and insert those point into another table.
So for every (x,y) in roadData table find neighbouring data point from
tweetMelbourne and insert those points into another new table.
So I have to do this:
SELECT geo_coordinates_latitude, geo_coordinates_longitude
FROM tweetmelbourne
HAVING ( 3959 * acos( cos( radians(latitude) ) * cos( radians( geo_coordinates_latitude ) ) *
cos( radians( geo_coordinates_longitude ) - radians(longitude) ) + sin( radians(latitude) ) *
sin( radians( geo_coordinates_latitude ) ) ) ) < .1 ORDER BY distance LIMIT 0 , 20;
in which the value of latitude and longitude i have to get from another table :
select longitude,latitude from roadData;
describe tweetmelbourne;
describe roadData;
SELECT geo_coordinates_latitude, geo_coordinates_longitude
FROM tweetmelbourne;
select longitude,latitude from roadData;
The correct syntax of IN() with multiple arguments is : (Val1,Val2) IN(SELECT VAL1,val2..
SELECT t.address,(t.x+t.y) as z
FROM student t
WHERE (t.x,t.y) IN(SELECT x,y FROM tweet)
Also can be done with a join :
SELECT t.address,(t.x+t.y) as z
FROM student t
JOIN tweet s
ON(t.x = s.x and t.y = s.y)
EDIT: I think what you want is:
SELECT s.address,t.x+t.y as z
FROM student s
CROSS JOIN tweet t
Try this:
SELECT s.address, (t.x + t.y) as z
from (SELECT id,x,y FROM `tweet`) as t, student s
WHERE t.id = s.id;
You need to join the two tables, calculating the distance in the ON clause to select the nearby rows.
SELECT *
FROM tweetmelbourne
JOIN roadData
ON ( 3959 * acos( cos( radians(latitude) ) * cos( radians( geo_coordinates_latitude ) ) *
cos( radians( geo_coordinates_longitude ) - radians(longitude) ) + sin( radians(latitude) ) *
sin( radians( geo_coordinates_latitude ) ) ) ) < .1
This will be very slow if the tables are large. It's not possible to use indexes to implement the join, so it will have to perform that complex formula on every pair of rows. You might want to look at MySQL's Spatial Data extensions.

How to use GROUP By and HAVING in MySQL

The first two queries work fine, the 3rd one runs but brings back nothing when there should be results. How can I get the 3rd one to bring back results. It seems to me the GROUP BY and HAVING are not working together.
The 2nd query returns 32 Active Status and 7 Pending Status, so the 3rd query should return a summary of the 2nd query, but it is not.
SELECT COUNT(DISTINCT MLSNumber) AS TOTAL, `Status`
FROM Residential
WHERE PropertyType='Single Family' AND Status IN ("Active", "Pending")
GROUP BY `Status`;
SELECT MLSNumber, `Status`,
( 3959 * acos( cos( radians(21.380936) ) * cos( radians( Latitude ) )
* cos( radians( Longitude ) - radians(-157.757438) ) + sin( radians(21.380936) )
* sin(radians(Latitude)) ) ) AS distance
FROM Residential
WHERE PropertyType='Single Family' AND Status IN ("Active", "Pending")
HAVING distance < 2;
SELECT COUNT(DISTINCT MLSNumber) AS TOTAL, `Status`,
( 3959 * acos( cos( radians(21.380936) ) * cos( radians( Latitude ) )
* cos( radians( Longitude ) - radians(-157.757438) ) + sin( radians(21.380936) )
* sin(radians(Latitude)) ) ) AS distance
FROM Residential
WHERE PropertyType='Single Family' AND Status IN ("Active", "Pending")
GROUP BY `Status`
HAVING distance < 2;
When you are using GROUP BY, you need to be using aggregate functions for all fields not included in your GROUP BY clause.
What I think you want is to have the calculated distance be part of your where clause and get rid of the HAVING clause.
WHERE PropertyType='Single Family' AND Status IN ("Active", "Pending")
AND ( 3959 * acos( cos( radians(21.380936) ) * cos( radians( Latitude ) ) * cos( radians( Longitude ) - radians(-157.757438) ) + sin( radians(21.380936) ) * sin(radians(Latitude)) ) ) < 2
You cannot use HAVING in this example, as that clause is used for specifying conditions on aggregate columns.
In your example, distance is not an aggregated column, but one that is calculated for each row, so if you want to look for rows with distance less than two you should use a WHERE clause. However, I'm not sure you want to be grouping in this case, as distance looks like it applies to individual rows, not something for the group.
The only column in these examples that would belong in HAVING is your count functions.
I'm assuming the goal of #3 is to find the total count of MLS entries within 2 miles.
The problem you're having is that by grouping by Status in the final query, you're applying that first, before doing the calculation on latitude and longitude. Therefore, only one record's latitude and longitude get's calculated for each Status grouped group.
Try wrapping Query #2 inside another select that groups by status:
SELECT COUNT(DISTINCT i.MLSNumber) AS TOTAL, i.Status FROM
( SELECT MLSNumber, `Status`,
( 3959 * acos( cos( radians(21.380936) ) * cos( radians( Latitude ) ) * cos( radians( Longitude ) - radians(-157.757438) ) + sin( radians(21.380936) ) * sin(radians(Latitude)) ) ) AS distance
FROM Residential
WHERE PropertyType='Single Family' AND Status IN ("Active", "Pending")
HAVING distance < 2) as i Group by Status
You may need to tweak the query a bit, but that's the gist--I don't have your schema to experiment.
Also, as noted by a comment to your earlier post, you can ditch the HAVING and just use a where since you're not grouping.

GTFS SQL Find Closest Routes

I have a MySQL database setup with GTFS data and I am trying to query the database to return a list of routes (no duplicates) ordered by the distance to the closest stop on each respective route. (Note: The coordinates will change depending on where the user is)
The database is rather large (millions of rows in the stop_times table, 10,000s of rows in the stops table) so I want to make this as efficient as possible.
One thing I tried was to create a temporary table called stop_routes_link (created when the GTFS data is imported into my database) that links stops with routes such that there is an entry for each stop-route pair like so:
route_id | stop_id
-----------------------
25 | 366072709
21 | 366072709
21 | 194326291
F | 60745282
Q | 198000482
And then I ran this query which works perfectly:
SELECT routes.route_short_name AS route_short_name
FROM routes
LEFT JOIN (
SELECT ( 3959 * acos( cos( radians(39.94868155755109) ) * cos( radians( stops.stop_lat ) ) * cos( radians( stops.stop_lon ) - radians(-75.15972534860013) ) + sin( radians(39.94868155755109) ) * sin( radians( stops.stop_lat ) ) ) ) AS distance, stop_routes_link.route_id AS route_id
FROM stops
LEFT JOIN stop_routes_link ON stops.stop_id = stop_routes_link.stop_id ORDER BY distance)
AS stops ON routes.route_id = stops.route_id
ORDER BY stops.distance
And it returns:
route_short_name
----------------
23
23
12
12
9
21
38
Which is what I want (I know those are the closest routes to that location), except it returns duplicates for routes since I think it is returning an row for each stop on a route.
How do I return only unique routes? I thought that the "LEFT JOIN" would cause only one row per entry in the routes table but it didn't.
I've also tried:
SELECT DINSTINCT routes.route_short_name AS route_short_name
FROM routes
LEFT JOIN (
SELECT ( 3959 * acos( cos( radians(39.94868155755109) ) * cos( radians( stops.stop_lat ) ) * cos( radians( stops.stop_lon ) - radians(-75.15972534860013) ) + sin( radians(39.94868155755109) ) * sin( radians( stops.stop_lat ) ) ) ) AS distance, stop_routes_link.route_id AS route_id
FROM stops
LEFT JOIN stop_routes_link ON stops.stop_id = stop_routes_link.stop_id ORDER BY distance)
AS stops ON routes.route_id = stops.route_id
ORDER BY stops.distance
And:
SELECT routes.route_short_name AS route_short_name
FROM routes
LEFT JOIN (
SELECT ( 3959 * acos( cos( radians(39.94868155755109) ) * cos( radians( stops.stop_lat ) ) * cos( radians( stops.stop_lon ) - radians(-75.15972534860013) ) + sin( radians(39.94868155755109) ) * sin( radians( stops.stop_lat ) ) ) ) AS distance, stop_routes_link.route_id AS route_id
FROM stops
LEFT JOIN stop_routes_link ON stops.stop_id = stop_routes_link.stop_id ORDER BY distance)
AS stops ON routes.route_id = stops.route_id
GROUP BY route_short_name
ORDER BY stops.distance
But they both don't return the closest routes they return a random ordered list of routes (I'm not sure how it is calculated) which I'm assuming is because the grouping messes it up.
Any help would be greatly appreciated!

Two identical formulas producing differing results

I have a table of items and a table of categories. Each item is saved with it co-ordinates, latitude (lat) and longitude (lon), to allow users to search geographically.
When I do a search for items, those which have exactly the same lat and lon as the user, show in one query but not the other.
One query simply selects all items within a category (2), within a range (<1).
SELECT *, c.name as category, c.category_id as CATid,
( 3959 * acos( cos( radians(52.993252) )
* cos( radians( i.latitude ) )
* cos( radians( i.longitude ) - radians(-0.412470) )
+ sin( radians(52.993252) )
* sin( radians( i.latitude ) ) ) ) AS distance
from items i
join categories c on i.category=c.category_id
where i.category=2 group by i.item_id
HAVING distance < 1
order by distance
The other query selects all the categories and counts the number of items within each category, within the specified geographic range (<1)
SELECT *, ( SELECT ( count( 3959 * acos( cos( radians(52.993252) )
* cos( radians( latitude ) )
* cos( radians( longitude )
- radians(-0.412470) )
+ sin( radians(52.993252) )
* sin( radians( latitude ) ) ) )) AS distance
FROM items
WHERE category = category_id
HAVING distance < 2 ) AS howmanyCat,
( SELECT name FROM categories WHERE category_id = c.parent ) AS parname
FROM categories c ORDER BY category_id, parent
Strangely, if you change the search parameter for distance to 2 on the second query it finds it!
Any ideas?
Here is a fiddle to show what I mean
The second query is assigning the count() value as distance.
The first is assigning the arithmetic calculation as distance.
The first is doing what you want, and it is a clearer query.
EDIT:
I also note that the first query is aggregating by item_id. The second is not doing an explicit aggregation in the outer query, but it is choosing all categories. This is another difference between the versions.

MySQL distinct or group by in combination with having not giving a result when result is a single row

It seems that my query is not exactly doing what I want. The query gets a result aslong as the result is 2 or more rows. When I get a single row the query is not getting any result.
In the SELECT I can do DISTINCT (ct.name) but this gives the same problem as the group by.
SELECT
ct.name,
( 3959 * acos(cos(radians(52.779716)) * cos(radians( com.gps_lat )) * cos(radians( com.gps_lon ) -
radians(21.84803)) + sin( radians(52.779716) ) * sin( radians( com.gps_lat )))) as distance
FROM cuisine_types as ct
Left joining company to check if a company is attached to the cuisine_type
LEFT JOIN company AS com ON (com.cuisine_type_id = ct.id)
Here I'm grouping the results so no Cuisine Type appears twice.
this only seems to work when the result is 2 or more rows...
GROUP BY ct.name
Here I'm checking if the distance of the company is within the users preferenced search radius
HAVING distance < 20;
for example if I had 'Fastfood', 'Vegan', and 'Healthy' as Cuisine Types, I only want one of each Cuisine Types no matter how many companies in the search distance are related to that Cuisine Type. So I filter the double Cuisine Types away using the GROUP BY I hope this helps with understanding my approach in this query.
NOTE: There is only one Cuisine Type attached to a company.
Full sql query without comments down here
SELECT ct.name, ( 3959 * acos( cos( radians(52.779716) ) * cos(
radians( com.gps_lat ) ) * cos( radians( com.gps_lon ) -
radians(21.84803) ) + sin( radians(52.779716) ) * sin( radians(
com.gps_lat ) ) ) ) as distance FROM cuisine_types as ct LEFT JOIN
company AS com ON (com.cuisine_type_id = ct.id) GROUP BY ct.name
HAVING distance < 20;
Try this:
SELECT
ct.name,
min( ( 3959 * acos( cos( radians(52.779716) ) * cos( radians( com.gps_lat ) ) * cos( radians( com.gps_lon ) - radians(21.84803) ) + sin( radians(52.779716) ) * sin( radians( com.gps_lat ) ) ) ) ) as distance
FROM
cuisine_types as ct
LEFT JOIN company AS com ON (com.cuisine_type_id = ct.id)
GROUP BY
ct.name
HAVING
distance < 20;