How to use GROUP By and HAVING in MySQL - mysql

The first two queries work fine, the 3rd one runs but brings back nothing when there should be results. How can I get the 3rd one to bring back results. It seems to me the GROUP BY and HAVING are not working together.
The 2nd query returns 32 Active Status and 7 Pending Status, so the 3rd query should return a summary of the 2nd query, but it is not.
SELECT COUNT(DISTINCT MLSNumber) AS TOTAL, `Status`
FROM Residential
WHERE PropertyType='Single Family' AND Status IN ("Active", "Pending")
GROUP BY `Status`;
SELECT MLSNumber, `Status`,
( 3959 * acos( cos( radians(21.380936) ) * cos( radians( Latitude ) )
* cos( radians( Longitude ) - radians(-157.757438) ) + sin( radians(21.380936) )
* sin(radians(Latitude)) ) ) AS distance
FROM Residential
WHERE PropertyType='Single Family' AND Status IN ("Active", "Pending")
HAVING distance < 2;
SELECT COUNT(DISTINCT MLSNumber) AS TOTAL, `Status`,
( 3959 * acos( cos( radians(21.380936) ) * cos( radians( Latitude ) )
* cos( radians( Longitude ) - radians(-157.757438) ) + sin( radians(21.380936) )
* sin(radians(Latitude)) ) ) AS distance
FROM Residential
WHERE PropertyType='Single Family' AND Status IN ("Active", "Pending")
GROUP BY `Status`
HAVING distance < 2;

When you are using GROUP BY, you need to be using aggregate functions for all fields not included in your GROUP BY clause.
What I think you want is to have the calculated distance be part of your where clause and get rid of the HAVING clause.
WHERE PropertyType='Single Family' AND Status IN ("Active", "Pending")
AND ( 3959 * acos( cos( radians(21.380936) ) * cos( radians( Latitude ) ) * cos( radians( Longitude ) - radians(-157.757438) ) + sin( radians(21.380936) ) * sin(radians(Latitude)) ) ) < 2

You cannot use HAVING in this example, as that clause is used for specifying conditions on aggregate columns.
In your example, distance is not an aggregated column, but one that is calculated for each row, so if you want to look for rows with distance less than two you should use a WHERE clause. However, I'm not sure you want to be grouping in this case, as distance looks like it applies to individual rows, not something for the group.
The only column in these examples that would belong in HAVING is your count functions.

I'm assuming the goal of #3 is to find the total count of MLS entries within 2 miles.
The problem you're having is that by grouping by Status in the final query, you're applying that first, before doing the calculation on latitude and longitude. Therefore, only one record's latitude and longitude get's calculated for each Status grouped group.
Try wrapping Query #2 inside another select that groups by status:
SELECT COUNT(DISTINCT i.MLSNumber) AS TOTAL, i.Status FROM
( SELECT MLSNumber, `Status`,
( 3959 * acos( cos( radians(21.380936) ) * cos( radians( Latitude ) ) * cos( radians( Longitude ) - radians(-157.757438) ) + sin( radians(21.380936) ) * sin(radians(Latitude)) ) ) AS distance
FROM Residential
WHERE PropertyType='Single Family' AND Status IN ("Active", "Pending")
HAVING distance < 2) as i Group by Status
You may need to tweak the query a bit, but that's the gist--I don't have your schema to experiment.
Also, as noted by a comment to your earlier post, you can ditch the HAVING and just use a where since you're not grouping.

Related

using first sql statement result into another sql statement

What basically i want to do is pick all the coordinates from roadData
one by one and then find all the point in tweetMelbourne within 20
miles of it and insert those point into another table.
So for every (x,y) in roadData table find neighbouring data point from
tweetMelbourne and insert those points into another new table.
So I have to do this:
SELECT geo_coordinates_latitude, geo_coordinates_longitude
FROM tweetmelbourne
HAVING ( 3959 * acos( cos( radians(latitude) ) * cos( radians( geo_coordinates_latitude ) ) *
cos( radians( geo_coordinates_longitude ) - radians(longitude) ) + sin( radians(latitude) ) *
sin( radians( geo_coordinates_latitude ) ) ) ) < .1 ORDER BY distance LIMIT 0 , 20;
in which the value of latitude and longitude i have to get from another table :
select longitude,latitude from roadData;
describe tweetmelbourne;
describe roadData;
SELECT geo_coordinates_latitude, geo_coordinates_longitude
FROM tweetmelbourne;
select longitude,latitude from roadData;
The correct syntax of IN() with multiple arguments is : (Val1,Val2) IN(SELECT VAL1,val2..
SELECT t.address,(t.x+t.y) as z
FROM student t
WHERE (t.x,t.y) IN(SELECT x,y FROM tweet)
Also can be done with a join :
SELECT t.address,(t.x+t.y) as z
FROM student t
JOIN tweet s
ON(t.x = s.x and t.y = s.y)
EDIT: I think what you want is:
SELECT s.address,t.x+t.y as z
FROM student s
CROSS JOIN tweet t
Try this:
SELECT s.address, (t.x + t.y) as z
from (SELECT id,x,y FROM `tweet`) as t, student s
WHERE t.id = s.id;
You need to join the two tables, calculating the distance in the ON clause to select the nearby rows.
SELECT *
FROM tweetmelbourne
JOIN roadData
ON ( 3959 * acos( cos( radians(latitude) ) * cos( radians( geo_coordinates_latitude ) ) *
cos( radians( geo_coordinates_longitude ) - radians(longitude) ) + sin( radians(latitude) ) *
sin( radians( geo_coordinates_latitude ) ) ) ) < .1
This will be very slow if the tables are large. It's not possible to use indexes to implement the join, so it will have to perform that complex formula on every pair of rows. You might want to look at MySQL's Spatial Data extensions.

Two identical formulas producing differing results

I have a table of items and a table of categories. Each item is saved with it co-ordinates, latitude (lat) and longitude (lon), to allow users to search geographically.
When I do a search for items, those which have exactly the same lat and lon as the user, show in one query but not the other.
One query simply selects all items within a category (2), within a range (<1).
SELECT *, c.name as category, c.category_id as CATid,
( 3959 * acos( cos( radians(52.993252) )
* cos( radians( i.latitude ) )
* cos( radians( i.longitude ) - radians(-0.412470) )
+ sin( radians(52.993252) )
* sin( radians( i.latitude ) ) ) ) AS distance
from items i
join categories c on i.category=c.category_id
where i.category=2 group by i.item_id
HAVING distance < 1
order by distance
The other query selects all the categories and counts the number of items within each category, within the specified geographic range (<1)
SELECT *, ( SELECT ( count( 3959 * acos( cos( radians(52.993252) )
* cos( radians( latitude ) )
* cos( radians( longitude )
- radians(-0.412470) )
+ sin( radians(52.993252) )
* sin( radians( latitude ) ) ) )) AS distance
FROM items
WHERE category = category_id
HAVING distance < 2 ) AS howmanyCat,
( SELECT name FROM categories WHERE category_id = c.parent ) AS parname
FROM categories c ORDER BY category_id, parent
Strangely, if you change the search parameter for distance to 2 on the second query it finds it!
Any ideas?
Here is a fiddle to show what I mean
The second query is assigning the count() value as distance.
The first is assigning the arithmetic calculation as distance.
The first is doing what you want, and it is a clearer query.
EDIT:
I also note that the first query is aggregating by item_id. The second is not doing an explicit aggregation in the outer query, but it is choosing all categories. This is another difference between the versions.

MySQL distinct or group by in combination with having not giving a result when result is a single row

It seems that my query is not exactly doing what I want. The query gets a result aslong as the result is 2 or more rows. When I get a single row the query is not getting any result.
In the SELECT I can do DISTINCT (ct.name) but this gives the same problem as the group by.
SELECT
ct.name,
( 3959 * acos(cos(radians(52.779716)) * cos(radians( com.gps_lat )) * cos(radians( com.gps_lon ) -
radians(21.84803)) + sin( radians(52.779716) ) * sin( radians( com.gps_lat )))) as distance
FROM cuisine_types as ct
Left joining company to check if a company is attached to the cuisine_type
LEFT JOIN company AS com ON (com.cuisine_type_id = ct.id)
Here I'm grouping the results so no Cuisine Type appears twice.
this only seems to work when the result is 2 or more rows...
GROUP BY ct.name
Here I'm checking if the distance of the company is within the users preferenced search radius
HAVING distance < 20;
for example if I had 'Fastfood', 'Vegan', and 'Healthy' as Cuisine Types, I only want one of each Cuisine Types no matter how many companies in the search distance are related to that Cuisine Type. So I filter the double Cuisine Types away using the GROUP BY I hope this helps with understanding my approach in this query.
NOTE: There is only one Cuisine Type attached to a company.
Full sql query without comments down here
SELECT ct.name, ( 3959 * acos( cos( radians(52.779716) ) * cos(
radians( com.gps_lat ) ) * cos( radians( com.gps_lon ) -
radians(21.84803) ) + sin( radians(52.779716) ) * sin( radians(
com.gps_lat ) ) ) ) as distance FROM cuisine_types as ct LEFT JOIN
company AS com ON (com.cuisine_type_id = ct.id) GROUP BY ct.name
HAVING distance < 20;
Try this:
SELECT
ct.name,
min( ( 3959 * acos( cos( radians(52.779716) ) * cos( radians( com.gps_lat ) ) * cos( radians( com.gps_lon ) - radians(21.84803) ) + sin( radians(52.779716) ) * sin( radians( com.gps_lat ) ) ) ) ) as distance
FROM
cuisine_types as ct
LEFT JOIN company AS com ON (com.cuisine_type_id = ct.id)
GROUP BY
ct.name
HAVING
distance < 20;

SQL: Two different queries to merge

I have these two different queries.
This query pulls the records from "posts" table as per their replies counter. Only posts with replies are returned with this query:
SELECT posts.title, posts.num, posts.status, COUNT( posts_replies.post_num) AS count
FROM posts_replies
INNER JOIN posts ON ( posts_replies.post_num = posts.num )
WHERE posts.status = 1
AND posts.category='uncategorized'
GROUP BY posts.num
And this is a new query that i want to merge with the above one to pull and sort records as per gps.
SELECT num, title, ( 3959 * acos( cos( radians( 37 ) ) * cos( radians( lat ) ) * cos( radians( lon ) - radians( -122 ) ) + sin( radians( 37 ) ) * sin( radians( lat ) ) ) ) AS distance
FROM posts
HAVING distance <75
ORDER BY distance
This query uses the columns lat and long to return records that are within the 75 miles radius of the user.
I am not a sql expert and don't know how to merge both of the queries to gather results having the following criteria:
Only return posts with replies
Sort by their distance
Sort by their number of replies
Any help would be highly appreciated.
Thanks!
The having clause in the second query does not look correct. In most dialects of SQL is would not be allowed without a group by. I forget if MySQL t implicitly treats the whole query as an aggregation (returning one row) or if the having gets converted to a where. In either case, you should be explicit and use where when there are no aggregations.
You can just combine them by putting in the where clause. I would do it with a subquery, to make the variable definitions clearer:
SELECT p.title, p.num, p.status, p.distance,
COUNT( p_replies.post_num) AS count
FROM posts_replies pr INNER JOIN
(select p.*,
( 3959 * acos( cos( radians( 37 ) ) * cos( radians( lat ) ) * cos( radians( lon ) - radians( -122 ) ) + sin( radians( 37 ) ) * sin( radians( lat ) ) ) ) AS distance
from posts p
) p
ON pr.post_num = p.num
WHERE p.status = 1 AND
p.category='uncategorized' and
distance < 75
GROUP BY p.num
order by distance

How to improve this huge SQL query?

I have a big SQL query (for MySQL) that is slow. It's a union of two select statements. I have tried different things, but any slight variance gives me a different result set from the original. Any help with improving it will be greatly appreciated. Thanks. Here is the SQL:
(SELECT
CONCAT(city_name,', ',region) value,
latitude,
longitude,
id,
population,
( 3959 * acos( cos( radians($latitude) ) * cos( radians( latitude ) ) * cos( radians( longitude ) - radians($longitude) ) + sin( radians($latitude) ) * sin( radians( latitude ) ) ) )
AS distance,
CASE region
WHEN '$region' THEN 1
ELSE 0
END AS region_match
FROM `cities`
$where and foo_count > 5
ORDER BY region_match desc, foo_count desc
limit 0, 11)
UNION
(SELECT
CONCAT(city_name,', ',region) value,
latitude,
longitude,
id,
population,
( 3959 * acos( cos( radians($latitude) ) * cos( radians( latitude ) ) * cos( radians( longitude ) - radians($longitude) ) + sin( radians($latitude) ) * sin( radians( latitude ) ) ) )
AS distance,
CASE region
WHEN '$region' THEN 1
ELSE 0
END AS region_match
FROM `cities`
$where
ORDER BY region_match desc, population desc, distance asc
limit 0, 11)
limit 0, 11
The SQL does take some interpolated values (prefixed with the dollar sign($)).
The following might give the same result (I'm not sure about how the maximum/minimum functions are called in SQL, but you should get an idea -- you need two fields derived from foo_count which separate the items of the first part of your UNION from those of the second one and allow ordering within the first part without disturbing the order in the second part) -- of course, you later need a second query to throw the additional fields out again:
SELECT
CONCAT(city_name,', ',region) value,
latitude,
longitude,
id,
population,
( 3959 * acos( cos( radians($latitude) ) * cos( radians( latitude ) ) * cos( radians( longitude ) - radians($longitude) ) + sin( radians($latitude) ) * sin( radians( latitude ) ) ) )
AS distance,
min ( 6, max (foo_count, 5)) AS group_discriminator,
max ( 6, foo_count) AS rank_for_use_in_first_group,
CASE region
WHEN '$region' THEN 1
ELSE 0
END AS region_match
FROM `cities`
$where
ORDER BY group_discriminator desc, region_match desc, rank_for_use_in_first_group desc, population desc, distance asc
limit 0, 11
EDIT: Improvements