two queries give different results in the same database - mysql

Please help. I'm using MySQL 5.1.30 Community Edition.
I have four tables: nts, operator, country, cooperationtype
table `nts` has one column(`operatorId`) which is a foreign key to column `id` in table `operator` and one column(`voice`) which is a foreign key to column `id` in table cooperationtype
table operator has one column(`country_id`) which is a foreign key to column (`id`) in table country
I want to get the counts of operators and countries where all of the value of voice not equals to 'N/A' and grouped them by cooperationtype.id with this query:
SELECT cooperationtype.id AS cooptype,
COUNT(DISTINCT country_id) AS country, COUNT(DISTINCT operatorId) AS operator
FROM nts INNER JOIN operator ON operator.id = nts.operatorId INNER JOIN country ON operator.country_id = country.id
INNER JOIN cooperationtype ON cooperationtype.id = nts.voice
WHERE cooperationtype.code <> 'N/A' GROUP BY cooperationtype.id
I got this result:
cooptype country operator
1 128 348
2 11 11
3 15 17
The sum of this query is 154 countries and 376 operators.
But then when I want to get all of the counts of operators and countries where all of the value of voice not equals to 'N/A', regardless the of cooperationtype.id with this query:
SELECT COUNT(DISTINCT country_id) AS country, COUNT(DISTINCT operatorId) AS operator
FROM nts INNER JOIN operator ON operator.id = nts.operatorId INNER JOIN country ON operator.country_id = country.id
INNER JOIN cooperationtype ON cooperationtype.id = nts.voice
WHERE cooperationtype.code <> 'N/A'
I got this result:
country operator
133 372
My questions are:
Why is the sum of the result from the first query doesn't equal to the result from the second query?
Which one is the right result?
Data example:
voice country operator
1 US 1
1 US 2
1 UK 3
1 UK 4
2 US 1
2 US 2
For the first query, the data should generate:
cooptype country operator
1 2 4
2 2 2
For the second query, the data should generate:
country operator
2 4

Why is the sum of the result from the first query doesn't equal to the result from the second query?
Because you use COUNT(DISTINCT).
It counts distinct records group-wise.
Your first query counts two records with the same country but different cooptype twice (since it groups by cooptype), while the second one counts them once.
Which one is the right result?
Both are right.
For the given data:
cooptype country
1 US
1 US
1 UK
1 UK
2 US
2 US
the first query will return:
1 2
2 1
and the second will return
2
, since you have:
2 distinct countries in cooptype = 1 (US and UK)
1 distinct country in cooptype = 2 (US)
2 distinct countries overall (US and UK)
Which is "right" in your definition of "right", depends, well, on this definition.
If you just want the second query to match the results of the first one, use
SELECT COUNT(DISTINCT cootype, country_id) AS country,
COUNT(DISTINCT cooptype, operatorId) AS operator
FROM nts
INNER JOIN
operator
ON operator.id = nts.operatorId
INNER JOIN
country
ON operator.country_id = country.id
INNER JOIN
cooperationtype
ON cooperationtype.id = nts.voice
WHERE cooperationtype.code <> 'N/A'
but, again, this may be as wrong as your first query is.
For these data:
cooptype country operator
1 US 1
1 US 1
1 UK 2
1 UK 2
2 US 1
2 US 1
, what would be a correct resultset?

Related

SQL GROUP BY "HAVING" the required rows

Is there a succinct way of using HAVING to check if the required rows are within the GROUP BY?
With the example date:
turtle_id name
1 Mike
2 Ralph
3 Leo
4 Don
and
turtle_id crush_for
1 Pizza
1 April Oneil
2 April Oneil
3 Pizza
3 April Oneil
4 Pizza
4 Pizza
4 Science
4 April Oneil
And the SQL:
SELECT turtle.name
FROM turtle
JOIN turtle_crush ON turtle_crush.turtle_id = turtle.turtle_id
WHERE turtle_crush.crush_for IN ('Pizza', 'April Oneil')
GROUP BY turtle.turtle_id
HAVING (a crush on both "Pizza" and "April Oneil")
I realize I could do something like HAVING COUNT(*) > 1, but that would give a false positive for Don (id 4) because he likes 'Pizza' twice.
Edit:
Just adding a WHERE clause will return Ralph where he doesn't have a crush_for 'Pizza'
This should work:
SELECT t.turtle_name
FROM turtle t
INNER JOIN (SELECT turtle_id
FROM turtle_crush
WHERE crush_for IN ('Pizza','April Oneil')
GROUP BY turtle_id
HAVING COUNT(DISTINCT crush_for) = 2) tc
on t.turtle_id = tc.turtle_id;
In this code, the subquery first filter the results where crush_for is either 'Pizza' or 'April Oneil'. Then it groups by turtle_id, with another condition of choosing those turtle_ids that have 2 different crush_for values (hence ensuring that you get only the turtle_ids that have both crushes). Then it's joined with the turtle table to get the name.
Put the list of crushes in the WHERE clause, group by turtle ID, count the distinct values of crush types then keep only the groups that have at least 2 values (or how many crushes you put in the query):
SELECT turtle.name
FROM turtle
INNER JOIN turtle_crush ON turtle_crush.turtle_id = turtle.turtle_id
WHERE crush_for IN ("Pizza", "April Oneil")
GROUP BY turtle.turtle_id
HAVING COUNT(DISTINCT crush_for) = 2

How to select one column with all distinct values based on some clause

I essentially like to have one query which I'll execute one time and like to have the result (no multiple query execution) and definitely, the query should use simple MySQL structure (no complex/advanced structure to be used like BEGIN, loop, cursor).
Say I've two tables.
1st Table = Country (id(PK), name);
2nd Table = Businessman (id(PK), name, city, country_id(FK))
Like to SELECT all countries, whose businessmen are from distinct cities. No two businessmen exist in one country, who are from the same city. If so, that country will not be selected by the SELECT clause.
Country
id name
1 India
2 China
3 Bahrain
4 Finland
5 Germany
6 France
Businessman
id name city country_id
1 BM1 Kolkata 1
2 BM2 Delhi 1
3 BM3 Mumbai 1
4 BM4 Beijing 2
5 BM5 Paris 6
6 BM6 Beijing 2
7 BM7 Forssa 4
8 BM8 Anqing 2
9 BM9 Berlin 5
10 BM10 Riffa 3
11 BM11 Nice 6
12 BM12 Helsinki 4
13 BM13 Bremen 5
14 BM14 Wiesbaden 5
15 BM15 Angers 6
16 BM16 Sitra  3
17 BM17 Adliya 3
18 BM18 Caen 6
19 BM19 Jinjiang 2
20 BM20 Tubli 3
21 BM21 Duisburg 5
22 BM22 Helsinki 4
23 BM23 Kaarina 4
24 BM24 Bonn 5
25 BM25 Kemi 4
In this respect, China and Finland shouldn't be listed.
I've attempted using count and group by, but no luck.
Can you please help me to build up this query.
Here it is, all you need is to join Businessman table and count cities and distinct cities and if they equal that means all businessmen are from different cities:
SELECT
c.`id`,
c.`name`,
COUNT(b.`id`) AS BusinessmanCount,
COUNT(b.`city`) AS CityCount,
COUNT(DISTINCT b.`city`) AS DistinctCityCount
FROM `countries` c
INNER JOIN Businessman b ON c.`id` = b.`country_id`
GROUP BY c.`id`
HAVING CityCount = DistinctCityCount
For minified version what you exactly need:
SELECT
c.`id`,
c.`name`
FROM `countries` c
INNER JOIN Businessman b ON c.`id` = b.`country_id`
GROUP BY c.`id`
HAVING COUNT(b.`city`) = COUNT(DISTINCT b.`city`)
Well, I think we should have waited for you to show your own query, because one learns best from mistakes and their explanations. However, now that you've got answers already:
Yes, you need group by and count. I'd group by cities to see if I got duplicates. Then select countries and exclude those that have duplicate cities.
select *
from country
where id not in
(
select country_id
from businessmen
group by city, country_id
having count(*) > 1
);
You need either nested aggregations:
select *
from Country
where id in
(
select country_id
from
(
select city, country_id,
count(*) as cnt -- get the number of rows per country/city
from Businessman
group by city, country_id
) as dt
group by country_id
having max(cnt) = 1 -- return only those countries where all counts are unique
)
Or compare two counts:
select *
from Country
where id in
(
select country_id
from Businessman
group by country_id
having count(*) = count(distinct city) -- number of cities is equal to umber of rows
)

how can combine two columns in the same table and count its unique occurrences

I have been struggling and would really appreciate some assistance:
I have two tables cars and rides
cars
car_id car_manuf car_model
1 Honda CRV
2 Honda Accord
3 Toyota Corolla
4 Toyota Camry
5 Ford Fusion
rides
ride_id car_id ride_destination
1 3 Boston
2 5 New York
3 5 Washington DC
4 1 California
5 2 Dallas
6 5 Canada
I would like to count the number of rides by each car type which will have the combination of car_manuf and car_model and should be sorted from most to fewest number of rides.
Output should be:
CarType-NumberofRides
Honda_CRV-1
Honda_Accord-1
Toyota_Corolla-1
Toyota_Camry-0
Ford_Fusion-3
Sorted output with most-few rides
CarType-NumberofRides
Toyota_Camry-0
Honda_Accord-1
Toyota_Corolla-1
Honda_CRV-1
Ford_Fusion-3
mycode:
select
c.car_manuf + '_' + c.car_model AS 'Car Type',
(select count(*) from rides r where r.car_id = c.car_id) AS 'Number of Rides'
from cars c;
I am kinda stuck here and not sure which direction I should go in regards to getting the correct output.
You have to use GROUP BY and an ORDER BY when COUNTing the occurences. I use CONCAT to concatenate the strings instead of a + sign. Makes clearer what is going on, and is not mistaken as an arithmetic operation.
SELECT CONCAT(c.car_manuf, '_', c.car_model) AS CarType, COUNT(r.car_id) AS NumberOfRides
FROM rides r
LEFT JOIN cars c ON (r.car_id = c.car_id)
GROUP BY CarType
ORDER BY NumberOfRides ASC
However this omits the 0 occurences.
If you want to see the 0s as well swap the table order to:
SELECT CONCAT(c.car_manuf, '_', c.car_model) AS CarType, COUNT(r.car_id) AS NumberOfRides
FROM cars c
LEFT JOIN rides r ON (r.car_id = c.car_id)
GROUP BY CarType
ORDER BY NumberOfRides ASC

SQL: Fetch rows having a column (group by column) being the MAX value

I would like to know how to retrieve rows matching the maximum value for a column.
SCHEMA
assignments:
id student_id subject_id
1 10 1
2 10 2
3 20 1
4 30 3
5 30 3
6 40 2
students:
id name
10 A
20 B
30 C
subjects:
id name
1 Math
2 Science
3 English
Queries:
Provide the SQL for:
1. Display the names of the students who have taken most number of assignments
2. Display the names of the subjects which have been taken the most number of times
Results:
1.
A
C
2.
Math
English
Thanks !
The previous answer is not quite right - you won't get the instances where there are two with the same count. Try this - the second will be easy to replicate once understand the concept.
SELECT a.student_id, s.name, COUNT(a.subject_id) as taken_subjects
FROM assignments a
INNER JOIN students s ON a.student_id = s.id
GROUP BY a.student_id, s.name
HAVING COUNT(a.subject_id) = (SELECT COUNT(*) FROM assignments GROUP BY student_id LIMIT 1)
Alternate query:
SELECT a.subject_id, s.subject_name, COUNT(a.subject_id) FROM assignment a, subjects s
WHERE a.subject_id = s.subject_id
GROUP BY a.student_id, s.subject_name
HAVING COUNT(a.subject_id) = (SELECT MAX(COUNT(1)) FROM assignment GROUP BY subject_id)

Grouping and Accumulating Records at the same time

I am writing a query against an advanced many-to-many table in my database. I call it an advanced table because it is a many-to-many table with and extra field. The table maps data between the fields table and the students table. The fields table holds potential fields that a student can used, kind of like a contact system (i.e. name, school, address, etc). The studentvalues table that I need to query against holds the field id, student id, and the field answer (i.e. studentid=1; fieldid=2; response=Dave Long).
So my table looks like this:
What I need to do is take a few passed in values and create a grouped accumulated report. I would like to do as much in the SQL as possible.
So that data that I have will be the group by field (a field id), the cumulative field (a field id) and I need to group the students by the group by field and then in each group count the amount of students in the cumulative fields.
So for example I have this data
ID STUDENTID FIELDID RESPONSE
1 1 2 *(city)* Wallingford
2 1 3 *(state)* CT
3 2 2 *(city)* Wallingford
4 2 3 *(state)* CT
5 3 2 *(city)* Berlin
6 3 3 *(state)* CT
7 4 2 *(city)* Costa Mesa
8 4 3 *(state)* CA
I am hoping to write one query that I can generate a report that looks like this:
CA - 1 Student
Costa Mesa 1
CT - 3 Students
Berlin 1
Wallingford 2
Is this possible to do with a single SQL statement or do I have to get all the groups and then loop over them?
EDIT Here is the code that I have gotten so far, but it doesn't give the proper stateSubtotal (the stateSubtotal is the same as the citySubtotal)
SELECT state, count(state) AS stateSubtotal, city, count(city) AS citySubtotal
FROM(
SELECT s1.response AS city, s2.response AS state
FROM studentvalues s1
INNER JOIN studentvalues s2
ON s1.studentid = s2.studentid
WHERE s1.fieldid = 5
AND s2.fieldid = 6
) t
GROUP BY city, state
So to make a table that looks like that, I would assume something like
State StateSubtotal City CitySubtotal
CA 1 Costa Mesa 1
CT 3 Berlin 1
CT 3 Wallingford 2
Would be what you want. We can't just group on Response, since if you had a student answer LA for city, and another student that responds LA for state (Louisiana) they would add. Also, if the same city is in different states, we need to first lay out the association between a city and a state by joining on the student id.
edit - indeed, flawed first approach. The different aggregates need different groupings, so really, one select per aggregation is required. This gives the right result but it's ugly and I bet it could be improved on. If you were on SQL Server I would think a CTE would help but that's not an option.
select t2.stateAbb, stateSubtotal, t2.city, t2.citySubtotal from
(
select city, count(city) as citySubTotal, stateAbb from (
select s1.Response as city, s2.Response as StateAbb
from aaa s1 inner join aaa s2 on s1.studentId = s2.studentId
where s1.fieldId = 2 and s2.fieldId=3
) t1
group by city, stateabb
) t2 inner join (
select stateAbb, count(stateabb) as stateSubTotal from (
select s1.Response as city, s2.Response as StateAbb
from aaa s1 inner join aaa s2 on s1.studentId = s2.studentId
where s1.fieldId = 2 and s2.fieldId=3
) t3
group by stateabb
) t4 on t2.stateabb = t4.stateabb