How to count unique values on one column without double values from another column - mysql

My very first question as a newb in SQL.
I want to count unique values from one column Transport, group them by ID and delete double values in the Transport column that may be caused by Product column. Could be very simple, but at this point I need another point of view.
This is the data
ID
Product
Transport
1
A
Plane
1
B
Plane
2
A
Train
2
B
Train
2
C
Ship
3
A
Plane
3
B
Train
3
C
Ship
3
D
Ship
I would want to have the ID as unique values and then count each of the unique values of the Transport. If I do it with a normal GROUP BY, the Products will double the counting.
The result I need has to count each of the Transport values in separated columns without being doubled by the Product column. So it should look something like:
ID
Plane
Train
Ship
1
1
0
0
2
0
1
1
3
1
1
1
I think it's simple but maybe I'm missing something. Any help would be appreciated!
Thank you.

You can get a pivot by combining CASE with MAX(), as in:
select
id,
max(case when transport = 'Plane' then 1 else 0 end) as plance,
max(case when transport = 'Train' then 1 else 0 end) as train,
max(case when transport = 'Ship' then 1 else 0 end) as ship
from t
group by id

Just adding something to #The Impater's result
SELECT
id,
MAX(transport = 'Plane') AS plance,
MAX(transport = 'Train') AS train,
MAX(transport = 'Ship') AS ship
FROM `test_table`
GROUP BY id
I was taught there is no need to assign 1 and 0 when it can be done via boolean-type logic as results are returned either in 0 or 1.

Related

SQL query statement Self Join?

new to SQL.
I have the following set of data
A X Y Z
1 Wind 1 1
2 Wind 2 1
3 Hail 1 1
4 Flood 1 1
4 Rain 1 1
4 Fire 1 1
I would like to select all distinct 'A' fields where for all rows that contain A have flood and rain.
So in this example, the query would return only the number 4 since for the set of all rows that contain A = 4 we have Flood and Rain.
I need the values of A where for a given value 'a' in A, there exists rows with 'a' that must contain all of the following fields provided (in the example Flood and Rain).
Please let me know if you need further clarification.
I need the values of A where for a given value 'a' in A, there exists rows with 'a' that must contain all of the following fields provided (in the example Flood and Rain).
You can use aggregation, and filter with a having clause:
select a
from mytable t
where x in ('Flood', 'Rain') -- either one or the other
having count(*) = 2 -- both match
If tuples (a, x) tuples are not unique, then you want having count(distinct x) = 2 instead.
You Shooud use count(distinct X) group by A and having
count(distinct...) avoid situation where you have two time the same value for X
select A
from my_table
WHERE x in ('Flood', 'Rain')
group A
having count(distinct X) = 2

MySQL count different value in same table

I am working on a database right now, and I am trying to select some special data.
so the table looks like this.
name title type
Type is including two different value, "book" and "paper".
And this is the result I would like to get
name book paper
person A 0 1
person B 1 2
person C 0 5
What is the best way to write the query it in MySQL.
You may use conditional aggregation:
SELECT
name,
SUM(CASE WHEN type = 'book' THEN 1 ELSE 0 END) AS book,
SUM(CASE WHEN type = 'paper' THEN 1 ELSE 0 END) AS paper
FROM yourTable
GROUP BY
name;

Searching large (6 million) rows MySQL with stored queries?

I have a database with roughly 6 million entries - and will grow - where I'm running queries to return for a HighCharts charting functionality. I need to read longitudinally over years, so I'm running queries like this:
foreach($states as $state_id) { //php code
SELECT //mysql psuedocode
sum(case when mydatabase.Year = '2003' then 1 else 0 end) Year_2003,
sum(case when mydatabase.Year = '2004' then 1 else 0 end) Year_2004,
sum(case when mydatabase.Year = '2005' then 1 else 0 end) Year_2005,
sum(case when mydatabase.Year = '2006' then 1 else 0 end) Year_2006,
sum(case when mydatabase.Year = '2007' then 1 else 0 end) Year_2007,
sum(case when mydatabase.Year = '$more_years' then 1 else 0 end) Year_$whatever_year,
FROM mytable
WHERE State='$state_id'
AND Sex IN (0,1)
AND Age_segment IN (5,4,3,2,1)
AND "other_filters IN (etc, etc, etc)
} //end php code
But for various state at once... So returning lets say 5 states, each with the above statement but a state ID is substituted. Meanwhile the years can be any number of years, the Sex (male/female/other) and Age segment and other modifiers keep changing based on filters. The queries are long (at minimum 30-40seconds) a piece. So a thought I had - unless I'm totally doing it wrong - is to actually store the above query in a second table with the results, and first check that "meta query" and see if it was "cached" and then return the results without reading the db (which won't be updated very often).
Is this a good method or are there potential problems I'm not seeing?
EDIT: changed to table, not db (duh).
Table structure is:
id | Year | Sex | Age_segment | Another_filter | Etc
Nothing more complicated than that and no joining anything else. There are keys on id, Year, Sex, and Age_segment right now.
Proper indexing is what is needed to speed up the query. Start by doing an "EXPLAIN" on the query and post the results here.
I would suggest the following to start off. This way avoids the for loop and returns the data in 1 query. Not knowing the number of rows and cardinality of each column I suggest a composite index on State and Year.
SELECT mytable.State,mytable.Year,count(*)
FROM mytable
AND Sex IN (0,1)
AND Age_segment IN (5,4,3,2,1)
AND "other_filters IN (etc, etc, etc)
GROUP BY mytable.State,mytable.Year
The above query can be further optimised by checking the cardinality of some of the columns. Run the following to get the cardinality:
SELECT Age_segment FROM mytable GROUP BY Age_segment;
Pseudo code...
SELECT Year
, COUNT(*) total
FROM my_its_not_a_database_its_a_table
WHERE State = $state_id
AND Sex IN (0,1)
AND Age_segment IN (5,4,3,2,1)
GROUP
BY Year;

MySql - Dynamic logic calculator

Please find the Table " MarkCompare" below
SEMESTER - PAPER - TEACHER 1 - TEACHER 2
1 - ENG - PASS - PASS
1 - MATH - PASS - FAIL
2 - ENG - PASS - FAIL
2 - MATH - FAIL - FAIL
I want to calculate a logic like below
No.Of.Times where both teachers gave same result / sum of occurences where first teacher both the teachers result differed
I am writing a query like this
select count(*) from MarkCompare where teacher1=teacher2 where paper='ENG' / (select count(*) from MarkCompare where teacher1<>teacher2 where paper='ENG')
select count(*) from MarkCompare where teacher1=teacher2 where paper='MATH' / (select count(*) from MarkCompare where teacher1<>teacher2 where paper='MATH')
Now, in future the number of papers may increase or decrease... I am unable to find a dynamic query to run for any number of papers.
Is there a way to do this without any procedure/function, just with a query
You can use a GROUP BY clause to aggregate the results for each paper listed in the table.
SELECT paper,
SUM(CASE WHEN teacher1 = teacher2 THEN 1 ELSE 0 END) AS AgreeCount,
SUM(CASE WHEN teacher1 <> teacher2 THEN 1 ELSE 0 END) AS DisagreeCount
FROM MarkCompare
GROUP BY paper;

SQL where particular column values appears

I wasn't sure how to really search for this..
Lets say I have a simple table like this
ID Type
1 0
1 1
2 1
3 0
4 0
4 1
How could I select all ID's which have a type of both 0 and 1?
SELECT id,type
FROM t
GROUP BY id
HAVING SUM(type=0)>0
AND SUM(type=1)>0
You just group by id ,than with HAVING you use post aggregation filtering to check for 0 and 1.
Having is pretty expensive and that query can't hit keys.
SELECT ID FROM foo AS foo0 JOIN foo AS foo1 USING (ID) WHERE foo0.Type=0 AND foo1.Type=1 GROUP BY foo0.id.
A more generalized way of doing this would by to use a CASE column for each value you need to test combined with a GROUP BY on the id column. This means that if you have n conditions to test for, you would have a column indicating if each condition is met for a given id. Then the HAVING condition becomes trivial and you can use it like any multi-column filter, or use the grouping as your subquery and the code looks simpler and the logic becomes even easier to follow.
SELECT id, Type0,Type1
FROM (
SELECT id,
Type0 = max(CASE WHEN type = 0 THEN TRUE END)
, Type1 = max(CASE WHEN type = 1 THEN TRUE END)
FROM t
GROUP BY id
) pivot
WHERE Type0 = TRUE and Type1 = TRUE