Separating/Sorting single column values into several columns using case function - mysql

I have two tables that I want to join and split with a case function depending on the values in one of the columns. (I know, sounds weird so let me explain)
It's a process where I run separate batches. Every batch has several samples that are measured in instances of voltage readings in several locations. My two tables looks like this:
Sample Readings
id id
BatchesID SampleID
... voltage
... location
When a batch is run, it takes one sample at a time and for every location (25 locations) it takes about 20 readings of the voltage before moving on to the next one.
I want to look at one batch at a time, and for every Sample.id, I want to gather the AVG(voltage) for all the locations. My table for Readings turns out like:
SampleID location voltage
1 1 5.23
1 1 4.53
... ... ...
1 25 7.89
2 1 4.96
2 1 5.04
... ... ...
2 25 6.09
...
But I want it to look like:
SampleID avg_v_for_1 avg_v_for_2 ... avg_v_for_25
1 4.73 5.24 ... 6.35
2 3.87 4.76 ... 9.32
... ... ... ... ...
200 6.73 3.87 ... 8.23
Basically, what I want to do is for every separate sample, I want to take the average voltage for all the measurements in every location and put in on a single row. What my current syntax looks like is this:
SELECT Readings.SampleID, Sample.BatchesID
(case when location = '1' then AVG(voltage) else 0 end) avg_v_for_1,
(case when location = '2' then AVG(voltage) else 0 end) avg_v_for_2,
...
(case when location = '25' then AVG(voltage) else 0 end) avg_v_for_25
FROM DB.Readings
INNER JOIN Sample
ON Readings.SampleID = Sample.id
WHERE Sample.BatchesID = 'specific_batch_id'
GROUP BY Readings.location, Sample.id;
The problem is that this generates the following table:
SampleID avg_v_for_1 avg_v_for_2 ... avg_v_for_25
1 4.73 0 ... 0
1 0 4.76 ... 0
1 0 0 ... 6.73
2 3.87 0 ... 0
2 0 4.83 ... 0
...
How can I get MySQL to gather ALL the average values for EVERY location on a SINGLE row? I have tried removing the group by location and only group by sampleID but then I only get the values for the first location and everything else becomes 0.
Any help is appreciated, thank you!

I add another answer with explanation how the the query with AVG(case ..when ... then..end) works, and why the version with case ... when ... then AVG(..) end doesn't give expected results.
The first remark: the ANSI SQL standard for group by queries is the following:
SELECT column1, column2, ... column_n, aggregate_function (expression)
FROM tables
WHERE predicates
GROUP BY column1, column2, ... column_n;
where aggregated_function can be a function such a: SUM, MAX, MIN, COUNT, AVG
There are several rules (restrictions) for the GROUP BY CLASUE, see this link for details: http://etutorials.org/SQL/Mastering+Oracle+SQL/Chapter+4.+Group+Operations/4.2+The+GROUP+BY+Clause/
one of them says that:
GROUP BY clause must include all nonaggregate expressions
It means, that all columns in SELECT clause must be listed in the GROUP BY clause,
for example this query:
SELECT col1, col2, AVG( expression )
FROM table
GROUP BY col2
is wrong, because col1 is not listed in the GROUP BY clause, and this query won't work on all databases (Oracle, Postgresql, MS-SQL etc.) - except MySql (why - I'll tell about it later).
The expression within the aggregated function can refer to all columns of the table, regardless of the column is listed in the GROUP BY clause or not.
Because of the above the query:
SELECT Readings.SampleID,
(case when location = '1' then AVG(voltage) else 0 end) avg_v_for_1
....
GROUP BY sampleId
simply won't work on all databases that are compliant with ANSI SQL, this query will give a syntax error because location is out of AVG function, but is not listed in the GROUP BY clause.
The question - why this query works on MySql ?
Because MySql implemented it's own extension to the GROUP BY query, see this link --> http://dev.mysql.com/doc/refman/5.6/en/group-by-extensions.html
In MySql the select list can refer to nonaggregated columns not listed in the GROUP BY clause. Becaue of this extension our query is syntactically correct and runs on MySql, but gives unexpected (unwanted) results, since an order of expression's evaluation is different:
1. it first runs an aggregated (group by) query and evaluates AVG( price ),
2. then evaluates CASE WHEN ... THEN, but for resultset returned by the aggregated query from point 1
The query with the clause AVG( case when ... then ):
1. first calucates the expression CASE-WHEN-THEN for all table rows
2. then runs an aggregated query for resultset returned by #1 and calculates the AVG.

Try:
SELECT Readings.SampleID, Sample.BatchesID
AVG(case when location = '1' then voltage else null end) avg_v_for_1,
AVG(case when location = '2' then voltage else null end) avg_v_for_2,
...
AVG(case when location = '25' then voltage else null end) avg_v_for_25
FROM DB.Readings
........
GROUP BY sample_id
--- EDIT --> use ifnull function to change nulls into 0
SELECT Readings.SampleID, Sample.BatchesID
ifnull( AVG(case when location = '1' then voltage else null end), 0 ) avg_v_for_1,
ifnull( AVG(case when location = '2' then voltage else null end), 0 ) avg_v_for_2,
...
ifnull( AVG(case when location = '25' then voltage else null end), 0 ) avg_v_for_25
FROM DB.Readings
........
GROUP BY sample_id

Related

jDatabase request whith 2 times the same field under condition

I would like to request a table which looks like this:
Table1
record
element
value
62
56
637689
62
163
12/1990
...
joined with another table:
Table2
user_id
record
64
62
expecting this result: 637689,"12/1990" based on user_id=64 (not implemented in my request as i am unable to write the correct JOIN syntax
i tried with this request:
SELECT record,
(CASE WHEN element = 163 THEN value END) AS numserie,
(CASE WHEN element = 56 THEN value END) AS dateprod
FROM Table1
GROUP BY record
ORDER BY record DESC;
, but in heidiSQL, i have this result
numserie
dateprod
637689
NULL
i tried on others "element" number, always NULL
i tried to swap the two CASE lines, the result swap either
What is wrong ?
You need to aggregate the CASE expressions. Using the MAX function is one option:
SELECT
record,
MAX(CASE WHEN element = 163 THEN value END) AS numserie,
MAX(CASE WHEN element = 56 THEN value END) AS dateprod
FROM Table1
GROUP BY record
ORDER BY record DESC;

summation in mysql does not work properly and returns 0

I have a sql query as follow:
but the problem is that if the the second select staement with dataitem=3 returns null then the whole calculation becomes 0. For example for first select I have 100 and for second it returns null. Adding them should result 100 but it gives back 0!!!!!
can anyone say the reason and also what to do to get rid of that?
Here is also copiable code:
select( (
SELECT sum(Sentiment)
FROM entity_epoch_data
WHERE EpochID IN
(SELECT ID
FROM epoch
WHERE StartDateTime>='2013-11-1'
AND EndDateTime<='2013-11-30')
AND EntityID =86
AND DataitemType=0
)+
(SELECT sum(Sentiment)
FROM entity_epoch_data
WHERE EpochID IN
(SELECT ID
FROM epoch
WHERE StartDateTime>='2013-11-1'
AND EndDateTime<='2013-11-30')
AND EntityID =86
AND DataitemType=3)
)
Just add to the SQLs that represent values the IFNULL command like IFNULL((select sum()......), 0) it should work fine.
But a little peace of advice. You should improve that query.
I beleave that this query you do the same thing.
SELECT sum(entity_epoch_data.Sentiment)
FROM entity_epoch_data INNER JOIN epoch
ON entity_epoch_data.EpochID = epoch.id
WHERE epoch.StartDateTime>='2013-11-1'
and epoch.EndDateTime<='2013-11-30'
AND entity_epoch_data.EntityID =86
and entity_epoch_data.DataitemType in (0,3)
You are summing the sums of the DataitemType 3 and 0 it just can be one query with a join
Use CASE statements instead of the statements you're using.
select sum(case when StartDateTime>='2013-11-1' and EndDateTime<='2013-11-30' and DataitemType=0 then Sentiment else 0 end) as Sentiment_0, sum(case when StartDateTime>='2013-11-1' and EndDateTime<='2013-11-30' and DataitemType=3 then Sentiment else 0 end) as Sentiment3 from (tables_joined) where EntityID =86
Adding null is undefined and therefore returns null or 0. You can use a CASE expression to avoid that problem.
SELECT SUM(CASE null = sentiment THEN 0 ELSE sentiment END) FROM ....
If your DB does not support CASE inside the SUM() function create a VIEW that uses CASE to substitute the null values with 0.

MAX with extra criteria

I have the following part of a query I'm working on in MYSQL.
SELECT
MAX(CAST(MatchPlayerBatting.BatRuns AS SIGNED)) AS HighestScore
FROM
MatchPlayerBatting
It returns the correct result. However there is another column I need it to work off.
That is if the maximum value it finds also has a value of "not out" within "BatHowOut", it should show the result as for example 96* rather than just 96.
How could this be done?
To help make the data concrete, consider two cases:
BatRuns BatHowOut
96 not out
96 lbw
BatRuns BatHowOut
96 not out
102 lbw
For the first data, the answer should be '96*'; for the second, '102'.
You can achieve this using self-join like this:
SELECT t1.ID
, CONCAT(t1.BatRuns,
CASE WHEN t1.BatHowOut = 'Not Out' THEN '*' ELSE '' END
) AS HighScore
FROM MatchPlayerBatting t1
JOIN
(
SELECT MAX(BatRuns) AS HighestScore
FROM MatchPlayerBatting
) t2
ON t1.BatRuns = t2.HighestScore
See this sample SQLFiddle with highest "Not Out"
See this another sample SQLFiddle with highest "Out"
See this another sample SQLFiddle with two highest scores
How about ordering the scores in descending order and selecting only the first record?
select concat(BatRuns , case when BatHowOut = 'not out' then '*' else '' end)
from mytable
order by cast(BatRuns as signed) desc,
(case when BatHowOut = 'not out' then 1 else 2 end)
limit 1;
Sample here.
If you want to find highest score score for each player, here is a solution that may not be elegant, but quite effective.
select PlayerID,
case when runs != round(runs)
then concat(round(runs),'*')
else
round(runs)
end highest_score
from (select PlayerID,
max(cast(BatRuns as decimal) +
case when BatHowOut = 'not out' then 0.1 else 0 end
) runs
from MatchPlayerBatting
group by PlayerID) max_runs;
This takes advantage of the fact that, runs can never be fractions, only whole numbers. When there is a tie for highest score and one of them is unbeaten,
adding 0.1 to the unbeaten score will make it the highest. This can be later removed and concatenated with *.
Sample here.

mysql - performing addition, subtraction etc on two rows

I have a following Mysql table storing meter readings of different power stations.
Date, station_name, reading
2013-05-06, ABC, 102
2013-05-06, PQR, 122
I want a SQL query with following result for a particular date.
Date, ABC, PQR, ABC-PQR
2013-05-06,102,122,-20
You could use CASE statements:
SELECT Date
, SUM(CASE WHEN station_name = 'ABC' THEN reading ELSE 0 END) as ABC
, SUM(CASE WHEN station_name = 'PQR' THEN reading ELSE 0 END) as PQR
, SUM(CASE WHEN station_name = 'ABC' THEN reading ELSE 0 END) - SUM(CASE WHEN station_name = 'PQR' THEN reading ELSE 0 END) as 'ABC-PQR'
FROM table
WHERE Date = '20130506'
GROUP BY Date
You can search for MySQL PIVOT to find out other methods people use.
I believe that it is not possible to do dynamic column based on value of row. I believe you should do it in application-layer rather than database-layer.
See this post: mysql select dynamic row values as column names, another column as value.

Comparing 2 Columns in same table

I need to compare 2 columns in a table and give 3 things:
Count of rows checked (Total Rows that were checked)
Count of rows matching (Rows in which the 2 columns matched)
Count of rows different (Rows in which the 2 columns differed)
I've been able to get just rows matching using a join on itself, but I'm unsure how to get the others all at once. The importance of getting all of the information at the same time is because this is a very active table and the data changes with great frequency.
I cannot post the table schema as there is a lot of data in it that is irrelevant to this issue. The columns in question are both int(11) unsigned NOT NULL DEFAULT '0'. For purposes of this, I'll call them mask and mask_alt.
select
count(*) as rows_checked,
sum(col = col2) as rows_matching,
sum(col != col2) as rows_different
from table
Note the elegant use of sum(condition).
This works because in mysql true is 1 and false is 0. Summing these counts the number of times the condition is true. It's much more elegant than case when condition then 1 else 0 end, which is the SQL equivalent of coding if (condition) return true else return false; instead of simply return condition;.
Assuming you mean you want to count the rows where col1 is or is not equal to col2, you can use an aggregate SUM() coupled with CASE:
SELECT
COUNT(*) AS total,
SUM(CASE WHEN col = col2 THEN 1 ELSE 0 END )AS matching,
SUM(CASE WHEN col <> col2 THEN 1 ELSE 0 END) AS non_matching
FROM table
It may be more efficient to get the total COUNT(*) in a subquery though, and use that value to subtract the matching to get the non-matching, if the above is not performant enough.
SELECT
total,
matching,
total - matching AS non_matching
FROM
(
SELECT
COUNT(*) AS total,
SUM(CASE WHEN col = col2 THEN 1 ELSE 0 END )AS matching
FROM table
) sumtbl