Row count where column1='value1' and column1='value2' group by column2 - mysql

I have a table like this:
date day weather
2000-01-01 Monday Sunny
2000-01-02 Tuesday Rainy
. . .
I want to get number of rainy Mondays and sunny Mondays in one query like
day rainy_d sunny_d
Monday 2 5
How to accomplish that in Mysql and PostgreSQL?

select `Day`,
SUM(case when weather = 'Sunny' THEN 1 ELSE 0 end) as Sunny_D,
SUM(case when weather = 'Rainy' THEN 1 ELSE 0 end) as Rainy_D
FROM YOURTABLENAME
Where day = 'Monday'
Group by `Day`

Standard SQL, works in both:
SELECT
day,
SUM(CASE WHEN weather = 'Rainy' THEN 1 ELSE 0 END) AS rainy_d,
SUM(CASE WHEN weather = 'Sunny' THEN 1 ELSE 0 END) AS sunny_d
FROM yourtable
GROUP BY day
More concise version - MySQL only:
SELECT
day,
SUM(weather = 'Rainy') AS rainy_d,
SUM(weather = 'Sunny') AS sunny_d
FROM yourtable
GROUP BY day
More concise version - PostgreSQL only:
SELECT
day,
SUM((weather = 'Rainy')::int) AS rainy_d,
SUM((weather = 'Sunny')::int) AS sunny_d
FROM yourtable
GROUP BY day

The column day is probably redundant. I would delete it without substitution. The column date holds all the information. Then your query could look like this ..
In PostgreSQL:
SELECT to_char(date, 'Day') AS day
,COUNT(NULLIF(weather,'Sunny')) AS rainy_d
,COUNT(NULLIF(weather,'Rainy')) AS sunny_d
FROM tbl
GROUP BY 1;
In MySQL:
SELECT DAYNAME(date) AS day
... rest identical
The NULLIF() construct works for exactly two distinct (non-null) values in the column weather and is standard SQL. For more values use the alternatives provided by #Mark and #xQbert.

Related

Trying to calculate difference between 2 select queries in SQL

Fairly new to SQL but I'm trying to get the difference between 2 select queries from the same Table. I have tried the following
SELECT
(SELECT KwhMeter,IndexElek,CalorieMeter,IndexWarmte,IndexWarmWater,IndexKoudWater,Date FROM Energiemeters WHERE Date = '2017-05-01')
-
(SELECT KwhMeter,IndexElek,CalorieMeter,IndexWarmte,IndexWarmWater,IndexKoudWater,Date FROM Energiemeters WHERE Date = '2017-04-01') AS Difference
but I end up having the following error :
#1241 - Operand should contain 1 column(s)
If you want rows that are on May 1st but not April 1st, then one way is to use aggregation:
SELECT KwhMeter, IndexElek, CalorieMeter, IndexWarmte, IndexWarmWater, IndexKoudWater, Date
FROM Energiemeters
WHERE Date IN ('2017-04-01', '2017-05-01')
GROUP BY KwhMeter, IndexElek, CalorieMeter, IndexWarmte, IndexWarmWater, IndexKoudWater
HAVING MIN(Date) = '2017-05-01';
Using Cross Join. This is with the assumption that you get only 1 row per date.
Select
(a.KwhMeter-b.KwhMeter) as KwhMeter,
(a.IndexElek-b.IndexElek) as IndexElek,
(a.CalorieMeter-b.CalorieMeter) CalorieMeter,
(a.IndexWarmte-b.IndexWarmte) IndexWarmte,
(a.IndexWarmWater-b.IndexWarmWater) IndexWarmWater,
(a.IndexKoudWater-b.IndexKoudWater) IndexKoudWater,
(a.Date-b.Date) as Date
from
(
SELECT distinct KwhMeter,IndexElek,CalorieMeter,IndexWarmte,IndexWarmWater,IndexKoudWater,Date
FROM Energiemeters
WHERE Date = '2017-05-01'
) a
cross join
(
SELECT distinct KwhMeter,IndexElek,CalorieMeter,IndexWarmte,IndexWarmWater,IndexKoudWater,Date
FROM Energiemeters
WHERE Date = '2017-04-01'
) b;
It seems as though you want to subtract the respective values of columns from two rows determined by dates 2017-05-01 and 2017-04-01?
If yes, then the query can be written as follows:
SELECT SUM(CASE Date
WHEN '2017-05-01' THEN KwhMeter
WHEN '2017-04-01' THEN -KwhMeter
END) AS KwhMeter,
SUM(CASE Date
WHEN '2017-05-01' THEN IndexElek
WHEN '2017-04-01' THEN -IndexElek
END) AS IndexElek,
SUM(CASE Date
WHEN '2017-05-01' THEN CalorieMeter
WHEN '2017-04-01' THEN -CalorieMeter
END) AS CalorieMeter,
SUM(CASE Date
WHEN '2017-05-01' THEN IndexWarmte
WHEN '2017-04-01' THEN -IndexWarmte
END) AS IndexWarmte,
SUM(CASE Date
WHEN '2017-05-01' THEN IndexWarmWater
WHEN '2017-04-01' THEN -IndexWarmWater
END) AS IndexWarmte,
SUM(CASE Date
WHEN '2017-05-01' THEN IndexKoudWater
WHEN '2017-04-01' THEN -IndexKoudWater
END) AS IndexKoudWater
FROM Energiemeters
WHERE Date IN ('2017-05-01', '2017-04-01')
A small scale working demo can be found here.
WORKING IN MOST RDBMSs EXCEPT MYSQL:
If I wanted to compute per-column difference I would use common table expressions to prepare subresults and then compute difference.
WITH
res1 AS
(SELECT KwhMeter,IndexElek,CalorieMeter,IndexWarmte,IndexWarmWater,IndexKoudWater,Date FROM Energiemeters WHERE Date = '2017-05-01'),
res2 AS
(SELECT KwhMeter,IndexElek,CalorieMeter,IndexWarmte,IndexWarmWater,IndexKoudWater,Date FROM Energiemeters WHERE Date = '2017-04-01')
SELECT
r1.KwhMeter - r2.KwhMeter, r1.OtherColumnName - r2.OtherColumnName ... FROM res1 r1, res2 r2
However ... This works perfectly on 1 row per subselect (date). Do you guarentee one entry per date? Is that a PK? You need to specify your question, mainly what do you mean by 'difference'.

SQL query summary issue

I'm new to SQL and trying to create a total summary of a working SQL query. It's listing the total results from one month of data.
Now I need the total values of the outcome of the query.
So I created a 'query in a query' piece of SQL, but it ain't working because my lack of SQL knowledge. I guess it's an easy fix for you pro's :-)
The working SQL query with the daily outcome of one month:
SELECT
DATE_FORMAT(date, '%d/%m/%y') AS Datum,
COUNT(*) AS Berichten,
SUM(CASE WHEN virusinfected>0 THEN 1 ELSE 0 END) AS Virus,
SUM(CASE WHEN (virusinfected=0 OR virusinfected IS NULL) AND isspam>0 THEN 1 ELSE 0 END) AS Ongewenst,
SUM(CASE WHEN (virusinfected=0 OR virusinfected IS NULL) AND (isspam=1) AND isrblspam>0 THEN 1 ELSE 0 END) AS RBL,
SUM(size) AS Grootte
FROM
maillog
WHERE
1=1
AND (1=1)
AND
date < '2017-04-01'
AND
date >= '2017-03-01'
AND
to_domain = 'domain1.nl'
OR
date < '2017-04-01'
AND
date >= '2017-03-01'
AND
to_domain = 'domain2.nl'
GROUP BY
Datum
ORDER BY
date
The incorrect query trying to create the monthly totals:
SELECT Datum,
SUM(Berichten) AS Berichten,
SUM(Virus) AS Virus,
SUM(Ongewenst) AS Ongewenst,
SUM(RBL) AS RBL,
SUM(Grootte) AS Grootte,
FROM ( SELECT
DATE_FORMAT(date, '%d/%m/%y') AS Datum,
COUNT(*) AS Berichten,
SUM(CASE WHEN virusinfected>0 THEN 1 ELSE 0 END) AS Virus,
SUM(CASE WHEN (virusinfected=0 OR virusinfected IS NULL) AND isspam>0 THEN 1 ELSE 0 END) AS Ongewenst,
SUM(CASE WHEN (virusinfected=0 OR virusinfected IS NULL) AND (isspam=1) AND isrblspam>0 THEN 1 ELSE 0 END) AS RBL,
SUM(size) AS Grootte
FROM
maillog
WHERE
1=1
AND (1=1)
AND
date < '2017-04-01'
AND
date >= '2017-03-01'
AND
to_domain = 'domain1.nl'
OR
date < '2017-04-01'
AND
date >= '2017-03-01'
AND
to_domain = 'domain2.nl'
GROUP BY
Datum
ORDER BY
date
) t
GROUP BY Datum;
Thanks in advance.
What you want can be done with just a little addition to your first SQL statement: add with rollup after the group by clause:
GROUP BY Datum WITH ROLLUP
It will run more efficiently than the version with sub-query, although it could work that way, but you should then remove the outer group by clause and not select Datum there, since you don't want the totals per date any more, but overall.
Still, you will lose the details and only get the overall totals then. You would have to use a union with your original query to get both levels of totals. You can imagine that the with rollup modifier will do the job more efficiently.

Compute average sales per day in MySQL

In my database I have a table with two columns. The first column contains dates and the second is a count variable. I was wondering if it is possible to compute the average counts for each weekday based on the dates and counts.
In the following a small example:
Table:
Date Count
02/01/2005 100
02/02/2005 200
02/03/2005 300
... ...
Output:
Days Average
Monday 120.5
Tuesday 200.2
Wednesday 300.5
You could a series of avg calls on case expression extracting the day's name:
SELECT AVG(CASE DAYOFWEEK(`date`) WHEN 2 THEN `count` ELSE NULL END) AS Monday,
AVG(CASE DAYOFWEEK(`date`) WHEN 3 THEN `count` ELSE NULL END) AS Tuesday,
AVG(CASE DAYOFWEEK(`date`) WHEN 4 THEN `count` ELSE NULL END) AS Wednesday,
AVG(CASE DAYOFWEEK(`date`) WHEN 5 THEN `count` ELSE NULL END) AS Thursday,
AVG(CASE DAYOFWEEK(`date`) WHEN 6 THEN `count` ELSE NULL END) AS Friday
FROM mytable
EDIT:
Given the updated expected output in the edited post, it's much easier to do - just group by the dayname:
SELECT DAYNAME(`date`), AVG(`count`)
FROM mytable
WHERE DAYOFWEEK(`date`) BETWEEN 2 AND 6
GROUP BY DAYNAME(`date`)
#Mureinik's answer also pivots the data sets. If you need the week days as rows, not columns (I'm not sure by your question), the query gets even easier (untested):
SELECT DAYNAME(`date`) AS day_of_week,
AVG(`count`) AS average
FROM yourtable
GROUP BY DAYOFWEEK(`date`)
ORDER BY DAYOFWEEK(`date`)

Different conditions in one select statement

How can you Select two columns and have each column test for it's own condition and not the other's ?
Let's say I have a select that Count every records in a table. In one column I want every records from this week, and in the second one I want all record since the beginning of the year.
I have two conditions but they each apply to a specific column :
WHERE date BETWEEN #Monday AND #SUNDAY /* Weekly */
WHERE date >= #JanuaryFirst /* Annual */
But can't just put it like this because I will only get this week's record in both columns. I thought I could use an IFcondition but I don't think I can simply say "If you are column A test for this, if not test for the second one".
Here is a version that doesn't yield multiple scans:
select vehicule,
weekly = SUM(CASE WHEN date BETWEEN #Monday AND #SUNDAY THEN 1 ELSE 0 END),
annual = SUM(CASE WHEN date >= #JanuaryFirst THEN 1 ELSE 0 END)
from dbo.tablename AS t
GROUP BY vehicule;
Or you could also try the slightly less verbose:
select vehicule,
weekly = COUNT(CASE WHEN date BETWEEN #Monday AND #SUNDAY THEN 1 END),
annual = COUNT(CASE WHEN date >= #JanuaryFirst THEN 1 END)
from dbo.tablename AS t
GROUP BY vehicule;
Use INNER SELECTS, like this:
select vehicule,
(select count(*) from tablename t1 where t1.vehicule = t.vehicule and date BETWEEN #Monday AND #SUNDAY) as 'Weekly',
(select count(*) from tablename t1 where t1.vehicule = t.vehicule and date >= #JanuaryFirst) as 'Annual'
from tablename t
If you want to avoid subqueries you can use:
select vehicule,
sum(case when date BETWEEN #Monday AND #SUNDAY then 1 else 0 end) as 'Weekly',
sum(case when date >= #JanuaryFirst then 1 else 0 end) as 'Annual'
group by vehicule

Select single column multiple times at differents point in time

I have a simple table with 4 columns - ID, Date, Category, Value.
I have 5 distinct categories that have certain values daily. I would like to select value column at different points in time and display result along with the appropriate category.
This is the code that I'm using:
select
Category,
case when date=DATE_SUB(CURDATE(),INTERVAL 1 DAY) then value else 0 end as Today,
case when date=DATE_SUB(CURDATE(),INTERVAL 1 MONTH) then value else 0 end as "Month Ago",
case when date=DATE_SUB(CURDATE(),INTERVAL 1 Year) then value else 0 end as "Year Ago"
from table
group by category
It's not working. I'm using mysql database but will run the query in SSRS through an ODBC connection.
The problem with your query is that, as written, the case statements need to be embedded in aggregation functions:
select Category,
avg(case when date=DATE_SUB(CURDATE(),INTERVAL 1 DAY) then value end) as Today,
avg(case when date=DATE_SUB(CURDATE(),INTERVAL 1 MONTH) then value end) as "Month Ago",
avg(case when date=DATE_SUB(CURDATE(),INTERVAL 1 Year) then value end) as "Year Ago"
from table
group by category
I chose "avg" since this seems reasonable if there are multiple values and the "value" column is numeric. You might prefer min() or max() to get other values.
Also, I removed the "else 0" clause, so you will see NULL rather than 0 when there is no value.
This type of query is best done with three separate queries:
SELECT 'Today' AS `When`, Category, value FROM `table`
WHERE date = DATE_SUB(CURDATE(),INTERVAL 1 DAY)
UNION ALL
SELECT 'Month Ago' AS `When`, Category, value FROM `table`
WHERE date = DATE_SUB(CURDATE(),INTERVAL 1 MONTH)
UNION ALL
SELECT 'Year Ago' AS `When`, Category, value FROM `table`
WHERE date = DATE_SUB(CURDATE(),INTERVAL 1 YEAR)
try something like this:
SELECT
t1.Category, t1.Value, t2.Value, t3.Value
FROM YourTable t1
LEFT OUTER JOIN YourTable t2 ON t1.Category=t2.Category
AND Date=DATE_SUB(CURDATE(),INTERVAL 1 Month)
LEFT OUTER JOIN YourTable t3 ON t1.Category=t3.Category
AND Date=DATE_SUB(CURDATE(),INTERVAL 1 Year)
WHERE Date=DATE_SUB(CURDATE(),INTERVAL 1 DAY)
this assumes that you have only one row per your interval. if you have multiple rows per interval, you need to decide which value you want to show for that interval (min, max, etc). you then need to aggergate your multiple rows. if this is the case the OP should provide some sample data and expected query output so testing is possible.