Counting multiple columns based on datetable - MySQL - mysql

With a table of dates, I'm trying to count different columns based on weeks.
I manage to do it with one column, and it works fine. But when I'm counting multiple columns I get either wrong or duplicated results. I think it's because of the join.
This works for one column as expected:
SELECT
DATE_FORMAT(thedate, '%u') as week
,COUNT(t.completed_date) as completed
FROM datetable
LEFT JOIN projects t ON t.completed_date = thedate
WHERE thedate BETWEEN YEAR(NOW()) AND NOW()
GROUP BY YEARWEEK(thedate,7)
By adding ,COUNT(t.sales_date) as sales to the select, I will get duplicated counts for completed and sales.
Based to this sample (projects)
| id | completed_date | sales_date |
| 1 | NULL | NULL |
| 2 | NULL | 2013-08-26 |
| 3 | NULL | 2013-08-28 |
| 4 | 2013-09-06 | NULL |
I'm looking for
| week | completed | sales |
| 34 | 0 | 0 |
| 35 | 0 | 2 |
| 36 | 1 | 0 |
I'm using a datetable because I need all dates with 0 when there's no dates.
I think I could solve it by subqueries, but there's 12 other date fields i need to count in this query as well (excluded from the sample).
Is there a better way of solving this than by using lots of subqueries? My SQL is a bit rusty.

One way is to use subqueries that group each value by week, then join them all together.
SELECT d.week, completed, sales
FROM (SELECT YEARWEEK(thedate) week
FROM datetable
WHERE thedate BETWEEN YEAR(NOW()) AND NOW()
GROUP BY week) d
LEFT JOIN (SELECT YEARWEEK(completed_date) week, COUNT(*) completed
FROM projects
WHERE completed_date BETWEEN YEAR(NOW()) AND NOW()
GROUP BY week) c
ON c.week = d.week
LEFT JOIN (SELECT YEARWEEK(sales_date) week, COUNT(*) sales
FROM projects
WHERE sales_date BETWEEN YEAR(NOW()) AND NOW()
GROUP BY week) s
ON s.week = d.week
This way is more easily extended to additional columns:
SELECT DATE_FORMAT(thedate, '%u') AS week,
IFNULL(SUM(completed_date = thedate), 0) AS completed,
IFNULL(SUM(sales_date = thedate), 0) AS sales
FROM datetable
LEFT JOIN projects
ON thedate IN (completed_date, sales_date)
WHERE thedate BETWEEN YEAR(NOW()) AND NOW()
GROUP BY week

Related

MySQL sum where inner join is missing from right or left table

I have a turnover table on the one side that has :
Storeid Turnover myDate
| 1 | 1000 | 2020-01-01 |
| 1 | 200 | 2020-01-02 |
| 1 | 4000 | 2020-01-03 |
| 1 | 1000 | 2020-01-05 |
on the other side I have a table with the number of transactions:
Storeid Transactions myDate
| 1 | 20 | 2020-01-01 |
| 1 | 40 | 2020-01-03 |
| 1 | 20 | 2020-01-04 |
| 1 | 60 | 2020-01-05 |
I need to work out the sum of the turnover and the sum of the transactions for a given date range. However I might have missing dates on either one of the tables. If I sum them individually I get the correct answer for each but any sort of inner or left join and I get incomplete answers (as per below):
select sum(Turnover), sum(transactions) from TurnoverTable
left join TransactionTable on TurnoverTable.storeid = TransactionTable.storeid and
TurnoverTable.myDate = TransactionTable.myDate where TurnoverTable.myDate >= '2020-01-01'
This will produce a sum for Turnover of 6200 and for Transactions of 120 (20 is missing from the 2020-01-04 date as this date is not available in the Turnover table, therefore fails in the join).
Short of running 2 select sum queries, is there a way to run these sums?
Much appreciated.
You have dates missing in both tables, which rules out a left join solution. Conceptually, you want to full join. In MySQL, where this syntax is not supported, you can use union all; the rest is just aggregation:
select sum(turnover) turnover, sum(transactions) transactions
from (
select mydate, turnover, 0 transactions
union all
select mydate, 0, transactions
) t
where mydate >= '2020-01-01'
Regarding this kind of statistics, you should not use JOIN. Because you may get wrong results by rows duplications. Especially, we need to join many tables in practice.
So I recommend using UNION like the following: Please include a date where clause in UNION.
SELECT
Storeid,
SUM(Turnover),
SUM(Transactions)
FROM
(SELECT
Storeid,
myDate,
Turnover,
0 AS Transactions
FROM
turnovers
WHERE myDate BETWEEN '2020-01-01'
AND '2020-08-21'
UNION
ALL
SELECT
Storeid,
myDate,
0 AS Turnover,
Transactions
WHERE myDate BETWEEN '2020-01-01'
AND '2020-08-21'
FROM
Transactions) AS t
GROUP BY Storeid ;

Conditional subquery in where clause

I'm trying to create a query with conditional logic where I only calculate revenue for the most recent records by each month using a datetime column (start_date), but only if there are multiple records in that month from the same account_id.
Here's a basic example of the schema after I join two tables (full schema in sqlfiddle link).
| account_id | plan_id | start_date | plan_interval | price |
|------------|---------|----------------------|---------------|-------|
| 1 | 1 | 2018-01-03T14:52:13Z | month | 39 |
| 1 | 3 | 2018-02-07T11:10:17Z | year | 999 |
| 1 | 2 | 2018-02-07T11:11:17Z | month | 99 |
In the above example, I would only like to include rows 1 and 3 in my output, as it's the one record from account_id 1 in January and the most recent of two records for account_id 1 in February.
SELECT
MONTH(start_date) AS month,
SUM(CASE WHEN plan_interval = 'month'
THEN price * .01
ELSE (price * .01)/12 END) AS mrr
FROM subscriptions
JOIN plans
ON plans.id = subscriptions.plan_id
WHERE Year(start_date) = 2018 AND
CASE WHEN (account_id = account_id
AND MONTH(start_date) = MONTH(start_date))
THEN (SELECT MAX(start_date) FROM subscriptions)
ELSE (SELECT start_date FROM subscriptions)
END
GROUP BY month
ORDER BY month ASC;
The case statement in the subquery above does not seem to work in doing this. It returns the data without filtering out records when the first condition is met.
Here is an example: sqlfiddle
This query returns the rows that you are asking for in the question:
SELECT s.*, p.plan_interval, p.price,
(CASE WHEN p.plan_interval = 'month'
THEN p.price * 0.01
ELSE (p.price * 0.01)/12
END) AS mrr
FROM subscriptions s JOIN
plans p
ON p.id = s.plan_id
WHERE YEAR(s.start_date) = 2018 AND
s.start_date = (SELECT MAX(s2.start_date)
FROM subscriptions s2
WHERE s2.account_id = s.account_id AND
EXTRACT(YEAR_MONTH FROM s2.start_date) = EXTRACT(YEAR_MONTH FROM s.start_date)
)
ORDER BY s.start_date ASC;
This uses a subquery to get the most recent record for a subscription for each month.
You can then aggregate this however you wish.
Notes about the query:
Table aliases make the query easier to write and to read.
The subquery uses the handy YEAR_MONTH option of EXTRACT(), so it handles both years and months.
For numeric constants between -1 and 1, I always prepend with a 0, so 0.12 rather than .12. If find that this makes the decimal point more obvious.
First work out the last entry by account and month (sub query a) join to subscriptions to get the plan_id and then get the plan
SELECT S.ACCOUNT_id,s.plan_id,s.start_date,p.Price,p.plan_interval,
case when p.plan_interval = 'month' then p.price * .01 /12 else p.price * .01 end as rev
from subscriptions s
join (select s.account_id,month(s.start_date), max(s.start_date) start_date
from subscriptions s
group by account_id,month(start_date)) a on a.account_id = s.account_id and a.start_date = s.start_date
join plans p on p.id = s.plan_id;
+------------+---------+---------------------+----------+---------------+--------------+
| ACCOUNT_id | plan_id | start_date | Price | plan_interval | rev |
+------------+---------+---------------------+----------+---------------+--------------+
| 1 | 1 | 2018-01-03 14:52:13 | 3900.00 | month | 3.25000000 |
| 1 | 2 | 2018-02-07 11:11:17 | 9900.00 | month | 8.25000000 |
| 2 | 3 | 2018-01-03 17:40:05 | 99900.00 | year | 999.00000000 |
+------------+---------+---------------------+----------+---------------+--------------+
In your case, the WHERE statement does not work because the CASE statement will always return a boolean.
CASE WHEN (account_id = account_id
AND MONTH(start_date) = MONTH(start_date))
THEN (SELECT MAX(start_date) FROM subscriptions)
ELSE (SELECT start_date FROM subscriptions)
END
Another approach to what you are building would involve using a subquery to order the columns the way you want within the groups.
SELECT
account_id,
month,
CASE WHEN plan_interval = 'month'
THEN price * .01
ELSE (price * .01)/12
END AS mrr
FROM (
SELECT *, MONTH(start_date) AS month
FROM subscriptions
INNER JOIN plans ON plans.id = subscriptions.plan_id
ORDER BY account_id, start_date DESC
) sq
GROUP BY account_id, month
This works because selecting columns in a GROUP BY will automatically take the first row that is returned by the subquery for a given group of columns.

How to group by month and return zero if no value for certain month?

This is my mysql income table.
+----+------------------+---------------------------+------------+---------+
| id | title | description | date | amount |
+----+------------------+---------------------------+------------+---------+
| 1 | Vehicle sales up | From new sale up | 2016-09-09 | 9999.99 |
| 2 | Jem 2 Sales | From rathnapura store | 2016-05-15 | 9545.25 |
| 3 | Jem 2 Sales 2 | From rathnapura store | 2016-05-15 | 9545.25 |
| 4 | Jem 2 Sales 2 | From rathnapura store 234 | 2016-05-15 | 9545.25 |
+----+------------------+---------------------------+------------+---------+
The field 'date' is standard sql date. And I executed this query in order to take sum of incomes by month and return zero if no income from a certain month. I want zeros if no income from a certain month because i want to display these data in a chart.
This is the query.
SELECT MONTHNAME(`date`) AS mName, MONTH(`date`) AS mOrder, ifnull(sum(amount),0) AS total_num FROM income GROUP BY mOrder ORDER BY mOrder DESC
But I only get a output like follows. No zeros if no values in other months. This is the output.
+-----------+--------+-----------+
| mName | mOrder | total_num |
+-----------+--------+-----------+
| September | 9 | 9999.99 |
| May | 5 | 28635.75 |
+-----------+--------+-----------+
And I want other months in above table and total_num as zero. How can I do this? There's same kind of question there too. But no working answer.
Group by month and return 0 if data not found
Please help me to solve this issue. The language I use for this application is Node.JS :)
Have a table of all the months and then left join to your table:
SELECT MONTHNAME(m.month) AS mName,
MONTH(m.month) AS mOrder,
ifnull(sum(amount),0) AS total_num
from months m
left join income i
on m.month = i.date
GROUP BY mOrder
ORDER BY mOrder DESC
If you don't want to create a months table then you can:
(select STR_TO_DATE('01/01/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/02/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/03/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/04/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/05/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/06/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/07/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/08/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/09/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/10/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/11/2016', '%d/%m/%Y') as month union
select STR_TO_DATE('01/12/2016', '%d/%m/%Y') as month)
You should create a CALENDAR table, with the precision you need, in this case months.
+-----------+
| Month |
+-----------+
| January |
| February |
.......
And Join on it
Maybe this it's not the best way to do it, but it will solve your problem. As a quick soution:
SELECT 'January' AS mName, 1 AS mOrder, COALESCE(SUM(amount),0) AS total_num
FROM income i
WHERE month(i.date) = 1
UNION
SELECT 'February' AS mName, 2 AS mOrder, COALESCE(SUM(amount),0) AS total_num
FROM income i
WHERE month(i.date) = 2
UNION
...and go on

How merge two select with different WHERE and special conditions

I have table something like this:
date|status|value
date is date,
status is 1 for pending, 2 to confirmed
and value is value of order
I want to get 3 columns:
date|#status pending|#status pending+confirmed
example of data:
+------------+-----------------+-----------------+
| date | status | value |
+------------+-----------------+-----------------+
| 2015-11-17 | 1 | 89|
| 2015-11-16 | 1 | 6 |
| 2015-11-16 | 2 | 16 |
| 2015-11-16 | 2 | 26 |
| 2015-11-15 | 2 | 26 |
| 2015-11-14 | 2 | 24 |
+------------+-----------------+-----------------+
example of what I want:
+------------+-----------------+-----------------+
| date | confirmed |confirmed+pending|
+------------+-----------------+-----------------+
| 2015-11-17 | 0 | 1 |
| 2015-11-16 | 2 | 3 |
| 2015-11-15 | 1 | 1 |
| 2015-11-14 | 1 | 1 |
+------------+-----------------+-----------------+
I am trying to do:
SELECT array1.DATE
,array1.confirmed
,array2.total
FROM (
SELECT DATE (DATE) AS DATE
,count(value) AS confirmed
FROM Orders
WHERE STATUS = '2'
GROUP BY DATE (DATE) DESC limit 5
) AS array1
INNER JOIN (
SELECT DATE (DATE) AS DATE
,count(value) AS total
FROM Orders
GROUP BY DATE (DATE) DESC limit 5
) AS array2
But I get 4 results per date with repeated confirmed value and different total transactions.
If I try separated, I can get both correct informations:
will list only sum of confirmed orders of last 5 days:
SELECT array1.DATE
,array1.confirmed
,array2.total
FROM (
SELECT DATE (DATE) AS DATE
,count(valor) AS confirmed
FROM Orders
WHERE STATUS = '2'
GROUP BY DATE (DATE) DESC limit 5;
)
will list sum of all orders of last 5 days:
SELECT DATE (DATE) AS DATE
,count(valor) AS total
FROM Orders
GROUP BY DATE (DATE) DESC limit 5
I observed at least one big problem:
Sometimes we will have one day with a lot of not confirmed orders and zero confirmed, so probably inner join will fail.
You can use CASE WHEN, To get the expected output,you have given.
SELECT `date`,
(SUM(CASE WHEN `status`=1 THEN 1 ELSE 0 END)) AS Confirmed,
(SUM(CASE WHEN `status`=1 OR `status`=2 THEN 1 ELSE 0 END)) AS Confirmed_Pending
FROM
table_name
GROUP BY DATE(`date`) DESC
Hope this helps.
You are missing an ON clause in your INNER JOIN. Or, since in your case the column you join on is the same on both sides, you can use USING:
SELECT array1.DATE
,array1.confirmed
,array2.total
FROM (
SELECT DATE (DATE) AS DATE
,count(value) AS confirmed
FROM Orders
WHERE STATUS = '2'
GROUP BY DATE (DATE) DESC limit 5
) AS array1
INNER JOIN (
SELECT DATE (DATE) AS DATE
,count(value) AS total
FROM Orders
GROUP BY DATE (DATE) DESC limit 5
) AS array2
USING (DATE)
An easier approach could be to use a case expression to evaluate whether the status is something you'd like to count, and apply the count function to that:
SELECT DATE (`date`) AS `date`,
COUNT(CASE status WHEN 2 THEN 1 END) AS `confirmed`,
COUNT(CASE WHEN status IN (1, 2) THEN 1 END) AS `pending and confirmed`,
FROM orders
GROUP BY DATE (`date`) DESC

MySQL grouping by date range with multiple joins

I currently have quite a messy query, which joins data from multiple tables involving two subqueries. I now have a requirement to group this data by DAY(), WEEK(), MONTH(), and QUARTER().
I have three tables: days, qos and employees. An employee is self-explanatory, a day is a summary of an employee's performance on a given day, and qos is a random quality inspection, which can be performed many times a day.
At the moment, I am selecting all employees, and LEFT JOINing day and qos, which works well. However, now, I need to group the data in order to breakdown a team or individual's performance over a date range.
Taking this data:
Employee
id | name
------------------
1 | Bob Smith
Day
id | employee_id | day_date | calls_taken
---------------------------------------------
1 | 1 | 2011-03-01 | 41
2 | 1 | 2011-03-02 | 24
3 | 1 | 2011-04-01 | 35
Qos
id | employee_id | qos_date | score
----------------------------------------
1 | 1 | 2011-03-03 | 85
2 | 1 | 2011-03-03 | 95
3 | 1 | 2011-04-01 | 91
If I were to start by grouping by DAY(), I would need to see the following results:
Day__date | Day__Employee__id | Day__calls | Day__qos_score
------------------------------------------------------------
2011-03-01 | 1 | 41 | NULL
2011-03-02 | 1 | 24 | NULL
2011-03-03 | 1 | NULL | 90
2011-04-01 | 1 | 35 | 91
As you see, Day__calls should be SUM(calls_taken) and Day__qos_score is AVG(score). I've tried using a similar method as above, but as the date isn't known until one of the tables has been joined, its only displaying a record where there's a day saved.
Is there any way of doing this, or am I going about things the wrong way?
Edit: As requested, here's what I've come up with so far. However, it only shows dates where there's a day.
SELECT COALESCE(`day`.day_date, qos.qos_date) AS Day__date,
employee.id AS Day__Employee__id,
`day`.calls_taken AS Day__Day__calls,
qos.score AS Day__Qos__score
FROM faults_employees `employee`
LEFT JOIN (SELECT `day`.employee_id AS employee_id,
SUM(`day`.calls_taken) AS `calls_in`,
FROM faults_days AS `day`
WHERE employee.id = 7
GROUP BY (`day`.day_date)
) AS `day`
ON `day`.employee_id = `employee`.id
LEFT JOIN (SELECT `qos`.employee_id AS employee_id,
AVG(qos.score) AS `score`
FROM faults_qos qos
WHERE employee.id = 7
GROUP BY (qos.qos_date)
) AS `qos`
ON `qos`.employee_id = `employee`.id AND `qos`.qos_date = `day`.day_date
WHERE employee.id = 7
GROUP BY Day__date
ORDER BY `day`.day_date ASC
The solution I'm comming up with looks like:
SELECT
`date`,
`employee_id`,
SUM(`union`.`calls_taken`) AS `calls_taken`,
AVG(`union`.`score`) AS `score`
FROM ( -- select from union table
(SELECT -- first select all calls taken, leaving qos_score null
`day`.`day_date` AS `date`,
`day`.`employee_id`,
`day`.`calls_taken`,
NULL AS `score`
FROM `employee`
LEFT JOIN
`day`
ON `day`.`employee_id` = `employee`.`id`
)
UNION -- union both tables
(
SELECT -- now select qos score, leaving calls taken null
`qos`.`qos_date` AS `date`,
`qos`.`employee_id`,
NULL AS `calls_taken`,
`qos`.`score`
FROM `employee`
LEFT JOIN
`qos`
ON `qos`.`employee_id` = `employee`.`id`
)
) `union`
GROUP BY `union`.`date` -- group union table by date
For the UNION to work, we have to set the qos_score field in the day table and the calls_taken field in the qos table to null. If we don't, both calls_taken and score would be selected into the same column by the UNION statement.
After this, I selected the required fields with the aggregation functions SUM() and AVG() from the union'd table, grouping by the date field in the union table.