MySQL query help require - mysql

I am using MySQL database. I have employee leave table which having information about employee leave.
Please find table details:
CREATE TABLE IF NOT EXISTS `APPLY_LEAVE` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`EMP_ID` varchar(100) NOT NULL,
`TYPE_OF_LEAVE` varchar(100) NOT NULL,
`DAYS` varchar(100) NOT NULL,
`REASON` varchar(200) NOT NULL,
`START_DATE` date NOT NULL,
`END_DATE` date NOT NULL,
`STATUS` tinyint(2) NOT NULL,
`CREATED_ON` date NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1;
--
-- Dumping data for table `APPLY_LEAVE`
--
INSERT INTO `APPLY_LEAVE` (`ID`, `EMP_ID`, `TYPE_OF_LEAVE`, `DAYS`, `REASON`, `START_DATE`, `END_DATE`, `STATUS`, `CREATED_ON`) VALUES
(1, 'EMP001', 'SL', '2', 'Sick Leave', '2018-11-30', '2018-12-01', 1,'2018-11-06'),
(2, 'EMP002', 'EL', '1', 'Personal', '2018-12-13', '2018-12-13', 1,'2018-11-09'),
(3, 'EMP003', 'CL', '2', 'Casual Leave due to Birthday', '2018-08-31', '2018-09-01', 1,'2018-08-20'),
(4, 'EMP001', 'CL', '3', 'Casual Leave', '2018-12-04', '2018-12-06', 1,'2018-11-27'),
(5, 'EMP002', 'SL', '4', 'Sick Leave', '2018-09-10', '2018-09-13', 1,'2018-10-04'),
(6, 'EMP003', 'SL', '3', 'Sick Leave', '2018-10-30', '2018-11-01', 1,'2018-11-25');
Require Output:
I want to generate Report/excel to receive information as month wise employee leave data based on leave type i.e (Month wise, Leave type data)
Format should be below:
Requirement: I want MySQL query to fetch attached result month wise, Leave type data(SL/CL/EL) which took by Employee.
Query tries:
SELECT EMP_ID,
SUM(CASE WHEN TYPE_OF_LEAVE = 'EL' AND MONTH( START_DATE ) =11 THEN DAYS ELSE 0 END ) AS EL_NOV,
SUM(CASE WHEN TYPE_OF_LEAVE = 'CL' AND MONTH( START_DATE ) =11 THEN DAYS ELSE 0 END ) AS CL_NOV,
SUM(CASE WHEN TYPE_OF_LEAVE = 'SL' AND MONTH( START_DATE ) =11 THEN DAYS ELSE 0 END ) AS SL_NOV,
SUM(CASE WHEN TYPE_OF_LEAVE = 'LOP' AND MONTH( START_DATE ) =11 THEN DAYS ELSE 0 END ) AS LOP_NOV,
SUM(CASE WHEN TYPE_OF_LEAVE = 'EL' AND MONTH( START_DATE ) =12 THEN DAYS ELSE 0 END ) AS EL_DEC,
SUM(CASE WHEN TYPE_OF_LEAVE = 'CL' AND MONTH( START_DATE ) =12 THEN DAYS ELSE 0 END ) AS CL_DEC,
SUM(CASE WHEN TYPE_OF_LEAVE = 'SL' AND MONTH( START_DATE ) =12 THEN DAYS ELSE 0 END ) AS SL_DEC,
SUM(CASE WHEN TYPE_OF_LEAVE = 'LOP' AND MONTH( START_DATE ) =12 THEN DAYS ELSE 0 END ) AS LOP_DEC
FROM APPLY_LEAVE
GROUP BY EMP_ID
Facing Issue:
I.e One employee look leave on Friday and Saturday (i.e EMP001 took SL on 2018-11-30 to 2018-12-01) (Friday is month of last date and Saturday is first date of month and I am inserting single record into table. When employee applied leave from application. Here result should be
EMP001 - SL
November - 1 leave
December - 1 leave
How can I write this MySQL query?

Dear Dipti Kindly find below query for required result.
SELECT
*
FROM
(
SELECT
EMP_ID,
START_DATE as date_day,
TYPE_OF_LEAVE,
SUM(
if(
MONTH(START_DATE) <> MONTH(END_DATE),
(
day(
last_day(START_DATE)
)+ 1 - day(START_DATE)
),
days
)
) as DAYS
FROM
APPLY_LEAVE
GROUP BY
MONTH(START_DATE),
MONTH(END_DATE),
EMP_ID
UNION ALL
SELECT
EMP_ID,
END_DATE as date_day,
TYPE_OF_LEAVE,
SUM(
if(
MONTH(START_DATE) <> MONTH(END_DATE),
DAY(END_DATE),
0
)
) as DAYS
FROM
APPLY_LEAVE
GROUP BY
MONTH(START_DATE),
MONTH(END_DATE),
EMP_ID
) as a
WHERE
a.DAYS > 0;

Related

Mysql query : how to calculate statistic growth rate between two period per category

From this table of dates and categories inputs :
I'd like to get these folowing table, showing number of rows per category for each first and second quarter, and the growth of number of rows between second and first quarter, in value, and in percentage - which is a simple statistic table type that could be met to get the number of item per period of time and its growth.
What would be the sql query to get this table ?
Here is the SQL code to create the sql table, in order you to reproduce the schema:
CREATE TABLE `table_a_test_table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`date_row` timestamp NULL DEFAULT NULL,
`category` varchar(255) NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
Here is the sql code to fill the table with the data :
INSERT INTO `table_a_test_table` (`date_row`, `category`)
VALUES
('2020-01-03 00:00:00', 'A'),
('2020-02-02 00:00:00', 'A'),
('2020-03-08 00:00:00', 'B'),
('2020-02-06 00:00:00', 'C'),
('2020-04-07 00:00:00', 'B'),
('2020-05-21 00:00:00', 'A'),
('2020-06-07 00:00:00', 'C'),
('2020-06-08 00:00:00', 'B')
;
I've tried the following sql code, to get the result table, but I do not know where I should, and how I should introduce the category field in order to get a group by output.
SELECT
nb_rows_q1.nb_of_rows AS `number of row in q1`,
nb_rows_q2.nb_of_rows AS `number of row in q2`,
((nb_rows_q2.nb_of_rows - nb_rows_q1.nb_of_rows)/nb_rows_q1.nb_of_rows)*100 AS `growth nb of rows between q2 vs q1`
FROM
(
SELECT COUNT(id) AS nb_of_rows
FROM table_a_test_table
WHERE
date_row >= '2020-01-01 00:00:00'
AND date_row < '2020-04-01 00:00:00'
) AS nb_rows_q1,
(
SELECT COUNT(id) AS nb_of_rows
FROM table_a_test_table
WHERE
date_row >= '2020-04-01 00:00:00'
AND date_row < '2020-07-01 00:00:00'
) AS nb_rows_q2
;
The above code, returns the following :
So now, I'd like t place the category field into the code. But I do not knwo how to do it.
Any idea ?
You would need to aggregate by categories within your subqueries first so you wouldn't lose the category details in the final projection. See a sample working fiddle and results below:
CREATE TABLE `table_a_test_table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`date_row` timestamp NULL DEFAULT NULL,
`category` varchar(255) NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
INSERT INTO `table_a_test_table` (`date_row`, `category`)
VALUES
('2020-01-03 00:00:00', 'A'),
('2020-02-02 00:00:00', 'A'),
('2020-03-08 00:00:00', 'B'),
('2020-02-06 00:00:00', 'C'),
('2020-04-07 00:00:00', 'B'),
('2020-05-21 00:00:00', 'A'),
('2020-06-07 00:00:00', 'C'),
('2020-06-08 00:00:00', 'B')
;
Query #1
SELECT
nb_rows_q1.category,
nb_rows_q1.nb_of_rows AS `number of row in q1`,
nb_rows_q2.nb_of_rows AS `number of row in q2`,
(nb_rows_q2.nb_of_rows - nb_rows_q1.nb_of_rows) as `variation of nb of row q2 vs q1`,
((nb_rows_q2.nb_of_rows - nb_rows_q1.nb_of_rows)/nb_rows_q1.nb_of_rows)*100 AS `growth nb of rows between q2 vs q1`
FROM
(
SELECT category, COUNT(id) AS nb_of_rows
FROM table_a_test_table
WHERE
date_row >= '2020-01-01 00:00:00'
AND date_row < '2020-04-01 00:00:00'
GROUP BY category
) AS nb_rows_q1
INNER JOIN
(
SELECT category,COUNT(id) AS nb_of_rows
FROM table_a_test_table
WHERE
date_row >= '2020-04-01 00:00:00'
AND date_row < '2020-07-01 00:00:00'
GROUP BY category
) AS nb_rows_q2 ON nb_rows_q1.category = nb_rows_q2.category
;
category
number of row in q1
number of row in q2
variation of nb of row q2 vs q1
growth nb of rows between q2 vs q1
A
2
1
-1
-50.0000
B
1
2
1
100.0000
C
1
1
0
0.0000
View on DB Fiddle
Update 1
Another approach has been included below where the aggregation has been done with the help of a case expression. I have also added to your test data, D only in q2, E only in q1 and F neither in q1 or q2. This approach includes categories in either quarter. I also added a case expression for categories that are new or only occurring in quarter 2 that would have the growth rate as null. This is up to you whether you would like the growth rate returned as null or a default value, I included 100
CREATE TABLE `table_a_test_table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`date_row` timestamp NULL DEFAULT NULL,
`category` varchar(255) NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
INSERT INTO `table_a_test_table` (`date_row`, `category`)
VALUES
('2020-01-03 00:00:00', 'A'),
('2020-02-02 00:00:00', 'A'),
('2020-03-08 00:00:00', 'B'),
('2020-02-06 00:00:00', 'C'),
('2020-04-07 00:00:00', 'B'),
('2020-05-21 00:00:00', 'A'),
('2020-06-07 00:00:00', 'C'),
('2020-06-08 00:00:00', 'B'),
('2020-06-08 00:00:00', 'D'),
('2020-01-03 00:00:00', 'E'),
('2019-01-03 00:00:00', 'F')
;
Query #1
SELECT
category,
q1 as `number of row in q1`,
q2 as `number of row in q2`,
(q2 - q1) as `variation of nb of row q2 vs q1`,
CASE
WHEN q1=0 THEN 100.0
ELSE((q2-q1)/q1)*100
END as `growth nb of rows between q2 vs q1`
FROM (
SELECT
category,
SUM(CASE WHEN date_row < '2020-04-01 00:00:00' THEN 1 ELSE 0 END ) as q1,
SUM(CASE WHEN date_row >= '2020-04-01 00:00:00' THEN 1 ELSE 0 END ) as q2
FROM
table_a_test_table
WHERE
date_row BETWEEN '2020-01-01 00:00:00' AND '2020-07-01 00:00:00'
GROUP BY
category
) summary;
category
number of row in q1
number of row in q2
variation of nb of row q2 vs q1
growth nb of rows between q2 vs q1
A
2
1
-1
-50.0000
B
1
2
1
100.0000
C
1
1
0
0.0000
D
0
1
1
100.0
E
1
0
-1
-100.0000
View on DB Fiddle
Feel free to experiment with other filtering options. For example, if you would like to filter out entries like D or E that exist in only one quarter you may consider adding a WHERE clause similar to WHERE q1 > 0 and q2 > 0; as shown below:
SELECT
category,
q1 as `number of row in q1`,
q2 as `number of row in q2`,
(q2 - q1) as `variation of nb of row q2 vs q1`,
CASE
WHEN q1=0 THEN 100.0
ELSE((q2-q1)/q1)*100
END as `growth nb of rows between q2 vs q1`
FROM (
SELECT
category,
SUM(CASE WHEN date_row < '2020-04-01 00:00:00' THEN 1 ELSE 0 END ) as q1,
SUM(CASE WHEN date_row >= '2020-04-01 00:00:00' THEN 1 ELSE 0 END ) as q2
FROM
table_a_test_table
WHERE
date_row BETWEEN '2020-01-01 00:00:00' AND '2020-07-01 00:00:00'
GROUP BY
category
) summary
WHERE q1 > 0 and q2 > 0;
category
number of row in q1
number of row in q2
variation of nb of row q2 vs q1
growth nb of rows between q2 vs q1
A
2
1
-1
-50.0000
B
1
2
1
100.0000
C
1
1
0
0.0000
View on DB Fiddle
I hope this helps.
You probably just need one subquery and don't need JOIN at all. Something like this:
SELECT category,
q1 AS `number of row in q1`,
q2 AS `number of row in q2`,
q2-q1 AS `variation of nb of q2 vs q1`,
((q2-q1)/q1)*100 AS `growth nb of rows between q2 vs q1`
FROM
(SELECT category,
SUM(CASE WHEN date_row >= '2020-01-01 00:00:00'
AND date_row < '2020-04-01 00:00:00'
THEN 1 ELSE 0 END) AS 'q1',
SUM(CASE WHEN date_row >= '2020-04-01 00:00:00'
AND date_row < '2020-07-01 00:00:00'
THEN 1 ELSE 0 END) AS 'q2'
FROM table_a_test_table
GROUP BY category) A;
The base query there I use SUM() with CASE expression. The condition in both of your previous subqueries I used as the condition in the CASE expression followed by a GROUP BY on category column. And as you can see, I've assigned with short abbreviations for q1 and q2 and instead of doing all the calculation for differences and percentage in the same line of SELECT, I turned the query into a subquery and do the calculation outside. This, in my opinion, made the query much more readable.
Demo fiddle

Custom output from a mysql query

Consider the following schema:
CREATE TABLE `Result` (
`startDate` date NOT NULL,
`description` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci DEFAULT NULL,
`value` decimal(15,4) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO `Result`
(`startDate`,
`description`,
`value`)
VALUES
('2020-09-01' ,'Allowance' ,4000),
('2020-09-01' ,'Salary' ,1500),
('2020-10-01' ,'Allowance' ,2000),
('2020-10-01' ,'Salary' ,3000),
('2020-10-01' ,'Deduction' ,-200);
Given a date,the result should show as the total for a description & startdate and the difference between the date selected and previous date(month). So if October was the month selected the result of the query should show as,
description SeptemberTotal OctoberTotal Variance
Allowance 4000 2000 -2000
Salary 1500 3000 1500
Deduction 0 -200 -200
My attempt using a union & a pivot,
SELECT #selectDate:='2020-10-01'; -- set desired date
SELECT
t.month,
t.description,
Gross,
from (
SELECT
DATE_FORMAT(pi.startDate, '%b/%y') AS 'Month',
SUM(pi.value) AS gross,
description
FROM
Result pi
WHERE
pi.startDate = DATE_SUB(#selectDate, INTERVAL 1 MONTH) -- select previous month
GROUP BY description
UNION SELECT
DATE_FORMAT(pi.startDate, '%b/%y') AS 'Month',
SUM(pi.value) AS gross,
description
FROM
Result pi
WHERE
pi.startDate = #selectDate
GROUP BY description) t
GROUP BY t.Month,t.description
;
which gives the result as,
Month description Gross
Sep/20 Allowance 4000
Sep/20 Salary 1500
Oct/20 Allowance 2000
Oct/20 Salary 3000
Oct/20 Deduction -200
which is not exactly what the requirement is. I have tried a pivot query as well, that too is not showing the output as required.
db-fiddle
SET #m1 := '2020-09-01';
SET #m2 := '2020-10-01';
SELECT Result.Description,
COALESCE(SUM(CASE WHEN Result.startDate = #m1 THEN value END), 0) Total1,
COALESCE(SUM(CASE WHEN Result.startDate = #m2 THEN value END), 0) Total2,
COALESCE(SUM(CASE WHEN Result.startDate = #m2 THEN value END), 0) -
COALESCE(SUM(CASE WHEN Result.startDate = #m1 THEN value END), 0) Variance
FROM ( SELECT #m1 startDate UNION ALL SELECT #m2 ) baseDates
LEFT JOIN Result USING (startDate)
GROUP BY Result.Description
fiddle
select
description,
sum(if(startDate between '2020-11-01 00:00:00' and '2020-11-31 23:59:59' ,value,0)) 1st,
sum(if(startDate between '2020-12-01 00:00:00' and '2020-12-31 23:59:59',value,0)) 2nd,
sum(if(startDate between '2020-11-01 00:00:00' and '2020-11-31 23:59:59' ,value,0))-sum(if(created_at between '2020-12-01 00:00:00' and '2020-12-31 23:59:59',value,0)) Variance
from
Result
where startDate between '2020-11-01 00:00:00' and '2020-12-31 23:59:59' group by description

How To Write A Query With A CTE And A Left Join

I am trying to have a calendar table with my CTE and set it up so that my dates in my query display like this
Jan 18Jan 19Feb 18Feb 19
Now this is my DDL and this is the query I attempted, but in MySql Workbench I'm getting the error that there is an error in my sql somewhere.
This is the exact error:
Query Error: Error: ER_PARSE_ERROR: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'RECURSIVE cte_months_to_pull AS ( SELECT DATE_FORMAT(#start_date, '%Y-%m-01'' at line 1
Can someone assist?
If it's easier this is a SQL Fiddle of everything http://sqlfiddle.com/#!9/300f9d/1
CREATE TABLE PrevYear (
`EmployeeNumber` char(8) NOT NULL,
`SaleAmount` int DEFAULT NULL,
`SaleDate` date NOT NULL,
`EmployeeName` char(17) NOT NULL
);
CREATE TABLE CurrentYear (
`EmployeeNumber` char(8) NOT NULL,
`SaleAmount` int DEFAULT NULL,
`SaleDate` date NOT NULL,
`EmployeeName` char(17) NOT NULL
);
INSERT INTO CurrentYear
VALUES ('ea12', '100', '2019-01-10', 'Maggie Samuels');
INSERT INTO CurrentYear
VALUES ('ea12', '199', '2019-01-13', 'Sam Stoner');
INSERT INTO CurrentYear
VALUES ('ea12', '100', '2019-03-01', 'Jake Jolel');
INSERT INTO CurrentYear
VALUES ('ls22', '100', '2019-05-01', 'Maggie Samuels');
INSERT INTO PrevYear
VALUES ('ea12', '100', '2018-01-10', 'Maggie Samuels');
INSERT INTO PrevYear
VALUES ('ea12', '199', '2018-01-13', 'Sam Stoner');
INSERT INTO PrevYear
VALUES ('ea12', '100', '2018-03-01', 'Sam Stoner');
INSERT INTO PrevYear
VALUES ('ls22', '100', '2018-05-01', 'Maggie Samuels');
And this is the query I try:
SET #start_date = '20190102';
SET #number_of_months = 12;
WITH RECURSIVE
cte_months_to_pull AS (
SELECT DATE_FORMAT(#start_date, '%Y-%m-01')
- INTERVAL #number_of_months MONTH AS month_to_pull
UNION ALL
SELECT month_to_pull + INTERVAL 1 MONTH
FROM cte_months_to_pull
WHERE month_to_pull < #start_date + INTERVAL #number_of_months - 2 MONTH
)
SELECT Date_format(saledate, '%m-%Y') AS Month,
employeename,
Sum(saleamount) AS IA
FROM currentyear
WHERE employeename = 'Maggie Samuels'
GROUP BY Date_format(saledate, '%m-%Y'), employeename
UNION ALL
SELECT Date_format(saledate, '%m-%Y') AS Month,
employeename,
Sum(saleamount) AS IA
FROM prevyear
WHERE employeename = 'Maggie Samuels'
GROUP BY Date_format(saledate, '%m-%Y'), employeename
LEFT JOIN cte_months_to_pull (
Select DATE_Format(month_to_pull, '%b %y')
FROM cte_months_to_pull
) AS YRS ON month_to_pull = saledate
ORDER BY MONTH(month_to_pull), YEAR(month_to_pull)
As I can see you are using MySQL version older than 8.0 which doesn't support RECURSIVE CTEs. I have tried your query with some minor updates on 8.0 and it worked fine -
WITH RECURSIVE
cte_months_to_pull AS (
SELECT DATE_FORMAT(#start_date, '%Y-%m-01')
- INTERVAL #number_of_months MONTH AS month_to_pull
UNION ALL
SELECT month_to_pull + INTERVAL 1 MONTH
FROM cte_months_to_pull
WHERE month_to_pull < #start_date + INTERVAL #number_of_months - 2 MONTH
)
SELECT YRS.months_to_pull
,T.employeename
,COALESCE(T.IA, 0) IA
FROM (SELECT DATE_Format(month_to_pull, '%b-%Y') months_to_pull
FROM cte_months_to_pull
ORDER BY months_to_pull
) AS YRS
LEFT JOIN (SELECT Date_format(saledate, '%b-%Y') AS `Month`
,employeename
,Sum(saleamount) AS IA
FROM CurrentYear
WHERE employeename = 'Maggie Samuels'
GROUP BY Date_format(saledate, '%b-%Y'), employeename
UNION ALL
SELECT Date_format(saledate, '%b-%Y')
,employeename
,Sum(saleamount)
FROM PrevYear
WHERE employeename = 'Maggie Samuels'
GROUP BY Date_format(saledate, '%b-%Y'), employeename) T
ON YRS.months_to_pull = T.`Month`
ORDER BY month(STR_TO_DATE(CONCAT('01-',months_to_pull), '%d-%b-%Y'))
,YEAR(STR_TO_DATE(CONCAT('01-',months_to_pull), '%d-%b-%Y'))
Here is the Fiddle
Since there is no expected output, I have only tried till running the query.

How to get summary data for every months in mysql

I want to count the number of items sold(item_count) every month for every item,
--
-- Table structure for table `sales`
--
CREATE TABLE `sales` (
`id` int(11) NOT NULL,
`item_id` int(11) NOT NULL,
`date` date NOT NULL,
`item_count` int(11) NOT NULL,
`amount` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
--
-- Dumping data for table `sales`
--
INSERT INTO `sales` (`id`, `item_id`, `date`, `item_count`, `amount`) VALUES
(1, 1, '2018-01-15', 11, 110),
(2, 2, '2018-01-21', 5, 1000),
(3, 1, '2018-02-02', 7, 700),
(4, 2, '2018-02-11', 3, 3000);
I have tried this SQL, but it's not showing the data correctly.
SELECT `sales`.`item_id`,
(CASE WHEN MONTH(sales.date)=1 THEN sum(sales.item_count) ELSE NULL END) as JAN,
(case when MONTH(sales.date)=2 THEN sum(sales.item_count) ELSE NULL END) as FEB
FROM sales WHERE 1
GROUP BY sales.item_id
ORDER BY sales.item_id
This is my expected result,
item_id JAN FEB
1 11 7
2 5 3
I am getting this,
item_id JAN FEB
1 18 NULL
2 8 NULL
Here is an immediate fix to your query. You need to sum over a CASE expression, rather than the other way around.
SELECT
s.item_id,
SUM(CASE WHEN MONTH(s.date) = 1 THEN s.item_count END) AS JAN,
SUM(CASE WHEN MONTH(s.date) = 2 THEN s.item_count END) AS FEB
FROM sales s
GROUP BY
s.item_id
ORDER BY
s.item_id;
But the potential problem with this query is that in order to support more months, you need to add more columns. Also, if you want to cover mulitple years, then this approach also might not scale. Assuming you only have a few items, here is another way to do this:
SELECT
DATE_FORMAT(date, '%Y-%m') AS ym,
SUM(CASE WHEN item_id = 1 THEN item_count END) AS item1_total,
SUM(CASE WHEN item_id = 2 THEN item_count END) AS item2_total
FROM sales
GROUP BY
DATE_FORMAT(date, '%Y-%m');
This would generate output looking something like:
ym item1_total item2_total
2018-01 11 5
2018-02 7 3
Which version you use depends on how many months your report requires versus how many items might appear in your data.

get specific data along with group by

I've a table named log.
Table: log
ID user_id time_of_action
I want to get result for each user for each date i.e. group by date,user_id.
So, here's the expected output structure:
user_id date occurred_in_afternoon occurred_at_night total_action_count
Explanation:
occurred_in_afternoon: whether any action of a user occurred in between 12:00 PM to 4:00 PM
occurred_at_night: whether any action of a user occurred between 8:00 PM to 12:00 AM (next day)
Schema and sample data:
DROP TABLE IF EXISTS `logs`;
CREATE TABLE `logs` (
`Id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`time_of_action` timestamp NULL DEFAULT NULL ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`Id`)
);
INSERT INTO `logs` VALUES ('1', '71', '2016-03-10 10:07:34');
INSERT INTO `logs` VALUES ('2', '66', '2016-03-10 14:07:57');
INSERT INTO `logs` VALUES ('3', '71', '2016-03-10 22:08:27');
INSERT INTO `logs` VALUES ('4', '71', '2016-03-10 15:08:40');
And here's my current query:
SELECT
user_id,
DATE(time_of_action) `date`,
CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,12,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,16,DATE(time_of_action)) THEN 1 ELSE 0 END occurred_in_afternoon,
CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,20,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,24,DATE(time_of_action)) THEN 1 ELSE 0 END occurred_at_night,
COUNT(*) total_action_count
FROM `logs`
GROUP BY `date`,user_id
my current output:
user_id date occurred_in_afternoon occurred_at_night total_action_count
66 2016-03-10 1 0 1
71 2016-03-10 0 0 3
Expected output:
user_id date occurred_in_afternoon occurred_at_night total_action_count
66 2016-03-10 1 0 1
71 2016-03-10 1 1 3
The problem is that I am not getting the expected result. I guess occurred in afternoon value is reset by another time_of_action which doesn't lie in that afternoon region.
And is it possible to implement it in a single query?
You missed to use an aggregate function. You can use MAX() or BIT_OR() for your purpose:
SELECT
user_id,
DATE(time_of_action) `date`,
MAX(CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,12,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,16,DATE(time_of_action)) THEN 1 ELSE 0 END) occurred_in_afternoon,
MAX(CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,20,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,24,DATE(time_of_action)) THEN 1 ELSE 0 END) occurred_at_night,
COUNT(*) total_action_count
FROM `logs`
GROUP BY `date`,user_id
Update: I would also prefer a more readable version like
SELECT
user_id,
DATE(time_of_action) `date`,
BIT_OR(TIME(time_of_action) BETWEEN '12:00:00' AND '16:00:00') occurred_in_afternoon,
BIT_OR(TIME(time_of_action) BETWEEN '20:00:00' AND '23:59:59') occurred_at_night,
COUNT(*) total_action_count
FROM `logs`
GROUP BY `date`,user_id
I was thinking to have an alias of the result table that I've got through SUM in order to get Binary value for those two fields.
SELECT
t.user_id,
t.date,
CASE WHEN t.occurred_in_afternoon > 0 THEN 1 ELSE 0 END AS occurred_in_afternoon,
CASE WHEN t.occurred_at_night > 0 THEN 1 ELSE 0 END AS occurred_at_night,
t.total_action_count
FROM
(SELECT
user_id,
DATE(time_of_action) `date`,
SUM(CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,12,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,16,DATE(time_of_action)) THEN 1 ELSE 0 END) occurred_in_afternoon,
SUM(CASE WHEN time_of_action BETWEEN TIMESTAMPADD(HOUR,20,DATE(time_of_action)) AND TIMESTAMPADD(HOUR,24,DATE(time_of_action)) THEN 1 ELSE 0 END) occurred_at_night,
COUNT(*) total_action_count
FROM `logs`
GROUP BY `date`,user_id) t