I have a database containing tickets. Each ticket has a unique number but this number is not unique in the table. So for example ticket #1000 can be multiple times in the table with different other columns (Which I have removed here for the example).
create table countries
(
isoalpha varchar(2),
pole varchar(50)
);
insert into countries values ('DE', 'EMEA'),('FR', 'EMEA'),('IT', 'EMEA'),('US','USCAN'),('CA', 'USCAN');
create table tickets
(
id int primary key auto_increment,
number int,
isoalpha varchar(2),
created datetime
);
insert into tickets (number, isoalpha, created) values
(1000, 'DE', '2021-01-01 00:00:00'),
(1001, 'US', '2021-01-01 00:00:00'),
(1002, 'FR', '2021-01-01 00:00:00'),
(1003, 'CA', '2021-01-01 00:00:00'),
(1000, 'DE', '2021-01-01 00:00:00'),
(1000, 'DE', '2021-01-01 00:00:00'),
(1004, 'DE', '2021-01-02 00:00:00'),
(1001, 'US', '2021-01-01 00:00:00'),
(1002, 'FR', '2021-01-01 00:00:00'),
(1005, 'IT', '2021-01-02 00:00:00'),
(1006, 'US', '2021-01-02 00:00:00'),
(1007, 'DE', '2021-01-02 00:00:00');
Here is an example:
http://sqlfiddle.com/#!9/3f4ba4/6
What I need as output is the number of new created tickets for each day, devided into tickets from USCAN and rest of world.
So for this Example the out coming data should be
Date | USCAN | Other
'2021-01-01' | 2 | 2
'2021-01-02' | 1 | 3
At the moment I use this two queries to fetch all new tickets and then add the number of rows with same date in my application code:
SELECT MIN(ti.created) AS date
FROM tickets ti
LEFT JOIN countries ct ON (ct.isoalpha = ti.isoalpha)
WHERE ct.pole = 'USCAN'
GROUP BY ti.number
ORDER BY date
SELECT MIN(ti.created) AS date
FROM tickets ti
LEFT JOIN countries ct ON (ct.isoalpha = ti.isoalpha)
WHERE ct.pole <> 'USCAN'
GROUP BY ti.number
ORDER BY date
but that doesn't look like a very clean method. So how can I improved the query to get the needed data with less overhead?
Ii is recommended that is works with mySQL 5.7
You may logically combine the queries using conditional aggregation:
SELECT
MIN(CASE WHEN ct.pole = 'USCAN' THEN ti.created END) AS date_uscan,
MIN(CASE WHEN ct.pole <> 'USCAN' THEN ti.created END) AS date_other
FROM tickets ti
LEFT JOIN countries ct ON ct.isoalpha = ti.isoalpha
GROUP BY ti.number
ORDER BY date;
You can create unique entries for each date/country then use that value to count USCAN and non-USCAN
SELECT created,
SUM(1) as total,
SUM(CASE WHEN pole = 'USCAN' THEN 1 ELSE 0 END) as uscan,
SUM(CASE WHEN pole != 'USCAN' THEN 1 ELSE 0 END) as nonuscan
FROM (
SELECT created, t.isoalpha, MIN(pole) AS pole
FROM tickets t JOIN countries c ON t.isoalpha = c.isoalpha
GROUP BY created,isoalpha
) AS uniqueTickets
GROUP BY created
Results:
created total uscan nonuscan
2021-01-01T00:00:00Z 4 2 2
2021-01-02T00:00:00Z 3 1 2
http://sqlfiddle.com/#!9/3f4ba4/45/0
Regarding the answer of SQL Hacks I found the right solution
SELECT created,
SUM(1) as total,
SUM(CASE WHEN pole = 'USCAN' THEN 1 ELSE 0 END) as uscan,
SUM(CASE WHEN pole != 'USCAN' THEN 1 ELSE 0 END) as nonuscan
FROM (
SELECT created, t.isoalpha, MIN(pole) AS pole
FROM tickets t JOIN countries c ON t.isoalpha = c.isoalpha
GROUP BY t.number
) AS uniqueTickets
GROUP BY SUBSTR(created, 1 10)
Related
There are 2 tables ost_ticket and ost_ticket_action_history.
create table ost_ticket(
ticket_id int not null PRIMARY KEY,
created timestamp,
staff bool,
status varchar(50),
city_id int
);
create table ost_ticket_action_history(
ticket_id int not null,
action_id int not null PRIMARY KEY,
action_name varchar(50),
started timestamp,
FOREIGN KEY(ticket_id) REFERENCES ost_ticket(ticket_id)
);
In the ost_ticket_action_history table the data is:
INSERT INTO newdb.ost_ticket_action_history (ticket_id, action_id, action_name, started) VALUES (1, 1, 'Consultation', '2022-01-06 18:30:29');
INSERT INTO newdb.ost_ticket_action_history (ticket_id, action_id, action_name, started) VALUES (2, 2, 'Bank Application', '2022-02-06 18:30:45');
INSERT INTO newdb.ost_ticket_action_history (ticket_id, action_id, action_name, started) VALUES (3, 3, 'Consultation', '2022-05-06 18:42:48');
In the ost_ticket table the data is:
INSERT INTO newdb.ost_ticket (ticket_id, created, staff, status, city_id) VALUES (1, '2022-04-04 18:26:41', 1, 'open', 2);
INSERT INTO newdb.ost_ticket (ticket_id, created, staff, status, city_id) VALUES (2, '2022-05-05 18:30:48', 0, 'open', 3);
INSERT INTO newdb.ost_ticket (ticket_id, created, staff, status, city_id) VALUES (3, '2022-04-06 18:42:53', 1, 'open', 4);
My task is to get the conversion from the “Consultation” stage to the “Bank Application” stage broken down by months (based on the start date of the “Bank Application” stage).Conversion is calculated according to the following formula: (number of applications with the “Bank Application” stage / number of applications with the “Consultation” stage) * 100%.
My request is like this:
select SUM(action_name='Bank Application')/SUM(action_name='Consultation') * 2 as 'Conversion' from ost_ticket_action_history JOIN ost_ticket ot on ot.ticket_id = ost_ticket_action_history.ticket_id where status = 'open' and created > '2020 -01-01 00:00:00' group by action_name,started having action_name = 'Bank Application';
As a result I get:
Another query:
SELECT
SUM(CASE
WHEN b.ticket_id IS NOT NULL THEN 1
ELSE 0
END) / COUNT(*) conversion,
YEAR(a.started) AS 'year',
MONTH(a.started) AS 'month'
FROM
ost_ticket_action_history a
LEFT JOIN
ost_ticket_action_history b ON a.ticket_id = b.ticket_id
AND b.action_name = 'Bank Application'
WHERE
a.action_name = 'Consultation'
AND a.status = 'open'
AND a.created > '2020-01-01 00:00:00'
GROUP BY YEAR(a.started) , MONTH(a.started)
I apologize if I didn't write very clearly. Please explain what to do.
Like I explained in my comment, you exclude rows with your having clause.
I will show you in the next how to debug.
First check what the raw result of the select query is.
As you see, when you remove the GROUP BY and see what you actually get is only 1 row with bank application, because the having clause excludes all other rows
SELECT
*
FROM
ost_ticket_action_history
JOIN
ost_ticket ot ON ot.ticket_id = ost_ticket_action_history.ticket_id
WHERE
status = 'open'
AND created > '2020-01-01 00:00:00'
GROUP BY
action_name, started
HAVING
action_name = 'Bank Application';
Output:
ticket_id
action_id
action_name
started
ticket_id
created
staff
status
city_id
2
2
Bank Application
2022-02-06 18:30:45
2
2022-05-05 18:30:48
0
open
3
Second step, see what the result set is without calculating anything.
As you can see you make a division with 0, what you have learned in school, is forbidden, hat is why you have as result set NULL
SELECT
SUM(action_name = 'Bank Application')
#/
,SUM(action_name = 'Consultation') * 2 AS 'Conversion'
FROM
ost_ticket_action_history
JOIN
ost_ticket ot ON ot.ticket_id = ost_ticket_action_history.ticket_id
WHERE
status = 'open'
AND created > '2020-01-01 00:00:00'
GROUP BY action_name , started
HAVING action_name = 'Bank Application';
SUM(action_name = 'Bank Application') | Conversion
------------------------------------: | ---------:
1 | 0
db<>fiddle here
#Third what you can do exclude a division with 0, here i didn't remove all othe rows as this is only for emphasis
SELECT
SUM(action_name = 'Bank Application')
/
SUM(action_name = 'Consultation') * 2 AS 'Conversion'
FROM
ost_ticket_action_history
JOIN
ost_ticket ot ON ot.ticket_id = ost_ticket_action_history.ticket_id
WHERE
status = 'open'
AND created > '2020-01-01 00:00:00'
GROUP BY action_name , started
HAVING SUM(action_name = 'Consultation') > 0;
| Conversion |
| ---------: |
| 0.0000 |
| 0.0000 |
db<>fiddle here
Final words,
If you get a strange result, simply go back remove everything that doesn't matter and try to get all values, so hat you can check your math
From this table of dates and categories inputs :
I'd like to get these folowing table, showing number of rows per category for each first and second quarter, and the growth of number of rows between second and first quarter, in value, and in percentage - which is a simple statistic table type that could be met to get the number of item per period of time and its growth.
What would be the sql query to get this table ?
Here is the SQL code to create the sql table, in order you to reproduce the schema:
CREATE TABLE `table_a_test_table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`date_row` timestamp NULL DEFAULT NULL,
`category` varchar(255) NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
Here is the sql code to fill the table with the data :
INSERT INTO `table_a_test_table` (`date_row`, `category`)
VALUES
('2020-01-03 00:00:00', 'A'),
('2020-02-02 00:00:00', 'A'),
('2020-03-08 00:00:00', 'B'),
('2020-02-06 00:00:00', 'C'),
('2020-04-07 00:00:00', 'B'),
('2020-05-21 00:00:00', 'A'),
('2020-06-07 00:00:00', 'C'),
('2020-06-08 00:00:00', 'B')
;
I've tried the following sql code, to get the result table, but I do not know where I should, and how I should introduce the category field in order to get a group by output.
SELECT
nb_rows_q1.nb_of_rows AS `number of row in q1`,
nb_rows_q2.nb_of_rows AS `number of row in q2`,
((nb_rows_q2.nb_of_rows - nb_rows_q1.nb_of_rows)/nb_rows_q1.nb_of_rows)*100 AS `growth nb of rows between q2 vs q1`
FROM
(
SELECT COUNT(id) AS nb_of_rows
FROM table_a_test_table
WHERE
date_row >= '2020-01-01 00:00:00'
AND date_row < '2020-04-01 00:00:00'
) AS nb_rows_q1,
(
SELECT COUNT(id) AS nb_of_rows
FROM table_a_test_table
WHERE
date_row >= '2020-04-01 00:00:00'
AND date_row < '2020-07-01 00:00:00'
) AS nb_rows_q2
;
The above code, returns the following :
So now, I'd like t place the category field into the code. But I do not knwo how to do it.
Any idea ?
You would need to aggregate by categories within your subqueries first so you wouldn't lose the category details in the final projection. See a sample working fiddle and results below:
CREATE TABLE `table_a_test_table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`date_row` timestamp NULL DEFAULT NULL,
`category` varchar(255) NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
INSERT INTO `table_a_test_table` (`date_row`, `category`)
VALUES
('2020-01-03 00:00:00', 'A'),
('2020-02-02 00:00:00', 'A'),
('2020-03-08 00:00:00', 'B'),
('2020-02-06 00:00:00', 'C'),
('2020-04-07 00:00:00', 'B'),
('2020-05-21 00:00:00', 'A'),
('2020-06-07 00:00:00', 'C'),
('2020-06-08 00:00:00', 'B')
;
Query #1
SELECT
nb_rows_q1.category,
nb_rows_q1.nb_of_rows AS `number of row in q1`,
nb_rows_q2.nb_of_rows AS `number of row in q2`,
(nb_rows_q2.nb_of_rows - nb_rows_q1.nb_of_rows) as `variation of nb of row q2 vs q1`,
((nb_rows_q2.nb_of_rows - nb_rows_q1.nb_of_rows)/nb_rows_q1.nb_of_rows)*100 AS `growth nb of rows between q2 vs q1`
FROM
(
SELECT category, COUNT(id) AS nb_of_rows
FROM table_a_test_table
WHERE
date_row >= '2020-01-01 00:00:00'
AND date_row < '2020-04-01 00:00:00'
GROUP BY category
) AS nb_rows_q1
INNER JOIN
(
SELECT category,COUNT(id) AS nb_of_rows
FROM table_a_test_table
WHERE
date_row >= '2020-04-01 00:00:00'
AND date_row < '2020-07-01 00:00:00'
GROUP BY category
) AS nb_rows_q2 ON nb_rows_q1.category = nb_rows_q2.category
;
category
number of row in q1
number of row in q2
variation of nb of row q2 vs q1
growth nb of rows between q2 vs q1
A
2
1
-1
-50.0000
B
1
2
1
100.0000
C
1
1
0
0.0000
View on DB Fiddle
Update 1
Another approach has been included below where the aggregation has been done with the help of a case expression. I have also added to your test data, D only in q2, E only in q1 and F neither in q1 or q2. This approach includes categories in either quarter. I also added a case expression for categories that are new or only occurring in quarter 2 that would have the growth rate as null. This is up to you whether you would like the growth rate returned as null or a default value, I included 100
CREATE TABLE `table_a_test_table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`date_row` timestamp NULL DEFAULT NULL,
`category` varchar(255) NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
INSERT INTO `table_a_test_table` (`date_row`, `category`)
VALUES
('2020-01-03 00:00:00', 'A'),
('2020-02-02 00:00:00', 'A'),
('2020-03-08 00:00:00', 'B'),
('2020-02-06 00:00:00', 'C'),
('2020-04-07 00:00:00', 'B'),
('2020-05-21 00:00:00', 'A'),
('2020-06-07 00:00:00', 'C'),
('2020-06-08 00:00:00', 'B'),
('2020-06-08 00:00:00', 'D'),
('2020-01-03 00:00:00', 'E'),
('2019-01-03 00:00:00', 'F')
;
Query #1
SELECT
category,
q1 as `number of row in q1`,
q2 as `number of row in q2`,
(q2 - q1) as `variation of nb of row q2 vs q1`,
CASE
WHEN q1=0 THEN 100.0
ELSE((q2-q1)/q1)*100
END as `growth nb of rows between q2 vs q1`
FROM (
SELECT
category,
SUM(CASE WHEN date_row < '2020-04-01 00:00:00' THEN 1 ELSE 0 END ) as q1,
SUM(CASE WHEN date_row >= '2020-04-01 00:00:00' THEN 1 ELSE 0 END ) as q2
FROM
table_a_test_table
WHERE
date_row BETWEEN '2020-01-01 00:00:00' AND '2020-07-01 00:00:00'
GROUP BY
category
) summary;
category
number of row in q1
number of row in q2
variation of nb of row q2 vs q1
growth nb of rows between q2 vs q1
A
2
1
-1
-50.0000
B
1
2
1
100.0000
C
1
1
0
0.0000
D
0
1
1
100.0
E
1
0
-1
-100.0000
View on DB Fiddle
Feel free to experiment with other filtering options. For example, if you would like to filter out entries like D or E that exist in only one quarter you may consider adding a WHERE clause similar to WHERE q1 > 0 and q2 > 0; as shown below:
SELECT
category,
q1 as `number of row in q1`,
q2 as `number of row in q2`,
(q2 - q1) as `variation of nb of row q2 vs q1`,
CASE
WHEN q1=0 THEN 100.0
ELSE((q2-q1)/q1)*100
END as `growth nb of rows between q2 vs q1`
FROM (
SELECT
category,
SUM(CASE WHEN date_row < '2020-04-01 00:00:00' THEN 1 ELSE 0 END ) as q1,
SUM(CASE WHEN date_row >= '2020-04-01 00:00:00' THEN 1 ELSE 0 END ) as q2
FROM
table_a_test_table
WHERE
date_row BETWEEN '2020-01-01 00:00:00' AND '2020-07-01 00:00:00'
GROUP BY
category
) summary
WHERE q1 > 0 and q2 > 0;
category
number of row in q1
number of row in q2
variation of nb of row q2 vs q1
growth nb of rows between q2 vs q1
A
2
1
-1
-50.0000
B
1
2
1
100.0000
C
1
1
0
0.0000
View on DB Fiddle
I hope this helps.
You probably just need one subquery and don't need JOIN at all. Something like this:
SELECT category,
q1 AS `number of row in q1`,
q2 AS `number of row in q2`,
q2-q1 AS `variation of nb of q2 vs q1`,
((q2-q1)/q1)*100 AS `growth nb of rows between q2 vs q1`
FROM
(SELECT category,
SUM(CASE WHEN date_row >= '2020-01-01 00:00:00'
AND date_row < '2020-04-01 00:00:00'
THEN 1 ELSE 0 END) AS 'q1',
SUM(CASE WHEN date_row >= '2020-04-01 00:00:00'
AND date_row < '2020-07-01 00:00:00'
THEN 1 ELSE 0 END) AS 'q2'
FROM table_a_test_table
GROUP BY category) A;
The base query there I use SUM() with CASE expression. The condition in both of your previous subqueries I used as the condition in the CASE expression followed by a GROUP BY on category column. And as you can see, I've assigned with short abbreviations for q1 and q2 and instead of doing all the calculation for differences and percentage in the same line of SELECT, I turned the query into a subquery and do the calculation outside. This, in my opinion, made the query much more readable.
Demo fiddle
I have a table with few fields like id, country, ip, created_at. Then I am trying to get the deltas between total entry of one day and total entry of the next day.
CREATE TABLE session (
id int NOT NULL AUTO_INCREMENT,
country varchar(50) NOT NULL,
ip varchar(255),
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (id)
);
INSERT INTO `session` (`id`, `country`, `ip`, `created_at`) VALUES
('1', 'IN', '10.100.102.11', '2021-04-05 20:26:02'),
('2', 'IN', '10.100.102.11', '2021-04-05 19:26:02'),
('3', 'US', '10.120.102.11', '2021-04-17 10:26:02'),
('4', 'US', '10.100.112.11', '2021-04-16 12:26:02'),
('5', 'AU', '10.100.102.122', '2021-04-12 19:36:02'),
('6', 'AU', '10.100.102.122', '2021-04-12 18:20:02'),
('7', 'AU', '10.100.102.122', '2021-04-12 23:26:02'),
('8', 'US', '10.100.102.2', '2021-04-16 21:33:01'),
('9', 'AU', '10.100.102.122', '2021-04-18 20:46:02'),
('10', 'AU', '10.100.102.111', '2021-04-04 13:19:12'),
('11', 'US', '10.100.112.11', '2021-04-16 12:26:02'),
('12', 'IN', '10.100.102.11', '2021-04-05 15:26:02'),
('13', 'IN', '10.100.102.11', '2021-04-05 19:26:02');
Now I have written this query to get the delta
SELECT T1.date1 as date, IFNULL(T1.cnt1-T2.cnt2, T1.cnt1) as delta from (
select TA.dateA as date1, MAX(TA.countA) as cnt1 from (
select DATE(created_at) AS dateA, COUNT(*) AS countA
FROM session
GROUP BY DATE(created_at)
UNION
select DISTINCT DATE(DATE(created_at)+1) AS dateA, 0 AS countA
FROM session
) as TA
group by TA.dateA
) as T1
LEFT OUTER JOIN (
select DATE(DATE(created_at)+1) AS date2,
COUNT(*) AS cnt2
FROM session
GROUP BY DATE(created_at)
) as T2
ON T1.date1=T2.date2
ORDER BY date;
http://sqlfiddle.com/#!9/4f5fd26/60
Then I am getting the results as
date delta
2021-04-04 1
2021-04-05 3
2021-04-06 -4
2021-04-12 3
2021-04-13 -3
2021-04-16 3
2021-04-17 -2
2021-04-18 0
2021-04-19 -1
Now, is there any place of improvements/optimizes on it with/or window functions? (I am zero with SQL, still playing around).
Try a shorter version
with grp as (
SELECT t.dateA, SUM(t.cnt) AS countA
FROM session,
LATERAL (
select DATE(created_at) AS dateA, 1 as cnt
union all
select DATE(DATE(created_at)+1), 0 as cnt
) t
GROUP BY dateA
)
select t1.dateA as date, IFNULL(t1.countA-t2.countA, t1.countA) as delta
from grp t1
left join grp t2 on DATE(t2.dateA + 1) = t1.dateA
order by t1.dateA
db<>fiddle
SQL God...I need some help!
I have a data table that has a route_complete_percentage column and a created_at column.
I need two pieces of data:
the time stamp (within created_at column) when the route_complete_percentage is at its minimum but not zero
the time stamp (within created_at column) when the route_complete_percentage is at its maximum, it might be 100% or not, but when its at its highest.
Here is the kicker, there might be multiple time stamps for the highest route completion column. For example,
Example Table
I have multiple values when the route_completion_percentage is at its maximum, but I need the minimum time stamp value.
Here is the query so far...but the two time stamps are the same.
SELECT
A.fc,
A.plan_id,
A.route_id,
mintime.first_scan AS First_Batch_Scan,
min(route_complete_percentage),
maxtime.last_scan AS Last_Batch_Scan,
max(route_complete_percentage)
FROM
(SELECT
fc,
plan_id,
route_id,
route_complete_percentage,
CONCAT(plan_id, '-', route_id) AS JOINKEY
FROM
houdini_ops.BATCHINATOR_SCAN_LOGS_V2
WHERE
fc <> ''
AND order_id <> 'Can\'t find order'
AND source = 'scan'
AND created_at > DATE_ADD(CURDATE(), INTERVAL - 3 DAY)) A
LEFT JOIN
(SELECT
l.fc,
l.route_id,
l.plan_id,
CONCAT(l.plan_id, '-', l.route_id) AS JOINKEY,
CASE
WHEN MIN(route_complete_percentage) THEN CONVERT_TZ(l.created_at, 'UTC', s.time_zone)
END AS first_scan
FROM
houdini_ops.BATCHINATOR_SCAN_LOGS_V2 l
JOIN houdini_ops.O_SERVICE_AREA_ATTRIBUTES s ON l.fc = s.default_station_code
WHERE
l.fc <> ''
AND l.order_id <> 'Can\'t find order'
AND l.source = 'scan'
AND l.created_at > DATE_ADD(CURDATE(), INTERVAL - 3 DAY)
GROUP BY fc , plan_id , route_id) mintime ON A.JOINKEY = mintime.JOINKEY
LEFT JOIN
(SELECT
l.fc,
l.route_id,
l.plan_id,
CONCAT(l.plan_id, '-', l.route_id) AS JOINKEY,
CASE
WHEN MAX(route_complete_percentage) THEN CONVERT_TZ(l.created_at, 'UTC', s.time_zone)
END AS last_scan
FROM
houdini_ops.BATCHINATOR_SCAN_LOGS_V2 l
JOIN houdini_ops.O_SERVICE_AREA_ATTRIBUTES s ON l.fc = s.default_station_code
WHERE
l.fc <> ''
AND l.order_id <> 'Can\'t find order'
AND l.source = 'scan'
AND l.created_at > DATE_ADD(CURDATE(), INTERVAL - 3 DAY)
GROUP BY fc , plan_id , route_id) maxtime ON mintime.JOINKEY = maxtime.JOINKEY
GROUP BY fc , plan_id , route_id
I don't want to meddle with the rest of your query. Here is something that will do what it sounds like you need. There's sample data included. -- I interpreted your blank values as nulls from your sample data.
Basically, what you are looking for is the Minimum created_at value, inside each of the route_complete_percentage groups. So I treated route_complete_percentage as a group identifier. But you only care about two of the groups, so I identify those groups first in the cte, and use them to filter the aggregate query.
if object_id('tempdb.dbo.#Data') is not null drop table #Data
go
create table #Data (
route_complete_percentage int,
created_at datetime
)
insert into #Data (route_complete_percentage, created_at)
values
(0, '20170531 19:58'),
(1, null),
(2, null),
(3, null),
(4, null),
(5, null),
(6, null),
(7, null),
(80, null),
(90, null),
(100, '20170531 20:10'),
(100, '20170531 20:12'),
(100, '20170531 20:15')
;with cteMinMax(min_route_complete_percentage, max_route_complete_percentage) as (
select
min(route_complete_percentage),
max(route_complete_percentage)
from #Data D
-- This ensures the condition that you don't get the timestamp for 0
where D.route_complete_percentage > 0
)
select
route_complete_percentage,
min_created_at = min(created_at)
from #Data D
join cteMinMax MM on D.route_complete_percentage in (MM.min_route_complete_percentage, MM.max_route_complete_percentage)
group by route_complete_percentage
I want to group by keyword name, get sum of cf1+cf2 (where bug_status=CLOSED or RESOLVED) and get total sum (irrespective of bug status). Output will have 3 columns like mentioned.
Tried query but no luck:
SELECT keyworddefs.name as keyword, IFNULL(SUM(bugs.cf1 + bugs.cf2),0) as completed, (SELECT IFNULL(SUM(bugs.cf1 + bugs.cf2) ,0) FROM bugs, keywords, keyworddefs WHERE (keywords.bug_id = bugs .bug_id) AND (keyworddefs.id=keywords.keywordid) AND (keyworddefs.name LIKE 'K%')) as total FROM bugs, keywords, keyworddefs WHERE (keywords.bug_id = bugs .bug_id) AND (keyworddefs.id=keywords.keywordid) AND (bugs.bug_status = 'VERIFIED' OR bugs.bug_status = 'CLOSED') GROUP BY keyworddefs.name DESC;
Here's the query formatted.
SELECT keyworddefs.name as keyword,
IFNULL(SUM(bugs.cf1 + bugs.cf2),0) as completed,
(SELECT IFNULL(SUM(bugs.cf1 + bugs.cf2) ,0)
FROM bugs, keywords, keyworddefs
WHERE (keywords.bug_id = bugs .bug_id)
AND (keyworddefs.id=keywords.keywordid)
AND (keyworddefs.name LIKE 'K%')) as total
FROM bugs, keywords, keyworddefs
WHERE (keywords.bug_id = bugs .bug_id)
AND (keyworddefs.id=keywords.keywordid)
AND (bugs.bug_status = 'VERIFIED' OR bugs.bug_status = 'CLOSED')
GROUP BY keyworddefs.name DESC;
SQL Fiddle:
http://sqlfiddle.com/#!9/a11b4/7
Expected:
Matching records:
cf1+cf2 bugid, keyword bug_status
5 (102, 'K1') CLOSED
3 (565, 'K2') CLOSED
3 (1352, 'K1') VERIFIED
4 (13634, 'K1') NEW
# Query output should be:
keyword completed total
K1 8 12
K2 3 3
DDLs:
bugs table1 (master table) :
CREATE TABLE `bugs` (
`bug_id` int(11) NOT NULL,
`bug_date` date NOT NULL,
`cf1` int(11) NOT NULL,
`cf2` int(11) NOT NULL,
`bug_status` varchar(200) NOT NULL)
ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `bugs` (`bug_id`, `bug_date`, `cf1`, `cf2`, `bug_status`) VALUES
(102, '2016-07-19', 2, 1, 'CLOSED'),
(72123, '2016-07-19', 2, 1, 'VERIFIED'),
(57234, '2016-07-19', 2, 1, 'VERIFIED'),
(1352, '2016-07-19', 2, 1, 'VERIFIED'),
(565, '2016-07-19', 2, 1, 'CLOSED'),
(13634, '2016-07-22', 2, 2, 'NEW');
keywords table2 (having keyword ids):
CREATE TABLE `keywords` (
`bug_id` int(11) NOT NULL,
`keywordid` varchar(11) NOT NULL)
ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `keywords` (`bug_id`, `keywordid`) VALUES
(102, '3'),
(565, '4'),
(398, '1'),
(565, '2'),
(1352, '1'),
(57234, '2'),
(1363, '1'),
(72123, '2'),
(13634, '3');
keyworddefs table3 (having keyword names according to keywordid):
CREATE TABLE `keyworddefs` (
`id` int(11) NOT NULL,
`name` varchar(200) NOT NULL,
`description` varchar(200) NOT NULL)
ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `keyworddefs` (`id`, `name`, `description`) VALUES
(1, 'J1', 'My J1 item'),
(2, 'J2', 'My J2 item'),
(3, 'K1', 'My K1 item'),
(4, 'K2', 'My K2 item');
How can I get output like expected?
It looks to me like you're making this too complicated.
For one thing, you should join your keywords and keyworddefs tables ON keywords.keywordid = keyworddefs.name. You're using keyworddefs.id. That's a number. So, your old-timey early 1990s comma join yields no results.
For another thing, you don't need to join the keyworddefs table to get your result.
SUM() operations rarely yield NULL results. So, you should put your conditionals inside the parentheses of SUM() rather than outside.
Finally, you need a GROUP BY operation with two SUM() aggregates in it. One should be conditioned on the bug_status and the other should not.
http://sqlfiddle.com/#!9/a11b4/11/0
Something like this should work.
SELECT keywords.keywordid,
SUM(CASE WHEN bugs.bug_status IN ('CLOSED', 'RESOLVED')
THEN bugs.cf1 + bugs.cf2
ELSE 0 END) completed,
SUM(bugs.cf1 + bugs.cf2) total
FROM bugs
JOIN keywords ON bugs.bug_id = keywords.bug_id
GROUP BY keywords.keywordid
ORDER BY keywords.keywordid
If you need to filter your results by keywords.keywordid LIKE 'K%', you can just add a where clause.
Extended query from Ollie's comment, it works fine with couple of changes.
Highly appreciated Ollie!
SELECT keyworddefs.name,
SUM(CASE WHEN bugs.bug_status IN ('CLOSED', 'VERIFIED') THEN bugs.cf1 + bugs.cf2 ELSE 0 END) completed,
SUM(bugs.cf1 + bugs.cf2) total
FROM bugs
JOIN keywords ON bugs.bug_id = keywords.bug_id
JOIN keyworddefs ON keyworddefs.id = keywords.keywordid
WHERE keyworddefs.name LIKE 'K%'
GROUP BY keywords.keywordid
ORDER BY keyworddefs.name DESC;