How can I summarize rows that occur only once? - mysql

I have a query which returns the number of rows of a distinct device_type which occur more than once.
SELECT COUNT(*) AS C1,device_type FROM stat
WHERE stat_date = '2012-02-08'
GROUP BY 2 HAVING C1 > 1
ORDER BY 1 DESC
I would like to summarize the remaining (HAVING count = 1) rows as 'others'
How can I add the sum of COUNT(*) and 'others' as second column for the following query?
SELECT COUNT(*) AS C2,device_type FROM stat
WHERE stat_date = '2012-02-08'
GROUP BY 2 HAVING C2 = 1
ORDER BY 1 DESC
Sample data in DB
device_type
dt1
dt1
dt1
dt2
dt2
dt3
dt4
dt5
expected result
3 dt1
2 dt2
3 other

I would do this.
SELECT COUNT(*) AS C1,device_type FROM stat
WHERE stat_date = '2012-02-08'
GROUP BY 2 HAVING C1 > 1
ORDER BY 1 DESC
Union
SELECT Sum(1),'OTHERS'FROM stat
WHERE stat_date = '2012-02-08'
GROUP BY 2 HAVING C1 =1
ORDER BY 1 DESC

You can also try:
SELECT SUM(C1) AS C1, CASE WHEN C1 = 1 THEN 'other' ELSE device_type END as device_type
FROM ( SELECT COUNT(*) AS C1,
device_type
FROM stat
WHERE stat_date = '2012-02-08'
GROUP BY device_type) A
GROUP BY CASE WHEN C1 = 1 THEN 'other' ELSE device_type END

Related

Need to Pick Max Date when status = N otherwise No in MYSQL

I have a table which have records like this
ID DATEADD STATUS
'A0011' '04/01/2018 11:58:31' 'C'
'A0011' '31/05/2019 10:02:36' 'N'
'B0022' '04/01/2018 11:58:31' 'N'
'B0022' '31/05/2019 10:02:36' 'N'
'B0022' '30/04/2020 19:44:36' 'C'
'C0033' '04/01/2018 11:58:31' 'N'
'C0033' '30/05/2019 06:02:36' 'C'
'C0033' '29/04/2020 05:44:36' 'C'
I'm trying to get the Max Date for each ID which have STATUS = 'N'. If I get MAX DATE and STATUS = 'C' then I don't want that record.
Output :
ID DATEADD STATUS
'A0011' '31/05/2019 10:02:36' 'N'
SCRIPT :
SELECT I.* FROM INVOICE I
INNER JOIN (
Select ID,MAX(DATEADD)DATEADD,STATUS FROM INVOICE WHERE STATUS = 'N'
GROUP BY ID,STATUS) O
ON I.ID = O.ID AND O.DATEADD = I.DATEADD
But I'm not able to get desired output.
If your mysql version support the window function, we can try to use ROW_NUMBER window function to get each ID latest DATEADD then compare the STATUS
SELECT *
FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY DATEADD DESC) rn
FROM INVOICE
) t1
WHERE rn = 1 AND STATUS = 'N'
sqlfiddle
if your MySQL version didn't support the window function we can try to use correlated subquery
SELECT *
FROM (
SELECT *, (SELECT COUNT(*)
FROM INVOICE tt
WHERE tt.ID = t1.ID AND tt.DATEADD > t1.DATEADD) rn
FROM INVOICE t1
) t1
WHERE rn = 1 AND STATUS = 'N'
sqlfiddle
You can use NOT EXISTS:
SELECT i1.*
FROM INVOICE i1
WHERE i1.STATUS = 'N'
AND NOT EXISTS (
SELECT 1
FROM INVOICE i2
WHERE i2.ID = i1.ID
AND STR_TO_DATE(i2.DATEADD, '%d/%m/%Y %H:%i:%s') > STR_TO_DATE(i1.DATEADD, '%d/%m/%Y %H:%i:%s')
);
If the column's DATEADD data type is DATETIME or TIMESTAMP the last condition would be simpler:
...AND i2.DATEADD > i1.DATEADD
See the demo.
We can use ORDER BY and LIMIT 1 to get the row that we want without using any functions, sub-queries, CTE etc.
Thank you to D-Shih for the test schema.
If we want the maximum date with status 'N' for each ID we can use the second query.
SELECT
ID,
DATEADD,
STATUS
FROM INVOICE
ORDER BY
STATUS DESC,
DATEADD DESC
LIMIT 1;
ID | DATEADD | STATUS
:---- | :--------- | :-----
A0011 | 2019-05-31 | N
SELECT
ID,
MAX(DATEADD) AS DATEADD,
STATUS
FROM INVOICE
WHERE STATUS = 'N'
GROUP BY ID
ORDER BY ID;
ID | DATEADD | STATUS
:---- | :--------- | :-----
A0011 | 2019-05-31 | N
B0022 | 2019-05-31 | N
C0033 | 2018-01-04 | N
db<>fiddle here

How would I return the result of SQL math operations?

So I was taking a test recently with some higher level SQL problems. I only have what I would consider "intermediate" experience in SQL and I've been working on this for a day or so now. I just can't figure it out.
Here's the problem:
You have a table with 4 columns as such:
EmployeeID int unique
EmployeeType int
EmployeeSalary int
Created date
Goal: I need to retrieve the difference between the latest two EmployeeSalary for any EmployeeType with more than 1 entry. It has to be done in one statement (nested queries are fine).
Example Data Set: http://sqlfiddle.com/#!9/0dfc7
EmployeeID | EmployeeType | EmployeeSalary | Created
-----------|--------------|----------------|--------------------
1 | 53 | 50 | 2015-11-15 00:00:00
2 | 66 | 20 | 2014-11-11 04:20:23
3 | 66 | 30 | 2015-11-03 08:26:21
4 | 66 | 10 | 2013-11-02 11:32:47
5 | 78 | 70 | 2009-11-08 04:47:47
6 | 78 | 45 | 2006-11-01 04:42:55
So for this data set, the proper return would be:
EmployeeType | EmployeeSalary
-------------|---------------
66 | 10
78 | 25
The 10 comes from subtracting the latest two EmployeeSalary values (30 - 20) for the EmployeeType of 66. The 25 comes from subtracting the latest two EmployeeSalary values (70-45) for EmployeeType of 78. We skip EmployeeID 53 completely because it only has one value.
This one has been destroying my brain. Any clues?
Thanks!
How to make really simple query complex?
One funny way(not best performance) to do it is:
SELECT final.EmployeeType, SUM(salary) AS difference
FROM (
SELECT b.EmployeeType, b.EmployeeSalary AS salary
FROM tab b
JOIN (SELECT EmployeeType, GROUP_CONCAT(EmployeeSalary ORDER BY Created DESC) AS c
FROM tab
GROUP BY EmployeeType
HAVING COUNT(*) > 1) AS sub
ON b.EmployeeType = sub.EmployeeType
AND FIND_IN_SET(b.EmployeeSalary, sub.c) = 1
UNION ALL
SELECT b.EmployeeType, -b.EmployeeSalary AS salary
FROM tab b
JOIN (SELECT EmployeeType, GROUP_CONCAT(EmployeeSalary ORDER BY Created DESC) AS c
FROM tab
GROUP BY EmployeeType
HAVING COUNT(*) > 1) AS sub
ON b.EmployeeType = sub.EmployeeType
AND FIND_IN_SET(b.EmployeeSalary, sub.c) = 2
) AS final
GROUP BY final.EmployeeType;
SqlFiddleDemo
EDIT:
The keypoint is MySQL doesn't support windowed function so you need to use equivalent code:
For example solution in SQL Server:
SELECT EmployeeType, SUM(CASE rn WHEN 1 THEN EmployeeSalary
ELSE -EmployeeSalary END) AS difference
FROM (SELECT *,
ROW_NUMBER() OVER(PARTITION BY EmployeeType ORDER BY Created DESC) AS rn
FROM #tab
) AS sub
WHERE rn IN (1,2)
GROUP BY EmployeeType
HAVING COUNT(EmployeeType) > 1
LiveDemo
And MySQL equivalent:
SELECT EmployeeType, SUM(CASE rn WHEN 1 THEN EmployeeSalary
ELSE -EmployeeSalary END) AS difference
FROM (
SELECT t1.EmployeeType, t1.EmployeeSalary,
count(t2.Created) + 1 as rn
FROM #tab t1
LEFT JOIN #tab t2
ON t1.EmployeeType = t2.EmployeeType
AND t1.Created < t2.Created
GROUP BY t1.EmployeeType, t1.EmployeeSalary
) AS sub
WHERE rn IN (1,2)
GROUP BY EmployeeType
HAVING COUNT(EmployeeType) > 1;
LiveDemo2
The dataset of the fiddle is different from the example above, which is confusing (not to mention a little perverse). Anyway, there's lots of ways to skin this particular cat. Here's one (not the fastest, however):
SELECT a.employeetype, ABS(a.employeesalary-b.employeesalary) diff
FROM
( SELECT x.*
, COUNT(*) rank
FROM employees x
JOIN employees y
ON y.employeetype = x.employeetype
AND y.created >= x.created
GROUP
BY x.employeetype
, x.created
) a
JOIN
( SELECT x.*
, COUNT(*) rank
FROM employees x
JOIN employees y
ON y.employeetype = x.employeetype
AND y.created >= x.created
GROUP
BY x.employeetype
, x.created
) b
ON b.employeetype = a.employeetype
AND b.rank = a.rank+1
WHERE a.rank = 1;
a very similar but faster solution looks like this (although you sometimes need to assign different variables between tables a and b - for reasons I still don't fully understand)...
SELECT a.employeetype
, ABS(a.employeesalary-b.employeesalary) diff
FROM
( SELECT x.*
, CASE WHEN #prev = x.employeetype THEN #i:=#i+1 ELSE #i:=1 END i
, #prev := x.employeetype prev
FROM employees x
, (SELECT #prev := 0, #i:=1) vars
ORDER
BY x.employeetype
, x.created DESC
) a
JOIN
( SELECT x.*
, CASE WHEN #prev = x.employeetype THEN #i:=#i+1 ELSE #i:=1 END i
, #prev := x.employeetype prev
FROM employees x
, (SELECT #prev := 0, #i:=1) vars
ORDER
BY x.employeetype
, x.created DESC
) b
ON b.employeetype = a.employeetype
AND b.i = a.i + 1
WHERE a.i = 1;

MySQl Query to select count until you hit a specific value

I have this table e.g.:
Id StatusDate Status
1 20-08-2014
1 15-08-2014
1 09-08-2014 P
2 17-08-2014
1 10-08-2014
2 12-08-2014
2 06-07-2014 P
1 30-07-2014
2 02-07-2014
2 01-07-2014 P
...... and so on
I want to select count by ID where status is blank until I hit the first 'P' in ascending order of date group by ID. So my results will be like this.
ID Count
1 3
2 2
Try it out. Not tested
SELECT t1.ID, count(*) FROM table t1
WHERE t1.StatusDate >= (SELECT MAX(t2.StatusDate) FROM table t2
WHERE t1.ID = t2.ID AND t2.Status = 'P')
GROUP BY t1.ID
Assuming your table name is StatusTable This will work:
SELECT
ID,
COUNT(*) AS `Count`
FROM StatusTable AS st
WHERE
st.Status = ''
AND st.StatusDate > (
SELECT st2.StatusDate
FROM `StatusTable` AS st2
WHERE st.ID = st2.ID
AND st2.Status = 'P'
ORDER BY st2.StatusDate DESC
LIMIT 1
)
GROUP BY st.ID
ORDER BY st.ID
One option is to use a JOIN and COUNT rows which have a lower statusdate value, like this:
SELECT t1.id, SUM(CASE WHEN t1.statusdate > t2.statusdate THEN 1 ELSE 0 END) AS mycount
FROM t t1 JOIN (
SELECT id, MIN(statusdate) statusdate
FROM t
WHERE status = 'P'
GROUP BY id
) t2
ON t1.id = t2.id
GROUP BY t1.id
Working Demo: http://sqlfiddle.com/#!2/d9d91/2

Difficult MySQL Query - Getting Max difference between dates

I have a MySQL table of the following form
account_id | call_date
1 2013-06-07
1 2013-06-09
1 2013-06-21
2 2012-05-01
2 2012-05-02
2 2012-05-06
I want to write a MySQL query that will get the maximum difference (in days) between successive dates in call_date for each account_id. So for the above example, the result of this query would be
account_id | max_diff
1 12
2 4
I'm not sure how to do this. Is this even possible to do in a MySQL query?
I can do datediff(max(call_date),min(call_date)) but this would ignore dates in between the first and last call dates. I need some way of getting the datediff() between each successive call_date for each account_id, then finding the maximum of those.
I'm sure fp's answer will be faster, but just for fun...
SELECT account_id
, MAX(diff) max_diff
FROM
( SELECT x.account_id
, DATEDIFF(MIN(y.call_date),x.call_date) diff
FROM my_table x
JOIN my_table y
ON y.account_id = x.account_id
AND y.call_date > x.call_date
GROUP
BY x.account_id
, x.call_date
) z
GROUP
BY account_id;
CREATE TABLE t
(`account_id` int, `call_date` date)
;
INSERT INTO t
(`account_id`, `call_date`)
VALUES
(1, '2013-06-07'),
(1, '2013-06-09'),
(1, '2013-06-21'),
(2, '2012-05-01'),
(2, '2012-05-02'),
(2, '2012-05-06')
;
select account_id, max(diff) from (
select
account_id,
timestampdiff(day, coalesce(#prev, call_date), call_date) diff,
#prev := call_date
from
t
, (select #prev:=null) v
order by account_id, call_date
) sq
group by account_id
| ACCOUNT_ID | MAX(DIFF) |
|------------|-----------|
| 1 | 12 |
| 2 | 4 |
see it working live in an sqlfiddle
If you have an index on account_id, call_date, then you can do this rather efficiently without variables:
select account_id, max(call_date - prev_call_date) as diff
from (select t.*,
(select t2.call_date
from table t2
where t2.account_id = t.account_id and t2.call_date < t.call_date
order by t2.call_date desc
limit 1
) as prev_call_date
from table t
) t
group by account_id;
Just for educational purposes, doing it with JOIN:
SELECT t1.account_id,
MAX(DATEDIFF(t2.call_date, t1.call_date)) AS max_diff
FROM t t1
LEFT JOIN t t2
ON t2.account_id = t1.account_id
AND t2.call_date > t1.call_date
LEFT JOIN t t3
ON t3.account_id = t1.account_id
AND t3.call_date > t1.call_date
AND t3.call_date < t2.call_date
WHERE t3.account_id IS NULL
GROUP BY t1.account_id
Since you didn't specify, this shows max_diff of NULL for accounts with only 1 call.
SELECT a1.account_id , max(a1.call_date - a2.call_date)
FROM account a2, account a1
WHERE a1.account_id = a2.account_id
AND a1.call_date > a2.call_date
AND NOT EXISTS
(SELECT 1 FROM account a3 WHERE a1.call_date > a3.call_date AND a2.call_date < a3.call_date)
GROUP BY a1.account_id
Which gives :
ACCOUNT_ID MAX(A1.CALL_DATE - A2.CALL_DATE)
1 12
2 4

Using multiple group by having in single query

I have 2 queries to get the count of families having count = 1 and count = 2.
SELECT Name, count(*) as c FROM Tablename GROUP BY HOUSE_NO HAVING c<=1;
SELECT Name, count(*) as c FROM Tablename GROUP BY HOUSE_NO HAVING c>=2 and c<=4;
But i need to combine those queries into single query.Like
count1 count2
nooffamiliesHavingcount = 1 nooffamiliesHavingcount = 2
Please help me....Thanks in advance..
You need to put your first count into a subquery:
SELECT COUNT(CASE WHEN C = 1 THEN 1 END) AS nooffamiliesHavingcount1,
COUNT(CASE WHEN C = 2 THEN 1 END) AS nooffamiliesHavingcount2
FROM ( SELECT COUNT(*) AS C
FROM TableName
GROUP BY House_No
) t
WHERE c IN (1, 2);
EDIT
If you need to do ranges in your count you can use this:
SELECT COUNT(CASE WHEN C <= 1 THEN 1 END) AS nooffamiliesHavingcount1,
COUNT(CASE WHEN C BETWEEN 2 AND 4 THEN 1 END) AS nooffamiliesHavingcount2,
COUNT(CASE WHEN C > 4 THEN 1 END) AS nooffamiliesHavingcount3
FROM ( SELECT COUNT(*) AS C
FROM TableName
GROUP BY House_No
) t
Example on SQL Fiddle
SELECT CASE WHEN c <= 1 THEN "<=1"
WHEN c BETWEEN 2 and 4 THEN "2-4"
END familysize,
COUNT(*) nooffamilies
FROM (SELECT Name, count(*) c
FROM Tablename
GROUP BY Name) x
GROUP BY familysize
HAVING familysize IS NOT NULL
FIDDLE