merge sum of counts from id in mysql - mysql

I not mysql expert and I need help to make count query, I need merge sum of count from id 5 and 11 to one count number = 286 and give platform name as GCS in this case.
SELECT DISTINCT (p.id) AS id, (p.name) AS platform,
IFNULL(count(e.id), 0) AS count
FROM event e, lu_platform p
WHERE e.platform_id = p.id
AND p.id NOT IN ( 10, 15, 17, 18 )
AND e.sourcetype_id = 1
AND e.event_datetime BETWEEN '2013-11-4'
AND '2013-11-10' AND e.sender_id NOT IN ( 759, 73 )
GROUP BY p.id ORDER BY id;
+----+---------------------------+-------+
| id | platform | count |
+----+---------------------------+-------+
| 3 | GGG | 414 |
| 4 | KIKI | 156 |
| 5 | KJC | 284 |
| 6 | LOLO | 4 |
| 7 | MOD | 1147 |
| 8 | MARKT | 1049 |
| 11 | GCS | 2 |
| 12 | POLAR | 30 |
| 14 | GUAE | 145 |
+----+---------------------------+-------+

One possible way to do it - use a subquery, a sum, and IF function, or a case expression CASE WHEN THEN:
SELECT sum( case when id in ( 5, 11 ) then count else 0 end ) as count_5_11,
sum( if( id in ( 3, 4 ), count, 0 ) ) As count_3_4
FROM (
-- your query goes here
SELECT DISTINCT (p.id) AS id, (p.name) AS platform,
IFNULL(count(e.id), 0) AS count
....
....
....
....
....
) AS some_alias

Try to use sub-query
SELECT SUM(count) as count, `GCG` as platform
FROM
(
SELECT DISTINCT (p.id) AS id, (p.name) AS platform,
IFNULL(count(e.id), 0) AS count
FROM event e, lu_platform p
WHERE e.platform_id = p.id
AND p.id NOT IN ( 10, 15, 17, 18 )
AND e.sourcetype_id = 1
AND e.event_datetime BETWEEN '2013-11-4'
AND '2013-11-10' AND e.sender_id NOT IN ( 759, 73 )
GROUP BY p.id ORDER BY id;
) T
WHERE id IN (5,11)

Thank you for reply.
In the both above answers I get only one results instead of list. I think the best way is do this in application as was mentioned by sarwar026.
I noticed also that ISNULL(count(e.id), 0) is not working as I expected, platforms without records not returning 0, they are skipped.

Related

How to get max value of a grouped counted variable in MySQL

I have a MySQL table like this;
recordID| netcall | sign | activity | netid
1 | group1 | wa1 | 1 | 20
2 | group2 | wa2 | 2 | 30
3 | group1 | wa2 | 1 | 20
4 | group2 | wa3 | 2 | 30
5 | group1 | wa1 | 1 | 40
6 | group3 | wa4 | 3 | 50
7 | group3 | wa4 | 3 | 50
8 | group1 | wa2 | 1 | 40
9 | group1 | wa1 | 1 | 40
10 | group2 | wa4 | 2 | 60
What I need from that is:
Netcall | count | activity | netid
Group1 | 3 | 1 | 40
Group2 | 2 | 2 | 30
Group3 | 2 | 3 | 50
I thought I could;
SELECT MAX(xx.mycount) AS MAXcount
FROM (SELECT COUNT(tt.sign) AS mycount ,tt.activity
FROM NetLog tt
WHERE ID <> 0
GROUP BY netcall) xx
But this only brings up the grand total not broken down by netcall. I don't see an example of this question but I'm sure there is one, I'm just asking it wrong.
Your example and desire output are too basic, you should try to expand so include more cases.
Right now you can get the desire output with:
SELECT `netcall`, COUNT(*) as `total`, MAX(`activity`) as `activity`
FROM t
GROUP BY `netcall`;
My guess is you can have different activities for group so you need multiples steps
Calculate the COUNT() for GROUP BY netcall, activity I call it q
Then see what is the MAX(total) for each netcall I call it p
Now you reuse q as o you have all the count, so just select the one with the max count.
SQL DEMO
SELECT o.`netcall`, o.total, o.`activity`
FROM (
SELECT `netcall`, COUNT(*) `total`, `activity`
FROM t
GROUP BY `netcall`, `activity`
) o
JOIN (
SELECT `netcall`, MAX(`total`) as `total`
FROM (
SELECT `netcall`, COUNT(*) `total`
FROM t
GROUP BY `netcall`, `activity`
) q
GROUP BY `netcall`
) p
ON o.`netcall` = p.`netcall`
AND o.`total` = p.`total`
With MySQL v8+ you can use cte and window function to simplify a little bit
with group_count as (
SELECT `netcall`, COUNT(*) as total, `activity`
FROM t
GROUP BY `netcall`, `activity`
), group_sort as (
SELECT `netcall`, total, `activity`,
RANK() OVER (PARTITION BY `netcall`, `activity` ORDER BY total DESC) as rnk
FROM group_count
)
SELECT *
FROM group_sort
WHERE rnk = 1
This question is asked (and answered) every day on SO; it even has its own chapter in the MySQL manual, but anyway...
SELECT a.netcall
, b.total
, a.activity
FROM netlog a
JOIN
( SELECT netcall
, MAX(record_id) record_id
, COUNT(*) total
FROM netlog
GROUP
BY netcall
) b
ON b.netcall = a.netcall
AND b.record_id = a.record_id
SELECT k.netcall, k.netID, MAX(k.logins) highest,
AVG(k.logins) average, netDate, activity
FROM
(SELECT netID, netcall, COUNT(*) logins, DATE(`logdate`) as netDate, activity
FROM NetLog
WHERE netID <> 0 AND status = 1
AND netcall <> '0' AND netcall <> ''
GROUP BY netcall, netID) k
GROUP BY netcall
ORDER BY highest DESC
Resulted in:
Net Call Highest Average Net ID Sub Net Of... ICS
214 309 Map Date Activity
MESN 65 41.5294 339 214 309 MAP 2017-09-03 MESN
W0KCN 34 14.9597 1 214 309 MAP 2016-03-15 KCNARES Weekly 2m Voice Net
W0ERH 31 31.0000 883 214 309 MAP 2018-10-12 Johnson Co. Radio Amateurs Club Meeting Net
KCABARC 29 22.3333 57 214 309 MAP 2016-10-10 KCA Blind Amateurs Weekly 2m Voice Net
....

Optimizing SQL Query for max value with various conditions from a single MySQL table

I have the following SQL query
SELECT *
FROM `sensor_data` AS `sd1`
WHERE (sd1.timestamp BETWEEN '2017-05-13 00:00:00'
AND '2017-05-14 00:00:00')
AND (`id` =
(
SELECT `id`
FROM `sensor_data` AS `sd2`
WHERE sd1.mid = sd2.mid
AND sd1.sid = sd2.sid
ORDER BY `value` DESC, `id` DESC
LIMIT 1)
)
Background:
I've checked the validity of the query by changing LIMIT 1 to LIMIT 0, and the query works without any problem. However with LIMIT 1 the query doesn't complete, it just states loading until I shutdown and restart.
Breaking the Query down:
I have broken down the query with the date boundary as follows:
SELECT *
FROM `sensor_data` AS `sd1`
WHERE (sd1.timestamp BETWEEN '2017-05-13 00:00:00'
AND '2017-05-14 00:00:00')
This takes about 0.24 seconds to return the query with 8200 rows each having 5 columns.
Question:
I suspect the second half of my Query, is not correct or well optimized.
The tables are as follows:
Current Table:
+------+-------+-------+-----+-----------------------+
| id | mid | sid | v | timestamp |
+------+-------+-------+-----+-----------------------+
| 51 | 10 | 1 | 40 | 2015-05-13 11:56:01 |
| 52 | 10 | 2 | 39 | 2015-05-13 11:56:25 |
| 53 | 10 | 2 | 40 | 2015-05-13 11:56:42 |
| 54 | 10 | 2 | 40 | 2015-05-13 11:56:45 |
| 55 | 10 | 2 | 40 | 2015-05-13 11:57:01 |
| 56 | 11 | 1 | 50 | 2015-05-13 11:57:52 |
| 57 | 11 | 2 | 18 | 2015-05-13 11:58:41 |
| 58 | 11 | 2 | 19 | 2015-05-13 11:58:59 |
| 59 | 11 | 3 | 58 | 2015-05-13 11:59:01 |
| 60 | 11 | 3 | 65 | 2015-05-13 11:59:29 |
+------+-------+-------+-----+-----------------------+
Q: How would I get the MAX(v)for each sid for each mid?
NB#1: In the example above ROW 53, 54, 55 have all the same value (40), but I would like to retrieve the row with the most recent timestamp, which is ROW 55.
Expected Output:
+------+-------+-------+-----+-----------------------+
| id | mid | sid | v | timestamp |
+------+-------+-------+-----+-----------------------+
| 51 | 10 | 1 | 40 | 2015-05-13 11:56:01 |
| 55 | 10 | 2 | 40 | 2015-05-13 11:57:01 |
| 56 | 11 | 1 | 50 | 2015-05-13 11:57:52 |
| 58 | 11 | 2 | 19 | 2015-05-13 11:58:59 |
| 60 | 11 | 3 | 65 | 2015-05-13 11:59:29 |
+------+-------+-------+-----+-----------------------+
Structure of the table:
NB#2:
Since this table has over 110 million entries, it is critical to have have date boundaries, which limits to ~8000 entries over a 24 hour period.
The query can be written as follows:
SELECT t1.id, t1.mid, t1.sid, t1.v, t1.ts
FROM yourtable t1
INNER JOIN (
SELECT mid, sid, MAX(v) as v
FROM yourtable
WHERE ts BETWEEN '2015-05-13 00:00:00' AND '2015-05-14 00:00:00'
GROUP BY mid, sid
) t2
ON t1.mid = t2.mid
AND t1.sid = t2.sid
AND t1.v = t2.v
INNER JOIN (
SELECT mid, sid, v, MAX(ts) as ts
FROM yourtable
WHERE ts BETWEEN '2015-05-13 00:00:00' AND '2015-05-14 00:00:00'
GROUP BY mid, sid, v
) t3
ON t1.mid = t3.mid
AND t1.sid = t3.sid
AND t1.v = t3.v
AND t1.ts = t3.ts;
Edit and Explanation:
The first sub-query (first INNER JOIN) fetches MAX(v) per (mid, sid) combination. The second sub-query is to identify MAX(ts) for every (mid, sid, v). At this point, the two queries do not influence each others' results. It is also important to note that ts date range selection is done in the two sub-queries independently such that the final query has fewer rows to examine and no additional WHERE filters to apply.
Effectively, this translates into getting MAX(v) per (mid, sid) combination initially (first sub-query); and if there is more than one record with the same value MAX(v) for a given (mid, sid) combo, then the excess records get eliminated by the selection of MAX(ts) for every (mid, sid, v) combination obtained by the second sub-query. We then simply associate the output of the two queries by the two INNER JOIN conditions to get to the id of the desired records.
Demo
select * from sensor_data s1 where s1.v in (select max(v) from sensor_data s2 group by s2.mid)
union
select * from sensor_data s1 where s1.v in (select max(v) from sensor_data s2 group by s2.sid);
IN ( SELECT ... ) does not optimize well. It is even worse because of being correlated.
What you are looking for is a groupwise-max .
Please provide SHOW CREATE TABLE; we need to know at least what the PRIMARY KEY is.
Suggested code
You will need:
With the WHERE: INDEX(timestamp, mid, sid, v, id)
Without the WHERE: INDEX(mid, sid, v, timestamp, id)
Code:
SELECT id, mid, sid, v, timestamp
FROM ( SELECT #prev_mid := 99999, -- some value not in table
#prev_sid := 99999,
#n := 0 ) AS init
JOIN (
SELECT #n := if(mid != #prev_mid OR
sid != #prev_sid,
1, #n + 1) AS n,
#prev_mid := mid,
#prev_sid := sid,
id, mid, sid, v, timestamp
FROM sensor_data
WHERE timestamp >= '2017-05-13'
timestamp < '2017-05-13' + INTERVAL 1 DAY
ORDER BY mid DESC, sid DESC, v DESC, timestamp DESC
) AS x
WHERE n = 1
ORDER BY mid, sid; -- optional
Notes:
The index is 'composite' and 'covering'.
This should make one pass over the index, thereby providing 'good' performance.
The final ORDER BY is optional; the results may be in reverse order.
All the DESC in the inner ORDER BY must be in place to work correctly (unless you are using MySQL 8.0).
Note how the WHERE avoids including both midnights? And avoids manually computing leap-days, year-ends, etc?
With the WHERE (and associated INDEX), there will be filtering, but a 'sort'.
Without the WHERE (and the other INDEX), sort will not be needed.
You can test the performance of any competing formulations via this trick, even if you do not have enough rows (yet) to get reliable timings:
FLUSH STATUS;
SELECT ...
SHOW SESSION STATUS LIKE 'Handler%';
This can also be used to compare different versions of MySQL and MariaDB -- I have seen 3 significantly different performance characteristics in a related groupwise-max test.

MySQL top 2 records per group

Basically I need to get only the last 2 records for each user, considering the last created_datetime:
id | user_id | created_datetime
1 | 34 | '2015-09-10'
2 | 34 | '2015-10-11'
3 | 34 | '2015-05-23'
4 | 34 | '2015-09-13'
5 | 159 | '2015-10-01'
6 | 159 | '2015-10-02'
7 | 159 | '2015-10-03'
8 | 159 | '2015-10-06'
Returns (expected output):
2 | 34 | '2015-10-11'
1 | 34 | '2015-09-10'
7 | 159 | '2015-10-03'
8 | 159 | '2015-10-06'
I was trying with this idea:
select user_id, created_datetime,
$num := if($user_id = user_id, $num + 1, 1) as row_number,
$id := user_id as dummy
from logs group by user_id
having row_number <= 2
The idea is keep only these top 2 rows and remove all the others.
Any ideas?
Your idea is close. I think this will work better:
select u.*
from (select user_id, created_datetime,
$num := if(#user_id = user_id, #num + 1,
if(#user_id := id, 1, 1)
) as row_number
from logs cross join
(select #user_id := 0, #num := 0) params
order by user_id
) u
where row_number <= 2 ;
Here are the changes:
The variables are set in only one expression. MySQL does not guarantee the order of evaluation of expressions, so this is important.
The work is done in a subquery, which is then processed in the outer query.
The subquery uses order by, not group by.
The outer query uses where instead of having (actually, in MySQL having would work, but where is more appropriate).

grouping resultset - mysql

I have the following sql which returns the total number of books grouped by status
select COUNT(BOOK_ID) AS book_num, BOOK_STATUS_FK from BOOKS group by BOOK_STATUS_FK;
+---------+------------------+
| book_num | BOOK_STATUS_FK |
+---------+------------------+
| 57 | 2 |
| 162 | 3 |
| 9736 | 4 |
| 104 | 5 |
| 29 | 22 |
| 1 | 23 |
| 5 | 25 |
| 14 | 54 |
+---------+------------------+
I would like to group the resultset into 2 rows only where one row represents the number of books with BOOK_STATUS_FK > 4 and the 2nd to represent the number of books with BOOK_STATUS_FK <= 4
Is there a way of doing that in sql?
Thanks for your suggestions.
The 2 row solution Gordon Linoff suggests wont produce 2 rows when one of the counts is 0.
The following will give both counts in a single row:
select ifnull( sum( if( book_status_fk > 4, 1, 0 ) ), 0), ifnull( sum( if( book_status_fk <= 4, 1, 0 ) ), 0 )
from books
Edit: added ifnull's
This is an aggregation with a case statement:
select (case when book_tatus_fk > 4 then '>4' else '<=4' end) as grp, count(*)
from books
group by (case when book_tatus_fk > 4 then '>4' else '<=4' end)
If you always need two rows, even if count of a group is 0, you can use palindrom's solution or you can use this slightly modified version of Gordon Linoff's query:
select grp.g, count(BOOK_STATUS_FK)
from
(select '<=4' g union all select '>4') grp left join books
on grp.g = case when book_status_fk > 4 then '>4' else '<=4' end
group by grp.g

nested query & transaction

Update #1: query gives me syntax error on Left Join line (running the query within the left join independently works perfectly though)
SELECT b1.company_id, ((sum(b1.credit)-sum(b1.debit)) as 'Balance'
FROM MyTable b1
JOIN CustomerInfoTable c on c.id = b1.company_id
#Filter for Clients of particular brand, package and active status
where c.brand_id = 2 and c.status = 2 and c.package_id = 3
LEFT JOIN
(
SELECT b2.company_id, sum(b2.debit) as 'Current_Usage'
FROM MyTable b2
WHERE year(b2.timestamp) = '2012' and month(b2.timestamp) = '06'
GROUP BY b2.company_id
)
b3 on b3.company_id = b1.company_id
group by b1.company_id;
Original Post:
I keep track of debits and credits in the same table. The table has the following schema:
| company_id | timestamp | credit | debit |
| 10 | MAY-25 | 100 | 000 |
| 11 | MAY-25 | 000 | 054 |
| 10 | MAY-28 | 000 | 040 |
| 12 | JUN-01 | 100 | 000 |
| 10 | JUN-25 | 150 | 000 |
| 10 | JUN-25 | 000 | 025 |
As my result, I want to to see:
| Grouped by: company_id | Balance* | Current_Usage (in June) |
| 10 | 185 | 25 |
| 12 | 100 | 0 |
| 11 | -54 | 0 |
Balance: Calculated by (sum(credit) - sum(debits))* - timestamp does not matter
Current_Usage: Calculated by sum(debits) - but only for debits in JUN.
The problem: If I filter by JUN timestamp right away, it does not calculate the balance of all time but only the balance of any transactions in June.
How can I calculate the current usage by month but the balance on all transactions in the table. I have everything working, except that it filters only the JUN results into the current usage calculation in my code:
SELECT b.company_id, ((sum(b.credit)-sum(b.debit))/1024/1024/1024/1024) as 'BW_remaining', sum(b.debit/1024/1024/1024/1024/28*30) as 'Usage_per_month'
FROM mytable b
#How to filter this only for the current_usage calculation?
WHERE month(a.timestamp) = 'JUN' and a.credit = 0
#Group by company in order to sum all entries for balance
group by b.company_id
order by b.balance desc;
what you will need here is a join with sub query which will filter based on month.
SELECT T1.company_id,
((sum(T1.credit)-sum(T1.debit))/1024/1024/1024/1024) as 'BW_remaining',
MAX(T3.DEBIT_PER_MONTH)
FROM MYTABLE T1
LEFT JOIN
(
SELECT T2.company_id, SUM(T2.debit) T3.DEBIT_PER_MONTH
FROM MYTABLE T2
WHERE month(T2.timestamp) = 'JUN'
GROUP BY T2.company_id
)
T3 ON T1.company_id-T3.company_id
GROUP BY T1.company_id
I havn't tested the query. The point here i am trying to make is how you can join your existing query to get usage per month.
alright, thanks to #Kshitij I got it working. In case somebody else is running into the same issue, this is how I solved it:
SELECT b1.company_id, ((sum(b1.credit)-sum(b1.debit)) as 'Balance',
(
SELECT sum(b2.debit)
FROM MYTABLE b2
WHERE b2.company_id = b1.company_id and year(b2.timestamp) = '2012' and month(b2.timestamp) = '06'
GROUP BY b2.company_id
) AS 'Usage_June'
FROM MYTABLE b1
#Group by company in order to add sum of all zones the company is using
group by b1.company_id
order by Usage_June desc;