Achieving top N results and 'Other' category in MySQL - mysql

I'm trying to get the top 10 currencies (and their request count) from a table in an MySQL database and group all other currencies in an 'Other' category with request count equal to the sum of these currencies' request counts.
The below query gives me the correct result, but is probably highly inefficient.
SET #row_number = 0;
SELECT
rank,
CASE WHEN rank <=10 THEN ccy ELSE 'Other' END as ccy,
SUM(req_count) AS requests
FROM (SELECT CASE WHEN rank <= 10 THEN rank ELSE 11 END AS rank, ccy, req_count
FROM (SELECT (#row_number:=#row_number + 1) AS rank, ccy, req_count
FROM (SELECT have_currency AS ccy, COUNT(*) AS req_count
FROM db1.table1
GROUP BY have_currency
ORDER BY req_count DESC)
AS currencies)
AS currencies)
AS currencies
GROUP BY rank ASC;
Result:
# rank, ccy, requests
'1', 'SGD', '184481'
'2', 'USD', '10723'
'3', 'MYR', '8044'
'4', 'HKD', '7316'
'5', 'THB', '5725'
'6', 'JPY', '4930'
'7', 'INR', '2767'
'8', 'AUD', '2164'
'9', 'VND', '2130'
'10', 'CNY', '1965'
'11', 'Other', '10217'
Any way to make this more efficient?
Bonus question: is it possible to return the % of total for request counts instead of the absolute number?

is this what you looking for ?
SELECT rank, IF(rank>10,'other',ccy) AS ccy,sum(req_count)
FROM (
SELECT #rank := (#rank := #rank+1) AS rank, ccy,req_count
FROM (
SELECT have_currency AS ccy, COUNT(*) AS req_count
FROM table1
GROUP BY have_currency
ORDER BY req_count DESC
) AS d1
) AS d2
CROSS JOIN (SELECT #rank := 0) AS param
GROUP BY LEAST(rank, 11);

Related

How to check values in mysql for preceeding rows and using them to find sum

I have to find the time, that the system that is based on this table has elapsed while having code '100', so firstly i thought that I have to find the newest row of the xID group and after that, check the previous rows if their code is 100, if so i have to proceed with previous previous row till it gets a 200 value, after that it finds the time from the following row of 200 hundred till now (value 100).
ID xID createdDate CODE
1 '1', '2019-07-27 11:52:01', '100'
2 '1', '2019-07-27 11:54:01', '200'
3 '2', '2019-09-03 05:10:02', '200'
4 '2', '2019-09-03 05:12:02', '200'
5 '3', '2019-09-02 05:12:02', '200'
6 '3', '2019-09-02 05:12:02', '100'
7 '3', '2019-09-02 05:12:02', '200'
8 '4', '2019-09-02 05:13:02', '200'
9 '5', '2019-09-03 05:10:03', '200'
10 '6', '2018-12-13 05:03:02', '200'
So this query must for each group of xID find the total time for which the system has been with code 100 until now. Hope I've been clear. And here is the sql so far.
select id, createdDate, code
from wlogs
where id in (
select max(id)
from wlogs
group by xid
)
order by xid;
EDIT:
MYSQL VERSION 8.0
RESULT must be something like this where the column totTimeWithCode100 must show the time in seconds or minutes doesn't matter, for each type of xID.
xID totTimeWithCode100
'1', '500'
'2', '2'
'3', '33'
'4', '200'
'5', '40'
'6', '200'
These rows
Prior to MySQL 8.0, we can use of user-defined variables (in a way that is unsupported) in carefully crafted SQL that takes advantage of behavior that is repeatably observed but not guaranteed. (The MySQL Reference Manual warns specifically about this usage of user-defined variables.)
Something like this:
SELECT s.xid AS `xID`
, IFNULL(SUM(s.secs_in_code100),0) AS `totTimeWithCode100`
FROM (
SELECT IF(#prev_xid = t.xid AND #prev_code = 100, TIMESTAMPDIFF(SECOND,#prev_date,t.createddate),0) AS secs_in_code100
, #prev_xid := t.xid AS xid
, #prev_date := t.createddate AS createddate
, #prev_code := t.code AS code
FROM ( SELECT #prev_xid := ''
, #prev_date := '1970-01-02 03:00'
, #prev_code := ''
) i
CROSS
JOIN wlogs t
ORDER
BY t.xid
, t.createddate
) s
GROUP BY s.xid
ORDER BY s.xid
With MySQL 8.0, we can avoid the user-defined variables by using analytic/window functions.
You can get the result you want by finding all the rows with CODE = 100 for a given xID that are immediately followed (in time) by a row with CODE != 100. This can be done by LEFT JOINing the rows with CODE != 100 to the preceding row if that row has CODE = 100:
SELECT w.xID, COALESCE(SUM(TIMESTAMPDIFF(SECOND, w1.createdDate, w2.createdDate)), 0) AS totTimeWithCode100
FROM (SELECT DISTINCT xID FROM wlogs) w
LEFT JOIN (SELECT *
FROM wlogs
WHERE CODE = 100) w1 ON w1.xID = w.xID
LEFT JOIN wlogs w2 ON w2.xID = w1.xID
AND w2.createdDate = (SELECT MIN(createdDate)
FROM wlogs w3
WHERE w3.xID = w1.xID AND
w3.createdDate > w1.createdDate)
GROUP BY w.xID
ORDER BY w.xID;
Demo on dbfiddle

How can I check streaks by team and by result using mySQL?

I have created a View that includes the match id, team, result (using "W", "D", "L") and date.
e.g.:
'32750', 'Team 1', 'D', '2019-02-16 21:30:00'
'32750', 'Team 2', 'D', '2019-02-16 21:30:00'
'32748', 'Team 3', 'L', '2019-02-16 19:20:00'
'32748', 'Team 4', 'W', '2019-02-16 19:20:00'
I've adapted a query I found here that calculates streaks using that table and shows me the Team, Start date of the streak, end date, streak count and the matches that are included in that streak.
'Team 1', '1992-07-05', '1992-09-06', '5', '9522,9142,9161,9167,9180'
The problem I have is that I have to include the team name in a where clause or the query shows incorrect results. Here is the query I'm using for Winning Streaks
select
team_name,
min(match_date) as start_date,
max(match_date) as end_date,
count(match_date) as streak,
group_concat(idmatch) as gameid_list
from
(
select *,
IF(
result = "W"
and
#result = "W",
#gn, #gn := #gn + 1)
as group_number,
#match_date as old_date, #idmatch as old_gameid,
#result as old_rersult,
#match_date := match_date, #idmatch := idmatch,
#result := result
from my_football_database.`Resultados Globales WDL`
cross join
(
select
#match_date := CAST(null as date) as xa,
#idmatch := null + 0 as xb,
#result := null + 0 as xc, #gn := 0
) x where team_name = "Team 1"
order by match_date
) as y
group by group_number
order by streak desc;
As you can see, I have to specify the name of the team in a where clause so If I want to get for example the top 3 winning streaks in history, I have to run this query 150 times for each team, and then manually compare.
How can I modify this to get a full list of streaks correctly grouped by teams?

Counting the occurence of record in mysql

I am trying to count a lactation which means i am counting the dates calved of an animal: when calvDate for the animalid changes 1 is added to the lactationID to keep count.
This are the five columns
#ID,LactationID,CalvDateLactationID, animalidLactationID,animalid, calvDate
'1', '1', '1 - 2018-08-08', '1 - T81', 'T81', '2018-08-08'
'2', '1', '1 - 2017-12-18', '1 - T66', 'T66', '2017-12-18'
'3', '2', '3 - 2017-12-28', '4 - T66', 'T66', '2017-12-28'
The query i am using to generate this output is
SELECT
dt.ID,
#row_num := IF(#aid <> dt.animalid, 1, #row_num + 1) as LactationID,
concat(#row_num := IF(#aid <> dt.animalid, 1, #row_num + 1),' - ',calvDate) AS CalvDateLactationID,
concat(#row_num := IF(#aid <> dt.animalid, 1, #row_num + 1),' - ',animalid) AS animalidLactationID,
#aid := dt.animalid AS animalid, dt.calvDate
FROM
(SELECT ID,animalid,calvDate FROM calvingdatecombined ORDER BY animalid, calvDate, ID) AS dt
CROSS JOIN (SELECT #row_num := 0, #aid := '') AS user_init_vars
where calvDate <> '' and calvDate <> '0000-00-00' ORDER BY dt.ID
My expected output is
#ID, LactationID,CalvDateLactationID, animalidLactationID,animalid, calvDate
'1', '1', '1 - 2018-08-08', '1 - T81', 'T81', '2018-08-08'
'2', '1', '1 - 2017-12-18', '1 - T66', 'T66', '2017-12-18'
'3', '2', '2 - 2017-12-28', '2 - T66', 'T66', '2017-12-28'
what can i improve in my query to help me generate my expected output.
My calvingdatecombined table has the following columns and sample data
# ID, animalid, calvDate
'1', 'T81', '2018-08-08'
'2', 'T66', '2017-12-18'
'3', 'T66', '2017-12-28'
Not an answer, just a suggestion, and too long for a comment:
A data set like this would lose none of the meaning, and would be considerably easier to read:
ID, animalid, calvDate
186, 81, '2018-08-08'
188, 66, '2017-12-18'
189, 66, '2017-12-28'
You don't need to increment #row_num value again for CalvDateLactationID and animalidLactationID.
Also, I have shifted the Where condition on calvDate to inner Select query, for further optimization. Use the following query instead:
SELECT
dt.ID,
#row_num := IF(#aid <> dt.animalid, 1, #row_num + 1) as LactationID,
concat(#row_num, ' - ', calvDate) AS CalvDateLactationID,
concat(#row_num, ' - ', animalid) AS animalidLactationID,
#aid := dt.animalid AS animalid,
dt.calvDate
FROM
(
SELECT ID,animalid,calvDate
FROM calvingdatecombined
WHERE calvDate > '0000-00-00'
ORDER BY animalid, calvDate, ID
) AS dt
CROSS JOIN (SELECT #row_num := 0,
#aid := '') AS user_init_vars
ORDER BY dt.ID

How to SUM result both SUMs SQL?

Using this query I try to sum result of both SUM function:
select
DAY(created_at) AS day,
SUM(if(status = '1', 1, 0)) AS result,
SUM(if(status = '2', 1, 0)) AS noresult,
SUM(result + noresult)
from `clients` where `doctor_id` = 2 and MONTH(created_at) = MONTH(CURRENT_TIMESTAMP) group by `day`
I try to do that in this line:
SUM(result + noresult)
Try this:
select
DAY(created_at) AS day,
SUM(if(status = '1', 1, 0)) AS result,
SUM(if(status = '2', 1, 0)) AS noresult,
SUM(if(status in ('1', '2'), 1, 0))
from `clients`
where `doctor_id` = 2 and MONTH(created_at) = MONTH(CURRENT_TIMESTAMP)
group by `day`
You can't use alias in select columns name you must repeat the code
select
DAY(created_at) AS day,
SUM(if(status = '1', 1, 0)) AS result,
SUM(if(status = '2', 1, 0)) AS noresult,
SUM(if(status = '1', 1, 0)) + SUM(if(status = '2', 1, 0)) AS all_result
from `clients` where `doctor_id` = 2 and MONTH(created_at) = MONTH(CURRENT_TIMESTAMP) group by `day`
you must repeat the code because the different SQL clause are processed in a specific order (first from then where then select and and group by .... etc.. ) so at the moment of the select parsing the alias are not available to the sql engine
As several other people have stated, you cannot use aliases in your select statement. However, to keep it cleaner, you could combine both conditions rather than summing both SUM fields.
select
DAY(created_at) AS day,
SUM(if(status = '1', 1, 0)) AS result,
SUM(if(status = '2', 1, 0)) AS noresult,
SUM(if(status = '1' OR status = '2', 1, 0)) AS newcolumn
from `clients` where `doctor_id` = 2 and MONTH(created_at) = MONTH(CURRENT_TIMESTAMP) group by `day`

Group by, with rank and sum - not getting correct output

I'm trying to sum a column with rank function and group by month, my code is
select dbo.UpCase( REPLACE( p.Agent_name,'.',' '))as Agent_name, SUM(convert ( float ,
p.Amount))as amount,
RANK() over( order by SUM(convert ( float ,Amount )) desc ) as arank
from dbo.T_Client_Pc_Reg p
group by p.Agent_name ,p.Sale_status ,MONTH(Reg_date)
having [p].Sale_status='Activated'
Currently I'm getting all total value of that column not month wise
Name amount rank
a 100 1
b 80 2
c 50 3
for a amount 100 is total amount till now but , i want get current month total amount not last months..
Maybe you just need to add a WHERE clause? Here is a minor re-write that I think works generally better. Some setup in tempdb:
USE tempdb;
GO
CREATE TABLE dbo.T_Client_Pc_Reg
(
Agent_name VARCHAR(32),
Amount INT,
Sale_Status VARCHAR(32),
Reg_date DATETIME
);
INSERT dbo.T_Client_Pc_Reg
SELECT 'a', 50, 'Activated', GETDATE()
UNION ALL SELECT 'a', 50, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'NotActivated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()-40;
Then the query:
SELECT
Agent_name = UPPER(REPLACE(Agent_name, '.', '')),
Amount = SUM(CONVERT(FLOAT, Amount)),
arank = RANK() OVER (ORDER BY SUM(CONVERT(FLOAT, Amount)) DESC)
FROM dbo.T_Client_Pc_Reg
WHERE Reg_date >= DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP), 0)
AND Reg_date < DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP) + 1, 0)
AND Sale_status = 'Activated'
GROUP BY UPPER(REPLACE(Agent_name, '.', ''))
ORDER BY arank;
Now cleanup:
USE tempdb;
GO
DROP TABLE dbo.T_Client_Pc_Reg;