How can I check streaks by team and by result using mySQL? - mysql

I have created a View that includes the match id, team, result (using "W", "D", "L") and date.
e.g.:
'32750', 'Team 1', 'D', '2019-02-16 21:30:00'
'32750', 'Team 2', 'D', '2019-02-16 21:30:00'
'32748', 'Team 3', 'L', '2019-02-16 19:20:00'
'32748', 'Team 4', 'W', '2019-02-16 19:20:00'
I've adapted a query I found here that calculates streaks using that table and shows me the Team, Start date of the streak, end date, streak count and the matches that are included in that streak.
'Team 1', '1992-07-05', '1992-09-06', '5', '9522,9142,9161,9167,9180'
The problem I have is that I have to include the team name in a where clause or the query shows incorrect results. Here is the query I'm using for Winning Streaks
select
team_name,
min(match_date) as start_date,
max(match_date) as end_date,
count(match_date) as streak,
group_concat(idmatch) as gameid_list
from
(
select *,
IF(
result = "W"
and
#result = "W",
#gn, #gn := #gn + 1)
as group_number,
#match_date as old_date, #idmatch as old_gameid,
#result as old_rersult,
#match_date := match_date, #idmatch := idmatch,
#result := result
from my_football_database.`Resultados Globales WDL`
cross join
(
select
#match_date := CAST(null as date) as xa,
#idmatch := null + 0 as xb,
#result := null + 0 as xc, #gn := 0
) x where team_name = "Team 1"
order by match_date
) as y
group by group_number
order by streak desc;
As you can see, I have to specify the name of the team in a where clause so If I want to get for example the top 3 winning streaks in history, I have to run this query 150 times for each team, and then manually compare.
How can I modify this to get a full list of streaks correctly grouped by teams?

Related

SQL group by school name

school_name
class
medium
total
srk
1
english
13
srk
2
english
14
srk
3
english
15
srk
1
french
16
srk
2
french
16
srk
3
french
18
vrk
1
english
17
vrk
1
french
18
I want that output by
school_name
class1eng
class1french
class2eng
class2french
class3english
class3french
[output needed][ otput required
output
You’re looking for multiple select statements along with appropriate cases to satisfy.
This should work for you
Select
school_name,
Sum(Case when (class=1 and medium=‘English’) then total else 0 end) as class1english,
Sum(Case when (class=1 and medium=‘French’) then total else 0 end) as class1french,
Sum(Case when (class=2 and medium=‘English’) then total else 0 end) as class2english,
Sum(Case when (class=2 and medium=‘French’) then total else 0 end) as class2french,
Sum(Case when (class=3 and medium=‘English’) then total else 0 end) as class3english,
Sum(Case when (class=3 and medium=‘French’) then total else 0 end) as class3french
From
table_name
Group by
school_name
Seems to be a simple ask, assumed you also want to order your results. Please check below query if that helps
SELECT school_name, class, medium, SUM(total) AS Total
FROM <Table Name>
GROUP BY school_name, class, medium
This solution is for general purpose, complex, but functional.
I've made it for myself as exercise and challenge.
/* --------------- TABLE --------------- */
CREATE TABLE schools_tab
(school VARCHAR(9), class INT, subj VARCHAR(9), total INT);
INSERT INTO schools_tab VALUES
('srk', 1, 'english', 13),
('srk', 2, 'english', 14),
('srk', 3, 'english', 15),
('srk', 1, 'french', 16),
('srk', 2, 'french', 16),
('srk', 3, 'french', 18),
('vrk', 1, 'english', 17),
('vrk', 1, 'french', 18);
/* -------------- DYNAMIC QUERY --------------- */
SET #sql=NULL;
WITH cte AS (
SELECT school, class, subj, ROW_NUMBER() OVER (PARTITION BY school) AS idx, DENSE_RANK() OVER (ORDER BY school) AS ids
FROM (SELECT DISTINCT school FROM schools_tab) A LEFT JOIN (SELECT DISTINCT class, subj FROM schools_tab) B ON (1=1)
), cte2 AS (
SELECT A.ids, A.idx, A.school, A.class, A.subj, COALESCE(B.total, 0) AS total
FROM cte A LEFT JOIN schools_tab B ON (A.school=B.school AND A.class=B.class AND A.subj=B.subj)
), cte3 AS (
SELECT DISTINCT class, subj
FROM schools_tab
ORDER BY class, subject
)
SELECT CONCAT('WITH RECURSIVE cte AS (
SELECT school, class, subj, ROW_NUMBER() OVER (PARTITION BY school) AS idx, DENSE_RANK() OVER (ORDER BY school) AS ids
FROM (SELECT DISTINCT school FROM schools_tab) A LEFT JOIN (SELECT DISTINCT class, subj FROM schools_tab) B ON (1=1)
), cte2 AS (
SELECT A.ids, A.idx, A.school, A.class, A.subj, COALESCE(B.total, 0) AS total
FROM cte A LEFT JOIN schools_tab B ON (A.school=B.school AND A.class=B.class AND A.subj=B.subj)
), ctx AS ('
'SELECT (SELECT MAX(ids) FROM cte2) AS n,',
GROUP_CONCAT(DISTINCT CONCAT( '(SELECT total FROM cte2 WHERE idx=',idx,' AND ids=n) AS class',class,subj ) ORDER BY class, subj),
' UNION ALL SELECT n-1 AS n,',
GROUP_CONCAT(DISTINCT CONCAT( '(SELECT total FROM cte2 WHERE idx=',idx,' AND ids=n) AS class',class,subj ) ORDER BY class, subj),
' FROM ctx WHERE n>0',
') SELECT DISTINCT SUBSTRING_INDEX(SUBSTRING_INDEX(''srk,vrk'', '','', n+1), '','', -1) AS school,',
GROUP_CONCAT(DISTINCT CONCAT('class',class,subj)),
' FROM ctx ORDER BY school'
) INTO #sql
FROM cte2;
PREPARE stmt1 FROM #sql;
EXECUTE stmt1;

SSRS Predicament with expressions

I have an SSRS dataset that looks like this:
The dataset rows are generated independent of each other using UNION ALL.
I need to display these rows in my report as is, but I need to add an additional row that will calculate Total Won / Total Lost, so the result should look like this:
This is just sample as I have more columns (1 per month) and the whole thing is broken down by product, so if I have 10 different products, I will have 10 different tablix tables.
Basically I need to somehow create an expression that will only calculate values in 2 rows of the tablix out of 3 (based on the value of the Status column) and take into consideration that some values can be zeroes.
Here's the query (I simplified it a bit for better understanding):
select * from
(
select 'Created' as 'State', fo.groupidname, fo.businessidname ' Business', fo.opportunityid
from FilteredOpportunity fo
where fo.regionidname = 'Americas Region'
and fo.createdon >= dateadd(year, -1, getdate())
and fo.regionalfeeincome >= 250000
) created
pivot
(
count(created.opportunityid)
for created.groupidname in ([Boston], [Chicago], [Colombia], [Group D.C.], [Houston], [Los Angeles], [New York], [San Francisco], [Seattle], [Toronto])
) pivCreated
union all
select * from
(
select 'Won' as 'State', fo.groupidname, fo.businessidname ' Business', fo.opportunityid
from FilteredOpportunity fo
where regionidname = 'Americas Region'
and fo.actualclosedate >= dateadd(year, -1, getdate())
and regionalfeeincome >= 250000
and fo.jna is not null
) won
pivot
(
count(won.opportunityid)
for won.groupidname in ([Boston], [Chicago], [Colombia], [Group D.C.], [Houston], [Los Angeles], [New York], [San Francisco], [Seattle], [Toronto])
) pivWon
union all
select * from
(
select 'Lost' as 'State', fo.groupidname, fo.businessidname ' Business', fo.opportunityid
from FilteredOpportunity fo
where fo.regionidname = 'Americas Region'
and fo.actualclosedate >= dateadd(year, -1, getdate())
and fo.regionalfeeincome >= 250000
and fo.sys_phasename <> 'Pre-Bid'
) lost
pivot
(
count(lost.opportunityid)
for lost.groupidname in ([Boston], [Chicago], [Colombia], [Group D.C.], [Houston], [Los Angeles], [New York], [San Francisco], [Seattle], [Toronto])
) pivLost
TIA
-TS.
I can't fully test this as I don't have time to build sample data but it should work...
If you use this as your report's dataset query then you should be able to add a simple matrix with a row group by State and a column group by City
/*
You can optimise this a bit but I've kept it fairly procedural so it's easier to follow
*/
-- First do your 3 basic queries but this time we don't pivot and we dump the results into temp tables
-- I've also removed columns that don't appear to be used based on your sample output and remaned a column to City to make it easier to read
select 'Total Created' as 'State', fo.groupidname as City, COUNT(*) AS Cnt
INTO #Created
from FilteredOpportunity fo
where fo.regionidname = 'Americas Region'
and fo.createdon >= dateadd(year, -1, getdate())
and fo.regionalfeeincome >= 250000
select 'Total Won' as 'State', fo.groupidname as City, COUNT(*) AS Cnt
INTO #Won
from FilteredOpportunity fo
where fo.regionidname = 'Americas Region'
and fo.createdon >= dateadd(year, -1, getdate())
and fo.regionalfeeincome >= 250000
and fo.jna is not null
select 'Total Lost' as 'State', fo.groupidname as City, COUNT(*) AS Cnt
INTO #Lost
from FilteredOpportunity fo
where fo.regionidname = 'Americas Region'
and fo.createdon >= dateadd(year, -1, getdate())
and fo.regionalfeeincome >= 250000
and fo.sys_phasename <> 'Pre-Bid'
-- Now we calculate your Ratio
SELECT
'Win Ratio' as 'State'
, coalesce(w.City, l.City) as City -- use coalesce in case 1 side of join is null
, CASE WHEN ISNULL(l.cnt,0) = 0 THEN 0 ELSE w.Cnt/l.Cnt END as Cnt -- if lost is null or 0 return 0 else calc ratio
into #Ratio
FROM #Won w
full join #Lost l on w.City = l.City
-- finaly, get the results and add a sort to help the report row sorting.
SELECT 1 as Sort, * FROM #Created
UNION SELECT 2, * FROM #Won
UNION SELECT 3, * FROM #Lost
UNION SELECT 4, * FROM #Ratio

MySQL: Why would a query run faster with literal conditions compared to variables

Not sure whether the actually query matters but, I have a MySQL Stored Procedure where I commented out the other parts of the proc except the following query...
INSERT INTO temp_attribution (`attribute_type`, `domain`, `id`, `name`, `score`, `rank`, `partner_match`, `person_match`, `sponsor_match`, `date_match`)
SELECT 'Campaign' AS attribute_type, domain, id, name, score, (#proc_counter := #proc_counter + 1) AS rank,
partner_match, person_match, sponsor_match, date_match
FROM (
SELECT m_c.domain, m_c.campaign_id AS id, m_c.name, m_c.client_id, m_c.sent_date,
proc_sponsors AS invoice_sponsor, bs.sponsor AS campaign_sponsor,
proc_email AS invoice_email, aes_decrypt(m_r.email, in_encrypt_key) as campaign_email,
if (m_c.client_id = proc_client_id COLLATE latin1_general_ci, 'Yes', 'No') AS partner_match,
if (aes_encrypt(proc_email, in_encrypt_key) = m_r.email, 'Exact Email', 'Email Domain') AS person_match,
if (LOCATE(CONVERT(bs.sponsor USING utf8mb4), proc_sponsors) > 0, 'Sponsor',
if (CONVERT(bs.vendor USING utf8mb4) = proc_vendor, 'Vendor', 'No') ) AS sponsor_match,
if (datediff(proc_invoice_date, m_c.sent_date) BETWEEN 0 AND 92, 'Within Three', 'Within Six') AS date_match,
(
if (m_c.client_id = proc_client_id COLLATE latin1_general_ci, 45, 10) + 30 +
if (LOCATE(CONVERT(bs.sponsor USING utf8mb4), proc_sponsors) > 0, 10,
if (CONVERT(bs.vendor USING utf8mb4) = proc_vendor, 5, 0) ) +
if (datediff(proc_invoice_date, m_c.sent_date) BETWEEN 0 AND 92, 15, 5)
) AS score
FROM campaign_table m_c
INNER JOIN recipient_table m_r ON m_c.domain = m_r.domain AND m_c.campaign_id = m_r.campaign_id
LEFT JOIN booking_sponsor bs ON m_c.domain = bs.domain AND m_c.campaign_id = bs.campaign_id
WHERE datediff(proc_invoice_date, m_c.sent_date) BETWEEN 0 AND 185
AND ( aes_encrypt(proc_email, in_encrypt_key) = m_r.email OR m_r.email_domain = proc_email_domain )
) T ORDER BY score DESC, sent_date DESC LIMIT 5;
The fields starting with 'proc_' are actually variables declared at the beginning of the procedure and this only takes 0.385 seconds to initialise whereas the entire proc takes 15 seconds.
On a separate query window, I copied the relevant query and substituted variables starting with 'proc_' to test speed and optimise, like so...
INSERT INTO temp_attribution (`attribute_type`, `domain`, `id`, `name`, `score`, `rank`, `partner_match`, `person_match`, `sponsor_match`, `date_match`)
SELECT 'Campaign' AS attribute_type, domain, id, name, score, (#proc_counter := #proc_counter + 1) AS rank,
partner_match, person_match, sponsor_match, date_match
FROM (
SELECT m_c.domain, m_c.campaign_id AS id, m_c.name, m_c.client_id, m_c.sent_date,
'VENDOR SPONSOR VALUE' AS invoice_sponsor, bs.sponsor AS campaign_sponsor,
'johnsmith#domain.com' AS invoice_email, aes_encrypt('johnsmith#domain.com', 'secret_key') as campaign_email,
if (m_c.client_id = m_c.client_id COLLATE latin1_general_ci, 'Yes', 'No') AS partner_match,
if (aes_encrypt('johnsmith#domain.com', 'secret_key'), 'Exact Email', 'Email Domain') AS person_match,
if (LOCATE(CONVERT(bs.sponsor USING utf8mb4), 'VENDOR SPONSOR VALUE') > 0, 'Sponsor',
if (CONVERT(bs.vendor USING utf8mb4) = 'VENDOR', 'Vendor', 'No') ) AS sponsor_match,
if (datediff('2016-10-14', m_c.sent_date) BETWEEN 0 AND 92, 'Within Three', 'Within Six') AS date_match,
(
if (m_c.client_id = m_c.client_id COLLATE latin1_general_ci, 45, 10) + 30 +
if (LOCATE(CONVERT(bs.sponsor USING utf8mb4), 'VENDOR SPONSOR VALUE') > 0, 10,
if (CONVERT(bs.vendor USING utf8mb4) = 'VENDOR', 5, 0) ) +
if (datediff('2016-10-14', m_c.sent_date) BETWEEN 0 AND 92, 15, 5)
) AS score
FROM campaign_table m_c
INNER JOIN recipient_table m_r ON m_c.domain = m_r.domain AND m_c.campaign_id = m_r.campaign_id
LEFT JOIN booking_sponsor bs ON m_c.domain = bs.domain AND m_c.campaign_id = bs.campaign_id
WHERE datediff('2016-10-14', m_c.sent_date) BETWEEN 0 AND 185
AND ( aes_encrypt('johnsmith#domain.com', 'secret_key') = m_r.email OR m_r.email_domain = 'domain.com' )
) T ORDER BY score DESC, sent_date DESC LIMIT 5;
Now, magically without doing anything else, the query runs within two seconds. How is that possible?
Figured it out. Some of the declared variable type was different compared to the column being compared, so I guess MySQL could not compare them in the most efficient way possible.

Achieving top N results and 'Other' category in MySQL

I'm trying to get the top 10 currencies (and their request count) from a table in an MySQL database and group all other currencies in an 'Other' category with request count equal to the sum of these currencies' request counts.
The below query gives me the correct result, but is probably highly inefficient.
SET #row_number = 0;
SELECT
rank,
CASE WHEN rank <=10 THEN ccy ELSE 'Other' END as ccy,
SUM(req_count) AS requests
FROM (SELECT CASE WHEN rank <= 10 THEN rank ELSE 11 END AS rank, ccy, req_count
FROM (SELECT (#row_number:=#row_number + 1) AS rank, ccy, req_count
FROM (SELECT have_currency AS ccy, COUNT(*) AS req_count
FROM db1.table1
GROUP BY have_currency
ORDER BY req_count DESC)
AS currencies)
AS currencies)
AS currencies
GROUP BY rank ASC;
Result:
# rank, ccy, requests
'1', 'SGD', '184481'
'2', 'USD', '10723'
'3', 'MYR', '8044'
'4', 'HKD', '7316'
'5', 'THB', '5725'
'6', 'JPY', '4930'
'7', 'INR', '2767'
'8', 'AUD', '2164'
'9', 'VND', '2130'
'10', 'CNY', '1965'
'11', 'Other', '10217'
Any way to make this more efficient?
Bonus question: is it possible to return the % of total for request counts instead of the absolute number?
is this what you looking for ?
SELECT rank, IF(rank>10,'other',ccy) AS ccy,sum(req_count)
FROM (
SELECT #rank := (#rank := #rank+1) AS rank, ccy,req_count
FROM (
SELECT have_currency AS ccy, COUNT(*) AS req_count
FROM table1
GROUP BY have_currency
ORDER BY req_count DESC
) AS d1
) AS d2
CROSS JOIN (SELECT #rank := 0) AS param
GROUP BY LEAST(rank, 11);

Group by, with rank and sum - not getting correct output

I'm trying to sum a column with rank function and group by month, my code is
select dbo.UpCase( REPLACE( p.Agent_name,'.',' '))as Agent_name, SUM(convert ( float ,
p.Amount))as amount,
RANK() over( order by SUM(convert ( float ,Amount )) desc ) as arank
from dbo.T_Client_Pc_Reg p
group by p.Agent_name ,p.Sale_status ,MONTH(Reg_date)
having [p].Sale_status='Activated'
Currently I'm getting all total value of that column not month wise
Name amount rank
a 100 1
b 80 2
c 50 3
for a amount 100 is total amount till now but , i want get current month total amount not last months..
Maybe you just need to add a WHERE clause? Here is a minor re-write that I think works generally better. Some setup in tempdb:
USE tempdb;
GO
CREATE TABLE dbo.T_Client_Pc_Reg
(
Agent_name VARCHAR(32),
Amount INT,
Sale_Status VARCHAR(32),
Reg_date DATETIME
);
INSERT dbo.T_Client_Pc_Reg
SELECT 'a', 50, 'Activated', GETDATE()
UNION ALL SELECT 'a', 50, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'Activated', GETDATE()
UNION ALL SELECT 'b', 20, 'NotActivated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()
UNION ALL SELECT 'c', 25, 'Activated', GETDATE()-40;
Then the query:
SELECT
Agent_name = UPPER(REPLACE(Agent_name, '.', '')),
Amount = SUM(CONVERT(FLOAT, Amount)),
arank = RANK() OVER (ORDER BY SUM(CONVERT(FLOAT, Amount)) DESC)
FROM dbo.T_Client_Pc_Reg
WHERE Reg_date >= DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP), 0)
AND Reg_date < DATEADD(MONTH, DATEDIFF(MONTH, 0, CURRENT_TIMESTAMP) + 1, 0)
AND Sale_status = 'Activated'
GROUP BY UPPER(REPLACE(Agent_name, '.', ''))
ORDER BY arank;
Now cleanup:
USE tempdb;
GO
DROP TABLE dbo.T_Client_Pc_Reg;