Better optimized SELECT SQL query for 50,000+ records - mysql

I have a query which works great for 1000 records or less but now I need to optimize it for 50,000+ records and when I run it on that it just stalls...
Here is my code:
SELECT
b1.account_num,b1.effective_date as ed1,b1.amount as am1,
b2.effective_date as ed2,b2.amount as am2
FROM bill b1
left join bill b2 on (b1.account_num=b2.account_num)
where b1.effective_date = (select max(effective_date) from bill where account_num = b1.account_num)
and (b2.effective_date = (select max(effective_date) from bill where account_num = b1.account_num and effective_date < (select max(effective_date) from bill where account_num = b1.account_num)) or b2.effective_date is null)
ORDER BY b1.effective_date DESC
My objective is to get the latest two effective dates and amounts from one table with many records.

Here is a working answer from your SQL-Fiddle baseline
First, the inner preQuery gets the max date per account. That is then joined to the bill table per account AND the effective date is less than the max already detected.
That is then joined to each respective bill for their amounts.
select
FB1.account_num,
FB1.effective_date as ed1,
FB1.amount as am1,
FB2.effective_date as ed2,
FB2.amount as am2
from
( select
pq1.account_num,
pq1.latestBill,
max( b2.effective_date ) as secondLastBill
from
( SELECT
b1.account_num,
max( b1.effective_date ) latestBill
from
bill b1
group by
b1.account_num ) pq1
LEFT JOIN bill b2
on pq1.account_num = b2.account_num
AND b2.effective_date < pq1.latestBill
group by
pq1.account_num ) Final
JOIN Bill FB1
on Final.Account_Num = FB1.Account_Num
AND Final.LatestBill = FB1.Effective_Date
LEFT JOIN Bill FB2
on Final.Account_Num = FB2.Account_Num
AND Final.secondLastBill = FB2.Effective_Date
ORDER BY
Final.latestBill DESC

In mysql , window analytic function like row_number is not there, so we can simulate the same using variables.
The good thing is, the table is scanned only once with this approach.
A row_number is assigned to each partition which is divided based on ( account number, effective date ) and only 2 rows are selected from each partition.
select account_num,
max(case when row_number =1 then effective_date end) as ed1,
max(case when row_number =1 then amount end) as am1,
max(case when row_number =2 then effective_date end) as ed2,
max(case when row_number =2 then amount end )as am2
from (
select account_num, effective_date, amount,
#num := if(#prevacct= account_num , #num + 1, 1) as row_number,
#prevacct := account_num as dummy
from bill, (select #num:=0, #prevacct := '' ) as var
order by account_num , effective_date desc
)T
where row_number <=2
group by account_num

Related

I need to get last created eligible rider ids and pinged rider ids accordeing to a orderId using a sql query

I need to get my data set as this table
I am trying to get eligible set like this, need to group_concat pinged set also
x.id IN (SELECT MAX(x.id) FROM x WHERE ping rider id IS NULL GROUP BY orderId)
You can assign a group based on the cumulative number of non-null values in eligible_riders. Then aggregate and take the last value:
select og.*
from (select order_id, grp, max(eligible_riders) as eligible_riders,
group_concat(rider_id) as riders,
row_number() over (partition by order_id order by min(id) desc) as seqnum
from (select t.*,
sum(eligible_riders <> '') over (partition by order_id order by id) as grp
from t
) t
group by order_id, grp
) og
where seqnum = 1;
Hmmm . . . You could also do this with a correlated subquery, which might look a bit simpler:
select order_id, max(eligible_riders) as eligible_riders,
group_concat(rider_id) as riders
from t
where t.id >= (select max(t2.id)
from t t2
where t2.order_id = t.order_id and
t2.eligible_riders <> ''
)
group by order_id;
For performance, you want an index on (order_id, eligible_riders).

Get maximum value from two values

I have a table which gives the no of rides by a rider at each stand point. I need to find the stand for each rider for which he has the maximum rides.
My first result is in this format: 1
I require my final result like this: 2
I'm currently using this query, but I know it can be done in a better manner. Any suggestions would be helpful.
select c.rider_id, c.end_stand, b.max_rides
from
(select rider_id, max(rides) as max_rides
from
(select rider_id, end_stand, count(id) as rides
from ride where end_stand is not null
group by 1,2) a
group by 1
order by 2 desc, 1) b
join
(select rider_id, end_stand, count(id) as rides
from ride where end_stand is not null
group by 1,2) c
on c.rider_id = b.rider_id and c.rides = b.max_rides
order by 3 desc, 2,1
Before window functions, one method is a correlated subquery in the having clause:
select rider_id, end_stand, count(*) as rides
from ride r
where end_stand is not null
group by rider_id, end_stand
having count(*) = (select count(*)
from ride r2
where r2.end_stand is not null and
r2.rider_id = r.rider_id
group by r2.rider_id, r2.end_stand
order by count(*) desc
limit 1
);
With window functions, this is, of course, much simpler:
select *
from (select rider_id, end_stand, count(*) as rides
rank() over (partition by rider_id order by count(*) desc) as seqnum
from ride r
where end_stand is not null
group by rider_id, end_stand
) r
where seqnum = 1;
Both these will return duplicates, if there are ties for the max. The second version is easy to fix, if you want only one row: use row_number() instead of rank().

MySql GROUP BY Max Date

I have a table called votes with 4 columns: id, name, choice, date.
****id****name****vote******date***
****1*****sam*******A******01-01-17
****2*****sam*******B******01-05-30
****3*****jon*******A******01-01-19
My ultimate goal is to count up all the votes, but I only want to count 1 vote per person, and specifically each person's most recent vote.
In the example above, the result should be 1 vote for A, and 1 vote for B.
Here is what I currently have:
select name,
sum(case when uniques.choice = A then 1 else 0 end) votesA,
sum(case when uniques.choice = B then 1 else 0 end) votesB
FROM (
SELECT id, name, choice, max(date)
FROM votes
GROUP BY name
) uniques;
However, this doesn't work because the subquery is indeed selecting the max date, but it's not including the correct choice that is associated with that max date.
Don't think "group by" to get the most recent vote. Think of join or some other option. Here is one way:
SELECT v.name,
SUM(v.choice = 'A') as votesA,
SUM(v.choice = 'B') as votesB
FROM votes v
WHERE v.date = (SELECT MAX(v2.date) FROM votes v2 WHERE v2.name = v.name)
GROUP BY v.name;
Here is a SQL Fiddle.
Your answer are close but need to JOIN self
Subquery get Max date by name then JOIN self.
select
sum(case when T.vote = 'A' then 1 else 0 end) votesA,
sum(case when T.vote = 'B' then 1 else 0 end) votesB
FROM (
SELECT name,Max(date) as date
FROM T
GROUP BY name
) AS T1 INNER JOIN T ON T1.date = T.date
SQLFiddle
Try this
SELECT
choice,
COUNT(1)
FROM
votes v
INNER JOIN
(
SELECT
id,
max(date)
FROM
votes
GROUP BY
name
) tmp ON
v.id = tmp.id
GROUP BY
choice;
Something like this (if you really need count only last vote of person)
SELECT
sum(case when vote='A' then cnt else 0 end) voteA,
sum(case when vote='B' then cnt else 0 end) voteB
FROM
(SELECT vote,count(distinct name) cnt
FROM (
SELECT name,vote,date,max(date) over (partition by name) maxd
FROM votes
)
WHERE date=maxd
GROUP BY vote
)
PS. MySQL v 8
select
name,
sum( case when choice = 'A' then 1 else 0 end) voteA,
sum( case when choice = 'B' then 1 else 0 end) voteB
from
(
select id, name, choice
from votes
where date = (select max(date) from votes t2
where t2.name = votes.name )
) t
group by name
Or output just one row for the total counts of VoteA and VoteB:
select
sum( case when choice = 'A' then 1 else 0 end) voteA,
sum( case when choice = 'B' then 1 else 0 end) voteB
from
(
select id, name, choice
from votes
where date = (select max(date) from votes t2
where t2.name = votes.name )
) t
Based on #d-shish solution, and since introduction (in MySQL 5.7) of ONLY_FULL_GROUP_BY, the GROUP BY statement must be placed in subquery like this :
SELECT v.`name`,
SUM(v.`choice` = 'A') as `votesA`,
SUM(v.`choice` = 'B') as `votesB`
FROM `votes` v
WHERE (
SELECT MAX(v2.`date`)
FROM `votes` v2
WHERE v2.`name` = v.`name`
GROUP BY v.`name` # << after
) = v.`date`
# GROUP BY v.`name` << before
Otherwise, it won't work anymore !

How to select last and last but one records

I have a table with 3 columns id, type, value like in image below.
What I'm trying to do is to make a query to get the data in this format:
type previous current
month-1 666 999
month-2 200 15
month-3 0 12
I made this query but it gets just the last value
select *
from statistics
where id in (select max(id) from statistics group by type)
order
by type
EDIT: Live example http://sqlfiddle.com/#!9/af81da/1
Thanks!
I would write this as:
select s.*,
(select s2.value
from statistics s2
where s2.type = s.type
order by id desc
limit 1, 1
) value_prev
from statistics s
where id in (select max(id) from statistics s group by type) order by type;
This should be relatively efficient with an index on statistics(type, id).
select
type,
ifnull(max(case when seq = 2 then value end),0 ) previous,
max( case when seq = 1 then value end ) current
from
(
select *, (select count(*)
from statistics s
where s.type = statistics.type
and s.id >= statistics.id) seq
from statistics ) t
where seq <= 2
group by type

MySql Start and End price (Min,Max) with Inner Joins

I have a table of prices, 2 types. metal 1 and metal 2.
I have succeeded in getting the max, min price for each metal groups by day.
How can i also select the start (first) and end (last) of every day too?
I am nearly there, but struggling on getting these two final prices...
My SQL fiddle with example data:
http://sqlfiddle.com/#!9/ca4867/1
My query so far:
select
highp.metal_price_datetime_IST AS high_price_metal_price_datetime_IST
, highp.metal_price as highest_price
, lowp.report_term
, lowp.metal_id
, lowp.metal_price as lowest_price
, lowp.metal_price_datetime_IST AS low_price_metal_price_datetime_IST
from (select #report_term:=concat(day(metal_price_datetime_IST), ' ', monthname(metal_price_datetime_IST), ' ', year(metal_price_datetime_IST)) as report_term
, metal_price_datetime_IST
, metal_price
, metal_id
, case when #report_term=#old_report_term then #rn1:=#rn1+1 else #rn1:=1 end as rn
, #old_report_term:=#report_term
from metal_prices
cross join (select #rn1:=0, #old_report_term:='') inituservar1
where metal_price_datetime_IST BETWEEN '2018-02-01' AND LAST_DAY('2018-02-01')
order by metal_id, report_term, metal_price asc) lowp
inner join (select #report_term2:=concat(day(metal_price_datetime_IST), ' ', monthname(metal_price_datetime_IST), ' ', year(metal_price_datetime_IST)) as report_term
, metal_price_datetime_IST
, metal_price
, metal_id
, case when #report_term2=#old_report_term2 then #rn2:=#rn2+1 else #rn2:=1 end as rn
, #old_report_term2:=#report_term2
from metal_prices
cross join (select #rn2:=0, #old_report_term2:='') inituservar1
where metal_price_datetime_IST BETWEEN '2018-02-01' AND LAST_DAY('2018-02-01')
order by metal_id, report_term, metal_price desc) highp
on lowp.rn=highp.rn
and lowp.metal_id = highp.metal_id
and lowp.report_term = highp.report_term
and lowp.rn = 1
and (lowp.metal_id = 1 or lowp.metal_id = 2)
order by lowp.metal_price_datetime_IST DESC
The query you have in your fiddle seems too complex for what needs to be done. I have refactored and rewritten the query. Basically, the query is split in two parts. First one maxminprice determines the max and min price for each day for each metal. Fairly straight forward. The second part firstlastprice is a bit more complex. It finds out the max and min time stamps for each metal for each day. Then joins back to the main table to get the values for those time stamps. The case statement there is to merge the results for max and min (first and last) time so we don't have to do the query twice.
SELECT maxminprice.metal_id,
maxminprice.metal_price_datetime,
maxminprice.max_price,
maxminprice.min_price,
firstlastprice.first_price,
firstlastprice.last_price
FROM (SELECT metal_id,
DATE(metal_price_datetime) metal_price_datetime,
MAX(metal_price) max_price,
MIN(metal_price) min_price
FROM metal_prices
GROUP BY metal_id,
DATE(metal_price_datetime)
ORDER BY metal_id,
DATE(metal_price_datetime)) maxminprice
INNER JOIN (SELECT mp.metal_id,
day_range.metal_price_datetimefl,
SUM(CASE
WHEN TIME(mp.metal_price_datetime) = first_time
THEN
mp.metal_price
ELSE NULL
END) first_price,
SUM(CASE
WHEN TIME(mp.metal_price_datetime) = last_time
THEN
mp.metal_price
ELSE NULL
END) last_price
FROM metal_prices mp
INNER JOIN (SELECT metal_id,
DATE(metal_price_datetime)
metal_price_datetimefl,
MAX(TIME(metal_price_datetime))
last_time,
MIN(TIME(metal_price_datetime))
first_time
FROM metal_prices
GROUP BY metal_id,
DATE(metal_price_datetime))
day_range
ON mp.metal_id = day_range.metal_id
AND DATE(mp.metal_price_datetime) =
day_range.metal_price_datetimefl
AND TIME(mp.metal_price_datetime) IN
( last_time, first_time )
GROUP BY mp.metal_id,
day_range.metal_price_datetimefl) firstlastprice
ON maxminprice.metal_id = firstlastprice.metal_id
AND maxminprice.metal_price_datetime =
firstlastprice.metal_price_datetimefl