SQL join table lower date and lower id - mysql

I have got the following two tables
START AND REPEAT
START
INSPECID=======SCORE
1--------------3
2--------------1
3--------------4
REPEAT
ID========INSPECID========SCORE========DATE
1---------1---------------9------------12/01/2016
2---------1---------------1------------11/01/2016
3---------2---------------2------------29/01/2016
4---------2---------------4------------01/01/2016
5---------2---------------3------------22/01/2016
6---------2---------------5------------02/01/2016
7---------2---------------1------------11/01/2016
8---------2---------------1------------01/01/2016
9---------3---------------1------------02/01/2016
10--------3---------------2------------09/01/2016
I am expecting as below
INCREASED------1
DECREASED------2
EQUAL----------0
Rules
1) Join tables by INSPECID
2) When more than 1 INSPECID is found in REPEAT table consider the score from the lower date.
3) when both INSPECID is matched and date is matched than consider the lower ID in the REPEAT table, so ID 4 and ID 8 has same date and same INPECTID but consider the ID 4 score which is 4.

Do a self join with REPEAT table to pick the oldest row
select s.*,a.*
from `START` s
join `REPEAT` a on s.INSPECID = a.INSPECID
left join `REPEAT` b on a.INSPECID = b.INSPECID
and case when a.DATE = b.DATE
then a.ID > b.ID
else a.DATE > b.DATE
end
where b.INSPECID is null
For conflict when INSPECID and DATE is same use CASE to choose row with lowest ID
Demo
Updated for desired result set
select t.result,count(t1.result) cnt
from (
select 'Increased' result
union
select 'Decreased' result
union
select 'Equal' result
) t
left join (
select s.score,a.id,a.DATE,
case when s.SCORE > a.SCORE
then 'Increased'
when s.SCORE < a.SCORE
then 'Decreased'
else 'Equal'
end result
from `START` s
join `REPEAT` a on s.INSPECID = a.INSPECID
left join `REPEAT` b on a.INSPECID = b.INSPECID
and case when a.DATE = b.DATE
then a.ID > b.ID
else a.DATE > b.DATE
end
where b.INSPECID is null
) t1 using(result)
group by t.result
Demo

This is a bit tricky. The following uses the group_concat() trick for calculating the first and last scores. It then puts these into the categories that you want:
select w.which, count(r.INSPECID)
from (select 'DECREASING' as which union all
select 'INCREASING' as which union all
select 'EQUAL' as which
) w left join
(select r.INSPECID,
(substring_index(group_concat(score order by date), ',', 1) + 0) as first_score,
(substring_index(group_concat(score order by date desc), ',', 1) + 1) as last_score
from repeat r
group by INSPECID
) r
ON (last_score > first_score and w.which = 'INCREASING') or
(last_score < first_score and w.which = 'DECREASING') or
(last_score = first_score and w.which = 'INCREASING')
group by w.which;
Note that the first table is not necessary.

Related

Why integer cast is not working with integer group_concat() list?

I'm stuck at the query where I need to concat IDs of the table. And from that group of IDs, I need to fetch that rows in sub query. But when I try to do so, MySQL consider group_concat() as a string. So that condition becomes false.
select count(*)
from rides r
where r.ride_status = 'cancelled'
and r.id IN (group_concat(rides.id))
*************** Original Query Below **************
-- Daily Earnings for 7 days [Final]
select
group_concat(rides.id) as ids,
group_concat(ride_category.name) as rideType,
group_concat(ride_cars.amount + ride_cars.commission) as rideAmount ,
group_concat(ride_types.name) as carType,
count(*) as numberOfRides,
(
select count(*) from rides r where r.ride_status = 'cancelled' and r.id IN (group_concat(rides.id) )
) as cancelledRides,
(
select count(*) from rides r where r.`ride_status` = 'completed' and r.id IN (group_concat(rides.id))
) as completedRides,
group_concat(ride_cars.status) as status,
sum(ride_cars.commission) + sum(ride_cars.amount) as amount,
date_format(from_unixtime(rides.requested_at/1000 + rides.offset*60), '%Y-%m-%d') as requestedDate,
date_format(from_unixtime(rides.requested_at/1000 + rides.offset*60), '%V') as week
from
ride_cars,
rides,
ride_category,
ride_type_cars,
ride_types
where
ride_cars.user_id = 166
AND (rides.ride_status = 'completed' or. rides.ride_status = 'cancelled')
AND ride_cars.ride_id = rides.id
AND (rides.requested_at >= 1559347200000 AND requested_at < 1561852800000)
AND rides.ride_category = ride_category.id
AND ride_cars.car_model_id = ride_type_cars.car_model_id
AND ride_cars.ride_type_id = ride_types.id
group by
requestedDate;
Any solutions will be appreciated.
Try to replace the sub-query
(select count(*) from rides r where r.ride_status = 'cancelled' and r.id IN (group_concat(rides.id) )) as cancelledRides,
with below to count using SUM and CASE, it will make use of the GROUP BY
SUM(CASE WHEN rides.ride_status = 'cancelled' THEN 1 ELSE 0 END) as cancelledRides
and the same for completedRides
And move to using JOIN instead of implicit joins

Convert SumIfs Excel Function to MySQL

The formula in cell G2 "ReplenQty" is:
=SUMIFS(D:D,A:A,A2,B:B,B2,C:C,">=" & E2,C:C,"<=" &F2)
The formula in cell H2 "RpInVar" is:
=IF($A2<>$A1,ROUND(VAR(IF($A:$A=$A2,$G:$G)),2),0)
I attempted this in MySQL:
SELECT DISTINCT
Part,
Customer,
OrdDt,
OrdQty,
StartDate,
ReplenDate,
SUM(CASE WHEN Part = Part AND Customer = Customer AND OrdDt >= StartDate AND OrdDt <= ReplenDate THEN OrdQty ELSE 0 END) AS ReplenQty,
VARIANCE(CASE WHEN Part = Part AND Customer = Customer AND OrdDt >= StartDate AND OrdDt <= ReplenDate THEN OrdQty ELSE 0 END) AS RpInVar,
FROM
BeforeReplenQty
GROUP BY
Part,
Customer,
OrdDt,
OrdQty,
StartDate,
ReplenDate;
Problem is OrdQty and ReplenQty are the same and RpInVar are all 0.
This query is quite long and complicated but working on this demo: http://sqlfiddle.com/#!9/3b3334/70
One task is to do a sum where order date is between start date and replenish date.
Then get the row where part is new compared to previous row.
The first part of the query is to get the variance, the second subquery is to get the sum of Ordered qty and the sub-query at the bottom is to get the row where part column has changed.
select tab.Part,tab.Customer,tab.OrdDt,tab.OrdQty,tab.StartDate,tab.ReplenDate,tab.ReplenQty,
case when sumtab.Rnk=1 then
(select variance(ReplenQty)
from (select sum(t1.OrdQty) as ReplenQty
from BeforeReplenQty t2
inner join BeforeReplenQty t1
where t2.part=t1.part and t2.customer=t1.customer
and t2.OrdDt between t1.StartDate and t1.ReplenDate
group by t1.Part,t1.Customer,t1.OrdDt,t1.OrdQty,t1.StartDate,t1.ReplenDate) t3) else 0 end as ReplenVar
from (
select t1.*,sum(t1.OrdQty) as ReplenQty
from BeforeReplenQty t2
inner join BeforeReplenQty t1
where t2.part=t1.part and t2.customer=t1.customer
and t2.OrdDt between t1.StartDate and t1.ReplenDate
group by t1.Part,t1.Customer,t1.OrdDt,t1.OrdQty,t1.StartDate,t1.ReplenDate) tab
left join (select part,customer,orddt,rnk
from (
select t.part,t.customer,t.OrdDt,
#s:=CASE WHEN #c <> t.part THEN 1 ELSE #s+1 END AS rnk,
#c:=t.part AS partSet
from (SELECT #s:= 0) s
inner join (SELECT #c:= 'A') c
inner join (SELECT * from BeforeReplenQty
order by Part, Customer, OrdDt) t
) tab
where rnk = 1
) sumtab
on tab.part=sumtab.part and tab.customer=sumtab.customer and tab.orddt=sumtab.orddt;

MYSQL Query with Multiple Selects from Same Table

Getting an error
Operand should contain 1 column(s)
PK is ID
The table just dumps data in to the table
need to get the earliest date qty and the latest date qty and display on the same column
Any help appreciated
SELECT ebx_r_history.ItemNumber,
(SELECT r.QuantitySold as newqty, r.lastupdate as lu
FROM ebx_r_history r
WHERE ebx_r_history.ItemNumber = r.ItemNumber AND ebx_r_history.SKU = r.SKU
ORDER BY r.LastUpdate ASC
LIMIT 1),
(SELECT r.QuantitySold as newqty, r.lastupdate as lu
FROM ebx_r_history r
WHERE ebx_r_history.ItemNumber = r.ItemNumber AND ebx_r_history.SKU = r.SKU
ORDER BY r.LastUpdate DESC
LIMIT 1)
FROM
ebx_r_history
GROUP BY ebx_r_history.ItemNumber,
ebx_r_history.SKU
ORDER BY ebx_r_history.LastUpdate
This version may offer a simplified and faster alternative for you. The inner query for "AllItems" does both a min and max of the last update on a per-item number/sku basis, although I believe they would be one-in-the-same record.
So now, join that results back to the history data by item/sku and only those that match either the min or max date. If a true date/time, there would expect to only be one anyhow, vs just a date-only. So, since there would be 2 possible records (one for the min, one for the max), I am applying a MAX( IIF( )) for each respective matching the minimum and maximum dates respectively and must retain the group by clause.
Note, if you are dealing with date-only entries, or possibilities of the exact same item/sku and lastupdate are the same to the second, then you would need an approach more towards limit 1 per ascending/descending basis.
SELECT
AllItems.ItemNumber,
AllItems.SKU,
AllItems.MinUpdate,
MAX( IIF( rh.lastupdate = AllItems.MinUpdate, rh.Quantity.Sold, 0 )) as QtyAtMinDate,
AllItems.MaxUpdate,
MAX( IIF( rh.lastupdate = AllItems.MaxUpdate, rh.Quantity.Sold, 0 )) as QtyAtMaxDate
from
( SELECT
r.ItemNumber,
r.SKU,
MIN( r.lastupdate ) as MinUpdate,
MAX( r.lastupdate ) as MaxUpdate
FROM
ebx_r_history r
group by
r.ItemNumber,
r.SKU ) AllItems
JOIN ebx_r_history rh
ON AllItems.ItemNumber = rh.ItemNumber
AND AllItems.SKU = rh.SKU
AND ( rh.lastUpdate = AllItems.MinUpdate
OR rh.lastUpdate = AllItems.MaxUpdate )
group by
AllItems.ItemNumber,
AllItems.SKU
Per another answer where you were only looking to IGNORE items within the most recent 14 days, you can just add a WHERE clause to the inner query similar via
WHERE r.LastUpdate >= CURDATE() - INTERVAL 14 DAY
If your history table has an auto-incrementing ID column, AND the respective transactions have the lastUpdate sequentially stamped, such as when they are added and not modified by any other operation, then you could just apply similar but MIN/MAX of the ID column, then join back TWICE on the ID and just each row ONCE such as...
SELECT
AllItems.ItemNumber,
AllItems.SKU,
rhMin.LastUpdate as MinUpdate,
rhMin.QuantitySold as MinSold,
rhMax.LastUpdate as MaxUpdate,
rhMax.QuantitySold as MaxSold
from
( SELECT
r.ItemNumber,
r.SKU,
MIN( r.AutoIncrementColumn ) as MinAutoID,
MAX( r.AutoIncrementColumn ) as MaxAutoID
FROM
ebx_r_history r
group by
r.ItemNumber,
r.SKU ) AllItems
JOIN ebx_r_history rhMin
ON AllItems.MinAutoID = rhMin.AutoIncrementColumn
JOIN ebx_r_history rhMax
ON AllItems.MaxAutoID = rhMax.AutoIncrementColumn
order by
rhMax.LastUpdated
Try something like this:
SELECT r1.ItemNumber,
(
SELECT r.QuantitySold
FROM ebx_r_history r
WHERE r1.ItemNumber = r.ItemNumber
AND r1.SKU = r.SKU
ORDER BY r.LastUpdate ASC LIMIT 1
) AS earliestDateQty,
(
SELECT r.QuantitySold
FROM ebx_r_history r
WHERE r1.ItemNumber = r.ItemNumber
AND r1.SKU = r.SKU
ORDER BY r.LastUpdate DESC LIMIT 1
) AS latestDateQty
FROM ebx_r_history r1
GROUP BY r1.ItemNumber,r1.SKU
ORDER BY 3
You had a couple of errors. you were getting two columns inside the inner selects, and you had a couple of places where you might get the error for ambiguous column name.
sqlFiddle here
SELECT T1.ItemNumber,
T1.SKU,
T1.Old_QuantitySold,
T1.Old_LastUpdate,
T2.New_QuantitySold,
T2.New_LastUpdate
FROM
(SELECT itemNumber,SKU,QuantitySold as Old_QuantitySold,LastUpdate as Old_LastUpdate
FROM ebx_r_history r
WHERE NOT EXISTS (SELECT 1 FROM ebx_r_history e
WHERE e.itemNumber = r.itemNumber AND e.SKU = r.SKU
AND e.LastUpdate < r.LastUpdate)
)T1
LEFT JOIN
(SELECT itemNumber,SKU,QuantitySold as New_QuantitySold,LastUpdate as New_LastUpdate
FROM ebx_r_history r
WHERE NOT EXISTS (SELECT 1 FROM ebx_r_history e
WHERE e.itemNumber = r.itemNumber AND e.SKU = r.SKU
AND e.LastUpdate > r.LastUpdate)
)T2 ON (T2.itemNumber = T1.itemNumber AND T2.SKU = T1.SKU)
WHERE T1.Old_LastUpdate >= CURDATE() - INTERVAL 14 DAY
AND T2.New_LastUpdate >= CURDATE() - INTERVAL 14 DAY
ORDER BY T2.New_LastUpdate;
you can do left join or inner join it's up to you, since T1 will always get earliest records and T2 will always get latest records for the ItemNumber,SKU grouping.
UPDATED TO IGNORE DATA OLDER THAN 14 DAYS
SELECT T1.ItemNumber,
T1.SKU,
T1.Old_QuantitySold,
T1.Old_LastUpdate,
T2.New_QuantitySold,
T2.New_LastUpdate
FROM
(SELECT itemNumber,SKU,QuantitySold as Old_QuantitySold,LastUpdate as Old_LastUpdate
FROM ebx_r_history r
WHERE LastUpdate >= CURDATE() - INTERVAL 14 DAY
AND NOT EXISTS (SELECT 1 FROM ebx_r_history e
WHERE e.itemNumber = r.itemNumber AND e.SKU = r.SKU
AND e.LastUpdate >= CURDATE() - INTERVAL 14 DAY
AND e.LastUpdate < r.LastUpdate)
)T1
LEFT JOIN
(SELECT itemNumber,SKU,QuantitySold as New_QuantitySold,LastUpdate as New_LastUpdate
FROM ebx_r_history r
WHERE LastUpdate >= CURDATE() - INTERVAL 14 DAY
AND NOT EXISTS (SELECT 1 FROM ebx_r_history e
WHERE e.itemNumber = r.itemNumber AND e.SKU = r.SKU
AND e.LastUpdate >= CURDATE() - INTERVAL 14 DAY
AND e.LastUpdate > r.LastUpdate)
)T2 ON (T2.itemNumber = T1.itemNumber AND T2.SKU = T1.SKU)
ORDER BY T2.New_LastUpdate;
ignore data older than 14 days sqlFiddle here
If you want to use exact time (14 days ago), you can replace occurences of CURDATE() with NOW()

Entries a specific distance away from others

My table has an NAME and DISTANCE column. I'd like to figure out a way to list all the names that are within N units or less from the same name. i.e. Given:
NAME DISTANCE
a 2
a 4
a 3
a 7
a 1
b 3
b 1
b 2
b 5
(let's say N = 2)
I would like
a 2
a 4
a 3
a 1
...
...
Instead of
a 2
a 2 (because it double counts)
I'm trying to apply this method in order to solve for a customerID with claim dates (stored as number) that appear in clusters around each other. I'd like to be able to label the customerID and the claim date that is within say 10 days of another claim by that same customer. i.e., |a.claimdate - b.claimdate| <= 10. When I use this method
WHERE a.CUSTID = b.CUSTID
AND a.CLDATE BETWEEN (b.CLDATE - 10 AND b.CLDATE + 10)
AND a.CLAIMID <> b.CLAIMID
I double count. CLAIMID is unique.
Since you don't need the text, and just want the values, you can accomplish that using DISTINCT:
select distinct t.name, t.distance
from yourtable t
join yourtable t2 on t.name = t2.name
and (t.distance = t2.distance+1 or t.distance = t2.distance-1)
order by t.name
SQL Fiddle Demo
Given your edits, if you're looking for results between a certain distance, you can use >= and <= (or BETWEEN):
select distinct t.name, t.distance
from yourtable t
join yourtable t2 on t.name = t2.name
and t.distance >= t2.distance-1
and t.distance <= t2.distance+1
and t.distance <> t2.distance
order by t.name
You need to add the final criteria of t.distance <> t2.distance so you don't return the entire dataset -- technically every distance is between itself. This would be better if you had a primary key to add to the join, but if you don't, you could utilize ROW_NUMBER() as well to achieve the same results.
with cte as (
select name, distance, row_number() over (partition by name order by (select null)) rn
from yourtable
)
select distinct t.name, t.distance
from cte t
join cte t2 on t.name = t2.name
and t.distance >= t2.distance-1
and t.distance <= t2.distance+1
and t.rn <> t2.rn
order by t.name
Updated SQL Fiddle
I like #sgeddes' solution, but you can also get rid of the distinct and or in the join condition like this:
select * from table a
where exists (
select 1 from table b
where b.name = a.name
and b.distance between a.distance - 1 and a.distance + 1
)
This also ensures that rows with equal distance get included and considers a whole range, not just the rows that have a distance difference of exactly n, as suggested by #HABO.

mysql find date where no row exists for previous day

I need to select how many days since there is a break in my data. It's easier to show:
Table format:
id (autoincrement), user_id (int), start (datetime), end (datetime)
Example data (times left out as only need days):
1, 5, 2011-12-18, 2011-12-18
2, 5, 2011-12-17, 2011-12-17
3, 5, 2011-12-16, 2011-12-16
4, 5, 2011-12-13, 2011-12-13
As you can see there would be a break between 2011-12-13 and 2011-12-16. Now, I need to be able say:
Using the date 2011-12-18, how many days are there until a break:
2011-12-18: Lowest sequential date = 2011-12-16: Total consecutive days: 3
Probably: DATE_DIFF(2011-12-18, 2011-12-16)
So my problem is, how can I select that 2011-12-16 is the lowest sequential date? Remembering that data applies for particular user_id's.
It's kinda like the example here: http://www.artfulsoftware.com/infotree/queries.php#72 but in the reverse.
I'd like this done in SQL only, no php code
Thanks
SELECT qmin.start, qmax.end, DATE_DIFF( qmax.end, qmin.start ) FROM table AS qmin
LEFT JOIN (
SELECT end FROM table AS t1
LEFT JOIN table AS t2 ON
t2.start > t1.end AND
t2.start < DATE_ADD( t1.end, 1 DAY )
WHERE t1.end >= '2011-12-18' AND t2.start IS NULL
ORDER BY end ASC LIMIT 1
) AS qmax
LEFT JOIN table AS t2 ON
t2.end < qmin.start AND
t2.end > DATE_DIFF( qmin.start, 1 DAY )
WHERE qmin.start <= '2011-12-18' AND t2.start IS NULL
ORDER BY end DESC LIMIT 1
This should work - left joins selects one date which can be in sequence, so max can be fineded out if you take the nearest record without sequential record ( t2.anyfield is null ) , same thing we do with minimal date.
If you can calculate days between in script - do it using unions ( eg 1. row - minimal, 2. row maximal )
Check this,
SELECT DATEDIFF((SELECT MAX(`start`) FROM testtbl WHERE `user_id`=1),
(select a.`start` from testtbl as a
left outer join testtbl as b on a.user_id = b.user_id
AND a.`start` = b.`start` + INTERVAL 1 DAY
where a.user_id=1 AND b.`start` is null
ORDER BY a.`start` desc LIMIT 1))
DATEDIFF() show difference of the Two days, if you want to number of consecutive days add one for that result.
If it's not a beauty contents then you may try something like:
select t.start, t2.start, datediff(t2.start, t.start) + 1 as consecutive_days
from tab t
join tab t2 on t2.start = (select min(start) from (
select c1.*, case when c2.id is null then 1 else 0 end as gap
from tab c1
left join tab c2 on c1.start = adddate(c2.start, -1)
) t4 where t4.start <= t.start and t4.start >= (select max(start) from (
select c1.*, case when c2.id is null then 1 else 0 end as gap
from tab c1
left join tab c2 on c1.start = adddate(c2.start, -1)
) t3 where t3.start <= t.start and t3.gap = 1))
where t.start = '2011-12-18'
Result should be:
start start consecutive_days
2011-12-18 2011-12-16 3