Table qtlsdmx
Id
pid (Receipt Id)
qrrq (Unix Datetime)
sl (QTY)
je (Total Price Don't * QTY)
spmc (Product Name)
zdcx_ids (Promo Id)
1
1
1653839999
3
127.26
Product1
175,167
2
1
1653839999
2
84.84
Product2
175,167
3
1
1653839999
1
183.42
Product3
167
4
1
1653839999
1
165.74
Product4
167
5
1
1653839999
1
165.74
Product4
167
Table zdcxd (Id = Table qtlsdmx Column zdcx_ids)
Id (Promo Id)
hdmc (Promo Name)
167
$500 - $40
175
$25 VOUCHER
177
$50 VOUCHER
179
$75 VOUCHER
Table qtlsd (Id = Table qtlsdmx Column pid)
Id (Receipt Id)
zddm (Shop name)
1
SHOP01
2
SHOP02
3
SHOP03
I need to analyze some promotion using above table.
The main issue is zdcx_ids can contains multiple promotion id. But I need to group it one at a time. I'd tried to create a new column with splitted zdcx_ids (SEE REF1 Output). But this method basically duplicate the record and cause Id not unique and QTY, Price etc. incorrect.
May I ask how to possible to code it? If possible please avoid using IN and LIKE "%175%" method. Because there's over 8mil record in it and loading time is huge.
REF1 Output:
Id
pid (Receipt Id)
qrrq (Unix Datetime)
sl (QTY)
je (Total Price Don't * QTY)
spmc (Product Name)
zdcx_ids (Promo Id)
splitted_zdcx_ids (Splitted Promo Id)
1
1
1653839999
3
127.26
Product1
175,167
175
1
1
1653839999
3
127.26
Product1
175,167
167
2
1
1653839999
2
84.84
Product2
175,167
175
2
1
1653839999
2
84.84
Product2
175,167
167
3
1
1653839999
1
183.42
Product3
167
167
4
1
1653839999
1
165.74
Product4
167
167
5
1
1653839999
1
165.74
Product4
167
167
Here are my 3 expected output.
Expected output 1 :
Id (Promo Id)
Count
167
5
175
2
177
0
179
0
Expected output 2 :
Id (Promo Id)
Price (Only Sum zdcx_ids if exist)
<--- RESULT FORMULA
167
$727
127.26 + 84.84 + 183.42 + 165.7 + 165.74
175
$212.1
127.26 + 84.84
177
$0
179
$0
Expected output 3 :
Id (Promo Id)
Price (If zdcx_ids exist Id, Sum je with same pid)
<--- RESULT FORMULA
167
$727
127.26 + 84.84 + 183.42 + 165.7 + 165.74
175
$727
127.26 + 84.84 + 183.42 + 165.7 + 165.74
177
$0
179
$0
My progress for solution 1:
SET #tDate := '2022-05-29';
SELECT COUNT(*) FROM (
SELECT COUNT(*) FROM qtlsdmx
WHERE FROM_UNIXTIME(qrrq) >= #tDate AND FROM_UNIXTIME(qrrq) < DATE_ADD(#tDate, INTERVAL 1 DAY)
and zdcx_ids like '%175%' group by pid
) a ;
My progress for solution 2:
select sum(je)
from qtlsdmx
where FROM_UNIXTIME(qrrq) >= #tDate and FROM_UNIXTIME(qrrq) < DATE_ADD(#tDate, INTERVAL 1 DAY)
and pid in (select pid from qtlsdmx where zdcx_ids like '%175%');
My progress for solution 3:
SET #tDate := '2022-05-29';
SELECT SUM(je) AS TOTALPRICE
FROM qtlsdmx
WHERE FROM_UNIXTIME(qrrq) >= #tDate AND FROM_UNIXTIME(qrrq) < DATE_ADD(#tDate, INTERVAL 1 DAY)
and pid in (select pid from qtlsdmx where zdcx_ids like '%175%');
Here's an idea using SUBSTRING_INDEX() and UNION ALL to make comma separated values generated by each of their own row. So, we are going to turn this row:
Id
pid
qrrq
sl
je
spmc
zdcx_ids
1
1
1653839999
3
127.26
Product1
175,167
to:
Id
pid
qrrq
sl
je
spmc
zdcx_ids
1
1
1653839999
3
127.26
Product1
175
1
1
1653839999
3
127.26
Product1
167
With this query, we'll get the first in comma separated value of zdcx_ids:
SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',1),',',-1)
FROM qtlsdmx;
The reason why we're doing SUBSTRING_INDEX() twice it to cater for the next value after the comma. There's another part we need to add in to the query, which is the count of comma separated value. I think I've seen a different method somewhere but the one on top of my head is subtraction of string length between the original value and the value where the comma is being removed and then add 1:
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 1
The result of the length subtraction on zdcx_ids=175,167 will return 1 since there's only 1 comma so the count is not correct. Obviously 175,167 should have two value count which is why we add +1 at the end of the subtraction result. The >= 1 is just telling the query to return result if the count value is equal or more than the checked value. We'll add this into the query:
SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',1),',',-1)
FROM qtlsdmx
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 1
Now we'll add the remaining original condition that you've placed and do UNION ALL with the same query structure but with different count or numbering sequence. There are two parts we need to change for the second query:
SET #tDate := '2022-05-29';
SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',1),',',-1)
FROM qtlsdmx
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 1
/*adding the rest of your condition*/
AND FROM_UNIXTIME(qrrq) >= #tDate
AND FROM_UNIXTIME(qrrq) < DATE_ADD(#tDate, INTERVAL 1 DAY)
UNION ALL
SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',2),',',-1)
FROM qtlsdmx
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 2
AND FROM_UNIXTIME(qrrq) >= #tDate
AND FROM_UNIXTIME(qrrq) < DATE_ADD(#tDate, INTERVAL 1 DAY)
Notice the changes in the second query at the first SUBSTRING_INDEX() and the value count. Depending on your data, you might have to repeat the same query for 10 times with incremental value on those two parts. You probably can consider using PREPARED STATEMENT for this. I'll get to that later.
Once the UNION ALL queries done, you can make it as subquery, LEFT JOIN and do the calculation:
For expected output 1:
SELECT t1.Id, COUNT(t2.Id)
FROM zdcxd AS t1
LEFT JOIN
(SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',1),',',-1) AS val /*assigning alias*/
FROM qtlsdmx
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 1
UNION ALL
SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',2),',',-1)
FROM qtlsdmx
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 2) AS t2
ON t1.Id=t2.val
GROUP BY t1.Id;
For expected output 2, you need to just change COUNT(t2.Id) to SUM(je):
SELECT t1.Id, SUM(je)
FROM zdcxd AS t1
LEFT JOIN
...
As for expected output 3, I'm still trying to understand how to make that work so I haven't got any suggestion on that yet. I'll update this post if I've figured it out somehow but hopefully you manage to figure it out yourself too.
Now, since this UNION ALL query depend on how many comma separated value, it will definitely be a long query. So I suggest using MySQL prepare statement to generate the query with some customized sequence numbering.
Here's a fiddle of the full prepare statement queries
Fiddle 1.
Fiddle 2 - in case the first fiddle is not working.
I have a table with id, city_id, and stock which looks like this.
id
city_id
stock
1
1
1000
2
2
500
3
3
11000
4
2
600
5
3
12000
6
1
2000
9
3
13000
10
1
3000
11
1
4000
12
2
700
To select the latest stock values for each city_id I used the following query, which works fine.
SELECT `stock`
FROM `stock_table`
WHERE id in ( SELECT MAX(id)
FROM `stock_table`
GROUP BY city_id
);
It returns
stock
13000
4000
700
Now I want to select 2nd latest stock entry for each city_id. So the output should be like the following table.
stock
12000
3000
600
Any help is greatly appreciated. Thanks!
In MySQL 8 you can use the row_number() window function to assign a number to each row ordered by the id per partition of city_id. Then just filter on that being 2 (in your example; you can use any number).
SELECT x.stock
FROM (SELECT s.stock,
row_number() OVER (PARTITION BY s.city_id
ORDER BY s.id DESC) rn
FROM stock_table s) x
WHERE x.rn = 2;
u can use group by with row number and get rownumber 2
ROW_NUMBER() OVER (PARTITION BY city_id ORDER By id) as ROW
I have a table which has a combination of item_id and payment_id columns as a key.
Every two rows have the same item_id value.
This is how the table looks.
item_id payment_id amount
1 140 1000
1 141 3000
2 141 500
2 145 600
3 4 4000
3 735 9000
How to subtract the amount value of the least payment_id row from the amount value of the maximum payment_id row (of the two rows with the same item_id) using MySQL?
To clarify, this is how the table I want.
item_id payment_id amount
1 140 1000
1 141 2000 : 3000 - 1000
2 141 500
2 145 100 : 600 - 500
3 4 4000
3 735 5000 : 9000 - 4000
Cheer!
You can get the new amount with this query:
select p1.item_id, p1.payment_id, p1.amount - (
select p0.amount
from payments p0
where p0.item_id = p1.item_id
and p0.payment_id < p1.payment_id
order by p0.payment_id
limit 1
) as new_amount
from payments p1
having new_amount is not null;
It will subtract the amount of the "last" row with the same item_id (if present).
You can then use that query in the UPDATE statement as derived table joined to your original table:
update payments p
join (
select p1.item_id, p1.payment_id, p1.amount - (
select p0.amount
from payments p0
where p0.item_id = p1.item_id
and p0.payment_id < p1.payment_id
order by p0.payment_id
limit 1
) as new_amount
from payments p1
having new_amount is not null
) p1 using (item_id, payment_id)
set p.amount = p1.new_amount;
Demo: http://rextester.com/DJD86481
UPDATE tt JOIN (SELECT item_id, MAX(payment_id) mp , (SUBSTRING_INDEX(GROUP_CONCAT(amount ORDER BY payment_id DESC),',',1) - SUBSTRING_INDEX(GROUP_CONCAT(amount ORDER BY payment_id ),',',1)) maxdif FROM tt GROUP BY item_id) s
ON tt.item_id=s.item_id
SET tt.amount =s.maxdif
WHERE tt.payment_id =s.mp AND tt.item_id=s.item_id;
SELECT * FROM tt;
See it working
I am trying to get the second last records use mysql.
I did some research, some sample has fix gap between numbers or date. But my situation is that the contract_id is not always +1 from the previous one. Anyone ideas? Thank you so much.
merchant_id contract_id start_date
10 501 2016-05-01
10 506 2016-06-01
13 456 2015-12-01
13 462 2016-01-01
14 620 2016-06-01
14 642 2016-07-01
14 656 2016-07-05
merchant_id Second_last_contract_id
10 501
13 456
14 642
contract_id != previous contract_id + X. (The X is not fixed)
'start_date' tell us the contracts creating order.
Here's one option using user-defined variables to establish a row number per group of merchants and then filtering on the 2nd in each group ordered by contracts:
select *
from (
select *,
#rn:=if(#prevMerchantId=merchantid,
#rn+1,
if(#prevMerchantId:=merchantid, 1, 1)
) as rn
from yourtable cross join (select #rn:=0, #prevMerchantId:=null) t
order by merchantId, contractid desc
) t
where rn = 2
SQL Fiddle Demo
Here's another option, filtering the results of GROUP_CONCAT() using SUBSTRING_INDEX():
SELECT merchant_id,
SUBSTRING_INDEX(SUBSTRING_INDEX(
GROUP_CONCAT(contract_id ORDER BY start_date DESC),
',', 2), ',', -1) AS Second_last_contract_id
FROM the_table
GROUP BY merchant_id
See it on sqlfiddle.
I have a database table that represents concert hall and it looks as following:
Columns: ActionID, RegionName, RowNumber, Price, Quantity
The sample data:
ActionID RegionName RowNumber Price Quantity
1 Region1 22 8000 7
1 Region1 - 8000 1
1 Region2 10 5000 2
1 Region2 10 5000 2
1 Region2 10 5000 1
I should display particular regions with overall quantity per region grouped by ActionID, RegionName, Price and Quantity.
1 Region1 22 8000 8
1 Region2 10 5000 5
As you can see the RowNumber should be displayed, but not taken into account in grouping.
How can I do that?
You could use a CTE and ranking function like ROW_NUMBER:
WITH CTE AS
(
SELECT t.*,
rn = ROW_NUMBER()OVER(Partition By ActionID, RegionName, Price
Order By RowNumber),
OverallQuantity = SUM(Quantity) OVER (Partition By ActionID, RegionName, Price)
FROM dbo.TableName t
)
SELECT ActionID, RegionName, RowNumber, Price, Quantity = OverallQuantity
FROM CTE
WHERE RN = 1
Result:
ActionID RegionName RowNumber Price Quantity
1 Region1 - 8000 8
1 Region2 10 5000 5
Demo
You could use something like this:
SELECT ActionId,
RegionName,
MAX(RowNumber),
MAX(Price),
SUM(Quantity)
FROM #tblTest
GROUP BY ActionId, RegionName
But if there are more RowNumber or Price, it should be modified the query
You can apply a MAX or MIN to collapse the rows:
SELECT MAX(ActionID), RegionName, MAX(RowNumber), MAX(Price), SUM(Quantity)
FROM my_table
GROUP BY RegionName
This obviously gets more complicated if you have multiple ActionID or RowNumber or Price per region. In that case, only use this if you don't care which one of these multiple values you get back.