MySQL How To Group by Id that Id contains multiple Ids - mysql

Table qtlsdmx
Id
pid (Receipt Id)
qrrq (Unix Datetime)
sl (QTY)
je (Total Price Don't * QTY)
spmc (Product Name)
zdcx_ids (Promo Id)
1
1
1653839999
3
127.26
Product1
175,167
2
1
1653839999
2
84.84
Product2
175,167
3
1
1653839999
1
183.42
Product3
167
4
1
1653839999
1
165.74
Product4
167
5
1
1653839999
1
165.74
Product4
167
Table zdcxd (Id = Table qtlsdmx Column zdcx_ids)
Id (Promo Id)
hdmc (Promo Name)
167
$500 - $40
175
$25 VOUCHER
177
$50 VOUCHER
179
$75 VOUCHER
Table qtlsd (Id = Table qtlsdmx Column pid)
Id (Receipt Id)
zddm (Shop name)
1
SHOP01
2
SHOP02
3
SHOP03
I need to analyze some promotion using above table.
The main issue is zdcx_ids can contains multiple promotion id. But I need to group it one at a time. I'd tried to create a new column with splitted zdcx_ids (SEE REF1 Output). But this method basically duplicate the record and cause Id not unique and QTY, Price etc. incorrect.
May I ask how to possible to code it? If possible please avoid using IN and LIKE "%175%" method. Because there's over 8mil record in it and loading time is huge.
REF1 Output:
Id
pid (Receipt Id)
qrrq (Unix Datetime)
sl (QTY)
je (Total Price Don't * QTY)
spmc (Product Name)
zdcx_ids (Promo Id)
splitted_zdcx_ids (Splitted Promo Id)
1
1
1653839999
3
127.26
Product1
175,167
175
1
1
1653839999
3
127.26
Product1
175,167
167
2
1
1653839999
2
84.84
Product2
175,167
175
2
1
1653839999
2
84.84
Product2
175,167
167
3
1
1653839999
1
183.42
Product3
167
167
4
1
1653839999
1
165.74
Product4
167
167
5
1
1653839999
1
165.74
Product4
167
167
Here are my 3 expected output.
Expected output 1 :
Id (Promo Id)
Count
167
5
175
2
177
0
179
0
Expected output 2 :
Id (Promo Id)
Price (Only Sum zdcx_ids if exist)
<--- RESULT FORMULA
167
$727
127.26 + 84.84 + 183.42 + 165.7 + 165.74
175
$212.1
127.26 + 84.84
177
$0
179
$0
Expected output 3 :
Id (Promo Id)
Price (If zdcx_ids exist Id, Sum je with same pid)
<--- RESULT FORMULA
167
$727
127.26 + 84.84 + 183.42 + 165.7 + 165.74
175
$727
127.26 + 84.84 + 183.42 + 165.7 + 165.74
177
$0
179
$0
My progress for solution 1:
SET #tDate := '2022-05-29';
SELECT COUNT(*) FROM (
SELECT COUNT(*) FROM qtlsdmx
WHERE FROM_UNIXTIME(qrrq) >= #tDate AND FROM_UNIXTIME(qrrq) < DATE_ADD(#tDate, INTERVAL 1 DAY)
and zdcx_ids like '%175%' group by pid
) a ;
My progress for solution 2:
select sum(je)
from qtlsdmx
where FROM_UNIXTIME(qrrq) >= #tDate and FROM_UNIXTIME(qrrq) < DATE_ADD(#tDate, INTERVAL 1 DAY)
and pid in (select pid from qtlsdmx where zdcx_ids like '%175%');
My progress for solution 3:
SET #tDate := '2022-05-29';
SELECT SUM(je) AS TOTALPRICE
FROM qtlsdmx
WHERE FROM_UNIXTIME(qrrq) >= #tDate AND FROM_UNIXTIME(qrrq) < DATE_ADD(#tDate, INTERVAL 1 DAY)
and pid in (select pid from qtlsdmx where zdcx_ids like '%175%');

Here's an idea using SUBSTRING_INDEX() and UNION ALL to make comma separated values generated by each of their own row. So, we are going to turn this row:
Id
pid
qrrq
sl
je
spmc
zdcx_ids
1
1
1653839999
3
127.26
Product1
175,167
to:
Id
pid
qrrq
sl
je
spmc
zdcx_ids
1
1
1653839999
3
127.26
Product1
175
1
1
1653839999
3
127.26
Product1
167
With this query, we'll get the first in comma separated value of zdcx_ids:
SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',1),',',-1)
FROM qtlsdmx;
The reason why we're doing SUBSTRING_INDEX() twice it to cater for the next value after the comma. There's another part we need to add in to the query, which is the count of comma separated value. I think I've seen a different method somewhere but the one on top of my head is subtraction of string length between the original value and the value where the comma is being removed and then add 1:
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 1
The result of the length subtraction on zdcx_ids=175,167 will return 1 since there's only 1 comma so the count is not correct. Obviously 175,167 should have two value count which is why we add +1 at the end of the subtraction result. The >= 1 is just telling the query to return result if the count value is equal or more than the checked value. We'll add this into the query:
SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',1),',',-1)
FROM qtlsdmx
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 1
Now we'll add the remaining original condition that you've placed and do UNION ALL with the same query structure but with different count or numbering sequence. There are two parts we need to change for the second query:
SET #tDate := '2022-05-29';
SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',1),',',-1)
FROM qtlsdmx
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 1
/*adding the rest of your condition*/
AND FROM_UNIXTIME(qrrq) >= #tDate
AND FROM_UNIXTIME(qrrq) < DATE_ADD(#tDate, INTERVAL 1 DAY)
UNION ALL
SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',2),',',-1)
FROM qtlsdmx
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 2
AND FROM_UNIXTIME(qrrq) >= #tDate
AND FROM_UNIXTIME(qrrq) < DATE_ADD(#tDate, INTERVAL 1 DAY)
Notice the changes in the second query at the first SUBSTRING_INDEX() and the value count. Depending on your data, you might have to repeat the same query for 10 times with incremental value on those two parts. You probably can consider using PREPARED STATEMENT for this. I'll get to that later.
Once the UNION ALL queries done, you can make it as subquery, LEFT JOIN and do the calculation:
For expected output 1:
SELECT t1.Id, COUNT(t2.Id)
FROM zdcxd AS t1
LEFT JOIN
(SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',1),',',-1) AS val /*assigning alias*/
FROM qtlsdmx
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 1
UNION ALL
SELECT Id, pid, qrrq, sl, je, spmc,
SUBSTRING_INDEX(SUBSTRING_INDEX(zdcx_ids,',',2),',',-1)
FROM qtlsdmx
WHERE LENGTH(zdcx_ids)-LENGTH(REPLACE(zdcx_ids,',','')) +1 >= 2) AS t2
ON t1.Id=t2.val
GROUP BY t1.Id;
For expected output 2, you need to just change COUNT(t2.Id) to SUM(je):
SELECT t1.Id, SUM(je)
FROM zdcxd AS t1
LEFT JOIN
...
As for expected output 3, I'm still trying to understand how to make that work so I haven't got any suggestion on that yet. I'll update this post if I've figured it out somehow but hopefully you manage to figure it out yourself too.
Now, since this UNION ALL query depend on how many comma separated value, it will definitely be a long query. So I suggest using MySQL prepare statement to generate the query with some customized sequence numbering.
Here's a fiddle of the full prepare statement queries
Fiddle 1.
Fiddle 2 - in case the first fiddle is not working.

Related

Running total with condition and always looking at the previous value

I want to do a sequential sum in a table by taking into consideration a few extra conditions.
We need to make sure that when the sum is taken sequentially so if a id has +40 then the next sum would be 130, if the next one is +1, the sum is still 130, now if the next one is -1 then the sum has to be 129.
100 needs to be added to the sum for the first time and from there on just the count should be added depending on condition.
We need to even cap the min value of sum so it can't be less than 70
I have tried the query below but it does not seem to look at the prior value.
Example that I tried:
create table tableA (id int not null, count int not null);
insert into tableA(id, count) values(1,11), (2,21),(3, -3); -- case 1
insert into tableA(id, count) values(1,35), (2,-3); -- case 2
insert into tableA(id, count) values(1,-45),(2,67); -- case3
Query tried:
select t.id, t.count,
case when (100+(select ifnull(sum(count),0) from tableA x where x.id <= t.id)) >= 130 then 130
when (100+(select ifnull(sum(count),0) from tableA x where x.id <= t.id)) <= 70 then 70
else (100+(select ifnull(sum(count),0) from tableA x where x.id <= t.id))
end as xxxx
from tableA t;
I expect my output to look like:
Case1 Result:
id count Sum
1 11 111
2 21 130
3 -4 126
Case2 Result:
id count Sum
1 35 130
2 -3 127
Case3 Result:
id count Sum
1 -45 70
2 67 137
THIS ANSWERS THE ORIGINAL VERSION OF THE QUESTION.
I think this does what you want:
select a.*, (#sum := least(#sum + count, 130)) as "sum"
from (select a.*
from tablea a
order by a.id
) a cross join
(select #sum := 0) params;
I don't understand where the 100 is coming from. It is not part of your explanation.
Here is a db<>fiddle that illustrates how this works using 30 as the limit (which seems to be your intention).

Get the first n lines for group after summing up

I am practising my skills in MySQL using the Sakila DB.
I would I have created a view called rentals_customer_store_film_category with is the union of customers, payments, rentals, film, and category tables.
I would now like to get the top 5 films by income. Meaning that I would lie to sum up all incomes of each film by store and then return the first 5.
I tried the below code but it does not work.
I cannot figure out what is wrong with it
Any help?
SELECT store_id, film_id, income
FROM
(SELECT film_id, store_id, sum(amount) as income,
#store_rank := IF(#current_store = store_id, #store_rank + 1, 1) AS store_rank,
#current_store := store_id
FROM rentals_customer_store_film_category
group by store_id, film_id
ORDER BY store_id, income DESC, film_id
) ranked
WHERE store_rank <= 5
RESULTS BELOW. As you can see, it does not stop at the fifth row per store. It shows all films by store while I would like only the top 5 for store:id 1 and top 5 for store id 2.
store_id film_id income
1 971 134.82
1 879 132.85
1 938 127.82
1 973 123.83
1 865 119.84
1 941 117.82
1 267 116.83
1 327 110.85
1 580 106.86
1 715 105.85
1 897 104.85
...
...
...
...
2 878 127.83
2 791 127.82
2 854 123.83
2 946 117.86
2 396 113.81
2 369 111.84
2 764 110.85
2 260 110.84
2 838 110.82
2 527 109.83
2 893 106.84
2 71 102.87
2 8 102.82
...
...
...
...
The order in this case is important to compare the previous store_id with the current,try this:
SELECT store_id, film_id, income
FROM
(SELECT film_id, store_id, sum(amount) as income,
#First compare previus with current
#store_rank := IF(#prev_store = store_id, #store_rank + 1, 1) AS store_rank,
#asign previus store
#prev_store := store_id
FROM films
group by store_id, film_id
ORDER BY store_id, income DESC, film_id
) ranked
WHERE store_rank <= 5

Use a sub query result

I have a table with numbers and dates (1 number each date and dates aren't necessarily at regular intervals).
I would like to get the count of dates when a number isn't in the table.
Where I am :
select *
from
(
select
date from nums
where chiffre=1
order by date desc
limit 2
) as f
I get this :
date
--------------
2014-09-07
--------------
2014-07-26
Basically, I have this query dynamically:
select * from nums where date between "2014-07-26" and "2014-09-07"
And in a second time, browse the whole table (because there I limited to the first 2 rows but I would compare the 2 and 3 and 3 and 4 etc...)
The goal is to get this:
date | actual_number_of_real_dates_between_two_given_dates
2014-09-07 - 2014-07-26 | 20
2014-04-02 - 2014-02-12 | 13
etc...
How can I do this? Thanks.
Edit:
What I have (just an example, dates and "chiffre" are more complex) :
date | chiffre
2014-09-30 | 2
2014-09-29 | 1
2014-09-28 | 2
2014-09-27 | 2
2014-09-26 | 1
2014-09-25 | 2
2014-09-24 | 2
etc...
What I need for the number "1":
actual_number_of_real_dates_between_two_given_dates
1
3
etc...
Edit 2:
My updated query thanks to Gordon Linoff
select count(n.id) as difference
from nums n inner join
(select min(date) as d1, max(date) as d2
from (select date from nums where chiffre=1 order by date desc limit 2) d
) dd
where n.date between dd.d1 and dd.d2
How can I test row 2 with 3? 3 with 4 etc... Not only last 2?
Should I use a loop? Or I can do it without?
Does this do what you want?
select count(distinct n.date) as numDates,
(datediff(dd.d2, dd.d1) + 1) as datesInPeriod,
(datediff(dd.d2, dd.d1) + 1 - count(distinct n.date)) as missingDates
from nums n cross join
(select date('2014-07-26') as d1, date('2014-09-07') as d2) d
where n.date between dd.d1 and dd.d2;
EDIT:
If you just want the last two dates:
select count(distinct n.date) as numDates,
(datediff(dd.d2, dd.d1) + 1) as datesInPeriod,
(datediff(dd.d2, dd.d1) + 1 - count(distinct n.date)) as missingDates
from nums n cross join
(select min(date) as d1, max(date) as d2
from (select date from nums order by date desc limit 2) d
) dd
where n.date between dd.d1 and dd.d2;

How to make a fake column with an autoincrement number in a "group by" query

I have data in a table like this:
fgid qty ntid
1 100 10
2 90 10
6 200 11
1 80 11
1 120 12
6 100 12
6 30 13
And i make query :
SELECT fgid, SUM(qty) AS total_qty, COUNT(ntid) AS nt_count FROM sofg
GROUP BY fgid
AND the result is :
fgid total_qty nt_count
1 300 3
2 90 1
6 330 3
Then i want to make the result like this :
no fgid total_qty nt_count
1 1 300 3
2 2 90 1
3 6 330 3
How to do that with a query? where 'no' is (like) autoincrement number.
Try this query.
SELECT
#rownum := #rownum + 1 rownum,
t.*
FROM (SELECT #rownum:=0) r,
(
SELECT fgid, SUM(qty) AS total_qty, COUNT(ntid) AS nt_count FROM sofg GROUP BY fgid
) t;
Basically the same as Dhinakaran's answer, but there's no need to put the whole main query into a subquery. There's no difference to his answer appart from maybe being more pleasing to the eye, but please accept Dhinakaran's answer, as he was faster.
SELECT
#rownum:=#rownum + 1 as rownumber,
fgid,
SUM(qty) AS total_qty,
COUNT(ntid) AS nt_count
FROM sofg
, (select #rownum:=0) v
GROUP BY fgid

SQL query - how to construct multiple SUMs (based on different parameters) in one query

Please review my tables below... Is it possible to build a single query capable of
1) calculating the SUM of total_time for all vehicles that have class_id 1 (regardless of feature_id)(result would be 6:35)
2) calculating the SUM of total_time for all vehicles that have class_id 1 AND have feature_id 2(result would be 5:35 based on vehicle_id 22 and 24)
I'm able to get the results in two seperate queries, but I was hoping to retrieve them in one single query.... something like:
SELECT
SUM((CASE WHEN (VEHICLE_TABLE.class_id = 1) then LOG_TABLE.total_time else 0 end)) **AS TOTAL_ALL**,
...here goes statement for 2)... AS TOTAL_DIESEL...
FROM LOG_TABLE, VEHICLE_TABLE .....
WHERE VEHICLE_TABLE.vehicle_id = LOG_TABLE.vehicle_id ......
TABLE 1: LOG_TABLE (vehicle_id is NOT unique)
vehicle_id | total_time
--------------|--------------
22 2:00
22 0:30
23 1:00
24 2:20
24 0:45
TABLE 2: VEHICLE_TABLE (vehicle_id is unique)
vehicle_id | class_id
--------------|--------------
22 1
23 3
24 1
TABLE 3: VEHICLE_FEATURES_TABLE (vehicle_id is NOT unique but feature_id is unique per vehicle_id)
vehicle_id | feature_id
--------------|--------------
22 1
22 2
23 1
23 2
23 6
24 2
24 6
SELECT SUM(lt.total_time) AS TOTAL_ALL,
SUM(CASE WHEN (vft.feature_id IS NOT NULL) then LOG_TABLE.total_time else 0 end) AS FEATURE_TOTAL
FROM VEHICLE_TABLE vt
JOIN LOG_TABLE lt
ON vt.vehicle_id = lt.vehicle_id
LEFT JOIN VEHICLE_FEATURES_TABLE vft
ON vt.vehicle_id = vft.vehicle_id AND vft.feature_id = 2
WHERE vt.class_id = 1
It seems that there is not much point in putting both of them in one query unless you want the results together.
If so, just add a UNION between the 2 queries.
If you want to have both values in the same row try something like this:
SELECT (SELECT Sum(X)
FROM TBL
WHERE CLASS_ID = 1) AS CLS_id1,
(SELECT Sum(X)
FROM TBL
WHERE CLASS_ID = 1
AND FEATURE_ID = 2) AS CLS_id1_FTR_ID2