MySQL - Update table with row number per group - mysql

Sample Data
id | order_id | instalment_num | date_due
---------------------------------------------------------
1 | 10000 | 1 | 2010-07-09 00:00:00
2 | 10000 | 1 | 2010-09-06 11:39:56
3 | 10001 | 1 | 2014-04-25 15:46:52
4 | 10002 | 1 | 2010-01-11 00:00:00
5 | 10003 | 1 | 2010-01-04 00:00:00
6 | 10003 | 1 | 2016-05-31 00:00:00
7 | 10003 | 1 | 2010-01-08 00:00:00
8 | 10003 | 1 | 2010-01-06 09:06:26
9 | 10004 | 1 | 2010-01-11 11:25:07
10 | 10004 | 1 | 2010-01-12 07:06:42
Desired Result
id | order_id | instalment_num | date_due
---------------------------------------------------------
1 | 10000 | 1 | 2010-07-09 00:00:00
2 | 10000 | 2 | 2010-09-06 11:39:56
3 | 10001 | 1 | 2014-04-25 15:46:52
4 | 10002 | 1 | 2010-01-11 00:00:00
5 | 10003 | 1 | 2010-01-04 00:00:00
8 | 10003 | 2 | 2010-01-06 09:06:26
7 | 10003 | 3 | 2010-01-08 00:00:00
6 | 10003 | 4 | 2016-05-31 00:00:00
9 | 10004 | 1 | 2010-01-11 11:25:07
10 | 10004 | 2 | 2010-01-12 07:06:42
As you can see, I have an instalment_num column which should show the number/index of each row belonging to the order_id, determined by the date_due ASC, id ASC order.
How can I update the instalment_num column like this?
Additional Notes
The date_due column is not unique, and there may be many ids or order_ids with the exact same timestamp.
If the timestamp is the same for two rows belonging to the same order_id, it should order them by id as a fallback.
I require a query which will update this column.

This is how I would do it:
SELECT a.id,
a.order_id,
COUNT(b.id)+1 AS instalment_num,
a.date_due
FROM sample_data a
LEFT JOIN sample_data b ON a.order_id=b.order_id AND (a.date_due>b.date_due OR (a.date_due=b.date_due AND a.id>b.id))
GROUP BY a.id, a.order_id, a.date_due
ORDER BY a.order_id, a.date_due, a.id
UPDATE version attempt:
UPDATE sample_data
LEFT JOIN (SELECT a.id,
COUNT(b.id)+1 AS instalment_num
FROM sample_data a
JOIN sample_data b ON a.order_id=b.order_id AND (a.date_due>b.date_due OR (a.date_due=b.date_due AND a.id>b.id))
GROUP BY a.id) c ON c.id=sample_data.id
SET sample_data.instalment_num=c.instalment_num

For the numbering to begin with 1:
UPDATE sample_data
LEFT JOIN (SELECT a.id,
COUNT(b.id) AS instalment_num
FROM sample_data a
JOIN sample_data b ON a.order_id = b.order_id AND (a.date_due > b.date_due OR (a.date_due=b.date_due AND a.id + 1 > b.id))
GROUP BY a.id) c ON c.id = sample_data.id
SET sample_data.instalment_num = c.instalment_num

You are trying to achieve what ROW_NUMBER with a partition would do using something like SQL Server or Oracle. You can simulate this with an approriate query:
SELECT t.id, t.order_id,
(
SELECT 1 + COUNT(*)
FROM sampleData
WHERE (date_due < t.date_due OR (date_due = t.date_due AND id < t.id)) AND
order_id = t.order_id
) AS instalment_num,
t.date_due
FROM sampleData t
ORDER BY t.order_id, t.date_due
This query will order the instalment_num by due_date in ascending order. And in the case of a tie in due_date, it will order by the id in ascending order.
Follow the link below for a demo:
SQLFiddle

select
sub.order_id, sub.date_due,
#group_rn:= case
when #group_order_id=sub.order_id then #group_rn:=#group_rn:+1
else 1
end as instalment_num,
#group_order_id:=sub.order_id
FROM (select #group_rn:=0, group_order_id=0) init,
(select *
from the_table
order by order_id, date_due) sub

Related

Sum childrens in two tables of a table

I Have 3 tables:
a (id,date,ckey) b(id,a.ckey,hht,hha) c(id,a.ckey,date_ini,date_fin)
where B keeps all the activities to be done and their respective hours in 2 places (hht,hha), while c saves the activities carried out with its initial and final date (to determine the hours executed the dates are subtracted).
Now I need to know, for each record in A how many hours you have assigned (B) and how many hours you have completed (C)
actually i have this:
a:
+----------+----------+------------+
| id | date | ckey |
+----------+----------+------------+
| 1 |2018-01-20| 18 |
|----------|----------|------------|
b:
+----------+----------+--------+--------+
| id | a.ckey | hht | hht |
+----------+----------+--------+--------+
| 1 | 18 | 2 | 3 |
| 2 | 18 | 2 | 5 |
| 3 | 18 | 0 | 7 |
+----------+----------+--------+--------+
c:
+----------+----------+----------------------+----------------------+
| id | a.ckey | date_ini | date_fin |
+----------+----------+----------------------+----------------------+
| 1 | 18 | 2019-01-23 13:30:00 | 2019-01-23 14:00:00 |
| 1 | 18 | 2019-01-23 14:00:00 | 2019-01-23 14:30:00 |
+----------+----------+----------------------+----------------------+
I need this:
+----------+----------+----------------------+----------------------+
| id | a.ckey | hours | hours2 |
+----------+----------+----------------------+----------------------+
| 1 | 18 | 19 | 1 |
+----------+----------+----------------------+----------------------+
I get this:
+----------+----------+----------------------+----------------------+
| id | a.ckey | hours | hours2 |
+----------+----------+----------------------+----------------------+
| 1 | 18 | 38 | 37.5 |
+----------+----------+----------------------+----------------------+
This is my query:
SELECT
(b.hht+b.hha) AS hours,
(SUM(b.hht+b.hha) -
FORMAT(IFNULL((TIMESTAMPDIFF(MINUTE, c.date_ini, c.date_fin)/60),0),2)) AS hours2
FROM a
LEFT JOIN b ON a.key=b.akey
INNER JOIN c ON a.key=c.akey
GROUP a.ckey
Because you have multiple rows in tables b and c for each value of ckey you need to do the aggregation within a subquery, otherwise you get duplicated rows leading to incorrect sums.
SELECT a.id, a.key, b.hours, FORMAT(c.minutes/60, 2) AS hours2
FROM a
LEFT JOIN (SELECT akey, SUM(hht+hha) AS hours
FROM b
GROUP BY akey) b ON b.akey = a.key
LEFT JOIN (SELECT akey, SUM(TIMESTAMPDIFF(MINUTE, date_ini, date_fin)) AS minutes
FROM c
GROUP BY akey) c ON c.akey = a.key
ORDER BY a.id
Output:
id key hours hours2
1 18 19 1.00
Demo on SQLFiddle
You're doing an m-to-n-join, try UNION ALL instead:
select ckey, sum(hours) as hours, sum(hours) - sum(hours2) as hours2
from
(
SELECT ckey, (b.hht+b.hha) AS hours, NULL as hours2
FROM b
UNION ALL
SELECT ckey, NULL AS hours,
FORMAT(IFNULL((TIMESTAMPDIFF(MINUTE, c.date_ini, c.date_fin)/60),0),2)) as hours2
FROM c
) as dt
group by ckey
If you actually need columns from table a put this Select in a Derived Table and join to it.
please check this
SELECT
(SELECT SUM(hha + hht) from b where b.ckey = a.ckey) hours,
FORMAT((SELECT SUM(TIMESTAMPDIFF(MINUTE, c.date_ini, c.date_fin)/60) from c where c.ckey = a.ckey),2) as hours2
FROM A
Fiddle

query not returning the right data

Payment_Detail_Table
payment_detail_id| payment_id | payment_status | total | user_id | company_id
10001 | 10| 1 | 100 1 103
10002 | 11| 2 | 200 1 103
10003 | 12| 2 | 300 2 104
10004 | 13| 1 | 400 2 104
10005 | 14| 0 | 500 1 105
10006 | 15| 2 | 600 1 103
Payment_Table
payment_id| payment_type|
10 | 1 |
11 | 1 |
12 | 1 |
13 | 1 |
14 | 0 |
15 | 0 |
How to get the user_ids that have payment_type of 1 and payment_type of 0 from Payment_Table?
The purpose is to find that they have made two kind of payments and for those who have paid two of them, they must have payment_status of 2 , but if
for example, if the user_id is 1 and company_id is 103, the output must be 100+200+600=900.
This user with this company_id has the payment_Type 0 and 1 and for those two conditions (payment_type=1 and payment_type=0) have finished them successfully with payment_Status of 2 even though have a failed payment earlier
For example payment_detail_id is 1001 have payment_status of 1.
Is this what you are looking for ?
SELECT user_id, company_id
FROM (select payment_detail_table.user_id AS user_id,payment_detail_table.company_id AS company_id
from payment_detail_table
where (EXISTS(SELECT * FROM payment_table WHERE payment_table.payment_id=payment_detail_table.payment_id AND payment_table.payment_type=1)) AND payment_detail_table.payment_status = 2
group by concat(payment_detail_table.user_id,'-',payment_detail_table.company_id)) T1
INNER JOIN
(SELECT payment_detail_table.user_id AS user_id,payment_detail_table.company_id AS company_id
FROM (select payment_detail_table.user_id AS user_id,payment_detail_table.company_id AS company_id
from payment_detail_table
where (EXISTS(SELECT * FROM payment_table WHERE payment_table.payment_id=payment_detail_table.payment_id AND payment_table.payment_type=0)) AND payment_detail_table.payment_status = 2
group by concat(payment_detail_table.user_id,'-',payment_detail_table.company_id)) T2
USING (user_id, company_id)
SELECT
DISTINCT user_id
FROM
Payment_Detail_Table D
WHERE
EXISTS(
SELECT
*
FROM
Payment_Table P1
WHERE
P1.payment_id = D.payment_id
AND
P1.payment_type = 1
)
AND
EXISTS(
SELECT
*
FROM
Payment_Table P2
WHERE
P2.payment_id = D.payment_id
AND
P2.payment_type = 0
)

Query with a subquery where results could be many

I have a main table, tbl_vluchtgegevens which is the "main" table I'm looking at. From this, I want to JOIN tbl_photos and show a "random" result from this table.
My problem is that in the tbl_vluchtgegevens there is only 1 column value that would equal a column value in tbl_photos, however, there is a second column that is stored in tbl_photos which is similar to a second column in tbl_vluchtgevevens that it needs to look at. There is a 3rd table where the value in tbl_photos would have the value for tbl_vluchtgegevens, tbl_luchtvaartmaatschappij
I just can't figure out the MySQL code for MariaDB. I'll try to display this below.
tbl_vluchtgegevens | tbl_luchtvaartmaatschappij | tbl_photos
luchtvaartmaatschappij luchtvaartmaatschappij
IATACode img_lvm
inschrijvingnmr img_nmr
SAMPLE DATA:
tbl_vluchtgegevens
gegevenID | luchtvaartmaatschappij | inschrijvingnmr | vertrekdatum2
1 911 N803NW 2018-01-01 12:00:00
2 1702 PH-AON 2018-01-15 17:00:00
3 911 N853NW 2018-01-17 11:00:00
tbl_luchtvaartmaatschappij
luchtvaartmaatschappijID | IATACode
911 DL
1702 KL
1803 LH
tbl_photos
photoID | img_lvm | img_nmr | file
1 DL N853NW somefile.jpg
2 DL N803NW somefile2.jpg
3 DL N853NW somefile3.jpg
4 KL PH-AON somefile4.jpg
5 KL PH-AON somefile5.jpg
6 LH D-AUBC somefile6.jpg
7 DL N805NW somefile7.jpg
Query would result:
gegevenID | vertrekdatum2 | luchtvaartmaatschappij | inschrijvingnmr | file
1 2018-01-15 12:00:00 911 N803NW somefile.jpg
2 2018-01-15 17:00:00 1702 PH-AON somefile4.jpg
3 2018-01-17 11:00:00 911 N853NW somefile3.jpg
sqlfiddle: http://www.sqlfiddle.com/#!9/19e222/1
At one point, I've tried using the code below, but if multiple rows exist in tbl_photos, then it displays each row from tbl_vluchtgegevens with all of the rows in tbl_photos.
SELECT DISTINCT vg.gegevenID, vg.vertrekdatum2, vg.inschrijvingnmr, lvm.luchtvaartmaatschappij, lvm.luchtvaartmaatschappijID, p.*
FROM tbl_vluchtgegevens vg
LEFT JOIN tbl_luchtvaartmaatschappij lvm
ON vg.luchtvaartmaatschappij = lvm.luchtvaartmaatschappijID
LEFT JOIN tbl_photos p
ON lvm.IATACode = p.img_lvm
AND vg.inschrijvingnmr = p.img_nmr
WHERE vg.vertrekdatum2 <=NOW()
ORDER BY vg.vertrekdatum2 DESC
I've tried to do a subquery, too, but I've only done one and I can't get this to work no matter how I rework the code.
SELECT vg.gegevenID, vg.vertrekdatum2, vg.inschrijvingnmr, lvm.luchtvaartmaatschappij, lvm.luchtvaartmaatschappijID, p.*
FROM tbl_vluchtgegevens vg
LEFT JOIN tbl_luchtvaartmaatschappij lvm
ON vg.luchtvaartmaatschappij = lvm.luchtvaartmaatschappijID
( SELECT p.*, lvm.IATACode, lvm.luchtvaartmaatschappijID
FROM tbl_photos p
LEFT JOIN tbl_luchtvaartmaatschappij lvm
ON vg.luchtvaartmaatschappij = lvm.luchtvaartmaatschappijID
ORDER BY RAND()
LIMIT 1 ) pho
WHERE vg.vertrekdatum2 <=NOW() AND vg.luchtvaartmaatschappij = pho.luchtvaartnamatschappij AND vg.inschrijvingnmr = pho.img_nmr
ORDER BY vg.vertrekdatum2 DESC
One way to do it is with a co-related subquery
Query
SELECT
tbl_vluchtgegevens.gegevenID
, tbl_vluchtgegevens.vertrekdatum2
, tbl_vluchtgegevens.luchtvaartmaatschappij
, tbl_vluchtgegevens.inschrijvingnmr
, (
SELECT
tbl_photos.file
FROM
tbl_photos
WHERE
tbl_photos.img_nmr = tbl_vluchtgegevens.inschrijvingnmr
ORDER BY
RAND()
LIMIT 1
) AS `file`
FROM
tbl_vluchtgegevens
WHERE
tbl_vluchtgegevens.vertrekdatum2 <=NOW()
ORDER BY
tbl_vluchtgegevens.vertrekdatum2 DESC
One Possible Result
| gegevenID | vertrekdatum2 | luchtvaartmaatschappij | inschrijvingnmr | file |
|-----------|----------------------|------------------------|-----------------|---------------|
| 2 | 2018-01-01T17:00:00Z | 1702 | PH-AON | somefile5.jpg |
| 1 | 2018-01-01T12:00:00Z | 911 | N803NW | somefile2.jpg |
| 4 | 2017-03-01T17:00:00Z | 911 | N809NW | (null) |
| 3 | 2017-01-17T11:00:00Z | 911 | N853NW | somefile7.jpg |
| 4 | 2016-03-01T17:00:00Z | 1702 | PH-AON | somefile3.jpg |
see demo http://www.sqlfiddle.com/#!9/be9f7/29

Impala/SQL: Can I have different time-period for each group?

I have the following table:
id | timestamp | team
----------------------------
1 | 2016-05-06 | A
2 | 2016-03-02 | A
3 | 2015-12-01 | A
4 | 2016-07-05 | B
5 | 2016-06-30 | B
6 | 2016-06-28 | B
7 | 2016-04-05 | C
8 | 2016-04-02 | C
9 | 2016-01-02 | C
I want to group by team and find the last timestamp for each team, so I did:
select team, max(timestamp) from my_table group by team
It's all working fine so far. However, now I want to find out how many distinct id in the last month of each team. For example, for team A, it would be from 2016-04-07 to 2016-05-06, so such count is 1. For team B, the last month is from 2016-06-06 to 2016-07-05, so the count is 3. And for team C, the last month is from 2016-03-06 to 2016-04-05, and the count is 2. My expected output should look like:
team | max(timestamp) | count_in_last_month
------------------------------------------------
A | 2016-05-06 | 1
B | 2016-07-05 | 3
C | 2016-04-05 | 2
Can this be derived using the Impala query? Thanks!
Join the original table with the subquery that gets the max timestamp.
SELECT t1.team, t2.month_end, COUNT(DISTINCT t1.id) AS count_in_last_month
FROM my_table AS t1
JOIN (SELECT team, MAX(timestamp) AS month_end
FROM my_table
GROUP BY team) AS t2
ON t1.team = t2.team
AND t1.timestamp BETWEEN DATE_SUB(month_end, INTERVAL 1 MONTH) AND month_end
GROUP BY t1.team, t2.month_end
DEMO

MySQL Joining Tables While Counting & Grouping

I want to join 2 tables:
source_table
----------------------------------
| source_id label |
|----------------------------------|
| 1 Contact Form |
| 2 E-Mail |
| 3 Inbound Call |
| 4 Referral |
----------------------------------
related_table
---------------------------------------
| id created_at source |
|---------------------------------------|
| 1 2013-12-26 2 |
| 2 2013-12-26 2 |
| 3 2013-12-26 4 |
| 4 2013-12-25 1 |
| 5 2013-12-18 2 |
| 6 2013-12-16 4 |
| 7 2013-11-30 2 |
---------------------------------------
So that it looks like this:
---------------------------------------
| created_at source amount |
|---------------------------------------|
| 2013-12-26 E-Mail 2 |
| 2013-12-26 Referral 1 |
| 2013-12-25 Contact Form 1 |
| 2013-12-18 E-Mail 1 |
| 2013-12-16 Referral 1 |
---------------------------------------
I want to count the occurrences of each source in related_table grouped by the source for each date in the range.
But I'm not sure how to write the query.
Here's what I have so far:
SELECT DISTINCT
source_table.source_id,
source_table.label AS source,
related_table.created_at,
COUNT(*) AS amount
FROM source_table
INNER JOIN related_table
ON related_table.source=source_table.source_id AND
related_table.created_at>='2013-12-01' AND
related_table.created_at<='2013-12-31'
GROUP BY `source`
ORDER BY `created_at` ASC
I'm not very good with SQL, so the above query might be far off from what I need to have. All I know is that it doesn't work as expected.
My implementation:
select created_at, s.label, amount
from
(
select count(r.Source) as amount, r.source, r.created_at
from related_table r
group by r.source, r.created_at) a inner join source_table s
on a.source = s.source_id
where created_at between '2013-12-01' and '2013-12-31'
order by amount desc, created_at desc
http://sqlfiddle.com/#!2/841bd/2
adjusted demo to your example...
SELECT
created_at
,label as source
,COUNT(*) AS amount
FROM source_table
INNER JOIN related_table
ON source_table.source_id = related_table.source
GROUP BY label, created_at
ORDER BY created_at DESC