MYSQL sum from two different tables group by - mysql

I have two tables.
I need to combine each row of these two tables into a row in table3. I managed to get the table1 SUM amount but not table2.
Eg.
table user
+---------+-----------+
| user_id | user_name |
+---------+-----------+
| 001 | JOHN |
| 002 | ADAM |
+---------+-----------+
table1
+-----------+----------------+-------------------+---------------------+
| table1_id | table1_user_id | table1_amount | table1_date |
+-----------+----------------+-------------------+---------------------+
| 6 | 001 | 100 | 01/11/2014 10:55 |
| 7 | 002 | 100 | 01/11/2014 10:55 |
| 8 | 001 | 50 | 25/10/2014 10:55 |
| 9 | 001 | 100 | 23/10/2014 11:00 |
| 10 | 002 | 0 | 21/10/2014 11:00 |
+-----------+----------------+-------------------+---------------------+
table2
+-----------+----------------+----------------+--------------------+
| table2_id | table2_user_id | table2_amount | table2_date |
+-----------+----------------+----------------+--------------------+
| 1 | 001 | 100 | 15/11/2014 10:55 |
| 2 | 001 | 100 | 15/10/2014 10:55 |
| 3 | 002 | 100 | 11/10/2014 10:55 |
| 4 | 001 | 50 | 11/10/2014 10:55 |
+-----------+----------------+----------------+--------------------+
Expected Result:
Table3
+-----+---------+---------------+---------------+----------+---------+
| id | user_id | table1_amount | table2_amount | Year | Month |
+-----+---------+---------------+---------------+----------+---------+
| 1 | 001 | 100 | 100 | 2014 | 11 |
| 2 | 002 | 100 | 0 | 2014 | 11 |
| 3 | 001 | 150 | 150 | 2014 | 10 |
| 4 | 002 | 0 | 100 | 2014 | 10 |
+-----+---------+---------------+---------------+----------+---------+
My try but it does not show the expected result. The amount of table2_amount in every row is NULL :
SQL=" INSERT INTO table3
SELECT user_id,SUM(table1_amount),t2.amount2,
YEAR(table1_date),MONTH(table1_date) FROM table1 a
LEFT JOIN
(SELECT c.table2_user_id,SUM(c.table2_amount) as amount2,c.table2_date
FROM table2 c
GROUP BY DATE_FORMAT(c.table2_date,'%Y-%m'),c.table2_user_id ASC
) t2
on t2.table2_user_id = a.table1_user_id AND t2.table2_date = a.table1_date
GROUP BY DATE_FORMAT(a.table1_date,'%Y-%m'),table1_user_id ASC ";
"

This a nice task for UNION
SELECT tx.uid,SUM(tx.a1),SUM(tx.a2),YEAR(tx.d),MONTH(tx.d)
FROM
(
SELECT t1.table1_user_id as uid,
t1.table1_amount as a1,
0 as a2,
t1.table1_date as d
FROM table1 t1
UNION
SELECT t2.table2_user_id as uid,
0 as a1,
t2.table2_amount as a2,
t2.table2_date as d
FROM table2 t2
) tx
GROUP BY DATE_FORMAT(d,'%Y-%m'),uid ASC

Thanks to David162795 for the enlightening discussion.
The missed point is group INNER QUERY by date and user id when the date from two tables are different.
We need to group them by their individual date in the Inner Query and then group the main SELECT query by the time variable.
This goes my answer for this case :
$SQL = "
INSERT INTO table3 (user_id, table1_amount, table2_amount,Year, Month)
SELECT tx.uid, SUM(tx.sum1), SUM(tx.sum2),YEAR(tx.d) as year,MONTH(tx.d) as month
FROM
(SELECT b.table1_user_id as uid,b.table1_amount as sum1,0 as sum2,
b.table1_date as d FROM table1 b
GROUP BY DATE_FORMAT(d,'%Y-%m'),uid ASC
UNION
SELECT c.table2_user_id as uid,0 as sum1,
sum(c.table2_amount) as sum2,c.table2_date as d1
FROM table2 c
GROUP BY DATE_FORMAT(d1,'%Y-%m'),uid ASC
) tx
GROUP BY year,month,uid"

Related

Having two MySQL tables, get the last result for each value of first table key

I have two tables:
TABLE_01
-------------------------------
| ID | Key1 | Key2 |
-------------------------------
| 1 | 504 | 101 |
| 2 | 504 | 102 |
| 3 | 505 | 101 |
| 4 | 505 | 103 |
| 5 | 508 | 101 |
| 6 | 510 | 104 |
| 7 | 509 | 101 |
-------------------------------
TABLE_02
----------------------------------------
| ID | T_01 | timestamp | data |
----------------------------------------
| 1 | 1 | ts_01 | ..abc.. |
| 2 | 1 | ts_02 | ..dfg.. |
| 3 | 2 | ts_03 | ..hij.. |
| 4 | 3 | ts_04 | ..klm.. |
| 5 | 1 | ts_05 | ..nop.. |
| 6 | 4 | ts_06 | ..qrs.. |
| 7 | 3 | ts_07 | ..tuv.. |
| 8 | 5 | ts_08 | ..wxy.. |
| 9 | 2 | ts_09 | ..z.... |
| 10 | 4 | ts_10 | ..abc.. |
----------------------------------------
On both table, ID is the Primary Incremental Key
In TABLE_01, the columns key1 + key2 are Unique Key (Can't be more than one Key1 Key2 couple)
In TABLE_02, the column T_01 makes reference on TABLE_01.ID
My goal is that given a key1 value, be able to get the last entry of TABLE_02 for each TABLE_01.ID with the correspondent timestamp on DESC ORDER.
For example, if I give 505, the output should be:
KEY1 | KEY2 | TIMESTAMP
---------------------------
505 | 103 | ts_10 ---> FROM TABLE_01.Id = 4
505 | 101 | ts_07 ---> FROM TABLE_01.Id = 3
As you can see, It only shows the last entry on the case of TABLE_01.ID = 4 (which is 505 | 103)
I have tried to do something like this:
SELECT `t1`.`Key1`, `t1`.`key2`, `t2`.`timestamp`
FROM `TABLE_02` AS t2
INNER JOIN `TABLE_01` AS t1
WHERE `t1`.`key1` = '505'
ORDER BY `t2`.`ID`
DESC LIMIT 100
The problem with this query is that since I am using t2.timestamp, I am receiving all the results instead of only ONE for EACH. Also, I'm not using correctly the TABLE_01.ID on TABLE_02.
If you just want the latest timestamp in the second table per combination of keys in the first table, you can join and aggregate:
select t1.key1, t1.key2, max(t2.timestamp) max_t2_timestamp
from table_01 t1
inner join table_02 t2 on t2.t_01 = t1.id
group by t1.key1, t1.key2
If you want the entire row of the second table, then I would recommend window functions:
select *
from (
select t1.key1, t1.key2, t2.*,
row_number() over(partition by t1.key1, t1.key2 order by t2.timestamp desc) rn
from table_01 t1
inner join table_02 t2 on t2.t_01 = t1.id
group by t1.key1, t1.key2
) t
where rn = 1

How can I get the last row from each given row value in a column through date? [duplicate]

This question already has answers here:
Retrieving the last record in each group - MySQL
(33 answers)
Closed 4 years ago.
I have the following table.
+--------------------+--------------+-------+
Date | SymbolNumber | Value
+--------------------+--------------+-------+
2018-08-31 15:00:00 | 123 | data
2018-09-31 15:00:00 | 456 | data
2018-09-31 15:00:00 | 123 | data
2018-09-31 15:00:00 | 555 | data
2018-10-31 15:00:00 | 555 | data
2018-10-31 15:00:00 | 231 | data
2018-10-31 15:00:00 | 123 | data
2018-11-31 15:00:00 | 123 | data
2018-11-31 15:00:00 | 555 | data
2018-12-31 15:00:00 | 123 | data
2018-12-31 15:00:00 | 555 | data
I need a query that can select the last row of each SymbolNumber stated in the query.
SELECT
*
FROM
MyTable
WHERE
symbolNumber IN (123, 555)
AND
**lastOfRow ordered by latest-date**
Expected results:
2018-12-31 15:00:00 | 123 | data
2018-12-31 15:00:00 | 555 | data
How can I do this?
First, you will need a query that get the latest date for each symbolNumber. Second, you can inner join to this table (using date) for get the rest of the columns. Like this:
SELECT
t.*
FROM
<table_name> AS t
INNER JOIN
(SELECT
symbolNumber,
MAX(date) AS maxDate
FROM
<table_name>
GROUP BY
symbolNumber) AS latest_date ON latest_date.symbolNumber = t.symbolNumber AND latest_date.maxDate = t.date
The previous query will get latest data for each existing symbolNumber on the table. If you want to restrict to symbolNumbers: 123 and 555, you will need to made next modification:
SELECT
t.*
FROM
<table_name> AS t
INNER JOIN
(SELECT
symbolNumber,
MAX(date) AS maxDate
FROM
<table_name>
WHERE
symbolNumber IN (123, 555)
GROUP BY
symbolNumber) AS latest_date ON latest_date.symbolNumber = t.symbolNumber AND latest_date.maxDate = t.date
We can do a "self-left-join" on symbolNumber, and match to other rows in the same group with higher Date value on the right side.
We will eventually consider only those rows, where higher date could not be found (meaning the current row belongs to highest date in the group).
Here is a solution avoiding subquery, and utilizing Left Join:
SELECT t1.*
FROM MyTable AS t1
LEFT JOIN MyTable AS t2
ON t2.symbolNumber = t1.symbolNumber AND
t2.Date > t1.Date -- Joining to a row in same group with higher date
WHERE t1.symbolNumber IN (123, 555) AND
t2.symbolNumber IS NULL -- Higher date not found; so this is highest row
EDIT:
Benchmarking studies comparing Left Join method v/s Derived Table (Subquery)
#Strawberry ran a little benchmark test in 5.6.21. Here's what he found...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id SERIAL PRIMARY KEY
,dense_user INT NOT NULL
,sparse_user INT NOT NULL
);
INSERT INTO my_table (dense_user,sparse_user)
SELECT RAND()*100,RAND()*100000;
INSERT INTO my_table (dense_user,sparse_user)
SELECT RAND()*100,RAND()*100000 FROM my_table;
-- REPEAT THIS LINE A FEW TIMES !!!
SELECT COUNT(DISTINCT dense_user) dense
, COUNT(DISTINCT sparse_user) sparse
, COUNT(*) total
FROM my_table;
+-------+--------+---------+
| dense | sparse | total |
+-------+--------+---------+
| 101 | 99999 | 1048576 |
+-------+--------+---------+
ALTER TABLE my_table ADD INDEX(dense_user);
ALTER TABLE my_table ADD INDEX(sparse_user);
--dense_test
SELECT x.*
FROM my_table x
LEFT
JOIN my_table y
ON y.dense_user = x.dense_user
AND y.id < x.id
WHERE y.id IS NULL
ORDER
BY dense_user
LIMIT 10;
+------+------------+-------------+
| id | dense_user | sparse_user |
+------+------------+-------------+
| 1212 | 0 | 1950 |
| 153 | 1 | 23193 |
| 255 | 2 | 27472 |
| 28 | 3 | 86440 |
| 18 | 4 | 47886 |
| 291 | 5 | 76563 |
| 15 | 6 | 85049 |
| 16 | 7 | 78384 |
| 135 | 8 | 52304 |
| 62 | 9 | 40930 |
+------+------------+-------------+
10 rows in set (2.64 sec)
SELECT x.*
FROM my_table x
JOIN
( SELECT dense_user, MIN(id) id FROM my_table GROUP BY dense_user ) y
ON y.dense_user = x.dense_user
AND y.id = x.id
ORDER
BY dense_user
LIMIT 10;
+------+------------+-------------+
| id | dense_user | sparse_user |
+------+------------+-------------+
| 1212 | 0 | 1950 |
| 153 | 1 | 23193 |
| 255 | 2 | 27472 |
| 28 | 3 | 86440 |
| 18 | 4 | 47886 |
| 291 | 5 | 76563 |
| 15 | 6 | 85049 |
| 16 | 7 | 78384 |
| 135 | 8 | 52304 |
| 62 | 9 | 40930 |
+------+------------+-------------+
10 rows in set (0.05 sec)
Uncorrelated query is 50 times faster.
--sparse test
SELECT x.*
FROM my_table x
LEFT
JOIN my_table y
ON y.sparse_user = x.sparse_user
AND y.id < x.id
WHERE y.id IS NULL
ORDER
BY sparse_user
LIMIT 10;
+--------+------------+-------------+
| id | dense_user | sparse_user |
+--------+------------+-------------+
| 165055 | 75 | 0 |
| 37598 | 63 | 1 |
| 170596 | 70 | 2 |
| 46142 | 87 | 3 |
| 33546 | 21 | 4 |
| 323114 | 87 | 5 |
| 86592 | 96 | 6 |
| 156711 | 36 | 7 |
| 17148 | 62 | 8 |
| 139965 | 71 | 9 |
+--------+------------+-------------+
10 rows in set (0.03 sec)
SELECT x.*
FROM my_table x
JOIN ( SELECT sparse_user, MIN(id) id FROM my_table GROUP BY sparse_user ) y
ON y.sparse_user = x.sparse_user
AND y.id = x.id
ORDER
BY sparse_user
LIMIT 10;
+--------+------------+-------------+
| id | dense_user | sparse_user |
+--------+------------+-------------+
| 165055 | 75 | 0 |
| 37598 | 63 | 1 |
| 170596 | 70 | 2 |
| 46142 | 87 | 3 |
| 33546 | 21 | 4 |
| 323114 | 87 | 5 |
| 86592 | 96 | 6 |
| 156711 | 36 | 7 |
| 17148 | 62 | 8 |
| 139965 | 71 | 9 |
+--------+------------+-------------+
10 rows in set (4.73 sec)
Exclusion Join is 150 times faster
However, as you move further up the result set, the picture begins to change very dramatically...
SELECT x.*
FROM my_table x
JOIN ( SELECT sparse_user, MIN(id) id FROM my_table GROUP BY sparse_user ) y
ON y.sparse_user = x.sparse_user
AND y.id = x.id
ORDER
BY sparse_user
LIMIT 10000,10;
+--------+------------+-------------+
| id | dense_user | sparse_user |
+--------+------------+-------------+
| 9810 | 93 | 10000 |
| 162438 | 4 | 10001 |
| 467371 | 62 | 10002 |
| 8258 | 13 | 10003 |
| 297049 | 17 | 10004 |
| 68354 | 23 | 10005 |
| 192701 | 64 | 10006 |
| 176225 | 92 | 10007 |
| 156595 | 37 | 10008 |
| 318266 | 1 | 10009 |
+--------+------------+-------------+
10 rows in set (9.17 sec)
SELECT x.*
FROM my_table x
LEFT
JOIN my_table y
ON y.sparse_user = x.sparse_user
AND y.id < x.id
WHERE y.id IS NULL
ORDER
BY sparse_user
LIMIT 10000,10;
+--------+------------+-------------+
| id | dense_user | sparse_user |
+--------+------------+-------------+
| 9810 | 93 | 10000 |
| 162438 | 4 | 10001 |
| 467371 | 62 | 10002 |
| 8258 | 13 | 10003 |
| 297049 | 17 | 10004 |
| 68354 | 23 | 10005 |
| 192701 | 64 | 10006 |
| 176225 | 92 | 10007 |
| 156595 | 37 | 10008 |
| 318266 | 1 | 10009 |
+--------+------------+-------------+
10 rows in set (32.19 sec) -- !!!
In summary, the exclusion join (the so-called 'strawberry query' can be (significantly) faster in certain, limited situations. More generally, an uncorrelated query will be faster.

mysql select most recent row by date for each user

I'm trying to select the most recent rows for every unique userid where pid = 50 and active = 1. I haven't been able to figure it out.
Here is a sample table
+-----+----------+-------+-----------------------+---------+
| id | userid | pid | start_date | active |
+-----+----------+-------+-----------------------+---------+
| 1 | 4 | 50 | 2015-05-15 12:00:00 | 1 |
| 2 | 4 | 50 | 2015-05-16 12:00:00 | 1 |
| 3 | 4 | 50 | 2015-05-17 12:00:00 | 0 |
| 4 | 4 | 51 | 2015-06-29 12:00:00 | 1 |
| 5 | 4 | 51 | 2015-06-30 12:00:00 | 1 |
| 6 | 5 | 50 | 2015-07-05 12:00:00 | 1 |
| 7 | 5 | 50 | 2015-07-06 12:00:00 | 1 |
| 8 | 5 | 51 | 2015-07-08 12:00:00 | 1 |
+-----+----------+-------+-----------------------+---------+
Desired Result
+-----+----------+-------+-----------------------+---------+
| id | userid | pid | start_date | active |
+-----+----------+-------+-----------------------+---------+
| 2 | 4 | 50 | 2015-05-16 12:00:00 | 1 |
| 7 | 5 | 50 | 2015-07-06 12:00:00 | 1 |
+-----+----------+-------+-----------------------+---------+
I've tried a bunch of things and this is the closest I got but unfortunately it is not quit there.
SELECT *
FROM mytable t1
WHERE
(
SELECT COUNT(*)
FROM mytable t2
WHERE
t1.userid = t2.userid
AND t1.start_date < t2.start_date
) < 1
AND pid = 50
AND active = 1
ORDER BY start_date DESC
plan
get last record grouping by userid where pid is 50 and is active
inner join to mytable to get the record info associated with last
query
select
my.*
from
(
select userid, pid, active, max(start_date) as lst
from mytable
where pid = 50
and active = 1
group by userid, pid, active
) maxd
inner join mytable my
on maxd.userid = my.userid
and maxd.pid = my.pid
and maxd.active = my.active
and maxd.lst = my.start_date
;
output
+----+--------+-----+------------------------+--------+
| id | userid | pid | start_date | active |
+----+--------+-----+------------------------+--------+
| 2 | 4 | 50 | May, 16 2015 12:00:00 | 1 |
| 7 | 5 | 50 | July, 06 2015 12:00:00 | 1 |
+----+--------+-----+------------------------+--------+
sqlfiddle
notes
as suggested by #Strawberry, updated to join also on pid and active. this will avoid the possibility of a record which is not active or not pid 50 but has exact same date also being rendered.

How to fetch data from two tables using mysql query with some conditions applied to the second table?

I want to generate a result from a MySql query with below requirement.
Table 1 :
---------------
| nid | type |
---------------
| 1 | forum |
| 2 | forum |
| 3 | forum |
| 4 | forum |
---------------
Table 2
-----------------------
| nid | cid | created |
-----------------------
| 1 | 32 | 123456 |
| 2 | 65 | 123457 |
| 4 | 67 | 123458 |
| 1 | 61 | 123491 |
| 1 | 78 | 123497 |
| 2 | 23 | 123498 |
| 1 | 12 | 123698 |
| 4 | 54 | 132365 |
| 4 | 81 | 135698 |
| 1 | 30 | 168965 |
-----------------------
Now i require result like below. (Condition : I need the nid from first table, smallest cid for the corresponding nid in second table WHERE type = 'forum')
--------------
| nid | cid |
--------------
| 1 | 12 |
| 2 | 23 |
| 4 | 67 |
--------------
You can try this
SELECT tbl1.nid,
min(tbl2.cid) as cid
FROM table1 tbl1
INNER JOIN table2 tbl2 ON tbl1.nid=tbl2.nid
GROUP BY tbl2.nid;
SQL Fiddle
Try this
SELECT t1.nid,
min(t2.cid) as cid
FROM table1 t1
INNER JOIN table2 t2 ON t1.nid=t2.nid
GROUP BY t2.nid;
This could also work
select nid, min(cid) cid
from table2
group by nid
The above queries are have issues in group by clause. Kindly check this query.
SELECT t1.NID, MIN(t2.CID) AS cis from
TAB1 t1 inner join TAB2 t2 on t1.nid = t2.nid
group by t1.nid

Get MAX row for GROUP in MySQL

I have the following data:
+---------+----------+----------+--------+
| id | someId | number | data |
+---------+----------+----------+--------+
| 27 | 123 | 1 | abcde1 |
| 28 | 123 | 3 | abcde2 |
| 29 | 123 | 1 | abcde3 |
| 30 | 123 | 5 | abcde4 |
| 31 | 124 | 4 | abcde1 |
| 32 | 124 | 8 | abcde2 |
| 33 | 124 | 1 | abcde3 |
| 34 | 124 | 2 | abcde4 |
| 35 | 123 | 16 | abcde1 |
| 245 | 123 | 3 | abcde2 |
| 250 | 125 | 0 | abcde3 |
| 251 | 125 | 1 | abcde4 |
| 252 | 125 | 7 | abcde1 |
| 264 | 125 | 0 | abcde2 |
| 294 | 123 | 0 | abcde3 |
| 295 | 126 | 0 | abcde4 |
| 296 | 126 | 0 | abcde1 |
| 376 | 126 | 0 | abcde2 |
+---------+----------+----------+--------+
And I want to get a MySQL query that gets me the data of the row with the highest number for each someId. Note that id is unique, but number isn't
SELECT someid, highest_number, data
FROM test_1
INNER JOIN (SELECT someid sid, max(number) highest_number
FROM test_1
GROUP BY someid) t
ON (someid=sid and number=highest_number)
Unfortunately it is not look quite efficient. In Oracle it could be possible to user OVER clause without subqueries, but MySQL…
Update 1
If there are several instances of highest number this will returs also several data for each pair of someid and number.
To get the only row per each someid we should preaggregate the source table to make someid and number pairs unique (see t1 subquery)
SELECT someid, highest_number, data
FROM
(SELECT someid, number, MIN(data) data
FROM test_1
GROUP BY
someid, number) t1
INNER JOIN
(SELECT someid sid, max(number) highest_number
FROM test_1
GROUP BY someid) t2
ON (someid=sid and number=highest_number)
Update 2
It is possible to simplify previous solution
SELECT someid,highest_nuimber,
(select min(data)
from test_1
where someid=t1.someid and number=highest_nuimber)
FROM
(SELECT someid, max(number) highest_nuimber
FROM test_1
GROUP BY someid) t1
If we materialize unique pairs of someid and number than it is possible to use correlated subquery. Unlike a JOIN it would not produce additional rows if highest value of number is repeated several times.
Slight tweak to Naeel's answer but to return just a single data result for any someId even if there's a tie you should add a GROUP BY:
SELECT t1.someid, t1.number, t1.data
FROM Table1 t1
INNER JOIN (SELECT someId sid, max(number) max_number
FROM Table1
GROUP BY someId) t2
ON (someId = sid AND number = max_number)
GROUP BY t1.someId
SQL Fiddle here