Add columns of two rows based on a condition in mysql - mysql

I have the following table structure:
+------+-------+-------+--------+
| mid | a | b | points |
+------+-------+-------+--------+
| 69 | 3137 | 13316 | 210 |
| 70 | 13316 | 3137 | 350 |
| 71 | 3497 | 13316 | 200 |
| 72 | 13316 | 3497 | 25 |
| 73 | 3605 | 13316 | 205 |
| 74 | 13316 | 3605 | 290 |
+------+-------+-------+--------+
I want to add the "points" values of two rows when "a" of row 1 = "b" of row 2 and "a" of row 2 = "b" of row 1.
The output needs to be something like this:
+------+-------+-------+--------+
| mid | a | b | points |
+------+-------+-------+--------+
| 69 | 3137 | 13316 | 560 |
| 71 | 3497 | 13316 | 225 |
| 73 | 3605 | 13316 | 495 |
+------+-------+-------+--------+

You can try it this way
SELECT t1.mid, t1.a, t1.b, t1.points + t2.points points
FROM table1 t1 JOIN table1 t2
ON t1.a = t2.b
AND t1.b = t2.a
AND t1.mid < t2.mid
or leveraging MySQL non-standard GROUP BY extension
SELECT mid, a, b, SUM(points) points
FROM
(
SELECT mid, a, b, points
FROM table1
ORDER BY mid
) q
GROUP BY LEAST(a, b), GREATEST(a, b)
Output:
| MID | A | B | POINTS |
|-----|------|-------|--------|
| 69 | 3137 | 13316 | 560 |
| 71 | 3497 | 13316 | 225 |
| 73 | 3605 | 13316 | 495 |
Here is SQLFiddle demo

If I've understood correctly then you want:
SELECT p1.mid, p1.a, p1.b, p1.points + p2.points points
FROM points p1
INNER JOIN points p2 ON p1.a = p2.b AND p1.b = p2.a
WHERE p1.mid < p2.mid
I have knocked it up in sql fiddle and gives the output that you want.

Related

Percentage Query in MySQL

I have 2 tables as shown below :
Tabel_1
|idUnit | Budget |
|112 | 1000 |
|112 | 2000 |
|113 | 4000 |
Tabel_2
|idUnit | Real2 |
|112 | 500 |
|112 | 100 |
|113 | 200 |
My Question, how to make the table as below with percentage:
| idUnit | TotalBudget | TotalReal2 | Percentage
| 112 | 3000 | 600 | ? (15%) |
| 113 | 4000 | 200 | ? (5%) |
My query before :
SELECT t1.idUnit, SUM(Budget) AS TotalBudget, t2.TotalReal2
FROM Tabel_1 AS t1
JOIN (SELECT idUnit, SUM(Real2) AS TotalReal2
FROM Tabel_2 GROUP BY idUnit
) AS t2 ON t1.idUnit = t2.idUnit
GROUP BY t1.idUnit;
You can try to use two subqueries then do JOIN, calculating your column by TotalBudget and totalbudget
Query 1:
SELECT t1.idUnit,
t1.TotalBudget,
t2.TotalReal2,
CONCAT((TotalReal2/totalbudget)*100,'%') Percentage
FROM (
SELECT idUnit,SUM(Budget) TotalBudget
FROM Tabel_1
GROUP BY idUnit
) AS t1
INNER JOIN (
SELECT idUnit,
SUM(Real2) AS TotalReal2
FROM Tabel_2
GROUP BY idUnit
) AS t2 ON t1.idUnit = t2.idUnit
Results:
| idUnit | TotalBudget | TotalReal2 | Percentage |
|--------|-------------|------------|------------|
| 112 | 3000 | 600 | 20.0000% |
| 113 | 4000 | 200 | 5.0000% |

Select row based on another row having same id in same table in MySQL

I have a table view T like:
id | C1 | C2 |
---+------+-----+
1 | pat | 190 |
1 | pat1 | 191 |
1 | A5 | 302 |
2 | pet | 190 |
2 | pet1 | 191 |
2 | A5 | 302 |
3 | pit | 190 |
3 | pit1 | 191 |
3 | A6 | 302 |
Would like to get:
id | C1 | C2 |
---+------+-----+
1 | pat | 190 |
2 | pet | 190 |
In other words, return id where C2 = 190 where same id elsewhere in table is A5.
Have tried several LEFT JOIN approaches but haven't gotten anywhere. Please help. Thanx.
You need EXISTS:
select t.*
from tablename t
where c2 = 190
and exists (
select 1 from tablename where id = t.id and c1 = 'A5'
)
See the demo.
Results:
| id | C1 | C2 |
| --- | --- | --- |
| 1 | pat | 190 |
| 2 | pet | 190 |
You can use EXISTS to check if a row with c1 = 'A5' exists for an ID.
SELECT *
FROM t t1
WHERE t1.c2 = 190
AND EXISTS (SELECT *
FROM t t2
WHERE t2.id = t1.id
AND t2.c1 = 'A5');

How can I get the last row from each given row value in a column through date? [duplicate]

This question already has answers here:
Retrieving the last record in each group - MySQL
(33 answers)
Closed 4 years ago.
I have the following table.
+--------------------+--------------+-------+
Date | SymbolNumber | Value
+--------------------+--------------+-------+
2018-08-31 15:00:00 | 123 | data
2018-09-31 15:00:00 | 456 | data
2018-09-31 15:00:00 | 123 | data
2018-09-31 15:00:00 | 555 | data
2018-10-31 15:00:00 | 555 | data
2018-10-31 15:00:00 | 231 | data
2018-10-31 15:00:00 | 123 | data
2018-11-31 15:00:00 | 123 | data
2018-11-31 15:00:00 | 555 | data
2018-12-31 15:00:00 | 123 | data
2018-12-31 15:00:00 | 555 | data
I need a query that can select the last row of each SymbolNumber stated in the query.
SELECT
*
FROM
MyTable
WHERE
symbolNumber IN (123, 555)
AND
**lastOfRow ordered by latest-date**
Expected results:
2018-12-31 15:00:00 | 123 | data
2018-12-31 15:00:00 | 555 | data
How can I do this?
First, you will need a query that get the latest date for each symbolNumber. Second, you can inner join to this table (using date) for get the rest of the columns. Like this:
SELECT
t.*
FROM
<table_name> AS t
INNER JOIN
(SELECT
symbolNumber,
MAX(date) AS maxDate
FROM
<table_name>
GROUP BY
symbolNumber) AS latest_date ON latest_date.symbolNumber = t.symbolNumber AND latest_date.maxDate = t.date
The previous query will get latest data for each existing symbolNumber on the table. If you want to restrict to symbolNumbers: 123 and 555, you will need to made next modification:
SELECT
t.*
FROM
<table_name> AS t
INNER JOIN
(SELECT
symbolNumber,
MAX(date) AS maxDate
FROM
<table_name>
WHERE
symbolNumber IN (123, 555)
GROUP BY
symbolNumber) AS latest_date ON latest_date.symbolNumber = t.symbolNumber AND latest_date.maxDate = t.date
We can do a "self-left-join" on symbolNumber, and match to other rows in the same group with higher Date value on the right side.
We will eventually consider only those rows, where higher date could not be found (meaning the current row belongs to highest date in the group).
Here is a solution avoiding subquery, and utilizing Left Join:
SELECT t1.*
FROM MyTable AS t1
LEFT JOIN MyTable AS t2
ON t2.symbolNumber = t1.symbolNumber AND
t2.Date > t1.Date -- Joining to a row in same group with higher date
WHERE t1.symbolNumber IN (123, 555) AND
t2.symbolNumber IS NULL -- Higher date not found; so this is highest row
EDIT:
Benchmarking studies comparing Left Join method v/s Derived Table (Subquery)
#Strawberry ran a little benchmark test in 5.6.21. Here's what he found...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id SERIAL PRIMARY KEY
,dense_user INT NOT NULL
,sparse_user INT NOT NULL
);
INSERT INTO my_table (dense_user,sparse_user)
SELECT RAND()*100,RAND()*100000;
INSERT INTO my_table (dense_user,sparse_user)
SELECT RAND()*100,RAND()*100000 FROM my_table;
-- REPEAT THIS LINE A FEW TIMES !!!
SELECT COUNT(DISTINCT dense_user) dense
, COUNT(DISTINCT sparse_user) sparse
, COUNT(*) total
FROM my_table;
+-------+--------+---------+
| dense | sparse | total |
+-------+--------+---------+
| 101 | 99999 | 1048576 |
+-------+--------+---------+
ALTER TABLE my_table ADD INDEX(dense_user);
ALTER TABLE my_table ADD INDEX(sparse_user);
--dense_test
SELECT x.*
FROM my_table x
LEFT
JOIN my_table y
ON y.dense_user = x.dense_user
AND y.id < x.id
WHERE y.id IS NULL
ORDER
BY dense_user
LIMIT 10;
+------+------------+-------------+
| id | dense_user | sparse_user |
+------+------------+-------------+
| 1212 | 0 | 1950 |
| 153 | 1 | 23193 |
| 255 | 2 | 27472 |
| 28 | 3 | 86440 |
| 18 | 4 | 47886 |
| 291 | 5 | 76563 |
| 15 | 6 | 85049 |
| 16 | 7 | 78384 |
| 135 | 8 | 52304 |
| 62 | 9 | 40930 |
+------+------------+-------------+
10 rows in set (2.64 sec)
SELECT x.*
FROM my_table x
JOIN
( SELECT dense_user, MIN(id) id FROM my_table GROUP BY dense_user ) y
ON y.dense_user = x.dense_user
AND y.id = x.id
ORDER
BY dense_user
LIMIT 10;
+------+------------+-------------+
| id | dense_user | sparse_user |
+------+------------+-------------+
| 1212 | 0 | 1950 |
| 153 | 1 | 23193 |
| 255 | 2 | 27472 |
| 28 | 3 | 86440 |
| 18 | 4 | 47886 |
| 291 | 5 | 76563 |
| 15 | 6 | 85049 |
| 16 | 7 | 78384 |
| 135 | 8 | 52304 |
| 62 | 9 | 40930 |
+------+------------+-------------+
10 rows in set (0.05 sec)
Uncorrelated query is 50 times faster.
--sparse test
SELECT x.*
FROM my_table x
LEFT
JOIN my_table y
ON y.sparse_user = x.sparse_user
AND y.id < x.id
WHERE y.id IS NULL
ORDER
BY sparse_user
LIMIT 10;
+--------+------------+-------------+
| id | dense_user | sparse_user |
+--------+------------+-------------+
| 165055 | 75 | 0 |
| 37598 | 63 | 1 |
| 170596 | 70 | 2 |
| 46142 | 87 | 3 |
| 33546 | 21 | 4 |
| 323114 | 87 | 5 |
| 86592 | 96 | 6 |
| 156711 | 36 | 7 |
| 17148 | 62 | 8 |
| 139965 | 71 | 9 |
+--------+------------+-------------+
10 rows in set (0.03 sec)
SELECT x.*
FROM my_table x
JOIN ( SELECT sparse_user, MIN(id) id FROM my_table GROUP BY sparse_user ) y
ON y.sparse_user = x.sparse_user
AND y.id = x.id
ORDER
BY sparse_user
LIMIT 10;
+--------+------------+-------------+
| id | dense_user | sparse_user |
+--------+------------+-------------+
| 165055 | 75 | 0 |
| 37598 | 63 | 1 |
| 170596 | 70 | 2 |
| 46142 | 87 | 3 |
| 33546 | 21 | 4 |
| 323114 | 87 | 5 |
| 86592 | 96 | 6 |
| 156711 | 36 | 7 |
| 17148 | 62 | 8 |
| 139965 | 71 | 9 |
+--------+------------+-------------+
10 rows in set (4.73 sec)
Exclusion Join is 150 times faster
However, as you move further up the result set, the picture begins to change very dramatically...
SELECT x.*
FROM my_table x
JOIN ( SELECT sparse_user, MIN(id) id FROM my_table GROUP BY sparse_user ) y
ON y.sparse_user = x.sparse_user
AND y.id = x.id
ORDER
BY sparse_user
LIMIT 10000,10;
+--------+------------+-------------+
| id | dense_user | sparse_user |
+--------+------------+-------------+
| 9810 | 93 | 10000 |
| 162438 | 4 | 10001 |
| 467371 | 62 | 10002 |
| 8258 | 13 | 10003 |
| 297049 | 17 | 10004 |
| 68354 | 23 | 10005 |
| 192701 | 64 | 10006 |
| 176225 | 92 | 10007 |
| 156595 | 37 | 10008 |
| 318266 | 1 | 10009 |
+--------+------------+-------------+
10 rows in set (9.17 sec)
SELECT x.*
FROM my_table x
LEFT
JOIN my_table y
ON y.sparse_user = x.sparse_user
AND y.id < x.id
WHERE y.id IS NULL
ORDER
BY sparse_user
LIMIT 10000,10;
+--------+------------+-------------+
| id | dense_user | sparse_user |
+--------+------------+-------------+
| 9810 | 93 | 10000 |
| 162438 | 4 | 10001 |
| 467371 | 62 | 10002 |
| 8258 | 13 | 10003 |
| 297049 | 17 | 10004 |
| 68354 | 23 | 10005 |
| 192701 | 64 | 10006 |
| 176225 | 92 | 10007 |
| 156595 | 37 | 10008 |
| 318266 | 1 | 10009 |
+--------+------------+-------------+
10 rows in set (32.19 sec) -- !!!
In summary, the exclusion join (the so-called 'strawberry query' can be (significantly) faster in certain, limited situations. More generally, an uncorrelated query will be faster.

MySQL query for row sequences

Is there a way to express the following query in MySQL:
Let a table have types of rows A, B, C, D, E ... Z and each row represents an event. Find the timestamps and ids of all event sequences A, .. , B, ... , C ordered by timestamp so that timestamp(C) - timestamp(A) < Thresh.
For example consider the following table
| type | timestamp | id |
|------+-----------+-----|
| Z | 19:00 | 20 |
| A | 19:01 | 21 |
| | | |
| . | ... | .. |
| | | |
| A | 20:13 | 50 | *
| B | 20:14 | 51 | *
| D | 20:17 | 52 |
| C | 20:19 | 53 | *
| | | |
| . | ... | .. |
| | | |
| A | 22:13 | 80 | *
| D | 22:14 | 81 |
| B | 22:15 | 82 | *
| K | 22:16 | 83 |
| J | 22:17 | 84 |
| C | 22:19 | 85 | *
| | | |
| . | ... | .. |
| | | |
| A | 23:13 | 100 |
| B | 23:14 | 101 |
| C | 23:50 | 102 |
The rows that the query with Thresh = 10mins should yield something along the lines of:
| A_id | B_id | C_id |
|------+------+------|
| 50 | 51 | 53 |
| 80 | 82 | 85 |
See how the last triplet of A, B and C is not present. The time distance between the last A event and the last C event is more that Thresh.
I suspect that the answer would be something along the lines of "MySQL is not the right tool if you need to ask this kind of question". In that case the followup is, which database is a good candidate to handle this kind of task?
Edit: provided an example
I think you can express this using a self join:
SELECT A.id as A_id, B.id as B_id, C.id as C_id
FROM (
SELECT *
FROM the_table
WHERE type = 'A'
) A
JOIN (
SELECT *
FROM the_table
WHERE type = 'B'
) B
JOIN (
SELECT *
FROM the_table
WHERE type = 'C'
) C ON (
(C.timestamp - A.timestamp) < 10 -- threshold here
AND B.timestamp BETWEEN A.timestamp AND C.timestamp
)

Calculate difference between one column over multiple rows using coalesce and join

I have following table
+-----+--------+-----------+-----------+-------+
| id | job_id | source_id | target_id | value |
+-----+--------+-----------+-----------+-------+
| 204 | 5283 | 247 | 228 | 1201 |
| 349 | 4006 | 247 | 228 | 100 |
| 350 | 4007 | 247 | 228 | 500 |
| 351 | 4008 | 247 | 228 | 1000 |
| 352 | 4009 | 1 | 100 | 100 |
| 353 | 4010 | 1 | 100 | 500 |
| 354 | 4011 | 1 | 100 | 50 |
+-----+--------+-----------+-----------+-------+
I want to create a diff between the column value groupped by source_id and target_id. The older one (smaller id) should be compared with the newer one
I have searched a little bit and found coalesce. I have written a small query and it works in "general", but not as expexted:
SELECT
c.id, c.source_id, c.target_id, c.value, COALESCE(c1.value - c.value, -1) AS diff
FROM
changes c LEFT JOIN changes c1 ON (c1.source_id = c.source_id AND c1.target_id = c.target_id)
GROUP BY c.source_id, c.target_id, c.job_id
ORDER BY c.id
I got following result:
+-----+-----------+-----------+-------+------+
| id | source_id | target_id | value | diff |
+-----+-----------+-----------+-------+------+
| 204 | 247 | 228 | 1201 | 0 |
| 349 | 247 | 228 | 100 | 1101 |
| 350 | 247 | 228 | 500 | 701 |
| 351 | 247 | 228 | 1000 | 201 |
| 352 | 1 | 100 | 100 | 0 |
| 353 | 1 | 100 | 500 | -400 |
| 354 | 1 | 100 | 50 | 50 |
+-----+-----------+-----------+-------+------+
You can see the diff work for id 349 and 353, I want this for all rows like the following expected result:
+-----+-----------+-----------+-------+------+
| id | source_id | target_id | value | diff |
+-----+-----------+-----------+-------+------+
| 204 | 247 | 228 | 1201 | 1201 |
| 349 | 247 | 228 | 100 | 1101 |
| 350 | 247 | 228 | 500 | -400 |
| 351 | 247 | 228 | 1000 | -500 |
| 352 | 1 | 100 | 100 | 100 |
| 353 | 1 | 100 | 500 | -400 |
| 354 | 1 | 100 | 50 | 450 |
+-----+-----------+-----------+-------+------+
It would be no problem if the diff result is inverted.
What did I miss?
Thanks for any hints.
if you use user defined variables you don't need to join the table to itself. just do a row by row comparrision like so
SELECT
id,
job_id,
target_id,
if(#a = source_id, #b - value, value) as diff,
#b := value as value,
#a := source_id as source_id
FROM changes
CROSS JOIN (SELECT #a:=0, #b:=0)t
DEMO
I suspect that you're looking for something like this - although the COALESCE bit seems misleading to me...
SELECT a.*, COALESCE(b.value-a.value,a.value) diff
FROM
( SELECT x.* , COUNT(*) rank FROM changes x JOIN changes y ON y.id <= x.id GROUP BY x.id ) a
LEFT
JOIN
( SELECT x.* , COUNT(*) rank FROM changes x JOIN changes y ON y.id <= x.id GROUP BY x.id ) b
ON b.source_id = a.source_id
AND b.rank = a.rank - 1;
I think you want:
SELECT c.id,
c.source_id,
c.target_id,
c.value,
c.value - COALESCE(co.value, 0) delta
FROM changes c
LEFT JOIN (
SELECT ci.id, MAX(cio.id) prev_id
FROM changes ci
JOIN changes cio
ON cio.source_id = ci.source_id
AND cio.target_id = ci.target_id
AND cio.id < ci.id
GROUP BY ci.id
) link
ON link.id = c.id
LEFT JOIN changes co
ON co.id = link.prev_id
ORDER BY c.id
I have changed the logic slightly.
In your expected results, the first diff has gone from unknown (0?) to 1201 and is reported as a positive diff, but the second has gone from 1201 to 100 and is still reported as positive.
I have changed the name to delta, and given you the number required to move from the previous value to the new value. Obviously you can change this if you want to:
COALESCE(co.value-c.value, c.value) diff
which will get you the results you provided (with the diff 500 changed to -500, which I believe was a typo).