I'm trying to calculate row differences (like MySQL difference between two rows of a SELECT Statement) over a grouped result set:
create table test (i int not null auto_increment, a int, b int, primary key (i));
insert into test (a,b) value (1,1),(1,2),(2,4),(2,8);
Gives
| a | b
---------
| 1 | 1
| 1 | 2
| 2 | 4
| 2 | 8
This is the simple SQL with group and max(group) result columns:
select
data.a,
max(data.b)
from
(
select a, b
from test
order by i
) as data
group by a
order by a
The obvious result is
| a | max(data.b)
-----------------
| 1 | 2
| 2 | 8
Where I'm failing is when I want to calculate the row-by-row differences on the grouped column:
set #c:=0;
select
data.a,
max(data.b),
#c:=max(data.b)-#c
from
(
select a, b
from test
order by i
) as data
group by a
order by a
Still gives:
| a | max(data.b) | #c:=max(data.b)-#c
--------------------------------------
| 1 | 2 | 2 (expected 2-0=2)
| 2 | 8 | 8 (expected 8-2=6)
Could anybody highlight why the #c variable is not updating from grouped row to grouped row as expected?
SELECT data.a
, data.b
, #c := data.b - #c
FROM (
SELECT a
, max(b) AS b
FROM test
GROUP BY a
) AS data
ORDER BY a
Example
The 'documented' solution might look like this...
SELECT x.*
, #c := b - #c c
FROM test x
JOIN
( SELECT a,MAX(b) max_b FROM test GROUP BY a ) y
ON y.a = x.a
AND y.max_b = x.b
JOIN (SELECT #c:= 0) vals;
Related
I have a table with 2 columns, the first column is called ID and the second is called TRACKING. The ID column has duplicates, I want to to take all of those duplicates and consolidate them into one row where each value from TRACKING from the duplicate row is placed into a new column within the same row and I no longer have duplicates.
I have tried a few suggested things where all of the values would be concatenated into one column but I want these TRACKING values for the duplicate IDs to be in separate columns. The code below did not do what I intended it to.
SELECT ID, TRACKING =
STUFF((SELECT DISTINCT ', ' + TRACKING
FROM #t b
WHERE b.ID = a.ID
FOR XML PATH('')), 1, 2, '')
FROM #t a
GROUP BY ID
I am looking to take this:
| ID | TRACKING |
-----------------
| 5 | 13t3in3i |
| 5 | g13g13gg |
| 3 | egqegqgq |
| 2 | 14y2y24y |
| 2 | 42yy44yy |
| 5 | 8i535i35 |
And turn it into this:
| ID | TRACKING | TRACKING1 | TRACKING2 |
-----------------
| 5 | 13t3in3i | g13g13gg | 8i535i35 |
| 3 | egqegqgq | | |
| 2 | 14y2y24y | 42yy44yy | |
On (relatively) painful way to do this in MySQL is to use correlated subqueries:
select i.id,
(select t.tracking
from t
where t.id = i.id
order by t.tracking
limit 1, 0
) as tracking_1,
(select t.tracking
from t
where t.id = i.id
order by t.tracking
limit 1, 1
) as tracking_2,
(select t.tracking
from t
where t.id = i.id
order by t.tracking
limit 1, 2
) as tracking_3
from (select distinct id from t
) i;
As bad as this looks, it will probably have surprisingly decent performance with an index on (id, tracking).
By the way, your original code with stuff() would put everything into one column:
select id, group_concat(tracking)
from t
group by id;
with test_tbl as
(
select 5 id, 'goog' tracking,'goog' tracking1
union all
select 5 id, 'goog1','goo'
union all
select 2 , 'yahoo','yah'
union all
select 2, 'yahoo1','ya'
union all
select 3,'azure','azu'
), modified_tbl as
(
select id,array_agg(concat(tracking)) Tracking,array_agg(concat(tracking1)) Tracking1 from test_tbl group by 1
)
select id, tracking[safe_offset(0)] Tracking_1,tracking1[safe_offset(0)] Tracking_2, tracking[safe_offset(1)] Tracking_3,tracking1[safe_offset(1)] Tracking_4 from modified_tbl where array_length(Tracking) > 1
Let's assume I have two columns: letters and numbers in a table called tbl;
letters numbers
a 1
b 2
c 3
d 4
Doing a cartesian product will lead to :
a 1
a 2
a 3
a 4
b 1
b 2
b 3
b 4
c 1
c 2
c 3
c 4
d 1
d 2
d 3
d 4
Write a query that reverts the cartesian product of these two columns back to the original table.
I tried multiple methods from using ROWNUM to selecting distinct values and joining them (which leads me back to the cartesian product)
SELECT DISTINCT *
FROM (SELECT DISTINCT NUMBERS
FROM TBL
ORDER BY NUMBERS) AS NB
JOIN (SELECT DISTINCT LETTERS
FROM TBL
ORDER BY LETTERS) AS LT1
which led me back to the cartesian product....
This is a version that works with 5.7.
SELECT `numbers`,`letters` FROM
(SELECT `numbers`,
#curRank := #curRank + 1 AS rank
FROM Table1 t, (SELECT #curRank := 0) r
GROUP By `numbers`
ORDER BY `numbers`) NB1
INNER JOIN
(SELECT `letters`,
#curRank1 := #curRank1 + 1 AS rank
FROM (
Select `letters` FROM Table1 t
GROUP By `letters`) t2, (SELECT #curRank1 := 0) r
ORDER BY `letters`) LT1 ON NB1.rank = LT1.rank;
https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=cc17c2cfeff049edc73e437e5e4fd892
As Raymond and Ankit pointed out you have to know which order have the letters and even the order of the numbers has to be defined prior or else you never get a correct answer.
Another way of writing this:
SELECT numbers
, letters
FROM
( SELECT DISTINCT numbers
, #curRank := #curRank + 1 rank
FROM Table1 t
, (SELECT #curRank := 0) r
ORDER
BY numbers
) NB1
JOIN
( SELECT letters
, #curRank1 := #curRank1 + 1 rank
FROM
( SELECT DISTINCT letters
FROM Table1 t
) t2
, (SELECT #curRank1 := 0) r
ORDER
BY letters
) LT1
ON NB1.rank = LT1.rank;
If you are sure that the order will never be destroyed and is deterministic, You can use dense_rank() analytic function to achieve it back -
SELECT LT1.LETTERS, NB.NUMBERS
FROM (SELECT DISTINCT NUMBERS
FROM TBL
ORDER BY NUMBERS) AS NB
JOIN (SELECT DISTINCT LETTERS, RN
FROM (SELECT LETTERS, DENSE_RANK() OVER (ORDER BY LETTERS) RN
FROM TBL
ORDER BY LETTERS) T) AS LT1
ON NB.NUMBERS = LT1.RN
Here is the fiddle
Perhaps this is oversimplifying the problem, but it should be seen that this, or some variation of it, would suffice...
SELECT * FROM my_table;
+---------+---------+
| letters | numbers |
+---------+---------+
| a | 1 |
| a | 2 |
| a | 3 |
| a | 4 |
| b | 1 |
| b | 2 |
| b | 3 |
| b | 4 |
| c | 1 |
| c | 2 |
| c | 3 |
| c | 4 |
| d | 1 |
| d | 2 |
| d | 3 |
| d | 4 |
+---------+---------+
16 rows in set (0.00 sec)
SELECT x.*
, #i:=#i+1 numbers
FROM
( SELECT DISTINCT letters
FROM my_table
) x
, (SELECT #i:=0) vars
ORDER
BY letters;
+---------+---------+
| letters | numbers |
+---------+---------+
| a | 1 |
| b | 2 |
| c | 3 |
| d | 4 |
+---------+---------+
I want to rows according to same column value.
Suppose this is a table
id name topic
1 A t
2 B a
3 c t
4 d b
5 e b
6 f a
I want result something like this.
id name topic
1 A t
3 c t
2 B a
6 f a
4 d b
5 e b
As you can see these are not order by topic neither by id, it sort about that topic which come first if t come first sort t first, one second a come then sort according to a then b.
if you apply ORDER BY topic it sort a b t or in DESC t b a but required result is t a b
Any suggestion ?
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,topic CHAR(1) NOT NULL
);
INSERT INTO my_table VALUES
(1,'t'),
(2,'a'),
(3,'t'),
(4,'b'),
(5,'b'),
(6,'a');
SELECT x.*
FROM my_table x
JOIN
( SELECT topic, MIN(id) id FROM my_table GROUP BY topic ) y
ON y.topic = x.topic
ORDER
BY y.id,x.id;
+----+-------+
| id | topic |
+----+-------+
| 1 | t |
| 3 | t |
| 2 | a |
| 6 | a |
| 4 | b |
| 5 | b |
+----+-------+
You can use CASE expression in ORDER BY.
Query
select * from `your_table_name`
order by
case `topic`
when 't' then 1
when 'a' then 2
when 'b' then 3
else 4 end
, `name`;
I have a table with a column A that is INT(11) (it's a timestamp, but for now I just use small numbers)
id | A | diff |
---+----+------+
1 | 12 | |
2 | 7 | |
3 | 23 | |
4 | 9 | |
5 | 2 | |
6 | 30 | |
I like to update diff with the difference between A and it's nearest smaller neighbour. So if A=12 it's first smaller neightbour is A=7, if A=30 it is A=23. I should end up with a table like this (sorted on A):
id | A | diff |
---+----+------+
5 | 2 | - |
2 | 7 | 5 | (7-5)
4 | 9 | 2 | (9-7)
1 | 12 | 3 | (12-9)
3 | 23 | 11 | (23-12)
6 | 30 | 7 | (30-23)
I can calculate the difference at the moment of insertion, as I know A then (here: A=15):
INSERT INTO `table` (`A`,`diff`)
(SELECT 15 , 15-`A` FROM `table` WHERE `A` < 15 ORDER BY `A` DESC LIMIT 1)
This results in a new record:
id | A | diff |
---+----+------+
7 | 15 | 3 | (3 being the difference between A=12 and A=15
(NOTE: This fails miserably when A=1, being the new smallest value and having no smaller neighbour, so no value of diff)
But now the value of diff in record 3 is wrong, because it still is based on the difference between 23 - 12 as is now should be 23 - 15.
So I just want to insert the A value and then run an update on the table, refreshing diff where necessery. But that's where my knowledge of MYSQL ends...
I crafted this query, but it fails saying `You can't specify table 't1' for update in FROM clause
UPDATE `table` AS t1
SET
t1.`diff` = t1.`A` - (SELECT `A` FROM `table`
WHERE `A` < t1.`A`
ORDER BY `A` DESC LIMIT 1
)
Here's a query:
SELECT x.*
, x.a-MAX(y.a) diff
FROM my_table x
LEFT
JOIN my_table y
ON y.a < x.a
GROUP
BY x.id
ORDER
BY a;
I'm not sure why you would want to store derived data, but you can I guess...
UPDATE my_table m
JOIN
( SELECT x.*
, x.a-MAX(y.a) q
FROM my_table x
JOIN my_table y
ON y.a < x.a
GROUP
BY x.id
) n
ON n.id = m.id
SET m.diff = q;
You may try this after inserting new value :
UPDATE x
SET
x.diff = iq2.new_diff
FROM
#t x
INNER JOIN
(SELECt id,A,diff , new_diff
FROM
(select id,A,15 as new_number,
CASE WHEN (A-15) < 0 THEN NULL ELSE (A-15) END as new_diff,diff
from #t
) iq
WHERE
iq.new_diff <= iq.diff
AND iq.new_diff <> 0
)iq2
on x.A = iq2.A
inner query compares the previous difference and current one and then updates the relevant ones.
I'm using the following mysql query to create a pagination array -- for a list of documents -- in the form "Ab-Cf | Cg-Le | Li-Ru " etc...
The subquery 'Subquery' selects the entire list of documents, and is variable depending on the user privileges, requirements etc -- so I'm trying to avoid altering that part of the query (and have used a simplified version here).
I'm then selecting the first and last row of each page range -- i.e, the 1st and 10th row, the 11th and 20th row etc., determined by $num_rows_per_page.
SELECT * FROM
(
SELECT #row := #row + 1 AS `rownum`, `sort_field` FROM
(
SELECT #row := 0 ) r, (
SELECT D.`id`, D.`display_name` as display_field,
D.`sort_name` as sort_field
FROM Document D ORDER BY `sort_field` ASC
) Subquery
) Sorted
WHERE rownum % $num_rows_per_page = 1 OR rownum % $num_rows_per_page = 0
This is working just fine, and gives me a result set like:
+---------+-----------------------------------+
| rownum | index_field |
+---------+-----------------------------------+
| 1 | Alternaria humicola |
| 10 | Anoplophora chinensis |
| 11 | Atherigona soccata |
| 20 | Carlavirus RCVMV |
| 21 | Cephus pygmaeus |
| 30 | Colletotrichum truncatum |
| 31 | Fusarium oxysporium f. sp. ciceri |
| 40 | Homalodisca vitripennis |
| 41 | Hordevirus BSMV |
| 50 | Mayetiola hordei |
| 51 | Meromyza saltatrix |
| 60 | Phyllophaga |
| 61 | Pyrenophora teres |
+--------+------------------------------------+
However -- I can't for the life of me work out how to include the last row of the subquery in the result set. I.e., the row with rownum 67 (or whatever) that does not meet the criteria of the WHERE clause.
I was hoping to somehow pull the maximum value of rownum and add it to the WHERE clause, but I'm having no joy.
Any ideas?
Happy to try to rephrase if this isn't clear!
Edit -- here's a more appropriate version of the subquery:
SELECT * FROM
(
SELECT #row := #row + 1 AS `rownum`, `sort_field` FROM
(
SELECT #row := 0 ) r,
(
SELECT D.`id`, D.`display_name` as display_field,
D.`sort_name` as sort_field
FROM Document D INNER JOIN
(
SELECT DS.* FROM Document_Status DS INNER JOIN
(
SELECT `document_id`, max(`datetime`) as `MaxDateTime`
FROM Document_Status GROUP BY `document_id`
)
GS ON DS.`document_id` = GS.`document_id`
AND DS.`datetime` = GS.`MaxDateTime`
AND DS.`status` = 'approved' INNER JOIN
(
SELECT `id` FROM Document WHERE `template_id`= 2 ) GD
ON DS.`document_id` = GD.`id`
)
AG ON D.id = AG.document_id ORDER BY `sort_field` ASC
) Subquery
) Sorted
WHERE rownum % $num_rows_per_page = 1 OR rownum % $num_rows_per_page = 0
But, a key point to remember is that the subquery will change depending on the context.
Please try adding
OR rownum=#row
to your WHERE clause (in my testing case this works)