Update of MySQL table column to sequential digit based on another column - mysql

The current table looks something like this:
[id | section | order | thing]
[1 | fruits | 0 | apple]
[2 | fruits | 0 | banana]
[3 | fruits | 0 | avocado]
[4 | veggies | 0 | tomato]
[5 | veggies | 0 | potato]
[6 | veggies | 0 | spinach]
I'm wondering how to make the table look more like this:
[id | section | order | thing]
[1 | fruits | 1 | apple]
[2 | fruits | 2 | banana]
[3 | fruits | 3 | avocado]
[4 | veggies | 1 | tomato]
[5 | veggies | 2 | potato]
[6 | veggies | 3 | spinach]
"order" column updated to a sequential number, starting at 1, based on "section" column and "id" column.

You can do this with an update by using a join. The second table to the join calculates the ordering, which is then used for the update:
update t join
(select t.*, #rn := if(#prev = t.section, #rn + 1, 1) as rn
from t cross join (select #rn := 0, #prev := '') const
) tsum
on t.id = tsum.id
set t.ordering = tsum.rn

You don't want to do this as an UPDATE, as that will be really slow.
Instead, do this on INSERT. Here's a simple one-line INSERT that will grab the next order number and inserts a record called 'kiwi' in the section 'fruits'.
INSERT INTO `table_name` (`section`, `order`, `thing`)
SELECT 'fruits', MAX(`order`) + 1, 'kiwi'
FROM `table_name`
WHERE `section` = `fruits`
EDIT: You could also do this using an insert trigger, e.g.:
DELIMITER $$
CREATE TRIGGER `trigger_name`
BEFORE INSERT ON `table_name`
FOR EACH ROW
BEGIN
SET NEW.`order` = (SELECT MAX(`order`) + 1 FROM `table_name` WHERE `section` = NEW.`section`);
END$$
DELIMITER ;
Then you could just insert your records as usual, and they will auto-update the order value.
INSERT INTO `table_name` (`section`, `thing`)
VALUES ('fruits', 'kiwi')

Rather than storing the ordering, you could derive it:
SELECT t.id
,t.section
,#row_num := IF (#prev_section = t.section, #row_num+1, 1) AS ordering
,t.thing
,#prev_section := t.section
FROM myTable t
,(SELECT #row_num := 1) x
,(SELECT #prev_value := '') y
ORDER BY t.section, t.id
Note that order is a keyword and is therefore not the greatest for a column name. You could quote the column name or give it a different name...

Related

MySQL: Sequentially number a column based on change in a different column

If I have a table with the following columns and values, ordered by parent_id:
id parent_id line_no
-- --------- -------
1 2
2 2
3 2
4 3
5 4
6 4
And I want to populate line_no with a sequential number that starts over at 1 every time the value of parent_id changes:
id parent_id line_no
-- --------- -------
1 2 1
2 2 2
3 2 3
4 3 1
5 4 1
6 4 2
What would the query or sproc look like?
NOTE: I should point out that I only need to do this once. There's a new function in my PHP code that automatically creates the line_no every time a new record is added. I just need to update the records that already exist.
Most versions of MySQL do not support row_number(). So, you can do this using variables. But you have to be very careful. MySQL does not guarantee the order of evaluation of variables in the select, so a variable should not be assigned an referenced in different expressions.
So:
select t.*,
(#rn := if(#p = parent_id, #rn + 1,
if(#p := parent_id, 1, 1)
)
) as line_no
from (select t.* from t order by id) t cross join
(select #p := 0, #rn := 0) params;
The subquery to sort the table may not be necessary. Somewhere around version 5.7, this became necessary when using variables.
EDIT:
Updating with variables is fun. In this case, I would just use subqueries with the above:
update t join
(select t.*,
(#rn := if(#p = parent_id, #rn + 1,
if(#p := parent_id, 1, 1)
)
) as new_line_no
from (select t.* from t order by id) t cross join
(select #p := 0, #rn := 0) params
) tt
on t.id = tt.id
set t.line_no = tt.new_line_no;
Or, a little more old school...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id SERIAL PRIMARY KEY
,parent_id INT NOT NULL
);
INSERT INTO my_table VALUES
(1, 2),
(2 , 2),
(3 , 2),
(4 , 3),
(5 , 4),
(6 , 4);
SELECT x.*
, CASE WHEN #prev = parent_id THEN #i := #i+1 ELSE #i := 1 END i
, #prev := parent_id prev
FROM my_table x
, (SELECT #prev:=null,#i:=0) vars
ORDER
BY parent_id,id;
+----+-----------+------+------+
| id | parent_id | i | prev |
+----+-----------+------+------+
| 1 | 2 | 1 | 2 |
| 2 | 2 | 2 | 2 |
| 3 | 2 | 3 | 2 |
| 4 | 3 | 1 | 3 |
| 5 | 4 | 1 | 4 |
| 6 | 4 | 2 | 4 |
+----+-----------+------+------+
You can use subquery if the row_number() doesn't help :
select t.*,
(select count(*)
from table t1
where t1.parent_id = t.parent_id and t1.id <= t.id
) as line_no
from table t;

SQL fill new column dependant on uniqueness of other columns

I have a table in the database that has among others 2 columns named item and slot. I want to create another column (call it cid) that is filled with numbers so that:
two rows that have the same item and same slot always have the same cid
rows with different item may have the same cid
the amount of distinct cid values is minimal
rows with the same item but different slot need to have different cid
If possible I'd like to just run an sql query that does that.
edit by request:
| item | slot | what cid should be
| a | x | 1
| a | y | 2
| a | y | 2
| a | z | 3
| b | x | 1
| b | y | 2
| b | q | 3
| c | x | 1
You can do what you want just by enumerating the slots for each item:
select t.*,
(#cid := if(#i = item, if(#s = slot, #cid, if(#s := slot, #cid + 1, #cid + 1))
if(#i := item and #s := 0, 1, 1)
)
) as cid
from t cross join
(select #i := -1, #cid := 0, #s := 0) param
order by item;

Can grouped expressions be used with variable assignments?

I'm trying to calculate row differences (like MySQL difference between two rows of a SELECT Statement) over a grouped result set:
create table test (i int not null auto_increment, a int, b int, primary key (i));
insert into test (a,b) value (1,1),(1,2),(2,4),(2,8);
Gives
| a | b
---------
| 1 | 1
| 1 | 2
| 2 | 4
| 2 | 8
This is the simple SQL with group and max(group) result columns:
select
data.a,
max(data.b)
from
(
select a, b
from test
order by i
) as data
group by a
order by a
The obvious result is
| a | max(data.b)
-----------------
| 1 | 2
| 2 | 8
Where I'm failing is when I want to calculate the row-by-row differences on the grouped column:
set #c:=0;
select
data.a,
max(data.b),
#c:=max(data.b)-#c
from
(
select a, b
from test
order by i
) as data
group by a
order by a
Still gives:
| a | max(data.b) | #c:=max(data.b)-#c
--------------------------------------
| 1 | 2 | 2 (expected 2-0=2)
| 2 | 8 | 8 (expected 8-2=6)
Could anybody highlight why the #c variable is not updating from grouped row to grouped row as expected?
SELECT data.a
, data.b
, #c := data.b - #c
FROM (
SELECT a
, max(b) AS b
FROM test
GROUP BY a
) AS data
ORDER BY a
Example
The 'documented' solution might look like this...
SELECT x.*
, #c := b - #c c
FROM test x
JOIN
( SELECT a,MAX(b) max_b FROM test GROUP BY a ) y
ON y.a = x.a
AND y.max_b = x.b
JOIN (SELECT #c:= 0) vals;

Sql to find timediff between two rows based on ID

The subject of the question is not very explanatory, sorry for that.
Ya so the question follows:
I have a database structure as below where pk is primary key, id
is something which is multiple for many rows.
+------+------+---------------------+
| pk | id | value |
+------+------+---------------------+
| 99 | 1 | 2013-08-06 11:10:00 |
| 100 | 1 | 2013-08-06 11:15:00 |
| 101 | 1 | 2013-08-06 11:20:00 |
| 102 | 1 | 2013-08-06 11:25:00 |
| 103 | 2 | 2013-08-06 15:10:00 |
| 104 | 2 | 2013-08-06 15:15:00 |
| 105 | 2 | 2013-08-06 15:20:00 |
+------+------+---------------------+
What is really need to get is, value difference between first two rows (which is ordered by value) for each
group (where group is by id). So according to above structure I need
timediff(value100, value99) [ which is for id 1 group]
and timediff(value104, value103) [ which is for id 2 group]
i.e. value difference of time ordered by value for 1st two rows in each group.
One way i can think to do is by 3 self joins (or 3 sub queries) so as to find the
first two in 2 of them , and third query subtracting it. Any suggestions?
try this.. CTE is pretty powerfull!
WITH CTE AS (
SELECT
value, pk, id,
rnk = ROW_NUMBER() OVER ( PARTITION BY id order by id DESC)
, rownum = ROW_NUMBER() OVER (ORDER BY id, pk)
FROM test
)
SELECT
curr.rnk, prev.rnk, curr.rownum, prev.rownum, curr.pk, prev.pk, curr.id, prev.id, curr.value, prev.value, curr.value - prev.value
FROM CTE curr
INNER JOIN CTE prev on curr.rownum = prev.rownum -1 and curr.id = prev.id
and curr.rnk <=1
Looks a bit wierd... But you can try this way
SET #previous = 0;
SET #temp = 0;
SET #tempID = 0;
Above step may not be needed .. But just to make sure nothing goes wrong
SELECT pkid, id, diff, valtemp FROM (
SELECT IF(#previousID = id, #temp := #temp + 1, #temp := 1) occ, #previousID := id,
TIMEDIFF(`value`, #previous) diff, pk, id, `value`, #previous := `value`
FROM testtable) a WHERE occ = 2
Demo on sql fiddle

Returning the last row from a mysql subquery when paginating a result set

I'm using the following mysql query to create a pagination array -- for a list of documents -- in the form "Ab-Cf | Cg-Le | Li-Ru " etc...
The subquery 'Subquery' selects the entire list of documents, and is variable depending on the user privileges, requirements etc -- so I'm trying to avoid altering that part of the query (and have used a simplified version here).
I'm then selecting the first and last row of each page range -- i.e, the 1st and 10th row, the 11th and 20th row etc., determined by $num_rows_per_page.
SELECT * FROM
(
SELECT #row := #row + 1 AS `rownum`, `sort_field` FROM
(
SELECT #row := 0 ) r, (
SELECT D.`id`, D.`display_name` as display_field,
D.`sort_name` as sort_field
FROM Document D ORDER BY `sort_field` ASC
) Subquery
) Sorted
WHERE rownum % $num_rows_per_page = 1 OR rownum % $num_rows_per_page = 0
This is working just fine, and gives me a result set like:
+---------+-----------------------------------+
| rownum | index_field |
+---------+-----------------------------------+
| 1 | Alternaria humicola |
| 10 | Anoplophora chinensis |
| 11 | Atherigona soccata |
| 20 | Carlavirus RCVMV |
| 21 | Cephus pygmaeus |
| 30 | Colletotrichum truncatum |
| 31 | Fusarium oxysporium f. sp. ciceri |
| 40 | Homalodisca vitripennis |
| 41 | Hordevirus BSMV |
| 50 | Mayetiola hordei |
| 51 | Meromyza saltatrix |
| 60 | Phyllophaga |
| 61 | Pyrenophora teres |
+--------+------------------------------------+
However -- I can't for the life of me work out how to include the last row of the subquery in the result set. I.e., the row with rownum 67 (or whatever) that does not meet the criteria of the WHERE clause.
I was hoping to somehow pull the maximum value of rownum and add it to the WHERE clause, but I'm having no joy.
Any ideas?
Happy to try to rephrase if this isn't clear!
Edit -- here's a more appropriate version of the subquery:
SELECT * FROM
(
SELECT #row := #row + 1 AS `rownum`, `sort_field` FROM
(
SELECT #row := 0 ) r,
(
SELECT D.`id`, D.`display_name` as display_field,
D.`sort_name` as sort_field
FROM Document D INNER JOIN
(
SELECT DS.* FROM Document_Status DS INNER JOIN
(
SELECT `document_id`, max(`datetime`) as `MaxDateTime`
FROM Document_Status GROUP BY `document_id`
)
GS ON DS.`document_id` = GS.`document_id`
AND DS.`datetime` = GS.`MaxDateTime`
AND DS.`status` = 'approved' INNER JOIN
(
SELECT `id` FROM Document WHERE `template_id`= 2 ) GD
ON DS.`document_id` = GD.`id`
)
AG ON D.id = AG.document_id ORDER BY `sort_field` ASC
) Subquery
) Sorted
WHERE rownum % $num_rows_per_page = 1 OR rownum % $num_rows_per_page = 0
But, a key point to remember is that the subquery will change depending on the context.
Please try adding
OR rownum=#row
to your WHERE clause (in my testing case this works)