SQL fill new column dependant on uniqueness of other columns - mysql

I have a table in the database that has among others 2 columns named item and slot. I want to create another column (call it cid) that is filled with numbers so that:
two rows that have the same item and same slot always have the same cid
rows with different item may have the same cid
the amount of distinct cid values is minimal
rows with the same item but different slot need to have different cid
If possible I'd like to just run an sql query that does that.
edit by request:
| item | slot | what cid should be
| a | x | 1
| a | y | 2
| a | y | 2
| a | z | 3
| b | x | 1
| b | y | 2
| b | q | 3
| c | x | 1

You can do what you want just by enumerating the slots for each item:
select t.*,
(#cid := if(#i = item, if(#s = slot, #cid, if(#s := slot, #cid + 1, #cid + 1))
if(#i := item and #s := 0, 1, 1)
)
) as cid
from t cross join
(select #i := -1, #cid := 0, #s := 0) param
order by item;

Related

MySQL: Sequentially number a column based on change in a different column

If I have a table with the following columns and values, ordered by parent_id:
id parent_id line_no
-- --------- -------
1 2
2 2
3 2
4 3
5 4
6 4
And I want to populate line_no with a sequential number that starts over at 1 every time the value of parent_id changes:
id parent_id line_no
-- --------- -------
1 2 1
2 2 2
3 2 3
4 3 1
5 4 1
6 4 2
What would the query or sproc look like?
NOTE: I should point out that I only need to do this once. There's a new function in my PHP code that automatically creates the line_no every time a new record is added. I just need to update the records that already exist.
Most versions of MySQL do not support row_number(). So, you can do this using variables. But you have to be very careful. MySQL does not guarantee the order of evaluation of variables in the select, so a variable should not be assigned an referenced in different expressions.
So:
select t.*,
(#rn := if(#p = parent_id, #rn + 1,
if(#p := parent_id, 1, 1)
)
) as line_no
from (select t.* from t order by id) t cross join
(select #p := 0, #rn := 0) params;
The subquery to sort the table may not be necessary. Somewhere around version 5.7, this became necessary when using variables.
EDIT:
Updating with variables is fun. In this case, I would just use subqueries with the above:
update t join
(select t.*,
(#rn := if(#p = parent_id, #rn + 1,
if(#p := parent_id, 1, 1)
)
) as new_line_no
from (select t.* from t order by id) t cross join
(select #p := 0, #rn := 0) params
) tt
on t.id = tt.id
set t.line_no = tt.new_line_no;
Or, a little more old school...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id SERIAL PRIMARY KEY
,parent_id INT NOT NULL
);
INSERT INTO my_table VALUES
(1, 2),
(2 , 2),
(3 , 2),
(4 , 3),
(5 , 4),
(6 , 4);
SELECT x.*
, CASE WHEN #prev = parent_id THEN #i := #i+1 ELSE #i := 1 END i
, #prev := parent_id prev
FROM my_table x
, (SELECT #prev:=null,#i:=0) vars
ORDER
BY parent_id,id;
+----+-----------+------+------+
| id | parent_id | i | prev |
+----+-----------+------+------+
| 1 | 2 | 1 | 2 |
| 2 | 2 | 2 | 2 |
| 3 | 2 | 3 | 2 |
| 4 | 3 | 1 | 3 |
| 5 | 4 | 1 | 4 |
| 6 | 4 | 2 | 4 |
+----+-----------+------+------+
You can use subquery if the row_number() doesn't help :
select t.*,
(select count(*)
from table t1
where t1.parent_id = t.parent_id and t1.id <= t.id
) as line_no
from table t;

Mysql Ranking within grouped results

I have read posts that answer how to rank results in mysql, but my question is how to assign ranks within a group
Let me explain with an example
Data:
sem_id | result_month
--------------------
1 |1313907325000
1 |1345529725000
2 |1329804925000
2 |1361427325000
3 |1377065725000
3 |1440137725000
What i am able to achieve with the below query:
SELECT #ss := #ss + 1 AS rank,
res.sm_id,
res.result_month
FROM (SELECT sm_id, result_month
FROM xx_table
GROUP BY sm_id,
result_month) AS res,(SELECT #ss := 0) AS ss;
Current results:
rank | sem_id | result_month
----------------------------
1 | 1 |1313907325000
2 | 1 |1345529725000
3 | 2 |1329804925000
4 | 2 |1361427325000
5 | 3 |1377065725000
6 | 3 |1440137725000
What I actually want :
rank | sem_id | result_month
----------------------------
1 | 1 |1345529725000
2 | 1 |1313907325000
1 | 2 |1361427325000
2 | 2 |1329804925000
1 | 3 |1440137725000
2 | 3 |1377065725000
In the above results things to observe is each group is ranked within itself and each group is ordered by result_month desc
Help me on how can i achieve the above results
Thanks in advance!
You're almost there, use another variable to compute group:
SELECT #ss := CASE WHEN #grp = sem_id THEN #ss + 1 ELSE 1 END AS rank, sem_id, result_month, #grp := sem_id
FROM (select * from xx_table ORDER BY sem_id, result_month DESC) m
CROSS JOIN (SELECT #ss := 0, #grp = null) ss
See demo here.

MySQL: Find max consecutive rows in a table based on value

Using MySQL, I am trying to find the highest number of consecutive rows in a table based on a value. For the sake of simplicity, my table looks like this:
+----+-------+
| ID | VALUE |
+----+-------+
| 1 | A |
| 2 | B |
| 3 | A |
| 4 | A |
| 5 | B |
| 6 | B |
| 7 | A |
| 8 | A |
| 9 | A |
| 10 | B |
+----+-------+
In this example, if I wanted the highest number of consecutive rows for 'A', I would get 3. For 'B', I would get 2. Even returning a result set of the counts of consecutive rows for 'A' would be preferable. I am newer to SQL so hints would be appreciated too. Any suggestions?
You can do it using variables:
SELECT VALUE, MAX(cnt) AS maxCount
FROM (
SELECT VALUE, COUNT(grp) AS cnt
FROM (
SELECT ID, VALUE, rn - rnByVal AS grp
FROM (
SELECT ID, VALUE,
#rn := #rn + 1 AS rn,
#rnByVal := IF (#val = VALUE,
IF (#val := VALUE, #rnByVal + 1, #rnByVal + 1),
IF (#val := VALUE, 1, 1)) AS rnByVal
FROM mytable
CROSS JOIN (SELECT #rn := 0, #rnByVal := 0, #val := '') AS vars
ORDER BY ID) AS t
) AS s
GROUP BY VALUE, grp ) AS u
GROUP BY VALUE
Variables #rn and #rnByVal are used in order to simulate ROW_NUMBER window function, currently not available in MySQL. The second variable (#rnByVal) performs a count over VALUE partitions.
Using #rn - #rnByVal in an outer query we can calculate grp field, which identifies islands of consecutive rows having the same VALUE. Performing a GROUP BY on VALUE, grp we can calculate the population of these islands and, finally, in the outermost query, get the max population per VALUE.
Demo here

MySQL group chunks by column value, sorted by other column

I have a table like:
time | status
1390836600 | 1
1390836605 | 1
1390836610 | 0
1390836615 | 0
1390836620 | 1
1390836625 | 1
1390836630 | 1
I need to output the data "grouped" by the status, and sorted by time. The trick is that I need the groupings in chunks for each time the status changes, with the fields: MIN(time), status
So for the example data above I'd need an output like
MIN(time) | status
1390836600 | 1
1390836610 | 0
1390836620 | 1
This is not the behaviour of GROUP BY, which would just group ALL rows with the same status and only output 2 rows. But is something like this possible?
This (grouping of continuous ranges) is called gaps-and-islands problem and can be effectively solved by using analytic functions (specifically ROW_NUMBER()) which MySQL still has no support for.
But you can emulate ROW_NUMBER() with session variables in the following way
SELECT MIN(time) time, status
FROM
(
SELECT time, status,
#n := #n + 1 rnum,
#g := IF(status = #s, #g + 1, 1) rnum2,
#s := status
FROM table1 CROSS JOIN (SELECT #n := 0, #g := 0, #s := NULL) i
ORDER BY time
) q
GROUP BY rnum - rnum2
Output:
| TIME | STATUS |
|------------|--------|
| 1390836600 | 1 |
| 1390836610 | 0 |
| 1390836620 | 1 |
Here is a SQLFiddle demo

Update of MySQL table column to sequential digit based on another column

The current table looks something like this:
[id | section | order | thing]
[1 | fruits | 0 | apple]
[2 | fruits | 0 | banana]
[3 | fruits | 0 | avocado]
[4 | veggies | 0 | tomato]
[5 | veggies | 0 | potato]
[6 | veggies | 0 | spinach]
I'm wondering how to make the table look more like this:
[id | section | order | thing]
[1 | fruits | 1 | apple]
[2 | fruits | 2 | banana]
[3 | fruits | 3 | avocado]
[4 | veggies | 1 | tomato]
[5 | veggies | 2 | potato]
[6 | veggies | 3 | spinach]
"order" column updated to a sequential number, starting at 1, based on "section" column and "id" column.
You can do this with an update by using a join. The second table to the join calculates the ordering, which is then used for the update:
update t join
(select t.*, #rn := if(#prev = t.section, #rn + 1, 1) as rn
from t cross join (select #rn := 0, #prev := '') const
) tsum
on t.id = tsum.id
set t.ordering = tsum.rn
You don't want to do this as an UPDATE, as that will be really slow.
Instead, do this on INSERT. Here's a simple one-line INSERT that will grab the next order number and inserts a record called 'kiwi' in the section 'fruits'.
INSERT INTO `table_name` (`section`, `order`, `thing`)
SELECT 'fruits', MAX(`order`) + 1, 'kiwi'
FROM `table_name`
WHERE `section` = `fruits`
EDIT: You could also do this using an insert trigger, e.g.:
DELIMITER $$
CREATE TRIGGER `trigger_name`
BEFORE INSERT ON `table_name`
FOR EACH ROW
BEGIN
SET NEW.`order` = (SELECT MAX(`order`) + 1 FROM `table_name` WHERE `section` = NEW.`section`);
END$$
DELIMITER ;
Then you could just insert your records as usual, and they will auto-update the order value.
INSERT INTO `table_name` (`section`, `thing`)
VALUES ('fruits', 'kiwi')
Rather than storing the ordering, you could derive it:
SELECT t.id
,t.section
,#row_num := IF (#prev_section = t.section, #row_num+1, 1) AS ordering
,t.thing
,#prev_section := t.section
FROM myTable t
,(SELECT #row_num := 1) x
,(SELECT #prev_value := '') y
ORDER BY t.section, t.id
Note that order is a keyword and is therefore not the greatest for a column name. You could quote the column name or give it a different name...