Caching row ranks in MySQL? - mysql

I have a column I want to sort by, with periodical updates on the rank (daily). I currently use this in code
get all rows from table order by column
rank = 1
foreach row in table
update row's rank to rank
rank++
this takes an update for each row in MySQL. Are there more efficient ways to do this?

Use an update with a join:
set #rank := 0;
update tbl a join
(select id, #rank := #rank + 1 as new_rank from tbl order by col) b
on a.id = b.id set a.rank = b.new_rank;
If expecting to have a lot of rows, you'll get the best performance by doing the join against a table that is indexed, e.g.:
set #rank := 0;
create temporary table tmp (id int primary key, rank int)
select id, #rank := #rank + 1 as rank from tbl order by col;
update tbl join tmp on tbl.id = tmp.id set tbl.rank = tmp.rank;
Finally, you could potentially make it faster by skipping the update step entirely and swapping in a new table (not always feasible):
set #rank := 0;
create table new_tbl (id int primary key, rank int, col char(10),
col2 char(20)) select id, #rank := #rank + 1 as rank, col, col2
from tbl order by col;
drop table tbl;
rename table new_tbl to tbl;

Related

Weird result in MySQL query

I have this query:
SELECT COUNT(1), name, (#i := #i + 1) AS counter FROM mytbl, (SELECT #i := 0) tmp_tbl GROUP BY counter
For this query, the counter column increases its value with 2.
But if I remove COUNT(1), such as:
SELECT name, (#i := #i + 1) AS counter FROM mytbl, (SELECT #i := 0) tmp_tbl GROUP BY counter
counter column increases its value with 1.
Can anyone explain why this behavior?
Table would be:
create table mytbl (name VARCHAR(20));
With data:
INSERT INTO mytbl VALUES
('a1'),
('a2'),
('a3');
As mentioned in MySQL document, we should not assign a value to a user variable and read the value within the same statement. We might get the expected results, but this is not guaranteed. Changing the statement (for example, by adding a GROUP BY, HAVING, or ORDER BY clause) may cause MySQL to select an execution plan with a different order of evaluation.
In your query, counter field will be evaluated in SELECT statement and then be used in GROUP BY statement. Seem when we add an aggregation function to SELECT statement, the field that be used in GROUP BY statement will be evaluated 2 times.
I've create a demo, you could check it. In the demo, I've this query
SELECT Count(1),
name,
( #i := #i + 1 ) AS counter,
( #j := #j + 1 ) AS group_field
FROM (SELECT 'A' AS name
UNION
SELECT 'B' AS name
UNION
SELECT 'C' AS name) mytable,
(SELECT #i := 0) tmp_tbl,
(SELECT #j := 0) tmp_tbl1
GROUP BY group_field;
In the execution result, counter field only be increased by 1 and group_field be increased by 2.
To make the counter field only increasing by 1, you could try this
SELECT Count(1),
name,
counter
FROM (SELECT name,
( #i := #i + 1 ) AS counter
FROM mytbl,
(SELECT #i := 0) tmp_tbl) data
GROUP BY counter;

Find all NULL values and set them to lowest unused number using MySQL query

I want to find all NULL values in column parameter_id and set them to lowest unused parameter_id.
I have query which will find lowest unused parameter_id, I also know how to get list of NULL values.
SELECT MIN(t1.parameter_id)+1 FROM table AS t1 WHERE NOT EXISTS (SELECT * FROM table AS t2 WHERE t2.parameter_id = t1.parameter_id+1)
I can get list of all rows with parameter_id=NULL, then make query to find current lowest unused parameter_id and then update parameter_id to that lowest unused number. Since table has 50.000 rows, this approach would create thousands of queries (50.000 * 2 per row).
Is there way to run "single query" which will find all parameter_id=NULL and update them all to current lowest unused parameter_id?
Here is table decrtiption (MySQL 5.5):
id (INT) primary key, auto_increment
parameter_id (INT) default NULL
Sample data:
# id, parameter_id
1, NULL
2, 1
3, NULL
4, 5
5, 3
Desired result:
# id, parameter_id
1, 2
2, 1
3, 4
4, 5
5, 3
EDIT:
I distilled what I want to single query. I simply need to run this query until there is 0 rows affected by UPDATE.
UPDATE `table`
SET parameter_id=
(SELECT *
FROM
(SELECT MIN(t1.parameter_id)+1
FROM `table` AS t1
WHERE NOT EXISTS
(SELECT *
FROM `table` AS t2
WHERE t2.parameter_id = t1.parameter_id+1)) AS t4)
WHERE parameter_id IS NULL LIMIT 1
The following enumerates the unused parameter ids:
select t.*, (#rn := #rn + 1) as seqnum
from table t cross join
(select #rn := 0) params
where not exists (select 1 from table t2 where t2.parameter_id = t.id)
order by t.id;
(You might want to put this in a temporary table with an index on seqnum for the subsequent query.)
The problem is getting a join key for the update. Here is a bit of a kludge: I'm going to add a column, enumerate it, and then drop it:
alter table `table` add column null_seqnum;
update `table` t cross join (select #rn1 := 0) params
set null_seqnum = (#rn1 := #rn1 + 1)
where parameter_id is null;
update `table` t join
(select t.*, (#rn := #rn + 1) as seqnum
from `table` t cross join
(select #rn := 0) params
where not exists (select 1 from `table` t2 where t2.parameter_id = t.id)
order by t.id
) tnull
on t.null_seqnum = tnull.seqnum
set t.parameter_id = tnull.id;
alter table `table` drop column null_seqnum;

Increment a column value based on uniqueness of other column values

I have a table, to which I need to add an increment column, however the increment should happen based on the existing values in the other columns.
select * from mytable;
first_col second_col
A B
A C
A D
A E
A B
A D
Now, I want to add another column, say new_column whose value increments uniquely on the basis of the first_col and second_col.
The column should be populated like these :
first_col second_col new_col
A B 1
A C 1
A D 1
A E 1
A B 2
A D 2
A B 3
Is it possible to do this using some sort of an MySQL in built auto increment strategy.
Using a temporary table with an auto_incremented id column you could do
create temporary table tt (
id int auto_increment primary key,
col1 varchar(32),col2 varchar(32));
insert into tt
select col1, col2 from origtable;
select col1, col2,
(select count(*)+1 from tt s
where s.col1=m.col1 and s.col2=m.col2
and s.id<m.id) n
frm tt m
There is no built in increment method in MySQL, but you can do this with either correlated subqueries or variables:
select t.*,
(#rn := if(#c = concat(first_col, ':', second_col), #rn + 1,
#c := concat(first_col, ':', second_col), 1, 1
)
) as new_col
from mytable t cross join
(select #rn := 0, #c := '') params
order by first_col, second_col;
Note: this re-orders the results. If you want the results in the original order, then you need a column that specifies that ordering.
Here's how you can do it, replace col1val and col2val with the values to be inserted.
INSERT INTO mytable (first_col, second_col, new_col)
VALUES (SELECT col1val, col2val SUM(COUNT(*), 1)
FROM mytable
GROUP BY first_col, second_col
HAVING first_col = col1val AND second_col = col2val)
Note that this is an insert query, and will affect only newly inserted values.

getting the ranking of the rows in mysql ORDER BY statements

suppose I have
SELECT * FROM t ORDER BY j
is there a way to specify the query to also return an autoincremented column that go along with the results that specifies the rank of that row in terms of the ordering?
also this column should also work when using ranged LIMITs, eg
SELECT * FROM t ORDER BY j LIMIT 10,20
should have the autoincremented column return 11,12,13,14 etc....
Oracle, MSSQL etc support ranking functions that do exactly what you want, unfortunately, MySQL has some catching up to do in this regard.
The closest I've ever been able to get to approximating ROW_NUMBER() OVER() in MySQL is like this:
SELECT t.*,
#rank = #rank + 1 AS rank
FROM t, (SELECT #rank := 0) r
ORDER BY j
I don't know how that would rank using ranged LIMIT unless you used that in a subquery perhaps (although performance may suffer with large datasets)
SELECT T2.*, rank
FROM (
SELECT t.*,
#rank = #rank + 1 AS rank
FROM t, (SELECT #rank := 0) r
ORDER BY j
) t2
LIMIT 10,20
The other option would be to create a temporary table,
CREATE TEMPORARY TABLE myRank
(
`rank` INT(11) NOT NULL AUTO_INCREMENT,
`id` INT(11) NOT NULL,
PRIMARY KEY(id, rank)
)
INSERT INTO myRank (id)
SELECT T.id
FROM T
ORDER BY j
SELECT T.*, R.rank
FROM T
INNER JOIN myRank R
ON T.id = R.id
LIMIT 10,20
Of course, the temporary table would need to be persisted between calls.
I wish there was a better way, but without ROW_NUMBER() you must resort to some hackery to get the behavior you want.

Mysql - calculate difference between two columns in 2 different tables and insert

I have the following statement which need to add 1 more subquery to calculate the difference between the number of followers in the tweeps table and the number in the current column in the ranking table and insert the difference in column called latest in ranking table sure PK is screenname ,
Like number in follower coulm in tweeps table is 10 current coulmn n ranking table for the same screenname is 5 the value will be added to latest is +5
mysql_query ("
INSERT INTO ranking
SELECT #rank := #rank + 1, tweeps.* FROM tweeps
JOIN( SELECT #rank := 0 ) AS init
ORDER BY followers DESC
ON DUPLICATE KEY UPDATE
ranking.ranking = #rank,
ranking.name = tweeps.name,
ranking.followers = tweeps.followers,
ranking.tweets = tweeps.tweets,
ranking.location = tweeps.location,
ranking.`join date` = tweeps.join_date,
ranking.avatar = tweeps.avatar;");
mysql_close($con);
Try this:
INSERT INTO ranking
SELECT #rank := #rank + 1, tweeps.* FROM tweeps
JOIN( SELECT #rank := 0 ) AS init
ORDER BY followers DESC
ON DUPLICATE KEY
UPDATE ranking set
ranking = #rank,
name = tweeps.name,
followers = tweeps.followers - followers,
tweets = tweeps.tweets,
location = tweeps.location,
`join date` = tweeps.join_date,
avatar = tweeps.avatar;
I changed the syntax to ON DUPLICATE KEY UPDATE ranking set...