MySQL - Recursive Reorder Algorithm / UPDATE with Variable Incrementation - mysql

Sorry for the kind of meaningless title, but I couldn't come up with a more fitting one.
I have a MySQL table, which looks like this:
SELECT * FROM `table`
+----+-----------+----------+-------+
| id | dimension | order_by | value |
+----+-----------+----------+-------+
| 1 | 1 | 1 | 1st |
| 2 | 1 | 100 | 3rd |
| 3 | 2 | 300 | 5th |
| 4 | 3 | 999 | 6th |
| 5 | 1 | 2 | 2nd |
| 6 | 2 | 1 | 4th |
+----+-----------+----------+-------+
I am listing all entries ordered by dimension (first) and order_by (second), which looks like this:
SELECT * FROM `table` ORDER BY `dimension`, `order_by`
+----+-----------+----------+-------+
| id | dimension | order_by | value |
+----+-----------+----------+-------+
| 1 | 1 | 1 | 1st |
| 5 | 1 | 2 | 2nd |
| 2 | 1 | 100 | 3rd |
| 6 | 2 | 1 | 4th |
| 3 | 2 | 300 | 5th |
| 4 | 3 | 999 | 6th |
+----+-----------+----------+-------+
Now I'd like to write a function, that rearranges the order_by, if possible with just one update query, to make it look that way:
SELECT * FROM `table` ORDER BY `dimension`, `order_by`
+----+-----------+----------+-------+
| id | dimension | order_by | value |
+----+-----------+----------+-------+
| 1 | 1 | 1 | 1st |
| 5 | 1 | 2 | 2nd |
| 2 | 1 | 3 | 3rd |
| 6 | 2 | 1 | 4th |
| 3 | 2 | 2 | 5th |
| 4 | 3 | 1 | 6th |
+----+-----------+----------+-------+
What I got so far (which, unfortunately, doesn't start recounting for each dimension):
UPDATE `table` AS `l`
JOIN (SELECT #i=1 FROM `table`) AS `i`
SET `order_by` = #i:=i
Now, my question would be: Is it possible to do it with just one UPDATE query?

You have to introduce another variable holding the value of the previous row.
UPDATE Table1 t
INNER JOIN (
SELECT
id, /*your primary key I assume*/
#new_ob:=if(#prev != dimension, 1, #new_ob + 1) as new_ob,
#prev := dimension /*In this line, the value of the current row is assigned. In the previous line, the variable still holds the value of the previous row*/
FROM
Table1
, (SELECT #prev := null, #new_ob := 0) var_init_subquery
ORDER BY dimension, order_by
) st ON t.id = st.id
SET t.order_by = st.new_ob;
see it working live in an sqlfiddle

Related

How to create a simple crosstab query in MySQL

I have two tables containing fields as below.
Table 1
| SetID | InQty | Day |
| 1 | 10 | 1 |
| 2 | 10 | 2 |
| 3 | 10 | 3 |
Table 2
| SetID | OtQty | Day |
| 1 | 1 | 5 |
| 1 | 2 | 6 |
| 1 | 3 | 7 |
SetID in table 2 is linked with SetId in table 1. Day is placed in place of date, just for convenience only. Expected Output,
| Day | InQty | OtQty |
| 1 | 10 | |
| 5 | | 1 |
| 6 | | 2 |
| 7 | | 3 |
Blank Space can be filled with NULL or Zero.
It appears you are querying ONLY for set ID = 1 otherwise, I would expect to see in/out values for Set 2 and 3. You should be able to get with a simple UNION
select t1.Day, t1.InQty, 0 OutQty
from Table1 t1
where SetID = 1
order by t1.Day
union select t2.Day, 0, t2.OtQty
from Table2 t2
where SetID = 1
Now, if you want totals spanning different "setID"s and keeping them differentiated from each other, just add the setID as a column and also add to the group by clause as well.

MYSQL - how do i select no more than x rows max with the same field value y?

this question is a bit tricky to formulate, so probably has been asked before.
i am selecting rows from a table of interrelating data. i only want a maximum of n rows which have the same value x of some field/column in the table to show up in my set. there is a global limit, in essence i always want the query to return the same amount of rows, with no more than n rows sharing value x. how do i do this?
here's an example of the data (dots are supposed to indicate that this table is large, let's say 20000 rows of data):
some_table
+----+----------+-------------+------------+
| id | some_id | some_column | another_id |
+----+----------+-------------+------------+
| 1 | 10 | value | 8 |
| 2 | 10 | value | 5 |
| 3 | 10 | value | 2 |
| 4 | 20 | value | 3 |
| 5 | 30 | value | 9 |
| 6 | 30 | value | 1 |
| 7 | 30 | value | 4 |
| 8 | 30 | value | 6 |
| 9 | 30 | value | 7 |
| 10 | 40 | value | 10 |
| .. | ... | ... | ... |
| .. | ... | ... | ... |
| .. | ... | ... | ... |
| .. | ... | ... | ... |
+----+----------+-------------+------------+
now here's my select:
select * from some_table where some_column="value" order by another_id limit 6
but instead of returning rows with another_id = 1 thru 6 i want to get no more than 2 rows with the same value of some_id. in other words, i'd like to get:
result set
+----+----------+-------------+------------+
| id | some_id | some_column | another_id |
+----+----------+-------------+------------+
| 6 | 30 | value | 1 |
| 3 | 10 | value | 2 |
| 1 | 10 | value | 3 |
| 7 | 30 | value | 4 |
| 4 | 20 | value | 8 |
| 10 | 40 | value | 10 |
+----+----------+-------------+------------+
note that the results are ordered by another_id, but there are no more than 2 results with the same value of some_id.
how can i best (meaning preferably in one query and reasonably fast) get there? thanks!
select id, some_id, some_column, another_id from (
select
t.*,
#rn := if(#prev = some_id, #rn + 1, 1) as rownumber,
#prev := some_id
from some_table t
, (select #prev := null, #rn := 0) var_init
where some_column="value"
order by some_id, id
) sq where rownumber <= 2
order by another_id;
see it working live in an sqlfiddle
First we order by some_id, id in the subquery to do the right calculations. Then we order by another_id in the outer query to have correct ordering.

SQL, difficult fetching data query

Suppose I have such a table:
+-----+---------+-------+
| ID | TIME | DAY |
+-----+---------+-------+
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 3 | 1 |
| 1 | 1 | 2 |
| 2 | 2 | 2 |
| 3 | 3 | 2 |
| 1 | 1 | 3 |
| 2 | 2 | 3 |
| 3 | 3 | 3 |
| 1 | 1 | 4 |
| 2 | 2 | 4 |
| 3 | 3 | 4 |
| 1 | 1 | 5 |
| 2 | 2 | 5 |
| 3 | 3 | 5 |
+-----+---------+-------+
I want to fetch a table which represents 2 IDs which got the largest sum of TIME within the last 3 days (means from 3 to 5 in a DAY column)
So the correct result would be:
+-----+---------+
| ID | SUM |
+-----+---------+
| 3 | 9 |
| 2 | 6 |
+-----+---------+
The original table is much larger and more complex. So i need a generic approach.
Thanks in advance.
And so I just learned that MySQL used LIMIT instead of TOP...
fiddle
CREATE TABLE tbl (ID INT,tm INT,dy INT);
INSERT INTO tbl (id, tm, dy) VALUES
(1,1,1)
,(2,2,1)
,(3,3,1)
,(1,1,2)
,(1,1,1)
SELECT ID
,SUM(SumTimeForDay) SumTimeFromLastThreeDays
FROM (SELECT ID
,SUM(tm) SumTimeForDay
FROM tbl
GROUP BY ID, dy
HAVING dy > MAX(dy) -3) a
GROUP BY id
ORDER BY SUM(SumTimeForDay) DESC
LIMIT 2
select t1.`id`, sum(t1.`time`) as `sum`
from `table` t1
inner join ( select distinct `day` from `table` order by `day` desc limit 3 ) t2
on t2.`da`y = t1.`day`
group by t1.`id`
order by sum(t1.`time`) desc
limit 2

Create a column that counts the number of times a value shows up on another column, from the same table - MySQL

In MySQL:
Lets say I've this table:
id | name | count |
1 | John | |
2 | John | |
3 | John | |
4 | Mary | |
5 | Lewis| |
6 | Lewis| |
7 | Max | |
8 | Max | |
The names are already grouped, so the same name comes up together.
Now I want the table to be like this:
id | name | count |
1 | John | 1 |
2 | John | 2 |
3 | John | 3 |
4 | Mary | 1 |
5 | Lewis| 1 |
6 | Lewis| 2 |
7 | Max | 1 |
8 | Max | 2 |
Notice it auto increments the value of count everytime there is a repetition of the same name.
Thanks!
You can use a user variable.
Something like this:-
UPDATE somepeople a
INNER JOIN (
SELECT id, name, IF(#PrevName=name, #aCnt := #aCnt + 1, #aCnt := 1) AS sequence, #PrevName:=name
FROM somepeople,
(SELECT #aCnt:=1, #PrevName:='') Sub1
ORDER BY name, id) b
ON a.id = b.id
SET a.count = b.sequence

Top 'n' results for each keyword

I have a query to get the top 'n' users who commented on a specific keyword,
SELECT `user` , COUNT( * ) AS magnitude
FROM `results`
WHERE `keyword` = "economy"
GROUP BY `user`
ORDER BY magnitude DESC
LIMIT 5
I have approx 6000 keywords, and would like to run this query to get me the top 'n' users for each and every keyword we have data for. Assistance appreciated.
Since you haven't given the schema for results, I'll assume it's this or very similar (maybe extra columns):
create table results (
id int primary key,
user int,
foreign key (user) references <some_other_table>(id),
keyword varchar(<30>)
);
Step 1: aggregate by keyword/user as in your example query, but for all keywords:
create view user_keyword as (
select
keyword,
user,
count(*) as magnitude
from results
group by keyword, user
);
Step 2: rank each user within each keyword group (note the use of the subquery to rank the rows):
create view keyword_user_ranked as (
select
keyword,
user,
magnitude,
(select count(*)
from user_keyword
where l.keyword = keyword and magnitude >= l.magnitude
) as rank
from
user_keyword l
);
Step 3: select only the rows where the rank is less than some number:
select *
from keyword_user_ranked
where rank <= 3;
Example:
Base data used:
mysql> select * from results;
+----+------+---------+
| id | user | keyword |
+----+------+---------+
| 1 | 1 | mysql |
| 2 | 1 | mysql |
| 3 | 2 | mysql |
| 4 | 1 | query |
| 5 | 2 | query |
| 6 | 2 | query |
| 7 | 2 | query |
| 8 | 1 | table |
| 9 | 2 | table |
| 10 | 1 | table |
| 11 | 3 | table |
| 12 | 3 | mysql |
| 13 | 3 | query |
| 14 | 2 | mysql |
| 15 | 1 | mysql |
| 16 | 1 | mysql |
| 17 | 3 | query |
| 18 | 4 | mysql |
| 19 | 4 | mysql |
| 20 | 5 | mysql |
+----+------+---------+
Grouped by keyword and user:
mysql> select * from user_keyword order by keyword, magnitude desc;
+---------+------+-----------+
| keyword | user | magnitude |
+---------+------+-----------+
| mysql | 1 | 4 |
| mysql | 2 | 2 |
| mysql | 4 | 2 |
| mysql | 3 | 1 |
| mysql | 5 | 1 |
| query | 2 | 3 |
| query | 3 | 2 |
| query | 1 | 1 |
| table | 1 | 2 |
| table | 2 | 1 |
| table | 3 | 1 |
+---------+------+-----------+
Users ranked within keywords:
mysql> select * from keyword_user_ranked order by keyword, rank asc;
+---------+------+-----------+------+
| keyword | user | magnitude | rank |
+---------+------+-----------+------+
| mysql | 1 | 4 | 1 |
| mysql | 2 | 2 | 3 |
| mysql | 4 | 2 | 3 |
| mysql | 3 | 1 | 5 |
| mysql | 5 | 1 | 5 |
| query | 2 | 3 | 1 |
| query | 3 | 2 | 2 |
| query | 1 | 1 | 3 |
| table | 1 | 2 | 1 |
| table | 3 | 1 | 3 |
| table | 2 | 1 | 3 |
+---------+------+-----------+------+
Only top 2 from each keyword:
mysql> select * from keyword_user_ranked where rank <= 2 order by keyword, rank asc;
+---------+------+-----------+------+
| keyword | user | magnitude | rank |
+---------+------+-----------+------+
| mysql | 1 | 4 | 1 |
| query | 2 | 3 | 1 |
| query | 3 | 2 | 2 |
| table | 1 | 2 | 1 |
+---------+------+-----------+------+
Note that when there are ties -- see users 2 and 4 for keyword "mysql" in the examples -- all parties in the tie get the "last" rank, i.e. if the 2nd and 3rd are tied, both are assigned rank 3.
Performance: adding an index to the keyword and user columns will help. I have a table being queried in a similar way with 4000 and 1300 distinct values for the two columns (in a 600000-row table). You can add the index like this:
alter table results add index keyword_user (keyword, user);
In my case, query time dropped from about 6 seconds to about 2 seconds.
You can use a pattern like this (from Within-group quotas (Top N per group)):
SELECT tmp.ID, tmp.entrydate
FROM (
SELECT
ID, entrydate,
IF( #prev <> ID, #rownum := 1, #rownum := #rownum+1 ) AS rank,
#prev := ID
FROM test t
JOIN (SELECT #rownum := NULL, #prev := 0) AS r
ORDER BY t.ID
) AS tmp
WHERE tmp.rank <= 2
ORDER BY ID, entrydate;
+------+------------+
| ID | entrydate |
+------+------------+
| 1 | 2007-05-01 |
| 1 | 2007-05-02 |
| 2 | 2007-06-03 |
| 2 | 2007-06-04 |
| 3 | 2007-07-01 |
| 3 | 2007-07-02 |
+------+------------+