Need a grouping option in mysql - mysql

I have a Mysql table like this:
+--------+--------------+
| idform | user_id |
+--------+--------------+
| 17 | 2 |
| 16 | 2 |
| 15 | 2 |
| 14 | 2 |
| 13 | 18 |
| 12 | 18 |
| 11 | 18 |
| 10 | 18 |
| 9 | 18 |
| 8 | 1 |
| 6 | 2 |
| 5 | 2 |
| 3 | 2 |
| 1 | 2 |
+--------+--------------+
14 rows in set (0.00 sec)
I need a query that gives me a result like this:
+----------------+--------------+
| idform | user_id |
+----------------+--------------+
| 17,16,15,14 | 2 |
| 13,12,11,10,9 | 18 |
| 8 | 1 |
| 6,5,3,1 | 2 |
+----------------+--------------+
4 rows in set (0.00 sec)
I tried to use GROUP_CONCAT() function of MYSQL but i couldn't make the result look like this. All i want to do is, MYSQL return the results in order but creates a new group for new user. Create a new group add ids with comma, then on a new user_id, create a new group.
I know i can make it by programmatic way with PHP but if i make it with PHP, i have some problems on pagination.
Any idea how to do this?
Thank you by now.
EDIT:
I also tried the query like this: SELECT GROUP_CONCAT(idform) FROM story GROUP_BY user_id. But it gives the result like this:
+---------------------+--------------+
| idform | user_id |
+---------------------+--------------+
| 8 | 1 |
| 1,3,5,6,14,15,16,17 | 2 |
| 9,10,11,12,13 | 18 |
+---------------------+--------------+
3 rows in set (0.00 sec)

You need to compare consecutive user-id's and after comparing assign each group a number. Later on, you can use group_concat over the data with that group_number.
I think below query should work for you.
SELECT GROUP_CONCAT(idform)
FROM (
SELECT
story.*
, #groupNumber := IF(#prev_userID != user_id, #groupNumber + 1, #groupNumber) AS gn
, #prev_userID := user_id
FROM story
, (SELECT #groupNumber := 0, #prev_userID := NULL) subquery
) sq
GROUP BY gn;

Related

How to select the first match of a group of conditions in MySQL

I have a table like this:
MyTable
-------------------------------
| ID | from | to |
-------------------------------
| 1 | U_002 | C_005 |
| 2 | U_015 | C_004 |
| 3 | C_005 | U_011 |
| 4 | U_008 | C_001 |
| 5 | U_007 | C_005 |
| 6 | U_001 | C_005 |
| 7 | C_004 | U_015 |
| 8 | U_002 | C_002 |
| 9 | U_001 | C_009 |
| 10 | U_010 | C_005 |
| 11 | C_005 | U_001 |
| 12 | U_004 | C_003 |
| 13 | U_005 | C_005 |
| 14 | U_010 | C_001 |
| 15 | C_005 | U_001 |
-------------------------------
ID, is the Unique Incremental Key of the table.
The goal is:
By giving a value (for example: C_005, U_001, C_010, etc..) Obtain the first match of this two conditions: ((from == value) || (to == value)) starting from higher ID.
This means, that data can be "duplicate", but I only wants the first result of the group.
For example, C_004 and U_015, have TWO entries (C_004 -> U_015 and U_015 -> C_004). This should return only ONE.
Since we want to start from higher Id, that mean that it would return only 7 | C_004 | U_015.
Let's put an example:
Value = C_005
The expected output is:
15 | C_005 | U_001
13 | U_005 | C_005
10 | U_010 | C_005
5 | U_007 | C_005
3 | C_005 | U_011
1 | U_002 | C_005
The idea, is to get the ""last"" (because we are starting from higher Id) coincidence of TWO values.
As I have said, two values can have multiple coincidences, but I only want to get the "last" one (Higher Id).
use max()
select max(id) id,`from`,`to`
from table_name
group by `from`,`to`
Your data is messed up because you have duplicates and potentially cycles too. Arrggh. You should fix the data.
But you can still do what you want with a recursive CTE:
with recursive cte as (
select id, f, t, 1 as lev, cast(t as char(1000)) as visited
from t
where f in ('C_005') /*, 'U_001', 'C_010') */
union all
select t.id, t.f, t.t, lev + 1, concat_ws(',', cte.visited, t.f)
from cte join
t
on cte.f = t.t
where cte.visited not like concat('%', t.f, '%')
)
select distinct id, f, t
from cte
order by id desc;
Here is a db<>fiddle.

MYSQL - how do i select no more than x rows max with the same field value y?

this question is a bit tricky to formulate, so probably has been asked before.
i am selecting rows from a table of interrelating data. i only want a maximum of n rows which have the same value x of some field/column in the table to show up in my set. there is a global limit, in essence i always want the query to return the same amount of rows, with no more than n rows sharing value x. how do i do this?
here's an example of the data (dots are supposed to indicate that this table is large, let's say 20000 rows of data):
some_table
+----+----------+-------------+------------+
| id | some_id | some_column | another_id |
+----+----------+-------------+------------+
| 1 | 10 | value | 8 |
| 2 | 10 | value | 5 |
| 3 | 10 | value | 2 |
| 4 | 20 | value | 3 |
| 5 | 30 | value | 9 |
| 6 | 30 | value | 1 |
| 7 | 30 | value | 4 |
| 8 | 30 | value | 6 |
| 9 | 30 | value | 7 |
| 10 | 40 | value | 10 |
| .. | ... | ... | ... |
| .. | ... | ... | ... |
| .. | ... | ... | ... |
| .. | ... | ... | ... |
+----+----------+-------------+------------+
now here's my select:
select * from some_table where some_column="value" order by another_id limit 6
but instead of returning rows with another_id = 1 thru 6 i want to get no more than 2 rows with the same value of some_id. in other words, i'd like to get:
result set
+----+----------+-------------+------------+
| id | some_id | some_column | another_id |
+----+----------+-------------+------------+
| 6 | 30 | value | 1 |
| 3 | 10 | value | 2 |
| 1 | 10 | value | 3 |
| 7 | 30 | value | 4 |
| 4 | 20 | value | 8 |
| 10 | 40 | value | 10 |
+----+----------+-------------+------------+
note that the results are ordered by another_id, but there are no more than 2 results with the same value of some_id.
how can i best (meaning preferably in one query and reasonably fast) get there? thanks!
select id, some_id, some_column, another_id from (
select
t.*,
#rn := if(#prev = some_id, #rn + 1, 1) as rownumber,
#prev := some_id
from some_table t
, (select #prev := null, #rn := 0) var_init
where some_column="value"
order by some_id, id
) sq where rownumber <= 2
order by another_id;
see it working live in an sqlfiddle
First we order by some_id, id in the subquery to do the right calculations. Then we order by another_id in the outer query to have correct ordering.

Listing results with unique column

I want to list top 6 race records with unique holder only. I mean a holder gets in the list shouldn't be listed with his another record. I currently use the query below to list top 6 times.
mysql> select * from racerecords order by record_time asc, date asc;
+----+---------+------------+-------------+---------------------+----------+
| id | race_id | holder | record_time | date | position |
+----+---------+------------+-------------+---------------------+----------+
| 2 | 10 | Stav | 15 | 2014-08-11 19:43:49 | 1 |
| 1 | 10 | Jennifer | 15 | 2014-08-13 19:43:19 | 1 |
| 4 | 10 | Jennifer | 16 | 2014-08-02 19:44:27 | 1 |
| 5 | 10 | Osman | 17 | 2014-08-04 19:44:57 | 1 |
| 7 | 10 | Gokhan | 18 | 2014-08-15 19:45:37 | 1 |
| 3 | 10 | MotherLode | 25 | 2014-08-01 19:44:11 | 1 |
+----+---------+------------+-------------+---------------------+----------+
6 rows in set (0.00 sec)
As you can see the holder "Jennifer" is listed twice. I want mySQL to skip her after she got in the list. The result I want to be generated is:
+----+---------+------------+-------------+---------------------+----------+
| id | race_id | holder | record_time | date | position |
+----+---------+------------+-------------+---------------------+----------+
| 2 | 10 | Stav | 15 | 2014-08-11 19:43:49 | 1 |
| 1 | 10 | Jennifer | 15 | 2014-08-13 19:43:19 | 1 |
| 5 | 10 | Osman | 17 | 2014-08-04 19:44:57 | 1 |
| 7 | 10 | Gokhan | 18 | 2014-08-15 19:45:37 | 1 |
| 3 | 10 | MotherLode | 25 | 2014-08-01 19:44:11 | 1 |
+----+---------+------------+-------------+---------------------+----------+
I tried everything. GROUP BY holder generates wrong results. It gets the very first record of the holder, even though is not the best. In this table it generates an output like above because id:1 is the first record I inserted for Jennifer.
How can I generate output a result like above?
Desired result can be achieved through this query but it performance intensive. I have reproduced the result in SQLFilddle http://sqlfiddle.com/#!2/f8ee7/3
select * from racerecords
where
(HOLDER, RECORD_TIME) in (
select HOLDER,min(RECORD_TIME) from racerecords
group by HOLDER)
Seems you have missed to include the Where clause in the sub-query. Try this
select * from racerecords
where
(HOLDER, RECORD_TIME) in (
select HOLDER,min(RECORD_TIME) from racerecords where race_id =17
group by HOLDER )
And race_id =17
Order by RECORD_TIME
you should use distinct clause
SELECT DISTINCT column_name,column_name
FROM table_name;
looks this http://www.w3schools.com/sql/sql_distinct.asp

How to get this specific user rankings query in mysql?

I've got tbl_items in my user database that I want to sort user rankings on a particular item with certain id (514). I have test data on my dev environment with this set of data:
mysql> select * from tbl_items where classid=514;
+---------+---------+----------+
| ownerId | classId | quantity |
+---------+---------+----------+
| 1 | 514 | 3 |
| 2 | 514 | 5 |
| 3 | 514 | 11 |
| 4 | 514 | 46 |
| 5 | 514 | 57 |
| 6 | 514 | 6 |
| 7 | 514 | 3 |
| 8 | 514 | 27 |
| 10 | 514 | 2 |
| 11 | 514 | 73 |
| 12 | 514 | 18 |
| 13 | 514 | 31 |
+---------+---------+----------+
12 rows in set (0.00 sec)
so far so good :) I wrote the following query:
set #row=0;
select a.*, #row:=#row+1 as rank
from (select a.ownerid,a.quantity from tbl_items a
where a.classid=514) a order by quantity desc;
+---------+----------+------+
| ownerid | quantity | rank |
+---------+----------+------+
| 11 | 73 | 1 |
| 5 | 57 | 2 |
| 4 | 46 | 3 |
| 13 | 31 | 4 |
| 8 | 27 | 5 |
| 12 | 18 | 6 |
| 3 | 11 | 7 |
| 6 | 6 | 8 |
| 2 | 5 | 9 |
| 7 | 3 | 10 |
| 1 | 3 | 11 |
| 10 | 2 | 12 |
+---------+----------+------+
12 rows in set (0.00 sec)
that ranks correctly the users. However in a table with lots of records, I need to do the following:
1) be able to get small portion of the list, around where the user ranking actually resides, something that would get me the surrounding records, preserving the overall rank:
I tried to do these things with setting a user variable to the ranking of the current user and by using offset and limit, but couldn't preserve the overall ranking.
This should get me something like the following (for instance ownerId=2 and surroundings limit 5:
+---------+----------+------+
| ownerid | quantity | rank |
+---------+----------+------+
| 3 | 11 | 7 |
| 6 | 6 | 8 |
| 2 | 5 | 9 | --> ownerId=2
| 7 | 3 | 10 |
| 1 | 3 | 11 |
+---------+----------+------+
5 rows in set (0.00 sec)
2) I'd also need another query (preferably single query) that gets me the top 3 places + the ranking of particular user with certain id, preferably with a single query, no matter if he's among the top 3 places or not. I couldn't get this as well
It would look like the following (for instance ownerId=2 again):
+---------+----------+------+
| ownerid | quantity | rank |
+---------+----------+------+
| 11 | 73 | 1 |
| 5 | 57 | 2 |
| 4 | 46 | 3 |
| 2 | 5 | 9 | --> ownerId=2
+---------+----------+------+
4 rows in set (0.00 sec)
Also I'm in a bit of a concern about the performance of the queries on a table with millions of records...
Hope someone helps :)
1) 5 entries around a given id.
set #row=0;
set #rk2=-1;
set #id=2;
select b.* from (
select a.*, #row:=#row+1 as rank, if(a.ownerid=#id, #rk2:=#row, -1) as rank2
from (
select a.ownerid,a.quantity
from tbl_items a
where a.classid=514) a
order by quantity desc) b
where b.rank > #rk2 - 3
limit 5;
Though you'll get an extra column rank2: you probably want to filter it out by explicit list of columns instead of b.*. Maybe it's possible whith a having clause rather than an extra nesting.
2) 3 top ranked entries + 1 specific id
select b.* from (
select a.*, #row:=#row+1 as rank
from (
select a.ownerid,a.quantity
from tbl_items a
where a.classid=514) a
order by quantity desc) b
where b.rank < 4 or b.ownerid=#id

Top 'n' results for each keyword

I have a query to get the top 'n' users who commented on a specific keyword,
SELECT `user` , COUNT( * ) AS magnitude
FROM `results`
WHERE `keyword` = "economy"
GROUP BY `user`
ORDER BY magnitude DESC
LIMIT 5
I have approx 6000 keywords, and would like to run this query to get me the top 'n' users for each and every keyword we have data for. Assistance appreciated.
Since you haven't given the schema for results, I'll assume it's this or very similar (maybe extra columns):
create table results (
id int primary key,
user int,
foreign key (user) references <some_other_table>(id),
keyword varchar(<30>)
);
Step 1: aggregate by keyword/user as in your example query, but for all keywords:
create view user_keyword as (
select
keyword,
user,
count(*) as magnitude
from results
group by keyword, user
);
Step 2: rank each user within each keyword group (note the use of the subquery to rank the rows):
create view keyword_user_ranked as (
select
keyword,
user,
magnitude,
(select count(*)
from user_keyword
where l.keyword = keyword and magnitude >= l.magnitude
) as rank
from
user_keyword l
);
Step 3: select only the rows where the rank is less than some number:
select *
from keyword_user_ranked
where rank <= 3;
Example:
Base data used:
mysql> select * from results;
+----+------+---------+
| id | user | keyword |
+----+------+---------+
| 1 | 1 | mysql |
| 2 | 1 | mysql |
| 3 | 2 | mysql |
| 4 | 1 | query |
| 5 | 2 | query |
| 6 | 2 | query |
| 7 | 2 | query |
| 8 | 1 | table |
| 9 | 2 | table |
| 10 | 1 | table |
| 11 | 3 | table |
| 12 | 3 | mysql |
| 13 | 3 | query |
| 14 | 2 | mysql |
| 15 | 1 | mysql |
| 16 | 1 | mysql |
| 17 | 3 | query |
| 18 | 4 | mysql |
| 19 | 4 | mysql |
| 20 | 5 | mysql |
+----+------+---------+
Grouped by keyword and user:
mysql> select * from user_keyword order by keyword, magnitude desc;
+---------+------+-----------+
| keyword | user | magnitude |
+---------+------+-----------+
| mysql | 1 | 4 |
| mysql | 2 | 2 |
| mysql | 4 | 2 |
| mysql | 3 | 1 |
| mysql | 5 | 1 |
| query | 2 | 3 |
| query | 3 | 2 |
| query | 1 | 1 |
| table | 1 | 2 |
| table | 2 | 1 |
| table | 3 | 1 |
+---------+------+-----------+
Users ranked within keywords:
mysql> select * from keyword_user_ranked order by keyword, rank asc;
+---------+------+-----------+------+
| keyword | user | magnitude | rank |
+---------+------+-----------+------+
| mysql | 1 | 4 | 1 |
| mysql | 2 | 2 | 3 |
| mysql | 4 | 2 | 3 |
| mysql | 3 | 1 | 5 |
| mysql | 5 | 1 | 5 |
| query | 2 | 3 | 1 |
| query | 3 | 2 | 2 |
| query | 1 | 1 | 3 |
| table | 1 | 2 | 1 |
| table | 3 | 1 | 3 |
| table | 2 | 1 | 3 |
+---------+------+-----------+------+
Only top 2 from each keyword:
mysql> select * from keyword_user_ranked where rank <= 2 order by keyword, rank asc;
+---------+------+-----------+------+
| keyword | user | magnitude | rank |
+---------+------+-----------+------+
| mysql | 1 | 4 | 1 |
| query | 2 | 3 | 1 |
| query | 3 | 2 | 2 |
| table | 1 | 2 | 1 |
+---------+------+-----------+------+
Note that when there are ties -- see users 2 and 4 for keyword "mysql" in the examples -- all parties in the tie get the "last" rank, i.e. if the 2nd and 3rd are tied, both are assigned rank 3.
Performance: adding an index to the keyword and user columns will help. I have a table being queried in a similar way with 4000 and 1300 distinct values for the two columns (in a 600000-row table). You can add the index like this:
alter table results add index keyword_user (keyword, user);
In my case, query time dropped from about 6 seconds to about 2 seconds.
You can use a pattern like this (from Within-group quotas (Top N per group)):
SELECT tmp.ID, tmp.entrydate
FROM (
SELECT
ID, entrydate,
IF( #prev <> ID, #rownum := 1, #rownum := #rownum+1 ) AS rank,
#prev := ID
FROM test t
JOIN (SELECT #rownum := NULL, #prev := 0) AS r
ORDER BY t.ID
) AS tmp
WHERE tmp.rank <= 2
ORDER BY ID, entrydate;
+------+------------+
| ID | entrydate |
+------+------------+
| 1 | 2007-05-01 |
| 1 | 2007-05-02 |
| 2 | 2007-06-03 |
| 2 | 2007-06-04 |
| 3 | 2007-07-01 |
| 3 | 2007-07-02 |
+------+------------+