Get most occured value in the table - mysql

I got following tables
1. user
+----+-------------------+
| id | email |
+----+-------------------+
| 2 | user1#example.com |
| 3 | user2#example.com |
| 1 | user3#example.com |
+----+-------------------+
2. answer
+----+---------+-------------+-----------+---------------------+
| id | user_id | question_id | option_id | created |
+----+---------+-------------+-----------+---------------------+
| 1 | 2 | 1 | 5 | 2015-12-19 15:15:07 |
| 2 | 2 | 1 | 5 | 2015-12-19 15:16:05 |
| 3 | 2 | 2 | 3 | 2015-12-19 15:16:06 |
| 4 | 2 | 3 | 3 | 2015-12-19 15:16:08 |
| 5 | 2 | 1 | 1 | 2015-12-19 15:32:46 |
| 6 | 2 | 1 | 4 | 2015-12-19 15:39:22 |
| 7 | 2 | 1 | 2 | 2015-12-19 15:39:23 |
| 8 | 2 | 1 | 2 | 2015-12-19 15:40:38 |
| 9 | 2 | 1 | 1 | 2015-12-19 15:41:25 |
+----+---------+-------------+-----------+---------------------+
I want to fetch option_id with most occurrences grouped by user with the following condition
If there are two or more maximum occurrences of option_id, get last record.
With reference to above answer table, as you see there are four maximum occurrences for option_id, in this case i want last in the list to be returned which is option_id 1
Here is the query i used to achieve what i want
SELECT
option_id,
COUNT(option_id) as occurence
FROM
answer
GROUP BY
option_id
ORDER BY
occurence DESC LIMIT 1;
This works, however when i add WHERE condition, it gives me option_id 5, whereas i expect option_id 1
SELECT
option_id,
COUNT(option_id) as occurence
FROM
answer
WHERE
user_id = 2
GROUP BY
option_id
ORDER BY
occurence DESC LIMIT 1;
What am i missing here?
Note : This is a follow-up question from this link, the reason i am re-posting here is to post simplified version of the same question.

I did something similar on my tables, and that's my solution:
SELECT option_id
FROM answer
WHERE id = (SELECT id
FROM answer
WHERE user_id = 2
GROUP BY option_id
ORDER BY COUNT(option_id) DESC, id DESC
LIMIT 1)
Or see what happens if you do ORDER BY occurence DESC, id DESC in your query.

If it's really only the order_id you are interested in, you can simply add MAX(created) DESC to your ORDER BY:
SELECT
option_id,
COUNT(option_id) as occurence
FROM
answer
WHERE
user_id = 2
GROUP BY
option_id
ORDER BY
occurence DESC, MAX(created) DESC LIMIT 1;

Step by step:
Count option_id occurrences (group by option_id and count)
Get maximum count from above figures (with max or with order by and limit 1)
3, Get all option_id with this count.
Get all records with one of those option_ids.
Keep only the last record of the this found records (order by limit 1).
Query:
select *
from answer
where option_id in
(
select option_id
from answer
group by option_id
having count(*) =
(
select count(*) as cnt
from answer
group by option_id
order by count(*) desc limit 1
)
)
order by created desc limit 1;

Related

Random record from the table

I have customer table with 10 columns. In the table customer id is repeated. I need to take only one record every customer but randomly.
Let suppose customer table contain total 10000 records. But distinct customers is only 500.
So i need only 500 distinct customer data randomly.
I am using mysql 5.7.
Consider the following...
SELECT * FROM my_table;
+----+-------------+
| id | customer_id |
+----+-------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 3 |
| 4 | 5 |
| 5 | 3 |
| 6 | 2 |
| 7 | 1 |
| 8 | 4 |
| 9 | 5 |
| 10 | 2 |
| 11 | 3 |
| 12 | 1 |
| 13 | 4 |
+----+-------------+
SELECT id
, customer_id
FROM
( SELECT id
, customer_id
, CASE WHEN #prev=customer_id THEN #i:=#i+1 ELSE #i:=1 END i
, #prev:=customer_id
FROM
( SELECT id
, customer_id
FROM my_table
ORDER
BY customer_id
, RAND()
) x
JOIN (SELECT #prev:=null,#i:=0) vars
) n
WHERE i = 1
ORDER
BY customer_id;
-- sample output, different each time --
+----+-------------+
| id | customer_id |
+----+-------------+
| 12 | 1 |
| 10 | 2 |
| 3 | 3 |
| 8 | 4 |
| 9 | 5 |
+----+-------------+
You do not want to ORDER BY RAND() because that will be extremely slow for a large table because it will actually sort all of those random records.
Instead pick a random int less than the number of rows in the table (random_num_less_than_row_count) and do this which is faster but not perfect.
SELECT * FROM atable LIMIT $random_num_less_than_row_count, 1
Or if u have a primary key that is an auto_increment you can pick a random int less than the highest id in the table (random_num_less_than_last_id) do the following which is pretty fast.
SELECT * FROM atable WHERE id >= $random_num_less_than_last_id ORDER BY id ASC LIMIT 1
I did a >= and an ORDER BY id ASC so that if you are missing ids you'll still get a result. But if you have many large gaps you need the slower first option above.
Not sure about it but it is a beginner level query which might to get the desired result
SELECT Distinct column FROM table
ORDER BY RAND()
LIMIT 500
PS: This code isn't in mysql 5.7. And if anyone have a better query more than happy to get corrected

Sum of Counted records that calculated using "group by" with condition and "group by"

I'm sorry for fuzzy title of this question.
I have 2 Tables in my database and want to count records of first_table using "group by" on a foreign key id that exists in a column of second_table (which stores ids like array "1,2,3,4,5").
id | name | fk_id
1 | john | 1
2 | mike | 1
3 | jane | 2
4 | tailor | 1
5 | jane | 3
6 | tailor | 5
7 | jane | 4
8 | tailor | 5
9 | jane | 5
10 | tailor | 5
id | name | fk_ids | s_fk_id
1 | xxx | 1,5,6 | 1
2 | yyy | 2,3 | 1
3 | zzz | 9 | 1
4 | www | 7,8 | 1
Now i wrote the following query but it not working properly and displays wrong numbers.
I WANT TO:
1-Count records in first_table group by "fk_id"
2-Sum the counted records which exists in "fk_ids"
3-Display the sum result (sum of related counts) grouped by id.
symbol ' ' means ``.
select sum(if(FIND_IN_SET('fk_id', 'fk_ids')>0,'count',0) 'sum', 'count', 'from'.'fk_id', 'second_table'.* FROM 'second_table'
LEFT JOIN
(
SELECT 'fk_id', count(*) 'count'
FROM 'first_table'
group BY 'fk_id'
) AS 'from'
ON FIND_IN_SET('fk_id', 'fk_ids')>0
WHERE 'second_table'.'s_fk_id'=1
GROUP BY 'id'
ORDER by 'count' DESC
This table has many data and we have no plan to change the structure.
Edit:
Desired output:
id | name | sum
1 | xxx | 7 (3+4+0)
2 | yyy | 2 (1+1)
3 | zzz | 0 (0)
4 | www | 0 (0+0)
After two holidays i came back to work and found out that the "FIND_IN_SET" function is not working properly with space contained string.
And the problem is that i was ignored the spaces too, (same as this question)
Finnaly this query worked:
select sum(`count`) `sum`, `count`, `from`.`fk_id`, `second_table`.* FROM `second_table`
LEFT JOIN
(
SELECT `fk_id`, count(*) `count`
FROM `first_table`
group BY `fk_id`
) AS `from`
ON FIND_IN_SET(`fk_id`, replace(`fk_ids`,' ',''))>0
WHERE `second_table`.`s_fk_id`=1
GROUP BY `id`
ORDER by `count` DESC
And the magic is replace(fk_ids,' ','')

MySQL - Return last row after using GROUP BY

Here is a log of user activity on my project, who have each "voted" on various items, giving each item either a "1", "2" or "3" rating.
rec_id | user_id | item_id | value
-----------------------------------
1 | 1 | 2 | 3
2 | 1 | 2 | 2
3 | 2 | 1 | 1
4 | 3 | 1 | 1
5 | 3 | 2 | 2
6 | 1 | 2 | 1
7 | 1 | 4 | 2
I'm trying to return all the item_id's user_id "1" has voted on, and the last value they gave each item. So, my goal is to return the following rows from the full table above:
rec_id | user_id | item_id | value
-----------------------------------
6 | 1 | 2 | 1
7 | 1 | 4 | 2
In the first example, user_id "1" has voted on item_id "2" three times, so I want to ignore the previous instances in which user 1 has voted on it.
Here is my statement so far, but this returns "3" for the rating of item_id 2, when it should be "1":
SELECT MAX(rec_id), user_id, item_id, value
FROM logs
WHERE user_id=1
GROUP BY user_id, item_id
What do I need to add to reach my goal?
basically you just need a subquery where the rec_id is equal to the max rec_id
QUERY:
SELECT
rec_id, user_id, item_id, value
FROM logs
WHERE user_id = 1
AND rec_id IN
( SELECT
MAX(rec_id)
FROM logs
GROUP BY item_id
)
GROUP BY user_id, item_id
DEMO
OUTPUT:
+-------+---------+----------+-------+
|rec_id | user_id | item_id | value |
+-------+---------+----------+-------+
| 6 | 1 | 2 | 1 |
| 7 | 1 | 4 | 2 |
+-------+---------+----------+-------+
You get the last row usually you would use a combination of order by and LIMIT 1.
In your case I would use two seperate queries though. But I would first restructure my database to avoid religion and duplicates.strong text

Select the lastest one of each result in MySQL

Say if I have a table similar to this but including more columns and more rows (These are the only relevant ones):
+-------+----+
| name | id |
+-------+----+
| james | 1 |
| james | 2 |
| james | 3 |
| adam | 4 |
| max | 5 |
| adam | 6 |
| max | 7 |
| adam | 8 |
+-------+----+
How could I get it so that it would only show the max(id) from each name like:
+-------+----+
| name | id |
+-------+----+
| adam | 8 |
| max | 7 |
| james | 3 |
+-------+----+
I currently just have this
"select * from table order by id desc"
but this just shows the latest ids. I only want to be able to see one of each name.
So basically show only the highest id of each name
You would use aggregation and max():
select name, max(id)
from table t
group by name
order by max(id) desc
limit 40;
EDIT:
If you need select * with the highest id, then use the not exists approach:
select *
from table t
where not exists (select 1 from table t2 where t2.name = t.name and t2.id > t.id)
order by id desc
limit 40;
The "not exists" essentially says: "Get me all rows in the table where there is no other row with the same name and a higher id". That is a round-about way of getting the maximum row.
One way to achieve this is to leverage a non-standard GROUP BY extension in MySQL
SELECT *
FROM
(
SELECT *
FROM table1
ORDER BY id DESC
) q
GROUP BY name
-- LIMIT 40
or another way is to grab a max id per name first and then join back to your table to fetch all other columns
SELECT t.*
FROM
(
SELECT MAX(id) id
FROM table1
GROUP BY name
-- LIMIT 40
) q JOIN table1 t
ON q.id = t.id
ORDER BY name;
Output:
| NAME | ID |
|-------|----|
| adam | 8 |
| james | 3 |
| max | 7 |
Here is SQLFiddle demo

how to find duplicates and gaps in this scenario in mysql

Hi I have a table that looks like
-----------------------------------------------------------
| id | group_id | source_id | target_id | sortsequence |
-----------------------------------------------------------
| 2 | 1 | 2 | 4 | 1 |
-----------------------------------------------------------
| 4 | 1 | 20 | 2 | 1 |
-----------------------------------------------------------
| 5 | 1 | 2 | 14 | 1 |
-----------------------------------------------------------
| 7 | 1 | 2 | 7 | 3 |
-----------------------------------------------------------
| 20 | 2 | 20 | 4 | 3 |
-----------------------------------------------------------
| 21 | 2 | 20 | 4 | 1 |
-----------------------------------------------------------
Scenario
There are two scenarios that needs to be handled.
Sortsequence column value should be unique against one source_id and group_id. For example if all the records having group_id = 1 AND source_id = 2 should have sortsequence unique. In above example records having id= and 5 which are having group_id = 1 and source_id = 2 have same sortsequence which is 1. This is faulty record. I need to find out these records.
If group_id and source_id is same. The sortsequence columns value should be continous. There should be no gap. For example in above table records having id = 20, 21 having same group_id and source_id and sortsequence value is 3 and 1. Even this is unique but there is a gap in sortsequence value. I need to also find out these records.
MY So Far Effort
I have written a query
SELECT source_id,`group_id`,GROUP_CONCAT(id) AS children
FROM
table
GROUP BY source_id,
sortsequence,
`group_id`
HAVING COUNT(*) > 1
This query only address the scenario 1. How to handle scenario 2? Is there any way to do it in same query or I have to write other to handle second scenario.
By the way query will be dealing with million of records in table so performance must be very good.
Got answer from Tere J Comments. Following query covers above mentioned both criteria.
SELECT
source_id, `group_id`, GROUP_CONCAT(id) AS faultyIDS
FROM
table
GROUP BY
source_id,group_id
HAVING
COUNT(DISTINCT sortsequence) <> COUNT(sortsequence) OR COUNT(sortsequence) <> MAX(sortsequence) OR MIN(sortsequence) <> 1
May be it can help others.
Try this query it will solve both of the cases as you have mentioned in the question.
SELECT
a.*
FROM
tbl a
INNER JOIN
(select
#rn:=IF(#prevG = group_id AND #prevS = source_id, #rn + 1, 1) As rId,
#prevG:=group_id AS group_id,
#prevS:=source_id AS source_id,
id,
sortsequence
FROM
tbl
join
(select #rn:=0, #prevS:=0, #prevG:=0)b
order by group_id, source_id, id) b
ON a.id = b.id AND a.SORTSEQUENCE <> b.RID;
FIDDLE