How to select MySql one random row from each group [duplicate] - mysql

I have a database with an Items table that looks something like this:
id
name
category (int)
There are several hundred thousand records. Each item can be in one of 7 different categories, which correspond to a categories table:
id
category
I want a query that chooses 1 random item, from each category. Whats the best way of approaching that? I know to use Order By rand() and LIMIT 1for similar random queries, but I've never done something like this.

This query returns all items joined to categories in random order:
SELECT
c.id AS cid, c.category, i.id AS iid, i.name
FROM categories c
INNER JOIN items i ON c.id = i.category
ORDER BY RAND()
To restrict each category to one, wrap the query in a partial GROUP BY:
SELECT * FROM (
SELECT
c.id AS cid, c.category, i.id AS iid, i.name
FROM categories c
INNER JOIN items i ON c.id = i.category
ORDER BY RAND()
) AS shuffled_items
GROUP BY cid
Note that when a query has both GROUP BY and ORDER BY clause, the grouping is performed before sorting. This is why I have used two queries: the first one sorts the results, the second one groups the results.
I understand that this query isn't going to win any race. I am open to suggestions.

Here is a simple solution. Let suppose you have this table.
id name category
1 A 1
2 B 1
3 C 1
4 D 2
5 E 2
6 F 2
7 G 3
8 H 3
9 I 3
Use this query
select
c.id,
c.category,
(select name from category where category = c.category group by id order by rand() limit 1) as CatName
from category as c
group by category

Try this
SELECT id, name, category from Items where
(
select count(*) from Items i where i.category = Items.category
GROUP BY i.category ORDER BY rand()
) <= 1
REF: http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/

Change order of the original table (random order), before final select:
select * from
(select category, id, name from categories order by rand()) as tab
group by 1

Please note: in the following example I am assuming your table is named "items" not "Items" because you also said the other table was named "categories" (second table name not capitalized).
The SQL for what you want to do would roughly be:
`SELECT items.id AS item_id,
items.name AS item_name,
items.category AS item_category_id,
categories.id AS category_id,
categories.category AS category_name
FROM items, category
WHERE items.category = categories.id
ORDER BY rand()
LIMIT 1`

Related

Couchbase/N1QL: SELECT FROM list of values provided by parameter

As a follow-up to Get top rows by "category" from a collection I still want to get the top5 products per categoryId, but I want to provide a pre-selected list of categoryIds that are relevant to me.
Starting with vsr's answer from the original question, I could do something like:
SELECT u.*
FROM (SELECT DISTINCT RAW p.categoryId
FROM products AS p
WHERE p.categoryId IN $categoryIds) AS c
UNNEST (SELECT p1.*
FROM products AS p1
WHERE p1.categoryId = c
ORDER BY p1.categoryId, p1.price DESC
LIMIT 5) AS u;
where the named parameter $categoryIds will be provided as an array ['cat1', 'cat2'].
It feels a bit inefficient to do the SELECT DISTINCT RAW p.categoryId FROM products AS p WHERE p.categoryId IN $categoryIds, just to get something back that is essentially again my list of provided categoryIds.
I am sure there is more efficient way to express this. Something like:
SELECT u.*
FROM (VALUES IN $categoryIds) AS c
UNNEST ...;
CREATE INDEX ix1 ON products(categoryId, price DESC);
So that below subquery in the Unnest uses index order and retrieves top 5 entries per category only irrespective of number of entries in specific category
If $categoryIds contain unique entries
SELECT u.*
FROM $categoryIds AS c
UNNEST (SELECT p1.*
FROM products AS p1
WHERE p1.categoryId = c
ORDER BY p1.categoryId, p1.price DESC
LIMIT 5) AS u;
For non-unique entries
SELECT u.*
FROM (SELECT DISTINCT RAW c1
FROM $categoryIds AS c1 ) AS c
UNNEST (SELECT p1.*
FROM products AS p1
WHERE p1.categoryId = c
ORDER BY p1.categoryId, p1.price DESC
LIMIT 5) AS u;

Display the category list even if it does not contain any item

In Connection to my question,
How to display the list of categories which contain items in mysql
I would like to ask how to display the list of category from the items even if the category does not contain any record in itemtbl. Here is my query:
SELECT *, count(*) as cnt
FROM categorytbl LEFT JOIN itemstbl
ON itemstbl.cat_id=categorytbl.cat_id
GROUP BY itemstbl.cat_id
ORDER BY cnt DESC
the result is :
Pet(1)
person(2)
I want the result to be:
Pet(1)
person(2)
Places(0)
After a couple if minutes of fixing this problem, I got this answer:
SELECT categorytbl.cat_id AS cat_id , count(itemstbl.cat_id) as cnt
FROM categorytbl LEFT JOIN itemstbl
ON itemstbl.cat_id=categorytbl.cat_id
GROUP BY cat_id
ORDER BY cnt DESC
The result now is:
Pet(1) person(2) Places(0)
The idea is, just Group the cat_id on categorytbl and count the cat_id on itemstbl.

MySQL : Top K with duplicates

Ex: Lets say I have 10 categories (a,b,c,d,e,f,g,h,i,j) and the counts of products in each category are : (5,6,10,4,10,4,6,10,10,4). Now if I want to find the top 5 categories with max products:
c - 10
e - 10
h - 10
i - 10
(b,g) - 6 (sometimes it will be b and sometimes it will be g, if I use the LIMIT 5 option.)
What I need: If there are categories with counts same and there are no fixed rule to return which category, then I want the sql query to return all such categories. In the above example, I want the sql query to return 6 rows. In case if all categories have 10 products, then querying for top 5, I need 10 rows to be returned.
I saw this question : Selecting the top 5 in a column with duplicates. But it has a different requirement.
You can achieve this with an inner select. First get the counts of the top k categories, then get all categories that have those counts.
Select cat_count, category from
(select count(category) as top_count
from products group by category order by count(category) desc limit 5)
as t1 inner join
(select count(category) as cat_count, category
from products group by category) as t2 on t1.top_count = t2.cat_count
Or written differently :
select count(category), category
from products
group by category
having count(category) in
(select count(category) as top_count
from products
group by category
order by count(category) desc limit 5)

SQL select ... in (select...) taking long time

I have a table of items that users have bought, within that table there is a category identifier. So, I want to show users other items from the same table, with the same categories they have already bought from.
The query I'm trying is taking over 22 seconds to run and the main items table is not even 3000 lines... Why so inefficient? Should I index, if so which columns?
Here's the query:
select * from items
where category in (
select category from items
where ((user_id = '63') AND (category <> '0'))
group by category
)
order by dateadded desc limit 20
Here is a query. And sure add index on category,user_id,dateadded
select i1.*
from items i1
inner join
(select distinct
category
from items
where ((user_id = '63') AND (category <> '0'))
) i2 on (i1.Category=i2.Category)
order by i1.dateadded desc limit 20
Appropriate places to put an index if necessary would be on dateadded, user_id and/or category
Try using self join for better performance as:
select i1.* from items i1 JOIN items i2 on i1.category= i2.category
where i2.user_id = '63' AND i2.category <> '0'
group by i2.category
order by i1.dateadded desc limit 20
Join is much faster than nested subqueries.
EDIT: Try without group by as :
select i1.* from items i1 JOIN items i2 on i1.category= i2.category
where i2.user_id = '63' AND i2.category <> '0'
order by i1.dateadded desc limit 20
If you index on category and possibly dateadded, it should speed things up a bit.

How can I use MySQL to COUNT with a LEFT JOIN?

How can I use MySQL to count with a LEFT JOIN?
I have two tables, sometimes the Ratings table does not have ratings for a photo so I thought LEFT JOIN is needed but I also have a COUNT statement..
Photos
id name src
1 car bmw.jpg
2 bike baracuda.jpg
Loves (picid is foreign key with photos id)
id picid ratersip
4 1 81.0.0.0
6 1 84.0.0.0
7 2 81.0.0.0
Here the user can only rate one image with their IP.
I want to combine the two tables in order of the highest rating. New table
Combined
id name src picid
1 car bmw.jpg 1
2 bike baracuda.jpg 2
(bmw is highest rated)
My MySQL code:
SELECT * FROM photos
LEFT JOIN ON photos.id=loves.picid
ORDER BY COUNT (picid);
My PHP Code: (UPDATED AND ADDED - Working Example...)
$sqlcount = "SELECT p . *
FROM `pics` p
LEFT JOIN (
SELECT `loves`.`picid`, count( 1 ) AS piccount
FROM `loves`
GROUP BY `loves`.`picid`
)l ON p.`id` = l.`picid`
ORDER BY coalesce( l.piccount, 0 ) DESC";
$pics = mysql_query($sqlcount);
MySQL allows you to group by just the id column:
select
p.*
from
photos p
left join loves l on
p.id = l.picid
group by
p.id
order by
count(l.picid)
That being said, I know MySQL is really bad at group by, so you can try putting the loves count in a subquery in your join to optimize it:
select
p.*
from
photos p
left join (select picid, count(1) as piccount from loves group by picid) l on
p.id = l.picid
order by
coalesce(l.piccount, 0)
I don't have a MySQL instance to test out which is faster, so test them both.
You need to use subqueries:
SELECT id, name, src FROM (
SELECT photos.id, photos.name, photos.src, count(*) as the_count
FROM photos
LEFT JOIN ON photos.id=loves.picid
GROUP BY photos.id
) t
ORDER BY the_count
select
p.ID,
p.name,
p.src,
PreSum.LoveCount
from
Photos p
left join ( select L.picid,
count(*) as LoveCount
from
Loves L
group by
L.PicID ) PreSum
on p.id = PreSum.PicID
order by
PreSum.LoveCount DESC
I believe you just need to join the data and do a count(*) in your select. Make sure you specify which table you want to use for ambigous columns. Also, don't forget to use a group by function when you do a count(*). Here is an example query that I run on MS SQL.
Select CmsAgentInfo.LOGID, LOGNAME, hCmsAgent.SOURCEID, count(*) as COUNT from hCmsAgent
LEFT JOIN CmsAgentInfo on hCmsAgent.logid=CmsAgentInfo.logid
where SPLIT = '990'
GROUP BY CmsAgentInfo.LOGID, LOGNAME, hCmsAgent.SOURCEID
The example results form this will be something like this.
77615 SMITH, JANE 1 36
29422 DOE, JOHN 1 648
Hope that helps. Good Luck.