I have a table of items that users have bought, within that table there is a category identifier. So, I want to show users other items from the same table, with the same categories they have already bought from.
The query I'm trying is taking over 22 seconds to run and the main items table is not even 3000 lines... Why so inefficient? Should I index, if so which columns?
Here's the query:
select * from items
where category in (
select category from items
where ((user_id = '63') AND (category <> '0'))
group by category
)
order by dateadded desc limit 20
Here is a query. And sure add index on category,user_id,dateadded
select i1.*
from items i1
inner join
(select distinct
category
from items
where ((user_id = '63') AND (category <> '0'))
) i2 on (i1.Category=i2.Category)
order by i1.dateadded desc limit 20
Appropriate places to put an index if necessary would be on dateadded, user_id and/or category
Try using self join for better performance as:
select i1.* from items i1 JOIN items i2 on i1.category= i2.category
where i2.user_id = '63' AND i2.category <> '0'
group by i2.category
order by i1.dateadded desc limit 20
Join is much faster than nested subqueries.
EDIT: Try without group by as :
select i1.* from items i1 JOIN items i2 on i1.category= i2.category
where i2.user_id = '63' AND i2.category <> '0'
order by i1.dateadded desc limit 20
If you index on category and possibly dateadded, it should speed things up a bit.
Related
Let me start in plain english first
Query: Get top 100 paying users and their current active item (just one item)
Here is a drafted query
SELECT `user_id`, SUM(p.`amount`) as `total`
FROM `users_purcahse` AS p
LEFT JOIN (SELECT `ui`.`item_id` as `item_id`, `ui`.`user_id` as `user_id`
FROM `user_items` AS `ui`
LEFT OUTER JOIN `items` AS `i` ON `ui`.`item_id` = `i`.`id`
LEFT OUTER JOIN `categories` AS `cat` ON `i`.`category_id` = `cat`.`id`
WHERE `ui`.isActive = 1
) AS `ui` ON p.`user_id` = `ui`.`user_id`
GROUP BY `user_id`, `ui`.`item_id`
ORDER BY `total` DESC
LIMIT 0, 100;
The problem with this is that the inner query is getting all users items table and then it will join it with the top 100 paying users
user items is a very large table, the query is taking too long
I simply want to attach the current active items for each user after doing the calculations
Note: a user can have so many items but only 1 active item
Note2: it's not enforced on the DB level that user_items can have one column with is_active per user
This is a job for some well-chosen subqueries.
First, let's find the user_id values of your top-paying users.
SELECT user_id, SUM(amount) total
FROM users_purcahse
ORDER BY SUM(amount) DESC
LIMIT 100
Next, let's find the item_id values for your users. If more than one item is active, we'll take the one with the smallest item_id value to get just one.
SELECT user_id, MIN(item_id) item_id
FROM user_items
WHERE isActive = 1
GROUP BY user_id
Then, in an outer query we can fetch the details of your items.
SELECT top_users.user_id, top_users.total,
active_items.item_id,
items.*, categories.*
FROM (
SELECT user_id, SUM(amount) total
FROM users_purcahse
ORDER BY SUM(amount) DESC
LIMIT 100
) top_users
LEFT JOIN (
SELECT user_id, MIN(item_id) item_id
FROM user_items
WHERE isActive = 1
GROUP BY user_id
) active_items ON top_users.user_id = active_items.user_id
LEFT JOIN items ON active_items.item_id = item.id
LEFT JOIN categories ON item.category_id = categories.id
ORDER BY top_users.total DESC, top_users.user_id
The trick here is to use GROUP BY subqueries to get the data items where you need just one value per user_id.
Once you have the resultset you need, you can use EXPLAIN to help you sort out any performance problems.
I have a database with an Items table that looks something like this:
id
name
category (int)
There are several hundred thousand records. Each item can be in one of 7 different categories, which correspond to a categories table:
id
category
I want a query that chooses 1 random item, from each category. Whats the best way of approaching that? I know to use Order By rand() and LIMIT 1for similar random queries, but I've never done something like this.
This query returns all items joined to categories in random order:
SELECT
c.id AS cid, c.category, i.id AS iid, i.name
FROM categories c
INNER JOIN items i ON c.id = i.category
ORDER BY RAND()
To restrict each category to one, wrap the query in a partial GROUP BY:
SELECT * FROM (
SELECT
c.id AS cid, c.category, i.id AS iid, i.name
FROM categories c
INNER JOIN items i ON c.id = i.category
ORDER BY RAND()
) AS shuffled_items
GROUP BY cid
Note that when a query has both GROUP BY and ORDER BY clause, the grouping is performed before sorting. This is why I have used two queries: the first one sorts the results, the second one groups the results.
I understand that this query isn't going to win any race. I am open to suggestions.
Here is a simple solution. Let suppose you have this table.
id name category
1 A 1
2 B 1
3 C 1
4 D 2
5 E 2
6 F 2
7 G 3
8 H 3
9 I 3
Use this query
select
c.id,
c.category,
(select name from category where category = c.category group by id order by rand() limit 1) as CatName
from category as c
group by category
Try this
SELECT id, name, category from Items where
(
select count(*) from Items i where i.category = Items.category
GROUP BY i.category ORDER BY rand()
) <= 1
REF: http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
Change order of the original table (random order), before final select:
select * from
(select category, id, name from categories order by rand()) as tab
group by 1
Please note: in the following example I am assuming your table is named "items" not "Items" because you also said the other table was named "categories" (second table name not capitalized).
The SQL for what you want to do would roughly be:
`SELECT items.id AS item_id,
items.name AS item_name,
items.category AS item_category_id,
categories.id AS category_id,
categories.category AS category_name
FROM items, category
WHERE items.category = categories.id
ORDER BY rand()
LIMIT 1`
I want to select the last inserted date and at the same time I want to select the user-name and count how many times the user-profile is visited.
So I am using this query
SELECT v.visitor_date, i.info_name, count(DISTINCT v.visitor_date) AS counted
FROM profile_visitors v
INNER JOIN profile_info i ON i.info_userId = v.visitor_accountId
ORDER BY v.visitor_date DESC
LIMIT 1
The result of the fiddle is wrong and SHOULD be
2015-07-28 11:05:16 - Testname - 5
Anyone knows what is wrong with the query?
http://sqlfiddle.com/#!9/2814c/1
DISTINCT does NOT give you the first or last record of any group, in fact you cannot guarantee which record DISTINCT will display within a group (nor does this matter by the way). So select MAX visitor date.
Try below query
SELECT MAX( v.visitor_date ) , i.info_name, COUNT( DISTINCT v.visitor_date ) AS counted FROM profile_visitors v INNER JOIN profile_info i ON i.info_userId = v.visitor_accountId ORDER BY v.visitor_date DESC LIMIT 1
You can try it:
SELECT v.visitor_date,
i.info_name,
COUNT(*) AS counted
FROM profile_visitors v
INNER JOIN profile_info i ON i.info_userId = v.visitor_accountId
GROUP BY v.visitor_accountId
ORDER BY v.visitor_date DESC
LIMIT 1
I need to join 2 identical tables to display the same list sorted by id. (posts and posts2)
It happens that before only worked with 1 table, but we've been using a second table (posts2) to store the new data from a certain id.
This is the query I used when I worked with 1 table(posts) and works fine.
select posts.id_usu,posts.id_cat,posts.titulo,posts.html,posts.slug,posts.fecha,hits.id,hits.hits,usuarios.id,usuarios.usuario,posts.id
From posts
Join hits On posts.id = hits.id
Join usuarios On posts.id_usu = usuarios.id
where posts.id_cat='".$catid."' order by posts.id desc
Now I tried to apply this query to Union 2 tables, but I don't know at what point instantiate the JOINS. I tried several ways but sends MYSQL Error. The following query merge the 2 tables and order by id, but need to add the JOIN.
select * from (
SELECT posts.id,posts.id_usu,posts.id_cat,posts.titulo,posts.html,posts.slug,posts.fecha
FROM posts where id_cat='6' ORDER BY id
)X
UNION ALL
SELECT posts2.id,posts2.id_usu,posts2.id_cat,posts2.titulo,posts2.html,posts2.slug,posts2.fecha FROM posts2 where id_cat='4' ORDER BY id DESC limit 20
I need to add this at the above query
Join hits On posts.id = hits.id
Join usuarios On posts.id_usu = usuarios.id
Thanks in advance guys.
If you want the same query as your first query but this time with union of your identical table i.e post2 then you can do so
select
p.id_usu,p.id_cat,p.titulo,p.html,p.slug,p.fecha
,hits.id,hits.hits,usuarios.id,usuarios.usuario
from (
(select
id_usu,id_cat,titulo,html,slug,fecha ,id
From posts
where id_cat='".$catid."' order by id desc limit 20)
UNION ALL
(select
id_usu,id_cat,titulo,html,slug,fecha ,id
From posts2
where id_cat='".$catid."' order by id desc limit 20)
) p
Join hits On p.id = hits.id
Join usuarios On p.id_usu = usuarios.id
order by p.id desc limit 20
My table contains votes of users for different items. It has the following fields:
id, user_id, item_id, vote, utc_time
I understand how to get the last vote of #user# for #item#, but it uses subquery:
SELECT votes.*, items.name, items.price
FROM votes JOIN items ON items.id = votes.item_id
WHERE user_id = #user# AND item_id = #item#
AND utc_time = (
SELECT MAX(utc_time) FROM votes
WHERE user_id = #user# AND item_id = #item#
)
It works, but it looks quite stupid to me... There should be a more elegant way to get this one record. I tried the approach suggested here, but I cannot make it work yet, so I'll appreciate your help: How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?
There is a second part to this question: Count rows with DISTINCT(several columns) and MAX(another column)
You want just one row from the result, the one with MAX(utc_time). In MySQL, there is a LIMIT clause you can apply with ORDER BY:
SELECT votes.*, items.name, items.price
FROM votes JOIN items ON items.id = votes.item_id
WHERE user_id = #user# AND item_id = #item#
ORDER BY votes.utc_time DESC
LIMIT 1 ;
An index on either (user_id, item_id, utc_time) or (item_id, user_id, utc_time) will be good for efficiency.
Simple: if the date/time is the maximum date there will not exist a "higher" (more recent) date/time (for the same {user,item} ).
SELECT vo.*
, it.name, it.price
FROM votes vo
JOIN items it ON it.id = vo.item_id
WHERE vo.user_id = #user# AND vo.item_id = #item#
AND NOT EXISTS (
SELECT *
FROM votes nx
WHERE nx.user_id = vo.user_id
AND nx.item_id = vo.item_id
AND nx.utc_time > vo.utc_time
);