Rewrite this IN query using JOIN for speedy results - mysql

I have this query which works but it takes 30 mintues to calculate. I know IN is slow, but looking for a join alternative.
SELECT *, COUNT(*) as Results from member_preferences_products_data
WHERE member_preferences_products_data.Member_ID IN (SELECT Member_ID from member_preferences_products_data WHERE Product_ID = '623')
GROUP by Product_ID
ORDER by Results asc
LIMIT 10

A direct replacement for you query would be
SELECT *, COUNT(*) as Results
FROM member_preferences_products_data
INNER JOIN
(
SELECT DISTINCT Member_ID
FROM member_preferences_products_data
WHERE Product_ID = '623'
) Sub1
ON member_preferences_products_data.Member_ID = Sub1.Member_ID
GROUP by Product_ID
ORDER by Results asc
LIMIT 10
However I am a bit confused about what you are trying to find. It seems you want a count of the rows for each product id where that product id has been bought by a member who bought product id 623.
Also, if the field Product_ID a string or an INT. If an INT the quotes are not required, but if it is a string then it will probably be slower than an INT field assuming the values are numeric

Related

Mysql query showing extra fields

I am trying this query:
SELECT * FROM heath_check where cid = '1' and eid in('3','5','7','1','6')
My table structure:
I want distinct eid but all other data as it is. For example I have two entries with an eid of 1 my query fetched both, but I want one which is in the second column.
SELECT *
FROM heath_check AS hc
INNER JOIN (
SELECT MAX(id) AS lastId
FROM heath_check
WHERE cid = '1' and eid in('3','5','7','1','6')
GROUP BY eid) AS lastIDs
ON hc.id = lastIDs.lastId
;
You need a subquery, like the above, to find the records you want for each value. If you had wanted the first ones, you could use MIN(id) instead; if you cannot count on sequential ids, it becomes much more complex with use of potentially non-unique timestamps (if they are even available).
Create a RowNumber grouped by eid and filter the RowNumber = 1 to get the expected result.
SELECT id, eid, cid,weight, s_blood_pressure
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY eid ORDER BY id DESC) AS RowNumber
FROM heath_check
WHERE cid = '1' AND eid IN ('3','5','7','1','6')
) A
WHERE RowNumber = 1

MySQL - Group and total, but return all rows in each group

I'm trying to write a query that finds each time the same person occurs in my table between a specific date range. It then groups this person and totals their spending for a specific range. If their spending habits are greater than X amount, then return each and every row for this person between date range specified. Not just the grouped total amount. This is what I have so far:
SELECT member_id,
SUM(amount) AS total
FROM `sold_items`
GROUP BY member_id
HAVING total > 50
This is retrieving the correct total and returning members spending over $50, but not each and every row. Just the total for each member and their grand total. I'm currently querying the whole table, I didn't add in the date ranges yet.
JOIN this subquery with the original table:
SELECT si1.*
FROM sold_items AS si1
JOIN (SELECT member_id
FROM sold_items
GROUP BY member_id
HAVING SUM(amount) > 50) AS si2
ON si1.member_id = si2.member_id
The general rule is that the subquery groups by the same column(s) that it's selecting, and then you join that with the original query using the same columns.
SELECT member_id, amount
FROM sold_items si
INNER JOIN (SELECT member_id,
SUM(amount) AS total
FROM `sold_items`
GROUP BY member_id
HAVING total > 50) spenders USING (member_id)
The query you have already built can be used as a temporary table to join with. if member_id is not an index on the table, this will become slow with scale.
The word spenders is a table alias, you can use any valid alias in its stead.
There are a few syntaxes that will get the result you are looking, here is one using an inner join to ensure that all rows returned have a member_id in the list returned by the group by and that the total is repeated for each a certain member has:
SELECT si.*, gb.total from sold_items as si, (SELECT member_id as mid,
SUM(amount) AS total
FROM `sold_items`
GROUP BY member_id
HAVING total > 50) as gb where gb.mid=si.member_id;
I think that this might help:
SELECT
member_id,
SUM(amount) AS amount_value,
'TOTAL' as amount_type
FROM
`sold_items`
GROUP BY
member_id
HAVING
SUM(amount) > 50
UNION ALL
SELECT
member_id,
amount AS amount_value,
'DETAILED' as amount_type
FROM
`sold_items`
INNER JOIN
(
SELECT
A.member_id,
SUM(amount) AS total
FROM
`sold_items` A
GROUP BY
member_id
HAVING
total <= 50
) AS A
ON `sold_items`.member_id = A.member_id
Results of the above query should be like the following:
member_id amount_value amount_type
==========================================
1 55 TOTAL
2 10 DETAILED
2 15 DETAILED
2 10 DETAILED
so the column amount_type would distinguish the two specific member groups
You could do subquery with EXISTS as an alternative:
select *
from sold_items t1
where exists (
select * from sold_items t2
where t1.member_id=t2.member_id
group by member_id
having sum(amount)>50
)
ref: http://dev.mysql.com/doc/refman/5.7/en/exists-and-not-exists-subqueries.html
In case you need to group by multiple columns, you can use a composite identifier with concatenate in combination with a group by subquery
select id, key, language, group
from translation
--query all key-language entries by composite identifier...
where concat(key, '_', language) in (
--by lookup of all key-language combinations...
select concat(key, '_', language)
from translation
group by key, language
--that occur more than once
having count(*) > 1
)

Obtain a list with the items found the minimum amount of times in a table

I have a MySQL table where I have a certain id as a foreign key coming from another table. This id is not unique to this table so I can have many records holding the same id.
I need to find out which ids are seen the least amount of times in this table and pull up a list containing them.
For example, if I have 5 records with id=1, 3 records with id=2 and 3 records with id=3, I want to pull up only ids 2 & 3. However, the data in the table changes quite often so I don't know what that minimum value is going to be at any given moment. The task is quite trivial if I use two queries but I'm trying to do it with just one. Here's what I have:
SELECT id
FROM table
GROUP BY id
HAVING COUNT(*) = MIN(SELECT COUNT(*) FROM table GROUP BY id)
If I substitute COUNT(*) = 3, then the results come up but using the query above gives me an error that MIN is not used properly. Any tips?
I would try with:
SELECT id
FROM table
GROUP BY id
HAVING COUNT(*) = (SELECT COUNT(*) FROM table GROUP BY id ORDER BY COUNT(*) LIMIT 1);
This gets the minimum selecting the first row from the set of counts in ascendent order.
You need a double select in the having clause:
SELECT id
FROM table
GROUP BY id
HAVING COUNT(*) = (SELECT MIN(cnt) FROM (SELECT COUNT(*) as cnt FROM table GROUP BY id) t);
The MIN() aggregate function is suposed to take a column, not a query. So, I see two ways to solve this:
To properly write the subquery, or
To use temp variables
First alternative:
select id
from yourTable
group by id
having count(id) = (
select min(c) from (
select count(*) as c from yourTable group by id
) as a
)
Second alternative:
set #minCount = (
select min(c) from (
select count(*) as c from yourTable group by id
) as a
);
select id
from yourTable
group by id
having count(*) = #minCount;
You need to GROUP BY to produce a set of grouped values and additional select to get the MIN value from that group, only then you can match it against having
SELECT * FROM table GROUP BY id
HAVING COUNT(*) =
(SELECT MIN(X.CNT) AS M FROM(SELECT COUNT(*) CNT FROM table GROUP BY id) AS X)

How Can I Dynamically Select the Names With the Maximum Frequency in MySQL

My current project is basically importing my friend list from Facebook and then selecting the first name with the highest frequency i.e. the most common name. I've been trying to set up the subquery like this ::
SELECT COUNT(*) as count, first_name
FROM Friends GROUP BY first_name ORDER BY count DESC;
and then after that I'm stumped... I've been trying to use a MAX function in the where clause but it wouldn't compile, so then i tried putting it in a subquery and I still couldn't get it to work. Do I need to use a join?
SELECT first_name
FROM friends
GROUP BY first_name
ORDER BY COUNT(*) DESC
LIMIT 1
or this, that could return more than one row if more than one name has the maximum number of repetitions:
SELECT first_name
FROM friends
GROUP BY first_name
HAVING COUNT(*) = (SELECT COUNT(*) FROM FRIENDS
GROUP BY first_name ORDER BY COUNT(*) DESC LIMIT 1)

Optimize Group by & Order by query

This is structure of product table.
Currently have more 1 million records.
I have performance issue when I use query group by & order by.
Query:
SELECT product_name FROM vs_product GROUP BY store_id ORDER BY id DESC LIMIT 2
How to improve this query to perform faster? I indexed the store_id, ID is primary key.
SELECT x.*
FROM my_table x
JOIN (SELECT store_id, MAX(id) max_id FROM my_table GROUP BY store_id) y
ON y.store_id = x.store_id
AND y.max_id = x.id
ORDER
BY store_id DESC LIMIT 2;
A hacky (but fast) solution:
SELECT product_name
FROM (
SELECT id
FROM vs_product
GROUP BY store_id DESC
LIMIT 2) as ids
JOIN vs_product USING (id);
How it works:
Your index on store_id stores (store_id, id) pairs in ascending order. GROUP BY DESC will make MySQL read the index in reverse order, that is the subquery will fetch the maximum ids for each store_id. Then you just join them back to the whole table to fetch product names.
Take notice, that the query will fetch two product names for the store ids with the maximum values.
You want a query like this:
select p.*
from product p join
(select store_id, max(id) as maxid
from product p
group by store_id
) psum
on psum.store_id = p.store_id and p.id = maxid
You don't have date in any of the tables, so I'm assuming the largest id is the most recent.