MySQL Select parent row from a many to many relation pivot - mysql

I have three tables, products, ingredients and ingredient_product.
I need to find all products, where the product has the related ingredients.
It also needs to have a matching value on the percentage column.
A product exists with two ingredients related.
+-----+------------+---------------+------------+
| id | product_id | ingredient_id | percentage |
+-----+------------+---------------+------------+
| 1 | 1 | 1 | 50 |
| 2 | 1 | 2 | 50 |
+------------------+--------------+-------------+
SQL to retrieve:
SELECT
products.id
FROM
products,
ingredient_product
WHERE
ingredient_product.product_id = products.id
AND
(ingredient_product.ingredient_id = 1 AND ingredient_product.percentage = 50)
AND
(ingredient_product.ingredient_id = 2 AND ingredient_product.percentage = 50)
But this returns an empty result. Empty set (0.00 sec)
Products:
+-------------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+-----------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(255) | NO | | NULL | |
| short_description | text | YES | | NULL | |
| long_description | text | YES | | NULL | |
+-------------------+-----------------------+------+-----+---------+----------------+
Ingredients:
+-------------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+-----------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| short_name | varchar(50) | NO | | NULL | |
| thumbnail | varchar(255) | NO | | NULL | |
+-------------------+-----------------------+------+-----+---------+----------------+
ingredient_product
+---------------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| product_id | int(10) unsigned | NO | MUL | NULL | |
| ingredient_id | int(10) unsigned | NO | MUL | NULL | |
| percentage | tinyint(3) unsigned | NO | | NULL | |
+---------------+---------------------+------+-----+---------+----------------+

Based on Strawberries answer:
SELECT product_id
FROM ingredient_product
WHERE ingredient_id IN (1,2) AND percentage = 50
GROUP BY product_id
HAVING COUNT(*) = 2;
However this doesn't work if you need separate percentages, but you can do this:
SELECT product_id FROM
(SELECT product_id
FROM ingredient_product
WHERE ingredient_id = 1 AND percentage = 50
UNION ALL
SELECT product_id
FROM ingredient_product
WHERE ingredient_id = 2 AND percentage = 50) AS tmp
GROUP BY product_id
HAVING COUNT(*) = 2;
Essentially you add a UNION for each requirement (consisting of an id and percentage combo) and then increase the HAVING condition to the number of requirements.
Notes: Since the requirements are mutually exclusive in this case, UNION ALL is quicker than UNION and will give you the same result.
Instead of an UNION in a subquery you could use OR but we found that MySQL seems to like this format better. That however might change from version to version. For completeness sake here that solution as well:
SELECT product_id
FROM ingredient_product
WHERE (ingredient_id = 1 AND percentage = 50) OR (ingredient_id = 2 AND percentage = 50)
GROUP BY product_id
HAVING COUNT(*) = 2;

On the assumption that id is redundant, and that you have a perfectly serviceable natural key on (product_id,ingredient_id)...
SELECT product_id
FROM ingredient_product
WHERE ingredient_id IN (1,2)
GROUP
BY product_id
HAVING COUNT(*) = 2;

Related

Nested queries in mysql which return primary key?

Q. print the complete details of the product which is ordered by the maximum number of customers and its price is greater than 3.0
products
+--------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-------------+------+-----+---------+----------------+
| productID | int | NO | PRI | NULL | auto_increment |
| Name | varchar(30) | NO | | NULL | |
| Price | double(3,2) | NO | | NULL | |
| CoffeeOrigin | varchar(30) | YES | | NULL | |
+--------------+-------------+------+-----+---------+----------------+
orders
+------------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+----------+------+-----+---------+----------------+
| orderID | int | NO | PRI | NULL | auto_increment |
| productID | int | YES | MUL | NULL | |
| customerID | int | YES | MUL | NULL | |
| Date_Time | DateTime | NO | | NULL | |
+------------+----------+------+-----+---------+----------------+
query:
select * from products where productID=y.id (
select y.id from (
select products.productsID as id, count(*) as counter
from orders join products on orders.productID=products.productID
group by productID order by counter desc limit 1
) y
);
what is that I m doing is not correct?
Two things:
First, if you use a scalar subquery, you don't need to give it a table alias y. You only need to assign a table alias if you use a subquery as a derived table. That is, in the FROM clause.
Second, if you compare productID to the result of the scalar subquery, you don't need to reference the y.ID. The subquery expression itself can be the right hand side of the comparison.
You can write expressions to compare to a scalar subquery like this:
WHERE productID = ( ... subquery... )
No table alias following the subquery, and no need to reference y.ID.

HQL/MySQL for listing distincts and duplicates

I have list of 20.000+ objects. These objects have a fk to a table called title. Two tipps are considered duplicate if they are linked to the same title, and they belong to the same package(tipp_pkg_fk, this is a parameter).
I need a list of all objects, with the duplicates listed together. For example:
tippA.title.name = "One"
tippB.title.name = "Two"
tippC.title.name = "Two"
Ideally from the above I will get a list result like this: [[tippA],[tippB,tippC]]
I am not sure how to do this, I have made an attempt (first in Mysql so I can test it, then ill change it to HQL):
select tipp.tipp_id, 1 as sortOrder
from (select distinct a.tipp_id as id
from title_instance_package_platform a, title_instance_package_platform b
where a.tipp_pkg_fk= 1 and b.tipp_pkg_fk = 1 and a.tipp_ti_fk = b.tipp_ti_fk) duplicates,
title_instance_package_platform tipp
where tipp.tipp_id != duplicates.id
union all
select duplicates.id, 2 as sortOrder
from (select distinct a.tipp_id as id
from title_instance_package_platform a , title_instance_package_platform b
where a.tipp_pkg_fk = 1 and b.tipp_pkg_fk=1 and a.tipp_ti_fk = b.tipp_ti_fk) duplicates
order by sortOrder, id;
This executed for 330 seconds, then I got the message fetching in MySQL workbench, and computer started dying at that point. So the idea is that first I select all the IDs that are not duplicate, then I select all the IDS that are duplicate, and then I merge them and order them so that they appear together. I am looking for the most efficient way to do this, as I will be executing this query several times during an overnight job.
For my TIPP model, the following are part of the mapping:
static mapping = {
pkg column:'tipp_pkg_fk', index: 'tipp_idx'
title column:'tipp_ti_fk', index: 'tipp_idx'
}
+-----------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------------+--------------+------+-----+---------+----------------+
| tipp_id | bigint(20) | NO | PRI | NULL | auto_increment |
| tipp_version | bigint(20) | NO | | NULL | |
| tipp_pkg_fk | bigint(20) | NO | MUL | NULL | |
| tipp_plat_fk | bigint(20) | NO | MUL | NULL | |
| tipp_ti_fk | bigint(20) | NO | MUL | NULL | |
| date_created | datetime | NO | | NULL | |
| last_updated | datetime | NO | | NULL | |
+-----------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+---------------+------+-----+---------+----------------+
| ti_id | bigint(20) | NO | PRI | NULL | auto_increment |
| ti_version | bigint(20) | NO | | NULL | |
| date_created | datetime | NO | | NULL | |
| ti_imp_id | varchar(255) | NO | MUL | NULL | |
| last_updated | datetime | NO | | NULL | |
| ti_title | varchar(1024) | YES | | NULL | |
| ti_key_title | varchar(1024) | YES | | NULL | |
| ti_norm_title | varchar(1024) | YES | | NULL | |
| sort_title | varchar(1024) | YES | | NULL | |
+-----------------+---------------+------+-----+---------+----------------+
Update
After some changes it is working:
select tipp.tipp_id as id, 1 as sortOrder
from
title_instance_package_platform tipp
where tipp.tipp_id not in (select distinct a.tipp_id as id
from title_instance_package_platform a, title_instance_package_platform b
where a.tipp_pkg_fk= 1 and b.tipp_pkg_fk = 1 and a.tipp_ti_fk = b.tipp_ti_fk)
union all
select duplicates.id as id, 2 as sortOrder
from (select distinct a.tipp_id as id
from title_instance_package_platform a , title_instance_package_platform b
where a.tipp_pkg_fk = 1 and b.tipp_pkg_fk=1 and a.tipp_ti_fk = b.tipp_ti_fk) duplicates
order by sortOrder, id;
I still haven't got the duplicates grouped together though, instead everything comes as a list, which means I still need to group them.
Can you do your select from the other side?
select all titles and packages and list all tipps to these, only if a tipp exists (count > 0) and bundle these together to get the array you showed?
Seems like you could compute both the dups and the non-dups at the same time. Something like
SELECT ( a.tipp_ti_fk = b.tipp_ti_fk ) AS sortOrder,
a.tipp_id as id
from title_instance_package_platform a ,
title_instance_package_platform b
where a.tipp_pkg_fk = 1
and b.tipp_pkg_fk = 1
You might need a DISTINCT.
This composite index would help:
INDEX(tipp_pkg_fk, tipp_ti_fk, tipp_id)

EAV - Add rows with null value if row does not exist

I have the following table storing data in the EAV model:
+-------------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------+---------+------+-----+---------+-------+
| user_id | int(11) | NO | MUL | NULL | |
| question_id | int(11) | NO | MUL | NULL | |
| answer | blob | YES | | NULL | |
+-------------+---------+------+-----+---------+-------+
With a table to hold the different types of question:
+----------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| question | blob | YES | | NULL | |
+----------+---------+------+-----+---------+----------------+
As well as a users table:
+------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_name | varchar(128) | YES | | NULL | |
+------------------+--------------+------+-----+---------+----------------+
How would I write a query to insert a row with a null value for answer for each question_id that each user does not currently have a row for?
For example, if I have question_ids 1,2,3,4 and my table storing data looks like:
+--------------+---------------+--------+
| user_id | question_id | answer |
+--------------+---------+-----+--------+
| 1 | 1 | example|
| 1 | 3 | example|
| 1 | 4 | example|
+--------------+---------+-----+--------+
I want to insert a row that looks like :
+--------------+---------------+--------+
| user_id | question_id | answer |
+--------------+---------+-----+--------+
| 1 | 2 | NULL |
+--------------+---------+-----+--------+
I tried something like this:
INSERT INTO profile_answers
(
user_id,
question_id,
answer
)
SELECT
id,
profile_answers.question_id,
null
FROM users
LEFT JOIN profile_answers ON profile_answers.user_id = users.id
WHERE NOT EXISTS (
SELECT answer
FROM profile_answers
WHERE user_id = id
AND question_id IN (
SELECT GROUP_CONCAT(id SEPARATOR ',') FROM profile_questions
)
)
But I ended up inserting rows with a question id of 0.
I've given this some thought, but I couldn't find anything better than using a cartesian product between users and questions, and a filtering subquery:
SELECT u.id, q.id
FROM users u,
questions q
WHERE NOT EXISTS(
SELECT *
FROM profile_answers a
WHERE a.question_id = q.id AND a.user_id = u.id
);
Demo

MySQL merge results into table from count of 2 other tables, matching ids

I've got 3 tables: model, model_views, and model_views2. In an effort to have one column per row to hold aggregated views, I've done a migration to make the model look something like this, with a new column for the views:
+---------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_id | int(11) | NO | | NULL | |
| [...] | | | | | |
| views | int(20) | YES | | 0 | |
+---------------+---------------+------+-----+---------+----------------+
This is what the columns for model_views and model_views2 look like:
+------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_id | smallint(5) | NO | MUL | NULL | |
| model_id | smallint(5) | NO | MUL | NULL | |
| time | int(10) unsigned | NO | | NULL | |
| ip_address | varchar(16) | NO | MUL | NULL | |
+------------+------------------+------+-----+---------+----------------+
model_views and model_views2 are gargantuan, both totalling in the tens of millions of rows each. Each row is representative of one view, and this is a terrible mess for performance. So far, I've got this MySQL command to fetch a count of all the rows representing single views in both of these tables, sorted by model_id added up:
SELECT model_id, SUM(c) FROM (
SELECT model_views.model_id, COUNT(*) AS c FROM model_views
GROUP BY model_views.model_id
UNION ALL
SELECT model_views2.model_id, COUNT(*) AS c FROM model_views2
GROUP BY model_views2.model_id)
AS foo GROUP BY model_id
So that I get a nice big table with the following:
+----------+--------+
| model_id | SUM(c) |
+----------+--------+
| 1 | 1451 |
| [...] | |
+----------+--------+
What would be the safest route for pulling off commands from here on in to merge the values of SUM(c) into the column model.views, matched by the model.id to model_ids that I get out of the above SQL query? I want to only fill the rows for models that still exist - There is probably model_views referring to rows in the model table which have been deleted.
You can just use UPDATE with a JOIN on your subquery:
UPDATE model
JOIN (
SELECT model_views.model_id, COUNT(*) AS c
FROM model_views
GROUP BY model_views.model_id
UNION ALL
SELECT model_views2.model_id, COUNT(*) AS c
FROM model_views2
GROUP BY model_views2.model_id) toupdate ON model.id = toupdate.model_id
SET model.views = toupdate.c

SQL query to get latest record for all distinct items in a table

I have a table of all sales defined like:
mysql> describe saledata;
+-------------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+---------------------+------+-----+---------+-------+
| SaleDate | datetime | NO | | NULL | |
| StoreID | bigint(20) unsigned | NO | | NULL | |
| Quantity | int(10) unsigned | NO | | NULL | |
| Price | decimal(19,4) | NO | | NULL | |
| ItemID | bigint(20) unsigned | NO | | NULL | |
+-------------------+---------------------+------+-----+---------+-------+
I need to get the last sale price for all items (as the price may change). I know I can run a query like:
SELECT price FROM saledata WHERE itemID = 1234 AND storeID = 111 ORDER BY saledate DESC LIMIT 1
However, I want to be able to get the last sale price for all items (the ItemIDs are stored in a separate item table) and insert them into a separate table. How can I get this data? I've tried queries like this:
SELECT storeID, itemID, price FROM saledata WHERE itemID IN (SELECT itemID from itemmap) ORDER BY saledate DESC LIMIT 1
and then wrap that into an insert, but it's not getting the proper data. Is there one query I can run to get the last price for each item and insert that into a table defined like:
mysql> describe lastsale;
+-------------------+---------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+---------------------+------+-----+---------+-------+
| StoreID | bigint(20) unsigned | NO | | NULL | |
| Price | decimal(19,4) | NO | | NULL | |
| ItemID | bigint(20) unsigned | NO | | NULL | |
+-------------------+---------------------+------+-----+---------+-------+
This is the greatest-n-per-group problem that comes up frequently on Stack Overflow.
INSERT INTO lastsale (StoreID, Price, ItemID)
SELECT s1.StoreID, s1.Price, s1.ItemID
FROM saledata s1
LEFT OUTER JOIN saledata s2
ON (s1.Itemid = s2.Itemid AND s1.SaleDate < s2.SaleDate)
WHERE s2.ItemID IS NULL;