I'm dealing with a large query that maps data from one table into a CSV file, so it essentially looks like a basic select query--
SELECT * FROM item_table
--except that * is actually a hundred lines of CASE, IF, IFNULL, and other logic.
I've been told to add a "similar items" line to the select statement, which should be a string of comma-separated item numbers. The similar items are found in a category_table, which can join to item_table on two data points, column_a and column_b, with category_table.category_id having the data that identifies the similar items.
Additionally, I've been told NOT to use a subquery.
So I need to join category_table and group_concat item numbers from that table having the same category_id value (but not having the item number of whatever the current record would be).
If I can only do it with a subquery regardless of the instructions, I will accept that, but I want to do it with a join and group_concat as instructed if possible--I just can't figure it out. How can I do this?
You can make use of a mySQL "feature" called hidden columns.
I am going to assume you have an item id in the item table that uniquely identifies each row. And, if I have your logic correct, the following query does what you want:
select i.*, group_concat(c.category_id)
from item_table i left outer join
category_table c
on i.column_a = c.column_a and
i.column_b = c.column_b and
i.item_id <> c.category_id
group by i.item_id
I think this is what you're looking for, although I wasn't sure what uniquely identified your item_table so I used column_a and column_b (those may be incorrect):
SELECT
...,
GROUP_CONCAT(c.category_id separator ',') CategoryIDs
FROM item_table i
JOIN category_table ct ON i.column_a = ct.column_a AND
i.column_b = ct.column_b
GROUP BY i.column_a, i.column_b
I've used a regular INNER JOIN, but if the category_table might not have any related records, you may need to use a LEFT JOIN instead to get your desired results.
Maybe something like this?
SELECT i.*, GROUP_CONCAT(c.category_id) AS similar_items
FROM item_table i
INNER JOIN category_table c ON (i.column_a = c.column_a AND
i.column_b = c.column_b)
GROUP BY i.column_a, i.column_b
Related
I have this query I need to optimize further since it requires too much cpu time and I can't seem to find any other way to write it more efficiently. Is there another way to write this without altering the tables?
SELECT category, b.fruit_name, u.name
, r.count_vote, r.text_c
FROM Fruits b, Customers u
, Categories c
, (SELECT * FROM
(SELECT *
FROM Reviews
ORDER BY fruit_id, count_vote DESC, r_id
) a
GROUP BY fruit_id
) r
WHERE b.fruit_id = r.fruit_id
AND u.customer_id = r.customer_id
AND category = "Fruits";
This is your query re-written with explicit joins:
SELECT
category, b.fruit_name, u.name, r.count_vote, r.text_c
FROM Fruits b
JOIN
(
SELECT * FROM
(
SELECT *
FROM Reviews
ORDER BY fruit_id, count_vote DESC, r_id
) a
GROUP BY fruit_id
) r on r.fruit_id = b.fruit_id
JOIN Customers u ON u.customer_id = r.customer_id
CROSS JOIN Categories c
WHERE c.category = 'Fruits';
(I am guessing here that the category column belongs to the categories table.)
There are some parts that look suspicious:
Why do you cross join the Categories table, when you don't even display a column of the table?
What is ORDER BY fruit_id, count_vote DESC, r_id supposed to do? Sub query results are considered unordered sets, so an ORDER BY is superfluous and can be ignored by the DBMS. What do you want to achieve here?
SELECT * FROM [ revues ] GROUP BY fruit_id is invalid. If you group by fruit_id, what count_vote and what r.text_c do you expect to get for the ID? You don't tell the DBMS (which would be something like MAX(count_vote) and MIN(r.text_c)for instance. MySQL should through an error, but silently replacescount_vote, r.text_cbyANY_VALUE(count_vote), ANY_VALUE(r.text_c)` instead. This means you get arbitrarily picked values for a fruit.
The answer hence to your question is: Don't try to speed it up, but fix it instead. (Maybe you want to place a new request showing the query and explaining what it is supposed to do, so people can help you with that.)
Your Categories table seems not joined/related to the others this produce a catesia product between all the rows
If you want distinct resut don't use group by but distint so you can avoid an unnecessary subquery
and you dont' need an order by on a subquery
SELECT category
, b.fruit_name
, u.name
, r.count_vote
, r.text_c
FROM Fruits b
INNER JOIN Customers u ON u.customer_id = r.customer_id
INNER JOIN Categories c ON ?????? /Your Categories table seems not joined/related to the others /
INNER JOIN (
SELECT distinct fruit_id, count_vote, text_c, customer_id
FROM Reviews
) r ON b.fruit_id = r.fruit_id
WHERE category = "Fruits";
for better reading you should use explicit join syntax and avoid old join syntax based on comma separated tables name and where condition
The next time you want help optimizing a query, please include the table/index structure, an indication of the cardinality of the indexes and the EXPLAIN plan for the query.
There appears to be absolutely no reason for a single sub-query here, let alone 2. Using sub-queries mostly prevents the DBMS optimizer from doing its job. So your biggest win will come from eliminating these sub-queries.
The CROSS JOIN creates a deliberate cartesian join - its also unclear if any attributes from this table are actually required for the result, if it is there to produce multiples of the same row in the output, or just an error.
The attribute category in the last line of your query is not attributed to any of the tables (but I suspect it comes from the categories table).
Further, your code uses a GROUP BY clause with no aggregation function. This will produce non-deterministic results and is a bug. Assuming that you are not exploiting a side-effect of that, the query can be re-written as:
SELECT
category, b.fruit_name, u.name, r.count_vote, r.text_c
FROM Fruits b
JOIN Reviews r
ON r.fruit_id = b.fruit_id
JOIN Customers u ON u.customer_id = r.customer_id
ORDER BY r.fruit_id, count_vote DESC, r_id;
Since there are no predicates other than joins in your query, there is no scope for further optimization beyond ensuring there are indexes on the join predicates.
As all too frequently, the biggest benefit may come from simply asking the question of why you need to retrieve every single row in the tables in a single query.
I'm creating a product filter for e-commerce store. I have a product table, characteristics table and a table in which I store product_id, characteristic_id and a single filter value.
shop_products - id, name
shop_characteristics - id, values (json)
shop_values - product_id, characteristic_id, value
I can build a query to get all the products by a single value like this:
SELECT `p`.* FROM `shop_products` `p`
LEFT JOIN `shop_values` `fv` ON `p`.`id` = `fv`.`product_id`
WHERE ((`fv`.`characteristic_id`=3) AND (`fv`.`value`='outdoor'))
It works fine. Also, I can modify this query and get all the products by multiple values that belong to the very same characteristics group (have identical characteristics_id) like this:
SELECT `p`.* FROM `shop_products` `p`
LEFT JOIN `shop_values` `fv` ON `p`.`id` = `fv`.`product_id`
WHERE ((`fv`.`characteristic_id`=3) AND (`fv`.`value`='outdoor'))
OR ((`fv`.`characteristic_id`=3) AND (`fv`.`value`='indoor'))
but when I try to create a query for multiple conditions with different characteristic_id I get nothing
SELECT `p`.* FROM `shop_products` `p`
LEFT JOIN `shop_values` `fv` ON `p`.`id` = `fv`.`product_id`
WHERE ((`fv`.`characteristic_id`=3) AND (`fv`.`value`='outdoor'))
AND ((`fv`.`characteristic_id`=5) AND (`fv`.`value`='white'))
My guess it does not work because of AND operator that I am using wrong in this case due to there are no records in shop_values table that have both characteristic_id 3 and 5.
So my question is how to combine or modify my query to get all related products or maybe it is a flaw to store data like this and I need to create a different kind of shop_values table?
Use aggregation. You can also use tuples with the in clause. So:
SELECT p.*
FROM shop_products p JOIN
shop_values v
ON p.id = v.product_id
WHERE (v.characteristic_id, v.value) IN ( (3, 'outdoor'), (5, 'white'))
GROUP BY p.id
HAVING COUNT(DISTINCT v.characteristic_id) = 2;
Notes:
Unnecessarily escaping column and table aliases (with backticks) just makes the query harder to write and to read.
In general, using SELECT p.* and GROUP BY p.id is really, really bad form. The one exception is when you are grouping by a unique or primary key. This latter form is actually supported in the ANSI standard.
A LEFT JOIN is not needed. You need to find matches between the tables for the logic to work.
The use of AND and OR is fine for the WHERE clause. MySQL happens to support tuples with IN, which somewhat simplifies the logic.
I'm trying to pull results where 1 row from tableA (profiles.category) matches 1 row from tableB (projects.categorysecond) however I'm not getting results.
*IMPORTANT projects.categorysecond will vary between having only 1 category to several category deliminated by ; EXAMPLE: OTTAWA OR OTTAWA;TORONTO;MONTREAL OR OTTAWA;MONTREAL OR TORONTO;MONTREAL
profiles.category will always only have 1 category, never deliminated.
I need to make sure that regardless if I have OTTAWA OR OTTAWA;TORONTO;MONTREAL in profiles.category it PULLS results as long as 1 word matches.
I'm currently trying the following query:
SELECT p.*, up.* FROM users_profiles up INNER JOIN projects p ON find_in_set(up.category, p.categorysecond) > 0
FIND_IN_SET() only understands comma as a separator. Read https://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_find-in-set
So you could substitute ; with , and then do the comparison:
SELECT p.*, up.*
FROM users_profiles up
INNER JOIN projects p
ON FIND_IN_SET(up.category, REPLACE(p.categorysecond, ';', ',')) > 0
I have to comment that this is not a good way to store data if you want to write expressions that find individual words in your semicolon-separated string. See my answer to Is storing a delimited list in a database column really that bad?
You should store one project category per row in a project_categories table. Then your query would be easier:
SELECT p.*, up.*
FROM users_profiles up
INNER JOIN project_categories pc
ON up.category = pc.category
INNER JOIN projects p
ON p.project = pc.project;
With a compound index on project_categories(category,project), this query should be optimized pretty well.
I have a translations table (simplified) like this:
id, lang, item, value
I want to find which items exists for one language that do not exist for some of the other languages, so I'm looking for orphaned language items.
I've tried something like
SELECT a.*
FROM translations a
NATURAL LEFT JOIN translations b
WHERE b.item IS NULL
but this doesn't seem to work.
Another option is to iterate through all languages asking for items in lang_1 that are not in lang_2, but this needs several queries.
Is there an easy way to filter this in sql ?
You can do this by generating all the combinations of languages and words and then determining which ones do not exist. The first part can be done using a cross join. The second part has a couple of options. Here is a left join version:
select i.item, l.lang
from (select distinct item from translations) i cross join
(select distinct lang from translations) l left join
translations t
on t.item = i.item and t.lang = l.lang
where t.item is null;
This will give the list of all item/lang pairs that are not in the translations table.
I have a MySQL query in which I want to include a list of ID's from another table. On the website, people are able to add certain items, and people can then add those items to their favourites. I basically want to get the list of ID's of people who have favourited that item (this is a bit simplified, but this is what it boils down to).
Basically, I do something like this:
SELECT *,
GROUP_CONCAT((SELECT userid FROM favourites WHERE itemid = items.id) SEPARATOR ',') AS idlist
FROM items
WHERE id = $someid
This way, I would be able to show who favourited some item, by splitting the idlist later on to an array in PHP further on in my code, however I am getting the following MySQL error:
1242 - Subquery returns more than 1 row
I thought that was kind of the point of using GROUP_CONCAT instead of, for example, CONCAT? Am I going about this the wrong way?
Ok, thanks for the answers so far, that seems to work. However, there is a catch. Items are also considered to be a favourite if it was added by that user. So I would need an additional check to check if creator = userid. Can someone help me come up with a smart (and hopefully efficient) way to do this?
Thank you!
Edit: I just tried to do this:
SELECT [...] LEFT JOIN favourites ON (userid = itemid OR creator = userid)
And idlist is empty. Note that if I use INNER JOIN instead of LEFT JOIN I get an empty result. Even though I am sure there are rows that meet the ON requirement.
OP almost got it right. GROUP_CONCAT should be wrapping the columns in the subquery and not the complete subquery (I'm dismissing the separator because comma is the default):
SELECT i.*,
(SELECT GROUP_CONCAT(userid) FROM favourites f WHERE f.itemid = i.id) AS idlist
FROM items i
WHERE i.id = $someid
This will yield the desired result and also means that the accepted answer is partially wrong, because you can access outer scope variables in a subquery.
You can't access variables in the outer scope in such queries (can't use items.id there). You should rather try something like
SELECT
items.name,
items.color,
CONCAT(favourites.userid) as idlist
FROM
items
INNER JOIN favourites ON items.id = favourites.itemid
WHERE
items.id = $someid
GROUP BY
items.name,
items.color;
Expand the list of fields as needed (name, color...).
I think you may have the "userid = itemid" wrong, shouldn't it be like this:
SELECT ITEMS.id,GROUP_CONCAT(FAVOURITES.UserId) AS IdList
FROM FAVOURITES
INNER JOIN ITEMS ON (ITEMS.Id = FAVOURITES.ItemId OR FAVOURITES.UserId = ITEMS.Creator)
WHERE ITEMS.Id = $someid
GROUP BY ITEMS.ID
The purpose of GROUP_CONCAT is correct but the subquery is unnecessary and causing the problem. Try this instead:
SELECT ITEMS.id,GROUP_CONCAT(FAVOURITES.UserId)
FROM FAVOURITES INNER JOIN ITEMS ON ITEMS.Id = FAVOURITES.ItemId
WHERE ITEMS.Id = $someid
GROUP BY ITEMS.ID
Yes, soulmerge's solution is ok. But I needed a query where I had to collect data from more child tables, for example:
main table: sessions (presentation sessions) (uid, name, ..)
1st child table: events with key session_id (uid, session_uid, date, time_start, time_end)
2nd child table: accessories_needed (laptop, projector, microphones, etc.) with key session_id (uid, session_uid, accessory_name)
3rd child table: session_presenters (presenter persons) with key session_id (uid, session_uid, presenter_name, address...)
Every Session has more rows in child tables tables (more time schedules, more accessories)
And I needed to collect in one collection for every session to display in ore row (some of them):
session_id | session_name | date | time_start | time_end | accessories | presenters
My solution (after many hours of experiments):
SELECT sessions.uid, sessions.name,
,(SELECT GROUP_CONCAT( `events`.date SEPARATOR '</li><li>')
FROM `events`
WHERE `events`.session_id = sessions.uid ORDER BY `events`.date) AS date
,(SELECT GROUP_CONCAT( `events`.time_start SEPARATOR '</li><li>')
FROM `events`
WHERE `events`.session_id = sessions.uid ORDER BY `events`.date) AS time_start
,(SELECT GROUP_CONCAT( `events`.time_end SEPARATOR '</li><li>')
FROM `events`
WHERE `events`.session_id = sessions.uid ORDER BY `events`.date) AS time_end
,(SELECT GROUP_CONCAT( accessories.name SEPARATOR '</li><li>')
FROM accessories
WHERE accessories.session_id = sessions.uid ORDER BY accessories.name) AS accessories
,(SELECT GROUP_CONCAT( presenters.name SEPARATOR '</li><li>')
FROM presenters
WHERE presenters.session_id = sessions.uid ORDER BY presenters.name) AS presenters
FROM sessions
So no JOIN or GROUP BY needed.
Another useful thing to display data friendly (when "echoing" them):
you can wrap the events.date, time_start, time_end, etc in "<UL><LI> ... </LI></UL>" so the "<LI></LI>" used as separator in the query will separate the results in list items.
I hope this helps someone. Cheers!