Messy self-join Mysql SELECT - mysql

I need to return a list of product id's that are...
Within a specific category (for example 'Clothing')
Which have various attributes, such as 'Red' or 'Green'
Which are themselves within attribute 'groups' such as 'Color'
I'm getting stuck when I need to select MULTIPLE attribute options within MULTIPLE attribute groups. For example, if I need to return a list of products where Color is 'blue' OR 'red' AND size is 'Medium' OR 'XXL'.
This is my code:
SELECT `products.id`
FROM
`products` ,
`categories` ,
`attributes` att1,
`attributes` att2
WHERE products.id = categories.productid
AND `categories.id` = 3
AND att1.productid = products.id
AND att1.productid = att2.productid
AND
(att1.attributeid = 58 OR att1.attributeid = 60)
AND
(att2.attributeid = 12 OR att2.attributeid = 9)
I believe this code works, but It looks pretty messy and I'm not sure my 'dirty' self-join is the correct way to go. Has anyone got any ideas on a more 'elegant' solution to my problem?

Please use the modern join syntax:
SELECT products.id
FROM products
join categories on products.id = categories.productid
join attributes att1 on att1.productid = products.id
join attributes att2 on att1.productid = att2.productid
WHERE categories.id = 3
AND att1.attributeid IN (58, 60)
AND att2.attributeid IN (12, 9)
It's easier to read because it clearly demarques join conditions from row filtering conditions. It's also easier for the SQL optimizer to identify these distinctions and create better query plans
Edited
I alsp added the use of IN (...). Not only does it look nicer, the DB wil use an index with IN but usually not with OR, even though they mean the same thing

SELECT p.id
FROM products p
JOIN categories c ON c.productid = p.id
JOIN attributes a1 ON a1.productid = p.id
JOIN attributes a2 ON a2.productid = p.id
WHERE categories.id = 3
AND a1.attributeid IN (58, 60)
AND a2.attributeid IN (12, 9)
I think you had a mistake where you join the second attribute to the first attribute instead of joining it to the product. I fixed that.
On second thought, this may be intentional, and my correction wrong. It is a messy design, though, to mix attributes with attribute groups in the same table.
I also simplified your syntax and use explicit JOINs which are more readable.

Related

Join tables with one table having multiple identical IDs

I want to join three tables. One of these tables (modx_article_category) can have rows with identical IDs (articles have multiple categories).
I would like to put the values of these joins in a single column where the results are comma separated.
Here's my code so far:
I was looking for a solution but am not even sure what to google...
CREATE TABLE article_en AS
SELECT *
FROM mod_article_c, category_c, modx_article_category
WHERE mod_article_c.article_id = modx_article_category.article
AND modx_article_category.category = category_c.category_id
AND mod_article_c.article_lang = "en"
AND category_c.category_lang = "en"
DB samples:
https://raslan.de/index.php/s/cK9mxGyj9wKzFsS
This only selects one category even though there might be more.
If you need further infos just let me know.
Thanks in advance.
You can use group_concat.
SELECT c1.category_id,
Group_concat(DISTINCT article ORDER BY article)
FROM mod_article_c c
INNER JOIN modx_article_category c1
ON c.article_id = c1.article
INNER JOIN category_c c2
ON c1.category = c2.category_id
WHERE c.article_lang = "en"
AND c2.category_lang = 'en'
GROUP BY category_id;
above query will return all the articles for each category_id/category

Select from table where names have same initial letters

I have been looking for a solution for this in SQL. I am trying to find records from one table that has the same first two characters and same birth date. I thought about doing self-join but I doubt I am getting the right results. Here is my query, please tell me what's missing:
select p1.frst_name,
from person p1 inner join person p2
on upper(left(p1.frst_name,2)) like upper(left(p2.frst_name,2))
and upper(p1.last_name) LIKE upper(p2.last_name)
and p1.birth_date = p2.birth_date
Join on the last_name and birth_date since you want those to match exactly, then filter by the two first two characters matching.
You shouldn't need upper() on p1.frst_name or p2.frst_name. Because they are the same column in the same table, their cases will match.
Try...
select p1.frst_name,
from person p1
full outer join person p2
on p1.last_name = p2.last_name
and p1.birth_date = p2.birth_date
where upper(left(p1.frst_name,2)) like upper(left(p2.frst_name,2))
Change LIKE to = (you want an exact match), and add a join condition to prevent rows from joining to themselves:
select p1.id, p1.frst_name, p1.last_name, p1.birth_date
from person p1
join person p2
on upper(left(p1.frst_name,2)) = upper(left(p2.frst_name,2))
and upper(p1.last_name) = upper(p2.last_name)
and p1.birth_date = p2.birth_date
and p1.id != p2.id
Without the addition of and p1.id != p2.id, every row would be returned, because of course every row would otherwise match itself.
The question was tagged with both mysql and oracle. The above query works in mysql. For iracle, which doesn't support left(col, 2), use substr(col, 1, 2) instead.

MySQL complex join for multiple inclusion and exclusion

I've got a client that wants me to make a tagging system. I've got three tables: items, tags, and item_tag_assoc - the last one only serves to associate item_ids and tag_ids. Here's the problem I'm having:
If the user requests included tags [1, 2, 3] and excluded tags [4, 5, 6], the result should be all items that have EVERY tag in [1, 2, 3] (not just one) and NO tags in [4, 5, 6]. How do I write a query to accomplish this?
I researched enough to figure out inner joins for the tag inclusion:
SELECT i.item_id, i.item_title FROM items AS i INNER JOIN tag_item_assoc AS tia1 ON (tia1.item_id = i.item_id AND tia1.tag_id = 1)
...and just chain on the same inner join pattern for as many tags as you want to include. It may be a little bulky, but users won't be choosing more than 4 or 5 tags before they move on, so it'll do.
I was really hoping that I could exclude the same way, and wrap everything into one query:
INNER JOIN tag_item_assoc AS tia2 ON (tia2.item_id = i.item_id AND tia2.tag_id!= 2)
But it became obvious very quickly that wasn't going to work. I read a couple of articles that said LEFT OUTER JOINs could let me exclude while I include, but I couldn't figure them out, mostly because of the stray WHERE clauses. Any permutation of LEFT OUTER JOINs and INNER JOINs I tried either yielded an error or very confusing results.
All that to say - does anyone here know how I can accomplish this? I apologize for not having any useful code examples to provide. I'm ok with starting from scratch if the INNER JOINs are an obstacle - I just need a way to accomplish multiple association inclusion and exclusion at the same time. Thanks in advance for the help and expertise!
SELECT a.*
FROM items a
INNER JOIN item_tag_assoc b
ON a.item_id = b.item_id
INNER JOIN tags c
ON a.tag_id = c.tag_id
WHERE c.tag_id IN(1) AND
c.tag_id IN(2) AND
c.tag_id IN(3) AND
c.tag_id NOT IN (4) AND
c.tag_id NOT IN (5) AND
c.tag_id NOT IN (6) ;
You need to match the number of instances of records to the number of your tags:
SELECT a.item_id -- , add other columns here
FROM items a
INNER JOIN item_tag_assoc b
ON a.item_id = b.item_id
WHERE b.tag_id IN (1, 2, 3) AND
b.tag_id NOT IN (4, 5, 6)
GROUP BY a.item_id
HAVING COUNT(a.item_id) = 3
Found the answer. Instead of adding INNER JOINs like so:
INNER JOIN tag_item_assoc AS tia2 ON (tia2.item_id = i.item_id AND tia2.tag_id!= 2)
I added a sub-query with a single inner join:
WHERE i.item_id NOT IN ( SELECT i.item_id FROM items AS s INNER JOIN tag_item_assoc AS tia1 ON (tia1.item_id = i.item_id AND tia1.tag_id IN (4, 5, 6)))
The sub-query creates a set of all items that have any of the tags in [4, 5, 6], and WHERE i.iten_id NOT IN eliminates any records that match that set.
Thanks everyone for the effort! Hopefully this will help someone down the road.

sql query help join (i think)

I am having trouble figuring our how I can get results only when products.published, product_types.published, and product_cats.published = 1 but my query isn't working. Please help:
SELECT
`products`.`title`,
`products`.`menu_id`,
`products`.`short_description`,
`products`.`datasheet_icon`,
`products`.`datasheet`,
`products`.`ordering`,
`products`.`product_type_id`,
CASE WHEN CHAR_LENGTH(`products`.`alias`)
THEN CONCAT_WS(':', `products`.`id`, `products`.`alias`)
ELSE `products`.`id`
END AS slug
FROM
`products`,
`product_cats`,
`product_types`
WHERE
`products`.published=1 AND
`product_cats`.published=1 AND
`product_types`.published=1 AND
`products`.`product_cat_id`='42' AND
`product_types`.`id` IN (1,40,48,49,50)
GROUP BY `products`.`id`
ORDER BY `product_types`.`ordering`, `products`.`ordering`
I want to assume tables product_cats and product_types have product ids in them as well. And I call them pid in this:
SELECT
p.title,
p.menu_id,
p.short_description,
p.datasheet_icon,
p.datasheet,
p.ordering,
p.product_type_id,
CASE
WHEN CHAR_LENGTH(p.alias)
THEN CONCAT_WS(':', p.id, p.alias)
ELSE p.id
END AS slug
FROM products p
JOIN product_cats pc ON pc.pid = p.id
JOIN product_types pt ON pt.pid = p.id
WHERE
p.published=1 AND
pc.published=1 AND
pt.published=1
GROUP BY p.id
ORDER BY pt.ordering,p.ordering
You need join tables!
FROM
`products`,
`product_cats`,
`product_types`
Use relational fields to do it and your problem will be gone!
I'm afraid your query is a bit of a mess. Without the table structures we can only guess at what you're trying to do. The critical information is how the three tables are related to each other.
Note the following:
You are using three tables in your SELECT, but are not JOINing them. You will need to explicitly JOIN the tables you use. The lack of explicit JOINs is the reason you're getting too many rows back and are having to use GROUP BY to eliminate duplicates. Your final solution should not use GROUP BY.
If you're only searching for product.cat_id of 42, I presume you know whether than cat_id is published and you don't need to involve the product_cats table. Is that correct?
Presumably there's a column product.type_id or something similar. Since you are searching for a limited number of these, do you know in advance that the ids in that list are published?

Joining Tables: case statement for no matches?

I have this query:
SELECT p.text,se.name,s.sub_name,SUM((p.volume / (SELECT SUM(p.volume)
FROM phrase p
WHERE p.volume IS NOT NULL) * sp.position))
AS `index`
FROM phrase p
LEFT JOIN `position` sp ON sp.phrase_id = p.id
LEFT JOIN `engines` se ON se.id = sp.engine_id
LEFT JOIN item s ON s.id = sp.site_id
WHERE p.volume IS NOT NULL
AND s.ignored = 0
GROUP BY se.name,s.sub_name
ORDER BY se.name,s.sub_name
There are a few things I want to do with it:
1) The end of the calculation for 'index', I multiple it all by sp.position, then get it's SUM. If there is NO MATCH in the first LEFT JOIN 'position', I want to give sp.position a value of 200. So basically if in the 'phrase' table I have an ID=2, but that does not exist in sp.phrase_id in the entire 'position' table, then sp.position=200 for the 'index' calculation, otherwise it will it will be whatever value is stored in the 'position' table. I hope that makes sense.
2) I do a GROUP BY se.name. I would like to actually SUM the entire 'index' values for similar se.name fields. So in the resultset as it stands now, if there were 20 p.text rows with the same se.name, I would like to SUM the index column for the same se.name(s).
I am more of a PHP guy, but trying to learn more MySQL. I have become a big believer in making the DB do as much of the work as possible instead of trying to manipulate the dataset after it's been returned.
I hope the questions were clear. Anyways, can both 1) and 2) be done? There's much more I want to modify this query to do, but I think if I need more help in the future on it, it would require a different question.
The position table has a engines_id, phrase_id, item_id which will make it a unique entry. The value I am trying to calculate is the sp.position value. But there are cases when there is no entry for these IDs combined. If there is no entry for the combo of 3 IDs I just listed, I would like to use sp.position=200 in my calculation.
How's this:
select x.name, sum(index) from
(
SELECT p.text,se.name,s.sub_name,SUM((p.volume / (SELECT SUM(p.volume)
FROM phrase p
WHERE p.volume IS NOT NULL) * if(sp.position is null,200,sp.position)))
AS `index`
FROM phrase p
LEFT JOIN `position` sp ON sp.phrase_id = p.id
LEFT JOIN `engines` se ON se.id = sp.engine_id
LEFT JOIN item s ON s.id = sp.site_id
WHERE p.volume IS NOT NULL
AND s.ignored = 0
GROUP BY se.name,s.sub_name
ORDER BY se.name,s.sub_name
)x
GROUP BY x.name
Try the following:
1.) Use IFNULL(), in your case IFNULL(sp.position, 200)
2.) I am not entirely clear on this part, but it seems like you already have part of what you are asking.