Duplicate values from database when joining - mysql

I want to get the tags and genres that are connected to the items using two relationtables, though I'm getting duplicate values.
This is my query, I'm grouping the values by the items id so I don't understand why it is giving me duplicate values.
SELECT
name,
GROUP_CONCAT(tag) AS tags,
GROUP_CONCAT(genre) AS genres
FROM items
LEFT JOIN tagsItemsRelation ON
tagsItemsRelation.itemId = items.id
LEFT JOIN tags ON
tags.id = tagsItemsRelation.tagId
LEFT JOIN genresItemsRelation ON
genresItemsRelation.itemId = items.id
LEFT JOIN genres ON
genres.id = genresItemsRelation.genreId
GROUP BY items.id
Here is a SQLFiddle
As you can see it gives me duplicate values:
NAME TAGS GENRES
item1 tag2,tag1 genre1,genre1

You are aggregating along two different dimensions at the same time. That is why you are getting duplicates. So, if a name has tags, t1, t2, and t3 along with genres g1 and g2, then your joins are producing six rows for the name, with all combinations of the tags and genres.
If you have just a handful of multiple values for tags and genres, then the easiest solution is to use distinct:
SELECT name, GROUP_CONCAT(DISTINCT tag) AS tags, GROUP_CONCAT(DISTINCT genre) AS genres
FROM items LEFT JOIN
tagsItemsRelation
ON tagsItemsRelation.itemId = items.id LEFT JOIN
tags
ON tags.id = tagsItemsRelation.tagId LEFT JOIN
genresItemsRelation
ON genresItemsRelation.itemId = items.id LEFT JOIN
genres
ON genres.id = genresItemsRelation.genreId
GROUP BY items.name;
If you have lots of duplicates (dozens or hundreds per name), then the generation and handling of the duplicates can be a real performance issue. In that case, you would want to pre-aggregate the values along each dimension and then do the join.
Note that I changed the group by condition to be on name rather than id. It is good form for the group by columns to match the select columns.

Related

Select on join table with exact number of itmes

I have two tables
tracks
tags
One track have many tags
I want to have list of tracks that have both of two tags example tag_id 1 and tag_id 2
SELECT * FROM tracks
LEFT JOIN tags ON tracks.tag_id = tags.id
WHERE tags.id in (1,2)
GROUP BY track.id
HAVING count(tags.id) = 2
The problem if a tracks have tag 1 and 3 it will be listed.
any help please?
Add distinct to count
SELECT track.id FROM tracks
LEFT JOIN tags ON tracks.tag_id = tags.id
WHERE tags.id in (1,2)
GROUP BY track.id
HAVING count(Distinct tags.id) = 2
You can change the LEFT JOIN to INNER JOIN since it is converted implicitly based on your Where clause
Your code should do what you want it to. I would write it as:
SELECT track.id
FROM tracks INNER JOIN
tags
ON tracks.tag_id = tags.id
WHERE tags.id in (1, 2)
GROUP BY track.id
HAVING count(tags.id) = 2;
Note:
The LEFT JOIN is turned into an INNER JOIN by the WHERE clause. You might as well be specific.
If you have duplicates in track, then you want to use COUNT(DISTINCT) rather than COUNT().
Because you are returning non-aggregated columns, you might get unexpected results in other columns.
Actually, this can be further simplified to:
SELECT t.id
FROM tracks t
WHERE t.tag_id in (1, 2)
GROUP BY t.id
HAVING count(t.id) = 2;
The JOIN is not needed at all, because you have the information you need in tracks.tag_id.

mySQL multiple Joins for 5 tables with timeout errors

I have 5 tables:
Categories_groups_cat1
id|order|title(text)
Categories_vs_groups
id|categories_groups_cat1_id|categories_id
Categories
id|title(text)
Offers
id|category (text)
Coupons
id|category (text)
I want to display titles from Categories_groups_cat1 only if:
Exists at least one row in Categories_vs_groups via categories_groups_cat1_id column (Categories_vs_groups.categories_groups_cat1_id==Categories_groups_cat1.id)
AND categories_id from Categories_vs_groups exists as at least one row to table Categories (Categories.id==Categories_vs_groups.categories_id) AND where have at least one row in table Offers or Coupons via category column (offers.category==categories.title)!!
I do the following but I have timeout because the tables offers or coupons is more tha 500000.
SELECT
offers.category_gr,
categories.id, categories.title_gr,
categories_vs_groups.categories_id,
categories_vs_groups.categories_groups_cat1_id AS cat1,
categories_vs_groups.categories_groups_cat2_id AS cat2
FROM offers
LEFT JOIN categories
ON categories.title_gr=offers.category_gr
LEFT JOIN categories_vs_groups
ON categories_vs_groups.categories_id=categories.id
WHERE categories.title_gr='$row_best_offer[category_gr]'
GROUP BY categories.id
order by categories.id
If you have the right set-up, your query should not take more than 60 seconds.
Your where clause is undoing the first LEFT JOIN (and you probably don't need the second either. Because you are aggregating by categories, I would suggest starting with that. Then, the offers table is only needed for filtering, so you should consider whether it is really necessary.
So, consider this query:
SELECT c.category_gr, c.id, c.title_gr,
cvg.categories_id, cvg.categories_groups_cat1_id AS cat1,
cvg.categories_groups_cat2_id AS cat2
FROM categories c JOIN
offers o
ON c.title_gr = o.category_gr LEFT JOIN
categories_vs_groups cvg
ON cvg.categories_id = c.id
WHERE c.title_gr = '$row_best_offer[category_gr]'
GROUP BY c.id
ORDER BY c.id;
For best results, you want indexes on categories(title_gr, category_gr, id), offers(category_gr), and categories_vs_groups(categories_id).
instead of using condtion WHERE categories.title_gr='$row_best_offer[category_gr]' in where clause, put it in On clause
First left join is unnecessary.
SELECT
offers.category_gr,
categories.id, categories.title_gr,
categories_vs_groups.categories_id,
categories_vs_groups.categories_groups_cat1_id AS cat1,
categories_vs_groups.categories_groups_cat2_id AS cat2
FROM offers
JOIN categories
ON
(categories.title_gr=offers.category_gr
and categories.title_gr='$row_best_offer[category_gr]' )
LEFT JOIN categories_vs_groups
ON categories_vs_groups.categories_id=categories.id
GROUP BY categories.id
order by categories.id

Retrieve records from multiple tables some distinct, some not

I have 4 tables in an existing mysql database of a directory type site.
Table mt_links contains basic info for each listing
Table mt_cl contains which listing above is in what category (I only want cat_id=1)
Table mt_cfvalues contains more details for each listing It Can have repeated values
Table mt_images contains image names for each listing.
I want all records from mt_links where the mt_cl cat_id=1, and for each of those records, I need all records in mt_cfvalues and cf_images matching the link_id.
I set up a select with Group_Concat and left joins, but ended up with repeating values in my results. I added Distinct, which cured the repeating values, but mt_cfvalues can have records with the same value, so now I'm missing a value I should have.
SELECT a.link_id,
a.link_name,
a.link_desc,
GROUP_CONCAT(DISTINCT b.value ORDER BY b.cf_ID) AS details,
GROUP_CONCAT(DISTINCT c.filename ORDER BY c.ordering) AS images
FROM mt_links a
LEFT JOIN mt_cfvalues b ON a.link_id = b.link_ID
LEFT JOIN mt_images c ON b.link_id = c.link_ID
LEFT JOIN mt_cl d ON a.link_id = d.link_ID WHERE d.cat_ID = '1'
GROUP BY a.link_id
I put together a SQLFiddle here: http://www.sqlfiddle.com/#!2/f39e9/1
Is there an easier way? How do I fix the repeating / no repeating issue?
Here is one way of accomplishing what you seek. Because the two subqueries return independent results, you can't combine the GROUP BY, which is why you were getting duplicates.
SELECT a.link_id,
a.link_name,
a.link_desc,
cvf.details,
imgs.images
FROM mt_links a
LEFT JOIN (
SELECT link_ID, GROUP_CONCAT(value ORDER BY cf_ID) AS details
FROM mt_cfvalues
GROUP BY link_ID
) cvf ON cvf.link_ID = a.link_id
LEFT JOIN (
SELECT link_ID, GROUP_CONCAT(filename ORDER BY ordering) AS images
FROM mt_images
GROUP BY link_ID
) imgs ON imgs.link_ID = a.link_id
INNER JOIN mt_cl d ON a.link_id = d.link_ID
WHERE d.cat_ID = '1'

SQL Multiple Joins multiple where

I have three tables is question. categories, vocabulary & tex. I am trying to figure out how to have multiple joins in my query, i thought you can just add as many joins as you wanted, as long as you reference them properly.
So, the following two work perfectly on there own:
1.
SELECT
categories.ID AS ID,
categories.ParentID AS ID,
vocabulary.value AS Name
FROM categories
INNER JOIN vocabulary
ON categories.sid=vocabulary.sid
WHERE vocabulary.langid=1
2.
SELECT
categories.ID AS ID,
categories.ParentID AS ID,
tex.value AS Description
FROM categories
INNER JOIN tex
ON categories.tid=tex.tid
WHERE tex.langid=1
However, if i try to combine them as follows, it does not work.
categories.ID AS ID,
categories.ParentID AS ID,
vocabulary.value AS Name
tex.value AS Description
FROM categories
INNER JOIN tex
ON categories.tid=tex.tid
WHERE tex.langid=1
INNER JOIN vocabulary
ON categories.sid=vocabulary.sid
WHERE vocabulary.langid=1
Any ideas?
Thanks in advance
John
In MySQL, when you have columns with the same name, one of them will only be shown. You need to identify them uniquely by supplying ALIAS. And you can either put the condition on the ON clause or WHERE clause which could yield the same result since it uses INNER JOIN.
SELECT categories.ID AS CategoryID,
categories.ParentID AS CategoryParentID,
vocabulary.value AS Name
tex.value AS Description
FROM categories
INNER JOIN tex
ON categories.tid = tex.tid
INNER JOIN vocabulary
ON categories.sid = vocabulary.sid
WHERE vocabulary.langid = 1 AND
tex.langid = 1

SQL: How to get an occurrence count of unique values from a query complicated by joins?

I have three tables:
Content
Content_Category
Content_Class
There are four types of Content Classes, and Content_Class has a Many:1 relationship with Content rows to link each content with 1 or more classes.
The same rule applies to categories, which also has a Many:1 relationship with content rows.
My goal is, given a set of content rows possibly filtered on category, what are the aggregate counts of the Content_Class rows per Class?
my current query:
SELECT cc.class_Id, COUNT(*) AS `records`
FROM Content_Class cc
LEFT JOIN Content c ON c.id = cc.content_id
LEFT JOIN Content_Category ccat ON c.id = ccat.content_id
WHERE cc.class_id IS NOT NULL
GROUP BY cc.class_id';
This does not provide accurate counts, because for content that contain more than one category relation, the content row shows up once per category link in the query response, inflating the count of class occurrences.
For instance, if a content row is mapped to two categories, and two classes, it will show up four times, doubling the actual count of unique content-to-class relationships in the result...
What's the best query to get the UNIQUE count of all content class occurrences by UNIQUE content row?
You should try this:
SELECT count(cc.class_id) FROM (SELECT distinct cc.class_Id, COUNT(*) AS `records`
FROM Content_Class cc
LEFT JOIN Content c ON c.id = cc.content_id
LEFT JOIN Content_Category ccat ON c.id = ccat.content_id
WHERE cc.class_id IS NOT NULL
GROUP BY cc.class_id');