I have to group my anime index according to their AniDB ID and show the values in a DESCENDING order according to file auto increment id.
Here's what I did currently:
SELECT
f.id, f.category, f.anidb, f.mal_id, COUNT( * ) AS dupes, f.filename,
a.titles, a.synopsis, a.episodes, a.image, a.rating,
c.name as cat_name, c.id as categoryid
FROM table_files f
LEFT JOIN table_anidb a ON a.id = f.anidb
LEFT JOIN table_categories c ON c.id = f.category
GROUP BY a.id ORDER BY f.id DESC
PROBLEM:
I have Naruto 8 episodes. episode 8's ID is 204. And ep.1 has ID 160. The query return like this:
id | anidb | filename | dupes | cat_name
--------------------------------------------------------
201 | 8692 | SAO | 1 | Series
200 | 9251 | RYO | 1 | Movie
.....
.......
160 | 239 | Naruto ep.1 | 8 | Series
But I want Naruto Episode 8 to be showed in the top of the results instead of episode 1 in the last.
How do I group by anidb and mal_id at the same time with an OR logic? So that the grouping can be done even if there is not any anidb ID provided.
Ad. 1.
Since id, anidb and filename are all in one table i'm afraid you can't get away from doing a subquery join:
SQLFiddle
SELECT f.id, f.anidb, f.filename
FROM files f
JOIN
(SELECT MAX(id) as id FROM files GROUP BY anidb) AS f2
ON f2.id = f.id
ORDER BY f.id DESC
(data flattened for the sake of readibility but you can get the general idea)
Ad. 2.
As for the second problem, you really just have to add second grouping column to the above joined subquery:
SQLFiddle
SELECT f.id, f.anidb, f.mal_id, f.filename
FROM files f
JOIN
(SELECT MAX(id) as id FROM files GROUP BY anidb, mal_id) AS f2 on f2.id = f.id
ORDER BY f.id DESC
The NULL's are distinct from each other (e.g. NULL != NULL) so there's no fear that grouping would melt all the nulled anidb rows into one.
For the first problem you can use ORDER BY dupes
SELECT
f.id, f.category, f.anidb, f.mal_id, COUNT( * ) AS dupes, f.filename,
a.titles, a.synopsis, a.episodes, a.image, a.rating,
c.name as cat_name, c.id as categoryid
FROM table_files f
LEFT JOIN table_anidb a ON a.id = f.anidb
LEFT JOIN table_categories c ON c.id = f.category
GROUP BY a.id ORDER BY dupes DESC
For the second problem you can use CASE to check if f.anidb is null
SELECT
f.id, f.category, f.anidb, f.mal_id, COUNT( * ) AS dupes, f.filename,
a.titles, a.synopsis, a.episodes, a.image, a.rating,
c.name as cat_name, c.id as categoryid
FROM table_files f
LEFT JOIN table_anidb a ON a.id = f.anidb
LEFT JOIN table_categories c ON c.id = f.category
GROUP BY
(CASE WHEN f.anidb IS NULL THEN f.mal_id ELSE f.anidb END )
ORDER BY dupes DESC
Related
a seemingly generic SQL query really left me clueless.
Here's the case.
I have 3 generic tables (simplified versions here):
Movie
id | title
-----------------------
1 | Evil Dead
-----------------------
2 | Bohemian Rhapsody
....
Genre
id | title
-----------------------
1 | Horror
-----------------------
2 | Comedy
....
Rating
id | title
-----------------------
1 | PG-13
-----------------------
2 | R
....
And 2 many-to-many tables to connect them:
Movie_Genre
movie_id | genre_id
Movie_Rating
movie_id | rating_id
The initial challenge was to write a query which allows me to fetch movies that belong to multiple genres (e.g. horror comedies or sci-fi action).
Thankfully, I was able to find this solution here
MySQL: Select records where joined table matches ALL values
However, what would be the correct option to fetch records that belong to multiple many-to-many tables? E.g. rated R horror comedies. Is there any way to do so without subquery (or a single one only)?
One method uses correlated subqueries:
select m.*
from movies m
where (select count(*)
from movie_genre mg
where mg.movie_id = m.id
) > 1 and
(select count(*)
from movie_rating mr
where mr.movie_id = m.id
) > 1 ;
With indexes on movie_genre(movie_id) and movie_rating(movie_id) this probably has quite reasonable performance.
The above is possibly the most efficient method. However, if you wanted to avoid subqueries, one method would be:
select mg.movie_id
from movie_genres mg join
movie_ratings mr
on mg.movie_id = mr.movie_id
group by mg.movie_id
having count(distinct mg.genre_id) > 0 and
count(distinct mr.genre_id) > 0;
More efficient than the above is aggregating before the join:
select mg.movie_id
from (select movie_id
from mg_genres
group by movie_id
having count(*) >= 2
) mg join
(select movie_id
from mg_ratings
group by movie_id
having count(*) >= 2
) mr
on mg.movie_id = mr.movie_id;
Although you state that you want to avoid subqueries, the irony is that the version with no subqueries probably has the worst performance of these three options.
E.g. rated R horror comedies
You can join all the tables together, aggregate by movie and filter with a HAVING clause:
select m.id, m.title
from movies m
inner join movie_genre mg on mg.movid_id = m.id
inner join genre g on g.id = mg.genre_id
inner join movie_rating mr on mr.movie_id = m.id
inner join rating r on r.id = mr.rating_id
group by m.id, m.title
having
max(r.title = 'R') = 1
and max(g.title = 'Horror') = 1
and max(g.title = 'Comedy') = 1
You can also use a couple of exists conditions along with correlated subqueries:
select m.*
from movie m
where
exists (
select 1
from movie_genre mg
inner join genre g on g.id = mg.genre_id
where mg.movie_id = m.id and g.title = 'R')
and exists (
select 1
from movie_rating mr
inner join rating r on r.id = mr.rating_id
where mr.movie_id = m.id and r.title = 'Horror'
)
and exists (
select 1
from movie_rating mr
inner join rating r on r.id = mr.rating_id
where mr.movie_id = m.id and r.title = 'Comedy'
)
This is the query:
SELECT a.id, a.userName,if(o.userId=1,'C',if(i.userId=1,'I','N')) AS relation
FROM tbl_users AS a
LEFT JOIN tbl_contacts AS o ON a.id = o.contactId
LEFT JOIN tbl_invites AS i ON a.id = i.invitedId
ORDER BY relation
This returns the output as follows:
+----+--------------+-------------+
| ID | USERNAME | RELATION |
+----+--------------+-------------+
| 1 | ray | C |
+----+--------------+-------------+
| 2 | john | I |
+----+--------------+-------------+
| 1 | ray | N |
+----+--------------+-------------+
I need to remove the third row from the select query by checking if possible that id is duplicate. The priority is as follows:
C -> I -> N. So since there is already a "ray" with a C, I dont want it again with an I or N.
I tried adding distinct(a.id) but it doesn't work. How do I do this?
Why doesn't DISTINCT work for this?
From the specs you gave, all you have to do is group by ID and username, then pick the lowest value of relation you can find (since C < I < N)
SELECT a.id, a.userName, MIN(if(o.userId=1,'C',if(i.userId=1,'I','N'))) AS relation
FROM tbl_users AS a
LEFT JOIN tbl_contacts AS o ON a.id = o.contactId
LEFT JOIN tbl_invites AS i ON a.id = i.invitedId
GROUP BY a.id, a.username
There are multiple ways to get the group-wise maximum/minimum as you can see in this manual page.
The best one suited for you is the first one, if the order of the rows can not be defined by alphabetic order.
In this case, given if the desired order were z-a-m (see Rams' comment) you'd need the FIELD() function.
So your answer is
SELECT
a.id,
a.userName,
if(o.userId=1,'C',if(i.userId=1,'I','N')) AS relation
FROM tbl_users a
LEFT JOIN tbl_contacts AS o ON a.id = o.contactId
LEFT JOIN tbl_invites AS i ON a.id = i.invitedId
WHERE
if(o.userId=1,'C',if(i.userId=1,'I','N')) = (
SELECT
if(o.userId=1,'C',if(i.userId=1,'I','N')) AS relation
FROM tbl_users aa
LEFT JOIN tbl_contacts AS o ON aa.id = o.contactId
LEFT JOIN tbl_invites AS i ON aa.id = i.invitedId
WHERE aa.id = a.id AND aa.userName = a.userName
ORDER BY FIELD(relation, 'N', 'I', 'C') DESC
LIMIT 1
)
Note, you can also do it like ORDER BY FIELD(relation, 'C', 'I', 'N') to have it more readable / intuitive. I turned it the other way round, because if you'd have the possibility of having a 'X' in the relation, the FIELD() function would have returned 0 because X is not specified as a parameter. Therefore it would be sorted before 'C'. By sorting descending and turning the order of the parameters around this can not happen.
I am trying to figure out how to find duplicats based on several different columns and tables.
I've got these tables:
products
tags (lists tagid and productid - one row per tag)
groups (lists groupid and productid - one row per groupid)
I want to find exact matches in my table products on columns productName, brandid, origin. But to cast the row as a duplicate I also need to compare so that they have the exact same tags (column: tagid) and groups (column: groupid) assigned.
Every product may have multiple tags and multiple groups.
This is what I've come up with... but it's not quite doing what I need it to.
SQLFiddle
http://sqlfiddle.com/#!9/43f19/1
In my SQL fiddle example I have listed 10 different products.
For example, products 1,2 are exact matches and thus should be listed as a duplicate.
Product number 3 only has one group assigned and thus differ from product 1 and 2 even if any other parameter fits (it should not be listed). My intention with the dupid column would be to list the first entry of a set of duplicates.
id | name | brandid | origin | tags | groups | dupid
1 | prod | 1 | England | 1,2 | 1,2 | 1
2 | prod | 1 | England | 1,2 | 1,2 | 1
3 | prod | 1 | England | 1,2 | 1 | 3
Complete set of items that should be listed as exact matches in my SQL fiddle are:
id 1
id 2
id 4
id 5
My guess why this fails is that I have not succeeded to involve the tags and the groups correctly into my comparison.
SELECT m.*,dup.id AS dupid,GROUP_CONCAT(DISTINCT t.tagid ORDER BY t.tagid ASC) AS alltags,GROUP_CONCAT(DISTINCT g.groupid ORDER BY g.groupid ASC) AS groups
FROM `products` m
JOIN (SELECT id,`productName`, brandid, origin, COUNT(*) AS c FROM products
GROUP BY `productName`, brandid, origin HAVING c > 1) dup ON m.`productName` = dup.`productName` AND m.brandid = dup.brandid AND m.origin = dup.origin
LEFT JOIN tags AS t ON t.productid = m.id
LEFT JOIN groups AS g ON g.productid = m.id
GROUP BY m.id
ORDER BY `productName`,brandid,origin
Any help and/or advice on how to achieve this is highly appricated.
My guess is that you are missing an aggregation function on the subquery on ID field, also - you need to group by productname,origin and brand and not id so try this:
SELECT m.*,dup.id AS dupid,GROUP_CONCAT(DISTINCT t.tagid ORDER BY t.tagid ASC) AS alltags,GROUP_CONCAT(DISTINCT g.groupid ORDER BY g.groupid ASC) AS groups
FROM `products` m
JOIN (SELECT min(id) as id,`productName`, brandid, origin, COUNT(*) AS c FROM products
GROUP BY `productName`, brandid, origin HAVING c > 1) dup ON m.`productName` = dup.`productName` AND m.brandid = dup.brandid AND m.origin = dup.origin
LEFT JOIN tags AS t ON t.productid = m.id
LEFT JOIN groups AS g ON g.productid = m.id
GROUP BY m.`productName`,m.brandid,m.origin
ORDER BY m.`productName`,m.brandid,m.origin
EDIT : You can use this query:
SELECT tt.*
FROM(
SELECT m.*,GROUP_CONCAT(DISTINCT t.tagid ORDER BY t.tagid ASC) AS alltags,GROUP_CONCAT(DISTINCT g.groupid ORDER BY g.groupid ASC) AS groups
FROM `products` m
LEFT JOIN tags AS t ON t.productid = m.id
LEFT JOIN groups AS g ON g.productid = m.id
GROUP BY m.id) tt
INNER JOIN
(SELECT productName,brandid,origin,alltags,groups
FROM
(SELECT m.*,GROUP_CONCAT(DISTINCT t.tagid ORDER BY t.tagid ASC) AS alltags,GROUP_CONCAT(DISTINCT g.groupid ORDER BY g.groupid ASC) AS groups
FROM `products` m
LEFT JOIN tags AS t ON t.productid = m.id
LEFT JOIN groups AS g ON g.productid = m.id
GROUP BY m.id) s
GROUP BY productName,brandid,origin,alltags,groups
HAVING COUNT(*) > 1) ss
ON(tt.productName = ss.productName and tt.brandid = ss.brandid and tt.origin = ss.origin
and tt.alltags = ss.alltags and tt.groups = ss.groups)
We have the following, quite complex (at least for us) query.
Since, as far as we know, there's no such thing as INTERSECT on MySQL, we are wondering how can we fix this:
(
SELECT GROUP_CONCAT(APA_T.district ORDER BY APA_T.district), t.name
FROM tbl_activity AS t
INNER JOIN tbl_activity_package AS ap ON t.id = ap.id_activity
INNER JOIN (
SELECT DISTINCT apa.district AS district, (
SELECT s1.id_activity_package
FROM tbl_activity_package_address s1
WHERE apa.district = s1.district
ORDER BY s1.id DESC
LIMIT 1
) AS idActivityPackage
FROM
tbl_activity_package_address apa
ORDER BY apa.district
) AS APA_T
ON ap.id = APA_T.idActivityPackage
GROUP BY t.name
ORDER BY APA_T.district
)
UNION DISTINCT
(
SELECT GROUP_CONCAT(DISTINCT apa2.district ORDER BY apa2.district), t2.name
FROM tbl_activity AS t2
INNER JOIN tbl_activity_package AS ap2 ON t2.id = ap2.id_activity
INNER JOIN tbl_activity_package_address AS apa2 ON ap2.id = apa2.id_activity_package
GROUP BY t2.name
ORDER BY apa2.district
)
#LIMIT 6, 6
Here are the results:
GROUP_CONCAT(APA_T.DISTRICT ORDER BY APA_T.DISTRICT) NAME
Beja,Faro,Setubal activity-1
Evora activity-2
Sintra activity-4
Braga,Sines activity-5
Santarem activity-6
Guarda,Matosinhos,Sagres activity-7
Lisboa,Montemor,Porto,Rio de Janeiro activity-8
Beja,Evora,Faro,Setubal activity-1
Faro activity-3
Here are the results as we wish they were:
GROUP_CONCAT(APA_T.DISTRICT ORDER BY APA_T.DISTRICT) NAME
Beja,Faro,Setubal activity-1
Evora activity-2
Sintra activity-4
Braga,Sines activity-5
Santarem activity-6
Guarda,Matosinhos,Sagres activity-7
Lisboa,Montemor,Porto,Rio de Janeiro activity-8
Faro activity-3
ISSUE
This line should NOT appear. No activity should appear twice.
Beja,Evora,Faro,Setubal activity-1
We understand that the UNION DISTINCT doesn't remove it, because indeed:
Beja, Faro, Setubal IS DIFFERENT THAN Beja,Evora,Faro,Setubal
HOWEVER, we wish NOT to have Evora to appear on the first result. So, it is OK as it is, the first query on the UNION does it's job as it should.
Still, that second activity-1 that appears, should be removed.
Any advice on how to solve this?
THE BIG PICTURE
As you can see, this is quite a huge select that will, perhaps, get worst and slow by time.
We wish to have a INFINITE SCROLL of Activities, and the first results of that Infinite Scroll, should be from Activities happening on different districts. Why? Why can't we do it "order by date" or something, you may ask.
Because if a database back-end user do insert the last 20 records, all from one single district, we will have on the infinite scroll first list results, only activities from that district APPEARING that we don't have more than that district.
So, the point is to LIST ALL the results on a certain (complex) ORDER. :)
Any other, perhaps better way, would be great.
http://sqlfiddle.com/#!2/37dd94/51
Does the below (SQL Fiddle) produce the results you are looking for. I wrapped the union so I could then sort on the name field. If you don't want it that way then you can remove it or sort on the DistCon field instead.
SELECT * FROM
(
SELECT GROUP_CONCAT(APA_T.district) AS DistCon, t.name
FROM tbl_activity AS t
JOIN tbl_activity_package AS ap ON t.id = ap.id_activity
JOIN
(
SELECT DISTINCT apa.district AS district,
(
SELECT s1.id_activity_package
FROM tbl_activity_package_address s1
WHERE apa.district = s1.district
ORDER BY s1.id DESC
LIMIT 1
) AS idActivityPackage
FROM
tbl_activity_package_address apa
ORDER BY apa.district
) AS APA_T
ON ap.id = APA_T.idActivityPackage
GROUP BY t.name
UNION
SELECT GROUP_CONCAT(apa.district), t.name
FROM tbl_activity AS t
JOIN tbl_activity_package AS ap ON t.id = ap.id_activity
JOIN tbl_activity_package_address AS apa ON ap.id = apa.id_activity_package
WHERE t.name NOT IN
(
SELECT DISTINCT t.name
FROM tbl_activity AS t
JOIN tbl_activity_package AS ap ON t.id = ap.id_activity
JOIN
(
SELECT DISTINCT apa.district AS district,
(
SELECT s1.id_activity_package
FROM tbl_activity_package_address s1
WHERE apa.district = s1.district
ORDER BY s1.id DESC
LIMIT 1
) AS idActivityPackage
FROM
tbl_activity_package_address apa
) AS APA_T
ON ap.id = APA_T.idActivityPackage
)
GROUP BY t.name
) AS Mm
ORDER BY Mm.name
This query provides a slightly different result from that specified above because it employs slightly different rules.
Basically, it says "Give me at least one district for every activity. Where multiple districts offer the same activity, exclude any that are sole providers of another activity."
SELECT x.activity
, GROUP_CONCAT(DISTINCT x.district) districts
FROM
( SELECT a.name activity
, apa.district
FROM tbl_activity a
JOIN tbl_activity_package ap
ON ap.id_activity = a.id
JOIN tbl_activity_package_address apa
ON apa.id_activity_package = ap.id
) x
LEFT
JOIN
( SELECT a.name activity
, apa.district
FROM tbl_activity a
JOIN tbl_activity_package ap
ON ap.id_activity = a.id
JOIN tbl_activity_package_address apa
ON apa.id_activity_package = ap.id
GROUP
BY activity
HAVING COUNT(*) = 1
) y
ON y.district = x.district
AND y.activity <> x.activity
WHERE y.activity IS NULL
GROUP
BY activity;
+------------+--------------------------------------+
| activity | districts |
+------------+--------------------------------------+
| activity-1 | Beja,Setubal |
| activity-2 | Evora |
| activity-3 | Faro |
| activity-4 | Sintra |
| activity-5 | Braga,Sines |
| activity-6 | Santarem |
| activity-7 | Guarda,Sagres,Matosinhos |
| activity-8 | Lisboa,Porto,Rio de Janeiro,Montemor |
+------------+--------------------------------------+
i have three tables.
1.fi_category
+----+-----------------+-----------------+
| id | name | slug |
+----+-----------------+-----------------+
2.fi_subcategory
+----+-----------------+-----------------+-------------+
| id | name | slug | category_id |
+----+-----------------+-----------------+-------------+
3.fi_business_subcategory
+----+-------------+----------------+
| id | business_id | subcategory_id |
+----+-------------+----------------+
what i am basically trying to do is,
fetch all categories
fetch all subcategories that belong to categories.
count the number of business that exist for particular subcategory.
this is what i tried doing.
SELECT
f.id,
f.name,
f.slug,
f2.id,
f2.name,
f2.slug,
COUNT(f3.business_id) as count
FROM
fi_category f
LEFT JOIN
fi_subcategory f2 ON f.id = f2.category_id
LEFT JOIN
fi_business_subcategory f3 ON f2.id = f3.subcategory_id
however the above query fetches only 1 record. how do i fetch what i want?
I would add a GROUP BY clause:
SELECT
f.id,
f.name,
f.slug,
f2.id,
f2.name,
f2.slug,
COUNT(f3.business_id) as count
FROM fi_category f
LEFT JOIN fi_subcategory f2
ON f.id = f2.category_id
LEFT JOIN fi_business_subcategory f3
ON f2.id = f3.subcategory_id
GROUP BY f.id,
f.name,
f.slug,
f2.id,
f2.name,
f2.slug
Or get your count() in a sub-query:
SELECT f.id,
f.name,
f.slug,
f2.id,
f2.name,
f2.slug,
f3.cnt
FROM fi_category f
LEFT JOIN fi_subcategory f2
ON f.id = f2.category_id
LEFT JOIN
(
select count(business_id) cnt, subcategory_id
from fi_business_subcategory
group by subcategory_id
) f3
ON f2.id = f3.subcategory_id