so, I am creating an music database.
I am using three tables (files, categories, categories_assignments).
I want to be able to select a file that is in multiple categories (e.g. a song that is both pop and rock)
I already have made the or variance (included below for reference)
SELECT DISTINCT `files`.`filename` FROM `files`
INNER JOIN `categories_assignments`
ON `files`.`id` = `categories_assignments`.`fileid`
INNER JOIN `categories`
ON `categories_assignments`.`catid` = `categories`.`id`
WHERE `categories`.`name` = 'rock' OR `categories`.`name`='pop';
This is a "set-within-sets" problem -- you are looking for songs that have a set of categories. I like to solve this using group by and having:
SELECT f.filename
FROM files f JOIN
categories_assignments ca
ON f.id = ca.fileid JOIN
categories c
ON ca.catid = c.id
WHERE c.name IN ('rock', 'pop')
GROUP BY f.filename
HAVING COUNT(*) = 2;
Notes:
Table aliases make the query easier to write and to read.
I don't see a need to put backticks around every identifier. That just makes the query harder to read.
You should use IN instead of multiple OR comparisons.
If you are learning SQL, then SELECT DISTINCT is almost never useful. Learn to use GROUP BY first.
Group by the file and take only those groups having both categories
SELECT f.filename
FROM files f
INNER JOIN categories_assignments ca ON f.id = ca.fileid
INNER JOIN categories c ON ca.catid` = c.id
WHERE c.name in ('rock', 'pop')
GROUP BY f.filename
HAVING count(c.name) = 2
Related
I have products , and categories table, and a pivot table named product_catalog, I need to update the product_catalog table so that I can remove the categories which have less than five products. Those products which are in these redundant categories should move to their parent categories. I have written a query for this but problem is that this product_catalog table has 55213277 records in it and it takes lot of time to run .
Basically it is a nested query and we have to run this query for as many times unless there is no category left having less than five products.
Here is my sql query I tested.
Can you propose me an optimized solution.
UPDATE product_catalogT AS C
INNER JOIN
(SELECT
COUNT(*) AS tp, catalog_id cid, g.parent_id pid
FROM
product_catalog AS p
LEFT JOIN catalog AS g ON p.catalog_id = g.id
Where g.parent_id <> 0
GROUP BY catalog_id
HAVING tp < 5)
AS A ON C.catalog_id = A.cid
SET
C.catalog_id = A.pid
Here's a little less writing, but for performance we'd need to see your tables, indexes, and the EXPLAIN, as mentioned.
UPDATE product_catalogT C
JOIN
( SELECT p.catalog_id
FROM product_catalog p
JOIN catalog g
ON p.catalog_id = g.id
Where g.parent_id <> 0
GROUP
BY catalog_id
HAVING COUNT(*) < 5
) A
ON C.catalog_id = A.cid
SET C.catalog_id = A.pid
Also, I might mention that this seems like a rather strange request
For this example I got 3 simple tables (Page, Subs and Followers):
For each page I need to know how many subs and followers it has.
My result is supposed to look like this:
I tried using the COUNT function in combination with a GROUP BY like this:
SELECT p.ID, COUNT(s.UID) AS SubCount, COUNT(f.UID) AS FollowCount
FROM page p, subs s, followers f
WHERE p.ID = s.ID AND p.ID = f.ID AND s.ID = f.ID
GROUP BY p.ID
Obviously this statement returns a wrong result.
My other attempt was using two different SELECT statements and then combining the two subresults into one table.
SELECT p.ID, COUNT(s.UID) AS SubCount FROM page p, subs s WHERE p.ID = s.ID GROUP BY p.ID
and
SELECT p.ID, COUNT(f.UID) AS FollowCount FROM page p, follow f WHERE p.ID = f.ID GROUP BY p.ID
I feel like there has to be a simpler / shorter way of doing it but I'm too unexperienced to find it.
Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax.
Next, learn what COUNT() does. It counts the number of non-NULL values. So, your expressions are going to return the same value -- because f.UID and s.UID are never NULL (due to the JOIN conditions).
The issue is that the different dimensions are multiplying the amounts. A simple fix is to use COUNT(DISTINCT):
SELECT p.ID, COUNT(DISTINCT s.UID) AS SubCount, COUNT(DISTINCT f.UID) AS FollowCount
FROM page p JOIN
subs s
ON p.ID = s.ID JOIN
followers f
ON s.ID = f.ID
GROUP BY p.ID;
The inner joins are equivalent to the original query. You probably want left joins so you can get counts of zero:
SELECT p.ID, COUNT(DISTINCT s.UID) AS SubCount, COUNT(DISTINCT f.UID) AS FollowCount
FROM page p LEFT JOIN
subs s
ON p.ID = s.ID LEFT JOIN
followers f
ON p.ID = f.ID
GROUP BY p.ID;
Scalar subquery should work in this case.
SELECT p.id,
(SELECT Count(s_uid)
FROM subs s1
WHERE s1.s_id = p.id) AS cnt_subs,
(SELECT Count(f_uid)
FROM followers f1
WHERE f1.f_id = p.id) AS cnt_fol
FROM page p
GROUP BY p.id;
We are maintaining a history of Content. We want to get the updated entry of each content, with create Time and update Time should be of the first entry of the Content. The query contains multiple selects and where clauses with so many left joins. The dataset is very huge, thereby query is taking more than 60 seconds to execute. Kindly help in improving the same. Query:
select * from (select * from (
SELECT c.*, initCMS.initcreatetime, initCMS.initupdatetime, user.name as partnerName, r.name as rightsName, r1.name as copyRightsName, a.name as agelimitName, ct.type as contenttypename, cat.name as categoryname, lang.name as languagename FROM ContentCMS c
left join ContentCategoryType ct on ct.id = c.contentType
left join User user on c.contentPartnerId = user.id
left join Category cat on cat.id = c.categoryId
left join Language lang on lang.id = c.languageCode
left join CopyRights r on c.rights = r.id
left join CopyRights r1 on c.copyrights = r1.id
left join Age a on c.ageLimit = a.id
left outer join (
SELECT contentId, createTime as initcreatetime, updateTime as initupdatetime from ContentCMS cms where cms.deleted='0'
) as initCMS on initCMS.contentId = c.contentId WHERE c.deleted='0' order by c.id DESC
) as temp group by contentId) as c where c.editedBy='0'
Any help would be highly appreciated. Thank you.
Just a partial eval and suggestion because your query seems non properly formed
This left join seems unuseful
FROM ContentCMS c
......
left join (
SELECT contentId
, createTime as initcreatetime
, updateTime as initupdatetime
from ContentCMS cms
where cms.deleted='0'
) as initCMS on initCMS.contentId = c.contentId
same table
the order by (without limit) in a subquery in join is unuseful because join ordered values or unordered value produce the same result
the group by contentId is strange beacuse there aren't aggregation function and the sue of group by without aggregation function is deprecated is sql
and in the most recente version for mysql is not allowed (by deafult) if you need distinct value or just a rows for each contentId you should use distinct or retrive the value in a not casual manner (the use of group by without aggregation function retrive casual value for not aggregated column .
for a partial eval your query should be refactored as
SELECT c.*
, c.initcreatetime
, c.initupdatetime
, user.name as partnerName
, r.name as rightsName
, r1.name as copyRightsName
, a.name as agelimitName
, ct.type as contenttypename
, cat.name as categoryname
, lang.name as languagename
FROM ContentCMS c
left join ContentCategoryType ct on ct.id = c.contentType
left join User user on c.contentPartnerId = user.id
left join Category cat on cat.id = c.categoryId
left join Language lang on lang.id = c.languageCode
left join CopyRights r on c.rights = r.id
left join CopyRights r1 on c.copyrights = r1.id
WHERE c.deleted='0'
) as temp
for the rest you should expiclitally select the column you effectively need add proper aggregation function for the others
Also the nested subquery just for improperly reduce the rows don't help performance ... you should also re-eval you data modelling and design.
I'm trying to LEFT JOIN on the same table multiple times, to get all the values of the specific topics. It works like I thought it would, see: http://sqlfiddle.com/#!9/9cda67/4
However, using the above fiddle, the database returns a single row for each different course. I'd like to group them, using GROUP BY PersonID, but then it would only take the first value (a 6 for Math) and a (null) value for all the other columns. See: http://sqlfiddle.com/#!9/9cda67/5
What do I need to change so that I get single row per Person, with all the grades filled in into their respective columns (when available)?
MySQL allows you to include columns in a SELECT that are not in the GROUP BY. This actually violates the ANSI standard and is not supported by any other database (although in some cases the ANSI standard does allow it). The result is indeterminate values from a single row in the output.
The solution is aggregation functions:
SELECT p.id AS PersonID, p.name AS PersonName,
max(pc1.grade) AS Math,
max(pc2.grade) AS Chemistry,
max(pc3.grade) AS Physics
FROM Person p LEFT JOIN
Person_Course pc
on p.id = pc.user_id LEFT JOIN
Course c on c.id = pc.course_id LEFT JOIN
Person_Course pc1
on pc1.id = pc.id AND pc1.course_id = 1 LEFT JOIN
Person_Course pc2
on pc2.id = pc.id AND pc2.course_id = 2 LEFT JOIN
Person_Course pc3
on pc3.id = pc.id AND pc3.course_id = 3
GROUP BY PersonID;
You might want group_concat() if people could take the same course multiple times. Also, don't use single quotes for column names. Only use them for string and date constants.
Hardwiring the course ids into the code seems like a bad idea. I would write this more simply using conditional aggregation:
SELECT p.id AS PersonID, p.name AS PersonName,
max(case when c.name = 'Math' then pc.grade end) AS Math,
max(case when c.name = 'Chemistry' then pc.grade end) AS Chemistry,
max(case when c.name = 'Physics' then pc.grade end) AS Physics
FROM Person p LEFT JOIN
Person_Course pc
on p.id = pc.user_id LEFT JOIN
Course c
on c.id = pc.course_id
GROUP BY PersonID;
I have a query that selects data from several tables using LEFT JOINS. The problem is data is being duplicated.
Here's the query
SELECT
A.ID,
T.T_ID,
T.name,
T.pic,
T.timestamp AS T_ts,
(SELECT COUNT(*) FROM track_plays WHERE T_ID = T.T_ID) AS plays,
(SELECT COUNT(*) FROM track_downloads WHERE T.T_ID) AS downloads,
S.S_ID,
S.status,
S.timestamp AS S_ts,
G.G_ID,
G.gig_name,
G.date_time,
G.lineup,
G.price,
G.currency,
G.pic AS G_pic,
G.ticket,
G.venue,
G.timestamp AS G_ts
FROM artists A
LEFT JOIN TRACKS T
ON T.ID = A.ID
LEFT JOIN STATUS S
ON S.ID = A.ID
LEFT JOIN GIGS G
ON G.ID = A.ID
WHERE A.ID = '$ID'
ORDER BY S_ts, G_ts AND T_ts DESC LIMIT 20
The problem is data is duplicated if one of the tables in the join has more data than another. So if tracks has 1 row, status has 2 and gigs has no rows you would get the data from tracks doubled.
I have tried using GROUP BY A.ID but that eliminates data. So in the example given before there would nly be one row of status show.
I've also tried GROUP_CONCAT but am unsure on that function so can't tell you much.
USING SELECT DISTINCT has the same effect as just the GROUP BY A.ID.
Assuming that artists -> gigs and artists -> tracks are 1-N mappings then you have two choices. (both of which were covered in the comments on your OP
1) Specify which of the N rows you want to get back to achieve a 1-1 map:
FROM artists A
LEFT JOIN TRACKS T ON T.ID = A.ID AND T.<SOMETHING> = SOMETHING
LEFT JOIN STATUS S ON S.ID = A.ID
LEFT JOIN GIGS G ON G.ID = A.ID AND G.<SOMETHING> = SOMETHNING
2) Do the joins as you wrote and get multiple entries for tracks and gigs and then pivot them in your calling application. Generally you'd put an ORDER BY clause in the query and check for the same artist key and pivot the list.