GROUP BY and ORDER BY issues - mysql

I have the following query:
SELECT DISTINCT (
s.styleTitle
), COUNT(p.id) AS `PictureCount`
FROM `style` s
LEFT JOIN `instagram_picture_style` ps ON s.id = ps.style_id
LEFT JOIN `instagram_shop_picture` p ON ps.picture_id = p.id
LEFT JOIN `instagram_picture_category` c ON c.picture_id = p.id
LEFT JOIN `instagram_second_level_category` sl ON c.second_level_category_id = sl.id
WHERE sl.id =25
GROUP BY p.id
ORDER BY PictureCount
however this query gives me:
I basically wanted the list to be ordered by the style that has the most pictures in it. What did I do wrong? Why is it giving me 1 on all of the styles, I am pretty sure it has more pictures for that style

ORDER BY doesn't have underscores. But equally important, you are using DISTINCT in a way where you seem to think that it is a function. It is not. It is a modifies on the SELECT and it applies to all columns.
You should group by the same column you have in the distinct. Something like this:
SELECT s.styleTitle, COUNT(p.id) AS `PictureCount`
FROM `style` s
LEFT JOIN `instagram_picture_style` ps ON s.id = ps.style_id
LEFT JOIN `instagram_shop_picture` p ON ps.picture_id = p.id
LEFT JOIN `instagram_picture_category` c ON c.picture_id = p.id
LEFT JOIN `instagram_second_level_category` sl ON c.second_level_category_id = sl.id
WHERE sl.id = 25
GROUP BY s.styleTitle
ORDER BY PictureCount DESC;
In fact, you almost never need distinct with group by. If you are using, you need to think why it would be necessary.

Related

SQL LIKE seems not work like i think it should

I have an SELECT statement that has a huge number of left join and I want to filter some out.
When I check how many records i have in total and subtract the records with my LIKE statements, I should get the amount that is not affected by my restrictions.
But when I negate my restriction to get the ones I didn't affect, I get an different number than calculated.
SQL without restrictions (Record count: 13.251.981)
SELECT p.product_number
FROM product p
LEFT JOIN product_category pc on p.id = pc.product_id
LEFT JOIN product_category_tree pct on p.id = pct.product_id
LEFT JOIN product_configurator_setting pcs on p.id = pcs.product_id
LEFT JOIN product_cross_selling pcs2 on p.id = pcs2.product_id
LEFT JOIN product_cross_selling_assigned_products pcsap on p.id = pcsap.product_id
LEFT JOIN product_cross_selling_translation pcst on pcs2.id = pcst.product_cross_selling_id
LEFT JOIN product_custom_field_set pcfs on p.id = pcfs.product_id
LEFT JOIN product_media pm on p.id = pm.product_id
LEFT JOIN product_option po on p.id = po.product_id
LEFT JOIN product_price pp on p.id = pp.product_id
LEFT JOIN product_property pp2 on p.id = pp2.product_id
LEFT JOIN product_review pr on p.id = pr.product_id
LEFT JOIN product_search_keyword psk on p.id = psk.product_id
LEFT JOIN product_tag pt on p.id = pt.product_id
LEFT JOIN product_translation pt2 on p.id = pt2.product_id
LEFT JOIN product_visibility pv on p.id = pv.product_id
With restriction (Record count: 9.285.545)
WHERE p.product_number NOT LIKE 'SW%'
AND p.product_number NOT LIKE '%.%'
AND pt2.name NOT LIKE '%Gutschein'
AND pt2.name NOT LIKE '%Test%'
With negated restriction (Record count: 100.851)
WHERE p.product_number LIKE 'SW%'
OR p.product_number LIKE '%.%'
OR pt2.name LIKE '%Gutschein'
OR pt2.name LIKE '%Test%';
From my calculations i should get 3.966.436 records that don't get affected. (13.251.981 - 9.285.545 = 3.966.436)
But instead I get 100.851
How is that possible?
The solution for me was actually this WHERE:
WHERE p.product_number < 'SW'

Multiple aggregate functions in SQL query

For this example I got 3 simple tables (Page, Subs and Followers):
For each page I need to know how many subs and followers it has.
My result is supposed to look like this:
I tried using the COUNT function in combination with a GROUP BY like this:
SELECT p.ID, COUNT(s.UID) AS SubCount, COUNT(f.UID) AS FollowCount
FROM page p, subs s, followers f
WHERE p.ID = s.ID AND p.ID = f.ID AND s.ID = f.ID
GROUP BY p.ID
Obviously this statement returns a wrong result.
My other attempt was using two different SELECT statements and then combining the two subresults into one table.
SELECT p.ID, COUNT(s.UID) AS SubCount FROM page p, subs s WHERE p.ID = s.ID GROUP BY p.ID
and
SELECT p.ID, COUNT(f.UID) AS FollowCount FROM page p, follow f WHERE p.ID = f.ID GROUP BY p.ID
I feel like there has to be a simpler / shorter way of doing it but I'm too unexperienced to find it.
Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax.
Next, learn what COUNT() does. It counts the number of non-NULL values. So, your expressions are going to return the same value -- because f.UID and s.UID are never NULL (due to the JOIN conditions).
The issue is that the different dimensions are multiplying the amounts. A simple fix is to use COUNT(DISTINCT):
SELECT p.ID, COUNT(DISTINCT s.UID) AS SubCount, COUNT(DISTINCT f.UID) AS FollowCount
FROM page p JOIN
subs s
ON p.ID = s.ID JOIN
followers f
ON s.ID = f.ID
GROUP BY p.ID;
The inner joins are equivalent to the original query. You probably want left joins so you can get counts of zero:
SELECT p.ID, COUNT(DISTINCT s.UID) AS SubCount, COUNT(DISTINCT f.UID) AS FollowCount
FROM page p LEFT JOIN
subs s
ON p.ID = s.ID LEFT JOIN
followers f
ON p.ID = f.ID
GROUP BY p.ID;
Scalar subquery should work in this case.
SELECT p.id,
(SELECT Count(s_uid)
FROM subs s1
WHERE s1.s_id = p.id) AS cnt_subs,
(SELECT Count(f_uid)
FROM followers f1
WHERE f1.f_id = p.id) AS cnt_fol
FROM page p
GROUP BY p.id;

How to ORDER BY/MAX before GROUP BY after a LEFT JOIN to a many table?

There are 3 tables in question - properties, specials, and properties_specials.
I want n results back where n is the number of properties, and the corresponding special is the one with the MAX(specials.startdate).
I first tried doing this after aggregation but the MIN/MAX doesn't seem to affect the result set at all:
SELECT
p.id,
s.*,
MAX( s.startdate )
FROM
properties p
LEFT JOIN properties_specials ps ON ps.properties_id = p.id
LEFT JOIN specials s ON s.id = ps.specials_id
GROUP BY p.id
Using a subquery with max doesn't work because it just grabs the total max:
SELECT
p.id,
s.*
FROM
properties p
LEFT JOIN properties_specials ps ON ps.properties_id = p.id
LEFT JOIN (
SELECT id, MAX( specials.startdate )
FROM specials
) AS s ON s.id = ps.specials_id
GROUP BY p.id
And finally doing an ORDER BY in the subquery doesn't seem to work either because even though I specify ORDER BY specials.startdate DESC or ORDER BY specials.startdate ASC, the result is the same for:
properties table
id name
----------------
11 Hotel
properties_specials table
properties_id specials_id
----------------
11 33
11 34
specials table
id startdate
----------------
33 2016-01-02
34 2016-01-10
How can I adjust this properly so I get the most recent / max after the join to properties_specials?
EDIT: Came up with a sqlfiddle for this. I think the differing factor may have been that the same special can be applied to different properties in the mapping table - if that's the case sorry for not specifying earlier.
You can do this by using an aggregation to get the maxdate per property and then joining these back in:
select p.*, s.*
from properties p join
(select ps.properties_id, max(s.startdate) as maxsd
from properties_specials ps join
specials s
on s.id = ps.specials_id
group by ps.properties_id
) maxps
on p.id = maxps.properties_id join
properties_specials ps
on ps.properties_id = p.id join
specials s
on ps.specials_id = s.id and s.startdate = maxps.maxsd;
EDIT:
Note the above query had an error. It was missing an on clause, which resulted in many duplicates (and would have been an error in any database other than MySQL).
Another approach is to just use the id instead of the date. Plugging directly into the above query:
select p.*, s.*
from properties p join
(select ps.properties_id, max(s.id) as maxid
from properties_specials ps join
specials s
on s.id = ps.specials_id
group by ps.properties_id
) maxps
on p.id = maxps.properties_id join
properties_specials ps
on ps.properties_id = maxps.properties_id join
specials s
on s.id = maxps.maxid;
Here is the quick answer:
SELECT ps.properties_id, max(s.startdate)
FROM specials s, properties_specials ps
WHERE s.id = ps.specials_id
GROUP BY 1
If you want all the information, try the following:
SELECT p.id pid, p.name, s.id sid, s.startdate
FROM properties p, specials s, properties_specials ps, (
SELECT ps.properties_id pid, max(s.startdate) maxdate
FROM specials s, properties_specials ps
WHERE s.id = ps.specials_id
GROUP BY 1
) as pmax
WHERE p.id = ps.properties_id AND s.id = ps.specials_id
AND p.id = pmax.pid AND pmax.maxdate = s.startdate

Count matched words from IN operator

i have this little mysql query :
select t.title FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where k.keyword IN (
select k.keyword
FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where t.id = 166282
)
LIMIT 15
as you can see it will return all titles from title that have at least one the same keyword that have movie with id 166282.
Now i have problem, because i want also count how many keywords was matched in IN operator(let's say i want to see only titles that have 3 or more the same keywords), i tried something with aggregate functions, but everything failed, so i came here with my problem. Maybe somebody can give me some advice, or code example.
I'm not also sure, if this "subquery way" is good, so if there are some better options how i should solve my problem, I am open to any suggestions or tips.
Thank you!
#Edit
So after some problems, i have one more. This is my current query :
SELECT s.title,s.vote,s.rating,count(dk.key) as keywordCnt, count(dg.name) as genreCnt
FROM series s
INNER JOIN series_has_genre shg ON shg.series_id = s.id
INNER JOIN dict_genre dg ON dg.id = shg.dict_genre_id
INNER JOIN series_has_keyword shk ON shk.series_id = s.id
INNER JOIN dict_keyword dk ON dk.id = shk.dict_keyword_id
WHERE dk.key IN (
SELECT dki.key FROM series si
INNER JOIN series_has_keyword shki ON shki.series_id = si.id
INNER JOIN dict_keyword dki ON dki.id = shki.dict_keyword_id
WHERE si.title LIKE 'The Wire'
)
and dg.name IN (
SELECT dgo.name FROM series so
INNER JOIN series_has_genre shgo ON shgo.series_id = so.id
INNER JOIN dict_genre dgo ON dgo.id = shgo.dict_genre_id
WHERE so.title LIKE 'The Wire'
)
and s.production_year > 2000
GROUP BY s.title
ORDER BY s.vote DESC, keywordCnt DESC ,s.rating DESC, genreCnt DESC
LIMIT 5
Problem is, it is very, very, very slow. Any tips what i should change, to run it faster ?
Will this work for you:
select t.title, count(k.keyword) as keywordCount FROM title t
inner join movie_keyword mk on mk.movie_id = t.id
inner join keyword k on k.id = mk.keyword_id
where k.keyword IN (
select ki.keyword
FROM title ti
inner join movie_keyword mki on mki.movie_id = ti.id
inner join keyword ki on ki.id = mki.keyword_id
where ti.id = 166282
) group by t.title
LIMIT 15
Note that I have changed the table names inside the nested query to avoid confusion.

How to remove a row if sub query returns null value?

I have following query.
select
Product.*,
(
select
group_concat(features.feature_image order by product_features.feature_order)
from product_features
inner join features
on features.id = product_features.feature_id
where
product_features.product_id = Product.id
and product_features.feature_id in(1)
) feature_image
from products as Product
where
Product.main_product_id=1
and Product.product_category_id='1'
I want to bypass the row if feature_image is empty.
Your query looks a bit strange because you are doing most of the work in a subquery:
select p.*, (select group_concat(f.feature_image order by pf.feature_order)
from product_features pf inner join
features f
on f.id = pf.feature_id
where pf.product_id = p.id and pf.feature_id in (1)
) as feature_image
from products p
where p.main_product_id=1 and p.product_category_id='1';
A more common way to phrase the query is as an inner join in the outer query:
select p.*, group_concat(f.feature_image order by pf.feature_order) as feature_image
from products p join
product_features pf
on pf.product_id = p.id and pf.feature_id in (1) join
features f
on f.id = pf.feature_id
where p.main_product_id=1 and p.product_category_id='1'
group by p.id;
This will automatically include only products that have matching features. You would use left outer join to get all products.