This should be a very simple solution, which means you'll probably think this is a dumb question, but I've tried everything I can think of.
I have two tables, one for possible poll choices, and the other for actual responses. They're structured like so:
choices responses
----------- ----------
poll_id poll_id
choice_id user_id
choice_text choice_id
I have a poll with two choices (yes/no), so I'm trying to fetch results so that if no one has voted for a certain choice, that choice shows up in the result set with a null value. So if 3 users have voted "yes" and none have voted "no", I want the result set to be:
choice_text num
-----------------------------
yes 3
no null
I would have thought this simply an outer join like so:
select
c.choice_text,
count(*) num
from
choices c
left outer join responses r
on c.poll_id = r.poll_id
and c.choice_id = r.choice_id
where
r.poll_id = 1
group by r.choice_id
order by r.choice_id asc;
But alas, that is giving me:
choice_text num
-----------------------------
yes 3
...without any record for "no".
I've tried every language for joins I can think of, with the same erroneous result.
Thoughts?
Your where condition on the outer joined table turns the outer join into an inner join. Move that condition into the JOIN
select c.choice_text,
count(*) num
from choices c
left outer join responses r
on c.poll_id = r.poll_id
and c.choice_id = r.choice_id
and r.poll_id = 1
group by r.choice_id
order by r.choice_id asc;
Btw: your usage of group by is incorrect and every other DBMS would reject that statement. MySQL simply chooses to return random data instead of failing with an error. For more details please see these blogs:
Debunking GROUP BY myths
Wrong GROUP BY makes your queries fragile
Figured it out thanks to #a_horse_with_no_name pointing me in the right direction.
The winning query:
select c.choice_text,
count(r.user_id) num
from choices c
left outer join responses r
on c.poll_id = r.poll_id
and c.choice_id = r.choice_id
and c.poll_id = 1
group by c.choice_id
order by c.choice_id asc;
Moved the c.poll_id clause to the join, like #a_horse_with_no_name suggested, but then had to group by the choices table instead of the responses table, and finally changed my count(*) to count(r.user_id), and voila.
Related
The query below is grabbing some information about a category of toys and showing the most recent sale price for three levels of condition (e.g., Brand New, Used, Refurbished). The price for each sale is almost always different. One other thing - the sales table row id's are not necessarily in chronological order, e.g., a toy with a sale id of 5 could have happened later than a toy with a sale id of 10).
This query works but is not performant. It runs in a manageable amount of time, usually about 1s. However, I need to add yet another left join to include some more data, which causes the query time to balloon up to about 9s, no bueno.
Here is the working but nonperformant query:
SELECT b.brand_name, t.toy_id, t.toy_name, t.toy_number, tt.toy_type_name, cp.catalog_product_id, s.date_sold, s.condition_id, s.sold_price FROM brands AS b
LEFT JOIN toys AS t ON t.brand_id = b.brand_id
JOIN toy_types AS tt ON t.toy_type_id = tt.toy_type_id
LEFT JOIN catalog_products AS cp ON cp.toy_id = t.toy_id
LEFT JOIN toy_category AS tc ON tc.toy_category_id = t.toy_category_id
LEFT JOIN (
SELECT date_sold, sold_price, catalog_product_id, condition_id
FROM sales
WHERE invalid = 0 AND condition_id <= 3
ORDER BY date_sold DESC
) AS s ON s.catalog_product_id = cp.catalog_product_id
WHERE tc.toy_category_id = 1
GROUP BY t.toy_id, s.condition_id
ORDER BY t.toy_id ASC, s.condition_id ASC
But like I said it's slow. The sales table has about 200k rows.
What I tried to do was create the subquery as a view, e.g.,
CREATE VIEW sales_view AS
SELECT date_sold, sold_price, catalog_product_id, condition_id
FROM sales
WHERE invalid = 0 AND condition_id <= 3
ORDER BY date_sold DESC
Then replace the subquery with the view, like
SELECT b.brand_name, t.toy_id, t.toy_name, t.toy_number, tt.toy_type_name, cp.catalog_product_id, s.date_sold, s.condition_id, s.sold_price FROM brands AS b
LEFT JOIN toys AS t ON t.brand_id = b.brand_id
JOIN toy_types AS tt ON t.toy_type_id = tt.toy_type_id
LEFT JOIN catalog_products AS cp ON cp.toy_id = t.toy_id
LEFT JOIN toy_category AS tc ON tc.toy_category_id = t.toy_category_id
LEFT JOIN sales_view AS s ON s.catalog_product_id = cp.catalog_product_id
WHERE tc.toy_category_id = 1
GROUP BY t.toy_id, s.condition_id
ORDER BY t.toy_id ASC, s.condition_id ASC
Unfortunately, this change causes the query to no longer grab the most recent sale, and the sales price it returns is no longer the most recent.
Why is it that the table view doesn't return the same result as the same select as a subquery?
After reading just about every top-n-per-group stackoverflow question and blog article I could find, getting a query that actually worked was fantastic. But now that I need to extend the query one more step I'm running into performance issues. If anybody wants to sidestep the above question and offer some ways to optimize the original query, I'm all ears!
Thanks for any and all help.
The solution to the subquery performance issue was to use the answer provided here: Groupwise maximum
I thought that this approach could only be used when querying a single table, but indeed it works even when you've joined many other tables. You just have to left join the same table twice using the s.date_sold < s2.date_sold join condition and make sure the where clause looks for the null value in the second table's id column.
I'm having some trouble formulating a complex SQL query. I'm getting the result I'm looking for and the performance is fine but whenever I try to grab distinct rows for my LEFT JOIN of product_groups, I'm either hitting some performance issues or getting incorrect results.
Here's my query:
SELECT
pl.name, pl.description,
p.rows, p.columns,
pr.sku,
m.filename, m.ext, m.type,
ptg.product_group_id AS group,
FROM
product_region AS pr
INNER JOIN
products AS p ON (p.product_id = pr.product_id)
INNER JOIN
media AS m ON (p.media = m.media_id)
INNER JOIN
product_language AS pl ON (p.product_id = pl.product_id)
LEFT JOIN
products_groups AS ptg ON (ptg.product_id = pr.product_id)
WHERE
(pl.lang = :lang) AND
(pr.region = :region) AND
(pt.product_id = p.product_id)
GROUP BY
p.product_id
LIMIT
:offset, :limit
The result I'm being given is correct however I want only distinct rows returned for "group". For example I'm getting many results back that have the same "group" value but I only want the first row to show and the following records that have the same group value to be left out.
I tried GROUP BY group and DISTINCT but it gives me incorrect results. Also note that group can come back as NULL and I don't want those rows to be effected.
Thanks in advance.
I worked out a solution an hour after posting this. My goal was to group by product_group_id first and then the individual product_id. The requirement was that I would eliminate product duplicates and have ONE product represent the group set.
I ended up using COALESCE(ptg.product_group_id, p.product_id). This accounts for the fact that most of my group IDs were null except for a few dispersed products. In using COALESCE I'm first grouping by the group ID, if that value is null it ignores the group and collects by product_id.
I have a query that show results for a search and I want to add one more field that checks if each result appears as a favourited result in another table. To keep it simple, as most of the search parameters only add some JOIN and WHERE and are not important to my question, let's consider there is a results table that has all the right fields:
id | title | description
And the result_favourites table:
userid | resultid
Here is the MySQL query to get results (once again without all the search criterias for simplicity):
SELECT id, title, description FROM results
What I want is something like that (let's say the user is #1):
SELECT r.id r.title, r.description, (something here) AS is_favourited
FROM results AS r
RIGHT JOIN result_favourites AS rf ON rf.resultid = r.id
WHERE userid = 1
With is_favourited being either 1 (there is at least one row in result_favourites with both userid and resultid matching r.id and userid = 1) or 0 (there is none).
I've tried to use COUNT(rf.userid) AS is_favourited but that didn't work. Any help is welcome!
I think you want a left join, not right join. A left join keeps all rows from the first table. Then just check if there is a match:
SELECT r.id r.title, r.description, (rf.resultid is not null) AS is_favourited
FROM results r LEFT JOIN
result_favourite rf
ON rf.resultid = r.id and
rf.userid = 1;
Try this one:
SELECT r.id r.title, r.description, if(rf.userid is null,0,1) AS is_favourited
FROM results AS r
left JOIN result_favourites AS rf ON (rf.resultid = r.id)
WHERE userid = 1
My brain is turning to mush over this one, but I suspect there's an easy answer.
I have a table of theatre shows and I also have a table of reviews of those shows. The reviews have a flag to signify whether the review is in-house or an audience review, i.e. 1 for in-house, 0 for audience.
Now, what I want to do is return all shows that don't have an in-house review. I tried the following, but no results (obviously because I'm conflicting r.id is NULL and r.author = 1)
SELECT s.title FROM shows s LEFT OUTER JOIN reviews r ON s.id = r.showid WHERE r.id is NULL AND r.author = 1
If I take off the r.author = 1 then I get results, but false positives if there's an audience review.
Transfer r.author = 1 in the ON clause to filter table reviews first before joining with shows.
SELECT s.title
FROM shows s
LEFT OUTER JOIN reviews r
ON s.id = r.showid AND
r.author = 1
WHERE r.showid is NULL
The difference between ON and WHERE is that ON filters the rows from a specific table before joining on the other table while WHERE filters the result after the tables has been joined.
I am trying to create a custom sort that involves the count of some records in another table. For example, if one record has no records associated with it in the other table, it should appear higher in the sort than if it had one or more records. Here's what I have so far:
SELECT People.*, Organizations.Name AS Organization_Name,
(CASE
WHEN Sent IS NULL AND COUNT(SELECT * FROM Graphics WHERE People.Organization_ID = Graphics.Organization_ID) = 0 THEN 0
ELSE 1
END) AS Status
FROM People
LEFT JOIN Organizations ON Organizations.ID = People.Organization_ID
ORDER BY Status ASC
The subquery within the COUNT is not working. What is the correct way to do something like this?
Update: I moved the case statement into the order by clause and added a join:
SELECT People.*, Organizations.Name AS Organization_Name
FROM People
LEFT JOIN Organizations ON Organizations.ID = People.Organization_ID
LEFT JOIN Graphics ON Graphics.Organization_ID = People.Organization_ID
GROUP BY People.ID
ORDER BY
CASE
WHEN Sent IS NULL AND Graphics.ID IS NULL THEN 0
ELSE 1
END ASC
So if if the People record does not have any graphics, Graphics.ID will be null. This achieves the immediate need.
If what you tried does not work, it can be done by joining against a subquery, and placing the CASE expression into ORDER BY as well:
SELECT
People.*,
orgcount.num
FROM People JOIN (
SELECT Organization_ID, COUNT(*) AS num FROM Graphics GROUP BY Organization_ID
) orgcount ON People.Organization_ID = orgcount.num
ORDER BY
CASE WHEN Sent IS NULL AND orgcount.num = 0 THEN 0 ELSE 1 END,
orgcount.num DESC
You could use an outer join to the Graphics table to get the data needed for your sort.
Since I don't know your schema, I made an assumption that the People table has a primary key column called ID. If the PK column has a different name, you should substitute that in the GROUP BY clause.
Something like this should work for you:
SELECT People.*, (count(Distinct Graphics.Organization_ID) > 0) as Status
FROM People
LEFT OUTER JOIN Graphics ON People.Organization_ID = Graphics.Organization_ID
GROUP BY People.ID
ORDER BY Status ASC
Fairly straight forward with a LEFT JOIN provided you have some kind of primary key in the People table to GROUP on;
SELECT p.*, sent IS NOT NULL or COUNT(g.Organization_ID) Status
FROM People p LEFT JOIN Graphics g ON g.Organization_ID = p.Organization_ID
GROUP BY p.primary_key
ORDER BY Status
Demo here.