valid MySQL subquery syntax and placement - mysql

I've never figured out several syntax things about SQL subqueries. Mainly I'm interested in where in the parent query is it valid to place a subquery.
here's an example which throws an error:
SELECT
sum(votes.vote) AS sum,
votes.vote IS NOT NULL AS did_i_vote,
purchase_id, subject, full_name
FROM (
SELECT vote FROM votes
where votes.acct_id=3 AND
votes.column_name='purchase_id'
) votes
RIGHT JOIN items_purchased
ON votes.parent_id=items_purchased.purchase_id
JOIN accounts
ON items_purchased.purchaser_account_id=accounts.acct_id
JOIN items
ON items_purchased.item_id=items.folder_id
WHERE purchase_id='2'
GROUP BY items_purchased.purchase_id
How do I make this query work?

One error is in the GROUP BY part!
In the SELECT, you can only have the columns displayed in the GROUP BY and agregate functions of columns that are not there.
Check THIS info!

Your subquery must select every column that you wish to reference afterwards.
SELECT
sum(votes.vote) AS sum,
votes.vote IS NOT NULL AS did_i_vote,
purchase_id, subject, full_name
FROM (
SELECT vote, parent_id FROM votes
where votes.acct_id=3 AND
votes.column_name='purchase_id'
) votes
RIGHT JOIN items_purchased
ON votes.parent_id=items_purchased.purchase_id
JOIN accounts
ON items_purchased.purchaser_account_id=accounts.acct_id
JOIN items
ON items_purchased.item_id=items.folder_id
WHERE purchase_id='2'
GROUP BY items_purchased.purchase_id
Is what I would assume (note that I select vote and parent_id in the subquery)

Related

Multiply output of subquery (MySQL)

I'm trying to multiply the result of a subquery with a field from the 'main' query. See the following example:
Table: subscriptions
id
title
price
Table: users
subscription_id
SELECT
subscriptions.id,
subscriptions.title,
(select count(*) from users where users.subscription_id = subscriptions.id) AS qty
SUM(qty * subscriptions.price) AS total
FROM subscriptions
This gives the error Unknown column 'qty' in 'field list'. So it seems like the result from the subquery isn't available in the SELECT field. After searching StackOverflow I found some of the same questions and it seems I need to move the subquery from the select to a JOIN. This seems simple enough but I'm having trouble to modify my own query to work like this. Anyone who can push me in the right direction?
Don't put the subquery in the SELECT list, join with it.
SELECT s.id, s.title, u.qty, s.price * u.qty AS total
FROM subscriptions AS s
JOIN (
SELECT subscription_id, COUNT(*) AS qty
FROM users
GROUP BY subscription_id
) AS u ON s.id = u.subscription_id
Almost right.
SELECT
s.id,
s.title,
SUM(s.price * (select count(*) from users u where u.subscription_id = s.id)) AS total
FROM subscriptions s
GROUP BY s.id, s.title
I tried to reslove your query, check it
https://dbfiddle.uk/xrMrT7Y4
I don't know why someone has deleted my answer. Here I found issue in your query is you didn't group the aggregate function & If you are comparing ID then both tables should be considered. #Vinze

Implementing Count Function In SQL Query With Inner Joins

I have a query which is the following :
select person.ID, person.personName, round(avg(TIMESTAMPDIFF(DAY,orderDate,shippedDate)),2)) as 'Average' from orders inner join person person.personID= orders.personID where shippedDate is not null group by orders.personID;
The query above outputs 10 rows. I want to add a field which would count how how many rows there are in the query above in total.
I have tried to implement the SQL COUNT function but am struggling with the syntax as it has an INNER JOIN.
If you are running MySQL 8.0, you can do a window count:
select
person.ID,
person.personName,
round(avg(timestampdiff(day, o.orderDate, o.shippedDate)),2)) average,
count(*) over() total_no_rows
from orders o
inner join person p on p.personID = o.personID
where o.shippedDate is not null
group by p.personID, o.personName
Note that I made a few fixes to your query:
table aliases make the query easier to read and write
it is a good practice to qualify all column names with the table they belong to - I made a few assumptions that you might need to review
every non-aggregated column should belong to the group by clause (this is a good practice, and a requirement in most databases)
if you are not using Mysql 8.0 you can use Subquery:
select COUNT(*) FROM (
person.ID,
person.personName,
round(avg(TIMESTAMPDIFF(DAY,orderDate,shippedDate)),2)) as 'Average' from
orders inner join person person.personID= orders.personID where shippedDate
is not null group by orders.personID
);
and if you are using MYSQL 8.0 use window function like below:
select
person.ID,
person.personName,
round(avg(timestampdiff(day, o.orderDate, o.shippedDate)),2)) average,
count(*) over() total_no_rows
from orders o
inner join person p on p.personID = o.personID
where o.shippedDate is not null
group by p.personID, o.personName

SQL Query doesnt get correct count

I try to write a query that would return me for each user how many orders he made and at how many diffrent stores :
SELECT user_id,display_name,count(store_id) as stores,count(order_id) as orders
FROM `orders` natural join users
where user_id NOT IN (7,766,79)
group by user_id
order by orders desc
It seem to me that this query should do it but for some reason i get same values in the orders and stores columns (and i have double checked that there should be a diffrence). Any one can help with that?
BTW, order_id is primary key, store_id and user_id are foreign keys on orders.
EDIT:
SELECT user_id,display_name,count(distinct store_id) as stores,count(order_id) as orders
FROM `orders` natural join users
where user_id NOT IN (7,766,79)
group by user_id
order by orders desc
worked, can any one explain why would i have to add distinct keyword in this case?
RESOLVED:
using comments by #McAdam331 i understood that the original query counts the rows with the same user id in both counts, and since post_ids are unique, the the unique ids count is the same as row count, while store_id which can have doubles doesnt have same count as the row count. therefore using distinct solved the problem.
I would just use COUNT(*) to get number of orders, and COUNT(DISTINCT store_id) to get the number of stores:
SELECT u.user_id, u.display_name, COUNT(*) AS numOrders, COUNT(DISTINCT store_id) AS numStores
FROM orders o
JOIN users u
...
The reason your query fails is because you're omitting DISTINCT on store_id. Because for every row you have a store_id and an order_id, you have the same number of store_ids and order_ids in each group. However, that's not what you're really looking for, you're looking for the number of DISTINCT store_ids.
If it's possible for the user to have the same order_id twice (though it doesn't make sense to me) you can add the DISTINCT keyword in that count function too.
You've got it backwards. Start with users and join to orders. Something like this:
SELECT users.user_id, display_name,
count(DISTINCT store_id) as store_count,
count(order_id) as order_count
FROM users
LEFT OUTER JOIN orders ON orders.user_id=users.user_id
where users.user_id NOT IN (7,766,79)
group by users.user_id
order by order_count desc;
Note the use of DISTINCT within the store_count because you may have multiple orders from the same store.

SQL query is not retrieving all the fields

I have to tables in my database, the first one (participants) look just like that:
And I have another called votes in which I can vote for any participants.
So my problem is that I'm trying to get all the votes of each participant but when I execute my query it only retrieves four rows sorted by the COUNT of votes, And the other remaining are not appearing in my query:
SELECT COUNT(DISTINCT `votes`.`id`) AS count_id, participants.name
AS participant_name FROM `participants` LEFT OUTER JOIN `votes` ON
`votes`.`participant_id` = `participants`.`id` GROUP BY votes.participant_id ORDER BY
votes.participant_id DESC;
Retrieves:
I think the problem is that you're grouping by votes.participant_id, rather than participants.id, which limits you to participants with votes, the outer join notwithstanding. Check out http://sqlfiddle.com/#!2/c5d3d/5/0
As what i have understood from the query you gave you were selecting unique id's from the votes table and I assume that your column id is not an identity. but it would be better if that would be an identity? and if so, here is my answer.replace your select with these.
Select count (votes.participant.id) as count_id ,participants.name as participant_name
from participants join votes
on participants.id = vote.participant_id
group by participants.name
order by count_id
just let me know if it works
cheers

MySQL is not using INDEX in subquery

I have these tables and queries as defined in sqlfiddle.
First my problem was to group people showing LEFT JOINed visits rows with the newest year. That I solved using subquery.
Now my problem is that that subquery is not using INDEX defined on visits table. That is causing my query to run nearly indefinitely on tables with approx 15000 rows each.
Here's the query. The goal is to list every person once with his newest (by year) record in visits table.
Unfortunately on large tables it gets real sloooow because it's not using INDEX in subquery.
SELECT *
FROM people
LEFT JOIN (
SELECT *
FROM visits
ORDER BY visits.year DESC
) AS visits
ON people.id = visits.id_people
GROUP BY people.id
Does anyone know how to force MySQL to use INDEX already defined on visits table?
Your query:
SELECT *
FROM people
LEFT JOIN (
SELECT *
FROM visits
ORDER BY visits.year DESC
) AS visits
ON people.id = visits.id_people
GROUP BY people.id;
First, is using non-standard SQL syntax (items appear in the SELECT list that are not part of the GROUP BY clause, are not aggregate functions and do not sepend on the grouping items). This can give indeterminate (semi-random) results.
Second, ( to avoid the indeterminate results) you have added an ORDER BY inside a subquery which (non-standard or not) is not documented anywhere in MySQL documentation that it should work as expected. So, it may be working now but it may not work in the not so distant future, when you upgrade to MySQL version X (where the optimizer will be clever enough to understand that ORDER BY inside a derived table is redundant and can be eliminated).
Try using this query:
SELECT
p.*, v.*
FROM
people AS p
LEFT JOIN
( SELECT
id_people
, MAX(year) AS year
FROM
visits
GROUP BY
id_people
) AS vm
JOIN
visits AS v
ON v.id_people = vm.id_people
AND v.year = vm.year
ON v.id_people = p.id;
The: SQL-fiddle
A compound index on (id_people, year) would help efficiency.
A different approach. It works fine if you limit the persons to a sensible limit (say 30) first and then join to the visits table:
SELECT
p.*, v.*
FROM
( SELECT *
FROM people
ORDER BY name
LIMIT 30
) AS p
LEFT JOIN
visits AS v
ON v.id_people = p.id
AND v.year =
( SELECT
year
FROM
visits
WHERE
id_people = p.id
ORDER BY
year DESC
LIMIT 1
)
ORDER BY name ;
Why do you have a subquery when all you need is a table name for joining?
It is also not obvious to me why your query has a GROUP BY clause in it. GROUP BY is ordinarily used with aggregate functions like MAX or COUNT, but you don't have those.
How about this? It may solve your problem.
SELECT people.id, people.name, MAX(visits.year) year
FROM people
JOIN visits ON people.id = visits.id_people
GROUP BY people.id, people.name
If you need to show the person, the most recent visit, and the note from the most recent visit, you're going to have to explicitly join the visits table again to the summary query (virtual table) like so.
SELECT a.id, a.name, a.year, v.note
FROM (
SELECT people.id, people.name, MAX(visits.year) year
FROM people
JOIN visits ON people.id = visits.id_people
GROUP BY people.id, people.name
)a
JOIN visits v ON (a.id = v.id_people and a.year = v.year)
Go fiddle: http://www.sqlfiddle.com/#!2/d67fc/20/0
If you need to show something for people that have never had a visit, you should try switching the JOIN items in my statement with LEFT JOIN.
As someone else wrote, an ORDER BY clause in a subquery is not standard, and generates unpredictable results. In your case it baffled the optimizer.
Edit: GROUP BY is a big hammer. Don't use it unless you need it. And, don't use it unless you use an aggregate function in the query.
Notice that if you have more than one row in visits for a person and the most recent year, this query will generate multiple rows for that person, one for each visit in that year. If you want just one row per person, and you DON'T need the note for the visit, then the first query will do the trick. If you have more than one visit for a person in a year, and you only need the latest one, you have to identify which row IS the latest one. Usually it will be the one with the highest ID number, but only you know that for sure. I added another person to your fiddle with that situation. http://www.sqlfiddle.com/#!2/4f644/2/0
This is complicated. But: if your visits.id numbers are automatically assigned and they are always in time order, you can simply report the highest visit id, and be guaranteed that you'll have the latest year. This will be a very efficient query.
SELECT p.id, p.name, v.year, v.note
FROM (
SELECT id_people, max(id) id
FROM visits
GROUP BY id_people
)m
JOIN people p ON (p.id = m.id_people)
JOIN visits v ON (m.id = v.id)
http://www.sqlfiddle.com/#!2/4f644/1/0 But this is not the way your example is set up. So you need another way to disambiguate your latest visit, so you just get one row per person. The only trick we have at our disposal is to use the largest id number.
So, we need to get a list of the visit.id numbers that are the latest ones, by this definition, from your tables. This query does that, with a MAX(year)...GROUP BY(id_people) nested inside a MAX(id)...GROUP BY(id_people) query.
SELECT v.id_people,
MAX(v.id) id
FROM (
SELECT id_people,
MAX(year) year
FROM visits
GROUP BY id_people
)p
JOIN visits v ON (p.id_people = v.id_people AND p.year = v.year)
GROUP BY v.id_people
The overall query (http://www.sqlfiddle.com/#!2/c2da2/1/0) is this.
SELECT p.id, p.name, v.year, v.note
FROM (
SELECT v.id_people,
MAX(v.id) id
FROM (
SELECT id_people,
MAX(year) year
FROM visits
GROUP BY id_people
)p
JOIN visits v ON ( p.id_people = v.id_people
AND p.year = v.year)
GROUP BY v.id_people
)m
JOIN people p ON (m.id_people = p.id)
JOIN visits v ON (m.id = v.id)
Disambiguation in SQL is a tricky business to learn, because it takes some time to wrap your head around the idea that there's no inherent order to rows in a DBMS.