I have three tables that are joined. I almost have the solution but there seems to be one small problem going on here. Here is statement:
SELECT items.item,
COUNT(ratings.item_id) AS total,
COUNT(comments.item_id) AS comments,
AVG(ratings.rating) AS rate
FROM `items`
LEFT JOIN ratings ON (ratings.item_id = items.items_id)
LEFT JOIN comments ON (comments.item_id = items.items_id)
WHERE items.cat_id = '{$cat_id}' AND items.spam < 5
GROUP BY items_id ORDER BY TRIM(LEADING 'The ' FROM items.item) ASC;");
I have a table called items, each item has an id called items_id (notice it's plural). I have a table of individual user comments for each item, and one for ratings for each item. (The last two have a corresponding column called 'item_id').
I simply want to count comments and ratings total (per item) separately. With the way my SQL statement is above, they are a total.
note, total is the total of ratings. It's a bad naming scheme I need to fix!
UPDATE: 'total' seems to count ok, but when I add a comment to 'comments' table, the COUNT function affects both 'comments' and 'total' and seems to equal the combined output.
Problem is you're counting results of all 3 tables joined. Try:
SELECT i.item,
r.ratetotal AS total,
c.commtotal AS comments,
r.rateav AS rate
FROM items AS i
LEFT JOIN
(SELECT item_id,
COUNT(item_id) AS ratetotal,
AVG(rating) AS rateav
FROM ratings GROUP BY item_id) AS r
ON r.item_id = i.items_id
LEFT JOIN
(SELECT item_id,
COUNT(item_id) AS commtotal
FROM comments GROUP BY item_id) AS c
ON c.item_id = i.items_id
WHERE i.cat_id = '{$cat_id}' AND i.spam < 5
ORDER BY TRIM(LEADING 'The ' FROM i.item) ASC;");
In this query, we make the subqueries do the counting properly, then send that value to the main query and filter the results.
I'm guessing this is a cardinality issue. Try COUNT(distinct comments.item_id)
Related
I'd like to sum two columns from two different tables and then group it by user ID (uid). I did fiddle but seems to multipling the results by the number of rows in a column.
http://sqlfiddle.com/#!9/433a5e/3
You have multiple rows for each uid in both table. Hence, for a uid, you get a Cartesian product -- 2 rows in one table for a uid and 3 rows in the other become 6 rows with lots of duplicated data.
SO, aggregate the data before doing the join:
select s.uid, sumscore, sumorder
from (select s.uid, sum(s.score) as sumscore
from scores s
group by s.uid
) s left join
(select o.uid, sum(o.order) sumorder
from orders o
group by o.uid
) o
on o.uid = s.uid;
Here are the results in a SQL Fiddle.
You may be looking for this. Try this and let me know is it helpful or not for you.
SELECT s.uid, SUM(s.score) as score_tot, (SELECT SUM(orders.order) FROM orders WHERE orders.uid = s.uid GROUP BY orders.uid) as order_tot FROM scores as s GROUP BY s.uid
sqlfiddle here
I have three tables projects, discussions, and comments.
I have tried it like this:
SELECT p.PRO_Name, COUNT( d.DIS_Id ) AS nofdisc, COUNT( c.COM_Id ) AS nofcom
FROM projects p
LEFT JOIN discussions d ON p.PRO_Id = d.PRO_Id
LEFT JOIN comments c ON d.DIS_Id = c.DIS_Id
GROUP BY p.PRO_Name LIMIT 0 , 30
But it's taking all the rows from discussions and the count of comments is the same as the count of discussions.
count counts the number of non-null values of the given parameter. The join you have will create a row per comment, where both dis_id and com_id are not null, so their counts would be the same. Since these are IDs, you could just count the distinct number of occurrences to get the response you'd want:
(EDIT: Added an order by clause as per the request in the comments)
SELECT p.PRO_Name,
COUNT(DISTINCT d.DIS_Id) AS nofdisc,
COUNT(DISTINCT c.COM_Id) AS nofcom
FROM projects p
LEFT JOIN discussions d ON p.PRO_Id = d.PRO_Id
LEFT JOIN comments c ON d.DIS_Id = c.DIS_Id
GROUP BY p.PRO_Name
ORDER BY 2,3
LIMIT 0 , 30
I have 4 tables in an existing mysql database of a directory type site.
Table mt_links contains basic info for each listing
Table mt_cl contains which listing above is in what category (I only want cat_id=1)
Table mt_cfvalues contains more details for each listing It Can have repeated values
Table mt_images contains image names for each listing.
I want all records from mt_links where the mt_cl cat_id=1, and for each of those records, I need all records in mt_cfvalues and cf_images matching the link_id.
I set up a select with Group_Concat and left joins, but ended up with repeating values in my results. I added Distinct, which cured the repeating values, but mt_cfvalues can have records with the same value, so now I'm missing a value I should have.
SELECT a.link_id,
a.link_name,
a.link_desc,
GROUP_CONCAT(DISTINCT b.value ORDER BY b.cf_ID) AS details,
GROUP_CONCAT(DISTINCT c.filename ORDER BY c.ordering) AS images
FROM mt_links a
LEFT JOIN mt_cfvalues b ON a.link_id = b.link_ID
LEFT JOIN mt_images c ON b.link_id = c.link_ID
LEFT JOIN mt_cl d ON a.link_id = d.link_ID WHERE d.cat_ID = '1'
GROUP BY a.link_id
I put together a SQLFiddle here: http://www.sqlfiddle.com/#!2/f39e9/1
Is there an easier way? How do I fix the repeating / no repeating issue?
Here is one way of accomplishing what you seek. Because the two subqueries return independent results, you can't combine the GROUP BY, which is why you were getting duplicates.
SELECT a.link_id,
a.link_name,
a.link_desc,
cvf.details,
imgs.images
FROM mt_links a
LEFT JOIN (
SELECT link_ID, GROUP_CONCAT(value ORDER BY cf_ID) AS details
FROM mt_cfvalues
GROUP BY link_ID
) cvf ON cvf.link_ID = a.link_id
LEFT JOIN (
SELECT link_ID, GROUP_CONCAT(filename ORDER BY ordering) AS images
FROM mt_images
GROUP BY link_ID
) imgs ON imgs.link_ID = a.link_id
INNER JOIN mt_cl d ON a.link_id = d.link_ID
WHERE d.cat_ID = '1'
I'm finding trouble finding a similar example to what I'm trying to achieve. I have 3 tables. From one table I want to get the linking ID number. From another table I want to find the same ID's and add up another column of numbers in that table where the ID number from the 1st table matches. Then on the 3rd table, which is text, I want to group all the text together where the ID matches the main ID number... and return all this in 1 go. My diagram should show what I mean:
So have 2 queries that will on their own return part the results, but Im struggling to build it into 1 single query.
SELECT ticket_charges.ticket_id
, sum(ticket_charges.charge_time) AS Seconds
FROM
ticket_charges
LEFT OUTER JOIN tickets
ON ticket_charges.ticket_id = tickets.id
GROUP BY
ticket_charges.ticket_id
, tickets.id
The 77 and 937 for ticket ID 3 have been added up correctly!!
SELECT tickets.id AS `Ticket Number`
, left(tickets_messages.message, 500) AS `Ticket Message`
FROM
tickets
INNER JOIN tickets_messages
ON tickets.id = tickets_messages.id
GROUP BY
tickets_messages.ticket_id
, tickets.id
The messages are joined together correctly.
I've tried some concatenation on messages, selects within selects, different methods to group by, a couple of sums etc.. but just can't seem to get a result where by the I'm getting the results back correctly with both queries as 1 single query. Either the joined numbers from "charge_time" are very wrong and don't match any resemblance to anything or I end up with hundreds of "message" and strange numbers on the "charge_time"
FYI.. If I try this, I get "Sub query returned more than 1 row" but it's what I thought I should be doing.
SELECT ticket_charges.ticket_id
, sum(ticket_charges.charge_time) AS Seconds
FROM
ticket_charges
LEFT OUTER JOIN tickets
ON ticket_charges.ticket_id = tickets.id
Where (SELECT left(tickets_messages.message, 500)
FROM
tickets
INNER JOIN tickets_messages
ON tickets.id = tickets_messages.id
GROUP BY
tickets.id)
GROUP BY
ticket_charges.ticket_id
, tickets.id
If you really need to do that with a single query, the solution is to do a subquery in one of the jointures.
SELECT t.id, t.person_id, SUM(tc.charge_time), mc.concat
FROM tickets t
INNER JOIN tickets_charges tc ON tc.ticket_id = t.id
INNER JOIN (
SELECT ticket_id, GROUP_CONCAT(message SEPARATOR ' ') as concat
FROM tickets_messages
GROUP BY ticket_id) AS mc
ON mc.ticket_id = t.id
GROUP BY t.id
Try this query -
SELECT
t.id,
t.person_id,
SUM(tc.charge_time) Seconds,
GROUP_CONCAT(LEFT(tm.message, 20)) Message
FROM
tickets t
LEFT JOIN ticket_charges ts
ON ts.ticket_id = t.id
LEFT JOIN tickets_messages tm
ON tm.ticket_id = t.id
GROUP BY
t.id;
Note, that I used 'LEFT(tm.message, 20)', because GROUP_CONCAT function has length limitation - group_concat_max_len.
I have these tables and queries as defined in sqlfiddle.
First my problem was to group people showing LEFT JOINed visits rows with the newest year. That I solved using subquery.
Now my problem is that that subquery is not using INDEX defined on visits table. That is causing my query to run nearly indefinitely on tables with approx 15000 rows each.
Here's the query. The goal is to list every person once with his newest (by year) record in visits table.
Unfortunately on large tables it gets real sloooow because it's not using INDEX in subquery.
SELECT *
FROM people
LEFT JOIN (
SELECT *
FROM visits
ORDER BY visits.year DESC
) AS visits
ON people.id = visits.id_people
GROUP BY people.id
Does anyone know how to force MySQL to use INDEX already defined on visits table?
Your query:
SELECT *
FROM people
LEFT JOIN (
SELECT *
FROM visits
ORDER BY visits.year DESC
) AS visits
ON people.id = visits.id_people
GROUP BY people.id;
First, is using non-standard SQL syntax (items appear in the SELECT list that are not part of the GROUP BY clause, are not aggregate functions and do not sepend on the grouping items). This can give indeterminate (semi-random) results.
Second, ( to avoid the indeterminate results) you have added an ORDER BY inside a subquery which (non-standard or not) is not documented anywhere in MySQL documentation that it should work as expected. So, it may be working now but it may not work in the not so distant future, when you upgrade to MySQL version X (where the optimizer will be clever enough to understand that ORDER BY inside a derived table is redundant and can be eliminated).
Try using this query:
SELECT
p.*, v.*
FROM
people AS p
LEFT JOIN
( SELECT
id_people
, MAX(year) AS year
FROM
visits
GROUP BY
id_people
) AS vm
JOIN
visits AS v
ON v.id_people = vm.id_people
AND v.year = vm.year
ON v.id_people = p.id;
The: SQL-fiddle
A compound index on (id_people, year) would help efficiency.
A different approach. It works fine if you limit the persons to a sensible limit (say 30) first and then join to the visits table:
SELECT
p.*, v.*
FROM
( SELECT *
FROM people
ORDER BY name
LIMIT 30
) AS p
LEFT JOIN
visits AS v
ON v.id_people = p.id
AND v.year =
( SELECT
year
FROM
visits
WHERE
id_people = p.id
ORDER BY
year DESC
LIMIT 1
)
ORDER BY name ;
Why do you have a subquery when all you need is a table name for joining?
It is also not obvious to me why your query has a GROUP BY clause in it. GROUP BY is ordinarily used with aggregate functions like MAX or COUNT, but you don't have those.
How about this? It may solve your problem.
SELECT people.id, people.name, MAX(visits.year) year
FROM people
JOIN visits ON people.id = visits.id_people
GROUP BY people.id, people.name
If you need to show the person, the most recent visit, and the note from the most recent visit, you're going to have to explicitly join the visits table again to the summary query (virtual table) like so.
SELECT a.id, a.name, a.year, v.note
FROM (
SELECT people.id, people.name, MAX(visits.year) year
FROM people
JOIN visits ON people.id = visits.id_people
GROUP BY people.id, people.name
)a
JOIN visits v ON (a.id = v.id_people and a.year = v.year)
Go fiddle: http://www.sqlfiddle.com/#!2/d67fc/20/0
If you need to show something for people that have never had a visit, you should try switching the JOIN items in my statement with LEFT JOIN.
As someone else wrote, an ORDER BY clause in a subquery is not standard, and generates unpredictable results. In your case it baffled the optimizer.
Edit: GROUP BY is a big hammer. Don't use it unless you need it. And, don't use it unless you use an aggregate function in the query.
Notice that if you have more than one row in visits for a person and the most recent year, this query will generate multiple rows for that person, one for each visit in that year. If you want just one row per person, and you DON'T need the note for the visit, then the first query will do the trick. If you have more than one visit for a person in a year, and you only need the latest one, you have to identify which row IS the latest one. Usually it will be the one with the highest ID number, but only you know that for sure. I added another person to your fiddle with that situation. http://www.sqlfiddle.com/#!2/4f644/2/0
This is complicated. But: if your visits.id numbers are automatically assigned and they are always in time order, you can simply report the highest visit id, and be guaranteed that you'll have the latest year. This will be a very efficient query.
SELECT p.id, p.name, v.year, v.note
FROM (
SELECT id_people, max(id) id
FROM visits
GROUP BY id_people
)m
JOIN people p ON (p.id = m.id_people)
JOIN visits v ON (m.id = v.id)
http://www.sqlfiddle.com/#!2/4f644/1/0 But this is not the way your example is set up. So you need another way to disambiguate your latest visit, so you just get one row per person. The only trick we have at our disposal is to use the largest id number.
So, we need to get a list of the visit.id numbers that are the latest ones, by this definition, from your tables. This query does that, with a MAX(year)...GROUP BY(id_people) nested inside a MAX(id)...GROUP BY(id_people) query.
SELECT v.id_people,
MAX(v.id) id
FROM (
SELECT id_people,
MAX(year) year
FROM visits
GROUP BY id_people
)p
JOIN visits v ON (p.id_people = v.id_people AND p.year = v.year)
GROUP BY v.id_people
The overall query (http://www.sqlfiddle.com/#!2/c2da2/1/0) is this.
SELECT p.id, p.name, v.year, v.note
FROM (
SELECT v.id_people,
MAX(v.id) id
FROM (
SELECT id_people,
MAX(year) year
FROM visits
GROUP BY id_people
)p
JOIN visits v ON ( p.id_people = v.id_people
AND p.year = v.year)
GROUP BY v.id_people
)m
JOIN people p ON (m.id_people = p.id)
JOIN visits v ON (m.id = v.id)
Disambiguation in SQL is a tricky business to learn, because it takes some time to wrap your head around the idea that there's no inherent order to rows in a DBMS.