I am trying to list out all task with the count of finished/completed task (In submissions). The problem is that I also want to show all task that no users have finished. This query does not list out count = 0 (Null). Is there any way to do this?
Wanted result:
Date | title | completed
2014-05-20 | Case 1 | 45
2014-05-24 | Case 10 | 11
2014-05-20 | Case 2 | 0
I have tried so far:
Select date, title, count(*) as completed
from users u, submissions s, task t
where u.userPK = s.user
and s.task= t.taskPK
group by taskPK
order by completed desc;
You need to use an OUTER JOIN to get your desired results. However, considering the previous answer didn't suffice, I would also guess you don't want to GROUP BY the taskPK field, but rather by the date and title fields.
Perhaps this is what you're looking for:
SELECT t.date, t.title, count(*) cnt
FROM task t
LEFT JOIN submissions s ON t.task = s.taskPK
GROUP BY t.date, t.title
ORDER BY cnt DESC
I also removed the user table as I'm not sure how it affects the results. If you need it back, just add an additional join.
I think you should be able to achive this using a LEFT JOIN:
SELECT date, title, COUNT(u.userPK) completed FROM task t
LEFT JOIN submissions s ON s.task = t.taskPK
LEFT JOIN users u ON s.user = u.userPK
GROUP BY t.taskPK
ORDER BY completed;
Related
Working example: http://sqlfiddle.com/#!9/80995/20
I have three tables, a user table, a user_group table, and a link table.
The link table contains the dates that users were added to user groups. I need a query that returns the count of users currently in each group. The most recent date determines the group that the user is currently in.
SELECT
user_groups.name,
COUNT(l.name) AS ct,
GROUP_CONCAT(l.`name` separator ", ") AS members
FROM user_groups
LEFT JOIN
(SELECT MAX(added), group_id, name FROM link LEFT JOIN users ON users.id = link.user_id GROUP BY user_id) l
ON l.group_id = user_groups.id
GROUP BY user_groups.id
My question is if the query I have written could be optimized, or written better.
Thanks!
Ben
You actual query is not giving you the answer you want; at least, as far as I understand your question. John actually joined group 2 on 2017-01-05, yet it appears on group 1 (that he joined on 2017-01-01) on your results. Note also you're missing one Group 4.
Using standard SQL, I think the next query is what you're looking for. The comments in the query should clarify what each part is doing:
SELECT
user_groups.name AS group_name,
COUNT(u.name) AS member_count,
group_concat(u.name separator ', ') AS members
FROM
user_groups
LEFT JOIN
(
SELECT * FROM
(-- For each user, find most recent date s/he got into a group
SELECT
user_id AS the_user_id, MAX(added) AS last_added
FROM
link
GROUP BY
the_user_id
) AS u_a
-- Join back to the link table, so that the `group_id` can be retrieved
JOIN link l2 ON l2.user_id = u_a.the_user_id AND l2.added = u_a.last_added
) AS most_recent_group ON most_recent_group.group_id = user_groups.id
-- And get the users...
LEFT JOIN users u ON u.id = most_recent_group.the_user_id
GROUP BY
user_groups.id, user_groups.name
ORDER BY
user_groups.name ;
This can be written in a more compact way in MySQL (abusing the fact that, in older versions of MySQL, it doesn't follow the SQL standard for the GROUP BY restrictions).
That's what you'll get:
group_name | member_count | members
:--------- | -----------: | :-------------
Group 1 | 2 | Mikie, Dominic
Group 2 | 2 | John, Paddy
Group 3 | 0 | null
Group 4 | 1 | Nellie
dbfiddle here
Note that this query can be simplified if you use a database with window functions (such as MariaDB 10.2). Then, you can use:
SELECT
user_groups.name AS group_name,
COUNT(u.name) AS member_count,
group_concat(u.name separator ', ') AS members
FROM
user_groups
LEFT JOIN
(
SELECT
user_id AS the_user_id,
last_value(group_id) OVER (PARTITION BY user_id ORDER BY added) AS group_id
FROM
link
GROUP BY
user_id
) AS most_recent_group ON most_recent_group.group_id = user_groups.id
-- And get the users...
LEFT JOIN users u ON u.id = most_recent_group.the_user_id
GROUP BY
user_groups.id, user_groups.name
ORDER BY
user_groups.name ;
dbfiddle here
I cannot find the answer to my problem here on stackoverflow. I have a query that spans 3 tables:
newsitem
+------+----------+----------+----------+--------+----------+
| Guid | Supplier | LastEdit | ShowDate | Title | Contents |
+------+----------+----------+----------+--------+----------+
newsrating
+----+----------+--------+--------+
| Id | NewsGuid | UserId | Rating |
+----+----------+--------+--------+
usernews
+----+----------+--------+----------+
| Id | NewsGuid | UserId | ReadDate |
+----+----------+--------+----------+
Newsitem obviously contains newsitems, newsrating contains ratings that users give to newsitems, and usernews contains the date when a user has read a newsitem.
In my query I want to get every newsitem, including the number of ratings for that newsitem and the average rating, and how many times that newsitem has been read by the current user.
What I have so far is:
select newsitem.guid, supplier, count(newsrating.id) as numberofratings,
avg(newsrating.rating) as rating,
count(case usernews.UserId when 3 then 1 else null end) as numberofreads from newsitem
left join newsrating on newsitem.guid = newsrating.newsguid
left join usernews on newsitem.guid = usernews.newsguid
group by newsitem.guid
I have created an sql fiddle here: http://sqlfiddle.com/#!9/c8add/8
Both count() calls don't return the numbers I want. numberofratings should return the total number of ratings for that newsitem (by all users). numberofreads should return the number of reads for the current user for that newsitem.
So, newsitem with guid d104c330-c319-40e8-8be3-a7c4f549d35c should have 2 ratings and 3 reads for the current user with userid = 3.
I have tried conditional counts and sums, but no success yet. How can this be accomplished?
The main problem that I see is that you're joining in both tables together, which means that you're going to effectively be multiplying out by both numbers, which is why your counts aren't going to be correct. For example, if the Newsitem has been read 3 times by the user and rated by 8 users then you're going to end up getting 24 rows, so it will look like it has been rated 24 times. You can add a DISTINCT to your COUNT of the ratings IDs and that should correct that issue. Average should be unaffected because the average of 1 and 2 is the same as the average of 1, 1, 2, & 2 (for example).
You can then handle the reads by adding the userid to the JOIN condition (since it's an OUTER JOIN it shouldn't cause any loss of results) instead of in a CASE statement for your COUNT, then you can do a COUNT on distinct id values from Usernews. The resulting query would be:
SELECT
I.guid,
I.supplier,
COUNT(DISTINCT R.id) AS number_of_ratings,
AVG(R.rating) AS avg_rating,
COUNT(DISTINCT UN.id) AS number_of_reads
FROM
NewsItem I
LEFT OUTER JOIN NewsRating R ON R.newsguid = I.guid
LEFT OUTER JOIN UserNews UN ON
UN.newsguid = I.guid AND
UN.userid = #userid
GROUP BY
I.guid,
I.supplier
While that should work, you might get better results from a subquery, as the above needs to explode out the results and then aggregate them, perhaps unnecessarily. Also, some people might find the below to be a little clearer.
SELECT
I.guid,
I.supplier,
R.number_of_ratings,
R.avg_rating,
COUNT(*) AS number_of_reads
FROM
NewsItem I
LEFT OUTER JOIN
(
SELECT
newsguid,
COUNT(*) AS number_of_ratings,
AVG(rating) AS avg_rating
FROM
NewsRating
GROUP BY
newsguid
) R ON R.newsguid = I.guid
LEFT OUTER JOIN UserNews UN ON UN.newsguid = I.guid AND UN.userid = #userid
GROUP BY
I.guid,
I.supplier,
R.number_of_ratings,
R.avg_rating
I'm with Tom you should use a subquery to calculate the user count.
SQL Fiddle Demo
SELECT NI.guid,
NI.supplier,
COUNT(NR.ID) as numberofratings,
AVG(NR.rating) as rating,
user_read as numberofreads
FROM newsitem NI
LEFT JOIN newsrating NR
ON NI.guid = NR.newsguid
LEFT JOIN (SELECT NewsGuid, COUNT(*) user_read
FROM usernews
WHERE UserId = 3 -- use a variable #user_id here
GROUP BY NewsGuid) UR
ON NI.guid = UR.NewsGuid
GROUP BY NI.guid,
NI.supplier,
numberofreads;
I have two tables. One is a call history table which logs calls made (starttime, endtime, phone number, user, etc). The other is an orders table which logs order details (order number, customer info, orderdate, etc.). Orders are not always created when a call is created so there isnt a guaranteed ID to match them up. Right now, I'm interested in getting totals by day. When I try to run a a query to sum calls and join orders by day I get the following error:
The SELECT would examine more than MAX_JOIN_SIZE rows; check your WHERE and use SET SQL_BIG_SELECTS=1 or SET MAX_JOIN_SIZE=# if the SELECT is okay
This is the query I use:
SELECT
DATE_FORMAT(c.date_call_start,'%Y-%m-%d') as date,
COUNT(c.id) as calls,
COUNT(o.id) as orders
FROM tbl_calls c
LEFT OUTER JOIN tbl_orders o
ON DATE_FORMAT(c.date_call_start,'%Y-%m-%d') = DATE_FORMAT(o.created,'%Y-%m-%d')
WHERE c.campaign_id = 1
AND DATE_FORMAT(c.date_call_start,'%Y-%m-%d') = '2013-12-09'
GROUP BY DATE_FORMAT(c.date_call_start,'%Y-%m-%d')
Even when there are only a few calls for a particular day, it still shows the same error. So I'm pretty sure it my query that needs work.
I have also tried a sub query, but that doesn't rollup the totals from the subquery.
SELECT
DATE_FORMAT(c.date_call_start,'%Y-%m-%d') as date,
count(c.id) as calls,
(select count(DISTINCT o.id)
FROM tbl_orders o
WHERE DATE_FORMAT(o.created,'%Y-%m-%d') = DATE_FORMAT(c.date_call_start,'%Y-%m-%d')
) as orders
FROM tb_calls c
WHERE c.campaign_id = 1
AND DATE_FORMAT(c.date_call_start,'%Y-%m-%d') BETWEEN '2013-12-09' AND '2013-12-15'
GROUP BY DATE_FORMAT(c.date_call_start,'%Y-%m-%d')
WITH ROLLUP
Any thoughts on how I can get this query to work? Ultimately I'd like a result like below so I can do other calculations like % orders etc.
date | calls | orders
------------------------------------
2013-12-01 | 100| 10
2013-12-02 | 125| 20
NULL | 225| 30
UPDATED:
Based on the answer I did the following:
created call_date field with a date field (no datetime) to tbl_calls
created date_order field with a date format (not datetime) to tbl_orders
Updated each table and set the new fields to = date_format(the_date_time_stamp,'%Y-%m-%d') from the same table.
Also added an index to each of the new date fields.
That made the following query work:
SELECT
c.call_date as date,
COUNT(DISTINCT c.id) as calls,
COUNT(DISTINCT o.id) as orders,
ROUND((COUNT(DISTINCT o.id) / COUNT(DISTINCT c.id))*100,2) as conversion
FROM tbl_calls c
JOIN tbl_orders o
ON c.call_date = o.date_order
WHERE c.campaign_id = 1
AND c.call_date BETWEEN '2013-12-09' AND '2013-12-15'
GROUP BY c.call_date
WITH ROLLUP
Which gives me the following result and I can build off of this. Thanks to each of you who provided suggestions. I tried each. All make sense. However, since I ultimately had to create the additional date fields I chose the answer by
date | calls | orders| conversion
-------------------------------------------
2013-12-09 | 151 | 6 | 3.97
2013-12-10 | 164 | 2 | 1.22
2013-12-11 | 165 | 6 | 3.64
2013-12-12 | 189 | 1 | 0.53
2013-12-13 | 116 | 4 | 3.45
null | 785 | 19 | 2.42
First - try the results of EXPLAIN SELECT.... where ... is the rest of your select query above.
Since you're performing the join on two fields which have a function applied to them - I'm take a guess and say MySQL is performing two full table scans and using type all for the join. See this for an explanation of the EXPLAIN output.
DATE_FORMAT(c.date_call_start,'%Y-%m-%d') = DATE_FORMAT(o.created,'%Y-%m-%d')
You'll most likely want to create a separate field in each table that contains just the result of the DATE_FORMAT call. Then create an index for each of these new fields. Then join on these new indexed fields. MySQL should like that much better.
Presumably you want to count the calls and orders for each date. However, that is not what your query does, because it creates a cartesian product for all orders on a given date.
Instead, summarize the data first by date and then combine the results. This may be what you want:
select c.date, calls, orders
from (select DATE_FORMAT(c.date_call_start, '%Y-%m-%d') as date, count(*) as calls
from tbl_calls c
WHERE c.campaign_id = 1 and
DATE_FORMAT(c.date_call_start, '%Y-%m-%d') = '2013-12-09'
group by DATE_FORMAT(c.date_call_start, '%Y-%m-%d')
) c left outer join
(select DATE_FORMAT(o.created,'%Y-%m-%d') as date, count(*) as orders
from tbl_orders o
group by DATE_FORMAT(o.created, '%Y-%m-%d')
) o
on c.date = o.date;
If #Barmar 's suggestion does not work, then you may need to split the fields into DATE and TIME.
A different direction is to make two temp tables (giving you three queries:
CREATE TEMPORARY TABLE `tbl_calls_temp` SELECT * FROM tbl_calls c WHERE DATE(c.date_call_start) = '2013-12-09' AND c.campaign_id = 1
Then do the same restricting for the tbl_orders TABLE
CREATE TEMPORARY TABLE `tbl_orders_temp` SELECT * FROM tbl_orders o WHERE DATE(o.created) = '2013-12-09'
Finally query against the two temporary tables. Depending on how much data you get, you may want to add indexes to the temporary tables... but in all likelihood you are facing a full-join
SELECT
DATE_FORMAT(c.date_call_start,'%Y-%m-%d') as date,
COUNT(c.id) as calls,
COUNT(o.id) as orders
FROM tbl_calls_temp c
LEFT OUTER JOIN tbl_orders_temp o
ON DATE_FORMAT(c.date_call_start,'%Y-%m-%d') = DATE_FORMAT(o.created,'%Y-%m-%d')
GROUP BY DATE_FORMAT(c.date_call_start,'%Y-%m-%d')
And that should be much faster... assuming you have any indexes in your initial tables that can be queried.
I'm having trouble figuring out the sql for the following problem of mine. I have two tables like this:
+----------+------------+------------+
| event_id | event_name | event_date |
+----------+------------+------------+
+---------------+----------+---------+--------+
| attendance_id | event_id | user_id | status |
+---------------+----------+---------+--------+
What I am trying to do is to get a table like this:
+----------+--------+
| event_id | status |
+----------+--------+
Where the conditional for the second attendance table is the user_id. I'm trying to get a list of all the events as well as the status of a user for each one of those events, even if there is no record inside attendance (NULL is ok for now). And again, the status data from the attendance table needs to be chosen by the user_id.
From my initial research, I thought this would work:
SELECT event_id, status FROM events LEFT JOIN attendance WHERE attendance.user_id='someoutsideinput' ORDER BY event_date ASC
But that is not working for me as expected..how should I go about this?
Thanks!
all you need to do is to move the condition in the WHERE clause into ON clause.
SELECT events.event_id, COALESCE(attendance.status, 0) status
FROM events LEFT JOIN attendance
ON events.event_id = attendance.event_id AND
attendance.user_id='someoutsideinput'
ORDER BY events.event_date ASC
You need to more that condition to the JOIN clause instead of the WHERE clause.
BTW, you have not specified the join criteria between the tables, I have also corrected that below.
SELECT E.event_id
,A.status
FROM events E
LEFT JOIN
attendance A
ON E.event_id = A.event_id
AND A.user_id='someoutsideinput'
ORDER BY
E.event_date ASC
I have 2 tables. One is items and another is votes for those items.
Items table has: |item_id|name|post_date
Votes table has: |votes_id|item_id|answer|total_yes|total_no
What I want to do is show all items based on post_date and show the answer in the votes table with the HIGHEST total_yes. So I want to show only a SINGLE answer from the votes table with the highest total_yes vote.
I was trying:
SELECT a.*, b.* FROM Items a
INNER JOIN Votes b ON a.item_id = b.item_id
GROUP by a.item_id
ORDER by a.post_date DESC, b.total_yes DESC
But that doesnt work.
The result I would like to see is:
<item><answer><yes votes>
Buick | Fastest | 2 yes votes
Mercedes | Shiny | 32 yes votes
Honda | Quick | 39 yes votes
Any help is appreciated!
SELECT a.*, b.*
FROM Items a
LEFT JOIN Votes b on a.item_id = b.item_id
and b.total_yes = (select max(total_yes)
from Votes v
where v.item_id = a.item_id)
ORDER BY a.post_date DESC, b.total_yes DESC
N.B.: if you have for an item 2 answers with the same total_yes = max, you will have 2 rows for that item.
add LIMIT 1 to the end of your query :)
that will take only one record, but at the moment you're ordering by the date first, so you'll get the highest vote for the last date voted on. Is that what you want?
If you want the highest total vote regardless you'll need to order by that first.