DISTINCT with GROUP BY in MySQL - mysql

If I'm selecting distinct values, but then I group the results, are the values only distinct within each grouping, or across all groupings?
For example:
month | id
|
01 | 17
01 | 17
01 | 19
04 | 17
04 | 20
If I run
select month, count(distinct id)
from table
group by month
What counts will I get for the two months?

You would get 2 for each month.
Here is a SQL Fiddle showing the results.

Related

MYSQL - Select query generating a column for each month in date range

I have a request for a report that must return the total qty of items shipped, grouped by item, within a date range. No big deal. BUT they want the report to display one column for each month in the range.
Example: Customer specified start date = 2017-11-15 and end date = 2018-03-14
There is a bunch of columns prior to the month columns, but for the sake of simplicity, I'll use only ITEMCODE in the example. The result should look like the following:
ITEMCODE | NOV 17 | DEC 17 | JAN 18 | FEB 18 | MAR 18
SHIRT123 | 10 | 25 | 33 | 20 | 7
PANTS123 | 5 | 20 | 20 | 18 | 6
I don't even know if it's possible to do this. Note, under each month is the sum of the QTY column.

Count unique values from duplicates

I have following data on the table.
Uid | comm | status
-------------------
12 23 eve
15 23 eve
20 23 mon
12 23 mon
20 23 eve
17 23 mon
how do i query to get below result to avoid duplicates and make sure if i count uid for "eve" and same uid appears on "mon" then count only uid for "eve"?
count | status
-------------------
3 eve
1 mon
Thanks for the help!
You can use the following query in order to pick each Uid value once:
SELECT Uid, MIN(status)
FROM mytable
GROUP BY Uid
Output:
Uid MIN(status)
---------------
12 eve
15 eve
17 mon
20 eve
Using the above query you can get at the desired result like this:
SELECT status, count(*)
from (
SELECT Uid, MIN(status) AS status
FROM mytable
GROUP BY Uid ) AS t
GROUP BY status
Demo here

Mysql: find active users who logged in once a week

I have a table users and another table logins everytime the user log-in into the website we record a row in logins ex.
Users
-----
14 | name1
17 | name2
20 | name3
21 | name4
25 | name5
logins
----
14 | 2015-03-01
14 | 2015-03-07
14 | 2015-03-16
14 | 2015-03-24
14 | 2015-03-30
17 | 2015-03-01
17 | 2015-03-07
17 | 2015-03-16
17 | 2015-03-17
17 | 2015-03-30
20 | 2015-03-01
20 | 2015-03-07
20 | 2015-03-08
20 | 2015-03-16
20 | 2015-03-25
20 | 2015-03-30
if start date is 2015-03-01 and end date is 2015-04-01 then 14 & 20 should be selected while 17 wont be selected since he didn't login in the week of 03-22 to 03-28 so the result would be
Result
------
2
First you get the list of users per week which has logged in at least once, then you count per month the amount of users:
SELECT LoginYear,LoginWeek,COUNT(*) as NumbUsers
FROM (
SELECT Year(logins.date) as LoginYear, Week(logins.date) as LoginWeek, logins.UserID
FROM logins
WHERE logins.date>='2015-03-01'
GROUP BY LoginYear, LoginWeek, logins.UserID
HAVING COUNT(*)>0
) t
GROUP BY LoginYear,LoginWeek;
Week numbering: MySQL can count the weeks in different ways (such as starting on a Sunday/Monday) using the mode: WEEK(date,mode). See the WEEK MySQL documentation.
Update: to get the number of persons which has been logged in at least once every week: first we get the users that were logged in at least once per week in the subquery weektable. Then the users are select which have a week count which equals the total number of weeks in that period (thus having been online each week). Finally we count those users.
SELECT COUNT(*)
FROM (
SELECT UserID
FROM (
SELECT Year(logins.date) as LoginYear, Week(logins.date) as LoginWeek, logins.UserID
FROM logins
WHERE logins.date>='2015-03-01'
GROUP BY LoginYear, LoginWeek, logins.UserID
HAVING COUNT(*)>0
) weektable
GROUP BY UserID
HAVING COUNT(*)>=TIMESTAMPDIFF(WEEK,'2015-03-01',NOW())
) subq;
Note 1: I put the date '2015-03-01' as an example but you can change this or put as a variable.
Note 2: depending on the dates you choose it can be that the week count by TIMESTAMPDIFF is less than the maximum number of weeks (counted by COUNT(*)), since it does not count half weeks. Therefore I put >= in the last line: HAVING COUNT(*)>=TIMESTAMPDIFF(WEEK,'2015-03-01',NOW()).
I cannot test it here at the moment but something like
SELECT COUNT(Users.id) WHERE logins.date>=XXXX AND logins.date<=XXXX GROUP BY Users.id
should work

Group and sum data based on a day of the month

I have a reoccurring payment day of 14th of each month and want to group a subset of data by month/year and sum the sent column. For example for the given data:-
Table `Counter`
Id Date Sent
1 10/04/2013 2
2 11/04/2013 4
3 15/04/2013 7
4 10/05/2013 3
5 14/05/2013 5
6 15/05/2013 3
7 16/05/2013 4
The output I want is something like:
From Count
14/03/2013 6
14/04/2013 10
14/05/2013 12
I am not worried how the from column is formatted or if its easier to split into month/year as I can recreated a date from multiple columns in the GUI. So the output could easily just be:
FromMth FromYr Count
03 2013 6
04 2013 10
05 2013 12
or even
toMth toYr Count
04 2013 6
05 2013 10
06 2013 12
If the payment date is for example the 31st then the date comparison would need to be the last date of each month. I am also not worried about missing months in the result-set.
I will also turn this into a Stored procedure so that I can push in the the payment date and other filtered criteria. It is also worth mentioning that we can go across years.
Try this query
select
if(day(STR_TO_DATE(date, "%Y-%d-%m")) >= 14,
concat('14/', month(STR_TO_DATE(date, "%Y-%d-%m")), '/', year(STR_TO_DATE(date, "%Y-%d-%m"))) ,
concat('14/', if ((month(STR_TO_DATE(date, "%Y-%d-%m")) - 1) = 0,
concat('12/', year(STR_TO_DATE(date, "%Y-%d-%m")) - 1),
concat(month(STR_TO_DATE(date, "%Y-%d-%m"))-1,'/',year(STR_TO_DATE(date, "%Y-%d-%m")))
)
)
) as fromDate,
sum(sent)
from tbl
group by fromDate
FIDDLE
| FROMDATE | SUM(SENT) |
--------------------------
| 14/10/2013 | 3 |
| 14/12/2012 | 1 |
| 14/3/2013 | 6 |
| 14/4/2013 | 10 |
| 14/5/2013 | 12 |
| 14/9/2013 | 1 |
Pay date could be grouped by months and year separatedly
select Sum(Sent) as "Count",
Extract(Month from Date - 13) as FromMth,
Extract(Year from Date - 13) as FromYr
from Counter
group by Extract(Year from Date - 13),
Extract(Month from Date - 13)
Be careful, since field's name "Date" coninsides with the keyword "date" in ANSISQL
I think the simplest way to do what you want is to just subtract 14 days rom the date and group by that month:
select date_format(date - 14, '%Y-%m'), sum(sent)
from counter
group by date_format(date - 14, '%Y-%m')

Select highest 3 scores in each day for every user

I have a MYSQL table like this:
id | userid | score | datestamp |
-----------------------------------------------------
1 | 1 | 5 | 2012-12-06 03:55:16
2 | 2 | 0,5 | 2012-12-06 04:25:21
3 | 1 | 7 | 2012-12-06 04:35:33
4 | 3 | 12 | 2012-12-06 04:55:45
5 | 2 | 22 | 2012-12-06 05:25:11
6 | 1 | 16,5 | 2012-12-06 05:55:21
7 | 1 | 19 | 2012-12-06 13:55:16
8 | 2 | 8,5 | 2012-12-07 06:27:16
9 | 2 | 7,5 | 2012-12-07 08:33:16
10 | 1 | 10 | 2012-12-07 09:25:19
11 | 1 | 6,5 | 2012-12-07 13:33:16
12 | 3 | 6 | 2012-12-07 15:45:44
13 | 2 | 4 | 2012-12-07 16:05:16
14 | 2 | 34 | 2012-12-07 18:33:55
15 | 2 | 22 | 2012-12-07 18:42:11
I would like to display user scores like this:
if a user on a certain day has more than 3 scores it would get only highest 3, repeat that for every day for this user and then add all days together. I want to display this sum for every user.
EDIT:
So in the example above for user 1 on 06.12. I would add top 3 scores together and ignore 4th score, then add to that number top 3 from the next day and so on. I need that number for every user.
EDIT 2:
Expected output is:
userid | score
--------------------
1 | 59 //19 + 16.5 + 7 (06.12.) + 10 + 6.5 (07.12.)
2 | 87 //22 + 0.5 (06.12.) + 34 + 22 + 8.5 (07.12.)
3 | 18 //12 (06.12.) + 6 (07.12.)
I hope this is more clear :)
I would really appreciate the help because I am stuck.
Please take a look at the following code, if your answer to my comment is yes :) Since your data all in 2012, and month of november, I took day.
SQLFIDDLE sample
Query:
select y.id, y.userid, y.score, y.datestamp
from (select id, userid, score, datestamp
from scores
group by day(datestamp)) as y
where (select count(*)
from (select id, userid, score, datestamp
from scores group by day(datestamp)) as x
where y.score >= x.score
and y.userid = x.userid
) =1 -- Top 3rd, 2nd, 1st
order by y.score desc
;
Results:
ID USERID SCORE DATESTAMP
8 2 8.5 December, 07 2012 00:00:00+0000
20 3 6 December, 08 2012 00:00:00+0000
1 1 5 December, 06 2012 00:00:00+0000
Based on your latter updates to question.
If you need some per user by year/month/day and then find highest, you may simply add aggregation function like sum to the above query. I am reapeating myself, since your sample data is for just one year, there's no point group by year or month. That's why I took day.
select y.id, y.userid, y.score, y.datestamp
from (select id, userid, sum(score) as score,
datestamp
from scores
group by userid, day(datestamp)) as y
where (select count(*)
from (select id, userid, sum(score) as score
, datestamp
from scores
group by userid, day(datestamp)) as x
where y.score >= x.score
and y.userid = x.userid
) =1 -- Top 3rd, 2nd, 1st
order by y.score desc
;
Results based on sum:
ID USERID SCORE DATESTAMP
1 1 47.5 December, 06 2012 00:00:00+0000
8 2 16 December, 07 2012 00:00:00+0000
20 3 6 December, 08 2012 00:00:00+0000
UPDATED WITH NEW SOURCE DATA SAMPLE
Simon, please take a look at my own sample. As your data was changing, I used mine.
Here is the reference. I have used pure ansi style without any over partition or dense_rank.
Also note the data I used are getting top 2 not top 3 scores. You can change is accordingly.
Guess what, the answer is 10 times simpler than the first impression your first data gave....
SQLFIDDLE
Query to 1:
-- for top 2 sum by user by each day
SELECT userid, sum(Score), datestamp
FROM scores t1
where 2 >=
(SELECT count(*)
from scores t2
where t1.score <= t2.score
and t1.userid = t2.userid
and day(t1.datestamp) = day(t2.datestamp)
order by t2.score desc)
group by userid, datestamp
;
Results for query 1:
USERID SUM(SCORE) DATESTAMP
1 70 December, 06 2012 00:00:00+0000
1 30 December, 07 2012 00:00:00+0000
2 22 December, 06 2012 00:00:00+0000
2 25 December, 07 2012 00:00:00+0000
3 30 December, 06 2012 00:00:00+0000
3 30 December, 07 2012 00:00:00+0000
Final Query:
-- for all two days top 2 sum by user
SELECT userid, sum(Score)
FROM scores t1
where 2 >=
(SELECT count(*)
from scores t2
where t1.score <= t2.score
and t1.userid = t2.userid
and day(t1.datestamp) = day(t2.datestamp)
order by t2.score desc)
group by userid
;
Final Results:
USERID SUM(SCORE)
1 100
2 47
3 60
Here goes a snapshot of direct calculations of data I used.
SELECT
*
FROM
table1
LEFT JOIN
(SELECT * FROM table1 ORDER BY score LIMIT 3) as lr on DATE(lr.datestamp) = DATE(table1.datastamp)
GROUP BY
datestamp