Counting Occurrences of Day Names By Distinct Weeks - mysql

Apologies in advance if this question is badly worded!
I have a MySQL table which has a datetime field. How would one count the number of occurrences of each day name, for distinct weeks?
Here is the query I'm trying, and the result it gives from just over 2 weeks of data. It is (obviously) counting every occurrence for each record, whereas I want the output to be either 2 or 3.
SELECT dayname(datetime), COUNT(dayofweek(datetime))
FROM mytable GROUP BY dayofweek(datetime);
+-------------------+----------------------------+
| dayname(datetime) | count(dayofweek(datetime)) |
+-------------------+----------------------------+
| Monday | 404 |
| Tuesday | 275 |
| Wednesday | 251 |
| Thursday | 196 |
| Friday | 201 |
| Saturday | 128 |
+-------------------+----------------------------+
Grouping by week did not solve the problem. I feel as though I need to "count where week is distinct" but I'm not sure if this is possible.
Any guidance is much appreciated, thank you!

Try to count distinct values (date part only) of your datetime field. More about COUNT(DISTINCT expr,[expr...]) and DATE(expr) that returns the date part only from datetime field.
SELECT dayname(datetime), COUNT(distinct date(datetime))
FROM mytable GROUP BY dayname(datetime);

You normally group on what you are not counting.

Related

Self Join? Were Staff Who Worked the Previous Week Active 3 Weeks ago - MYSQL

I'm trying to add a column to a production hours dataset that will tell if a provider who worked last week was also working three weeks earlier. The current dataset looks something like this:
RowID | ProviderID | ClientID | DOS | DOS (Week) | Hours
1 | 1111111111 | 22222222 | 11/2/2020 | 11/1/2020 | 2.5
2 | 1111111111 | 33333333 | 11/5/2020 | 11/1/2020 | 1
3 | 1111111111 | 44444444 | 10/13/2020 | 10/11/2020 | 3
I'm trying to get an extra column 'Active 3 Weeks Prior' with y/n or 1/0 for values. For the above table, let's assume the provider started on 10/13/20. The new column would ideally populate like this:
RowID | ProviderID | ClientID | DOS | DOS (Week) | Hours | Active 3 weeks Prior
1 | 1111111111 | 22222222 | 11/2/2020 | 11/1/2020 | 2.5 | Yes
2 | 1111111111 | 33333333 | 11/5/2020 | 11/1/2020 | 1 | Yes
3 | 1111111111 | 44444444 | 10/13/2020 | 10/11/2020 | 3 | No
A couple extra tidbits: our org uses Sunday as the start of the week so DOS (Week) is the Sunday prior to the date of service. From what I've been reading so far, it seems like the solution here is some kind of self join, where the base production records are aggregated into weekly hours and compared with that same providerID's records for DOS (Week) - 21.
The trouble I'm having is: whether I'm on the right track in the first place with the self-join and how I would generate the y/n values based on the success or failure to find a matching value. Also, I suspect that joining based on a concatenate of ProviderID and DOS(Week) might be flawed? This is what I've been playing with so far.
Please let me know if I can clarify the question at all or am missing something very obvious. I truly appreciate any help, as I've been trying to figure out the right search terms to get a clue on the answer for a few days now.
If you are running MySQL 8.0, you can use window functions and a range specification:
select t.*,
(
max(providerid) over(
partition by providerid
order by dos
range between interval 3 week preceding and interval 3 week preceding
) is not null
) as active_3_weeks_before
from mytable t
It is not really clear from your explanation and data what you mean by was also working three weeks earlier. What the query does is, for each row, to check if another row exists with the same supplier and a dos that is exactly 3 week before the dos of the current row. This can easily be adapted for some other requirement.
Edit: if you want to check for any record within the last 3 weeks, you would change the window range to:
range between interval 3 week preceding and interval 1 day preceding
And if you want this in MySQL < 8.0, where window functions are not available, then you would use a correlated subquery:
select t.*,
exists (
select 1
from mytable t1
where
t1.providerid = t.provider_id
and t1.dos >= t.dos - interval 3 week
and t1.dos < t.dos
) as active_3_weeks_before
from mytable t

How can I select the last timestamp reading per day for multiple user id's and multiple days?

I have a database that contains user id, calories burned (value), and the timestamp at which those calories burned were recorded(reading_date). An individual could have multiple calorie readings for the same day, but I'm only interested in the last reading since it's a total of all the previous readings for that day.
IN:
SELECT
DISTINCT ON (date, user_contents.content_id)
date_trunc('day',reading_date + time '05:00') date,
user_id,
created_at,
value
FROM data
OUT:
date | user_id | created_at | value
2019-01-13 00:00:00 | 138 | 2019-01-18 06:07:52 | 81.0
2019-01-15 00:00:00 | 137 | 2019-01-15 15:43:25 | 87.0
2019-01-16T00:00:00 | 137 | 2019-01-18 04:22:11 | 143.0
2019-01-16T00:00:00 | 137 | 2019-01-18 06:12:11 | 230.0
additional values omitted
I want to be able to select the maximum reading value for each day per person. I've tried using DISTINCT statements such as:
SELECT
DISTINCT ON (date, user_contents.content_id)
date_trunc('day',reading_date + time '05:00') date,
Sometimes that results in an error message:
SELECT DISTINCT ON expressions must match initial ORDER BY expressions
Sometimes it filters out some results, but isn't always giving me the last reading of the day or only one result per person per day.
My optimal end result would look like this (the third record having been removed):
date | user_id | created_at | value
2019-01-13 00:00:00 | 138 | 2019-01-18 06:07:52 | 81.0
2019-01-15 00:00:00 | 137 | 2019-01-15 15:43:25 | 87.0
2019-01-16T00:00:00 | 137 | 2019-01-18 06:12:11 | 230.0
additional values omitted
Ultimately, I'm going to use this data to sum up the value column and determine the total number of calories burned by everyone in the dataset over a time period.
You appear to be using Postgres.
Follow the instructions in the error message. You want something like this:
SELECT DISTINCT ON (user_id, reading_date::date)
date_trunc('day',reading_date + time '05:00') date,
user_id, created_at,value
FROM data
ORDER BY user_id, reading_date::date DESC, reading_date DESC

MySQL: A table having start_date and end_date: How to select one row for each day of a record?

one of my tables does have a start_date and end_date, both type DATE.
Usually, they are equal, but in some cases, end_date is bigger than the start date.
What I would like to achieve if possible is a SELECT which returns a ROW for each day.
So e.g. if start_date is 2018-06-28 and end date is 2018-06-30, the SELECT should return 3 rows for this record. My favourite way would be to change the start_date, like:
+-----------+------------+------------+
| id | start_date | end_date |
+-----------+------------+------------+
| 45 | 2018-06-28 | 2018-06-30 |
| 45 | 2018-06-29 | 2018-06-30 |
| 45 | 2018-06-30 | 2018-06-30 |
+-----------+------------+------------+
Could you give me a push into the right direction if this is possible? Searching for this didn't bring up anything useful.
Thanks a lot
Philipp
The simplest way would be to add a very basic "dates" table with every date, then join to it on dates.date BETWEEN start_date AND end_date.
Depending on your MySQL version, there could also be a more complicated "common table expression" (recursive) query possible.

Use Max date to create a date range

I need to create a date range in a table that houses transaction information. The table updates sporadically throughout the week from a manual process. Each time the table is updated transactions are added up to the previous Sunday. For instance, the upload took place yesterday and so transactions were loaded through last Sunday (Feb 26th). If it had been loaded on Wednesday it would still be dated for Sunday. The point is that I have a moving target with my transactions and also when the data is loaded to the table. I am trying to fix my look back period to the date of the latest transaction then go three weeks back. Here is the query that I came up with:
SELECT distinct TransactionDate
FROM TransactionTABLE TB
inner join (
SELECT distinct top 21 TransactionDate FROM TrasactionTABLE ORDER BY TransactionDate desc
) A on TB.TransactionDate = A.TransactionDate
ORDER BY TB.TransactionDate desc
Technically this code works. The problem that I am running into now is when there were no transactions on a given date, such as bank holidays (in this case Martin Luther King Day), then the query looks back one day too far.
I have tried a few different options including MAX(TransactionDate) but if I use that in a sub-query or CTE then use the new value in a WHERE statement as a reference I only get the max value or the value I subtract that statement by. For instance if I say WHERE TransactionDate >= MAX(TransactionDate)-21 and the max date is Feb 26th then the result is Feb 2nd instead of the range of dates from Feb 2nd through Feb 26th.
IN SUMMARY, what I need is a date range looking three weeks back from the date of the latest transaction date. This is for a daily report so I cannot hardcode the date in. Since I am also using Excel Connections the use of Declare statements is prohibited.
Thank you StackOverflow gurus in advance!
You could use something like this:
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
select top (21)
[Date]=convert(date,dateadd(day, row_number() over (order by (select 1))-1
, dateadd(day,-20,(select max(TransactionDate) from t) ) ) )
from n as deka
cross join n as hecto
order by [Date]
)
select Date=convert(varchar(10),dates.date,120) from dates
rextester demo: http://rextester.com/ZFYV25543
returns:
+------------+
| Date |
+------------+
| 2017-02-06 |
| 2017-02-07 |
| 2017-02-08 |
| 2017-02-09 |
| 2017-02-10 |
| 2017-02-11 |
| 2017-02-12 |
| 2017-02-13 |
| 2017-02-14 |
| 2017-02-15 |
| 2017-02-16 |
| 2017-02-17 |
| 2017-02-18 |
| 2017-02-19 |
| 2017-02-20 |
| 2017-02-21 |
| 2017-02-22 |
| 2017-02-23 |
| 2017-02-24 |
| 2017-02-25 |
| 2017-02-26 |
+------------+
I just found this for looking up dates that fall within a given week. The code can be manipulated to change the week start date.
select convert(datetime,dateadd(dd,-datepart(dw,convert(datetime,convert(varchar(10),DateAdd(dd,-1/*this # changes the week start day*/,getdate()),101)))+1/*this # is used to change the week start date*/,
convert(datetime,convert(varchar(10),getdate(),21))))/*also can enter # here to change the week start date*/
I've included a screenshot of the results if you were to include this with a full query. This way you can see how it looks with a range of dates. I did a little manipulation so that the week starts on Monday and references Monday's date.
Since I am only looking back three weeks a simple GETDATE()-21 is sufficient because as the query moves forward through the week it will look back 21 days and pick the Monday at the beginning of the week as my start date.

MySQL how to present day results (starting value, total change and day-end value from table

I have this table (have a look on SQLFiddle)
In previous steps the record number has been determined and the values for "PrevVal" and "NewVal" have been calculated.
The record's end value ("NewVal"), becomes the next record's starting value ("PrevVal")
I would like to condense the table in such a way that there is only one record per day, containing:
the date starting value "StartOfDay",
the total change during the day "TotalChange" and
the resulting day-end value "EndOfDay"
The desired result can be seen in the demo table "ChangesPerDayCondensed"
Who can help me solve this (a stored procedure is OK).
Thnx
I am a little confused whey the record numbers are going the opposite way. But neverthless you could solve this by evaluating the starting value and sum of mutations separatately and then adding them all to come up with ending value..
Ordering the results descending as the record number again needs to be lower for a higher date.
insert into ChangesPerDayCondensed
select #recrd:=#recrd+1, a.MyDate, b.PrevVal, a.Mutation, b.PrevVal+a.Mutation
from
(select MyDate, sum(Mutation) as Mutation from MutationsPerDay group by MyDate) a,
(select b.MyDate, b.PrevVal from (select MyDate, max(RecNo) as RecNo from MutationsPerDay group by MyDate) a, MutationsPerDay b where a.RecNo = b.RecNo) b,
(select #recrd:=0) c
where a.MyDate = b.MyDate order by MyDate desc;
I'd do it this way:
First create a lookup for each day (find first and lasts ReqNo) and then join two times to the Daily table and calculate the changes:
SELECT first_.MyDate,
first_.PrevVal AS StartOfDay,
last_.NewVal AS EndOfDay,
(last_.NewVal - first_.PrevVal) AS TotalChange
FROM
(SELECT mpd1.MyDate,
max(mpd1.RecNo) AS first_rec_no,
min(mpd1.RecNo) AS last_rec_no
FROM MutationsPerDay mpd1
GROUP BY MyDate) AS lo
JOIN MutationsPerDay AS first_ ON lo.first_rec_no = first_.RecNo
JOIN MutationsPerDay AS last_ ON lo.last_rec_no = last_.RecNo
Explanation:
What you actually want is:
For every day the first and the last value (and the difference).
So what you need to find first is for every date the id of the first and the last value:
SELECT mpd1.MyDate,
max(mpd1.RecNo) AS first_rec_no,
min(mpd1.RecNo) AS last_rec_no
FROM MutationsPerDay mpd1
GROUP BY MyDate
----------------------------------------------------
| MyDate | first_rec_no | last_rec_no |
----------------------------------------------------
| 2016-12-05 00:00:00 | 16 | 13 |
| 2016-12-07 00:00:00 | 12 | 12 |
| 2016-12-12 00:00:00 | 11 | 8 |
| 2016-12-14 00:00:00 | 7 | 7 |
| 2016-12-20 00:00:00 | 6 | 6 |
| 2016-12-21 00:00:00 | 5 | 4 |
| 2016-12-28 00:00:00 | 3 | 3 |
| 2016-12-29 00:00:00 | 2 | 2 |
| 2016-12-30 00:00:00 | 1 | 1 |
----------------------------------------------------
Then you can use these first and last id's to find the corresponding values in the source table. For example for the 2016-12-21 you'd get the rows with the id's first: 5 and last: 4
The PrevVal record no 5 represents the first value you have seen at this day and NewVal in record no 4 represents the last value you have seen at this day. If you subtract them you'll get the change for this day.
I hope this clarifies the methodology a bit.