Average of a distinct count by day in Access - ms-access

I'm trying to get an average of a distinct count per day out of an Access 2010 database and nothing I do seems to be working. I've tried using a subquery but I can't figure out how to link it to the main one. The table has a row for every office visit, like this:
patient_id, visit_year, visit_date, dow, doctor_id
12345, 2012, 3/5/12, Monday, 987
12567, 2012, 3/5/12, Monday, 986
12789, 2012, 3/6/12, Tuesday, 987
I need to get the average number of doctors available per day of week where year = 2012. In my head this should work but it doesn't:
Select dow, AVG(COUNT(DISTINCT(doctor_id))) AS AvgDocsInOffice
From visits
WHERE visit_year = 2012
GROUP BY dow
I'm trying to get to this output:
DOW, AvgDocsInOffice
Monday, 5
Tuesday, 6
Wednesday, 4
Any ideas? Unfortunately I'm stuck doing this in Access.

SELECT dow, AVG(CntDocsInOffice) AS AvgDocsInOffice FROM
(SELECT dow, visit_date, COUNT(*) AS CntDocsInOffice FROM
(SELECT DISTINCT dow, visit_date, doctor_id FROM
visits
WHERE visit_year = 2012)
GROUP BY dow, visit_date)
GROUP BY dow;
The inner most SELECT selects unique doctors per day.
The next outer SELECT counts the number of unique doctors per day.
The outer most SELECT finally calculates the average doctor count per day of week.

How about:
SELECT q.dow,
Avg(q.countofdoctor_id) AS avgofcountofdoctor_id
FROM (SELECT v.visit_date,
v.dow,
COUNT(v.doctor_id) AS countofdoctor_id
FROM (SELECT DISTINCT visit_date,
dow,
doctor_id
FROM visits
WHERE visit_year = 2012) AS v
GROUP BY v.visit_date,
v.dow) AS q
GROUP BY q.dow;

Related

Getting the number of users for this year and last year in SQL

My table is like this:
root_tstamp
userId
2022-01-26T00:13:24.725+00:00
d2212
2022-01-26T00:13:24.669+00:00
ad323
2022-01-26T00:13:24.629+00:00
adfae
2022-01-26T00:13:24.573+00:00
adfa3
2022-01-26T00:13:24.552+00:00
adfef
...
...
2021-01-26T00:12:24.725+00:00
d2212
2021-01-26T00:15:24.669+00:00
daddfe
2021-01-26T00:14:24.629+00:00
adfda
2021-01-26T00:12:24.573+00:00
466eff
2021-01-26T00:12:24.552+00:00
adfafe
I want to get the number of users in the current year and in previous year like below using SQL.
Date Users previous_year
2022-01-01 10 5
2022-01-02 20 15
The code is written as follows.
select CAST(root_tstamp as DATE) as Date,
count(DISTINCT userid) as users,
count(Distinct case when CAST(root_tstamp as DATE) = dateadd(MONTH,-12,CAST(root_tstamp as DATE)) then userid end) as previous_year
FROM table1
But it returns 0 for previous_year values.
How can I fix that?
Possible solution for SQL Server:
WITH cte AS ( SELECT 2022 [year]
UNION ALL
SELECT 2021 )
SELECT cte.[year],
COUNT(DISTINCT test.userId) current_users_amount,
COUNT(DISTINCT CASE WHEN YEAR(test.root_tstamp) < cte.[year]
THEN test.userId
END) previous_users_amount
FROM test
JOIN cte ON YEAR(test.root_tstamp) <= cte.[year]
GROUP BY cte.[year]
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=88b78aad9acd965bdbac4c85a0b81927
This query (for MySql) returns unique number of userids where the root_timestamp is in the current year, by day, and the number of unique userids for the same day last year. If there is no record for a day in the current year nothing will be displayed for that day. If there are rows for the current year, but no rows for the same day last year, then NULL will be shown for that lastyear column.
SELECT cast(ty.root_tstamp as date) as Dte,
COUNT(DISTINCT ty.userId) as users_this_day,
count(distinct lysd.userid) as users_sameday_lastyear
FROM test ty
left join
test lysd
on cast(lysd.root_tstamp as date)=date_add(cast(ty.root_tstamp as date), interval -1 year)
WHERE YEAR(ty.root_tstamp) = year(current_date())
GROUP BY Dte
If you wish to show output rows for calendar days even if there are no rows in current year and/or last year, then you also need a calendar table to be introduced (let's hope that it is not what you need)

SQL reporting active users and logins

I have a SQL Table of logins as a data source, and each row has an id, timestamp and user_id.
Similar to this:
id
timestamp
user_id
1
2022-01-01T15:17:13.000Z
234
2
2022-01-02T15:17:13.000Z
235
I want to build a report that shows an aggregate of logins by year. So something like (for all months, just using January as an example.):
Year
Active Users in January
Logins in January
2019
500
10000
2020
600
10002
Essentially, the active users would be grouping the rows of logins by user_id, and the logins would just aggregate the timestamps by month.
Is this kind of view something I build using a SQL query?
You may use aggregation here:
SELECT
YEAR(timestamp) AS Year,
COUNT(DISTINCT user_id) AS `Active Users in January`,
COUNT(*) AS `Logins in January`
FROM yourTable
WHERE
MONTH(timestamp) = 1
GROUP BY
YEAR(timestamp);
The number of active users is given by the distinct count of users for a given year, in the month of January. The number of logins is just the number of entries for a given year in January.
If you want to report for all months and all years, then use:
SELECT
DATE_FORMAT(timestamp, '%Y-%m') AS ym,
COUNT(DISTINCT user_id) AS `Active Users per month`,
COUNT(*) AS `Logins in month`
FROM yourTable
GROUP BY 1;

Query to combine two tables group by month

I have tried to connect two tables by join and group them to get the count. But unfortunately, these two tables don't have any common value to join (Or I have misunderstood the solutions).
select date_format(check_in.date,'%M') as Month, count(check_in.Id) as checkInCount
from check_in
group by month(check_in.date);
Month
checkInCount
July
1
October
2
This is the first table.
select date_format(reservation.date,'%M') as Month, count(reservation.id) as reserveCount
from reservation
group by month(reservation.date);
Month
reserveCount
July
3
October
5
This is the second table.
I want to show these two tables in one table.
Month
checkInCount
reserveCount
July
1
3
October
2
5
Thank you for trying this and sorry if this is too easy.
You will need to join the result by month from your two subqueries.
This query assume all the month (July, August, September...) present in your subqueries monthCheckInStat, monthCheckOutStat, even if the count is 0
SELECT monthCheckInStat.Month, monthCheckInStat.checkInCount, monthCheckOutStat.reserveCount
FROM
(
select date_format(check_in.date,'%M') as Month, count(check_in.Id) as checkInCount
from check_in
group by month(check_in.date)
) monthCheckInStat
INNER JOIN
(
select date_format(reservation.date,'%M') as Month, count(reservation.id) as reserveCount
from reservation
group by month(reservation.date)
) monthCheckOutStat
ON monthCheckInStat.Month = monthCheckOutStat.Month;

sql: group by multiple correlated fields (date, weekday, month)

I am working on a SQL task. The goal is to know how many flights there are on average, for a given day in a given month from the flights table.
Input table:
flights
id BIGINT
dep_day_of_week varchar (255)
dep_month varchar (255)
dep_date text
An example of the flights table. There could be multiple entries for the same date.
id dep_day_of_week dep_month dep_date
1 Thursday January 4/7/2005 15:24:00
2 Friday February 5/6/2005 12:12:12
3 Friday February 5/6/2005 15:12:12
I read a solution as following:
SELECT a.dep_month,
a.dep_day_of_week,
AVG(a.flight_count) AS average_flights
FROM (
SELECT dep_month, dep_day_of_week, dep_date,
COUNT(*) AS flight_count
FROM flights
GROUP BY 1,2,3
) a
GROUP BY 1,2
ORDER BY 1,2;
My question is in the subquery which calculate the number of flights per day:
SELECT dep_month, dep_day_of_week, dep_date, COUNT(*) AS flight_count
FROM flights
GROUP BY 1,2,3
Since dep_month, dep_day_of_week, dep_date are three correlated attributes, with the dep_date might be the most detailed resolution of the three. So I thought GROUP BY 1,2,3 will do the same function as GROUP BY 3.
To examine what could be the possible differences, I use count(*) from ... to select all the terms resulted from the above subquery,
Select count(*) from (
SELECT dep_month, dep_day_of_week, dep_date, COUNT(*) AS flight_count
FROM flights
GROUP BY 1,2,3 or Group Group by 3)
In the output, the counts for GROUP BY 1,2,3 and GROUP BY 3 , are 447 and 441, respectively. Why there is any difference between these two grouping methods?
Updates:
Thanks to #trincot excellent answer. I use his suggested codes and found inconsistency in the input database.
SELECT dep_date, count(distinct dep_month), count(distinct dep_day_of_week)
FROM flights
GROUP BY dep_date
HAVING count(distinct dep_month) > 1
OR count(distinct dep_day_of_week) > 1
Output:
dep_date count(distinct dep_month) count(distinct dep_day_of_week)
1/16/2001 1 2
10/25/2003 1 2
2/23/2000 1 2
3/29/2001 1 2
4/3/2001 1 2
5/13/2000 1 2
Specifically, the database assigns Monday for 1/16/2001 8:25:00 and Tuesday for 1/16/2001 7:56:00. That is the reason of the inconsistency.
As the date field has a time component, the count(*) in your subquery is going to be 1 every time, since the time component will be different and generate a new group. Your groups are actually per second.
You could get your results without subquery, like this:
select dep_month,
dep_day_of_week,
count(*) /
count(distinct substring_index(dep_date, ' ', 1)) avg_flights
from flights
group by dep_month,
dep_day_of_week
This counts all the flight records, and divides that by the number of different dates these flights are on. The date is extracted by only taking the part before the space.
Note that this means that when you don't have a record at all for a certain date, this day will not count in the average and might give a false impression. For instance, if in January there is only one Friday for which you have flights (let's say 10 of them), but there are 4 Fridays in January, you will still get an average of 10, even though 2.5 would be more reasonable.
About the difference in count
You state that this query returns 447 records:
Select count(*) from (
SELECT dep_month, dep_day_of_week, dep_date, COUNT(*) AS flight_count
FROM flights
GROUP BY 1,2,3)
And this only 441:
Select count(*) from (
SELECT dep_month, dep_day_of_week, dep_date, COUNT(*) AS flight_count
FROM flights
GROUP BY 3)
This seems to indicate that you have identical dates in multiple records, but yet with difference in one of the first two columns, which would be an inconsistency. You can find out with this query:
SELECT dep_date, count(distinct dep_month), count(distinct dep_day_of_week)
FROM flights
GROUP BY dep_date
HAVING count(distinct dep_month) > 1
OR count(distinct dep_day_of_week) > 1
In a healthy data set, this query should return 0 records. If it returns records, you'll get the dates for which the month is not correctly set in at least one record, or the day of the week is not correctly set in at least one record.

MySQL query to retrieve DISTINCT COUNT between moving DATE period

I'm trying to write a query that returns a list of dates and the DISTINCT COUNT of User IDs for the 7 days preceding each date. The table I'm working with is simple, and looks like this:
Started UserId
"2012-09-25 00:01:04" 164382
"2012-09-25 00:01:39" 164382
"2012-09-25 00:02:37" 166121
"2012-09-25 00:03:35" 155682
"2012-09-25 00:04:18" 160947
"2012-09-25 00:08:19" 165806
I can write the query for output of an individual COUNT as follows:
SELECT COUNT(DISTINCT UserId)
FROM Session
WHERE Started BETWEEN '2012-09-18 00:00' AND '2012-09-25 00:00';
But what I'm trying to do is output this COUNT for every day in the table AND the 7 days preceding it. To clarify, the value for September 25th would be the count of DISTINCT User IDs between the 18th and 25th, the 24th the count between 17th and 24th, etc.
I tried the following query but it provides just the COUNT for each day:
SELECT
DATE(A.Started),
Count(DISTINCT A.UserId)
FROM Session AS A
WHERE DATE(A.Started) BETWEEN DATE(DATE_SUB(DATE(DATE(A.Started)),INTERVAL 7 DAY)) AND DATE(DATE(A.Started))
GROUP BY DATE(A.Started)
ORDER BY DATE(A.Started);
And the output looks like this:
DATE(A.Started) "Count(DISTINCT A.UserId)"
2012-09-18 709
2012-09-19 677
2012-09-20 658
2012-09-21 556
2012-09-22 530
2012-09-23 479
2012-09-24 528
2012-09-25 480
...
But as I said, those are just the daily counts. Initially I thought I could just sum the 7 day values, but that will invalidate the DISTINCT clause. I need the DISTINCT UserId counts for each 7 day period preceding a given date.
This query should work for you:
SELECT
DATE_FORMAT(d1.Started, '%Y-%m-%d') AS Started,
COUNT(DISTINCT d2.UserID) Users
FROM
(
SELECT
DATE(Started) AS Started
FROM
Session
GROUP BY
DATE(Started)
) d1
INNER JOIN
(
SELECT DISTINCT
DATE(Started) AS Started,
UserID
FROM
Session
) d2
ON d2.Started BETWEEN d1.Started - INTERVAL 7 DAY AND d1.Started
GROUP BY
d1.Started
ORDER BY
d1.Started DESC
Visit http://sqlfiddle.com/#!2/9339c/5 to see this query in action.
try:
Select Distinct Date(A.Started), Count(B.UserId)
From Session a
Join Session b
On b.Start Between AddDate(A.Start, day, -7) And A.Start
I'm not a MySQL guy, so the syntax might not be correct, but the pattern will work....