mySQL GROUP, most recent - mysql

Here is the data set:
Person Status Date
Eric 1 1/1/2015
Eric 2 2/1/2015
Eric 3 3/1/2015
John 1 3/1/2015
John 2 2/1/2015
John 1 1/1/2015
I'd like to get the most recent date, and its correlated status, grouped by Person. I tried using a subquery to first identify the most recent date:
SELECT MAX(Date), Person FROM tbl1 GROUP BY Person
And then joining that back into the original table, so that by person I know which date is the most recent. But I'm struggling how to identify the most recent status. I just don't see the appropriate aggregator. Thanks.

select tbl1.*
from tbl1
join
(
SELECT Person, MAX(Date) as m_date
FROM tbl1
GROUP BY Person
) tmp on tbl1.Person = tmp.Person
and tbl1.date = tmp.m_date

Related

SQL nested query under WHERE

One of the test questions came by with following schemas, to look for the best doctor in terms of:
Best scored;
The most times/attempts;
For each medical procedures (in terms of name)
[doctor] table
id
first_name
last_name
age
1
Phillip
Singleton
50
2
Heidi
Elliott
34
3
Beulah
Townsend
35
4
Gary
Pena
36
5
Doug
Lowe
45
[medical_procedure] table
id
doctor_id
name
score
1
3
colonoscopy
44
2
1
colonoscopy
37
3
4
ulcer surgery
98
4
2
angiography
79
5
3
angiography
84
6
3
embolization
87
and list goes on...
Given solution as follow:
WITH cte AS(
SELECT
name,
first_name,
last_name,
COUNT(*) AS procedure_count,
RANK() OVER(
PARTITION BY name
ORDER BY COUNT(*) DESC) AS place
FROM
medical_procedure p JOIN doctor d
ON p.doctor_id = d.id
WHERE
score >= (
SELECT AVG(score)
FROM medical_procedure pp
WHERE pp.name = p.name)
GROUP BY
name,
first_name,
last_name
)
SELECT
name,
first_name,
last_name
FROM cte
WHERE place = 1;
It'll mean a lot to be clarified on/explain on how the WHERE clause worked out under the subquery:
How it worked out in general
Why must we match the two pp.name and p.name for it to reflect the correct rows...
...
WHERE
score >= (
SELECT AVG(score)
FROM medical_procedure pp
WHERE pp.name = p.name)
...
Thanks a heap!
Above is join with doctor and medical procedure and group by procedure name and you need doctor names with most attempt and best scored.
Subquery will join by procedure avg score and those who have better score than avg will be filtered.
Now there can be multiple doctor better than avg so taken rank by procedure count so most attempted will come first and then you taken first to pick top one

How to display the days when there are no records in MariaDB?

I have the following table called employees:
employee
name
101
John
102
Alexandra
103
Ruth
And the table called records:
employee
assistance
101
2022-02-01
101
2022-02-02
101
2022-02-07
Let's suppose that I want to display the employee number, name and the days of the month in which there were absences between 2022-02-01 and 2022-02-07 (taking into account that days 05 and 06 are weekends). In that case, the result would be the following:
employee
name
absence
101
John
4,5
How do I get that result?
So far I have developed a query where the days of the month in which there are attendances are displayed. Said query is as follows:
SELECT e.employee,
e.name,
r.assistance AS assistance,
OF employees and
JOIN LEFT(SELECT employee, GROUP_CONCAT(DIFFERENT EXTRACT(DAY SINCE assistance)
ORDER BY STATEMENT(DAY FROM assistance)) AS assistance FROM records
WHERE assistance BETWEEN '2022-02-01' AND '2022-02-07' GROUP BY employee) r ON e.employee = employee
WHERE (r.no_employee IS NOT NULL) ORDER BY name ASC
I would like to know how to implement the days in which there were absences and not consider the weekends. I've done several tests but I'm still stuck. I'm working with MariaDB 10.4.11
You use a recursive common table expression (requires mariadb 10.2+ or mysql 8) to get the list of dates in the date range, and join against that:
with recursive date_range as (
select '2021-12-01' dt
union all
select dt + interval 1 day from date_range where dt < '2021-12-07'
)
select employee.employee, group_concat(day(date_range.dt) order by date_range.dt) faults
from date_range
cross join employee
left join records on records.employee=employee.employee and records.assistance=date_range.dt
where weekday(date_range.dt) < 5 and records.employee is null
group by employee.employee
fiddle
If you are just looking for one employee, add that as a where condition.

SQL group by function to categorize only the most recent data

So I have this table called title where it stores all of the title held by each employee which will look like this
emp_no
title
start_date
101
Engineer
2019-01-01
101
Senior Engineer
2020-02-01
102
Engineer
2019-01-11
102
Senior Engineer
2020-02-11
103
Engineer
2019-01-21
104
Engineer
2019-01-31
105
Associate
2019-01-01
106
Associate
2019-01-11
106
Manager
2020-02-11
107
Associate
2019-01-21
107
Manager
2020-02-21
108
Associate
2019-01-31
Notice that each employee can have more than 1 title. For example emp 101 title is engineer in 1st January 2019 but got promoted as senior engineer one year later.
Now lets say i want to count how many employees for each position. I have tried using the count function along with group by (to group the number of employee by the title) but the problem is, the SQL query also count the past position of every employee.
To be exact, I only want to include the most recent role that an employee currently has. So in this case, the result I am expecting is
Engineer: 2 employees (because the other 2 has been promotod to senior engineer),
Senior engineer: 2 employees,
Associate: 2 employees (because the other 2 has been promotod to manager),
Manager: 2 employees
Is there some kind of way to achieve that?
NOTE: this table format is from one of the SQL online course that i'm taking so I'm not the one who make the table. and also in the original table in containes tens of thousands of data.
You can use not exists as follows:
select title, count(*) as Count
from your_table t
where not exists
(select 1 from your_table tt
where tt.emp_no = t.emp_no and tt.start_date> t.start_date)
group by title
select title,COUNT(*) numberOfEmp from
(
select distinct emp_no
,(select top 1 title from [dbo].[Tbl_title] a where a.emp_no=m.emp_no
order by [start_date] desc
) title
from [dbo].[Tbl_title] m
) mTable
group by title
I am going to recommend a correlated subquery, but for a very particular reason:
select title, count(*)
from t
where t.start_date = (select max(t2.start_date)
from t t2
where t2.emp_no = t.emp_no
);
The particular reason for suggesting this is that it is easy to modify this for the number of employees "as of" a particular date. For instance, if you want the number of employees as of 2019-01-01, you change the where to:
where t2.emp_no = t.emp_no and t2.start_date <= '2019-01-01'
You can simply filter the data in where condition while counting. Query as follows:
select title, count(distinct emp_no) as Count
from (select emp_np, title, max(start_date) as start_date
from table
group by emp_np, title) subset
group by title

MySQL Query for Row with Most Recent Date

Let's say I have a table called "signup_info" as follows:
p_key_id name gender signup_date
1 Bob male 10/5/17
2 Mary female 9/23/14
3 Jamie female 2/6/15
4 Jamie male 3/22/17
How would I write a query that would only give me the most row pertaining to the most recent signup_date for every instance of a person's name?
I would use a correlated subquery:
select si.*
from signup_info si
where si.signup_date = (select max(s2.signup_date) from signup_info si2 where si2.name = si.name);
If p_key_id is autoincrementing, then it might provide a more reliable way to get the most recent:
select si.*
from signup_info si
where si.p_key_id = (select max(s2.p_key_id) from signup_info si2 where si2.name = si.name);
If someone signs up twice on the same date, then the first will return duplicate rows for that person.

Get highest value for each date

I have a table that logs every time a user completes a survey. It looks a bit like this:
surveyID author timestamp
-----------------------------------------------
1 person1 1461840669000
2 person2 1461840670000
3 person1 1461840680000
I'm trying to run a query that shows me the top surveyor every day (i.e. the person that does the highest number of surveys per day) since April 1st.
So far I've tried this:
SELECT author,
COUNT (DISTINCT surveyid) AS num_surveys,
STRFTIME_UTC_USEC(creation_time*1000, "%Y-%m-%d") AS date,
FROM myTable
WHERE creation_time > 1459468800000 //since April 1st
GROUP BY date, author
ORDER BY 3 DESC,2 DESC;
Which gives me this result:
author num_surveys date
------------------------------------
user1 116 2016-04-27
user2 109 2016-04-27
user3 99 2016-04-27
user3 102 2016-04-28
user1 98 2016-04-28
user2 97 2016-04-28
However, I would really just like the top record from each day:
author num_surveys date
------------------------------------
user1 116 2016-04-27
user3 102 2016-04-28 etc...
I've tried MAX() and TOP() in various places but none of them have worked so far hence the above example of my query that gets me closest to what I want... Any suggestions would be much appreciated. I'm very new to SQL!
EDIT
Thanks for the suggestions to far. Have managed to get it to work with:
DEFINE INLINE TABLE A
SELECT author,
COUNT (DISTINCT featureid) AS num_surveys,
STRFTIME_UTC_USEC(creation_time*1000, "%Y-%m-%d") AS date,
FROM placesense.surveys
WHERE creation_time > 1459468800000
GROUP BY date, author
ORDER BY 3 DESC,2 DESC;
SELECT
MAX(num_surveys),
date
FROM A AS B
WHERE date = B.date
GROUP BY date
Any other more efficient suggestions welcome though.
A pretty simple way uses a correlated subquery:
select t.*
from t
where t.num_surveys = (select max(t2.num_surveys) from t t2 where t2.date = t.date);
Note: this will return duplicates for a date in the case of ties.
SELECT MAX( surveyid) AS m_surveys,
STRFTIME_UTC_USEC(creation_time*1000, "%Y-%m-%d") AS date,
FROM myTable
WHERE creation_time > 1459468800000 //since April 1st
GROUP BY date, author
ORDER BY 3 DESC,2 DESC;