SQL group by function to categorize only the most recent data

SQL group by function to categorize only the most recent data - mysql

So I have this table called title where it stores all of the title held by each employee which will look like this
emp_no
title
start_date
101
Engineer
2019-01-01
101
Senior Engineer
2020-02-01
102
Engineer
2019-01-11
102
Senior Engineer
2020-02-11
103
Engineer
2019-01-21
104
Engineer
2019-01-31
105
Associate
2019-01-01
106
Associate
2019-01-11
106
Manager
2020-02-11
107
Associate
2019-01-21
107
Manager
2020-02-21
108
Associate
2019-01-31
Notice that each employee can have more than 1 title. For example emp 101 title is engineer in 1st January 2019 but got promoted as senior engineer one year later.
Now lets say i want to count how many employees for each position. I have tried using the count function along with group by (to group the number of employee by the title) but the problem is, the SQL query also count the past position of every employee.
To be exact, I only want to include the most recent role that an employee currently has. So in this case, the result I am expecting is
Engineer: 2 employees (because the other 2 has been promotod to senior engineer),
Senior engineer: 2 employees,
Associate: 2 employees (because the other 2 has been promotod to manager),
Manager: 2 employees
Is there some kind of way to achieve that?
NOTE: this table format is from one of the SQL online course that i'm taking so I'm not the one who make the table. and also in the original table in containes tens of thousands of data.

You can use not exists as follows:
select title, count(*) as Count
from your_table t
where not exists
(select 1 from your_table tt
where tt.emp_no = t.emp_no and tt.start_date> t.start_date)
group by title

select title,COUNT(*) numberOfEmp from
(
select distinct emp_no
,(select top 1 title from [dbo].[Tbl_title] a where a.emp_no=m.emp_no
order by [start_date] desc
) title
from [dbo].[Tbl_title] m
) mTable
group by title

I am going to recommend a correlated subquery, but for a very particular reason:
select title, count(*)
from t
where t.start_date = (select max(t2.start_date)
from t t2
where t2.emp_no = t.emp_no
);
The particular reason for suggesting this is that it is easy to modify this for the number of employees "as of" a particular date. For instance, if you want the number of employees as of 2019-01-01, you change the where to:
where t2.emp_no = t.emp_no and t2.start_date <= '2019-01-01'

You can simply filter the data in where condition while counting. Query as follows:
select title, count(distinct emp_no) as Count
from (select emp_np, title, max(start_date) as start_date
from table
group by emp_np, title) subset
group by title

Related

SQL nested query under WHERE

One of the test questions came by with following schemas, to look for the best doctor in terms of:
Best scored;
The most times/attempts;
For each medical procedures (in terms of name)
[doctor] table
id
first_name
last_name
age
1
Phillip
Singleton
50
2
Heidi
Elliott
34
3
Beulah
Townsend
35
4
Gary
Pena
36
5
Doug
Lowe
45
[medical_procedure] table
id
doctor_id
name
score
1
3
colonoscopy
44
2
1
colonoscopy
37
3
4
ulcer surgery
98
4
2
angiography
79
5
3
angiography
84
6
3
embolization
87
and list goes on...
Given solution as follow:
WITH cte AS(
SELECT
name,
first_name,
last_name,
COUNT(*) AS procedure_count,
RANK() OVER(
PARTITION BY name
ORDER BY COUNT(*) DESC) AS place
FROM
medical_procedure p JOIN doctor d
ON p.doctor_id = d.id
WHERE
score >= (
SELECT AVG(score)
FROM medical_procedure pp
WHERE pp.name = p.name)
GROUP BY
name,
first_name,
last_name
)
SELECT
name,
first_name,
last_name
FROM cte
WHERE place = 1;
It'll mean a lot to be clarified on/explain on how the WHERE clause worked out under the subquery:
How it worked out in general
Why must we match the two pp.name and p.name for it to reflect the correct rows...
...
WHERE
score >= (
SELECT AVG(score)
FROM medical_procedure pp
WHERE pp.name = p.name)
...
Thanks a heap!

Above is join with doctor and medical procedure and group by procedure name and you need doctor names with most attempt and best scored.
Subquery will join by procedure avg score and those who have better score than avg will be filtered.
Now there can be multiple doctor better than avg so taken rank by procedure count so most attempted will come first and then you taken first to pick top one

How to display the days when there are no records in MariaDB?

I have the following table called employees:
employee
name
101
John
102
Alexandra
103
Ruth
And the table called records:
employee
assistance
101
2022-02-01
101
2022-02-02
101
2022-02-07
Let's suppose that I want to display the employee number, name and the days of the month in which there were absences between 2022-02-01 and 2022-02-07 (taking into account that days 05 and 06 are weekends). In that case, the result would be the following:
employee
name
absence
101
John
4,5
How do I get that result?
So far I have developed a query where the days of the month in which there are attendances are displayed. Said query is as follows:
SELECT e.employee,
e.name,
r.assistance AS assistance,
OF employees and
JOIN LEFT(SELECT employee, GROUP_CONCAT(DIFFERENT EXTRACT(DAY SINCE assistance)
ORDER BY STATEMENT(DAY FROM assistance)) AS assistance FROM records
WHERE assistance BETWEEN '2022-02-01' AND '2022-02-07' GROUP BY employee) r ON e.employee = employee
WHERE (r.no_employee IS NOT NULL) ORDER BY name ASC
I would like to know how to implement the days in which there were absences and not consider the weekends. I've done several tests but I'm still stuck. I'm working with MariaDB 10.4.11

You use a recursive common table expression (requires mariadb 10.2+ or mysql 8) to get the list of dates in the date range, and join against that:
with recursive date_range as (
select '2021-12-01' dt
union all
select dt + interval 1 day from date_range where dt < '2021-12-07'
)
select employee.employee, group_concat(day(date_range.dt) order by date_range.dt) faults
from date_range
cross join employee
left join records on records.employee=employee.employee and records.assistance=date_range.dt
where weekday(date_range.dt) < 5 and records.employee is null
group by employee.employee
fiddle
If you are just looking for one employee, add that as a where condition.

count() results without using group by

I am attempting something very similar to last example (Using GROUP BY) on this page:
https://thecodedeveloper.com/mysql-count-function/
Referring to the following table of data:
id name salary department
1 Tom 4000 Technology
2 Sam 6000 Sales
3 Bob 3000 Technology
4 Alan 8000 Technology
5 Jack 12000 Marketing
The following query:
SELECT department, COUNT(*) AS "Number of employees"
FROM employees
GROUP BY department;
Will produce the following output:
department Number of employees
Marketing 1
Sales 1
Technology 3
Except I want to see the number of employees in each department as well as every user in the table.
So I want the output to look like this:
id name salary department employees per department
1 Tom 4000 Technology 3
2 Sam 6000 Sales 1
3 Bob 3000 Technology 3
4 Alan 8000 Technology 3
5 Jack 12000 Marketing 1
I have managed to achieve what I want using a second query to test every result from the first query but it is extremely slow and I am convinced that there is a faster way to do it in a single query.

That's a window count. In MySQL 8.0:
select e.*, count(*) over(partition by d.department) as number_of_employees
from employees e
In earlier versions, an alternative uses a correlated subquery:
select e.*,
(select count(*) from employees e1 where e1.department = e.department) as number_of_employees
from employees e

mySQL GROUP, most recent

Here is the data set:
Person Status Date
Eric 1 1/1/2015
Eric 2 2/1/2015
Eric 3 3/1/2015
John 1 3/1/2015
John 2 2/1/2015
John 1 1/1/2015
I'd like to get the most recent date, and its correlated status, grouped by Person. I tried using a subquery to first identify the most recent date:
SELECT MAX(Date), Person FROM tbl1 GROUP BY Person
And then joining that back into the original table, so that by person I know which date is the most recent. But I'm struggling how to identify the most recent status. I just don't see the appropriate aggregator. Thanks.

select tbl1.*
from tbl1
join
(
SELECT Person, MAX(Date) as m_date
FROM tbl1
GROUP BY Person
) tmp on tbl1.Person = tmp.Person
and tbl1.date = tmp.m_date

Comparing two tables and returning answers that do not appear in both in mysql

The problem I am having is in relation to comparing two tables, and returning results which do not feature in both tables. I am using a theatre based situtation with bookings and seat numbers.
So my first table is the seat table looking like this.
row_no area_name
a01 front stalls
a02 front stalls
there are several area names that can be used, but they all use the same format as the above. For this example I will use seats a01 through a20 only in the front stalls.
The second table is the booking table looking like this
ticket_no row_no date_time customer_name
001070714 a01 21:00 7.7.14 John Doe
002070714 a02 21:00 7.7.14 John Doe
What I am trying to achieve is to compare the list of booked seats at that specific showtime to the total list of seats from the seat table, then group the results by area_name so I hopefully acheive results like
area_name row_no
front stalls 18
where 18 would be the number of free seats from the complete set of 20 described in the seat table.
How would I set about achieving this answer?
EDIT
This is what I've tried so far
SELECT
DISTINCT s.area_name, row_no
FROM seat AS s
FROM booking AS b
COUNT * WHERE s.row_no != b.row_no
WHERE b.title = rammstein
WHERE s.area_name = 'front stalls'
GROUP BY s.area_name;

Try something like:
SELECT seat_q.area_name, (seat_q.num_seats - COUNT(*) AS occupied_seats) AS free_seats
FROM booking b
WHERE row_no IN (
SELECT s.row_no
FROM seat s
WHERE s.area_name = 'front stalls'
GROUP BY s.area_name)
JOIN
(SELECT s.row_no, s.area_name, COUNT(s.*) AS num_seats
FROM seat s
WHERE s.area_name = 'front stalls'
GROUP BY s.area_name) AS seat_q
ON seat_q.row_no = b.row_no
WHERE b.date_time = '21:00, 7.7.14'

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

SQL group by function to categorize only the most recent data - mysql

You can use not exists as follows: select title, count(*) as Count from your_table t where not exists (select 1 from your_table tt where tt.emp_no = t.emp_no and tt.start_date> t.start_date) group by title

select title,COUNT(*) numberOfEmp from ( select distinct emp_no ,(select top 1 title from [dbo].[Tbl_title] a where a.emp_no=m.emp_no order by [start_date] desc ) title from [dbo].[Tbl_title] m ) mTable group by title

You can simply filter the data in where condition while counting. Query as follows: select title, count(distinct emp_no) as Count from (select emp_np, title, max(start_date) as start_date from table group by emp_np, title) subset group by title

Related

SQL nested query under WHERE

How to display the days when there are no records in MariaDB?

count() results without using group by

mySQL GROUP, most recent

Comparing two tables and returning answers that do not appear in both in mysql

Categories

Resources