Advanced MySQL Query select and compare time in one field - mysql

Employees
EmpID : int(10)
Firstname: varchar(100)
Lastname: varchar(100)
HireDate: timestamp
TerminationDate: timestamp
AnnualReviews
EmpID: int(10)
ReviewDate: timestamp
What is query that returns each employee and for each row/employee include the greatest number of employees that worked for the company at any time during their tenure and the first date that maximum was reached.
So far, this is my query:
select *, (select count(empid) from employees where terminationdate between t.hiredate and t.terminationdate)
from employees as t
group by empid

What you have is close.
But there's more work to do.
We'd to work out the conditions that determine how many employees were "working" at any point in time (i.e. at a given timestamp value.) The condition I'd check:
HireDate <= timestamp < TerminationDate
We'd need to extend that comparison, so that a NULL value for TerminationDate would be handled like it were a point in time after the timestamp value. That's easy enough to do.)
HireDate <= timestamp AND ( timestamp < TerminationDate OR TerminationDate IS NULL
So, something like this:
SELECT COUNT(1)
FROM Employees e
WHERE ( :timestamp >= e.HireDate )
AND ( :timestamp < e.TerminationDate OR e.TerminationDate IS NULL)
That "count" value would remain the same, and would only change for a "hire" or "terminate" event.
If we got a distinct list of all timestamps for all "hire" and "terminate" events, we could get the number of employees at that point in time.
So, this query would give us the employee count every time the employee count might change:
SELECT t.ts AS `as_of`
, COUNT(1) AS `employee_count`
FROM Employees e
JOIN ( SELECT t.TerminationDate AS ts
FROM Employees t
WHERE t.TerminationDate IS NOT NULL
GROUP BY t.TerminationDate
UNION
SELECT h.HireDate AS ts
FROM Employees h
WHERE h.HireDate IS NOT NULL
GROUP BY h.HireDate
) t
ON ( t.ts >= e.HireDate )
AND ( t.ts < e.TerminationDate OR e.TerminationDate IS NULL)
GROUP BY t.ts
We could use that result (as an inline view) and join that to particular Employee, and get just the rows that have an as_of timestamp that matches the period of employment for that employee. Then just pulling out the maximum employee_count. It wouldn't be difficult to identify the earlier of multiple as_of dates, if that maximum employee_count occurred multiple times.
(The wording of the question leaves open a question, the "earliest date" ever that the employee count met or exceeded the maximum that occurred during an employees tenure, or just the earliest date within the employees tenure that the maximum was reached. It's possible to get either result.)
That's just one way to approach the problem.

Related

How to get weekly data from a timestamp?

I have two tables, "Gate_Logs" and "Employee".
The "Gate_Logs" table has three columns.
Employee ID - A 4 Digit Unique Number assigned to every employee
Status – In or Out
Timestamp – A recorded timestamp
The "Employee" Table has
Employee ID
Level
Designation
Joining Date
Reporting Location
Reporting Location ID - Single Digit ID
I want to find out which employee had the highest weekly work time over the past year, and I am trying to get this data for each individual location. I want to look at the cumulative highest. Let's say Employee X at Location L worked 60 hours in a particular week, which was the highest at that location, so X will be the person I wanted to query.
Please provide any pointers on how I can proceed with this, have been stuck at it for a while.
SQL version 8.0.27
It can use window function LAG to pair In/Out records
periods - pair in/out records
sumup_weekly - compute weekly work hours for each employee
rank_weekly - rank employees per location per week
and finally select those rank one
WITH periods AS (
SELECT
`employee_id`,
`status` to_status,
`timestamp` to_timestamp,
LAG(`status`) OVER w AS fr_status,
LAG(`timestamp`) OVER w AS fr_timestamp
FROM gate_log
WINDOW w AS (PARTITION BY `employee_id` ORDER BY `timestamp` ASC)
),
sumup_weekly AS (
SELECT
`employee_id`,
WEEKOFYEAR(fr_timestamp) week,
SUM(TIMESTAMPDIFF(SECOND, fr_timestamp, to_timestamp)) seconds
FROM periods
WHERE fr_status = 'In' AND to_status = 'Out'
GROUP BY `employee_id`, `week`
),
rank_weekly AS (
SELECT
e.`employee_id`,
e.`location_id`,
w.`week`,
SEC_TO_TIME(w.`seconds`) work_hours,
RANK() OVER(PARTITION BY e.`location_id`, w.`week` ORDER BY w.`seconds` DESC) rank_hours
FROM sumup_weekly w
JOIN employee e ON w.`employee_id` = e.`employee_id`
)
SELECT *
FROM rank_weekly
WHERE rank_hours = 1
DEMO

Select NULL otherwise latest date per group

I am trying to pickup Account with End Date NULL first then latest date if there are more accounts with the same item
Table Sample
Result expected
Select distinct *
from Sample
where End Date is null
Need help to display the output.
Select *
from Sample
order by End_Date is not null, End_date desc
According to sample it seems to me you need union and not exists corelate subquery
select * from table_name t where t.enddate is null
union
select * from table_name t
where t.endate=( select max(enddate) from table_name t1 where t1.Item=t.Item and t1.Account=t.Account)
and not exists ( select 1 from table_name t2 where enddate is null and
t1 where t2.item=t.item
)
SELECT * FROM YourTable ORDER BY End_Date IS NOT NULL, End_Date DESC
In a Derived Table, you can determine the end_date_to_consider for every Item (using GROUP BY Item). IF() the MIN() date is NULL, then we consider NULL, else we consider the MAX() date.
Now, we can join this back to the main table on Item and the end_date to get the required rows.
Try:
SELECT t.*
FROM
Sample AS t
JOIN
(
SELECT
Item,
IF(MIN(end_date) IS NULL,
NULL,
MAX(end_date)) AS end_date_to_consider
FROM Sample
GROUP BY Item
) AS dt
ON dt.Item = t.Item AND
(dt.end_date_to_consider = t.end_date OR
(dt.end_date_to_consider IS NULL AND
t.end_date IS NULL)
)
First of all you should state clearly which result rows you want: You want one result row per Item and TOU. For each Item/TOU pair you want the row with highest date, with null having precedence (i.e. being considered the highest possible date).
Is this correct? Does that work with your real accounts? In your example it is always that all rows for one account have a higher date than all other account rows. If that is not the case with your real accounts, you need something more sophisticated than the following solution.
The highest date you can store in MySQL is 9999-12-31. Use this to treat the null dates as desired. Then it's just two steps:
Get the highest date per item and tou.
Get the row for these item, tou and date.
The query:
select * from
sample
where (item, tou, coalesce(enddate, date '9999-12-31') in
(
select item, tou, max(coalesce(enddate, date '9999-12-31'))
from sample
group by item, tou
)
order by item, tou;
(If it is possible for your enddate to have the value 9999-12-31 and you want null have precedence over this, then you must consider this in the query, i.e. you can no longer simply use this date in case of null, and the query will get more complicated.)

MySQL using count in query to search for availability of multiple rows

I am using one table, mrp to store multi room properties and a second table booking to store the dates the property was booked on.
I thus have the following tables:
mrp(property_id, property_name, num_rooms)
booking(property_id, booking_id, date)
Whenever a property is booked, an entry is made in the bookings table and because each table has multiple rooms, it can have multiple bookings on the same day.
I am using the following query:
SELECT * FROM mrp
WHERE property_id
NOT IN (SELECT property_id FROM booking WHERE `date` >= {$checkin_date} AND `date` <= {$checkout_date}
)
But although this query would work fine for a property with a single room (that is, it only lists properties which have not been booked altogether between the dates you provide), it does not display properties that have been booked but still have vacant rooms. How can we use count and the num_rooms table to show in my results the rooms which are still vacant, even if they already have a booking between the selected dates, and to display in my results the number of rooms that are free.
You need 3 levels of query. The innermost query will list properties and dates where all rooms are fully booked (or overbooked) on any day within your date range. The middle query narrows that down to just a list of property_id's. The outermost query lists all properties that are NOT in that list.
SELECT *
FROM mrp
WHERE property_id NOT IN (
-- List all properties sold-out on any day in range
SELECT DISTINCT Z.property_id
FROM (
-- List sold-out properties by date
SELECT MM.property_id, MM.num_rooms, BB.adate
, COUNT(*) as rooms_booked
FROM mrp MM
INNER JOIN booking BB on MM.property_id = BB.property_id
WHERE BB.adate >= #checkin AND BB.adate <= #checkout
GROUP BY MM.property_id, MM.num_rooms, BB.adate
HAVING MM.num_rooms - COUNT(*) <= 0
) as Z
)
You are close but you need to change the dates condition and add a condition to match the records from the outer and inner queries (all in the inner query's WHERE clause):
SELECT * FROM srp
WHERE NOT EXISTS
(SELECT * FROM bookings_srp
WHERE srp.booking_id = bookings_srp.booking_id
AND `date` >= {$check-in_date} AND `date` <= {$check-out_date})
You have to exclude the properties which are booked between the checkin date and checkout date. This query should do:
SELECT * FROM srp WHERE property_id NOT IN (
SELECT property_id FROM booking WHERE `date` >= {$checkin_date} AND `date` <= {$checkout_date}
)

SQL GROUP BY return empty set

I have a table rental:
rentalId int(11)
Customer_customerId int(11)
Vehicle_registrationNumber varchar(20)
startDate datetime
endDate datetime
pickUpLocation int(11)
returnLocation int(11)
booking_time timestamp
timePickedUp timestamp
timeReturned timestamp
and table payment:
paymentId int(11)
Rental_rentalId int(11)
amountDue decimal(10,2)
amountPaid decimal(10,2)
paymentDate timestamp
I run two group by functions, first one counts the number of reservations and sums the payments by day, this function only works as expected when having pickUpLocation` is omitted, otherwise it returns incorrect values. :
SELECT COUNT(rentalId) AS number_of_rentals, MONTH(booking_time) AS month,
`YEAR(booking_time) AS year,
CONCAT(DAY(booking_time), '-', MONTH(booking_time), '-',`
YEAR(booking_time) ) AS date, SUM(amountDue) AS total_value, SUM(amountPaid) AS
total_paid, `pickUpLocation`
FROM (`rental`)
JOIN `payment` ON `payment`.`Rental_rentalId` = `rental`.`rentalId`
GROUP BY DAY(booking_time)
HAVING `month` = 2
AND `year` = 2012
AND `pickUpLocation` = 1
ORDER BY `booking_time` desc
LIMIT 31
The second function is expected to sum the reservations and payments (both due and received) for the entire month, for a specific location:
SELECT COUNT(rentalId) AS number_of_rentals, MONTH(booking_time) AS month,
YEAR(booking_time) AS year, SUM(amountDue) AS total_value,
SUM(amountPaid) AS total_paid,
`pickUpLocation`
FROM (`rental`)
JOIN `payment` ON `payment`.`Rental_rentalId` = `rental`.`rentalId`
GROUP BY MONTH(booking_time)
HAVING `month` = 2
AND `year` = 2012
AND `pickUpLocation` = 1
ORDER BY `booking_time` desc
It works for some locations and doesn't work for others (returns correct set when there are many reservations, but when there are only few, it returns empty set). I use MySQL. Any help greatly appreciated.
You're doing an inner join between rental and payment which means you will only ever get rentals that have been paid for. If you want to find rentals without payment info too in your result, you need to use a LEFT JOIN instead of just an (inner) JOIN.
Note that that may result in NULLs in your result if there are no payments to account for, so you may have to adjust the output of your query using one of the control flow functions.
Edit: You're also GROUPing before your conditions, that will GROUP all rows for a month into one single row. Since the year and the PickupLocation may vary, you will get random values (of the ones available) in those two fields. HAVING will then filter on those random fields, leaving you with a possibly empty result set. WHERE on the other hand will see every row before GROUPing and do the right thing (tm) on a row to row basis, so the conditions should be put there instead.
(The same change should probably be done to your first, working, query)
Demo here.
You may need to push some conditions from HAVING to WHERE clause:
WHERE YEAR(booking_time) = 2012
AND MONTH(booking_time) = 2
AND `pickUpLocation` = 1
GROUP BY DAY(booking_time)
LIMIT 31
For a specific month, you don't even need the GROUP BY:
WHERE YEAR(booking_time) = 2012
AND MONTH(booking_time) = 2
AND `pickUpLocation` = 1
The above condition is not very good regarding performance:
WHERE YEAR(booking_time) = 2012
AND MONTH(booking_time) = 2
You should change it into:
WHERE booking_time >= '2012-02-01'
AND booking_time < '2012-03-01'
so the query can use an index on booking_time (if you have or you add one in the future) and so it doesn't call the YEAR() and MONTH() functions for every row of the table.

Is this query possible?

Here is my table:
table employee (
char(50) name
datetime startdate
datetime finishdate
}
Assume that at least one employee started everyday since the start of the business so that
select distinct startdate from employee
would return every day the business was open. I've already provided a query to get every day the business was open, but would it be pair each day with the number of employees that were employed on that day? Essentially I am asking whether it is possible to count the number of employees for which (startdate <= day AND finishdate >= day) is true for each day and return that relation in one query.
SELECT x.startDate, COUNT(e.startDate)
FROM (SELECT DISTINCT startDate FROM employee) AS x
INNER JOIN employee AS e
ON x.startDate BETWEEN e.startDate AND e.finishDate
GROUP BY x.startDate
SELECT e1.startdate, COUNT(*) AS num_employed_on_date
FROM employee e1 INNER JOIN employee e2
ON e1.startdate BETWEEN e2.startdate and e2.finishdate
GROUP BY e1.startdate
'startdate' holds the date an employee joined the company.
If you get MIN for startdate and MAX for finishdate, you'll have the entire span of time the company was open.
What you suggested would return every date at least one employee joined the company / started working.