MySQL calculating time difference when one data point is null - mysql

I have a query that runs on a patient record system:
SELECT WardTransactions.Id ID,
Genders.Description Gender,
Wards.Code Ward,
TIME_TO_SEC(TIMEDIFF(DischargeDateTime, AdmissionDateTime)) Duration
from WardTransactions
JOIN Wards on WardTransactions.WardId=Wards.Id
JOIN Demographics on WardTransactions.DemographicId=Demographics.Id
JOIN Genders on Demographics.GenderId=Genders.Id
JOIN Visits on WardTransactions.VisitId=Visits.Id
The Issue is that at the time the query is run, DischargeDateTime may be null as the patient is still in the ward. I need to include that record in the calculation, but have the DischargeDateTime set to the current time. I intend to use the Duration data in a jasperReports variable to calculate total, max, min and average times.
I am not sure how to build a query to resolve this issue.

SELECT WardTransactions.Id ID,
Genders.Description Gender,
Wards.Code Ward,
TIME_TO_SEC(TIMEDIFF(
(case WHEN DischargeDateTime IS NULL THEN NOW() ELSE DischargeDateTime END), AdmissionDateTime)) Duration
from WardTransactions
JOIN Wards on WardTransactions.WardId=Wards.Id
JOIN Demographics on WardTransactions.DemographicId=Demographics.Id
JOIN Genders on Demographics.GenderId=Genders.Id
JOIN Visits on WardTransactions.VisitId=Visits.Id

If I understand you, you want NOW() to be used as the discharge timestamp if the patient isn't yet discharged.
Use this expression:
TIME_TO_SEC(TIMEDIFF(IFNULL(DischargeDateTime,NOW()), AdmissionDateTime)) Duration
and you'll get what you need.

Related

Subquery returns more than 1 row, how can I solve?

Question: Show the category of competitions that have always been hosted in the same country during May 2010. What is wrong with my query?
select Category
from competition
where Date >= '2010-01-01' and Date <= '2010-12-31'
group by Country, Category
having count(*) = (select count(*)
from competition
where Date >= '2010-01-01' and Date <= '2010-12-31'
group by Category)
You don't need two queries. Just use one query that checks that the count of countries is 1.
select category, count(DISTINCT country) AS country_count
from competition
where Date BETWEEN '2010-05-01' and '2010-05-31'
group by Category
HAVING country_count = 1
I also corrected the dates to be just May, not the whole year 2010.
Remove the GROUP BY if you that is making you return more than 1 row (in your HAVING CLAUSE. If you give me an example dataset and what you want I can help you more
I'd try something like this to start with:
SELECT COUNTRY
, CATEGORY
, COUNT(COUNTRY)
FROM COMPETITION
WHERE DATE BETWEEN '2010-04-30' AND '2010-06-01'
ORDER BY CATEGORY DESC, COUNT(COUNTRY) DESC
;
Your original query's date limits are just for the year of 2010 but you specified you only wanted May 2010. If the Date column is a date or datetime time you'll need to cast the string to the appropriate datatype.
Your question asked "always hosted by one country" - do you know that a competition is only going to be hosted by one country during that particular month? If you do, you're pretty much done. If you don't, however, then you need to clarify what your criteria really are

how to count number of lines with jointure in Talend on Oracle

i have 3 tables
supplier(id_supp, name, adress, ...)
Customer(id_cust, name, adress, ...)
Order(id_order, ref_cust, ref_supp, date_order...)
I want to make a job that counts the number of orders by Supplier, for last_week, last_two_weeks with Talend
select
supp.name,
(
select
count(*)
from
order
where
date_order between sysdate-7 and sysdate
nd ref_supp=id_supp
) as week_1,
(
select
count(*)
from
order
where
date_order between sysdate-14 and sysdate-7
nd ref_supp=id_supp
) as week_2
from supplier supp
the resaon for what i'm doing this, is that my query took to much time
You need a join between supplier and order to get supplier names. I show an inner join, but if you need ALL suppliers (even those with no orders in the order table) you may change it to a left outer join.
Other than that, you should only have to read the order table once and get all the info you need. Your query does more than one pass (read EXPLAIN PLAN for your query), which may be why it is taking too long.
NOTE: sysdate has a time-of-day component (and perhaps the date_order value does too); the way you wrote the query may or may not do exactly what you want it to do. You may have to surround sysdate by trunc().
select s.name,
count(case when o.date_order between sysdate - 7 and sysdate then 1 end)
as week_1,
count(case when o.date_order between sysdate - 14 and sysdate - 7 then 1 end)
as week_2
from supplier s inner join order o
on s.id_supp = o.ref_supp
;

MySQL right outer join query

I have a query regarding a query in MySQL.
I have 2 tables one containing SalesRep details like name, email, etc. I have another table with the sales data which has reportDate, customers served and link to the salesrep via a foreign key. One thing to note is that the reportDate is always a friday.
So the requirement is this: I need to find sales data for a 13 week period for a given list of sales reps - with 0 as customers served if on a particular friday there is no data. The query result is consumed by a Java application which relies on the 13 rows of data per sales rep.
I have created a table with all the Friday dates populated and wrote a outer join like below:
select * from (
select name, customersServed, reportDate
from Sales_Data salesData
join `SALES_REPRESENTATIVE` salesRep on salesRep.`employeeId` = salesData.`employeeId`
where employeeId = 1
) as result
right outer join fridays on fridays.datefield = reportDate
where fridays.datefield between '2014-10-01' and '2014-12-31'
order by datefield
Now my doubts:
Is there any way where i can get the name to be populated for all 13 rows in the above query?
If there are 2 sales reps, I'd like to use a IN clause and expect 26 rows in total - 13 rows per sales person (even if there is no record for that person, I'd still like to see 13 rows of nulls), and 39 for 3 sales reps
Can these be done in MySql and if so, can anyone point me in the right direction?
You must first select your lines (without customersServed) and then make an outer join for the customerServed
something like that:
select records.name, records.datefield, IFNULL(salesRep.customersServed,0)
from (
select employeeId, name, datefield
from `SALES_REPRESENTATIVE`, fridays
where fridays.datefield between '2014-10-01' and '2014-12-31'
and employeeId in (...)
) as records
left outer join `Sales_Data` salesData on (salesData.employeeId = records.employeeId and salesData.reportDate = records.datefield)
order by records.name, records.datefield
You'll have to do 2 level nesting, in your nested query change to outer join for salesrep, so you have atleast 1 record for each rep, then a join with fridays without any condition to have atleast 13 record for each rep, then final right outer join with condition (fridays.datefield = innerfriday.datefield and (reportDate is null or reportDate=innerfriday.datefield))
Very inefficient, try to do it in code except for very small data.

Mysql time calculation with join

I have two tables: sales, actions
Sales table:
id, datetime, status
--------------------
Actions table:
id, datetime, sales_id, action
------------------------------
There's a many-to-one relations ship between the actions and sales tables. For each sales record, there could be numerous actions. I am trying to determine, by each hour of the day, what the average time difference is between when sales records are first created, and when the first action record associated with it's respective sales record was created.
In other words, how fast (in hours) are sales agents responding to leads, based on what hour of the day the lead came in.
Here's what I tried:
SELECT
FROM_UNIXTIME(sales.datetime, '%H') as Hour,
count(actions.id) AS actions,
(MIN(actions.datetime) - sales.datetime) / 3600 as Lag
FROM
actions
INNER JOIN sales ON actions.sales_id = sales.id
group by Hour
I get what looks like reasonable hours numbers for 'Lag', but I am not convinced they're accurate:
Hour Actions Lag
00 66 11.0442
01 30 11.2758
02 50 8.2900
03 25 5.7492
.
.
.
23 77 34.4744
My question is, is this the correct way to get the values for the first action that was recorded for a given sales record? :
(MIN(actions.createDate) - sales.createDate) / 3600 as Lag
It should be:
MIN(actions.datetime - sales.datetime) / 3600 AS Lag
You way is getting the first action from any sale within the hour, and subtracting each sale's timestamp from its timestamp. You need to do the subtraction only within actions and sales that are joined by the ID.
This query has two layers, and it's helpful to crawl through them both.
The lowest layer should compute the lag time from sales.datetime to the earliest action.datetime for each row of sales. That will probably use a MIN() function.
The next layer will compute the statistics for those lag times, worked out in the lowest layer, by hour of the day. That will use an AVG() function.
Here's the lowest layer:
SELECT s.id, s.datetime, s.status,
TIMEDIFF(SECOND, MIN(a.datetime), s.datetime) AS lag_seconds
FROM sales AS s
JOIN actions AS a ON s.id = a.sales_id AND a.datetime > s.datetime
GROUP BY s.id, s.datetime, s.status
The second part of that ON clause makes sure that you only consider actions taken after the sales order was entered. It may be unnecessary, but I thought I'd throw it in.
Here's the second layer.
SELECT HOUR(datetime) AS hour_Sale_entered,
COUNT(*) AS number_in_that_hour,
AVG(lag_seconds) / 3600.0 AS Lag_to_first_action
FROM (
SELECT s.id, s.datetime, s.status,
TIMEDIFF(SECOND, MIN(a.datetime), s.datetime) AS lag_seconds
FROM sales AS s
JOIN actions AS a ON s.id = a.sales_id AND a.datetime > s.datetime
GROUP BY s.id, s.datetime, s.status
) AS d
GROUP BY HOUR(datetime)
ORDER BY HOUR(datetime)
See how there are two nested aggregations (GROUP BY) operations? The inner one identifies the first action, and the second one does the hourly averaging.
One more tidbit. If you want to include sales items that have not yet been acted on, you can do this:
SELECT HOUR(datetime) AS hour_Sale_entered,
COUNT(*) AS number_in_that_hour,
SUM(no_action) AS not_acted_upon_yet,
AVG(lag_seconds) / 3600.0 AS Lag_to_first_action
FROM (
SELECT s.id, s.datetime, s.status,
TIMEDIFF(SECOND, MIN(a.datetime), s.datetime) AS lag_seconds,
IFNULL(a.id,1,0) AS no_action
FROM sales AS s
LEFT JOIN actions AS a ON s.id = a.sales_id AND a.datetime > s.datetime
GROUP BY s.id, s.datetime, s.status
) AS d
GROUP BY HOUR(datetime)
ORDER BY HOUR(datetime)
The average of lag_seconds will still be correct, because the sales rows with no action rows will have NULL values for that, and AVG() ignores nulls.

Calculate salary of tutor based on distinct sittings using mysql

I have the following table denoting a tutor teaching pupils in small groups. Each pupil has an entry into the database. A pupil may be alone or in a group. I wish to calculate the tutors "salary" as such: payment is based on time spent - this means that for each sitting (with one or more pupils) only one sitting will be calculated - distinct sittings! The start and end times are unix times.
<pre>
start end attendance
1359882000 1359882090 1
1359867600 1359867690 0
1359867600 1359867690 1
1359867600 1359867690 0
1360472400 1360477800 1
1360472400 1360477800 1
1359867600 1359867690 1
1359914400 1359919800 1
1360000800 1360006200 1
1360000800 1360006200 0
1360000800 1360006200 1
</pre>
This is what I tried: with no success - I can't get the right duration (number of hours for all distinct sittings)
SELECT YEAR(FROM_UNIXTIME(start)) AS year,
MONTHNAME(STR_TO_DATE(MONTH(FROM_UNIXTIME(start)), '%m')) AS month,
COUNT(DISTINCT start) AS sittings,
SUM(TRUNCATE((end-start)/3600, 1)) as duration
FROM schedules
GROUP BY
YEAR(FROM_UNIXTIME(start)),
MONTH(FROM_UNIXTIME(start))
Thanks for your proposals / support!
EDIT: Required results
Rate = 25
Year Month Sittings Duration Bounty
2013 February 2 2.2 2.2*25
2013 April 4 12.0 12.0*25
You could probably do something with subqueries, I've had a play with SQL fiddle, how does this look for you. Link to sql fiddle : http://sqlfiddle.com/#!2/50718c/3
SELECT
YEAR(d.date) AS year,
MONTH(d.date) AS month,
COUNT(*) AS sittings,
SUM(d.duration) AS duration_mins
FROM (
SELECT
DATE(FROM_UNIXTIME(s.start)) AS date,
s.attendance,
end-start AS duration
FROM schedules s
) d
GROUP BY
year,
month
I couldn't really see where attendance comes into this at present, you didn't specify. The inner query is responsible for taking the schedules, extracting a start date, and a duration (in seconds).
The outer query then uses these derived values but groups them up to get the sums. You could elaborate from here i.e. maybe you only want to select where attendance > 0, or maybe you want to multiply by attendance.
In this next example I have done this, calculating the duration in hours instead, and calculating the applicable duration for where sessions have >1 attendance along with the appropriate bounty assuming bounty == hours * rate : http://sqlfiddle.com/#!2/50718c/21
SELECT
YEAR(d.date) AS year,
MONTH(d.date) AS month,
COUNT(*) AS sittings,
SUM(d.duration) AS duration,
SUM(
IF(d.attendance>0,1,0)
) AS sittingsWorthBounty,
SUM(
IF(d.attendance>0,d.duration,0)
) AS durationForBounty,
SUM(
IF(d.attendance>0,d.bounty,0)
) AS bounty
FROM (
SELECT
DATE(FROM_UNIXTIME(s.start)) AS date,
s.attendance,
(end-start)/3600 AS duration,
(end-start)/3600 * #rate AS bounty
FROM schedules s,
(SELECT #rate := 25) v
) d
GROUP BY
year,
month
The key point here, is that in the subquery you do all the calculation per-row. The main query then is responsible for grouping up the results and getting your totals. The IF statements in the outer query could easily be moved into the subquery instead, for example. I just included them like this so you could see where the values came from.