SQL query total vs a single value - mysql

I've been working on a query with a peer and it has been turning back some unusual numbers. The query is a productivity report. I'm trying to total the all of the billable units for a specific end user, compare that total to single expected value, and then calculate the difference between those 2 numbers within a 1 week period of time. Here is what we have come up with so far:
SELECT
Employees.emp_id,
Employees.last_name+', '+Employees.first_name as staff_name,
SUM(VisitQuery.billed_value)/60 AS billed_value,
SUM(StandardQuery.num8) as expected_value
FROM
Employees
INNER JOIN
(
SELECT
ClientVisit.duration AS billed_value,
ClientVisit.emp_id,
ClientVisit.client_id
FROM
ClientVisit
WHERE
ClientVisit.non_billable = 0 AND
ClientVisit.rev_timeout >= #param1 AND
ClientVisit.rev_timeout <= #param2
) VisitQuery
ON VisitQuery.emp_id = Employees.emp_id
INNER JOIN
(
SELECT DISTINCT
CaseloadQuery.emp_id,
ClientsExt.num8
FROM
(
SELECT
ClientVisit.duration AS billed_value,
ClientVisit.emp_id,
ClientVisit.client_id
FROM
ClientVisit
WHERE
ClientVisit.non_billable = 0 AND
ClientVisit.rev_timeout >= #param1 AND
ClientVisit.rev_timeout <= #param2
) CaseloadQuery
INNER JOIN ClientsExt
ON CaseloadQuery.client_id = ClientsExt.client_id
) StandardQuery
ON Employees.emp_id = StandardQuery.emp_id
GROUP BY
Employees.emp_id,
Employees.last_name+', '+Employees.first_name`enter code here`
The return comes out looking like this:
emp_id staff_name billed_value expected_value
X X 74 231
XX XX 108 279
XXX XXX 19 72
Does anyone have any thoughts? The expected value should really not be any higher that 40 hours for the week.

In the table ClientVisit, can the same employee (emp_id) has multiple rows that lead to multiple values of client_id? If the answer is yes, then I think you should also do a GROUP BY on client_id
Below I tried rewriting your query (pay attention to the lines marked with "add" and "delete").
Disclaimer: I don't have your actual DB tables to test my query, so it may have syntax and semantic bugs
SELECT
Employees.emp_id,
StandardQuery.client_id, -- add
Employees.last_name+', '+Employees.first_name as staff_name,
SUM(VisitQuery.billed_value)/60 AS billed_value,
SUM(StandardQuery.num8) as expected_value
FROM
Employees
INNER JOIN
(
SELECT
ClientVisit.duration AS billed_value,
ClientVisit.emp_id,
ClientVisit.client_id
FROM
ClientVisit
WHERE
ClientVisit.non_billable = 0 AND
ClientVisit.rev_timeout >= #param1 AND
ClientVisit.rev_timeout <= #param2
) VisitQuery
ON VisitQuery.emp_id = Employees.emp_id
INNER JOIN
(
SELECT DISTINCT
CaseloadQuery.emp_id,
ClientsExt.num8,
ClientsExt.client_id -- add
FROM
(
SELECT
-- ClientVisit.duration AS billed_value, -- delete
ClientVisit.emp_id,
ClientVisit.client_id
FROM
ClientVisit
WHERE
ClientVisit.non_billable = 0 AND
ClientVisit.rev_timeout >= #param1 AND
ClientVisit.rev_timeout <= #param2
)CaseloadQuery
INNER JOIN ClientsExt
ON CaseloadQuery.client_id = ClientsExt.client_id
)StandardQuery
ON Employees.emp_id = StandardQuery.emp_id
GROUP BY
Employees.emp_id,
-- Employees.last_name+', '+Employees.first_name -- delete
StandardQuery.client_id -- add

Related

MySql - Selecting MAX & MIN and returning the corresponding rows

I trying to get the last 6 months of the min and max of prices in my table and display them as a group by months. My query is not returning the corresponding rows values, such as the date time for when the max price was or min..
I want to select the min & max prices and the date time they both occurred and the rest of the data for that row...
(the reason why i have concat for report_term, as i need to print this with the dataset when displaying results. e.g. February 2018 -> ...., January 2018 -> ...)
SELECT metal_price_id, CONCAT(MONTHNAME(metal_price_datetime), ' ', YEAR(metal_price_datetime)) AS report_term, max(metal_price) as highest_gold_price, metal_price_datetime FROM metal_prices_v2
WHERE metal_id = 1
AND DATEDIFF(NOW(), metal_price_datetime) BETWEEN 0 AND 180
GROUP BY report_term
ORDER BY metal_price_datetime DESC
I have made an example, extract from my DB:
http://sqlfiddle.com/#!9/617bcb2/4/0
My desired result would be to see the min and max prices grouped by month, date of min, date of max.. and all in the last 6 months.
thanks
UPDATE.
The below code works, but it returns back rows from beyond the 180 days specified. I have just checked, and it is because it joining by the price which may be duplicated a number of times during the years.... see: http://sqlfiddle.com/#!9/5f501b/1
You could use twice inner join on the subselect for min and max
select a.metal_price_datetime
, t1.highest_gold_price
, t1.report_term
, t2.lowest_gold_price
,t2.metal_price_datetime
from metal_prices_v2 a
inner join (
SELECT CONCAT(MONTHNAME(metal_price_datetime), ' ', YEAR(metal_price_datetime)) AS report_term
, max(metal_price) as highest_gold_price
from metal_prices_v2
WHERE metal_id = 1
AND DATEDIFF(NOW(), metal_price_datetime) BETWEEN 0 AND 180
GROUP BY report_term
) t1 on t1.highest_gold_price = a.metal_price
inner join (
select a.metal_price_datetime
, t.lowest_gold_price
, t.report_term
from metal_prices_v2 a
inner join (
SELECT CONCAT(MONTHNAME(metal_price_datetime), ' ', YEAR(metal_price_datetime)) AS report_term
, min(metal_price) as lowest_gold_price
from metal_prices_v2
WHERE metal_id = 1
AND DATEDIFF(NOW(), metal_price_datetime) BETWEEN 0 AND 180
GROUP BY report_term
) t on t.lowest_gold_price = a.metal_price
) t2 on t2.report_term = t1.report_term
simplified version of what you should do so you can learn the working process.
You need calculate the min() max() of the periods you need. That is your first brick on this building.
you have tableA, you calculate min() lets call it R1
SELECT group_field, min() as min_value
FROM TableA
GROUP BY group_field
same for max() call it R2
SELECT group_field, max() as max_value
FROM TableA
GROUP BY group_field
Now you need to bring all the data from original fields so you join each result with your original table
We call those T1 and T2:
SELECT tableA.group_field, tableA.value, tableA.date
FROM tableA
JOIN ( ... .. ) as R1
ON tableA.group_field = R1.group_field
AND tableA.value = R1.min_value
SELECT tableA.group_field, tableA.value, tableA.date
FROM tableA
JOIN ( ... .. ) as R2
ON tableA.group_field = R2.group_field
AND tableA.value = R2.max_value
Now we join T1 and T2.
SELECT *
FROM ( .... ) as T1
JOIN ( .... ) as T2
ON t1.group_field = t2.group_field
So the idea is if you can do a brick, you do the next one. Then you also can add filters like last 6 months or something else you need.
In this case the group_field is the CONCAT() value

Select Status That Last for More Than 1 Second

I have got a problem looks simple, but I could not find the solution.
So, I have got a table with two cols like this:
Time Status
00:00:00.111 Off
00:00:00.222 On
00:00:00.345 On
00:00:01.555 On
00:00:01.666 Off
00:00:02.222 On
00:00:02.422 On
00:00:02.622 Off
00:00:05.888 Off
00:00:05.999 Off
I want to select all statuses of On which lasted for more than 1 second,
in this example, I want the sequence:
00:00:00.222 On
00:00:00.345 On
00:00:01.555 On
Could you guys give me any clue? Many thanks!
A simple GROUP BY and SUM can not do this on your current dataset, so my idea is to add a helper column:
CREATE TABLE someTable(
`time` DATETIME,
status CHAR(3),
helperCol INT
);
The helperCol is an INT and will be set as follows:
CREATE PROCEDURE setHelperCol()
BEGIN
DECLARE finished,v_helperCol INT;
DECLARE status CHAR(3);
DECLARE ts DATETIME;
DECLARE CURSOR st FOR SELECT `time`,status,helperCol FROM someTable WHERE helperCol IS NOT NULL; -- Handy for re-use: No need to go over all data, so you can save the helperCol as permanent value.
DECLARE CONTINUE HANDLER FOR NOT FOUND SET finished = 1;
SELECT #maxVal:=MAX(helperCol) FROM helperCol;
SET finished=0;
SET helperCol=#maxVal;
IF(!helperCol>0) SET helperCol=1;
OPEN st;
FETCH ts,status,v_helperCol FROM st;
WHILE(finished=0) DO
IF(status='Off') v_helperCol=v_helperCol+1;
UPDATE someTable SET helperCol=v_helperCol WHERE `time`=ts; -- Assuming `time` is unique;
FETCH ts,status,v_helperCol FROM st;
END WHILE;
CLOSE st;
END;
Execute the procedure and the result is:
Time Status helperCol
00:00:00.111 Off 2
00:00:00.222 On 2
00:00:00.345 On 2
00:00:01.555 On 2
00:00:01.666 Off 3
00:00:02.222 On 3
00:00:02.422 On 3
00:00:02.622 Off 4
This can now be grouped and processed:
SELECT MAX(`time`)-MIN(`time`) AS diffTime
FROM someTable
WHERE status='ON'
GROUP BY helperCol
HAVING MAX(`time`)-MIN(`time`)>1;
The result of that is (you need to search for the correct datetime functions to apply in the MAX-MIN part):
1.333
Alternative:
You can also process the MAX-MIN in the stored procedure, but that would not be efficiently repeatable as the helperColumn solution is.
SELECT a.time start
, MIN(c.time) end
, TIMEDIFF(MIN(c.time),a.time) duration
FROM
( SELECT x.*, COUNT(*) rank FROM my_table x JOIN my_table y ON y.time <= x.time GROUP BY time ) a
LEFT
JOIN
( SELECT x.*, COUNT(*) rank FROM my_table x JOIN my_table y ON y.time <= x.time GROUP BY time ) b
ON b.status = a.status
AND b.rank = a.rank - 1
JOIN
( SELECT x.*, COUNT(*) rank FROM my_table x JOIN my_table y ON y.time <= x.time GROUP BY time ) c
ON c.rank >= a.rank
LEFT
JOIN
( SELECT x.*, COUNT(*) rank FROM my_table x JOIN my_table y ON y.time <= x.time GROUP BY time ) d
ON d.status = c.status
AND d.rank = c.rank + 1
WHERE b.rank IS NULL
AND d.rank IS NULL
AND a.status = 1
GROUP
BY a.time
HAVING duration >= 1;
Another, faster, method might be along these lines - unfortunately I don't think the data types and functions in my version of MySQL support fractions of a second, so this is probably a little bit wrong (there may also be a logical error)...
SELECT time
, status
, cumulative
FROM
( SELECT *
, CASE WHEN #prev = status THEN #i:=#i+duration ELSE #i:=0 END cumulative
, #prev:=status
FROM
( SELECT x.*
, TIME_TO_SEC(MIN(y.time))-TIME_TO_SEC(x.time) duration
FROM my_table x
JOIN my_table y
ON y.time > x.time
GROUP
BY x.time
) n
ORDER
BY time
) a
WHERE cumulative >= 1
AND status = 1;

Calculating payment breakages

I'm currently working on a report to highlight payment breakages, this is based on a customer paying in June, but then failing to pay in July.
I've currently got it set up to do an except query, to check one month and compare it to the next. Similar to below(syntax my not be correct as I have had to edit certain data).
DECLARE #StartDatePaid AS DATETIME
DECLARE #EndDatePaid AS DATETIME
DECLARE #StartDateMissed AS DATETIME
DECLARE #EndDateMissed AS DATETIME
SET #StartDatePaid = '01-Oct-2013'
SET #EndDatePaid = '31-Oct-2013'
SET #StartDateMissed = '01-Nov-2013'
SET #EndDateMissed = '05-Dec-2013'
SELECT d.StoreNo
, d.CustNo
FROM (
--Paid Range
SELECT c.CustNo, m.StoreNo
FROM dbo.tblCont AS c INNER JOIN
dbo.tblContDep AS cd ON c.ContractNo = cd.ContractNo INNER JOIN
dbo.tblCust AS m ON c.CustNo = m.CustNo INNER JOIN
dbo.tblTrans AS mx ON m.CustNo = mx.CustNo AND cd.AgendaCode = mx.AgendaCode INNER JOIN
dbo.tblCalender AS cl ON mx.DateEvent = cl.Date
WHERE (cd.Payment > 0) AND (m.Closed <> 'Y') AND (cd.AgendaCode <> 'OPCLIPMT')
AND mx.DateEvent BETWEEN #StartDatePaid AND #EndDatePaid
GROUP BY c.CustNo, m.StoreNo, mx.DateEvent
EXCEPT
--Missed Range
SELECT c.CustNo, m.StoreNo
FROM dbo.tblCont AS c INNER JOIN
dbo.tblContDep AS cd ON c.ContractNo = cd.ContractNo INNER JOIN
dbo.tblCust AS m ON c.CustNo = m.CustNo INNER JOIN
dbo.tblTrans AS mx ON m.CustNo = mx.CustNo AND cd.AgendaCode = mx.AgendaCode INNER JOIN
dtLookups.dbo.tblCalender AS cl ON mx.DateEvent = cl.Date
WHERE (cd.Payment > 0) AND (m.Closed <> 'Y') AND (cd.AgendaCode <> 'OPCLIPMT') AND (mx.DateEvent BETWEEN #StartDateMissed AND #EndDateMissed )
GROUP BY c.CustNo, m.StoreNo, mx.DateEvent
) AS d
WHERE d.StoreNo IN (72, 114, 121, 139, 185, 241, 266)
GROUP BY
d.StoreNo, d.CustNo
I will be switching it over to be based on calendar months instead of date ranges, my question is how am I best generating several months of breakages at once. To get a month on Month comparison at once, as it is I can only get it to create one months breakages based on supplied data.
Example of desired output
Month| breakges
June | 201
July | 189
Aug | 250
Open to suggestions on best practice also or ways to improve.
I admit I don't understand your query. Assuming your breakage is the first occurrence of missing payment, not the subsequent ones. You can produce the desired output like this:
-- prepare your source data
with cte1 as
(
select
user_id,
date, -- representing month by the first day
missed -- bool flag if payment was missed in that month
from ...
)
-- add a sequence number to the source data ordered by date
with cte2 as
(
select *,
row_number() over(partition by user_id order by date) rn
from cte1
)
-- select those records where payment was missed but the previous was ok
,cte3 as
(
select user_id, date from cte2 a
where a.missed = 1
and exists (
select * from cte2 b
where b.missed = 0
and b.uid = a.uid
and b.rn = a.rn -1
)
)
select date, count(*) as breakage from cte3 group by date

SQL Group By Number Of Users Within Range

I have the following query which will return the number of users in table transactions who have earned between $100 and $200
SELECT COUNT(users.id)
FROM transactions
LEFT JOIN users ON users.id = transactions.user_id
WHERE transactions.amount > 100 AND transactions.amount < 200
The above query returns the correct result below:
COUNT(users.id)
559
I would like to extend it so that the query can return data in the following format:
COUNT(users.id) : amount
1678 : 0-100
559 : 100-200
13 : 200-300
How can I do this?
You can use a CASE expression inside of your aggregate function which will get the result in columns:
SELECT
COUNT(case when amount >= 0 and amount <= 100 then users.id end) Amt0_100,
COUNT(case when amount >= 101 and amount <= 200 then users.id end) Amt101_200,
COUNT(case when amount >= 201 and amount <= 300 then users.id end) Amt201_300
FROM transactions
LEFT JOIN users
ON users.id = transactions.user_id;
See SQL Fiddle with Demo
You will notice that I altered the ranges from 0-100, 101-200, 201-300 otherwise you will have user ids being counted twice on the 100, 200 values.
If you want the values in rows, then you can use:
select count(u.id),
CASE
WHEN amount >=0 and amount <=100 THEN '0-100'
WHEN amount >=101 and amount <=200 THEN '101-200'
WHEN amount >=201 and amount <=300 THEN '101-300'
END Amount
from transactions t
left join users u
on u.id = t.user_id
group by
CASE
WHEN amount >=0 and amount <=100 THEN '0-100'
WHEN amount >=101 and amount <=200 THEN '101-200'
WHEN amount >=201 and amount <=300 THEN '101-300'
END
See SQL Fiddle with Demo
But if you have many ranges that you need to calculate the counts on, then you might want to consider creating a table with the ranges, similar to the following:
create table report_range
(
start_range int,
end_range int
);
insert into report_range values
(0, 100),
(101, 200),
(201, 300);
Then you can use this table to join to your current tables and group by the range values:
select count(u.id) Total, concat(start_range, '-', end_range) amount
from transactions t
left join users u
on u.id = t.user_id
left join report_range r
on t.amount >= r.start_range
and t.amount<= r.end_range
group by concat(start_range, '-', end_range);
See SQL Fiddle with Demo.
If you don't want to create a new table with the ranges, then you can always use a derived table to get the same result:
select count(u.id) Total, concat(start_range, '-', end_range) amount
from transactions t
left join users u
on u.id = t.user_id
left join
(
select 0 start_range, 100 end_range union all
select 101 start_range, 200 end_range union all
select 201 start_range, 300 end_range
) r
on t.amount >= r.start_range
and t.amount<= r.end_range
group by concat(start_range, '-', end_range);
See SQL Fiddle with Demo
One way to do this would be to use a case/when statement in your group by.
SELECT
-- NB this must match your group by statement exactly
-- otherwise you will get an error
CASE
WHEN amount <= 100
THEN '0-100'
WHEN amount <= 200
THEN '100-200'
ELSE '201+'
END Amount,
COUNT(*)
FROM
transactions
GROUP BY
CASE
WHEN amount <= 100
THEN '0-100'
WHEN amount <= 200
THEN '100-200'
ELSE '201+'
END
If you plan on using the grouping elsewhere, it probably makes sense to define it as a scalar function (it will also look cleaner)
e.g.
SELECT
AmountGrouping(amount),
COUNT(*)
FROM
transactions
GROUP BY
AmountGrouping(amount)
If you want to be fully generic:
SELECT
concat(((amount DIV 100) * 100),'-',(((amount DIV 100) + 1) * 100)) AmountGroup,
COUNT(*)
FROM
transactions
GROUP BY
AmountGroup
Sql Fiddle
Bilbo, I tried to be creative and found a very nice solution [ for those who love math (like me) ]
It's always surprising when MySQL integer division operator solves our problems.
DROP SCHEMA IF EXISTS `stackoverflow3`;
CREATE SCHEMA `stackoverflow3`;
USE `stackoverflow3`;
CREATE TABLE users (
id INT UNSIGNED PRIMARY KEY NOT NULL AUTO_INCREMENT,
name VARCHAR(25) NOT NULL DEFAULT "-");
CREATE TABLE transactions(
id INT UNSIGNED PRIMARY KEY NOT NULL AUTO_INCREMENT,
user_id INT UNSIGNED NOT NULL,
amount INT UNSIGNED DEFAULT 0,
FOREIGN KEY (user_id) REFERENCES users (id));
INSERT users () VALUES (),(),();
INSERT transactions (user_id,amount)
VALUES (1,120),(2,270),(3, 350),
(2,500), (1,599), (1,550), (3,10),
(3,20), (3,30), (3,50), (3,750);
SELECT
COUNT(t.id),
CONCAT(
((t.amount DIV 100)*100)," to ",((t.amount DIV 100 + 1)*100-1)
) AS amount_range
FROM transactions AS t
GROUP BY amount_range;
Awaiting your questions, Mr. Baggins.

MySQL Complex Inner Join

Suppose equity has a column called TickerID. I would like to replace the 111's with equity.TickerID. MySQL can't seem to resolve the scope and returns an unknown column when I try that. This SQL statement works but I need to run it for each ticker. Would be nice if I could get a full table.
SELECT Ticker,
IF(tbl_m200.MA200_Count = 200,tbl_m200.MA200,-1) AS MA200,
IF(tbl_m50.MA50_Count = 50,tbl_m50.MA50,-1) AS MA50,
IF(tbl_m20.MA20_Count = 20,tbl_m20.MA20,-1) AS MA20
FROM equity
INNER JOIN
(SELECT TickerID,AVG(Y.Close) AS MA200,COUNT(Y.Close) AS MA200_Count FROM
(
SELECT Close,TickerID FROM equity_pricehistory_daily
WHERE TickerID = 111
ORDER BY Timestamp DESC LIMIT 0,200
) AS Y
) AS tbl_m200
USING(TickerID)
INNER JOIN
(SELECT TickerID,AVG(Y.Close) AS MA50,COUNT(Y.Close) AS MA50_Count FROM
(
SELECT Close,TickerID FROM equity_pricehistory_daily
WHERE TickerID = 111
ORDER BY Timestamp DESC LIMIT 50
) AS Y
) AS tbl_m50
USING(TickerID)
INNER JOIN
(SELECT TickerID,AVG(Y.Close) AS MA20,COUNT(Y.Close) AS MA20_Count FROM
(
SELECT Close,TickerID FROM equity_pricehistory_daily
WHERE TickerID = 111
ORDER BY Timestamp DESC LIMIT 0,20
) AS Y
) AS tbl_m20
USING(TickerID)
This seems to be some bug or "feature" of MySQL. Many persons seems to have the same problem with outer tables being out of scope.
Anyway... You could create functions that retrieve the information you want:
DROP FUNCTION IF EXISTS AveragePriceHistory_20;
CREATE FUNCTION AveragePriceHistory_20(MyTickerID INT)
RETURNS DECIMAL(9,2) DETERMINISTIC
RETURN (
SELECT AVG(Y.Close)
FROM (
SELECT Z.Close
FROM equity_pricehistory_daily Z
WHERE Z.TickerID = MyTickerID
ORDER BY Timestamp DESC
LIMIT 20
) Y
HAVING COUNT(*) = 20
);
SELECT
E.TickerID,
E.Ticker,
AveragePriceHistory_20(E.TickerID) AS MA20
FROM equity E;
You would get NULL instead of -1. If this is undesirable, you could wrap the function-call with IFNULL(...,-1).
Another way of solving this, would be to select for the time-frame, instead of using LIMIT.
SELECT
E.TickerID,
E.Ticker,
(
SELECT AVG(Y.Close)
FROM equity_pricehistory_daily Y
WHERE Y.TickerID = E.TickerID
AND Y.Timestamp > ADDDATE(CURRENT_TIMESTAMP, INTERVAL -20 DAY)
) AS MA20
FROM equity E;