Percentage change from previous row when multiple rows have the same date? - mysql

I have a query that calculates ratios and returns them for each hour and server on a given day:
SELECT a.day,
a.hour,
Sum(a.gemspurchased),
Sum(b.gems),
Sum(b.shadowgems),
( Sum(b.gems) / Sum(a.gemspurchased) ) AS GemRatio,
( Sum(b.shadowgems) / Sum(a.gemspurchased) ) AS ShadowGemRatio
FROM (SELECT Date(Date_sub(createddate, INTERVAL 7 hour)) AS day,
Hour(Date_sub(createddate, INTERVAL 7 hour)) AS hour,
serverid,
Sum(gems) AS GemsPurchased
FROM dollartransactions
WHERE Date(Date_sub(createddate, INTERVAL 7 hour)) BETWEEN
Curdate() - INTERVAL 14 day AND Curdate()
GROUP BY 1,
2,
3) a,
/*Gems recorded from DollarTransactions Table after purchasing gem package*/
(SELECT Date(Date_sub(createddate, INTERVAL 7 hour)) AS day,
Hour(Date_sub(createddate, INTERVAL 7 hour)) AS hour,
serverid,
Sum(acceptedamount) AS Gems,
Sum(acceptedshadowamount) AS ShadowGems
FROM gemtransactions
WHERE Date(Date_sub(createddate, INTERVAL 7 hour)) BETWEEN
Curdate() - INTERVAL 14 day AND Curdate()
AND transactiontype IN ( 990, 2 )
AND fullfilled = 1
AND gemtransactionid >= 130000000
GROUP BY 1,
2,
3) b
/*Gems & Shadow Gems spent, recorded from GemTransactions Table */
WHERE a.day = b.day
AND a.serverid = b.serverid
GROUP BY 1,
2
This code returns the component parts of the ratios, as well as the ratios themselves (which are sometimes null):
day hour sum(a.GemsPurchased) sum(b.Gems) sum(b.ShadowGems) GemRatio ShadowGemRatio
9/5/2014 0 472875 465499 60766 0.9844 0.1285
9/5/2014 1 350960 371092 45408 1.0574 0.1294
9/5/2014 2 472985 509618 58329 1.0775 0.1233
9/5/2014 3 1023905 629310 71017 0.6146 0.0694
9/5/2014 4 1273170 628697 74896 0.4938 0.0588
9/5/2014 5 998920 637709 64145 0.6384 0.0642
9/5/2014 6 876470 651451 68977 0.7433 0.0787
9/5/2014 7 669100 667217 81599 0.9972 0.122
What I'd like to do is create an 8th and 9th column which calculate the % change from previous row for both GemRatio and ShadowGemRatio. I've seen other threads here on how to do this for specific queries, but I couldn't get it to work for my particular MySQL query...

Ok first create a view for that query. Let's call it v1:
CREATE VIEW v1 AS SELECT YOUR QUERY HERE;
Now here is the query to have the ratios. I assumed a day has 24 hours. The first row ratio change will be zero.
select now.*,
CASE
WHEN yesterday.gemRatio is null THEN 0
ELSE 100*(now.gemRatio-yesterday.gemRatio)/yesterday.gemRatio
END as gemChange,
CASE
WHEN yesterday.ShadowGemRatio is null THEN 0
ELSE 100*(now.ShadowGemRatio-yesterday.ShadowGemRatio)/yesterday.ShadowGemRatio
END as shadowGemChange
from v1 now left outer join v1 yesterday on
((now.day = yesterday.day && now.hour = yesterday.hour+1) ||
(DATEDIFF(now.day,yesterday.day) = 1 && now.hour = 0 && yesterday.hour=23))

Related

How to select records but exclude if one type is outside a subquery?

We have multiple invStatus values (1-10) and want to exclude only one status type (1) BUT only those of that type that are a older than X number of days. So all records will show but NOT those who's invStatus = 1 and is older than X days. invStatus = 1 and younger than X days will be included in the recordset.
Do I select all records generically, then in a subquery filter those of status = 1 that are older than X days?
The query below uses NOT IN in an attempt to select those records to exclude but it is not working and also seems to be inefficient as it takes a couple seconds to execute.
SELECT
tblinventory.invId,
tblinventory.invTitle,
tblinventory.invStatus,
tblhouseinfo.Address,
tblhouseinfo.City,
tblhouseinfo.`State`,
tblhouseinfo.Zip,
tblhouseinfo.Update_date,
CURRENT_DATE() - INTERVAL 10 DAY AS dateEx
FROM
tblinventory
LEFT OUTER JOIN tblhouseinfo ON tblinventory.invId = tblhouseinfo.addInfoID
WHERE
invReleased = 0
AND invStatus NOT IN (SELECT invId from tblhouseinfo WHERE invStatus = 1
AND tblhouseinfo.Update_date < CURRENT_DATE() - INTERVAL 10 DAY )
ORDER BY
`tblhouseinfo`.`Update_date` DESC
I could filter the results with PHP on the page level but this also seems less than efficient and would prefer to perform this task using the best practices.
UPDATE:
There are a total of 155 rows.
All tblhouseinfo.Update_date (timestamp) values are "2017-09-06 10:53:17" (Aug 9th) accept three I changed for testing to "2017-07-06 10:53:17
" (July 6th)
Utilizing the suggestion for :
AND NOT (invStatus = 1 AND tblhouseinfo.Update_date > CURRENT_DATE() - INTERVAL 10 DAY )
60 records are excluded not the expected 3.
"2017-08-28" is the current result from CURRENT_DATE() - INTERVAL 10 DAY which should be within the 10 day range to select "2017-09-06 10:53:17" and only exclude the three records that are "2017-07-06 10:53:17"
FINAL WORKING SOLUTION/Query:
SELECT
tblinventory.invId,
tblinventory.invTitle,
tblinventory.invStatus,
tblhouseinfo.Address,
tblhouseinfo.City,
tblhouseinfo.`State`,
tblhouseinfo.Zip,
tblhouseinfo.Update_date,
CURRENT_DATE() - INTERVAL 10 DAY AS dateEx
FROM
tblinventory
LEFT OUTER JOIN tblhouseinfo ON tblinventory.invId = tblhouseinfo.addInfoID
WHERE
invReleased = 0
AND NOT (invStatus = 1 AND tblhouseinfo.Update_date < CURRENT_DATE() - INTERVAL 10 DAY )
ORDER BY
`tblhouseinfo`.`Update_date` DESC
SELECT
tblinventory.invId,
tblinventory.invTitle,
tblinventory.invStatus,
tblhouseinfo.Address,
tblhouseinfo.City,
tblhouseinfo.`State`,
tblhouseinfo.Zip,
tblhouseinfo.Update_date,
CURRENT_DATE() - INTERVAL 10 DAY AS dateEx
FROM
tblinventory
LEFT OUTER JOIN tblhouseinfo ON tblinventory.invId = tblhouseinfo.addInfoID
WHERE
invReleased = 0
AND NOT (invStatus = 1 AND tblhouseinfo.Update_date < CURRENT_DATE() - INTERVAL 10 DAY )
ORDER BY
`tblhouseinfo`.`Update_date` DESC
You don't need to select invID from the other table if you know you never want the ID #1 (invStatus 1). But you can also throw in an AND statement for the # of days.
I always use timestamps (in UNIX) for recording data entry / modification.
AND (timestamp >= beginTimestamp AND timeStamp <= endTimestamp)

Finding percentage difference from different values in the same SQL Table?

I have a table that tracks total values for months against years in a particular location.
Desired Outcome: I wanted to compare a month's value for the current year against last years value. I then wanted to check for a percentage increase.
e.g. 2014 (January) = 140 - 2013 (January) = 150 * 100 = - 6.67
Table Name- donation_tracker
Thank you in advance.
As I understood, You want to get the percent of increase from last year to current year for the Same month for a Particular location. Use the query.
SELECT D1.month, ROUND((D2.Donation_amount- D1.Donation_amount) * 100 /
D1.Donation_amount, 2)
FROM donation_tracker D1
INNER JOIN donation_tracker D2
ON d1.month = D2.month AND D1.year = D2.year - 1
AND D1.Location_ID = D2.Location_ID;
Let's say you need to compare the immediately-completed twelve months with the twelve months prior to that, month-by-month. I am guessing at your table and column names because, well, I don't know them.
Let's build this from the ground up.
Here's a query that will find the most recent twelve months of donations month by month.
SELECT YEAR(donation_date) AS donation_year,
MONTH(donation_date) AS donation_month,
SUM(donation_amount) AS donation_amount
FROM donations
WHERE donation_date >= LAST_DAY(NOW()) + INTERVAL 1 DAY - INTERVAL 13 MONTH
AND donation_date < LAST_DAY(NOW()) + INTERVAL 1 DAY - INTERVAL 1 MONTH
GROUP BY YEAR(donation_date), MONTH(donation_date)
That gives you a twelve-row result set like this (when NOW() happens to be in the middle of November 2014):
2013 11 145
2013 12 220
2014 1 123
2014 2 11
...
2014 10 45
The trick is picking the right range of donation_date values.
So, now you need two of those result sets, one for mostly-2014 and one for mostly-2013. The one for mostly-2013 looks very similar. You simply back up one more year like this.
SELECT YEAR(donation_date) AS donation_year,
MONTH(donation_date) AS donation_month,
SUM(donation_amount) AS donation_amount
FROM donations
WHERE donation_date >= LAST_DAY(NOW()) + INTERVAL 1 DAY - INTERVAL 25 MONTH
AND donation_date < LAST_DAY(NOW()) + INTERVAL 1 DAY - INTERVAL 13 MONTH
GROUP BY YEAR(donation_date), MONTH(donation_date)
This is going to be one of those notorious club-sandwich queries, made of those two basic queries. You join them by month like so, then do the percentage computation in the SELECT clause.
SELECT a.donation_month,
a.donation_amount AS this_year,
b.donation_amount AS last_year,
100.0 * (a.donation_amount - b.donation_amount) / b.donation_amount as pct_increase
FROM (
/* this year's query */
) AS a
JOIN (
/* last year's query */
) AS b ON a.donation_month = b.donation_month
ORDER BY a.donation_year, a.donation_month
Here's the whole club sandwich for your server to chew on. Yummy!
SELECT a.donation_month,
a.donation_amount AS this_year,
b.donation_amount AS last_year,
100.0 * (a.donation_amount - b.donation_amount) / b.donation_amount as pct_increase
FROM (
SELECT YEAR(donation_date) AS donation_year,
MONTH(donation_date) AS donation_month,
SUM(donation_amount) AS donation_amount
FROM donations
WHERE donation_date >= LAST_DAY(NOW()) + INTERVAL 1 DAY - INTERVAL 13 MONTH
AND donation_date < LAST_DAY(NOW()) + INTERVAL 1 DAY - INTERVAL 1 MONTH
GROUP BY YEAR(donation_date), MONTH(donation_date)
) AS a
JOIN (
SELECT YEAR(donation_date) AS donation_year,
MONTH(donation_date) AS donation_month,
SUM(donation_amount) AS donation_amount
FROM donations
WHERE donation_date >= LAST_DAY(NOW()) + INTERVAL 1 DAY - INTERVAL 25 MONTH
AND donation_date < LAST_DAY(NOW()) + INTERVAL 1 DAY - INTERVAL 13 MONTH
GROUP BY YEAR(donation_date), MONTH(donation_date)
) AS b ON a.donation_month = b.donation_month
ORDER BY a.donation_year, a.donation_month
Once you stack up the whole club sandwich, it look complicated. But it's actually a stack of simple subqueries.
This should give you an idea :)
Sample data:
CREATE TABLE t
(`month` varchar(3), `year` int, `amount` int)
;
INSERT INTO t
(`month`, `year`, `amount`)
VALUES
('jan', 2013, 150),
('feb', 2013, 180),
('jan', 2014, 140),
('feb', 2014, 160)
;
Query:
select
t1.month, round((t2.amount - t1.amount) * 100 / t1.amount, 2)
from
t t1
inner join t t2 on t1.month = t2.month and t1.year < t2.year;
Result:
| MONTH | ROUND((T2.AMOUNT - T1.AMOUNT) * 100 / T1.AMOUNT, 2) |
|-------|-----------------------------------------------------|
| jan | -6.67 |
| feb | -11.11 |

Incorrect values with multiple left joins (MySQL)

I am trying to make a report. It is supposed to give me a list of the machines at a specific customer and the sum of hours and material that was put in to that machine.
In the following examples, I select the sum of materials and hours in different fields to make the problem clearer. But i really want to sum the material an hours, then group them by the machine field.
I can query the list of machine and cost of hours without problems.
SELECT CONCAT(`customer`.`PREFIX`, `wo`.`machine_id`) AS `machine`,
ROUND(COALESCE(SUM(`wohours`.`length` * `wohours`.`price`), 0), 2) AS `hours`
FROM `wo`
JOIN `customer` ON `customer`.`id`=`wo`.`customer_id`
LEFT JOIN `wohours` ON `wohours`.`wo_id`=`wo`.`id` AND `wohours`.`wo_customer_id`=`wo`.`customer_id`
AND `wohours`.`wo_machine_id`=`wo`.`machine_id` AND `wohours`.`date`>=(CURDATE() - INTERVAL DAY(CURDATE() - INTERVAL 1 DAY) DAY) - INTERVAL 11 MONTH
WHERE `wo`.`customer_id`=1
GROUP BY `wo`.`machine_id`;
This gives me the correct values for hours. But when I add the material like this:
SELECT CONCAT(`customer`.`PREFIX`, `wo`.`machine_id`) AS `machine`,
ROUND(COALESCE(SUM(`wohours`.`length` * `wohours`.`price`), 0), 2) AS `hours`,
ROUND(COALESCE(SUM(`womaterial`.`multiplier` * `womaterial`.`price`), 0), 2) AS `material`
FROM `wo`
JOIN `customer` ON `customer`.`id`=`wo`.`customer_id`
LEFT JOIN `wohours` ON `wohours`.`wo_id`=`wo`.`id` AND `wohours`.`wo_customer_id`=`wo`.`customer_id`
AND `wohours`.`wo_machine_id`=`wo`.`machine_id` AND `wohours`.`date`>=(CURDATE() - INTERVAL DAY(CURDATE() - INTERVAL 1 DAY) DAY) - INTERVAL 11 MONTH
LEFT JOIN `womaterial` ON `womaterial`.`wo_id`=`wo`.`id` AND `womaterial`.`wo_customer_id`=`wo`.`customer_id`
AND `womaterial`.`wo_machine_id`=`wo`.`machine_id` AND `wohours`.`date`>=(CURDATE() - INTERVAL DAY(CURDATE() - INTERVAL 1 DAY) DAY) - INTERVAL 11 MONTH
WHERE `wo`.`customer_id`=1
GROUP BY `wo`.`machine_id`;
then both hour and material values are incorrect.
I have read other threads where people with similar problems could solve this by splitting it in multiple queries or subqueries. But I don't think that is possible in this case.
Any help is appreciated.
//John
Your other reading is correct. You will need to put them into their own "subquery" for the join. The reason you are probably getting invalid values is that the materials table has multiple records per machine, thus causing a Cartesian result from your original based on hours. And you don't know which has many vs just one making it look incorrect.
So, I've written, and each inner-most query for pre-aggregating the woHours and woMaterial will produce a single record per "wo_id and machine_id" to join back to the wo table when finished. Each of these queries has the criteria on the single customer ID you are trying to run it for.
Then, as re-joined to the work order (wo) table, it grabs all records and applies the ROUND() and COALESCE() in case no such hours or materials present. So this is a return of something like
WO Machine ID Machine Hours Material
1 1 CustX 1 2 0
2 4 CustY 4 2.5 6.5
3 4 CustY 4 1.2 .5
4 1 CustX 1 1.5 1.2
Finally, you can now roll up the SUM() of all these entries into a single row per machine ID
Machine Hours Material
CustX 1 3.5 1.2
CustY 4 3.7 7.0
SELECT
AllWO.Machine,
SUM( AllWO.Hours ) Hours,
SUM( AllWO.Material ) Material
from
( SELECT
wo.wo_id,
wo.Machine_ID,
CONCAT(customer.PREFIX, wo.machine_id) AS machine,
ROUND( COALESCE( PreSumHours.MachineHours, 0), 2) AS hours,
ROUND( COALESCE( PreSumMaterial.materialHours, 0), 2) AS material
FROM
wo
JOIN customer
ON wo.customer_id = customer.id
LEFT JOIN ( select wohours.wo_id,
wohours.wo_machine_id,
SUM( wohours.length * wohours.price ) as machinehours
from
wohours
where
wohours.wo_customer_id = 1
AND wohours.date >= ( CURDATE() - INTERVAL DAY( CURDATE() - INTERVAL 1 DAY) DAY) - INTERVAL 11 MONTH
group by
wohours.wo_id,
wohours.wo_machine_id ) as PreSumHours
ON wo.id = PreSumHours.wo_id
AND wo.machine_id = PreSumHours.wo_machine_id
LEFT JOIN ( select womaterial.wo_id,
womaterial.wo_machine_id,
SUM( womaterial.length * womaterial.price ) as materialHours
from
womaterial
where
womaterial.wo_customer_id = 1
AND womaterial.date >= ( CURDATE() - INTERVAL DAY( CURDATE() - INTERVAL 1 DAY) DAY) - INTERVAL 11 MONTH
group by
womaterial.wo_id,
womaterial.wo_machine_id ) as PreSumMaterial
ON wo.id = PreSumMaterial.wo_id
AND wo.machine_id = PreSumMaterial.wo_machine_id
WHERE
wo.customer_id = 1 ) AllWO
group by
AllWO.Machine_ID

Changing start-date in MySQL for week

I found the following code to help in creating a weekly report based on a start date of Friday. The instructions say to replace ".$startWeekDay." with a 4. When I put '".$startDay."' as '2013-01-30', I get errors.
Also I get a report by day rather than week as I desire.
SELECT SUM(cost) AS total,
CONCAT(IF(date - INTERVAL 6 day < '".$startDay."',
'".$startDay."',
IF(WEEKDAY(date - INTERVAL 6 DAY) = ".$startWeekDay.",
date - INTERVAL 6 DAY,
date - INTERVAL ((WEEKDAY(date) - ".$startWeekDay.")) DAY)),
' - ', date) AS week,
IF((WEEKDAY(date) - ".$startWeekDay.") >= 0,
TO_DAYS(date) - (WEEKDAY(date) - ".$startWeekDay."),
TO_DAYS(date) - (7 - (".$startWeekDay." - WEEKDAY(date)))) AS sortDay
FROM daily_expense
WHERE date BETWEEN '".$startDay."' AND '".$endDay."'
GROUP BY sortDay;
The following code is what I am using
SELECT count(DISTINCT (
UserID)
) AS total, CONCAT(IF(date(LastModified) - INTERVAL 6 day < date(LastModified),
date(LastModified),
IF(WEEKDAY(date(LastModified) - INTERVAL 6 DAY) = 4,
date(LastModified) - INTERVAL 6 DAY,
date(LastModified) - INTERVAL ((WEEKDAY(date(LastModified)) - 4)) DAY)),
' - ', date(LastModified)) AS week
FROM `Purchase`
WHERE `OfferingID` =87
AND `Status`
IN ( 1, 4 )
GROUP BY week
The output I get is
total week
3 2013-01-30 - 2013-01-30
1 2013-01-31 - 2013-01-31
I'm not sure exactly how you want to display your week, the sql above is attempting to display date ranges. If this isn't a requirement, your query could be very simple, you can just offset your time by two days (since friday is two days away from the natural star of the week) and use the week function to get the week number.
The query would look like this:
select count(distinct (UserID)) as total
, year( LastModified + interval 2 day ) as year
, week( LastModified + interval 2 day ) as week_number
FROM `Purchase`
WHERE `OfferingID` =87
AND `Status`
IN ( 1, 4 )
group by year, week_number;

Date calculation ranking - Assign a value for event age

I'm trying to assign a value for events based on it's age using this:
SELECT u.iduser, timeIsImportant
FROM user AS u
LEFT OUTER JOIN(
SELECT action, user_id, action_time,
(IF(bb.action_time < DATE_SUB(CURDATE(), INTERVAL 7 DAY),
(CASE
WHEN bb.action_time BETWEEN DATE_SUB(CURDATE(), INTERVAL 3 MONTH) AND CURDATE() THEN 0.1
WHEN bb.action_time BETWEEN DATE_SUB(CURDATE(), INTERVAL 6 MONTH) AND CURDATE() THEN 0.2
WHEN bb.action_time BETWEEN DATE_SUB(CURDATE(), INTERVAL 12 MONTH) AND CURDATE() THEN 0.4
WHEN bb.action_time BETWEEN DATE_SUB(CURDATE(), INTERVAL 18 MONTH) AND CURDATE() THEN 0.7
WHEN bb.action_time BETWEEN DATE_SUB(CURDATE(), INTERVAL 24 MONTH) AND CURDATE() THEN 1.0
END), 0))
AS timeIsImportant
FROM bigbrother AS bb
ORDER BY bigbrother.uid DESC LIMIT 1)
AS bbb
ON
bbb.user_id = u.iduser AND bbb.action = "C"
WHERE u.iduser = 2;
The idea is that older 'events' on the bigbrother table need to subtract different values from a ranking query calculation. The timeIsImportant value from the query above, would be the agePoints on the following example.
sample data:
row1 row2
------------- -------------
rank: 4.7 rank 4.9
agePoints: 0.1 agePoints 0.4
timedRank: (rank-AgePoints) timedRank: (rank-AgePoints)
-------------------------------------------------------------
SQL: ORDER BY timedRank DESC
row1, row2
SQL: ORDER BY timedRank ASC
row2, row1
I wonder if there's another way to assign the values based on events age, since I'm doing this calculation on every page load in order to rank search results and found that this piece of code slows the overall performance when nested within the search query.