I have a table containing the following fields:
date, time, node, result
describing some numeric result for different nodes at different dates and times throughout each day. Typical listing will look something like this:
date | time | node | result
----------------------------------
2011-03-01 | 10:02 | A | 10
2011-03-01 | 11:02 | A | 20
2011-03-02 | 03:13 | A | 23
2011-03-02 | 12:15 | A | 18
2011-03-02 | 13:15 | A | 8
2011-03-01 | 13:12 | B | 2
2011-03-01 | 14:26 | B | 1
2011-03-02 | 08:00 | B | 6
2011-03-02 | 07:22 | B | 3
2011-03-02 | 21:19 | B | 4
I want to form a query that'll get the last result from each day for each node, such that I'd get something like this:
date | time | node | latest
-----------------------------------
2011-03-01 | 11:02 | A | 20
2011-03-01 | 14:26 | B | 1
2011-03-02 | 13:15 | A | 8
2011-03-02 | 21:19 | B | 4
I thought about doing a group by date, node, but then extracting the last value was a mess (I used group_concat( result order by time ) and used SUBSTRING() to get the last value. Baah, I know). Is there a simple way to do this in mysql?
I'm pretty sure I saw a similar request solving it very nice without using an INNER JOIN but I can't find it right now (and it might have been SQL Server) but following should work nevertheless.
SELECT n.*
FROM Nodes n
INNER JOIN (
SELECT MAX(time) AS Time
, Date
, Node
FROM Nodes
GROUP BY
Date
, Node
) nm ON nm.time = n.time
AND nm.Date = n.Date
AND nm.Node = n.Node
I would think that you would have to use something like the Max() function. Sorry I don't have mysql, so I can't test but I would think something like this
select t.date, t.node, t.latest, Max(time) from Table t Group By t.node, t.date
I think the aggregate function will return only the one row per grouping.
Related
I have a query. I want to do an subtraction of the first and last row in the same day. I wrote the this query, but I was not sure of the performance. Is there an alternative way to this problem?
| imei | date | km |
|-----------------------------------------|
| 123 | 2019-01-15 00:00:01 | 15 |
| 123 | 2019-01-15 12:12:08 | 8 |
| 123 | 2019-01-15 23:00:59 | 30 |
| 456 | 2019-01-15 00:03:12 | 232 |
| 456 | 2019-01-15 07:04:00 | 123 |
| 456 | 2019-01-15 23:16:18 | 464 |
My query:
SELECT
gg.imei,
DATE_FORMAT(gg.datee, '%Y-%m-%d'),
gg.km - (SELECT
g.km
FROM
gps g
WHERE
g.datee LIKE '2019-01-15%'
AND g.datee = (SELECT
MIN(t.datee)
FROM
gps t
WHERE
t.datee LIKE '2019-01-15%'
AND t.imei = g.imei)
AND g.imei = gg.imei
GROUP BY g.imei) AS km
FROM
gps gg
WHERE
gg.datee LIKE '2019-01-15%'
AND gg.datee = (SELECT
MAX(ts.datee)
FROM
gps ts
WHERE
ts.datee LIKE '2019-01-15%'
AND gg.imei = ts.imei)
Result is true.
| imei | date | km |
|------------------------------|
| 123 | 2019-01-15 | 15 |
| 456 | 2019-01-15 | 232 |
But the query is too complicated.
Edit: There are 3 million records in the table.
You can find first and last datetime for each imei-date pair in a sub query then join with it:
SELECT agg.imei, agg.date_date, gps_last.km - gps_frst.km AS diff
FROM (
SELECT imei, DATE(date) AS date_date, MIN(date) AS date_frst, MAX(date) AS date_last
FROM gps
GROUP BY imei, DATE(date)
) AS agg
JOIN gps AS gps_frst ON agg.imei = gps_frst.imei AND agg.date_frst = gps_frst.date
JOIN gps AS gps_last ON agg.imei = gps_last.imei AND agg.date_last = gps_last.date
You need appropriate indexes on your table though. The DATE(date) part in particular will be slow, so you might want to consider adding another column for storing the date part only.
I came across a task where I have to return the total COUNT and SUM of issued policies for each day of the month and compare it to the previous year.
Table PolicyOrder has fields:
PolicyOrderId - primary key
CreatedAt (DATETIME)
CalculatedPremium - cost of policy or "premium"
PolicyOrderStatusId - irrelevant to the question but still - status of the policy.
To solve this I came up with a query that inner joins self table and sums/counts by grouping according to DAY of the creation date.
SELECT
DATE(po1.CreatedAt) AS dayDate_2017,
SUM(po1.CalculatedPremium) AS premiumSum_2017,
COUNT(po1.PolicyOrderId) AS policyCount_2017,
po2.*
FROM
PolicyOrder po1
INNER JOIN (
SELECT
DATE(CreatedAt) AS dayDate_2018,
SUM(CalculatedPremium) AS premiumSum_2018,
COUNT(PolicyOrderId) AS policyCount_2018
FROM
PolicyOrder po2
WHERE
YEAR(CreatedAt) = 2018 AND
MONTH(CreatedAt) = 10 AND
PolicyOrderStatusId = 6
GROUP BY
DAY(CreatedAt)
) po2 ON (
DAY(po2.dayDate_2018) = DAY(po1.CreatedAt)
)
WHERE
YEAR(po1.CreatedAt) = 2017 AND
MONTH(po1.CreatedAt) = 10 AND
PolicyOrderStatusId = 6
GROUP BY
DAY(po1.CreatedAt)
The above query returns these results:
dayDate_2017 | premiumSum_2017 | policyCount_2017 | dayDate_2018 | premiumSum_2018 | policyCount_2018
2017-10-01 | 4699.36 | 98 | 2018-10-01 | 8524.21 | 144
2017-10-02 | 9114.55 | 168 | 2018-10-02 | 7942.25 | 140
2017-10-03 | 9512.43 | 178 | 2018-10-03 | 9399.61 | 161
2017-10-04 | 9291.77 | 155 | 2018-10-04 | 6922.83 | 137
2017-10-05 | 8063.27 | 155 | 2018-10-05 | 9278.58 | 178
2017-10-06 | 9743.40 | 184 | 2018-10-06 | 6139.38 | 136
...
2017-10-31 | ...
The problem is that now I have to add two more columns in which policies has to be counted and amounts added from the start of the year UP UNTIL each returned row.
Desired results:
dayDate_2017 | premiumSum_2017 | policyCount_2017 | sumFromYearBegining | countFromYearBegining
2017-10-01 | 4699.36 | 98 | 150000.34 | 5332
2017-10-02 | 9114.55 | 168 | 156230.55 | 5443
2017-10-03 | 9512.43 | 178 | 160232.44 | 5663
...
2017-10-31 | ...
WHERE:
sumFromYearBegining (150000.34) - SUM of premiumSum from 2017-01-01 until 2017-10-01 (excluding)
countFromYearBegining (5332) - COUNT of policies from 2017-01-01 until 2017-10-01 (excluding)
sumFromYearBegining (1566239.55) - SUM of premiumSum from 2017-01-01 until 2017-10-02 (excluding)
countFromYearBegining (5443) - COUNT of policies from 2017-01-01 until 2017-10-02 (excluding)
sumFromYearBegining (160232.44) - SUM of premiumSum from 2017-01-01 until 2017-10-02 (excluding)
countFromYearBegining (5663) - COUNT of policies from 2017-01-01 until 2017-10-02 (excluding)
I have tried inner joining same table COUNTed and SUMed which failed because I cannot specify the range up to which I need to count and sum, I have tried LEFT joining and then counting, which fails because the results are counted not untill each row result but until the last result etc...
DB Fiddle: https://www.db-fiddle.com/f/ckM8HyTD6NjLbK41Mq1gct/5
Any help from you SQL ninjas highly appreciated.
We can use User-defined variables to calculate Rolling Sum / Count, in absence of Window Functions' availability.
We will first need to determine the Sum and Count for every day in the year 2017 (even though you need rows for a particular month only). Because, in order to calculate rolling Sum for the days in March month, we would need the sum/count values from the January, and February month(s) as well. One optimization possibility is that we can restrict calculations from the first month to the require month only.
Note that ORDER BY daydate_2017 is necessary in order to be able to calculate rolling sum correctly. By default, data is in unordered fashion. Without defining the order, we cannot guarantee that Sum will be correct.
Also, we need to two levels of sub-select queries. First level is used to calculate the Rolling sum values. Second level is used to restrict the result to February month only. Since WHERE is executed before SELECT; we cannot restrict the result to February month, in the first level itself.
If you need similar rolling Sum for the year 2018 as well; similar query logic can be implemented in other set of sub-select queries.
SELECT dt2_2017.*, dt_2018.*
FROM
(
SELECT dt_2017.*,
#totsum := #totsum + dt_2017.premiumsum_2017 AS sumFromYearBegining_2017,
#totcount := #totcount + dt_2017.policycount_2017 AS countFromYearBeginning_2017
FROM (SELECT Date(po1.createdat) AS dayDate_2017,
Sum(po1.calculatedpremium) AS premiumSum_2017,
Count(po1.policyorderid) AS policyCount_2017
FROM PolicyOrder AS po1
WHERE po1.policyorderstatusid = 6 AND
YEAR(po1.createdat) = 2017 AND
MONTH(po1.createdat) <= 2 -- calculate upto February for 2017
GROUP BY daydate_2017
ORDER BY daydate_2017) AS dt_2017
CROSS JOIN (SELECT #totsum := 0, #totcount := 0) AS user_init_vars
) AS dt2_2017
INNER JOIN (
SELECT
DATE(po2.CreatedAt) AS dayDate_2018,
SUM(po2.CalculatedPremium) AS premiumSum_2018,
COUNT(po2.PolicyOrderId) AS policyCount_2018
FROM
PolicyOrder po2
WHERE
YEAR(po2.CreatedAt) = 2018 AND
MONTH(po2.CreatedAt) = 2 AND
po2.PolicyOrderStatusId = 6
GROUP BY
dayDate_2018
) dt_2018 ON DAY(dt_2018.dayDate_2018) = DAY(dt2_2017.dayDate_2017)
WHERE YEAR(dt2_2017.daydate_2017) = 2017 AND
MONTH(dt2_2017.daydate_2017) = 2;
RESULT: View on DB Fiddle
| dayDate_2017 | premiumSum_2017 | policyCount_2017 | sumFromYearBegining_2017 | countFromYearBeginning_2017 | dayDate_2018 | premiumSum_2018 | policyCount_2018 |
| ------------ | --------------- | ---------------- | ------------------------ | --------------------------- | ------------ | --------------- | ---------------- |
| 2017-02-01 | 4131.16 | 131 | 118346.77 | 3627 | 2018-02-01 | 8323.91 | 149 |
| 2017-02-02 | 2712.74 | 85 | 121059.51000000001 | 3712 | 2018-02-02 | 9469.33 | 153 |
| 2017-02-03 | 3888.59 | 111 | 124948.1 | 3823 | 2018-02-03 | 6409.21 | 97 |
| 2017-02-04 | 2447.99 | 74 | 127396.09000000001 | 3897 | 2018-02-04 | 5693.69 | 120 |
| 2017-02-05 | 1437.5 | 45 | 128833.59000000001 | 3942 | 2018-02-05 | 8574.97 | 129 |
| 2017-02-06 | 4254.48 | 127 | 133088.07 | 4069 | 2018-02-06 | 8277.51 | 133 |
| 2017-02-07 | 4746.49 | 136 | 137834.56 | 4205 | 2018-02-07 | 9853.75 | 173 |
| 2017-02-08 | 3898.05 | 125 | 141732.61 | 4330 | 2018-02-08 | 9116.33 | 144 |
| 2017-02-09 | 8306.86 | 286 | 150039.46999999997 | 4616 | 2018-02-09 | 8818.32 | 166 |
| 2017-02-10 | 6740.99 | 204 | 156780.45999999996 | 4820 | 2018-02-10 | 7880.17 | 134 |
| 2017-02-11 | 4290.38 | 133 | 161070.83999999997 | 4953 | 2018-02-11 | 8394.15 | 180 |
| 2017-02-12 | 3687.58 | 122 | 164758.41999999995 | 5075 | 2018-02-12 | 10378.29 | 171 |
| 2017-02-13 | 4939.31 | 159 | 169697.72999999995 | 5234 | 2018-02-13 | 9383.15 | 160 |
If you want a way that avoids using #variables in the select list, and also avoids analytics (only mysql 8 supports them) you can do it with a semi-cartesian product:
WITH prevYr AS(
SELECT
YEAR(CreatedAt) AS year_prev,
MONTH(CreatedAt) AS month_prev,
DAY(CreatedAt) AS day_prev,
SUM(CalculatedPremium) AS premiumSum_prev,
COUNT(PolicyOrderId) AS policyCount_prev
FROM
PolicyOrder
WHERE
CreatedAt BETWEEN '2017-02-01' AND '2017-02-28' AND
PolicyOrderStatusId = 6
GROUP BY
YEAR(CreatedAt), MONTH(CreatedAt), DAY(CreatedAt)
),
currYr AS (
SELECT
YEAR(CreatedAt) AS year_curr,
MONTH(CreatedAt) AS month_curr,
DAY(CreatedAt) AS day_curr,
SUM(CalculatedPremium) AS premiumSum_curr,
COUNT(PolicyOrderId) AS policyCount_curr
FROM
PolicyOrder
WHERE
CreatedAt BETWEEN '2018-02-01' AND '2018-02-28' AND
PolicyOrderStatusId = 6
GROUP BY
YEAR(CreatedAt), MONTH(CreatedAt), DAY(CreatedAt)
)
SELECT
*
FROM
prevYr
INNER JOIN
currYr
ON
currYr.day_curr = prevYr.day_prev
INNER JOIN
(
SELECT
main.day_prev AS dayRolling_prev,
SUM(pre.premiumSum_prev) AS premiumSumRolling_prev,
SUM(pre.policyCount_prev) AS policyCountRolling_prev
FROM
prevYr main LEFT OUTER JOIN prevYr pre ON pre.day_prev < main.day_prev
GROUP BY
main.day_prev
) rollingPrev
ON
currYr.day_curr = rollingPrev.dayRolling_prev
ORDER BY 1,2,3
We summarise the year 2017 and year 2018 data into two CTEs because it makes things a lot cleaner and neater later, particularly for this rolling count. You can probably follow the logic of the CTE easily because it's lifted more or less straight from your query - I only dropped the DATE column in favour of a year/month/date triplet because it made other things cleaner (joins) and can be recombined to a date if needed. I also swapped the WHERE clauses to use date BETWEEN x AND y because this will leverage an index on a column whereas using YEAR(date) = x AND MONTH(date) = y might not
The rolling counts works via something I referred to as a semi-cartesian. It's actually a cartesian product; any database join that results in rows from one o both tables multiplying and being represented repeatedly in the output, is a cartesian product. Rather than being a full product (every row crossed with every other row) in this case it uses a less than, so every row is only crossed with a subset of rows. As the date increases, more rows match the predicate, because a date of 30th has 29 rows that are less than it.
This thus causes the following pattern of data:
maindate predate maincount precount
2017-02-01 NULL 10 NULL
2017-02-02 2017-02-01 20 10
2017-02-03 2017-02-01 30 10
2017-02-03 2017-02-02 30 20
2017-02-04 2017-02-01 40 10
2017-02-04 2017-02-02 40 20
2017-02-04 2017-02-03 40 30
You can see that for any given main date, it repeats N - 1 times because there are N - 1 dates lower than in that satisfy the join condition predate < maindate
If we group by the maindate and sum the counts associated with each predate, we get the rolling sum of all the pre-counts on that main-date (So, on the 4th day of the month, it's SUM(pre count for dates 1st - 3rd, i.e. 10+20+30 = 60. On the 5th day, we sum the counts for days 1 to 4. On the 6th day, we sum days 1 to 5 etc)
I am looking for a solution to SELECT (or otherwise derive) the values for Column C (minimum price for last 3 days only, not for the whole column).
----------------------------------------
Date | Unit_ | Low_3_days |
| price | |
----------------------------------------
2015-01-01 | 15 | should be: 15 |
2015-01-02 | 17 | should be: 15 |
2015-01-03 | 21 | should be: 15 |
2015-01-04 | 18 | should be: 17 |
2015-01-05 | 12 | should be: 12 |
2015-01-06 | 14 | should be: 12 |
2015-01-07 | 16 | should be: 12 |
----------------------------------------
My thought revolves around the following, but yielding an error:
select S.Date,Unit_price,
(SELECT min(LOW_3_days)
FROM table
where S.DATE BETWEEN S.DATE-1
and S.DATE-3)
AS min_price_3_days
FROM table AS S
What is the correct query to get this to work? Database used MySQL.
You are pretty close. When working with correlated subqueries, always use table aliases to be absolutely clear about where the columns are coming from:
select S.Date, Unit_price,
(SELECT min(s2.Unit_Price)
FROM table s2
WHERE s2.DATE BETWEEN s.DATE - interval 3 day and
s.DATE - interval 1 day
) as min_price_3_days
FROM table S;
Sorry for the confusing Title. I have been struggling with this query for quite a long time and I will do my best to explain what I am looking for.
I have 3 tables I am trying to pull from. We will call them:
headers
details
users
The headers table contains two important fields:
ref_num
headers_uid
The details table has the following important rows:
details_uid
headers_uid
work_time
user_uid
disposition
date_time
The users table has the following:
user_uid
username
An example of the details table which contains the majority of the information I need is as follows:
details_uid | headers_uid | work_time | user_uid | disposition | date_time
1 | 10 | 25:00 | 5 | o | 2013-07-02 12:14:48
2 | 10 | 10:00 | 7 | p | 2013-07-02 13:55:37
3 | 10 | 5:00 | 5 | c | 2013-07-02 15:04:28
4 | 12 | 7:00 | 5 | o | 2013-07-02 15:20:21
5 | 12 | 12:00 | 7 | p | 2013-07-02 15:35:27
6 | 12 | 3:00 | 7 | c | 2013-07-02 15:40:19
What I'm trying to do is display the headers.refnum, sum of total work_time for the unique user for ALL details.details_uids with the same details.headers_uid and only the LAST disposition of the details.headers_uid for the each user. The results must look for a specific date_time (I generally search by > CURDATE() to grab events for today) Also, instead of displaying the user_uid, I will be searching within a WHERE clause by users.username (I have usernames stored in a txt file which is turned into an IN statement).
Ideally, this is what I would like to see:
ref_num | work_time | username | disposition |
A10 | 30:00 | mike | c |
A10 | 10:00 | james | p |
A12 | 7:00 | mike | o |
A12 | 15:00 | james | c |
Any help is greatly appreciated! I know this will probably involve a good deal of join statements and subqueries and I've been banging my head on the table trying to get it right. I know this would be much easier using php, but sadly, I don't have php access at work yet (don't ask..)
I think this does what you want:
select h.ref_num, sum(d.work_time), u.username, d.disposition
from details d join
headers h
on d.headers_uid = h.headers_uid join
users u
on d.user_uid = u.user_uid
where d.disposition = (select disposition
from details d2
where d2.headers_uid = d.headers_uid and
d2.users_uid = d.users_uid
order by date_time desc
limit 1
)
group by h.ref_num, u.username, d.disposition;
The key is the where clause that selects the last disposition for a given set of details records.
I'm trying to do some report line chart graphs and find it easiest if I return one data row from my query for each column (date) of data that will appear in a line chart. The challenge is that I want more than one line.
Here is what I can do with a simplified example of data:
| DATE | SALES | LOCATION |
| 2012-01-07 | 500 | 1 |
| 2012-01-07 | 600 | 2 |
| 2012-01-14 | 700 | 1 |
| 2012-01-14 | 400 | 2 |
| 2012-01-21 | 450 | 1 |
| 2012-01-21 | 550 | 2 |
SELECT date, sum(sales) as SalesTotal1 FROM TABLE WHERE location = '1' group by date
Which returns:
| DATE | SalesTotal1 |
| 2012-01-07 | 500 |
| 2012-01-14 | 700 |
| 2012-01-21 | 450 |
That's fine if I just have one line in my graph but what I really want in more than one alias of the same column still grouped by date that would return this:
| DATE | SalesTotal1 | SalesTotal2 |
| 2012-01-07 | 500 | 600 |
| 2012-01-14 | 700 | 400 |
| 2012-01-21 | 450 | 550 |
Is this possible? Sub query? I've tried many things, thanks ~
You could do something like:
SELECT `date`,
SUM(IF(location=1,sales,0)) As SalesTotal1,
SUM(IF(location=2,sales,0)) As SalesTotal2
FROM my_table
GROUP BY `date`
You'd have to add in as many columns as there are locations though, and if you have many locations it would be annoying. Perhaps you could consider doing the re-arranging on the code side (if you have many, many locations)?
This is called a PIVOT table query which is not generally supported by MYSQL. However, if you know exactly the locations you can do it with a chain of sub-queries. If you don't know the exact locations you have to write a stored procedure that looks up the locations and builds a query string and then executes. Here's how to do two locations:
SELECT
date,
SalesTotal1,
SalesTotal2
FROM (
SELECT
date,
sum(sales) as SalesTotal1
FROM TABLE
WHERE location = '1'
group by date
) S1
INNER JOIN (
SELECT
date,
sum(sales) as SalesTotal2
FROM TABLE
WHERE location = '2'
group by date
) S2 ON S1.date=S2.date