Hopefully the title makes any sense.
For this example I'll have the next table in my database
measurements
==================================
stn | date | temp | time =
1 | 01-12-2001 | 2.0 | 14:30 =
1 | 01-12-2001 | 2.1 | 14:31 =
1 | 03-12-2001 | 1.9 | 21:34 =
2 | 01-12-2001 | 4.5 | 12:48 =
2 | 01-12-2001 | 4.7 | 12:49 =
2 | 03-12-2001 | 4.9 | 11:01 =
==================================
And so on and so forth.
Each station (stn) has many measurements, one per day second. Now I want to select the temp of each station of the last 30 days measurements where the station has at least 30 temperature measurements.
I was playing with subquerys and group by, but I can't seem to figure it out.
Hope someone can help me out here.
edited the table
My example was oversimplified leaving a critical piece of information out. Please review the question.
select t1.stn,t1.date,t1.temp,t1.rn from (
select *,
#num := if(#stn = stn, #num + 1, 1) as rn,
#stn := stn as id_stn
from table,(select #stn := 0, #num := 1) as r
order by stn asc, date desc) as t1
inner join (select `stn`
from table
where concat_ws(' ',date,time) >= now() - interval 30 day
group by `stn`
having count(*) >= 30) as t
on t1.stn = t.stn
and t1.rn <= 30
order by stn,date desc,time desc
This is the query that should select Last 30 entries where there are at least 30 entries for a station
This query is based on the answer here by nick rulez, so please upvote him
SELECT t1.stn, t1.date, t1.temp, t1.time FROM
(
SELECT *,
#num := if(#stn = stn, #num + 1, 1) as rn,
#stn := stn as id_stn
FROM
`tablename`,
(SELECT #stn := 0, #num := 1) as r
ORDER BY stn asc, date desc
) as t1
INNER JOIN
(
SELECT `stn`
FROM `tablename`
GROUP BY `stn`
HAVING COUNT(*) >= 30
) as t
ON t1.stn = t.stn
AND t1.rn <= 30
ORDER BY stn, date desc, time desc
I have tested it on a sample database I made based on your schema and is working fine.
To know more about such queries have a look here Within-group quotas (Top N per group)
SELECT stn, date, temp FROM
(
SELECT stn, date, temp, #a:=IF(#lastStn=stn, #a+1, 1) countPerStn, #lastStn:=stn
FROM cache
GROUP BY stn, date
ORDER BY stn, date DESC
) as tempTable
WHERE countPerStn > 30;
Is the query I was looking for, sorry if my question was 'so wrong' that it pushed you all in the wrong direction. I'll up vote the answers who'm helped me to find the needed query.
Related
I am trying to use a query from this SO question Check for x consecutive days - given timestamps in database to count a number of consecutive days a user has submitted an activity i.e. 3 days, 5 days, 7 days etc.
The query is:
SELECT IF(COUNT(1) > 0, 1, 0) AS has_consec
FROM
(
SELECT *
FROM
(
SELECT IF(b.dateAdded IS NULL, #val:=#val+1, #val) AS consec_set
FROM activity a
CROSS JOIN (SELECT #val:=0) var_init
LEFT JOIN activity b ON
a.userID = b.userID AND
a.dateAdded = b.dateAdded + INTERVAL 1 DAY
WHERE a.userID = 1
) a
GROUP BY a.consec_set
HAVING COUNT(1) >= 3
) a
The code works great when the date field is not dateTime but how would I modify the code to ignore the time component of dateTime? I have tried using DATE(dateAdded) but that didn't work.
My data looks like:
userID dateAdded
1 2016-07-01 17:01:56
1 2016-07-02 12:45:49
1 2016-07-03 13:06:27
1 2016-07-04 12:51:10
1 2016-07-05 15:51:10
2 2016-07-06 16:51:10
2 2016-07-07 11:51:10
1 2016-07-08 11:26:38
Thanks
Casted the dateAdded field to Date.
Please give it a try and let me know if it resolves the issue:
SELECT IF(COUNT(1) > 0, 1, 0) AS has_consec
FROM
(
SELECT *
FROM
(
SELECT IF(b.dateAdded IS NULL, #val:=#val+1, #val) AS consec_set
FROM activity a
CROSS JOIN (SELECT #val:=0) var_init
LEFT JOIN activity b ON
a.userID = b.userID AND
DATE(a.dateAdded) = DATE(b.dateAdded) + INTERVAL 1 DAY
WHERE a.userID = 1
) a
GROUP BY a.consec_set
HAVING COUNT(1) >= 3
) a;
Note: Using timestamp will return correct output only if they are having same time (hh:mm:ss)
I have the following structure in my user table:
id(INT) registered(DATETIME)
1 2016-04-01 23:23:01
2 2016-04-02 03:23:02
3 2016-04-02 05:23:03
4 2016-04-03 04:04:04
I want to get the total (accumulated) user count per day, for all days in DB
So result should be something like
day total
2016-04-01 1
2016-04-02 3
2016-04-03 4
I tried some sub querying, but somehow i have now idea how to achieve this with possibly 1 SQL statement. Of course if could group by per day count and add them programmatically, but i don't want to do that if possible.
You can use a GROUP BY that does all the counts, without the need of doing anything programmatically, please have a look at this query:
select
d.dt,
count(*) as total
from
(select distinct date(registered) dt from table1) d inner join
table1 r on d.dt>=date(r.registered)
group by
d.dt
order by
d.dt
the first subquery returns all distinct dates, then we can join all dates with all previous registrations, and do the counts, all in one query.
An alternative join condition that can give some improvements in performance is:
on d.dt + interval 1 day > r.registered
Not sure why not just use GROUP BY, without it this thing will be more complicated, anyway, try this;)
select
date_format(main.registered, '%Y-%m-%d') as `day`,
main.total
from (
select
table1.*,
#cnt := #cnt + 1 as total
from table1
cross join (select #cnt := 0) t
) main
inner join (
select
a.*,
if(#param = date_format(registered, '%Y-%m-%d'), #rowno := #rowno + 1 ,#rowno := 1) as rowno,
#param := date_format(registered, '%Y-%m-%d')
from (select * from table1 order by registered desc) a
cross join (select #param := null, #rowno := 0) tmp
having rowno = 1
) sub on main.id = sub.id
SQLFiddle DEMO
So I was taking a test recently with some higher level SQL problems. I only have what I would consider "intermediate" experience in SQL and I've been working on this for a day or so now. I just can't figure it out.
Here's the problem:
You have a table with 4 columns as such:
EmployeeID int unique
EmployeeType int
EmployeeSalary int
Created date
Goal: I need to retrieve the difference between the latest two EmployeeSalary for any EmployeeType with more than 1 entry. It has to be done in one statement (nested queries are fine).
Example Data Set: http://sqlfiddle.com/#!9/0dfc7
EmployeeID | EmployeeType | EmployeeSalary | Created
-----------|--------------|----------------|--------------------
1 | 53 | 50 | 2015-11-15 00:00:00
2 | 66 | 20 | 2014-11-11 04:20:23
3 | 66 | 30 | 2015-11-03 08:26:21
4 | 66 | 10 | 2013-11-02 11:32:47
5 | 78 | 70 | 2009-11-08 04:47:47
6 | 78 | 45 | 2006-11-01 04:42:55
So for this data set, the proper return would be:
EmployeeType | EmployeeSalary
-------------|---------------
66 | 10
78 | 25
The 10 comes from subtracting the latest two EmployeeSalary values (30 - 20) for the EmployeeType of 66. The 25 comes from subtracting the latest two EmployeeSalary values (70-45) for EmployeeType of 78. We skip EmployeeID 53 completely because it only has one value.
This one has been destroying my brain. Any clues?
Thanks!
How to make really simple query complex?
One funny way(not best performance) to do it is:
SELECT final.EmployeeType, SUM(salary) AS difference
FROM (
SELECT b.EmployeeType, b.EmployeeSalary AS salary
FROM tab b
JOIN (SELECT EmployeeType, GROUP_CONCAT(EmployeeSalary ORDER BY Created DESC) AS c
FROM tab
GROUP BY EmployeeType
HAVING COUNT(*) > 1) AS sub
ON b.EmployeeType = sub.EmployeeType
AND FIND_IN_SET(b.EmployeeSalary, sub.c) = 1
UNION ALL
SELECT b.EmployeeType, -b.EmployeeSalary AS salary
FROM tab b
JOIN (SELECT EmployeeType, GROUP_CONCAT(EmployeeSalary ORDER BY Created DESC) AS c
FROM tab
GROUP BY EmployeeType
HAVING COUNT(*) > 1) AS sub
ON b.EmployeeType = sub.EmployeeType
AND FIND_IN_SET(b.EmployeeSalary, sub.c) = 2
) AS final
GROUP BY final.EmployeeType;
SqlFiddleDemo
EDIT:
The keypoint is MySQL doesn't support windowed function so you need to use equivalent code:
For example solution in SQL Server:
SELECT EmployeeType, SUM(CASE rn WHEN 1 THEN EmployeeSalary
ELSE -EmployeeSalary END) AS difference
FROM (SELECT *,
ROW_NUMBER() OVER(PARTITION BY EmployeeType ORDER BY Created DESC) AS rn
FROM #tab
) AS sub
WHERE rn IN (1,2)
GROUP BY EmployeeType
HAVING COUNT(EmployeeType) > 1
LiveDemo
And MySQL equivalent:
SELECT EmployeeType, SUM(CASE rn WHEN 1 THEN EmployeeSalary
ELSE -EmployeeSalary END) AS difference
FROM (
SELECT t1.EmployeeType, t1.EmployeeSalary,
count(t2.Created) + 1 as rn
FROM #tab t1
LEFT JOIN #tab t2
ON t1.EmployeeType = t2.EmployeeType
AND t1.Created < t2.Created
GROUP BY t1.EmployeeType, t1.EmployeeSalary
) AS sub
WHERE rn IN (1,2)
GROUP BY EmployeeType
HAVING COUNT(EmployeeType) > 1;
LiveDemo2
The dataset of the fiddle is different from the example above, which is confusing (not to mention a little perverse). Anyway, there's lots of ways to skin this particular cat. Here's one (not the fastest, however):
SELECT a.employeetype, ABS(a.employeesalary-b.employeesalary) diff
FROM
( SELECT x.*
, COUNT(*) rank
FROM employees x
JOIN employees y
ON y.employeetype = x.employeetype
AND y.created >= x.created
GROUP
BY x.employeetype
, x.created
) a
JOIN
( SELECT x.*
, COUNT(*) rank
FROM employees x
JOIN employees y
ON y.employeetype = x.employeetype
AND y.created >= x.created
GROUP
BY x.employeetype
, x.created
) b
ON b.employeetype = a.employeetype
AND b.rank = a.rank+1
WHERE a.rank = 1;
a very similar but faster solution looks like this (although you sometimes need to assign different variables between tables a and b - for reasons I still don't fully understand)...
SELECT a.employeetype
, ABS(a.employeesalary-b.employeesalary) diff
FROM
( SELECT x.*
, CASE WHEN #prev = x.employeetype THEN #i:=#i+1 ELSE #i:=1 END i
, #prev := x.employeetype prev
FROM employees x
, (SELECT #prev := 0, #i:=1) vars
ORDER
BY x.employeetype
, x.created DESC
) a
JOIN
( SELECT x.*
, CASE WHEN #prev = x.employeetype THEN #i:=#i+1 ELSE #i:=1 END i
, #prev := x.employeetype prev
FROM employees x
, (SELECT #prev := 0, #i:=1) vars
ORDER
BY x.employeetype
, x.created DESC
) b
ON b.employeetype = a.employeetype
AND b.i = a.i + 1
WHERE a.i = 1;
I believe it can be solve by temp table/stored procedure but in case it can be done by single SQL statement.
Goal: List all row with count down by year, however number of row of each year is different. Row can be order by date
Result Arm to:
|-Count Down-|-Date-------|
| 3 | 2013-01-01 | <- Start with number of Row of each year
| 2 | 2013-03-15 |
| 1 | 2013-06-07 |
| 5 | 2014-01-01 | <- Start with number of Row of each year
| 4 | 2014-03-17 |
| 3 | 2014-07-11 |
| 2 | 2014-08-05 |
| 1 | 2014-11-12 |
SQL:
Select #row_number:=#row_number-1 AS CountDown, Date
FROM table JOIN
(Select #row_number:=COUNT(*), year(date) FROM table GROUP BY year(date))
Is there any solution for that?
The subquery that gets the count by year needs to return the year, so you can join it with the main table to get the starting number for the countdown. And you need to detect when the year changes, so you need another variable for that.
SELECT #row_number := IF(YEAR(d.Date) = #prevYear, #row_number-1, y.c) AS CountDown,
d.Date, #prevYear := YEAR(d.Date)
FROM (SELECT Date
FROM Table1
ORDER BY Date) AS d
JOIN
(Select count(*) AS c, year(date) AS year
FROM Table1
GROUP BY year(date)) AS y
ON YEAR(d.Date) = y.year
CROSS JOIN (SELECT #prevYear := NULL) AS x
DEMO
You can do the count down using variables (or correlated subqueries). The following does the count, but the returned data is not in the order you specify:
select (#rn := if(#y = year(date), #rn + 1,
if(#y := year(date), 1, 1)
)
) as CountDown, t1.*
from table1 cross join
(select #y := 0, #rn := 0) vars
order by date desc;
That is easily fixed with another subquery:
select t.*
from (select (#rn := if(#y = year(date), #rn + 1,
if(#y := year(date), 1, 1)
)
) as CountDown, t1.*
from table1 cross join
(select #y := 0, #rn := 0) vars
order by date desc
) t
order by date;
Note the complicated expression for assigning CountDown. This expression is setting both variables (#y and #rn) in a single expression. MySQL does not guarantee the order of evaluation of expressions in a select. If you assign these in different expressions, then they might be executed in the wrong order.
EDIT: To clarify there are many users and each user has many records, this is a log table of user activities,
how to find the timestamp difference every record and subsequent record that satisfies some condition ,
for example assuming the table is something like this
| id |u_id| .. | timestamp |
|----|----|----|--------------------|
| 50 | 1 | .. | 2014-04-22 15:35:44|
| 90 | 2 | .. | 2014-04-22 13:35:44|
| .. | .. | .. | ..... |
How do I find the time difference between every record and the next record for only one user id ?
Assuming that you want to do this for all users, the easiest way is to use variables:
select t.*,
if(u_id = #u_id, timediff(`timestamp`, #timestamp), NULL) as diff,
#timestamp := `timestamp`, #u_id := u_id
from table t cross join
(select #timestamp := 0, #u_id := 0) var
order by u_id, timestamp;
It is important that you explicitly order the records to be sure that the processing occurs in sequential order.
Try
select timediff(`timestamp`, #lasttime),
#lasttime := `timestamp`
from your_table
cross join (select #lasttime := 0) d
where u_id = 1
order by id
There are a couple of ways you can do this, the first would be to use a correlated subquery:
SELECT T.id,
T.u_id,
timestamp,
( SELECT T2.timestamp
FROM T AS T2
WHERE T2.u_id = T.u_id
AND T2.timestamp > T.timestamp
ORDER BY T2.Timestamp
LIMIT 1
) AS NextTimeStamp
FROM T;
Or you could do this using JOIN.
SELECT T.id,
T.u_id,
T.timestamp,
T2.timestamp AS NextTimeStamp
FROM T
LEFT JOIN T AS T2
ON T2.u_id = T.u_id
AND T2.timestamp > T.timestamp
LEFT JOIN T AS T3
ON T3.u_id = T.u_id
AND T3.timestamp > T.timestamp
AND T3.timestamp < T2.timestamp
WHERE T3.id IS NULL;
Which one is best will depend on your actual requirements, amount of data, and indexes.