find all rows that have two minutes difference in date - mysql

I have a table ACQUISITION, with 1 720 208 rows.
------------------------------------------------------
| id | date | value |
|--------------|-------------------------|-----------|
| 1820188 | 2011-01-22 17:48:56 | 1.287 |
| 1820187 | 2011-01-21 21:55:11 | 2.312 |
| 1820186 | 2011-01-21 21:54:00 | 2.313 |
| 1820185 | 2011-01-20 17:46:10 | 1.755 |
| 1820184 | 2011-01-20 17:45:05 | 1.785 |
| 1820183 | 2011-01-19 18:21:02 | 2.001 |
------------------------------------------------------
Following a problem I need to find every rows that have less than two minutes difference.
Ideally I should be able to find here:
| 1820187 | 2011-01-21 21:55:11 | 2.312 |
| 1820186 | 2011-01-21 21:54:00 | 2.313 |
| 1820185 | 2011-01-20 17:46:10 | 1.755 |
| 1820184 | 2011-01-20 17:45:05 | 1.785 |
I'm quite lost here, if you got any ideas.

Let us restate your question in a subtle fashion so we can make this query complete before the heat-death of the universe.
"I need to know the consecutive records in the table with timestamps closer together than two minutes."
We can tie the notion of "consecutive" to your id values.
Try this query and see if you get decent performance (http://sqlfiddle.com/#!9/28738/2/0)
SELECT a.date first_date, a.id first_id, a.value first_value,
b.id second_id, b.value second_value,
TIMESTAMPDIFF(SECOND, a.date, b.date) delta_t
FROM thetable AS a
JOIN thetable AS b ON b.id = a.id + 1
AND b.date <= a.date + INTERVAL 2 MINUTE
The self-join workload is brought to heel with ON b.id = a.id + 1. And, avoiding a function on one of the two date column values allows the query to exploit any index that's available on that column.
Creating a covering index on (id,date,value) will help performance of this query.
If the consecutive-row assumption doesn't work in this dataset, you can try this, to compare each row to the next ten rows. It will be slower. (http://sqlfiddle.com/#!9/28738/6/0)
SELECT a.date first_date, a.id first_id, a.value first_value,
b.id second_id, b.value second_value,
TIMESTAMPDIFF(SECOND, a.date, b.date) delta_t
FROM thetable AS a
JOIN thetable AS b ON b.id <= a.id + 10
AND b.id > a.id
AND b.date <= a.date + INTERVAL 2 MINUTE
If the id values are entirely worthless as a way of ordering your rows, you'll need this. And, it will be very slow. (http://sqlfiddle.com/#!9/28738/5/0)
SELECT a.date first_date, a.id first_id, a.value first_value,
b.id second_id, b.value second_value,
TIMESTAMPDIFF(SECOND, a.date, b.date) delta_t
FROM thetable AS a
JOIN thetable AS b ON b.date <= a.date + INTERVAL 2 MINUTE
AND b.date > a.date
AND b.id <> a.id

Do a SELF JOIN with the table and use TIMEDIFF() function like
SELECT t1.*
from ACQUISITION t1 JOIN ACQUISITION t2
ON TIMEDIFF(t1.`date`, t2.`date`) <= 2;

Related

Is there any way to do calculate the MAX() quickly?

I am trying to do this query:
SELECT
A.*
, (SELECT MAX(B.Date2) FROM Tab2 B WHERE A.ID = B.ID AND A.Date > B.Date2) AS MaxDate
FROM
Tab A
This works but it takes a lot of time to run when you have a lot of rows. Is there any quicker way to do this which give the same results?
Thank you!
Edit:
The table définitions are as follow:
Tab : (dd-mm-yyyy)
ID | Date
1 | 19-01-2018
1 | 14-01-2018
2 | 18-02-2019
3 | 20-03-2019
Tab2:
ID | Date2
1 | 10-01-2018
1 | 15-01-2018
1 | 20-01-2018
2 | 15-02-2019
2 | 21-02-2019
3 | 25-03-2019
I want my query returns:
ID | Date | MaxDate
1 | 19-01-2018 | 15-01-2018
1 | 14-01-2018 | 10-01-2018
2 | 18-02-2019 | 15-02-2019
3 | 20-03-2019 | NULL
Thanks!
It was unexpected for me but this query worked:
SELECT
A.ID
, A.Date
, MAX(B.Date2) AS MaxDate
FROM
Tab A
left outer join Tab2 B
on A.ID = B.ID and A.Date > B.Date2
GROUP BY
A.ID, A.Date
;
I didn't know that we can put a column from a table in a group by when the column of the MAX() is in another table.

Select two rows where the sum of two rows is equal to a value

Say I have a table as follows:
| id | value |
--------------
| 1 | 6 |
| 2 | 8 |
| 3 | 5 |
| 4 | 12 |
| 5 | 6 |
I want to return the two rows for which added together will equal a certain value
e.g. I want to get 2 rows where the total is 18, so in the above table it should return:
| id | value |
--------------
| 1 | 6 |
| 4 | 12 |
...as the sum of values here is 18. It shouldn't match on the other 3 rows even if they add up to the total as it can only be sum of 2 rows in this case.
Also, if there are multiple pairs that add up to the required value, it should only return the first match.
edit:
Came up with this which seems to do the trick but I'm not sure it's the best method
SELECT *, (t1.value+t2.value) AS total
FROM test t1, test t2
WHERE t1.id != t2.id
HAVING total = 18
LIMIT 1
Here's something similar to what you're after...
SELECT x.id x_id
, y.id y_id
FROM my_table x
JOIN my_table y
ON y.id > x.id
WHERE x.value + y.value = 18
ORDER
BY x.id
, y.id
LIMIT 1;
MySQL doesn't really lend to such queries. The only [admitedly god aweful] solution I can think of is to self-join the table to get the two records, and then put this join in a CTE and use union all to get the two records on separate rows:
WITH summer AS (
SELECT a.id AS a_id, a.value AS a_value, b.id AS b_id, b.value AS b_value
FROM mytable a
JOIN mytable b ON a.id <> b.id AND a.value + b.value = 18
ORDER BY a.id, b.id
LIMIT 1
)
SELECT a_id, a_value
FROM summer
UNION ALL
SELECT b_id, b_value
FROM summer
Here is a simple query for doing it:
SELECT
number1.id, number2.id
FROM
number AS number1
JOIN
number AS number2 ON number1.value + number2.value = 18
AND number2.id > number1.id

How to avoid self-joins that result in symmetric results in MySQL?

I was looking for records that are within 2 weeks of each other, in the same table, as such:
SELECT stuff
FROM mytable AS a
JOIN mytable AS b
ON a.ID = b.ID
WHERE
(
a.Date = b.Date
OR
a.Date BETWEEN DATE_SUB(b.Date, INTERVAL 14 DAY) AND DATE_ADD(b.Date, INTERVAL 14 DAY)
OR
b.Date BETWEEN DATE_SUB(a.Date, INTERVAL 14 DAY) AND DATE_ADD(a.Date, INTERVAL 14 DAY)
)
;
It worked fine, but now I have a result with this type of structure:
| ID | a.Date | b.Date | a.Value | b.Value |
|----|------------|------------|---------|---------|
| 1 | 2016-01-01 | 2016-01-02 | foo | bar |
| 1 | 2016-01-02 | 2016-01-01 | bar | foo |
Either I did my join in a bad way which is leading to this duplicated structure, or the join is okay but I need some way to remove the chiral record. Can anyone advise me on how to proceed?
Add:
a.Value < b.Value
to the WHERE clause.
Or, better yet, if you have a primary key (and all tables should have a primary key):
a.pk < b.pk

Grouping and aggregating with fields that don't need it

I have the following data:
| ID | Date | Code |
--------------------------
| 1 | 26/02/14 | 10 |
| 1 | 25/02/14 | 11 |
| 1 | 24/02/14 | 10 |
| 2 | 25/02/14 | 13 |
| 2 | 24/02/14 | 11 |
| 2 | 23/02/14 | 10 |
All I want is to group by the ID field and return the maximum value from the date field (i.e. most recent). So the final result should look like this:
| ID | Date | Code |
--------------------------
| 1 | 26/02/14 | 10 |
| 2 | 25/02/14 | 13 |
It seems though that if I want the "Code" field showing in the same query I also have to group or aggregate it as well... which makes sense because there could potentially be more than one value left on that field after the others are grouped/aggregated (even though there won't be in this case).
I thought I could handle this problem by doing the GroupBy and Max in a subquery on just those fields and then do a join on that subquery to bring in the "Code" field I don't want grouped or aggregated:
SELECT Q.ID, Q.MaxOfDate, A.Code
FROM
(SELECT B.ID, Max(B.Date) As MaxOfDate
FROM myTable As B
GROUP BY B.ID) As Q
LEFT JOIN myTable As A ON Q.ID = A.ID;
This isn't working though as it is still only giving me the original number of records I started with.
How do you do grouping and aggregation with fields you don't necessarily want grouped/aggregated?
An alternative to the answer I accepted:
SELECT Q.ID, Q.MaxOfDate, A.Code
FROM
(SELECT B.ID, Max(B.Date) As MaxOfDate
FROM myTable As B
GROUP BY B.ID) As Q
LEFT JOIN myTable As A ON (Q.ID = A.ID) AND (A.Date = Q.MaxOfDate);
Needed to do the LEFT JOIN on the Date field as well as the ID field.
If you want the CODE associated with the Max Date, you will have to use a subquery with a top 1, like this:
SELECT B.ID, Max(B.Date) As MaxOfDate,
(select top 1 C.Code
from myTable As C
where B.ID = C.ID
order by C.Date desc, C.Code) as Code
FROM myTable As B
GROUP BY B.ID

Number of increments through period in MySQL

I think this question is gonna be hard to solve.
I have a TABLE in my DDBB as this one:
+----+--------+-------+
| ID | MONTH | VALUE |
+----+--------+-------+
| 1 | 1-2000 | 20.00 |
| 1 | 2-2000 | 21.00 |
| 1 | 3-2000 | 7.00 |
| 1 | 4-2000 | 8.00 |
+----+--------+-------+
With the following definition:
ID INTEGER(7) ZEROFILL NOT NULL
MONTH VARCHAR(7) NOT NULL
VALUE DOUBLE(20,2)
What I'm trying to achieve is the way to retrieve the number of times, through a period, the field {VALUE} has increased from its previous values.
In the example above, if the period is from "1-2000" to "4-2000", {VALUE} has increased 2 times: [20.00->21.00, 7.00->8.00]
At the end, I will like to have the following output:
+----+------------+
| ID | NUM_OF_INC |
+----+------------+
| 1 | 2 |
+----+------------+
What I'm pointing as the main issue, is that {MONTH} is not a DATE type field (of course, it cannot be).
Is there any way to achieve this?
I'm afraid that the solution is to get all the values and then compare one by one from the engine that is executing the queries.
Due to your date format and MySQLs lack of CTEs to convert them a single time, the query gets pretty verbose; this searches the whole range but it's fairly easy to add a range check using the same pattern;
SELECT a.id, COUNT(*) NUM_OF_INC
FROM Table1 a
JOIN Table1 b
ON a.id = b.id
AND a.value < b.value
AND STR_TO_DATE(CONCAT(a.`MONTH`, '-1'), '%c-%Y-%d')
< STR_TO_DATE(CONCAT(b.`MONTH`, '-1'), '%c-%Y-%d')
LEFT JOIN Table1 c
ON a.id = c.id
AND STR_TO_DATE(CONCAT(a.`MONTH`, '-1'), '%c-%Y-%d')
< STR_TO_DATE(CONCAT(c.`MONTH`, '-1'), '%c-%Y-%d')
AND STR_TO_DATE(CONCAT(c.`MONTH`, '-1'), '%c-%Y-%d')
< STR_TO_DATE(CONCAT(b.`MONTH`, '-1'), '%c-%Y-%d')
WHERE c.id IS NULL
GROUP BY a.id;
An SQLfiddle to test with.
Sadly, this query will definitely not use any index you have on MONTH.
If it is an option consider changing the datatype of MONTH into something calculable. Then you can join the last month (Month - 1) and select on a difference > 0:
SELECT
t1.ID, count(*)
FROM
Entity t1
INNER JOIN Entity t2
ON t1.ID = t2.ID
AND t2.MONTH = t1.MONTH - 1
WHERE
t1.VALUE - t2.VALUE > 0
AND t1.MONTH BETWEEN :beginDate AND :endDate
GROUP BY t1.ID
If you can't change the data type. You have to change the t1.MONTH - 1 with some MySQL functions:
DATE_FORMAT(
SUBDATE(
STR_TO_DATE(CONCAT(t1.MONTH, "-1"), "%c-%Y-%d"),
INTERVAL 1 MONTH),
"%c-%Y")
as well as t1.MONTH BETWEEN :beginDate AND :endDate:
STR_TO_DATE(CONCAT(t1.MONTH, "-1"), "%c-%Y-%d")
BETWEEN :beginDate AND :endDate