mysql Select row with most recent date per user - make it faster - mysql

records in my table are like below:
id |sensor_id|val |audit_date
255245| 1|22.12|2017-02-18 08:26:47
and I want get latest records using this
SELECT `sensor_id`, `val`, `audit_date`
FROM `tests` t1
JOIN (SELECT max(`audit_date`) as audit_date, `sensor_id`
from `tests` group by `sensor_id`) t2
USING (`audit_date`, `sensor_id`)
where `id` > (select max(`id`)-1000 from `tests`)
It takes more than one second; without last "where" - second and half.
"id" is primary key and now indexes.
What I can do to make this query faster?

This query return latest instered record using max() funtion
SELECT t1.sensor_id,val,t1.audit_date
FROM `tests` t1
JOIN (SELECT max(`audit_date`) as audit_date, max(`sensor_id`) as max_sensor_id
FROM `tests` group by `sensor_id`) t2
ON t2.max_sensor_id = t1.sensor_id
AND t2.audit_date =t1.audit_date

You can try if a self-exclusion join would be faster:
SELECT t1.sensor_id, t1.val, t1.audit_date
FROM audit t1
LEFT JOIN audit t2
ON t1.sensor_id = t2.sensor_id
AND t2.audit_date > t1.audit_date
where
t2.id is null
Basically that says return records for which there are no greater audit_dates per sensor_id.

Related

Delete records based on another query in mysql

I have a query in MySQL based on which I am finding duplicate records of some columns.
select max(id), count(*) as cnt
from table group by start_id, end_id, mysqltable
having cnt>1;
This above query gives me the max(id) and the count of number of records that have start_id,end_id,mysqltable column values same.
I want to delete all the records that match the max(id) column of the above query
How can I do that?
I have tried like below
delete from table
where (select max(id), count(*) as cnt
from table group by start_id,end_id,mysqltable
having cnt>1)
But Unable to delete records
You can remove duplicate records using JOIN.
DELETE t1 FROM table t1
INNER JOIN
table t2
WHERE
t1.id > t2.id AND t1.start_id = t2.start_id AND t1.end_id = t2.end_id AND t1.mysqltable = t2.mysqltable;
This query keeps the lowest id and remove the highest.
I think so this command should work:
delete from table
where id in
( select max(id) from table
group by start_id, end_id, mysqltable
having count(*) > 1
);

How to find latest record from two different tables

There are 2 tables table1 and table 2
First column, foreign_id is the common column between both tables.
Data type of all the related columns are same.
Now, we need to find the latest record based on timestamp column, for each foreign_id taking from both the tables, for example as below, also an extra column from_table, which shows from which table this row is selected.
One method that I can think of is
Combine both the tables
then, find the latest for each foreign_id column
Any, better way to do this as there could be 5000+ rows in both the tables.
Try this:
SELECT
t1.foreign_id,
MAX(t1.timestamp) max_time_table1,
MAX(t2.timestamp) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id;
Note: This can be a bit slow, if the number of records are quite large.
However you can also use this:
SELECT a.foreign_id,
IF(a.max_time_table1 > a.max_time_table2, a.max_time_table1, a.max_time_table2) latest_update
FROM(
SELECT
t1.foreign_id,
SUBSTRING_INDEX(GROUP_CONCAT(t1.timestamp ORDER BY t1.id DESC),',',1) max_time_table1,
SUBSTRING_INDEX(GROUP_CONCAT(t2.timestamp ORDER BY t2.id DESC),',',1) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id) a;
Make sure the id columns in both tables are auto_increment.
From your explanation, this would do then:
SELECT
foreign_id,
CASE
WHEN max_time_table1 < max_time_table2 THEN max_time_table2
WHEN max_time_table2 < max_time_table1 THEN max_time_table1
END as timestamps
FROM(
SELECT
t1.foreign_id,
SUBSTRING_INDEX(GROUP_CONCAT(t1.timestamp ORDER BY t1.id DESC),',',1) max_time_table1,
SUBSTRING_INDEX(GROUP_CONCAT(t2.timestamp ORDER BY t2.id DESC),',',1) max_time_table2
FROM *table1* t1
LEFT JOIN *table2* t2 USING (foreign_id)
GROUP BY foreign_id) a;

How to optimize mysql on left join

I try to explain a very high level
I have two complex SELECT queries(for the sake of example I reduce the queries to the following):
SELECT id, t3_id FROM t1;
SELECT t3_id, MAX(added) as last FROM t2 GROUP BY t3_id;
query 1 returns 16k rows and query 2 returns 15k
each queries individually takes less than 1 second to compute
However what I need is to sort the results using column added of query 2, when I try to use LEFT join
SELECT
t1.id, t1.t3_Id
FROM
t1
LEFT JOIN
(SELECT t3_id, MAX(added) as last FROM t2 GROUP BY t3_id) AS t_t2
ON t_t2.t3_id = t1.t3_id
GROUP BY t1.t3_id
ORDER BY t_t2.last
However, the execution time goes up to over a 1 minute.
I like to understand the reason
what is the cause of such a huge explosion?
NOTE:
ALL the used columns on every table have been indexed
e.g. :
table t1 has index on id,t3_Id
table t2 has index on t3_id and added
EDIT1
after #Tim Biegeleisen suggestion, I change the query to the following now the query is executing in about 16 seconds. If I remove the ORDER BY it query gets executed in less than 1 seconds. The problem is that ORDER BY the sole reason for this.
SELECT
t1.id, t1.t3_Id
FROM
t1
LEFT JOIN
t2 ON t2.t3_id = t1.t3_id
GROUP BY t1.t3_id
ORDER BY MAX(t2.added)
Even though table t2 has an index on column t3_id, when you join t1 you are actually joining to a derived table, which either can't use the index, or can't use it completely effectively. Since t1 has 16K rows and you are doing a LEFT JOIN, this means the database engine will need to scan the entire derived table for each record in t1.
You should use MySQL's EXPLAIN to see what the exact execution strategy is, but my suspicion is that the derived table is what is slowing you down.
The correct query should be:
SELECT
t1.id,
t1.t3_Id,
MAX(t2.added) as last
FROM t1
LEFT JOIN t2 on t1.t3_Id = t2.t3_Id
GROUP BY t2.t3_id
ORDER BY last;
This is happen because a temp table is generating on each record.
I think you could try to order everything after the records are available. Maybe:
select * from (
select * from
(select t3_id,max(t1_id) from t1 group by t3_id) as t1
left join (select t3_id,max(added) as last from t2 group by t3_id) as t2
on t1.t3_id = t2.t3_id ) as xx
order by last

How to translate this sql to left join without subquery?

I want to create a view in mysql.But that in mysql does't support subquery.
How to write the sql without subquery?
select * from dev_location t1
inner join
(
select
`dev_location`.`device_id` AS `device_id`,
max(`dev_location`.`id`) AS `id`
from
`dev_location`
group by `dev_location`.`device_id`) t2
on t1.id = t2.id
MySQL views don't support subqueries in the from clause. The following should work in a view:
select dl.*
from dev_location dl
where not exists (select 1
from dev_location dl2
where dl2.device_id = dl.device_id and
dl2.id > dl.id
);
This reformulates the query to say: "Get me all the rows from dev_location where the device_id has no greater id." This is an awkward way of getting the max.
And, with an index on dev_location(device_id, id), it might perform better than your version.

How do I write this kind of query (returning the latest avaiable data for each row)

I have a table defined like this:
CREATE TABLE mytable (id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY(id),
user_id INT REFERENCES user(id) ON UPDATE CASCASE ON DELETE RESTRICT,
amount REAL NOT NULL CHECK (amount > 0),
record_date DATE NOT NULL
);
CREATE UNIQUE INDEX idxu_mybl_key ON mytable (user_id, amount, record_date);
I want to write a query that will have two columns:
user_id
amount
There should be only ONE entry in the returned result set for a given user. Furthermore, the amount figure returned should be the last recoreded amount for the user (i.e. MAX(record_date).
The complication arises because weights are recorded on different dates for different users, so there is no single LAST record_date for all users.
How may I write (preferably an ANSI SQL) query to return the columns mentioned previously, but ensuring that its only the amount for the last recorded amount for the user that is returned?
As an aside, it is probably a good idea to return the 'record_date' column as well in the query, so that it is eas(ier) to verify that the query is working as required.
I am using MySQL as my backend db, but ideally the query should be db agnostic (i.e. ANSI SQL) if possible.
First you need the last record_date for each user:
select user_id, max(record_date) as last_record_date
from mytable
group by user_id
Now, you can join previous query with mytable itself to get amount for this record_date:
select
t1.user_id, last_record_date, amount
from
mytable t1
inner join
( select user_id, max(record_date) as last_record_date
from mytable
group by user_id
) t2
on t1.user_id = t2.user_id
and t1.record_date = t2.last_record_date
A problem appears becuase a user can have several rows for same last_record_date (with different amounts). Then you should get one of them, sample (getting the max of the different amounts):
select
t1.user_id, t1.record_date as last_record_date, max(t1.amount)
from
mytable t1
inner join
( select user_id, max(record_date) as last_record_date
from mytable
group by user_id
) t2
on t1.user_id = t2.user_id
and t1.record_date = t2.last_record_date
group by t1.user_id, t1.record_date
I do not now about MySQL but in general SQL you need a sub-query for that. You must join the query that calculates the greatest record_date with the original one that calculates the corresponding amount. Roughly like this:
SELECT B.*
FROM
(select user_id, max(record_date) max_date from mytable group by user_id) A
join
mytable B
on A.user_id = B.user_id and A.max_date = B.record_date
SELECT datatable.* FROM
mytable AS datatable
INNER JOIN (
SELECT user_id,max(record_date) AS max_record_date FROM mytable GROUP BS user_id
) AS selectortable ON
selectortable.user_id=datatable.user_id
AND
selectortable.max_record_date=datatable.record_date
in some SQLs you might need
SELECT MAX(user_id), ...
in the selectortable view instead of simply SELECT user_id,...
The definition of maximum: there is no larger(or: "more recent") value than this one. This naturally leads to a NOT EXISTS query, which should be available in any DBMS.
SELECT user_id, amount
FROM mytable mt
WHERE mt.user_id = $user
AND NOT EXISTS ( SELECT *
FROM mytable nx
WHERE nx.user_id = mt.user_id
AND nx.record_date > mt.record_date
)
;
BTW: your table definition allows more than one record to exist for a given {id,date}, but with different amounts. This query will return them all.