How to rewrite a NOT IN subquery as join - mysql

Let's assume that the following tables in MySQL describe documents contained in folders.
mysql> select * from folder;
+----+----------------+
| ID | PATH |
+----+----------------+
| 1 | matches/1 |
| 2 | matches/2 |
| 3 | shared/3 |
| 4 | no/match/4 |
| 5 | unreferenced/5 |
+----+----------------+
mysql> select * from DOC;
+----+------+------------+
| ID | F_ID | DATE |
+----+------+------------+
| 1 | 1 | 2000-01-01 |
| 2 | 2 | 2000-01-02 |
| 3 | 2 | 2000-01-03 |
| 4 | 3 | 2000-01-04 |
| 5 | 3 | 2000-01-05 |
| 6 | 3 | 2000-01-06 |
| 7 | 4 | 2000-01-07 |
| 8 | 4 | 2000-01-08 |
| 9 | 4 | 2000-01-09 |
| 10 | 4 | 2000-01-10 |
+----+------+------------+
The columns ID are the primary keys and the column F_ID of table DOC is a not-null foreign key that references the primary key of table FOLDER. By using the 'DATE' of documents in the where clause, I would like to find which folders contain only the selected documents. For documents earlier than 2000-01-05, this could be written as:
SELECT DISTINCT d1.F_ID
FROM DOC d1
WHERE d1.DATE < '2000-01-05'
AND d1.F_ID NOT IN (
SELECT d2.F_ID
FROM DOC d2 WHERE NOT (d2.DATE < '2000-01-05')
);
and it correctly returns '1' and '2'. By reading
http://dev.mysql.com/doc/refman/5.5/en/rewriting-subqueries.html
the performance for big tables could be improved if the subquery is replaced with a join. I already found questions related to NOT IN and JOINS but not exactly what I was looking for. So, any ideas of how this could be written with joins ?

The general answer is:
select t.*
from t
where t.id not in (select id from s)
Can be rewritten as:
select t.*
from t left outer join
(select distinct id from s) s
on t.id = s.id
where s.id is null
I think you can apply this to your situation.

select distinct d1.F_ID
from DOC d1
left outer join (
select F_ID
from DOC
where date >= '2000-01-05'
) d2 on d1.F_ID = d2.F_ID
where d1.date < '2000-01-05'
and d2.F_ID is null

If I understand your question correctly, that you want to find the F_IDs representing folders which only contains documents from before '2000-01-05', then simply
SELECT F_ID
FROM DOC
GROUP BY F_ID
HAVING MAX(DATE) < '2000-01-05'

Sample Table and Insert Statements
CREATE TABLE `tleft` (
`id` int(2) NOT NULL,
`name` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `tright` (
`id` int(2) NOT NULL,
`t_left_id` int(2) DEFAULT NULL,
`description` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
INSERT INTO `tleft` (`id`, `name`)
VALUES
(1, 'henry'),
(2, 'steve'),
(3, 'jeff'),
(4, 'richards'),
(5, 'elon');
INSERT INTO `tright` (`id`, `t_left_id`, `description`)
VALUES
(1, 1, 'sample'),
(2, 2, 'sample');
Left Join : SELECT l.id,l.name FROM tleft l LEFT JOIN tright r ON l.id = r.t_left_id ;
Returns Id : 1, 2, 3, 4, 5
Right Join : SELECT l.id,l.name FROM tleft l RIGHT JOIN tright r ON l.id = r.t_left_id ;
Returns Id : 1,2
Subquery Not in tright : select id from tleft where id not in ( select t_left_id from tright);
Returns Id : 3,4,5
Equivalent Join For above subquery :
SELECT l.id,l.name FROM tleft l LEFT JOIN tright r ON l.id = r.t_left_id WHERE r.t_left_id IS NULL;
AND clause will be applied during the JOIN and WHERE clause will be applied after the JOIN .
Example : SELECT l.id,l.name FROM tleft l LEFT JOIN tright r ON l.id = r.t_left_id AND r.description ='hello' WHERE r.t_left_id IS NULL ;
Hope this helps

Related

MySql: sum orders in an efficient way (OR too slow)

I want to sum up orders. There are products p and ordered items i like:
DROP TABLE IF EXISTS p;
CREATE TABLE p (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`combine` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `combine`(`combine`)
) ENGINE=InnoDB;
DROP TABLE IF EXISTS i;
CREATE TABLE i (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`p` int(10) unsigned DEFAULT NULL,
`quantity` decimal(15,2) NOT NULL,
PRIMARY KEY (`id`),
INDEX `p`(`p`)
) ENGINE=InnoDB;
INSERT INTO p SET id=1, combine=NULL;
INSERT INTO p SET id=2, combine=1;
INSERT INTO p SET id=3, combine=1;
INSERT INTO p SET id=4, combine=NULL;
INSERT INTO i SET id=1, p=1, quantity=5;
INSERT INTO i SET id=2, p=1, quantity=2;
INSERT INTO i SET id=3, p=2, quantity=1;
INSERT INTO i SET id=4, p=3, quantity=4;
INSERT INTO i SET id=5, p=4, quantity=2;
INSERT INTO i SET id=6, p=4, quantity=1;
The idea is that products may be combined which means all sales are combined for these products. This means for example that products 1, 2 and 3 should have the same result: All sales of these products summed up. So I do:
SELECT p.id, SUM(i.quantity)
FROM p
LEFT JOIN p AS p_all ON (p_all.id = p.id OR p_all.combine=p.combine OR p_all.id = p.combine OR p_all.combine = p.id)
LEFT JOIN i ON i.p = p_all.id
GROUP BY p.id;
which gives the required result:
p=1: 12 (i: 1, 2, 3, 4 added)
p=2: 12 (i: 1, 2, 3, 4 added)
p=3: 12 (i: 1, 2, 3, 4 added)
p=4: 3 (i: 5, 6 added)
My problem is that on the real data the OR in the JOIN of the products for p_combine make the query very slow. Just querying without the combination takes 0.2 sec, while the OR makes it last for more than 30 sec.
How could I make this query more efficient in MySql?
Added: There are some more constraints on the real query like:
SELECT p.id, SUM(i.quantity)
FROM p
LEFT JOIN p AS p_all ON (p_all.id = p.id OR p_all.combine=p.combine OR p_all.id = p.combine OR p_all.combine = p.id)
LEFT JOIN i ON i.p = p_all.id
LEFT JOIN orders o ON o.id = i.order
WHERE o.ordered <= '2018-05-10'
AND i.flag=false
AND ...
GROUP BY p.id;
Added: EXPLAIN on real data:
+----+-------------+------------------+------------+-------+-----------------------------+---------+---------+--------------+------+----------+-------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+-------+-----------------------------+---------+---------+--------------+------+----------+-------------------------------------------------+
| 1 | SIMPLE | p | NULL | index | PRIMARY,...combine... | PRIMARY | 4 | NULL | 6556 | 100.00 | NULL |
| 1 | SIMPLE | p_all | NULL | ALL | PRIMARY,combine | NULL | NULL | NULL | 6556 | 100.00 | Range checked for each record (index map: 0x41) |
| 1 | SIMPLE | p | NULL | ref | p | p | 5 | p_all.id | 43 | 100.00 | NULL |
+----+-------------+------------------+------------+-------+-----------------------------+---------+---------+--------------+------+----------+-------------------------------------------------+
I don't know if you have the flexibility to do this, but you could speed it up by changing the combine field in p:
UPDATE p SET combine=id WHERE combine IS NULL;
Then you can massively simplify the ON condition to:
ON p_all.combine = p.combine
making the query (SQLFiddle):
SELECT p.id, SUM(i.quantity) AS qty
FROM p
JOIN p AS p_all
ON p_all.combine = p.combine
JOIN i
ON i.p = p_all.id
GROUP BY p.id
Output:
id qty
1 12
2 12
3 12
4 3
Using subqueries can sometimes be faster than joins.
e.g.
Select p.id, (Select sum(quantity) from i where p in
(Select id from p as p2 where
p2.id = p.id or
p2.combine=p.id or
p2.id = p.combine or
p2.combine = p.combine)
) as orders
from p
You could add all of your constraints on i inside the 'orders' subquery

SQL improvement in MySQL

I have these tables in MySQL.
CREATE TABLE `tableA` (
`id_a` int(11) NOT NULL,
`itemCode` varchar(50) NOT NULL,
`qtyOrdered` decimal(15,4) DEFAULT NULL,
:
PRIMARY KEY (`id_a`),
KEY `INDEX_A1` (`itemCode`)
) ENGINE=InnoDB
CREATE TABLE `tableB` (
`id_b` int(11) NOT NULL AUTO_INCREMENT,
`qtyDelivered` decimal(15,4) NOT NULL,
`id_a` int(11) DEFAULT NULL,
`opType` int(11) NOT NULL, -- '0' delivered to customer, '1' returned from customer
:
PRIMARY KEY (`id_b`),
KEY `INDEX_B1` (`id_a`)
KEY `INDEX_B2` (`opType`)
) ENGINE=InnoDB
tableA shows how many quantity we received order from customer, tableB shows how many quantity we delivered to customer for each order.
I want to make a SQL which counts how many quantity remaining for delivery on each itemCode.
The SQL is as below. This SQL works, but slow.
SELECT T1.itemCode,
SUM(IFNULL(T1.qtyOrdered,'0')-IFNULL(T2.qtyDelivered,'0')+IFNULL(T3.qtyReturned,'0')) as qty
FROM tableA AS T1
LEFT JOIN (SELECT id_a,SUM(qtyDelivered) as qtyDelivered FROM tableB WHERE opType = '0' GROUP BY id_a)
AS T2 on T1.id_a = T2.id_a
LEFT JOIN (SELECT id_a,SUM(qtyDelivered) as qtyReturned FROM tableB WHERE opType = '1' GROUP BY id_a)
AS T3 on T1.id_a = T3.id_a
WHERE T1.itemCode = '?'
GROUP BY T1.itemCode
I tried explain on this SQL, and the result is as below.
+----+-------------+------------+------+----------------+----------+---------+-------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+----------------+----------+---------+-------+-------+----------------------------------------------+
| 1 | PRIMARY | T1 | ref | INDEX_A1 | INDEX_A1 | 152 | const | 1 | Using where |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 21211 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 10 | |
| 3 | DERIVED | tableB | ref | INDEX_B2 | INDEX_B2 | 4 | | 96 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | tableB | ref | INDEX_B2 | INDEX_B2 | 4 | | 55614 | Using where; Using temporary; Using filesort |
+----+-------------+-------------------+----------------+----------+---------+-------+-------+----------------------------------------------+
I want to improve my query. How can I do that?
First, your table B has int for opType, but you are comparing to string via '0' and '1'. Leave as numeric 0 and 1. To optimize your pre-aggregates, you should not have individual column indexes, but a composite, and in this case a covering index. INDEX table B ON (OpType, ID_A, QtyDelivered) as a single index. The OpType to optimize the WHERE, ID_A to optimize the group by, and QtyDelivered for the aggregate in the index without going to the raw data pages.
Since you are looking for the two types, you can roll them up into a single subquery testing for either in a single pass result. THEN, Join to your tableA results.
SELECT
T1.itemCode,
SUM( IFNULL(T1.qtyOrdered, 0 )
- IFNULL(T2.qtyDelivered, 0)
+ IFNULL(T2.qtyReturned, 0)) as qty
FROM
tableA AS T1
LEFT JOIN ( SELECT
id_a,
SUM( IF( opType=0,qtyDelivered, 0)) as qtyDelivered,
SUM( IF( opType=1,qtyDelivered, 0)) as qtyReturned
FROM
tableB
WHERE
opType IN ( 0, 1 )
GROUP BY
id_a) AS T2
on T1.id_a = T2.id_a
WHERE
T1.itemCode = '?'
GROUP BY
T1.itemCode
Now, depending on the size of your tables, you might be better doing a JOIN on your inner table to table A so you only get those of the item code you are expectin. If you have 50k items and you are only looking for items that qualify = 120 items, then your inner query is STILL qualifying based on the 50k. In that case would be overkill. In this case, I would suggest an index on table A by ( ItemCode, ID_A ) and adjust the inner query to
LEFT JOIN ( SELECT
b.id_a,
SUM( IF( b.opType = 0, b.qtyDelivered, 0)) as qtyDelivered,
SUM( IF( b.opType = 1, b.qtyDelivered, 0)) as qtyReturned
FROM
( select distinct id_a
from tableA
where itemCode = '?' ) pqA
JOIN tableB b
on PQA.id_A = b.id_a
AND b.opType IN ( 0, 1 )
GROUP BY
id_a) AS T2
My Query against your SQLFiddle

SQL JOIN : Prefix fields with table name

I have the following tables
CREATE TABLE `constraints` (
`id` int(11),
`name` varchar(64),
`type` varchar(64)
);
CREATE TABLE `groups` (
`id` int(11),
`name` varchar(64)
);
CREATE TABLE `constraints_to_group` (
`groupid` int(11),
`constraintid` int(11)
);
With the following data :
INSERT INTO `groups` (`id`, `name`) VALUES
(1, 'group1'),
(2, 'group2');
INSERT INTO `constraints` (`id`, `name`, `type`) VALUES
(1, 'cons1', 'eq'),
(2, 'cons2', 'inf');
INSERT INTO `constraints_to_group` (`groupid`, `constraintid`) VALUES
(1, 1),
(1, 2),
(2, 2);
I want to get all constraints for all groups, so I do the following :
SELECT groups.*, t.* FROM groups
LEFT JOIN
(SELECT * FROM constraints
LEFT JOIN constraints_to_group
ON constraints.id=constraints_to_group.constraintid) as t
ON t.groupid=groups.id
And get the following result :
id| name | id | name type groupid constraintid
-----------------------------------------------------
1 | group1 | 1 | cons1 | eq | 1 | 1
1 | group1 | 2 | cons2 | inf | 1 | 2
2 | group2 | 2 | cons2 | inf | 2 | 2
What I'd like to get :
group_id | group_name | cons_id | cons_name | cons_type | groupid | constraintid
-------------------------------------------------------------------------------------
1 | group1 | 1 | cons1 | eq | 1 | 1
1 | group1 | 2 | cons2 | inf | 1 | 2
2 | group2 | 2 | cons2 | inf | 2 | 2
This is an example, in my real case my tables have much more columns so using the SELECT groups.name as group_name, ... would lead to queries very hard to maintains.
Try this way
SELECT groups.id as group_id, groups.name as group_name ,
t.id as cons_id, t.name as cons_name, t.type as cons_type,
a.groupid , a.constraintid
FROM constraints_to_group as a
JOIN groups on groups.id=a.groupid
JOIN constraints as t on t.id=a.constraintid
The only difference I see are the names of the columns? Use for that mather an AS-statement.
SELECT
groups.id AS group_id,
groups.name AS group_name,
t.id AS cons_id,
t.name AS cons_name,
t.groupid, t.constraintid
FROM groups
LEFT JOIN
(SELECT * FROM constraints
LEFT JOIN constraints_to_group
ON constraints.id=constraints_to_group.constraintid) as t
ON t.groupid=groups.id
Besides, a better join-construction is:
SELECT G.id AS group_id,
G.name AS group_name,
CG.id AS cons_id,
CG.name AS cons_name,
C.groupid, C.constraintid
FROM constraints_to_group CG
LEFT JOIN constraints C
ON CG.constraintid = C.id
LEFT JOIN groups G
ON CG.groupid = G.id;
Possible duplicate of this issue

phpmyadmin update m to n relationship table to remove duplicates from separate table

I have 2 tables.
CREATE TABLE designs
( game_id INT NOT NULL,
des_id INT NOT NULL,
PRIMARY KEY(game_id, des_id),
FOREIGN KEY(game_id) REFERENCES Game(id),
ON UPDATE CASCADE)
CREATE TABLE designer
( name VARCHAR(30) NOT NULL,
id INT NOT NULL,
PRIMARY KEY(id),
FOREIGN KEY(id) REFERENCES designs(des_id),
ON UPDATE CASCADE);
Lets say I have data:
designs:
0---0
0---1
1---2
2---3
2---4
.............................
designer:
Bob---0
Jill---1
Bob---2
Rob---3
Jill---4
After the update, I would like the "designs" table to look like:
0---0
0---1
1---0
2---3
2---1
What update query would I need to accomplish this?
Some queries I tried are:
UPDATE designs
SET des_id = (
SELECT a.id
FROM designer as a
JOIN designer as b
ON a.name=b.name AND a.id < b.id
WHERE des_id = b.id);
...
UPDATE `designs` as a
JOIN designer as b
ON a.des_id=b.id
SET a.des_id = b.id
WHERE b.id = (
SELECT c.id
FROM designer as c
LEFT JOIN designer as d
ON c.name=d.name
WHERE c.id<d.id)
Here's one idea. Note that it uses an documented hack in the form of a 'group by/order by' trick:
UPDATE designs d
JOIN
( select d1.id matcher_id
, d2.id select_id
from `designer` d1
JOIN designer d2
ON d1.name = d2.name
group
by d1.id
Order
by d2.id
) x
ON x.matcher_id = d.des_id
SET d.des_id = select_id
Your LEFT JOIN idea is almost right, but here's another idea which is faster...
DROP TABLE IF EXISTS designs;
CREATE TABLE designs
( game_id INT NOT NULL
, designer_id INT NOT NULL
, PRIMARY KEY(game_id, designer_id)
);
DROP TABLE IF EXISTS designers;
CREATE TABLE designers
( name VARCHAR(30) NOT NULL
, designer_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
);
INSERT INTO designs VALUES
(1,1),
(1,2),
(2,3),
(3,4),
(3,5);
INSERT INTO designers VALUES
('Bob',1),
('Jill',2),
('Bob',3),
('Rob',4),
('Jill',5);
SELECT * FROM designs;
+---------+-------------+
| game_id | designer_id |
+---------+-------------+
| 1 | 1 |
| 1 | 2 |
| 2 | 3 |
| 3 | 4 |
| 3 | 5 |
+---------+-------------+
SELECT * FROM designers;
+------+-------------+
| name | designer_id |
+------+-------------+
| Bob | 1 |
| Jill | 2 |
| Bob | 3 |
| Rob | 4 |
| Jill | 5 |
+------+-------------+
UPDATE designs g
JOIN designers d
ON d.designer_id = g.designer_id
JOIN designers x ON x.name = d.name
JOIN
( SELECT name
, MIN(designer_id) min_designer_id
FROM designers
GROUP
BY name
) y
ON y.name = x.name
AND y.min_designer_id = x.designer_id
SET g.designer_id = x.designer_id;
SELECT * FROM designs;
+---------+-------------+
| game_id | designer_id |
+---------+-------------+
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 3 | 2 |
| 3 | 4 |
+---------+-------------+
Actually, in the special case of an UPDATE, I think this will work just as well, and I'm not really sure that it's any less performative...
UPDATE designs g
JOIN designers x
ON x.designer_id = g.designer_id
JOIN designers y
ON y.name = x.name
AND y.designer_id < x.designer_id
SET g.designer_id = y.designer_id;

MySQL Query Optimization with MAX()

I have 3 tables with the following schema:
CREATE TABLE `devices` (
`device_id` int(11) NOT NULL auto_increment,
`name` varchar(20) default NULL,
`appliance_id` int(11) default '0',
`sensor_type` int(11) default '0',
`display_name` VARCHAR(100),
PRIMARY KEY USING BTREE (`device_id`)
)
CREATE TABLE `channels` (
`channel_id` int(11) NOT NULL AUTO_INCREMENT,
`device_id` int(11) NOT NULL,
`channel` varchar(10) NOT NULL,
PRIMARY KEY (`channel_id`),
KEY `device_id_idx` (`device_id`)
)
CREATE TABLE `historical_data` (
`date_time` datetime NOT NULL,
`channel_id` int(11) NOT NULL,
`data` float DEFAULT NULL,
`unit` varchar(10) DEFAULT NULL,
KEY `devices_datetime_idx` (`date_time`) USING BTREE,
KEY `channel_id_idx` (`channel_id`)
)
The setup is that a device can have one or more channels and each channel has many (historical) data.
I use the following query to get the last historical data for one device and all it's related channels:
SELECT c.channel_id, c.channel, max(h.date_time), h.data
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
INNER JOIN historical_data h ON h.channel_id = c.channel_id
WHERE d.name = 'livingroom' AND d.appliance_id = '0'
AND d.sensor_type = 1 AND ( c.channel = 'ch1')
GROUP BY c.channel
ORDER BY h.date_time, channel
The query plan looks as follows:
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
| 1 | SIMPLE | c | ALL | PRIMARY,device_id_idx | NULL | NULL | NULL | 34 | Using where |
| 1 | SIMPLE | d | eq_ref | PRIMARY | PRIMARY | 4 | c.device_id | 1 | Using where |
| 1 | SIMPLE | h | ref | channel_id_idx | channel_id_idx | 4 | c.channel_id | 322019 | |
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
3 rows in set (0.00 sec)
The above query is currently taking approximately 15 secs and I wanted to know if there are any tips or way to improve the query?
Edit:
Example data from historical_data
+---------------------+------------+------+------+
| date_time | channel_id | data | unit |
+---------------------+------------+------+------+
| 2011-11-20 21:30:57 | 34 | 23.5 | C |
| 2011-11-20 21:30:57 | 9 | 68 | W |
| 2011-11-20 21:30:54 | 34 | 23.5 | C |
| 2011-11-20 21:30:54 | 5 | 316 | W |
| 2011-11-20 21:30:53 | 34 | 23.5 | C |
| 2011-11-20 21:30:53 | 2 | 34 | W |
| 2011-11-20 21:30:51 | 34 | 23.4 | C |
| 2011-11-20 21:30:51 | 9 | 68 | W |
| 2011-11-20 21:30:49 | 34 | 23.4 | C |
| 2011-11-20 21:30:49 | 4 | 193 | W |
+---------------------+------------+------+------+
10 rows in set (0.00 sec)
Edit 2:
Mutliple channel SELECT example:
SELECT c.channel_id, c.channel, max(h.date_time), h.data
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
INNER JOIN historical_data h ON h.channel_id = c.channel_id
WHERE d.name = 'livingroom' AND d.appliance_id = '0'
AND d.sensor_type = 1 AND ( c.channel = 'ch1' OR c.channel = 'ch2' OR c.channel = 'ch2')
GROUP BY c.channel
ORDER BY h.date_time, channel
I've used OR in the c.channel where clause because it was easier to generated pro grammatically but it can be changed to use IN if necessary.
Edit 3:
Example result of what I'm trying to achieve:
+-----------+------------+---------+---------------------+-------+
| device_id | channel_id | channel | max(h.date_time) | data |
+-----------+------------+---------+---------------------+-------+
| 28 | 9 | ch1 | 2011-11-21 20:39:36 | 0 |
| 28 | 35 | ch2 | 2011-11-21 20:30:55 | 32767 |
+-----------+------------+---------+---------------------+-------+
I have added the device_id to the example but my select will only need to return channel_id, channel, last date_time i.e max and the data. The results should be the last record from the historical_data table for each channel for one device.
It seems that removing an re-creating the index on date_time by deleting and creating it again sped up my original SQL up to around 2secs
I haven't been able to test this, so I'd like to ask you to run it and let us know what happens.. if it gives you the desired result and if it runs faster than your current:
CREATE DEFINER=`root`#`localhost` PROCEDURE `GetLatestHistoricalData_EXAMPLE`
(
IN param_device_name VARCHAR(20)
, IN param_appliance_id INT
, IN param_sensor_type INT
, IN param_channel VARCHAR(10)
)
BEGIN
SELECT
h.date_time, h.data
FROM
historical_data h
INNER JOIN
(
SELECT c.channel_id
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
WHERE
d.name = param_device_name
AND d.appliance_id = param_appliance_id
AND d.sensor_type = param_sensor_type
AND c.channel = param_channel
)
c ON h.channel_id = c.channel_id
ORDER BY h.date_time DESC
LIMIT 1;
END
Then to run a test:
CALL GetLatestHistoricalData_EXAMPLE ('livingroom', 0, 1, 'ch1');
I tried working it into a stored procedure so that even if you get the desired results using this for one device, you can try it with another device and see the results... Thanks!
[edit] : : In response to Danny's comment here's an updated test version:
CREATE DEFINER=`root`#`localhost` PROCEDURE `GetLatestHistoricalData_EXAMPLE_3Channel`
(
IN param_device_name VARCHAR(20)
, IN param_appliance_id INT
, IN param_sensor_type INT
, IN param_channel_1 VARCHAR(10)
, IN param_channel_2 VARCHAR(10)
, IN param_channel_3 VARCHAR(10)
)
BEGIN
SELECT
h.date_time, h.data
FROM
historical_data h
INNER JOIN
(
SELECT c.channel_id
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
WHERE
d.name = param_device_name
AND d.appliance_id = param_appliance_id
AND d.sensor_type = param_sensor_type
AND (
c.channel IN (param_channel_1
,param_channel_2
,param_channel_3
)
c ON h.channel_id = c.channel_id
ORDER BY h.date_time DESC
LIMIT 1;
END
Then to run a test:
CALL GetLatestHistoricalData_EXAMPLE_3Channel ('livingroom', 0, 1, 'ch1', 'ch2' , 'ch3');
Again, this is just for testing, so you'll be able to see if it meets your needs..
I would first add an index on the devices table ( appliance_id, sensor_type, name ) to match your query. I don't know how many entries are in this table, but if large, and many elements per device, get right to it.
Second, on your channels table, index on ( device_id, channel )
Third, on your history data, index on ( channel_id, date_time )
then,
SELECT STRAIGHT_JOIN
PreQuery.MostRecent,
PreQuery.Channel_ID,
PreQuery.Channel,
H2.Data,
H2.Unit
from
( select
c.channel_id,
c.channel,
max( h.date_time ) as MostRecent
from
devices d
join channels c
on d.device_id = c.device_id
and c.channel in ( 'ch1', 'ch2', 'ch3' )
join historical_data h
on c.channel_id = c.Channel_id
where
d.appliance_id = 0
and d.sensor_type = 1
and d.name = 'livingroom'
group by
c.channel_id ) PreQuery
JOIN Historical_Data H2
on PreQuery.Channel_ID = H2.Channel_ID
AND PreQuery.MostRecent = H2.Date_Time
order by
PreQuery.MostRecent,
PreQuery.Channel