MYSQL Left Join Not Working when Rows Not Found - mysql

I have created a MYSQL query that looks for users favorites and sorts the results based on their last time they viewed the page.
Everything works on the database call, except when the LEFT JOIN does not find results.
Let me show you some example queries
EXPLAIN
SELECT a.*,
COUNT(b.SID) AS mcount,
MAX(b.time) AS mtime
FROM posts a
LEFT JOIN impressions b ON a.SID = b.SID
AND b.time > '1444679848'
AND (b.MID = '3'
OR b.FBID = '418'
OR b.TID = '152')
WHERE a.pending != 1
AND a.sponsor != '1'
AND a.fbpost = '0'
AND a.roundup = '0'
AND a.404 != '1'
AND a.hide = '0'
AND a.END >= '2015-10-22'
AND a.added >= '2015-10-12'
AND a.url_image > ''
AND CHAR_LENGTH(a.states) < 80
AND a.usa = '1'
GROUP BY a.SID
ORDER BY mcount DESC, (a.int_clicks + (a.int_views2b * 100)) DESC
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------------------------------------------------------+--------------+---------+--------------------------+------+----------------------------------------------+
| 1 | SIMPLE | a | range | end,added,usa,sponsor,pending | added | 4 | NULL | 2474 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | b | ref | MID,FBID,TID,SID | SID | 5 | sp282q.a.SID | 280 | |
+----+-------------+-------+-------+------------------------------------------------------------------+--------------+---------+--------------------------+------+----------------------------------------------+
Here is the same query, but instead with invalid SID, FBID and TID numbers resulting in no matches
EXPLAIN
SELECT a.*,
COUNT(b.SID) AS mcount,
MAX(b.time) AS mtime
FROM posts a
LEFT JOIN impressions b ON a.SID = b.SID
AND b.time > '1444679848'
AND (b.MID = '3234234'
OR b.FBID = '423423423418'
OR b.TID = '152342342342')
WHERE a.pending != 1
AND a.sponsor != '1'
AND a.fbpost = '0'
AND a.roundup = '0'
AND a.404 != '1'
AND a.hide = '0'
AND a.END >= '2015-10-22'
AND a.added >= '2015-10-12'
AND a.url_image > ''
AND CHAR_LENGTH(a.states) < 80
AND a.usa = '1'
GROUP BY a.SID
ORDER BY mcount DESC, (a.int_clicks + (a.int_views2b * 100)) DESC
+----+-------------+-------+-------+------------------------------------------------------------------+--------------+---------+------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------------------------------------------------------+--------------+---------+------+----------+----------------------------------------------+
| 1 | SIMPLE | a | range | end,added,usa,sponsor,pending | added | 4 | NULL | 2474 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | b | ALL | MID,FBID,TID,SID | NULL | NULL | NULL | 13029182 | Using where |
+----+-------------+-------+-------+------------------------------------------------------------------+--------------+---------+------+----------+----------------------------------------------+
For the second database call, when there are no MID, FBID or TID rows found, it returns the whole data base of over 13 million records.
When I switch these queries to an INNER JOIN, it will return the results fast when no records are found.
When I run the query with a LEFT JOIN, and no records are found, it locks mysql up and tries to run this query for hours.
What am I doing wrong here?
The expected output should be:
POSTID mcount COUNT ORDER mtime
883 25 10 1444279848
823 22 22 1444249848
813 20 8 1444672448
816 6 18 1444672248
810 0 50 1444679848
811 0 45 1444479865
815 0 30 1444673468
The time column is the MAX(b.time) as mtime and COUNT ORDER is (a.int_clicks + (a.int_views2b * 100))
Updated with SHOW CREATE TABLE impressions
CREATE TABLE `impressions` (
`id` int(5) NOT NULL AUTO_INCREMENT,
`SID` int(11) DEFAULT NULL,
`MID` bigint(20) DEFAULT NULL,
`FBID` bigint(20) DEFAULT NULL,
`TID` bigint(20) DEFAULT NULL,
`time` bigint(20) DEFAULT NULL,
`ip` int(11) unsigned DEFAULT NULL,
`data` tinyint(4) NOT NULL DEFAULT '0',
UNIQUE KEY `id` (`id`),
KEY `MID` (`MID`),
KEY `FBID` (`FBID`),
KEY `TID` (`TID`),
KEY `SID` (`SID`),
KEY `ip` (`ip`)
) ENGINE=MyISAM AUTO_INCREMENT=13029800 DEFAULT CHARSET=utf8 |

A LEFT JOIN should return records with NULL values. The docs at https://dev.mysql.com/doc/refman/5.0/en/join.html say this about LEFT JOIN:
If there is no matching row for the right table in the ON or USING
part in a LEFT JOIN, a row with all columns set to NULL is used for
the right table. You can use this fact to find rows in a table that
have no counterpart in another table
If the values of MID, FBID or TID are invalid, all of those records will be returned anyway (13 million, I suppose). INNER JOIN, or just "JOIN" will exclude records that do not match.
Before the quote above, in general about the ON clause:
The conditional_expr used with ON is any conditional expression of the
form that can be used in a WHERE clause. Generally, you should use the
ON clause for conditions that specify how to join tables, and the
WHERE clause to restrict which rows you want in the result set.

After many "Oh I know what it must be"'s.. I figured it out!
To LEFT JOIN null rows, simply add OR b.id IS NULL to the ON statement.
EXPLAIN
SELECT a.*,
COUNT(b.SID) AS mcount,
MAX(b.time) AS mtime
FROM posts a
LEFT JOIN impressions b ON a.SID = b.SID
AND b.time > '1444679848'
AND ((b.MID = '3'
OR b.FBID = '418'
OR b.TID = '152')
OR b.id IS NULL)
WHERE a.pending != 1
AND a.sponsor != '1'
AND a.fbpost = '0'
AND a.roundup = '0'
AND a.404 != '1'
AND a.hide = '0'
AND a.END >= '2015-10-22'
AND a.added >= '2015-10-12'
AND a.url_image > ''
AND CHAR_LENGTH(a.states) < 80
AND a.usa = '1'
GROUP BY a.SID
ORDER BY mcount DESC, (a.int_clicks + (a.int_views2b * 100)) DESC

Related

Slow MySQL query when using ORDER BY id

I have a very slow query where the first part is created by a gem (https://github.com/CanCanCommunity/cancancan, it creates the select and the inner query) and where I add an ORDER BY and LIMIT for a cursor based pagination.
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 3.48 sec)
This are the tables:
CREATE TABLE `spree_products` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`available_on` datetime DEFAULT NULL,
`permalink` varchar(255) DEFAULT NULL,
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
`count_on_hand` int(11) DEFAULT NULL,
`vendor_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_spree_products_on_vendor_id` (`vendor_id`)
) ENGINE=InnoDB AUTO_INCREMENT=37209248 DEFAULT CHARSET=utf8mb4
CREATE TABLE `spree_vendors` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`active` tinyint(1) DEFAULT '0',
`created_at` datetime NOT NULL,
`updated_at` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4413 DEFAULT CHARSET=utf8mb4
(I removed unneccessary fields to keep it tidy)
The EXPLAIN on the query above returns this:
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
| 1 | SIMPLE | spree_vendors | NULL | ALL | PRIMARY | NULL | NULL | NULL | 3465 | 10.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | spree_products | NULL | ref | PRIMARY,index_spree_products_on_vendor_id | index_spree_products_on_vendor_id | 5 | _hubert_test.spree_vendors.id | 8613 | 100.00 | Using index |
| 1 | SIMPLE | spree_products | NULL | eq_ref | PRIMARY | PRIMARY | 4 | _hubert_test.spree_products.id | 1 | 100.00 | NULL |
+----+-------------+----------------+------------+--------+-------------------------------------------+-----------------------------------+---------+--------------------------------+------+----------+----------------------------------------------+
When I remove the ORDER BY the query is fast:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
LIMIT 50;
=> 50 rows in set (0.00 sec)
When I keep the ORDER BY part from the outer query, but remove the WHERE part from the sub query, the query also is fast:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
I tried adding a composite index to spree_vendors.id / spree_vendors.active, but that didn't help.
Any idea, on how to optimise this query?
UPDATE 1:
A JOIN Variant of this is also slow. The DISTINCT is added by the gem to prevent duplicate records in case you don't select all columns:
SELECT DISTINCT `spree_products`.*
FROM `spree_products`
LEFT OUTER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 43.13 sec)
Without the DISTINCT the query is fast.
UPDATE 2
It was pointed out, that using a LEFT OUTER JOIN inside the sub query returns the whole table. But when using an INNER JOIN it still is slow:
SELECT `spree_products`.*
FROM `spree_products`
WHERE `spree_products`.`id` IN
(SELECT `spree_products`.`id`
FROM `spree_products`
INNER JOIN `spree_vendors` ON `spree_vendors`.`id` = `spree_products`.`vendor_id`
WHERE `spree_vendors`.`active` = TRUE)
ORDER BY `spree_products`.`id` ASC
LIMIT 50;
=> 50 rows in set (1 min 3.98 sec)
Given that id must be PRIMARY, your query must be functionally identical to this:
SELECT [DISTINCT] p.*
FROM spree_products p
JOIN spree_vendors v
ON v.id = p.vendor_id
WHERE v.active = 1
ORDER
BY p.id ASC
LIMIT 50;
This would benefit from an index on p.vendor_id, and perhaps v.active.

MySql: sum orders in an efficient way (OR too slow)

I want to sum up orders. There are products p and ordered items i like:
DROP TABLE IF EXISTS p;
CREATE TABLE p (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`combine` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `combine`(`combine`)
) ENGINE=InnoDB;
DROP TABLE IF EXISTS i;
CREATE TABLE i (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`p` int(10) unsigned DEFAULT NULL,
`quantity` decimal(15,2) NOT NULL,
PRIMARY KEY (`id`),
INDEX `p`(`p`)
) ENGINE=InnoDB;
INSERT INTO p SET id=1, combine=NULL;
INSERT INTO p SET id=2, combine=1;
INSERT INTO p SET id=3, combine=1;
INSERT INTO p SET id=4, combine=NULL;
INSERT INTO i SET id=1, p=1, quantity=5;
INSERT INTO i SET id=2, p=1, quantity=2;
INSERT INTO i SET id=3, p=2, quantity=1;
INSERT INTO i SET id=4, p=3, quantity=4;
INSERT INTO i SET id=5, p=4, quantity=2;
INSERT INTO i SET id=6, p=4, quantity=1;
The idea is that products may be combined which means all sales are combined for these products. This means for example that products 1, 2 and 3 should have the same result: All sales of these products summed up. So I do:
SELECT p.id, SUM(i.quantity)
FROM p
LEFT JOIN p AS p_all ON (p_all.id = p.id OR p_all.combine=p.combine OR p_all.id = p.combine OR p_all.combine = p.id)
LEFT JOIN i ON i.p = p_all.id
GROUP BY p.id;
which gives the required result:
p=1: 12 (i: 1, 2, 3, 4 added)
p=2: 12 (i: 1, 2, 3, 4 added)
p=3: 12 (i: 1, 2, 3, 4 added)
p=4: 3 (i: 5, 6 added)
My problem is that on the real data the OR in the JOIN of the products for p_combine make the query very slow. Just querying without the combination takes 0.2 sec, while the OR makes it last for more than 30 sec.
How could I make this query more efficient in MySql?
Added: There are some more constraints on the real query like:
SELECT p.id, SUM(i.quantity)
FROM p
LEFT JOIN p AS p_all ON (p_all.id = p.id OR p_all.combine=p.combine OR p_all.id = p.combine OR p_all.combine = p.id)
LEFT JOIN i ON i.p = p_all.id
LEFT JOIN orders o ON o.id = i.order
WHERE o.ordered <= '2018-05-10'
AND i.flag=false
AND ...
GROUP BY p.id;
Added: EXPLAIN on real data:
+----+-------------+------------------+------------+-------+-----------------------------+---------+---------+--------------+------+----------+-------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------------+------------+-------+-----------------------------+---------+---------+--------------+------+----------+-------------------------------------------------+
| 1 | SIMPLE | p | NULL | index | PRIMARY,...combine... | PRIMARY | 4 | NULL | 6556 | 100.00 | NULL |
| 1 | SIMPLE | p_all | NULL | ALL | PRIMARY,combine | NULL | NULL | NULL | 6556 | 100.00 | Range checked for each record (index map: 0x41) |
| 1 | SIMPLE | p | NULL | ref | p | p | 5 | p_all.id | 43 | 100.00 | NULL |
+----+-------------+------------------+------------+-------+-----------------------------+---------+---------+--------------+------+----------+-------------------------------------------------+
I don't know if you have the flexibility to do this, but you could speed it up by changing the combine field in p:
UPDATE p SET combine=id WHERE combine IS NULL;
Then you can massively simplify the ON condition to:
ON p_all.combine = p.combine
making the query (SQLFiddle):
SELECT p.id, SUM(i.quantity) AS qty
FROM p
JOIN p AS p_all
ON p_all.combine = p.combine
JOIN i
ON i.p = p_all.id
GROUP BY p.id
Output:
id qty
1 12
2 12
3 12
4 3
Using subqueries can sometimes be faster than joins.
e.g.
Select p.id, (Select sum(quantity) from i where p in
(Select id from p as p2 where
p2.id = p.id or
p2.combine=p.id or
p2.id = p.combine or
p2.combine = p.combine)
) as orders
from p
You could add all of your constraints on i inside the 'orders' subquery

SQL improvement in MySQL

I have these tables in MySQL.
CREATE TABLE `tableA` (
`id_a` int(11) NOT NULL,
`itemCode` varchar(50) NOT NULL,
`qtyOrdered` decimal(15,4) DEFAULT NULL,
:
PRIMARY KEY (`id_a`),
KEY `INDEX_A1` (`itemCode`)
) ENGINE=InnoDB
CREATE TABLE `tableB` (
`id_b` int(11) NOT NULL AUTO_INCREMENT,
`qtyDelivered` decimal(15,4) NOT NULL,
`id_a` int(11) DEFAULT NULL,
`opType` int(11) NOT NULL, -- '0' delivered to customer, '1' returned from customer
:
PRIMARY KEY (`id_b`),
KEY `INDEX_B1` (`id_a`)
KEY `INDEX_B2` (`opType`)
) ENGINE=InnoDB
tableA shows how many quantity we received order from customer, tableB shows how many quantity we delivered to customer for each order.
I want to make a SQL which counts how many quantity remaining for delivery on each itemCode.
The SQL is as below. This SQL works, but slow.
SELECT T1.itemCode,
SUM(IFNULL(T1.qtyOrdered,'0')-IFNULL(T2.qtyDelivered,'0')+IFNULL(T3.qtyReturned,'0')) as qty
FROM tableA AS T1
LEFT JOIN (SELECT id_a,SUM(qtyDelivered) as qtyDelivered FROM tableB WHERE opType = '0' GROUP BY id_a)
AS T2 on T1.id_a = T2.id_a
LEFT JOIN (SELECT id_a,SUM(qtyDelivered) as qtyReturned FROM tableB WHERE opType = '1' GROUP BY id_a)
AS T3 on T1.id_a = T3.id_a
WHERE T1.itemCode = '?'
GROUP BY T1.itemCode
I tried explain on this SQL, and the result is as below.
+----+-------------+------------+------+----------------+----------+---------+-------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+----------------+----------+---------+-------+-------+----------------------------------------------+
| 1 | PRIMARY | T1 | ref | INDEX_A1 | INDEX_A1 | 152 | const | 1 | Using where |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 21211 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 10 | |
| 3 | DERIVED | tableB | ref | INDEX_B2 | INDEX_B2 | 4 | | 96 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | tableB | ref | INDEX_B2 | INDEX_B2 | 4 | | 55614 | Using where; Using temporary; Using filesort |
+----+-------------+-------------------+----------------+----------+---------+-------+-------+----------------------------------------------+
I want to improve my query. How can I do that?
First, your table B has int for opType, but you are comparing to string via '0' and '1'. Leave as numeric 0 and 1. To optimize your pre-aggregates, you should not have individual column indexes, but a composite, and in this case a covering index. INDEX table B ON (OpType, ID_A, QtyDelivered) as a single index. The OpType to optimize the WHERE, ID_A to optimize the group by, and QtyDelivered for the aggregate in the index without going to the raw data pages.
Since you are looking for the two types, you can roll them up into a single subquery testing for either in a single pass result. THEN, Join to your tableA results.
SELECT
T1.itemCode,
SUM( IFNULL(T1.qtyOrdered, 0 )
- IFNULL(T2.qtyDelivered, 0)
+ IFNULL(T2.qtyReturned, 0)) as qty
FROM
tableA AS T1
LEFT JOIN ( SELECT
id_a,
SUM( IF( opType=0,qtyDelivered, 0)) as qtyDelivered,
SUM( IF( opType=1,qtyDelivered, 0)) as qtyReturned
FROM
tableB
WHERE
opType IN ( 0, 1 )
GROUP BY
id_a) AS T2
on T1.id_a = T2.id_a
WHERE
T1.itemCode = '?'
GROUP BY
T1.itemCode
Now, depending on the size of your tables, you might be better doing a JOIN on your inner table to table A so you only get those of the item code you are expectin. If you have 50k items and you are only looking for items that qualify = 120 items, then your inner query is STILL qualifying based on the 50k. In that case would be overkill. In this case, I would suggest an index on table A by ( ItemCode, ID_A ) and adjust the inner query to
LEFT JOIN ( SELECT
b.id_a,
SUM( IF( b.opType = 0, b.qtyDelivered, 0)) as qtyDelivered,
SUM( IF( b.opType = 1, b.qtyDelivered, 0)) as qtyReturned
FROM
( select distinct id_a
from tableA
where itemCode = '?' ) pqA
JOIN tableB b
on PQA.id_A = b.id_a
AND b.opType IN ( 0, 1 )
GROUP BY
id_a) AS T2
My Query against your SQLFiddle

MySQL Query Optimization with MAX()

I have 3 tables with the following schema:
CREATE TABLE `devices` (
`device_id` int(11) NOT NULL auto_increment,
`name` varchar(20) default NULL,
`appliance_id` int(11) default '0',
`sensor_type` int(11) default '0',
`display_name` VARCHAR(100),
PRIMARY KEY USING BTREE (`device_id`)
)
CREATE TABLE `channels` (
`channel_id` int(11) NOT NULL AUTO_INCREMENT,
`device_id` int(11) NOT NULL,
`channel` varchar(10) NOT NULL,
PRIMARY KEY (`channel_id`),
KEY `device_id_idx` (`device_id`)
)
CREATE TABLE `historical_data` (
`date_time` datetime NOT NULL,
`channel_id` int(11) NOT NULL,
`data` float DEFAULT NULL,
`unit` varchar(10) DEFAULT NULL,
KEY `devices_datetime_idx` (`date_time`) USING BTREE,
KEY `channel_id_idx` (`channel_id`)
)
The setup is that a device can have one or more channels and each channel has many (historical) data.
I use the following query to get the last historical data for one device and all it's related channels:
SELECT c.channel_id, c.channel, max(h.date_time), h.data
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
INNER JOIN historical_data h ON h.channel_id = c.channel_id
WHERE d.name = 'livingroom' AND d.appliance_id = '0'
AND d.sensor_type = 1 AND ( c.channel = 'ch1')
GROUP BY c.channel
ORDER BY h.date_time, channel
The query plan looks as follows:
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
| 1 | SIMPLE | c | ALL | PRIMARY,device_id_idx | NULL | NULL | NULL | 34 | Using where |
| 1 | SIMPLE | d | eq_ref | PRIMARY | PRIMARY | 4 | c.device_id | 1 | Using where |
| 1 | SIMPLE | h | ref | channel_id_idx | channel_id_idx | 4 | c.channel_id | 322019 | |
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
3 rows in set (0.00 sec)
The above query is currently taking approximately 15 secs and I wanted to know if there are any tips or way to improve the query?
Edit:
Example data from historical_data
+---------------------+------------+------+------+
| date_time | channel_id | data | unit |
+---------------------+------------+------+------+
| 2011-11-20 21:30:57 | 34 | 23.5 | C |
| 2011-11-20 21:30:57 | 9 | 68 | W |
| 2011-11-20 21:30:54 | 34 | 23.5 | C |
| 2011-11-20 21:30:54 | 5 | 316 | W |
| 2011-11-20 21:30:53 | 34 | 23.5 | C |
| 2011-11-20 21:30:53 | 2 | 34 | W |
| 2011-11-20 21:30:51 | 34 | 23.4 | C |
| 2011-11-20 21:30:51 | 9 | 68 | W |
| 2011-11-20 21:30:49 | 34 | 23.4 | C |
| 2011-11-20 21:30:49 | 4 | 193 | W |
+---------------------+------------+------+------+
10 rows in set (0.00 sec)
Edit 2:
Mutliple channel SELECT example:
SELECT c.channel_id, c.channel, max(h.date_time), h.data
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
INNER JOIN historical_data h ON h.channel_id = c.channel_id
WHERE d.name = 'livingroom' AND d.appliance_id = '0'
AND d.sensor_type = 1 AND ( c.channel = 'ch1' OR c.channel = 'ch2' OR c.channel = 'ch2')
GROUP BY c.channel
ORDER BY h.date_time, channel
I've used OR in the c.channel where clause because it was easier to generated pro grammatically but it can be changed to use IN if necessary.
Edit 3:
Example result of what I'm trying to achieve:
+-----------+------------+---------+---------------------+-------+
| device_id | channel_id | channel | max(h.date_time) | data |
+-----------+------------+---------+---------------------+-------+
| 28 | 9 | ch1 | 2011-11-21 20:39:36 | 0 |
| 28 | 35 | ch2 | 2011-11-21 20:30:55 | 32767 |
+-----------+------------+---------+---------------------+-------+
I have added the device_id to the example but my select will only need to return channel_id, channel, last date_time i.e max and the data. The results should be the last record from the historical_data table for each channel for one device.
It seems that removing an re-creating the index on date_time by deleting and creating it again sped up my original SQL up to around 2secs
I haven't been able to test this, so I'd like to ask you to run it and let us know what happens.. if it gives you the desired result and if it runs faster than your current:
CREATE DEFINER=`root`#`localhost` PROCEDURE `GetLatestHistoricalData_EXAMPLE`
(
IN param_device_name VARCHAR(20)
, IN param_appliance_id INT
, IN param_sensor_type INT
, IN param_channel VARCHAR(10)
)
BEGIN
SELECT
h.date_time, h.data
FROM
historical_data h
INNER JOIN
(
SELECT c.channel_id
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
WHERE
d.name = param_device_name
AND d.appliance_id = param_appliance_id
AND d.sensor_type = param_sensor_type
AND c.channel = param_channel
)
c ON h.channel_id = c.channel_id
ORDER BY h.date_time DESC
LIMIT 1;
END
Then to run a test:
CALL GetLatestHistoricalData_EXAMPLE ('livingroom', 0, 1, 'ch1');
I tried working it into a stored procedure so that even if you get the desired results using this for one device, you can try it with another device and see the results... Thanks!
[edit] : : In response to Danny's comment here's an updated test version:
CREATE DEFINER=`root`#`localhost` PROCEDURE `GetLatestHistoricalData_EXAMPLE_3Channel`
(
IN param_device_name VARCHAR(20)
, IN param_appliance_id INT
, IN param_sensor_type INT
, IN param_channel_1 VARCHAR(10)
, IN param_channel_2 VARCHAR(10)
, IN param_channel_3 VARCHAR(10)
)
BEGIN
SELECT
h.date_time, h.data
FROM
historical_data h
INNER JOIN
(
SELECT c.channel_id
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
WHERE
d.name = param_device_name
AND d.appliance_id = param_appliance_id
AND d.sensor_type = param_sensor_type
AND (
c.channel IN (param_channel_1
,param_channel_2
,param_channel_3
)
c ON h.channel_id = c.channel_id
ORDER BY h.date_time DESC
LIMIT 1;
END
Then to run a test:
CALL GetLatestHistoricalData_EXAMPLE_3Channel ('livingroom', 0, 1, 'ch1', 'ch2' , 'ch3');
Again, this is just for testing, so you'll be able to see if it meets your needs..
I would first add an index on the devices table ( appliance_id, sensor_type, name ) to match your query. I don't know how many entries are in this table, but if large, and many elements per device, get right to it.
Second, on your channels table, index on ( device_id, channel )
Third, on your history data, index on ( channel_id, date_time )
then,
SELECT STRAIGHT_JOIN
PreQuery.MostRecent,
PreQuery.Channel_ID,
PreQuery.Channel,
H2.Data,
H2.Unit
from
( select
c.channel_id,
c.channel,
max( h.date_time ) as MostRecent
from
devices d
join channels c
on d.device_id = c.device_id
and c.channel in ( 'ch1', 'ch2', 'ch3' )
join historical_data h
on c.channel_id = c.Channel_id
where
d.appliance_id = 0
and d.sensor_type = 1
and d.name = 'livingroom'
group by
c.channel_id ) PreQuery
JOIN Historical_Data H2
on PreQuery.Channel_ID = H2.Channel_ID
AND PreQuery.MostRecent = H2.Date_Time
order by
PreQuery.MostRecent,
PreQuery.Channel

How to fetch 3 first places of each game from the score table in mysql?

I have the following table:
CREATE TABLE `score` (
`score_id` int(10) unsigned NOT NULL auto_increment,
`user_id` int(10) unsigned NOT NULL,
`game_id` int(10) unsigned NOT NULL,
`thescore` bigint(20) unsigned NOT NULL,
`timestamp` timestamp NOT NULL default CURRENT_TIMESTAMP,
PRIMARY KEY (`score_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
That's a score table the stores the user_id and game_id and score of each game.
there are trophies for the first 3 places of each game.
I have a user_id and I would like to check if that specific user got any trophies from any of the games.
Can I somehow create this query without creating a temporary table ?
SELECT s1.*
FROM score s1 LEFT OUTER JOIN score s2
ON (s1.game_id = s2.game_id AND s1.thescore < s2.thescore)
GROUP BY s1.score_id
HAVING COUNT(*) < 3;
This query returns the rows for all winning games. Although ties are included; if the scores are 10,16,16,16,18 then there are four winners: 16,16,16,18. I'm not sure how you handle that. You need some way to resolve ties in the join condition.
For example, if ties are resolved by the earlier game winning, then you could modify the query this way:
SELECT s1.*
FROM score s1 LEFT OUTER JOIN score s2
ON (s1.game_id = s2.game_id AND (s1.thescore < s2.thescore
OR s1.thescore = s2.thescore AND s1.score_id < s2.score_id))
GROUP BY s1.score_id
HAVING COUNT(*) < 3;
You could also use the timestamp column to resolve ties, if you can depend on it being UNIQUE.
However, MySQL tends to create a temporary table for this kind of query anyway. Here's the output of EXPLAIN for this query:
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| 1 | SIMPLE | s1 | ALL | NULL | NULL | NULL | NULL | 9 | Using temporary; Using filesort |
| 1 | SIMPLE | s2 | ALL | PRIMARY | NULL | NULL | NULL | 9 | |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
SELECT game_id, user_id
FROM score score1
WHERE (SELECT COUNT(*) FROM score score2
WHERE score1.game_id = score2.game_id AND score2.thescore > score1.thescore) < 3
ORDER BY game_id ASC, thescore DESC;
A clearer way to do it, and semitested.
SELECT DISTINCT user_id
FROM
(
select s.user_id, s.game_id, s.thescore,
(SELECT count(1)
from scores
where game_id = s.game_id
AND thescore > s.thescore
) AS acount FROM scores s
) AS a
WHERE acount < 3
Didn´t test it, but should work fine:
SELECT
*,
#position := #position + 1 AS position
FROM
score
JOIN (SELECT #position := 0) p
WHERE
user_id = <INSERT_USER_ID>
AND game_id = <INSERT_GAME_ID>
ORDER BY
the_score
There you can check the position field to see if it´s between 1 and 3.