Can a sql query perform this job? - mysql

Here is my schema:
CREATE TABLE `tbltransactions` (
`transactionid` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`transactiondate` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`customerid` bigint(20) unsigned NOT NULL,
`transactiondetail` varchar(255) NOT NULL,
`transactionamount` decimal(10,2) NOT NULL,
UNIQUE KEY `transactionid` (`transactionid`),
KEY `customerid` (`customerid`),
CONSTRAINT `tbltransactions_ibfk_1` FOREIGN KEY (`customerid`) REFERENCES `tblcustomers` (`customerid`)
) ENGINE=InnoDB AUTO_INCREMENT=240 DEFAULT CHARSET=latin1;
transactionamount contains positive values for purchases and negative values for payments.
I wish I could list all records onwards from tbltransactions where customer's balance was zero at the latest. Any help?
EDIT: Please consider this dataset:
purchase 10
payment -10 // balance is zero
purchase 5
payment -5 // balance again zero
purchase 7 // show this transaction and onwards
purchase 2
payment -5 // show this also since balance is not zero
EDIT: sample of actual data:
INSERT INTO `tbltransactions` VALUES (1,'2014-06-22 22:51:00',39,'Balance when computerized',8851.00),(2,'2014-06-22 22:55:05',35,'Balance when computerized',5395.00),(3,'2014-06-22 22:56:17',53,'Balance when computerized',60.00),(4,'2014-06-22 22:57:15',54,'Balance when computerized',2671.00),(5,'2014-06-22 22:57:41',55,'Balance when computerized',1586.00),(6,'2014-06-22 22:58:34',61,'Balance when computerized',50.00),(7,'2014-06-22 22:59:22',56,'Balance when computerized',344.00),(8,'2014-06-22 22:59:42',71,'Balance when computerized',650.00),(9,'2014-06-22 23:01:10',63,'Balance when computerized',1573.00),(10,'2014-06-22 23:01:51',32,'Balance when computerized',7515.00),(11,'2014-06-22 23:02:22',72,'Balance when computerized',466.00),(12,'2014-06-22 23:03:10',64,'Balance when computerized',4774.00),(13,'2014-06-22 23:03:32',42,'Balance when computerized',2992.00),(14,'2014-06-22 23:05:24',41,'Balance when computerized',2218.00),(15,'2014-06-22 23:05:39',40,'Balance when computerized',7149.00),(16,'2014-06-22 23:06:25',80,'Balance when computerized',2607.00),(17,'2014-06-22 23:09:18',67,'Balance when computerized',357.00),(18,'2014-06-22 23:20:39',10,'Balance when computerized',677.00),(19,'2014-06-22 23:13:17',57,'Balance when computerized',135.00),(20,'2014-06-22 23:13:47',58,'Balance when computerized',5872.00),(21,'2014-06-24 11:36:10',73,'Balance when computerized',355.00),(22,'2014-06-22 23:14:30',74,'Balance when computerized',173.00),(23,'2014-06-22 23:16:45',59,'Balance when computerized',995.00),(24,'2014-06-22 23:17:44',19,'Balance when computerized',1704.00),(25,'2014-06-22 23:19:00',23,'Balance when computerized',690.00),(26,'2014-06-22 23:21:17',34,'Balance when computerized',10331.00),(27,'2014-06-22 23:21:43',38,'Balance when computerized',495.00),(28,'2014-06-22 23:22:01',65,'Balance when computerized',6676.00),(29,'2014-06-22 23:23:31',8,'Balance when computerized',4148.00),(30,'2014-06-22 23:23:53',24,'Balance when computerized',3124.00),(31,'2014-06-22 23:27:02',68,'Balance when computerized',3364.00),(35,'2014-06-22 23:35:22',46,'Balance when computerized',19105.00),(36,'2014-06-22 23:36:26',36,'Balance when computerized',2471.00),(37,'2014-06-22 23:36:42',60,'Balance when computerized',910.00),(38,'2014-06-22 23:37:11',75,'Balance when computerized',5203.00),(39,'2014-06-22 23:37:29',77,'Balance when computerized',2342.00),(40,'2014-06-22 23:37:42',13,'Balance when computerized',4555.00),(41,'2014-06-22 23:38:24',62,'Balance when computerized',271.00),(42,'2014-06-22 23:42:43',26,'Balance when computerized',5040.00),(43,'2014-06-22 23:43:13',33,'Balance when computerized',6792.00),(44,'2014-06-22 23:43:57',9,'Balance when computerized',1101.00),(45,'2014-06-22 23:44:27',21,'Balance when computerized',1010.00),(46,'2014-06-22 23:45:16',69,'Balance when computerized',89.00),(47,'2014-06-22 23:45:52',81,'Balance when computerized',220.00),(48,'2014-06-22 23:46:37',82,'Balance when computerized',205.00),(49,'2014-06-22 23:47:26',83,'Balance when computerized',731.00),(50,'2014-06-22 23:48:00',84,'Balance when computerized',155.00),(51,'2014-06-22 23:48:54',5,'Balance when computerized',475.00),(52,'2014-06-22 23:50:13',85,'Balance when computerized',1375.00),(53,'2014-06-22 23:51:04',86,'Balance when computerized',28.00),(54,'2014-06-22 23:51:39',87,'Balance when computerized',26.00),(55,'2014-06-22 23:52:23',88,'Balance when computerized',30.00),(56,'2014-06-22 23:52:53',89,'Balance when computerized',45.00),(57,'2014-06-22 23:53:23',90,'Balance when computerized',140.00),(58,'2014-06-22 23:54:13',91,'Balance when computerized',40.00),(59,'2014-06-22 23:55:38',93,'Balance when computerized',3350.00),(60,'2014-06-22 23:57:13',3,'Balance when computerized',60.00),(61,'2014-06-22 23:59:05',94,'Balance when computerized',3372.00),(62,'2014-06-23 00:00:12',20,'Balance when computerized',562.00),(63,'2014-06-23 00:00:48',18,'Balance when computerized',3227.00),(64,'2014-06-23 00:01:26',7,'Balance when computerized',1023.00),(65,'2014-06-23 00:01:46',29,'Balance when computerized',20.00),(66,'2014-06-23 00:02:57',15,'Balance when computerized',160.00),(67,'2014-06-23 00:04:14',11,'Balance when computerized',345.00),(68,'2014-06-23 00:04:50',31,'Balance when computerized',45.00),(69,'2014-06-23 00:08:45',50,'Balance when computerized',50.00),(70,'2014-06-23 00:09:05',6,'Balance when computerized',2880.00),(71,'2014-06-23 00:11:29',96,'Balance when computerized',1300.00),(72,'2014-06-23 00:12:40',4,'Balance when computerized',601.00),(74,'2014-06-24 10:21:26',97,'Balance when computerized',1250.00),(76,'2014-06-24 10:35:31',32,'1.5 ltr etc.',510.00),(77,'2014-06-24 15:04:13',97,'parchi',535.00),(78,'2014-06-24 15:05:51',32,'parchi',400.00),(79,'2014-06-24 15:08:08',32,'parchi',1924.00),(80,'2014-06-24 15:14:38',35,'suger berd',840.00),(81,'2014-06-24 15:16:49',39,'bottel',85.00),(82,'2014-06-24 15:21:51',20,'salt tusho',250.00),(83,'2014-06-24 15:23:49',26,'eggs',45.00),(84,'2014-06-24 15:24:54',38,'waldah',200.00),(85,'2014-06-24 15:26:12',78,'Balance when computerized',1557.00),(86,'2014-06-24 15:27:12',78,'haldi',70.00),(87,'2014-06-24 15:28:37',68,'eggs butter',87.00),(88,'2014-06-24 15:30:19',98,'Balance when computerized',550.00),(89,'2014-06-24 15:32:13',44,'2 coke',50.00),(90,'2014-06-24 15:33:05',81,'self',-220.00),(91,'2014-06-24 15:33:52',46,'razor',30.00),(92,'2014-06-24 15:34:37',75,'dues',40.00),(93,'2014-06-24 15:35:35',9,'oil ghee',625.00),(94,'2014-06-24 15:36:57',99,'bread',93.00),(95,'2014-06-24 15:38:14',100,'bottle razor',55.00),(96,'2014-06-24 15:38:54',7,'dues',40.00),(97,'2014-06-24 15:39:41',75,'ltr',60.00),(98,'2014-06-24 15:40:08',69,'1.5 ltr',60.00),(99,'2014-06-24 15:40:27',42,'2 1.5 ltr',120.00),(100,'2014-06-24 15:42:02',26,'bread bottle',110.00),(101,'2014-06-24 15:45:39',78,'saman',140.00),(102,'2014-06-26 15:19:20',101,'Oil dues',105.00),(103,'2014-06-26 15:19:59',26,'bread etc',55.00),(104,'2014-06-26 15:20:15',97,'parchi',290.00),(105,'2014-06-26 15:20:33',35,'parchi',355.00),(106,'2014-06-26 15:20:46',81,'bread',100.00),(107,'2014-06-26 15:21:26',102,'razor',40.00),(108,'2014-06-26 15:22:51',38,'dues',30.00),(109,'2014-06-26 15:23:35',20,'register, bottle',275.00),(110,'2014-06-26 15:23:55',46,'bottle dues etc',540.00),(112,'2014-06-26 15:26:08',46,'wife',-5000.00),(113,'2014-06-26 15:26:52',39,'bottle',65.00),(114,'2014-06-26 15:27:05',66,'1.5 ltr',85.00),(115,'2014-06-26 15:27:22',34,'cheeni etc',780.00),(116,'2014-06-26 15:27:46',97,'parchi',260.00),(117,'2014-06-26 15:28:04',81,'surf',370.00),(118,'2014-06-26 15:28:38',103,'rooh afza',150.00),(119,'2014-06-26 15:28:57',35,'parchi oil etc',623.00),(120,'2014-06-26 15:29:19',52,'easy paisa',1060.00),(121,'2014-06-26 15:29:51',35,'cake 1.5 ltr',185.00),(122,'2014-06-26 15:30:06',97,'parchi',243.00),(123,'2014-06-26 15:32:04',18,'dues',13.00),(124,'2014-06-26 15:32:28',26,'bread',50.00),(125,'2014-06-26 15:33:47',78,'bread',150.00),(126,'2014-06-26 15:34:52',9,'cheeni',280.00),(127,'2014-06-26 15:36:17',20,'oil',205.00),(128,'2014-06-26 15:39:31',96,'more load',500.00),(129,'2014-06-26 15:40:38',75,'water etc',125.00),(130,'2014-06-26 15:40:57',35,'dues',30.00),(131,'2014-06-26 15:41:10',18,'half role',90.00),(132,'2014-06-26 15:41:32',88,'geometery dues',20.00),(133,'2014-06-26 15:41:56',4,'dues',10.00),(134,'2014-06-26 15:42:18',41,'dues',60.00),(135,'2014-06-26 15:42:36',20,'ciggeret half role',190.00),(136,'2014-06-26 15:43:02',87,'always',30.00),(137,'2014-06-26 15:43:42',104,'dues',73.00),(138,'2014-06-26 15:44:07',13,'dues',946.00),(139,'2014-06-26 15:44:20',18,'surf',130.00),(140,'2014-06-26 15:44:29',35,'parchi',240.00),(141,'2014-06-26 15:44:46',85,'dues',30.00),(142,'2014-06-26 15:45:05',75,'milk',140.00),(143,'2014-06-26 15:45:24',74,'cream',40.00),(144,'2014-06-26 15:45:39',88,'milk',40.00),(145,'2014-06-26 15:46:00',38,'perfume',90.00),(146,'2014-06-26 15:46:20',32,'chilka etc',70.00),(147,'2014-06-26 15:47:05',90,'payment',-140.00),(148,'2014-06-26 15:47:26',18,'ghee dues',30.00),(149,'2014-06-26 15:47:45',98,'color',15.00),(150,'2014-06-26 15:48:00',85,'taala',50.00),(151,'2014-06-26 15:48:25',103,'ball',15.00),(153,'2014-06-26 15:51:21',64,'catchup',130.00),(154,'2014-06-26 15:51:42',65,'dues',10.00),(155,'2014-06-26 15:52:10',20,'dues',10.00),(156,'2014-06-26 15:52:35',18,'mirch',115.00),(157,'2014-06-26 15:52:56',18,'dues',10.00),(158,'2014-06-26 15:53:13',46,'half role etc',150.00),(159,'2014-06-26 15:53:37',33,'ghee',330.00),(160,'2014-06-26 15:54:06',36,'dues',10.00),(161,'2014-06-26 15:54:37',18,'dues',10.00),(162,'2014-06-26 15:54:50',18,'dues',30.00),(163,'2014-06-26 15:55:20',99,'dues',10.00),(164,'2014-06-26 15:58:14',92,'maidah',30.00),(165,'2014-06-26 16:16:23',26,'dues',856.00),(166,'2014-06-26 16:18:28',20,'load plus others',562.00),(167,'2014-06-26 16:51:35',75,'chanay',50.00),(168,'2014-06-26 16:54:22',103,'dettol',17.00),(169,'2014-06-26 16:55:00',42,'load',100.00),(171,'2014-06-26 17:15:23',85,'dues',125.00),(172,'2014-06-26 17:17:40',46,'tape',25.00),(173,'2014-06-26 17:33:50',66,'chana',40.00),(174,'2014-06-26 17:35:11',75,'shampoo',5.00),(175,'2014-06-26 17:36:37',106,'wiper',50.00),(176,'2014-06-26 17:37:23',43,'bottle',15.00),(177,'2014-06-26 17:37:51',87,'dues',60.00),(178,'2014-06-26 17:38:05',100,'bottle brush',125.00),(179,'2014-06-26 17:38:29',36,'shampoo',180.00),(180,'2014-06-26 17:39:49',32,'dues',20.00),(181,'2014-06-26 17:40:01',55,'dues',7.00),(182,'2014-06-26 17:41:01',41,'dues',15.00),(183,'2014-06-26 18:55:39',66,'bar haf',50.00),(184,'2014-06-26 19:40:30',103,'payment',-150.00),(185,'2014-06-26 20:24:00',61,'chohay maar',30.00),(186,'2014-06-26 21:47:45',97,'payment',-2578.00),(187,'2014-06-26 23:51:17',35,'boteletc',70.00),(188,'2014-06-27 00:00:18',66,'half',17.00),(189,'2014-06-27 00:02:05',99,'self',-107.00),(190,'2014-06-27 00:03:00',68,'tazab',30.00),(192,'2014-06-27 00:07:15',75,'due',25.00),(193,'2014-06-27 00:12:15',108,'dal',35.00),(194,'2014-06-27 00:14:54',57,'due',20.00),(195,'2014-06-27 00:15:30',65,'sig',45.00),(196,'2014-06-27 00:16:21',69,'shapener',15.00),(197,'2014-06-27 00:17:36',39,'botel',150.00),(198,'2014-06-27 00:19:27',37,'ice juice',140.00),(199,'2014-06-27 00:20:31',8,'sweet',250.00),(200,'2014-06-27 00:22:27',106,'botel',20.00),(201,'2014-06-27 00:23:24',22,'due',15.00),(202,'2014-06-27 00:24:08',81,'due',15.00),(203,'2014-06-27 00:26:31',19,'juice',50.00),(204,'2014-06-27 00:29:03',91,'kochyetc',30.00),(205,'2014-06-27 10:16:40',20,'payment',-2054.00),(206,'2014-06-27 10:39:38',78,'bread eggs',135.00),(208,'2014-06-27 10:41:27',74,'payment',-120.00),(209,'2014-06-27 10:45:24',109,'Balance when computerized',12287.00),(210,'2014-06-27 11:04:40',57,'payment',-155.00),(211,'2014-06-27 11:04:55',68,'blue band',55.00),(212,'2014-06-27 11:14:37',32,'sarmad soday',959.00),(213,'2014-06-27 11:28:59',78,'biscuit',40.00),(214,'2014-06-27 11:54:03',71,'bun',30.00),(215,'2014-06-27 15:26:06',92,'cocomo',20.00),(216,'2014-06-27 15:26:20',100,'paste',110.00),(217,'2014-06-27 15:26:48',71,'20 out of 30',-20.00),(218,'2014-06-27 15:27:22',32,'chetos',30.00),(219,'2014-06-27 15:27:44',75,'ghee cheeni',233.00),(220,'2014-06-27 15:29:32',18,'dues',45.00),(221,'2014-06-27 15:30:25',3,'ball',50.00),(222,'2014-06-27 15:31:15',100,'2 bottles',50.00),(223,'2014-06-27 15:32:10',3,'payment',-110.00),(224,'2014-06-27 15:32:38',4,'chips',40.00),(225,'2014-06-27 15:34:11',75,'ghee cheeni daal chawal',433.00),(226,'2014-06-27 15:34:52',41,'spray',400.00),(227,'2014-06-27 21:24:38',40,'katch bred',351.00),(228,'2014-06-27 22:02:04',8,'botel',60.00),(229,'2014-06-27 23:58:40',78,'half',90.00),(230,'2014-06-27 23:59:56',68,'rice',190.00),(231,'2014-06-28 00:00:40',97,'parchi',400.00),(232,'2014-06-28 00:01:15',97,'milk',70.00),(233,'2014-06-28 00:01:53',16,'ice',250.00),(234,'2014-06-28 00:02:53',35,'sig cake',20.00),(235,'2014-06-28 00:03:41',46,'botel cake',95.00),(236,'2014-06-28 00:05:17',75,'parchi rice bottels',750.00),(237,'2014-06-28 00:06:47',78,'sigret wife etc',230.00),(238,'2014-06-28 00:07:23',37,'nimko',10.00),(239,'2014-06-28 00:07:59',41,'rice',160.00);

SELECT a.*
FROM tbltransactions AS a
JOIN (
SELECT customerid, MAX(transactiondate) last_zero_bal FROM (
SELECT customerid, transactiondate,
#balance := IF (customerid = #prev_cust,
#balance + transactionamount,
transactionamount) AS balance,
#prev_cust := customerid
FROM (SELECT *
FROM tbltransactions
ORDER BY customerid, transactiondate) AS t
CROSS JOIN (SELECT #balance := 0, #prev_cust := NULL) AS v
) AS running_balances
WHERE balance = 0
GROUP BY customerid
) AS b ON a.customerid = b.customerid AND a.transactiondate > b.last_zero_bal
The subquery with the alias running_balances calculates each customer's running balance. Then the subquery b finds the most recent date where each customer had a zero balance. Finally, this is joined with the original transaction table to show all the transactions after this.
DEMO

Here you go
select b.*
from (
select a.customerid, max(a.transactionid) last_zero_transactionid
from (
select a.customerid, a.transactionid,
if(#prev_customer_id=customerid, #running_total:=#running_total+#last_transaction, #running_total:=0) running_total,
#prev_customer_id:=customerid prev_customer_id,
#last_transaction:=transactionamount last_transaction
from tbltransactions a
join (select #prev_customer_id:=0, #running_total:=0, #last_transaction:=0) b
order by customerid, transactionid) a
where a.running_total = 0
group by a.customerid) a
join tbltransactions b on a.customerid = b.customerid and a.last_zero_transactionid <= b.transactionid;

SELECT t1.* FROM
tbltransactions t1
JOIN (
SELECT MAX(t1.transactionid) max_zero_id, t1.customerid
FROM tbltransactions t1
JOIN tbltransactions t2 ON t1.customerid = t2.customerid
AND t1.transactiondate >= t2.transactiondate
GROUP BY t1.transactiondate, t1.customerid
HAVING SUM(t2.transactionamount) = 0
) t2 ON t1.transactionid > t2.max_zero_id AND t2.customerid = t1.customerid
ORDER BY t1.customerid, t1.transactionid
Output
+---------------+---------------------+------------+-------------------+-------------------+
| transactionid | transactiondate | customerid | transactiondetail | transactionamount |
+---------------+---------------------+------------+-------------------+-------------------+
| 106 | 2014-06-26 15:20:46 | 81 | bread | 100.00 |
| 117 | 2014-06-26 15:28:04 | 81 | surf | 370.00 |
| 202 | 2014-06-27 00:24:08 | 81 | due | 15.00 |
| 231 | 2014-06-28 00:00:40 | 97 | parchi | 400.00 |
| 232 | 2014-06-28 00:01:15 | 97 | milk | 70.00 |
+---------------+---------------------+------------+-------------------+-------------------+
Query Plan
+----+-------------+------------+------+--------------------------+------------+---------+--------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+--------------------------+------------+---------+--------------------+------+---------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 6 | Using temporary; Using filesort |
| 1 | PRIMARY | t1 | ref | transactionid,customerid | customerid | 8 | t2.customerid | 1 | Using where |
| 2 | DERIVED | t1 | ALL | customerid | NULL | NULL | NULL | 229 | Using temporary; Using filesort |
| 2 | DERIVED | t2 | ref | customerid | customerid | 8 | test.t1.customerid | 1 | Using where |
+----+-------------+------------+------+--------------------------+------------+---------+--------------------+------+---------------------------------+

Related

mysql: Search for records where two columns match another record and specify conditions on both records

I have a recurring_bill table which describes a subscription transaction. I have some duplicates that have the same customerId and chargeId. I want to identify these duplicate records. I also want to make sure IsDeleted is false for both records and the LastDay is in the future or NULL (so its an ongoing charge). My current query shows records that do not have a duplicate. Please help me correct my query.
SELECT * FROM recurring_bills b1
WHERE EXISTS
(SELECT * FROM recurring_bill b2 WHERE b1.customerId = b2.customerId
AND b1.chargeId = b2.chargeId
AND (b2.LastDay > '2022-03-10' OR b2.LastDay IS NULL)
AND b2.IsDeleted = 0)
AND (b1.LastDay > '2022-03-10' OR b1.LastDay IS NULL) AND b1.IsDeleted = 0;
Lets say this is the input
customerId | chargeId | LastDay | IsDeleted
1 | charge1 | NULL | 0
1 | charge1 | 05-23-2022 | 0
2 | charge2 | 05-23-2022 | 0
2 | charge2 | 05-23-2021 | 0
3 | charge3 | NULL | 1
3 | charge3 | NULL | 0
The correct output would be
customerId | chargeId | LastDay | IsDeleted
1 | charge1 | NULL | 0
MySQL version is 5.5.59
Try using EXISTS to identify records with the same customerId, chargeId, etc.. but having a different unique record Id. (I didn't know the name of your column, so used "TheUniqueId" for the example).
Note, when mixing AND/OR operators you must use parentheses to ensure expressions are evaluated in the expected order. Otherwise, the query may return the wrong results.
That said, wouldn't you want to see all of the "duplicates", so you could take action on them, if needed? If so, try:
CREATE TABLE recurring_bills
(`theUniqueId` int auto_increment primary key
, `customerId` int
, `chargeId` varchar(7)
, `LastDay` date
, `IsDeleted` int
)
;
INSERT INTO recurring_bills
(`customerId`, `chargeId`, `LastDay`, `IsDeleted`)
VALUES
(1, 'charge1', NULL, 0),
(1, 'charge1', '2022-05-23', 0),
(2, 'charge2', '2022-05-23', 0),
(2, 'charge2', '2021-05-23', 0),
(3, 'charge3', NULL, 1),
(3, 'charge3', NULL, 0)
;
-- Show all duplicates
SELECT *
FROM recurring_bills dupe
WHERE dupe.IsDeleted = 0
AND ( dupe.LastDay > '2022-03-10' OR
dupe.LastDay IS NULL
)
AND EXISTS (
SELECT NULL
FROM recurring_bills b
WHERE b.customerId = dupe.customerId
AND b.chargeId = dupe.chargeId
AND b.IsDeleted = dupe.IsDeleted
AND b.theUniqueId <> dupe.theUniqueId
AND ( b.LastDay > '2022-03-10' OR
b.LastDay IS NULL
)
)
Results:
theUniqueId | customerId | chargeId | LastDay | IsDeleted
----------: | ---------: | :------- | :--------- | --------:
1 | 1 | charge1 | null | 0
2 | 1 | charge1 | 2022-05-23 | 0
If for some reason you really want a single record per chargeId + customerId, add a GROUP BY
-- Show single record per dupe combination
SELECT *
FROM recurring_bills dupe
WHERE dupe.IsDeleted = 0
AND ( dupe.LastDay > '2022-03-10' OR
dupe.LastDay IS NULL
)
AND EXISTS (
SELECT NULL
FROM recurring_bills b
WHERE b.customerId = dupe.customerId
AND b.chargeId = dupe.chargeId
AND b.IsDeleted = dupe.IsDeleted
AND b.theUniqueId <> dupe.theUniqueId
AND ( b.LastDay > '2022-03-10' OR
b.LastDay IS NULL
)
)
GROUP BY dupe.customerId, dupe.chargeId
Results:
theUniqueId | customerId | chargeId | LastDay | IsDeleted
----------: | ---------: | :------- | :------ | --------:
1 | 1 | charge1 | null | 0
db<>fiddle here

SQL improvement in MySQL

I have these tables in MySQL.
CREATE TABLE `tableA` (
`id_a` int(11) NOT NULL,
`itemCode` varchar(50) NOT NULL,
`qtyOrdered` decimal(15,4) DEFAULT NULL,
:
PRIMARY KEY (`id_a`),
KEY `INDEX_A1` (`itemCode`)
) ENGINE=InnoDB
CREATE TABLE `tableB` (
`id_b` int(11) NOT NULL AUTO_INCREMENT,
`qtyDelivered` decimal(15,4) NOT NULL,
`id_a` int(11) DEFAULT NULL,
`opType` int(11) NOT NULL, -- '0' delivered to customer, '1' returned from customer
:
PRIMARY KEY (`id_b`),
KEY `INDEX_B1` (`id_a`)
KEY `INDEX_B2` (`opType`)
) ENGINE=InnoDB
tableA shows how many quantity we received order from customer, tableB shows how many quantity we delivered to customer for each order.
I want to make a SQL which counts how many quantity remaining for delivery on each itemCode.
The SQL is as below. This SQL works, but slow.
SELECT T1.itemCode,
SUM(IFNULL(T1.qtyOrdered,'0')-IFNULL(T2.qtyDelivered,'0')+IFNULL(T3.qtyReturned,'0')) as qty
FROM tableA AS T1
LEFT JOIN (SELECT id_a,SUM(qtyDelivered) as qtyDelivered FROM tableB WHERE opType = '0' GROUP BY id_a)
AS T2 on T1.id_a = T2.id_a
LEFT JOIN (SELECT id_a,SUM(qtyDelivered) as qtyReturned FROM tableB WHERE opType = '1' GROUP BY id_a)
AS T3 on T1.id_a = T3.id_a
WHERE T1.itemCode = '?'
GROUP BY T1.itemCode
I tried explain on this SQL, and the result is as below.
+----+-------------+------------+------+----------------+----------+---------+-------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+----------------+----------+---------+-------+-------+----------------------------------------------+
| 1 | PRIMARY | T1 | ref | INDEX_A1 | INDEX_A1 | 152 | const | 1 | Using where |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 21211 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 10 | |
| 3 | DERIVED | tableB | ref | INDEX_B2 | INDEX_B2 | 4 | | 96 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | tableB | ref | INDEX_B2 | INDEX_B2 | 4 | | 55614 | Using where; Using temporary; Using filesort |
+----+-------------+-------------------+----------------+----------+---------+-------+-------+----------------------------------------------+
I want to improve my query. How can I do that?
First, your table B has int for opType, but you are comparing to string via '0' and '1'. Leave as numeric 0 and 1. To optimize your pre-aggregates, you should not have individual column indexes, but a composite, and in this case a covering index. INDEX table B ON (OpType, ID_A, QtyDelivered) as a single index. The OpType to optimize the WHERE, ID_A to optimize the group by, and QtyDelivered for the aggregate in the index without going to the raw data pages.
Since you are looking for the two types, you can roll them up into a single subquery testing for either in a single pass result. THEN, Join to your tableA results.
SELECT
T1.itemCode,
SUM( IFNULL(T1.qtyOrdered, 0 )
- IFNULL(T2.qtyDelivered, 0)
+ IFNULL(T2.qtyReturned, 0)) as qty
FROM
tableA AS T1
LEFT JOIN ( SELECT
id_a,
SUM( IF( opType=0,qtyDelivered, 0)) as qtyDelivered,
SUM( IF( opType=1,qtyDelivered, 0)) as qtyReturned
FROM
tableB
WHERE
opType IN ( 0, 1 )
GROUP BY
id_a) AS T2
on T1.id_a = T2.id_a
WHERE
T1.itemCode = '?'
GROUP BY
T1.itemCode
Now, depending on the size of your tables, you might be better doing a JOIN on your inner table to table A so you only get those of the item code you are expectin. If you have 50k items and you are only looking for items that qualify = 120 items, then your inner query is STILL qualifying based on the 50k. In that case would be overkill. In this case, I would suggest an index on table A by ( ItemCode, ID_A ) and adjust the inner query to
LEFT JOIN ( SELECT
b.id_a,
SUM( IF( b.opType = 0, b.qtyDelivered, 0)) as qtyDelivered,
SUM( IF( b.opType = 1, b.qtyDelivered, 0)) as qtyReturned
FROM
( select distinct id_a
from tableA
where itemCode = '?' ) pqA
JOIN tableB b
on PQA.id_A = b.id_a
AND b.opType IN ( 0, 1 )
GROUP BY
id_a) AS T2
My Query against your SQLFiddle

MySQL show used index in query

For example I have created 3 index:
click_date - transaction table, daily_metric table
order_date - transaction table
I want to check does my query use index, I use EXPLAIN function and get this result:
+----+--------------+--------------+-------+---------------+------------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+--------------+-------+---------------+------------+---------+------+--------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 668 | Using temporary; Using filesort |
| 2 | DERIVED | <derived3> | ALL | NULL | NULL | NULL | NULL | 645 | |
| 2 | DERIVED | <derived4> | ALL | NULL | NULL | NULL | NULL | 495 | |
| 4 | DERIVED | transaction | ALL | order_date | NULL | NULL | NULL | 291257 | Using where; Using temporary; Using filesort |
| 3 | DERIVED | daily_metric | range | click_date | click_date | 3 | NULL | 812188 | Using where; Using temporary; Using filesort |
| 5 | UNION | <derived7> | ALL | NULL | NULL | NULL | NULL | 495 | |
| 5 | UNION | <derived6> | ALL | NULL | NULL | NULL | NULL | 645 | Using where; Not exists |
| 7 | DERIVED | transaction | ALL | order_date | NULL | NULL | NULL | 291257 | Using where; Using temporary; Using filesort |
| 6 | DERIVED | daily_metric | range | click_date | click_date | 3 | NULL | 812188 | Using where; Using temporary; Using filesort |
| NULL | UNION RESULT | <union2,5> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+--------------+-------+---------------+------------+---------+------+--------+----------------------------------------------+
In EXPLAIN results I see, that index order_date of transaction table is not used, do I correct understand ?
Index click_date of daily_metric table was used correct ?
Please tell my how to understand from EXPLAIN result does my created index is used in query properly ?
My query:
SELECT
partner_id,
the_date,
SUM(clicks) as clicks,
SUM(total_count) as total_count,
SUM(count) as count,
SUM(total_sum) as total_sum,
SUM(received_sum) as received_sum,
SUM(partner_fee) as partner_fee
FROM (
SELECT
clicks.partner_id,
clicks.click_date as the_date,
clicks,
orders.total_count,
orders.count,
orders.total_sum,
orders.received_sum,
orders.partner_fee
FROM
(SELECT
partner_id, click_date, sum(clicks) as clicks
FROM
daily_metric WHERE DATE(click_date) BETWEEN '2013-04-01' AND '2013-04-30'
GROUP BY partner_id , click_date) as clicks
LEFT JOIN
(SELECT
partner_id,
DATE(order_date) as order_dates,
SUM(order_sum) as total_sum,
SUM(customer_paid_sum) as received_sum,
SUM(partner_fee) as partner_fee,
count(*) as total_count,
count(CASE
WHEN status = 1 THEN 1
ELSE NULL
END) as count
FROM
transaction WHERE DATE(order_date) BETWEEN '2013-04-01' AND '2013-04-30'
GROUP BY DATE(order_date) , partner_id) as orders ON orders.partner_id = clicks.partner_id AND clicks.click_date = orders.order_dates
UNION ALL SELECT
orders.partner_id,
orders.order_dates as the_date,
clicks,
orders.total_count,
orders.count,
orders.total_sum,
orders.received_sum,
orders.partner_fee
FROM
(SELECT
partner_id, click_date, sum(clicks) as clicks
FROM
daily_metric WHERE DATE(click_date) BETWEEN '2013-04-01' AND '2013-04-30'
GROUP BY partner_id , click_date) as clicks
RIGHT JOIN
(SELECT
partner_id,
DATE(order_date) as order_dates,
SUM(order_sum) as total_sum,
SUM(customer_paid_sum) as received_sum,
SUM(partner_fee) as partner_fee,
count(*) as total_count,
count(CASE
WHEN status = 1 THEN 1
ELSE NULL
END) as count
FROM
transaction WHERE DATE(order_date) BETWEEN '2013-04-01' AND '2013-04-30'
GROUP BY DATE(order_date) , partner_id) as orders ON orders.partner_id = clicks.partner_id AND clicks.click_date = orders.order_dates
WHERE
clicks.partner_id is NULL
ORDER BY the_date DESC
) as t
GROUP BY the_date ORDER BY the_date DESC LIMIT 50 OFFSET 0
Although I can't explain what the EXPLAIN has dumped, I thought there must be an easier solution to what you have and came up with the following. I would suggest the following indexes to optimize your existing query for the WHERE date range and grouping by partner.
Additionally, when you have a query that uses a FUNCTION on a field, it doesn't take advantage of the index. Such as your DATE(order_date) and DATE(click_date). To allow the index to better be used, qualify the full date/time such as 12:00am (morning) up to 11:59pm. I would typically to this via
x >= someDate #12:00 and x < firstDayAfterRange.
in your example would be (notice less than May 1st which gets up to April 30th at 11:59:59pm)
click_date >= '2013-04-01' AND click_date < '2013-05-01'
Table Index
transaction (order_date, partner_id)
daily_metric (click_date, partner_id)
Now, an adjustment. Since your clicks table may have entries the transactions dont, and vice-versa, I would adjust this query to do a pre-query of all possible date/partners, then left-join to respective aggregate queries such as:
SELECT
AllParnters.Partner_ID,
AllParnters.the_Date,
coalesce( clicks.clicks, 0 ) Clicks,
coalesce( orders.total_count, 0 ) TotalCount,
coalesce( orders.count, 0 ) OrderCount,
coalesce( orders.total_sum, 0 ) OrderSum,
coalesce( orders.received_sum, 0 ) ReceivedSum,
coalesce( orders.partner_fee 0 ) PartnerFee
from
( select distinct
dm.partner_id,
DATE( dm.click_date ) as the_Date
FROM
daily_metric dm
WHERE
dm.click_date >= '2013-04-01' AND dm.click_date < '2013-05-01'
UNION
select
t.partner_id,
DATE(t.order_date) as the_Date
FROM
transaction t
WHERE
t.order_date >= '2013-04-01' AND t.order_date < '2013-05-01' ) AllParnters
LEFT JOIN
( SELECT
dm.partner_id,
DATE( dm.click_date ) sumDate,
sum( dm.clicks) as clicks
FROM
daily_metric dm
WHERE
dm.click_date >= '2013-04-01' AND dm.click_date < '2013-05-01'
GROUP BY
dm.partner_id,
DATE( dm.click_date ) ) as clicks
ON AllPartners.partner_id = clicks.partner_id
AND AllPartners.the_date = clicks.sumDate
LEFT JOIN
( SELECT
t.partner_id,
DATE(t.order_date) as sumDate,
SUM(t.order_sum) as total_sum,
SUM(t.customer_paid_sum) as received_sum,
SUM(t.partner_fee) as partner_fee,
count(*) as total_count,
count(CASE WHEN t.status = 1 THEN 1 ELSE NULL END) as COUNT
FROM
transaction t
WHERE
t.order_date >= '2013-04-01' AND t.order_date < '2013-05-01'
GROUP BY
t.partner_id,
DATE(t.order_date) ) as orders
ON AllPartners.partner_id = orders.partner_id
AND AllPartners.the_date = orders.sumDate
order by
AllPartners.the_date DESC
limit 50 offset 0
This way, the first query will be quick on the index to get all possible combinations from EITHER table. Then the left-join will AT MOST join to one row per set. If found, get the number, if not, I am applying COALESCE() so if null, defaults to zero.
CLARIFICATION.
Like you when building your pre-aggregate queries of "clicks" and "orders", the "AllPartners" is the ALIAS result of the select distinct of partners and dates within the date range you were interested in. The resulting columns of that where were "partner_id" and "the_date" respective to your next queries. So this is the basis of joining to the aggregates of "clicks" and "orders". So, since I have these two columns in the alias "AllParnters", I just grabbed those for the field list since they are LEFT-JOINed to the other aliases and may not exist in either/or the respective others.

What's the most efficient way to structure a 2-dimensional MySQL query?

I have a MySQL database with the following tables and fields:
Student (id)
Class (id)
Grade (id, student_id, class_id, grade)
The student and class tables are indexed on id (primary keys). The grade table is indexed on id (primary key) and student_id, class_id and grade.
I need to construct a query which, given a class ID, gives a list of all other classes and the number of students who scored more in that other class.
Essentially, given the following data in the grades table:
student_id | class_id | grade
--------------------------------------
1 | 1 | 87
1 | 2 | 91
1 | 3 | 75
2 | 1 | 68
2 | 2 | 95
2 | 3 | 84
3 | 1 | 76
3 | 2 | 88
3 | 3 | 71
Querying with class ID 1 should yield:
class_id | total
-------------------
2 | 3
3 | 1
Ideally I'd like this to execute in a few seconds, as I'd like it to be part of a web interface.
The issue I have is that in my database, I have over 1300 classes and 160,000 students. My grade table has almost 15 million rows and as such, the query takes a long time to execute.
Here's what I've tried so far along with the times each query took:
-- I manually stopped execution after 2 hours
SELECT c.id, COUNT(*) AS total
FROM classes c
INNER JOIN grades a ON a.class_id = c.id
INNER JOIN grades b ON b.grade < a.grade AND
a.student_id = b.student_id AND
b.class_id = 1
WHERE c.id != 1 AND
GROUP BY c.id
-- I manually stopped execution after 20 minutes
SELECT c.id,
(
SELECT COUNT(*)
FROM grades g
WHERE g.class_id = c.id AND g.grade > (
SELECT grade
FROM grades
WHERE student_id = g.student_id AND
class_id = 1
)
) AS total
FROM classes c
WHERE c.id != 1;
-- 1 min 12 sec
CREATE TEMPORARY TABLE temp_blah (student_id INT(11) PRIMARY KEY, grade INT);
INSERT INTO temp_blah SELECT student_id, grade FROM grades WHERE class_id = 1;
SELECT o.id,
(
SELECT COUNT(*)
FROM grades g
INNER JOIN temp_blah t ON g.student_id = t.student_id
WHERE g.class_id = c.id AND t.grade < g.grade
) AS total
FROM classes c
WHERE c.id != 1;
-- Same thing but with joins instead of a subquery - 1 min 54 sec
SELECT c.id,
COUNT(*) AS total
FROM classes c
INNER JOIN grades g ON c.id = p.class_id
INNER JOIN temp_blah t ON g.student_id = t.student_id
WHERE c.id != 1
GROUP BY c.id;
I also considered creating a 2D table, with students as rows and classes as columns, however I can see two issues with this:
MySQL implements a maximum column count (4096) and maximum row size (in bytes) which may be exceeded by this approach
I can't think of a good way to query that structure to get the results I need
I also considered performing these calculations as background jobs and storing the results somewhere, but for the information to remain current (it must), they would need to be recalculated every time a student, class or grade record was created or updated.
Does anyone know a more efficient way to construct this query?
EDIT: Create table statements:
CREATE TABLE `classes` (
`id` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1331 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci$$
CREATE TABLE `students` (
`id` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=160803 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci$$
CREATE TABLE `grades` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`student_id` int(11) DEFAULT NULL,
`class_id` int(11) DEFAULT NULL,
`grade` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_grades_on_student_id` (`student_id`),
KEY `index_grades_on_class_id` (`class_id`),
KEY `index_grades_on_grade` (`grade`)
) ENGINE=InnoDB AUTO_INCREMENT=15507698 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci$$
Output of explain on the most efficient query (the 1 min 12 sec one):
id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 | PRIMARY | c | range | PRIMARY | PRIMARY | 4 | | 683 | Using where; Using index
2 | DEPENDENT SUBQUERY | g | ref | index_grades_on_student_id,index_grades_on_class_id,index_grades_on_grade | index_grades_on_class_id | 5 | mydb.c.id | 830393 | Using where
2 | DEPENDENT SUBQUERY | t | eq_ref | PRIMARY | PRIMARY | 4 | mydb.g.student_id | 1 | Using where
Another edit - explain output for sgeddes suggestion:
+----+-------------+------------+--------+---------------+------+---------+------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+------+---------+------+----------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 14953992 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | <derived3> | system | NULL | NULL | NULL | NULL | 1 | Using filesort |
| 2 | DERIVED | G | ALL | NULL | NULL | NULL | NULL | 15115388 | |
| 3 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
+----+-------------+------------+--------+---------------+------+---------+------+----------+----------------------------------------------+
I think this should work for you using SUM and CASE:
SELECT C.Id,
SUM(
CASE
WHEN G.Grade > C2.Grade THEN 1 ELSE 0
END
)
FROM Class C
INNER JOIN Grade G ON C.Id = G.Class_Id
LEFT JOIN (
SELECT Grade, Student_Id, Class_Id
FROM Class
JOIN Grade ON Class.Id = Grade.Class_Id
WHERE Class.Id = 1
) C2 ON G.Student_Id = C2.Student_Id
WHERE C.Id <> 1
GROUP BY C.Id
Sample Fiddle Demo
--EDIT--
In response to your comment, here is another attempt that should be much faster:
SELECT
Class_Id,
SUM(CASE WHEN Grade > minGrade THEN 1 ELSE 0 END)
FROM
(
SELECT
Student_Id,
#classToCheck:=
IF(G.Class_Id = 1, Grade, #classToCheck) minGrade ,
Class_Id,
Grade
FROM Grade G
JOIN (SELECT #classToCheck:= 0) t
ORDER BY Student_Id, IF(Class_Id = 1, 0, 1)
) t
WHERE Class_Id <> 1
GROUP BY Class_ID
And more sample fiddle.
Can you give this a try on the original data as well! It is only one join :)
select
final.class_id, count(*) as total
from
(
select * from
(select student_id as p_student_id, grade as p_grade from table1 where class_id = 1) as partial
inner join table1 on table1.student_id = partial.p_student_id
where table1.class_id <> 1 and table1.grade > partial.p_grade
) as final
group by
final.class_id;
sqlfiddle link

MySQL Query Optimization with MAX()

I have 3 tables with the following schema:
CREATE TABLE `devices` (
`device_id` int(11) NOT NULL auto_increment,
`name` varchar(20) default NULL,
`appliance_id` int(11) default '0',
`sensor_type` int(11) default '0',
`display_name` VARCHAR(100),
PRIMARY KEY USING BTREE (`device_id`)
)
CREATE TABLE `channels` (
`channel_id` int(11) NOT NULL AUTO_INCREMENT,
`device_id` int(11) NOT NULL,
`channel` varchar(10) NOT NULL,
PRIMARY KEY (`channel_id`),
KEY `device_id_idx` (`device_id`)
)
CREATE TABLE `historical_data` (
`date_time` datetime NOT NULL,
`channel_id` int(11) NOT NULL,
`data` float DEFAULT NULL,
`unit` varchar(10) DEFAULT NULL,
KEY `devices_datetime_idx` (`date_time`) USING BTREE,
KEY `channel_id_idx` (`channel_id`)
)
The setup is that a device can have one or more channels and each channel has many (historical) data.
I use the following query to get the last historical data for one device and all it's related channels:
SELECT c.channel_id, c.channel, max(h.date_time), h.data
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
INNER JOIN historical_data h ON h.channel_id = c.channel_id
WHERE d.name = 'livingroom' AND d.appliance_id = '0'
AND d.sensor_type = 1 AND ( c.channel = 'ch1')
GROUP BY c.channel
ORDER BY h.date_time, channel
The query plan looks as follows:
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
| 1 | SIMPLE | c | ALL | PRIMARY,device_id_idx | NULL | NULL | NULL | 34 | Using where |
| 1 | SIMPLE | d | eq_ref | PRIMARY | PRIMARY | 4 | c.device_id | 1 | Using where |
| 1 | SIMPLE | h | ref | channel_id_idx | channel_id_idx | 4 | c.channel_id | 322019 | |
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
3 rows in set (0.00 sec)
The above query is currently taking approximately 15 secs and I wanted to know if there are any tips or way to improve the query?
Edit:
Example data from historical_data
+---------------------+------------+------+------+
| date_time | channel_id | data | unit |
+---------------------+------------+------+------+
| 2011-11-20 21:30:57 | 34 | 23.5 | C |
| 2011-11-20 21:30:57 | 9 | 68 | W |
| 2011-11-20 21:30:54 | 34 | 23.5 | C |
| 2011-11-20 21:30:54 | 5 | 316 | W |
| 2011-11-20 21:30:53 | 34 | 23.5 | C |
| 2011-11-20 21:30:53 | 2 | 34 | W |
| 2011-11-20 21:30:51 | 34 | 23.4 | C |
| 2011-11-20 21:30:51 | 9 | 68 | W |
| 2011-11-20 21:30:49 | 34 | 23.4 | C |
| 2011-11-20 21:30:49 | 4 | 193 | W |
+---------------------+------------+------+------+
10 rows in set (0.00 sec)
Edit 2:
Mutliple channel SELECT example:
SELECT c.channel_id, c.channel, max(h.date_time), h.data
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
INNER JOIN historical_data h ON h.channel_id = c.channel_id
WHERE d.name = 'livingroom' AND d.appliance_id = '0'
AND d.sensor_type = 1 AND ( c.channel = 'ch1' OR c.channel = 'ch2' OR c.channel = 'ch2')
GROUP BY c.channel
ORDER BY h.date_time, channel
I've used OR in the c.channel where clause because it was easier to generated pro grammatically but it can be changed to use IN if necessary.
Edit 3:
Example result of what I'm trying to achieve:
+-----------+------------+---------+---------------------+-------+
| device_id | channel_id | channel | max(h.date_time) | data |
+-----------+------------+---------+---------------------+-------+
| 28 | 9 | ch1 | 2011-11-21 20:39:36 | 0 |
| 28 | 35 | ch2 | 2011-11-21 20:30:55 | 32767 |
+-----------+------------+---------+---------------------+-------+
I have added the device_id to the example but my select will only need to return channel_id, channel, last date_time i.e max and the data. The results should be the last record from the historical_data table for each channel for one device.
It seems that removing an re-creating the index on date_time by deleting and creating it again sped up my original SQL up to around 2secs
I haven't been able to test this, so I'd like to ask you to run it and let us know what happens.. if it gives you the desired result and if it runs faster than your current:
CREATE DEFINER=`root`#`localhost` PROCEDURE `GetLatestHistoricalData_EXAMPLE`
(
IN param_device_name VARCHAR(20)
, IN param_appliance_id INT
, IN param_sensor_type INT
, IN param_channel VARCHAR(10)
)
BEGIN
SELECT
h.date_time, h.data
FROM
historical_data h
INNER JOIN
(
SELECT c.channel_id
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
WHERE
d.name = param_device_name
AND d.appliance_id = param_appliance_id
AND d.sensor_type = param_sensor_type
AND c.channel = param_channel
)
c ON h.channel_id = c.channel_id
ORDER BY h.date_time DESC
LIMIT 1;
END
Then to run a test:
CALL GetLatestHistoricalData_EXAMPLE ('livingroom', 0, 1, 'ch1');
I tried working it into a stored procedure so that even if you get the desired results using this for one device, you can try it with another device and see the results... Thanks!
[edit] : : In response to Danny's comment here's an updated test version:
CREATE DEFINER=`root`#`localhost` PROCEDURE `GetLatestHistoricalData_EXAMPLE_3Channel`
(
IN param_device_name VARCHAR(20)
, IN param_appliance_id INT
, IN param_sensor_type INT
, IN param_channel_1 VARCHAR(10)
, IN param_channel_2 VARCHAR(10)
, IN param_channel_3 VARCHAR(10)
)
BEGIN
SELECT
h.date_time, h.data
FROM
historical_data h
INNER JOIN
(
SELECT c.channel_id
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
WHERE
d.name = param_device_name
AND d.appliance_id = param_appliance_id
AND d.sensor_type = param_sensor_type
AND (
c.channel IN (param_channel_1
,param_channel_2
,param_channel_3
)
c ON h.channel_id = c.channel_id
ORDER BY h.date_time DESC
LIMIT 1;
END
Then to run a test:
CALL GetLatestHistoricalData_EXAMPLE_3Channel ('livingroom', 0, 1, 'ch1', 'ch2' , 'ch3');
Again, this is just for testing, so you'll be able to see if it meets your needs..
I would first add an index on the devices table ( appliance_id, sensor_type, name ) to match your query. I don't know how many entries are in this table, but if large, and many elements per device, get right to it.
Second, on your channels table, index on ( device_id, channel )
Third, on your history data, index on ( channel_id, date_time )
then,
SELECT STRAIGHT_JOIN
PreQuery.MostRecent,
PreQuery.Channel_ID,
PreQuery.Channel,
H2.Data,
H2.Unit
from
( select
c.channel_id,
c.channel,
max( h.date_time ) as MostRecent
from
devices d
join channels c
on d.device_id = c.device_id
and c.channel in ( 'ch1', 'ch2', 'ch3' )
join historical_data h
on c.channel_id = c.Channel_id
where
d.appliance_id = 0
and d.sensor_type = 1
and d.name = 'livingroom'
group by
c.channel_id ) PreQuery
JOIN Historical_Data H2
on PreQuery.Channel_ID = H2.Channel_ID
AND PreQuery.MostRecent = H2.Date_Time
order by
PreQuery.MostRecent,
PreQuery.Channel