Understand EXPLAIN mysql - mysql

I am trying to interpret the explain of mysql on a query,this is the table:
create table text_mess(
datamess timestamp(3) DEFAULT 0,
sender bigint ,
recipient bigint ,
roger boolean,
msg char(255),
foreign key(recipient)
references users (tel)
on delete cascade
on update cascade,
primary key(datamess,sender)
)
engine = InnoDB
this is the first type of query :
EXPLAIN
select /*!STRAIGHT_JOIN*/datamess, sender,recipient,roger,msg
from text_mess join (select max(datamess)as dmess
from text_mess
where roger = true
group by sender,recipient) as max
on text_mess.datamess=max.dmess ;
and this is the second:EXPLAIN
EXPLAIN
select /*!STRAIGHT_JOIN*/datamess, sender,recipient,roger,msg
from (select max(datamess)as dmess
from text_mess
where roger = true
group by sender,recipient) as max
join
text_mess
on max.dmess = text_mess.datamess ;
the two queries are asking the same thing, the only difference is the order of ref_table (driving_table), in the first case is text_mess, in the second case is a sub query :
![first and second query][1]
as you can see the difference is in the order of the first two lines, my question in particular is on the second (the faster query )
the second line should be the inner-table, but if so, why the column ref tells me: max.dmess, that should be the column of the ref-table (sub-query).
and, the last row is referred on how the first is built?
and in the end you think there is a more efficient query?

Related

MYSQL ERROR CODE: 1288 - can't update with join statement

Thanks for past help.
While doing an update using a join, I am getting the 'Error Code: 1288. The target table _____ of the UPDATE is not updatable' and figure out why. I can update the table with a simple update statement (UPDATE sales.customerABC Set contractID = 'x';) but can't using a join like this:
UPDATE (
SELECT * #where '*' contains columns a.uniqueID and a.contractID
FROM sales.customerABC
WHERE contractID IS NULL
) as a
LEFT JOIN (
SELECT uniqueID, contractID
FROM sales.tblCustomers
WHERE contractID IS NOT NULL
) as b
ON a.uniqueID = b.uniqueID
SET a.contractID = b.contractID;
If changing that update statement a SELECT such as:
SELECT * FROM (
SELECT *
FROM opwSales.dealerFilesCTS
WHERE pcrsContractID IS NULL
) as a
LEFT JOIN (
SELECT uniqueID, pcrsContractID
FROM opwSales.dealerFileLoad
WHERE pcrsContractID IS NOT NULL
) as b
ON a."Unique ID" = b.uniqueID;
the result table would contain these columns:
a.uniqueID, a.contractID, b.uniqueID, b.contractID
59682204, NULL, NULL, NULL
a3e8e81d, NULL, NULL, NULL
cfd1dbf9, NULL, NULL, NULL
5ece009c, , 5ece009c, B123
5ece0d04, , 5ece0d04, B456
5ece7ab0, , 5ece7ab0, B789
cfd21d2a, NULL, NULL, NULL
cfd22701, NULL, NULL, NULL
cfd23032, NULL, NULL, NULL
I pretty much have all database privileges and can't find restrictions with the table reference data. Can't find much information online concerning the error code, either.
Thanks in advance guys.
You cannot update a sub-select because it's not a "real" table - MySQL cannot easily determine how the sub-select assignment maps back to the originating table.
Try:
UPDATE customerABC
JOIN tblCustomers USING (uniqueID)
SET customerABC.contractID = tblCustomers.contractID
WHERE customerABC.contractID IS NULL AND tblCustomers.contractID IS NOT NULL
Notes:
you can use a full JOIN instead of a LEFT JOIN, since you want uniqueID to exist and not be null in both tables. A LEFT JOIN would generate extra NULL rows from tblCustomers, only to have them shot down by the clause requirement that tblCustomers.contractID be not NULL. Since they allow more stringent restrictions on indexes, JOINs tend to be more efficient than LEFT JOINs.
since the field has the same name in both tables you can replace ON (a.field1 = b.field1) with the USING (field1) shortcut.
you obviously strongly want a covering index with (uniqueID, customerID) on both tables to maximize efficiency
this is so not going to work unless you have "real" tables for the update. The "tblCustomers" may be a view or a subselect, but customerABC may not. You might need a more complicated JOIN to pull out a complex WHERE which might be otherwise hidden inside a subselect, if the original 'SELECT * FROM customerABC' was indeed a more complex query than a straight SELECT. What this boils down to is, MySQL needs a strong unique key to know what it needs to update, and it must be in a single table. To reliably update more than one table I think you need two UPDATEs inside a properly write-locked transaction.

How to update certain column in sql using select and where clause on the same table?

I have a table called employeepostinghistory with the following columns:
employee_posting_id, employee_posting_to, employee_posting_from, emp_emp_cnic
employee_posting_id is primary key and emp_emp_cnic is foreign key
Basically this table is responsible to hold employee history with from and to dates i.e. employee_posting_from and employee_posting_to.
I want to update all records setting employee_posting_to=NULL Where employee history is latest so, I have used DESC on employee_posting_from. But, UPDATE query says subquery return more than 1 record, What can be the possible solution for this problem.
UPDATE employeepostinghistory
SET employeepostinghistory.posting_to=NULL
WHERE employeepostinghistory.emp_emp_cnic=(
SELECT DISTINCT employeepostinghistory.emp_emp_cnic
from employeepostinghistory
GROUP BY employeepostinghistory.emp_emp_cnic
ORDER BY employeepostinghistory.posting_from DESC
)
;
UPDATE employeepostinghistory
NATURAL JOIN (
SELECT emp_emp_cnic, MAX(datetime_column) datetime_column
FROM employeepostinghistory
GROUP BY 1
) last_row_per_employee
SET employeepostinghistory.posting_to=NULL
;
emp_emp_cnic.datetime_column must be unique.

group update mysql command

I have the following mysql tables(tests,questions) with the corresponding columns types. The field correct_answer of questions table can hold a value equal to 'yes' or 'no'. When it is 'yes', it is counted as correct. When it is 'no',it is counted as incorrect. The fields correct and incorrect in table tests holds the sum of those counts. I had wanted a single sql command that does the update of the tests table based on the values in questions table. A record is initially inserted in tests table with the counts put to 0 while the table questions is filled up progressively.
tests(test_id integer primary key, correct integer, incorrect integer)
questions(test_id integer foreign key, question varchar(35), correct_answer varchar(3))
Test data
tests
10,0,0
11,0,0
questions
10,'textbook','yes'
10,'fire','no'
10,'card','yes'
11,'lamp','yes'
After I run the sql command, the tests table must read:
10,2,1
11,1,0
I tried "update tests set correct=select count(test_id) from questions where correct_answer='oui',incorrect=select count(test_id) from questions where correct_answer='non'" but does not work
you can do the aggregation inside a subquery and do a join with tests table to update the total counts
update tests t
join ( select test_id,
sum(correct_answer='yes') as correctCount,
sum(correct_answer='no') as incorrectCount
from questions
group by test_id) aggr
on t.test_id = aggr.test_id
set t.correct = aggr.correctCount,
t.incorrect = aggr.incorrectCount
Try this.
UPDATE tests
SET
tests.correct = (
SELECT count(*) FROM questions WHERE tests.test_id = questions.test_id AND questions.correct_answer = 'yes' GROUP BY test_id
),
tests.incorrect = (
SELECT count(*) FROM questions WHERE tests.test_id = questions.test_id AND questions.correct_answer = 'no' GROUP BY test_id
)

What should i use instead of IN?

I have a query like this:
SELECT DISTINCT devices1_.id AS id27_, devices1_.createdTime AS createdT2_27_, devices1_.deletedOn AS deletedOn27_,
devices1_.deviceAlias AS deviceAl4_27_, devices1_.deviceName AS deviceName27_, devices1_.deviceTypeId AS deviceT21_27_,
devices1_.equipmentVendor AS equipmen6_27_, devices1_.exceptionDetail AS exceptio7_27_, devices1_.hardwareVersion AS hardware8_27_,
devices1_.ipAddress AS ipAddress27_, devices1_.isDeleted AS isDeleted27_, devices1_.loopBack AS loopBack27_,
devices1_.modifiedTime AS modifie12_27_, devices1_.osVersion AS osVersion27_, devices1_.productModel AS product14_27_,
devices1_.productName AS product15_27_, devices1_.routerType AS routerType27_, devices1_.rundate AS rundate27_,
devices1_.serialNumber AS serialN18_27_, devices1_.serviceName AS service19_27_, devices1_.siteId AS siteId27_,
devices1_.siteIdA AS siteIdA27_, devices1_.status AS status27_, devices1_.creator AS creator27_, devices1_.lastModifier AS lastMod25_27_
FROM goldenvariation goldenconf0_
INNER JOIN devices devices1_ ON goldenconf0_.deviceId=devices1_.id
CROSS JOIN devices devices2_
WHERE goldenconf0_.deviceId=devices2_.id
AND (goldenconf0_.classType = 'policy-options')
AND DATE(goldenconf0_.rundate)=DATE('2014-04-14 00:00:00')
AND devices2_.isDeleted=0
AND EXISTS (SELECT DISTINCT(deviceId) FROM goldenvariation goldenconf3_
WHERE (goldenconf3_.goldenVariationType = 'MISMATCH')
AND (goldenconf3_.classType = 'policy-options')
AND DATE(goldenconf3_.rundate)=DATE('2014-04-14 00:00:00'))
AND EXISTS (SELECT DISTINCT (deviceId) FROM goldenvariation goldenconf4_
WHERE (goldenconf4_.goldenVariationType = 'MISSING')
AND (goldenconf4_.classType = 'policy-options')
AND DATE(goldenconf4_.rundate)=DATE('2014-04-14 00:00:00'));
Its taking too much time, how i can rewrite the query and make it fast?
Table structure of goldervariation is:
CREATE TABLE `goldenvariation` (
`id` BIGINT(20) NOT NULL AUTO_INCREMENT,
`classType` VARCHAR(255) DEFAULT NULL,
`createdTime` DATETIME DEFAULT NULL,
`goldenValue` LONGTEXT,
`goldenXpath` VARCHAR(255) DEFAULT NULL,
`isMatched` TINYINT(1) DEFAULT NULL,
`modifiedTime` DATETIME DEFAULT NULL,
`pathValue` LONGTEXT,
`rundate` DATETIME DEFAULT NULL,
`value` LONGTEXT,
`xpath` VARCHAR(255) DEFAULT NULL,
`deviceId` BIGINT(20) DEFAULT NULL,
`goldenXpathId` BIGINT(20) DEFAULT NULL,
`creator` INT(10) UNSIGNED DEFAULT NULL,
`lastModifier` INT(10) UNSIGNED DEFAULT NULL,
`goldenVariationType` VARCHAR(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `FK6804472AD99F2D15` (`deviceId`),
KEY `FK6804472A98002838` (`goldenXpathId`),
KEY `FK6804472A27C863B` (`creator`),
KEY `FK6804472A3617A57C` (`lastModifier`),
KEY `rundateindex` (`rundate`),
KEY `varitionidindex` (`id`),
KEY `classTypeindex` (`classType`),
CONSTRAINT `FK6804472A27C863B` FOREIGN KEY (`creator`) REFERENCES `users` (`userid`),
CONSTRAINT `FK6804472A3617A57C` FOREIGN KEY (`lastModifier`) REFERENCES `users` (`userid`),
CONSTRAINT `FK6804472A98002838` FOREIGN KEY (`goldenXpathId`) REFERENCES `goldenconfigurationxpath` (`id`),
CONSTRAINT `FK6804472AD99F2D15` FOREIGN KEY (`deviceId`) REFERENCES `devices` (`id`)
) ENGINE=INNODB AUTO_INCREMENT=1868865 DEFAULT CHARSET=latin1;
And explain plan of query is :
"1" "PRIMARY" "goldenconf0_" "ref" "FK6804472AD99F2D15,classTypeindex" "classTypeindex" "258" "const" "179223" "Using where; Using temporary"
"1" "PRIMARY" "devices2_" "eq_ref" "PRIMARY,deviceindex" "PRIMARY" "8" "cmdb.goldenconf0_.deviceId" "1" "Using where"
"1" "PRIMARY" "devices1_" "eq_ref" "PRIMARY,deviceindex" "PRIMARY" "8" "cmdb.goldenconf0_.deviceId" "1" ""
"3" "DEPENDENT SUBQUERY" "goldenconf4_" "index_subquery" "FK6804472AD99F2D15,classTypeindex" "FK6804472AD99F2D15" "9" "func" "19795" "Using where"
"2" "DEPENDENT SUBQUERY" "goldenconf3_" "index_subquery" "FK6804472AD99F2D15,classTypeindex" "FK6804472AD99F2D15" "9" "func" "19795" "Using where"
INNER JOIN goldenvariation goldenconf4_
ON goldenconf4_.deviceId = goldenconf0_deviceId
AND (goldenconf4_.goldenVariationType = 'MISSING')
AND (goldenconf4_.classType = 'policy-options')
AND DATE(goldenconf4_.rundate)=DATE('2014-04-14 00:00:00'))
In the same way change another EXISTS. I think this one should work much faster. Also small tips from me: try to use shorter aliases. Your query is really hard to read.
SELECT DISTINCT
devices1_.id AS id27_,
devices1_.createdTime AS createdT2_27_,
devices1_.deletedOn AS deletedOn27_,
devices1_.deviceAlias AS deviceAl4_27_,
devices1_.deviceName AS deviceName27_,
devices1_.deviceTypeId AS deviceT21_27_,
devices1_.equipmentVendor AS equipmen6_27_,
devices1_.exceptionDetail AS exceptio7_27_,
devices1_.hardwareVersion AS hardware8_27_,
devices1_.ipAddress AS ipAddress27_,
devices1_.isDeleted AS isDeleted27_,
devices1_.loopBack AS loopBack27_,
devices1_.modifiedTime AS modifie12_27_,
devices1_.osVersion AS osVersion27_,
devices1_.productModel AS product14_27_,
devices1_.productName AS product15_27_,
devices1_.routerType AS routerType27_,
devices1_.rundate AS rundate27_,
devices1_.serialNumber AS serialN18_27_,
devices1_.serviceName AS service19_27_,
devices1_.siteId AS siteId27_,
devices1_.siteIdA AS siteIdA27_,
devices1_.status AS status27_,
devices1_.creator AS creator27_,
devices1_.lastModifier AS lastMod25_27_
FROM goldenvariation goldenconf0_
INNER JOIN devices devices1_ ON goldenconf0_.deviceId=devices1_.id
INNER JOIN goldenvariation a on a.deviceId = goldenconf0_.deviceId and a.goldenVariationType = 'MISMATCH'
INNER JOIN goldenvariation b on b.deviceId = goldenconf0_.deviceId and b.goldenVariationType = 'MISSING'
WHERE (goldenconf0_.classType = 'policy-options')
AND convert(date,goldenconf0_.rundate) = '2014-04-14'
AND devices1_.isDeleted=0
Try this one. Should work much faster than your query. You joined table using CROSS JOIN but not even 1 column from this was used in SELECT.
You are looking for elements associated with the golden variations table via EXISTS. I would start with that table to get distinct IDs, then join to your devices table. Also, when converting dates, you won't be able to take advantage of an INDEX (if so part of index).
INDEX... ( classType, rundate, goldenVariationType, deviceID )
CHANGE the date clause to >= ? and < ?+1 this way, you get the entire date range from 12:00:00 morning to 11:59:59pm of the same day and the index can utilize the date component without converting for every record.
Also, you are doing a cross-join to the devices table TWICE on the matching "ID" from the goldenVariations table to devices 1 and 2 on same ID which is wasteful and not doing anything.
Your devices table should have an index ON (id, isDeleted)
SELECT
d1.id AS id27,
d1.createdTime AS createdT2_27,
d1.deletedOn AS deletedOn27,
d1.deviceAlias AS deviceAl4_27_,
d1.deviceName AS deviceName27_,
d1.deviceTypeId AS deviceT21_27_,
d1.equipmentVendor AS equipmen6_27_,
d1.exceptionDetail AS exceptio7_27_,
d1.hardwareVersion AS hardware8_27_,
d1.ipAddress AS ipAddress27_,
d1.isDeleted AS isDeleted27_,
d1.loopBack AS loopBack27_,
d1.modifiedTime AS modifie12_27_,
d1.osVersion AS osVersion27_,
d1.productModel AS product14_27_,
d1.productName AS product15_27_,
d1.routerType AS routerType27_,
d1.rundate AS rundate27_,
d1.serialNumber AS serialN18_27_,
d1.serviceName AS service19_27_,
d1.siteId AS siteId27_,
d1.siteIdA AS siteIdA27_,
d1.status AS status27_,
d1.creator AS creator27_,
d1.lastModifier AS lastMod25_27_
from
( SELECT distinct
gv.deviceID
from
goldenVariation gv
where
gv.classType = 'policy-options'
AND gv.runDate >= '2014-04-14'
AND gv.runDate < '2014-04-15'
AND gv.goldenVariationType IN ( 'MISSING', 'MISMATCH' )) PQ
JOIN devices d1
ON PQ.deviceId = d1.id
AND d1.isDeleted = 0
Yes, the query could be rewritten to improve performance (though it looks like a query generated by Hibernate, and getting Hibernate to use a different query can be a challenge.)
How sure are you that this query is returning the resultset you expect? Because the query is rather odd.
In terms of performance, dollars to donuts, its the repeated executions of the dependent subqueries that are really eating your lunch, and your lunchbox, in terms of performance. It looks like MySQL is using the index on the deviceId column to satisfy that subquery, and that doesn't look like the most appropriate index.
We notice that there are two JOIN operations to the devices table; there is no reason this table needs to be joined twice. Both JOIN operations require a match to the deviceID column of goldenvariation, the second join to the devices table does additional filtering with the isDeleted=0. The keywords INNER and CROSS don't have any impact on the statement at all; and the second join to the devices table isn't really a "cross" join, it's really an inner join. (We prefer to see the join predicates in an ON clause rather than the WHERE clause.
The DATE() function wrapped around the rundate column disables an index range scan operation. These predicates can be rewritten to take advantage of an appropriate index.
The DISTINCT(deviceId) in the SELECT list of an EXISTS subquery is very strange. Firstly, DISTINCT is a keyword, not a function. There's no need for parens around deviceId. But beyond that, it doesn't matter what is returned in the SELECT list of the EXISTS subquery, it could just be SELECT 1.
It's odd to see an EXISTS predicate with a query that doesn't reference any expression in from the outer query (i.e. a correlated subquery). It's valid syntax. With a correlated subquery, MySQL performs that query for each and every row returned by the outer query. The EXPLAIN output looks like MySQL is doing the same thing, it didn't recognize any optimization.
The way those EXIST predicates are written, if there isn't a 'policy-options' row with 'MISMATCH' AND there isn't a 'policy-options' row with 'MISSING' (for the specified date, then the query will not return any rows. If a row of each type is found (for the specified date, then ALL of 'policy-options' rows for that date are returned. (It's syntactically valid, but it's rather odd.)
Assuming that the id column on the devices table is UNIQUE (i.e. it's the PRIMARY KEY or there's a UNIQUE index on that column, then the DISTINCT keyword is unnecessary on the outermost query. (From the EXPLAIN output, it looks like MySQL already optimized away the usual operations, that is, MySQL recognized that the DISTINCT keyword is unnecessary.
But bottom line, it's the dependent subqueries that are killing performance; the absence of suitable indexes, and the predicate on the date column wrapped in a function.
To answer your question, yes, this query can be rewritten to return an equivalent resultset more efficiently. (It's not entirely clear that the query is returning the resultset you expect.)
SELECT d1.id AS id27_
, d1.createdTime AS createdT2_27_
, d1.deletedOn AS deletedOn27_
, d1.deviceAlias AS deviceAl4_27_
, d1.deviceName AS deviceName27_
, d1.deviceTypeId AS deviceT21_27_
, d1.equipmentVendor AS equipmen6_27_
, d1.exceptionDetail AS exceptio7_27_
, d1.hardwareVersion AS hardware8_27_
, d1.ipAddress AS ipAddress27_
, d1.isDeleted AS isDeleted27_
, d1.loopBack AS loopBack27_
, d1.modifiedTime AS modifie12_27_
, d1.osVersion AS osVersion27_
, d1.productModel AS product14_27_
, d1.productName AS product15_27_
, d1.routerType AS routerType27_
, d1.rundate AS rundate27_
, d1.serialNumber AS serialN18_27_
, d1.serviceName AS service19_27_
, d1.siteId AS siteId27_
, d1.siteIdA AS siteIdA27_
, d1.status AS status27_
, d1.creator AS creator27_
, d1.lastModifier AS lastMod25_27_
FROM devices d1
JOIN (SELECT g.deviceId
FROM goldenvariation g
CROSS
JOIN (SELECT 1
FROM goldenvariation x3
WHERE x3.goldenVariationType = 'MISMATCH'
AND x3.classType = 'policy-options'
AND x3.rundate >= '2014-04-14'
AND x3.rundate < '2014-04-14' + INTERVAL 1 DAY
LIMIT 1
) t3
CROSS
JOIN (SELECT 1
FROM goldenvariation x4
WHERE x4.goldenVariationType = 'MISSING'
AND x4.classType = 'policy-options'
AND x4.rundate >= '2014-04-14'
AND x4.rundate < '2014-04-14' + INTERVAL 1 DAY
LIMIT 1
) t4
WHERE g.classType = 'policy-options'
AND g.rundate >= '2014-04-14'
AND g.rundate < '2014-04-14' + INTERVAL 1 DAY
GROUP BY g.deviceId
) t2
ON t2.device_id = d1.id
WHERE d1.isDeleted=0

Need Help Speeding up an Aggregate SQLite Query

I have a table defined like the following...
CREATE table actions (
id INTEGER PRIMARY KEY AUTO_INCREMENT,
end BOOLEAN,
type VARCHAR(15) NOT NULL,
subtype_a VARCHAR(15),
subtype_b VARCHAR(15),
);
I'm trying to query for the last end action of some type to happen on each unique (subtype_a, subtype_b) pair, similar to a group by (except SQLite doesn't say what row is guaranteed to be returned by a group by).
On an SQLite database of about 1MB, the query I have now can take upwards of two seconds, but I need to speed it up to take under a second (since this will be called frequently).
example query:
SELECT * FROM actions a_out
WHERE id =
(SELECT MAX(a_in.id) FROM actions a_in
WHERE a_out.subtype_a = a_in.subtype_a
AND a_out.subtype_b = a_in.subtype_b
AND a_in.status IS NOT NULL
AND a_in.type = "some_type");
If it helps, I know all the unique possibilities for a (subtype_a,subtype_b)
eg:
(a,1)
(a,2)
(b,3)
(b,4)
(b,5)
(b,6)
Beginning with version 3.7.11, SQLite guarantees which record is returned in a group:
Queries of the form: "SELECT max(x), y FROM table" returns the value of y on the same row that contains the maximum x value.
So greatest-n-per-group can be implemented in a much simpler way:
SELECT *, max(id)
FROM actions
WHERE type = 'some_type'
GROUP BY subtype_a, subtype_b
Is this any faster?
select * from actions where id in (select max(id) from actions where type="some_type" group by subtype_a, subtype_b);
This is the greatest-in-per-group problem that comes up frequently on StackOverflow.
Here's how I solve it:
SELECT a_out.* FROM actions a_out
LEFT OUTER JOIN actions a_in ON a_out.subtype_a = a_in.subtype_a
AND a_out.subtype_b = a_in.subtype_b
AND a_out.id < a_in.id
WHERE a_out.type = "some type" AND a_in.id IS NULL
If you have an index on (type, subtype_a, subtype_b, id) this should run very fast.
See also my answers to similar SQL questions:
Fetch the row which has the Max value for a column
Retrieving the last record in each group
SQL join: selecting the last records in a one-to-many relationship
Or this brilliant article by Jan Kneschke: Groupwise Max.