I have a Mysql update statement and it's running too long - 52 sec
update table_ea ea, table_a a
set ea.match_creator='S', a.match_state=N
where
ea.source_id=a.asset_id and
ea.source_name='S' and
ea.match_creator='S' and
ea.entity_id like 'S'
Question:
a) Can we do an explain on this update statement in Mysql as we do for Select statements ?
b) Any suggestions on how to minimize the update time..
See how the corresponding select statement is performing. You are probably missing an index.
You'll need to post the table information if you want us to check.
Try posting SHOW CREATE TABLE table_ea and SHOW CREATE TABLE table_a
EXPLAIN SELECT ea.match_creator, a.match_state
FROM table_ea ea, table_a a
WHERE ea.source_id=a.asset_id
AND ea.source_name='S'
AND ea.match_creator='S'
AND ea.entity_id like 'S'`
You should create indexes to the following fields of the tables in order to make it quicker (it accelerates the joins):
ea.source_id
a.asset_id
ea.source_name
ea.match_creator
ea.entity_id
I also recomend that you replace the like operator for entity_id with an equal operator, cause in this case it is the same.
Related
Using a UPDATE query, is it possible to reference to a stored select query?
I'd like to accomplish something like this:
UPDATE ... WHERE ... IN [MY_STORED_PROCEDURE]
Perhaps something on these lines:
UPDATE ...
WHERE ID IN (SELECT ID FROM MyStoredProc)
Depending on your set up, a join may also be possible. You can add stored queries to the query design window, but you do not always end up with an updatable query, it usually depends on your indexes.
How do I optimize the following update because the sub-query is being executed for each row in table a?
update
a
set
col = 1
where
col_foreign_id not in (select col_foreign_id in b)
You could potentially use an outer join where there are no matching records instead of your not in:
update table1 a
left join table2 b on a.col_foreign_id = b.col_foreign_id
set a.col = 1
where b.col_foreign_id is null
This should use a simple select type rather than a dependent subquery.
Your current query (or the one that actually works since the example in the OP doesn't look like it would) is potentially dangerous in that a NULL in b.col_foreign_id would cause nothing to match, and you'd update no rows.
not exists would also be something to look at if you want to replace not in.
I can't tell you that this will make your query any faster, but there is some good info here. You'll have to test in your environment.
Here's a SQL Fiddle illuminating the differences between in, exists, and outer join (check the rows returned, null handling, and execution plans).
I'm trying to find the most efficient way to determine if a table row exists.
I have in mind 3 options:
SELECT EXISTS(SELECT 1 FROM table1 WHERE some_condition);
SELECT 1 FROM table1 WHERE some_condition LIMIT 0,1;
SELECT COUNT(1) FROM table1 WHERE some_condition;
It seems that for MySQL the first approach is more efficient:
Best way to test if a row exists in a MySQL table
Is it true in general for any database?
UPDATE:
I've added a third option.
UPDATE2:
Let's assume the database products are mysql, oracle and sql-server.
I would do
SELECT COUNT(1) FROM table 1 WHERE some_condition.
But I don't think it makes a significant difference unless you call it a lot (in which case, I'd probably use a different strategy).
If you mean to use as a test if AT LEAST ONE row exists with some condition (1 or 0, true or false), then:
select count(1) from my_table where ... and rownum < 2;
Oracle can stop counting after it gets a hit.
Exists is faster because it will return the number of results that match the subquery and not the whole result.
The different methods have different pros and cons:
SELECT EXISTS(SELECT 1 FROM table1 WHERE some_condition);
might be the fastest on MySQL, but
SELECT COUNT(1) FROM table 1 WHERE some_condition
as in #Luis answer gives you the count.
More to the point I recommend you take a look at your business logic: Very seldom is it necessary to just see if a row exists, more often you will want to
either use these rows, so just do the select and handle the 0-rows case
or you will want to change these rows, in which case just do your update and check mysql_affected_rows()
If you want to INSERT a row if it doesn't already exist, take a look at INSERT .. ON DUPLICATE KEY or REPLACE INTO
The exists function is defined generally in SQL, it isn't only as a MySQL function : http://www.techonthenet.com/sql/exists.php
and I usually use this function to test if a particular row exists.
However in Oracle I've seen many times the other approach suggested before:
SELECT COUNT(1) FROM table 1 WHERE some_condition.
From a comment on https://stackoverflow.com/a/11064/247702
You save the query planner from needing to figure that out by using
either Answer.Text or a.Text. It doesn't matter whether you use the
table name or the alias, but qualifying the field helps.
Is this true for SQL Server 2008 when querying a single table? For example, will this
select
mt.myfield
from
mytable mt
where
mt.myid = 1
be faster than this?
select
myfield
from
mytable
where
myid = 1
I could test this ofcourse, but I don't have a large enough dataset nor do I know how to reliably test SQL Server performance.
In the case that you presented I think it would be the same thing.
The only problem might be when you join with multiple tables and the query optimizer needs to locate the column in the where clause (in all the tables). If you use aliases the query optimizer already knows to what table each column is.
I just read this article:
http://use-the-index-luke.com/sql/clustering/index-only-scan-covering-index
And at the bottom is this statement:
Queries that do not select any table columns are often executed as index-only scan.
Can you think of a meaningful example?
Problem is, there is no comments section, so I just want to verify, this is one example, correct?
SELECT 1 FROM `table_name` WHERE `indexed_column` = ?
This is to check whether a specified row exists.
So the questions:
Are there any more practical uses for that?
As a side note, I read somewhere that the above query might be more performant if encapsulated in EXISTS, I'm not sure how to check if it's true:
SELECT EXISTS(SELECT 1 FROM `table_name` WHERE `indexed_column` = ? LIMIT 1)
Is it?
Well, possibly the canonical example would be select count(*) from mytable to get a row count.
That selects no data from the table and would most likely be satisfied by the primary key index, if available.