mysql SQL optimization

mysql SQL optimization - mysql

this query takes an hour
select *,
unix_timestamp(finishtime)-unix_timestamp(submittime) timetaken
from joblog
where jobname like '%cas%'
and submittime>='2013-01-01 00:00:00'
and submittime<='2013-01-10 00:00:00'
order by id desc limit 300;
but the same query with one submittime finishes in like .03 seconds
the table has 2.1 Million rows
Any idea whats causing the issue or how to debug it

Your first step should be to use MySQL EXPLAIN to see what the query is doing. It'll probably give you some insight on how to fix your issue.
My guess is that jobname LIKE '%cas%' is the slowest part because you're doing a wildcard text search. Adding an index here won't even help, because you have a leading wildcard. Is there any way to do this query without a leading wildcard like that? Also adding an index on submittime might improve the speed of this query.

You might try adding a LIMIT to the query and see if that increases the speed that it returns ...
Excerpt from http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
"Sometimes MySQL does not use an index, even if one is available. One circumstance under which this occurs is when the optimizer estimates that using the index would require MySQL to access a very large percentage of the rows in the table. (In this case, a table scan is likely to be much faster because it requires fewer seeks.) However, if such a query uses LIMIT to retrieve only some of the rows, MySQL uses an index anyway, because it can much more quickly find the few rows to return in the result. "

select *,unix_timestamp(finishtime)-unix_timestamp(submittime) timetaken
from joblog
where (submittime between '2013-01-10 00:00:00' and '2013-01-19 00:00:00')
and jobname is not null
and jobname like '%cas%';
this helped
(0.93 seconds)

Related

MySQL query takes more time to fetch data [MySQL]

I have 500000 records table in my MySQL server. When running a query it takes more time for query execution. sometimes it goes beyond a minute.
Below I have added my MySQL machine detail.
RAM-16GB
Processor : Intel(R) -Core™ i5-4460M CPU #3.20GHz
OS: Windows server 64 bit
I know there is no problem with my machine since it is a standalone machine and no other applications there.
Maybe the problem with my query. I have gone through the MySql site and found that I have used proper syntax. But I don't know exactly the reason for the delay in the result.
SELECT SUM(`samplesalesdata50000`.`UnitPrice`) AS `UnitPrice`, `samplesalesdata50000`.`SalesChannel` AS `Channel`
FROM `samplesalesdata50000` AS `samplesalesdata50000`
GROUP BY `samplesalesdata50000`.`SalesChannel`
ORDER BY 2 ASC
LIMIT 200 OFFSET 0
Can anyone please let me know whether the duration, depends on the table or the query that I have used?
Note: Even if try with indexing, there is no much difference in result time.
Thanks

Two approaches to this:
One approach is to create a covering index on the columns needed to satisfy your query. The correct index for your query contains these columns in this order: (SalesChannel, UnitPrice).
Why does this help? For one thing, the index itself contains all data needed to satisfy your query, and nothing else. This means your server does less work.
For another thing, MySQL's indexes are BTREE-organized. That means they're accessible in order. So your query can be satisfied one SalesChannel at a time, and MySQL doesn't need an internal temporary table. That's faster.
A second approach involves recognizing that ORDER BY ... LIMIT is a notorious performance antipattern. You require MySQL to sort a big mess of data, and then discard most of it.
You could try this:
SELECT SUM(UnitPrice) UnitPrice,
SalesChannel Channel
FROM samplesalesdata50000
WHERE SalesChannel IN (
SELECT SalesChannel
FROM samplesalesdata50000
ORDER BY Channel LIMIT 200 OFFSET 0
)
GROUP BY SalesChannel
ORDER BY SalesChannel
LIMIT 200 OFFSET 0
If you have an index on SalesChannel (the covering index mentioned above works) this should speed you up a lot, because your aggregate (GROUP BY) query need only consider a subset of your table.

Your problem with "ORDER BY 2 ASC". Try this "ORDER BY Channel".

If it was MS SQL Server you would use the WITH (NOLOCK)
and the MYSQL equivalent is
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED ;
SELECT SUM(`samplesalesdata50000`.`UnitPrice`) AS `UnitPrice`, `samplesalesdata50000`.`SalesChannel` AS `Channel`
FROM `samplesalesdata50000` AS `samplesalesdata50000`
GROUP BY `samplesalesdata50000`.`SalesChannel`
ORDER BY SalesChannel ASC
LIMIT 200 OFFSET 0
COMMIT ;

To improve on OJones's answer, note that
SELECT SalesChannel FROM samplesalesdata50000
ORDER BY SalesChannel LIMIT 200, 1
will quickly (assuming the index given) find the end of the desired list. Then adding this limits the main query to only the rows needed:
WHERE SalesChannel < (that-select)
There is, however, a problem. If there are fewer than 200 rows in the table, the subquery will return nothing.
You seem to be setting up for "paginating"? In that case, a similar technique can be used to find the starting value:
WHERE SalesChannel >= ...
AND SalesChannel < ...
This also avoids using the inefficient OFFSET, which has to read, then toss, all the rows being skipped over. More
But the real solution may be to build and maintain a Summary Table of the data. It would contain subtotals for each, say, month. Then run the query against the Summary table -- it might be 10x faster. More

Optimizing mysql query with the proper index

I have a table of 15.1 million records. I'm running the following query on it to process the records for duplicate checking.
select id, name, state, external_id
from companies
where dup_checked=0
order by name
limit 500;
When I use explain extended on the query it tells me it's using the index_companies_on_name index which is just an index on the company name. I'm assuming this is due to the ordering. I tried creating other indexes based on the name and dup_checked fields hoping it would use this one as it may be faster, but it still uses the index_companies_on_name index.
Initially it was fast enough, but now we're down to 3.3 million records left to check and this query is taking up to 90 seconds to execute. I'm not quite sure what else to do to make this run faster. Is a different index the answer or something else I'm not thinking of? Thanks.

Generally the trick here is to create an index that filters first, reducing the number of rows ("Cardinality"), and has the ordering applied secondarily:
CREATE INDEX `index_companies_on_dup_checked_name`
ON `companies` (`dup_checked`,`name`)
That should give you the scope you need.

Very slow query, any other ways to format this with better performace?

I have this query (I didn't write) that was working fine for a client until the table got more then a few thousand rows in it, now it's taking 40 seconds+ on only 4200 rows.
Any suggetions on how to optimize and get the same result?
I've tried a few other methods but didn't get the correct result that this slower query returned...
SELECT COUNT(*) AS num
FROM `fl_events`
WHERE id IN(
SELECT DISTINCT (e2.id)
FROM `fl_events` AS e1, fl_events AS e2
WHERE e1.startdate >= now() AND e1.startdate = e2.startdate
)
ORDER BY `startdate`
Any help would be greatly appriciated!

Appart from the obvious indexes needed, I don't really get why you are joining your table with itself for choosing the IN condition. The ORDER BY is also not needed. Are you sure that your query can't be written just like this?:
SELECT COUNT(*) AS num
FROM `fl_events` AS e1
WHERE e1.startdate >= now()

I don't think rewriting the query will help. The key to your question is "until the table got more than a few thousand rows." This implies that important columns aren't indexed. Prior to a certain number of records, all the data fit on a single memory block - over that point, it takes a new block. And index is the only way to speed up the search.
first - check to see that the ID in fl_events is actually marked as a primary key. That physically orders the records and without it you can see data corruption and occasionally super-slow results. The use of distinct in the query makes it look like it might NOT be a unique value. That will pose a problem.
Then, make sure to add an index on the start_date.

The slowness is probably related to the join of the event table with itself, and possibly startdate not having an index.

Removing 'using filesort' from query

I have the following query:
SELECT *
FROM shop_user_member_spots
WHERE delete_flag = 0
ORDER BY date_spotted desc
LIMIT 10
When run, it takes a few minutes. The table is around 2.5 million rows.
Here is the table (not designed by me but I am able to make some changes):
And finally, here are the indexes (again, not made by me!):
I've been attempting to get this query running fast for hours now, to no avail.
Here is the output of EXPLAIN:
Any help / articles are much appreciated.

Based on your query, it seems the index you would want would be on (delete_flag, date_spotted). You have an index that has the two columns, but the id column is in between them, which would make the index unhelpful in sorting based on date_spotted. Now whether mysql will use the index based on Zohaib's answer I can't say (sorry, I work most often with SQL Server).

The problem that I see in the explain plan is that the index on spotted date is not being used, insted filesort mechanism is being used to sort (as far as index on delete flag is concerned, we actually gain performance benefit of index if the column on which index is being created contains unique values)
the mysql documentation says
Index will not used for order by clause if
The key used to fetch the rows is not the same as the one used in the ORDER BY:
SELECT * FROM t1 WHERE key2=constant ORDER BY key1;
http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
I guess same is the case here. Although you can try using Force Index

Mysql, does using index for selects with operators >,< improve performance?

I need to select a row from table that has more than 5 millions of rows. It is table of all IP ranges. Every row has columns upperbound and lowerbound. All are bigintegers, and the number is integer representation of IP address.
The select is:
select *
from iptocity
where lowerbound < 3529167967
and upperbound >= 3529167967
limit 1;
My problem is
...that the select takes too long time. Without index on an InnoDB table, it takes 20 seconds. If it is on MyISAM then it takes 6 seconds. I need it to be less than 0.1 sec. And I need to be able to handle hundreds of such selects per second.
I tried to create index on InnoDB, on both columns upperbound and lowerbound. But it didn't help, it took even more time, like 40 seconds to select ip. What is wrong? Why did index not help? Does index work only on = operator, and >,< operators can't use indexes? What should I do to make it faster to have less than 0.1 sec selection time?

Did you create one index on both columns, or two indexes one column each? If only one index, then this could be your problem, make one index for each. Indexes should still work for < and >
Aside from changing indexes around, run:
EXPLAIN
select *
from iptocity
where lowerbound<3529167967
and upperbound>=3529167967
limit 1;
Same query as you had, just added EXPLAIN and some linebreaks for readability.
The EXPLAIN keyword will make MySQL explain to you how it's running the query, and it will tell you whether indexes are being used or not. For any slow running query, always use explain to try figuring out what's going on.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

mysql SQL optimization - mysql

select *,unix_timestamp(finishtime)-unix_timestamp(submittime) timetaken from joblog where (submittime between '2013-01-10 00:00:00' and '2013-01-19 00:00:00') and jobname is not null and jobname like '%cas%'; this helped (0.93 seconds)

Related

MySQL query takes more time to fetch data [MySQL]

Optimizing mysql query with the proper index

Very slow query, any other ways to format this with better performace?

Removing 'using filesort' from query

Mysql, does using index for selects with operators >,< improve performance?

Categories

Resources