mysql query is taking too much time to execute - mysql

Hello everyone I am working on phpmyadmin database. Whenever I try to execute query it takes too much time more than 10 mins to show results. Is there any way to speed it up. please response.
The query is
SELECT ib.*, b.brand_name, m.model_name,
s.id as sale_id, br.branch_code,br.branch_name,r.rentry_date,r.id as rid
from in_book ib
left join brand b on ib.brand_id=b.id
left join model m on ib.vehicle_id=m.id
left join re_entry r on r.in_book_id=ib.id
left join sale s on ib.id=s.in_book_id
left join branch br on ib.branch_id=br.id
where ib.id !=''
and ib.branch_id='65'
group by ib.id
order by r.id ASC,
count(r.in_book_id) DESC ,
ib.purchaes_date ASC,
ib.id ASC
there are almost 7 tables

make sure you got an index on every key you use to join the tables.
from http://dev.mysql.com/doc/refman/5.5/en/optimization-indexes.html:
The best way to improve the performance of SELECT operations is to create indexes on one or more of the columns that are tested in the query. The index entries act like pointers to the table rows, allowing the query to quickly determine which rows match a condition in the WHERE clause, and retrieve the other column values for those rows. All MySQL data types can be indexed.
.. this of course also applies to the JOIN conditions.

You don't list any such indexes, however, I would start with the following suggested indexes
table index
in_book ( branch_id, id, brand_id, vehicle_id )
brand ( id, brand_name )
model ( id, model_name )
re_entry ( in_book_id, id, reentry_date )
sale ( in_book_id, id )
branch ( id )
Also, with MySQL, you can use a special keyword "STRAIGHT_JOIN" which tells the engine to query in the order you have selected the tables... Although you are doing LEFT JOINs, I don't think it will matter as it appears the secondary tables are all lookup type of tables and in_book is your primary. But as just a try it would be..
SELECT STRAIGHT_JOIN (...rest of query...)

Related

MySQL Spring complicated query - ways to order and query efficiency

I run this complicated query on Spring JPA Repository.
My goal is to get all info from the site table, ordering it by events severity on each site.
This is my query:
SELECT alls.* FROM sites AS alls JOIN
(
SELECT distinct ets.id FROM
(
SELECT s.id, et.`type`, et.severity_level, COUNT(et.`type`) FROM sites AS s
JOIN users_sites AS us ON (s.id=us.site_id)
JOIN users AS u ON (us.user_id=u.user_id)
JOIN areas AS a ON (s.id=a.site_id)
JOIN panels AS p ON (a.id=p.area_id)
JOIN events AS e ON (p.id=e.panel_id)
JOIN event_types AS et ON (e.event_type_id=et.id)
WHERE u.user_id="98765432-123a-1a23-123b-11a1111b2cd3"
GROUP BY s.id , et.`type`, et.severity_level
ORDER BY et.severity_level, COUNT(et.`type`) DESC
) AS ets
) as etsd ON alls.id = etsd.id
The second select (the one with "distinct") returns site_ids ordered correctly by severity.
Note that there are different event_types + severity in each site, and I use pagination on the answer, so I need the distinct.
The problem is - the main select doesn't keep this order.
Is there any way to keep the order in one complicated query?
Another related question - one of my ideas was making two queries:
The "select distinct" query that will return me the order --> saved in a list "order list"
The main "sites" query (that becomes very simple) with "where id in {"order list"}
Order the second query in code by "order list".
I use the query every 10 seconds, so it is very sensitive on performance.
What seems to be faster in this case - original complicated query or those 2?
Any insight will be appreciated.
Tnx a lot.
A quirk of SQL's declarative set-oriented syntax for us procedural programmers: ORDER by clauses in subqueries are not carried through to the outer query, except sometimes by accident. If you want ordering at any query level, you must specify it at that level or you will get unpredictable results. The query optimizers are usually smart enough to avoid wasting sort operations.
Your requirement: give at most one sites row for each sites.id value, ordered by the worst event. Worst: lowest event severity, and if there are more than one event with lowest severity, the largest count.
Use this sort of thing to get the "worst" for each id, in place of DISTINCT.
SELECT id, MIN(severity_level) severity_level, MAX(num) num
FROM (
/* your inner query */
) ets
GROUP BY id
This gives at most one row per sites.id value. Then your outer query is
SELECT alls.*
FROM sites alls
JOIN (
SELECT id, MIN(severity_level) severity_level, MAX(num) num
FROM (
/* your inner query */
) ets
GROUP BY id
) worstevents ON alls.id = worstevents.id
ORDER BY worstevents.severity_level, worstevents.num DESC, alls.id
Putting it all together:
SELECT alls.*
FROM sites alls
JOIN (
SELECT id, MIN(severity_level) severity_level, MAX(num) num
FROM (
SELECT s.id, et.severity_level, COUNT(et.`type`) num
FROM sites AS s
JOIN users_sites AS us ON (s.id=us.site_id)
JOIN users AS u ON (us.user_id=u.user_id)
JOIN areas AS a ON (s.id=a.site_id)
JOIN panels AS p ON (a.id=p.area_id)
JOIN events AS e ON (p.id=e.panel_id)
JOIN event_types AS et ON (e.event_type_id=et.id)
WHERE u.user_id="98765432-123a-1a23-123b-11a1111b2cd3"
GROUP BY s.id , et.`type`, et.severity_level
) ets
GROUP BY id
) worstevents ON alls.id = worstevents.id
ORDER BY worstevents.severity_level, worstevents.num DESC, alls.id
An index on users.user_id will help performance for these single-user queries.
If you still have performance trouble, please read this and ask another question.

MySQL: Optimizing Sub-queries

I have this query I need to optimize further since it requires too much cpu time and I can't seem to find any other way to write it more efficiently. Is there another way to write this without altering the tables?
SELECT category, b.fruit_name, u.name
, r.count_vote, r.text_c
FROM Fruits b, Customers u
, Categories c
, (SELECT * FROM
(SELECT *
FROM Reviews
ORDER BY fruit_id, count_vote DESC, r_id
) a
GROUP BY fruit_id
) r
WHERE b.fruit_id = r.fruit_id
AND u.customer_id = r.customer_id
AND category = "Fruits";
This is your query re-written with explicit joins:
SELECT
category, b.fruit_name, u.name, r.count_vote, r.text_c
FROM Fruits b
JOIN
(
SELECT * FROM
(
SELECT *
FROM Reviews
ORDER BY fruit_id, count_vote DESC, r_id
) a
GROUP BY fruit_id
) r on r.fruit_id = b.fruit_id
JOIN Customers u ON u.customer_id = r.customer_id
CROSS JOIN Categories c
WHERE c.category = 'Fruits';
(I am guessing here that the category column belongs to the categories table.)
There are some parts that look suspicious:
Why do you cross join the Categories table, when you don't even display a column of the table?
What is ORDER BY fruit_id, count_vote DESC, r_id supposed to do? Sub query results are considered unordered sets, so an ORDER BY is superfluous and can be ignored by the DBMS. What do you want to achieve here?
SELECT * FROM [ revues ] GROUP BY fruit_id is invalid. If you group by fruit_id, what count_vote and what r.text_c do you expect to get for the ID? You don't tell the DBMS (which would be something like MAX(count_vote) and MIN(r.text_c)for instance. MySQL should through an error, but silently replacescount_vote, r.text_cbyANY_VALUE(count_vote), ANY_VALUE(r.text_c)` instead. This means you get arbitrarily picked values for a fruit.
The answer hence to your question is: Don't try to speed it up, but fix it instead. (Maybe you want to place a new request showing the query and explaining what it is supposed to do, so people can help you with that.)
Your Categories table seems not joined/related to the others this produce a catesia product between all the rows
If you want distinct resut don't use group by but distint so you can avoid an unnecessary subquery
and you dont' need an order by on a subquery
SELECT category
, b.fruit_name
, u.name
, r.count_vote
, r.text_c
FROM Fruits b
INNER JOIN Customers u ON u.customer_id = r.customer_id
INNER JOIN Categories c ON ?????? /Your Categories table seems not joined/related to the others /
INNER JOIN (
SELECT distinct fruit_id, count_vote, text_c, customer_id
FROM Reviews
) r ON b.fruit_id = r.fruit_id
WHERE category = "Fruits";
for better reading you should use explicit join syntax and avoid old join syntax based on comma separated tables name and where condition
The next time you want help optimizing a query, please include the table/index structure, an indication of the cardinality of the indexes and the EXPLAIN plan for the query.
There appears to be absolutely no reason for a single sub-query here, let alone 2. Using sub-queries mostly prevents the DBMS optimizer from doing its job. So your biggest win will come from eliminating these sub-queries.
The CROSS JOIN creates a deliberate cartesian join - its also unclear if any attributes from this table are actually required for the result, if it is there to produce multiples of the same row in the output, or just an error.
The attribute category in the last line of your query is not attributed to any of the tables (but I suspect it comes from the categories table).
Further, your code uses a GROUP BY clause with no aggregation function. This will produce non-deterministic results and is a bug. Assuming that you are not exploiting a side-effect of that, the query can be re-written as:
SELECT
category, b.fruit_name, u.name, r.count_vote, r.text_c
FROM Fruits b
JOIN Reviews r
ON r.fruit_id = b.fruit_id
JOIN Customers u ON u.customer_id = r.customer_id
ORDER BY r.fruit_id, count_vote DESC, r_id;
Since there are no predicates other than joins in your query, there is no scope for further optimization beyond ensuring there are indexes on the join predicates.
As all too frequently, the biggest benefit may come from simply asking the question of why you need to retrieve every single row in the tables in a single query.

MySQL Query Optimization using multiple joins

I'm having trouble optimizing a query and could use some help. I'm currently pulling in events in a system that has to join several other tables to make sure the event is supposed to display, etc... The query was running smoothly (around 480ms) until I introduced another table in the mix. The query is as follows:
SELECT
keyword_terms,
`esf`.*,
`venue`.`name` AS venue_name,
...
`venue`.`zip`, ase.region_id,
(DATE(NOW()) BETWEEN...AND ase.region_id IS NULL) as featured,
getDistance(`venue`.`lat`, `venue`.`lng`, 36.073, -79.7903) as distance,
`network_exclusion`.`id` as net_exc_id
FROM (`event_search_flat` esf)
# Problematic part of query (pulling in the very next date for the event)
LEFT JOIN (
SELECT event_id, MIN(TIMESTAMP(CONCAT(event_date.date, ' ', event_date.end_time))) AS next_date FROM event_date WHERE
event_date.date >= CURDATE() OR (event_date.date = CURDATE() AND TIME(event_date.end_time) >= TIME(NOW()))
GROUP BY event_id
) edate ON edate.event_id=esf.object_id
# Pull in associated ad space
LEFT JOIN `ad_space` ads ON `ads`.`data_type`=`esf`.`data_type` AND ads.object_id=esf.object_id
# and make sure it is featured within region
LEFT JOIN `ad_space_exclusion` ase ON ase.ad_space_id=ads.id AND region_id =5
# Get venue details
LEFT JOIN `venue` ON `esf`.`venue_id`=`venue`.`id`
# Make sure this event should be listed
LEFT JOIN `network_exclusion` ON network_exclusion.data_type=esf.data_type
AND network_exclusion.object_id=esf.object_id
AND network_exclusion.region_id=5
WHERE `esf`.`event_type` IN ('things to do')
AND (`edate`.`next_date` >= '2013-07-18 16:23:53')
GROUP BY `esf`.`esf_id`
HAVING `net_exc_id` IS NULL
AND `distance` <= 40
ORDER BY DATE(edate.next_date) asc,
`distance` asc
LIMIT 6
It seems that the issue lies with the event_date table, but I'm unsure how to optimize this query (I tried various views, indexes, etc... to no avail). I ran EXPLAIN and received the following: http://cl.ly/image/3r3u1o0n2A46 .
At the moment, the query is taking 6.6 seconds. Any help would be greatly appreciated.
You may be able to get Using index on the event_date subquery by creating a compound index over (event_id, date, end_time). That may turn the subquery into an index-only query, which should speed it up slightly.
The subquery might be better written as the following, without GROUP BY:
SELECT event_id, TIMESTAMP(CONCAT(event_date.date, ' ', event_date.end_time))) AS next_date
FROM event_date
WHERE event_date.date >= CURDATE()
OR (event_date.date = CURDATE() AND TIME(event_date.end_time) >= TIME(NOW()))
ORDER BY next_date LIMIT 1
I'm more concerned that your EXPLAIN shows so many tables with type=ALL. That means it has to read every row from those tables and compare to them rows in other tables. You can get an idea of how much work it's doing by multiplying the values in the rows column. Basically, it's making billions of row comparisons to resolve the joins. As the tables grow, this query will get a lot worse.
Using LEFT [OUTER] JOIN has a specific purpose, and if you really mean to use INNER JOIN you should do that, because using an outer join where it doesn't belong can mess up the optimization. Use an outer join like A LEFT JOIN B only if you want rows in A that may not have matching rows in B.
For example, I assume based on column naming convention that LEFT JOIN venue ON esf.venue_id=venue.id should be an inner join, because there should always be a venue referenced by esf.venue_id (unless esf.venue_id is sometimes null).
event_search_flat should have a compound index with columns used in the WHERE clause first, then columns to join to other tables: (event_type, object_id, data_type, event_id)
ad_space should have a compound index for the join: (data_type, object_id). Does this need to be an inner join too?
ad_space_exclusion should have a compound index for the join: (ad_space_id, region_id)
network_exclusion should have a compound index for the join: (data_type, object_id, region_id)
venue is okay because it's doing a primary key lookup already.

how can I make this query more efficient?

edit: here is a simplified version of the original query (runs in 3.6 secs on a products table of 475K rows)
SELECT p.*, shop FROM products p JOIN
users u ON p.date >= u.prior_login and u.user_id = 22 JOIN
shops s ON p.shop_id = s.shop_id
ORDER BY shop, date, product_id;
this is the explain plan
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE u const PRIMARY,prior_login,user_id PRIMARY 4 const 1 Using temporary; Using filesort
1 SIMPLE s ALL PRIMARY NULL NULL NULL 90
1 SIMPLE p ref shop_id,date,shop_id_2,shop_id_3 shop_id 4 bitt3n_minxa.s.shop_id 5338 Using where
the bottleneck seems to be ORDER BY date,product_id. Removing these two orderings, the query runs in 0.06 seconds. (Removing either one of the two (but not both) has virtually no effect, query still takes over 3 seconds.) I have indexes on both product_id and date in the products table. I have also added an index on (product,date) with no improvement.
newtover suggests the problem is the fact that the INNER JOIN users u1 ON products.date >= u1.prior_login requirement is preventing use of the index on products.date
Two variations of the query that execute in ~0.006 secs (as opposed to 3.6 secs for the original) have been suggested to me (not from this thread).
this one uses a subquery, which appears to force the order of the joins
SELECT p.*, shop
FROM
(
SELECT p.*
FROM products p
WHERE p.date >= (select prior_login FROM users where user_id = 22)
) as p
JOIN shops s
ON p.shop_id = s.shop_id
ORDER BY shop, date, product_id;
this one uses the WHERE clause to do the same thing (although the presence of SQL_SMALL_RESULT doesn't change the execution time, 0.006 secs without it as well)
SELECT SQL_SMALL_RESULT p . * , shop
FROM products p
INNER JOIN shops s ON p.shop_id = s.shop_id
WHERE p.date >= (
SELECT prior_login
FROM users
WHERE user_id =22 )
ORDER BY shop, DATE, product_id;
My understanding is that these queries work much faster on account of reducing the relevant number of rows of the product table before joining it to the shops table. I am wondering if this is correct.
Use the EXPLAIN statement to see the execution plan. Also you can try adding an index to products.date and u1.prior_login.
Also please just make sure you have defined your foreign keys and they are indexed.
Good luck.
We do need an explain plan... but
Be very careful of select * from table where id in (select id from another_table) This is a notorious. Generally these can be replaced by a join. The following query might run, although I haven't tested it.
SELECT shop,
shops.shop_id AS shop_id,
products.product_id AS product_id,
brand,
title,
price,
image AS image,
image_width,
image_height,
0 AS sex,
products.date AS date,
fav1.favorited AS circle_favorited,
fav2.favorited AS session_user_favorited,
u2.username AS circle_username
FROM products
LEFT JOIN favorites fav2
ON fav2.product_id = products.product_id
AND fav2.user_id = 22
AND fav2.current = 1
INNER JOIN shops
ON shops.shop_id = products.shop_id
INNER JOIN users u1
ON products.date >= u1.prior_login AND u1.user_id = 22
LEFT JOIN favorites fav1
ON products.product_id = fav1.product_id
LEFT JOIN friends f1
ON f1.star_id = fav1.user_id
LEFT JOIN users u2
ON fav1.user_id = u2.user_id
WHERE f1.fan_id = 22 OR fav1.user_id = 22
ORDER BY shop,
DATE,
product_id,
circle_favorited
the fact that the query is slow because of the ordering is rather obvious since it is hard to find an index that would to apply ORDER BY in this case. The main problem is products.date >= comparison which breaks using any index for ORDER BY. And since you have a lot of data to output, MySQL starts using temporary tables for sorting.
what i would to is to try to force MySQL output data in the order of an index which already has the required order and remove the ORDER BY clause.
I am not at a computer to test, but how would I do it:
I would do all inner joins
then I would LEFT JOIN to a subquery which makes all computations on favorites ordered by product_id, circle_favourited (which would provide the last ordering criterion).
So, the question is how to make the data be sorted on shop, date, product_id
I am going to write about it a bit later =)
UPD1:
You should probably read something on how btree indexes work in MySQL. There is a good article on mysqlperformanceblog.com about it (I currently write from a mobile and don't have the link at hand). In short, you seem to talk about one-column indexes which arrange pointers to rows based on values sorted in a single column. Compound indexes store an order based on several columns. Indexes mostly used to operate on clearly defined ranges of them to obtain most of the information before retrieving data from the rows they point at. Indexes usually do not know about other indexes on the same table, as result they are rarely merged. when there is no more info to take from the index, MySQL starts to operate directly on data.
That is an index on date can not make use of the index on product_id, but an index on (date, product_id) can get some more info on product_id after a condition on date (sort on product id for a specific date match).
Nevertheless, a range condition on date (>=) breaks this. That is what I was talking about.
UPD2:
As I uderstand the problem can be reduced to (most of the time it spends on that):
SELECT p.*, shop
FROM products p
JOIN users u ON p.`date` >= u.prior_login and u.user_id = 22
JOIN shops s ON p.shop_id = s.shop_id
ORDER BY shop, `date`, product_id;
Now add an index (user_id, prior_login) on users and (date) on products, and try the following query:
SELECT STRAIGHT_JOIN p.*, shop
FROM (
SELECT product_id, shop
FROM users u
JOIN products p
user_id = 22 AND p.`date` >= prior_login
JOIN shops s
ON p.shop_id = s.shop_id
ORDER BY shop, p.`date`, product_id
) as s
JOIN products p USING (product_id);
If I am correct the query should return the same result but quicker. If would be nice if you would post the result of EXPLAIN for the query.

How can I make these two queries into one?

I have two tables, one for downloads and one for uploads. They are almost identical but with some other columns that differs them. I want to generate a list of stats for each date for each item in the table.
I use these two queries but have to merge the data in php after running them. I would like to instead run them in a single query, where it would return the columns from both queries in each row grouped by the date. Sometimes there isn't any download data, only upload data, and in all my previous tries it skipped the row if it couldn't find log data from both rows.
How do I merge these two queries into one, where it would display data even if it's just available in one of the tables?
SELECT DATE(upload_date_added) as upload_date, SUM(upload_size) as upload_traffic, SUM(upload_files) as upload_files
FROM packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
SELECT DATE(download_date_added) as download_date, SUM(download_size) as download_traffic, SUM(download_files) as download_files
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC
I want to get result rows like this:
date, upload_traffic, upload_files, download_traffic, download_files
All help appreciated!
Your two queries can be executed and then combined with the UNION cluase along with an extra field to identify Uploads and Downloads on separate lines:
SELECT
'Uploads' TransmissionType,
DATE(upload_date_added) as TransmissionDate,
SUM(upload_size) as TransmissionTraffic,
SUM(upload_files) as TransmittedFileCount
FROM
packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
UNION
SELECT
'Downloads',
DATE(download_date_added),
SUM(download_size),
SUM(download_files)
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC;
Give it a Try !!!
What you're asking can only work for rows that have the same add date for upload and download. In this case I think this SQL should work:
SELECT
DATE(u.upload_date_added) as date,
SUM(u.upload_size) as upload_traffic,
SUM(u.upload_files) as upload_files,
SUM(d.download_size) as download_traffic,
SUM(d.download_files) as download_files
FROM
packages_uploads u, packages_downloads d
WHERE u.upload_date_added = d.download_date_added
AND u.upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY date
ORDER BY date DESC
Without knowing the schema is hard to give the exact answer so please see the following as a concept not a direct answer.
You could try left join, im not sure if the table package exists but the following may be food for thought
SELECT
p.id,
up.date as upload_date
dwn.date as download_date
FROM
package p
LEFT JOIN package_uploads up ON
( up.package_id = p.id WHERE up.upload_date = 'etc' )
LEFT JOIN package_downloads dwn ON
( dwn.package_id = p.id WHERE up.upload_date = 'etc' )
The above will select all the packages and attempt to join and where the value does not join it will return null.
There is number of ways that you can do this. You can join using primary key and foreign key. In case if you do not have relationship between tables,
You can use,
LEFT JOIN / LEFT OUTER JOIN
Returns all records from the left table and the matched
records from the right table. The result is NULL from the
right side when there is no match.
RIGHT JOIN / RIGHT OUTER JOIN
Returns all records from the right table and the matched
records from the left table. The result is NULL from the left
side when there is no match.
FULL OUTER JOIN
Return all records when there is a match in either left or right table records.
UNION
Is used to combine the result-set of two or more SELECT statements.
Each SELECT statement within UNION must have the same number of,
columns The columns must also have similar data types The columns in,
each SELECT statement must also be in the same order.
INNER JOIN
Select records that have matching values in both tables. -this is good for your situation.
INTERSECT
Does not support MySQL.
NATURAL JOIN
All the column names should be matched.
Since you dont need to update these you can create a view from joining tables then you can use less query in your PHP. But views cannot update. And you did not mentioned about relationship between tables. Because of that I have to go with the UNION.
Like this,
CREATE VIEW checkStatus
AS
SELECT
DATE(upload_date_added) as upload_date,
SUM(upload_size) as upload_traffic,
SUM(upload_files) as upload_files
FROM packages_uploads
WHERE upload_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY upload_date
ORDER BY upload_date DESC
UNION
SELECT
DATE(download_date_added) as download_date,
SUM(download_size) as download_traffic,
SUM(download_files) as download_files
FROM packages_downloads
WHERE download_date_added BETWEEN '2011-10-26' AND '2011-11-16'
GROUP BY download_date
ORDER BY download_date DESC
Then anywhere you want to select you just need one line:
SELECT * FROM checkStatus
learn more.