Optimising a Query with 1 million rows - mysql

I've been trying to optimise this query I've got, originally I was using INNER JOIN for the vip.tvip database however noticed that people that didn't exist in that table weren't showing and read I have to use a LEFT JOIN which has caused further issues.
SELECT sb_admins.srv_group AS role, rankme.lastconnect, rankme.steam, rankme.name, rankme.pfp, vip.tvip.vip_level FROM bans.sb_admins
INNER JOIN rankme ON CONCAT("STEAM_0:", rankme.authid) = sb_admins.authid
LEFT JOIN vip.tvip ON tvip.playerid = rankme.authid
AND gid > 0 ORDER BY rankme.name;
This is the query I'm currently using, it seems to take around 5 seconds to get the result due to the rankme table being 1.3 million rows. I am also attaching the EXPLAIN for this query too, I'm not that well versed in MySQL queries so apologies if I am butchering this.
If someone could give an in-sight on how to fix this, would be tremendously helpful. I have created keys for anything which I could such as name being a FULLTEXT key etc but still no prevail.
Cheers.

Could you try:
SELECT sb_admins.srv_group AS role, rankme.lastconnect, rankme.steam, rankme.name, rankme.pfp, vip.tvip.vip_level FROM bans.sb_admins
INNER JOIN rankme ON rankme.authid = REPLACE(sb_admins.authid,"STEAM_0:","")
LEFT JOIN vip.tvip ON tvip.playerid = rankme.authid
AND gid > 0 ORDER BY rankme.name;
This should be able to use the index on rankme.authid in rankme. (if that exists...)

Related

Incorrect key file for table '/tmp/#sql_18b4_0.MYI'; try to repair it

I got a query from our developers that is not executing on server and giving below error-
Incorrect key file for table '/tmp/#sql_18b4_0.MYI'; try to repair it
I have checked all tables individually and their index, everything seems file. Even I have checked all these tables in some other query join which is fetching more data than this query and working fine.
Even these tables hardly keep less than 1000 records per table.
Query is:
SELECT `PsMasterSubject`.`id`, `PsMasterSubject`.`name`, `PsProgram`.`name`, `PsStreamLevel`.`id`
FROM `misdb`.`ps_master_subjects` AS `PsMasterSubject`
LEFT JOIN `misdb`.`ps_programs` AS `PsProgram` ON (`PsMasterSubject`.`ps_program_id` = `PsProgram`.`id`)
LEFT JOIN `misdb`.`ps_stream_levels` AS `PsStreamLevel` ON (`PsStreamLevel`.`id` AND `PsProgram`.`ps_stream_level_id`)
LEFT JOIN `misdb`.`ps_program_levels` AS `PsProgramLevel` ON (`PsProgramLevel`.`id` AND `PsStreamLevel`.`ps_program_level_id`)
WHERE 1 = 1
ORDER BY `PsMasterSubject`.`id` DESC LIMIT 10;
I am getting some issues same like this but I have checked that my table is not currupt.
Any quick help will be highly appreciated.
Oh shit this was a silly mistake from my developer end, After 30 minutes brain storming to design this query in different way I got this issue that developer was using join in wrong way, due to this mysql was not able to proper join tables data and consuming all space in /tmp directory and throwing this error. Correct query is here-
SELECT `PsMasterSubject`.`id`, `PsMasterSubject`.`name`, `PsProgram`.`name`, `PsStreamLevel`.`id`
FROM `misdb`.`ps_master_subjects` AS `PsMasterSubject`
LEFT JOIN `misdb`.`ps_programs` AS `PsProgram` ON (`PsMasterSubject`.`ps_program_id` = `PsProgram`.`id`)
LEFT JOIN `misdb`.`ps_stream_levels` AS `PsStreamLevel` ON (`PsStreamLevel`.`id` = `PsProgram`.`ps_stream_level_id`)
LEFT JOIN `misdb`.`ps_program_levels` AS `PsProgramLevel` ON (`PsProgramLevel`.`id` = `PsStreamLevel`.`ps_program_level_id`)
WHERE 1 = 1
ORDER BY `PsMasterSubject`.`id` DESC LIMIT 10;
Now question is here that is this a mysql bug as mysql should throw wrong syntax error but here mysql is trying to create a temporary table for temp data.
I will be very thankful if anyone can clear this to me.

Optimize "JOIN" query

this is my query from my source code
SELECT `truyen`.*, MAX(chapter.chapter) AS last_chapter
FROM (`truyen`)
LEFT JOIN `chapter` ON `chapter`.`truyen` = `truyen`.`Id`
WHERE `truyen`.`title` LIKE \'%%\'
GROUP BY `truyen`.`Id`
LIMIT 250
When I install it on iFastnet host, It cause over 500,000 rows to be examined due to the join, and the query is being blocked (this would used over 100% of a CPU, which ultimately would cause server instability).
I also tried to add this line before the query, it fixed the problem above but lead to another issue making some of functions can not run correctly
mysql_query("SET SQL_BIG_SELECTS=1");
How can I fix this problem without buying another hosting ?
Thanks.
You might be looking for an INNER JOIN. That would remove results that do not match. I find INNER JOINs to be faster than LEFT JOINs.
However, I'm not sure what results you are actually looking for. But because you are using the GROUP BY, it looks like the INNER JOIN might work for you.
One thing I would recommend is copy and paste the query that it generates into SQL with DESCRIBE before it.
So if the query ended up being:
SELECT truyen.*, MAX(chapter.chapter) AS last_chapter FROM truyen
LEFT JOIN chapter ON chapter.truyen = truyen.Id
WHERE truyen.title LIKE '%queryString%'
You would type:
DESCRIBE SELECT truyen.*, MAX(chapter.chapter) AS last_chapter FROM truyen
LEFT JOIN chapter ON chapter.truyen = truyen.Id
WHERE truyen.title LIKE '%queryString%'
This will tell you if you could possibly ad an index to your table to JOIN on faster.
I hope this at least points you in the right direction.
Michael Berkowski seems to agree with the indexing, which you will be able to see from the DESCRIBE.
Please look if you have indexes on chapter.chapter and chapter.truyen. If not, set them and try again. If this is not successful try these suggestions:
Do you have the possibility to flag permanently on insert/update your last chapter in a column of your chapter table? Then you could use it to reduce the joined rows and you could drop out the GROUP BY. Maybe in this way:
SELECT `truyen`.*, `chapter`.`chapter` as `last_chapter`
FROM `truyen`, `chapter`
WHERE `chapter`.`truyen` = `truyen`.`Id`
AND `chapter`.`flag_last_chapter` = 1
AND `truyen`.`title` LIKE '%queryString%'
LIMIT 250
Or create a new table for that instead:
INSERT INTO new_table (truyen, last_chapter)
SELECT truyen, MAX(chapter) FROM chapter GROUP BY truyen;
SELECT `truyen`.*, `new_table`.`last_chapter`
FROM (`truyen`)
LEFT JOIN `new_table` ON `new_table`.`truyen` = `truyen`.`Id`
WHERE `truyen`.`title` LIKE '%queryString%'
GROUP BY `truyen`.`Id`
LIMIT 250
Otherwise you could just fetch the 250 rows of truyen, collect your truyen ids in an array and build another SQL Statement to select the 250 rows of the chapter table. I have seen in your original question that you can use PHP for that. So you could merge the results after that:
SELECT * FROM truyen
WHERE title LIKE '%queryString%'
LIMIT 250
SELECT truyen, MAX(chapter) AS last_chapter
FROM chapter
WHERE truyen in (comma_separated_ids_from_first_select)

MySQL - how to speed up or change this query

I did not write this query. I am working on someone else's old code. I am looking into changing what is needed for this query but if I could simply speed up this query that would solve my problem temporarily. I am looking at adding indexes. when I did a show indexes there are so many indexes on the table orders can that also slow down a query?
I am no database expert. I guess I will learn more from this effort. :)
SELECT
orders.ORD_ID,
orders.ORD_TotalAmt,
orders.PAYMETH_ID,
orders.SCHOOL_ID,
orders.ORD_AddedOn,
orders.AMAZON_PurchaseDate,
orders.ORDSTATUS_ID,
orders.ORD_InvoiceNumber,
orders.ORD_CustFirstName,
orders.ORD_CustLastName,
orders.AMAZON_ORD_ID,
orders.ORD_TrackingNumber,
orders.ORD_SHIPPINGCNTRY_ID,
orders.AMAZON_IsExpedited,
orders.ORD_ShippingStreet1,
orders.ORD_ShippingStreet2,
orders.ORD_ShippingCity,
orders.ORD_ShippingStateProv,
orders.ORD_ShippingZipPostalCode,
orders.CUST_ID,
orders.ORD_ShippingName,
orders.AMAZON_ShipOption,
orders.ORD_ShipLabelGenOn,
orders.ORD_SHIPLABELGEN,
orders.ORD_AddressVerified,
orders.ORD_IsResidential,
orderstatuses.ORDSTATUS_Name,
paymentmethods.PAYMETH_Name,
shippingoptions.SHIPOPT_Name,
SUM(orderitems.ORDITEM_Qty) AS ORD_ItemCnt,
SUM(orderitems.ORDITEM_Weight * orderitems.ORDITEM_Qty) AS ORD_ItemTotalWeight
FROM
orders
LEFT JOIN orderstatuses ON
orders.ORDSTATUS_ID = orderstatuses.ORDSTATUS_ID
LEFT JOIN orderitems ON
orders.ORD_ID = orderitems.ORD_ID
LEFT JOIN paymentmethods ON
orders.PAYMETH_ID = paymentmethods.PAYMETH_ID
LEFT JOIN shippingoptions ON
orders.SHIPOPT_ID = shippingoptions.SHIPOPT_ID
WHERE
(orders.AMAZON_ORD_ID IS NOT NULL AND (orders.ORD_SHIPLABELGEN IS NULL OR orders.ORD_SHIPLABELGEN = '') AND orderstatuses.ORDSTATUS_ID <> 101 AND orderstatuses.ORDSTATUS_ID <> 40)
GROUP BY
orders.ORD_ID,
orders.ORD_TotalAmt,
orders.PAYMETH_ID,
orders.SCHOOL_ID,
orders.ORD_AddedOn,
orders.ORDSTATUS_ID,
orders.ORD_InvoiceNumber,
orders.ORD_CustFirstName,
orders.ORD_CustLastName,
orderstatuses.ORDSTATUS_Name,
paymentmethods.PAYMETH_Name,
shippingoptions.SHIPOPT_Name
ORDER BY
orders.ORD_ID
One simple thing you should consider is whether you really need to use left joins or you would be satisfied using inner joins for some of the joins. the new query would not be the same as the original query, so you would need to think carefully about what you really want back. If your foreign key relationships are indexed correctly, this could help substantially, especially between ORDERS and ORDERITEMS, because I would imagine these are your largest tables. The following post has a good explanation: INNER JOIN vs LEFT JOIN performance in SQL Server. There are lots of other things that can be done, but you will need to post the query plan so people can dive deeper.
It looks like just adding the index was all that was needed.
create index orderitems_ORD_ID_index on orderitems(ORD_ID);

How can I modify this query with two Inner Joins so that it stops giving duplicate results?

EDIT: I will leave the post here as is, but what I really needed to accomplish needed to be reposted. I didn't explain the problem well enough. After trying again with quite a different starting point, I was able to get the query that I needed. That is explained here.
ORIGINAL QUESTION:
I'm having trouble. I have looked at similar threads, and I am unable to find a solution specific to this query. The database is very large, and group by seems to slow it down immensely.
The problem is I am getting duplicate results. Here is my query which causes duplicates:
SELECT
itpitems.identifier,
itpitems.name,
itpitems.subtitle,
itpitems.description,
itpitems.itemimg,
itpitems.mainprice,
itpitems.upc,
itpitems.isbn,
itpitems.weight,
itpitems.pages,
itpitems.publisher,
itpitems.medium_abbr,
itpitems.medium_desc,
itpitems.series_abbr,
itpitems.series_desc,
itpitems.voicing_desc,
itpitems.pianolevel_desc,
itpitems.bandgrade_desc,
itpitems.category_code,
itprank.overall_ranking,
itpitnam.name AS artist,
itpitnam.type_code
FROM itpitems
INNER JOIN itprank ON ( itprank.item_number = itpitems.identifier )
INNER JOIN itpitnam ON ( itpitems.identifier = itpitnam.item_number )
WHERE mainprice >1
The results are actually not complete duplicates. itpitnam.type_code has a different result in the otherwise duplicated results.
Since adding GROUP BY to the end of the query is causing too much strain on the server (It's searching through about 300,000 records) what else can I do?
Can this be re-written as a sub-query? I just can't figure out how to eliminate the 2nd instances where type_code has changed.
Thank you for your help and assistance.
I also tried SELECT DISTINCT itpitems.identifier, but this served out the same results and had the duplicates (where type_code was the only difference). I don't want the second instance where type_code has changed. I just want one result per identifier regardless of whether or not type_code has multiple instances.
Without seeing examples of the output, hard to say. But have you tried the same exact query with a simple DISTINCT added to the SELECT?
SELECT DISTINCT itpitems.identifier, itpitems.name, itpitems.subtitle, itpitems.description, itpitems.itemimg, itpitems.mainprice, itpitems.upc, itpitems.isbn, itpitems.weight, itpitems.pages, itpitems.publisher, itpitems.medium_abbr, itpitems.medium_desc, itpitems.series_abbr, itpitems.series_desc, itpitems.voicing_desc, itpitems.pianolevel_desc, itpitems.bandgrade_desc, itpitems.category_code, itprank.overall_ranking, itpitnam.name AS artist, itpitnam.type_code
FROM itpitems
INNER JOIN itprank ON ( itprank.item_number = itpitems.identifier )
INNER JOIN itpitnam ON ( itpitems.identifier = itpitnam.item_number )
WHERE mainprice >1

Need help speeding up a MySQL query

I need a query that quickly shows the articles within a particular module (a subset of articles) that a user has NOT uploaded a PDF for. The query I am using below takes about 37 seconds, given there are 300,000 articles in the Article table, and 6,000 articles in the Module.
SELECT *
FROM article a
INNER JOIN article_module_map amm ON amm.article=a.id
WHERE amm.module = 2 AND
a.id NOT IN (
SELECT afm.article
FROM article_file_map afm
INNER JOIN article_module_map amm ON amm.article = afm.article
WHERE afm.organization = 4 AND
amm.module = 2
)
What I am doing in the above query is first truncating the list of articles to the selected module, and then further truncating that list to the articles that are not in the subquery. The subquery is generating a list of the articles that an organization has already uploaded PDF's for. Hence, the end result is a list of articles that an organization has not yet uploaded PDF's for.
Help would be hugely appreciated, thanks in advance!
EDIT 2012/10/25
With #fthiella's help, the below query ran in an astonishing 1.02 seconds, down from 37+ seconds!
SELECT a.* FROM (
SELECT article.* FROM article
INNER JOIN article_module_map
ON article.id = article_module_map.article
WHERE article_module_map.module = 2
) AS a
LEFT JOIN article_file_map
ON a.id = article_file_map.article
AND article_file_map.organization=4
WHERE article_file_map.id IS NULL
I am not sure that i can understand the logic and the structure of the tables correctly. This is my query:
SELECT
article.id
FROM
article
INNER JOIN
article_module_map
ON article.id = article_module_map.article
AND article_module_map.module=2
LEFT JOIN
article_file_map
ON article.id = article_file_map.article
AND article_file_map.organization=4
WHERE
article_file_map.id IS NULL
I extract all of the articles that have a module 2. I then select those that organization 4 didn't provide a file.
I used a LEFT JOIN instead of a subquery. In some circumstances this could be faster.
EDIT Thank you for your comment. I wasn't sure it would run faster, but it surprises me that it is so much slower! Anyway, it was worth a try!
Now, out of curiosity, I would like to try all the combinations of LEFT/INNER JOIN and subquery, to see which one runs faster, eg:
SELECT *
FROM
(SELECT *
FROM
article INNER JOIN article_module_map
ON article.id = article_module_map.article
WHERE
article_module_map.module=2)
LEFT JOIN
etc.
maybe removing *, and I would like to see what changes between the condition on the WHERE clause and on the ON clause... anyway I think it doesn't help much, you should concentrate on indexes now.
Indexes on keys/foreign key should be okay already, but what if you add an index on article_module_map.module and/or article_file_map.organization ?
When optimizing queries I use to check the following points:
First: I would avoid using * in SELECT clause, instead, name the diferent fields you want. This increases crazily the speed (I had one which took 7 seconds with *, and naming the field decreased to 0.1s).
Second: As #Adder says, add indexes to your tables.
Third: Try using INNER JOIN instead of WHERE amm.module = 2 AND a.id NOT IN ( ... ). I think I read (I don't remember it well, so take it carefully) that usually MySQL optimize INNER JOINS, and as your subquery is a filter, maybe using three INNER JOINS plus WHERE would be faster to retrieve.