I have a MySQL database and one query which I'm trying to optimize as well as I can.
I'm not familiar with indexes, so I do not know which indexes I should create. At the moment I do not have any indexes and my query is too slow, I think. In fact, using join has made this all much slower. I believed that it would make this faster, but not. I do not understand why this is much slower now.
Any suggestions for indexes? Is there anything else I could make better to make my query faster?
SELECT ka_ki.kierrosnumero AS kierrosnumero
, ka_ki.kierroskoodi AS kierroskoodi
, ka_ki_ot.ottelunumero AS ottelunumero
, ka_ki.haviajien_sijat_tekstina AS haviajien_sijat_tekstina
, ka_ki.voittajien_puolelta_cupiin AS voittajien_puolelta_cupiin
, ka_ki.haviajien_puolelta_cupiin AS haviajien_puolelta_cupiin
, ka_ki_ot.paikka_a_ja_b_kaaviokoodi AS paikka_a_ja_b_kaaviokoodi
, ka_ki_ot.paikka_a_kaaviokoodi AS paikka_a_kaaviokoodi
, ka_ki_ot.paikka_b_kaaviokoodi AS paikka_b_kaaviokoodi
, ki_ka_ot.id AS ki_ka_ot_id
, ki_ka_ot.kaaviopaikka_id
, ki_ka_ot.peli_monesko_peli_ottelussa
, ki_ka_ot.peli_pelipaikka_id
, ki_ka_ot.peli_pelimuoto_id
, ki_ka_ot.peli_voittopisteet
, ki_ka_ot.peli_ajankohta_aikataulutus
, ki_ka_ot.peli_ajankohta_alkamisaika
, ki_ka_ot.peli_ajankohta_loppumisaika
, ki_ka_ot.peli_paikka_a_tiimiilmo_id
, ki_ka_ot.peli_paikka_b_tiimiilmo_id
, ki_ka_ot.peli_paikka_a_peluri1ilm_id
, ki_ka_ot.peli_paikka_a_peluri2ilm_id
, ki_ka_ot.peli_paikka_b_peluri1ilm_id
, ki_ka_ot.peli_paikka_b_peluri2ilm_id
, ki_il.pari_joukkue_nimi_txt AS peli_paikka_b_pari_joukkue_nimi_txt
, ki_il.sijoitusnumero_syotetty AS peli_paikka_b_sijoitusnumero_syotetty
, ki_il.sijoitusnumero_arvottu AS peli_paikka_b_sijoitusnumero_arvottu
, ka_il.pelaaja_oma_nimi_txt AS peli_paikka_b_peluri1_oma_nimi_txt
FROM ki_ka_ot
JOIN ki_il ON ki_ka_ot.peli_paikka_b_tiimiilmo_id = ki_il.id
JOIN ka_il ON ki_ka_ot.peli_paikka_b_peluri1ilm_id = ka_il.id
JOIN ki_ka ON ki_ka.id = ki_ka_ot.kaavio_id
JOIN ka_ki_ot ON
ki_ka.kaaviopohja_id = ka_ki_ot.kaaviopohja_id
AND ka_ki_ot.id = ki_ka_ot.kaaviopaikka_id
JOIN kaa ON ka_ki_ot.kaaviopohja_id = kaa.id
JOIN ka_ki ON ka_ki_ot.kierros_id = ka_ki.id
WHERE ki_ka_ot.kaavio_id = 107
ORDER BY ka_ki_ot.ottelunumero ASC
Update
Instead of using join, I could use FROM ki_ka_ot, ki_il, ka_il, ki_ka, ka_ki_ot, kaa, ka_ki and add several AND conditions to WHERE section. The result of the query would be 100% same, but it would be faster. Should I do it?
SELECT ka_ki.kierrosnumero AS kierrosnumero
, ka_ki.kierroskoodi AS kierroskoodi
, ka_ki_ot.ottelunumero AS ottelunumero
, ka_ki.haviajien_sijat_tekstina AS haviajien_sijat_tekstina
, ka_ki.voittajien_puolelta_cupiin AS voittajien_puolelta_cupiin
, ka_ki.haviajien_puolelta_cupiin AS haviajien_puolelta_cupiin
, ka_ki_ot.paikka_a_ja_b_kaaviokoodi AS paikka_a_ja_b_kaaviokoodi
, ka_ki_ot.paikka_a_kaaviokoodi AS paikka_a_kaaviokoodi
, ka_ki_ot.paikka_b_kaaviokoodi AS paikka_b_kaaviokoodi
, ki_ka_ot.id AS ki_ka_ot_id
, ki_ka_ot.kaaviopaikka_id
, ki_ka_ot.peli_monesko_peli_ottelussa
, ki_ka_ot.peli_pelipaikka_id
, ki_ka_ot.peli_pelimuoto_id
, ki_ka_ot.peli_voittopisteet
, ki_ka_ot.peli_ajankohta_aikataulutus
, ki_ka_ot.peli_ajankohta_alkamisaika
, ki_ka_ot.peli_ajankohta_loppumisaika
, ki_ka_ot.peli_paikka_a_tiimiilmo_id
, ki_ka_ot.peli_paikka_b_tiimiilmo_id
, ki_ka_ot.peli_paikka_a_peluri1ilm_id
, ki_ka_ot.peli_paikka_a_peluri2ilm_id
, ki_ka_ot.peli_paikka_b_peluri1ilm_id
, ki_ka_ot.peli_paikka_b_peluri2ilm_id
, ki_il.pari_joukkue_nimi_txt AS peli_paikka_b_pari_joukkue_nimi_txt
, ki_il.sijoitusnumero_syotetty AS peli_paikka_b_sijoitusnumero_syotetty
, ki_il.sijoitusnumero_arvottu AS peli_paikka_b_sijoitusnumero_arvottu
, ka_il.pelaaja_oma_nimi_txt AS peli_paikka_b_peluri1_oma_nimi_txt
FROM ki_ka_ot
, ki_il
, ka_il
, kaa
, ka_ki
, ka_ki_ot
, ki_ka
WHERE ki_ka_ot.kaavio_id = 107
AND ki_ka_ot.peli_paikka_b_tiimiilmo_id = ki_il.id
AND ki_ka_ot.peli_paikka_b_peluri1ilm_id = ka_il.id
AND ki_ka.id = ki_ka_ot.kaavio_id
AND ki_ka.kaaviopohja_id = ka_ki_ot.kaaviopohja_id
AND ka_ki_ot.kaaviopohja_id = kaa.id
AND ka_ki_ot.kierros_id = ka_ki.id
AND ka_ki_ot.id = ki_ka_ot.kaaviopaikka_id
ORDER BY ka_ki_ot.ottelunumero ASC
Update 2
Now I have modified my original query which uses join. I think it works better and faster, but maybe there is something to fix.
SELECT ka_ki.kierrosnumero AS kierrosnumero
, ka_ki.kierroskoodi AS kierroskoodi
, ka_ki_ot.ottelunumero AS ottelunumero
, ka_ki.haviajien_sijat_tekstina AS haviajien_sijat_tekstina
, ka_ki.voittajien_puolelta_cupiin AS voittajien_puolelta_cupiin
, ka_ki.haviajien_puolelta_cupiin AS haviajien_puolelta_cupiin
, ka_ki_ot.paikka_a_ja_b_kaaviokoodi AS paikka_a_ja_b_kaaviokoodi
, ka_ki_ot.paikka_a_kaaviokoodi AS paikka_a_kaaviokoodi
, ka_ki_ot.paikka_b_kaaviokoodi AS paikka_b_kaaviokoodi
, ki_ka_ot.id AS ki_ka_ot_id
, ki_ka_ot.kaaviopaikka_id
, ki_ka_ot.peli_monesko_peli_ottelussa
, ki_ka_ot.peli_pelipaikka_id
, ki_ka_ot.peli_pelimuoto_id
, ki_ka_ot.peli_voittopisteet
, ki_ka_ot.peli_ajankohta_aikataulutus
, ki_ka_ot.peli_ajankohta_alkamisaika
, ki_ka_ot.peli_ajankohta_loppumisaika
, ki_ka_ot.peli_paikka_a_tiimiilmo_id
, ki_ka_ot.peli_paikka_b_tiimiilmo_id
, ki_ka_ot.peli_paikka_a_peluri1ilm_id
, ki_ka_ot.peli_paikka_a_peluri2ilm_id
, ki_ka_ot.peli_paikka_b_peluri1ilm_id
, ki_ka_ot.peli_paikka_b_peluri2ilm_id
, ki_il.pari_joukkue_nimi_txt AS peli_paikka_b_pari_joukkue_nimi_txt
, ki_il.sijoitusnumero_syotetty AS peli_paikka_b_sijoitusnumero_syotetty
, ki_il.sijoitusnumero_arvottu AS peli_paikka_b_sijoitusnumero_arvottu
, ka_il.pelaaja_oma_nimi_txt AS peli_paikka_b_peluri1_oma_nimi_txt
FROM ki_ka_ot
JOIN ki_il ON ki_ka_ot.peli_paikka_b_tiimiilmo_id = ki_il.id
JOIN ka_il ON ki_ka_ot.peli_paikka_b_peluri1ilm_id = ka_il.id
JOIN ki_ka ON ki_ka.id = ki_ka_ot.kaavio_id
JOIN ka_ki_ot ON
ki_ka.kaaviopohja_id = ka_ki_ot.kaaviopohja_id
/* AND ka_ki_ot.id = ki_ka_ot.kaaviopaikka_id */
JOIN kaa ON ka_ki_ot.kaaviopohja_id = kaa.id
JOIN ka_ki ON ka_ki_ot.kierros_id = ka_ki.id
WHERE ki_ka_ot.kaavio_id = 107
AND ka_ki_ot.id = ki_ka_ot.kaaviopaikka_id /* this was moved here */
ORDER BY ka_ki_ot.ottelunumero ASC
JOIN is a high cost operation that though the optimizer attempts to perform efficiently it cannot without some context of your data. I cannot suggest specific indexes you should make since I too have no context of your data, but you should attempt to index in a way that impacts any operations you wish to perform. So if you have a JOIN that joins on column A, you'd want to index on A so the optimizer can efficiently perform that JOIN.
Related
When executing a query statement, the speed is very slow.
SELECT
T1.APPL_SEQ
, T1.COMP_CD
, (SELECT COMP_NM FROM tb_company WHERE COMP_CD = T1.COMP_CD) AS COMP_NM
, T1.GPROD_CD
, (SELECT GPROD_NM FROM tb_gprod WHERE GPROD_CD = T1.GPROD_CD) AS GPROD_NM
, T1.SITE_CD
, (SELECT SITE_NM FROM tb_site WHERE SITE_CD = T1.SITE_CD) AS SITE_NM
, T1.INFLOW_CD
, T1.INFLOW_URL
, T1.STATUS
, T1.REG_DTM
, DECRYPTO(T1.NAME) AS NAME
, DECRYPTO(T1.HP) AS HP
, ifnull(T1.AGE,T1.`115`) AS AGE
, ifnull(T1.GENDER,T1.`116`) AS GENDER
, ifnull(T1.MEMO,T1.`120`) AS MEMO
, ifnull(T1.`105`,T1.`124`) AS TIME
, T1.`125` AS AGE_CHILD
, T2.API_YN
, T2.API_START_DT
, T2.API_END_DT
, T2.API_CD
, T2.DATA_INFLOWCD
, T2.CONFIRM_YN
, T2.SALE_YN
, T2.SALE_PRICE
, T2.BREAKDOWN
, T2.INPUT_DATE
, T3.DIST_YN
, T3.DIST_DT
,(select ifnull((select timestampdiff(DAY, T11.REG_DTM,T1.REG_DTM) AS DIFF2REGTIME from tb_applicant T11 WHERE T11.HP = T1.HP AND T11.GPROD_CD = T1.GPROD_CD AND T11.REG_DTM < T1.REG_DTM order by T11.REG_DTM desc limit 1),-1)) AS HP2_COUNT
FROM
tb_applicant T1
LEFT JOIN mm_applicant T2
ON T1.APPL_SEQ = T2.APPL_SEQ
LEFT JOIN dist_applicant T3
ON T1.APPL_SEQ = T3.APPL_SEQ
LEFT JOIN tb_site T4
ON T4.site_cd = T1.SITE_CD and T4.comp_cd = T1.COMP_CD and T4.gprod_cd = T1.GPROD_CD
WHERE 1=1
AND T1.APPL_SEQ > 147293
AND T4.is_use = 'Y'
$Sql_Search
ORDER BY
$Sql_OrderBy
) U1
, (SELECT #ROWNUM := 0) U2
) V1";
,(select ifnull((...),-1)) AS HP2_COUNT
This is part of why it's so slow.
This query calculates the number of months difference by comparing REG_DTM when the td_applicant table has the same data for HP, GPROD, and COMP.
I don't need to get the date difference, is there any way to improve the query speed?
The main problem are those subselect in the select. As #Akina suggested, you should move them in FROM and make them as join.
They way you have done implies that each subselect is executed for each row returned by the main select.
You have 4 subselect that mean if you have 100 rows you execute 1 (main select) + (4*100) query so 401 instead of 1.
Using join allow the internal optimization engine to choose the best strategy to perform the query, in your way practically no optimization are applied.
I post a short example of how should be your query, didn't refactor the whole query since without database is a bit difficult to do it and I can easily produce a wrong query.
Notice that you select twice on tb_site with different condition, so is up to you to put the correct one.
SELECT T1.APPL_SEQ, T1.COMP_CD, T1.GPROD_CD, T1.SITE_CD
TC.COMP_NM,
TG.GPROD_NM,
TS.SITE_NM,
......
FROM tb_applicant T1
LEFT JOIN mm_applicant T2
JOIN tb_company TC on TC.COMP_CD = T1.COMP_CD
JOIN tb_gprod TG on GPROD_CD = T1.GPROD_CD
JOIN tb_site TS on TS.SITE_CD = T1.SITE_CD ON T1.APPL_SEQ = T2.APPL_SEQ
.......
The query listed below works perfectly, but for the life of me I cannot figure out how to convert it to a SELECT statement and preview the results. I know how to do it when SET and WHERE are used, but the JOIN statement is messing things up. I'd appreciate suggestions.
UPDATE WA.contacts c
JOIN National.zips z
ON c.zipcode = z.zipcode
SET c.county = z.county
, c.population = z.population
, c.MA_Penetration = z.MA_Penetration
, c.MA_Eligibles = z.MA_Eligibles
WHERE state = 'WA';
Use select and from
SELECT WA.contacts.county
, National.zips.county
, WA.contacts.population
, National.zips.population
, WA.contacts.MA_Penetration
, National.zips.MA_Penetration
, WA.contacts.MA_Eligibles
, National.zips.MA_Eligibles
FROM WA.contacts
JOIN National.zips ON WA.contacts.zipcode = National.zips.zipcode
WHERE state = 'WA';
I am having trouble with a SQL query. So in my project user can reserve a ride. I want to display reserved rides by users ID (passenger_id) but query returns all users (driver_id) advertisements when user reserved a ride only for one of drivers advertisements.
SELECT advertisement.id
, COUNT(review.driver_id) AS 'review_count'
, ROUND(AVG(review.mark) ,1) AS 'rating'
, users.unique_id
, users.name
, users.surname
, users.phone
, YEAR(CURDATE()) - YEAR(users.birthdate) AS age
, users.image
, advertisement.from_city
, advertisement.to_city
, users.car_name
, users.car_model
, users.car_make_year
, advertisement.number_of_places
, advertisement.price
, advertisement.datetime
, advertisement.info
FROM reserved_rides
JOIN advertisement
ON reserved_rides.driver_id = advertisement.user_id
LEFT
JOIN review
ON reserved_rides.driver_id = review.driver_id
JOIN users
ON reserved_rides.driver_id = users.unique_id
WHERE reserved_rides.passenger_id = ?
GROUP
BY advertisement.id
ORDER
BY advertisement.datetime ASC
What is going wrong here?
I hope replacing GROUP BY advertisement.id with GROUP BY reserved_rides.driver_idsolves your problem. cheers
The given query works fine, But i want some modification. In 3rd line i have
LEFT JOIN tbl_emp ON inventory.itemGiven = tbl_emp.Sname
This is filter the output if match.
But problem is, all the inventory.itemGiven is not present in tbl_emp.Sname. and i want, if not match then the result store in different field.
SELECT `inventory`.`ID` , `inventory`.`out_` , `inventory`.`userName` , date_format( `inventory`.`date` , '%Y-%m-%d' ) AS date, `department`.`id` , `tbl_emp`.`emp_id_number`
FROM `inventory`
LEFT JOIN `tbl_emp` ON `inventory`.`itemGiven` = `tbl_emp`.`Sname`
LEFT JOIN `department` ON `inventory`.`givenDept` = `department`.`dept`
WHERE `out_` >0
LIMIT 0 , 30
I know this is possible, please anybody help me.
Thanks in advance.
Replace tbl_emp.emp_id_number with IFNULL(tbl_emp.emp_id_number,inventory.itemGiven) in select part
SELECT `inventory`.`ID` , `inventory`.`out_` , `inventory`.`userName` , date_format( `inventory`.`date` , '%Y-%m-%d' ) AS date, `department`.`id` , `tbl_emp`.`emp_id_number`
FROM `inventory`
LEFT JOIN `tbl_emp` ON `inventory`.`itemGiven` = `tbl_emp`.`Sname`
LEFT JOIN `department` ON `inventory`.`givenDept` = `department`.`dept`
WHERE `out_` >0
I am having around 2.5 lachs (250K) products and 2600 subcategories on magento application (community edition).
Query
SELECT 1 status
, e.entity_id
, e.type_id
, e.attribute_set_id
, cat_index.position AS cat_index_position
, e.name
, e.description
, e.short_description
, e.price
, e.special_price
, e.special_from_date
, e.special_to_date
, e.cost
, e.small_image
, e.thumbnail
, e.color
, e.color_value
, e.news_from_date
, e.news_to_date
, e.url_key
, e.required_options
, e.image_label
, e.small_image_label
, e.thumbnail_label
, e.msrp_enabled
, e.msrp_display_actual_price_type
, e.msrp
, e.tax_class_id
, e.price_type
, e.weight_type
, e.price_view
, e.shipment_type
, e.links_purchased_separately
, e.links_exist
, e.open_amount_min
, e.open_amount_max
, e.custom_h1
, e.awards
, e.region
, e.grape_type
, e.food_match
, e.udropship_vendor
, e.upc_barcode
, e.ean_barcode
, e.mpn
, e.size
, e.author
, e.format
, e.pagination
, e.publish_date
, price_index.price
, price_index.tax_class_id
, price_index.final_price
, IF(price_index.tier_price IS NOT NULL
, LEAST(price_index.min_price
, price_index.tier_price)
, price_index.min_price) AS minimal_price
, price_index.min_price
, price_index.max_price
, price_index.tier_price
FROM catalog_product_flat_1 e
JOIN catalog_category_product_index cat_index
ON cat_index.product_id = e.entity_id
AND cat_index.store_id = 1
AND cat_index.visibility IN(2,4)
AND cat_index.category_id = 163
JOIN catalog_product_index_price price_index
ON price_index.entity_id = e.entity_id
AND price_index.website_id = 1
AND price_index.customer_group_id = 0
GROUP
BY e.entity_id
ORDER
BY cat_index_position ASC
, cat_index.position ASC
LIMIT 15;
whenever accessing any products on this magento site it created a huge data under /tmp directory on theserver which is around 10 GB.
How can I fix this please suggest some solution.
Database size is 50 GB and server is nginx.
You are misusing GROUP BY. Please learn how it works. There's a misfeature in MySQL which allows you to misuse it. Unfortunately, queries that misuse it are very difficult to troubleshoot.
It is difficult to infer what you are trying to do from your query. When you're dealing with result sets of that size, it helps to know your intent.
You should know, if you don't already, that queries of the form
SELECT <<many columns>>
FROM large_table
JOIN another_large_table ON something
JOIN another_large_table ON something
ORDER BY some_arbitrary_column
LIMIT some_small_number
can be grossly inefficient because they have to generate an enormous result set, then sort the whole thing, then return the first results. The sort operation carries the whole result set with it. You could be instructing the MySQL server to sort a crore or two of rows (dozens of megarows).
It looks like you want the first fifteen results starting with the lowest cat_index.position value. Accordingly, you may be able to make your query faster by joining with an appropriate subset of the table you call cat_index, like so:
SELECT 1 status, many_other_columns
FROM catalog_product_flat_1 e
JOIN ( /* join only with fifteen lowest eligible position values in cat_index */
SELECT *
FROM catalog_category_product_index
WHERE store_id = 1
AND visibility IN(2,4)
AND category_id = 163
ORDER BY position ASC
LIMIT 15
) AS cat_index ON cat_index.product_id = e.entity_id
JOIN catalog_product_index_price price_index
ON price_index.entity_id = e.entity_id
AND price_index.website_id = 1
AND price_index.customer_group_id = 0
GROUP BY e.entity_id /*wrong!!*/
ORDER BY cat_index_position ASC, /* redundant!*/
cat_index.position ASC
LIMIT 15;
It's worth a try.
Are you have sufficient Hardware resources to run a big query and also please update you hardware configuration of server.