Optimize query mysql - mysql

i have a problem with a query for a web site. This is the situation:
I have 3 table:
articoli = where there are all article
clasart = where there are all the matches between the code article and class code - 32314 rows
classificazioni = where there are all matches between class code and name of class - 2401 rows
and this is the query
SELECT a.clar_classi , b.CLA_DESCRI
FROM clasart a JOIN (
SELECT art.AI_CAPOCODI, art.ai_codirest
FROM (select * from clasart where clar_azienda = 'SRL') a
JOIN (
SELECT AI_CAPOCODI, AI_CODIREST,AI_DT_CREAZ,
AI_DESCRIZI, AI_CATEMERC, concat(AI_CAPOCODI, AI_CODIREST) as codice, aI_grupscon
FROM articoli
WHERE AI_AZIENDA = 'SRL' AND AI_CATEMERC LIKE '0101______' AND AI_FLAG_NOW = 0 AND AI_CAPOCODI <> 'zzz'
) art ON trim(a.CLAR_ARTICO) = art.AI_CODIREST
JOIN classificazioni b ON a.CLAR_CLASSI = b.CLA_CODICE
WHERE b.CLA_CODICE LIKE 'AA51__'
group by CLAR_ARTICO) art ON trim(CLAR_ARTICO) = concat(art.AI_CAPOCODI, art.ai_codirest)
JOIN classificazioni b ON a.CLAR_CLASSI = b.CLA_CODICE
WHERE CLAR_AZIENDA = 'SRL' AND CLAR_CLASSI like 'CO____'
The time of run is 16 second. The time increase to 16 second when join with classificazioni.
You can help me? Thanks

Introduce following indexing using the queries below and after that the query will start running within a second or two:
ALTER TABLE articoli ADD INDEX idx_artc_az_cat_flg_cap (AI_AZIENDA, AI_FLAG_NOW, AI_CAPOCODI, AI_CATEMERC);
The above query will introduce the multi-column indexes on articoli table. The indexing work similar way how hash tables or keys of the array work to directly identifying the row on which the target value(s) match. Using multi-column will result in comparison of less number of rows.
Do not use trim(a.CLAR_ARTICO): make sure that before insertion the values are trimmed but not at the time of joining. This can result in skipping the index files and the join comparison can be expensive this way.
Let's move to next steps:
Introduce index on clar_azienda using following query:
ALTER TABLE clasart ADD INDEX idx_cls_az (clar_azienda);
If art.AI_CODIREST is not a primary/foreign key you'll need to introduce index there using the query below:
ALTER TABLE classificazioni ADD INDEX idx_clsi_cd (CLA_CODICE);
We are almost done, you'll just need to index CLAR_AZIENDA as well the same way how I indexed the above columns. Let me also tell you what is what in index column last query so you can write your own.
ALTER TABLE <tableName> ADD INDEX <indexKey (<column to be indexed>);
Let me know if you still have issues, remember you can run these queries after selecting your database from PhpMyAdmin (SQL tabl) or on mysql console.

Related

Is this the proper use of a MySQL index? Why it seems that is not working?

I have a PHP website that shows in a specific page a list of all comments related to that specific url.
My query
I do a SELECT query and I get some results. I wanted to add an index in order to make the query quicker:
SELECT
commentID, comment, users.userID
FROM comments
LEFT JOIN users
ON comments.userID = users.userID
WHERE contentID = ?
Original query in spanish:
SELECT
comentarioID, comentario, usuarios.userID
FROM comentarios
LEFT JOIN usuarios
ON comentarios.userID = usuarios.userID
WHERE contenidoID = ?
My indexes
As you can see is an easy query, but MySQL needs to search between the +14.000 comments in order to show them, so I added an index:
ALTER TABLE comments ADD INDEX(userID);
ALTER TABLE users ADD INDEX(userID);
So here is how comments indexes look without the index:
The result
And here is after I added it:
In both cases (before and after adding the indexes), if I use EXPLAIN for the SELECT query that I've shown at the beginning, I get:
The tables are all InnoDB.
Why there is no real difference?
The speed of the query is almost the same before and after adding the index: (Query took around 0.0163 seconds in both cases).
Is this post duplicated?
Before declaring this is a duplicated issue, please, note that I've already read this post, and this other one, and this other one... but I didn't find the replies there useful, because my case in my opinion is different.
(I presume that the ambiguous attributes in your query are from the comentarios table - you should have qualified these)
Because you are using a LEFT JOIN then the DBMS will always find matching rows in comentarios first before it goes looking for data in usuarios. An index is fast way to find rows. So by the time it has found those matching rows, it has no reason to use the new index.
OTOH if you specified a predicate in the users table, it would have used your new userID index index to find the matching rows in the comments table:
SELECT
comentarioID, comentario, usuarios.userID
FROM comentarios
INNER JOIN usuarios
ON comentarios.userID = usuarios.userID
WHERE usuarios.name = ?
I would expect "UserID" to be unique / the primary key, hence adding a second index on the same attribute is redundant.
Further, if my assumption above holds, your query only outputs attributes which exist in the comentarios table, hence unless you allow comments to be created without a matching user, the join is redundant / expensive and the query can be written as just:
SELECT
comentarioID, comentario, userID
FROM comentarios
WHERE contenidoID = ?
WHERE contenidoID = ? needs INDEX(contenidoID)
WHERE usuarios.name = ? needs INDEX(name)

select * from table1, table2 where table1.id = 1 showing values with other id

I just can't see the problem with how I'm making my foreign keys and I'm just really confused why I keep getting the wrong result. Here are screenshots from my workbench
Here are my tables:
And here's my diagram
I've also tried to normalize my tables and I was kinda expecting my query to return a similar result like in the sample table (Questions table) where it will only show 2 results since I want to query where idsurvey = 1 I made in this image:
My question is that, how do I fix my foreign key so that if I want to query
select * from survey.survey, survey.questions where idsurvey = 1
it will only return 2 rows? (based on sample data in the workbench screenshot)
Any comments and suggestions on my diagram would also be greatly appreciated.
When you have two tables in the from clause, every row from the first table is matched with every table from the second table. This is known as a Cartesian Product. Usually, this isn't the behavior you'd want (like it isn't in this case), and you'd use a condition to tell the database how to match these two tables:
SELECT *
FROM survey.survey s, survey.questions q
WHERE s.idsurvey = q.survey_id AND idsurvey = 1
While this should work, it's quite outdated to use multiple tables in the same from clause. You should probably use an explicit join clause instead:
SELECT *
FROM survey.survey s
JOIN survey.questions q ON s.idsurvey = q.survey_id
WHERE idsurvey = 1

How do i append data from one table with data from another table in MYSQL

I need to add information to a column where the first name, last name, state, and zip match each other from 2 different tables. The current query i am using i do not thing is efficient enough, it is taking days to run and never seems to finish. I have columns from both tables indexed.
UPDATE Table_1 INNER JOIN
Table_2
ON Table_2.fn = table_1.fn and Table_2.ln = table_1.ln and
Table_2.State = table_1.state and table_2.zip = table_1.zip
SET Table_1.app_phone = table_2.phone
I have also tried the where statement to do this query and was unsuccessful
If you want this to run effectively, then you need a composite index. I would suggest: table2(fn, ln, state, zip, phone).
The composite index should greatly help performance.

MySQL - How do I figure out which columns to index?

How do I figure out which columns to index?
SELECT a.ORD_ID AS Manual_Added_Orders,
a.ORD_poOrdID_List AS Auto_Added_Orders,
a.ORDPOITEM_ModelNumber,
a.ORDPO_Number,
a.ORDPOITEM_ID,
(SELECT sum(ORDPOITEM_Qty) AS ORDPOITEM_Qty
FROM orderpoitems
WHERE ORDPOITEM_ModelNumber = a.ORDPOITEM_ModelNumber
AND ORDPO_Number = 123007)
AS ORDPOITEM_Qty,
a.ORDPO_TrackingNumber,
a.ORDPOITEM_Received,
a.ORDPOITEM_ReceivedQty,
a.ORDPOITEM_ReceivedBy,
b.ORDPO_ID
FROM orderpoitems a
LEFT JOIN orderpo b ON (a.ORDPO_Number = b.ORDPO_Number)
WHERE a.ORDPO_Number = 123007
GROUP BY a.ORDPOITEM_ModelNumber
ORDER BY a.ORD_poOrdID_List, a.ORD_ID
I did the explain that is how I am getting these pictures... I added a few indexes... still not looking good.
Well firstly your query could be simplified to:
SELECT a.ORD_ID AS Manual_Added_Orders,
a.ORD_poOrdID_List AS Auto_Added_Orders,
a.ORDPOITEM_ModelNumber,
a.ORDPO_Number,
a.ORDPOITEM_ID,
SUM(ORDPOITEM_Qty) AS ORDPOITEM_Qty
a.ORDPO_TrackingNumber,
a.ORDPOITEM_Received,
a.ORDPOITEM_ReceivedQty,
a.ORDPOITEM_ReceivedBy,
b.ORDPO_ID
FROM orderpoitems a
LEFT JOIN orderpo b ON (a.ORDPO_Number = b.ORDPO_Number)
WHERE a.ORDPO_Number = 123007
GROUP BY a.ORDPOITEM_ModelNumber
ORDER BY a.ORD_poOrdID_List, a.ORD_ID
Secondly I would start by creating a index on the orderpoitems.ORDPO_Number and orderpo.ORDPO_number
Bit hard to say without the table structures.
Read up on indexes and covering index
From what you have, start with what is in your where clause AND join criteria to another table. Also, include if possible and practical, those columns used in group by / order by as order by is typically a killer when finishing a query.
That said, I would have an index on your OrderPOItems table on
( ordpo_number, orderpoitem_ModelNumber, ord_poordid_list, ord_id )
This way, the FIRST element hits your WHERE clause. Next the column for your data grouping, finally, the columns for your order by. This way, the joins and qualifying components can be "covered" from the index alone without having to go to the raw data pages for the rest of the columns being returned. Hopefully a good jump start specific to your scenario.

how can I speed up my queries?

so I have a 560mb db with the largest table 500mb(over 10 million rows)
my query hase to join 5 tables and takes about 10 seconds to finish....
SELECT DISTINCT trips.tripid AS tripid,
stops.stopdescrption AS "perron",
Date_format(segments.segmentstart, "%H:%i") AS "time",
Date_format(trips.tripend, "%H:%i") AS "arrival",
Upper(routes.routepublicidentifier) AS "lijn",
plcend.placedescrption AS "destination"
FROM calendar
JOIN trips
ON calendar.vsid = trips.vsid
JOIN routes
ON routes.routeid = trips.routeid
JOIN places plcstart
ON plcstart.placeid = trips.placeidstart
JOIN places plcend
ON plcend.placeid = trips.placeidend
JOIN segments
ON segments.tripid = trips.tripid
JOIN stops
ON segments.stopid = stops.stopid
WHERE stops.stopid IN ( 43914, 23899, 23925, 23908,
23913, 19899, 23871, 43902,
23876, 25563, 18956, 19912,
23889, 23861, 23879, 23884,
23856, 19920, 19898, 23916,
23894, 20985, 23930, 20932,
20986, 22434, 20021, 19893,
19903, 19707, 19935 )
AND calendar.vscdate = Str_to_date('25-10-2011', "%e-%c-%Y")
AND segments.segmentstart >= Str_to_date('15:56', "%H:%i")
AND routes.routeservicetype = 0
AND segments.segmentstart > "00:00:00"
ORDER BY segments.segmentstart
what are things I can do to speed this up? any tips are welcome, i'm pretty new to sql...
but I can't change the structure of the db because it's not mine...
Use EXPLAIN to find the bottlenecks: http://dev.mysql.com/doc/refman/5.0/en/explain.html
Then perhaps, add indexes.
If you don't need to select ALL rows, use LIMIT to limit returned result count.
Just looking at the query, I would say that you should make sure that you have indexes on trips.vsid, calendar.vscdate, segments.segmentstart and routes.routeservicetype. I assume that there is already indexes on all the primary keys in the tables.
Using explain as Briedis suggested would show you how well the indexes work.
You might want to add covering indexes for some tables, like for example an index on trips.vsid where tripid and routeid are included. That way the database can use only the index for the data that is needed from the table, and not read from the actual table.
Edit:
The execution plan tells you that it successfully uses indexes for everything except the segments table, where it does a table scan and filters by the where condition. You should try to make a covering index for segments.segmentstart by including tripid and stopid.
Try adding a clusters index to the routes table on both routeservicetype and routeid.
Depending on the frequency of the data within the routeservicetype field, you may get an improvement by shrinking the amount of data being compared in the join to the trips table.
Looking at the explain plan, you may also want to force the sequence of the table usage by using STRAIGHT_JOIN instead of JOIN (or INNER JOIN), as I've had real improvements with this technique.
Essentially, put the table with the smallest row-count of extracted data at the beginning of the query, and the largest row count table at the end (in this case possibly the segments table?), with the exception of simple lookups (eg. for descriptions).
You may also consider altering the WHERE clause to filter the segments table on stopid instead of the stops table, and creating a clustered index on the segments table on (stopid, tripid and segmentstart) - this index will be effectively able to satisfy two joins and two where clauses from a single index...
To build the index...
ALTER TABLE segments ADD INDEX idx_qry_helper ( stopid, tripid, segmentstart );
And the altered WHERE clause...
WHERE segments.stopid IN ( 43914, 23899, 23925, 23908,
23913, 19899, 23871, 43902,
23876, 25563, 18956, 19912,
23889, 23861, 23879, 23884,
23856, 19920, 19898, 23916,
23894, 20985, 23930, 20932,
20986, 22434, 20021, 19893,
19903, 19707, 19935 )
:
:
At the end of the day, a 10 second response for what appears to be a complex query on a fairly large dataset, isn't all that bad!