What is the performance difference between these two mysql queries?

What is the performance difference between these two mysql queries? - mysql

select sum(a) from tbl1 where id in (1,2,3) (0.1 seconds)
and
select sum(a) from tbl1 where id in (select id from tbl2) (60 seconds)
select id from tbl2 returns 1,2,3 in 0.001 seconds;
tbl1 has roughly 2.2M entries;

(1,2,3) is something known already when the code runs, while the second example is a query from another table and it is perfectly normal that the latter takes more. However, 600x slowness needs some further explanation. Without further information of your situation, we can only guess, but the potential problems are as follows:
there are many ids in the second table and many records in the first table
the inner query runs for each record of the outer query (highly probable)
lack of indexes on columns you are using for filtering
If the result of the inner query should be similar to the first example's set, then you might want to separate your queries into two. You could load the ids separately and then use the result for the second query.

This situation happens on mysql < 5.5. There seems to be a bug which causes long query time. The solution si to add an index on the tmp2.id or upgrade to higher versions of mysql.

Related

is it better/faster to make a select table.field than make a table.* when using big databases?

By simple logic Id think yeah, is faster because the DBMS brings less info and needs less memory...however, I dont have a valid argument why could be faster.
If for example, I want to have a select from 2 related tables, with index and everything.
But I want to know why select tableA.field, tableA.field2, tableA.field3, tableBfield1, tableB,field2 from tableA, tableB
is actually faster than
select * from tableA,tableB
Both tables have about 3 million records and table A has about 14 fields and tableB got 18.
Any idea?
Thanks.

Reducing the number of fields selected means that less data has to be transmitted from the server to the client. It also reduces the amount of memory that the server and client have to use to hold the data selected. So these should improve performance once the server determines which rows should be in the result set.
It's not likely to have any significant impact on the speed of processing the query itself within the database server. That's dominated by the cost of joining the tables, filtering the rows based on the WHERE clause, and performing any calculations specified in the SELECT clause. These are all independent of the columns being selected. If you use EXPLAIN on the two queries, you won't see any difference.

you are joining two tables with 3 million rows each with no filter. that will make 9x10^12 rows. generating and transmitting to the client a resultset of a few fields, against all 32 fields will make a difference.

If you select all fields in the first query it's the same thing because you request the same amount of data. Check this http://sqlfiddle.com/#!9/27987/2
Maybe the difference of perfomance has another reason...like...other selects in running.
Essentially select * from tableA,tableB is the equivalent of the Cartesian product of the two tables, for a total of 3million x 3 million of rows.
Therefore:
select * from tableA,tableB
With the wildcards * you retrieve a table of 9million x 28 columns, while
select tableA.field, tableA.field2, tableA.field3, tableB.field1, tableB.field2 from tableA, tableB
with the explicit form you have a table of 9million x 5 columns...so less data!

Slow query takes up entire HDD space resulting in a "1030 Got error 28 from storage engine"

Fairly new to MySQL.
Slow query takes up the entire HDD space ending up with 1030 error code.
INSERT INTO schema.Table C
SELECT a.`Date`, a.Store, a.SKU,
floor((a.QTY / ((b.CASEQTY * b.CASEPERLAYER) * b.LAYERPERPALLET))) AS Pallets,
floor(((a.QTY / ((b.CASEPERLAYER * b.LAYERPERPALLET) * b.CASEQTY)) /.CASEQTY)) AS Cases,
(a.QTY * b.CASEQTY) AS Pieces
FROM
(schema.table1 AS a
INNER JOIN schema.table2 AS b)
WHERE a.Description = 'BLAH';
Problem:
When I run the above query I get the results I need in 0.01 sec with a limit of 100 rows. However, When I try to insert the query into a prepared table it fails.
The above query will basically run for hours until the HDD is full. Table A contains millions of records and table B only a few thousand. Storage engine is InnoDB. I've run a similar query for 3hrs and have had it succeed. Any help will be greatly appreciated.

That's something special in MySQL. In spite of calling it INNER JOIN, you can do a CROSS JOIN by leaving out the ON clause which is exactly what you are doing. (Another dbms would raise a syntax error.)
So by not specifying the ON clause to match records from table1 and table2 you match every record in table1 with every record in table2. These can be many :-)

Your inner join statement contains no join criteria. This will result in something (bad) called a "cartesian product". So, if table A has a million records and table b contains a thousand, then a cartesian product will match each row in table A to EVERY row in the other table. This should give you (at least) a billion records.
To fix this, you need to define/constrain the relationship between the two tables by using an "ON" clause for your join or it could go in the WHERE clause.

Left joining two views is slow?

SELECT DISTINCT
viewA.TRID,
viewA.hits,
viewA.department,
viewA.admin,
viewA.publisher,
viewA.employee,
viewA.logincount,
viewA.registrationdate,
viewA.firstlogin,
viewA.lastlogin,
viewA.`month`,
viewA.`year`,
viewA.businesscategory,
viewA.mail,
viewA.givenname,
viewA.sn,
viewA.departmentnumber,
viewA.sa_title,
viewA.title,
viewA.supemail,
viewA.regionname
FROM
viewA
LEFT JOIN viewB ON viewA.TRID = viewB.TRID
WHERE viewB.TRID IS NULL
I have two views with a about 10K and 5K records in them. They each come in very quickly - fraction of a second. When I try to get all of the records that are not in ViewB from ViewA, it works but it is very slow. All of the underlying TRID fields are same char set and all set to varchar (10) and indexed and tables are all Innodb. Right now the query is taking 16 seconds. Anything that I can do?

Normally, with JOIN, MySQL has to do a lookup for each joined record. Lookups are fast when using keys, but in your case, there aren't really any keys because the joined table is a view.
To try to get MySQL from running the query behind the second view once per record in the first view, we can use a subquery.
SELECT *
FROM viewA
WHERE TRID NOT IN (SELECT TRID FROM viewB);
This should allow MySQL to get all the TRID values for viewB in the subquery (in a temp table) then do a search over them for each record in viewA.
From MySQL docs:
MySQL executes uncorrelated subqueries only once. Use EXPLAIN to make
sure that a given subquery really is uncorrelated.

It is hard to optimize queries with views in MySQL. My first suggestion is to get rid of distinct unless you absolutely know that it is needed.
Then you might compare the performance with this query:
select viewA.*
from viewA
where not exists (select 1 from viewB where viewB.TRID = viewA.TRID);
It is hard to say whether one will be better than the other, but it is worth trying to see if this is better.

mysql Subquery with JOIN bad performance

My problem is this:
select * from
(
select * from barcodesA
UNION ALL
select * from barcodesB
)
as barcodesTOTAL, boxes
where barcodesTotal.code=boxes.code;
Table barcodesA has 4000 entries
Table barcodesB has 4000 entries
Table boxes has like 180.000 entries
It takes 30 seconds to proccess the query.
Another problematic query:
select * from
viewBarcodesTotal, boxes
where barcodesTotal.code=boxes.code;
viewBarcodesTotal contains the UNION ALL from both barcodes tables. It also takes forever.
Meanwhile,
select * from barcodesA , boxes where barcodesA.code=boxes.code
UNION ALL
select * from barcodesB , boxes where barcodesB.code=boxes.code
This one takes <1second.
The question is obviously WHY?, is my code bugged? is mysql bugged?
I have to migrate from access to mysql, and i would have to rewrite all my code if the first option in bugged.

Add an index on boxes.code if you don't already have one. Joining 8000 records (4K+4K) to the 180,000 will benefit from an index on the 180K side of the equation.
Also, be explicit and specify the fields you need back in your SELECT statements. Using * in a production-use query is bad form as it encourages not having to think about what fields (and how big they might be), not to mention the fact that you have 2 different tables in your example, barcodesa and barcodesb with potentially different data types and column orders that you're UNIONing....

The REASON for the performance difference...
The first query says... First, do a complete union of EVERY record in A UNIONed with EVERY record in B, THEN Join it to boxes on the code. The union does not have an index to be optimized against.
By explicitly applying your SECOND query instance, each table individually IS optimized on the join (apparently there IS an index per performance of second, but I would ensure both tables have index on "code" column).

Slow query when using ORDER BY

Here's the query (the largest table has about 40,000 rows)
SELECT
Course.CourseID,
Course.Description,
UserCourse.UserID,
UserCourse.TimeAllowed,
UserCourse.CreatedOn,
UserCourse.PassedOn,
UserCourse.IssuedOn,
C.LessonCnt
FROM
UserCourse
INNER JOIN
Course
USING(CourseID)
INNER JOIN
(
SELECT CourseID, COUNT(*) AS LessonCnt FROM CourseSection GROUP BY CourseID
) C
USING(CourseID)
WHERE
UserCourse.UserID = 8810
If I run this, it executes very quickly (.05 seconds roughly). It returns 13 rows.
When I add an ORDER BY clause at the end of the query (ordering by any column) the query takes about 10 seconds.
I'm using this database in production now, and everything is working fine. All my other queries are speedy.
Any ideas of what it could be? I ran the query in MySQL's Query Browser, and from the command line. Both places it was dead slow with the ORDER BY.
EDIT: Tolgahan ALBAYRAK solution works, but can anyone explain why it works?

maybe this helps:
SELECT * FROM (
SELECT
Course.CourseID,
Course.Description,
UserCourse.UserID,
UserCourse.TimeAllowed,
UserCourse.CreatedOn,
UserCourse.PassedOn,
UserCourse.IssuedOn,
C.LessonCnt
FROM
UserCourse
INNER JOIN
Course
USING(CourseID)
INNER JOIN
(
SELECT CourseID, COUNT(*) AS LessonCnt FROM CourseSection GROUP BY CourseID
) C
USING(CourseID)
WHERE
UserCourse.UserID = 8810
) ORDER BY CourseID

Is the column you're ordering by indexed?
Indexing drastically speeds up ordering and filtering.

You are selecting from "UserCourse" which I assume is a joining table between courses and users (Many to Many).
You should index the column that you need to order by, in the "UserCourse" table.
Suppose you want to "order by CourseID", then you need to index it on UserCourse table.
Ordering by any other column that is not present in the joining table (i.e. UserCourse) may require further denormalization and indexing on the joining table to be optimized for speed;
In other words, you need to have a copy of that column in the joining table and index it.
P.S.
The answer given by Tolgahan Albayrak, although correct for this question, would not produce the desired result, in cases where one is doing a "LIMIT x" query.

Have you updated the statistics on your database? I ran into something similar on mine where I had 2 identical queries where the only difference was a capital letter and one returned in 1/2 a second and the other took nearly 5 minutes. Updating the statistics resolved the issue

Realise answer is too late, however I have just had a similar problem, adding order by increased the query time from seconds to 5 minutes and having tried most other suggestions for speeding it up, noticed that the /tmp files where getting to be 12G for this query. Changed the query such that a varchar(20000) field being returned was "trim("ed and performance dramatically improved (back to seconds). So I guess its worth checking whether you are returning large varchars as part of your query and if so, process them (maybe substring(x, 1, length(x))?? if you dont want to trim them.
Query was returning 500k rows and the /tmp file indicated that each row was using about 20k of data.

A similar question was asked before here.
It might help you as well. Basically it describes using composite indexes and how order by works.

Today I was running into a same kind of problem. As soon as I was sorting the resultset by a field from a joined table, the whole query was horribly slow and took more than a hundred seconds.
The server was running MySQL 5.0.51a and by chance I noticed that the same query was running as fast as it should have always done on a server with MySQL 5.1. When comparing the explains for that query I saw that obviously the usage and handling of indexes has changed a lot (at least from 5.0 -> 5.1).
So if you encounter such a problem, maybe your resolution is to simply upgrade your MySQL

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

What is the performance difference between these two mysql queries? - mysql

select sum(a) from tbl1 where id in (1,2,3) (0.1 seconds) and select sum(a) from tbl1 where id in (select id from tbl2) (60 seconds) select id from tbl2 returns 1,2,3 in 0.001 seconds; tbl1 has roughly 2.2M entries;

This situation happens on mysql < 5.5. There seems to be a bug which causes long query time. The solution si to add an index on the tmp2.id or upgrade to higher versions of mysql.

Related

is it better/faster to make a select table.field than make a table.* when using big databases?

Slow query takes up entire HDD space resulting in a "1030 Got error 28 from storage engine"

Left joining two views is slow?

mysql Subquery with JOIN bad performance

Slow query when using ORDER BY

Categories

Resources