The first queries behave erratically - mysql

I ran the same query in number of tables (containing different no of records):
SELECT * FROM `tblTest`
ORDER BY `tblTest`.`DateAccess` DESC;
Why the first queries behave erratically (take longer then second, third...)?
I calculated the average of the second, third and fourth query, exuding the first query.
So for example, in a table with 1,000,000 records, the first time to proccess takes 4.8410 s and second time - only 0.8940 s. Why is this happening?
p.s. I use phpMyAdmin tool.

DBMS are really smart applications and maintain multiple catalogues to optimize their execution. When a query is run it generates many entries in the database depending on the DBMS used these catalogues will be more optimized and can even go to automatically generate index to optimize really often used queries. They also all have what is call a query optimizer which analyzes the plan of the query execution in order to optimize the execution plan.
In your specific case, you should look at query and result caching, the following article should help you understand how mysql natively tries to optimize query processing.
http://dev.mysql.com/doc/refman/5.5/en/query-cache.html
http://www.cyberciti.biz/tips/enable-the-query-cache-in-mysql-to-improve-performance.html
Here is a comparison between oracle, mysql and postgres (not a new article but will give you a basic idea of how different dbms have different way of handling complex queries on large databases)
http://dcdbappl1.cern.ch:8080/dcdb/archive/ttraczyk/db_compare/db_compare.html#Query+optimization
Cheers,

Related

Is selecting fewer columns speeding up my query?

I have seen several questions comparing select * to select by all columns explicitly, but what about fewer columns selected vs more.
In other words, is:
SELECT id,firstname,lastname,lastlogin,email,phone
More than negligibly faster than:
SELECT id,firstname,lastlogin
I realize there will be small differences for more data being transferred through the system and to the application, but this is a total data/load difference, not a cost of the query (larger data in the cells would have the same effect anyway I believe) - I'm only trying to optimize my query, as I will have to load ALL the data at some point anyway...
When my admin user logs in, I'm going to load the entire user database into a cache, but I can either query only critical data upfront to shave some execution time, or just get everything - if it works out roughly the same. I know more rows equals longer query execution - but what about more selected values in my query?
Under most circumstances, the only difference is going to be slightly larger data for these fields and the additional time to fetch them.
There are two things to consider:
If the additional fields are very big, then this could be a big difference in performance.
If there is an index that covers the columns you actually want, then the index can be used for the query. This could speed the query in the database.
In general, though, the advice is to return the columns you want to the application. If there is complex processing, you should consider doing that in the database rather than the application.

sql query count for sum of two "unknown" columns

I need to query for the COUNT of rows that fulfill multiple filter criteria. However, I do not know which filters will be combined, so I cannot create appropriate indexes.
SELECT COUNT(id) FROM tbl WHERE filterA > 1000 AND filterD < 500
This is very slow since it has to do a full table scan. Is there any way to have a perfomant query in my situation?
id, filterA, filterB, filterC, filterD, filterE
1, 2394, 23240, 8543, 3241, 234, 23
The issue here is that there are certain limitations in how you can index data on multiple criteria. These are standard, fundamental issues and to the extent that ElasticSearch is able to get away from the problems it is just brute force parallelism and indexes on everything you may want to filter by.
Usually some filters will be more commonly used and more selective, so usually one would start by looking at actual examples of queries and build indexes around the queries which have performed slowly in the past.
This means you start with slow query logging and then focus on the most important queries first until you get everything where it is tolerable.

Mysql SELECT query and performance

I was wondering if there is a performance gain between a SELECT query with a not very specific WHERE clause and another SELECT query with a more specific WHERE clause.
For instance is the query:
SELECT * FROM table1 WHERE first_name='Georges';
slower than this one:
SELECT * FROM table1 WHERE first_name='Georges' AND nickname='Gigi';
In other words is there a time factor that is link to the precision of the WHERE clause ?
I'm not sure to be very understandable and even if my question takes into account all the components that are involved in database query (MYSQL in my case)
My question is related to the Django framework because I would like to cache an evaluated queryset, and on a next request, take back this cached-evaluated queryset, filter it more, and evaluate it again.
There is no hard and fast rule about this.
There can be either an increase or decrease in performance by adding more conditions to the WHERE clause, as it depends on, among other things, the:
indexing
schema
data quantity
data cardinality
statistics
intelligence of the query engine
You need to test with your data set and determine what will perform the best.
MySql server must compare all columns in your WHERE clause (if all joined by AND ).
So if you don't have any index on column nickname second query will by slightly slower.
Here you can read how column indexes works (with examples similar to your question): http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html
I think is difficult to answer this question, too many aspects (e.g.: indexes) are involved. I would say that the first query is faster than the first one, but I can't say for sure.
If this is crucial for you, why don't you run a simulation (e.g.: run 1'000'000 of queries) and check the time?
Yes, it can be slower. It will all depend on indexes you have and data distribution.
Check the link Understanding the Query Execution Plan
for information on how to know what MySQL is going to do when executing your query.

How big is too big for a view in MySQL InnoDB?

BACKGROUND
I'm working with a MySQL InnoDB database with 60+ tables and I'm creating different views in order to make dynamic queries fast and easier in the code. I have a couple of views with INNER JOINS (without many-to-many relationships) of 20 to 28 tables SELECTING 100 to 120 columns with row count below 5,000 and it works lighting fast.
ACTUAL PROBLEM
I'm creating a master view with INNER JOINS (without many-to-many relationships) of 34 tables and SELECTING about 150 columns with row count below 5,000 and it seems like it's too much. It takes forever to do a single SELECT. I'm wondering if I hit some kind of view-size limit and if there is any way of increasing it, or any tricks that would help me pass through this apparent limit.
It's important to note that I'm NOT USING Aggregate functions because I know about their negative impact on performance, which, by the way I'm very concerned about.
MySql does not use the "System R algorithm" (used by Postgresql, Oracle, and SQL Server, I think), which considers not only different merge algorithms (MySQL only has nested-loop, although you can fake a hash join by using a hash index), but also the possible ways of joining the tables and possible index combinations. The result seems to be that parsing of queries - and query execution - can be very quick upto a point, but performance can dramatically drop off as the optimizer chooses the wrong path through the data.
Take a look at your explain plans and try to see if a) the drop in performance is due to the number of columns you are returning (just do SELECT 1 or something) or b) if it is due to the optimizer choosing a table scan instead of index usage.
A view is just a named query. When you refer to a view in MySQL it just replaces the name with the actual query and run it.
It seems that you confuse it with materialized views, which are tables you create from a query. Afterwards you can query that table, and does not have to do the original query again.
Materialized views are not implemented in MySQL.
To improve the performance try to use the keyword explain to see where you can optimize your query/view.

mysql performance comparison

Which is more efficient,and by how much?
type 1:
insert into table_name(column1,column2..) select column1,column2 ... from another_table where
columnX in (value_list)
type 2:
insert into table_name(column1,column2..) values (column1_0,column2_0..),(column1_1,column2_1..)
The first edition looks short,and the second may become extremely long,when value_list contains,say 500 or even more values.
But I've no idea about whose performance will be better,though feels the first should be more efficient,intuitively.
The first is cleaner, especially if your columns are already in mysql (which I'm assuming you are saying?). You would save some time in network overhead sending data, and parsing time, and have to worry less about hitting whatever query size limit your client has.
However, in general, I would expect the performance to be similar as the number rows grows larger, especially on a well-indexed table. Most of the time for inserts w/ large queries is spent doing things like building indexes (see here), and both those queries, absent turning indexes off, would have to do that.
I agree with Todd, the first query is cleaner and will be faster to send to the MySQL server and faster to compile. And it's probably true that as the number of inserted records increases, the speed differential will drop.
But the first form has substantial other benefits to consider:
It's far easier to maintain: you only have to add or modify a field every now and then.
You avoid the expense of querying another_table and processing the results to concatenate the second query (a hidden cost of that approach).
If you need to run this update more than once, the first query can be cached in the MySQL server along with its compiled form and query plan. This makes subsequent invocations of the query run a bit faster.