I was wondering whats the speed difference between multiple mysql queries and one single complex query? Is there a difference? Anyone have any benchmarks or tips?
Example
"SELECT *, ( SELECT COUNT(DISTINCT stuff) FROM stuff where stuff.id = id) as stuff, ( SELECT SUM(morestuff) FROM morestuff where morestuff.id = id) as morestuff, (SELECT COUNT(alotmorestuff) FROM alotmorestuff where alotmorestuff.id = id) as alotmorestuff FROM inventory, blah WHERE id=id"
vs single select queries for each.
Well, your complex query actually isn't. It's a bunch of independent selects mashed together which may or may not work - I see no reason there should be any noticeable difference. You may save a bit if you've got high latency connection to your db.
Related
I have a mysql (mariadb) database with numerous tables and all the tables have the same structure.
For the sake of simplicity, let's assume the structure is as below.
UserID - Varchar (primary)
Email - Varchar (indexed)
Is it possible to query all the tables together for the Email field?
Edit: I have not finalized the db design yet, I could put all the data in single table. But I am afraid that large table will slow down the operations, and if it crashes, it will be painful to restore. Thoughts?
I have read some answers that suggested dumping all data together in a temporary table, but that is not an option for me.
Mysql workbench or PHPMyAdmin is not useful either, I am looking for a SQL query, not a frontend search technique.
There's no concise way in SQL to say this sort of thing.
SELECT a,b,c FROM <<<all tables>>> WHERE b LIKE 'whatever%'
If you know all your table names in advance, you can write a query like this.
SELECT a,b,c FROM table1 WHERE b LIKE 'whatever%'
UNION ALL
SELECT a,b,c FROM table2 WHERE b LIKE 'whatever%'
UNION ALL
SELECT a,b,c FROM table3 WHERE b LIKE 'whatever%'
UNION ALL
SELECT a,b,c FROM table4 WHERE b LIKE 'whatever%'
...
Or you can create a view like this.
CREATE VIEW everything AS
SELECT * FROM table1
UNION ALL
SELECT * FROM table2
UNION ALL
SELECT * FROM table3
UNION ALL
SELECT * FROM table4
...
Then use
SELECT a,b,c FROM everything WHERE b LIKE 'whatever%'
If you don't know the names of all the tables in advance, you can retrieve them from MySQL's information_schema and write a program to create a query like one of my suggestion. If you decide to do that and need help, please ask another question.
These sorts of queries will, unfortunately, always be significantly slower than querying just one table. Why? MySQL must repeat the overhead of running the query on each table, and a single index is faster to use than multiple indexes on different tables.
Pro tip Try to design your databases so you don't add tables when you add users (or customers or whatever).
Edit You may be tempted to use multiple tables for query-performance reasons. With respect, please don't do that. Correct indexing will almost always give you better query performance than searching multiple tables. For what it's worth, a "huge" table for MySQL, one which challenges its capabilities, usually has at least a hundred million rows. Truly. Hundreds of thousands of rows are in its performance sweet spot, as long as they're indexed correctly. Here's a good reference about that, one of many. https://use-the-index-luke.com/
Another reason to avoid a design where you routinely create new tables in production: It's a pain in the ***xxx neck to maintain and optimize databases with large numbers of tables. Six months from now, as your database scales up, you'll almost certainly need to add indexes to help speed up some slow queries. If you have to add many indexes, you, or your successor, won't like it.
You may also be tempted to use multiple tables to make your database more resilient to crashes. With respect, it doesn't work that way. Crashes are rare, and catastrophic unrecoverable crashes are vanishingly rare on reliable hardware. And crashes can corrupt multiple tables. (Crash resilience: decent backups).
Keep in mind that MySQL has been in development for over a quarter-century (as have the other RDBMSs). Thousands of programmer years have gone into making it fast and resilient. You may as well leverage all that work, because you can't outsmart it. I know this because I've tried and failed.
Keep your database simple. Spend your time (your only irreplaceable asset) making your application excellent so you actually get millions of users.
I have a project(Laravel 5.4) where i need to improve performance as much as i can.
So i was wondering what is the performance difference between:
$model->get()
get method takes all variables('created_at','updated_at', etc), so the select should be faster.
$model->select('many variables to select')->get();
select method is an additional query, so it takes more time, so maybe just get is faster?
I wanted to know if select and get is better in all occasions or are there any moments where just get is better?
The differences between Model::get() and Model::select(['f1', 'f2'])->get() is only at the query
// Model::get()
SELECT * FROM table
// Model::select(['f1', 'f2'])->get()
SELECT f1, f2 FROM table
Both runs database query ONCE, and prepare the collection of model instances to you. select simply construct eloquent to only select fields you need. Performance gain is almost negligible or can be worse. Read about it here: Is it bad for performance to select all columns?
In MySQL, is it generally a good idea to always do a COUNT(*) first to determine if you should do a SELECT * to actually fetch the rows, or is it better to just do the SELECT * directly and then check if it returned any rows?
Unless you lock the table/s in question, doing a select count(*) is useless. Consider:
Process 1:
SELECT COUNT(*) FROM T;
Process 2:
INSERT INTO T
Process 1:
...now doing something based on the obsolete count retrieved before...
Of course, locking a table is not a very good idea in a server environment.
It depends on whether you need the number, but in particular in mysql there's a calc_found_rows, IIRC. Look up the docs.
always the SELECT [field1, field2 | *] FROM.... The SELECT COUNT(*) will just bloat your code, add additional transport and data overhead and generally be unmaintainable.
The form is 2 queries, the latter is 1 query. Each query needs to talk with the database server. Do the math.
The answer is as in many of this kind questions - "it depends". What you shouldn't do is performing those two queries when you do not have an index on a table. In general, performing just COUNT is a waste of IO time, so if if this operation will help you to save some time spent on IO in MOST cases, than it might be an option.
In some cases some db driver implementations may not return the count of actually selected rows for select statement that returns records itself. The 'count(*)' issued beforehand is useful when you need to know the precise size of resulting recordset before you select actual data.
I am performance profiling my queries and need to be able to see a lot more than the last 100 queries. Max profiling_history_size is 100, but I've seen debug tools that somehow manage to save more than the last 100 queries (for example, django debug tool).
I do
SET profiling=1;
set profiling_history_size = 100;
SHOW PROFILES;
It would be fine if I could move the records to another table. Or maybe I need to be looking somewhere else altogether?
My program runs the same queries a lot of times, so what I really want is an aggregate of all the times that a particular query was executed. I was going to do some kind of GROUP BY once I had all the queries, but maybe there is some other place to look? (I don't mean to ask 2 questions, but maybe knowing what I eventually need will change the answer to the above question.)
With MySQL 5.6, the statements (queries) are instrumented by the performance schema.
You get the time a query takes, plus many more attributes such as the number of rows returned, etc, and you get many different aggregations.
See
http://dev.mysql.com/doc/refman/5.6/en/statement-summary-tables.html
http://dev.mysql.com/doc/refman/5.6/en/performance-schema-statements-tables.html
An aggregation that is particularly useful is the aggregation by digest, to group several "similar queries" together, where similar means the same structure but possibly different literal values ("select * from t1 where id = 1" and "select * from t1 where id = 2" will be aggregated together as "select * from t1 where id = ?")
If you need information for more than the last 100 queries, you will simply need to collect existing data every 100 queries. I have implemented the latter feature in PHP PDO.
https://github.com/gajus/doll
This is an application layer solution.
I have read that creating a temporary table is best if the number of parameters passed in the IN criteria is large. This is for select queries. Does this hold true for update queries as well ?? I have an update query which uses 3 table joins (Inner Joins) and passes 1000 parameters in the IN criteria and this query runs in a loop for 200 or more times. Which is the best approach to execute this query ?
IN operations are usually slow. Passing 1000 parameters to any query sounds awful. If you can avoid that, do it. Now, I'd really have a go with the temp table. You can even play with the indexing of the table. I mean, instead of just putting values in it, play with the indexes that would help you optimize your searches.
On the other hand, adding with indexes is slower that adding without indexes. Go for an empiric test there. Now, what I think is a must, bear in mind that when using the other table you don't need to use the IN clause because you can use the EXISTS clause which results usually in better performance. I.E.:
select * from yourTable yt
where exists (
select * from yourTempTable ytt
where yt.id = ytt.id
)
I don't know your query, nor data, but that would give you an idea about how to do it. Note the inner select * is as fast as select aSingleField, as the database engine optimizes it.
Those are all my thoughts. But remember, to be 100% sure of what is best for your problem, there is nothing like performing both tests and timing them :) Hope this help.