Basically, I have multiple queries like this:
SELECT a, b, c FROM (LONG QUERY) X WHERE ...
The thing is, I am using this LONG QUERY really frequently. I am looking to give this subquery an alias which would primarily:
Shorten & simplify queries (also reduce errors and code duplication)
Possibly optimize performance. (I believe this is done by default by mysql query caching)
Till now, I have been doing it this way to store:
variable = LONG QUERY;
Query("SELECT a, b, c FROM ("+variable+") X WHERE ...");
Which is not bad. I am looking for a way to do this with mysql internally.
Is it possible to create a simple, read-only view that would generate NO overhead, so I could do everywhere? I believe this is more propper & readable way of doing it.
SELECT a, b, c FROM myTable WHERE ...
Typically these are called views. For example:
CREATE VIEW vMyLongQuery
AS
SELECT a, b, c FROM (LONG QUERY) X WHERE ...
Which can then be referenced like this:
SELECT a, b, c FROM vMyLongQuery
See http://dev.mysql.com/doc/refman/5.0/en/create-view.html for more info on syntax.
As far as performance goes, best case performance will be near enough exactly the same as what you're doing now and worst case it will kill your app. It depends on what you do with the views and how MySQL processes them.
MySQL implements views two ways merge and temptable. The merge option is pretty much exactly what you're doing now, your view is merged into your query as a subquery. With a temptable it will actually spool all the data into a temptable and then select/join to that temptable. You also lose index benefits when data is joined to the temptable.
As a heads up, the merge query plan doesn't support any of the following in your view.
Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
DISTINCT
GROUP BY
HAVING
LIMIT
UNION or UNION ALL
Subquery in the select list
Reference to literals without an underlying table
So if your subquery uses these, you're likely going to hurt performance.
Also, heed OMG Ponies' advice carefully, a view is NOT the same as a base class. Views have their place in a DB but can easily be misused. When an engineer comes to the database from an OO background, views seem like a convenient way to promote inheritance and reusability of code. Often people eventually find themselves in a position where they have nested views joined to nested views of nested views. SQL processes nested views by essentially taking the definition of each individual view and expanding that into a beast of a query that will make your DBA cry.
Also, you followed excellent practice in your example and I encourage you to continue this. You specified all your columns individually, never ever use SELECT * to specify the results of your view. It will, eventually, ruin your day.
Im not sure if this is what you are looking for but you can use stored procedures to recall mysql queries. Im not sure if you can use it inside another query though?
http://www.mysqltutorial.org/getting-started-with-mysql-stored-procedures.aspx
Related
I have a mysql (mariadb) database with numerous tables and all the tables have the same structure.
For the sake of simplicity, let's assume the structure is as below.
UserID - Varchar (primary)
Email - Varchar (indexed)
Is it possible to query all the tables together for the Email field?
Edit: I have not finalized the db design yet, I could put all the data in single table. But I am afraid that large table will slow down the operations, and if it crashes, it will be painful to restore. Thoughts?
I have read some answers that suggested dumping all data together in a temporary table, but that is not an option for me.
Mysql workbench or PHPMyAdmin is not useful either, I am looking for a SQL query, not a frontend search technique.
There's no concise way in SQL to say this sort of thing.
SELECT a,b,c FROM <<<all tables>>> WHERE b LIKE 'whatever%'
If you know all your table names in advance, you can write a query like this.
SELECT a,b,c FROM table1 WHERE b LIKE 'whatever%'
UNION ALL
SELECT a,b,c FROM table2 WHERE b LIKE 'whatever%'
UNION ALL
SELECT a,b,c FROM table3 WHERE b LIKE 'whatever%'
UNION ALL
SELECT a,b,c FROM table4 WHERE b LIKE 'whatever%'
...
Or you can create a view like this.
CREATE VIEW everything AS
SELECT * FROM table1
UNION ALL
SELECT * FROM table2
UNION ALL
SELECT * FROM table3
UNION ALL
SELECT * FROM table4
...
Then use
SELECT a,b,c FROM everything WHERE b LIKE 'whatever%'
If you don't know the names of all the tables in advance, you can retrieve them from MySQL's information_schema and write a program to create a query like one of my suggestion. If you decide to do that and need help, please ask another question.
These sorts of queries will, unfortunately, always be significantly slower than querying just one table. Why? MySQL must repeat the overhead of running the query on each table, and a single index is faster to use than multiple indexes on different tables.
Pro tip Try to design your databases so you don't add tables when you add users (or customers or whatever).
Edit You may be tempted to use multiple tables for query-performance reasons. With respect, please don't do that. Correct indexing will almost always give you better query performance than searching multiple tables. For what it's worth, a "huge" table for MySQL, one which challenges its capabilities, usually has at least a hundred million rows. Truly. Hundreds of thousands of rows are in its performance sweet spot, as long as they're indexed correctly. Here's a good reference about that, one of many. https://use-the-index-luke.com/
Another reason to avoid a design where you routinely create new tables in production: It's a pain in the ***xxx neck to maintain and optimize databases with large numbers of tables. Six months from now, as your database scales up, you'll almost certainly need to add indexes to help speed up some slow queries. If you have to add many indexes, you, or your successor, won't like it.
You may also be tempted to use multiple tables to make your database more resilient to crashes. With respect, it doesn't work that way. Crashes are rare, and catastrophic unrecoverable crashes are vanishingly rare on reliable hardware. And crashes can corrupt multiple tables. (Crash resilience: decent backups).
Keep in mind that MySQL has been in development for over a quarter-century (as have the other RDBMSs). Thousands of programmer years have gone into making it fast and resilient. You may as well leverage all that work, because you can't outsmart it. I know this because I've tried and failed.
Keep your database simple. Spend your time (your only irreplaceable asset) making your application excellent so you actually get millions of users.
I have a view (say 'v') that is the combination of 10 tables using several Joins and complex calculations. In that view, there are around 10 Thousand rows.
And then I select 1 row based on row as WHERE id = 23456.
Another possible way to use a larger query in which I can cut short the dataset to 1% before the complex calculation starts.
Question: Are SQL views optimized in some form?
MySQL Views are just syntactic sugar. There is not special optimization. Think of views as being textually merged; then optimized. That is, you could get the same optimizations (or not) by manually writing the equivalent SELECT.
If you would like to discuss the particular query further, please provide SHOW CREATE TABLE/VIEW and EXPLAIN SELECT .... It may be that you are missing a useful 'composite' index.
I have a view which queries from 2 tables that don't change often (they are updated once or twice a day) and have a maximum of 2000 and 1000 rows).
Which algorithm should perform better, MERGE or TEMPTABLE?
Wondering, will MySQL cache the query result, making TEMPTABLE the best choice in my case?
Reading https://dev.mysql.com/doc/refman/5.7/en/view-algorithms.html I understood that basically, the MERGE algorithm will inject the view code in the query that is calling it, then run. The TEMPTABLE algorithm will make the view run first, store its result into a temporary table then used. But no mention to cache.
I know I have the option to implement Materialized Views myself (http://www.fromdual.com/mysql-materialized-views). Can MySQL automatically cache the TEMPTABLE result and use it instead?
Generally speaking the MERGE algorithm is preferred as it allows your view to utilize table indexes, and doesn't introduce a delay in creating temporary tables (as TEMPTABLE does).
In fact this is what the MySQL Optimizer does by default - when a view's algorithm UNDEFINED (as it is by default) MySQL will use MERGE if it can, otherwise it'll use TEMPTABLE.
One thing to note (which has caused me a lot of pain) is that MySQL will not use the MERGE algorithm if your view contains any of the following constructs:
Constructs that prevent merging are the same for derived tables and view references:
Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
DISTINCT
GROUP BY
HAVING
LIMIT
UNION or UNION ALL
Subqueries in the select list
Assignments to user variables
Refererences only to literal values (in this case, there is no underlying table)
In this case, TEMPTABLE will be used, which can cause performance issues without any clear reason why. In this case it's best to use a stored procedure, or subquery instead of a view
Thank's MySQL 😠
Which algorithm? It depends on the particular query and schema. Usually the Optimizer picks the better approach, and you should not specify.
But... Sometimes the Optimizer picks really bad approach. At that point, the only real solution is not to use Views. That is, some Views cannot be optimized as well as the equivalent SELECT.
If you want to discuss a particular case, please provide the SHOW CREATE VIEW and SHOW CREATE TABLEs, plus a SELECT calling the view. And construct the equivalent SELECT. Also include EXPLAIN for both SELECTs.
I am renaming multiple tables in a large application. I need to preserve the old table name because some parts of the application will take longer to be updated, we can have no downtime.
My idea is to create a view that selects all from the new table, like this:
create view old_table_name as select a as x, b as y, c as z from new_table_name;
According to this article (http://dev.mysql.com/doc/refman/5.7/en/view-updatability.html) I will be able to make inserts and updates and deletes with this view.
My question is (considering that this is only a temporary solution in the mean time until we are able to migrate all legacy code to use this new table) will I be able to pull this off?
Will I have a decent enough performance in joins and things alike?
Will I be able to make complex updates or deletes (involving joins) with this approach?
Is there a better way to approach this problem?
Thanks in advance for your help.
The performance should be essentially identical.
For simple views without aggregate functions/group by/having, distinct, limit, unions, scalar subqueries, and views that return literals only, MySQL uses the MERGE algorithm by default, which effectively rewites a query referencing such a view as if you had used the columns in the base tables directly.
See View Algorithms in the documentation.
Determining what algorithm MySQL view is using may be informative as well.
I've Googled this question and can't seem to find a consistent opinion, or many opinions that are based on solid data. I simply would like to know if using the wildcard in a SQL SELECT statement incurs additional overhead than calling each item out individually. I have compared the execution plans of both in several different test queries, and it seems that the estimates always read the same. Is it possible that some overhead is incurred elsewhere, or are they truly handled identically?
What I am referring to specifically:
SELECT *
vs.
SELECT item1, item2, etc.
SELECT * FROM...
and
SELECT every, column, list, ... FROM...
will perform the same because both are an unoptimised scan
The difference is:
the extra lookup in sys.columns to resolve *
the contract/signature change when the table schema changes
inability to create a covering index. In fact, no tuning options at all, really
have to refresh views needed if non schemabound
can not index or schemabind a view using *
...and other stuff
Other SO questions on the same subject...
What is the reason not to use select * ?
Is there a difference betweeen Select * and Select list each col
SQL Query Question - Select * from view or Select col1,col2…from view
“select * from table” vs “select colA,colB,etc from table” interesting behaviour in SqlServer2005
Do you mean select * from ... instead of select col1, col2, col3 from ...?
I think it's always better to name the column and retrieve the minimal amount of information, because
your code will work independently of the physical order of the columns in the db. The column order should not impact your application, but it will be the case if you use *. It can be dangerous in case of db migration, etc.
if you name the columns, the DBMS can optimize further the execution. For instance, if there is an index that contains all the data your are interested in, the table will not be accessed at all.
If you mean something else with "wildcard", just ignore my answer...
EDIT: If you are talking about the asterisk wild card as in Select * From ... then see other responses...
If you are talking about wildcards in predicate clauses, or other query expressions using Like operator, (_ , % ) as described below, then:
This has to do with whether using the Wildcard affects whether the SQL is "SARG-ABLE" or not. SARGABLE, (Search-ARGument-able)means whether or not the query's search or sort arguments can be used as entry parameters to an existing index. If you prepend the wild card to the beginning of an argument
Where Name Like '%ing'
Then there is no way to traverse an index on the name field to find the nodes that end in 'ing'.
If otoh you append the wildcard to the end,
Where Name like 'Donald%'
then the optimizer can still use an index on the name column, and the query is still SARG-able
If that you call SQL wild car is *. It does not imply performance overhead by it self. However, if the table is extended you could find yourself retrieving fields you doesn't search.
In general not being specific in the fields you search or insert is a bad habit.
Consider
insert into mytable values(1,2)
What happen if the table is extended to three fields?
It may not be more work from an execution plan standpoint. But if you're fetching columns you don't actually need, that's additional network bandwidth being used between the database and your application. Also if you're using a high-level client API that performs some work on the returned data (for example, Perl's selectall_hashref) then those extra columns will impose performance cost on the client side. How much? Depends.