MySQL Rename table while keeping view for legacy code - mysql

I am renaming multiple tables in a large application. I need to preserve the old table name because some parts of the application will take longer to be updated, we can have no downtime.
My idea is to create a view that selects all from the new table, like this:
create view old_table_name as select a as x, b as y, c as z from new_table_name;
According to this article (http://dev.mysql.com/doc/refman/5.7/en/view-updatability.html) I will be able to make inserts and updates and deletes with this view.
My question is (considering that this is only a temporary solution in the mean time until we are able to migrate all legacy code to use this new table) will I be able to pull this off?
Will I have a decent enough performance in joins and things alike?
Will I be able to make complex updates or deletes (involving joins) with this approach?
Is there a better way to approach this problem?
Thanks in advance for your help.

The performance should be essentially identical.
For simple views without aggregate functions/group by/having, distinct, limit, unions, scalar subqueries, and views that return literals only, MySQL uses the MERGE algorithm by default, which effectively rewites a query referencing such a view as if you had used the columns in the base tables directly.
See View Algorithms in the documentation.
Determining what algorithm MySQL view is using may be informative as well.

Related

Join 10 tables on a single join id called session_id that's stored in session table. Is this good/bad practice?

There's 10 tables all with a session_id column and a single session table. The goal is to join them all on the session table. I get the feeling that this is a major code smell. Is this good/bad practice ?
What problems could occur?
Whether this is a good design or not depends deeply on what you are trying to represent with it. So, it might be OK or it might not be... there's no way to tell just from your question in its current form.
That being said, there are couple ways to speed up a join:
Use indexes.
Use covering indexes.
Under the right DBMS, you could use a materialized view to store pre-joined rows. You should be able to simulate that under MySQL by maintaining a special table via triggers (or even manually).
Don't join a table unless you actually need its fields. List only the fields you need in the SELECT list (instead of blindly using *). The fastest operation is the one you don't have to do!
And above all, measure on representative amounts of data! Possible results:
It's lightning fast. Yay!
It's slow, but it doesn't matter that it's slow (i.e. rarely used / not important).
It's slow and it matters that it's slow. Strap-in, you have work to do!
We need Query with 11 joins and the EXPLAIN posted in the original question when it is available, please. And be kind to your community, for every table involved post as well SHOW CREATE TABLE tblname SHOW INDEX FROM tblname to avoid additional requests for these 11 tables. And we will know scope of data and cardinality involved for each indexed column.
of Course more join kills performance.
but it depends !! if your data model is like that then you can't help yourself here unless complete new data model re-design happen !!
1) is it a online(real time transaction ) DB or offline DB (data warehouse)
if online , then better maintain single table. keep data in one table , let column increase in size.!!
if offline , it's better to maintain separate table , because you are not going to required all column always.!!

MySQL view performance TEMPTABLE or MERGE?

I have a view which queries from 2 tables that don't change often (they are updated once or twice a day) and have a maximum of 2000 and 1000 rows).
Which algorithm should perform better, MERGE or TEMPTABLE?
Wondering, will MySQL cache the query result, making TEMPTABLE the best choice in my case?
Reading https://dev.mysql.com/doc/refman/5.7/en/view-algorithms.html I understood that basically, the MERGE algorithm will inject the view code in the query that is calling it, then run. The TEMPTABLE algorithm will make the view run first, store its result into a temporary table then used. But no mention to cache.
I know I have the option to implement Materialized Views myself (http://www.fromdual.com/mysql-materialized-views). Can MySQL automatically cache the TEMPTABLE result and use it instead?
Generally speaking the MERGE algorithm is preferred as it allows your view to utilize table indexes, and doesn't introduce a delay in creating temporary tables (as TEMPTABLE does).
In fact this is what the MySQL Optimizer does by default - when a view's algorithm UNDEFINED (as it is by default) MySQL will use MERGE if it can, otherwise it'll use TEMPTABLE.
One thing to note (which has caused me a lot of pain) is that MySQL will not use the MERGE algorithm if your view contains any of the following constructs:
Constructs that prevent merging are the same for derived tables and view references:
Aggregate functions (SUM(), MIN(), MAX(), COUNT(), and so forth)
DISTINCT
GROUP BY
HAVING
LIMIT
UNION or UNION ALL
Subqueries in the select list
Assignments to user variables
Refererences only to literal values (in this case, there is no underlying table)
In this case, TEMPTABLE will be used, which can cause performance issues without any clear reason why. In this case it's best to use a stored procedure, or subquery instead of a view
Thank's MySQL 😠
Which algorithm? It depends on the particular query and schema. Usually the Optimizer picks the better approach, and you should not specify.
But... Sometimes the Optimizer picks really bad approach. At that point, the only real solution is not to use Views. That is, some Views cannot be optimized as well as the equivalent SELECT.
If you want to discuss a particular case, please provide the SHOW CREATE VIEW and SHOW CREATE TABLEs, plus a SELECT calling the view. And construct the equivalent SELECT. Also include EXPLAIN for both SELECTs.

MySQL Best Practice for adding columns

So I started working for a company where they had 3 to 5 different tables that were often queried in either a complex join or through a double, triple query (I'm probably the 4th person to start working here, it's very messy).
Anyhow, I created a table that when querying the other 3 or 5 tables at the same time inserts that data into my table along with whatever information normally got inserted there. It has drastically sped up the page speeds for many applications and I'm wondering if I made a mistake here.
I'm hoping that in the future to remove inserting into those other tables and simply inserting all that information into the table that I've started and to switch the applications to that one table. It's just a lot faster.
Could someone tell me why it's much faster to group all the information into one massive table and if there is any downside to doing it this way?
If the joins are slow, it may be because the tables did not have FOREIGN KEY relationships and indexes properly defined. If the tables had been properly normalized before, it is probably not a good idea to denormalize them into a single table unless they were not performant with proper indexing. FOREIGN KEY constraints require indexing on both the PK table and the related FK column, so simply defining those constraints if they don't already exist may go a long way toward improving performance.
The first course of action is to make sure the table relationships are defined correctly and the tables are indexed, before you begin denormalizing it.
There is a concept called materialized views, which serve as a sort of cache for views or queries whose result sets are deterministic, by storing the results of a view's query into a temporary table. MySQL does not support materialized views directly, but you can implement them by occasionally selecting all rows from a multi-table query and storing the output into a table. When the data in that table is stale, you overwrite it with a new rowset. For simple SELECT queries which are used to display data that doesn't change often, you may be able to speed up your pageloads using this method. It is not advisable to use it for data which is constantly changing though.
A good use for materialized views might be constructing rows to populate your site's dropdown lists or to store the result of complicated reports which are only run once a week. A bad use for them would be to store customer order information, which requires timely access.
Without seeing the table structures, etc it would be guesswork. But it sounds like possibly the database was over-normalized.
It is hard to say exactly what the issue is without seeing it. But you might want to look at adding indexes, and foreign keys to the tables.
If you are adding a table with all of the data in it, you might be denormalizing the database.
There are some cases where de-normalizing your tables has its advantages, but I would be more interested in finding out if the problem really lies with the table schema or with how the queries are being written. You need to know if the queries utilize indexes (or whether indexes need to be added to the table), whether the original query writer did things like using subselects when they could have been using joins to make a query more efficient, etc.
I would not just denormalize because it makes things faster unless there is a good reason for it.
Having a separate copy of the data in your newly defined table is a valid performance enchancing practice, but on the other hand it might become a total mess when it comes to keeping the data in your table and the other ones same. You are essentially having two truths, without good idea how to invalidate this "cache" when it comes to updates/deletes.
Read more about "normalization" and read more about "EXPLAIN" in MySQL - it will tell you why the other queries are slow and you might get away with few proper indexes and foreign keys instead of copying the data.

Is it more efficient to query from a view in database than from table?

Suppose I have a table A, creating a view V from that table.
Then I do several queries from V. I wonder if V will be re-constructed each time I query? or it will be constructed only 1 time, and being saved somewhere in memory by DBMS for next queries (which I think similar to query from a table)?
In general, no. V is a transient set of rows that is computed when requested by a query. Because you can apply additional WHERE and ORDER BY criteria when querying from a view, the execution plan for two queries against the same view could conceivably be quite different. The database generally cannot reuse the results of a previous query against a view to satisfy the next query against that view.
That said, there is a relatively new technology in some engines called Materialized Views. I have never used them myself, but my understanding is that these views are pre-computed based on updates that are made to the underlying tables. So with Materialize Views you do get improved SELECT performance, but at the expense of decrease INSERT, UPDATE, and DELETE performance.
You should also be aware that multi-column indexes can be used to precompute certain selections and sort orders involving individual tables. If you issue a query against a table that can be satisfied using a compound index (only the columns in the index are required by the query, and the sort order matches the index) then the table itself need never be read, only the index.
Views in MySQL are not a de facto caching solution.
MySQL runs the query against the base tables every time you query a view on those base tables. The results of the query are not stored for the view.
As a result, there is no need to "refresh" the view as there is when using materialized views in Oracle Microsoft SQL Server. Even the SQL in a MySQL view definition is re-evaluated every time you query the view.
If you need something like materialized views in MySQL, one tool that might help is FlexViews. This stores the results of a query in an ordinary base table, and then monitors changes recorded in MySQL's binary log, applying relevant changes to the base table. This tool can be quite useful, but it has some caveats:
FlexViews is written in PHP, and as such it has some performance limitations. Depending on your write traffic load, FlexViews may not be able to keep up.
It doesn't support every possible type of SELECT query.
FlexViews-managed materialized view tables are not updateable. That is, you can UPDATE this view table, but the change will not apply to the base tables.
According to Pinal Dave, a view must be refreshed in order to reflect changes made to its referenced table(s). I'm not sure this makes a view of a simple 1-table query any more efficient than querying the table directly (it probably doesn't) but I think it means that views containing complex joins and subqueries may be more efficient than their non-view counterparts.
Pinal Dave has more to say about the other limitations of SQL views (or features, if you like). Maybe you can learn something useful there.
Mysql Views do not support Indexes. (as like in Oracle, where you can create index in Oracle Views) But mysql views can use the indexes in underlying table when created with Merge Algorithm.
If you have to use views, then adjust your JOIN BUFFER.
Using, Something like this
set global join_buffer_size=314572800;
Do profile the differences before and after changing the buffer size.
I have seen after increasing join buffers, the view query executes in same time (in ms) as the table of the same size will do.

What is a good way to denormalize a mysql database?

I have a large database of normalized order data that is becoming very slow to query for reporting. Many of the queries that I use in reports join five or six tables and are having to examine tens or hundreds of thousands of lines.
There are lots of queries and most have been optimized as much as possible to reduce server load and increase speed. I think it's time to start keeping a copy of the data in a denormalized format.
Any ideas on an approach? Should I start with a couple of my worst queries and go from there?
I know more about mssql that mysql, but I don't think the number of joins or number of rows you are talking about should cause you too many problems with the correct indexes in place. Have you analyzed the query plan to see if you are missing any?
http://dev.mysql.com/doc/refman/5.0/en/explain.html
That being said, once you are satisifed with your indexes and have exhausted all other avenues, de-normalization might be the right answer. If you just have one or two queries that are problems, a manual approach is probably appropriate, whereas some sort of data warehousing tool might be better for creating a platform to develop data cubes.
Here's a site I found that touches on the subject:
http://www.meansandends.com/mysql-data-warehouse/?link_body%2Fbody=%7Bincl%3AAggregation%7D
Here's a simple technique that you can use to keep denormalizing queries simple, if you're just doing a few at a time (and I'm not replacing your OLTP tables, just creating a new one for reporting purposes). Let's say you have this query in your application:
select a.name, b.address from tbla a
join tblb b on b.fk_a_id = a.id where a.id=1
You could create a denormalized table and populate with almost the same query:
create table tbl_ab (a_id, a_name, b_address);
-- (types elided)
Notice the underscores match the table aliases you use
insert tbl_ab select a.id, a.name, b.address from tbla a
join tblb b on b.fk_a_id = a.id
-- no where clause because you want everything
Then to fix your app to use the new denormalized table, switch the dots for underscores.
select a_name as name, b_address as address
from tbl_ab where a_id = 1;
For huge queries this can save a lot of time and makes it clear where the data came from, and you can re-use the queries you already have.
Remember, I'm only advocating this as the last resort. I bet there's a few indexes that would help you. And when you de-normalize, don't forget to account for the extra space on your disks, and figure out when you will run the query to populate the new tables. This should probably be at night, or whenever activity is low. And the data in that table, of course, will never exactly be up to date.
[Yet another edit] Don't forget that the new tables you create need to be indexed too! The good part is that you can index to your heart's content and not worry about update lock contention, since aside from your bulk insert the table will only see selects.
MySQL 5 does support views, which may be helpful in this scenario. It sounds like you've already done a lot of optimizing, but if not you can use MySQL's EXPLAIN syntax to see what indexes are actually being used and what is slowing down your queries.
As far as going about normalizing data (whether you're using views or just duplicating data in a more efficient manner), I think starting with the slowest queries and working your way through is a good approach to take.
I know this is a bit tangential, but have you tried seeing if there are more indexes you can add?
I don't have a lot of DB background, but I am working with databases a lot recently, and I've been finding that a lot of the queries can be improved just by adding indexes.
We are using DB2, and there is a command called db2expln and db2advis, the first will indicate whether table scans vs index scans are being used, and the second will recommend indexes you can add to improve performance. I'm sure MySQL has similar tools...
Anyways, if this is something you haven't considered yet, it has been helping a lot with me... but if you've already gone this route, then I guess it's not what you are looking for.
Another possibility is a "materialized view" (or as they call it in DB2), which lets you specify a table that is essentially built of parts from multiple tables. Thus, rather than normalizing the actual columns, you could provide this view to access the data... but I don't know if this has severe performance impacts on inserts/updates/deletes (but if it is "materialized", then it should help with selects since the values are physically stored separately).
In line with some of the other comments, i would definately have a look at your indexing.
One thing i discovered earlier this year on our MySQL databases was the power of composite indexes. For example, if you are reporting on order numbers over date ranges, a composite index on the order number and order date columns could help. I believe MySQL can only use one index for the query so if you just had separate indexes on the order number and order date it would have to decide on just one of them to use. Using the EXPLAIN command can help determine this.
To give an indication of the performance with good indexes (including numerous composite indexes), i can run queries joining 3 tables in our database and get almost instant results in most cases. For more complex reporting most of the queries run in under 10 seconds. These 3 tables have 33 million, 110 million and 140 millions rows respectively. Note that we had also already normalised these slightly to speed up our most common query on the database.
More information regarding your tables and the types of reporting queries may allow further suggestions.
For MySQL I like this talk: Real World Web: Performance & Scalability, MySQL Edition. This contains a lot of different pieces of advice for getting more speed out of MySQL.
You might also want to consider selecting into a temporary table and then performing queries on that temporary table. This would avoid the need to rejoin your tables for every single query you issue (assuming that you can use the temporary table for numerous queries, of course). This basically gives you denormalized data, but if you are only doing select calls, there's no concern about data consistency.
Further to my previous answer, another approach we have taken in some situations is to store key reporting data in separate summary tables. There are certain reporting queries which are just going to be slow even after denormalising and optimisations and we found that creating a table and storing running totals or summary information throughout the month as it came in made the end of month reporting much quicker as well.
We found this approach easy to implement as it didn't break anything that was already working - it's just additional database inserts at certain points.
I've been toying with composite indexes and have seen some real benefits...maybe I'll setup some tests to see if that can save me here..at least for a little longer.