I have to handle database that is read (and only read) by third party software I cannot verify. It has a table which stores partial copy of other table.
Would it be safe to replace this table with view?
Besides the view restrictions, when using views in MySQL you have to be aware of one very important issue: The performance of WHERE statements suffers greatly:
For example:
SELECT column_a FROM table_n WHERE column_a="some value";
is quick assuming an index is in place on column_a.
Now create a view:
CREATE VIEW column_a_view AS SELECT column_a FROM table_n;
SELECT * FROM column_a_view b WHERE b.column_a="some value";
can lead to a full table scan since MySQL (at least until 5.6) did not always recognize the fact that it can use the index.
So especially if you are working with large tables, it can be more beneficial to create "copies" of the related data and replace that data once per time interval then working with views.
Related
I'm experiencing huge performance problem in one legacy application.
There is a search form where user can search records with given value.
A result row contains 10 columns. Then a SP returns any row which contains in any column that value.
This SP uses 8 Tables and some of them have about million records. Every minute I get a new record. This SP conducts paging as well.
Execution of this SP takes sometimes around 40 seconds.
What I did was, I created a new table and put there all records by using a query from this SP, but without conditions.
When there is a new update or update in one of source table I use a trigger and update this new "cache" table.
Now waiting for results from this new table takes only 1-3 seconds.
Has someone experience with something like this?
One of my colleagues said I better use view, but then every time I will be making JOINS.
What do you think? Is there another way?
Often times temporary tables can help you resolve performance issues. One approach might be to collect only the records that you need to consider into temporary tables and then create your final select statement from the temporary tables joined to any other tables that you're not filtering.
As an example, let's say one of the fields you are searching for is field1 in table1. Start by inserting into table #table1 only records that have the value of field1 you are looking for:
select PrimaryKeyTable1, Field1, Field2, Field3, etc...
into #table1
from table1
where Field1 = 'Whatever you are looking for'
This should be pretty fast even for a big tables, especially if you have an index on Field1. You do this for every table with search fields to collect all the records that have relevant records you are searching.
Then you also need to be sure to insert any records into your temporary tables that might have foreign key references to any of your other temporary tables. So let's say you also built a table #table2 with the above method that has a foreign key to table1 called PrimaryKeyTable1. You would insert those records like:
Insert into #table1
(PrimaryKeyTable1, Field1, Field2, Field3, etc...)
select table1.PrimaryKeyTable1, table1.Field1, table1.Field2, table1.Field3, etc...
from table1
join #table2
on table1.PrimaryKeyTable1 = table2.PrimaryKeyTable1
where table1.PrimaryKeyTable1 not in
(Select PrimaryKeyTable1 from #table1)
Now you will also have any records in #table1 that match to a record in #table2 that contain records that match the search criteria. You do this for all your temporary tables that have relevant foreign keys. The order that you do the inserts matters; be sure that you don't reference any temporary tables until after the last insert statement while collecting the foreign key referenced records.
Then you can simply do your final select statement, replacing the actual tables with the temporary tables you have built and eliminating all the filters that search your field data. Depending on the structure of your query there might be other optimizations, but that is the general idea.
If you've already explored all of your indexing options and this still doesn't help, MS SQL Server has "Change Tracking" features that maybe be of use to you in building your cache table. You enable the database for change tracking and configure which tables you wish to track. SQL Server then creates change records on every update, insert, delete on a table and then lets you query for changes to records that have been made since the last time you checked. This is very useful for syncing changes and is more efficient than using triggers. It's also easier to manage than making your own tracking tables. This has been a feature since SQL Server 2005.
How to: Use SQL Server Change Tracking
Change tracking only captures the primary keys of the tables and let's you query which fields might have been modified. Then you can query the tables join on those keys to get the current data. If you want it to capture the data also you can use Change Capture, but it requires more overhead and at least SQL Server 2008 enterprise edition.
Change Data Capture
Your solution is a robust way of doing what is called in Microsoft SQL Server "an indexed view" or "materialized view" in Oracle.
Basically you are correct - it's faster to navigate single indexed table then a dozen ones which are updated constantly.
You should really try creating an indexed view (some start here https://technet.microsoft.com/en-us/library/dd171921(v=sql.100).aspx) and it will probably solve all your performance issues.
You can use schema binding View and create cluster index on view.it will store your view data physically.but after creating schema binding view you can not alter your table.
I want to replicate certain table from one database into another database in the same server. This tables contain exactly the same fields.
I was considering to use MySQL Replication to replicate that table but some people said that it will increase IO so i find another way to create 3 Trigger (Insert, update and Delete) that will perform exactly the same thing like what i expect.
My Question is, which way is better? Is it using MySQL replication is better even though it's in the same server or using Trigger to replicate the data is better.
Thanks.
I don't know what is your goal, but I got mine getting use of the VIEW functionality.
I had two different applications with separate databases but in the same Mysql server. Application2 needed to get a few data from Application1. In general, this is a trivial situation that you can handle with USE DB1; or USE DB2; as your needing, but my programming framework does not work very well with multiple DBs.
So, lets see my solution...
Here is my select query to retrieve this data:
SELECT id, name FROM DB1.customers;
So, using DB2 as default schema, I've created a VIEW:
USE DB2;
CREATE VIEW app1_customers AS SELECT id, name FROM DB1.customers;
Now I can retrieve this data in DB2 as a regular table with a regular SELECT statement.
SELECT * FROM DB2.app1_customers;
Hope ts useful. BR
Assuming you have two databases on the same server i.e DB1 and DB2 and the table is called tbl1 and it is sitting in DB1 you can query the table like this:
USE DB1;
SELECT * FROM tbl1;
USE DB2;
SELECT * FROM DB1.tbl1;
This way you wont need to copy the data and worry about extra space and extra code. You can query a table in another database on the same server. Replication and triggers are not your answer here. You could also create a view to encapsulate the SQL statement.
Definitely triggers is the way to go. Having another server (slave) will need to spare several MB for installation, logs, cpu and memory usage.
I'd use triggers to keep both tables equal. If you want to create a table with the same columns definition and data use:
USE db2;
CREATE TABLE t1 AS SELECT * FROM db1.t1;
After that, go ahead and create the triggers for Update, Insert and Delete statemetns.
Also you could ALTER the new table to a different engine like MEMORY or add indexes to see if you can improve something.
I'm extremely new to Views so please forgive me if this is a silly question, but I have a View that is really helpful in optimizing a pretty unwieldy query, and allows me to select against a small subset of columns in the View, however, I was hoping that the View would actually be stored somewhere so that selecting against it wouldn't take very long.
I may be mistaken, but I get the sense (from the speed with which create view executes and from the duration of my queries against my View) that the View is actually run as a query prior to the external query, every time I select against it.
I'm really hoping that I'm overlooking some mechanism whereby when I run CREATE VIEW it can do the hard work of querying the View query *then, so that my subsequent select against this static View would be really swift.
BTW, I totally understand that obviously this VIEW would be a snapshot of the data that existed at the time the VIEW was created and wouldn't reflect any new info that was inserted/updated subsequent to the VIEW's creation. That's actually EXACTLY what I need.
TIA
What you want to do is materialize your view. Have a look at http://www.fromdual.com/mysql-materialized-views.
What you're talking about are materialised views, a feature of (at least) DB2 but not MySQL as far as I know.
There are ways to emulate them by creating/populating a table periodically, or on demand, but a true materialised view knows when the underlying data has changed, and only recalculates if required.
If the data will never change once the view is created (as you seem to indicate in a comment), just create a brand new table to hold the subset of data and query that. People always complain about slow speed but rarely about data storage requirements :-)
You can do this with:
A MySQL Event
A separate table (for caching)
The REPLACE INTO ... SELECT statement.
Here's a working example.
-- create dummy data for testing
CREATE TABLE MyTable (
id INT NOT NULL,
groupvar INT NOT NULL,
myvar INT
);
INSERT INTO MyTable VALUES
(1,1,1),
(2,1,1),
(3,2,1);
-- create the view, making sure rows have a unique identifier (groupvar)
CREATE VIEW MyView AS
SELECT groupvar, SUM(myvar) as myvar_sum
FROM MyTable
GROUP BY groupvar;
-- create cache table, setting primary key to unique identifier (groupvar)
CREATE TABLE MyView_Cache (PRIMARY KEY (groupvar))
SELECT *
FROM MyView;
-- create a table to keep track of when the cache has been updated (optional)
CREATE TABLE MyView_Cache_updated (update_id INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (update_id));
-- create event to update cache table (e.g., daily)
DELIMITER |
CREATE EVENT MyView_Cache_Event
ON SCHEDULE EVERY 1 DAY STARTS CURRENT_TIMESTAMP + INTERVAL 1 HOUR
DO
BEGIN
REPLACE INTO MyView_Cache
SELECT *
FROM MyView_Cache;
INSERT INTO MyView_Cache_updated
SELECT NULL, NOW() AS last_updated;
END |
DELIMITER ;
You can now query MyView_Cache for faster response times, and query MyView_Cache_updated to inform users of the last time the cache was updated (in this example, daily).
Since a view is basically a SELECT statement you can use query cache to improve performance.
But first you should check if :
you can add indexes in the tables involved to speed up the query (use EXPLAIN)
the data isn't changing very often you can materialize the view (make snapshots)
Use a materiallised view.. It can store data like count sum etc but yes after updating the table you need to refresh the view to get correct results as they are not auto updated.. Moreover after querying from view the results are stored in cache so the memory cycles reduces to 2 which are 4 in case of querying from the table itself. So it gets efficient from the second time.. When you query for 1st time from view the data is fetched from main memory and is stored in cache after it.
Is there a manner to add something like a where clause as a 'global' parameter for a mysql session.
For example an company has multiple user and you want to query for the user in this company, normally you would use a statement like:
SELECT * FROM users WHERE users.companyId = 2;
The issue is that adding the WHERE clauses would mean a huge impact on the code. Though, we defined the relations and thus I image (though, I don't think it exists), that you could create a session with the "global" constrained that all queries in that session should comply to.
You can create a view
CREATE VIEW view2 AS SELECT * FROM table1 WHERE companyid = 2;
If slowness is your curse, there are a few things you can do:
put an index on the where field(s) in this case companyid.
if you need more speed you can partition the table by companyid.
make the table a memory table, and use hash indexes for = and IN fields.
use InnoDB, instead of MyISAM. InnoDB has faster indexes.
Do not use select *, explicitly select only the fields you need.
See: http://dev.mysql.com/doc/refman/5.0/en/create-view.html
http://dev.mysql.com/doc/refman/5.5/en/index-btree-hash.html
http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html
http://dev.mysql.com/doc/refman/5.1/en/partitioning.html
The answer is NO.
As said before, you should put an index on that coloumn. And you can create a view.
Also you could use a temporary table.
From mysql docs:
You can use the TEMPORARY keyword when creating a table. A TEMPORARY
table is visible only to the current connection, and is dropped
automatically when the connection is closed. This means that two
different connections can use the same temporary table name without
conflicting with each other or with an existing non-TEMPORARY table of
the same name. (The existing table is hidden until the temporary table
is dropped.)
As a final thought, I can say that if you're performing the same query over and over again, you should rethink your model diagram, maybe do some denormalization.
What will happen when you perform select from table that has no companyId :) You can create views however, and select from them instead
What is the purpose of a temporary table like in the following statement? How is it different than a regular table?
CREATE TEMPORARY TABLE tmptable
SELECT A.* FROM batchinfo_2009 AS A, calibration_2009 AS B
WHERE A.reporttime LIKE '%2010%'
AND A.rowid = B.rowid;
Temp tables are kept only for the duration of your session with the sever. Once the connection's severed for any reason, the table's automatically dropped. They're also only visible to the current user, so multiple users can use the same temporary table name without conflict.
Temporary table ceases to exist when connection is closed. So, its purpose is for instance to hold temporary result set that has to be worked on, before it will be used.
Temporary tables are mostly used to store query results that need further processing, for instance if the result needs to be queried or refined again or is going to be used at different occasions by your application. Usually the data stored in a temporary database contains information from several regular tables (like in your example).
Temporary tables are deleted automatically when the current database session is terminated.
Support for temporary tables exists to allow procedural paradigms in a set-based 4GL, either because the coder has not switched their 3GL mindset to the new paradigm or to work around a performance or syntax issue (perceived or otherwise).