How to improve performance of querying on federated table which is created on top of remote views?
Related
I have some questions before implement the following scenario:
I have the Database A (it contains multiple tables with lots of data, and is being queried by multiple clients)
this database contains a users table, which I need to create some triggers, but this database is managed by a partner. We don't have permissions to create triggers.
And the Database B is managed by me, much lighter, the queries are only from one source, and I need to have access to users table data from Database A so I can create triggers and take actions for every update, create or delete in users table from database A.
My most concern is, how can this federated table impact on performance in database A? Database B is not the problem.
Both databases stay in the same geographic location, just different servers.
My goal is to make possible take actions from every transaction in database A users table.
Definitely queries that read federated tables have performance issues.
https://dev.mysql.com/doc/refman/8.0/en/federated-usagenotes.html says:
A FEDERATED table does not support indexes in the usual sense; because access to the table data is handled remotely, it is actually the remote table that makes use of indexes. This means that, for a query that cannot use any indexes and so requires a full table scan, the server fetches all rows from the remote table and filters them locally. This occurs regardless of any WHERE or LIMIT used with this SELECT statement; these clauses are applied locally to the returned rows.
Queries that fail to use indexes can thus cause poor performance and network overload. In addition, since returned rows must be stored in memory, such a query can also lead to the local server swapping, or even hanging.
(emphasis mine)
The reason the federated engine was created was to support applications that need to write to tables at a rate greater than a single server can support. If you are inserting to a table and overwhelming the I/O of that server, you can use a federated table so you can write to a table on a different server.
Reading from federated tables is likely to be worse than reading local tables, and cannot be optimized with indexes.
If you need good performance, you should use replication or a CDC tool, to maintain a real table on server B that you can query as a local table, not a federated table.
Another solution would be to cache the user's table in the client application, so you don't have to read it on every query.
I am working on an application which will have approximately 1 million records. To fetch data I have two options:
Join 10-12 tables (indexed) at run time and get the result
Create views from these tables. Query the view at run time instead of joining tables. I think MySQL won't let me use indexes on views.
Currently for testing I have just 10-20 records and both the options are taking similar time. But when the real time data will be loaded, which option would give better performance?
I am currently trying to figure out why the site I am working on (Laravel 4.2 framework) is really slow at times, and I think it has to do with my database setup. I am not a pro at all so I would assume that where the problem is
My sessions table has roughly 2.2 million records in it, when I run show processlist;, all the queries that take the longest relate to that table.
Here is a picture for example:
Table structure
Surerly I am doing something wrong or it's not index properly? I'm not sure, not fantastic with databases.
We don't see the complete SQL being executed, so we can't recommend appropriate indexes. But if the only predicate on the DELETE statements is on the last_activity column i.e.
DELETE FROM `sessions` WHERE last_activity <= 'somevalue' ;
Then performance of the DELETE statement will likely be improved by adding an index with a leading column of somevalue, e.g.
CREATE INDEX sessions_IX1 ON sessions (last_activity);
Also, if this table is using MyISAM storage engine, then DML statements cannot execute concurrently; DML statements will block while waiting to obtain exclusive lock on the table. The InnoDB storage engine uses row level locking, so some DML operations can be concurrent. (InnoDB doesn't eliminate lock contention, but locks will be on rows and index blocks, rather than on the entire table.)
Also consider using a different storage mechanism (other than MySQL database) for storing and retrieving info for web server "sessions".
Also, is it necessary (is there some requirement) to persist 2.2 million "sessions" rows? Are we sure that all of those rows are actually needed? If some of that data is historical, and isn't specifically needed to support the current web server sessions, we might consider moving the historical data to another table.
I have a many-to-many relationship database with 3 tables. It's very slow to load the data into the tables, especially the join table. Several hours for 3 millions rows.
I was suggested to create the tables first without creating index. I am using Hibernate. If I don't annotate index in classes, then what's the best time and way to add index? Should I do it directly on MySql database using SQL statement? Or the index should be added somewhere in Hibernate, without affecting loading performance?
You should add indexes directly to MySQL database using CREATE INDEX statement.
If you have very big table you can use pt-online-schema-change to prevent blocking your application
I've found MySql views to be one of the most common performance pitfalls, mostly for reasons mentioned in this answer.
The biggest drawback is in how MySQL processes a view, whether it's
stored or inline. MySQL will always run the view query and materialize
the results from that query as a temporary MyISAM
One big drawback of a view is that predicates from the outer query
NEVER get pushed down into the view
What are the performance differences (if any) for views in Aurora compared to MySql?
Do Aurora views necessarily materialize without considering predicates on the outer query?