Could federated table impact on database performance? - mysql

I have some questions before implement the following scenario:
I have the Database A (it contains multiple tables with lots of data, and is being queried by multiple clients)
this database contains a users table, which I need to create some triggers, but this database is managed by a partner. We don't have permissions to create triggers.
And the Database B is managed by me, much lighter, the queries are only from one source, and I need to have access to users table data from Database A so I can create triggers and take actions for every update, create or delete in users table from database A.
My most concern is, how can this federated table impact on performance in database A? Database B is not the problem.
Both databases stay in the same geographic location, just different servers.
My goal is to make possible take actions from every transaction in database A users table.

Definitely queries that read federated tables have performance issues.
https://dev.mysql.com/doc/refman/8.0/en/federated-usagenotes.html says:
A FEDERATED table does not support indexes in the usual sense; because access to the table data is handled remotely, it is actually the remote table that makes use of indexes. This means that, for a query that cannot use any indexes and so requires a full table scan, the server fetches all rows from the remote table and filters them locally. This occurs regardless of any WHERE or LIMIT used with this SELECT statement; these clauses are applied locally to the returned rows.
Queries that fail to use indexes can thus cause poor performance and network overload. In addition, since returned rows must be stored in memory, such a query can also lead to the local server swapping, or even hanging.
(emphasis mine)
The reason the federated engine was created was to support applications that need to write to tables at a rate greater than a single server can support. If you are inserting to a table and overwhelming the I/O of that server, you can use a federated table so you can write to a table on a different server.
Reading from federated tables is likely to be worse than reading local tables, and cannot be optimized with indexes.
If you need good performance, you should use replication or a CDC tool, to maintain a real table on server B that you can query as a local table, not a federated table.
Another solution would be to cache the user's table in the client application, so you don't have to read it on every query.

Related

Comparison between MySQL Federated, Trigger, and Event Schedule?

I have a very specific problem that requires multiple MYSQL DB instances, and I need to "sync" all data from each DB/table into one DB/table.
Basically, [tableA.db1, tableB.db2, tableC.db3] into [TableAll.db4].
Some of the DB instances are on the same machine, and some are on a separate machine.
About 80,000 rows are added to a table per day, and there are 3 tables(DB).
So, about 240,000 would be "synced" to a single table per day.
I've just been using Event Schedule to copy the data from each DB into the "All-For-One" DB every hour.
However, I've been wondering lately if that's the best solution.
I considered using Trigger, but I've been told it puts heavy burden on DB.
Using statement trigger may be better, but it depends too much on how the statement is formed.
Then I heard about Federated (in Oracle term, "DBLink"),
and I thought I could use it to link each table and create a VIEW table on those tables.
But I don't know much about databases, so I don't really know the implication of each method.
So, my question is..
Considering the "All-For-One" DB only needs to be Read-Only,
which method would be better, performance and resource wise, in order to copy data from multiple databases into one database regularly?
Thanks!

MYSQL FEDERATED tables

"A FEDERATED table does not support indexes in the usual sense; because access to the table data is handled remotely, it is actually the remote table that makes use of indexes. This means that, for a query that cannot use any indexes and so requires a full table scan, the server fetches all rows from the remote table and filters them locally. This occurs regardless of any WHERE or LIMIT used with this SELECT statement; these clauses are applied locally to the returned rows.
Queries that fail to use indexes can thus cause poor performance and network overload. In addition, since returned rows must be stored in memory, such a query can also lead to the local server swapping, or even hanging."
16.8.3 FEDERATED Storage Engine Notes and Tips
Can anybody explain me on examples what is means?
What is "query that cannot use any indexes"?
This means that i get full data from remote server in any case or not?
The documentation means to say that if you run a query against a federated table, it generates another query that it runs against the remote base table. If the query that runs on the remote server cannot make use of an index, this forces a table-scan on the remote server, and therefore all the rows of that table are copied across the network.
You might think that the query should filter rows on the remote server before sending them back, but it seems it does not do that. It can filter rows on the remote server only if the filtering can be done on the remote side using an index.
There are very few cases where MySQL's federated storage engine is a good idea to use. I avoid it.

Updating MySQL Innodb Index Statistics

We have a large MySQL 5.5 database in which many rows are inserted daily and never deleted or updated. There are also users querying the live database. Tables are MyISAM.
But it is effectively impossible to run ANALYZE TABLES because it takes way too long. And so the query optimizer will often pick the wrong index. (15 hours, and sometimes crashes the tables.)
We want to try switching to all InnoDB. Will we need to run ANALYZE TABLES or not?
The MySQL docs say:
The cardinality (the number of different key values) in every index of a table
is calculated when a table is opened, at SHOW TABLE STATUS and ANALYZE TABLE and
on other circumstances (like when the table has changed too much).
But that begs the question: when is a table opened? If that means accessed during a connection then we need do nothing special. But I do not think that that is the case for InnoDB.
So what is the best approach? Run ANALYZE TABLE periodically? Perhaps with an increased dive count?
Or will it all happen automatically?
The query users use apps to get the data, so each run is a separate connection. They generally do NOT expect the rows to be up-to-date within just minutes.

Drawbacks of using manually created temporary tables in MySQL

I have many queries that use manually created temporary tables in MySQL.
I want to understand if there are any drawbacks associated with this.
I ask this because I use temporary tables for queries that fetch data shown on the home screen of a web application in the form of multiple widgets. In an organization with a significant number of users, this involves creation and deletion of temporary tables numerous times. How does this affect the MySQL Database Server ?
Execution plans can't be optimal when you frequently add/use/remove tables when we would talk about databases in general. As it takes a time to generate an execution plan, the DB is unable to create one when you use described approach.

MySQL Federated storage engine vs replication (performance)

Long story short - I am dealing with a largish database where basic user details (userid (index), username, password, parent user, status) are stored in one database and extended user details (same userid (index), full name, address etc. etc.) are stored in another database on another server.
I need to do a query where I select all users owned by a particular user (via the parent user field from basic user details database), sorted by their full name (from the extended user details field) and just 25 at a time (there are thousands, maybe tens of thousands for any one user).
As far as I can work out there are three possible solutions;
No JOIN - get all the user IDs in one query and run the second query based on those IDs. This would be fine, except the number of user IDs could get so high that it would exceed the maximum query length, or be horribly inefficient.
Replicate the database table with the basic user details onto the server with the extended details so I can do a JOIN
Use a federated storage engine table to achieve the same results as #2
It seems that 3 is the best option, but I have been able to find little information about performance and I also found one comment to be careful using this on production databases.
I would appreciate any suggestions on what would be the best implementation.
Thanks!
FEDERATED tables are a nice feature .. but they do not support indexes, which would slow down your application dramatically.
if (!) you do read only from the users database on the remote server.
replication would be more effective and also faster.
Talking in terms of performance or limitations, Federated Engine has a lot of limitations. It doesn't support for transactions, Performance on a FEDERATED table when performing bulk inserts is slower than with other table types etc..
Replication and Federated engines are not meant to do same things. First of all, did you try both?