MYSQL FEDERATED tables - mysql

"A FEDERATED table does not support indexes in the usual sense; because access to the table data is handled remotely, it is actually the remote table that makes use of indexes. This means that, for a query that cannot use any indexes and so requires a full table scan, the server fetches all rows from the remote table and filters them locally. This occurs regardless of any WHERE or LIMIT used with this SELECT statement; these clauses are applied locally to the returned rows.
Queries that fail to use indexes can thus cause poor performance and network overload. In addition, since returned rows must be stored in memory, such a query can also lead to the local server swapping, or even hanging."
16.8.3 FEDERATED Storage Engine Notes and Tips
Can anybody explain me on examples what is means?
What is "query that cannot use any indexes"?
This means that i get full data from remote server in any case or not?

The documentation means to say that if you run a query against a federated table, it generates another query that it runs against the remote base table. If the query that runs on the remote server cannot make use of an index, this forces a table-scan on the remote server, and therefore all the rows of that table are copied across the network.
You might think that the query should filter rows on the remote server before sending them back, but it seems it does not do that. It can filter rows on the remote server only if the filtering can be done on the remote side using an index.
There are very few cases where MySQL's federated storage engine is a good idea to use. I avoid it.

Related

Could federated table impact on database performance?

I have some questions before implement the following scenario:
I have the Database A (it contains multiple tables with lots of data, and is being queried by multiple clients)
this database contains a users table, which I need to create some triggers, but this database is managed by a partner. We don't have permissions to create triggers.
And the Database B is managed by me, much lighter, the queries are only from one source, and I need to have access to users table data from Database A so I can create triggers and take actions for every update, create or delete in users table from database A.
My most concern is, how can this federated table impact on performance in database A? Database B is not the problem.
Both databases stay in the same geographic location, just different servers.
My goal is to make possible take actions from every transaction in database A users table.
Definitely queries that read federated tables have performance issues.
https://dev.mysql.com/doc/refman/8.0/en/federated-usagenotes.html says:
A FEDERATED table does not support indexes in the usual sense; because access to the table data is handled remotely, it is actually the remote table that makes use of indexes. This means that, for a query that cannot use any indexes and so requires a full table scan, the server fetches all rows from the remote table and filters them locally. This occurs regardless of any WHERE or LIMIT used with this SELECT statement; these clauses are applied locally to the returned rows.
Queries that fail to use indexes can thus cause poor performance and network overload. In addition, since returned rows must be stored in memory, such a query can also lead to the local server swapping, or even hanging.
(emphasis mine)
The reason the federated engine was created was to support applications that need to write to tables at a rate greater than a single server can support. If you are inserting to a table and overwhelming the I/O of that server, you can use a federated table so you can write to a table on a different server.
Reading from federated tables is likely to be worse than reading local tables, and cannot be optimized with indexes.
If you need good performance, you should use replication or a CDC tool, to maintain a real table on server B that you can query as a local table, not a federated table.
Another solution would be to cache the user's table in the client application, so you don't have to read it on every query.

MySQL one query executed on two identical servers uses different indexes

I have two identical databases on separate server machines, and if I execute one query on both machines, on one server it would go smoothly while on the other it would cause slow log. Explain shows me that they are not using same indexes. Any suggestion or advice, it would be helpful.
The index statistics which MySQL keeps, sometimes become inaccurate (I don't know why/when).
Running ANALYZE TABLE <table> on both servers should correct the statistics.
If the problem appears again, you can use index hints and/or IF's to force MySQL to use the correct index.

Setting up MySQL 5.6 with Memcache fails without error

I am trying to setup MySQL 5.6 with the memcached plugin enabled. I followed the procedure on the mysql website and a couple of other tutorials 2, 3 that I found online. Specifically, as per 2, this should be really simple to setup and test.
I am trying to verify that the setup works as expected using telnet. When I set the value of a key from telnet, I get the return status of STORED. I can even fetch the value immediately from memcache. However, when I login into the DB, I do not see the new row. I don't see any errors in the logs either. "show plugins" shows that the daemon_memcached plugin is enabled.
[Edited]
Actually, things don't even the other way. I added a new row into the demo_test table and tried fetching it through the memcache interface. That didn't work either.
Any pointers about how to go about identifying what's wrong?
The memcache integration in MySQL communicates directly with the InnoDB storage engine, not the higher MySQL "server layer." As such, changes to table data through this interface do not invalidate queries against the table that have been stored in the query cache. This is in contrast to normal operations through the SQL interface, where any change to a table's data will immediately evict any and all results held from the query cache for queries against that table, without regard to whether or not the change to the table data actually invalidated each specific query impacted.
Repeat your query, but instead of SELECT, use SELECT SQL_NO_CACHE. If you get the result you expect, this is the explanation.
Once you have established that this is the cause, you will find that any SQL query that does an insert, delete, or update against the table will also have the effect of making memcache-changed data visible to SELECT queries, without the need for adding the SQL_NO_CACHE directive, and this will hold true even when the insert, delete, or update does not directly impact the rows in question, so long as it modifies something in the table in question.
Duh!! There was already a memcached instance running on port 11211. Unfortunately, mysql doesn't error out in this situation. When I was using telnet to connect to port 11211, I was reaching the existing memcached instance. It was storing/retrieving values that it had seen but wasn't communicating with MySQL.
I stopped the existing memcached instance and restarted mysql. I am now able to connect to port 11211. Using telnet, when I do a "get", I get back values from the db. Also, when I set new values from telnet, they get reflected in the DB (and can be retrieved using SQL).

Does a person's internet connection affect the speed of sql queries or php parsing?

I started benchmarking with Zend_Db_Profiler by saving queries that take too long. For one user, this query:
SELECT chapter, order, topic, id, name
FROM topics
WHERE id = '1'
AND hidden = 'no'
took 2.97 seconds. I performed an Explain:
select_type table possible keys key key_len ref rows Extra
SIMPLE topics id id 4 const 42 Using Where
and ran the query myself from phpMyAdmin, and it only took 0.0108 seconds. I thought that perhaps the size of the table might have an effect, as there is one column which is varchar and 8000 characters long, but it's not a part of the Select. I also just switched over to semi-dedicated hosting but can't imagine that this would have had a negative effect. Any thoughts as to how I could troubleshoot would be appreciated.
No. PHP and MySQL are server-side technologies, meaning your server processes them and has no bearing on the client. If your server is slow, it will just be slower in returning the response to the client.
Sadly, your premise about bottleneck here is not right. Also, when testing how one query behaves within your browser and then within PHPMyAdmin (or any other GUI), you have to clear query cache before trying to do the same query again. You didn't mention whether you did that.
The second part of tracking what might be wrong includes confirming that your database's configuration variables have been optimally set, that you chose the proper storage engine, and that your indexing strategy is optimal (such as choosing an INT for primary key instead of VARCHAR and similar atrocities).
That means that in most cases you'd go with InnoDB storage engine. It's free, it's quick if optimized (server variable named innodb_buffer_pool does wonders when set to proper size and when you have sufficient RAM). Seeing you said that you use semi-dedicated hosting implies you don't have control over those configuration variables.
Only when you're sure that
1) you're not testing the same query off of cache
2) that you've done everything within your power to make it optimal (this includes making sure that you don't have rogue processes raping your server).
Only then you can assume there might be an error in communication between the server and client.
As both PHP and SQL run on a server side, the user's internet connection does not affect the speed of the query.
Maybe the database server was too loaded at the time and couldn't pass the query in time.

Entity Framework code first with mysql in Production

I am creating an asp.net *MVC* application using EF code first. I had used Sql azure as my database. But it turns out Sql Azure is not reliable. So I am thinking of using MySql/PostgreSQL for database.
I wanted to know the repercussions/implications of using EF code first with MySql/PostgreSQL in regards of performance.
Has anyone used this combo in production or knows anyone who has used it?
EDIT
I keep on getting following exceptions in Sql Azure.
SqlException: "*A transport-level error has occurred when receiving results from the server.*
(provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.)"
SqlException: *"Database 'XXXXXXXXXXXXXXXX' on server 'XXXXXXXXXXXXXXXX' is not
currently available. Please retry the connection later.* If the problem persists, contact
customer support, and provide them the session tracing ID of '4acac87a-bfbe-4ab1-bbb6c-4b81fb315da'.
Login failed for user 'XXXXXXXXXXXXXXXX'."
First your problem seems to be a network issue, perhaps with your ISP. You may want to look at getting a remote PostgreSQL or MySQL db I think you will run into the same problems.
Secondly comparing MySQL and PostgreSQL performance is relatively tricky. In general, MySQL is optimized for pkey lookups, and PostgreSQL is more generally optimized for complex use cases. This may be a bit low-level but....
MySQL InnoDB tables are basically btree indexes where the leaf note includes the table data. The primary key is the key of the index. If no primary key is provided, one will be created for you. This means two things:
select * from my_large_table will be slow as there is no support for a physical order scan.
Select * from my_large_table where secondary_index_value = 2 requires two index traversals sinc ethe secondary index an only refer to the primary key values.
In contrast a selection for a primary key value will be faster than on PostgreSQL because the index contains the data.
PostgreSQL by comparison stores information in an unordered way in a series of heap pages. The indexes are separate from the data. If you want to pull by primary key you scan the index, then read the data page in which the data is found, and then pull the data. In comparison, if you pull from a secondary index, this is not any slower. Additionally, the tables are structured such that sequential disk access is possible when doing a long select * from my_large_table will result in the operating system read-ahead cache being able to speed performance significantly.
In short, if your queries are simply joinless selection by primary key, then MySQL will give you better performance. If you have joins and such, PostgreSQL will do better.