I'm working on a tcp server application that need dispatch around 1000 frames per second on java. Each ingoing frame generates an update (save state) and insert(for logging) to mysql. It says, 1000 updates and 1000 inserts per second!. I have a bottleneck in this updates and inserts. At this point, I don't know if mysql is a bad solution for my scenary because I'm doing the benchmarks on a PC, not on a server. I try execute Updates and Inserts in batchs periodically, but each batch takes a lot of time.
I'm using MyIsam engine and Mina to manipulates tcp connection and a pool for the mysql connections.
Should I use memcached over Mysql or NO-SQL, or it can be a hardware resource problem?
thanks!
regards!!
Related
I have been working on the server migration of a legacy ecommerce application using PHP 5.6.
The switch involved two Dedicated 32 servers from Linode.
One server is for NginX + PHP and the other is for MySQL only.
The legacy application leverages memcached.
After the switch, I can see a heavy internal traffic caused due to private inbound and outbound connections.
So far this element didn't cause any problem on performance.
However, I was under the impression that the queries would be cached on the local machine, and not on the remote.
Because if the query is cached on the remote host, it sill has to transmit the result set over the private network, instead of retrieving from RAM or the local SSD.
Am I assuming this wrong?
It may be that I am missing the point where the private inbound traffic is more beneficial for overall performance when compared to a local cache.
MySQL has a feature called the Query Cache, but this caches query result sets in the mysqld server process, not on the client. If you run the exact same query again after the result has been cached in the Query Cache, it will copy the result from the Query Cache and avoid the cost of running the query again. But this will not avoid the time to transfer the result across the network from mysqld to your PHP application.
Also keep in mind that the MySQL Query Cache is being deprecated and retired.
Alternatively, your application may store data from query results in memcached, but typically this would be done by the application code (I know there are UDF's to read and write memcached from MySQL triggers, but this is a bad idea).
If your memcached service is not on the same host as your PHP code, it would result in network transfer twice: Once when querying the data from MySQL the first time, then again transferring the data into memcached, then later every time you fetch the cached data out of memcached.
PHP also has some features to do in-memory caching, such as APCu. I don't have any experience with this, and it's not clear from a brief scan of the documentation where it stores cached data.
PHP is designed to be a "shared nothing" language. Every PHP request has its own data, and data doesn't normally last until the next request. This is why a cache is typically not kept in PHP memory. Applications rely on either memcached or the database itself, because those will hold data longer than a single PHP request.
If you have a fast enough network, it shouldn't be a high cost to fetch items out of a cache over a network. The performance architects at a past job of mine developed this wisdom:
"Remote memory is faster than local storage."
They meant that if the data is in RAM on a server, then reading it from RAM even with the additional overhead of transferring it across a network is usually better than reading the data from persistent (disk) storage on the local host.
A typical Elixir web application will usually have a postgresql backend, with Ecto queries coupled with the API logic.
However since cowboy creates a child GenServer process (containing the app logic) per request, will this have the effect of producing n psql threads for n concurrent requests, even with the pooling cowboy/poolboy provides?
Then, moving to a scenario where multiple instances of the application exists (for example a docker container cluster) will this not add an extra factor to the total number of existing database threads?
Cowboy does create a new Erlang process for each request but executing an Ecto query from that process will not result in a new Database connection. Ecto keeps a pool of connections to the database (using db_connection/poolboy). The size of this pool is set using the pool_size option in the configuration of the Repo. When you initiate a query, a connection from this pool is borrowed and used to execute the query. The connection is returned to the pool after the execution is complete. Ecto will never create a new connection for each query. If a connection is not available in the pool, it'll wait for one to be available or eventually time out if no connection is checked in in the configured timeout (defaults to 30 seconds).
I'm trying to test the performance of using memcached on a MySQL server to improve performance.
I want to be able to use the normal MySQL command line, but I can't seem to get it to connect to memcached, even when I specify the right port.
I'm running the MySQL command on the same machine as both the memcached process and the MySQL server.
I've looked around online, but I can't seem to find anything about using memcached other than with program APIs. Any ideas?
Memcached has its own protocol. The MySQL client cannot connect directly to a memcached server.
You may be thinking of the MySQL 5.6 feature that allows MySQL server to respond to connections using a memcached-compatible protocol, and read and write directly to InnoDB tables. See http://dev.mysql.com/doc/refman/5.6/en/innodb-memcached.html
But this does not allow MySQL clients to connect to memcached -- it's the opposite, allowing memcached clients to connect to mysqld.
Re your comment:
The InnoDB memcached interface is not really a caching solution per se, it's a solution for using a familiar key/value API for persistent data in InnoDB tables. InnoDB does do transparent caching of data pages in its buffer pool, but this is no different from conventional data reads with SQL. InnoDB also commits all changes to its transaction log synchronously on commit.
Here's a blog from my colleague at Percona. He tested whether the MySQL 5.6 memcached API could be used as a caching layer, and found that actually using memcached is still superior.
http://www.mysqlperformanceblog.com/2013/03/29/mysql-5-6-innodb-memcached-plugin-as-a-caching-layer/
Here's one conclusion from that blog:
As expected, there is a slowdown for write operations when using the InnoDB version. But there is also a slight increase in the average fetch time.
I am writing a db logging ruby gem which will simply take out a job from a Beanstalk queue and write it in the DB.
That is one process on Server A puts a job (that it wants to log) in the Beanstalk queue on Server B, and my logging process on Server B takes it out and writes it to the mysql DB on Server B.
I want to know if this is worth it?
Is putting a job in the Beanstalk queue faster than writing to the DB. Or can my process that wants to log to DB directly write it to DB instead of using the logging process.
Note that both the beanstalk server and DB are on another server.
Beanstalk internally makes a socket call from Server A to Server B.
I believe mysql would need to do the same as well?
So therefore is mysql to another server going to be slower than putting in the beanstalk queue.
It'll be much faster, primarily because Beanstalkd jobs, by default, are stored in-memory and are lost if, for example, you lose power on your server, whereas MySQL is a strongly ACID-compliant relational database, and hence will go to a lot of effort and flush each of your logs to disk.
I think you'll find that, after your do some benchmarking with a lot of logs being made by your system, that disk I/O will be your limiting factor, rather than the speed of TCP/IP sockets. Your current system's advantage is that when server A files a log on Server B's beanstalkd instance it takes up very little of Server A's time, and Server B can periodically flush our many logs at once from beanstalkd to MySQL, making the process more efficient. The disadvantage is that, the more you batch up the logs, the more logs you will lose in the event of a software / power failure, unless you use beanstalkd's "-b" parameter which makes jobs durable by writing them to disk (and hence making the process slower).
Of course, the only way to truly settle this question is to benchmark!
Is it normal for mysql to be slow when connecting to a remote host or should it have the same performance as connecting to a local host?
I noticed a small performance difference, when I tried to connect to a remote host, so I'm wondering if that's normal?
Assuming that the remote machine is equal in terms of processing power as your local machine, then the primary difference in speed should be network latency - the round trip time for a network traffic. If you are sending huge amounts of data (e.g., reading or writing large BLOBs), then the network bandwidth can come into play as well and "slow" things down. But in general, the round trip cost is often the biggest factor. If you are executing a large number of "small" queries, this cost difference can be fairly significant when comparing a local connection to a remote connection.
Out of curiosity, I just now ran a test that I had already built that simply runs a bunch of update queries. This is not using MySQL but another client/server DBMS. Thus the results would likely be different, but the idea is the same and I would imagine the relative differences would not be significantly different.
Local host (using IPC comm): 5.3 seconds
Remote host (UDP comm): 20.2 seconds
This involved about 50,000 operations. The remote host was 2 hops away on the LAN with (if I measured it correctly) a round trip latency of approximately 0.25 ms for a packet with a 1 byte payload.
It depends entirely on the network connection between the program and the MySQL database server. A slow network will make the database appear slow.
I'd expect a "small performance difference" (as you described it) to be normal for a remote connection.
By default the MySQL server will perform a reverse DNS lookup the first time a client connects to it. It then stores this in its cache. This can potentially give a performance hit depending on the speed of the reverse DNS resolution.
It can depend on how many MySQL queries you're doing: Slow MySQL Remote Connection
You can optimize your code by converting many small queries into larger ones.