SQL queries break our game! (Back-end server is at capacity)

SQL queries break our game! (Back-end server is at capacity) - mysql

We have a Facebook game that stores all persistent data in a MySQL database that is running on a large Amazon RDS instance. One of our tables is 2GB in size. If I run any queries on that table that take more than a couple of seconds, any SQL actions performed by our game will fail with the error:
HTTP/1.1 503 Service Unavailable: Back-end server is at capacity
This obviously brings down our game!
I've monitored CPU usage on the RDS instance during these periods, and though it does spike, it doesn't go much over 50%. Previously we were on a smaller instance size and it did hit 100%, so I'd hoped just throwing more CPU capacity at the problem would solve it. I now think it's an issue with the number of open connections. However, I've only been working with SQL for 8 months or so, so I'm no expert on MySQL configuration.
Is there perhaps some configuration setting I can change to prevent these queries from overloading the server, or should I just not be running them whilst our game is up?
I'm using MySQL Workbench to run the queries.
Any help would be very much appreciated - Thanks!
EDIT:
Here's an example....
SELECT *
FROM BlueBoxEngineDB.Transfer
WHERE Amount = 1000
AND FromUserId = 4
AND Status='Complete';
The table looks like this:
TransferId Started Status Expires FromUserId ToUserId CurrencyId Amount SessionId
1177 2012-06-04 21:43:18 Added 150001 2 4 1 12156
1179 2012-06-04 21:48:50 ISF 150001 2 4 1 12156
1181 2012-06-04 22:08:33 Added 150001 2 4 25 12156
1183 2012-06-04 22:08:41 Complete 150001 2 4 50 12156
1185 2012-06-04 22:08:46 Added 150001 2 4 200 12156

You should REALLY consider running a high availability RDS and setting up a read replica off of it. That way you can run complex queries on the replica to your heart's content and not interfere with the production database.
A 2GB (in size) database is really not all that large. If you have the proper indexes on the tables you are trying to query, you should not be locking your DB up.
Above all - don't be running queries on a high capacity production database if you don't know what it is going to do. From the comments above it seems clear that you are not a very experienced DB admin. That's ok. Working on a high volume server will definitely be a learning experience for you, just try not to make your lessons ones where you crash your service. Again, this is why having a replica, or creating a DB snapshot and setting up a test DB before trying queries on large tables is a very good idea.

An index on (FromUserId, Amount, Status) would probably help this query a lot.
You may have though a lot more variations of queries that hit this table. Adding an index for every one of them, will end you in having tens of indexes in the table and this may bring other problems.
Try to analyze the slow query log and then optimize the slowest queries (and the ones that use more percentage of the CPU).

You probably need to tweak your schema (adding indexes is the immediate step).
To analyze your situation you can access the MySQL slow query logs for your database to determine if there are slow-running SQL queries and, if so, the performance characteristics of each. You could set the "slow_query_log" DB Parameter and query the mysql.slow_log table to review the slow-running SQL queries. Please refer to the Amazon RDS User Guide to learn more.
There are probably some of your tables that you should consider offloading to DynamoDB or Redis. Both of them will give a latency of single-digit milliseconds, and therefore are very popular among game developers. You just need to think about your data structure.

Related

How to diagnose extremely slow AWS RDS MySQL Performance?

My DB has around 15 tables, each with 40 columns, with 10.000 rows each.
Most of it with VARCHAR, some indexes and foreign keys.
Sometime I need to reconstruct my database (design flaw, working on it), which takes about 40 seconds locally. Now I'm trying to do the same to a AWS RDS MySQL 5.75 instance, but it takes forever, something like 40-50 minutes. The last time I had to do this same process it took no more than 5 minutes, still way more than the local 40 seconds, but I'm happy with it.
My internet speed is at about 35 Mbps Download / 5 Mbps Upload.
I know it's not fast, but it's consistent, and it hasn't changed since my last rebuilt.
I enabled General Logs, but all I can see are the INSERT queries, occasionally some "SELECT 1".
I do have same space for improvements on my code, but still, from 00:40:00 to 50:00:00, it seems that there's something else going on.
Any ideas on how to diagnose and find the bottleneck?
Thanks
--
Additional relevant information:
It is a Micro instance from AWS, all of the relevant monitoring indicators are basically flat: CPU at 4%, Free Storage Space at 20.000 MB, Freeable Memory at 200 MB, Write IOPS at around 2,5, the server runs a 5.7.25 MySQL, 1vCPU, 1Gb of RAM and 20GB of SSD. This is the same as 3 months ago when I last rebuilt the database.
SHOW GLOBAL STATUS: https://pastebin.com/jSrAzYZP
SHOW GLOBAL VARIABLES: https://pastebin.com/YxD7dVhR
SHOW ENGINE INNODB STATUS: https://pastebin.com/r5wffB5t
SHOW PROCESS LIST: https://pastebin.com/kWwiyGwf
SELECT * FROM information_schema...: https://pastebin.com/eXGBmetP
I haven't made any big changes to the server configuration, except enabling logs, e maxing out max_allowed_packets and saving logs to file.
In my backend I have a Flask app running, when it receives the API call, it takes a bunch of pickled objects and adds them all to the database (appending the Flask SQLAlchemy class to a list) and then running db.session.add_all(entries), trying to run a bulk operation. The code is the same, both for localhost and my remote server.
It does get slower in three specific tables, most of them with VARCHAR columns, but nothing different from my last inserts - it seems odd that the problem would be data, or the way the code is structured, or at least doesn't seem reasonable that this would result in a 20 second (localhost) to 40 minutes (hosted server) time, specially when the rest of the tables work mostly the same.

Enable the slow log, set long_query_time=0, run your code, then put the resulting log through mysqldumpslow.
Establish which queries contribute most to slowness and take it from there.
Compare the config between your old server and your new one.
Also, are they the same version of MySQL? 5.6, 5.7 and 8.0 can produce very different execution plans (with 5.6 usually coming up with the sane one if they differ).

Rate Per Second = RPS
Suggestions to consider for your AWS RDS Parameters group
thread_cache_size=24 # from 8 to reduce threads_created count
innodb_io_capacity=1900 # from 200 to enable more use of SSD IOPS capacity
read_rnd_buffer_size=128K # from 512K to reduce handler_read_rnd_next RPS of 21
query_cache_size=0 # from 1M since you have QC turned off with query_cache_typ=OFF
Determine why com_flush is running 13 times per hour and get it stopped to avoid table open thrashing.

I found that after migrating to RDS all my database Indexes are gone! They weren't migrated along with the schema and data. Make sure you're indexes are there.
Also, MySQL query cache is OFF by default in RDS. This won't help the performance of your initial query, but it may speed things up in general.
You can set query_cache_type to 1 and define a value for query_cache_size. I also changed the thread_cache_size from 8 to 24 and innodb_io_capacity from 200 to 1900 don't know if it helps you.
Also creating AWS DB Parameter Groups helped me a lot with configuring and tuning DB variables. Here you can read more:
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_WorkingWithParamGroups.html

MySQL server very high load

I run a website with ~500 real time visitors, ~50k daily visitors and ~1,3million total users. I host my server on AWS, where I use several instances of different kind. When I started the website the different instances cost rougly the same. When the website started to gain users the RDS instance (MySQL DB) CPU constantly keept hitting the roof, I had to upgrade it several times, now it have started to take up the main part of the performance and monthly cost (around 95% of (2,8k$/month)). I currently use a database server with 16vCPU and 64GiB of RAM, I also use Multi-AZ Deployment to protect against failures. I wonder if it is normal for the database to be that expensive, or if I have done something terribly wrong?
Database Info
At the moment my database have 40 tables with the most of them have 100k rows, some have ~2millions and 1 have 30 millions.
I have a system the archives rows that are older then 21 days when they are not needed anymore.
Website Info
The website mainly use PHP, but also some NodeJS and python.
Most of the functions of the website works like this:
Start transaction
Insert row
Get last inserted id (lastrowid)
Do some calculations
Updated the inserted row
Update the user
Commit transaction
I also run around 100bots wich polls from the database with 10-30sec interval, they also inserts/updates the database sometimes.
Extra
I have done several things to try to lower the load on the database. Such as enable database cache, use a redis cache for some queries, tried to remove very slow queries, tried to upgrade the storage type to "Provisioned IOPS SSD". But nothing seems to help.
This is the changes I have done to the setting paramters:
I have though about creating a MySQL cluster of several smaller instances, but I don't know if this would help, and I also don't know if this works good with transactions.
If you need any more information, please ask, any help on this issue is greatly appriciated!

In my experience, as soon as you ask the question "how can I scale up performance?" you know you have outgrown RDS (edit: I admit my experience that leads me to this opinion may be outdated).
It sounds like your query load is pretty write-heavy. Lots of inserts and updates. You should increase the innodb_log_file_size if you can on your version of RDS. Otherwise you may have to abandon RDS and move to an EC2 instance where you can tune MySQL more easily.
I would also disable the MySQL query cache. On every insert/update, MySQL has to scan the query cache to see if there any results cached that need to be purged. This is a waste of time if you have a write-heavy workload. Increasing your query cache to 2.56GB makes it even worse! Set the cache size to 0 and the cache type to 0.
I have no idea what queries you run, or how well you have optimized them. MySQL's optimizer is limited, so it's frequently the case that you can get huge benefits from redesigning SQL queries. That is, changing the query syntax, as well as adding the right indexes.
You should do a query audit to find out which queries are accounting for your high load. A great free tool to do this is https://www.percona.com/doc/percona-toolkit/2.2/pt-query-digest.html, which can give you a report based on your slow query log. Download the RDS slow query log with the http://docs.aws.amazon.com/cli/latest/reference/rds/download-db-log-file-portion.html CLI command.
Set your long_query_time=0, let it run for a while to collect information, then change long_query_time back to the value you normally use. It's important to collect all queries in this log, because you might find that 75% of your load is from queries under 2 seconds, but they are run so frequently that it's a burden on the server.
After you know which queries are accounting for the load, you can make some informed strategy about how to address them:
Query optimization or redesign
More caching in the application
Scale out to more instances

I think the answer is "you're doing something wrong". It is very unlikely you have reached an RDS limitation, although you may be hitting limits on some parts of it.
Start by enabling detailed monitoring. This will give you some OS-level information which should help determine what your limiting factor really is. Look at your slow query logs and database stats - you may have some queries that are causing problems.
Once you understand the problem - which could be bad queries, I/O limits, or something else - then you can address them. RDS allows you to create multiple read replicas, so you can move some of your read load to slaves.
You could also move to Aurora, which should give you better I/O performance. Or use PIOPS (or allocate more disk, which should increase performance). You are using SSD storage, right?
One other suggestion - if your calculations (step 4 above) takes a significant amount of time, you might want look at breaking it into two or more transactions.

A query_cache_size of more than 50M is bad news. You are writing often -- many times per second per table? That means the QC needs to be scanned many times/second to purge the entries for the table that changed. This is a big load on the system when the QC is 2.5GB!
query_cache_type should be DEMAND if you can justify it being on at all. And in that case, pepper the SELECTs with SQL_CACHE and SQL_NO_CACHE.
Since you have the slowlog turned on, look at the output with pt-query-digest. What are the first couple of queries?
Since your typical operation involves writing, I don't see an advantage of using readonly Slaves.
Are the bots running at random times? Or do they all start at the same time? (The latter could cause terrible spikes in CPU, etc.)
How are you "archiving" "old" records? It might be best to use PARTITIONing and "transportable tablespaces". Use PARTITION BY RANGE and 21 partitions (plus a couple of extras).
Your typical transaction seems to work with one row. Can it be modified to work with 10 or 100 all at once? (More than 100 is probably not cost-effective.) SQL is much more efficient in doing lots of rows at once versus lots of queries of one row each. Show us the SQL; we can dig into the details.
It seems strange to insert a new row, then update it, all in one transaction. Can't you completely compute it before doing the insert? Hanging onto the inserted_id for so long probably interferes with others doing the same thing. What is the value of innodb_autoinc_lock_mode?
Do the "users" interactive with each other? If so, in what way?

Self Hosted mysql server on azure vm hangs on select where DB is ~100mb

I i'm doing select from 3 joined tables on MySql server 5.6 running on azure instance with inno_db set to 2GB. I used to have 14GB ram and 2core server and I just doubled ram and cores hoping this will result positive on my select but it didn't happen.
My 3 tables I'm doing select from are 90mb,15mb and 3mb.
I believe I don't do anything crazy in my request where I select few booleans however i'm seeing this select is hangind the server pretty bad and I can't get my data. I do see traffic increasing to like 500MB/s via Mysql workbench but can't figure out what to do with this.
Is there anything I can do to get my sql queries working? I don't mind to wait for 5 minutes to get that data, but i need to figure out how to get it.
==================== UPDATE ===============================
I was able to get it done via cloning the table that is 90 mb and forfilling it with filtered original table. It ended up to be ~15mb, then I just did select all 3 tables joining then via ids. So now request completes in 1/10 of a second.
What did I do wrong in the first place? I feel like there is a way to increase some sizes of some packets to get such queries to work? Any suggestions on what shall I google?
Just FYI, my select query looked like this
SELECT
text_field1,
text_field2,
text_field3 ,..
text_field12
FROM
db.major_links,db.businesses, db.emails
where bool1=1
and bool2=1
and text_field is not null or text_field!=''
and db.businesses.major_id=major_links.id
and db.businesses.id=emails.biz_id;
So bool1,2 and textfield i'm filtering are the filds from that 90mb table

I know this might be a bit late, but I have some suggestions.
First take a look the max_allowed_packet in your my.ini file. This is usually found here in Windows:
C:\ProgramData\MySQL\MySQL Server 5.6
This controls the packet size, and usually causes errors in large queries if it isn't set correctly. I have mine set to 100M
Here is some documentation for you:
Official documentation
In addition I've slow queries when there are a lot of items in the where statement and here you have several. Make sure you have indexes and compound indexes on the values in your where clause especially related to the joins.

When should I care about data modeling?

On the last month I've done basically the impossible: I have a Debian server on a Intel Celeron 2.5Ghz / 512 MB RAM / >40GB IDE Hard Drive with MySql running smoothly. I managed to connect using MySql Workbench and then I realized that I didn't stop to think about the database model.
My current database is an Access 97 Database with 2 gigantic tables:
Tbl_Swift - 13 fields and one of them is a 'memo' field with a full page of information.
Tbl_Contr - 20 fields where FOUR of them are 'memo' fields with pages of information.
It's not like the database is heavy or slow on Access, but I wanted to make it available to most users... then I realized that I should optimize my database, but here's the problem:
WHY?
Will it make that much of a difference? I'll have less than 5 users connected to this database and NONE of them will have 'write' privilege, they'll just run some standard queries. The database itself is rather small, it's under 600MB and ~90K records.
So, should I really stop to think about making it more 'optimized'?

"When I say OPTMIZE I mean when people say I should have a lot of tables with little information"
What you are talking about is normalization, and recently there was a thread about normalization vs. performance here: Denormalization: How much is too much?
And yes, I believe that you should think about normalization before that DB gets too big.

DB with best inserts/sec performance? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
We deploy an (AJAX - based) Instant messenger which is serviced by a Comet server. We have a requirement to store the sent messages in a DB for long-term archival purposes in order to meet legal retention requirements.
Which DB engine provides the best performance in this write-once, read never (with rare exceptions) requirement?
We need at least 5000 Insert/Sec. I am assuming neither MySQL nor PostgreSQL
can meet these requirements.
Any proposals for a higher performance solution? HamsterDB, SQLite, MongoDB ...?

Please ignore the above Benchmark we had a bug inside.
We have Insert 1M records with following columns: id (int), status (int), message (140 char, random).
All tests was done with C++ Driver on a Desktop PC i5 with 500 GB Sata Disk.
Benchmark with MongoDB:
1M Records Insert without Index
time: 23s, insert/s: 43478
1M Records Insert with Index on Id
time: 50s, insert/s: 20000
next we add 1M records to the same table with Index and 1M records
time: 78s, insert/s: 12820
that all result in near of 4gb files on fs.
Benchmark with MySQL:
1M Records Insert without Index
time: 49s, insert/s: 20408
1M Records Insert with Index
time: 56s, insert/s: 17857
next we add 1M records to the same table with Index and 1M records
time: 56s, insert/s: 17857
exactly same performance, no loss on mysql on growth
We see Mongo has eat around 384 MB Ram during this test and load 3 cores of the cpu, MySQL was happy with 14 MB and load only 1 core.
Edorian was on the right way with his proposal, I will do some more Benchmark and I'm sure we can reach on a 2x Quad Core Server 50K Inserts/sec.
I think MySQL will be the right way to go.

If you are never going to query the data, then i wouldn't store it to a database at all, you will never beat the performance of just writing them to a flat file.
What you might want to consider is the scaling issues, what happens when it's to slow to write the data to a flat file, will you invest in faster disk's, or something else.
Another thing to consider is how to scale the service so that you can add more servers without having to coordinate the logs of each server and consolidate them manually.
edit: You wrote that you want to have it in a database, and then i would also consider security issues with havening the data on line, what happens when your service gets compromised, do you want your attackers to be able to alter the history of what have been said?
It might be smarter to store it temporary to a file, and then dump it to an off-site place that's not accessible if your Internet fronts gets hacked.

If you don't need to do queries, then database is not what you need. Use a log file.

it's only stored for legal reasons.
And what about the detailed requirements? You mention the NoSQL solutions, but these can't promise the data is realy stored on disk. In PostgreSQL everything is transaction safe, so you're 100% sure the data is on disk and is available. (just don't turn of fsync)
Speed has a lot to do with your hardware, your configuration and your application. PostgreSQL can insert thousands of record per second on good hardware and using a correct configuration, it can be painfully slow using the same hardware but using a plain stupid configuration and/or the wrong approach in your application. A single INSERT is slow, many INSERT's in a single transaction are much faster, prepared statements even faster and COPY does magic when you need speed. It's up to you.

I don't know why you would rule out MySQL. It could handle high inserts per second. If you really want high inserts, use the BLACK HOLE table type with replication. It's essentially writing to a log file that eventually gets replicated to a regular database table. You could even query the slave without affecting insert speeds.

Firebird can easily handle 5000 Insert/sec if table doesn't have indices.

Depending in your system setup MySql can easily handle over 50.000 inserts per sec.
For tests on a current system i am working on we got to over 200k inserts per sec. with 100 concurrent connections on 10 tables (just some values).
Not saying that this is the best choice since other systems like couch could make replication/backups/scaling easier but dismissing mysql solely on the fact that it can't handle so minor amounts of data it a little to harsh.
I guess there are better solutions (read: cheaper, easier to administer) solutions out there.

Use Event Store (https://eventstore.org), you can read (https://eventstore.org/docs/getting-started/which-api-sdk/index.html) that when using TCP client you can achieve 15000-20000 writes per second. If you will ever need to do anything with data, you can use projections or do the transformations based on streams to populate any other datastore you wish.
You can create even cluster.

If money plays no role, you can use TimesTen.
http://www.oracle.com/timesten/index.html
A complete in memory database, with amazing speed.

I would use the log file for this, but if you must use a database, I highly recommend Firebird. I just tested the speed, it inserts about 10k records per second on quite average hardware (3 years old desktop computer). The table has one compound index, so I guess it would work even faster without it:
milanb#kiklop:~$ fbexport -i -d test -f test.fbx -v table1 -p **
Connecting to: 'LOCALHOST'...Connected.
Creating and starting transaction...Done.
Create statement...Done.
Doing verbatim import of table: TABLE1
Importing data...
SQL: INSERT INTO TABLE1 (AKCIJA,DATUM,KORISNIK,PK,TABELA) VALUES (?,?,?,?,?)
Prepare statement...Done.
Checkpoint at: 1000 lines.
Checkpoint at: 2000 lines.
Checkpoint at: 3000 lines.
...etc.
Checkpoint at: 20000 lines.
Checkpoint at: 21000 lines.
Checkpoint at: 22000 lines.
Start : Thu Aug 19 10:43:12 2010
End : Thu Aug 19 10:43:14 2010
Elapsed : 2 seconds.
22264 rows imported from test.fbx.
Firebird is open source, and completely free even for commercial projects.

I believe the answer will as well depend on hard disk type (SSD or not) and also the size of the data you insert. I was inserting a single field data into MongoDB on a dual core Ubuntu machine and was hitting over 100 records per second. I introduced some quite large data to a field and it dropped down to about 9ps and the CPU running at about 175%! The box doesn't have SSD and so I wonder if I'd have gotten better with that.
I also ran MySQL and it was taking 50 seconds just to insert 50 records on a table with 20m records (with about 4 decent indexes too) so as well with MySQL it will depend on how many indexes you have in place.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008