We have a service which sees several hundred simultaneous connections throughout the day, peeking at about 2000, for about 3 million hits a day, and growing. With each request I need to log 4 or 5 pieces of data to MySQL, we originally used the logging that came with the app were using however it was terribly inefficient and would run my db server at >3x the average cpu load, and would eventually bring the server to it knees.
At this point we are going to add our own logging to the application (php), the only option I have for logging data is the MySQL db, as this is the only common resource available to all of the http servers. This data will be mostly writes however everyday we generate reports based on the data, then crunch and archive the old data.
What recommendations can be made to ensure that I don't take down our services with logging data?
The solution we took with this problem was to create an archive table then regularly ( every 15 minutes, on an app server) crunch the data and put it back into the tables that were used to generate reports. The archive table of course did not have any indices, the tables which the reports are generated from have several indices.
Some stats on this approach:
Short Version: >360 times faster
Long Version:
The original code/model did direct inserts into the indexed table, and the average insert took .036 seconds, using the new code/model inserts took less than .0001 seconds (I was not able to get an accurate fix on the insert time I had to measure 100,000 inserts and average for the insert time). The post-processing (crunch) took an average 12 seconds for several tens-of-thousands records. Overall we were greatly pleased with this approach and so far it has worked incredibly well for us.
Based on what you describe, I recommend you try to leverage the fact that you don't need to read this data immediately and pursue a "periodic bulk commit route". That is, buffer the logging data in RAM on the app servers and doing periodic bulk commits. If you have multiple application nodes, some sort of randomized approach would help even more (e.g., commit updated info every 5 +/- 2 minutes).
The main drawback with this approach is that if an app server fails, you lose the buffered data. However, that's only bad if (a) you absolutely need all of the data and (b) your app servers crash regularly. Small chance that both are true, but in the event they are, you can simply persist your buffer to local disk (temporarily) on an app server if that's really a concern.
The main idea is:
buffering the data
periodic bulk commits (leveraging some sort of randomization in a distributed system would help)
Another approach is to stop opening and closing connections if possible (e.g., keep longer lived connections open). While that's likely a good first step, it may require a fair amount of work on your part on a part of the system that you may not have control over. But if you do, it's worth exploring.
Related
Suituation:
Client is running a web based finance application, where the primary functionaities includes huge volume of financial transactions both in and out.
The processes are automated.
We run several cron job tasks at midnight to split the payments for appropriate customers.
Monthly on average we have 2000 to 3000 new customers with total of 30,000 customers currently.
Our transactional tables has almost 900000 records so far and expect drastic increase in comming months.
Technologies: Initially we used LAMP environment, With Codeignitor framework, Laravel elequont ORM for querying and Mysql.
Hosting: Hosted in AWS, T2 small instance, no load balancer implemented.
**This application was developed three years back.
Problem:
Currenty our client faces downtime during peak hours and also their customers faces load time issues while reviewing their transaction archives and stats.
And also they fear in case if the cron job tasks fails, they could not able to handle the suituation. (vast calculations are made and amounts were inserted accross huge volume of customers).
Our plan:
So right now, we planned to rework on the application from scratch with performance and fault tolerance as our primary goal. And this application has to be reliable at least for another
six to eight years.
Technologies: Node (Sails.js), Angular 5, AWS with load balancer, AWS RDS (Mysql)
Our approach: From our analysis, we gained few straight forward reasons for the performance loss. Primarly, there are many stats for customers which access heavy tables.
Most of the stats are on current month. So we plan to add log tables for such and keep only the current month data in the specific table.addMethod
So, there are going to be may such log table which will only going to have read operation.
Queries:
Is it good to split the ready only tables to separate database or can we have it within the single database.
How Mysql buffer cache differ from Redis / memcache, Is there any memory consumption problem occurs while more traffic flows in?
What is the best approach to truncate few tables at the end of evey month (As i mentioned about log file)?
Am I proceeding in right direction?
A million rows is a modest size, not "huge". Since you are having performance problems, I have to believe that it stems from poor indexing and/or poor query formulation.
Find out what queries are having the most trouble. See this for suggestions on using mysqldumpslow -s t or pt-query-digest to locate them.
Provide SHOW CREATE TABLE and EXPLAIN SELECT ... for discussion of how to improve them. It may be as simple as adding a "composite" index.
Another possible performance bottleneck may be repeatedly summarizing old data. If this is the case, then consider the Data Warehousing technique of _building and maintaining Summary Tables .
As for your 4 questions, I tentatively say "no" to each.
The various frameworks tend to make small applications easy to develop, but they start to give trouble when you scale. Still, there are things that can be fixed without abandoning (yet) the frameworks.
AWS, etc, give you lots of reliability and read scaling. But, I repeat, the likely place to look is at the slow queries, not the various ideas you presented.
As for periodic truncation, let's discuss that after seeing what the data looks like and what the business requirements are for data retention.
Our mobile app track user events (Events can have many types)
Each mobile reporting the user event and later on can retrieve it.
I thought of writing to Redis and Mysql.
When user request:
1. Find on Redis
2. If not on Redis find on Mysql
3. Return the value
4. Keep Redis modified in case value wasnt existed.
5. set expiry policy to each key on redis to avoid out of mem.
Problem:
1. Reads: If many users at once requesting information which not existed at Redis mysql going to be overloaded with Reads (latency).
2. Writes: I am going to have lots of writes into Mysql since every event going to be written to both datasources.
Facts:
1. Expecting 10m concurrect users which writes and reads.
2. Need to serv each request with max latency of one second.
3. expecting to have couple of thousands requests per sec.
Any solutions for that kind of mechanism to have good qos?
3. Is that in any way Lambda architecture solution ?
Thank you.
Sorry, but such issues (complex) rarely have a ready answer here. Too many unknowns. What is your budget and how much hardware you have. Since 10 million clients are concurrent use your service your question is about hardware, not the software.
Here is no any words about several important requirements:
What is more important - consistency vs availability?
What is the read/write ratio?
Read/write ratio requirement
If you have 10,000,000 concurrent users this is problem in itself. But if you have much of reads it's not so terrible as it may seem. In this case you should take care about right indexes in mysql. Also buy servers with lot of RAM to keep at least index data in RAM. So one server can hold 3000-5000 concurrent select queries without any problems with latency requirement in 1 second (one of our statistic project hold up to 7,000 select rps per server on 4 years old ordinary harware).
If you have much of writes - all becomes more complicated. And consistency becomes main question.
Consistency vs availability
If consistency is important - go to the store for new servers with SSD drives and moder CPU. Do not forget to buy much RAM as possible. Why? If you have much of write requests your sql server would rebuild index with every write. And you can't do not use indexes because of your read requests do not to keep in latency requirement. Under consistency i mean - if you write something, you should do this in 1 second and if you read this data right after write - you get actual written information in 1 second.
Your problem 1:
Reads: If many users at once requesting information which not existed at Redis mysql going to be overloaded with Reads (latency).
Or well known "cache miss" problem. And it has just some solutions - horizontal scaling (buy more hardware) or precaching. Precaching in this case may be done in at least 3 scenarios:
Using non blocking read and wait up to one second while data wont be queried from SQL server. If it not, return data from Redis. Update in Redis immediately or throw queue - as you want.
Using blocking/non blocking read and return data from Redis as fast as possible, but with every ready query push jub to queue about update cache data in Redis (also may inform app it should requery data after some time).
Always read/write from Redis, but register job in queue every write request to update data in SQL.
Every of them is compromise:
High availability but consistency suffers, Redis is LRU cache.
High availability but consistency suffers, Redis is LRU cache.
High availability and consistency but requires lot of RAM for Redis.
Writes: I am going to have lots of writes into Mysql since every event going to be written to both datasources.
The filed of compromise again. Lot's of writes rests to hardware. So buy more or use queues for pending writes. So availability vs consistency again.
Event tracking means (usualy) you can return data close to real time but not in real time. For example have 1-10 seconds latency to update data on disk (mysql) keeping 1 second latency for write/read serving requests.
So, it's combination of 1/2/3 (or some other) techniques for data provessing:
Use LRU in Redis and do not use expire. Lot's of expire keys - problem as is. So we can't use to be sure we save RAM.
Use queue to warm up missing keys in Redis.
Use queue to write data into mysql server from Redis server.
Use additional requests to update data from client size of cache missing situation accures.
is there any software that does "lazy" deletion of the rows from the table. I would like to do maintenance of my tables when my server is idle, and ideally i should be able to define what "idle" is (num of database connections/system load/ requests per second). Is there anything remotely similar to this?
If you are on a linux server, you can make your table cleanup scripts only run based on the output of the command "w" which will show you a system load. If your system load is under say .25 you can run your script. Do this with shell scripting.
To some degree, from an internal perspective InnoDB already does this. Rows are initially marked as deleted, but only made free as part of a background operation.
My advice: You can get in to needlessly complicated problems if you try and first check if the server is idle. i.e.
What if it was idle, but the cleanup takes 2 minutes. During that 2 minutes the server load peaks?
What if the server never becomes idle enough? Now you just have an unlimited backlog.
If you just background the task you might improve performance enough, since now at least no users will be sitting in front of web pages waiting for it to complete. Look at activity graphs as to what is the best time to schedule it (3am, 5am etc).
I have to implement a tracking system backed up by a MySQL database. The system will track many apps with at least 5 events tracked for each app (e.g. how many users clicked on link x, how many users visited page y). Some apps will have millions of users so a few thousand updates/second is not a far fetched assumption.
Another component of the system will have to compute some statistical info that should be update every minute. The system should also record past values of those statistical values.
The approach a friend of mine suggested was to log every event in a log table and have a cron job that runs every minute and computes the desired info and updates a stats table.
This sounds reasonable to me. Are there better alternatives?
Thanks.
I've logged to a mysql log table with a cron that crunches it.
I generally use innodb tables in my apps, but for the log table I did it as myisam and used insert DELAYED . . . queries.
Myisam doesn't provide all the goodies of innodb, but I believe it is slightly faster (for that reason).
The main thing you are worried about is database locking when your cron is running, but using "insert delayed" gets around that problem for the most part.
if your hits rate it too high for even insert delated into myisam table to handle, you may want to keep recent hits in memory (memcache can come in handy, or a custom daemon you can write) and process the hits from memory periodically into the database stats table (aggregated).
I would really recommend you to use an already existing log analyzer analyzing the already existing logs from your web server. One example is webalizer. Even better in my opinion is an external system such as google analytics. This works better since it will keep working with intermediate systems such as load balancers and caches in place.
I'm working on a program to automatically find optimal shift assignments, subject to lots of constraints. I'm using grails, i.e. the data about workers, shifts and assignments will be kept in a DBMS.
For the optimization itself, I'll have to work very intensively on a small subset of the data (about 600 rows total from about 5 different tables). I'll have to iterate over and search through various sub-subsets dozens of times to compute fitness functions, change some values, compute fitness again, lather, rinse, repeat, perhaps hundreds of times.
Now, while searching and iteration are exactly what a DBMS is for, I believe that in this case the overhead of hundreds of DB requests would dwarf the actual work being done, even for an in-memory DBMS like HSQLDB. So instead, I'm planning to slurp the entire subset into memory at the beginning, build my own indexes (HashMap, mainly) for the lookups I'll have to do, and then work only with those, staying away from the DB until I'm done and write my result to it.
Is this a sound approach? Any better ideas?
I'm assuming you must issue hundreds of commands to the database? There's no way to execute the code inside the DB?
The main thing I'd be worried about is integrity; make sure you handle locking correctly. You'd probably want a version number stored somewhere so you don't need to lock the entire set of data for the duration of processing. In the update transaction, you'd first ensure the version number is the same as when you started reading.
Finally, benchmark it? I've done some apps over the last year or so that had a similar very intensive compute process per request. Using in-process objects to represent the data was orders of magnitude more efficient than hitting the database per request. But every app is different and there might be things not considered that'll impact it.