Lazy deletion of table rows - mysql

is there any software that does "lazy" deletion of the rows from the table. I would like to do maintenance of my tables when my server is idle, and ideally i should be able to define what "idle" is (num of database connections/system load/ requests per second). Is there anything remotely similar to this?

If you are on a linux server, you can make your table cleanup scripts only run based on the output of the command "w" which will show you a system load. If your system load is under say .25 you can run your script. Do this with shell scripting.

To some degree, from an internal perspective InnoDB already does this. Rows are initially marked as deleted, but only made free as part of a background operation.
My advice: You can get in to needlessly complicated problems if you try and first check if the server is idle. i.e.
What if it was idle, but the cleanup takes 2 minutes. During that 2 minutes the server load peaks?
What if the server never becomes idle enough? Now you just have an unlimited backlog.
If you just background the task you might improve performance enough, since now at least no users will be sitting in front of web pages waiting for it to complete. Look at activity graphs as to what is the best time to schedule it (3am, 5am etc).

Related

How do I see which database is running faster after making changes to my.cnf file?

Background
I have 5 servers all running essentially the same site but I have had difficulties with database speed. Part of the process has lead me to make changes to one of my my.cnf files to improve performance.
Problem
I am having difficulty finding if the settings are making any difference at all. I restart the mysql service and ever have rebooted the entire server, the variables show up as changed but I don't see any kind of noticeable difference when accessing my site. I would like a way to quantify how fast my database is without relying on the front end of the app so I can show my boss real figures for database speed instead of looking at the google console for load speeds.
Research
I thought that there might be some tool in phpmyadmin to help track speed but after going through the different tabs I couldn't find anything. All of the other on-line resources I have looked at seem to just talk about "expected results" instead of how to test directly.
Question
Is there a way to get speed information directly from the database (or phpmyadmin) instead of using the front end of the web app?
The optimal realistic benchmark goes something like this:
Capture a copy of the live dataset.
Turn on the general log.
Run for some period of time (say, an hour).
Turn off the general log.
That gives you 2 things: a starting point, and a list of realistic instructions. Now to replay:
Load the data on a test machine.
Change some my.cnf setting.
Apply the captured general log, timing how long it takes.
Replay with another setting change; see if the timing is more than trivially faster or slower.
Even better would be to arrange for the replay to be multi-threaded like the real product.
Caveat... Some settings won't make any difference until the size of something (data, index, number of connections, etc) exceeds some threshold. Only at that point will the setting show a difference. This benchmark method fails to predict such.
If you would like an independent review of your my.cnf, please provide these:
How much RAM do you have
SHOW VARIABLES;
SHOW GLOBAL STATUS; -- after mysld has been running at least a day.
I will compute a couple hundred formulas and judge which ones are red flags.

How expensive are MySQL events?

In my web app I use two recurring events that "clean up" one of the tables in the database, both executed every 15 minutes or so.
My question is, could this lead to problems in performance in the future? Because I've read somewhere -I don't recall where exactly- that MySQL events are supposed to be scheduled to run once a month or so. Thing is, this same events keep the table in a pretty reduced size (as they delete records older than 15~ minutes), maybe this compensates the frequency of their execution, right?
Also, is it better to have one big MySQL event or many small ones if they are be called in the same frequency?
I don't think there's a performance indication in the monthly base just more of a suggestion of what to do with it. So i think you're ok with doing your cleanup using the events.
In the end the documentation suggets that the events are
Conceptually, this is similar to the idea of the Unix crontab (also known as a “cron job”) or the Windows Task Scheduler.
And the concept for those is that you can run a task every minute if you wish to do so.
On the second part of that question:
Serialize or spread it up. If you split them up into many events that will run at the same time you will create spikes of possibly very high cpu usage that might slow down the application while processing the events.
So either pack everything into one event so it runs in succession or spread the single events up so they execute on different times during the 15 minutes timeframe. Personally i think the first one is to be preferred, pack them up into a single event as then they are guaranteed to run in succession, even if a single one of them keeps running longer than usual.
The same goes for cronjobs. If you shedule 30 long-running exports at a single time your application is going to fail miserably during that timeslot (learned that the hard way).

MySQL online vs. batch processing - options for preventing MySQL cron jobs from blocking online queries?

Is there another way to prevent nightly cron jobs that do batch processing against mysql from impacting online webserver->mysql queries other than setting query priority? I'm thinking there may be a way to segment these, but I'm not sure if this is possible?
Try and break the queries down, perhaps rather than processing lots of data in one go try and process smaller batches but more often. This way you will lock tables for less time and allow gaps for queries from the frontend to be executed.
Another solution would be to process more often but even during the day. My last project used an event system, so that a user would comment something and this event would go into a queue. A background process (executed from The Fat Controller) would then take this event and insert data so that all the user's friends news feeds were updated about the comment. That way feeds are updated by simple insert statements and not rebuilt from scratch every x hours.

Most efficient method of logging data to MySQL

We have a service which sees several hundred simultaneous connections throughout the day, peeking at about 2000, for about 3 million hits a day, and growing. With each request I need to log 4 or 5 pieces of data to MySQL, we originally used the logging that came with the app were using however it was terribly inefficient and would run my db server at >3x the average cpu load, and would eventually bring the server to it knees.
At this point we are going to add our own logging to the application (php), the only option I have for logging data is the MySQL db, as this is the only common resource available to all of the http servers. This data will be mostly writes however everyday we generate reports based on the data, then crunch and archive the old data.
What recommendations can be made to ensure that I don't take down our services with logging data?
The solution we took with this problem was to create an archive table then regularly ( every 15 minutes, on an app server) crunch the data and put it back into the tables that were used to generate reports. The archive table of course did not have any indices, the tables which the reports are generated from have several indices.
Some stats on this approach:
Short Version: >360 times faster
Long Version:
The original code/model did direct inserts into the indexed table, and the average insert took .036 seconds, using the new code/model inserts took less than .0001 seconds (I was not able to get an accurate fix on the insert time I had to measure 100,000 inserts and average for the insert time). The post-processing (crunch) took an average 12 seconds for several tens-of-thousands records. Overall we were greatly pleased with this approach and so far it has worked incredibly well for us.
Based on what you describe, I recommend you try to leverage the fact that you don't need to read this data immediately and pursue a "periodic bulk commit route". That is, buffer the logging data in RAM on the app servers and doing periodic bulk commits. If you have multiple application nodes, some sort of randomized approach would help even more (e.g., commit updated info every 5 +/- 2 minutes).
The main drawback with this approach is that if an app server fails, you lose the buffered data. However, that's only bad if (a) you absolutely need all of the data and (b) your app servers crash regularly. Small chance that both are true, but in the event they are, you can simply persist your buffer to local disk (temporarily) on an app server if that's really a concern.
The main idea is:
buffering the data
periodic bulk commits (leveraging some sort of randomization in a distributed system would help)
Another approach is to stop opening and closing connections if possible (e.g., keep longer lived connections open). While that's likely a good first step, it may require a fair amount of work on your part on a part of the system that you may not have control over. But if you do, it's worth exploring.

MySQL active connections at once, Windows Server

I have read every possible answer to this question and searched via Google in order to find the correct answer to the following question, but I am rather a novice and don't seem to get a clear understanding.
A lot I've read has to do with web servers, but I don't have a web server, but an intranet database.
I have a MySQL dsatabase in a Windows server at work.
I will have many users accessing this database constantly to perform simple queries and writting back to it new records.
The read/write will not be that heavy (chances are 50-100 users will do so exactly at the same time, even if 1000's could be connected).
The GUI will be either via Excel forms and/or Access.
What I need to know is the maximum number of active connections I can have at any given time to the database.
I know I can change the number on Mysql Admin however I really need to know what will really work...
I don't want to put 1000 users if the system will really handle 100 correctly (after that, although connected, the performance will be too slow, for example)
Any ideas or own experiences will be appreciated
This depends mainly on your server hardware (RAM, cpu, networking) and server load for other processes if not dedicated to the database. I think you won't have an absolute answer and the best way is testing.
I think something like 1000 should work ok, as long as you use 64 bit MySQL server. With 32 bit, too many connections may create virtual memory pressure - a connection has an own thread, and every thread needs a stack, so the stack memory will reduce possible size of the buffer pool and other buffers.
MySQL generally does not slow down if you have many idle connections, however special commands e.g "show processlist" or "kill", that enumerate every connection will be somewhat slower.
If idle connection stays idle for too long (idle time exceeds wait_timeout parameter), it is dropped by the server. If this is the case in your possible scenario, you might want to increase wait_timeout (its default value is 8 hours)