We are planning to rewrite legacy system that is using MySQL InnoDB database and trying to analyse main bottlenecks that should be avoided in next version.
System has many services/jobs that runs over night that generates data - inserts/updates, that mainly should be optimized. Jobs runs avg. 2-3 hours now.
We already gathered long running queries that must be optimized.
But I am wondering if it is possible to gather information and statistics about long running transactions.
Very helpful will be information which tables is locked by transaction the most - average locking time, lock type, periods.
Could somebody advice any tool or script that can gather such information?
Or maybe someone can share own experience in database analyse and optimization?
MySQL has built in capability for capturing "slow" query statistics (but to get an accurate picture you need to set the slow threshold as 0). You can turn the log into useful information with mysqldumpslow (bundled with mysql). I like the percona toolkit, but there are lots of other tools available.
Related
We have percona monitoring tool for monitoring Mysql db, generating slow log file report not gives us instant results. do we have any best approach to handle it using metrics/ promql's or query analytics etc. where we get min,max,average time of critical queries
min,max,average -- These don't make sense until you have enough samples to take min,max,average against. Rethink the need for "instant results".
pt-query-digest could be run daily or hourly (or whatever) to get results for the "recent past".
A lot of metrics can be graphed by the various monitoring tools available from Percona, MariaDB, and Oracle, plus others. Some cost money. Some come "close" to "instant results" even for slow queries.
Please describe your goal in different words; we may be able to better direct you.
Metrics like SHOW GLOBAL STATUS LIKE 'Threads_running'; (or a graph monitoring that) can spot a spike in realtime. But it there is nothing actionable in knowing that there is a spike.
I prefer looking at the slowlog afterward. The "worst" queries are readily identified by pt-query-digest. Spikes and bottlenecks can be identified, but not until they are "finished".
Deadlocks come from the hard-to-parse SHOW ENGINE InnoDB STATUS;, but only one at a time and after the fact.
In a not-well-tuned system, the first entry in pt-query-digest is (sometimes) a query that consumes over 50% of the system resources. Fixing that one query makes a big difference. Very cost-effective.
Suituation:
Client is running a web based finance application, where the primary functionaities includes huge volume of financial transactions both in and out.
The processes are automated.
We run several cron job tasks at midnight to split the payments for appropriate customers.
Monthly on average we have 2000 to 3000 new customers with total of 30,000 customers currently.
Our transactional tables has almost 900000 records so far and expect drastic increase in comming months.
Technologies: Initially we used LAMP environment, With Codeignitor framework, Laravel elequont ORM for querying and Mysql.
Hosting: Hosted in AWS, T2 small instance, no load balancer implemented.
**This application was developed three years back.
Problem:
Currenty our client faces downtime during peak hours and also their customers faces load time issues while reviewing their transaction archives and stats.
And also they fear in case if the cron job tasks fails, they could not able to handle the suituation. (vast calculations are made and amounts were inserted accross huge volume of customers).
Our plan:
So right now, we planned to rework on the application from scratch with performance and fault tolerance as our primary goal. And this application has to be reliable at least for another
six to eight years.
Technologies: Node (Sails.js), Angular 5, AWS with load balancer, AWS RDS (Mysql)
Our approach: From our analysis, we gained few straight forward reasons for the performance loss. Primarly, there are many stats for customers which access heavy tables.
Most of the stats are on current month. So we plan to add log tables for such and keep only the current month data in the specific table.addMethod
So, there are going to be may such log table which will only going to have read operation.
Queries:
Is it good to split the ready only tables to separate database or can we have it within the single database.
How Mysql buffer cache differ from Redis / memcache, Is there any memory consumption problem occurs while more traffic flows in?
What is the best approach to truncate few tables at the end of evey month (As i mentioned about log file)?
Am I proceeding in right direction?
A million rows is a modest size, not "huge". Since you are having performance problems, I have to believe that it stems from poor indexing and/or poor query formulation.
Find out what queries are having the most trouble. See this for suggestions on using mysqldumpslow -s t or pt-query-digest to locate them.
Provide SHOW CREATE TABLE and EXPLAIN SELECT ... for discussion of how to improve them. It may be as simple as adding a "composite" index.
Another possible performance bottleneck may be repeatedly summarizing old data. If this is the case, then consider the Data Warehousing technique of _building and maintaining Summary Tables .
As for your 4 questions, I tentatively say "no" to each.
The various frameworks tend to make small applications easy to develop, but they start to give trouble when you scale. Still, there are things that can be fixed without abandoning (yet) the frameworks.
AWS, etc, give you lots of reliability and read scaling. But, I repeat, the likely place to look is at the slow queries, not the various ideas you presented.
As for periodic truncation, let's discuss that after seeing what the data looks like and what the business requirements are for data retention.
I want to check the performance of my database in mysql. I googled and came to know about show full processlist etc commands, but not very clear. i just want to know and calaculate the performance of database in terms of how much heap memory it is taking and other such.
Is there any way to know and assess the performance of the database. so that I can optimize and improve the performance.
Thanks in advance
The basic tool is MySQL Workbench which will work with any recent version of MySQL. It's not as powerful as the enterprise version, but is a great place to start.
The configuration can be exposed with SHOW VARIABLES and the current state of the system is exposed through SHOW STATUS. These status numbers are what ends up being graphed in most tools.
Don't forget that you can do a lot of monitoring on the application side, turning on database logs for instance. Barring that you can enable the "slow query" log in MySQL to check which queries are having the most impact. These can then be diagnosed with EXPLAIN.
Download mysql enterprise tools. They will allow you to monitor load on the server as well as performance of individual queries.
You can use open source tools from Percona called as Percona Toolkit and start using some useful tools which can help you in Efficiently archive rows, Find duplicate indexes, Summarize MySQL servers, Analyze queries from logs and tcpdump and Collect vital system information when problems occur.
You can try experimenting with Performance_Schema tables avialable in MySQL v5.6 onwards which can give a detailed information of query, database statistics.
http://www.markleith.co.uk/2012/07/04/mysql-performance-schema-statement-digests/
I'm really new to server-side and I've been administrating a dedicated server for my website.
At peak hours the website is very slow, in order to know why I've been monitoring it using htop.
Some really long-lasting (more than few hours) mysql processes use up to 95% of the CPU !
The thing is I don't know what queries might be the cause of it nor how to monitor it.
I do have a cron every quarter of hour that sometimes takes a long time to run but the time of the slow down is not always matching cron's.
I've heard of a solution using a cron job killing too long mysql processes but wouldn't it cause discrepancies in the db ?
Configure MySQL server to log slow queries, then look at them. You need to understand why these queries are so slow.
Most likely, these queries can be sped up by adding proper indexes, but this is not fast and hard rule - you need to understand what is really happening.
You can kill long-running queries, but if you have MyISAM engine it can corrupt your tables. In that case, seriously consider switching to Innodb engine. With transactional engine like Innodb, currently running transactions will likely be rolled back, but data should not be corrupted.
Is there any way to see an overview of what kind of queries are spent the most time on every day on MySQL?
Yes, mysql can create a slow query log. You'll need to start mysqld with the --log-slow-queries flag:
mysqld --log-slow-queries=/path/to/your.log
Then you can parse the log using mysqldumpslow:
mysqldumpslow /path/to/your.log
More info is here (http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html).
You can always set up query logging as described here:
http://dev.mysql.com/doc/refman/5.0/en/query-log.html
It depends on what you mean by 'most time'. There may be thousands if not hundreds of thousands of queries which take very little time each, but consume 90% of CPU/IO bandwidth. Or there may be a few huge outliers.
There are tools for performance monitoring and analysis, such as the built-in PERFORMANCE_SCHEMA, the enterprise tools from the Oracle/MySQL team, and online services like newrelic which can track performance of an entire application stack.