I'am a little confused about how to find most frequently used queries in mysql. I'm looking this query because I want use memcache tool in php application. Is really only way to find them is log every query to log file and after work on it?
Using General Query Log is only the solution to do that. I couldn't find any other way. If the purpose is to optimize slow queries then you can use Slow Query Log to find the slowest queries.
Related
I'm looking to optimize my SQL queries for a growing website based on CakePHP. I can optimize things using recursive = -1, for example, but before going further, I think it'd be helpful to know which queries are taking the most time.
Is there a simple way to log the time queries are taking on a production site? The idea of adding code around each find() makes me want to quit before I start, and it doesn't look like the beforeFind and afterFind functions carry enough information to track which "after" corresponds to which "before".
Thanks in advance!
Simply use the Debug Kit plugin for CakePHP or use the logging of your DB server? MySQL can be configured to even just log slow queries.
https://github.com/cakephp/debug_kit
https://book.cakephp.org/3.0/en/debug-kit.html
https://dev.mysql.com/doc/refman/8.0/en/query-log.html
I'm using the slow query log and sometimes the general log to debug problems, but so many of my queries are written similarly in my application that I don't know where in the application it is. Is there a way to append something to each query so that I can use it for identification?
For example add a tag like "lookingforuserswithreferrals" in the query (which doesn't actually do anything), and then find that keyword in my application?
If I wanted to scale a Rails application by distributing its database on a different machine based on its authorization rules (location and user roles). So any resource attributed to that location would be sitting in a database dedicated to that location.
Should I get down to basic SQL writing, use something like Sequel gem or keep the niceness and magic of ActiveRecord?
It is true that raw SQL execution speed is more than execution of the ActiveRecord's nice magical queries. However, if you talk about scaling then there comes this question of how well would the queries be manageable when the application really grows large.
By far, a lot of complicated database operations can be managed well by caching, and proper indexing and proper eager-loading. In some cases MySQL views also help to improve performance, and Rails treats MySQL views fairly. After this, if one is able to corner out the really slow queries, then it might be worth it to convert them to raw SQL and save some time. Also, Rails offer caching of the database queries. MySQL also has a caching mechanism. Before executing raw SQL directly I would make sure these options (actually many more like avoiding unnecessary joins, as a join operation is resource intensive) are not able to give me what I am looking for. Hope this helps.
It sounds like you are partitioning your database, which Sequel has built-in support for (http://sequel.rubyforge.org/rdoc/files/doc/sharding_rdoc.html). I recommend using Sequel, but as the lead developer, I'm biased.
the question is about the best practice.
How to perform a reliable SQL query test?
That is the question is about optimization of DB structure and SQL query itself not the system and DB performance, buffers, caches.
When you have a complicated query with a lot of joins etc, one day you need to understand how to optimize it and you come to EXPLAIN command (mysql::explain, postresql::explain) to study the execution plan.
After tuning the DB structure you execute the query to see any performance changes but here you're on the pan of multiple level of optimization/buffering/caching. How to avoid this? I need the pure time for the query execution and be sure it is not affected.
If you know different practise for different servers please specify explicitly: mysql, postgresql, mssql etc.
Thank you.
For Microsoft SQL Server you can use DBCC FREEPROCCACHE (to drop compiled query plans) and DBCC DROPCLEANBUFFERS (to purge the data cache) to ensure that you are starting from a completely uncached state. Then you can profile both uncached and cached performance, and determine your performance accurately in both cases.
Even so, a lot of the time you'll get different results at different times depending on how complex your query is and what else is happening on the server. It's usually wise to test performance multiple times in different operating scenarios to be sure you understand what the full performance profile of the query is.
I'm sure many of these general principles apply to other database platforms as well.
In the PostgreSQL world you need to flush the database cache as well as the OS cache as PostgreSQL leverages the OS caching system.
See this link for some discussions.
http://archives.postgresql.org/pgsql-performance/2010-08/msg00295.php
Why do you need pure execution time? It depends on so many factors and almost meaningless on live server. I would recommend to collect some statistic from live server and analyze queries execution time using pgfouine tool (it's for postgresql) and make decisions based on it. You will see exactly what do you need to tune and how effective was your changes on a report.
I am about to begin developing a logging system for future implementation in a current PHP application to get load and usage statistics from a MYSQL database.
The statistic will later on be used to get info about database calls per second, query times etc.
Of course, this will only be used when the app is in testing stage, since It will most certainly cause a bit of additional load itself.
However, my biggest questionmark right now is if i should use MYSQL to log the queries, or go for a file-based system. I'll guess that it would be a bit of a headache to create something that would allow writings from multiple locations when using a file based system to handle the logs?
How would you do it?
Use the general log, which will show client activity, including all the queries:
http://dev.mysql.com/doc/refman/5.1/en/query-log.html
If you need very detailed statistics on how long each query is taking, use the slow log with a long_query_time of 0 (or some other sufficiently short time):
http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html
Then use http://www.maatkit.org/ to analyze the logs as needed.
MySQL already had logging built in- Chapter 5.2 of the manual describes these. You'll probably be interested in The General Query Log (all queries), the Binary Query Log (queries that change data) and the Slow log (queries that take too long, or don't use indexes).
If you insist on using your own solution, you will want to write a database middle layer that all your DB calls go through, which can handle the timing aspects. As to where you write them, if you're in devel, it doesn't matter too much, but the idea of using a second db isn't bad. You don't need to use an entirely separate DB, just as far as using a different instance of MySQL (on a different machine, or just a different instance using a different port). I'd go for using a second MySQL instance instead of the filesystem- you'll get all your good SQL functions like SUM and AVG to parse your data.
If all you are interested in is longer-term, non-real time analysis, turn on MySQL's regular query logging. There are tons of tools for doing analysis on the query-logs (both regular and slow-query), giving you information about the run-times, average rows returned, etc. Seems to be what you are looking for.
If you are doing tests on MySQL you should store the results in a different database such as Postgres, this way you won't increase the load with your operations.
I agree with macabail but would only add that you could couple this with a cron job and a simple script to extract and generate any statistics you might want.