MySQL Log file performance / size - mysql

Last week I noticed after a crash that my mysql log file had become so large that it consumed the disk - not a massive disk. I recently implemented a new helpdesk/ticketing system which was adopted by the entire company much quicker than was anticipated thus a log file with 99% selects.
So my question is this; Can I retain mysql logging but exclude select statements? Further more can I keep select statements but exclude certain databases(i.e. helpdesk)?
Thanks for any response

You can't restrict MySQL General log file to certain database or certain DML statements. It logs everything being executed on your MySQL server and ofcourse it's a overhead on a MySQL server in production environment.
I suggest you to turn-off General log on production server and enable slow query log with appropriate settings so that only problamatic queries will be logged which needs attention, later you can optimize those queries to achieve better MySQL performance.
If you still needs general log to be enabled then make sure that logrotate script is used for General log file which will keep it's size to a certain limit.
http://www.thegeekstuff.com/2010/07/logrotate-examples/

Related

History of queries in MySql

Is there any way to check the query that occurs in my MySql database?
For example:
I have an application (OTRS) that allows you to generate reports according to the frames that I desire. I would like to know which query is made by the application in the database.
Because I will use it to integrate with other reporting software.
Is this possible?
Yes, you can enable logging in your MySQL server. there are several types of logs you can use, depending on what you want to log, starting from errors only or slow queries, and to logs that write everything done on your server.
See the full doc here
Although, as Nir says, mysql can log all queries (you should be looking at the general log or the slow log configured with a threshold of 0 seconds) this will show all the queries being run; on a production system it may prove difficult to match what you are doing in your browser with specific entries in the log.
The reason I suggest using the slow query log is that there are tools available which will remove the parameters from the queries, allowing you to see what SQL code is running more frequently.
If you have some proficiency in Perl it should be straightforward to output - all queries are processed via an abstraction layer.
(Presumably you are aware that the schema is published)

Can we enable mysql binary logging for a specific table

I want to write a listener which detects the DML changes on a table and perform some actions. This listener cannot be embedded in the application and it runs separately.
I thought let the application write to blackhole table and I will detect the changes from the binary log file.
But in the docs I found that enabling binary logging slows down the mysql performance slightly. Thats why i was wondering is there a way i can make the mysql master to log the changes related to a specific table.
Thanks!
SQL is the best way to track DML change and call function based on that. But, as you want to explore other options you may try
writing a cronjob with General Query Log which includes SELECT / SHOW statements as well which you don't need
mysqlbinlog : It slows down performance just a little, but it is necessary for point in time data recovery and replication.
Suggestions:
On a prod environment, MySQL binary log must be enabled. and general
query log must be disabled as general query logs almost everything
and gets filled very quickly and might run out of disk space if not
rotated properly.
On a dev/qa environment, general query log can be enabled with proper
rotation policy.

call graph for MySQL sessions

I am trying to create a valgrind (cachegrind) analysis of MySQL client connections.
I am running valgrind with --trace-children=yes.
What I want to find is one of the internal method calls, to see the call graph when it is being used...
After running valgrind --trace-children=yes ./bin/mysqld_safe
I get many dump files that were written that moment.
I am waiting 5 minutes (for letting the new files that I expect to be created to have a different "last modified" date.
After these 5 minutes I open 30 sessions, and floud the system with small transactions, and when I am done - shutdown the MySQL.
Now the questions:
1. After running 30 transactions and shutting down the system, only 3 files are modified. I expected to see 30 files, cause I though MySQL spans processes. So first - can someone confirm MySQL spans threads and not processes for each session?
I see three different database log calls: one to a DUMMY, one to binlog, and one to the innodb log. Can someone explain why the binlog and the DUMMY are there, and what's the difference between them? (I guess the DUMMY is because of the innodb, but I don't understand why the binlog is there if my first guess is true).
Is there a better way to do this analysis?
Is there a tool like kcachegrind that can open multiple files and show the summery from all of them? (or is it possible somehow within kcachegrind?)
Thanks!!
btw - for people who extend and develop MySQL - there are many interesting things there that can be improved....
I can only help you on some issues: Yes, MySQL does not create processes, but threads, see the manual on the command which lists what is currently done by the server:
When you are attempting to ascertain what your MySQL server is doing,
it can be helpful to examine the process list, which is the set of
threads currently executing within the server.
(Highlighting by me.)
Concerning the logs: Binary log is the log used for replication. This contains all executed statements (or changed rows) and will be propagated to slaves.
The InnoDB log is independent from the binary log and is used to assure that InnoDB performs ACID conform. Transactions are inserted there first and this file is used if the server crashed and InnoDB starts a recovery.
It is kind of normal that both logs are filled on a normal server.
I cannot help you with your other questions though. Maybe you want to ask your question on dba.stackexchange.com

What is happening as my Sphinx search server warms up?

I have Sphinx Search running on a Linux server with 38GB of RAM. The sphinx index contains 35M full text documents plus meta data indexed from a MySQL table. When I launch a fresh server, I run a script that "warms up the sphinx cache" by sending my 10,000 most common queries through it. It takes about an hour to run the warm up script the first time, but the same script completes in just a few minutes if I run it again.
My confusion arises from the fact that Sphinx doesn't have any documented caching, other than a file based cache that I am not using. The index is loaded into memory when Sphinx starts, but individual queries take the same length of time each time they are run after the system has been "warmed up".
There is a clear warm up period when I run my scripts. What is going on? Is Linux caching something that helps Sphinx run faster? Does the underlying MySQL system cache queries ( I believe Sphinx is basically a custom MySQL storage engine )? How are new queries that have never been run being made faster by what is going on?
I realize there is likely a very complex explanation for this, but even a little direction should help be dig deeper.
( I believe Sphinx is basically a custom MySQL storage engine )
SphinxSE is a 'fake' storage engine. fake because it doesnt store any data - but rather take requests for data from its 'table', but really it just proxies it back to a running searchd instance in the background.
searchd itself doesnt have any caching - but as mentioned as indexed are read from, the OS may well start caching the files - so dont have to go all the way back to disk.
If you are using SphinxSE - then queries may be cached by the normal mysql query cache - so whole result sets are cached. But in addiction, the usual way to use SphinxSE, is to join the search results back with the original dataset, so you get both returned to the app in one go. So your queries are also dependent on the real mysql data tables. And they will be subject to the same OS caching - as mysql reads data it will be cached.
When I launch a fresh server
that suggests you are using a VM? If so the virtual disk might actully be located on a remote SAN. (or EBS on Amazon ec2)
which means loading a large sphinx index via that route might well be slow.
Depending on where your VM is hosted might be able to get some special high performance disks - ideally local to the host - maybe even SSD - which may well help.
Anyway to trace the issue, more you should almost certainly enable the sphinx query log. Look into that to see if queries are slow executing there. There is also a startup upoption to searchd - where you can enable iostats. This will log more information to the quyery log about io stats as queries are run. This can give you additional insights.
Sphinx doesn't cache your queries, but file system does. So, yes, second time queries executing faster than first time.

Should I use mysql to keep logs, or just dump to a text file

I am creating a site which will make lots of searches and I need to log data about every search that is made for later analysis.
I anticipate ultimately having load distributed between a number of servers, then each month I will download and import all logs into a single mysql database at my end for analysis.
At the moment I've been looking at setting every server up as a mysql 'master' which will live update the slave analysis server and essentially also act as a backup.
However I'm aiming for efficiency. Obviously the benefits of mysql replication are that I always have logs centrally available and don't have to import and reset log files on each server every month.
How much more efficient would it be to log in a plaintext file and just dump this logfile every month and import into mysql centrally? Is a plaintext dump much, if any, more efficient/faster than mysql?
Thanks for your thoughts!
Databases are strong for doing more than inserts. They are strong for locking mechanisms, transaction management, fast searches, connections pooling, and the list goes on.
On the other hand, if all you need to do in general is writing a chunk of data to the disk, a database would be a huge overhead.
Given the above, and since you only want to write stuff all month long, I would recommend you use logs, and once a month - take the logs, merge them together and analyze them. You could then decide if you want to merge all of them into a database (if it makes sense and gives you some added value), or you just want to merge the text together.
BTW, you may want to save the INSERT statements into this log, and then use it as a script to load everything into the database. Give it a thought :-)