how to speed up tshark processing / reporting - libpcap

As tshark version 2.0.0, If I run a network throughput speed test like speedtest over a 50Mbps down/up link, tshark will stay "forever" dequeueing and digesting packets to be able to report at the stdout. tcpdump doesn't appear to suffer from any major overhead (tho I had to increase buffer size to avoid dropping packets for tcpdump).
After several attempts to tweak tshark, like increasing buffer size, avoiding writing pkts to tmp ( I tried /dev/null - no option for it, /run/out1.tmpfs, etc), -n -Tfields (showing only a few fields), tshark failed to report > 20Mbps in realtime. I wonder if there's any technique that I can use to improve its processing capacity. Thank you.
I still need some of the tshark features like the tcp retransmission analysis (afaik tcpdump doesn't do much stuff other than dumping packets).

Related

MySQL heavy disk activity even with no queries running

Trying to troubleshoot an issue with a mysterious disk io bottleneck caused by MySQL.
I'm using the following commands to test disk read/write speed:
#write
dd if=/dev/zero of=/tmp/writetest bs=1M count=1024 conv=fdatasync,notrunc
#read
echo 3 > /proc/sys/vm/drop_caches; dd if=/tmp/writetest of=/dev/null bs=1M count=1024
I rebooted the machine, disabled cron so none of my usual processes are running queries, killed the web server which usually runs, and killed mysqld.
When I run the read test without mysqld running, I get 1073741824 bytes (1.1 GB) copied, 2.19439 s, 489 MB/s. Consistently around 450-500 MB/s.
When I start back up the mysql service back up, then run the read test again, I get 1073741824 bytes (1.1 GB) copied, 135.657 s, 7.9 MB/s. Consistently around 5MB/s.
Running show full processlist in mysql doesn't show any queries (and I disabled everything that would be running queries anyway). In MySQLWorkbench's Server Status tab, I can see InnoDB reads fluctuate between 30-200 reads per second, and 3-15 writes per second even when no queries are running.
If I run iotop -oPa I can see that mysqld is racking up like 1MB disk reads per second when no queries are running. That seems like a lot considering no queries are running, but at the same time that doesn't seem like enough to cause my dd command to take so long... The only other thing performing disk io is jbd2/sda3-8.
Not sure if it's related, but if I try to kill the mysql server with service mysql stop it says "Attempt to stop MySQL timed out", and the mysqld process continues running, but I can no longer connect to the DB. I have to use kill -9 to kill the mysqld process and restart the server.
All of this appears to be out of the blue. This server was doing heavy duty log parsing, high volume inserts and selects for months, until this last weekend we started seeing this disk io bottleneck.
How can I find out why MySQL is doing so much disk reading when it's essentially idle?
Did you update/delete/insert a large number rows? If so, consider these "delays" in writing to disk:
The block containing the data is not written back to disk immediately.
Ditto for UNIQUE keys.
Updates to secondary indexes go into the "change buffer" They get folded into the index blocks, often even later.
Updates/deletes leave behind a "history list" that needs to be cleaned up after the transaction is complete.
Those things are handled by background tasks that do not show up in the PROCESSLIST. They may be visible on mysqld process(es), mostly as I/O. (CPU is probably minimal.)
Was there a ROLLBACK? Transactions are "optimistic". So a ROLLBACK has to do a lot of work to "undo" what was optimistically already committed.
If you abruptly kill mysqld (or turn off the power), then the ROLLBACK occurs after restarting.
SSDs have no "seek" time. HDDs must move the read/write heads by a variable amount; this takes time. If your dd is working on one end of the disk, and mysqld is working on the other end, the "seeking" adds to the apparent I/O time.
This turned out, like many performance problems, to be a multifaceted issue.
Essentially the issue turned out to be with nightly system and db backups writing to a separate HDD raid array running into the next day, then the master sending FLUSH TABLES and causing mysql jobs and replication work to wait for that. In addition, an unnecessary side process copying many gigabytes of text files around the system a few times a day. Tons of context switching as the system was trying to copy data for backups while also performing mysql work (replication and other jobs).
I ended up reducing the number of tables we were replicating (some were unnecessary), reducing the copying of text files around the system when not needed, increasing memory and io allocated to the mysql server, streamlining the mysql backups and system backups, and limiting cron jobs running mysql processes to give the mysql backups more time to complete. With all that, the backups were barely completing by 7AM each morning, so I ended up determining that we need to run the mysql backups only on weekends instead of nightly, which is fine since this is all fairly static data.

Filter or Log MySQL log for a particular table

In MySQL via are getting the logs from the file /var/log/mysql/mysql.log and we can monitoring the live queries through tailing this file using the tail command.
The problem is all the queries are logging here, Is there any way to log or filter from tail when the query from a particular table is fired
Thanks in advance
I'd use pt-query-digest.
Making a filter based on a table name is tricky, and depends on some undocumented features.
pt-query-digest --filter '$qr->distill($event->{arg}) =~ /\bMyTable\b/' \
/var/log/mysql/mysql-slow.log
Note I'm parsing the slow query log, not the general query log. I prefer to use the slow-query log because it has more information in it.
Also be cautious about running this on your production server. I've seen the script take a lot of resources, and it can interfere with your server's load if your log is too large. I recommend you scp the log to some other host where high load won't interfere with your production app.

mysqldump increase apache process threads?

We are running into problem with number of apache processes drastically increasing at a specific time. On further investigating, it's found that "mysqldump" was running in MySQL server during that time.
We noticed that while mysqldump was running, the count of apache instances(processes) shoots up to the max. Since we limited MaxClients to 150 there is no further increase in process thread.
My question is: Is it possible that mysqldump would increase number of processes in apache?
MySQLdump itself does not cause Apache to do anything. MySQLdump and Apache Server are only related in that since they are running on the same machine they are sharing the resources of that machine.
I suspect that when you run MySQLdump your server becomes resource-constrained (could be CPU, Network or Disk). When Apache is resource-constrained an individual Apache process is not able to complete as quickly and move on to the next request so instead new processes are spawned.

How to limit bandwidth used by mysqldump

I have to dump a large database over a network pipe that doesn't have that much bandwidth and other people need to use concurrently. If I try it it soaks up all the bandwidth and latency soars and everyone else gets messed up.
I'm aware of the --compress flag to mysqldump which help somewhat.
How can I do this without soaking up all the bandwidth over this connection?
Update:
The suggestion to copy a dumpfile using scp with the -l flag is a good one, but I should note that I don't have SSH access to the database server.
trickle?
trickle is a portable lightweight userspace bandwidth shaper
You don't mention how you are actually transffering the DB dump, but if the transfer happens over TCP/IP, trickle should work. For example, if you use nc (for example: nc -L 1234 > backup.sql) the following command will transfer the backup at no greater than 20KB/s:
mysqldump [database name] | trickle -u 20 nc backup.example.com:1234
You will have to have access to a linux machine (sorry I'm a linuxy sort of person).
An ingress policy can decrease the amount of incoming traffic, but the server on the other side needs to have a farely well behaved TCP/IP stack.
tc qdisc add dev eth0 handle ffff: ingress
tc filter add dev eth0 parent ffff: protocol ip prio 50 \
u32 match ip src server.ip.address/32 police rate 256kbit \
burst 10k drop flowid :1
tc qdisc add dev eth0 root tbf \
rate 256kbit latency 25ms burst 10k
You can find more information on ingress filters in the advanced routing howto.
http://www.linux.org/docs/ldp/howto/Adv-Routing-HOWTO/index.html
If you are doing it in linux, you can dump the file locally, compress it and use scp to copy the file with the -l switch to limit the bandwidth used:
-l limit
Limits the used bandwidth, specified in Kbit/s.
eg
scp -l 16 dumpfile remotehost:filepathandname
One trick I've used is to specify CSV format rather than the insert. It doesn't change how much bandwidth you use per unit time, but it can reduce the total number of bytes you're pulling out.
If you send it over TCP, the bandwidth will be shared equally between all parties. If you want to lower the speed even more, you need to shape your device to only allow a certain amount of data going out.
On the client, you can run a proxy that will limit the speed of the download.
You can also control # of connections etc.
If you are on windows, this should work nicely:
http://www.youngzsoft.net/ccproxy/index.html
Are you using a transactional table engine like InnoDB? Is this your master database? Be very careful! mysqldump will hold table locks and disrupt the production use of your database. Slowing down the backup will only cause this period of disruption to get longer. Always mysqldump to a local disc, and then copy the dump from there.
One other approach might be to set up a replication slave at your remote site, and take your backups from that. Then database updates will trickle over your contended link instead of coming down in one big lump.
Another alternative: do your backups when noone else is using the network :)

What's the quickest way to dump & load a MySQL InnoDB database using mysqldump?

I would like to create a copy of a database with approximately 40 InnoDB tables and around 1.5GB of data with mysqldump and MySQL 5.1.
What are the best parameters (ie: --single-transaction) that will result in the quickest dump and load of the data?
As well, when loading the data into the second DB, is it quicker to:
1) pipe the results directly to the second MySQL server instance and use the --compress option
or
2) load it from a text file (ie: mysql < my_sql_dump.sql)
QUICKLY dumping a quiesced database:
Using the "-T " option with mysqldump results in lots of .sql and .txt files in the specified directory. This is ~50% faster for dumping large tables than a single .sql file with INSERT statements (takes 1/3 less wall-clock time).
Additionally, there is a huge benefit when restoring if you can load multiple tables in parallel, and saturate multiple cores. On an 8-core box, this could be as much as an 8X difference in wall-clock time to restore the dump, on top of the efficiency improvements provided by "-T". Because "-T" causes each table to be stored in a separate file, loading them in parallel is easier than splitting apart a massive .sql file.
Taking the strategies above to their logical extreme, one could create a script to dump a database widely in parallel. Well, that's exactly what the Maakit mk-parallel-dump (see http://www.maatkit.org/doc/mk-parallel-dump.html) and mk-parallel-restore tools are; perl scripts that make multiple calls to the underlying mysqldump program. However, when I tried to use these, I had trouble getting the restore to complete without duplicate key errors that didn't occur with vanilla dumps, so keep in mind that your milage may vary.
Dumping data from a LIVE database (w/o service interruption):
The --single-transaction switch is very useful for taking a dump of a live database without having to quiesce it or taking a dump of a slave database without having to stop slaving.
Sadly, -T is not compatible with --single-transaction, so you only get one.
Usually, taking the dump is much faster than restoring it. There is still room for a tool that take the incoming monolithic dump file and breaks it into multiple pieces to be loaded in parallel. To my knowledge, such a tool does not yet exist.
Transferring the dump over the Network is usually a win
To listen for an incoming dump on one host run:
nc -l 7878 > mysql-dump.sql
Then on your DB host, run
mysqldump $OPTS | nc myhost.mydomain.com 7878
This reduces contention for the disk spindles on the master from writing the dump to disk slightly speeding up your dump (assuming the network is fast enough to keep up, a fairly safe assumption for two hosts in the same datacenter). Plus, if you are building out a new slave, this saves the step of having to transfer the dump file after it is finished.
Caveats - obviously, you need to have enough network bandwidth not to slow things down unbearably, and if the TCP session breaks, you have to start all over, but for most dumps this is not a major concern.
Lastly, I want to clear up one point of common confusion.
Despite how often you see these flags in mysqldump examples and tutorials, they are superfluous because they are turned ON by default:
--opt
--add-drop-table
--add-locks
--create-options
--disable-keys
--extended-insert
--lock-tables
--quick
--set-charset.
From http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html:
Use of --opt is the same as specifying --add-drop-table, --add-locks, --create-options, --disable-keys, --extended-insert, --lock-tables, --quick, and --set-charset. All of the options that --opt stands for also are on by default because --opt is on by default.
Of those behaviors, "--quick" is one of the most important (skips caching the entire result set in mysqld before transmitting the first row), and can be with "mysql" (which does NOT turn --quick on by default) to dramatically speed up queries that return a large result set (eg dumping all the rows of a big table).
Pipe it directly to another instance, to avoid disk overhead. Don't bother with --compress unless you're running over a slow network, since on a fast LAN or loopback the network overhead doesn't matter.
i think it will be a lot faster and save you disk space if you tried database replication as opposed to using mysqldump. personally i use sqlyog enterprise for my really heavy lifting but there also a number of other tools that can provide the same services. unless of course you would like to use only mysqldump.
For innodb, --order-by-primary --extended-insert is usually the best combo. If your after every last bit of performance and the target box has many CPU cores, you might want to split the resulting dumpfile and do parallel inserts in many threads, up to innodb_thread_concurrency/2.
Also, tweak the innodb_buffer_pool_size on the target to the max you can afford, and increase innodb_log_file_size to 128 or 256 MB (careful with this, you need to remove the old logfiles before restarting the mysql daemon otherwise it won't restart)
Use mk-parallel-dump tool from Maatkit.
At least that would probably be faster. I'd trust mysqldump more.
How often are you doing this? Is it really an application performance problem? Perhaps you should design a way of doing this which doesn't need to dump the whole data (replication?)
On the other hand, 1.5G is quite a small database so it probably won't be much of a problem.
mydumper is a good choice, with paralel export, even paralell threads per table, and compressed files, see: