couchbase nodes RAM is getting full frequently - couchbase

we have a 4 node cluster, with 24 GB RAM, out of which 18GB has been given to couchbase with zero replicaion.
We have a approx 10M of records in this cluster with ~2.5M/hour and expire old items.
My RAM Usage which is ~72GB is getting full every ~12 days, and i need to restart the cluster to fix this. After restart again the RAM usage is back to ~20GB.
Can someone please help to understand the reason for it.
FYI : Auto Compaction is set to 40% fragment level and Meta Data Purge Interval is set to 1 Day, -- which we reduced to do 2 hours. But it didn't help.

Under scenarios with very high memory allocation churn Couchbase can experience memory fragmentation, which would cause the effects you are describing. This was addressed in the 4.x release by switching to jemalloc on non-Windows OSes and using tcmalloc with agressive decommit on Windows. I would suggest you download the RC version of Couchbase 4 (http://www.couchbase.com/nosql-databases/downloads#Couchbase_Server) and give it a try to see if that fixes the issue.

Related

Aurora MySql DB backup size increased drastically

I am running Aurora MySql Engine version 5.6.10a in the production environment. Automated DB snapshot size on 9th May was 120 GB and this snapshot size increased by 27 GB to 147 GB. I have checked that DB size did not increase even by 1 GB. I looked on the internet for the reason why this happened but got nothing.
Graph for snapshot size for the last two weeks as:
It's pretty consistent till 9th May and after 10th May. Does anyone have insight into this issue?
The rate of increase in DB size:
VolumeBytesUsed Graph:
Your help will be much appreciated!!
Thanks,
Mayurkumar
There is bug in aurora storage provisioning , we have also faced the same issue for one of our client.
In our case the storage grew from 17gb to 22000 gb in 15 days where db size was just 16gb. We were charged a lot for that period , you can connect to aws support for the resolution of this problem.
How were you checking you snapshot size? One possibility is that may have configured daily backups or so, which might be the reason why your total backup size is growing. Would love to help out more if you can provide more details on the number of snapshots, how you were checking the size of the cluster and the snapshot etc.

MySQL freezing for 40 seconds every 30 minutes

We are running MySQL 5.6 on Windows Server 2008r2.
Every 30 minutes it runs very slowly for around 40 seconds and then goes back to normal for another 30 minutes. It is happening like clockwork with each ‘hang’ being 30 minutes after the last one finished.
Any ideas? We are stumped and don’t know where next to look.
Background / things we have ruled out below.
Thanks.
• Our initial thoughts were a locking query but we have eliminated this.
• The slow query log shows affected queries but with zero lock time.
• General logs show nothing (as an aside, is there a way to increase the logging level to get it to log when it is flushing the cache etc? What does MySQL run every 30 minutes?)
• When it is running slowly, it is still running but even simple queries like Select ‘Hello World’; take over a second to run.
• All MySQL operations run slowly at the time in question including monitoring tools and especially making new connections. InnoDB and MyISAM are equally affected.
• We have switched from using the SAN array to using local SSD and it has made no difference ruling out disk / spindles.
• The machine has Sophos Endpoint Protection but this is not scanning anything on the database drives.
• It is as if the machine is maxed out but local performance monitoring does show any unusual system metrics. CPU, disk queue, disk throughput, memory, network activity etc. are all flat.
• The machine is a VM running on VMware. Hypervisor monitoring is not showing any performance issues – but I am not convinced it is granular enough to pick up a 30 second spike.
• We have tried adjusting MySQL settings like the InnoDB cache size, log size etc and this has made no difference.
• The server runs nothing other than a couple of MySQL instances.
• The other instances are unaffected - as far as we can tell.
There's some decent advice here on Server Fault:
https://serverfault.com/questions/733590/mysql-stops-responding-periodically
Have you monitored Disk I/O? Is there an increase in I/O wait times or
queued transactions? It's possible that requests are queueing up at
the storage level due to an I/O limitation put on by your host. Also,
have you checked if you're hitting your max allowable mysql clients?
If these queries are suddenly taking a lot longer to complete, it's
also possible that it's not leaving enough available connections for
normal site traffic because the other connections aren't closing fast
enough.
I'd recommend using IOSTAT and seeing if you're saturating your disks. It should show if all your disks are at 100% usage, etc.

Getting a very bad performance with galera as compared to a standalone mariaDB server

I am getting an unacceptable low performance with the galera setup i created. In my setup there are 2 nodes in active-active and i am doing read/writes on both the nodes in a round robin fashion using HA-proxy load balancer.
I was easily able to get over 10000 TPS on my application with the single mariadb server with the below configuration:
36 vpcu, 60 GB RAM, SSD, 10Gig dedicated pipe
With galera i am hardly getting 3500 TPS although i am using 2 nodes(36vcpu, 60 GB RAM) of DB load balanced by ha-proxy. For information, ha-proxy is hosted as a standalone node on a different server. I have removed ha-proxy as of now but there is no improvement in performance.
Can someone please suggest some tuning parameters in my.cnf i should consider to tune this severely under-performing setup.
I am using the below my.cnf file:
I was easily able to get over 10000 TPS on my application with the
single mariadb server with the below configuration: 36 vpcu, 60 GB
RAM, SSD, 10Gig dedicated pipe
With galera i am hardly getting 3500 TPS although i am using 2
nodes(36vcpu, 60 GB RAM) of DB load balanced by ha-proxy.
Clusters based on Galera are not designed to scale writes as I see you intend to do; In fact, as Rick mentioned above: sending writes to multiple nodes for the same tables will end up causing certification conflicts that will reflect as deadlocks for your application, adding huge overhead.
I am getting an unacceptable low performance with the galera setup i
created. In my setup there are 2 nodes in active-active and i am doing
read/writes on both the nodes in a round robin fashion using HA-proxy
load balancer.
Please send all writes to a single node and see if that improves performane; There will always be some overhead due to the nature of virtually-synchronous replication that Galera uses, which literally adds network overhead to each write you perform (albeit true clock-based parallel replication will offset this impact quite a bit, still you are bound to see slightly lower throughput volumes).
Also make sure to keep your transactions short and COMMIT as soon as you are done with an atomic unit of work, since replication-certification process is single-threaded and will stall writes on the other nodes (if you see that your writer node shows transactions wsrep pre-commit stage that means the other nodes are doing certification for a large transaction or that the node is suffering performance problems of some sort -swap, full disk, abusively large reads, etc.
Hope that helps, and let us know how it goes when you move to single node.
Turn off the QC:
query_cache_size = 0 -- not 22 bytes
query_cache_type = OFF -- QC is incompatible with Galera
Increase innodb_io_capacity
How far apart (ping time) are the two nodes?
Suggest you pretend that it is Master-Slave. That is, have HAProxy send all traffic to one node, leaving the other as a hot backup. Certain things can run faster in this mode; I don't know about your app.

Node.js high memory usage

I'm currently running a node.js server that communicates with a remote MySQL database as well as performs webrequests to various APIs. When the server is idle, the CPU usage ranges from 0-5% and RAM usage at around 300MB. Yet when the server is under load, the RAM usage linearly goes up and CPU usage jumps all around and even up to 100% at times.
I setup a snapshot solution that that would take a snapshot of the heap when a leak was detected using node-memwatch. I downloaded 3 different snapshots when the server was at 1GB 1.5GB and 2.5GB RAM usage and attempted to analyze them yet I have no idea where the problem is because the total amount of storage in the analytics seem to add up to something much lower.
Here is one of the snapshots, when the server had a memory usage of 1107MB.
https://i.gyazo.com/e3dadeb727be3bdb4eeb833094291ebf.png
Does that match up? From what I see there is only a maximum of 500 MB allocated to objects there. Also, would anyone have any ideas of the crazy CPU usage that I'm getting? Thanks.
what you need is better tool to proper diagnose that leak, Looks like you can get some help using N|Solid https://nodesource.com/products/nsolid , it will help you to visualize and monitor your app, is free to use in a develop environment.

CPU usage PostgreSQL vs MySQL on windows

Currently i have this server
processor : 3
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Xeon(TM) CPU 2.40GHz
stepping : 9
cpu MHz : 2392.149
cache size : 512 KB
My application cause more 96% of cpu usage to MySQL with 200-300 transactions per seconds.
Can anyone assist, provide links me on how
to do benchmark to PostgreSQL
do you think PostgreSQL can improve CPU utilization instead of MySQL
links , wiki that simply present the benchmark comparison
A common misconception for database users is that high CPU use is bad.
It isn't.
A database has exactly one speed: as fast as possible. It will always use up every resource it can, within administrator set limits, to execute your queries quickly.
Most queries require lots more of one particular resource than others. For most queries on bigger databases that resource is disk I/O, so the database will be thrashing your storage as fast as it can. While it is waiting for the hard drive it usually can't do any other work, so that thread/process will go to sleep and stop using the CPU.
Smaller databases, or queries on small datasets within big databases, often fit entirely in RAM. The operating system will cache the data from disk and have it sitting in RAM and ready to return when the database asks for it. This means the database isn't waiting for the disk and being forced to sleep, so it goes all-out processing the data with the CPU to get you your answers quickly.
There are two reasons you might care about CPU use:
You have something else running on that machine that isn't getting enough CPU time; or
You think that given the 100% cpu use you aren't getting enough performance from your database
For the first point, don't blame the database. It's an admin issue. Set operating system scheduler controls like nice levels to re-prioritize the workload - or get a bigger server that can do all the work you require of it without falling behind.
For the second point you need to look at your database tuning, at your queries, etc. It's not a "database uses 100% cpu" problem, it's a "I'm not getting enough throughput and seem to be CPU-bound" problem. Database and query tuning is a big topic and not one I'll get into here, especially since I don't generally use MySQL.