Couchbase 4.0 Community Edition benchmark - couchbase

We are benchmarking couchbase and observing a very strange behaviour.
Setup phase:
Couchbase cluster machines;
2 x EC2 r3.xlarge with General purpose 80GB SSD (Not EBS optimised ) , IOPS 240/3000.
Couchbase settings:
Cluster:
Data Ram Quota: 22407 MB
Index Ram Quota: 2024 MB
Index Settings (default)
Bucket:
Per Node Ram Quota: 22407 MB
Total Bucket Size: 44814 MB (22407 x 2)
Replicas enabled (1)
Disk I/O Optimisation (Low)
Each node runs all three services
Couchbase client;
1 x EC2 m4.xlarge General purpose 20 GB SSD (EBS Optimised), IOPS 60/3000.
The client is running the 'YCSB' benchmark tool.
ycsb load couchbase -s -P workloads/workloada -p recordcount=100000000 -p core_workload_insertion_retry_limit=3 -p couchbase.url=http://HOST:8091/pools -p couchbase.bucket=test -threads 20 | tee workloadaLoad.dat
PS: All the machines are residing within the same VPC and subnet.
Results:
While everything works as expected
The average ops/sec is ~21000
The 'disk write queue' graph is floating between 200K - 600K(periodically drained).
The 'temp OOM per sec' graph is at constant 0.
When things starting to get weird
After about ~27M documents inserted we start seeing 'disk write queue' is constantly rising (Not getting drained)
At about ~8M disk queue size the OOM failures are starting to show them selves and the client receives 'Temporary failure' from couchbase.
After 3 retries of each YCSB thread, the client stops after inserting only ~27% of the overall documents.
Even when the YCSB client stopped running, the 'disk write queue' is asymptotically moving towards 0, and is drained only after ~15 min.
P.S
When we benchmark locally on MacBook with 16GB of ram + SSD disk (local client + one node server) we do not observe such behaviour and the 'disk write queue' is constantly drained in a predictable manner.
Thanks.

Related

CbBackup Tool gets interrupted after 30s of inactivity for specific buckets in a cluster

I have a cluster with 4 buckets in it. Whenever I try to take backups for the buckets individually, two of the buckets get backed up while the other two do not get backed up.
For the buckets which do not get backed up, I got this message in the console.
w0 no response for 30 seconds while there 1024 active streams
This is the command I’m running for each bucket.
./cbbackup http://localhost:8091 /datadrive/cb-backups/ -u <USERNAME> -p '<PASSWORD>' -b <BUCKET_NAME> -m full
These are the specs for those two buckets which were not getting backed up.
Bucket 1 - 4GB RAM, currently has around 400,000 documents.
Bucket 2 - 4GB RAM, currently has around 150,000 documents.
It’s worth noting that we first had 2GB ram in both the buckets. After increasing the RAM for both the buckets, backups started working again but the same error occured from the next day
Is there an inherent problem with the CbBackup tool? Does anyone know how the backups are actually taken? That would give more insight into why this error might occur.
DISTILLED TECHNICAL INFORMATION:
Couchbase Server - Community Edition 5.0.1 build 5003
Command used - `./cbbackup http://localhost:8091 /datadrive/cb-backups/ -u <USERNAME> -p '<PASSWORD>' -b <BUCKET_NAME> -m full`
* Bucket Specs -
* Bucket 1 - 4GB RAM, currently has around 400,000 documents.
* Bucket 2 - 4GB RAM, currently has around 150,000 documents.
Thanks for your valuable time.

High CPU usage on couchbase server with moderate load

I am using Couchbase Server on stage environment. Things were working fine until yesterday. But since today I am observing high CPU usage when the load is increased moderately. (PFA)
Couchbase cluster configuration:-
3 node cluster running (4.5.1-2844 Community Edition (build-2844))
each having m4.2xlarge(8 cores, 32 GB RAM) AWS machines.
Data RAM quota: 25000 MB
Index RAM quota: 2048MB
It has 9 buckets. And used bucket is having 9 GB RAM (i.e. 3 GB per cluster)
Note: - Since we are using community edition, each node is running Data, Full Text, Index, and query service.
Let me know if I've done some misconfiguration or if any optimization required.

AWS RDS Concurrent Connections Issue

So I have an RDS MariaDB server running with the following specs.
Instance Class: db.m4.2xlarge
Storage Type: Provisioned IOPS (SSD)
IOPS: 4000
Storage: 500 GB
My issue is when the SQL server experience heavy load (connections in excess of 200) it will start to refuse new connections.
However according to the monitoring stats, it should be able to handle connections well above that. At it's peak load these are the following stats:
CPU Utilization: 18%
DB Connections: 430
Write Operations: 175/sec
Read Operations: 1/sec (This Utilizes MemCache)
Memory Usage: 1.2GB
The DB has the following hardware specs
8 vCPUs
32 GB Mem
1000 Mbps EBS Optimized
Also from what I can tell RDS has the "Max_Connections" setting in MySQL set to
2,664.
So I can't understand why it is rejecting new connections at such a low rate by comparison. Is there another setting that controls this either in RDS or in the MariaDB?

Docker uses all memory and crashes the system

I have an AWS t2.micro EC2 instance with docker on it, and I bring up the following instances;
jwilder/nginx-proxy
mysql
wordpress
Which results in something like this docker stats;
CONTAINER MEM USAGE/LIMIT MEM %
wordpress 331.9 MB/1.045 GB 31.77%
nginx 18.32 MB/1.045 GB 1.75%
mysql 172.1 MB/1.045 GB 16.48%
Then, I run siege's default 15 concurrent connections against it, which spawns multiple apache processes, reaching the memory limit of the EC2 instance, crashing docker and bash due to no more memory, requiring my intervention to get it all running again.
I have a couple of questions regarding this.
Am I expecting too much? Should this setup be able to handle 15 concurrent connections? If so, what changes* need to be made?
How can I automate recovery from this? Is there a way to detect that memory is reaching capacity and do something (like reject requests or similar) until memory usage decreases? Is there a way to keep the system stable during the high request volume so once it's over it does not require my intervention to bring it back up?
* I've already done this to drop mysql memory from 22% to 15%.
Given a t2.micro only has 1GB total, and each of those containers has a 1GB limit on its own, have you tried limiting the max memory usage on each container (as per http://docs.docker.com/engine/reference/run/#user-memory-constraints) such that the total memory limit doesn't exceed 1GB?
The biggest impact, which stopped the EC2 instance from falling over, was limiting the memory a docker container can use with the -m option per #palfrey's answer.
Some additionally tweaks were required to reduce the memory footprint and have the service respond to 15 concurrent users, albeit somewhat slowly. This included;
MySQL
Disabling performance_schema
Using a minimal config
WordPress
Disabling KeepAlive
Limiting servers:
<IfModule mpm_prefork_module>
StartServers 1
MinSpareServers 1
MaxSpareServers 3
MaxRequestWorkers 10
MaxConnectionsPerChild 3000
</IfModule>
Docker
I created some docker images that extend the default images to include these optimisations;
mysql-minimal
wordpress-minimal
Further details in my blog post.
Probably, a micro only has 1GB of ram. You can run this configuration without docker just fine, but you do have to adjust for memory limitations. Docker probably adds some overhead. Is there a reason for running both nginx and apache?
Generally you test and limit your threads to what the system can handle, there are probably things you can do with caching that will help improve performance. Apache, nginx, php-fpm all have settings that can control the number of threads that are allowed to be created.

mysql server overload due to many queries at once

I have an application on my server that uses many database requests to a reasonable simple and small database (10Mb size).
The number of simultaneous requests can be around 500. I have an Apache & Mysql server running on linux with 8GB RAM and 3 cores.
I've upgraded the server recently(from 512mb to 8GB), but this is not having effect. It seems that the aditional CPU and RAM is not being used. Before the CPU hit 100%, but after the upgrade I still get status WARN at only 40% CPU usage:
Free RAM: 6736.94 MB
Free Swap: 1023.94 MB
Disk i/o: 194 io/s
In the processes, the mysqld cpu usage is 100%.
I can't figure out what the right settings are to make the hardware upgrade work for mysql and MyISAM engine.
I have little experience with setting up and configuring a server, so detailed comments or help are very welcome.
UPDATE #1
the mysql requests are both readers and writers from a large number of php scripts.