Couchbase: Deciding RAM size

Couchbase: Deciding RAM size - couchbase

I need your help in deciding RAM size for couchbase server. Here is my requirements:
I have 1 TB disk space.
Expected load is 1k transactions p/s
We see more read volume compared to write volume.
Can you please suggest what value of RAM should I allocate and please help me understand how to arrive at the RAM memory figure?
Thanks for the help.

Related

Node.js high memory usage

I'm currently running a node.js server that communicates with a remote MySQL database as well as performs webrequests to various APIs. When the server is idle, the CPU usage ranges from 0-5% and RAM usage at around 300MB. Yet when the server is under load, the RAM usage linearly goes up and CPU usage jumps all around and even up to 100% at times.
I setup a snapshot solution that that would take a snapshot of the heap when a leak was detected using node-memwatch. I downloaded 3 different snapshots when the server was at 1GB 1.5GB and 2.5GB RAM usage and attempted to analyze them yet I have no idea where the problem is because the total amount of storage in the analytics seem to add up to something much lower.
Here is one of the snapshots, when the server had a memory usage of 1107MB.
https://i.gyazo.com/e3dadeb727be3bdb4eeb833094291ebf.png
Does that match up? From what I see there is only a maximum of 500 MB allocated to objects there. Also, would anyone have any ideas of the crazy CPU usage that I'm getting? Thanks.

what you need is better tool to proper diagnose that leak, Looks like you can get some help using N|Solid https://nodesource.com/products/nsolid , it will help you to visualize and monitor your app, is free to use in a develop environment.

couchbase metadata overhead warning. 62% RAM is taken up by keys and metadata

Okay since i don't have 10 repitation I'm unable to post images, but I will try to explain in text.
I have a 7 node Couchbase (Community) cluster with 4 buckets.
Recently I've been getting spammed(constantly) by Metadata overhead warnings for one of the buckets..
The warning pops up and looks like this:
Metadata overhead warning. Over 62% of RAM allocated to bucket XXXX on node "xxx" is taken up by keys and metadata.
And I've read that it is usually a sign that the bucket needs more ram. But I don't thing that is the issue for me. I simply have a lot of metadata I would guess.
When I look at the Data Buckets tab this bucket has RAM/Quota Usage 64GB/75GB. So for me it looks that there is around 11GB(75-64GB) available.
If i look at the Bucket Analytics VBUCKET RESOURCES metrics I see that there is 59GB user data in RAM and 46GB metadata in RAM. So to my understanding there should be 105GB in RAM on a bucket that has a total of 75GB!?!
But that doesn't add up for me so clearly there is something that I don't understand here.
And yes 46GB of 75GB is around 62%. But what about the 59GB user data that is supposedly in RAM?
EDIT:
A typical document can look like this:
ID=1:CAESEA---rldZ5PhdV4msSdEchI
CONTENT=z2TjZEzkZ84=
And to my question. What do I do? Is the situation acceptable in my circumstances. If so, do I change the threshold for that warning(which I read is not recommended since the warning is set at 50% for a reason).
Or do I assign more RAM? And if so how does that help me if there is already 11GB free?
Please help me clarify these numbers and suggest if I need to take any actions.

First of all, there isn't necessarily a problem with having a high percentage of memory used by metadata - it just means there's less RAM available for caching actual documents. If your application is working well then it may be fine for your use-case. However, having said that let me try and address your questions on it, and what to change if you do want to improve things:
If i look at the Bucket Analytics VBUCKET RESOURCES metrics I see that there is 59GB user data in RAM and 46GB metadata in RAM. So to my understanding there should be 105GB in RAM on a bucket that has a total of 75GB!?!
IIRC "user data in RAM" is inclusive of "metadata in RAM" - so you have a total of 59 GB data used, of which 46 GB is metadata.
And to my question. What do I do? Is the situation acceptable in my circumstances. If so, do I change the threshold for that warning(which I read is not recommended since the warning is set at 50% for a reason).
Or do I assign more RAM? And if so how does that help me if there is already 11GB free?
So basically you are storing lots of very small documents, so the per-document metadata overhead (~48 bytes plus the length of the key) is very high compared to the actual document size.
The 11GB free is mainly made up of the difference between the bucket quota and the high watermark.
Here are a few options to improve this:
Allocate more RAM to the bucket (as you mentioned) - if there's any unallocated in the Server Quota.
Add more memory to the nodes (and allocate to the server quota and bucket).
Reduce the number of replicas (if that's acceptable to you) - at the moment you are essentially storing each object (and it's metadata) three times - once for the active vBuckets and twice for the two replica vBucket sets.
Change your documents to have shorter keys - This will reduce the average metadata per document.
Consolidate multiple documents into one - This will reduce the number of documents, and hence the overall metadata overhead.

How can i copy Couchbase data from disk to memory, due to memory increasing?

Some time ago our Couchbase cluster started to read data from disk, because memory was full. We increased memory amount, but the Couchbase still reads from the disk. Disk reads greatly increases the number of errors in our software. And i'm wondering is there possibility to copy data from disk to memory, so Couchbase can work normally again?
CentOs 5.6
Couchbase v.1.8

As documented here http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-introduction-architecture-diskstorage.html Couchbase tries to keep the dataset in memory. So when you access the document it will be put in memory.
When adding physical memory you will need to also increase the RAM Quota of your cluster/nodes.
Do you have information about the cache misses?
Is what you want is put "all the document" in memory? (do you have enougt RAM?)

How can I make sure MySQL is using all available memory?

Dumb question: I have 4 gb of RAM and my dataset is around 500 mb. How can I make sure MySQL/InnoDB is keeping my dataset in RAM?

MySQL Tuning Primer gives you lots info and recommendations regarding your MySQL performance. Keep in mind (and it will warn you), the instance should be running for a period of time to give you accurate feedback.

set the innodb_buffer_pool to 3G - InnoDB will load as much data it can in the buffer pool.

Darhazer is right (I'd vote him up but don't have the rep points). It's generally recommended to set innodb_buffer_pool_size to 70-80% of memory although it's really more complicated than that since you need to account for how much RAM other parts of your system is actually using.

Coherence Topology Suggestion

Data to be cached:
100 Gb data
Objects of size 500-5000 bytes
1000 objects updated/inserted in average per minute (peak 5000)
Need suggestion for Coherence topology in production and test (distributed with backup)
number of servers
nodes per server
heap size per node
Questions
How much free available memory is needed per node compared to memory used by cached data (assume 100% usage is not possible)
How much overhead will 1-2 additional indexes per cache element generate?
We do not know how many read operations will be done, the solution will be used by clients where low response times are critical (more than data consistency) and depend on each use-case. The cache will be updated from DB by polling at a fixed frequency and populating the cache (since cache is data master, not the system using the cache).

The rule of thumb for sizing a JVM for Coherence is that the data is 1/3 the heap assuming 1 backup: 1/3 for cache data, 1/3 for backup, and 1/3 for index and overhead.
The biggest difficulty in sizing is that there are no good ways to estimate index sizes. You have to try with real-world data and measure.
A rule of thumb for JDK 1.6 JVMs is start with 4GB heaps, so you would need 75 cache server nodes. Some people have been successful with much larger heaps (16GB), so it is worth experimenting. With large heaps (e.g, 16GB) you should not need as much as 1/3 for overhead and can hold more than 1/3 for data. With heaps greater than 16GB, garbage collector tuning becomes critical.
For maximum performance, you should have 1 core per node.
The number of server machines depends on practical limits of manageability, capacity (cores and memory), and failure. For example, even if you have a server that can handle 32 nodes, what happens to your cluster when a machine fails? The cluster will be machine safe (backups are not on the same machine) but recovery would be very slow given the massive amount of data to be moved to new backups. On the other hand 75 machines is hard to manage.
I've seen Coherence have latencies of 250 micro seconds (not millis) for a 1K object put, including network hops and backup. So, the number of inserts and updates you are looking for should be achievable. Test with multiple threads inserting/updating and make sure your test client is not the bottleneck.

A few more "rules of thumb":
1) For high availability, three nodes is a good minimum.
2) With Java 7, you can use larger heaps (e.g. 27GB) and the G1 garbage collector.
3) For 100GB of data, using David's guidelines, you will want 300GB total of heap. On servers with 128GB of memory, that could be done with 3 physical servers, each running 4 JVMs with 27GB heap each (~324GB total).
4) Index memory usage varies significantly with data type and arity. It is best to test with a representative data set, both with and without indexes, to see what the memory usage difference is.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008