Couchbase 1.8.0 concurrency (number of concurrent req support in java client/server): scalability - couchbase

Is there any limit on server on serving number of requests per second or number of requests serving simultaneously. [in configuration, not due to RAM, CPU etc hardware limitations]
Is there any limit on number of simultaneous requests on an instance of CouchbaseClient in Java servlet.
Is it best to create only one instance on CouchbaseClient and keep it open or to create multiple instances and destroy.
Is Moxi helpful with Couchbase 1.8.0 server/Couchbase java client 1.0.2
I need this info to setup application in production.
Thanks you

The memcached instance that runs behind Couchbase has a hard
connection limit of 10,000 connections. Couchbase in general
recommends that you should increase the number of nodes to address
the distrobution of traffic on that level.
The client itself does not have a hardcoded limit in regards to how
many connections it makes to a Couchbase cluster.
Couchbase generally recommends that you create a connection pool
from your application to the cluster and just re-use those
connections versus creation and destroying them over and over. On
heavier load applications, the creation and destruction of these
connections over and over can get very expensive from a resource
perspective.
Moxi is an integrated piece of Couchbase. However, it is generally
in place as an adapter layer for clients developers to specifically
use it or to give legacy access to applications designed to directly
access a memcached interface. If you are using the Couchbase client
driver you won't need to use the Moxi interface.

Related

Does RDS proxy affects current application side pooling?

I have a Saas application on AWS ECS and databases on AWS RDS. We are planning to implement AWS RDS Proxy for pooling implementation. From the RDS proxy documentation, I saw that we don't need to make any changes to the application code. Currently, we are using application side connection pooling. When we implement an RDS proxy for pooling, does the current pooling have any impact?
Do we need to remove the application side pooling to work with RDS effectively?
My main concern is, if I choose 100% pooling in RDS proxy and from application pooling configuration if we limit that to say 100 max connection. Will that be a bottleneck?
TLDR: keep the connection pool in your application, and size it to the number of connections required by that one instance of your application (e.g. the ECS task or EKS pod).
With a database proxy in the middle, there are two separate legs to a "connection":
First, there is a connection from the application to the proxy. What you called the "application side pooling" is this type of connection. Since there's still overhead associated with creating a new instance of this type of connection, continuing to use a connection pool in your application probably is a good idea.
Second, there is a connection from the proxy to the database. These connections are managed by the proxy. The number of connections of this type is controlled by a proxy configuration. If you set this configuration to 100%, then you're allowing the proxy to use up to the database's max_connections value, and other clients may be starved for connections.
So, when your application wants to use a connection, it needs to get a connection from its local pool. Then, the proxy needs to pair that with a connection to the database. The proxy will reuse connections to the database where possible (this technique also is called multiplexing). Or, quoting the official docs: "You can open many simultaneous connections to the proxy, and the proxy keeps a smaller number of connections open to the DB instance or cluster. Doing so further minimizes the memory overhead for connections on the database server. This technique also reduces the chance of "too many connections" errors."
As your container orchestrator (e.g. ECS or EKS) scales your application horizontally, your application will open/close connections to the proxy, but the proxy will prevent your database from becoming overwhelmed by these changes.

Is it possible to run IoT-Agent for Ultralight 2.0 without MondoDB link (with memory type of data hold)?

During configuring IoT-Agent for Ultralight 2.0 there is a possibility to set docker variable IOTA_REGISTRY_TYPE- Whether to hold IoT device info in memory or in a database (mongodb by default). Documentation that I'm referencing.
Firstly I would like to have it set for memory and what would it imply?
Could the data be preserved only in some allocated part of memory within docker env.? Could I omit further variables within configuration file, like IOTA_MONGO_HOST (The hostname of mongoDB - used for holding device information).
Architecture for my system has raspberry pi running IoT Agent and VM running Orion Context Broker and MongoDB. Both are reachable because they see each other in LAN. Is it necessary for MongoDB to be the same database for IoT Agent and Orion Context Broker if they are linked?
Is it possible to run IoT Agent with memory only type of device information persistence (instead of database type)? Will it have any effect on whole infrastructure running besides of obvious lack of device data holding?
Firstly I would like to have it set for memory and what would it imply?
There would be no need for a MongoDB database attached to the IoT Agent, there would be no persistence of provisioned devices in the event of disaster recovery.
Could the data be preserved only in some allocated part of memory within docker env.?
No
Could I omit further variables within configuration file, like IOTA_MONGO_HOST (The hostname of mongoDB - used for holding device information).
The Docker Env parameters are merely overrides to the values found in the config.js within the Enabler itself, so all of the ENV variables can be omitted if you are using defaults.
Is it necessary for MongoDB to be the same database for IoT Agent and Orion
Context Broker if they are linked?
The IoT Agent and Orion can run entirely separately and usually would use separate MongoDB instances. At least this would be the case in a properly architected production environment.
The Step-by-Step Tutorials are lumping everything together on one Docker engine for simplicity. A proper architecture has been sacrificed to keep the narrative focused on the learning goals. You don't need two Mongo-DB instances to handle less than 20 dummy devices.
When deploying to a production environment, try looking at the SmartSDK Recipes
in order to scale up to a proper architecture:
see: https://smartsdk.github.io/smartsdk-recipes/
Is it possible to run IoT Agent with memory only type of device information
persistence (instead of database type)? Will it have any effect on whole
infrastructure running besides of obvious lack of device data holding?
I haven't checked this, but there may be a slight difference in performance since memory access should be slightly faster. The pay-off is that you will lose the provisioned state of all devices if failure occurs. If you need to invest in disaster recovery then Mongo-DB is the way to go, and periodically back-up your database so you can always return to last-known-good

How does the intracluster replication on couchbase work?

How does the intracluster replication on couchbase work?
I understood that the buckets that contains the documents, are subdivided in vbuckets.
The vbuckets also create their replicas to provide High Availability,and the master vbucket and the replicas are stored in different servers throughout the cluster. Now I wanted to understand how is the process of sending the copies to the replicas done? With MongoDB we have oplogs, what about in couchbase?
Couchbase Server uses Distributed Change Protocol (DCP) for intracluster and intercluster replication.
From Couchbase Distributed Data Management documentation:
[DCP is] a high-performance streaming protocol that communicates the state of the data using an ordered change log with sequence numbers.
The Couchbase Forums have some commentary on the replication process in the face of node failures.
DCP facilitates many Couchbase integrations such as the Kafka Connector. See the Connector Guides for more examples.

AWS RDS read replicas interaction with application

I am very new to cloud computing. I have never worked with MySQL outside of 1 instance. I am trying to understand how AWS RDS read replicas work with my application. For example say I have 1 master and 2 read replicas. I then from my application server send the query to AWS:
SELECT * FROM users where username = 'bob';
How does this work now? Do I need to include more into my code to choose a certain read replica or does AWS automatically reroute the request or how does it work?
Amazon does not currently provide any sort of load balancing or other traffic distribution across RDS servers. When you send queries to the primary RDS endpoint, 100% of that traffic goes to the primary RDS server. You would have to architect your system to open connections to each server and distribute the queries across the different database servers.
To do this in a way that is transparent to your application, you could setup an HAProxy instance between your application and the database that manages the traffic distribution.
Use of Elastic Load Balancers to distribute RDS traffic is an often requested feature, but Amazon has given no indication that they are working on this feature at this time.

Load testing a ec2 Node.js machine - Now... how do I remotely load test 6500 QPS?

Ok, I have my server built on ec2. My stack is Nginx as a load balancer, supervisord for managing processes for node.js i.e. one process for each cpu, and redis, master and slave on separate boxes. I have stress tested by testing failover and taking services offline. Using apache AB, on the server I can get up to 6500 QPS.
Now, I need to load test remotely. What are the best open source tools to accomplish this or even the most cost effective SaaS method to do this. I do expect 6500 QPS per server in production and need to extend the isolation of apache AB to remote testing. E.g. I will have servers in singapore and I need to test 6500 QPS from Japan and the effect of latency. I am aware of apache Jmeter but looking for a best practice solution.
Thanks
I have successfully used jMeter for load testing at significant scale.
If a single load generation client cannot output enough load, you can configure jMeter with multiple load generation clients, with the load coordinated by a master instance.
Using "open source tools" implies that you have the ability to spin up servers in the zones you're interested in (e.g. Japan). If you locate a cloud provider in that region, you can spin up as many load generation instances as needed. You may, however, need quite a few instances depending on the network connectivity offered to individual instances. The nice thing about jMeter is that it can coordinate many load generation instances.
You can use blazemeter as SaaS solution. It's 100% Jmeter compatible. There is Japan(Tokyo) load origin location which you need.