I plan to use couchbase bucket for caching results from database calls. If one of the couchbase server in cluster goes down and starts back, I want to force expiration of any persisted documents on that server. How can I do that? How is the performance of memcached bucket compared to couchbase bucket?
Couchbase persists the expiration if an item has one so if you item expires while the server is down and you restart the server the item will be deleted during the warmup process.
There's no support to flush just a single nodes' vbuckets, but you can flush the whole Bucket (across all nodes) by simply deleting and re-creating the Bucket.
This can be done using the REST API - see Deleting a Bucket and Creating and Editing a Bucket. You may also have this wrapped up in an SDK call, depending on which SDK you're using.
Related
I have been working on the server migration of a legacy ecommerce application using PHP 5.6.
The switch involved two Dedicated 32 servers from Linode.
One server is for NginX + PHP and the other is for MySQL only.
The legacy application leverages memcached.
After the switch, I can see a heavy internal traffic caused due to private inbound and outbound connections.
So far this element didn't cause any problem on performance.
However, I was under the impression that the queries would be cached on the local machine, and not on the remote.
Because if the query is cached on the remote host, it sill has to transmit the result set over the private network, instead of retrieving from RAM or the local SSD.
Am I assuming this wrong?
It may be that I am missing the point where the private inbound traffic is more beneficial for overall performance when compared to a local cache.
MySQL has a feature called the Query Cache, but this caches query result sets in the mysqld server process, not on the client. If you run the exact same query again after the result has been cached in the Query Cache, it will copy the result from the Query Cache and avoid the cost of running the query again. But this will not avoid the time to transfer the result across the network from mysqld to your PHP application.
Also keep in mind that the MySQL Query Cache is being deprecated and retired.
Alternatively, your application may store data from query results in memcached, but typically this would be done by the application code (I know there are UDF's to read and write memcached from MySQL triggers, but this is a bad idea).
If your memcached service is not on the same host as your PHP code, it would result in network transfer twice: Once when querying the data from MySQL the first time, then again transferring the data into memcached, then later every time you fetch the cached data out of memcached.
PHP also has some features to do in-memory caching, such as APCu. I don't have any experience with this, and it's not clear from a brief scan of the documentation where it stores cached data.
PHP is designed to be a "shared nothing" language. Every PHP request has its own data, and data doesn't normally last until the next request. This is why a cache is typically not kept in PHP memory. Applications rely on either memcached or the database itself, because those will hold data longer than a single PHP request.
If you have a fast enough network, it shouldn't be a high cost to fetch items out of a cache over a network. The performance architects at a past job of mine developed this wisdom:
"Remote memory is faster than local storage."
They meant that if the data is in RAM on a server, then reading it from RAM even with the additional overhead of transferring it across a network is usually better than reading the data from persistent (disk) storage on the local host.
We are building a small advertising platform that will be used on several client sites. We want to setup multiple servers and load balancing (using Amazon Elastic Load Balancer) to help prevent downtime.
Our basic functions include rendering HTML for ads, recording click information (IP, user-agent, location, etc.), redirecting traffic with their click ID as a tracking variable (?click_id=XX), and basic data analysis for clients. It is very important that two different clicks don't end up with the same click ID.
I understand the basics of load balancing, but I'm not sure how to setup the MySQL server(s).
It seems there are a lot of options out there: master-master, master-slave, clusters, shards.
I'm trying to figure out what is best for us. The most important aspects we are looking for are:
Uptime - if one server goes down, automatically get picked up by another server.
Load sharing - keep CPU and RAM usage spread out.
From what I've read, it sounds like my best option might be a Master with 2 or more slaves. The Master would not be responsible for any HTTP traffic, that would go to the slaves only. The Master server would therefore only be responsible for database writes.
Would this slow down our click script? Since we have to insert first to get a click ID before redirecting, the Slave would have to contact the Master and then return with the data. Right now our click script is lightning fast and I'd like to keep it that way.
Also, what happens if the Master goes down? Would a slave be able to serve as the Master until the Master was back online?
If you use Amazon's managed database service, RDS, this will take a lot of the pain out of managing your database.
You can select the multi-AZ option on your master database instance to provide a redundant, synchronously replicated slave in another availability zone. In the event of a failure of the instance or the entire availability zone Amazon will automatically flip the A record pointing to your master instance to the slave in the backup AZ. This process, on vanilla MySQL or MariaDB, can take a couple of minutes during which time your application will be unable to write to the database.
You can also provision up to 5 read replicas for a MySQL or MariaDB instance that will replicate from the master asynchronously. You could then use an Elastic Load Balancer (or other TCP load balancer such as HAProxy or MariaDB's MaxScale for a more MySQL aware option) to distribute traffic across the read replicas. By default each of these read replicas will have a full copy of the master's data set but if you wanted to you could attempt to manually shard the data across these. You'd have to have some more complicated logic in your application or the load balancer to work out where to find the relevant shard of the data set though.
You can choose to promote a read replica into a stand alone master which will break replication to the master and give you a stand alone cluster which can then be reconfigured as to your previous setup (or something different if you want and just using the data set you had at the point of promotion). It doesn't sound like something you need to do here though.
Another option would be to use Amazon's own flavour of MySQL, Aurora, on RDS. Aurora is completely MySQL over the wire compatible so you can use whatever MySQL driver your application already uses to talk to it. Aurora will allow up to 15 read replicas and completely transparent load balancing. You simply provide your application with the Aurora cluster endpoint and then any writes will happen on the master and any reads will be balanced across however many read replicas you have in the cluster. In my limited testing, Aurora's failover between instances is pretty much instant too so that minimises down time in the event of a failure.
I'm running Couchbase Version: 2.5.1 enterprise edition (build-1083-rel) 2 server cluster on Windows
I created some test bucket to play with it. After some experiments I decide to "purge" it by delete|recreate.
Now I can't recreate bucket with same name - still getting "Bucket with given name still exists". I discovered folder with bucket name still exists on secondary server in cluster (30 minutes after deletion). I tried to delete this folder manually without restarting Couchbase. Deletion was successfully but I still can't recreate bucket with same name (still get "Bucket with given name still exists").
How do I fix this?
As mentioned above, I don't know the cluster has got into this situation, but you may be able to fix it by rebalancing out the node which still has the 'old' bucket present (UI: Server Nodes -> Remove, Rebalance) and then rebalance it back in, assuming that your first node has enough disk space for all the Bucket data.
I've a cluster, this cluster has four nodes.
If I stop one node, and edit the configuration file (add a new replicated cache),
When I'll start the node,
Will the cluster have the new replicated cache?
In the others three nodes, is it necessary change the configuration file?
Regards.
a) Yes, the new replicated cache will be created on the node. However, if you have the same cache (name) with different configurations, you're asking for trouble.
b) No, the configuration on other nodes will not change. You have to change it manually, either stopping the nodes, or running rolling upgrade.
You may also look into JMX operations for starting/stopping cache, but this does not allow to change the configuration (I am not 100% sure if starting a cache with unknown name wouldn't start a new cache with default configuration).
If you have programmatic access to CacheManager, you can start cache with configuration provided programmatically.
I am building an architecture on AWS with several EC2 instances as webservers and a central MySQL database (RDS). The EC2 instances have Redis installed for caching single db rows. When a row changes in MySQL, I want every instance to update the corresponding cache entries too.
What is the best way to do this in the AWS enviroment?
Don't use triggers for this. Ensure things are properly committed (as opposed to rolled back), and then flush from within the application layer.
If you don't, you can have a scenario where concurrent requests are re-filling the cache with the old data (since they don't see the new data yet) as it'll get deleted from the cache in your SQL trigger.
If you are using a queue server (amazon SQS, redis pubsub, etc) you could put an entry onto a queue for each record you want expired, and have a worker listening to the queue and when it gets a message to tell it which record to invalidate it will connect to cache and expire that record.
This works if you have one cache server or many, you just need one worker for each cache server that you have, or one worker that can connect to each cache server. Many workers is more scalable.