I am learning about couchbase. It's my first experience with NoSQL databases.
In the case of a central server and many users with mobile devices. I want every user on your database has different data.
I have doubts about the sync.
To synchronize, the server will have a database per user? A database per user sounds to many databases ... otherwise I do not understand how to differentiate user data.
Is it possible to communicate between server and device through couchdatabase?.
Is it good strategy for communication with the device to write to the replica that is in the server and Couchbase is responsible for making the communication? Where I can find an example of this?
Some of these questions are unclear, but I will attempt to answer:
The server will not have a database per user. It only has the databases (or "buckets") that you set up ahead of time. User data is separated by a mechanism called channels.
Communication between server and device is the exact point of Couchbase Lite. In this scenario, you must make all changes go through Sync Gateway in order for replication to function properly. This is handled for you by Couchbase Lite, so you just need to point it to a Sync Gateway instance and let it take care of replicating between various devices.
Related
I've got a very specific use case and because I'm not too familiar with database replication, I am open to suggestions and ideas about how to accomplish the following in the best possible way:
A web application + database is running on a remote server. Let's call this set-up R for remote.
Now suppose there are 3 separate geographical locations which need read+write access to the database. I will call these locations L1, L2 and L3.
The main problem: the remote server might be unavailable or the internet connection of one of the locations might not always work, rendering the remote application unavailable; but we want the application to work as a high availability solution (on-site) even when the remote server is down or when there is an internet connection problem.
Partial solution: So I was thinking about giving each geographical location its own server with a local copy of the web application. The web application itself can get updated when needed from a version control system automatically (for example using git hooks).
So far so good... (at least I believe so?)
But what about our data? The really tricky part seems to be the database replication. Let's assume no DNS or IP failover and assume that the user first tries to access the remote server directly and if this does not work, the user can still use the local server on-site instead. This all happens inside a web browser (or similar client).
One possible (but unsatisfactory) solution would be to use master-slave replication from R (master) to L1, L2 and L3 (slaves). When doing this asynchronously this should be quite fast? I think this is a viable solution for temporary local read-only database access when the main server is broken or can't be accessed.
But... what about read-write support? I suppose we would need multi-master replication in this case, but I am afraid that synchronous replication using something like (for example) MySQL Cluster or Galera would slow things down, especially since L1, L2 and L3 are on lower bandwidth connections. And they are connected through WAN. (Also, L1, L2 or L3 might not always be online.)
The real question: How would you tackle this specific use case? At the moment I am leaning towards multi-master replication if it doesn't slow down things too much. The application itself will mainly be used by employees on-site but by some external people over WAN as well. Would multi-master replication work well? What if for example L1 is down for 24 hours and suddenly comes back on-line? What if R can't be accessed?
EXTRA: not my main question, but I also need the synchronized data to be sent securely over SSL, if possible, please take this into account for your answer.
Perhaps I am still forgetting some necessary details; if so, please respond with some feedback and I will try to update my question accordingly.
Please note that I haven't decided on a database yet and the database schema will be developed from scratch, so ideas using other databases or database engines are welcome as well. (At the moment I have most experience with MySQL and PostgreSQL)
As you are still undecided, I would strongly recommand you to have a look at MS-SQL merge replication. It is strong, highly reliable, replicates through LAN and HTTPS (so called web replication), and not that expensive.
Terminology differs from the mySql Master\Slave idea. We are here talking about one publisher, and multiple subscribers. All changes done at subscriber's level are collected and sent to the publisher, then redistributed to all subscribers (with, if needed, fancy options like 'filtered subscriptions').
Standard architecture will then be:
a publisher, somewhere on a server, which collects and redistributes changes between subscribers. Publisher might not be accessed by end users.
other database subscribers servers, either for local or web access, replicating with the publisher. Subscribers are accessed by end users.
We have been using this architecture for years, including:
one subscriber for internet access
one subscriber for intranet access
tens of subscribers for local access: some subscribers are on our constructions projects, somewhere in the desert ....
Such an architecture is not available "from the shelf" with MySQL. I guess it could be built, but it would then certainly be a lot more expensive than just buying the corresponding MS-SQL licenses. Do not forget that the free SQLEXPRESS version of MS-SQL can be a subscriber.
Be careful: If you are planning to go through such a configuration, I would (really) strongly advise you to have all primary keys set to uniqueIdentifier data type, and randomly generated. This will avoid the typical replication pitfall, where PK's are set to int with automatic increment, and where independant servers generate identical primary keys between two replications (MS-SQL proposes a tool to avoid such problems, where you can allocate PK ranges per server, but this solution is a real PITA ...).
I have 3 Beagleboards that needs to share data between each other as fast as possible. They are running debian (with a real time kernel) and are connected to each other via wlan.
All beagleboards have different sensors attached. All Beagleboards need the sensor data of the others in real time (or at least as fast as possible -the data are used in control algorithms for actuators).
The system is supposed to be used for demonstrate a concept and does not need to be 100% fault proof but as close as possible.
What is the best way to design such a system?
Ideas:
Design program for UDP broadcast and some sql server or just an object/class on the receiver end.
Embedded MySQL/High Performance MySQL with replication or cluster.
SQLite - need some addons?
Any other solutions might be better, I have never designed such a system before. Any help is much appreciated.
If "as fast as possible" is your requirement you need to do the sharing of data yourself and use the database just to store shared data.
You can implement a publisher / subscriber mechanism. One of your nodes becomes master and each of other nodes subbscribes to this node at startup. Master node multiplies and routes messages from subscribers.
Another (faster) option is implementing the publisher / subscriber mechanism without a master node. Each node registers itself to other nodes, it is similar to broadcasting you mentioned.
I have been making some research in the domain of servers for a website I want to launch. I thought of a certain configuration of a server with RAID 10 implemented with a NAS doing the backup which has a RAID 10 configuration as well. This should keep data safe in 99.99+ of cases.
My problem appeared when I thought about the need of a second server. If I shall ever require more processing power and thus more storage for users, how can I connect a second server to my primary one and make them act as one what the database (mySQL) is regarded?
I mean, I don't want to replicate my first DB on the second server and load-balance the request - I want to use just one DB (maybe external) and let the servers use it both at the same time. Is this possible? And is the option of backing up mySQL data on a NAS viable?
The most common configuration (once scaling up from a single box) is to put the database on its own server. In many web applications, the database is the bottleneck (rather than the web server); so the first hardware scale-up step tends to be to put the DB on its own server.
This also allows you to put additional security between the database and web server - firewalls are common; different user accounts etc. are pretty much standard.
You can then add web servers to the load balancer, all talking to the same database, as long as your database can keep up.
Having more than one web server also helps with resilience - you can have a catastrophic hardware event on one webserver and the load balancer will direct the traffic to the remaining machines.
Scaling the database server performance is a whole different story - though typically you use very beefy machines for the database, and relative lightweights for the web servers.
To add resilience to the database layer, you can introduce clustering - this is a fairly complex thing to keep running, but protects you against catastrophic failure of a single machine.
Yes, you can back up MySQL to a NAS.
The networking team has flagged our Ruby on Rails application as one of the top producers of network traffic on our network, specifically from packet traffic between the app server and the database server (mysql).
What are the recommended best practices to reduce traffic between a Rails app and the database? Persistent database connections?
Is it an actual problem, or do they ding the top 3 db consumers no matter what? Check your logs or have them supply you with a log of queries that they think are problematic.
Beyond that, check to see if you're doing bad things like making model calls from your views in loops. Your logs should tell you what's going on here, if you see each partial paired with a query every time it's rendered, that's a big sign that your logic should be pulled back into the models and controllers.
Fire up Wireshark or another network scanner and look for the biggest packets or small packets that are too frequent - to identify the specific, troublesome queries.
Then, before even considering caching, check if that query can really be cached or if it just pulls too much data you are not using.
At this point, there are too many different possible causes - each with it's own recommended practices.
My partner and I are trying to start a website hosted in cloud. It has pretty heavy ajax traffic and the backend handles money transactions so we need ACID in some of the DB tables.
Currently everything is running off a single server. Some of the AJAX traffic are cached in text files.
Question:
What's the best way to scale the database server? I thought about moving mysql to separate instances and do master-master duplication. However this seems tough and I heard I might lose ACID properties even with InnoDB? Is Amazon RDS a good solution?
The web server is relatively stateless except for some custom log files and the ajax cache files. What's a good way to scale to multiple web servers? I guess the custom log files can be moved to a reliable shared file system or DB but not sure what to do about the AJAX cache file coherency across multiple servers. (I dont care about losing /var/log/* if web server dies)
For performance it might be cheaper to go with larger instance with more cores and memory but eventually I would need redundancy so wondering what's the best way to do this cheaply.
thanks
take a look at this post. there is plenty of presentations on the net discussing scalability. few things i suggest to keep in mind:
plan early for the data sharding [even if you are not going to do it immediately]
try using mechanisms like memcached to limit number of queries sent to the database
prepare to serve static content from other domain, in the longer run - from ngin-x-alike server and later CDN
redundancy - depends on your needs. is 'read-only' mode acceptable for your site? if so - go with mysql replication + rsync of static files and in case of failover have your site work in that mode till you recover the master node. if you need high availability - then take a look either at drbd replication [at least for mysql] or setup with automated promotion of slave server to become master node.
you might find following interesting:
http://yoshinorimatsunobu.blogspot.com/2011/08/mysql-mha-support-for-multi-master.html
http://mysqlperformanceblog.com
http://highscalability.com
http://google.com - search for scalability, lamp, failover... there are tones of case studies and horror stories from the trench lines :-]
Another option is using a scaleable platform such as Amazon Web Services. You can start out with a micro instance and configure load balancing to fire up more instances as needed.
Once you determine average resource requirements you can then resize your image to larger or smaller depending on your needs.
http://aws.amazon.com
http://tuts.pinehead.tv/2011/06/26/creating-an-amazon-ec2-instance-with-linux-lamp-stack/
http://tuts.pinehead.tv/2011/09/11/how-to-use-amazon-rds-relation-database-service-to-host-mysql/
Amazon allows you to either load balance or change instance size based off demand.