I have been making some research in the domain of servers for a website I want to launch. I thought of a certain configuration of a server with RAID 10 implemented with a NAS doing the backup which has a RAID 10 configuration as well. This should keep data safe in 99.99+ of cases.
My problem appeared when I thought about the need of a second server. If I shall ever require more processing power and thus more storage for users, how can I connect a second server to my primary one and make them act as one what the database (mySQL) is regarded?
I mean, I don't want to replicate my first DB on the second server and load-balance the request - I want to use just one DB (maybe external) and let the servers use it both at the same time. Is this possible? And is the option of backing up mySQL data on a NAS viable?
The most common configuration (once scaling up from a single box) is to put the database on its own server. In many web applications, the database is the bottleneck (rather than the web server); so the first hardware scale-up step tends to be to put the DB on its own server.
This also allows you to put additional security between the database and web server - firewalls are common; different user accounts etc. are pretty much standard.
You can then add web servers to the load balancer, all talking to the same database, as long as your database can keep up.
Having more than one web server also helps with resilience - you can have a catastrophic hardware event on one webserver and the load balancer will direct the traffic to the remaining machines.
Scaling the database server performance is a whole different story - though typically you use very beefy machines for the database, and relative lightweights for the web servers.
To add resilience to the database layer, you can introduce clustering - this is a fairly complex thing to keep running, but protects you against catastrophic failure of a single machine.
Yes, you can back up MySQL to a NAS.
Related
I've got a very specific use case and because I'm not too familiar with database replication, I am open to suggestions and ideas about how to accomplish the following in the best possible way:
A web application + database is running on a remote server. Let's call this set-up R for remote.
Now suppose there are 3 separate geographical locations which need read+write access to the database. I will call these locations L1, L2 and L3.
The main problem: the remote server might be unavailable or the internet connection of one of the locations might not always work, rendering the remote application unavailable; but we want the application to work as a high availability solution (on-site) even when the remote server is down or when there is an internet connection problem.
Partial solution: So I was thinking about giving each geographical location its own server with a local copy of the web application. The web application itself can get updated when needed from a version control system automatically (for example using git hooks).
So far so good... (at least I believe so?)
But what about our data? The really tricky part seems to be the database replication. Let's assume no DNS or IP failover and assume that the user first tries to access the remote server directly and if this does not work, the user can still use the local server on-site instead. This all happens inside a web browser (or similar client).
One possible (but unsatisfactory) solution would be to use master-slave replication from R (master) to L1, L2 and L3 (slaves). When doing this asynchronously this should be quite fast? I think this is a viable solution for temporary local read-only database access when the main server is broken or can't be accessed.
But... what about read-write support? I suppose we would need multi-master replication in this case, but I am afraid that synchronous replication using something like (for example) MySQL Cluster or Galera would slow things down, especially since L1, L2 and L3 are on lower bandwidth connections. And they are connected through WAN. (Also, L1, L2 or L3 might not always be online.)
The real question: How would you tackle this specific use case? At the moment I am leaning towards multi-master replication if it doesn't slow down things too much. The application itself will mainly be used by employees on-site but by some external people over WAN as well. Would multi-master replication work well? What if for example L1 is down for 24 hours and suddenly comes back on-line? What if R can't be accessed?
EXTRA: not my main question, but I also need the synchronized data to be sent securely over SSL, if possible, please take this into account for your answer.
Perhaps I am still forgetting some necessary details; if so, please respond with some feedback and I will try to update my question accordingly.
Please note that I haven't decided on a database yet and the database schema will be developed from scratch, so ideas using other databases or database engines are welcome as well. (At the moment I have most experience with MySQL and PostgreSQL)
As you are still undecided, I would strongly recommand you to have a look at MS-SQL merge replication. It is strong, highly reliable, replicates through LAN and HTTPS (so called web replication), and not that expensive.
Terminology differs from the mySql Master\Slave idea. We are here talking about one publisher, and multiple subscribers. All changes done at subscriber's level are collected and sent to the publisher, then redistributed to all subscribers (with, if needed, fancy options like 'filtered subscriptions').
Standard architecture will then be:
a publisher, somewhere on a server, which collects and redistributes changes between subscribers. Publisher might not be accessed by end users.
other database subscribers servers, either for local or web access, replicating with the publisher. Subscribers are accessed by end users.
We have been using this architecture for years, including:
one subscriber for internet access
one subscriber for intranet access
tens of subscribers for local access: some subscribers are on our constructions projects, somewhere in the desert ....
Such an architecture is not available "from the shelf" with MySQL. I guess it could be built, but it would then certainly be a lot more expensive than just buying the corresponding MS-SQL licenses. Do not forget that the free SQLEXPRESS version of MS-SQL can be a subscriber.
Be careful: If you are planning to go through such a configuration, I would (really) strongly advise you to have all primary keys set to uniqueIdentifier data type, and randomly generated. This will avoid the typical replication pitfall, where PK's are set to int with automatic increment, and where independant servers generate identical primary keys between two replications (MS-SQL proposes a tool to avoid such problems, where you can allocate PK ranges per server, but this solution is a real PITA ...).
So, we want to move out from Air (Adobe stopping support and really bad implementation for the sqlite api, among other things).
I want to make 3 things:
Connect with a flash (not web) application to a local mysql database.
Connect with a falsh (not web) application to a remote mysql database.
Connect with a flash (web) application with a remote mysql database.
All of this can be done without any problem, however:
1 and 2 can be done (WITHOUT using a webserver) using for example this:
http://code.google.com/p/assql/
3 can be done using also the above one as far as I understand.
Question are:
if you can connect with socket wit mysql server, why use a web server (for example with php) to connect like a inter connectioN? why not connnect directly?
I have done this a lot of times, using AMFPHP for example, but wouldn't be faster going directly?
In the case of accessing local machine, it will be a more simple deploy application that only require the flash application + mysql server, not need to also instal a web server.
Is this assumption correct?
Thanks a lot in advance.
The necessity of separate layer of data access usually stems from the way people build applications, the layered architecture, the distribution of the workload etc. SQL server usually don't provide very robust API for user management, session management etc. so one would use an intermediate layer between the database and the client application so that that layer could handle the issues not related directly to storing the data. Security plays a significant role here too. There are other concerns as well, as, for example, some times you would like to close all access to the database for maintenance reasons, but if you don't have any intermediate layer to notify the user about your intention, you'd leave them wondering about whether your application is still alive. The data access layer can also do a lot of caching, actually saving your trips to the database, you would have to make from client (of course, the client can do that too, but ymmv).
However, in some simple cases, having an intermediate layer is an overhead. More yet, I'd say that if you can, do it without an intermediate layer - less code makes better programs, but all chances are for that you will find yourself needing that layer for one reason or another.
Because connecting remotely over the internet poses huge huge huge security problems. You should never deploy an application that connects over the internet to a database directly. That's why AIR and Flex doesn't have remote Mysql Drivers because they should never be used except for building development type tools. And, even if you did build a tool that could connect directly, any descent network admin is going to block access to the database from anywhere outside the DMZ and internal network.
First in order your your application to connect to the database the database port has to exposed to the world. That means I won't have to hack your application to get your data. I just need to hack your database, and I can cut you out of the problem entirely because you were stupid enough to leave your database port open to me.
Second most databases don't encrypt credentials or data traveling over the wire. While most databases support SSL connections most people don't turn it on because applications want super fast data access and they don't want to pay for SSL encryption overhead blah blah blah. Furthermore, most applications sit in the DMZ and their database is behind a firewall so between the server and the database is unlikely something could be eavesdropping on their conversation. However, if you connected directly from an AIR app to the database it would be very easy to insert myself in the middle and watch the traffic coming out of your database because your not using SSL.
There are a whole host of problems doing what you are suggesting around privacy and data integrity that you can't guarantee by allowing a RIA direct access to the database its using.
Then there are some smaller nagging issues like if you want to do modern features like publishing reports to a central server so users don't have to install your software to see them, sending out email, social features, web service integration, cloud storage, collaboration or real time messaging etc you don't get if you don't use a web application. Middleware also gives you control over your database so you can pool connections to handle larger load. Using a web application brings more to the table than just security.
My partner and I are trying to start a website hosted in cloud. It has pretty heavy ajax traffic and the backend handles money transactions so we need ACID in some of the DB tables.
Currently everything is running off a single server. Some of the AJAX traffic are cached in text files.
Question:
What's the best way to scale the database server? I thought about moving mysql to separate instances and do master-master duplication. However this seems tough and I heard I might lose ACID properties even with InnoDB? Is Amazon RDS a good solution?
The web server is relatively stateless except for some custom log files and the ajax cache files. What's a good way to scale to multiple web servers? I guess the custom log files can be moved to a reliable shared file system or DB but not sure what to do about the AJAX cache file coherency across multiple servers. (I dont care about losing /var/log/* if web server dies)
For performance it might be cheaper to go with larger instance with more cores and memory but eventually I would need redundancy so wondering what's the best way to do this cheaply.
thanks
take a look at this post. there is plenty of presentations on the net discussing scalability. few things i suggest to keep in mind:
plan early for the data sharding [even if you are not going to do it immediately]
try using mechanisms like memcached to limit number of queries sent to the database
prepare to serve static content from other domain, in the longer run - from ngin-x-alike server and later CDN
redundancy - depends on your needs. is 'read-only' mode acceptable for your site? if so - go with mysql replication + rsync of static files and in case of failover have your site work in that mode till you recover the master node. if you need high availability - then take a look either at drbd replication [at least for mysql] or setup with automated promotion of slave server to become master node.
you might find following interesting:
http://yoshinorimatsunobu.blogspot.com/2011/08/mysql-mha-support-for-multi-master.html
http://mysqlperformanceblog.com
http://highscalability.com
http://google.com - search for scalability, lamp, failover... there are tones of case studies and horror stories from the trench lines :-]
Another option is using a scaleable platform such as Amazon Web Services. You can start out with a micro instance and configure load balancing to fire up more instances as needed.
Once you determine average resource requirements you can then resize your image to larger or smaller depending on your needs.
http://aws.amazon.com
http://tuts.pinehead.tv/2011/06/26/creating-an-amazon-ec2-instance-with-linux-lamp-stack/
http://tuts.pinehead.tv/2011/09/11/how-to-use-amazon-rds-relation-database-service-to-host-mysql/
Amazon allows you to either load balance or change instance size based off demand.
We currently have an application located on a remote server, and our call center uses this application to perform customer transactions.
We plan to setup asterisk on a local server to help us with all the call routing and recording, for asterisk to work smoothly we have to move our application from the remote server to the local.
Its will be easy to mover all data to the local server and do transactions locally, but there is an option for users to do transactions online too which will hit the remote server database.
The reason we still have the remote application because of the reliable infrastructure and backup solution provided by rackspace.
If we move application to local server i am looking at a reliable solution for syncing remote and local databases so that we can handle local as well as online transactions.
Why not use mysql master-master replication and hold definitive data at both ends? (Note you'll have to do some reading on on auto_increment_increment and auto_increment_offset)
symcbean's answer is basically correct. I'd add this article as a good starting place to understand master-master replication. I'd further recommend High Performance MySQL as a good reference for a deeper understanding of the techniques and issues.
There are some issues that you will have to face doing writes to two non-colocated MySQL servers. You'll have replication lag to deal with, so the databases won't necessarily be completely in sync, but will only be "eventually consistent". Also, if you have both sides doing updates on content, you can end up with data integrity issues. If your system leans towards INSERTs more then UPDATES for the write operations, it is less likely that you'll run into issues. Also, if the subset of data that is likely to be modified tends to be localized around one or the other of the servers, you'll run into fewer issues.
Otherwise, you'll probably want to roll your own solution that is designed towards the specific use cases of your application.
I'm working on a SaaS project and mysql is our main database. Our applications is written on c# .net and runs under an windows 2003 server.
Considering maintainance, cost, options and performance, which server plattaform can I decide for MySQL hosting, windows or Unix/Linux/Ubuntu/Debian?
The scenario is as following:
The server I run today has a modarate transaction volume. Databases increase 5MB daily and we expect to increase 50MB in couple of months and it is mission critical.
I don't know how big the database is going to be. We rent a VPS to host application and database server.
Most of our queries are simple but our ORM Tool makes constantly use of subqueries. Also we run reports simple and heavy ones. Some them runs after user click, but most runs in order to the queue.
Buy an extra co-lo space will be nice as we got more clients. That's SaaS project after all.
When developing, you can use your Windows box to also run a MySQL server. If and when you
want to have your DBMS in a separate server it can be in either a Windows or Linux server.
MySql and supporting tools for backup etc probably have more choices in Linux.
There are also 3rd party suppliers who will host your MySQL database on their servers. The benefit is they will handle backups, maintenance etc.
Also: look into phpMyAdmin for use as a great admin tool.
Larry
I think you need more information to make an informed decision. It's hard to just pull out a "best" answer based on no specific information.
What is your expected transaction volume?
How big will the database get?
How complex are your queries, ie are they long running or relatively quick?
Are you hosting the application on your own server at your own location? If you have to buy extra co-lo space maybe an extra server isn't the best option.
How "mission critical" is this database? Ie maybe you need replicated servers to ensure stability.
There is a server sizing tool online at http://www.sizinglounge.com/, so you should check that out. It sounds like your server could be smaller than their smallest tier, but it should be a good place to start.
If this is a mission critical application you need to do some kind of replication to an extra server in case the primary one fails, so you are definitely looking at two systems. This has to be in addition to a good backup plan.
Given that you are uncertain about how big it could get you might just continue renting a server. For your backup one idea would be to look at running MySQL on an Amazon EC2 instance. BTW it is important to have a remote replicated server. If you have two systems next to each other and an environmental problem comes up, they could both be out of commission at the same time. But with a remote copy your options are open to potentially working around it.
If you run a lot of read-only queries locally and have your site hosted somewhere, it might make sense to set up a local replicated database copy to query against. That could potentially improve both your website and local performance quite a bit. Plus it would give you some good piece of mind having a local copy under your control.
HTH,
Brandon