Mesos & persistent storage using MySQL - mysql

This is going to be a generic question.
We are a young startup faced with the inevitable problem of scaling and during our research, Apache Mesos seemed like a good fit for our architecture, which is –
Core Scala based microservices, each responsible for dealing with a
part of our database, which is mainly MySQL
Middleware microservices,
to deal with some other persistent data-storage systems like MongoDB,
Elasticsearch etc.
Which basically means that we can containerise all of our services and ship them to a single datacenter which can then deploy these containers in a topographically agnostic way.
What we are currently stumped by is –
Mesos doesn't seem to have any native support for MySQL
Container based persistence seems awfully tricky and hard to manage/maintain.
We'd like to continue using MySQL/MongoDB/ElasticSearch because migrating to Cassandra etc. at this stage (we are a small team) is too much of an overhead and hence not an option.
What are the best strategies for this problem?

Mesos provide Persistent resources support for storage-like services.
If you want to use MySQL on mesos, please consider try https://github.com/apache/incubator-cotton

After some research we decided not to try Cotton but we're still sticking to deploying our services across a Mesos cluster.
Instead of hosting our own MySQL database, we decided to outsource it to Amazon RDS. But we're now faced with problems like doing something for our other databases like MongoDB.

Related

What option we have to build our own MySQL private cloud?

My company has a requirement that we want to build a private DB cloud service for internal use.
Requirements are:
User can easily request a new mysql instance, and terminate it.
Each mysql instances are isolated with each other.
One of the solution we have is just using to create different user and schema for each user. Something similar to what the cPanel is doing.
But I wonder is there better option available?
Honestly, I am don't think putting everybody on the single big MySQL instance is a good idea.
First, we can't do much about resources management. And I am afraid having a problem in the database (can't boot it up for example) is going to kill everybody.
To minimize the risk of single point of failure, I am looking for something like the Amazon RDS and Azure MySQL. What we want is very similar to that.
Does anybody know how are they do that? Is there is open source or commerical version we can buy?
Thanks you.
Without knowing the why it is hard to give you a good answer to your question. If Amazon RDS or Azure MySQL look like a good solution to you, I would suggest using those. Building such a service yourself and making sure it will scale well will probably cost a lot more money.
I mean, sure you could set up Kubernetes or Hashicorp Nomad and deploy containers there but you would need to figure out how these tools work, how to let MySQL run in a scaleable fashion, and build some kind of UI to easily launch and stop MySQL instances.

library for sharding mysql in Java

Sooner, we are going to shard our mysql database to achieve horizontal scaling. Our technology stack is based on spring, hibernate.
However, I haven't been able to find any alternate to hibernate which would support sharding at the application level.
I read about hibernate shards but it is no longer maintained, so I would not be suitable to use it in production.
Moreover, with companies like facebook, twitter, digg using mysql sharding, I am surprised that there is not GA hibernate alternative to sharding.
I would appreciate if someone could suggest some persistence framework in java which supports sharding out of the box.
Thanks in advance!!!!
Disclaimer: I work for ScaleBase, a provider of a complete MySQL scale-out solution an "automatic sharding machine" if you like.
I'm a believer of a solution that is outside the code. This way the entire eco-system including ad-hoc and administration queries from MySQL command line and utilities like MySQLDump are also "aware" of the sharding.
This was the main disadvantage of Hib Shards, or any sharding framework that would be limited to the persistence layer, inside the application code.
We have some good resources about key-based sharding (hash, range, list) and data distribution on our site: http://www.scalebase.com/products/database-sharding/ http://www.scalebase.com/resources/webinars/ - (search for "WEBINAR – 10.23.12: Benefits of Automatic Data Distribution")
I also invite you to look at my blog: http://database-scalability.blogspot.com
Hope I helped.

How does database tiering work?

The only good reference that I can find on the internet is this whitepaper, which explains what database tiering is, but not how it works:
The concept behind database tiering is
the seamless co-existence of multiple
(legacy and new) database technologies
to best solve a business problem.
But, how does it implemented? How does it work?
Any links regarding this would also be helpful. Thanks.
I think the idea of that document is you to put "cheap" databases in front of the "expensive" databases to reduce costs.
For example. Let's assume you have an "expensive" db...something like Oracle, or DB2 or even MSSQL (more realistically it's probably more of an issue with a legacy DB system that is not supported much or you need specialized resources to maintain). A database engine that costs a lot to purchase and maintain (arguably these are not expensive when you take all factors into consideration. But let's use them for the example).
Now if you suddenly get famous and your server starts to get overloaded what do you do? Do you buy a bigger server and migrate all your data to that new server? That could be incredibly expensive.
With the tiering solution you put several "cheap" databases in front of you "expensive" database to take the brunt of the work. So your web servers (or app servers) talk to a bunch of MySQL servers, for example, instead of directly to the your expensive server. Then these MySQL servers handle the majority of the calls. For example, they could handle all read-only calls completely on their own and only need to pass write-calls back to the main database server. These MySQL servers are then kept in sync via standard replication practices.
Using methods like this you could in theory scale out your expensive server to dozens, if not hundreds, of "cheap" database servers and handle a much higher load.
Database tiering is just a specific style of tiering. There are also application tiering and service tiering. It's a form of scalability.
What exactly are you asking? This question is rather vague.
This is a PDF from a course at Ohio State. What it discusses is a bit over my head, but hopefully you might understand it better.

Database choices

I have a prickly design issue regarding the choice of database technologies to use for a group of new applications. The final suite of applications would have the following database requirements...
Central databases (more than one database) using mysql (must be mysql due to justhost.com).
An application to be written which accesses the multiple mysql databases on the web host. This application will also write to local serverless database (sqlite/firebird/vistadb/whatever).
Different flavors of this application will be created for windows (.NET), windows mobile, android if possible, iphone if possible.
So, the design task is to minimise the quantity of code to achieve this. This is going to be tricky since the languages used are already c# / java (android) and objc (iphone). Not too worried about that, but can the work required to implement the various database access layers be minimised?
The serverless database will hold similar data to the mysql server, so some kind of inheritance in the DAL would be useful.
Looking at hibernate/nhibernate and there is linq to whatever. So many choices!
Get a better host. Seriously - SQL Server hosts don't cost that much more. An hour development time possibly per month - and that is already non-conervative.
Otherwise - throw out stuff you do not need. Neutralize languages to one. If that is an internet access stuff, check out OData for exposing data - nice nidependant protocol
The resit sis architecture. and LINQ (2Sql) sucks - compared to nhibernate ;)
but can the database access layer be reused?
Yes, it can be, but you have to carefully create a loosely coupled datalayer with no dependency on other parts.

Is it possible to combine Cloud Computing and MYSQL?

The main bottle neck of a web server locates usually in the database,in my case,MYSQL.
More specifically,fulltext search and master-slave replication.
And sphinx is a probable solution for fulltext-search,so master-slave replication is the
final pain in ass.
Is it possible to boost the performance significantly with the technology of Cloud Coumputing,
for instance,by services offered by Amazon?
Just a wild guess!
EDIT:what about MySQL and Google App Engine?
Of course. MySQL Enterprise for Amazon EC2 is one MySQL package for Amazon EC2. See also Setting Up MySQL on an EC2 AMI and this tutorial/blog post.
EDIT: App Engine is higher-level than EC2 and is really designed for BigTable/GQL only. However, look at approcket, which allows replicating between AppEngine and MySQL.
You may want to be careful with just switching your web app to use an external data base (ie amazon, et.al.), you want to understand where exactly is your bottleneck or you may end up introducing more performance problems... Remember that by going to an external DB, you're introducing more latency into each query compared to a local (box or net) query.
If your problem is performance, try to find out exactly where the problem lies first, and then you may want to explore other options like query optimization, caching, etc.
Possible - for sure. See for example, xeround, rightscale, Amazon and phpfog. There are probably at least a few more with more to come. They come in varying degrees of "freeness" (How's that for a made up word?) too.
The question, it seems to me, will be performance and reliability.
Who knows, localhost may become a thing of the past for development.