When does a project get too big for mysql [closed] - mysql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I know google uses its own Big Tables (is that right?) and Facebook and Twitter use Cassandra but when does your everyday project outgrow mysql (if ever?)
If you were starting out on a potentially massive scale web application, would you use mysql as an engine or start with an alternative from the start?

I think the only way you can know when MySQL isn't good enough is when you start to see performance issues or you feel like your fighting to keep it going. If you are aware that your application is potentially huge then you should be implementing the right tools from the start otherwise it's a huge headache transferring at a later date.

There is no simple metric which will tell you the answer - it depends not only on the amount of data, number of transactions but also the nature of the replication - number of replicated sites, required speed of replication etc.
Yes, a large scale noSQL clsuter can out-perform a a MySQL cluster built for the same budget for OLTP, however its called noSQL for a reason - when you need to start doing somethng useful with the data, the relational model and SQL language make slicing and dicing the data much easier. OTOH, at some point OLAP then overtakes the relational model in terms of performance - but I think it would be rather difficult to use a datawarehouse for transaction processing.
So its quite possible that the functional requirements of an application will outgrow the capabilities of a noSQL database much faster than the perofrmance requriements would outgrow a relational database.

I'd start with an alternative (PostgreSQL), but not because of scaling issues, but because MySQL's support for transactions and referential integrity is worthless.

Related

SQL vs. NoSQL for medium complexity search systems [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
We're about to start developing a scheduling system and we're motivated to migrate from PHP to Node for the Backend, so it makes sense to also migrate from MySQL to MongoDB (or something similar), I'm not a very tech person, but I'm trying to help my team to make the choices here. All features of this system seem ok to be with either database, but one particular situation raised me concerns regarding performance:
Let's assume I have several doctors on my base, each one with their specialties and clinic locations and also with their time span to work on this system. They also already have some appointments scheduled for spread hours during the weeks.
One user fills the search form with:
Their geolocalization (x,y);
The search radius (ex.: 10miles);
Specialization needed (ex.: dermatologist);
Desired hour (ex.: 11am);
This search, for my old-school mindset, seems OK for a relational database, but a lot of work for non-relational, since their availability will be inside each doctor 'JSON', and not in a specific external 'table' for scheduling.
Do my concerns make any sense?
You can achieve the desired result with both SQL and NoSQL database. But the project you are talking about is more relational design. Example:- Doctor can visit multiple clinics. A patient has also related to the Clinic as well as the doctor. The best solution, in this case, is the hybrid approach where your primary database should be relational and for the reading operation, you can plug NoSQL database like MongoDB if required.
#Rafael Souza
You should go with Relational schema design.
If you use NoSQL then in our case below are the points I want to convey
NoSQL will not be utilized fully at its best.
Developers will have to learn NoSQL and its frameworks.
There is a vast forum for SQL problems compare to NOSQL.
Database storage size would not big so SQL should do good.
Here you need to manage the relationship between Doctor and Clinics which is best suitable in SQL.
I should say that don't go with the Hybrid approach as it will be overhead for your design, any database type is alone capable of handling all the features.

MySQL Community - Scaling [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am working in an environment that is under extreme load. It is a DB used by about a thousand users with one application. This application does thousands of queries against the DB. We have noticed significant performance degradation over time and are looking for a long-term solution to this problem. Of course, query optimization is one of the tasks we are working on, and we are also optimizing indexes; however, this will not be enough to see the performance gains we need.
I have worked in SQL Server for several years but my MySQL knowledge is limited. To start scaling MySQL, I've researched Sharding, but as we are using MySQL community edition, I'm nervous that this will cause more headaches than it's worth. The only other possibility is to re-design the application, specifically how it pulls data from the DB, but I'd rather not do that.
So my question is, is sharing worthwhile to pursue? Is it feasible without an enterprise edition of MySQL? Is there another possibility you could recommend?
Turn on the slowlog with long_query_time=1. Wait a day. Use pt-query-digest to identify the 'worst' could of queries. Then let's discuss them. Sometimes it involves the trivial addition of a 'composite' index.
That is, Slow queries is almost always the cause for scaling problems.
If we eliminate that as a problem, then we can discuss sharding and other non-trivial approaches.
We must see SHOW CREATE TABLE and other clues of what is going on.

What storage system to use for a real time messaging? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am developing a Real Time messaging application (such as WhatsApp and co) and I am facing a big question.
The application itself is not as complicated as what exists on the market. However, I am no sure what storage system I should use. I have several ideas but I don't know which one is better that the others:
A simple mysql database with relations between messages/conversations/conversations
A mongodb with replicate of each conversations for all users in the conversations
A redis store with replicate conversations for all users in the conversations.
I don't know which one is better for what I want to do. If you have some advise so I can choose the right solution. (or if there is a solution I haven't listed which is even better :) )
Note : My API is developped in Ruby On Rails (if this can help make a decision)
Data volume and number of read/writes should be the key factor leading you to the decision. If the data volume and number of read/write is not going to be huge you can do with mysql. I believe few TB of data with few hundreds of read/writes per minute is SQL database territory. Beyond that it is NoSQL world. However, you should be ready to deal with increased complexity of non-SQL data store design, query implementation, and achieving eventual consistency if you choose NoSQL solution. All the best!

Best practices while designing databases in MySQL [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am developing an Enterprise application in Java EE and I think it will have a huge amount of stored data. It is similar to a University management application in which all colleges students are registered and have their profile.
I am using a MySQL database. I tried to explore on the internet and I found some tips on this link.
What are the best practices to develop huge databases so that they do not decrease its performance?
Thanks in advance.
First of all your database is not huge but medium -> small size. Huge database is when you need to deal with terabytes of data and million operations per second. Considering your case, MySQL (MyISAM) is enough and rather than optimization you should focus on correct database design (optimization is the next step).
Let me share some tips with you:
scale your hardware (not so important for your case)
identify relations (normalize) and correct datatypes (i.e. use tiny int instead of big int if you can)
try to avoid NULL if possible
user varchar instead of text/blob if possible
index your tables (remember indexes slow update/delete/insert operations)
design your queries in a correct way (use indexes)
always use transactions
Once you design and develop your database and the performance is not sufficient - think about optimization:
- check explain plans and tune sqls
- check hardware utilization and tune either system or mysql parameters (i.e. query cache).
Please check also this link:
http://dev.mysql.com/doc/refman/5.0/en/optimization.html

What database to use for a clicks stream application RELATIONAL OR NOSQL? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I need to create a database for dealing with click stream (from ~240 subdomains). I use a Java Script for grabbing information like (Host, Page, Date, userID, Referer, HostName, RefererPath, uniqueUserID) for each click and than insert the data to the database through a java web dynamic application. There are about 9 milion new records each day and I have to insert new records every minute. Another application needs to be able to retrieve information about pageviews/unique visitors/ect for a certain article/subdomain in the last (10min, 20min, 30min, 1hour...24 hours). I only need to keep records for the last 3 months.
Initially I thought about using MySQL as I'm only interested in open-source. But I'm thinking about NoSQL solutions. The problem is that I've had experience only with relational databases and am not really able to tell if NoSQL would be a better solution here or not. Also which database should I use if I choose to go wiht NoSQL? and would Key-value store be the best way to go?
I'm guessing this data consistency isn't critical (statistics ?) so you could indeed spare a bit of consistency. NoSQL seems a good choice and a key value store would also be my pick. Now the real question is : what is the most suited one ?
I'd give a consideration to Redis and Riak (which are basically the most well-known ones) :
Riak (AP system) :
Fault-tolerant (masterless with partitioning and replication)
Map reduce
Full text search
BASE
Redis (CP system) :
Really fast
In-memory : You need RAM ! That also means you want replication so you don't lose everything on a crash. Redis also uses disk snapshot I believe.
Master/Slave with reelection
BASE
Both have a lot more features, you should go read the documentation for gotchas. Redis is primarly used as a cache since it's fast, whereas Riak focuses on fault-tolerance. Given your scalability requirements, both can satisfy your need. Therefore you must chose according to what's above.