Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am developing an Enterprise application in Java EE and I think it will have a huge amount of stored data. It is similar to a University management application in which all colleges students are registered and have their profile.
I am using a MySQL database. I tried to explore on the internet and I found some tips on this link.
What are the best practices to develop huge databases so that they do not decrease its performance?
Thanks in advance.
First of all your database is not huge but medium -> small size. Huge database is when you need to deal with terabytes of data and million operations per second. Considering your case, MySQL (MyISAM) is enough and rather than optimization you should focus on correct database design (optimization is the next step).
Let me share some tips with you:
scale your hardware (not so important for your case)
identify relations (normalize) and correct datatypes (i.e. use tiny int instead of big int if you can)
try to avoid NULL if possible
user varchar instead of text/blob if possible
index your tables (remember indexes slow update/delete/insert operations)
design your queries in a correct way (use indexes)
always use transactions
Once you design and develop your database and the performance is not sufficient - think about optimization:
- check explain plans and tune sqls
- check hardware utilization and tune either system or mysql parameters (i.e. query cache).
Please check also this link:
http://dev.mysql.com/doc/refman/5.0/en/optimization.html
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Our DBA deployed a standalone TiDB and a standalone MySQL to respectively handle about one million tables but it seemed that TiDB could not perform as good as MySQL, why? If it's because the data size is too small, how much data should I put in the database to ensure a better performance in TiDB than MySQL?
TiDB is designed for scenarios where sharding is used because the capacity of a MySQL standalone is limited, and where strong consistency and complete distributed transactions are required. One of the advantages of TiDB is pushing down computing to the storage nodes to execute concurrent computing.
TiDB is not suitable for tables of small size (such as below ten million level), because its strength in concurrency cannot be shown with small size data and limited Region. A typical example is the counter table, in which records of a few lines are updated high frequently. In TiDB, these lines become several Key-Value pairs in the storage engine, and then settle into a Region located on a single node. The overhead of background replication to guarantee strong consistency and operations from TiDB to TiKV leads to a poorer performance than a MySQL standalone.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am working in an environment that is under extreme load. It is a DB used by about a thousand users with one application. This application does thousands of queries against the DB. We have noticed significant performance degradation over time and are looking for a long-term solution to this problem. Of course, query optimization is one of the tasks we are working on, and we are also optimizing indexes; however, this will not be enough to see the performance gains we need.
I have worked in SQL Server for several years but my MySQL knowledge is limited. To start scaling MySQL, I've researched Sharding, but as we are using MySQL community edition, I'm nervous that this will cause more headaches than it's worth. The only other possibility is to re-design the application, specifically how it pulls data from the DB, but I'd rather not do that.
So my question is, is sharing worthwhile to pursue? Is it feasible without an enterprise edition of MySQL? Is there another possibility you could recommend?
Turn on the slowlog with long_query_time=1. Wait a day. Use pt-query-digest to identify the 'worst' could of queries. Then let's discuss them. Sometimes it involves the trivial addition of a 'composite' index.
That is, Slow queries is almost always the cause for scaling problems.
If we eliminate that as a problem, then we can discuss sharding and other non-trivial approaches.
We must see SHOW CREATE TABLE and other clues of what is going on.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I will be writing a program in Delphi that will be reading and writing to a MySQL database tables on a regular basis like every 5 seconds. Is this going to be CPU intensive? or get to a point where computer will freeze completely? I know reading and writing to and from a hardrive nonstop can freeze everything on your computer. I am not really sure about MySQL database.
Databases are designed to handle many transactions frequently, but it really depends on what the queries you are using. A simple SELECT on a couple rows is unlikely to cause an issue, but large scale updates targeting many tables or multiple joins can slow performance. It all depends on what your queries are.
This all depends on the computer and the complexity of the query.
As David has said, it really does depend on the hardware and queries you are processing.
I would suggest measuring the processing time of each query to determine whether the writing processes will be stacking over the other 5 second interval queries.
You can find information on how to measure your MySQL processes here.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I need to create a database for dealing with click stream (from ~240 subdomains). I use a Java Script for grabbing information like (Host, Page, Date, userID, Referer, HostName, RefererPath, uniqueUserID) for each click and than insert the data to the database through a java web dynamic application. There are about 9 milion new records each day and I have to insert new records every minute. Another application needs to be able to retrieve information about pageviews/unique visitors/ect for a certain article/subdomain in the last (10min, 20min, 30min, 1hour...24 hours). I only need to keep records for the last 3 months.
Initially I thought about using MySQL as I'm only interested in open-source. But I'm thinking about NoSQL solutions. The problem is that I've had experience only with relational databases and am not really able to tell if NoSQL would be a better solution here or not. Also which database should I use if I choose to go wiht NoSQL? and would Key-value store be the best way to go?
I'm guessing this data consistency isn't critical (statistics ?) so you could indeed spare a bit of consistency. NoSQL seems a good choice and a key value store would also be my pick. Now the real question is : what is the most suited one ?
I'd give a consideration to Redis and Riak (which are basically the most well-known ones) :
Riak (AP system) :
Fault-tolerant (masterless with partitioning and replication)
Map reduce
Full text search
BASE
Redis (CP system) :
Really fast
In-memory : You need RAM ! That also means you want replication so you don't lose everything on a crash. Redis also uses disk snapshot I believe.
Master/Slave with reelection
BASE
Both have a lot more features, you should go read the documentation for gotchas. Redis is primarly used as a cache since it's fast, whereas Riak focuses on fault-tolerance. Given your scalability requirements, both can satisfy your need. Therefore you must chose according to what's above.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I know google uses its own Big Tables (is that right?) and Facebook and Twitter use Cassandra but when does your everyday project outgrow mysql (if ever?)
If you were starting out on a potentially massive scale web application, would you use mysql as an engine or start with an alternative from the start?
I think the only way you can know when MySQL isn't good enough is when you start to see performance issues or you feel like your fighting to keep it going. If you are aware that your application is potentially huge then you should be implementing the right tools from the start otherwise it's a huge headache transferring at a later date.
There is no simple metric which will tell you the answer - it depends not only on the amount of data, number of transactions but also the nature of the replication - number of replicated sites, required speed of replication etc.
Yes, a large scale noSQL clsuter can out-perform a a MySQL cluster built for the same budget for OLTP, however its called noSQL for a reason - when you need to start doing somethng useful with the data, the relational model and SQL language make slicing and dicing the data much easier. OTOH, at some point OLAP then overtakes the relational model in terms of performance - but I think it would be rather difficult to use a datawarehouse for transaction processing.
So its quite possible that the functional requirements of an application will outgrow the capabilities of a noSQL database much faster than the perofrmance requriements would outgrow a relational database.
I'd start with an alternative (PostgreSQL), but not because of scaling issues, but because MySQL's support for transactions and referential integrity is worthless.