Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
In a MySQL database, I have one table that has 330 columns, each column is either a float or integer value. The entries are indexed by a millisecond time stamp column. Over the life of the application there is expected to be on the order of 100 - 200 million entries. This table is completely independent and has no relations to other tables. The only queries are ones filtering by the time stamp index.
Assuming I have a modern intel server with 6 cores and 32GB of ram and enough disk storage for the data, will any size limits be hit or will performance significantly degrade?
If there will be problems, what should be done to mitigate the problems.
I know similar questions have been asked, but the answer always seems to be it depends. Hopefully I've provided enough information so that a definitive answer can be determined.
Wow such huge hardware for such a small dataset!
You will not have any insurmountable problems with this dataset.
330 columns * 8 bytes = 2640 bytes (maximum) per row
2640 bytes * 200 million rows = 491GB
It's big, but not huge. It really depends what you're going to do with the data. If you are 'appending' to the data, never updating or inserting (in your case inserting earlier timestamps) then that eliminates two potential causes for concern.
If you are querying on the timestamp index, are you going to be using a RANGE or a specific timestamp?
Querying over ranges will be fine - make your timestamp the clustered index column. Since you are performing some earlier inserts, this can cause your table to fragment but that won't be a really big problem - if it does you can defragment the table.
A big choice for performance is InnoDB or MyISAM - InnoDB has full transactional capability - but do you need this? It involves twice as many writes, but for a single table with no referential integrity, you're probably going to be ok - more here.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 11 months ago.
Improve this question
I'm doing a training point management function, I need to save all those points in the database so that it can be displayed when needed. And I created table for that function with 60 columns. Is that good? Or can anyone suggest me another way to handle it?
It is unusual but not impossible for a table to have that many columns, however...
It suggests that you schema might not be normalized. If that is the case then you will run into problems designing queries and/or making efficient use of the available resources.
Depending on how often each row is updated, the table could become fragmented. MySQL, like most DBMS, does not add up the size of all the attributes in the relation to work out the size to allocate for the record (although this is an option with C-ISAM). It rounds that figuere up so that there is some space for the data to grow, but at some point it could be larger than the space available, At that point the record must be migrated elsewhere. This leads to fragmentation in the data.
You queries are going to be very difficult to read/maintain. You may fall into the trap of writing "select * ...." which means that the DBMS needs to read the entirety of the record into memory in order to resolve the query. This does not make for efficient use of your memory.
We can't tell you if what you have done is correct, nor if you should be doing it differently without a detailed understanding of the underlying the data.
I've worked with many tables that had dozens of columns. It's usually not a problem.
In relational database theory, there is no limit to the number of columns in a table, as long as it's finite. If you need 60 attributes and they are all properly attributes of the candidate key in that table, then it's appropriate to make 60 columns.
It is possible that some of your 60 columns are not proper attributes of the table, and need to be split into multiple tables for the sake of normalization. But you haven't described enough about your specific table or its columns, so we can't offer opinions on that.
There's a practical limit in MySQL for how many columns it supports in a given table, but this is a limit of the implementation (i.e. MySQL internal code), not of the theoretical data model. The actual maximum number of columns in a table is a bit tricky to define, since it depends on the specific table. But it's almost always greater than 60. Read this blog about Understanding the Maximum Number of Columns in a MySQL Table for details.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am kind of new to SQL and web-programming right now, so I am doing a little project for myself to get to know every aspect of developing a website (yup, from frontend to backend and sql).
So the point is that potentially I may have a lot of data in my table, like over 3k rows [elements] (relatively a lot) with a bunch of columns [properties] as well. And I know from the beginning that it could be split on, say, four tables. For example, by color.
Each element of each color has the same amount of keys and keys itself
So the question is how I could estimate the ratio of time:memory efficiency in this case. I do understand that it is much quicker to search for information in a smaller table, but I have no idea how do SQL tables are stored. For instance, how much additional memory each table costs without.
3,000 rows is small for SQL. You don't want to split large tables, because SQL has strong capabilities for handling larger data. Three come to mind:
Sophisticated query optimizers.
Indexes.
Table partitions.
In addition, the way that data is stored incurs overhead for small tables -- not large ones. Rows are stored on data pages. And data pages are typically measured in thousands of bytes. So, a small row with 100 bytes still occupies one data page -- even if the page could store one hundred such records. And the overhead for reading the data page is the same.
In summary. Your table isn't big. And SQL is optimized for larger tables. So, no need to change your data model.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have one table in MySQL server. Which reaches to 2 GB every 1 week. I generally take a backup of that table and delete the data. Please suggest if any other best solution is there for this problem.
Few Doubts I have:
1) Shall i move this particular table to some other NoSql Db like mongoDb or Cassandra ?
2) Will increasing the MySQL server size may help in this problem?
1) Shall i move this particular table to some other NoSql Db like
mongoDb or Cassandra ?
A decision to move data to MongoDB or Cassandra is not a decision to be made just for dumping the data somewhere else. These are stores with different data models and operational concerns than an RDBMS like MySQL.
2) Will increasing the MySQL server size may help in this problem?
You look to be somehow constrained by disk size which is surprising as 2 GB is nothing in terms of storage. Your smartphone has much more storage than this. You should increase that and then have an archival policy in place. Or if as others too have pointed out if you are hitting file system size limit then you can go for partitioning which has other benefits too like faster queries if the data is constrained in one of few partitions by where clause.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I'm working on a website which have requirement to store large numbers of data on single table. It will be over 100K entries per month and stores for minimum 5 years. It will approx 100k × 60 months = 6 million entries.
My Question is which is the best DBMS system which can handles this kind of data? Mysql/Oracle/PostgreSQL?
First of all, 6M records is not very much, so in these days it should not be a problem for any mainstream DBMS. However, I see two aspects:
1) Space assessment - approximate how much space will be needed. For this you can insert in a table several records that will be similar to yours and extrapolate this to 6M records. E.g. (I have used SQL Server, but this should be available for any other DBMS such as MySQL):
Record looks like this (4 integers and a varchar)
103 1033 15 0 The %S_MSG that starts with '%.*ls' is too long. Maximum length is %d.
I have inserted about 1M rows in a table and space usage returns something like:
rows reserved
1008656 268232 KB
So, it will be about 1.5GB for 6M rows.
2) Usage assessment - already specified by chanaka wije. If you do only SELECTs or INSERTs, no special features are required (like support for many transactions per time unit).
Also, in order to improve SELECT performance, you should take a look into partitioning (by time your case) - see here, here or here.
depends on the usage of your table whether you want insert only or whether you do selects frequently, I'm using a table to store web page views, 4 million records per month and I'm using mysql, also every 6 months I do trimming, no issues so far, if you want to use select queries use correct database engine like Innodb has Row-level locking, and MyISAM has Table-level locking
This is a good question. Apart from what has been suggested here, I think one issue to be considered would be how you connect to the database. Oracle itself can only scale well if you are using a connection pool (limited fixed amount of connections). If you are connecting all the time, peaking some data and disconnecting, don't use Oracle. Seriously, go for MySQL.
And if your application is very simple, consider the least expensive option. Don't throw Oracle at it just because is "the best out there".
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I have a MySQL database with around 30gb data. Some of the tables contain over 40 million rows. Iam using InnoDB. I query by only use "select count(*) from table_name" in local PC takes me around 5 minutes. I think it's impossible for me to the joining of the table. I would like to ask would there anything I could do to improve the performance. Or do I need to switch to another database? I never encounter such large data in DB. Please help.
I have run mysql instances with over 100 million entries, and delivering over 30 million queries per day. So it can be done.
The problems you are experiencing will occur with any other database system if similarly configured.
I can only give you a few tips, if this is mission critical consider hiring a professional to tweak your system.
Basics that you need to look at;
This size database is best run on a dedicated server with SSD disks, and at least 2 cores;
Your going to need a lot of RAM in your server, at least your total database size + 20% for other system resources;
Make sure mysql has been configured with enough memory, 80% of your total RAM. The primary setting that does this will be innodb_buffer_pool_size;
Optimize your queries, and index where needed - this is a fine art but can drastically improve performance, learn to use EXPLAIN ... on your queries.
MySQL InnoDB tables do not keep a count of rows, thus SELECT COUNT(*) can be slow. It's not an indication of how other queries might perform, but it is an indication of how slow a full table scan might be. Five minutes is really bad for just 40 million rows and might indicate a serious problem with your database or disk.
Here is a performance blog on the subject. Also see this related answer.
I had encounter the large date size problem before and hope my experience is useful for you.
first, your need create index for your table, but which kind of index should be used depending on your query logic.
after indexing, if the query is still slow, you'd better divide the data into a hierarchy, for example, source tables, middle tables and report tables. the report table just store some final data and the query will be fast, also create index for it.
third, try to use something like memsql if above mentioned can not meet your require.
besides, learn some command like :
set profiling = 1
do some slow query
show profiles;
SHOW PROFILE FOR QUERY N;