Proper database practices [closed] - mysql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I'm a bit newer to structuring databases and I was wondering if, say I have 38 different pieces of data that I want to have per record. Is it better to break that up into say a couple different tables or can I just keep it all in one table.
In this case I have a table of energy usage data for accounts, I have monthly usage, monthly demand, and demand percentage, then 2 identifying keys for each which comes out to 38 pieces of data for each record.
So is it good practice to break it up or should I just leave that all as one table? Also are there any effects on the efficiency of the product doing queries once this database ends up accumulating a couple thousand records at it peak?
Edit: I'm using Hibernate to query, I'm not sure if that would have any effect on the efficiency depending on how I end up breaking this data up.

First, check the normal forms:
1) Wiki
2) A Simple Guide to Five Normal Forms in Relational Database Theory
Second, aggregation data like "monthly sales" or "daily clicks" typically go to a separate tables. This is motivated not only by normal forms, but also by the implementation of the database.
For example, MySQL offers the Archive storage engine which is designed for that.
If you're watching current month's data, these may appear in the same table, or can be stored in cache. The per-month data in a separated table may be computed 1st day of month.

when you read a record do you use often all data? or you have different sections or masks (loaded separatly) to show energy usage data, monthly statistics and so on?
how many records do you plan to have on this table? If they grow dramatically and continually, is it possible create tables with a postfix for grouping them by period (for month, half year, year ...)?

Related

What is more efficient: one big table or separate non-related smaller ones [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am kind of new to SQL and web-programming right now, so I am doing a little project for myself to get to know every aspect of developing a website (yup, from frontend to backend and sql).
So the point is that potentially I may have a lot of data in my table, like over 3k rows [elements] (relatively a lot) with a bunch of columns [properties] as well. And I know from the beginning that it could be split on, say, four tables. For example, by color.
Each element of each color has the same amount of keys and keys itself
So the question is how I could estimate the ratio of time:memory efficiency in this case. I do understand that it is much quicker to search for information in a smaller table, but I have no idea how do SQL tables are stored. For instance, how much additional memory each table costs without.
3,000 rows is small for SQL. You don't want to split large tables, because SQL has strong capabilities for handling larger data. Three come to mind:
Sophisticated query optimizers.
Indexes.
Table partitions.
In addition, the way that data is stored incurs overhead for small tables -- not large ones. Rows are stored on data pages. And data pages are typically measured in thousands of bytes. So, a small row with 100 bytes still occupies one data page -- even if the page could store one hundred such records. And the overhead for reading the data page is the same.
In summary. Your table isn't big. And SQL is optimized for larger tables. So, no need to change your data model.

MySQL One table vs many tables (same data) [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I'm building a website to monitor a bunch of IOT devices. E.g. Online/Offline status of each devices and some device specific information it may report back, IP address, Temperature etc this will vary. FYI These devices report back to my site via a processor/computer that poles these devices and then reports back (a maximum of 255 devices but in most cases between 10 - 100 devices).
To date, my approach had been that for each processor I would create a new table with just that processors devices would reside within. However in discussions with a colleague he suggested this might not be the best way to go, as it isn't particularly efficient and could be problematic later on e.g. if you wanted to add another column later on, having to add this to possible 50+ different processor tables etc.
Instead because all these tables would have the same structure e.g. identical amount of columns etc just the amount of devices e.g. rows would vary, would one big table with all these rows was a better way to go?
I know that in MySQL terms "scanning" is an expensive operation, and with one big table I would argue there would be more scanning as I would have to filter as I would have to take one big data set each time, and filter it down into a view, e.g. Processor or location against 5000+ rows vs lots of smaller tables of 100 rows. Also I would argue the data in this table would be written to allot e.g. each time a device goes offline the offline flag is updated, so I'm not sure if that makes it more suitable to a single table vs one large table.
Appropriate there's many different ways of approaching this, I just don't want to go down one rabbit hole and regret it later on. Front end will be PHP if that counts for anything.
Your friend is correct. Creating many tables to store very similar data would be a waste of configuration time and an inefficient way to store this information. Instead, creating a table that has columns which can differentiate your machines from each other (ID of machine, type, whatever), as well as columns for the information that all machines will be reporting (temperature, IP, etc), you will have a much more organized database and it will be much simpler when you want to update your table later on.
SQL is very well-optimized for search queries, and unless you're storing millions of rows, I think you'll be just fine in terms of performance.

Which is faster in sql: searching for a table, or searching for data in table? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Say there are 1,000 tables in a database, and each table has 1,000 rows. When I search for single table from these 1,000 tables, is the search time same as that required to search for data within one of the tables?
In other words, does SQL use the same search algorithm to find a table out of 1,000 tables as it does to get data from a table with 1,000 rows?
No, MySQL doesn't use the same search algorithm to find a table.
MySQL maintains an in-memory "data dictionary" so when you run a query that names a specific table, it looks up that table very quickly. It's much faster for MySQL to identify a table than to search for data within a table. For example, the database servers I maintain at my job have over 150,000 tables, and this isn't a problem.
Does this mean you should split up your data over many tables to make it run faster? No -- that's not usually a good tradeoff. It makes your application code more complex, since your code needs to pick which table to query. You may also find cases where you wish the data were in one table, if you have to search for results across many of your tables.
Here are a couple of principles to follow:
"Everything should be made as simple as possible, but not simpler." (attributed to Albert Einstein)
"First make it work, then make it right, and, finally, make it fast." (Stephen C. Johnson and Brian W. Kernighan, 1983)

database performance: how do more columns in a table impact design / development / performance [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have a program that captures many different types of structured messages. I need to persist the messages to database. What is the forum's view on design and performance, between:
(a) using one big table for all message types, so to handle any new message type, new columns are added to the big table. So the database is one table that may end up having 100's of columns.
(b) using a tables for each message type, so for a new message type, a new table is added to the database
By performance I mean in terms of searching all messages (i.e. searching one table versus a search across joined tables) and in terms of development work (i.e. knowledge transfer between developers) and maintenance (i.e. when something goes wrong).
This sounds a bit like it's about normalisation, but I am not sure it is.
Thanks!
If I read you right, choice (a) amounts to what is called the "One True Lookup Table" (OTLT). OTLT is an antipattern. You can research it on the web.
Performance is degraded because the lookup has to be done on two fields, the type and the code. With separate tables for each type, the lookup is just on the code.
Queries are more complex, and therefore more likely to be in error.
Data management is harder if you want separate entry forms for each type. If you are going to have just one true type entry form, you need to be careful when entering new lookup values. Good luck.

Most efficient way to store user profile information [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Okay, so I have my user table ready with columns for all the technical information, such as username, profile picture, password and so on. Now I'm at a situation where I need to add superficial profile information, such as location, age, self-description, website, Facebook account, Twitter account, interests etc. In total, I calculated this would amount to 12 new columns, and since my user table already has 18 columns, I come at a crossroads. Other questions I read about this didn't really give a bottom-line answer of the method that is most efficient.
I need to find out if there is a more efficient way, and what is the most efficient way to store this kind of information? The base assumption being that my website would in the future have millions of users, so an option is needed that is able to scale.
I have so far concluded two different options:
Option 1: Store superficial data in user table, taking the total column count in users table up to 30.
Or
Option 2: Store superficial data in separate table, connecting that with Users table.
Which of these has better ability to scale? Which is more efficient? Is there a third option that is better than these two?
A special extra question also, if anyone has information about this; how do the biggest sites on the internet handle this? Thanks to anyone who participates with an answer, it is hugely appreciated.
My current databse is MySQL with rails mysql2 gem in Rails 4.
In your case, I would go with the second option. I suppose this would be more efficient because you would retrieve data from table 1 whenever the user logins and you would use data from table 2 (superficial data) whenever you change his preferences. You would not have to retrieve all data each time you want to do something. In the bottom line, I would suggest modelling your data according to your usage scenarios (use cases), creating data entities (eg tables) matching your use case entities. Then you should take into account the database normalization principles.
If you are interested on how these issues are handled by the biggest sites in the world, you should know that they do not use relational (SQL) databases. They actually use NoSQL databases, which run on a distributed function. This is a much more complicated scenario than yours. If you want to see related tools, you could start reading about Cassandra and hadoop.
Hope I helped!
If you will need to access to these 30 columns of information frequently, you could put all of them into the same table. That's what some widely-used CMS-es do because even though a row is big, it's faster to retrieve one big row than plenty of small rows on various tables (more SQL requests, more searches, more indexes, ...).
Also a good read for your problem is Database normalization.