Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
This question has been asked many times regarding entity tables.
What about a many-to-many (aka cross, join, pivot, etc, etc) table?
For instance, I have entity tables "courses" and "students", and a many-to-many table "students_taking_courses".
What are the pros and cons of deleting the joining record versus adding a flag column to the table and marking it as being deleted? What conditions would make one approach preferred over the other?
EDIT. Please assume the the two entity tables have surrogate primary keys.
Assuming usage of RDBMS, when having a flag instead of physical delete:
We will keep full trace of actions in the system. When having a
history is crucial this approach will be beneficial.
On the other hand
By keeping records, we will need more space
There will be more updates, in heavy load or high concurrency systems
that may be a problem.
More complexity will be imposed to the design, think about when we
need to force a unique constraint to course-student (many to many)
table, showing a course can be taken by a student once in a term.
Having the flag, we will need more effort than using DBMS unique
constraint.
I prefer an EAV approach for historical data over using the flag.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I am working on a project where I am using a table with a multi-valued attribute having 5-10 values. Is it good to keep multivalued attributes or should I normalize it into normal forms ?
But I think that it unnecessarily increases the no of rows.If we have 10 multi values for an attribute then each row or tuple will be replaced with new 10 rows which might increase the query running time.
Can anyone give suggestions on this?
The first normal form requests that each attribute be atomic.
I would say that the answer to this question hinges on the “atomic”: it is too narrow to define it as “indivisible”, because then no string would be atomic, as it can be split into letters.
I prefer to define it as “a single unit as far as the database is concerned”. So if this array (or whatever it is) is stored and retrieved in its entirety by the application, and its elements are never accessed inside the database, it is atomic in this sense, and there is nothing wrong with the design.
If, however, you plan to use elements of that attribute in WHERE conditions, if you want to modify individual elements with UPDATE statements or (worst of all) if you want the elements to satisfy constraints or refer to other tables, your design is almost certainly wrong. Experience shows that normalization leads to simpler and faster queries in that case.
Don't try to get away with few large table rows. Databases are optimized for dealing with many small table rows.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Say there are 1,000 tables in a database, and each table has 1,000 rows. When I search for single table from these 1,000 tables, is the search time same as that required to search for data within one of the tables?
In other words, does SQL use the same search algorithm to find a table out of 1,000 tables as it does to get data from a table with 1,000 rows?
No, MySQL doesn't use the same search algorithm to find a table.
MySQL maintains an in-memory "data dictionary" so when you run a query that names a specific table, it looks up that table very quickly. It's much faster for MySQL to identify a table than to search for data within a table. For example, the database servers I maintain at my job have over 150,000 tables, and this isn't a problem.
Does this mean you should split up your data over many tables to make it run faster? No -- that's not usually a good tradeoff. It makes your application code more complex, since your code needs to pick which table to query. You may also find cases where you wish the data were in one table, if you have to search for results across many of your tables.
Here are a couple of principles to follow:
"Everything should be made as simple as possible, but not simpler." (attributed to Albert Einstein)
"First make it work, then make it right, and, finally, make it fast." (Stephen C. Johnson and Brian W. Kernighan, 1983)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
These days I've faced performance issues when binding data with Java object to database. Especially when paring the data from database to java code when a lot of FK-PK relationship involved. I realized the issue and solved the performance slowdown by creating database views and create POJOs to map with the view.
I did some research online but couldn't find a good answer for this: How does database(I am using mysql) keeps the fast data querying speed in views?
For example, if I create a view among 10 tables, with FK-PK relationship, the view is still pretty fast to query and display the result pretty fast. How exactly happened behind the scenes for the database engine?
Indexes.
MySQL implicitly creates a foreign key index (i.e. an index on columns that compose the foreign key), unless one already exists. Not all database engines do so.
A view is little more than an aliased query. As such, any view, as trivial as it may seem, could kill the server if written poorly. Execution time is not proportional with the number of joined tables, but with the quality of indexes*.
Side effect: the default index might not be the most efficient one.
*tables sizes also start to matter when the tables grow large, as in millions of records
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have a program that captures many different types of structured messages. I need to persist the messages to database. What is the forum's view on design and performance, between:
(a) using one big table for all message types, so to handle any new message type, new columns are added to the big table. So the database is one table that may end up having 100's of columns.
(b) using a tables for each message type, so for a new message type, a new table is added to the database
By performance I mean in terms of searching all messages (i.e. searching one table versus a search across joined tables) and in terms of development work (i.e. knowledge transfer between developers) and maintenance (i.e. when something goes wrong).
This sounds a bit like it's about normalisation, but I am not sure it is.
Thanks!
If I read you right, choice (a) amounts to what is called the "One True Lookup Table" (OTLT). OTLT is an antipattern. You can research it on the web.
Performance is degraded because the lookup has to be done on two fields, the type and the code. With separate tables for each type, the lookup is just on the code.
Queries are more complex, and therefore more likely to be in error.
Data management is harder if you want separate entry forms for each type. If you are going to have just one true type entry form, you need to be careful when entering new lookup values. Good luck.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I was wondering if there is some sort of standard regarding the naming of MySQL Tables? I currently have two tables, one called Users and one called Trainings. Because i need them to have a many to many relationship it is required to create a table that interconnects these two but I am unable to think of a proper naming other than Users/Trainings.
No official best practice, but some well tested heuristics...
If my ORM or something doesn't enforce it's own standard, I like to use the underscore separated notation, but always in alphabetical order. Also, I always use singular for my classes and table names to avoid the puralization futz. Many to many by it's nature you can't tell what 'has' the other thing, so the alphabetical rule just makes it easier.
user
address
zebra
address_user
user_zebra
One simple set of rules, no confusion or ever as to how/what to name things, and easy to explain.
Going a step further, I recommend unless a very specific reason compels against it:
Always use lower case for tablenames and column names (that way you won't be suprised moving from a case sensitive file system to a case insensitive one--it's also easier not having to remember camel case)
name primary keys for a table (when not composite keys) id
name foreign keys in the format tablename_id (like user_id, zebra_id)
Remember, just Guidelines, not dogma.
These guidelines will make your life and your future dba's life better.