Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
The community reviewed whether to reopen this question 20 days ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I've read the MySQL Documentation on them, but still not clear on the benefits of Stored/Virtual Generated Columns? What are the pros/cons over storing the same data in an actual column and indexing that into memory? What are the pros/cons, and the best times/examples to when using them are more efficient or better?
Thank you!
A good reason to use a stored generated column is when the expression is costly enough that you want to calculate it only when you insert/update the row. A virtual generated column must recalculate the expression every time you run a query that reads that column.
The manual confirms this:
Stored generated columns can be used as a materialized cache for complicated conditions that are costly to calculate on the fly.
Besides that, there are some uses of generated columns that require the column to be stored. They don't work with virtual generated columns. For example, you need to use a stored generated column if you want to create a foreign key or a fulltext index on that generated column.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I am working on a project where I am using a table with a multi-valued attribute having 5-10 values. Is it good to keep multivalued attributes or should I normalize it into normal forms ?
But I think that it unnecessarily increases the no of rows.If we have 10 multi values for an attribute then each row or tuple will be replaced with new 10 rows which might increase the query running time.
Can anyone give suggestions on this?
The first normal form requests that each attribute be atomic.
I would say that the answer to this question hinges on the “atomic”: it is too narrow to define it as “indivisible”, because then no string would be atomic, as it can be split into letters.
I prefer to define it as “a single unit as far as the database is concerned”. So if this array (or whatever it is) is stored and retrieved in its entirety by the application, and its elements are never accessed inside the database, it is atomic in this sense, and there is nothing wrong with the design.
If, however, you plan to use elements of that attribute in WHERE conditions, if you want to modify individual elements with UPDATE statements or (worst of all) if you want the elements to satisfy constraints or refer to other tables, your design is almost certainly wrong. Experience shows that normalization leads to simpler and faster queries in that case.
Don't try to get away with few large table rows. Databases are optimized for dealing with many small table rows.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
It might look like a simple question already answered countless times, but I could not find the optimal way(using some db).
I have a list of few thousands keywords(let's say abusive words). Whenever someone posts a message(long sentence or a paragraph), I want to check if the given sentence contains any of the keywords, so that I can block user or take other actions.
I am looking for a db/schema which can solve the above problem and gives response in a few milliseconds(<15ms).
There are many dbs which solves the reverse of the above problem: given the keywords, find documents containing keywords(text search).
Try ClickHouse for your workload.
According to docs:
multiMatchAny(...) returns 0 if none of the regular expressions are matched and 1 if any of the patterns matches. It uses hyperscan library. For patterns to search substrings in a string, it is better to use multiSearchAny since it works much faster.
The length of any of the haystack string must be less than 2^32 bytes.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I am trying to decide on what will be more efficient between two different systems for MySQL to handle my data. I can either,
1: Create around 200 tables, each having around 30 rows & 5 columns.
2: Create 1 table, having around 6000 rows & 5 columns.
I am using Laravel for this project and Eloquent will be handling this. Does anybody have any opinions on this matter? I appreciate any/all responses.
Option 2.
For such low row counts the overhead both in terms of programming effort and computation of joining 200(!) tables far outweighs the "flat file" approach. Additionally, MySQL will attempt to cache the entire 6000-row table in RAM, assuming you're not storing massive BLOBs.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
So far in my experience, every piece of data saved inside table's columns did not have a specific reason to be integer or boolean or so on. Yet again, we all are advised to use column type based on the data type. And I have been doing so for years now.
I am thinking to completely drop this idea and to use tables only with TEXT columns. It's easier to create them (don't have to write type but can copy/paste TEXT), compiler will warn me when type conversion is needed, and dozens of reasons like these ones.
Is there a good non-beatable reason why I should not switch to this practice?
No we don't complicate database tables with non-TEXT columns, instead we make the database provide a consistent format for the storge of the data.
By using number, boolean and date fields we get all the wonderful validation and retrieval methods that the database has implemented for these fields.
Without using the specific datatypes we will start reinventing the wheel when we need validations and specific display formats.
I once worked in a data warehouse group and it frequently made me want to have better defined datatypes, formats, required fields and validations on most of the data received.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am working in a project where i need to calculate some avg values based on the users interaction on a site.
Now, the amount of records that needs to have their total avg calculated can range from a few to thousands.
My question is, at which threshold would it be wise to store the aggregated data in a seperate table and through a store procedure update that value everytime a new record is generated instead of just calculate it everytime it is neede?
Thanks in advance.
Dont do it, until you start having performance problems caused by the time it takes to aggregate your data.
Then do it.
If discovering this bottleneck in production is unacceptable, then run the system in a test environment that accurately matches your production environment and load in test data that accurately matches production data. If you hit a performance bottleneck in that environment that is caused by aggregation time, then do it.
You need to weigh the need of current data vs the need of quick data. If you absolutely need current data then you have to live with longer delays in your queries. If you absolutely need your data asap then you will have to deal with older data.
You can time your queries and time the insertion into a separate table and evaluate which seems to best fit your needs.