Optimal Way to Store/Retrieve Array in Table - mysql

I currently have a table in MySQL that stores values normally, but I want to add a field to that table that stores an array of values, such as cities. Should I simply store that array as a CSV? Each row will need it's own array, so I feel uneasy about making a new table and inserting 2-5 rows for each row inserted in the previous table.
I feel like this situation should have a name, I just can't think of it :)
Edit
number of elements - 2-5 (a selection from a dynamic list of cities, the array references the list, which is a table)
This field would not need to be searchable, simply retrieved alongside other data.

The "right" way would be to have another table that holds each value but since you don't want to go that route a delimited list should work. Just make sure that you pick a delimiter that won't show up in the data. You can also store the data as XML depending on how you plan on interacting with the data this may be a better route.

I would go with the idea of a field containing your comma (or other logical delimiter) separated values. Just make sure that your field is going to be big enough to hold your maximum array size. Then when you pull the field out, it should be easy to perform an explode() on the long string using your delimiter, which will then immediately populate your array in the code.

Maybe the word you're looking for is "normalize". As in, move the array to a separate table, linked to the first by means of a key. This offers several advantages:
The array size can grow almost indefinitely
Efficient storage
Ability to search for values in the array without having to use "like"
Of course, the decision of whether to normalize this data depends on many factors that you haven't mentioned, like the number of elements, whether or not the number is fixed, whether the elements need to be searchable, etc.

Is your application PHP? It might be worth investigating the functions serialize and unserialize.
These two functions allow you to easily store an array in the database, then recreate that array at a later time.

As others have mentioned, another table is the proper way to go.
But if you really don't want to do that(?), assuming you're using PHP with MySQL, why not use the serialize() and store a serialized value?

Related

Store and query array or group of words in MYSQL and PHP

I am working on a project that uses PHP/MYSQL as the backend for an IOS app that makes a lot of use of dictionaries and arrays containing text or strings.
I need to store this text in MYSQL (coming from Arrays of srtrings on phone) and then query to see the text contains (case insensitive) a word or phrase in question.
For example, if the array consists of {Ford, Chevy, Toyota, BMW, Buick}, I might want to query it to see it contains Saab.
I know storing arrays in a field is not MYSQL friendly as it prevents optimization. However, it would be way too complicated to create individual tables for these collections of words which are created by users.
So I'm looking for a reasonable way to store them, perhaps delimited with spaces or with commas that makes possible reasonably efficient searches.
If they are stored separated by spaces, I gather you can do something with regex like:
SELECT
*
FROM
`wordgroups`
WHERE
wordgroup regexp '(^|[[:space:]])BLA([[:space:]]|$)';
But this seems funky.
Is there a better way to do this? Thanks for any insights
Consider using a FULLTEXT index. And use MATCH(...) AGAINST(... IN NATURAL LANGUAGE MODE).
FULLTEXT is very fast for "words", and IN NATURAL MODE may solve your Saab example.
Using regexp can achieve what you want, however, your query will be inefficient, since it cannot rely on any indexes.
If you want to store a list of words and their position within the array does not matter, then you may consider storing them in a single field, space delimited. But instead of using a regexp, use fulltext indexing and searching. This method has a clear advantage over searching with regexp: it uses an index. It has some drawbacks as well: there is a stopword list (these are excluded from searching) and there is a minimum word length as well. The good news is that these parameters are configurable. Also, you get all the drawbacks of storing data in a delimited field, as detailed in Is storing a delimited list in a database column really that bad? question here on SO.
However, if you want to use dictionaries (key - value pairs) or the position within the list may be important, then the above data structure will not do.
In this case, I would consider if mysql is the right choice for storing my data in the first place. If you have multi-dimensional lists, or lists containing lists, then I would definitely choose a different nosql solution.
If you only need simple, two-dimensional lists / dictionaries, then you can store all of them in a single table with a similar structure as below:
list_id - unique identifier of the list, primary key
user_id - id of the user the list belongs to
key - for dictionaries this is the lookup field (indexed), for other lists it may store the position of the element. String data type.
value - the field holding the value (indexed). Data type should be string, so that it could hold different data types as well.
A search to determine if a list holds a certain value would be fast and efficient lookup using the index on either the key or value fields.

Storing attributes with multiple integer values

I need to store a dynamic number of integer attributes (1-8). I'm storing them in individual columns in the database table, like:
attribute_1, attribute_2, ..., attribute_8
This makes for a fairly messy model when methods need to reference these, as well as an unwieldy database table and schema.
These are assigned default values (but are overridable on a form), and represent unique identifiers for the user.
For instance, a Brew is composed of up to eight batches before they are mixed together in a fermenter. The brewer might want to go back and refer to any one of these by its unique identifying number. I'm assigning these unique values based on the last highest value when creating a new Brew. However, the user may want to override these default values to some other numbers.
In most cases (smaller breweries), they'll probably only use the first two, but some larger breweries would use all eight.
There must be a better way to store these than having eight different attributes with the same name and a number at the end.
I'm using MySQL. Is there an easy/concise way to store an array or a JSON hash but still be able to edit these values on a form?
I would not store attributes like that. It will limit you in the future. Let say you want to know which brews have used attribute_4? You will have to scan the entire brews table, open the attributes field and deconstruct it to see if 4 is in there.
Much better to separate Brew and Attributes in two tables, and link them, like so:
Another benefit, is you can add attributes easily.
Storing JSON is ok, like #max pointed out. I just propose the normalized database way of doing it.

Set Data Type in mySQL

My knowledge of relational databases is more limited, but is there a SQL command that can be used to create a column that contains a set in each row?
I am trying to create a table with 2 columns. 1 for specific IDs and a 2nd for sets that correspond to these IDs.
I read about
http://dev.mysql.com/doc/refman/5.1/en/set.html
However, the set data type requires that you know what items may be in your set. However, I just want there to be a variable-number list of items that don't repeat.
It would be much better to create that list of items as multiple rows in a second table. Then you could have as many items in the list you want, you could sort them, search for a specific item, make sure they're unique, etc.
See also my answer to Is storing a delimited list in a database column really that bad?
No, there's no MySQL data type for arbitrary sets. You can use a string containing a comma-delimited list; there are functions like FIND_IN_SET() that will operate on such values.
But this is poor database design. If you have an open-ended list, you should store it in a table with one row per value. This will allow them to be indexed, making searching faster.
MySQL doesn't support arrays, lists or other data structures like that. It does however support strings so use that and FIND_IN_SET() function:
http://dev.mysql.com/doc/refman/5.7/en/string-functions.html#function_find-in-set
"SET" data type won't be a good choice here.
You can use the "VARCHAR" and store the values in CSV format. You handle them at application level.
Example: INSERT into my_table(id, myset) values(1, "3,4,7");

Best way to store an array in MySQL database?

Part of my app includes volume automation for songs.
The volume automation is described in the following format:
[[0,50],[20,62],[48,92]]
I consider each item in this array a 'data point' with the first value containing the position in the song and the second value containing the volume on a scale of 0-100.
I then take these values and perform a function client-side to interpolate this data with 'virtual' data points in order to create a bezier curve allowing smooth volume transition as an audio file is playing.
However, the need has arisen to allow a user to save this automation into the database for recall at a later date.
The datapoints can be unlimited (though in reality should never really exceed around 40-50 with most being less than 10)
Also how should I handle the data? Should it be stored as is, in a text field? Or should I process it in some way beforehand for optimum results?
What data type would be best to use in MySQL to store an array?
Definitely not a text field, but a varchar -- perhaps. I wouldn't recommend parsing the results and storing them in individual columns unless you want to take advantage of that data in database sense -- statistics etc.
If you never see yourself asking "What is the average volume that users use?" then don't bother parsing it.
To figure out how to store this data ask yourself "How will i use it later?" If you will obtain the array and need to utilize it with PHP you can use serialize function. If you will use the values in JavaScript then JSON encoding will probably be best for you (plus many languages know how to decode it)
Good luck!
I suggest you to take a look at the JSON data type. This way you can store your array in a more efficient way than text or varchar, and you can access your data directly form MySQL without having to parse the whole thing.
Take a look at this link : https://dev.mysql.com/doc/refman/5.7/en/json.html
If speed is the most important when retrieving the rows then make a new table and make it dedicated to holding the indices of your array. Use the data type of integer and have each row represent an index of the array. You'll have to create another numeric column which binds these together so you can re-assemble the array with an SQL query.
This way you help MySQL help you speed up access. If you only want certain parts of the array, you just change the range in the SQL query and you can reassemble the array however you want.
The best way to store array is JSON data type -
CREATE TABLE example (
`id` int NOT NULL AUTO_INCREMENT,
`docs` JSON,
PRIMARY KEY (`id`)
);
INSERT INTO example (docs)
VALUES ('["hot", "cold"]');
Read more - https://sebhastian.com/mysql-array/#:~:text=Although%20an%20array%20is%20one,use%20the%20JSON%20data%20type.

Does MYSQL stores it in an optimal way it if the same string is stored in multiple rows?

I have a table where one of the columns is a sort of id string used to group several rows from the table. Let's say the column name is "map" and one of the values for map is e.g. "walmart". The column has an index on it, because I use to it filter those rows which belong to a certain map.
I have lots of such maps and I don't know how much space the different map values take up from the table. Does MYSQL recognizes the same map value is stored for multiple rows and stores it only once internally and only references it with an internal numeric id?
Or do I have to replace the map string with a numeric id explicitly and use a different table to pair map strings to ids if I want to decrease the size of the table?
MySQL will store the whole data for every row, regardless of whether the data already exists in a different row.
If you have a limited set of options, you could use an ENUM field, else you could pull the names into another table and join on it.
I think MySQL will duplicate your content each time : it stores data row by row, unless you explicitly specify otherwise (putting the data in another table, like you suggested).
Using another table will mean you need to add a JOIN in some of your queries : you might want to think a bit about the size of your data (are they that big ?), compared to the (small ?) performance loss you may encounter because of that join.
Another solution would be using an ENUM datatype, at least if you know in advance which string you will have in your table, and there are only a few of those.
Finally, another solution might be to store an integer "code" corresponding to the strings, and have those code translated to strings by your application, totally outside of the database (or use some table to store the correspondances, but have that table cached by your application, instead of using joins in SQL queries).
It would not be as "clean", but might be better for performances -- still, this may be some kind of micro-optimization that is not necessary in your case...
If you are using the same values over and over again, then there is a good functional reason to move it to a separate table, totally aside from disk space considerations: To avoid problems with inconsistent data.
Suppose you have a table of Stores, which includes a column for StoreName. Among the values in StoreName "WalMart" occurs 300 times, and then there's a "BalMart". Is that just a typo for "WalMart", or is that a different store?
Also, if there's other data associated with a store that would be constant across the chain, you should store it just once and not repeatedly.
Of course, if you're just showing locations on a map and you really don't care what they are, it's just a name to display, then this would all be irrelevant.
And if that's the case, then buying a bigger disk is probably a simpler solution than redesigning your database just to save a few bytes per record. Because if we're talking arbitrary strings for place names here, then trying to find duplicates and have look-ups for them is probably a lot of work for very little gain.