Should i use relations or split the result - mysql

I'm creating a database that should contain coordinates, textsize, etc.
My first table looks like this
id, template_id, data_1, data_2, data_3, data_4, data_5, data_6, data_7, data_8
Every data_x field should have one of the following formats:
svg string;textsize
x;y;textheight;textwidth
x;y;imageheight;imagewidth
In the future more formats could be added
My question is, should i use those formats (and split them using eg PHP) or should i create a table for each format with relationships? What is the fastest/best practice?
I hope i explained myself well enough..

First, you should not be storing these in separate columns. You should have another table with one row per table1 id and another per data item. It would have at least two columns:
Table1Id
DataColumn
It might also have an auto-incremented id, a sequential number to enumerate the ids and so on.
As for your question on how to store the data, that depends on how you are going to access them. If the database is going to be blind to the contents, the you can store them all in a single field. If you need to access them, then you might have to go to the next level and break things out into separate data tables, one for each type of value, that the above data column would refer to.
In any case, the more important change at this point is to put the "array" of data values into separate rows of another table.

Related

How to use Key-Value pair in Relational database(MySql)?

I wanted to use a relational database(MySql) to store my data as key-value pair.
I would be getting no. of key-value pairs dynamically.
I can create a simple table to store them in separate columns.
Values can be of type- int, varchar, text or date.
The problem which I am facing is:
When I need to run a query on key whose value should be an integer and I need to use and greater than or less than query with it. Same case when I need to use between query with date fields.
How can I achieve it?
------------------------------------------------Edit---------------------------------------------------
For greater clarity, I am providing the background for this question which I have divided into three parts:
1. Data 2: Use Case 3. Possible Designs
1. Data
Suppose I'm creating data store for census of a country**(Just an example)**. Fields for storing data would be different for male, female, boy or girl and also it will vary according to the person's profession. The number of fields depends on the requirement which can increase up to 500 or more.
2. Use Case
Show a paginated list of persons whose monthly income is between $7000 to $10000. User can click on any page number and the database should directly fetch the data for that page number. For example, if we are showing 10 results in a page and user clicks on the 5th page then we should show him the list of the person's from 40 to 50.
Some of the values belonging to a particular group store description which can have large data. So they should be stored as TEXT.
3. Possible Designs
I can create a separate table for each different type and store their data in respective fields. But the problem I'm thinking about this approach is that MySQL table has a maximum row size limit of 65,535 bytes. Going by this approach and storing all data horizontally might cross the max size limit. As the number of fields are not fixed and can change as per requirement.
Instead of storing data horizontally I can store them vertically using Entity Attribute Value design(key-value pair). For now, the increase in the number of rows due to this design is not a problem. Using this I can store data of all male, female or child in the same table. But the problem with this approach is:
I will lose the Datatype of certain important fields. I can not query and get the list of persons whose income is more than 1000.
For storing data or all fields in single Value type, I need to make it varchar. But some fields store large data which requires TEXT as the type.
Considering the above problem, I thought that instead of creating only one value field, I will create multiple value fields like value_int, value_varchar, value_date or value_text.
DB structure
For this problem, I will be using MySQL and cannot change the DB due to certain restrictions. So I am looking for a design with MySQL only.
Going by key-value approach is a good idea or not? Or any other possible design which can be used?
In very general terms, if you know the entities and attributes of your problem domain, and the data is relational, I'd use a relational schema (your "possible design 1"). If you actually encounter problems with maximum row width, your problem domain might contain logical subgroupings of attributes, so you can split them into separate table.
For instance:
Person (id, name, ...)
Person_demographics (person_id, age, location, ...)
Person_finance (person_id, income, wealth...)
If you don't know the entities and attributes in advance, I recommend using MySQL's JSON support. or XML support. This gives you access to much better query options than EAV.
The problem with EAV-like solutions in your scenario is that any non-trivial queries end up being incredibly complicated - "find all responses where salary is between x and y, and the age is z, in locations (a, b, c)" turns into a horrible mess of SQL, but with XPath this is pretty straightforward.

What should I do if I am unable to have a different amount of columns per table row?

Basically, I keep records of client information, and sometimes they will have multiple addresses (properties), e.g.:
id, name, phone, primaryProperty, properties
Considering I cannot create a random number of columns for each entry, I currently just grab the string value from a JavaScript array I create, which is used to hold all of the clients properties, e.g.:
['123 Fakestreet Faketown QC A1A1A1', '555 Falsestreet Falsetown QC B2B2B2']
then I convert it to a string, and then shove it into MySql
"123 Fakestreet Faketown QC A1A1A1, 555 Falsestreet Falsetown QC B2B2B2"
Unfortunately, as cool as arrays are, this makes it so that I can never properly query the above properties individually, unless I echo the value of the column out and use a for loop to change the contents around.
--
I realize that I could probably get away with increasing the total amount of columns on my clients table depending on the maximum amount of properties a client needs, but having a lot of unused fields for all of the other clients is a little strange, no?
I thought about creating a separate table for the properties, but then I would have a problem when it comes to updating the clients information efficiently, and without calling another update.php file. I have a fear of losing internet connection in between updating tables.... if that makes sense.
Normalization is an answer. Just split it into 2 tables. 1st one contains the unique data of the customer (name, shoe size etc.) and the ID. The 2nd table contains the same ID and addresses in the separate rows. Then you just join the tables on the ID column as you need.
Cheers!
G.

How to design the database when you need too many columns? [duplicate]

This question already has answers here:
How do you know when you need separate tables?
(9 answers)
Closed 9 years ago.
I have a table called cars but each car has hundreds of attributes and they keep on increasing over time (horsepower, torque, a/c, electric windows, etc...) My table has each attribute as a column. Is that the right way to do it when I have thousands of rows and hundreds of columns? Also, I made each attribute a column so I facilitate advanced searching / filtering.
Using MySQL database.
Thanks
This is an interesting question IMHO, and the answer may depend on your specific data model and implementation. The most important factor in this case is data density.
How much of each row is actually filled up, in average?
If most of your fields are always present, then data scope partition may be the way to go.
If most of your fields are empty, then a metadata-like structure (like #JayC suggested) may be more attractive.
Let's use the case you mentioned, and do some simulations.
On the first case, scope partition, the idea is to implement partitions based on scope or usage. As an example of partitioning by usage, let's say that the most retrieved fields are Model, Year, Maker and Color. These fields may compose your main [CAR] table, the owner of the ID field which will exclusively identify the vehicle.
Now let's say that Engine, Horsepower, Torque and Cylinders are also used for searches from time to time, but not so frequently. These may exist on a secondary table [CAR_INFO_1], which is tied to the first table by the presence of the CAR_ID field, a foreign key. Proceed by creating as many partitions you need.
Advantage: Simpler queries. You may coalesce all information about a vehicle if you do a joint query (for example inside a VIEW).
Downside: Maintenance. Each new field must be implemented in the model itself, and you need an updated data model to locate where the field you need is actually stored (or abstract it inside a view.)
Metadata format is much more elegant, but demands more of your database engine. Check #JayC's and #Nitzan Shaked's answers for details.
Advantages: 100% data density. You'll never have empty Data values. Also maintenance - a new attribute is created by adding it as a row to the metadata identifier table. Data structure is less complex as well.
Downside: Complex queries, together with more complex execution plans. Let's say you need all Ford cars made in 2010 that are blue. It would be very trivial on the first case:
SELECT * FROM CAR WHERE Model='Ford' AND Year='2010' AND Color='Blue'
Now the same query on a metadata-structured model:
Assume the existence of this two tables,
CAR_METADATA_TYPE
ID DESC
1 'Model'
2 'Year'
3 'Color'
and
CAR_METADATA [CAR_ID], [METADATA_TYPE_ID], [VALUE]
The query itself would like something like this:
SELECT * FROM CAR, CAR_METADATA [MP1], CAR_METADATA [MP2], CAR_METADATA [MP3]
WHERE MP1.CAR_ID = CAR.ID AND MP1.METADATA_TYPE_ID = 1 AND MP1.Value='Ford'
AND MP2.CAR_ID = CAR.ID AND MP2.METADATA_TYPE_ID = 2 AND MP2.Value='2010'
AND MP3.CAR_ID = CAR.ID AND MP3.METADATA_TYPE_ID = 3 AND MP3.Value='Blue'
So, it all depends on you needs. But given your case, my suggestion would be the Metadata format.
(But do a model cleanup first - no repeated fields, 1:N data on their own table instead of inline fields like Color1, Color2, Color3, this kind of stuff ;) )
I guess the obvious question is, then: why not have a table car_attrs(car, attr, value)? Each attribute is a row. Most queries can be re-written to use this form.
If it is all about features, create a features table, list all your features as rows and give them some sort of automatic id, and create a car_features that with foreign keys to both your cars table and your features table that associates cars with features, maybe along with any values associated with the relationship (one passenger electric seat, etc.).
If you have ever changing attributes, then consider storing them in an XML blob or text structure in one column. This structure is not relational. The most important attributes will then be duplicated in additional columns so you can craft queries to search on them as the Blob will not be searchable from SQL queries. This will cut down on the amount of columns in that table and allow for expansion without changing the database schema.
As others as suggested, if you want all the attributes in a table, then use an attribute table to define them. Then will depend on your requirements and needs of the application.

How to find a list of rows that contain any of the given INTs in a column that contains INTs as comma separated values

I have a case where we are maintaining a table containing resources. This table has a varchar column that contains role ids as comma separated values (I know normalizing SHOULD have been the way to go, but can't change a long running working system). E.g. role_ids column contains '1,4,6,9,10' and another row contains '5,10,15'.
Then, for a user in system, I have the associated role ids as a list, e.g. 4,15. Now I need to find 'any in many', i.e. any resource that may have any of the role ids present in resource.role_ids column.
This question is something similar to this one, but the solution expected is not expected in Grails.
I'm looking for a MySQL solution - either a query or a stored procedure. Though finding a set of resources could have been achieved using 'FIND_IN_SET()', but don't want to perform multiple calls to DB with each of user's role_id list.
Use a function like this one, to turn your lists into individual records, then join everything up normally.

MySQL Database Design Questions

I am currently working on a web service that stores and displays money currency data.
I have two MySQL tables, CurrencyTable and CurrencyValueTable.
The CurrencyTable holds the names of the currencies as well as their description and so forth, like so:
CREATE TABLE CurrencyTable ( name VARCHAR(20), description TEXT, .... );
The CurrencyValueTable holds the values of the currencies during the day - a new value is inserted every 2 minutes when the market is open. The table looks like this:
CREATE TABLE CurrencyValueTable ( currency_name VARCHAR(20), value FLOAT, 'datetime' DATETIME, ....);
I have two questions regarding this design:
1) I have more than 200 currencies. Is it better to have a separate CurrencyValueTable for each currency or hold them all in one table?
2) I need to be able to show the current (latest) value of the currency. Is it better to just insert such a field to the CurrencyTable and update it every two minutes or is it better to use a statement like:
SELECT value FROM CurrencyValueTable ORDER BY 'datetime' DESC LIMIT 1
The second option seems slower.. I am leaning towards the first one (which is also easier to implement).
Any input would be greatly appreciated!!
p.s. - please ignore SQL syntax / other errors, I typed it off the top of my head..
Thanks!
To your questions:
I would use one table. Especially if you need to report on or compare data from multiple currencies, it will be incredibly improved by sticking to one table.
If you don't have a need to track the history of each currency's value, then go ahead and just update a single value -- but in that case, why even have a separate table? You can just add "latest value" as a field in the currency table and update it there. If you do need to track history, then you will need the two tables and the SQL you posted will work.
As an aside, instead of FLOAT I would use DECIMAL(10,2). After MySQL 5.0, this will actually have improved results when it comes to currency handling with rounding.
It is better to have one table holding all currencies
If there is need for historical prices, then the table needs to hold them. A reasonable compromise in many situations is to split the price table into a full list of historical prices and another table which only has the current prices.
Using data type float can be troublesome. Please be sure you know what you are doing. If not, use a database currency data type.
As your webservice is transactional it is better if you'd have to access less tables at the same time. Since you will be reading and writing a lot, I would suggest having a single table.
Its better to insert a field to the CurrencyTable and update it rather than hitting two tables for a single request.