mysql: how to split list field - mysql

I have a table which only contains id and a field whose data is a list of data. e.g.
--------------
| id | data |
| 1 | a,b,c,d|
| 2 | a,b,k,m|
---------------
I guess it's not a good design that put a list data in a field, so I want to know how can I redesign it?

As per me you need two tables i.e. Master and Transaction tables only when some details are gonna be same for every records and some are gonna be changing. In your case if there are not any other thing related to your id field is gonna be same you can carry on with one table and with following structure.
--------------
| id | data |
| 1 | a |
| 1 | b |
| 1 | c |
| 1 | d |
| 2 | a |
| 2 | b |
| 2 | k |
| 2 | m |
---------------
BUT if there are any other things related to the id fields that is gonna be same for same id records you will have to use two tables.
like following case. there are 3 fields id, name and data.
and you current table looks something like
--------------------------
| id | name | data |
| 1 | testname | a,b,c,d|
| 2 | remy | a,b,c,d|
--------------------------
your new table structure should look like.
table 1 Master
-----------------
| id | name |
| 1 | testname |
| 2 | remy |
-----------------
Table 2 Transaction
--------------
| id | data |
| 1 | a |
| 1 | b |
| 1 | c |
| 1 | d |
| 2 | a |
| 2 | b |
| 2 | k |
| 2 | m |
---------------
For better database management we might need to normalize the data.
Database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency. Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships. You can find more on below links
3 Normal Forms Database Tutorial
Database normalization

If you have only those two fields in your table then you should have only 1 table as below
id | data
with composite primary key as PRIMARY KEY(id,data) so that there won't be any duplicate data for the respective ID.
The data would be like this
id | data
1 | a
1 | b
1 | c
1 | d
2 | a
2 | b
2 | k
2 | m

You will need another table which can be of the ONE to MANY type.
For e.g. you could have another table datamapping which would have data and ID column where the ID column is a FOREIGN KEY to the ID column of the data table.
So according to your example there would be 4 entries for ID = 1 in the datamapping table.

You will need two tables with a foreign key.
Table 1
id
Table 2
id
datavalue
So the data looks like:
Table 1:
id
1
2
3
Table 2:
id | data
1 | a
1 | b
1 | c
1 | d
2 | a
2 | b
2 | k
2 | m

You are correct, this this is not a good database design. The data field violates the principle of atomicity and therefore the 1NF, which can lead to problems in maintaining and querying the data.
To normalize your design, split the original table in two. There are 2 basic strategies to do it: using non-identifying and using identifying relationship.
NOTE: If you only have id in the parent table, and no other FKs on it, and parent cannot exist without at least one child (i.e. data could not have been empty in the original design), you can dispense with the parent table altogether.

Related

What Should be database structure to create excel sheet like view? mean should tables store in json format or create tables for cells,rows and columns

Hi Everyone,
i want to create a board and all board will contain groups and each groups have tables(Rows and Columns), so should i save tables(Rows and Columns) as a json format or create separate table for rows,columns and cells etc?
i watn to create a system like monday.com
Depends on what kind of database you are using, if its a document database then JSON is the natural way, in a SQL database, the better way would be to have a table representing the rows, and having a separate table representing column mapping, this will give you the flexibility to add columns at will.
For example:-
Row Table
| id | details |
|----|---------------|
| 1 | row_1_details |
| 2 | row_2_details |
Column Table
| id | column_name | column_value | row_id |
|----|-------------|--------------|--------|
| 1 | col_1 | skjdjks | 1 |
| 2 | col_2 | jslkds | 1 |
| 3 | col_1 | dhkshd | 2 |

Good practice on saving properties in relational database

Let's assume I have two types of users in my system.
Those who can program and those who cannot.
I need to save both types of users in the same table.
The users who can program have lots properties different to those who can't, defined in another table.
What's either advantages of the following solutions and are there any better solutions?
Solution 1
One table containing a column with the correspondig property.
Table `users`:
----------------------------
| id | name | can_program |
----------------------------
| 1 | Karl | 1 |
| 2 | Ally | 0 |
| 3 | Blake | 1 |
----------------------------
Solution 2
Two tables related to each other via primary key and foreign key.
One of the tables containing the users and the other table only containing the id of those who can program.
Table users:
--------------
| id | name |
--------------
| 1 | Karl |
| 2 | Ally |
| 3 | Blake |
--------------
Table can_program:
---------------------
| id | can_program |
---------------------
| 1 | 1 |
| 3 | 1 |
---------------------
You have a 1-1 relationship between a user and the property that allows him to program. I would recommend storing this information as an additional column in table users. Creating an additional table will basically results in an additional storage structure with a 1-1 relationship to the original table.
Why not just have some kind of programmer_profiles table that the users table has a one-to-many relationship with?
If there's an associated record in programmer_profiles then they can program, otherwise it's presumed they can't.
This is more flexible since you can add in other x_profiles tables that provide different properties even if some of these have the same names.

MySql add relationships without creating dupes

I created a table (t_subject) like this
| id | description | enabled |
|----|-------------|---------|
| 1 | a | 1 |
| 2 | b | 1 |
| 3 | c | 1 |
And another table (t_place) like this
| id | description | enabled |
|----|-------------|---------|
| 1 | d | 1 |
| 2 | e | 1 |
| 3 | f | 1 |
Right now data from t_subject is used for each of t_place records, to show HTML dropdowns, with all the results from t_subject.
So I simply do
SELECT * FROM t_subject WHERE enabled = 1
Now just for one of t_place records, one record from t_subject should be hidden.
I don't want to simply delete it with javascript, since I want to be able to customize all of the dropdowns if anything changes.
So the first thing I though was to add a place_id column to t_subject.
But this means I have to duplicate all of t_subject records, I would have 3 of each, except one that would have 2.
Is there any way to avoid this??
I thought adding an id_exclusion column to t_subject so I could duplicate records only whenever a record is excluded from another id from t_place.
How bad would that be?? This way I would have no duplicates, so far.
Hope all of this makes sense.
While you only need to exclude one course, I would still recommend setting up a full 'place-course' association. You essentially have a many-to-many relationship, despite not explicitly linking your tables.
I would recommend an additional 'bridging' or 'associative entity' table to represent which courses are offered at which places. This new table would have two columns - one foreign key for the ID of t_subject, and one for the ID of t_place.
For example (t_place_course):
| place_id | course_id |
|----------|-----------|
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 1 |
| 3 | 3 |
As you can see in my example above, place 3 doesn't offer course 2.
From here, you can simply query all of the courses available for a place by querying the place_id:
SELECT * from t_place_course WHERE place_id = 3
The above will return both courses 1 and 3.
You can optionally use a JOIN to get the other information about the course or place, such as the description:
SELECT `t_course`.`description`
FROM `t_course`
INNER JOIN `t_place_course`
ON `t_course`.`id` = `t_place_course`.`course_id`
INNER JOIN `t_place`
ON `t_place`.`id` = `place_id`

Is it possible to normalize the table so that it can contain one Value in one row?

I have a table containing three column BusNo, BusRoute & BusStop where BusStop column contain multiple comma separated values. I want to normalize it so that the table contain one stop in one Row. Ex.
BusNo BusRoute BusStop
1 Rajendra Nagar to Noida Apsara,Shahadara,Shakarpur,Mother Dairy
I want to make the stops in multiple row would it be good approach I have more that 1000 BusNo here.
My suggestion would be to have two new tables: BusStops and BusRouteBusStops.
BusStops will have one line for each bus stop, containing at least two columns: StopNumber and StopName.
BusRouteBusStops will be the table that links the BusRoute table with the BusStops table. Each line in this table will have a primary key from BusRoutes and from BusStops.
The idea is to keep the bus stops in a table, regardless of if and where they are used. That way you can use a single stop in however many routes you want. Also, if you decide to remove a stop from all the routes, it is still kept and is available for use for new routes.
If you want to represent the order of the bus stops in the route, it can be added as a column to the BusRouteBusStops table.
Tables example:
Table BusRoutes - primary-Key(BusNo)
===============
BusNo | BusRoute
1 | Rajendra Nagar to Noida
Table BusStops - primary-Key(StopNumber)
===============
StopNumber | StopName
1 | Apsara
2 | Shahadara
3 | Shakarpur
4 | Other Stop
5 | Mother Dairy
Table BusRouteBusStops - primary-Key(BusNo+StopNumber)
===============
BusNo | StopNumber | stpoOrder
1 | 1 | 1
1 | 2 | 2
1 | 3 | 3
1 | 5 | 4
A query to get all the bus numbers that go through a given stop (say: Apsara), using MySql syntax, will be:
SELECT BR.*
FROM BusRoutes BR, BusStops BS, BusRouteBusStops BRBS
WHERE BR.BusNo=BRBS.BusNo
AND BS.StopNumber=BRBS.StopNumber
AND BS.StopName="Apsara"
To resolve a m:n relation, you normally use an additional table. As you have everything in one table right now, that means two additional tables for you.
Table structure
bus_stop: id, name
bus_route: id, description
stop_to_route_relation: bus_route, bus_stop
Example
bus_stop
--------------------
| id | name |
--------------------
| 1 | CityA |
--------------------
| 2 | CityB |
--------------------
| 3 | CityC |
--------------------
bus_route
-----------------------------
| id | bus_no | description |
-----------------------------
| 1 | 5 | CityA to B |
-----------------------------
| 2 | 5 | CityA to C |
-----------------------------
stop_to_route_relation
------------------------
| bus_route | bus_stop |
------------------------
| 1 | 1 |
------------------------
| 1 | 2 |
------------------------
| 2 | 1 |
------------------------
| 2 | 3 |
------------------------
Example query
select
br.bus_no,
bs.name
from
bus_route br
left join stop_to_route_relation str on (br.id = str.bus_route)
left join bus_stop bs on (str.bus_stop = bs.id);
If you want to normalize BusStop field then you need to make a new table for it. Like this:
Table: Bus
===================================
| BusNo | BusRoute
===================================
| 1 | Rajendra Nagar to Noida
===================================
Table: BusStop
--------------------------
| BusNo | BusStop
--------------------------
| 1 | Apsara
--------------------------
| 1 | Shahadara
--------------------------
| 1 | Shakarpur
--------------------------
| 1 | Mother Dairy
--------------------------
In the BusStop table the BusNo is the Foreign Key that links it to Bus table.
You mentioned that you have 1000 BusNo so I guess it will require a lot of resources since normalizing it will need more rows for saving the BusStop for each BusNo. For instance, each BusNo has 5 BusStops then your new table for BusStop will approximately have 1000 x 5 rows (Your saving every BusStop of Bus in the table). The advantage that I see here is you can do more queries in normalizing it. You weigh the pros and cons in deciding. Goodluck.

How to split CSVs from one column to rows in a new table in MSSQL 2008 R2

Imagine the following (very bad) table design in MSSQL2008R2:
Table "Posts":
| Id (PK, int) | DatasourceId (PK, int) | QuotedPostIds (nvarchar(255)) | [...]
| 1 | 1 | | [...]
| 2 | 1 | 1 | [...]
| 2 | 2 | 1 | [...]
[...]
| 102322 | 2 | 123;45345;4356;76757 | [...]
So, the column QuotedPostIds contains a semicolon-separated list of self-referencing PostIds (Kids, don't do that at home!). Since this design is ugly as a hell, I'd like to extract the values from the QuotedPostIds table to a new n:m relationship table like this:
Desired new table "QuotedPosts":
| QuotingPostId (int) | QuotedPostId (int) | DatasourceId (int) |
| 2 | 1 | 1 |
| 2 | 1 | 2 |
[...]
| 102322 | 123 | 2 |
| 102322 | 45345 | 2 |
| 102322 | 4356 | 2 |
| 102322 | 76757 | 2 |
The primary key for this table could either be a combination of QuotingPostId, QuotedPostId and DatasourceID or an additional artificial key generated by the database.
It is worth noticing that the current Posts table contains about 6,300,000 rows but only about 285,000 of those have a value set in the QuotedPostIds column. Therefore, it might be a good idea to pre-filter those rows. In any case, I'd like to perform the normalization using internal MSSQL functionality only, if possible.
I already read other posts regarding this topic which mostly dealt with split functions but neither could I find out how exactly to create the new table and also copying the appropriate value from the Datasource column, nor how to filter the rows to touch accordingly.
Thank you!
€dit: I thought it through and finally solved the problem using an external C# program instead of internal MSSQL functionality. Since it seems that it could have been done using Mikael Eriksson's suggestion, I will mark his post as an answer.
From comments you say you have a string split function that you you don't know how to use with a table.
The answer is to use cross apply something like this.
select P.Id,
S.Value
from Posts as P
cross apply dbo.Split(';', P.QuotedPostIds) as S