how should I build up my database when I want to store these kind of data? - mysql

I want to build a page like shown below and all data should be retrieved from a database. Both the term, subject and sentences is retrieved from a database. Three levels of data. And under each term (eg. Spring 2017) I can pick and choose between all of these sentences.
Spring 2017
Subject1
Sentence 1
Sentence 2
Sentence 3
Subject2
Sentence 13
Sentence 12
Sentence 17
Subject3
Sentence 11
Sentence 14
Sentence 19
Autmn 2017
...
I want to present similar info from database to user, and let the user choose between all this sentences. How should i build up my database for achieving this in the best and most efficient way.
One way is:
Table 'subject' Table 'sentences'
| id | subjects | | id | subjectid | name |
| 3 | Subject1 | | 1 | 3 | Sentence 2 |
| 4 | Subject2 | | 2 | 4 | Sentence 13 |
Table 'term'
| id | term | sentenceid |
| 1 | Spring 17 | 1,2,28 |
Another way is maybe using pivot-tables, something like this:
Table 'sentences'
| id | parentid | name |
| 1 | 0 | Subject2 |
| 2 | 3 | Sentence 2 |
| 3 | 0 | Subject1 |
| 4 | 1 | Sentence 13 |
Table 'term'
| id | term | sentenceid |
| 1 | Spring 17 | 2,4,28 |
Notice: Number of terms can be many more than just two in a year.
Is it any of this structures you recommend, or any other way you think I should build my database? Is one of these more efficient? Not so demanding? Easier to adjust?

You are doing relational analysis/design:
Find all substantives/nouns of your domain. These are candidates for tables.
Find any relationships/associations between those substantives. "Has", "consists of", "belongs to", "depends on" and so on. Divide them into 1:1, 1:n, n:m associations.
look hard at the 1:1 ones and check if you can reduce two of your original tables into one.
the 1:n lead you to foreign keys in one of the tables.
the n:m give you additional association tables, possibly with their own attributes.
That's about it. I would strongly advise against optimizing for speed or space at this point. Any modem RDBMS will be totally indifferent against the number of rows you are likely to encounter in your example. All database related software (ORMs etc.) expect such a clean model. Packing ids into comma separated fields is an absolutes no-no as it defeats all mechanisms your RDBMS has to deal with such data; it makes the application harder to program; it confuses GUIs and so on.
Making weird choices in your table setup so they deviate from a clean model of your domain is the #1 cause of trouble along the way. You can optimize for performance later, if and when you actually get into trouble. Except for extreme cases (huge data sets or throughput), such optimisation primarily takes place inside the RDBMS (indexes, storage parameters, buffer management etc.) or by optimizing your queries, not by changing the tables.

If the data is hierarchical, consider representing it with a single table, with one column referencing a simple lookup for the "entry type".
Table AcademicEntry
================================
| ID | EntryTypeID | ParentAcademicEntryID | Description |
==========================================================
| 1 | 3 | 3 | Sentence 1 |
| 2 | 1 | <null> | Spring 2017 |
| 3 | 2 | 2 | Subject1 |
Table EntryType
================================
| ID | Description |
====================
| 1 | Semester |
| 2 | Subject |
| 3 | Sentence |

Start with the terms. Every term has subjects. Every subject has sentences. Then you may need the position of a subject within a term and probably the position of a sentence in a subject.
Table 'term'
id | term
---+------------
1 | Spring 2017
Table 'subject'
id | title | termid | pos
---+----------+--------+----
3 | Subject1 | 1 | 1
4 | Subject2 | 1 | 2
5 | Subject3 | 1 | 3
Table 'sentence'
id | name | subjectid | pos
---+-------------+-----------+-----
1 | Sentence 2 | 3 | 2
2 | Sentence 13 | 4 | 1
3 | Sentence 1 | 3 | 1
4 | Sentence 3 | 3 | 3
2 | Sentence 17 | 4 | 3
...

This table design Should resolve your need.
TblSeason
(
SeasonId int,
SeasonName varchar(30)
)
tblSubject
(
Subjectid int
sessionid int (fk to tblsession)
SubjectData varchar(max)
)
tblSentences
(
SentencesID INT
Subjectid int (Fk to tblSubject)
SentenceData varchar(max)
)

Related

Is it possible to normalize the table so that it can contain one Value in one row?

I have a table containing three column BusNo, BusRoute & BusStop where BusStop column contain multiple comma separated values. I want to normalize it so that the table contain one stop in one Row. Ex.
BusNo BusRoute BusStop
1 Rajendra Nagar to Noida Apsara,Shahadara,Shakarpur,Mother Dairy
I want to make the stops in multiple row would it be good approach I have more that 1000 BusNo here.
My suggestion would be to have two new tables: BusStops and BusRouteBusStops.
BusStops will have one line for each bus stop, containing at least two columns: StopNumber and StopName.
BusRouteBusStops will be the table that links the BusRoute table with the BusStops table. Each line in this table will have a primary key from BusRoutes and from BusStops.
The idea is to keep the bus stops in a table, regardless of if and where they are used. That way you can use a single stop in however many routes you want. Also, if you decide to remove a stop from all the routes, it is still kept and is available for use for new routes.
If you want to represent the order of the bus stops in the route, it can be added as a column to the BusRouteBusStops table.
Tables example:
Table BusRoutes - primary-Key(BusNo)
===============
BusNo | BusRoute
1 | Rajendra Nagar to Noida
Table BusStops - primary-Key(StopNumber)
===============
StopNumber | StopName
1 | Apsara
2 | Shahadara
3 | Shakarpur
4 | Other Stop
5 | Mother Dairy
Table BusRouteBusStops - primary-Key(BusNo+StopNumber)
===============
BusNo | StopNumber | stpoOrder
1 | 1 | 1
1 | 2 | 2
1 | 3 | 3
1 | 5 | 4
A query to get all the bus numbers that go through a given stop (say: Apsara), using MySql syntax, will be:
SELECT BR.*
FROM BusRoutes BR, BusStops BS, BusRouteBusStops BRBS
WHERE BR.BusNo=BRBS.BusNo
AND BS.StopNumber=BRBS.StopNumber
AND BS.StopName="Apsara"
To resolve a m:n relation, you normally use an additional table. As you have everything in one table right now, that means two additional tables for you.
Table structure
bus_stop: id, name
bus_route: id, description
stop_to_route_relation: bus_route, bus_stop
Example
bus_stop
--------------------
| id | name |
--------------------
| 1 | CityA |
--------------------
| 2 | CityB |
--------------------
| 3 | CityC |
--------------------
bus_route
-----------------------------
| id | bus_no | description |
-----------------------------
| 1 | 5 | CityA to B |
-----------------------------
| 2 | 5 | CityA to C |
-----------------------------
stop_to_route_relation
------------------------
| bus_route | bus_stop |
------------------------
| 1 | 1 |
------------------------
| 1 | 2 |
------------------------
| 2 | 1 |
------------------------
| 2 | 3 |
------------------------
Example query
select
br.bus_no,
bs.name
from
bus_route br
left join stop_to_route_relation str on (br.id = str.bus_route)
left join bus_stop bs on (str.bus_stop = bs.id);
If you want to normalize BusStop field then you need to make a new table for it. Like this:
Table: Bus
===================================
| BusNo | BusRoute
===================================
| 1 | Rajendra Nagar to Noida
===================================
Table: BusStop
--------------------------
| BusNo | BusStop
--------------------------
| 1 | Apsara
--------------------------
| 1 | Shahadara
--------------------------
| 1 | Shakarpur
--------------------------
| 1 | Mother Dairy
--------------------------
In the BusStop table the BusNo is the Foreign Key that links it to Bus table.
You mentioned that you have 1000 BusNo so I guess it will require a lot of resources since normalizing it will need more rows for saving the BusStop for each BusNo. For instance, each BusNo has 5 BusStops then your new table for BusStop will approximately have 1000 x 5 rows (Your saving every BusStop of Bus in the table). The advantage that I see here is you can do more queries in normalizing it. You weigh the pros and cons in deciding. Goodluck.

Optimize SQL-Query that is using REGEXP in a JOIN

I have the following situation:
Table Words:
| ID | WORD |
|----|--------|
| 1 | us |
| 2 | to |
| 3 | belong |
| 4 | are |
| 5 | base |
| 6 | your |
| 7 | all |
| 8 | is |
| 9 | yours |
Table Sentence:
| ID | SENTENCE |
|----|-------------------------------------------|
| 1 | <<7>> <<6>> <<5>> <<4>> <<3>> <<2>> <<1>> |
| 2 | <<7>> <<8>> <<9>> |
And i want to replace the <<(\d)>> with the equivalent word from the Word-Table.
So the result should be
| ID | SENTENCE |
|----|--------------------------------|
| 1 | all your base are belong to us |
| 2 | all is yours |
What i came up with is the following SQL-Code:
SELECT id, GROUP_CONCAT(word ORDER BY pos SEPARATOR ' ') AS sentence FROM (
SELECT sentence.id, words.word, LOCATE(words.id, sentence.sentence) AS pos
FROM sentence
LEFT JOIN words
ON (sentence.sentence REGEXP CONCAT('<<',words.id,'>>'))
) AS TEMP
GROUP BY id
I made a sqlfiddle for this:
http://sqlfiddle.com/#!2/634b8/4
The code basically is working, but i'd like to ask you pros if there is a way without a derived table or without filesort in the execution plan.
You should make a table with one entry per word, so your sentense (sic) can be made by joining on that table. It would look something like this
SentenceId, wordId, location
2, 7, 1
2, 8, 2
2, 9, 3
They way you have it set up, you are not taking advantage of your database, basically putting several points of data in 1 table-field.
The location field (it is tempting to call it "order", but as this is an SQL keyword, don't do it, you'll hate yourself) can be used to 'sort' the sentence.
(and you might want to rename sentense to sentence?)

How to store multiple values in single column where use less memory?

I have a table of users where 1 column stores user's "roles".
We can assign multiple roles to particular user.
Then I want to store role IDs in the "roles" column.
But how can I store multiple values into a single column to save memory in a way that is easy to use? For example, storing using a comma-delimited field is not easy and uses memory.
Any ideas?
If a user can have multiple roles, it is probably better to have a user_role table that stores this information. It is normalised, and will be much easier to query.
A table like:
user_id | role
--------+-----------------
1 | Admin
2 | User
2 | Admin
3 | User
3 | Author
Will allow you to query for all users with a particular role, such as SELECT user_id, user.name FROM user_role JOIN user WHERE role='Admin' rather than having to use string parsing to get details out of a column.
Amongst other things this will be faster, as you can index the columns properly and will take marginally more space than any solution that puts multiple values into a single column - which is antithetical to what relational databases are designed for.
The reason this shouldn't be stored is that it is inefficient, for the reason DCoder states on the comment to this answer. To check if a user has a role, every row of the user table will need to be scanned, and then the "roles" column will have to be scanned using string matching - regardless of how this action is exposed, the RMDBS will need to perform string operations to parse the content. These are very expensive operations, and not at all good database design.
If you need to have a single column, I would strongly suggest that you no longer have a technical problem, but a people management one. Adding additional tables to an existing database that is under development, should not be difficult. If this isn't something you are authorised to do, explain to why the extra table is needed to the right person - because munging multiple values into a single column is a bad, bad idea.
You can also use bitwise logic with MySQL. role_id must be in BASE 2 (0, 1, 2, 4, 8, 16, 32...)
role_id | label
--------+-----------------
1 | Admin
2 | User
4 | Author
user_id | name | role
--------+-----------------
1 | John | 1
2 | Steve | 3
3 | Jack | 6
Bitwise logic allows you to select all user roles
SELECT * FROM users WHERE role & 1
-- returns all Admin users
SELECT * FROM users WHERE role & 5
-- returns all users who are admin or Author because 5 = 1 + 4
SELECT * FROM users WHERE role & 6
-- returns all users who are User or Author because 6 = 2 + 4
From your question what I got,
Suppose, you have to table. one is "meal" table and another one is "combo_meal" table. Now I think you want to store multiple meal_id inside one combo_meal_id without separating coma[,]. And you said that it'll make your DB to more standard.
If I not getting wrong from your question then please read carefully my suggestion bellow. It may be help you.
First think is your concept is right. Definitely it'll give you more standard DB.
For this you have to create one more table [ example table: combo_meal_relation ] for referencing those two table data. May be one visible example will clear it.
meal table
+------+--------+-----------+---------+
| id | name | serving | price |
+------+--------+-----------+---------+
| 1 | soup1 | 2 person | 12.50 |
+------+--------+-----------+---------+
| 2 | soup2 | 2 person | 15.50 |
+------+--------+-----------+---------+
| 3 | soup3 | 2 person | 23.00 |
+------+--------+-----------+---------+
| 4 | drink1 | 2 person | 4.50 |
+------+--------+-----------+---------+
| 5 | drink2 | 2 person | 3.50 |
+------+--------+-----------+---------+
| 6 | drink3 | 2 person | 5.50 |
+------+--------+-----------+---------+
| 7 | frui1 | 2 person | 3.00 |
+------+--------+-----------+---------+
| 8 | fruit2 | 2 person | 3.50 |
+------+--------+-----------+---------+
| 9 | fruit3 | 2 person | 4.50 |
+------+--------+-----------+---------+
combo_meal table
+------+--------------+-----------+
| id | combo_name | serving |
+------+--------------+-----------+
| 1 | combo1 | 2 person |
+------+--------------+-----------+
| 2 | combo2 | 2 person |
+------+--------------+-----------+
| 4 | combo3 | 2 person |
+------+--------------+-----------+
combo_meal_relation
+------+--------------+-----------+
| id | combo_meal_id| meal_id |
+------+--------------+-----------+
| 1 | 1 | 1 |
+------+--------------+-----------+
| 2 | 1 | 2 |
+------+--------------+-----------+
| 3 | 1 | 3 |
+------+--------------+-----------+
| 4 | 2 | 4 |
+------+--------------+-----------+
| 5 | 2 | 2 |
+------+--------------+-----------+
| 6 | 2 | 7 |
+------+--------------+-----------+
When you search inside table then it'll generate faster result.
search query:
SELECT m.*
FROM combo_meal cm
JOIN meal m
ON m.id = cm.meal_id
WHERE cm.combo_id = 1
Hopefully you understand :)
You could do something like this
INSERT INTO table (id, roles) VALUES ('', '2,3,4');
Then to find it use FIND_IN_SET
As you might already know, storing multiple values in a cell goes against 1NF form. If youre fine with that, using a json column type is a great way and has good methods to query properly.
SELECT * FROM table_name
WHERE JSON_CONTAINS(column_name, '"value 2"', '$')
Will return any entry with json data like
[
"value",
"value 2",
"value 3"
]
Youre using json, so remember, youre query performance will go down the drain.

mysql: how to split list field

I have a table which only contains id and a field whose data is a list of data. e.g.
--------------
| id | data |
| 1 | a,b,c,d|
| 2 | a,b,k,m|
---------------
I guess it's not a good design that put a list data in a field, so I want to know how can I redesign it?
As per me you need two tables i.e. Master and Transaction tables only when some details are gonna be same for every records and some are gonna be changing. In your case if there are not any other thing related to your id field is gonna be same you can carry on with one table and with following structure.
--------------
| id | data |
| 1 | a |
| 1 | b |
| 1 | c |
| 1 | d |
| 2 | a |
| 2 | b |
| 2 | k |
| 2 | m |
---------------
BUT if there are any other things related to the id fields that is gonna be same for same id records you will have to use two tables.
like following case. there are 3 fields id, name and data.
and you current table looks something like
--------------------------
| id | name | data |
| 1 | testname | a,b,c,d|
| 2 | remy | a,b,c,d|
--------------------------
your new table structure should look like.
table 1 Master
-----------------
| id | name |
| 1 | testname |
| 2 | remy |
-----------------
Table 2 Transaction
--------------
| id | data |
| 1 | a |
| 1 | b |
| 1 | c |
| 1 | d |
| 2 | a |
| 2 | b |
| 2 | k |
| 2 | m |
---------------
For better database management we might need to normalize the data.
Database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency. Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships. You can find more on below links
3 Normal Forms Database Tutorial
Database normalization
If you have only those two fields in your table then you should have only 1 table as below
id | data
with composite primary key as PRIMARY KEY(id,data) so that there won't be any duplicate data for the respective ID.
The data would be like this
id | data
1 | a
1 | b
1 | c
1 | d
2 | a
2 | b
2 | k
2 | m
You will need another table which can be of the ONE to MANY type.
For e.g. you could have another table datamapping which would have data and ID column where the ID column is a FOREIGN KEY to the ID column of the data table.
So according to your example there would be 4 entries for ID = 1 in the datamapping table.
You will need two tables with a foreign key.
Table 1
id
Table 2
id
datavalue
So the data looks like:
Table 1:
id
1
2
3
Table 2:
id | data
1 | a
1 | b
1 | c
1 | d
2 | a
2 | b
2 | k
2 | m
You are correct, this this is not a good database design. The data field violates the principle of atomicity and therefore the 1NF, which can lead to problems in maintaining and querying the data.
To normalize your design, split the original table in two. There are 2 basic strategies to do it: using non-identifying and using identifying relationship.
NOTE: If you only have id in the parent table, and no other FKs on it, and parent cannot exist without at least one child (i.e. data could not have been empty in the original design), you can dispense with the parent table altogether.

Data Entry Tracking (Database Design)

I have developed a website (PHP) that allow staffs to add records on to our system.
Staffs will be adding thousands of records into our database.
I need a way to keep track of what record have been done and the process/status of record.
Here a number of Teams I could think of:
Data Entry Team
Proof Reading Team
Admin Team
When staff (Data Entry Team) completed a record - he/she will then click on the Complete button. Then somehow it should asssign to 'Proof Reading Team' automatically.
A record need to be checked twice from a Proof Reading Team. If StaffB finish proof reading then another member from Proof Reading Team need to check it again.
When Proof reading is done, Admin Team will then assign "Record Completed"
In a few months later record might need to be updated (spelling mistake, price change, etc) - Admin might to assign record to Data entry team.
Is this good data entry management solution? How do I put this into Database Design perspective?
Here what I tried:
mysql> select * from records;
+----+------------+----------------------+
| id | name | address |
+----+------------+----------------------+
| 1 | Bill Gates | Text 1 Text Text 1 |
| 2 | Jobs Steve | Text 2 Text 2 Text 2 |
+----+------------+----------------------+
mysql> select * from staffs;
+----+-----------+-----------+---------------+
| id | username | password | group |
+----+-----------+-----------+---------------+
| 1 | admin1 | admin1 | admin |
| 2 | DEntryA | DEntryA | data_entry |
| 3 | DEntryB | DEntryB | data_entry |
| 4 | PReadingA | PReadingA | proof_reading |
| 5 | PReadingB | PReadingB | proof_reading |
+----+-----------+-----------+---------------+
mysql> select * from data_entry;
+----+------------+-----------+------------------------+
| id | records_id | staffs_id | record_status |
+----+------------+-----------+------------------------+
| 1 | 2 | 3 | data_entry_processiing |
| 2 | 2 | 3 | data_entry_completed |
| 3 | 2 | 4 | proof_read_processing |
| 4 | 2 | 4 | proof_read_completed |
| 5 | 2 | 5 | proof_read_processing |
| 6 | 2 | 5 | proof_read_completed |
+----+------------+-----------+------------------------+
Is there alternative better solution of database design?
i think design it's well done. but may be you want to separate group into groups table, and record_status into status table. If you're storing a lot of records you would store a lot of useless information, at least create an enum type for record_status field and group field
table: groups
id - name 1 - admin 2 - data_entry 3 - proof_reading
...
table: status
id - name 1 - data_entry_processing ...
and if you want the users to be in different groups at a time, you could create users_group table
table: user_groups
group_id - user_id 1 - 1 2 - 1 1 - 4 3 -
4 4 - 4 ....
Hope this helps