What's the best way to normalise this database? - mysql

First of all thank you for reading my problem and I hope that you can help me.
So I'm creating an API to use later on in my Android App, but I'm having doubts about the correct way that I need to create the database that I'm working with. The way I've got the database setup now is like this:
But this does not look good and doesn't help with what I want to do in my API. I want my users to able to input symptoms and then my API outputs illnesses based on the input of the users.
So, I think I need to change my database design but what should it look like? I've been struggling over this for the past day and I can't seem to find the correct answer.
Again, thanks for reading!

There are two ways you can improve the structure of your database. The first one is simpler but the second one is more strict and completely normalized:
Way 1
Create an illness table:
CREATE TABLE illness(
id INTEGER NOT NULL AUTO_INCREMENT,
illnessName VARCHAR(255) UNIQUE NOT NULL,
PRIMARY KEY(illnessId)
);
Then create a table that uses each ilness' unique id to match it with its symptoms in a 1:n relationship.
CREATE TABLE illness_symptom(
illnessId INTEGER NOT NULL,
symptom VARCHAR(255),
FOREIGN KEY (illnessId) REFERENCES illness(id)ON UPDATE CASCADE ON DELETE CASCADE,
PRIMARY KEY(illnessId, symptom)
);
The dual primary key ensures that no symptom is included twice for the same illness.
The fact that the symptom is a string makes it less strict than the following method which is the best:
WAY 2
The illness table remains the same as in way 1:
CREATE TABLE illness(
id INTEGER NOT NULL AUTO_INCREMENT,
illnessName VARCHAR(255) UNIQUE NOT NULL,
PRIMARY KEY(illnessId)
);
Create a whole separate table for storing every possible symptom:
CREATE TABLE symptom(
id INTEGER NOT NULL AUTO_INCREMENT,
symptomName VARCHAR(255) UNIQUE NOT NULL,
PRIMARY KEY(id)
);
The create a third table that matches the id of the illness with the id of the symptom:
CREATE TABLE illness_symptom(
illnessId INTEGER NOT NULL,
symptomId INTEGER NOT NULL,
PRIMARY KEY(illnessId, symptomId),
FOREIGN KEY(illnessId) REFERENCES illness(id),
FOREIGN KEY(symptomId) REFERENCES symptom(id)
);
Again the dual primary key ensures that an illness does not include the same symptom more than once
EDIT
After creating the tables you can join them to get match each illness with its symptoms like this:
SELECT i.id, i.illnessName AS illnessName, s.symptomName AS symptomName
FROM (illness AS i JOIN illness_symptom AS is ON i.id=is.illnessId) JOIN symptom AS s ON is.symptomId=s.id
GROUP BY i.id;
An example output would something like this:
1 | Bronchitis | stuffiness
1 | Bronchitis | fatigue
1 | Bronchitis | thightness in the chest
2 | Whiplash | headache
2 | Whiplash | dizzyness
2 | Whiplash | concentration problems
You can read more about inner join here

Actually you can have three tables:
1. Illness Table
2. Symptom Table
3. IllnessSymptom Table
1.Illness Table will have IllnessID,Illname
2.Symptom table will have SymptomID,SymptomName
3.IllnessSymptom Table will have IllnessSymptomID,IllnessID,Symptom which will relate Illness and Symptom
You can make your API fetch data by joining these table
So the query would be like
SELECT I.IllnessName IS
INNER JOIN Illness I ON IS.IllnessID=I.IllnessID
INNER JOIN Symptom S OM IS.SymptonID=S.SymptonID
WHERE S.SymptomName=#YourInputIllness
Hope this answers your query! :)

Related

In MySQL, how do I join two fields in one table to the same table's primary key?

I am working with a MySQL backend (version 5.7.19), and a LibreOffice Base frontend(version 7.0.6.2 x64) on 64-bit Windows. I have a table that lists personnel with a primary key id. I also have a workorders table that has an "entered by" field and a "reviewed by" field, both of which need to store the id of the personnel who complete those tasks. If I wanted to have two foreign keys in one table pointing to the same table's primary key, what would my SELECT statement need to look like?
In my case, I have a table 'personnel' with two fields with ID as the primary key, thus:
ID
Name
1
John Smith
2
John Adams
3
Samuel Adams
which can be created and populated thus:
CREATE TABLE orders(
workorder int(10) unsigned NOT NULL AUTO_INCREMENT,
entered_by int(10) unsigned NOT NULL,
reviewed_by int(10) unsigned NOT NULL,
PRIMARY KEY (workorder),
FOREIGN KEY (entered_by) REFERENCES personnel(id),
FOREIGN KEY (reviewed_by) REFERENCES personnel(id)
);
ALTER TABLE orders AUTO_INCREMENT = 1;
INSERT INTO personnel(name) VALUES('John Smith');
INSERT INTO personnel(name) VALUES('John Adams');
INSERT INTO personnel(name) VALUES('Samuel Adams');
Also, a table 'orders' with three fields with entered_by and reviewed_by as foreign keys to personnel.id
workorder
entered_by
reviewed_by
1
2
3
2
3
1
which can be created and populated thus:
CREATE TABLE orders(
workorder int(10) unsigned NOT NULL AUTO_INCREMENT,
entered_by int(10) unsigned NOT NULL,
reviewed_by int(10) unsigned NOT NULL,
PRIMARY KEY (workorder),
FOREIGN KEY (entered_by) REFERENCES personnel(id),
FOREIGN KEY (reviewed_by) REFERENCES personnel(id)
);
INSERT INTO orders(entered_by, reviewed_by) VALUES (2,3);
INSERT INTO orders(entered_by, reviewed_by) VALUES (3,1);
I know how to
SELECT workorder, personnel.name AS entered
FROM orders JOIN personnel
ON personnel.id = orders.entered_by
ORDER BY orders.workorder;
which results in
workorder
entered
1
John Adams
2
Samuel Adams
and how to
SELECT workorder, personnel.name AS entered
FROM orders JOIN personnel
ON personnel.id = orders.entered_by
ORDER BY orders.workorder;
which yields:
workorder
reviewed
1
Samuel Adams
2
John Smith
but I'm not sure how to put them into a single query (that I can use in a query form in Base), so that it will display:
workorder
entered
reviewed
1
John Adams
Samuel Adams
2
Samuel Adams
John Smith
Yes, according to relational algebra every pair of tables can have multiple relationships between them.
For example, the typical illustration of this case, is a money_transfer table that records money flowing from one account to another. In this case this table will have two foreign keys against the account table: one to indicate where the money is coming from, and the other to indicate where money is going to.
Other pairs of tables can have many more relationships between them. I've seen cases for authorization purposes and auditing, that have many FKs.
For example, the requirements stated that the app needed to record who entered the data, who verified it, who accepted it, and who executed the transaction; sometimes it even has "first-level of approval" (for amounts above US$10K) and "second-level of approval" (for amounts above $100K).
EDIT - Joining the Same Table Multiple Times
As requested, when joining the same table multiple times you need to assign different names to each "instance" of the table. Typically this is done by adding an alias to each table instance according to its role.
In this case the roles are "entered by" and "reviewed by", so the query can use the aliases e and r respectively. The query could take the form:
select o.*, e.name, r.name
from workorders o
join personnel e on e.id = o.entered_by
join personnel r on r.id = o.reviewed_by

MySQL statement ON DUPLICATE KEY for not keys

I'm working on a game that requires the user (primarily kids) to combine a prefix and a suffix into a unique username, say, BlueBaron. Now there's only so many prefixes and suffixes, so if a user generates an existing one, a number is appended to it, say, BlueBaron2.
I have a table as follows:
| id | prefix_id | suffix_id | identifier_index | username | hashbrown | salt | coins | ranking | date_created | date_updated
The id is an auto-increment, unique, not-null primary key - I assume for this particular instance, I won't actually need to worry about the id. The prefix_id and suffix_id are not-null, but because they refer to common prefixes and suffixes, they are not unique. The rest of the rows are just not-nulls.
Ideally, I would like to check if a new user has the exact same prefix_id and suffix_id as another user, and increment the identifier_index.
I tried this with multiple (SELECT then INSERT) statements, but I fear the data might not be updated / unique (another user might have inserted between the time it took for you to insert, etc.).
Is this possible within a single insert statement? I've read of ON DUPLICATE KEY but I'm not sure that's applicable here.
UPDATE:
Per the comments and answers below, I've created a unique index for the three columns in question:
However, the identifier_index increments even when the prefix_id and suffix_id are different. And in the case of the last entry, wouldn't increment at all resulting in a duplicate entry error:
That's a good question. I'm no developer, but from a database admins view, I'd say that you need to do it like this.
You definitely need a unique index spanning over the 3 columns.
CREATE TABLE `a` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`prefix_id` varchar(10) NOT NULL,
`suffix_id` varchar(10) NOT NULL,
`identifier_index` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `uidx_psi` (`prefix_id`,`suffix_id`,`identifier_index`)
) ENGINE=InnoDB;
This is a must, you want to guarantee data integrity!
Your insert statement would look like this:
insert into a (prefix_id, suffix_id, identifier_index)
select 'asdf', 'qwer', coalesce(max(identifier_index) + 1, 1)
from a
where prefix_id = 'asdf' and suffix_id = 'qwer';
Be aware though, that you can run into deadlock issues. This happens when another transaction is trying to insert while this query is still running. Deadlocks are no serious issue, though. Typically an application is built in a way, that it simply tries again, until the insertion is successful.

Best way with relation tables

I have a question about tables and relations tables ...
Actually, I have these 3 tables
CREATE TABLE USER (
ID int(11) NOT NULL AUTO_INCREMENT,
NAME varchar(14) DEFAULT NULL
);
CREATE TABLE COUNTRY (
ID int(11) NOT NULL AUTO_INCREMENT,
COUNTRY_NAME varchar(14) DEFAULT NULL
);
CREATE TABLE USER_COUNTRY_REL (
ID int(11) NOT NULL AUTO_INCREMENT,
ID_USER int(11) NOT NULL,
ID_COUNTRY int(11) NOT NULL,
);
Ok, so now, 1 user can have one or more country, so, several entries in the table USER_COUNTRY_REL for ONE user.
But, my table USER contains almost 130.000 entries ...
Even for 1 country by user, it's almost 10Mo for the USER_COUNTRY_REL table.
And I have several related tables in this style ...
My question is, is it the fastest, better way to do?
This would not be better to put directly in the USER table, COUNTRY field that contains the different ID (like this: "2, 6, ...")?
Thanks guys ;)
The way you have it is the most optimal as far as time constraints go. Sure, it takes up more space, but that's part of space-time tradeoff - If you want to be faster, you use more space; if you want to use less space, it will run slower (on average).
Also, think of the future. Right now, you're probably selecting the countries for each user, but just wait. Thanks to the magic of scope creep, your application will one day need to select all the users in a given country, at which point scanning each user's "COUNTRY" field to find matches will be incredibly slow, as opposed to just going backwards through the USER_COUNTRY_REL table like you could do now.
In general, for a 1-to-1 or 1-to-many correlation, you can link by foreign key. For a many-to-many correlation, you want to have a relation table in between the two. This scenario is a many-to-many relationship, as each user has multiple countries, and each country has multiple users.
Why not try like this: Create table country first
CREATE TABLE COUNTRY (
CID int(11) NOT NULL AUTO_INCREMENT,
COUNTRY_NAME varchar(14) DEFAULT NULL
);
Then the table user:
CREATE TABLE USER (
ID int(11) NOT NULL AUTO_INCREMENT,
NAME varchar(14) DEFAULT NULL,
CID Foreign Key References CID inCountry
);
just Create a Foreign Key relation between them.
If you try to put this as explicit relation , there will lot of redundancy data.
This is the better approach. You can also make that Foreign Key as index . So that the databse retrieval becomes fast during search operations.
hope this helps..
Note : Not sure about the exact syntax of the foreign key

Database schema for storing ints

I'm really new to databases so please bear with me.
I have a website where people can go to request tickets to an upcoming concert. Users can request tickets for either New York or Dallas. Similarly, for each of those locales, they can request either a VIP ticket or a regular ticket.
I need a database to keep track of how many people have requested each type of ticket (VIP and NY or VIP and Dallas or Regular and NY or Regular and Dallas). This way, I won't run out of tickets.
What schema should I use for this database? Should I have one row and then 4 columns (VIP&NY, VIP&Dallas, Regular&NY and Regular&Dallas)? The problem with this is it doesn't seem very flexible, thus I'm not sure if it's good design.
You should have one column containing a quantity, a column that specifies the type (VIP), and another that specifies the city.
To make it flexible you would do:
Table:
location
Columns:
location_id integer
description varchar
Table
type
Columns:
type_id integer
description varchar
table
purchases
columns:
purchase_id integer
type_id integer
location_id integer
This way you can add more cities, more types and you allways insert them in purchases.
When you want to know how many you sold you count them
What you want to do is have one table with cities and one table with ticket types.
Then you create a weak association with [city, ticket type, number of tickets].
That table will have 2 foreign keys, therefore "weak".
But this enables you to add or remove cities etc. And you can add a table for concerts as well and your weak table you will have another foreign key "concert".
I think this is the most correct way to do it.
CREATE TABLE `tickets` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`locale` varchar(45) NOT NULL,
`ticket_type` varchar(45) NOT NULL
}
This is a simple representation of your table. Ideally you would have separate tables for locale and type. And your table would look like this:
CREATE TABLE `tickets` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`locale_id` int(11) NOT NULL,
`ticket_type_id` int(11) NOT NULL
}

Need help designing my invoice db structure

I have a website where customer can buy subscriptions.
The customer can at any time go to payment history and see what has ben bought.
I'm trying to design the db for creating invoices, but something doesn't seem right for me.
My current setup looks like this:
+-----------+--------------+---------+
| Invoice | invoice_item | product |
+-----------+--------------+---------+
| id | id | id |
| fk_userID | desc | name |
| | quantity | price |
| | sum | |
| | fk_invoiceID | |
+-----------+--------------+---------+
It seems logical that invoice_item has a foreign key referenced to product.
But what happens if a product is deleted? If they are related, then row in the item_list will be deleted or set to null.
And that will not work if you want to look at an old invoice and the product is no longer available.
So, should Product and Item_list be related?
You cannot remove a product once it has been defined, so add a Status field to the product that - in this example I'm using an enum, although it could easily be an INT or a set of bools (i.e. Archived), I use Parameter Enumeration Tables for this but that's a seperate answer.
The most important thing is to ensure that the invoice line has the pricing (and description) taken from the product at the point of order, to ensure that any future pricing changes or product name changes don't affect pre-existing invoices.
The other technique that I have used (quite succesfully) is to introduce the concept of superceding entities in a database - so that the original record remains and a new version is inserted whenever data is changed. To do this I add the following fields :
currentID
supersededById
previousId
It makes the queries a little more cumbersome - but especially for addresses it is essential to ensure that the invoices remain constant and that address changes aren't reflected in the invoices - e.g. changing company name shouldn't change previously raised invoices.
CREATE TABLE `Invoice` (
`id` INTEGER NOT NULL AUTO_INCREMENT ,
PRIMARY KEY (`id`)
);
CREATE TABLE `Invoice Item` (
`id` INTEGER NOT NULL AUTO_INCREMENT ,
`desc` VARCHAR(200) NOT NULL ,
`value` DECIMAL(11,3) NOT NULL ,
`quantity` DECIMAL(11,3) NOT NULL ,
`total` DECIMAL(11,3) NOT NULL ,
`fk_id_Invoice` INTEGER NOT NULL ,
`fk_id_Product` INTEGER NOT NULL ,
PRIMARY KEY (`id`)
);
CREATE TABLE `Product` (
`id` INTEGER NOT NULL AUTO_INCREMENT ,
`Price` DECIMAL(11,3) NOT NULL ,
`Name` VARCHAR(200) NOT NULL ,
`Status` ENUM NOT NULL ,
PRIMARY KEY (`id`)
);
ALTER TABLE `Invoice Item` ADD FOREIGN KEY (fk_id_Invoice) REFERENCES `Invoice` (`id`);
ALTER TABLE `Invoice Item` ADD FOREIGN KEY (fk_id_Product) REFERENCES `Product` (`id`);
You just need an additional column as no_of_stock, in the product table. If the product is empty in the inventory or is not used currently, should not be deleted, instead that column's value will be set to 0. Which will mean that though product cann ot be sold currently, but was previously sold, having the past existense
If you want people to be able to see their old invoices, no matter how long ago they were generated, and including all information, then you shouldn't delete products. Just add a column containing an ENUM('active', 'inactive') NOT NULL or similar to distinguish those products you currently offer from those you only keep around for old references.