Thank you for opening this post. I need your help with MySQL DB Normalization.
I have 10 sales people, selling stuff, and I have few people in support team, that does telemarketing sales for salesmen. So anyways, was thinking what's the best way to normalize table, but to take into consideration that salesmen often quits so I have to transfer their DB to another sales agent DB.
Currently I have only one table. Is it better to put all stuff into that one table or to separate into many other small tables. Currently my table looks like this
id , mb , company_name , city , company_owner , phone_no1 , phone_no2 , app_status , sales_agent , cc_agent , three_options , exp_date , cc_comment , sales_comment , input_date , call_made , status
id = AI PK UQ
mb = is unique key that I provide
company_name, city, company_owner, cc_comment, sales_comment, input_date, call_made, phone_no1 and phone_no2 should be in one table since it's all different. Right?
sales_agent = I have 10 people
cc_agent = I have 3 people
app_status = I have 3 statuses to select, so it has to be one out of three
status = I have 15 statuses to pick from, it has to be only one of those
So, maybe one table for all changeable stuff, another for sales_agent, another for cc_agent, app_status and status?
Thank you in advance.
Lots of smaller tables is better. The goal of normalization is to eliminate repeating data, and increase referential integrity. You achieve that with lots of small tables and use of foreign keys.
When creating your table structure, you want to look for nouns. When you use a noun-phrase approach, identifying possible tables is easier. In your case, I see columns called company_name, company_owner, etc. Create a table called 'Company' and give it columns id, name, owner, city, state, zip etc.
You have sales people, agents, and support people. But they are all employees. So another possible table is 'Employee' and job_title is a column.
Really spend the time and think out the database structure, and try to get the database into 3rd normal form.
Related
To make you understand my question I'll give you an example:
I have a chat web app with many rooms, let's say 5 rooms.
People can choose to stay only in one room and they choose it at login.
When they choose the room I have to retrieve the people already in the room, so I can structure my db in two ways:
each room one table with the people being records;
all the rooms in one table, people are the records and a column indicating the room they are in;
In the first case the query would be:
SELECT * FROM 'room_2' WHERE 1
In the second case the query would be:
SELECT * FROM 'rooms' WHERE room = 'room_2'
Which is the best?
I think the only parameter to consider is performance, right?
In this example, no, because people are all 'like' objects and should therefore be in the same table.
All people and rooms in one table with a primary key on people, in this simple example.
Table Rooms(pk_person, personName, table_id)
But I want to talk about a structure that you will want to consider as your website grows. You’ll want three tables, one for each object (chat rooms, people) and one for the relationships.
Chat_Rooms(pk_ChatId, ChatName, MaxOccupants, other unique attributes of a chat room)
People(pk_PersonID, FirstName, LastName, other unique attributes of a person)
Room_People_Join(pk_JoinId, fk_ChatId, fk_PersonID, EnterDateTime, ExitDateTime)
This is a “highly normalized” structure. Each table is a collection of like objects, the join allows for many to many relationships, and object rows are not duplicated. So, a Person with all their attributes (name, gender, age) is never duplicated in the person table. Also, the person table never defines which chat rooms a person is in, because a person could be in one, many, none, or may have entered and exit multiple times. The same concept applies to a chat room. A chat rooms features, such as background color, max occupants, etc. have nothing to do with people.
The Room_People_Join is the important one. This has a unique primary key for which chat rooms a person is in and when they were there. This table grows indefinitely, but it tracks usage. Including the relationship table is what logically normalizes your database.
So how do you know which users are currently in chat room 1? You join your people and rooms to the join table with their respective Primary and Foreign keys in your FROM clause, ask for the columns you want in your SELECT clause, and filter for chat room 1 and people who haven’t yet left.
SELECT p.FirstName, p.LastName, r.ChatName
FROM Room_People_Join j
JOIN People p ON j.fk_PersonID = p.pk_PersonID
JOIN Chat_Rooms r ON j.fk_ChatId = r.pk_ChatId
WHERE r.ExitDateTime IS NOT NULL
AND pk_ChatId = 1
Sorry that’s long winded, but I extrapolated your question for database growth.
The answer is very simple and strongly recommended - one database table for all rooms for sure! What if you will later like to create rooms dynamically!? For sure you would not create new tables dynamically.
I was thinking that I would have two tables for mysql. One for storing login information and the other for shipping address. Is that the conventional way or is everything store in one table?
For two tables... is there a way where it automatically copies a column from table A to table B, so that I can reference the same id to grab their shipping address...
If its a single address and if it is going to be updated everytime , then you can have it in a single table something like
**Customer**
customer_id [pkey]
customer_name
login_id
password
shipping_address
whereas if you want to store all the shipping addresses for a single customer(across multiple visits) then it would be a good design to have another table customer_shipping_address
**Customer**
customer_id [pkey]
customer_name
login_id
password
**Customer_Shipping_Address**
customer_id [fkey to customer]
shipping_address
This is my answer to your question regarding using 1 table or 2 tables. This decision depends on may factors. But i would suggest that you should use 2 separate tables. Because the log-in information is something that you will be retrieving very often compare to shipping information. Now if you have all the info in one table then table size will be huge and you will have to query this huge table everytime you need login information of user.
I think using two tables is better way to go. then just join them when you want to do the shipping.
The SQL for that would be like this.
SELECT
table1.id, table2.id, table2.somethingelse, table1.somethingels
FROM
table1 INNER JOIN table2
ON table1.foreignkey = table2.primarykey
WHERE
(some conditions is true)
The code above would need to be run on the shiping page itself.
I have a table that contains alot of columns with ids(keys) corresponding to other tables.
for example, I have a table of cars that were sold
[table of cars that were sold]
(
car_make_id
, car_engine_id
, car_model_id
, car_radio_id
, buyer_id
, seller_id
, car_tittle_id
, sale_price
)
with each one of the id fields having another table containing the id and name like:
[another table]
(
car_make_id
, car_engine_id
, car_model_id
, car_radio_id
, buyer_id
, seller_id
, car_tittle_id
, sale_price
)
[and another table]
(
car_make
, car_make_id
)
[and another table]
(
car_title
, car_title_id
)
etc,...with each table named car_lookup, car_model_lookup,...
Is there anyway to join all these simply without writing a million subqueries. The are millions of entries in this table, and each additional join costs alot in terms of time. I am looking for a fast and efficient way of comparing this data against another table that doesn't have id's, but just the names. lets say I have a list of compatible radios that would have(make, model, engine, radio) and I want to have a list of all the sellers names who sold cars with incompatible radios, and how many incompatible sales they made.
I have been doing stuff like this in perl, but it can take hours to run. so I am looking for something that can be done in mysql.
ps: the car stuff is just an example, I don't actually work with cars, but it illustrates the problem I am having. I cannot change the way the database is set up either, due to a large number of code that already queries the data.
Thanks
You need some way of telling the database which tables to pull names from for each ID.
If this kind of query is too slow, perhaps you can optimize your database or MySQL server to be able to fill these JOIN statements faster. Try increasing cache sizes (especially if your server has much RAM) and make sure you have key indexing on those lookup tables.
SELECT car_make, car_engine, car_model, car_radio,
buyer, seller, car_title, sale_price FROM cars_sold
JOIN car_make_lookup USING (car_make_id)
JOIN car_engine_lookup USING (car_engine_id)
JOIN car_title_lookup USING (car_title_id)
JOIN car_model_lookup USING (car_model_id)
JOIN car_radio_lookup USING (car_radio_id)
JOIN buyer_lookup USING (buyer_id)
JOIN seller_lookup USING (seller_id)
JOIN car_title_lookup USING (car_title_id)
Ok, I have a database with with a table for storing classified posts, each post belongs to a different city. For the purpose of this example we will call this table posts. This table has columns:
id (INT, +AI),
cityid (TEXT),
postcat (TEXT),
user (TEXT),
datatime (DATETIME),
title (TEXT),
desc (TEXT),
location (TEXT)
an example of that data would be:
'12039',
'fayetteville-nc',
'user#gmail.com',
'December 28th, 2010 - 11:55 PM',
'post title',
'post description',
'spring lake'
id is auto incremented, cityid is in text format (this is where I think i will be losing performance once the database is large)...
Originally I planned on having a different table for each city and now since a user has to have the option of searching and posting through multiple cities, I think I need them all in one table. Everything was perfect when I had one city per table, where I could:
SELECT *
FROM `posts`
WHERE MATCH (`title`, `desc`, `location`)
AGAINST ('searchtext' IN BOOLEAN MODE)
AND `postcat` LIKE 'searchcatagory'
But then I ran into problems when trying to search multiple cities at one time, or listing all of a users posts for them to delete or edit.
So looks like I have to have one table with all the posts, and also match another FULLTEXT field: cityid. I am guessing I need full-text because if a user chooses an entire state, and my cityid is "fayetteville-nc" I would need to match cityid against "-nc" this is only an assumption and I would love another way. This database could easily reach over a million rows within 6 months, and a fulltext search against 4 columns is probably going to be slow.
My question is, is there a better way to do this more efficiently? The database has nothing in it now, except for some test posts made by me. So I can completely redesign the table structure if necessary. I am open to any and all suggestions, even if it is just a more efficient way to perform my query.
Yes, one table for all posts sounds sensible. It would also be normal design for the posts table to have a city_id, referring to the id in a city table. Each city would also have a state_id, referring to the id in a state table, and similarly each state would have a country_id referring to the id in a country table. So you could write:
SELECT $columns
FROM posts JOIN city ON city.id = posts.city_id
WHERE city.tag = 'fayetteville-nc'
Once you've brought the cities into a separate table, it might make more sense for you to do the city-to-city_id resolving up front. This fairly naturally happens if you have a city chose from a dropdown, for instance. But if you're entering free text into a search field, you may want to do it differently.
You can also search for all posts in a given state (or set of states) as:
SELECT $columns
FROM posts
JOIN city ON city.id = posts.city_id
JOIN state ON state.id = city.state_id
WHERE state.tag = 'NC'
If you're going to go more fancy or international, you may need a more flexible way of arranging locations into a hierarchy (e.g. you may want city districts, counties, multinational regions, intranational regions (Midwest, East Coast etc)) but stay easy for now :)
I'm new to database design,
please give me advice about this.
1 When should I use a composite index?
im not sure what does index does, but
i do know we should put it when it
will be heavly loaded like for WHERE
verified = 1 and in search like
company.name = something. am i right ?
2 MySQL indexes - how many are enough?
is it just enough ?
3 Database Normalization
is it just right?
Thanks.
edit*
rules.
each users( company member or owners ) could be a member of a
company
each company have some member of users.
there are company admins ( ceo, admins) and there are company members
( inserts the products )
each company can have products.
for the number 3 i will add a bit at users_company
- 1 is for admin
- 0 is for members
Looks good, well normalised, to me at least.
I notice that each product can only belong to one company. If that's what you intended that's fine, otherwise you could have product have its own PID and have a product_company relation table, which would let more than one company sell a particular product. Depends who administers the products I guess.
I did notice that the user table is called 'users' (plural) and the others are singular ('company', 'product'). That's only a minor thing though.
The only comment I have is that you may want to consider just adding a mapping_id column to your users_company table and making CID and UID foreign keys, and add a UNIQUE constraint.
This way you can have a distinct Primary Key for records in that table which isn't dependent on the structure of your other tables or any of your business logic.