Database design structure - mysql

I´m new to database design and never took class on it, i have problem with structuring my database and assigning primary keys.
I have a list of cities, each city has 5 types of public transport. Each type of public transport has different ticket price, main station and CSV file with route coordinations etc. in every city. Then i need to daily calculate average cost of transportation in every city for each type of public transport based on route coordinations (distances), price, time it takes etc.
Table cities:
city (Primary key)
Table public transport:
city, type of transport, ticket price, main station, file1, file2
Table results:
city, type of transport, date, cost
How should i connect these tables (assuming their structure is right)? In table public transport, i think city should be foreign key but type of transport will repeat for every city so i dont think it can be primary key of this table - the same for table results.

The main idea is that you don't wish to repeat ya self. Not only is it an overhead but also it's quite error prone when you wish to change multiple entries that represent the same thing.
There are guidelines on database normalization which help you to ensure that your data is on a form that's easy to maintain and work with.
You don't need to become an expert in understanding which form does what, but being able to identify what should be kept separated is a must when it comes to database designing.
You should list what you know:
Different cities.
Different type of transport.
Different ticket prices.
Different stations.
If you create a separate table for all of those then it'll be easy to link them together in rows in a table that then represents something on a larger scale. Every entry should have a separate id that will be your primary key, you need to be able to allow e.g. multiple cities with the same name, thus not being able to hold a unique value if they are to be the primary key.
E.g. now it would be easy to identify routes for a city, there can be multiple routes in a city
route_id | city_id | route_name
1 2 test1
2 2 test2
You then could add another table that represent which kind of transport is tied with this specific kind of route.
route_id | transport_id
1 3
2 4
You're then able to create a new table that holds points of stations that are a part of your route and you can even identify whether it's a main route or not.
route_connected_id | route_id | station_id | main_route
1 1 2 1 // a main route
2 1 3 0 // not a main route
And it goes on and on, separating the most simple entries allows you to create complex relationships where all you have to do is link ids.
This is the basic idea which should hopefully get you started, whether you find it helpful or not then I recommend that you take a look on the reading material that I suggested, i.e. database normalization.

Related

Entity relationship with one to many foriegn keys

This is more of a question on best practice/efficiency optimization, while also helping me to unstick myself.
I need to make a database that simulates a video game company. I have the project entity along with a assigned employee roster for that specific project, and employee entity with foreign keys relating to them from the project roster. I have to add different pay types such as a contractor, an intern, and a full time employee. Along with this I have to divide the employees themselves up into four separate groups to indicate if they are a programmer, an artist, a designer, a tester, or a producer. What might be the most efficient way to handle this while maintaining database query flexibility?
An idea that I had was to attempt to make a bridge entity for both pay types and employee types, but I got stuck thinking I might need to have foreign keys to each respective type table from the pay type bridge, or the employee type bridge, and that would simply be redundant on the query side, along with it yielding the wrong individuals upon a query.
Is there a way I could make a composite key that might look at an attribute of an entity, and assign them as a member of the indicated employee type?
For example:
emp_ID lname fname assigned_project FK_pay_type_ID
123456 Shelly Mary 21 65(contractor)
emp_type
designer(this could be read and a script could assign her to the designer table?)
To Give an idea of what I am thinking of doing here is a couple of tables I mocked up in Visio:
Visio Image Here (I really have to have arbitrary site points just to post an image that clarifies my question? Really?)
Project Project_Roster Employee
FK_projectState ----> PK_projectRoster_ID <list employee credentials here>
PK_Project_ID FK_Employee_ID ------>FK_assignedRoster_ID
projectName FK_payType_ID (goes to bridge)
launchDate FK_EmployeeType_ID(goes to bridge)
Employee_PayType_BR
PK_payType_ID (pointed to specific type depending on assigned?)
FK_Employee_ID
description
amount_Offered
amount_paidToDate("salary" if full time)
#Not sure how to get here
PayType - Contractor
PK_Contractor_ID
FK_Employee_ID
parent_company
skill_set(list | array)
PayType - Intern
PK_Intern_ID
FK_Employee_ID
Perhaps there is a simpler way, and I am just missing it, and in that case I apologize for my ineptitude.

Putting an entity in hierarchy, or as an attribute with lookup table?

Let's say my company is producing medical products, these products are used in many different lab testing instruments. The business logic hierarchy goes like this:
A lab has multiple locations (Up to thousands)
A location has multiple departments (Chemistry, Hematology, 3-5 per location)
A department has multiple instruments (No more than 10-20 instruments per location)
An instrument has many products.(No more than 1-5 product types per instrument)
The table structure currently mirrors the business logic, like displayed on the left. I suggested we make a small change, displayed on the right.
What are some pros and cons of each approach? I feel like the left-hand side approach might be a bit slower due to chaining so many Joins in a row.
The biggest "con" I see to the approach on the right-hand side is that you lose the association between Department and Location. For the relationships that you described atop your post, the structure on the left is correct from a design perspective.
HOWEVER...
The design that you have means that the Mass Spectrometer at your San Antonio facility will have a different ID than the one at your Denver facility. Is that intended?
------------------ revision after discussion in comments ------------------
You've described a couple of many-to-many relationships - a location will have multiple instruments and multiple locations can have the same instrument (e.g. Mass Spectrometer). To support that, you'll need cross-reference tables. Here's an initial sketch. My standard is to call the table's primary key "ID", and any field called "[table-name]_ID" is a foreign key to the corresponding table:
Lab
ID
Name
Location
ID
Lab_ID
Street_Address
City
etc.
Department
ID
Name
Location_Department -- this lists the departments at a given location
ID
Department_ID
Location_ID
Instrument -- Scale, Oscilloscope, Mass Spectrometer, etc.
ID
Name
Description
Location_Department_Instrument -- inventory at a given location
Location_Department_ID
Instrument_ID
Instrument_Serial_Number
Let me know if this makes sense.

How should I design MS Access 2010 tables for road construction projects?

New to database design and admittedly over my head. Trying to create a small database that will allow me to find state highway construction project information along any given road. Relationships include:
One contract number to one project; One county to many projects; One road to many projects; One road to many counties and one county to many roads; One manager to many projects; One contractor to many projects; One contact list (phone, email) to one manager; One date each (bid, start, complete) to many projects.
Would be a small database, maybe 500 records total. There are only 6 counties. Right now I've broken roads down into 6 separate "roads by county" tables so while the route number may be the same in different counties each record will be unique because it's in a separate county table. Is this OK or is it better to keep one roads table and assign county values there? I created other tables listing the counties, contracts, contractors, managers, dates and project description. Just don't know what to do with them.
My purpose is to be able to search, mostly by road number and keyword, to find what projects are on what road at any given time. I'd also like to update this info via forms. The data will change frequently and it's just a little too unruly for a spreadsheet. I simply can't wrap my head around how to setup and relate the tables and individual records. Any thoughts would be phenomenally appreciated.
I imagine a database with a simple structure.
The relations many-to-many normally should and can be avoided.
Access doesn't have a direct implementation of this relations. You need to create a cross table between the two main tables.
Anyway I created a database for projects more complex than this so I tried to accomplish your request.
The "core" table is tblProjects that refers for details to other tables.
Since projects are related to roads, I use road as the main item and a road can have a list of counties. If you want to know how many projects are in a county it can be simply done with a query looking for all the roads that have that County_ID.
If you want to look up for project for one road, simply find the road_ID (e.g. using a combo-box to select it) and you can filter (query) the tblProjects by Road_ID.
tblManagers
*IDManager
LastName
FirstName
eMail
Phone
Mobile
tblContractors
*IDContractor
ContractorName
Reference
tblCounties
*IDCounty
CountyName
Road_ID
tblRoads
*IDRoad
RoadNum
RoadName
tblProjects
*IDProject
ContractNum
ProjectName
StartDate
EndDate
BidDate
Contractor_ID
Road_ID
Manager_ID
The fields with * are the Key Fields. They are in relations with the _ID corresponding fields (check referential integrity and cascade deletion when you create the relation).
Let me know

How to move SQL info from one DB to another

Im trying to migrate SQL From one script to another, Namely only one part of it, but they use different "Code".
Basically im trying to move information for "States" which is held as a number in side both DB's. But the numbers of course are different, For instance Arizona's in one is 12 but the other is 3425.
Im trying to find a way to First match the Customer ID which is the same in both, then grab the Number from one DB and then convert it to the number it is in the new DB and place it in the correct area in said DB.
But my skills with SQL are very lacking and i can't find a reasonable solution.
Scripts in use:
Old - CRE Loaded 4.2.1a
New - OpenCart (Current Version)
EDIT
Here is my best at explaining the DB structure, excuse me if its not the best explanation of how the DB works:
Here is the Old DB Structure for the Tables that are involved:
address_book - Holds customer Information, State number is held in the Colum entry_zone_id in number format, also the Country number is held in entry_country_id also in number format, the "Zone ID" is based off of the "Country ID" depending on the country.
zones - Holds State and Country ID Information, the Example:
for arizona the zone_id is 4, the zone_country_id is 223 (For USA), the zone_code is AZ, and the Zone_name is Arizona
customers - This holds the main customer information, not includeing address info, customers_id is of course the customers ID Number which corrosponds with the same Customer ID in the table address_book. customers_default_address_id also corrosponds with the ID in address_book.
Now as for the new DB:
oc_address - address_id and customer_id mark the customers ID number. country_id tells which country they are in, Its in number format. zone_id holds the state's DB number.
oc_zone - "zone_id" holds the Zones ID, "country_id" holds the countrys ID number, such as USA is. "name" is the state name. "code" state or country code. Example:
Arizona in the DB shows like this:
zone_id=3616, country_id=223, name=Arizona, code=AZ, status=1(shows that its used)
One that may also be needed to know is this one:
oc_country - holds country info, country_id holds country code, name is the name of the country, isa_code_2 and iso_code_3 are the country ISO codes.
i hope this is of use.

Database Normalization with user input

I develop a mysql database that will contain the country,city and occupation of each user.
While I can use a "country" table and then insert the id of the country into the user table, I still have to look for the perfect method for the other two tables.
The problem is that the city and occupation of each user are taken from an input field, meaning that users can type "NYC" or "New York" or "New York City" and millions of other combinations for each town, for example.
Is it a good idea to disregard this issue, create an own "town" table containing all the towns inserted by users and then put the id of the town entry into the user table or would it be more appropriate to use a VARCHAR column "town" in the user table and not normalize the database concerning this relation?
I want to display the data from the three tables on user profile pages.
I am concerned about normalization because I don't want to have too much redundant data in my database because it consumes a lot of space and the queries will be slower if I use a varchar index instead of an integer index for example (as far as I know):
Thanks
We had this problem. Our solution was to collect the various synonyms and typo-containing versions that people use and explicitly map them to a known canonical city name. This allowed to correctly guess the name from user input in 99% of cases.
For the remaining 1%, we created a new city entry and marked it as a non-canonical. Periodically we looked through non-canonical entries. For recognizable known cities, we remapped the non-canonical entry to the canonical (updating FKs of linked records and adding a synonym). For a genuinely new city name we didn't know about we kept the created entry as canonical.
So we had something like this:
table city(
id integer primary key,
name varchar not null, -- the canonical name
...
);
table city_synonym(
name varchar primary key, -- we want unique index
city_id integer foreign key references(city.id)
);
Usually data normalization helps you to work with data and keep it simple. If normalized schema not fit your needs you can use denormalized data as well. So it depends on queries you want to use.
There is no good solution to group cities without creating separate table where you will keep all names for each city within single id. So it will be good to have 3 tables then: user(user_id, city_id), city (city_id, correct name), city_alias(alias_id, city_id, name).
It would be better to store the data in a normalized design, containing the actual, government recognized city names.
#Varela's suggestion of an 'alias' for the city would probably work well in this situation. But you have to return a message along the lines of "You typed in 'Now Yerk'. Did you perhaps mean 'New York'?". Actually, you want to get these kinds of corrections regardless...
Of course, what you should probably actually store isn't the city, but the postal/zip code. Table design is along these lines:
State:
Id State
============
AL Alabama
NY New York
City:
Id State_Id City
========================
1 NY New York
2 NY Buffalo
Zip_Code:
Id Code City_Id
=========================
1 00001-0001 1
And then store a reference to Zip_Code.Id whenever you have an address. You want to know exactly which zip code a user has (claimed) to be a part of. Reasons include:
Taxes for retail (regardless of how Amazon plays out).
Addresses for delivery (There is a Bellevue in both Washington and New York, for example. Zip codes are different).
Social mapping. If you store it as 'user input' cities, you will not be able to (easily) analyze the data to find out things like which users live near each other, much less in the same city.
There are a number of other things that can be done about address verification, including geo-location, but this is a basic design that should help you in most of your needs (and prevent most of the possible 'invalid' anomalies).