Database performance: Split table or keep together - mysql

For our project we want to save employees and customer in one or two database tables.
The customers have the same columns as the employees (e.g. name, address, language, email,...).
The employees on the other side do have additional columns like SocialSecurity Number, bankaccount,...
Since the two have many similar columns it might be senseful to merge them into a single 'person' table, considering that there might be from time to time a case where a customer get an employee or vice versa.
But since in the application this two 'roles' of people are strictly separated (we have querys where we want to get all customers and querys to get all employees, or search a person by email where role is customer / employee), then it might be more performant to keep them seperate.
Is there a big performance difference between this two solutions or is there even a third & better one?

I would not make this decision based on performance. I would make the decision based on security considerations and access rules.
In almost all circumstances I can think of, you would want separate tables for customers and employees. You could have a third table, persons, for common attributes such as names and addresses. That said, there are so many differences between the two, that I'm not even sure that is a good idea:
Customers could be incorporated entities such as companies.
Customers could be from anywhere in the world, but it is reasonable in many situations to assume that employees are local.
Employees have dependents.
You may maintain information such as gender, race, and age that you would not want to maintain about customers.
Those are just a few items that immediately come to mind. There are many other differences.

Related

MySql table with potentially *very* many columns

A friend who is a recruiter for software engineers wants me to create an app for him.
He wants to be able to search candidates' CVs based on skills.
As you can imagine, there are potentially hundreds, possibly thousands of skills.
What's the best way to represent the candidate in a table? I am thinking skill_1, skill_2, skill_n, etc, but somewhere out there there is a candidate with more than n skills.
Also, it is possible that more skills will be added to the database in future.
So, what's the best way to represent a candidate's skills?
[Update] for #zohar, here's a rough first pass at teh schema. Any comments?
You need three tables (at least):
One table for candidates, that will contain all the details such as name, contact information, the cv (or a link to it) and all other relevant details.
One table for skills - that will contain the skill name, and perhaps a short description (if that's relevant)
and one table to connect candidates to skills - candidatesToSkills - that will have a 1 to many relationship with both tables - and a primary key that is the combination of the candidate id and the skill id.
This is the relational way of creating a many to many relationship.
As a bonus, you can also add a column for skill level - beginner, intermediate, skilled, expert etc'.
You might also want to add a table for job openings and another table to connect that to the skills table, so that you can easily find the most suitable candidate for the job based on the required skills. (but please note that skills is not the only match needed - other points to match are geographic location, salary expectations, etc'.)

How best to implent 3-dimensional SQL relationship?

This question is an extension of another question I asked regarding many-to-many relationships in MySQL.
I currently have 3 tables that I need to link with a 4th intermediary table:
Stores, Products, and States
My intermediary table, _stores_products_states, combines the id from the other three tables to determine which product is sold by which store and in which state.
Now, as I understand it, I would need to create an entry in _stores_products_states for every possible combination of the three, correct? This would lead to thousands of duplicated values in 1-2 of the columns (though never all 3).
For example, if Best Guy sells both GI Bros and Darbies in all 50 states, that would be 100 entries just for those two products. If those products are sold by another store, they too would have 100 entries.
Is this the correct way to implement this kind of relationship?
EDIT:
This whole setup is basically just to determine the availability of a particular product. A user will search for a product and receive a list of stores that sell that product in their state.
The 4th table is the way!
So if I got it right, your '_stores_products_states' table could even be called sale
You do not need to create a record for all possible combinations of product, state, and store. You only need to create a record for existing combinations, that is, availability of a product in a store in a state (maybe with things like local price and quantity bolted on).
You will have to store this information one way or another; a 3-relation link table, especially stored as a clustered MySQL index, would be a pretty standard solution, with good performance characteristics.
One thing I wonder about is why you have stores separate from states. I'd expect a store to be associated with a state. With the 3-relation link table, you'd be able to associate the same store with a product in several different states. Is this what your business domain supposes?

Creating the right Database table structure for address change + pricing

I'm a little stumped on whether i can make this process of changing addresses easier. I'll explain the situation:
Basically I have three entities, Students, Addresses, StudentsAddresses. Students have many addresses, since they can change alot and rapidly (especially foster kids / homeless kids). So ill be changing them a lot. However based on each address I Want a user to attach (enter it via the UI) the price it would cost to pick that student up via bus service. So my initial thought was, ok, let me attach a column onto my join table 'StudentsAddresses' called 'dailyPrice', this is the cost for each day a student is picked up, and another column called 'adjustmentPrice', which is an additional cost for whatever special circumstance that requires extra work to pick up a student. Is my thinking going to cause me problems the more students I have in the future? Will it get harder to manage?
Another option I thought about, was creating a new Table called Pricing. And another join-type table called StudentsAddressesPricing
StudentsAddressPricing has three columns,
studentId
addressId
pricingId
each field connects the three together. So if i ever needed Students, with their addresses, and the pricing, i would query this table and eager load Students, Addresses, and Pricing. Does this approach seem much cleaner since i've abstracted pricing out a bit? Trying to determine the best way to go about this without having to many headaches in the future incase I wan't to add more attributes pricing related, or address related.
And then I even thought, hey what if pricing is just different for one day? How would I even consider that. Would I need a different kind of entity to handle that? Is doing alot of joins going to hurt my application performance?
Just looking for some insight on how others would do it, and criticism on why im off the ball.
The main question you should ask yourself is: on what does the price depend?
If the price is determined by the address, you might as well add it to addresses. If the price also depends on the student (e.g., depending on their financial situation), it would make sense to put it into studentsaddresses.
In other words: The table where the price is stored should have foreign keys to everything outside the table that determines the price. If that makes it fit into one of the existing tables, keep it there.

creating user profiles, each with personal mysql data, using php

I'm trying to figure out the best practices for storing user data on a php/mysql site.
let's say the website will host a service of saving people's input for items they have in their house.
I have set up tables that includes: kitchen, bathroom, bedroom, etc.
Sally adds her 6 kitchen items.
John adds his 3 kitchen items.
etc.
I'm just wondering what may be the common practice on storing other user information in the mysql database. I've taken a class on databases, so i'm thinking relationally linking by foreign key, john with his items in the lists, and sally too..
does that sound about right? or is there a better way? I can see the list getting really large quite quickly.
would it be possible to set up a different table to each user? is that possible? or would it be silly?
I would not set up a table for each user.
Definitely go relational. I am not sure I follow you completely around "john with his items.." and so on. So I interpret this as
user table
room table
item table
relational user->item (id, user_id, item_id, room_id) OR:
relational item->room
So you can pull a user, list the rooms they have related to them, then list the items in that room. Additionally, like this you do not need a new item entry for common things like tables, stoves, spatulas, etc.
Your list could get large, but if you scale properly and plan a back end based update migration when you absolutely need to (like millions of users) then you should be fine. Consider how many relations sites like facebook and ebay have to maintain. Large relations are normal for databases so I wouldn't let a couple million rows scare you.
I would use three tables:
rooms (id, room), to store values kitchen, bathroom, bedroom, etc.
users
items: assuming you have a common structure your your current kitchen, bathroom, bedroom tables, one table could replace all of them. This table should also contain two foreign keys, user_id and room_id.
With that structure, you can easily retrieve and filter your data.

Organizational chart represented in a table

I have an Access application, in which I have an employee table. The employees are part of several different levels in the organization. The orgranization has 1 GM, 5 department heads, and under each department head are several supervisors, and under those supervisors are the workers.
Depending on the position of the employee, they will only have access to records of those under them.
I wanted to represent the organization in a table with some sort of level system. The problem I saw with that was that there are many ppl on the same level (for example supervisors) but they shouldn't have access to the records of a supervisor in another department. How should I approach this problem?
One common way of keeping this kind of hierarchical data in a database uses only a single table, with fields something like this:
userId (primary key)
userName
supervisorId (self-referential "foreign key", refers to another userId in this same table)
positionCode (could be simple like 1=lakey, 2=supervisor; or a foreign key pointing to another table of positions and such)
...whatever else you need to store for each employee...
Then your app uses SQL queries to figure out permissions. To figure out the employees that supervisor 'X' (whose userId is '3', for example) is allowed to see, you query for all employees where supervisorId=3.
If you want higher-up bosses to be able to see everyone underneath them, the easiest way is just to do a recursive search. I.e. query for everyone that reports to this big boss, and for each of them query who reports to them, all the way down the tree.
Does that make sense? You let the database do the work of sorting through all the users, because computers are good at that kind of thing.
I put the positionCode in this example in case you wanted some people to have different permissions... for example, you might have a code '99' for HR employees which have the right to see the list of all employees.
Maybe I'll let some other people try to explain it better...
Here's an article from Microsoft's Access Cookbook that explains these queries rather well.
And here is a somewhat chunky explanation of the same.
Here's a completely different method (the "adjacency list model") that you might find useful, and his explanation is pretty good. He also points out some difficulties with both methods (when he talks about the tables being "denormalized").