School Site SQL Design - mysql

I was just handed a project to modify a school site for class registration. The current system is designed for each class to run one time per week, but I need to change things so that a class can occur on one or more nights per week. I am struggling with finding an efficient method to relate each of the classes, and allow the customers to view and select a single class when registering for the series.
My first thought is to add another field (groupid) that can hold a unique value to tie corresponding classes together. Looking at queries to sort this method is difficult, because if I sort by day-of-week followed by groupid (for display, class selection, etc), then the classes will be separated. Sorting by groupid then day-of-week produces a non-chronological order, which doesn't work either. Is there a way to move the classes together groupid after sorting, but not affect the sorted date?
My second thought was to modify the table to support multiple classes per row. This would be the easier method, but less flexible, and even more problematic if the classes don't run at the same time of the week.
Anyway, I'm a little lost, and would appreciate any feedback on design, and/or a query to help with my sort problem.
Thanks!

A class is a single entity regardless of how many days it meets in a given week. Create a Schedule table. It would include a FK_ClassID and ScheduleDate. If it meets three days in a week, it would have three records. This way, a student could schedule multiple classes, but check to make sure they do not over-lap on the same day of the week.

Related

Database Design for sub columns or many to many relations

If I have a list of theatres and in each theatre there are several classes of tickets eg. Rs.120, Rs.100 etc. These classes will apply for morning, noon and night shows. So all the classes of tickets will be available for all the shows(Many to Many Relationship) I need to model this as a database. I have a problem in modelling the classes and show timings. This makes the data base redundant.
Input Excel data
A good rule of thumb, is when you hit redundant data, make a new table.
Here is how I would break it down, though you could break it down further (also see a term normalize):
Tables:
theater_tbl
ticket_tbl
classes_tbl
relate each ticket to a class, and each theater may sell one or more tickets of any given class.
Information like address of theater go with theater_tbl
Ticket pricing would go in the ticket table under the type of ticket, unless I misunderstand what a class of ticket is, then the pricing should go there.
The time of day a ticket relates to should go in the ticket table.
This should get you started. To go further, you could break down show times into another table, and relate classes/tickets to those show times.
Its hard to draw out without a solid example.

having trouble normalizing this database

Currently, I have 48 fields.
I'm completely new to access. This is how I decided to connect everything together.
It doesn't seem to be very effective. Could somebody help me understand how to normalize this database?
Should I try to put employee information in one table, job information in another table and then have an equipment lookup table?
The current job, last job, and previous job can all the SAME table. If you sort this table by descending job start date, then then you have current, last and previous. You thus don’t need nor want a separate table for each of these which really amounts to the concept of a “job”. If sorting by date is not enough, then you could add a column called Job Type (current, previous, etc.). Again, we still only using the one table.
The same goes for Equipment. You really don’t care if the limit is 3 last, or 300 last. By building a normalized table, then ONE form can edit all types and you save MASSIVE amounts of coding and building of tables, User interface software, and that of building quires to retrieve + show the last 3 jobs in a form.
The fact that your design with FAR LESS cost of development allows 3 or 300 last jobs is really moot. More important if some manager comes along and now wants you to save the last 4 jobs, you don’t have some massive re-design here. And you can on the fly add new job types. So in place of current, and say previous, you can also have un-completed, or failed jobs. So adding new business rules means again you don’t add a new type of job table, but only a “type” to the one column you already using to define the job as current or previous.
Identify like objects and make one table to store all of them. In your design you have three tables for equipment but each item of equipment has the same fields; they should be one table. Similarly for jobs, each job is pretty much the same; they should be one table. The same for departments.
Figure out one or more column in each table that can uniquely identify the row in the table (that is, if you know the values for those columns it is impossible for there ever to be two rows with those values). These are your primary keys for your tables.
Identify cases in which an item in one table needs to "point to" (refer to) an item in another table. In this case, make sure that the referring table has a set of columns that match the referred-to table.
When you've done that, you'll have the beginnings of a correctly factored relational database design.

Database model for a 24/7 Staff roster at a casino

We presently use a pen/paper based roster to manage table games staff at the casino. Each row is an employee, each column is a 20 minute block of time and each cell represents what table the employee is assigned to, or alternatively they've been assigned to a break. The start and end time of shifts for employees vary as do the games/skills they can deal. We need to keep a copy of the rosters for 7 years, with paper this is fairly easy, I'm wanting to develop a digital application and am having difficulty how to store the data in a database for archiving.
I'm fairly new to working with databases, I think I understand how to model the data for a graph database like neo4j, but I had difficulty when it came to working with time. I've tried to learn about RDBMS databases like MySQL, below is how I think the data should be modelled. Please point out if I'm going in the wrong direction or if a different database type would be more appropriate, it would be greatly appreciated!
Basic Data
Here is some basic data to work with before we factor in scheduling/time.
Employee
- ID Number
- Name
- Skills (Blackjack, Baccarat, Roulette, etc)
Table
- ID Number
- Skill/Type (Can only be one skill)
It may be better to store the roster data as a file like JSON instead? Time sensitive data wouldn't be so much of a problem then. The benefit of going digital with a database would be queries, these could help assist time consuming tasks where human error is common.
Possible Queries
Note: Staff that are on shift are either on a break or on the floor (assigned to a table), Skills have a major or minor type based on difficulty to learn.
What staff have been on the floor for 80 minutes or more? (They are due for a break)
What open tables can I assign this employee to based on their skillset?
I need an employee that has Baccarat skill but is not already been assigned to a Baccarat table.
What employee(s) was on this table during this period of time?
Where was this employee at this point in time?
Who is on shift right now?
How many staff on shift can deal Blackjack?
How many staff have 3 major skills?
What staff have had the Baccarat skill for at least 3 months?
These queries could also be sorted by alphabetical order or time, skill etc.
I'm pretty sure I know how to perform these queries with cypher for neo4j provided I model the data right. I'm not as knowledgeable with SQL queries, I've read it can get a bit complicated depending on the query and structure.
----------------------------------------------------------------------------------------
MYSQL Specific
An employee table could contain properties such as their ID number and Name, but am I right that for their skills and shifts these would be separate tables that reference the employee by a unique integer(I think this is called a foreign key?).
Another table could store the gaming Tables, these would have their own ID and reference a skill/gametype with a foreign key.
To record data like the pen/paper roster, each day could have a table with columns starting from 0000 increasing by 20 in value going all the way to 2340? Prior to the time columns I could have one for staff where each employee is represented with their foreign key, the time columns would then have foreign keys to the assigned gaming Tables, the row data is bound to have many cells that aren't populated since the employee shift won't be 24/7. If I'm using foreign keys to reference gaming Tables I now have a problem when the employee is on break? Unless I treat say the first gaming Table entry as a break?
I may need to further complicate things though, management will over time try different gaming Table layouts, some of the gaming Tables can be converted from say Blackjack to Baccarat. this is bound to happen quite a bit over 7 years, would I want to be creating new gaming Table entries or add a column to use a foreign key and refer to a new table that stores the history of game types during periods of time? Employees will also learn to deal new games during their career, very rarely they may also have the skill removed.
----------------------------------------------------------------------------------------
Neo4j Specific
With this data would I have an Employee and a Table node that have "isA" relationship edges mapping to actual employees or tables?
I imagine with the skills for the two types I would be best with a Skill node and establish relationships like so?: Blackjack->isA->Skill, Employee->hasSkill->Blackjack, Table->typeIs->Blackjack?
TIME
I find difficulty when I want this database to now work with a timeline. I've come across the following suggestions for connecting nodes with time:
Unix Epoch seems to be a common recommendation?
Connecting nodes to a year/month/day graph?
Lucene timeline? (I don't know much about this or how to work with it, have seen some mention it)
And some cases with how time and data relate:
Staff have varied days and start/end times from week to week, this could be shift node with properties {shiftStart,shiftEnd,actualStart,actualEnd}, staff may arrive late or get sick during shift. Would this be the right way to link each shift to an employee? Employee(node)->Shifts(groupNode)->Shift(node)
Tables and Staff may have skill data modified, with archived data this could be an issue, I think the solution is to have time property on the relationship to the skill?
We open and close tables throughout the day, each table has open/close times for each day, this could change in a month depending on what management wants, in addition the times are not strict, for various reasons a manager may open or close tables during the shift. The open/closed status of a table node may only be relevant for queries during the shift, which confuses me as I'd want this for queries but for archiving with time it might not make sense?
It's with queries that I have trouble deciding when to use a node or add a property to a node. For an Employee they have a name and ID number, if I wanted to find an employee by their ID number would it be better to have that as a node of it's own? It would be more direct right, instead of going through all employees for that unique ID number.
I've also come across labels just recently, I can understand that those would be useful for typing employee and table nodes rather than grouping them under a node. With the shifts for an employee I think should continue to be grouped with a shifts node, If I were to do cypher queries for employees working shifts through a time period a label might be appropriate, however should it be applied to individual shift nodes or the shifts group node that links back to the employee? I might need to add a property to individual shift nodes or the relationship to the shifts group node? I'm not sure if there should be a shifts group node, I'm assuming that reducing the edges connecting to the employee node would be optimal for queries.
----------------------------------------------------------------------------------------
If there are any great resources I can learn about database development that'd be great, there is so much information and options out there it's difficult to know what to begin with. Thanks for your time :)
Thanks for spending the time to put a quality question together. Your requirements are great and your specifications of your system are very detailed. I was able to translate your specs into a graph data model for Neo4j. See below.
Above you'll see a fairly explanatory graph data model. In case you are unfamiliar with this, I suggest reading Graph Databases: http://graphdatabases.com/ -- This website you can get a free digital PDF copy of the book but in case you want to buy a hard copy you can find it on Amazon.
Let's break down the graph model in the image. At the top you'll see a time indexing structure that is (Year)->(Month)->(Day)->(Hour), which I have abbreviated as Y M D H. The ellipses indicate that the graph is continuing, but for the sake of space on the screen I've only showed a sub-graph.
This time index gives you a way to generate time series or ask certain questions on your data model that are time specific. Very useful.
The bottom portion of the image contains your enterprise data model for your casino. The nodes represent your business objects:
Game
Table
Employee
Skill
What's great about graph databases is that you can look at this image and semantically understand the language of your question by jumping from one node to another by their relationships.
Here is a Cypher query you can use to ask your questions about the data model. You can just tweak it slightly to match your questions.
MATCH (employee:Employee)-[:HAS_SKILL]->(skill:Skill),
(employee)<-[:DEALS]-(game:Game)-[:LOCATION]->(table:Table),
(game)-[:BEGINS]->(hour:H)<-[*]-(day:D)<-[*]-(month:M)<-[*]-(year:Y)
WHERE skill.type = "Blackjack" AND
day.day = 17 AND
month.month = 1 AND
year.year = 2014
RETURN employee, skill, game, table
The above query finds the sub-graph for all employees who have the skill Blackjack and their table and location on a specific date (1/17/14).
To do this in SQL would be very difficult. The next thing you need to think about is importing your data into a Neo4j database. If you're curious on how to do that please look at other questions here on SO and if you need more help, feel free to post another question or reach out to me on Twitter #kennybastani.
Cheers,
Kenny

Better to add separate tables for multiple users?

I am relatively new to database design so I am still learning a lot. What I am working on is an online time card clock. I am just messing around with it to learn more. My full time job is working for my uncle operating heavy equipment and he has expressed some headaches to me. When going over time cards several employees hand writing is hard to read, several employees don't add the hours correctly so he always has to double check their math, plus some people don't hand in the time sheets on time. Most of the employees have smart phones so my solution is to just make a simple website that has a button for "Clocking in" and "Clocking out" it would also contain several text fields to describe what the employee operated that day and the job site they were on. All of this will be added to a database that will be emailed to the boss at the end of the work week. My question is what would be the best way to setup a database for this? Should I add a separate table for each employee or keep it all in one table? There will be about 20 employees that will use the site. Thanks in advance for any help.
General database principles:
Think about object orientation. Classes of objects.
An "Employee" is one such class, therefore you should have one table that stores employees.
An "Event" such as clocking in or clocking out is a general class of two specific cases, e.g. ClockIn and ClockOut.
You could consider one table to store an Event with a field for the date and time of the event, one field for the employee (a foreign key), one field indicating whether it's in or out
You could alternatively consider one table for ClockIn, one for ClockOut, but this may not be advantageous depending on how you wish to scan the data later when printing reports. I'd personally recommend against this approach, just point out that it's an option.
Ideally, every table should have a numeric primary key
Think of key-value pairs
Employee
1 Jon Doe
2 Juan Gomez
etc...
Event
1 2012-11-29 08:59 Clock In 2
This translates to Juan Gomez clocking in today just before 9am

Retention Tracking

Let’s say I have an Angry Birds game.
I want to know how many players are buying the ‘mighty eagle’ weapon each month out of the players which bought the mighty eagle weapon in the previous months in their LTV in the system
I have the dates of all items bought per each client.
What I practically would like to have is a two dimensional
matrix that will tell me what the percentage of the players which moved from
LTV_month_X to LTV_month_Y for each combination of X<Y for a specific current
month?
An example:
example_png
(it didn't let me to put the pic inline so please press the link to see the pic)
Now, I have found a way to get the number of players moved
actually from from LTV_month_X to LTV_month_Y that LTV_month_Y is their current
month of activity within the system using SQL query and Excel Pivot table.
What I try find out is mainly is how to get the base number of those who potentially could do that transition.
A few definitions:
LTV_month_X = DATEDIFF(MONTH, first_eagle_month, specific_eagle_month)+1
Preferably I would like to have the solutions in ANSI-SQL, if not then MySQL or
MSSQL but no Oracle functions should be used at all.
Since I’m looking for the percentage of the transition two-steps plans could also work, first find the potential ones and the find the actual ones who moved to measure the retention from  LTV_month_X to LTV_month_Y.
One last issue: I need for it to be possible to drill down and find the actual IDs of the clients who moved from any stage X to any other stage Y (>X).
The use of the term LTV here is not clear. I assume you mean the lifetime of the user.
If I understand the question, you are asking, based on a list of entities each with one or more events, how do I group (e.g. count) the entities by the month of the last event and the month of the one before last event.
in mysql, you can use a variable to do that. I'm not going to expalin the whole concept, but basically, when within a SELECT statement you write #var:=column, then that variable is assigned the value of that column, and you can use that to compare values between consectuive columns e.g.
LEAST(IF(#var=column,#same:=#same+1,#same:=0),#var:=column)
the use of LEAST is a trick to ensure execution order.
The two dimension you are looking for are
Actual purchase month
Relative purchase month
SELECT
player_id,
TRUNCATE(first_purchase,'MM') AS first_month ,
TRUNCATE(current_purchase_date ,'MM') AS purchase_month,
months_between(current_purchase _date, first_purchase_date)+1 AS relative_month,
SUM(purchase_amount) AS total_purchase,
COUNT(DISTINCT player_id) AS player_count
FROM ...
Now you can pivot purchase month to relative month and aggregate