Access Database theory/design - ms-access

I'm working on building a database for tracking details of a large collection (40k) of architectural drawings. One of the details we want to track about each drawing is the floor numbers represented. Since some drawings have multiple floors there needs to be a way to account for this. My database theory isn't top rate so Im not sure if my solution is the logical way to go.
My solution would be to create a one to many relationship between the drawings primary key and a new Floors table with a record for every floor, using a third table of Floor Names as a lookup source for consistency. I.E. if a drawing represented floors 1 and 2 there would be two records in my floors table related to the drawings primary ID.
I had considered a one to one relationship and boolean fields for every floor, but since some buildings have upwards of 20 floors there would be a lot of fields, and many would be NULL as very few buildings have more than 5 floors.
Is my approach here a suitable solution? Thanks!

To clarify you want to record floor(s)and name? I was in construction for many years. I would make a few new tables.
tblProject
Project#(pk), ProjectName, ...(project details)
tblSheetDetail
SHEETID(seq), Sheet#(A1), Project#(fk), SheetTitle,
tblTrades
TradeID, TradeName, SpecCode
tblFloor
FloorID, Floor#,
tblTradesSheetDetail
TSDetailID,SheetID(fk), TradeID(fk),
tblFloorSheetDetail
FSDetailID, SheetID(fk), FloorID(fk)
Now you can pull data almost anyway you could imagine. We found the TradeSheet detail table most helpful. You might also consider a table to record revisions to each sheet.
Good luck!

Related

Better way to organize lots of columns and data?

I'm creating a real-estate website and i was wondering if there was a better way of organizing my columns or tables, not sure what would be the best way to go about it, i currently have a lot of columns and im worried about performance issues.
The columns are as follows
5 for things like property id, add date, duration, owner/user id.
35 columns for things like title, description, price, energy rating, location, etc.
40 columns for features like swimming-pool, central heating, river front, garage, well, etc.
15 for image locations which are stored on server
15 for the image descriptions
Is 110+ columns bad practice in MySQL? Everything is lightning fast but i'm in localhost at the mo, wont the monstrous size of the tables slow queries? Especially if I have a couple hundred properties?
Am i ok with my current setup? What would best practice be? How do e-commerce websites that have many feature options go about this?
It is not a good practice since the data can be stored in separate tables. What would help you most would be to create an ERD to visualize how you can organize your tables. Even if you do not understand the ins and outs of ERDs, you can still use it to at least organize your thoughts.
It seems that you already have your tables separated based on the bullet points that you made within your question. One thing that I would add to your bullets is maybe breaking down your features into categories and creating a table for each.
For example, swimming pools and riverfront can be placed in a table called
LandscapeFeatures or OutdoorFeatures.
Most likely, the property features would be better stored in a separate table, with one row per property feature, rather than as columns in your main table. I understand this as a many-to-many relationship between a propery and its features, so this suggest two more tables:
properties (property_id (pk), date_added, title, description)
features (feature_id (pk), description)
property_features (property_id (fk), feature_id (fk))
Such structure is much more flexible and easier to query than having one column per feature. As examples:
easy to add features by creating new rows in the features table (while in the old structure you had to create a new column)
easy to aggregate the features, and answer a question like: count how many features each property has
As for images, they should have their ow table too. If an image maby belong to several user, then it's a many-to-many relationship, and you can follow the above pattern. If each image belongs to a single user, one more table is enough:
properties (property_id (pk), date_added, title, description)
features (feature_id (pk), description)
property_features (property_id (fk), feature_id (fk))
images (image_id, location, description, property_id (fk))
One table:
Columns for the dozen or so values that you are most likely to search on.
Devise several composite indexes that involve those columns, starting with the more commonly searched columns.
Devise a TEXT column and put "words" in it for a FULLTEXT index. If this is home sales, consider words like "swimming pool septic tank gazebo Eichler". This will help with certain "boolean" type queries. (If you like this idea, let's discuss how to make use of filtering with indexes and/or fulltext; it gets tricky.)
Put the rest into a JSON (or TEXT column). Do not plan on searching it; instead bring the row(s) into your app code for further filtering after searching by the actual INDEXes

Split similar data into two tables?

I have two sets of data that are near identical, one set for books, the other for movies.
So we have things such as:
Title
Price
Image
Release Date
Published
etc.
The only difference between the two sets of data is that Books have an ISBN field and Movies has a Budget field.
My question is, even though the data is similar should both be combined into one table or should they be two separate tables?
I've looked on SO at similar questions but am asking because most of the time my application will need to get a single list of both books and movies. It would be rare to get either books or movies. So I would need to lookup two tables for most queries if the data is split into two tables.
Doing this -- cataloging books and movies -- perfectly is the work of several lifetimes. Don't strive for perfection, because you'll likely never get there. Take a look at Worldcat.org for excellent cataloging examples. Just two:
https://www.worldcat.org/title/coco/oclc/1149151811
https://www.worldcat.org/title/designing-data-intensive-applications-the-big-ideas-behind-reliable-scalable-and-maintainable-systems/oclc/1042165662
My suggestion: Add a table called metadata. your titles table should have a one-to-many relationship with your metadata table.
Then, for example, titles might contain
title_id title price release
103 Designing Data-Intensive Applications 34.96 2017
104 Coco 34.12 2107
Then metadata might contain
metadata_id title_id key value
1 103 ISBN-13 978-1449373320
2 103 ISBN-10 1449373320
3 104 budget USD175000000
4 104 EIDR 10.5240/EB14-C407-C74B-C870-B5B6-C
5 104 Sound Designer Barney Jones
Then, if you want to get items with their ISBN-13 values (I'm not familiar with IBAN, but I guess that's the same sort of thing) you do this
SELECT titles.*, isbn13.value isbn13
FROM titles
LEFT JOIN metadata isbn13 ON titles.title_id = metadata.title_id
AND metadata.key='ISBN-13'
This is a good way to go because it's future-proof. If somebody turns up tomorrow and wants, let's say, the name of the most important character in the book or movie, you can add it easily.
The only difference between the two sets of data is that Books have an
IBAN field and Movies has a Budget field.
Are you sure that this difference that you have now will not be
extended to other differences that you may have to take into account
in the future?
Are you sure that you will not have to deal with any other type of
entities (other than books and movies) in the future which will
complicate things?
If the answer in both questions is "Yes" then you could use 1 table.
But if I had to design this, I would keep a separate table for each entity.
If needed, it's easy to combine their data in a View.
What is not easy, is to add or modify columns in a table, even naming them, just to match requirements of 2 or more entities.
You must be very sure about future requests/features for your application.
I can't image what type of books linked with movies you store thus a lot of movies have different titles than books which are based on. Example: 25 films that changed the name.
If you are sure that your data will be persistent and always the same for books and movies then you can create new table for example Productions and there store attributes Title, Price, Image, Release Date, Published. Then you can store foreign keys of Production entity in your tables Books and Movies.
But if any accident happen in the future you will need to rebuild structure or change your assumptions. But anyway it will be easier with entity Production. Then you just create new row with modified values and assign to selected Book or Movie.
Solution with one table for both books and movies is the worst, because if one of the parameters drive away you will add new row and you will have data for first set (real book and non-existing movie) and second set (non-existing book and real movie).
Of course everything is under condition they may be changes in the future. If you are 100% sure, then 1 table is enough solution, but not correct from the database normalization perspective.
I would personally create separate tables for books and movies.

How best to implent 3-dimensional SQL relationship?

This question is an extension of another question I asked regarding many-to-many relationships in MySQL.
I currently have 3 tables that I need to link with a 4th intermediary table:
Stores, Products, and States
My intermediary table, _stores_products_states, combines the id from the other three tables to determine which product is sold by which store and in which state.
Now, as I understand it, I would need to create an entry in _stores_products_states for every possible combination of the three, correct? This would lead to thousands of duplicated values in 1-2 of the columns (though never all 3).
For example, if Best Guy sells both GI Bros and Darbies in all 50 states, that would be 100 entries just for those two products. If those products are sold by another store, they too would have 100 entries.
Is this the correct way to implement this kind of relationship?
EDIT:
This whole setup is basically just to determine the availability of a particular product. A user will search for a product and receive a list of stores that sell that product in their state.
The 4th table is the way!
So if I got it right, your '_stores_products_states' table could even be called sale
You do not need to create a record for all possible combinations of product, state, and store. You only need to create a record for existing combinations, that is, availability of a product in a store in a state (maybe with things like local price and quantity bolted on).
You will have to store this information one way or another; a 3-relation link table, especially stored as a clustered MySQL index, would be a pretty standard solution, with good performance characteristics.
One thing I wonder about is why you have stores separate from states. I'd expect a store to be associated with a state. With the 3-relation link table, you'd be able to associate the same store with a product in several different states. Is this what your business domain supposes?

How should I design MS Access 2010 tables for road construction projects?

New to database design and admittedly over my head. Trying to create a small database that will allow me to find state highway construction project information along any given road. Relationships include:
One contract number to one project; One county to many projects; One road to many projects; One road to many counties and one county to many roads; One manager to many projects; One contractor to many projects; One contact list (phone, email) to one manager; One date each (bid, start, complete) to many projects.
Would be a small database, maybe 500 records total. There are only 6 counties. Right now I've broken roads down into 6 separate "roads by county" tables so while the route number may be the same in different counties each record will be unique because it's in a separate county table. Is this OK or is it better to keep one roads table and assign county values there? I created other tables listing the counties, contracts, contractors, managers, dates and project description. Just don't know what to do with them.
My purpose is to be able to search, mostly by road number and keyword, to find what projects are on what road at any given time. I'd also like to update this info via forms. The data will change frequently and it's just a little too unruly for a spreadsheet. I simply can't wrap my head around how to setup and relate the tables and individual records. Any thoughts would be phenomenally appreciated.
I imagine a database with a simple structure.
The relations many-to-many normally should and can be avoided.
Access doesn't have a direct implementation of this relations. You need to create a cross table between the two main tables.
Anyway I created a database for projects more complex than this so I tried to accomplish your request.
The "core" table is tblProjects that refers for details to other tables.
Since projects are related to roads, I use road as the main item and a road can have a list of counties. If you want to know how many projects are in a county it can be simply done with a query looking for all the roads that have that County_ID.
If you want to look up for project for one road, simply find the road_ID (e.g. using a combo-box to select it) and you can filter (query) the tblProjects by Road_ID.
tblManagers
*IDManager
LastName
FirstName
eMail
Phone
Mobile
tblContractors
*IDContractor
ContractorName
Reference
tblCounties
*IDCounty
CountyName
Road_ID
tblRoads
*IDRoad
RoadNum
RoadName
tblProjects
*IDProject
ContractNum
ProjectName
StartDate
EndDate
BidDate
Contractor_ID
Road_ID
Manager_ID
The fields with * are the Key Fields. They are in relations with the _ID corresponding fields (check referential integrity and cascade deletion when you create the relation).
Let me know

Database model for a 24/7 Staff roster at a casino

We presently use a pen/paper based roster to manage table games staff at the casino. Each row is an employee, each column is a 20 minute block of time and each cell represents what table the employee is assigned to, or alternatively they've been assigned to a break. The start and end time of shifts for employees vary as do the games/skills they can deal. We need to keep a copy of the rosters for 7 years, with paper this is fairly easy, I'm wanting to develop a digital application and am having difficulty how to store the data in a database for archiving.
I'm fairly new to working with databases, I think I understand how to model the data for a graph database like neo4j, but I had difficulty when it came to working with time. I've tried to learn about RDBMS databases like MySQL, below is how I think the data should be modelled. Please point out if I'm going in the wrong direction or if a different database type would be more appropriate, it would be greatly appreciated!
Basic Data
Here is some basic data to work with before we factor in scheduling/time.
Employee
- ID Number
- Name
- Skills (Blackjack, Baccarat, Roulette, etc)
Table
- ID Number
- Skill/Type (Can only be one skill)
It may be better to store the roster data as a file like JSON instead? Time sensitive data wouldn't be so much of a problem then. The benefit of going digital with a database would be queries, these could help assist time consuming tasks where human error is common.
Possible Queries
Note: Staff that are on shift are either on a break or on the floor (assigned to a table), Skills have a major or minor type based on difficulty to learn.
What staff have been on the floor for 80 minutes or more? (They are due for a break)
What open tables can I assign this employee to based on their skillset?
I need an employee that has Baccarat skill but is not already been assigned to a Baccarat table.
What employee(s) was on this table during this period of time?
Where was this employee at this point in time?
Who is on shift right now?
How many staff on shift can deal Blackjack?
How many staff have 3 major skills?
What staff have had the Baccarat skill for at least 3 months?
These queries could also be sorted by alphabetical order or time, skill etc.
I'm pretty sure I know how to perform these queries with cypher for neo4j provided I model the data right. I'm not as knowledgeable with SQL queries, I've read it can get a bit complicated depending on the query and structure.
----------------------------------------------------------------------------------------
MYSQL Specific
An employee table could contain properties such as their ID number and Name, but am I right that for their skills and shifts these would be separate tables that reference the employee by a unique integer(I think this is called a foreign key?).
Another table could store the gaming Tables, these would have their own ID and reference a skill/gametype with a foreign key.
To record data like the pen/paper roster, each day could have a table with columns starting from 0000 increasing by 20 in value going all the way to 2340? Prior to the time columns I could have one for staff where each employee is represented with their foreign key, the time columns would then have foreign keys to the assigned gaming Tables, the row data is bound to have many cells that aren't populated since the employee shift won't be 24/7. If I'm using foreign keys to reference gaming Tables I now have a problem when the employee is on break? Unless I treat say the first gaming Table entry as a break?
I may need to further complicate things though, management will over time try different gaming Table layouts, some of the gaming Tables can be converted from say Blackjack to Baccarat. this is bound to happen quite a bit over 7 years, would I want to be creating new gaming Table entries or add a column to use a foreign key and refer to a new table that stores the history of game types during periods of time? Employees will also learn to deal new games during their career, very rarely they may also have the skill removed.
----------------------------------------------------------------------------------------
Neo4j Specific
With this data would I have an Employee and a Table node that have "isA" relationship edges mapping to actual employees or tables?
I imagine with the skills for the two types I would be best with a Skill node and establish relationships like so?: Blackjack->isA->Skill, Employee->hasSkill->Blackjack, Table->typeIs->Blackjack?
TIME
I find difficulty when I want this database to now work with a timeline. I've come across the following suggestions for connecting nodes with time:
Unix Epoch seems to be a common recommendation?
Connecting nodes to a year/month/day graph?
Lucene timeline? (I don't know much about this or how to work with it, have seen some mention it)
And some cases with how time and data relate:
Staff have varied days and start/end times from week to week, this could be shift node with properties {shiftStart,shiftEnd,actualStart,actualEnd}, staff may arrive late or get sick during shift. Would this be the right way to link each shift to an employee? Employee(node)->Shifts(groupNode)->Shift(node)
Tables and Staff may have skill data modified, with archived data this could be an issue, I think the solution is to have time property on the relationship to the skill?
We open and close tables throughout the day, each table has open/close times for each day, this could change in a month depending on what management wants, in addition the times are not strict, for various reasons a manager may open or close tables during the shift. The open/closed status of a table node may only be relevant for queries during the shift, which confuses me as I'd want this for queries but for archiving with time it might not make sense?
It's with queries that I have trouble deciding when to use a node or add a property to a node. For an Employee they have a name and ID number, if I wanted to find an employee by their ID number would it be better to have that as a node of it's own? It would be more direct right, instead of going through all employees for that unique ID number.
I've also come across labels just recently, I can understand that those would be useful for typing employee and table nodes rather than grouping them under a node. With the shifts for an employee I think should continue to be grouped with a shifts node, If I were to do cypher queries for employees working shifts through a time period a label might be appropriate, however should it be applied to individual shift nodes or the shifts group node that links back to the employee? I might need to add a property to individual shift nodes or the relationship to the shifts group node? I'm not sure if there should be a shifts group node, I'm assuming that reducing the edges connecting to the employee node would be optimal for queries.
----------------------------------------------------------------------------------------
If there are any great resources I can learn about database development that'd be great, there is so much information and options out there it's difficult to know what to begin with. Thanks for your time :)
Thanks for spending the time to put a quality question together. Your requirements are great and your specifications of your system are very detailed. I was able to translate your specs into a graph data model for Neo4j. See below.
Above you'll see a fairly explanatory graph data model. In case you are unfamiliar with this, I suggest reading Graph Databases: http://graphdatabases.com/ -- This website you can get a free digital PDF copy of the book but in case you want to buy a hard copy you can find it on Amazon.
Let's break down the graph model in the image. At the top you'll see a time indexing structure that is (Year)->(Month)->(Day)->(Hour), which I have abbreviated as Y M D H. The ellipses indicate that the graph is continuing, but for the sake of space on the screen I've only showed a sub-graph.
This time index gives you a way to generate time series or ask certain questions on your data model that are time specific. Very useful.
The bottom portion of the image contains your enterprise data model for your casino. The nodes represent your business objects:
Game
Table
Employee
Skill
What's great about graph databases is that you can look at this image and semantically understand the language of your question by jumping from one node to another by their relationships.
Here is a Cypher query you can use to ask your questions about the data model. You can just tweak it slightly to match your questions.
MATCH (employee:Employee)-[:HAS_SKILL]->(skill:Skill),
(employee)<-[:DEALS]-(game:Game)-[:LOCATION]->(table:Table),
(game)-[:BEGINS]->(hour:H)<-[*]-(day:D)<-[*]-(month:M)<-[*]-(year:Y)
WHERE skill.type = "Blackjack" AND
day.day = 17 AND
month.month = 1 AND
year.year = 2014
RETURN employee, skill, game, table
The above query finds the sub-graph for all employees who have the skill Blackjack and their table and location on a specific date (1/17/14).
To do this in SQL would be very difficult. The next thing you need to think about is importing your data into a Neo4j database. If you're curious on how to do that please look at other questions here on SO and if you need more help, feel free to post another question or reach out to me on Twitter #kennybastani.
Cheers,
Kenny