Classpass.com like database design - mysql

I am trying to get my head around creating classpass like database design. I'm new to database design and there are a few things that are not quite for me how to implement them and I can't quite get my head around.
You can check the classpass example:
https://classpass.com/classes
https://classpass.com/studios
EDIT 1: So here is the idea: Each city have multiple neighbourhoods having multiple studios/venues.
After reading spencer7593's comment, here is what I came with and the things that are still not quite clear:
So what I am not quite sure about is:
I am not sure how to store the venue/studio address and geolocation. Is it better to have table Region which defines id | name | parent_id and stores the cities and the neighborhoods recursively? Or add a foreign key constraint to city and neighborhoods? Should I store the lan/lon into the venue table, into the address or even separate locations table? I would like to be able to perform searches like:
show me venues in that neighborhood or city
show me venues which are in radius XX from position
Each class should have a schedule and currently I am not sure how to design it. For example: Spinning class, Mo, We, Fr from 9 AM till 10 AM. I would like
to be able to do queries like:
show me venues, which have spinning classes on Mo
or show me all classes in category Spinning, Boxing for example
or even show me venues offering spinning classes
Should I create an extra table schedules here? Or just create some kind of view which creates the schedule? If it's an extra table, how should I describe start, end of each day of the week?

#Dimitar,
Even though #rhavendc is correct, this question should be placed in Database Adminstrator, I will answer your question in respective order to the best of my knowledge.
I am not sure how to store the venue/studio address and geolocation. [...]
You can easily find Geo-Locations by searching on the web. take MyGeoPosition for example.
I would like to be able to perform searches like
show me venues in that neighborhood or city.
You can do this easily. There are a few ways to do it, and each way will require a bit of tweaking with your ERD design. With the example I attached below, you can run a query to list all the venues with the address_id followed by the city id. The yellow entities are the one I added to ensure integrity.
For example:
-- venue.name is using the "[table].[field]" format to help
-- the engine recognize where the field is coming from.
-- This is useful if you are pulling the fields of the
-- same name from different tables.
select venue.name, city.name
from venue join
address using (address_id) join
city using (city_id);
NOTE: You don't have to include the city_name. I just threw it in there so you can try it out to see all the venues matching it.
If you would like to do it by the neighborhood, you would have to tweak the ERD I gave you by adding neighbor_id in the ADDRESS table. I have attached the example below, You would also have to add neighborhood_id From there, you can run a query like this:
Using this ERD:
-- Remember the format from the previously mentioned code.
select venue.name, neighborhood.name
from venue join
address using (address_id) join
neighborhood using (neighbor_id);
show me venues which are in radius XX from position
You can calculate the amount of miles, kilometers, etc. from longitude and latitude using Haversine's Formula.
Each class should have a schedule and currently I am not sure how to design it. For example: Spinning class, Mo, We, Fr from 9 AM till 10 AM. I would like to be able to do queries like:
show me venues, which have spinning classes on Mo
or show me all classes in category Spinning, Boxing for example
or even show me venues offering spinning classes
This can be easily derived from either of the ERDs I attached here. In the CLASS table, I added a field called parent_class_id which gets the class_id from the same table. This uses recursion, and I know this is a bit of a headache to understand. This recursion will allow the classes with assigned parent class to show that the classes are also offered at different times.
You can get this result by doing so:
-- Remember the format from the previously mentioned code.
select class1.name, class1.class_id, class2.class_id
from class as class1,
class as class2
where class1.parent_class_id = class2.class_id;
or even show me venues offering spinning classes
This may be a tricky one... If you are wondering which venues are offering spinning classes, where spinning is either part of or the name of the class, not a category, it's simple.
Try this...
-- Remember the format from the previously mentioned code.
select venue_id
from venue join
class using (venue_id)
where class_name = 'spinning';
NOTE: Keep in mind that most SQL languages are case-sensitive when it comes to searching for literals. You could try using where UPPER(class_name) = 'SPINNING'.
If the class name may include words other than "spinning" in its name, use this instead: where UPPER(class_name) like '%SPINNING%'.
If you are wondering which classes are offering spinning classes where spinning is a category, that's where the tricky bit comes in. I believe you would have to use a subquery for this.
Try this:
-- Remember the format from the previously mentioned code.
select class_id
from class join
class_category using (class_id)
where cat_id = (select cat_id
from category
where name = 'spinning');
Again, SQL engines are usually sensitive when it comes to literal searches. Make sure your cases are in its correct upper or lower cases.
Should I create an extra table schedules here? Or just create some kind of view which creates the schedule? If it's an extra table, how should I describe start, end of each day of the week?
Yes and no. You could, but if you can understand recursion in database systems, you don't have to.
Hope this helps. :)

Entity Relationship Modeling.
An entity is a person, place, thing, concept or event that can be uniquely identified, is important to the business, and we can store information about.
Based on information in the question, some candidates to consider as entities might be:
studio
class
rating
neighborhood
city
For each entity, what uniquely identifies it? Figure out the candidate keys.
And figure out the relationships between the entities, and the cardinalities. (What is related to what, and how many, required or optional?)
Is a studio related to a class?
Can a studio have more than one class?
Can a studio have zero classes?
Can a class be related to more than one studio?
Is a neighborhood related to zero, one or more city?
Can a studio be related to more than one neighborhood?
Once you've got the entities and relationships, getting the attributes assigned to each entity is pretty straightforward. Just make sure every attribute is dependent on the key, the whole key, and nothing but the key.

FIRST
Your question is not suited to be posted here in Stack Overflow for I guess it's best to be posted in Database Administrators.
SECOND
Here are some info for reading, just to give you a good start for building your database:
Data Modeling (It's kinda broad but it's for the better)
Logical Data Model (Short but comprehensive one)
THIRD
Basically, when designing your database you should first know all the data that would be needed in your system and group them (if needed) to make it small. Normalize it to reduce data redundancy.
EXAMPLE
Let's assume that table venue would be your main table or the center of all the transaction in your system. By that, venue may have subdata for example branch that may hold different branch location... and that branch may have subdata too for example schedule, teacher and/or class which may also related to each other (subdata gets data from another subdata)... so forth and so on with dependent tables.
Then you can also create independent tables but still have connections with others. For example the neighborhood table, it may contain the neighbor location and main venue location (so it should get the id of selected venue from the venuetable)... so forth and so on with related and independent tables.
NOTE
Just remember the "one-to-one, one-to-many" relationship. If a data will be going to hold many kinds of subdata, just split them in different table. If a data will be going to hold only (1) kind of subdata, then put it all in one table.

Related

How to set up relational database tables for this many-to-many relationship?

I have a type of data called a chain. Each chain is made up of a specific sequence of another type of data called a step. So a chain is ultimately made up of multiple steps in a specific order. I'm trying to figure out the best way to set this up in MySQL that will allow me to do the following:
Look up all steps in a chain, and get them in the right order
Look up all chains that contain a step
I'm currently considering the following table set up as the appropriate solution:
TABLE chains
id date_created
TABLE steps
id description
TABLE chains_steps (this would be used for joins)
chain_id step_id step_position
In the table chains_steps, the step_position column would be used to order the steps in a chain correctly. It seems unusual for a JOIN table to contain its own distinct piece of data, such as step_position in this case. But maybe it's not unusual at all and I'm just inexperienced/paranoid.
I don't have much experience in all this so I wanted to get some feedback. Are the three tables I suggested the correct way to do this? Are there any viable alternatives and if so, what are the advantages/drawback?
You're doing it right.
Consider a database containing the Employees and Projects tables, and how you'd want to link them in a many-to-many fashion. You'd probably come up with an Assignments table (or Project_Employees in some naming conventions).
At some point you'd decide you want not only to store each project assignment, but you'd also want to store when the assignment started, and when it finished. The natural place to put that is in the assignment itself; it doesn't make sense to store it either with the project or with the employee.
In further designs you might even find it necessary to store further information about the assignment, for example in an employee review process you may wish to store feedback related to their performance in that project, so you'd make the assignment the "one" end of a relationship with a Review table, which would relate back to Assignments with a FK on assignment_id.
So in short, it's perfectly normal to have a junction table that has its own data.
That looks fine, and it's not unusual for the join table to contain a position/rank field.
Look up all steps in a chain, and get them in the right order
SELECT * FROM chains_steps
LEFT JOIN steps ON steps.id = chains_steps.step_id
WHERE chains_steps.chain_id = ?
ORDER BY chains_steps.step_position ASC
Look up all chains that contain a step
SELECT DISTINCT chain_id FROM chains_steps
LEFT JOIN chains ON chains.id = chains_steps.chain_id
I think that the plan you've outlined is the correct approach. Don't worry too much about the presence of step_position on your mapping table. After all the step_position is a bit of data that is directly related to a step in the context of a chain. So the chains_steps table is the right place for it IMHO.
Some things to think about:
Foreign keys - use 'em!
Unique key on the chains_steps table - can a step be present in more than one position in a single chain? What about in different chains?
Good luck!

Database Architecture Many-to-Many-to-Many

I have got an issue how to change a model of database:
For now we have predefined table Categories
and let's say tables Places and People which can be assigned to categories so it looks like this:
People <=> PeopleCategories <=> Categories <=> PlaceCategories <=> Places
(People can have many categories, categories can have many people, places can have many categories, categories can have many places)
But now there is a new requirement:
On person profile show all corresponding places based on categories (so far no problem) and add a tick box modeling some attribute (for example show on front-end as favorite place). The same from the other side on Place profile mark people assigned to at least one same category with a tick box.
I wonder whether there is some nice way to model this - the only thing which came to my mind is to add a new PeoplePlaces table but then I have to manually control whether people or places did not change their categories and they are still assigned and so on - There will be quite a problem with consistency of data which I will have to manage on application layer.
The second thing I could probably do is to delete categories totally and make it only on PeoplePlaces level but I will lose some simplicity for user: there are like 10 predefined categories which user can select so the linking between People and Places is quite automatic on front-end and only admin should see which places are assigned to which people and manage that tick box I was talking about
What would you suggest for this architecture? Thanks in advance! (It is a MySQL db if it is important for some kind of solution but this is more a general architecture thing)
If I understood your question correctly, you need to ensure that a person can only favor a place that is connected to the same category as the person herself?
If so, take a look at the following model:
We don't link the "endpoints" directly, and instead "link the links". This allows us to migrate PERSON_CATEGORY.CATEGORY_ID and PLACE_CATEGORY.CATEGORY_ID into the FAVORED_PLACE table, and "merge" them there, producing a single FAVORED_PLACE.CATEGORY_ID field (note FK1,FK2in the diagram above).
As a consequence, if a person is connected to a place, that must be done through a common category.
Furthermore, since CATEGORY_ID is outside PERSON_CATEGORY's PK, a particular combination of person and place can be used only once, even if they match through multiple categories. Effectively, you pick one common category as "special". If a place (or person) is removed from the special category, you'll need to pick another common category to serve as special. If there are no common categories left, the corresponding row in FAVORED_PLACE will not be allowed to exist anymore.
I don't think deleting Categories is a good idea.
What you are doing is introducing a new entity - PersonsFavouritePlaces - which relates People and Place directly rather than via a Category. It is sensible that a PersonsFavouritePlace be limited to a Person and a Place linked by Category, so it should probably reference PeopleCategories and PlaceCategories rather than the People and Category tables.
The table would look like:
create table PeopleFavourtiePlace
(
ID int not null, -- Primary key
PeopleCategoriesId int not null, -- FK to PK of PerpleCategories
PlaceCategoriesId int not null -- FK to PK of PlaceCategories
)
I don't know whether MySql supports cascading deletes, but if so the two FK's should have that turned on so when someone deselects a category (deleting the PeopleCategories row) if it linked to a favourite place in that category it too gets deleted.
However, if a person links to a place via multiple categories then it gets complicated....

How could the following database schema be drawn using E/R diagrams?

How could the following database schema be drawn using E/R diagrams? (A sketch or final image would be helpful). I would also appreciate if you could guide me to a easy-to-understand tutorial on entity-relationships so I could learn how to draw them on paper first.
A CD has a title, a year of production and a CD type. (CD type could be anything: mini-CD, CD-R, CD-RW, DVD-R, DVD-RW...)
A CD usually has multiple songs on different tracks. Each song has a name, an artist and a track number. Entity set Song is considered to be weak and needs support from entity set CD.
A CD is produced by a producer which has a name and an address.
A CD may be supplied by multiple suppliers, each has a name and an address.
A customer may rent multiple CDs. Customer information such as Social Security Number (SSN), name, telephone needs to be recorded. The date and period of renting (in days) should also be recorded.
A customer may be a regular member and a VIP member. A VIP member has additional information such as the starting date of VIP status and percentage of discount.
Is this Entity diagram correct? This is so fracking confusing. I've built this diagram on just intuition rather a systematic approach they teach in a textbook. I still can't wrap my head around the many-to-one relation, weak entities, foreign keys.
There's a fair article on ERDs on Wikipedia.
When you're starting a new ERD - whether it's hand-drawn or computer-drawn - you should focus first on the entities (entity sets). Add the relationships in and then worry about fleshing out your non-key predicates. When you get some experience with ERDs you'll get to the point where you won't need much more work to achieve normalization. It will start to come naturally to you.
There are probably quite a few changes that you'll want to make to your diagram. Since this may be homework, I'll give you an alternative diagram to consider:
This model takes a more sophisticated view of your rules, for example:
Songs can appear many times on the same CD and on different CDs.
A song can be performed by multiple artists within a given track.
Producers can cooperate on a CD.
None of these are necessarily right for your model. It depends on your business rules.
Compare your model with this one and ask yourself what is different and why you might want to take one approach or the other.
take all the major concepts, draw a box for each
in the box put the name of the major concept, like SONG then an underline
under the major concept, list all the attributes like NAME
draw lines from one box to another where those concepts are linked (usually through an attribute) like line from CD to SONG

Implementing Comments and Likes in database

I'm a software developer. I love to code, but I hate databases... Currently, I'm creating a website on which a user will be allowed to mark an entity as liked (like in FB), tag it and comment.
I get stuck on database tables design for handling this functionality. Solution is trivial, if we can do this only for one type of thing (eg. photos). But I need to enable this for 5 different things (for now, but I also assume that this number can grow, as the whole service grows).
I found some similar questions here, but none of them have a satisfying answer, so I'm asking this question again.
The question is, how to properly, efficiently and elastically design the database, so that it can store comments for different tables, likes for different tables and tags for them. Some design pattern as answer will be best ;)
Detailed description:
I have a table User with some user data, and 3 more tables: Photo with photographs, Articles with articles, Places with places. I want to enable any logged user to:
comment on any of those 3 tables
mark any of them as liked
tag any of them with some tag
I also want to count the number of likes for every element and the number of times that particular tag was used.
1st approach:
a) For tags, I will create a table Tag [TagId, tagName, tagCounter], then I will create many-to-many relationships tables for: Photo_has_tags, Place_has_tag, Article_has_tag.
b) The same counts for comments.
c) I will create a table LikedPhotos [idUser, idPhoto], LikedArticles[idUser, idArticle], LikedPlace [idUser, idPlace]. Number of likes will be calculated by queries (which, I assume is bad). And...
I really don't like this design for the last part, it smells badly for me ;)
2nd approach:
I will create a table ElementType [idType, TypeName == some table name] which will be populated by the administrator (me) with the names of tables that can be liked, commented or tagged. Then I will create tables:
a) LikedElement [idLike, idUser, idElementType, idLikedElement] and the same for Comments and Tags with the proper columns for each. Now, when I want to make a photo liked I will insert:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Photo'
INSERT (user id, typeId, photoId)
and for places:
typeId = SELECT id FROM ElementType WHERE TypeName == 'Place'
INSERT (user id, typeId, placeId)
and so on... I think that the second approach is better, but I also feel like something is missing in this design as well...
At last, I also wonder which the best place to store counter for how many times the element was liked is. I can think of only two ways:
in element (Photo/Article/Place) table
by select count().
I hope that my explanation of the issue is more thorough now.
The most extensible solution is to have just one "base" table (connected to "likes", tags and comments), and "inherit" all other tables from it. Adding a new kind of entity involves just adding a new "inherited" table - it then automatically plugs into the whole like/tag/comment machinery.
Entity-relationship term for this is "category" (see the ERwin Methods Guide, section: "Subtype Relationships"). The category symbol is:
Assuming a user can like multiple entities, a same tag can be used for more than one entity but a comment is entity-specific, your model could look like this:
BTW, there are roughly 3 ways to implement the "ER category":
All types in one table.
All concrete types in separate tables.
All concrete and abstract types in separate tables.
Unless you have very stringent performance requirements, the third approach is probably the best (meaning the physical tables match 1:1 the entities in the diagram above).
Since you "hate" databases, why are you trying to implement one? Instead, solicit help from someone who loves and breathes this stuff.
Otherwise, learn to love your database. A well designed database simplifies programming, engineering the site, and smooths its continuing operation. Even an experienced d/b designer will not have complete and perfect foresight: some schema changes down the road will be needed as usage patterns emerge or requirements change.
If this is a one man project, program the database interface into simple operations using stored procedures: add_user, update_user, add_comment, add_like, upload_photo, list_comments, etc. Do not embed the schema into even one line of code. In this manner, the database schema can be changed without affecting any code: only the stored procedures should know about the schema.
You may have to refactor the schema several times. This is normal. Don't worry about getting it perfect the first time. Just make it functional enough to prototype an initial design. If you have the luxury of time, use it some, and then delete the schema and do it again. It is always better the second time.
This is a general idea
please donĀ“t pay much attention to the field names styling, but more to the relation and structure
This pseudocode will get all the comments of photo with ID 5
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "comment"
This pseudocode will get all the likes or users who liked photo with ID 5
(you may use count() to just get the amount of likes)
SELECT * FROM actions
WHERE actions.id_Stuff = 5
AND actions.typeStuff="photo"
AND actions.typeAction = "like"
as far as i understand. several tables are required. There is a many to many relation between them.
Table which stores the user data such as name, surname, birth date with a identity field.
Table which stores data types. these types may be photos, shares, links. each type must has a unique table. therefore, there is a relation between their individual tables and this table.
each different data type has its table. for example, status updates, photos, links.
the last table is for many to many relation storing an id, user id, data type and data id.
Look at the access patterns you are going to need. Do any of them seem to made particularly difficult or inefficient my one design choice or the other?
If not favour the one that requires the fewer tables
In this case:
Add Comment: you either pick a particular many/many table or insert into a common table with a known specific identifier for what is being liked, I think client code will be slightly simpler in your second case.
Find comments for item: here it seems using a common table is slightly easier - we just have a single query parameterised by type of entity
Find comments by a person about one kind of thing: simple query in either case
Find all comments by a person about all things: this seems little gnarly either way.
I think your "discriminated" approach, option 2, yields simpler queries in some cases and doesn't seem much worse in the others so I'd go with it.
Consider using table per entity for comments and etc. More tables - better sharding and scaling. It's not a problem to control many similar tables for all frameworks I know.
One day you'll need to optimize reads from such structure. You can easily create agragating tables over base ones and lose a bit on writes.
One big table with dictionary may become uncontrollable one day.
Definitely go with the second approach where you have one table and store the element type for each row, it will give you a lot more flexibility. Basically when something can logically be done with fewer tables it is almost always better to go with fewer tables. One advantage that comes to my mind right now about your particular case, consider you want to delete all liked elements of a certain user, with your first approach you need to issue one query for each element type but with the second approach it can be done with only one query or consider when you want to add a new element type, with the first approach it involves creating a new table for each new type but with the second approach you shouldn't do anything...

Organizational chart represented in a table

I have an Access application, in which I have an employee table. The employees are part of several different levels in the organization. The orgranization has 1 GM, 5 department heads, and under each department head are several supervisors, and under those supervisors are the workers.
Depending on the position of the employee, they will only have access to records of those under them.
I wanted to represent the organization in a table with some sort of level system. The problem I saw with that was that there are many ppl on the same level (for example supervisors) but they shouldn't have access to the records of a supervisor in another department. How should I approach this problem?
One common way of keeping this kind of hierarchical data in a database uses only a single table, with fields something like this:
userId (primary key)
userName
supervisorId (self-referential "foreign key", refers to another userId in this same table)
positionCode (could be simple like 1=lakey, 2=supervisor; or a foreign key pointing to another table of positions and such)
...whatever else you need to store for each employee...
Then your app uses SQL queries to figure out permissions. To figure out the employees that supervisor 'X' (whose userId is '3', for example) is allowed to see, you query for all employees where supervisorId=3.
If you want higher-up bosses to be able to see everyone underneath them, the easiest way is just to do a recursive search. I.e. query for everyone that reports to this big boss, and for each of them query who reports to them, all the way down the tree.
Does that make sense? You let the database do the work of sorting through all the users, because computers are good at that kind of thing.
I put the positionCode in this example in case you wanted some people to have different permissions... for example, you might have a code '99' for HR employees which have the right to see the list of all employees.
Maybe I'll let some other people try to explain it better...
Here's an article from Microsoft's Access Cookbook that explains these queries rather well.
And here is a somewhat chunky explanation of the same.
Here's a completely different method (the "adjacency list model") that you might find useful, and his explanation is pretty good. He also points out some difficulties with both methods (when he talks about the tables being "denormalized").