I'm designing a simple database for a rental listings website,
sort of like classified ads but only for home/room rentals. This is what I've come up with thus far:
Question 1
For the "post" table, I actually wanted more information. For example, there would be a 'facilities' section where the users can select whether there's 'parking' available, do I need a separate table? Or just use 0 for no and 1 for yes?
Question 2
Here's what I did with the "category" table (sorry I don't know how to pretty print yet)
Category_ID 1 is Rent
Category_ID 2 is buildingType
For "categoryProperty" table
Category_ID 1 categoryPropertyID 1 House
Category_ID 1 categoryPropertyID 2 Room
Category_ID 2 categoryPropertyID 3 Apartment
Category_ID 2 categoryPropertyID 4 Condominium
Category_ID 2 categoryPropertyID 5 Detached
Does the above make sense?
Question 3
Users can post whether they are logged in or not. Just that logged in users/members have the advantage of tracking their ads/adjusting the availability.
How do I record the ads that a member has posted? Like their history.
Should I create a "postHistory" table and set the 'postHistory_ID' as FK to "member" table?
Thanks a lot in advance, I appreciate your help, especially just pointing me to the right direction.
Question 1:
make a separate table and make a One to One relation, that would be the simplest way:
POST -|-----|- EXTRAS
in EXTRAS you may have every extra field (parking=1/0, in_down_town=1/0,has_a_gost=1/0)
Question 2:
This does not make sense, you've two options:
in the Post table create a "type_of_operation", that can have two vales (building_type,rent). Or you can create different tables, but would make this more complicate (you should analyise if the same type can be in both states, etc).
Question 3:
I recommend you to make your users register. Even with a really simple form (email+password) .
Seems to be on the right track -- with respect to your specific questions:
Question #1: Assuming there's more than one type of facility (parking; swimming pool; gym) then you have a many-to-many relationship and you want 2 new tables: Facilities and PropertyFacilities. Each Property (or I guess "post") could have multiple rows in the PropertyFacilities table.
Question #2: Not really clear on what you're getting at -- is it that each property type can either be rented whole or rented per room?
Question #3: Good question, what you want to do is have an Active bit, or an ExpireDate, in your POST table -- then anything that becomes inactive or expired is automatically 'historical' data, no need to marshall it to a history table. Although you'll have to archive eventually of course.
Related
I have two sets of data that are near identical, one set for books, the other for movies.
So we have things such as:
Title
Price
Image
Release Date
Published
etc.
The only difference between the two sets of data is that Books have an ISBN field and Movies has a Budget field.
My question is, even though the data is similar should both be combined into one table or should they be two separate tables?
I've looked on SO at similar questions but am asking because most of the time my application will need to get a single list of both books and movies. It would be rare to get either books or movies. So I would need to lookup two tables for most queries if the data is split into two tables.
Doing this -- cataloging books and movies -- perfectly is the work of several lifetimes. Don't strive for perfection, because you'll likely never get there. Take a look at Worldcat.org for excellent cataloging examples. Just two:
https://www.worldcat.org/title/coco/oclc/1149151811
https://www.worldcat.org/title/designing-data-intensive-applications-the-big-ideas-behind-reliable-scalable-and-maintainable-systems/oclc/1042165662
My suggestion: Add a table called metadata. your titles table should have a one-to-many relationship with your metadata table.
Then, for example, titles might contain
title_id title price release
103 Designing Data-Intensive Applications 34.96 2017
104 Coco 34.12 2107
Then metadata might contain
metadata_id title_id key value
1 103 ISBN-13 978-1449373320
2 103 ISBN-10 1449373320
3 104 budget USD175000000
4 104 EIDR 10.5240/EB14-C407-C74B-C870-B5B6-C
5 104 Sound Designer Barney Jones
Then, if you want to get items with their ISBN-13 values (I'm not familiar with IBAN, but I guess that's the same sort of thing) you do this
SELECT titles.*, isbn13.value isbn13
FROM titles
LEFT JOIN metadata isbn13 ON titles.title_id = metadata.title_id
AND metadata.key='ISBN-13'
This is a good way to go because it's future-proof. If somebody turns up tomorrow and wants, let's say, the name of the most important character in the book or movie, you can add it easily.
The only difference between the two sets of data is that Books have an
IBAN field and Movies has a Budget field.
Are you sure that this difference that you have now will not be
extended to other differences that you may have to take into account
in the future?
Are you sure that you will not have to deal with any other type of
entities (other than books and movies) in the future which will
complicate things?
If the answer in both questions is "Yes" then you could use 1 table.
But if I had to design this, I would keep a separate table for each entity.
If needed, it's easy to combine their data in a View.
What is not easy, is to add or modify columns in a table, even naming them, just to match requirements of 2 or more entities.
You must be very sure about future requests/features for your application.
I can't image what type of books linked with movies you store thus a lot of movies have different titles than books which are based on. Example: 25 films that changed the name.
If you are sure that your data will be persistent and always the same for books and movies then you can create new table for example Productions and there store attributes Title, Price, Image, Release Date, Published. Then you can store foreign keys of Production entity in your tables Books and Movies.
But if any accident happen in the future you will need to rebuild structure or change your assumptions. But anyway it will be easier with entity Production. Then you just create new row with modified values and assign to selected Book or Movie.
Solution with one table for both books and movies is the worst, because if one of the parameters drive away you will add new row and you will have data for first set (real book and non-existing movie) and second set (non-existing book and real movie).
Of course everything is under condition they may be changes in the future. If you are 100% sure, then 1 table is enough solution, but not correct from the database normalization perspective.
I would personally create separate tables for books and movies.
The table below is from a tutorial where the tables are in 3rd normal form. But if I insert information into the table PROJECT as follows:
projectCode projectDescr customerNo
1 Apples 21
1 Apples 22
Didn't I lose the 3NF cos the projectcode and projectdescr ends up repeating since 2 customers could possibly have the same project?
So my question is whether the table in the image below is in 3NF. And does the above problem even exists or I am looking at it wrongly? I am setting up my own table but before that I am trying to get the 3NF understanding right. Please help. Thanks.
The table from the tutorial:
The assumption in the example would be that the relationship between PROJECT and CUSTOMER is many to one. A customer may have multiple projects but each project applies to only one customer. If you want a project to apply to multiple customers, then you need to split out another project_customer table that just contains a project and customer key for each row.
I am creating a database for a publishing company. The company has around 1300 books and around 6-7 offices. Now i have created a table that displays the stock items in all locations. The table should look like following to the user:
Book Name Location1 Location2 Location3 ......
History 20000 3000 4354
Computers 4000 688 344
Maths 3046 300 0
...
I already have a Books table which stores all the details of the books, i also have a office table which has the office information. Now if i create a stock management table which shows the information like above i will end up in a huge table with a lot of repetition if i store my data in the following way:
Column1- Book_ID Column2- Location_ID Column3- Quantity
1 1 20000
1 2 3000
1 3 4354
2 1 4000
2 2 688
...
So, i think this isn't the best way to store data as it would end up with 1300 (Books) X 7 (Locations) = 9100 rows. Is there a better way of storing data. Now i can have 7 additional columns in the Books stable but if i create a new location, i will have to add another column to the Books table.
I would appreciate any advice or if you think that the above method is suitable or not.
Nope, that's the best way to do it.
What you have is a Many-to-Many relationship between Books and Locations. This is, in almost all cases, stored in the database as an "associative" table between the two main entities. In your case, you also have additional information about that association, namely, it's "stock" or "quantity" (or, if you think about it like a Graph, the magnitude of the connection, or edge-weight).
So, it might seem like you have a lot of "duplication", but you don't really. If you were to try to do it any other way, it would be much less flexible. For example, with the design you have now, it doesn't require any database schema change to add another thousand different books or another 20 locations.
If you were to try to put the book quantities inside the Locations table, or the Locations inside the Books table, it would require you to change the layout of the database, and then re-test any code that might be use it.
Thats the most common (and effective) solution. Most frameworks like Django, Modx and several others implement Many2Many relations via an intermediate table only, using foreign key relations.
Make sure you index your table properly.
ALTER TABLE stock_management add index (Book_ID), add index (Location_ID)
That really the best way to do it; you have 9100 independent data to store, so you really do need 9100 rows (less, really; the rows where the quantity is 0 can be omitted.) Other way of arranging the data would require the structure of the table to change when a location was added.
Is it bad to implement supertype and subtype to the entire data in a database? I need some advice on this before moving to this direction...
For instance,
I have these tables as objects and they are related,
users
pages
images
entities table as the supertype
entity_id entity_type
1 page
2 page
3 user
4 user
5 image
6 image
users table
user_id entity_id
1 3
2 4
pages table
page_id entity_id
1 1
2 2
images table
image_id entity_id
1 5
2 6
here is the table to map images table with entities table because some images belong to certain page (maybe to blog posts, etc in the future),
map_entity_image table
entity_id image_id
1 1
1 2
so, I will insert a row into the entities table when I have a page, an image, an user, etc to be created.
in the end of the day the rows in this tables will increase in a great numbers. so my worry is that can it cop with large numbers of rows? will this database gets slow and slower in time?
after all, are these a bad structure?
or maybe I am doing supertype/ subtype incorrectly?
edit:
I think the entity should have these data only,
entity_id entity_type
1 page
2 page
unless I want to attach images to users, etc. then it should be like this,
entity_id entity_type
1 page
2 page
3 user
4 user
maybe I am wrong...
EDIT:
so this is the query how I find out how many images attached to the page id 1,
SELECT E.*, P.*, X.*,C.*
FROM entities E
LEFT JOIN pages P ON (P.entity_id = E.entity_id)
LEFT JOIN map_entities_images X ON (X.entity_id = E.entity_id)
LEFT JOIN images C ON (C.image_id = X.image_id)
WHERE P.page_id = 1
returns 2 images.
If all you need is to attach images to users and pages, I'm not sure a full-blown category (aka. "subclass", "subtype", "inheritance") hierarchy would be optimal.
Assuming pages/users can have multiple images, and any given image can be attached to multiple pages/users, and assuming you don't want to attach images to images, your model should probably look like this:
You could use category hierarchy to achieve similar result...
...but with so few subclasses I'd recommend against it (due potential maintainability and performance issues). On the other hand, if there is a potential for adding new subclasses in the future, this might actually be the right solution (ENTITY_IMAGE will automatically "cover" all these new subclasses, so you don't need to introduce a new "link" table for each and every one of them).
BTW, there are 3 major ways to implement the category hierarchy, each with its own set of tradeoffs.
Not exactly an answer to your question, but, what you are describing is not what most modelers would refer to as a "supertype".
This is analogous to super/sub classes in OOP. The supertype is a genric entity, and, the subtype is a more specialized version of the generic entity
The classic example is vehicles. A "vehicle" has a common set of attributes like "owner" , "price", "make", "model". It doesn't matter whether its a car, a bicycle or a boat. However cars have "wheels", "doors" "engine-size" and "engine-type", bicycles have "number-of-gears" and "terrain-type" (BMX, road etc.) and boats have "propellers", "sails" and "cabins".
There are two ways of implementing this.
Firstly there is a "rollup", you have one table which holds all the common attributes for a "vehicle" plus optional attibutes for each type of vehicle.
Secondly there is a "rolldown", you have one table which holds only the common attributes for every vehicle. And one table for each vehicle type to hold the attibutes specific to "cars", "bicycles" and "boats".
I can't understand where facebook uses really mysql:
All the Database can be seen as a graph:
Account - Like -> Comment
Account <- friend -> Account2
Account - Like -> Link
And what is stored in MySQL?
the text of the posts and notes?
Have facebook all these entities ( account, post, comment ) in its graph DB?
Well, I assume that everything You mentioned is stored in MySQL. Every piece of data that is subject to change, including:
Users
Posts
Comments
Information about uploaded pictures (but not pictures themselves)
Likes
Data about users logging in
Ads
Data about users liking / not liking ads
User settings
etc.
Any data that is subject to change needs to be saved in database for indexing and fast access. Filesystem is fine if You want to write-only data, for example logging. Or if You only need to access the whole data at once, not parts of it.
But if You need data to be structured and ready to be accessed quickly, then You need to use a database. You may want to read about binary trees: http://en.wikipedia.org/wiki/Binary_tree
About Facebook: If I had to guess, I would say that there are probably hundreds of more databases. I don't have access to their servers, so I can't really comment on that :) But as another example, if You install WordPress, then it creates 11 different tables. http://codex.wordpress.org/Database_Description
PS. There is no reason facebook should use MySQL, though. There are lot of different databases out there.
EDIT Thanks for pointing out that I misunderstood Your question.
Lets take this case: Account <- friend -> Account2
As said before, they have table like "Users".
Users table will have columns:
ID (It has PRIMARY KEY. This is meant to give unique ID to each row.)
Username (Text field with some length, for example 64 characters)
...And many more...
Now there will be table "Friends". It will have fields:
ID (again, PRIMARY KEY)
Person1
Person2
Both fields Person1 and Person2 will be integers pointing to ID in table "users".
So if table users has three rows:
ID Username
1 rodi
2 rauni
3 superman
Then table "Friends" would be for example:
ID Person1 Person2
1 1 2
2 2 1
3 1 3
4 3 1
Here row 1 means "rodi is friend with rauni" and row 2 means that "rauni is friend with rodi". This is redundant, but I wanted to keep example simple.
Here is good tutorial: http://www.tizag.com/mysqlTutorial/mysqltables.php
There are many pages there, just keep clicking Next to skip what You already know (I don't know how much You already know)
This is about joining info from two tables: http://www.tizag.com/mysqlTutorial/mysqljoins.php
You could use this to select all rodi's friends from our two tables in one query.