I'm designing a simple db (still learning)
Here's the entities
User, Topic, Article
User Topic is many to many(a user can be interested in many topics)
Topic User is many to many(a topic can be interested by many users)
User Blog is one to many (a user can write many blogs)
Blog User is one to one (a blog can be authored only by one user)
here's the question:
Shall I still make a one to many relationship between Topic and Blogs?
For instance, if I want to find all the latest blogs for certain topics, one way is to find all the users, and find all the blogs for those users, rank by time.
Another way is, if we keep redundancy by having a Topic to Blog relationship (one to many), then we can get all the blogs from the topic directly, then sort by time.
I'm kind confused, shall I have this redundancy? whats's the best practice here (in terms of ease of programming for backend programmers and query efficiencies? est users 100k, blogs 200k, Topics 20)
Thanks a lot!
Add a pic:
For many to many relationships you should use associative entity which means adding a bridge table to store relations, and for one to many simply store the relation key as a foreign key constraint.
Saying that your schema could look like:
Users
Topic
Users_Topic (where primary key would consist of columns identifying both Users and Topic)
Article (where it has a foreign key to table Users through a column eg. user_id)
Regarding Article and Topic relation, there's no need for redundancy by storing it in a separate table, so you could include a foreign key to topic within an article. If you are worried about querying Article table that much, which I don't think should be your consideration right now you can create a table which stores Article content since it is what takes space within the table and call it Article_Content with foreign key to Article also being the primary key for this new table.
Related
In a social networking site I'm making, I need some way to store which posts a user has 'liked', to ensure they can only 'like' each post one time. I have several ideas.
A seperate table for each user, to store all of the different posts' IDs which they've liked as rows in said table.
A space-seperated string of post IDs as a field in the table users.
A seperate table for each post, storing all of the different users' IDs which have liked said post as rows in the table.
note, users is a table containing all of the site's users, with their ID, username, etc.
-
Initially I liked the idea of a seperate table for each user, but I realised this might be more trouble than it's worth.
So I thought a space-seperated string for each row in users might be a good idea, since I wouldn't have to start working with many more tables (which could complicate things,) but I have a feeling using space-seperated strings would lower performance significantly more than using additional tables, especially with a greater amount of users.
Essentially my question is this: Which, out of the aforementioned methods of making sure a user can only like a post once, is the most practical?
None of these sound like particularly good ideas.
Generally, having to create tables on the fly, be it for users or posts, is a bad idea. It will complicate not only your SQL generation, but also clutter up the data dictionary with loads of objects and make maintaining the database much more complicated than it should be.
A comma-delimited string also isn't a good idea. It breaks 1NF will complicate your queries (or worse - make you right code!) to maintain it.
The sane approach is to use a single table to correlate between users and posts. Each row will hold a user ID and the ID of a post he liked, and creating a composite primary key over the two will ensure that a user can't like a post twice:
CREATE TABLE user_post_likes (
user_id INT, -- Or whatever you're using in the users tables
post_id INT, -- Or whatever you're using in the posts tables
PRIMARY KEY (user_id, post_id),
FOREIGN KEY (user_id) REFERENCES user(id),
FOREIGN KEY (post_id) REFERENCES post(id)
);
I have a newbie question about a database I am trying to create. I have a list of publications, with this general order:
UID, Author, URL, Title, Publication, start page, end page, volume, year
Then I realized that there are multiple Authors and I began trying to normalize the Database for multiple Authors. Then I realized that the Order of the authors is important, and that a journal article could also have numerous Authors, between 1, and dozens, or possibly even more.
Should I just create a table with multiple Authors (null columns)(like 12 or something)? Or is there a way to have a variable number of columns depending on the number of authors?
Database model
You basically need a many-to-many relationship between Authors and Publications, since one author can write many publications, and one publication can be written by more than one author.
This require you to have 3 tables.
Author - general info about every author (without publications_id)
Publication - general info about every publication (without author_id)
AuthorPublication - columns author_id and publication_id that are references to tables Author and Publication.
This way you're not binding a specific author to a publication, but you can have more of them, and the same thing the other way around.
Additional notes
If you would like to distinguish authors' role in particular publication you could also add some column like id_role that would be a reference to a dictionary table stating all possible roles for an author. This way you could differ between leading authors, co-authors etc. This way you could also store information about people handling translation of the book, but perhaps you should then change the naming of Author to something less specific.
Order of appearance
You can ensure a proper ordering of your authors by adding a column in AuthorPublication which you would increment separately for every Publication. This way you would be able to preserve the ordering as you need it.
You have many to many relationship between entity Publication and entity Author.
Publication can have many authors, author can have many publications.
So, you should create table for this relationship. For example table Authors_Publications with columns: UID, author_id, publication_id, order
You should create author table which is many-to-many relation with the publication table
Author have some information
and publication also have informaiton
so should have tables like author and publication
both have primary key like author_id and pblication_id
and both key having many-to-many relationship
Actually your scenario is even more complicated.
A publication can have more than one author. An author can write more than one published article or book. That is a Many-to-Many relationship.
We always(*) represent a many-to-many with a third table, sometimes called. Bridge table. This third table, authorship, is a child table with at least two columns, both foreign keys holding the primary key from each of its parent tables, pub_ and author_ tables. We transform the Many-to-Many into a pair of One-to-Many relationships.
By the way, this books-author scenario is the canonical example used when teaching relational database design.
You can have additional fields on this third table. In your case, we need a priority_ column of an integer type to sort the list of primary vs secondary authors.
Each author’s compensation fee or royalty would be additional columns on this bridge table. If you were tracking each author needing to sign a contract for their work on that publication, the authorship_ table would have a date, date-time, or boolean column contract_signed_. So you can see that the bridge table represents anything to do with one particular author’s involvement on one particular publication.
(*) Not merely an opinion or suggestion. Relational database design is proven by entire books filled with mathematical proofs. This includes the need to break up a many to many with a third table. Relational database design is the only case of true information engineering backed by mathematical description and proofs. Search for relation (a field of mathematics), and doctors E.F. Codd and Chris Date to learn more.
I'm managing to create my first complicated J2E Solution and in every tutorial I find some sort of intermediary tables usage, like here :
Tables : User, User_Roles, Roles
While logic would simply add a key to user Table referring to it's role on Roles table, why the usage of that intermediary table ?
I thought it's one or two developpers choice, but everywhere I look for a tutorial, I find this sort of sql schema.
Is it better ? Does it help in something particular ? Speed, security ? Cause from a logic point of view, using one table User and a foreign key to Roles is better.
Thank you
This is a common database relationship modeling called M-N (Many To Many). A User can have many Roles, and a Role can be assigned to many Users, so you need the intermediary table. Here's another example: a Teacher can teach many Classes, and each Class can be taught by many teachers (during different semesters, for example). In this case you need a Teacher-Class intermediary table.
A different kind of relationship is 1-N (one to N). A User can have many Telephones, but each Telephone is owned by a single User. In this case, a User's primary key (PK) is exported as a foreign key (FK) into the Telephones table. No need for an intermediary table.
For example, I have a table of Users and a table of Blogs and a separate table called User_To_Blogs which contain only User and Blog IDs.
What are the benefits of such a table as opposed to having a User ID column in the Blog table?
These sorts of tables are useful for many-to-many relationships, where a user can have many blogs, and a blog can have many authors, for example.
Perhaps a better example is the person -> employer relationships, where a person can have worked for many employers, and an employer can have many employees
Let's say your blog table has information like blog name, blog start date, blog URL, etc. If you have the user ID in that table, every row will have the same blog information repeated. If you define the blog record in its own table, you define it once and reference it as many times as you want in the user-blog table.
I am designing a database for student information. I wish to implement the best practices regarding separate tables and use of Primary and Foreign Keys.
Let's say I have the following tables (High Level):
Users
Student Information
Student Transcripts
Student Records
There will be different users with different levels. Also, the information in Student Info/Transcripts/Records will all have a Foreign Key with the ID that's in Users.
SO, it would be dumb to just clump all the tables into one big table, wouldn't it? Is it a good idea to keep all this information separate and just use Primary/Foreign keys to link things together, as well as maybe Joins? I just personally think a big table would be quite messy and through this way, it allows one to keep similar data together with its own kind.
Thanks for all input on the matter!