I am trying to normalize my database but I'm having a headache getting to grips with it. I am developing a CMS where Facebook users can create a page on my site. So far this is what I have
page
----
uid - PK AI
slug - Slug URL
title - Page title
description - Page description
image - Page image
imageThumbnail - Thumbnail of image
owner - The ID of the user that created the page
views - Page views
timestamp - Date page was created
user
----
uid - PK AI
fbid - Facebook ID
(at a later date may add profile options i.e name, website etc)
tags
----
uid - PK AI
tag - String (tag name)
page_tag
--------
pid - Page id (uid from page table)
tid - Tag id (uid from tag table)
page_user
---------
pid - Page id (uid form page table)
uid - User ID (uid from user table)
I've tried to seperate as much information as needed without going over the top. I created a seperate table for tags because I don't want tag names being repeated. If the database holds 100,000+ pages, the repeated tags will add to storage and speed no doubt.
Is there any problems with the design? Or anything I'm doing wrong? I remember learning this at university but I've done very little database design since then.
I'd rather get it right the first time then have the headache later on.
Looks fine to me. How bad can it be with five tables?
You have users, pages, and tags. Users can have many pages; pages can be referred to by many users. A page can have many tags; a tag can be associated with many pages.
Sums it up for me. I wouldn't worry about it.
Your next concern is indexes. You'll want an index for every WHERE clause that you'll use to query.
Related
I have some tables in the db which have either one image or several images associated with them. For instance:
# table 1
- id
- name
- created_at
# table 2
- id
- name
- created_at
Now each of these tables has either one or many images. A typical design would be like this:
# table 1
- id
- name
- image_path
- created_at
# table 2
- id
- name
- created_at
# images table
- id
- table_2_id
- image_path
- created_at
However, I happened to have several problems with design as the following:
I have many tables associated with one or more images.
Images are going to be uploaded in the different hosts for storage capacity sake.
There might be more tables added to my database with the same needs.
Dependent on domain changes some tables' image path might change as well.
So now I want to dealing with this problem as a multi-dimensional table for images is the right design choice and is it also going to be future proof?
# images
- id
- table_id
- table_name
- image_path
- created_at
Best regards. Thank you.
You are looking at the problem in the reverse way. You need 1 table with all your images and each table that needs an image will have a link to the images tables
You might consider doing a quick install of a popular open source CMS like WordPress or Drupal and add a few images to see how they accomplish this. Many CMS may have thought through some issues you have not considered.
I want to organize articles written on my website. Currently, I have an author submit their work to me (via email) and I copy/paste their article into a .php file and upload their file with FTP. At the same time I need to update the links for the navigation menu based on the new article.
I've been reading that I can put everything into a mysql database.
Right now, I have 2 Columns (a music column and a college life column) - each column will have articles updated every two weeks by a different author. How do I organize my database
What I was thinking...(after doing some reading)
Table Column:
Column_id
Name
Description
Create_date
Table Column_authors:
column_id
author_id
Table Articles:
Article_id
column_id
Title
Description/Summary
Body
create_date
Table Articles_authors:
article_id
author_id
Table Articles_keyword:
article_id
keyword_id
Table authors:
author_id
Name
Email
about
Table Keyword
keyword_id
name
?????
(I'm not sure how to organize with the keyword - each article can have multiple keywords)
I'm completely new to organizing with a database, so I have no idea what I'm doing!
Could someone, point me in the right direction of a good tutorial.
Please let me know if I need to be more specific
You can do this with WordPress. WordPress is built on top of a MySQL database, but you don't really to to mess around with it too much other than setting it up initially, if that (some hosting sites have an automated WordPress install that sets up the database for you).
Once you are all set up, then you can use Posts in WordPress for your articles and the latest article is displayed first, with links to the old ones automatically generated. If you have any static content, you can use Pages in WordPress.
I'm creating a database for a photography website and I want it to allow three main things -
Allow the owner/admin to create client accounts and credentials,
Specifying which photos should go into three different portfolio galleries on the site, and,
Displaying a unique client's photos (and only their photos!) to them when they log in.
This is my first database design ever - based on responses below, I've added that emphasis ;) and edited the design as below.
IMAGESimage_id,filename,description,client_id,date_uploaded,
USERS/CLIENTS
client_id,
client_name
username,
password,
PORTFOLIO
portfolio_id,
portfolio_name,
PORTFOLIO_IMAGES
id,
image_id,
portfolio_id,
Am I correct in thinking that the final id in PORTFOLIO_IMAGES would allow me to display one image in multiple galleries?
Thanks
As it is your first DB-Design and as you may have mentioned in the comments here is something essential missing: ER-Diagram. This helps a lot understanding what's going on.
ER-Diagram
Synonyms: User=Account, Image=Photo, Gallery=Portfolio
Known Roles: "Admin", "Client"
Examples for Rights: "Create Account", "Delete Account", "Watch images", "Add Gallery", "Remove Gallery", "Upload image", "Delete image", ...
Table Design
User
id
name
password
Image
id
user_id
filename
description
upload_date
Image_Gallery
image_id
gallery_id
Gallery
id
name
User_Role
user_id
role_id
User_Right
user_id
right_id
Role
id
name
Role_Right
role_id
right_id
Right
id
name
You may want to remove all the things with Right if it is enough to separate user privileges by Role.
Within the tables images and users, you will be referencing the clients id, not the name.
I would create a separate table for the galleries, as clients tend to have new wishes every three month. So you maybe need to add more galleries.
table "galleries"
id
name
table "image_is_in_gallery"
image_id
gallery_id
PRIMARY(image_id, gallery_id)
You might want to consider normalization.
Assuming that usernames are unique - two people can't have the same username, come on - then you can eliminate "id" in the Users table in order to help prevent update/insert/delete anomalies (doing this would almost certainly put Users into BCNF, and likely DKNF - this is a good thing).
Clients is fine. What is the difference between Clients and Users, though? Really... seems similar to me.
Make sure that references are done using foreign key constraints, and I think that should be better.
EDIT:
Based on the new design, I have these suggestions:
Change Clients/Users into three tables:
ClientNames
- ClientID (PK)
- ClientName
ClientUsernames
- ClientID (PK)
- Username
UsernamePasswords
- Username (PK)
- Password
This is safe and says that one Client/User has one name, one Client/User has one Username, and one Username has one Password. I don't see another good decomposition (in the sense that it's going to be in a tight normal form).
You can eliminate one of these tables by eliminating the synthetic "ClientID" key, if you want. There are disadvantages to this, and it may not be possible (some people do have the same name!).
The problem here is that it is likely that ClientID, ClientName, and UserName determine each other in a way that isn't amenable to stuffing them in the same table.
use client id instead of client_name on the images and users table
Add another table, portfolio with at least name and id columns
Add another table, portfolio_images with two columns, image_id and portfolio_id. This will allow the feature mentioned by #Alex in the comments
response to edit
You can do the one image in multiple portfolios by querying PORTFOLIO_IMAGES and JOINing with images or portfolios as necessary. For example, if you want to display the wedding portfolio (psuedo-code)
SELECT filename,...
FROM images img
INNER JOIN portfolio_images pimg on img.image_id = portfolio_images.image_id
WHERE pimg.portfolio_id = <whatever the id is for wedding portfolio>
I have an app that will allow an admin to upload an article and share it with many users to edit it. The article is then broken down into sentences which will be stored as individual rows in a MySQL DB. Each user can edit article sentences one at a time. How does one structure the database to allow admins to adjust the article sentences (merge, move, delete, edit, add) and still maintain the integrity of the the user's relationship to the article sentences?
Here is the basic structure:
article_sentences
---------------
-id (auto_increment)
-article_id (FK)
-paragraph_id
-content
user_article_sentences
---------------
-user_id (FK)
-article_id (FK)
-article_sentence_id (FK)
-user_content
One problem I see is the change in article_sentence ID. If the admin moves an article around, the ID will need to change along with the paragraph_id possibly changing if we want the article content to be in the correct order. To solve this, maybe we can add an article_sentence_order column? That way the id will never change but the order of the content is dictated by the article_sentence_order column.
What about merging and deleting? Those will cause some problems as well because fragmentation of the different IDs will start to happen.
Any ideas on a new schema design that will help solve these issues? How does an app like Google Docs deal with this type of issue?
Edit:
To solve the issue of moving different sentences around. We can use a new column called order_id and it can either be a varchar or int. Some tradeoffs: If int, then I will have to increment the subsequent sentences' order_id to be plus 1 of itself. If using a varchar, the order_id can simply be something like '3a' if I want to insert between 3 and 4. Problem with this is that in my application code, using numeric indexes to traverse to the next and previous sentences will be bit of a problem.
Are there other alternatives?
What about holding only full version of content, with a version number for each record so you will have a complete history of the article edited and by whom it was modified?
User:
- id
- name
User_article:
- id
- user_id (fk on user, this is the current editor)
- article_id
- version_number
- article_content (the full content of the article)
Article:
- id
- created_date
- user_id (the creator, or main owner )
- category_id
This way, it is very easy to revert articles content to a previous point in history, to see which user what modifications made, etc
I'm working on a project where I have the following (edited) table structures: (MySQL)
Blog
id
title
description
Episode
id
title
description
Tag
id
text
The idea is that that tags can be applied to any Blog or Episode (and to other types of sources), new tags can be created by the user if it doesn't exist already in the tag table.
The purpose of the tags is that a user will be able to search the site, and the results will search across all types of material on the site. Also, at the bottom of each blog article/episode description it would have a list of tags for that item.
I'd thought too much about the search mechanism, but I guess it'd be flexible between an OR and AND searches, if that has any impact on choices, and probably allow the user to filter the results for particular types of sources.
Originally I was planning to create multiple tag mapping tables:
BlogTag
id
tag_id
blog_id
EpisodeTag
id
episode_id
tag_id
But now I wonder if I would be better off with:
TaggedStuff
id
source_type
source_id
tag_id
Where source_type would be an integer related to whether it was an Episode, Blog, or some other type that I've not included in the structures above, and source_id would be the reference in that particular table.
I'm just wondering what the optimum structure would be for this, the first choice or the second?
In a clean (academic) design you would often see to have a supertype Resource (or something similar) for Blog and Episode with it's own table. Another table for the tags. And since it's a N:M relationship between Tag and Resource you have an extra mapping table between them.
So in such a design you would associate the Tag-Entities with your resources by having a relationship to their generalization.
After that you can put general attributes to the generalization. (i.e. title, description)
You can add attributes to the relationship between Tag and Resource like a counter how often a specific resource was tagged with a specific tag. Or how often a tag was used and and and (i.e. something like you see on stackoverflow in the upper right here)
The biggest loss in going with structure 2 is loss of referential integrity. If you can say "whatever" to that, it might be easier to go with this structure.
When I say structure 2 I mean:
TaggedStuff
id
source_type
source_id
tag_id
If I understand you correctly, the point is to optimize search mechanism...
So it has sense to make some kind of index_table and demoralize the data there...
I mean smth like this:
Url, Type, Title, Search_Field etc..
where Url is the path to the article or episode, Type (article|episode), Name (what users will see), Search_Field ( list of tags, other important data for search )
thats why both variants are quite good)))