i'm building a study tool and i'm not sure of the best way to go about structuring my database.
Basically, i have a simple but big table with around 50000 bits of information in it.
info (50'000 rows)
id
info_text
user
id
name
email
password
etc
What i want is for the students to be able to marked each item as studied or to be studied(basically on and off), so that they can tick off each item when they have revised it.
I want to build tool to cope with thousands of users and was wondering what the most efficient/easiest option way of setting up the database and associated queries.
At the moment i would lean towards just having one huge table with two primary keys one with user id and then id of the info they had studied and then doing some sort of JOIN statement so i could only pull back the items that they had left to study.
user_info
user_id
info_id
Thanks in advance
Here is one way to model this situation:
The table in the middle has a composite primary key on USER_ID and ITEM_ID, so a combination of the two must be unique, even though individually they don't have to be.
A user (with given USER_ID) has studied a particular item (with given ITEM_ID) only if there is a corresponding row in the STUDIED table (with these same USER_ID and ITEM_ID values).
Conversely, the user has not studied the item, if and only if the corresponding row in STUDIED is missing. To pull all items a given user hasn't studied, you can do something like this:
SELECT * FROM ITEM
WHERE NOT EXISTS (
SELECT * FROM STUDIED
WHERE
USER_ID = <given_user_id>
AND ITEM.ITEM_ID = STUDIED.ITEM_ID
)
Or, alternatively:
SELECT ITEM.*
FROM ITEM LEFT JOIN STUDIED ON ITEM.ITEM_ID = STUDIED.ITEM_ID
WHERE USER_ID = <given_user_id> AND STUDIED.ITEM_ID IS NULL
The good thing about this design is that you don't need to care about STUDIED table in advance. When adding a new user or item, just leave the STUDIED alone - you'll gradually fill it later as users progress with their studies.
I would do something like this:
1) A users table with a uid primary key
2) A enrolled table (this table shows all courses that have enrolled students) with a primary key of (uid, cid)
3) A items (info) table holding all items to study, with a primary key of itemid
Then in the enrolled table just have one attribute (a binary flag) 1 means it has been studyed and 0 means they still need to study it.
Related
First and foremost, I want to say I do not have any code to back this at this point. I am trying to conceptualize an idea. So, apologies in advance.
Basic rundown. I have a database full of shows that each can have multiple genres, such as show A can be an action, adventure, drama. Typical, right? Right now, as I have my database set up to have columns such as genre_1, genre_2, genre_3. This is terrible, I know, which is why I am redoing it.
I am wanting to create a table full of genres, then have a table with the show information, then have a table to relate those two. So, the primary keys in the genre and show tables would be foreign keys in the genre-show table.
I'm pretty sure this is the best way to go about this one-to-many relationship, but let me know if there is something I'm missing.
My problem is, I'm uncertain of how I would, for say, list all shows that are in the action OR adventure genres, or list all shows that are in action AND adventure genres.
I'm fairly, kind of familiar with joins, but on my knowledge I can't figure out how I would query that.
Ultimately, what I am looking to do is be able to query my DB and say "Give me every show that has action and adventure genres" and then be on my way.
I hope this make sense. Thank you in advance for your time / answers, I truly appreciate it.
One to many 101:
Main Table:
id (primary key, auto_increment),
name,
datecreated (datetimestamp),
dateupdated (datetimestamp)
Many-to-one A
id (primary key, auto_increment),
main_id (foreign key)
name,
datecreated (datetimestamp),
dateupdated (datetimestamp)
Many-to-one B
id (primary key, auto_increment),
main_id (foreign key)
name,
datecreated (datetimestamp),
dateupdated (datetimestamp)
Now you may join up as many, Many-to-one tables as you need thusly:
select
*
from
main left join
table_a on main.id = table_a.main_id left join
table_b on main.id = table_b.main_id
where
main.id = X
You will receive back many rows, but each row will have duplicates of the main object but include all many-to-ones. This is called Denormalization.
Or you may prefer to do sub loops whereby you run one query to get your main objects, and then within a subloop for each many-to-one, you use the main.id to find the main_id.id rows that match your object.
I have a pivot table for a Many to Many relationship between users and collected_guitars. As you can see a "collected_guitar" is an item that references some data in foreign tables (guitar_models, finish).
My users also have some foreign data in foreign tables (hand_types and genders)
I want to get a derived table that lists data if I look for a particular model_id in "collected_guitar_user"
Let's say "Fender Stratocaster" is model id = 200, where the make is Fender (id = 1 of makes table).
The same guitar could come in a variety of finish hence the use of another table collected_guitars.
One user could have this item in his collection
Now what I want to find by looking at model_id (in this case 200) in the pivot table "collected_guitar_user" is the number of Fender Stratocasters that are collected by users that share the same genders.sex and hand_types.type as the logged in user and to see what finish they divide in (some percent of finish A and B etc...).
So a user could see that is interested in what others are buying could see some statistics for the model.
What query can derive this kind of table??
You can do aggregate counts by using the GROUP BY syntax, and CROSS JOIN to compute a percentage of the total:
SELECT make.make, models.model_name as model, finish.finish,
COUNT(1) AS number_of_users,
(COUNT(1) / u.total * 100) AS percent_owned
FROM owned_guitar, owned_guitar_users, users, models, make, finish
CROSS JOIN (SELECT COUNT(1) AS total FROM users) u
WHERE users.id = owned_guitar_users.user_id
AND owned_guitar_user.owned_guitar_id = owned_guitar.id
AND owned_guitar.model_id = models.id
AND owned_guitar.make_id = make.id
AND owned_guitar.finish_id = finish.id
GROUP BY owned_guitar.id
Please note though, that in cases where a user owns more than one guitar, the percentages will no longer necessarily sum to unity (for example, Jack and John could both own all five guitars, so each of them owns "100%" of the guitars).
I'm also a little confused by your database design. Why do you have a finish_id and make_id associated directly in the owned_guitar table as well as in the models table?
To make you understand my question I'll give you an example:
I have a chat web app with many rooms, let's say 5 rooms.
People can choose to stay only in one room and they choose it at login.
When they choose the room I have to retrieve the people already in the room, so I can structure my db in two ways:
each room one table with the people being records;
all the rooms in one table, people are the records and a column indicating the room they are in;
In the first case the query would be:
SELECT * FROM 'room_2' WHERE 1
In the second case the query would be:
SELECT * FROM 'rooms' WHERE room = 'room_2'
Which is the best?
I think the only parameter to consider is performance, right?
In this example, no, because people are all 'like' objects and should therefore be in the same table.
All people and rooms in one table with a primary key on people, in this simple example.
Table Rooms(pk_person, personName, table_id)
But I want to talk about a structure that you will want to consider as your website grows. You’ll want three tables, one for each object (chat rooms, people) and one for the relationships.
Chat_Rooms(pk_ChatId, ChatName, MaxOccupants, other unique attributes of a chat room)
People(pk_PersonID, FirstName, LastName, other unique attributes of a person)
Room_People_Join(pk_JoinId, fk_ChatId, fk_PersonID, EnterDateTime, ExitDateTime)
This is a “highly normalized” structure. Each table is a collection of like objects, the join allows for many to many relationships, and object rows are not duplicated. So, a Person with all their attributes (name, gender, age) is never duplicated in the person table. Also, the person table never defines which chat rooms a person is in, because a person could be in one, many, none, or may have entered and exit multiple times. The same concept applies to a chat room. A chat rooms features, such as background color, max occupants, etc. have nothing to do with people.
The Room_People_Join is the important one. This has a unique primary key for which chat rooms a person is in and when they were there. This table grows indefinitely, but it tracks usage. Including the relationship table is what logically normalizes your database.
So how do you know which users are currently in chat room 1? You join your people and rooms to the join table with their respective Primary and Foreign keys in your FROM clause, ask for the columns you want in your SELECT clause, and filter for chat room 1 and people who haven’t yet left.
SELECT p.FirstName, p.LastName, r.ChatName
FROM Room_People_Join j
JOIN People p ON j.fk_PersonID = p.pk_PersonID
JOIN Chat_Rooms r ON j.fk_ChatId = r.pk_ChatId
WHERE r.ExitDateTime IS NOT NULL
AND pk_ChatId = 1
Sorry that’s long winded, but I extrapolated your question for database growth.
The answer is very simple and strongly recommended - one database table for all rooms for sure! What if you will later like to create rooms dynamically!? For sure you would not create new tables dynamically.
What's the best way to store "ordered lists" in a database, so that updating them (adding, removing and changing the order of entries) is easily done?
Consider a database where you have a table for users and movies. Each user has a list of favorite movies.
Since many users can like the same movie, I made users and movies separate tables and uses a third table to connect them, usermovies.
usermovies contains an id of a user and a movie and an "order number". The order number is used to order the list of movies for users.
For example, user Josh might have the following list:
Prometheus
Men in Black 3
The Dictator
and user Jack might have a list like:
The Dictator
Prometheus
Battleship
Snow White and the Huntsman
So, they share some favorites, but not necessarily in the same order.
I can get the list of movie IDs for each user using a query:
SELECT movie_id FROM usermovies WHERE user_id =? ORDER BY order_number
Then, with the ordered movie_ids, I can get the list of movies using another query
SELECT name FROM movies WHERE id in (?,?,?) ORDER BY FIELD (id, ?,?,?)
So queries work, but updating the lists seems really complex now - are there better ways to store this information so that it would be easy to get the list of movies for user x, add movies, remove them and change the order of the list?
If you are not looking for a "move up / move down" kinda solution, and then defaulting to adding at the bottom of the list, here are a few more pointers:
Inserting new rows into a specific position can be done like this: (inserting at position 3)
UPDATE usermovies SET order_number = ordernumber + 1
WHERE ordernumber > 3 and user_id = ?;
INSERT INTO usermovies VALUES (?, 3, ?);
And you can delete in a similar fashion: (deleting position 6)
DELETE usermovies WHERE order_numer = 6 and user_id=?;
UPDATE usermovies SET order_number = ordernumber - 1
WHERE ordernumber > 6 and user_id = ?;
A junction/link table with additional columns for the attributes of the association between movies and users is the standard way of realizing a many-many association with an association class - so what you have done seems correct.
Regarding the ease of insert/update/delete, you'll have to manage the entire association (all rows for the user-movie FKs) every time you perform an insert/update/delete.
There probably isn't a magical/simpler way to do this.
Having said this, you'll also need to run these operations in a transaction and more importantly have a 'version' column on this junction table if your application is multi-user capable.
To retrieve user favourites movies you could use a single query:
SELECT um.order_number, m.name FROM movies m
INNER JOIN usermovies um ON m.id = um.movie_id
WHERE um.user_id = ?
ORDER BY um.order_number
To add/remove a favourite movie simply add/remove related record in usermovies table.
To alter a movie order simply change all order_number field in user_movies table related to user.
In addition to what others have said, reordering existing favorites can be done in a single UPDATE statement, as explained here.
The linked answer explains reordering of two items, but can be easily generalized to any number of items.
For storing friends relationships in social networks, is it better to have another table with columns relationship_id, user1_id, user2_id, time_created, pending or should the confirmed friend's user_id be seralized/imploded into a single long string and stored along side with the other user details like user_id, name, dateofbirth, address and limit to like only 5000 friends similar to facebook?
Are there any better methods? The first method will create a huge table! The second one has one column with really long string...
On the profile page of each user, all his friends need to be retrieved from database to show like 30 friends similar to facebook, so i think the first method of using a seperate table will cause a huge amount of database queries?
The most proper way to do this would be to have the table of Members (obviously), and a second table of Friend relationships.
You should never ever store foreign keys in a string like that. What's the point? You can't join on them, sort on them, group on them, or any other things that justify having a relational database in the first place.
If we assume that the Member table looks like this:
MemberID int Primary Key
Name varchar(100) Not null
--etc
Then your Friendship table should look like this:
Member1ID int Foreign Key -> Member.MemberID
Member2ID int Foreign Key -> Member.MemberID
Created datetime Not Null
--etc
Then, you can join the tables together to pull a list of friends
SELECT m.*
FROM Member m
RIGHT JOIN Friendship f ON f.Member2ID = m.MemberID
WHERE f.MemberID = #MemberID
(This is specifically SQL Server syntax, but I think it's pretty close to MySQL. The #MemberID is a parameter)
This is always going to be faster than splitting a string and making 30 extra SQL queries to pull the relevant data.
Separate table as in method 1.
method 2 is bad because you would have to unserialize it each time and wont be able to do JOINS on it; plus UPDATE's will be a nightmare if a user changes his name, email or other properties.
sure the table will be huge, but you can index it on Member11_id, set the foreign key back to your user table and could have static row sizes and maybe even limit the amount of friends a single user can have. I think it wont be an issue with mysql if you do it right; even if you hit a few million rows in your relationship table.