MySQL Database design and effecient query - mysql

I have the following tables:
users (id, first_name, last_name)
category (id, name)
rank(id, user_id, rank)
Each user can belong to several categories. And all users are in the rank table and have a value between 0.0 and 1.0, where 0 is the lowest rank and 1 is the highest. I’d like to setup additional tables to create the following webpage:
A visitor to the page (identified by either one of the recorded ids in the user table, or a numeric representation of their ip address) chooses a category and is presented with two randomly chosen users from the users table such that:
1) the visiting user_id has not seen this pairing in a period of 24 hours
2) the two users belong to the chosen category
3) the two users are within 1 rank value of each other. Let me explain that last criteria - if the ranks were sorted, the two chosen users would have adjacent ranks.
This is a hard one and I can’t for the life of me figure it out how to do this effeciently
I truly appreciate any help on this front.
Thanks

You just need two more tables and the rest go in your website logic.
user_category(user_id, category_id)
user_pairing(first_user_id, second_user_id, last_seen)
The first table is to represent a ManyToMany relationship between the users and the category, and the second one is for the users pairing.

I agree with #Yasel, i want to add that you properly want another table
candidate(first_user_id, second_user_id);
this table is used to pre-calculate the candidates for each user, this candidate table is prepopulated every hour/day, so when each first_user_id, second_user_id is assigned, this pair is removed from candidate table and moved into user_pairing table. so each time you only need to query candidate table which should be efficient.

Related

Database Design for a system that has Facebook like groups

I'm creating a system that has Groups. These can be thought of like Facebook Groups. Users can create new groups. Currently I have the following types of groups:
City Group - Groups based on a certain city. For example "London Buy and Sell Group"
School Group - Groups based on schools. For example "London University Study Group"
Interest Group - Groups that are not tied to a place. For example "Over 50's Knitting Group"
In the future more group types will be added. Each group can have different types of options, but all groups have the same basic data:
An ID
A creator ID
A name
An option description
I'm struggling on putting together a database design for this. My initial thought was to create different tables for the different groups.
For example have a single table called group. This table has an id, creator id, name, description, member count, timestamps.
Then have other tables to represent the other groups, and link them to group. So I have a city_group table that contains and id, group_id, city_id. And the same for the other group types.
The only problem I have with this is interest_group doesn't have any extra data that a normal group. But for the purpose of being able to query only Interest Groups I thought it might make sense to create an interest_group table. It would only have the following columns: id, group_id, timestamps ... which seems a bit wasteful to have a table just for this purpose.
Here's a diagram to make things easier:
Are there any issues with my solution, or any better ways to solve this design problem?
I've got an idea, which is a workaround basically: have another table like: group_type in which you have id(the PK) and then you have tablename (the full table name of the type).
Then, you should have a FK from your Group table linking to this group_type table.
id tablename
--------------------
1 School Group
2 Interest Group
After all this is done, you could build your queries based on the values from this table, as an example:
JOIN (SELECT tablename FROM group_type WHERE id=group.group_type_id) ON ..

Non-unique many-to-many table design

I'm implementing a voting system for a php project which uses mysql. The important part is that I have to store every voting action separately for statistic reasons. The users can vote for many items multiple times, and every vote has a value (think of it like a donation kinda stuff).
So far I have a table votes in which I'm planning to store the votes with the following columns:
user_id - ID of the voting user, foreign key from users table
item_id - ID of the item which the user voted for, foreign key from items table
count - # of votes spent
created - date and time of voting
I'll need to get things out of the table like: Top X voters for an item, all the items that a user have voted for.
My questions are:
Is this table design suitable for the task? If it is, how should I index it? If not, where did I go wrong?
Would it be more rewarding to create another table beside this one, which has unique rows for the user-item relationship (not storing every vote separately, but update the count row)?
Each base table holds the rows that make a true statement from some fill-in-the-(named-)blanks statement aka predicate.
-- user [userid] has name ...
-- User(user_id, ...)
SELECT * FROM User
-- user [user_id] voted for item [item_id] spending [count] votes on [created]
-- Votes(user_id, item_id, count, created)
SELECT * FROM Votes
(Notice how the shorthand for the predicate is like an SQL declaration for its table. Notice how in the SQL query a base table predicate becomes the table's name.)
Top X voters for an item, all the items that a user have voted for.
Is this table design suitable for the task?
That query can be asked using that design. But only you can know what queries "like" that one are. You have to define sufficient tables/predicates to describe everything you care about in every situation. If Votes records the history of all relevant info about all events then it must be suitable. The query "all the items that user User has voted for" returns rows satisfying predicate
-- user User voted for item [item] spending some count on some date.
-- for some count & created,
user User voted for item [item_id] spending [count] votes on [created]
-- for some count & created, Votes(User, item_id, count, created)
-- for some user_id, count & created,
Votes(user_id, item_id, count, created) AND user_id = User
SELECT item_id FROM Votes WHERE user_id = User
(Notice how in the SQL the condition turns up in the WHERE and the columns you keep are the ones that you care about. More here and here on querying.)
If it is, how should I index it?
MySQL automatically indexes primary keys. Generally, index column sets that you JOIN ON, otherwise test, GROUP BY or ORDER BY. MySQL 5.7 Reference Manual 8.3 Optimization and Indexes
Would it be more rewarding to create another table beside this one, which has unique rows for the user-item relationship
If you mean a user-item table for some count & created, [user_id] voted for [item_id] spending [count] votes on [created] and you still want all the individual votings then you still need Votes, and that user-item table is just SELECT user_id, item_id FROM Votes. But if you want to ask about people who haven't voted, you need more.
(not storing every vote separately, but update the count row)
If you don't care about individual votings then you can have a table with user, item and the sum of count for user-item groups. But if you want Votes then that user-item-sum table is expressible in terms of Votes using GROUP BY user_id, item_id & SUM(count).

Mysql: is it better to split tables if possible?

To make you understand my question I'll give you an example:
I have a chat web app with many rooms, let's say 5 rooms.
People can choose to stay only in one room and they choose it at login.
When they choose the room I have to retrieve the people already in the room, so I can structure my db in two ways:
each room one table with the people being records;
all the rooms in one table, people are the records and a column indicating the room they are in;
In the first case the query would be:
SELECT * FROM 'room_2' WHERE 1
In the second case the query would be:
SELECT * FROM 'rooms' WHERE room = 'room_2'
Which is the best?
I think the only parameter to consider is performance, right?
In this example, no, because people are all 'like' objects and should therefore be in the same table.
All people and rooms in one table with a primary key on people, in this simple example.
Table Rooms(pk_person, personName, table_id)
But I want to talk about a structure that you will want to consider as your website grows. You’ll want three tables, one for each object (chat rooms, people) and one for the relationships.
Chat_Rooms(pk_ChatId, ChatName, MaxOccupants, other unique attributes of a chat room)
People(pk_PersonID, FirstName, LastName, other unique attributes of a person)
Room_People_Join(pk_JoinId, fk_ChatId, fk_PersonID, EnterDateTime, ExitDateTime)
This is a “highly normalized” structure. Each table is a collection of like objects, the join allows for many to many relationships, and object rows are not duplicated. So, a Person with all their attributes (name, gender, age) is never duplicated in the person table. Also, the person table never defines which chat rooms a person is in, because a person could be in one, many, none, or may have entered and exit multiple times. The same concept applies to a chat room. A chat rooms features, such as background color, max occupants, etc. have nothing to do with people.
The Room_People_Join is the important one. This has a unique primary key for which chat rooms a person is in and when they were there. This table grows indefinitely, but it tracks usage. Including the relationship table is what logically normalizes your database.
So how do you know which users are currently in chat room 1? You join your people and rooms to the join table with their respective Primary and Foreign keys in your FROM clause, ask for the columns you want in your SELECT clause, and filter for chat room 1 and people who haven’t yet left.
SELECT p.FirstName, p.LastName, r.ChatName
FROM Room_People_Join j
JOIN People p ON j.fk_PersonID = p.pk_PersonID
JOIN Chat_Rooms r ON j.fk_ChatId = r.pk_ChatId
WHERE r.ExitDateTime IS NOT NULL
AND pk_ChatId = 1
Sorry that’s long winded, but I extrapolated your question for database growth.
The answer is very simple and strongly recommended - one database table for all rooms for sure! What if you will later like to create rooms dynamically!? For sure you would not create new tables dynamically.

Mysql setup for multiple users with large number of individual options

i'm building a study tool and i'm not sure of the best way to go about structuring my database.
Basically, i have a simple but big table with around 50000 bits of information in it.
info (50'000 rows)
id
info_text
user
id
name
email
password
etc
What i want is for the students to be able to marked each item as studied or to be studied(basically on and off), so that they can tick off each item when they have revised it.
I want to build tool to cope with thousands of users and was wondering what the most efficient/easiest option way of setting up the database and associated queries.
At the moment i would lean towards just having one huge table with two primary keys one with user id and then id of the info they had studied and then doing some sort of JOIN statement so i could only pull back the items that they had left to study.
user_info
user_id
info_id
Thanks in advance
Here is one way to model this situation:
The table in the middle has a composite primary key on USER_ID and ITEM_ID, so a combination of the two must be unique, even though individually they don't have to be.
A user (with given USER_ID) has studied a particular item (with given ITEM_ID) only if there is a corresponding row in the STUDIED table (with these same USER_ID and ITEM_ID values).
Conversely, the user has not studied the item, if and only if the corresponding row in STUDIED is missing. To pull all items a given user hasn't studied, you can do something like this:
SELECT * FROM ITEM
WHERE NOT EXISTS (
SELECT * FROM STUDIED
WHERE
USER_ID = <given_user_id>
AND ITEM.ITEM_ID = STUDIED.ITEM_ID
)
Or, alternatively:
SELECT ITEM.*
FROM ITEM LEFT JOIN STUDIED ON ITEM.ITEM_ID = STUDIED.ITEM_ID
WHERE USER_ID = <given_user_id> AND STUDIED.ITEM_ID IS NULL
The good thing about this design is that you don't need to care about STUDIED table in advance. When adding a new user or item, just leave the STUDIED alone - you'll gradually fill it later as users progress with their studies.
I would do something like this:
1) A users table with a uid primary key
2) A enrolled table (this table shows all courses that have enrolled students) with a primary key of (uid, cid)
3) A items (info) table holding all items to study, with a primary key of itemid
Then in the enrolled table just have one attribute (a binary flag) 1 means it has been studyed and 0 means they still need to study it.

Best way to store ordered lists in a database?

What's the best way to store "ordered lists" in a database, so that updating them (adding, removing and changing the order of entries) is easily done?
Consider a database where you have a table for users and movies. Each user has a list of favorite movies.
Since many users can like the same movie, I made users and movies separate tables and uses a third table to connect them, usermovies.
usermovies contains an id of a user and a movie and an "order number". The order number is used to order the list of movies for users.
For example, user Josh might have the following list:
Prometheus
Men in Black 3
The Dictator
and user Jack might have a list like:
The Dictator
Prometheus
Battleship
Snow White and the Huntsman
So, they share some favorites, but not necessarily in the same order.
I can get the list of movie IDs for each user using a query:
SELECT movie_id FROM usermovies WHERE user_id =? ORDER BY order_number
Then, with the ordered movie_ids, I can get the list of movies using another query
SELECT name FROM movies WHERE id in (?,?,?) ORDER BY FIELD (id, ?,?,?)
So queries work, but updating the lists seems really complex now - are there better ways to store this information so that it would be easy to get the list of movies for user x, add movies, remove them and change the order of the list?
If you are not looking for a "move up / move down" kinda solution, and then defaulting to adding at the bottom of the list, here are a few more pointers:
Inserting new rows into a specific position can be done like this: (inserting at position 3)
UPDATE usermovies SET order_number = ordernumber + 1
WHERE ordernumber > 3 and user_id = ?;
INSERT INTO usermovies VALUES (?, 3, ?);
And you can delete in a similar fashion: (deleting position 6)
DELETE usermovies WHERE order_numer = 6 and user_id=?;
UPDATE usermovies SET order_number = ordernumber - 1
WHERE ordernumber > 6 and user_id = ?;
A junction/link table with additional columns for the attributes of the association between movies and users is the standard way of realizing a many-many association with an association class - so what you have done seems correct.
Regarding the ease of insert/update/delete, you'll have to manage the entire association (all rows for the user-movie FKs) every time you perform an insert/update/delete.
There probably isn't a magical/simpler way to do this.
Having said this, you'll also need to run these operations in a transaction and more importantly have a 'version' column on this junction table if your application is multi-user capable.
To retrieve user favourites movies you could use a single query:
SELECT um.order_number, m.name FROM movies m
INNER JOIN usermovies um ON m.id = um.movie_id
WHERE um.user_id = ?
ORDER BY um.order_number
To add/remove a favourite movie simply add/remove related record in usermovies table.
To alter a movie order simply change all order_number field in user_movies table related to user.
In addition to what others have said, reordering existing favorites can be done in a single UPDATE statement, as explained here.
The linked answer explains reordering of two items, but can be easily generalized to any number of items.