This is a bit hard to explain.
But i have built an app where users create what i like to call 'raffles' and then users subscribe to it.
I have a table for the raffles, and i could have a column of type text in it and store all the users in it separated by commas(,)
or i could create a separate table where users are added and associated to the raffle via another field called 'raffle_id' or something like it.
I'm not sure how effective both of these methods will be efficient in the long run or for scaling.
Some advise would be appreciated.
I would recommend against storing your user information in CSV format. The main reason for this is that CSV will make querying the table by user difficult. It will also make doing updates difficult. SQL databases were designed to handle relational data using tables. So in your case I would design the raffles table to look like thia:
raffles (raffle_id, user_id)
And the data might look like this:
1 1
1 3
1 7
2 1
2 2
2 3
2 6
In other words, each record corresponds to a single raffle-user relation. Assuming that you only have a few dozen users and raffles happen every so often, thia should scale fine. And if this raffles table ever gets too large at a much later date you can archive a portion of it.
See [What is the best way to add users to multiple groups in a database?][1]
Raffles are the "Groups". "UserInGroup" becomes UserInRaffle, your join table.
Related
I'm having conception difficulties to implement something in a database. I have two solutions for a problem, and I was wondering which one is the best.
Problem :
Let's picture a table speciality with 2 fields : speciality_id and speciality_name.
So for example :
1 - Mage
2 - Warrior
3 - Priest
Now, I have a table user with fields such as user_id, name, firstname etc ...
In this table, there is a field called speciality. The speciality stores an integer, corresponding to the speciality_id of the table speciality.
That would be acceptable for users that have only one speciality. I want to improve the model to be able to have multiple specialities for a user.
Here are my two solutions :
Create a table 'solution1' which link the user_id with the speciality_id and remove the speciality field in the user table. So for a user which has 2 specialities, 2 rows will be created in the table 'solution1'.
Change the type of the field speciality in the user table to be able to write down the specialities, separated with commas.
For example 2;3
The problem I got with the second solution is for making foreign keys between my table user and my table specialities, to link them. I may have a bit more difficulties with the PHP in the future too, while wanting to get the specilities for a user (will need to use a parser I guess).
Which solution do you find is the best ?
Thanks.
Absolutely go with your first solution.
Create a third "Many-to-Many" table that allows you to relate a user to multiple specialties. This is the only way to go in your case.
When designing tables, you always want to have each column contain one and only one data element. Think about what querying your second solution would look like. What would you do when you wanted to see all users who had a given specialty?
You might try something like this:
select * from user where specialty like '%2%'
Well, what happens when you have specialties that go to 12? Now "2" matches multiple entities. You could devolve further and try to be tricky, but...you really should just make your data design as normal as possible to avoid all the mess, headache, and errors. Go with Solution 1.
i think the best way is to follow solution1 cause solution2 will end up will lot of complexity later on
I have a table Things and I want to add ownership relations to a table Users. I need to be able to quickly query the owners of a thing and the things a user owns. If I know that there will be at most 50 owners, and the pdf for the number of owners will probably look like this, should I rather
add 50 columns to the Things table, like CoOwner1Id, CoOwner2Id, …, CoOwner50Id, or
should I model this with a Ownerships table which has UserId and ThingId columns, or
would it better to create a table for each thing, for example Thing8321Owners with a row for each owner, or
perhaps a combination of these?
The second choice is the correct one; you should create an intermediate table between the table Things and the table Owners (that contains the details of each owner).
This table should have the thing_id and the owner_id as the primary key.
So finally, you well have 3 tables:
Things (the things details and data)
Owner (the owners details and data)
Ownerships (the assignment of each thing_id to an owner_id)
Because in a relational DB you should not have any redundant data.
You should definitely go with option 2 because what you are trying to model is a many to many relationship. (Many owners can relate to a thing. Many things can relate to an owner.) This is commonly accomplished using what I call a bridging table. (Which exactly what option 2 is.) It is a standard technique in a normalized database.
The other two options are going to give you nightmares trying to query or maintain.
With option 1 you'll need to join the User table to the Thing table on 50 columns to get all of your results. And what happens when you have a really popular thing that 51 people want to own?
Option 3 is even worse. The only way to easily query the data is to use dynamic sql or write a new query each time because you don't know which Thing*Owners table to join on until you know the ID value of the thing you're looking for. Or you're going to need to join the User table to every single Thing*Owners table. Adding a new thing means creating a whole new table. But at least a thing doesn't have a limit on the number of owners it could possibly have.
Now isn't this:
SELECT Users.Name, Things.Name
FROM Users
INNER JOIN Ownership ON Users.UserId=Ownership.UserId
INNER JOIN Things ON Things.ThingId=Ownership.ThingId
much easier than any of those other scenarios?
Which one would be better (performance wise and maintenance), a database which creates table dynamically or just adding rows dynamically?
Suppose I am building a project in which I let users to register. Say I have a table which store only basic personal infos, like name, dob, Date of joining, address, phone, etc. Say 10 columns.
Now is the tricky part.
Scene 1: Creating multiple tables
When a user complete registration, a message table is created. So each table is created for each users. The rows of each message table varies for each user.
In the same way there is a cart table for each user like the message table.
For this scene 1, 2 tables are created with every registration.
Scene 2: Adding Rows
The scenario is same here as well, but in this case I have 2 tables for message and cart. Rows are added only when there is an activity.
Note:
You must assume that the number of users is more than 2000 and expect 50+ users to be active all the time. Which means the message and cart tables are always busy for both the cases. Like there is always a query for update, add, delete, insert, select etc. simultaneously.
Also which scene will consume more disk space.
While writing this, it make me wonder what technique would Facebook and others use. If they use the Scene 2 style (all users (billions) use the same big long message table)... Just wondering
Databases has some basic rules defined for Database Design called
"Database Normalization", These basic rules allow us eliminating
redundant data.
1st Normal Form
Store One piece of information in only One Column, A column should store only One piece of information.
2ns Normal Form
A Table should have only the columns that are related to each other. All the related columns should be in One table.
Now if you look at your advised design, A Separate Table for each USER
will split SAME information/Columns about all the user in 1000's of
tables. Which violates the 2nd Normal Form.
You need to Create One Table and put all the related Columns in that
one table for all the users. and you can make use of normal t-sql to
query your data but if you have a table for each user my guess is your
every query that you execute from your application will be built
dynamically and for every query you will be using dynamic sql. which
is one of the Sql Devils and you want to avoid using it whenever
possible.
My suggestion would be read more about Database Design. Once you have
some basic understanding of database design. Draw it on a piece of
paper and see if it provides you everything that your business
requires / expects from this application , Spend sometime on it now it
will save you a lot of pain later.
I guess that title isn't very descriptive, so I will explain! I have table called users_favs where is stored all info about which posts user has liked, which post he has favourited and the same for comments. info there is stored as serealized array / or JSON who cares.
Question: What is better? Stay like this or to make 4 tables for each of the fields and store not in serealized version but like user_id => post_id???
What I think about second option is that after some time this field will be GIGANTIC. Also, I will need to make 4 queries (or with JOINS) to take all of the info from these tables.
Keeping it in 1 table means that you'll only need 1 table access and 0 joins to get all the data. While storing it in 4 tables, you'll need at least 1 table access and n-1 joins, when you need n fields of information. Your result set at the end of the query will probably be the same, so the amount of data send over the network is independent of your table structure.
I presume a scenario when you will have data for fav_categories and other columns are null. Similarly for columns fav_posts, liked_posts, liked_comments. So there is a high probability that in each row , only three columns will have data most of the time (id,user_id,any one of rest). If my assumptions are right and the use cases as well , then i would definitely go four four tables.
To add to above you can always choose from whether you want to make read-friendly or write-friendly.
I currently maintain a single DB table that has some info for images that are stored in a file system. This setup works well with the several hundred thousand photos I currently have recorded.
For a users default image I maintain a separate folder that contains the photo but this has become a maintenance nightmare. Should I create a second table that stores a reference to the default photo from table 1 or is it better to add a new field in table 1 that's a boolean I can set to indicate a default photo?
My table looks something like this:
image_table
id user_id file_name
1 6 xvy.jpeg
2 6 abc.jpeg
3 6 def.jpeg
Proposed solution:
image_table
id user_id file_name default
1 6 xvy.jpeg 0
2 6 abc.jpeg 1
3 6 def.jpeg 0
In this proposed solution it seems as though I would need to make two SQL calls to reset the default and then a second call to set a new default photo if a user changes it...
It is better to add new fields instead of add new tables, if the second table would have identical columns to the first if you went that approach.
Reasoning: If I need to get values from both tables, I would need to do a cumbersome UNION. What if you had three or more tables that all had the same kind of data, and I wanted all of them at once? It just gets clunkier and clunkier and more awkward to code against.
Well i can see you are using some SQL database but dont take me wrong why dont you try a NoSQL database such as MongoDB . I know creating a field as a flag or creating a new table doesnt seems to be a good design.