How do I Normalize this table. It has a tree like structure which is expected to grow like a tree.
By tree like structure I mean that new students, subjects, levels and chapters will be constantly added or updated or removed
I want to store the result of a quiz in this table. the quiz has multiple subjects under which there are multiple levels under which there are mutliple chapter. and Every students can take different subjects.
So is this table good for storing the results or I need to do something with this table?
In this particular case you need to create several independent tables:
Table "Student"
ID, Name
1, John
2, Jack
Table "Subject"
ID, Name
1, Math
2, Science
3, Geography
4, History
5, English
Table "Levels"
ID, Name
1, Intermediate
2, Moderate
3, Difficult
Table "Chapters"
ID, Name
1, Chapter 1
2, Chapter 2
3, Chapter 3
And so on and so on.
Then you define the relations between the tables, like this:
Table "student_subject_level"
ID, student_id, subject_id, level_id
1, 1, 1, 1 (John, Math, Intermediate)
2, 1, 2, 2 (John, Science, Moderate)
So far you have the student, the corresponding subejct and the subject's level. Since we may have multiple chapters for each level, we need another relation:
Table "student_subject_level_chapter" (or use simpler name)
student_subject_level_id, chapter_id
1, 1 (John, Math, Intermediate, Chapter 1)
1, 2 (John, Math, Intermediate, Chapter 2)
2, 1 (John, Science, Moderate, Chapter 1)
And so on and so on. Start by isolating the individual tables and then figure out how you'd like to achieve the actual relation. Fore each new relation where you have redundant data, you'd like to have new table which keeps the relation you need. It's much easier once you have ID's to refer to, so start with the individual tables and figure your way through.
Related
I am trying to make a web app to learn chinese characters, and I want to keep record of the characters the users learn on a database.
I thought of keeping record using a words_learned column in the users table with an array that contains the character_id of the characers that already knows.
But I am a beginner so I don't know if this is efficient. Is the design right? Should I use many columns instead of an array? or is the complete design wrong?
Characters table
character_id character pinyin meaning
1 我 wo3 i
2 你 ni3 you
3 他 ta1 he
.
.
.
600 山 shan1 mountain
Users table
user_id user password words_learned
1 john 1234 {1, 5, 68, 599}
2 chuck passwd {2, 3, 5, 6, 8, 90, 160}
Generally this is normalized with another table, that would contain both a user_id and character_id.
This can be a huge table, but that's ok.
Suppose I have the following columns for a csv that I read through a 'File Reader' node:
id, name, city, income
After reading it, I notice that the column 'city' contains a huge number of unique values. I want to:
Know which values are the 'k' most frequent for 'city'
Modify those which are not the 'k' most frequent to hold something like 'other'
Example:
id, name, city, income
1, Person 1, New York, 100.000
2, Person 2, Toronto, 90.000
3, Person 3, New York, 50.000
4, Person 4, Seattle, 60.000
Choosing k to be 1, I want to produce the following table:
id, name, city, income
1, Person 1, New York, 100.000
2, Person 2, Other, 90.000
3, Person 3, New York, 50.000
4, Person 4, Other, 60.000
It happens because 'New York' is the '1' most frequent value for 'city' in the original table.
Do you know how I can do that using Knime?
Thanks a lot!
You can use the CSV Reader to read the data. With the Statistics and Row Filter nodes you can find the k most frequent values. From those, you can create a collection cell using GroupBy. With that collection value, you can use Rule Engine with a similar ruleset:
$city$ IN $most frequent cities$ => $city$
TRUE => "Other"
I have a table:
id, number, name, display_name
0001, 1, Category 1, null
0001-0002, 2, Category 2, null
0001-0002-0003, 3, Category 3, null
The id is the full path to the category, the number is just the final category number.
I'd like display_name updated to include the full names from all categories in the path, so they'd end up as
0001, 1, Category 1, Category 1
0001-0002, 2, Category 2, Category 1 > Category 2
0001-0002-0003, 3, Category 3, Category 1 > Category 2 > Category 3
I know I can generate these on the fly by lookups to the number column, but this table doesn't need to change that often but it has a lot of lookups -- it seems wasteful not to just calculate and store the data once. I can do this in php but it's slow and I assume there's a better way to do it? Or perhaps I'm just going about this in completely the wrong way. I realise there's plenty of redundancy in the table... I'm happy for any input.
I got as far as
update categories set display=(select name from (select name from categories where number=1) t) where number=1
but that obviously just copies the name to the display name.
I have a database table that holds user's vehicles (cars, motorcycles). I want to get the most similar vehicles out of that table. Lets say the table holds the following columns (with some context to get the idea):
table: vehicles
vehicle_id (pk, auto-increment)
model_id (BMW 3er, Honda Accord)
fuel_type (gasoline, diesel)
body_style (sedan, coupe)
year
engine_size (2.0L)
engine_power (150hp)
So in short I want to select N (usually 3) rows that have the same make_id (at least) and rank them by the amount of similarities they share with the seed vehicle lets say if the fuel_type matches, I'd have rank points +3, but if the body_style matches, it would be +1. Ideally I would get N vehicles that have maximum points but the idea is to still get something when I don't.
As in my table currently I have only around 5k rows and they are slowly growing, I decided to actually use the following simple approach (it came to me just after I wrote the question).
The seed lets say is Honda Accord (model_id 456), 2004, gasoline, 2.0L, 155hp, sedan with auto-inc ID 123.
SELECT vehicles.*,
(IF(`fuel_type`='gasoline', 3, 0) +
IF(`body_style`='sedan', 1, 0) +
IF(`year` > 2001 AND `year` < 2007, 2, 0) +
IF(`engine_size` >= 1.8 AND `engine_size` <= 2.2, 1, 0) +
IF(`engine_power`=155, 3, IF(`engine_power`>124 AND `engine_power`<186, 1, 0))) AS `rank`
FROM vehicles
WHERE vehicle_id!=123 AND model_id=456
ORDER BY `rank` DESC
LIMIT 3
It will work, as long as I don't too many rows. If the table becomes 50-100k, I probably will have to switch to something like Lucene?
I want to make an app that allows users to add other users to a personal friendslist. In my database there is a table called 'users'. Every user has a unique id and a unique username, now every user needs to be able to have a list of friends.
I think the best option to save these friendslists is to create a seperate table with two colums, for every user. One column for the friends' id's and one for their usernames.
I can search and retrieve the friends username and id at the same time. On the downside I will need to create a hugh number of tables (hundreds, thousands, perhaps millions), one for each user.
Will this make selecting a table from the database slow?
Will this unnecessarily cost a hugh amount of space on the server?
Is there a better way to save a friendslist for every user?
You should not do that.
Instead do something like
UserTable
* Id
* UserName
FriendsTable
* UserId
* FriendId
You may need to read a little about relation databases.
This way a user can be friend to a lot of people. Consider for this example
UserTable
1, Joey
2, Rachel
3, Chandler
4, Ross
5, Phoebe
6, Monica
FriendTable
1, 2
1, 3
1, 4
1, 5
1, 6
2, 3
2, 4
2, 5
2, 6
3, 4
3, 5
3, 6
4, 5
4, 6
5, 6
Here the people from Friends is all friends to eachother
I don't think you need to go down that route. If you have a table of users (user_id, user_name) for example and another table of friendships (friendship_id, user_id1, user_id2) then you will be able to store all friendships in one table. The unique id being friendship_id.