Is it ever best practise or recommended to use a table of the following?
id, uid, fieldname, fieldvalue
4, 12, gender, male
5, 12, age, 21-30
6, 12, location, 5
7, 13, gender, female
8, 13, age, 31-40
9, 13, location, 5
10, 14, gender, female
11, 14, age, 31-40
12, 14, location, 6
13, 15, gender, male
14, 15, age, 21-30
15, 15, location, 7
It is not normalised and you cannot specify the data type of the field.
Would the following not be better
id, uid, gender, age, location
4, 12, male, 21-30, 5
5, 13, female, 31-40, 5
6, 14, female, 31-40, 6
7, 15, male, 21-30, 7
I want to know if you can ever justify having such a table in a database, I know that the first method may be easier to add more fields (instead of altering the database) and will probably remove all null values.
However one cannot specify the datatype and you will have to convert from string every time you want to use the data.
So is there ever a scenario where the first table is considered the best practice or solution?
Working on a system that uses that method will make you lose your sanity. The complexity of the queries required in order to perform basic tasks is dreadful, and performance is a nightmare.
Here's one man's experience: https://www.simple-talk.com/opinion/opinion-pieces/bad-carma/
You can normalize the initial setup by adding another table:
fieldnames (fnid, name)
fieldvalues (id, uid, fnid, value, unique(uid,fnid))
However, I would recommend against it because of its complexity -- it's much easier to use a single table unless you are going to be adding and/or removing fields very frequently or there could be a large disparity in which rows get which fields (in which case you should probably rethink redesigning your DB and application).
The first type of structure you describe is quite common in applications where attributes are not known in advance. For example, you might be developing a system where a user can save details about contacts. You might know some of the details that will be stored, but not others. So, you might allow the user to define custom-attributes for a contact.
Of course, this type of design means that one loses database-applied checks, and has to rely on application-applied checks. That's a con, but should not be a deal-breaker if the flexibility is really required.
Related
I am trying to make a web app to learn chinese characters, and I want to keep record of the characters the users learn on a database.
I thought of keeping record using a words_learned column in the users table with an array that contains the character_id of the characers that already knows.
But I am a beginner so I don't know if this is efficient. Is the design right? Should I use many columns instead of an array? or is the complete design wrong?
Characters table
character_id character pinyin meaning
1 我 wo3 i
2 你 ni3 you
3 他 ta1 he
.
.
.
600 山 shan1 mountain
Users table
user_id user password words_learned
1 john 1234 {1, 5, 68, 599}
2 chuck passwd {2, 3, 5, 6, 8, 90, 160}
Generally this is normalized with another table, that would contain both a user_id and character_id.
This can be a huge table, but that's ok.
I was asked to create a report on adding up how many customers we have based on several different categories, like gender, age range, annual income range, and so on. These are distinct, unrelated partitions on the data, not just refinements. It appears that Access can only handle one partition in each query or report.
For example, if I use the grouping function in Reports, I will begin with (male) and (female), then proceed with subgroups (male, 18 - 35), (female, 18 - 35) and so on. Rather I need to count how many male and female customers there are. Then, forget about gender and count how many customers there are in each age group. Then, forget about age and proceed with a new partition and so on.
It will be awkward to write a new query or report each time. If nothing works, I am thinking to just export the counts into an Excel template.
No need to write an Excel template, Access is perfectly capable of doing this by itself.
The most usual way to do this is using conditional sums. E.g.:
SELECT Sum(Iif([sex] = "male", 1, 0)) As CountMales, Sum(Iif([sex] = "female", 1, 0)) As CountFemales, Sum(Iif([sex] = "male" And [age] Between 18 And 35, 1, 0) As CountMales18To35, etc
From MyTable
Suppose I have the following columns for a csv that I read through a 'File Reader' node:
id, name, city, income
After reading it, I notice that the column 'city' contains a huge number of unique values. I want to:
Know which values are the 'k' most frequent for 'city'
Modify those which are not the 'k' most frequent to hold something like 'other'
Example:
id, name, city, income
1, Person 1, New York, 100.000
2, Person 2, Toronto, 90.000
3, Person 3, New York, 50.000
4, Person 4, Seattle, 60.000
Choosing k to be 1, I want to produce the following table:
id, name, city, income
1, Person 1, New York, 100.000
2, Person 2, Other, 90.000
3, Person 3, New York, 50.000
4, Person 4, Other, 60.000
It happens because 'New York' is the '1' most frequent value for 'city' in the original table.
Do you know how I can do that using Knime?
Thanks a lot!
You can use the CSV Reader to read the data. With the Statistics and Row Filter nodes you can find the k most frequent values. From those, you can create a collection cell using GroupBy. With that collection value, you can use Rule Engine with a similar ruleset:
$city$ IN $most frequent cities$ => $city$
TRUE => "Other"
I have a table symptom_ratings containing the columns id, user_id, review_id, symptom_id, rate, and strain_id.
Each review can have multiple entries in symptom_ratings, one per symptom.
I would like to do a search for every strain_id that has all of the symptom_id's the user searches for.
That is, given the columns:
review: 2, strain_id: 3, symptom_id: 43
review: 2, strain_id: 3, symptom_id: 23
review: 2, strain_id: 3, symptom_id: 12
review: 6, strain_id: 1, symptom_id: 3
review: 6, strain_id: 2, symptom_id: 12
Searching for the symptom_id's 43 and 12 should only return results for strain_id 3.
I currently use the following WHERE condition:
Strain.id IN (SELECT strain_id
FROM symptom_ratings
WHERE symptom_id
IN ($symptoms))
where $symptoms is a comma-separated list of symptom_id values.
My problems is that this query currently performs an OR search (i.e. it finds strains that have any of the symptoms), where instead I'd prefer an AND search (i.e. finding strains that have all of the symptoms). How can I achieve that?
One way to do this would be to group the rows by the strain ID, count the number of distinct matching symptoms in each group, and return only those rows where the count equals the total number of symptoms searched for:
SELECT
strain_id,
COUNT(DISTINCT symptom_id) AS matched_symptoms
FROM symptom_ratings
WHERE symptom_id IN (43, 12)
GROUP BY strain_id
HAVING matched_symptoms = 2
Here's a quick online demo.
One potentially useful feature of this method is that it's trivial to extend it to support both "all of these", "any of these" and "at least n of these" searches just by changing the condition in the HAVING clause. For the latter cases, you can also sort the results by the number of matching symptoms (e.g. with ORDER BY matched_symptoms DESC).
I want to make an app that allows users to add other users to a personal friendslist. In my database there is a table called 'users'. Every user has a unique id and a unique username, now every user needs to be able to have a list of friends.
I think the best option to save these friendslists is to create a seperate table with two colums, for every user. One column for the friends' id's and one for their usernames.
I can search and retrieve the friends username and id at the same time. On the downside I will need to create a hugh number of tables (hundreds, thousands, perhaps millions), one for each user.
Will this make selecting a table from the database slow?
Will this unnecessarily cost a hugh amount of space on the server?
Is there a better way to save a friendslist for every user?
You should not do that.
Instead do something like
UserTable
* Id
* UserName
FriendsTable
* UserId
* FriendId
You may need to read a little about relation databases.
This way a user can be friend to a lot of people. Consider for this example
UserTable
1, Joey
2, Rachel
3, Chandler
4, Ross
5, Phoebe
6, Monica
FriendTable
1, 2
1, 3
1, 4
1, 5
1, 6
2, 3
2, 4
2, 5
2, 6
3, 4
3, 5
3, 6
4, 5
4, 6
5, 6
Here the people from Friends is all friends to eachother
I don't think you need to go down that route. If you have a table of users (user_id, user_name) for example and another table of friendships (friendship_id, user_id1, user_id2) then you will be able to store all friendships in one table. The unique id being friendship_id.