How to save language skill levels correctly in a database - mysql

I think I am before a problem where many of you were before. I have a registration form where a user can pick any language of the planet and then pick his skill level for the respective language from a selectbox.
So, for example:
Language1: German
Skill: Fluent
Language2: English
Skill: Basic
I'm thinking what's the best way to store these values in a MySQL database.
I thought of two ways.
First way: creating a column for each language and assigning a skill value to it.
--------------------------------------------------
| UserID | language_en | language_ge |
--------------------------------------------------
| 22 | 1 | 4 |
--------------------------------------------------
| 23 | 3 | 4 |
--------------------------------------------------
So the language is always the column's name and the number represents the skill level (1. Basic, 2. Average ... )
I believe this is a nice way to work with these things and it is also pretty fast. The problem starts when there are 50 languages or more. It doesn't sound like a good idea to make 50 columns where the script always have to check them all if a user have any skill in that language.
Second way: inserting an array in one of the table's column. The table will look like this:
----------------------------------
| UserID | languages |
----------------------------------
| 22 | "ge"=>"4", "en"=>"1" |
----------------------------------
This way the user with ID 22 has skill level 4 for Germany and skill level 1 for English. This is fine because we don't need to check 50 additional columns (or even more) but it's not the right way in my eyes anyway.
We have to parse a lot of results and find a user with, for example, has level 1 for Germany and level 2 for Spanish without looking for the English skill level - it will take the server's a longer time and when bigger data comes we are in trouble.
I bet many of you have experienced this kind of issue. Please, can someone advise me how to sort this out?
Thanks a lot.

I'd advise you to have a separate table with all the languages:
Table: Language
+------------+-------------------+--------------+
| LanguageID | LanguageNameShort | LanguageName |
+------------+-------------------+--------------+
| 1 | en | English |
| 2 | de | German |
+------------+-------------------+--------------+
And another table to link the users to the languages:
Table: LanguageLink
+--------+------------+--------------+
| UserID | LanguageID | SkillLevelID |
+--------+------------+--------------+
| 22 | 1 | 1 |
| 22 | 2 | 4 |
| 23 | 1 | 3 |
| 23 | 2 | 4 |
+--------+------------+--------------+
This is the normalised way to represent that kind of relations in a DB. All data is easily searchable and you don't have to change the DB scheme if you add a language.
To render a user's languages you could use a query like that. It will give you a row per lanugage a user speaks:
SELECT
LanguageLink.UserID,
LanguageLink.SkillLevelID,
Language.LanguageNameShort
FROM
LanguageLink,
Language
WHERE
LanguageLink.UserID = 22
AND LanguageLink.LanguageID = Language.LanguageID
If you want to go further, you could create another table fo the skill level:
Table: Skill
+--------------+-----------+
| SkillLevelID | SkillName |
+--------------+-----------+
| 1 | bad |
| 2 | mediocre |
| 3 | good |
| 4 | perfect |
+--------------+-----------+
What I've done here is called Database normalization. I'd recommend reading about it, it may help you design further databases.

Related

MS Access help needed forming a specific report

I have a table with a column for agent names and a column for each of the skills those agents could possibly have. Each skill the agent is assigned shows a 1 in the field under that skill.
Columns look like this:
+---------+----------+----------+----------+
| Name | 'Skill1' | 'Skill2' | 'Skill3' |
+---------+----------+----------+----------+
| John | 1 | | 1 |
| Sam | 1 | 1 | |
| Roberta | 1 | | 1 |
+---------+----------+----------+----------+
I would like to make a query that returns a list of all agent names that have a 1 for each particular skill. The query would return something like this:
+-----------+
| Skill 1 |
+-----------+
| John |
| Sam |
| Roberta |
+-----------+
Additionally I would like to be able to query a single name and retrieve all skills that agent has (all rows the Name column has a 1 in) like this:
+-----------+
| John |
+-----------+
| Skill 1 |
| Skill 3 |
+-----------+
I've done this in Excel using an index but I'm new to Access and not sure how to complete this task.
Thanks in advance.
One of the reasons that you are finding this task difficult is because your database is not normalised and so due to the way that your database is structured, you are working against MS Access, not with it.
Consequently, whilst a solution is still possible with the current data, the resulting queries will be painful to construct and will either be full of multiple messy iif statements, or several union queries performing the same operations over & over again, one for each 'skill'.
Then, if you every wish to add another Skill to the database, all of your queries have to be rewritten!
Whereas, if your database was normalised (as Gustav has suggested in the comments), the task would be a simple one-liner; and what's more, if you add a new skill later on, your queries will automatically output the results as if the skill had always been there.
Your data has a many-to-many relationship: an agent may have many skills, and a skill may be known by many agents.
As such, the most appropriate way to represent this relationship is using a junction table.
Hence, you would have a table of Agents such as:
tblAgents
+-----+-----------+----------+------------+
| ID | FirstName | LastName | DOB |
+-----+-----------+----------+------------+
| 1 | John | Smith | 1970-01-01 |
| ... | ... | ... | ... |
+-----+-----------+----------+------------+
This would only contain information unique to each agent, i.e. minimising the repeated information between records in the table.
You would then have a table of possible Skills, such as:
tblSkills
+-----+---------+---------------------+
| ID | Name | Description |
+-----+---------+---------------------+
| 1 | Skill 1 | Skill 1 Description |
| 2 | Skill 2 | Skill 2 Description |
| ... | ... | ... |
+-----+---------+---------------------+
Finally, you would have a junction table linking Agents to Skills, e.g.:
tblAgentSkills
+----+----------+----------+
| ID | Agent_ID | Skill_ID |
+----+----------+----------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
| 4 | 3 | 2 |
+----+----------+----------+
Now, say you want to find out which agents have Skill 1, the query is simple:
select Agent_ID from tblAgentSkills where Skill_ID = 1
What if you want to find out the skills known by an agent? Equally as simple:
select Skill_ID from tblAgentSkills where Agent_ID = 1
Of course, these queries will merely return the ID fields as present in the junction table - but since the ID uniquely identifies a record in the tblAgents or tblSkills tables, such ID is all you need to retrieve any other required information:
select
tblAgents.FirstName,
tblAgents.LastName
from
tblAgentSkills inner join tblAgents on
tblAgentSkills.AgentID = tblAgents.ID
where
tblAgentSkills.Skill_ID = 1
To get all agents with skill1, open the query designer and create the following query:
this will generate the following sql
SELECT Skills.AgentName
FROM Skills
WHERE (((Skills.Skill1)=1));
If you adjust the names you can also paste this query into the sql pane of the designer to get the query you want.
To get all the skills an agent has I chose a parameterized query. Open the query designer and create a new query:
When you run this query it will ask you for the name of the agent. Make sure to type the agent name exactly. Here is the resulting sql:
SELECT Skills.AgentName, Skills.Skill1, Skills.Skill2, Skills.Skill3
FROM Skills
WHERE (((Skills.AgentName)=[Agent]));
If you continue working with this query I would improve the table design by breaking your table into a skills table, agents table, skills&agents table. Then link the skills and agents tables to the skills&agents table in a many to many relationship. The query to get all an agents skills would then look like this in the designer:

MySQL - At what point should more than one table be used?

Edit for future viewers: Aside from the accepted answer which helped me I found some really good info here .
I've got a database with a single table for displaying inventory on a website (RVs). It stores the typical info: year, make, model, etc. I originally made it with 6 extra columns for storing "special features", but I don't like having such a hard limit on what options can be listed. Since I've never messed with more than a single table my gut instinct was to just add 24 or so more columns to cover everything, but something in my head told me that there might be a better way. So when do I decide N columns is too many? The data in these columns will commonly not be unique.
(Sorry for crappy diagram)
Current table design:
-----------------------------------------------------------------------
| id | year | make | model | price | ft_1 | ft_2 | ft_3 | ft_4 | ft_5 |
-----------------------------------------------------------------------
| | | | | | | | | | |
-----------------------------------------------------------------------
Possible better design:
table #1
------------------------------------
| id | year | make | model | price |
------------------------------------
| | | | | |
------------------------------------
table #2
---------------------------------------------
| unique_id(?) | feature | unit_ref |
---------------------------------------------
| 0 | "Diesel Pusher" | 2,6,14 |
---------------------------------------------
I feel like a bonus of the second table might be that I could more easily propagate a dropdown containing all the previously entered features to speed up adding new units to inventory.
Is this the right way to go about it, or should I just add more columns and be content?
Thanks.
Believe it or not, your best option would likely be to add a third table.
Since each record in your rvs table can be linked to multiple rows in the features table, and each feature can correspond to multiple rvs, you have a many-to-many relationship which is inherently difficult to maintain in a relational dbms. By adding a third "intersection" table you convert it to a one-to-many-to-one relationship which can be enforced declaratively by the dbms.
Your table structure would then become something like
rvs
------------------------------------
| id | year | make | model | price |
------------------------------------
| | | | | |
------------------------------------
features
--------------------------
| id | feature |
--------------------------
| 1192 | "Diesel Pusher" |
--------------------------
rv_features
----------------------
| rv_id | feature_id |
----------------------
| | |
----------------------
How do you make use of this? Suppose you want to record the fact that the 2016 Travelmore CampMaster has a 25kW diesel generator. You would first add a record to rvs like
--------------------------------------------------
| id | year | make | model | price |
--------------------------------------------------
| 0231 | 2016 | Travelmore | CampMaster | 750000 |
| 2101 | 2016 | Travelmore | Domestant | 650000 |
--------------------------------------------------
(Note the value in the id column is entirely arbitrary; its sole purpose is to serve as the primary key which uniquely identifies the record. It can encode meaningful information, but it must be something that will not change throughout the life of the record it identifies.)
You then add (or already have) the generator in the features table:
--------------------------------
| id | feature |
--------------------------------
| 1192 | Diesel Pusher 450hp |
| 3209 | diesel generator 25kW |
--------------------------------
Finally, you associate the rv to the feature with a record in rv_features:
----------------------
| rv_id | feature_id |
----------------------
| 0231 | 3209 |
| 0231 | 1192 |
| 2101 | 3209 |
----------------------
(I've added a few other records to each table for context.)
Now, to retrieve the features of the 2016 CampMaster, you use the following SQL query:
SELECT r.year, r.make, r.model, f.feature
FROM rvs r, features f, rv_features rf
WHERE r.id = rf.rv_id
AND rv.feature_id = f.id
AND r.id = '2031';
to get
----------------------------------------------------------
| year | make | model | feature |
----------------------------------------------------------
| 2016 | Travelmore | CampMaster | diesel generator 25kW |
| 2016 | Travelmore | CampMaster | Diesel Pusher 450hp |
----------------------------------------------------------
To see the rvs with a 25kW generator, change the query to
SELECT r.year, r.make, r.model, f.feature
FROM rvs r, features f, rv_features rf
WHERE r.id = rf.rv_id
AND rv.feature_id = f.id
AND f.id = '3209';
Sherantha's link to A Quick-Start Tutorial on Relational Database Design actually looks like a good intro to table design and normalization; you might find it useful.
There is a thing calles "third normal form" it says that everything without the unique ids shuld be unique. This means you need to make a table for year, a table for make a table for models etc and a table where you can combine all these ids to one connected dataset.
But this is not always practical, io think the best way to take this is something in between, like tables for entrys that repeat very often, but there dont need to be an extra table for price with unique ids, that would be overkill i think.
Based upon your scenario, if you believe no. of features columns remain same then no need for second table. And in case if there any possibility that features can be increased at any time in future then you should break up your table into two. (RVS & Features). Then create a third table that identify RVS & features as it seems there is many-to-many relationship. So I suggest you to use three tables.
I think it is better for you to be more familiar with relational database design. This is a short but great article I have found earlier.

Sum query for MySQL where field contain certain values

I need help with a Query, i have a table like this:
| ID | codehwos |
| --- | ----------- |
| 1 | 16,17,15,26 |
| 2 | 15,32,12,23 |
| 3 | 53,15,21,26 |
I need an outpout like this:
| codehwos | number_of_this_code |
| -------- | ---------------------- |
| 15 | 3 |
| 17 | 1 |
| 26 | 2 |
I want to sum all the time a code is used in a row.
Can anyone make a query for doing it for all the code in one time?
Thanks
You have a very poor data format. You should not store lists in strings and never store lists of numbers in strings. SQL has a great data structure for storing lists. Hint: it is called a "table" not a "string".
That said, sometimes one is stuck with other people's really poor design choices. We wouldn't make them ourselves, but we still need to get something done. Assuming you have a list of codes, you can do what you want with:
select c.code, count(*)
from codes c join
table t
on find_in_set(c.code, t.codehwos) > 0
group by c.code;
If you have any influence over the data structure, then advocate for a junction table, the right way to store this data in a relational database.

CRM Releationships MySQL

Our Company is developing a CRM and we came now to the point where we have to decide how we want to handle the releationships. This is an important point because there are going to be tons of them. And changing the structure later would be simply not cool..
I know 3 ways how we could do it:
One releationship table:
The way i would do this is creating one table holding all the releationships.
Table: releationships
+----+-------------+-----------+--------------+------------+
| id | record_type | record_id | belongs_type | belongs_id |
+----+-------------+-----------+--------------+------------+
| 1 | person | 42 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 2 | person | 43 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 3 | note | 23 | company | 12 |
+----+-------------+-----------+--------------+------------+
| 4 | attachment | 13 | company | 12 |
+----+-------------+-----------+--------------+------------+
Multiple releationship tables:
I think this is the way how it for example the SugarCRM does.
Table: company_realationships
+----+-----------+------------+--------+
| id | record_id | has_type | has_id |
+----+-----------+------------+--------+
| 1 | 12 | person | 42 |
+----+-----------+------------+--------+
| 2 | 12 | person | 43 |
+----+-----------+------------+--------+
| 3 | 12 | note | 23 |
+----+-----------+------------+--------+
| 2 | 12 | attachment | 13 |
+----+-----------+------------+--------+
All in the record table:
Table: person
+----+-----------+------------+
| id | name | company_id |
+----+-----------+------------+
| 42 | luke | 12 |
+----+-----------+------------+
| 43 | other guy | 12 |
+----+-----------+------------+
ect.
So my Question is wich is the Best way of handling lots of releationships?
Are there other ways to do it?
What are disadvantages / advantages?
Is there a special way how hightraffic sides handle their releationships?
Thanks for your help guys :)
So my Question is wich is the Best way of handling lots of releationships?
The third one or the variation of it (see below).
Every "M:N" relationship should be represented by its own junction table. OTOH, a "1:N" relationship doesn't need additional table - just a proper foreign key in the table on the side of the "N".
If I understand your description correctly, the third option models a 1:N relationship between company and person. If by any chance you wanted to model a M:N relationship between them, you'd have a junction table: company_person ( company_id, person_id, PK (company_id, person_id) ).
Are there other ways to do it?
Sometimes, inheritance (aka. category, subtype, generalization hierarchy etc.) can be used to lower the number of possible "relatable" combinations. In a nutshell, make a relationship to a parent, then every child inherited from that parent is automatically involved in that relationship.
For an example, take a look at this post.
What are disadvantages / advantages?
Enforcing constraints (including FKs) declaratively is better (less prone to errors and probably more performant) than enforcing them through triggers, which is again better than enforcing them in the client code.
Choose a design that better adheres to that principle. For example, your options 1 and 2 don't allow the DBMS to enforce FKs declaratively.
Is there a special way how hightraffic sides handle their releationships?
Good logical design followed by good physical implementation is the only solid basis for good performance. It's hard to "bolt-on" the performance on top of a bad design.
Perhaps, you'd like to take a look at:
ERwin Methods Guide
Use The Index, Luke!
And when it comes to performance, don't guess! Measure on realistic amounts of data.

How can I save semantic information in a MySQL table?

I wish to save some semantic information about data in a table. How can I save this information in MySQL, such that I can access data and also search for the articles using the semantic data.
For example, I have a article about Apple and Microsoft. The semantic data will be like
Person : Steve Jobs
Person : Steve Ballmer
Company : Apple
Company : Microsoft
I want to save the information without losing the info that Steve Jobs and Steve Ballmer are persons and Apple and Microsoft are companies. I also want to search for articles about Steve Jobs / Apple.
Person and Company are not the only possible types, hence adding new fields is not viable. Since the type of the data is to be saved, I cannot use FullText field type directly.
Update - These are two options that I am considering.
Save the data in a full text column as serialized php array.
Create another table with 3 columns
--
--------------------------------
| id | subject | object |
--------------------------------
| 1 | Person | Steve Ballmer |
| 1 | Person | Steve Jobs |
| 1 | Company | Microsoft |
| 1 | Company | Apple |
| 2 | Person | Obama |
| 2 | Country | US |
--------------------------------
You're working on a hard and interesting problem! You may get some interesting ideas from looking at the Dublin Core Metadata Initiative.
http://dublincore.org/metadata-basics/
To make it simple, think of your metadata items as all fitting in one table.
e.g.
Ballmer employed-by Microsoft
Ballmer is-a Person
Microsoft is-a Organization
Microsoft run-by Ballmer
SoftImage acquired-by Microsoft
SoftImage is-a Organization
Joel Spolsky is-a Person
Joel Spolsky formerly-employed-by Microsoft
Spolsky, Joel dreamed-up StackOverflow
StackOverflow is-a Website
Socrates is-a Person
Socrates died-on (some date)
The trick here is that some, but not all, your first and third column values need to be BOTH arbitrary text AND serve as indexes into the first and third columns. Then, if you're trying to figure out what your data base has on Spolsky, you can full-text search your first and third columns for his name. You'll get out a bunch of triplets. The values you find will tell you a lot. If you want to know more, you can search again.
To pull this off you'll probably need to have five columns, as follows:
Full text subject (whatever your user puts in)
Canonical subject (what your user puts in, massaged into a standard form)
Relation (is-a etc)
Full text object
Canonical object
The point of the canonical forms of your subject and object is to allow queries like this to work, even if your user puts in "Joel Spolsky" and "Spolsky, Joel" in two different places even if they mean the same person.
SELECT *
FROM relationships a
JOIN relationships b (ON a.canonical_object = b.canonical_subject)
WHERE MATCH (subject,object) AGAINST ('Spolsky')
You might want to normalize your data table by making 2 tables.
----------------
| id | subject |
----------------
| 1 | Person |
| 2 | Company |
| 3 | Country |
----------------
-----------------------------------
| id | subject-id | object |
-----------------------------------
| 1 | 1 | Steve Ballmer |
| 2 | 1 | Steve Jobs |
| 3 | 2 | Microsoft |
| 4 | 2 | Apple |
| 5 | 1 | Obama |
| 6 | 3 | US |
-----------------------------------
This allows you to more easily see all the different subject types you have defined.