Several separated tables vs one integrated table with an additional column? - mysql

I have 3 tables which all of them have the same structure:
// table1 // table2 // table3
+----+------+ +----+------+ +----+------+
| id | name | | id | name | | id | name |
+----+------+ +----+------+ +----+------+
| 1 | jack | | 1 | ali | | 1 | peter|
+----+------+ +----+------+ +----+------+
Well, I want to know, my current structure is better or an integrated table along with one additional column? something like this:
+----+------+-------+
| id | name | which |
+----+------+-------+
| 1 | jack | table1|
| 2 | ali | table2|
| 3 | peter| table3|
+----+------+-------+
Note: It should be noted that in the current structure (several tables) my query is something like this:
select id, name from table1
union all
select id, name from table2
union all
select id, name from table3
Now I want to know converting those several tables to one table and add a new column is better or not? (I think that new column is kinda overload, is it true?)

This has practical consequences and also philosophical consequences. From a practical point of view, it's very hard to know without knowing a lot more about how the data is going to be used. what's the read to write ratio for this data? How often is data from two or more tables going to be selected in a single query? If you have to do a UNION to get all the data gathered, it's both slower and more cumbersome.
I prefer the philosophical approach, starting with the subject matter. Is there only one kind of entity here, or are there three different entitites that all happen to have the same attribute? That nearly always tells me whether to put them in the same table or not, and also turns out to give the right answer to the practical issue as well, most of the time.
I will say that I would be looking around for some better name for the values of the extra attribute. "table1", "table2" and "table3" seem terribly opaque to me. The subject matter should provide a clue here as well.
Edit:
now that I get the subject matter, I'm going to opine in favor of a single table. It is an opinion rather than a hard and fast rule. So it would be something like.
+----+-----------+----------+--------------+
| id | word | language |translation |
+----+-----------+----------+--------------+
| 1 | butterfly | Spanish | mariposa |
| 2 | butterfly | French | papillon |
| 3 | butterfly | Italian | farfalla |
| 4 | chair | Spanish | silla |
+----+-----------+----------+--------------+

If you are sure that all three tables will remain have common attributes then the option of single table is fine and if that may not persist then don't think about it.
This thread may help you more.

Related

MS Access help needed forming a specific report

I have a table with a column for agent names and a column for each of the skills those agents could possibly have. Each skill the agent is assigned shows a 1 in the field under that skill.
Columns look like this:
+---------+----------+----------+----------+
| Name | 'Skill1' | 'Skill2' | 'Skill3' |
+---------+----------+----------+----------+
| John | 1 | | 1 |
| Sam | 1 | 1 | |
| Roberta | 1 | | 1 |
+---------+----------+----------+----------+
I would like to make a query that returns a list of all agent names that have a 1 for each particular skill. The query would return something like this:
+-----------+
| Skill 1 |
+-----------+
| John |
| Sam |
| Roberta |
+-----------+
Additionally I would like to be able to query a single name and retrieve all skills that agent has (all rows the Name column has a 1 in) like this:
+-----------+
| John |
+-----------+
| Skill 1 |
| Skill 3 |
+-----------+
I've done this in Excel using an index but I'm new to Access and not sure how to complete this task.
Thanks in advance.
One of the reasons that you are finding this task difficult is because your database is not normalised and so due to the way that your database is structured, you are working against MS Access, not with it.
Consequently, whilst a solution is still possible with the current data, the resulting queries will be painful to construct and will either be full of multiple messy iif statements, or several union queries performing the same operations over & over again, one for each 'skill'.
Then, if you every wish to add another Skill to the database, all of your queries have to be rewritten!
Whereas, if your database was normalised (as Gustav has suggested in the comments), the task would be a simple one-liner; and what's more, if you add a new skill later on, your queries will automatically output the results as if the skill had always been there.
Your data has a many-to-many relationship: an agent may have many skills, and a skill may be known by many agents.
As such, the most appropriate way to represent this relationship is using a junction table.
Hence, you would have a table of Agents such as:
tblAgents
+-----+-----------+----------+------------+
| ID | FirstName | LastName | DOB |
+-----+-----------+----------+------------+
| 1 | John | Smith | 1970-01-01 |
| ... | ... | ... | ... |
+-----+-----------+----------+------------+
This would only contain information unique to each agent, i.e. minimising the repeated information between records in the table.
You would then have a table of possible Skills, such as:
tblSkills
+-----+---------+---------------------+
| ID | Name | Description |
+-----+---------+---------------------+
| 1 | Skill 1 | Skill 1 Description |
| 2 | Skill 2 | Skill 2 Description |
| ... | ... | ... |
+-----+---------+---------------------+
Finally, you would have a junction table linking Agents to Skills, e.g.:
tblAgentSkills
+----+----------+----------+
| ID | Agent_ID | Skill_ID |
+----+----------+----------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
| 4 | 3 | 2 |
+----+----------+----------+
Now, say you want to find out which agents have Skill 1, the query is simple:
select Agent_ID from tblAgentSkills where Skill_ID = 1
What if you want to find out the skills known by an agent? Equally as simple:
select Skill_ID from tblAgentSkills where Agent_ID = 1
Of course, these queries will merely return the ID fields as present in the junction table - but since the ID uniquely identifies a record in the tblAgents or tblSkills tables, such ID is all you need to retrieve any other required information:
select
tblAgents.FirstName,
tblAgents.LastName
from
tblAgentSkills inner join tblAgents on
tblAgentSkills.AgentID = tblAgents.ID
where
tblAgentSkills.Skill_ID = 1
To get all agents with skill1, open the query designer and create the following query:
this will generate the following sql
SELECT Skills.AgentName
FROM Skills
WHERE (((Skills.Skill1)=1));
If you adjust the names you can also paste this query into the sql pane of the designer to get the query you want.
To get all the skills an agent has I chose a parameterized query. Open the query designer and create a new query:
When you run this query it will ask you for the name of the agent. Make sure to type the agent name exactly. Here is the resulting sql:
SELECT Skills.AgentName, Skills.Skill1, Skills.Skill2, Skills.Skill3
FROM Skills
WHERE (((Skills.AgentName)=[Agent]));
If you continue working with this query I would improve the table design by breaking your table into a skills table, agents table, skills&agents table. Then link the skills and agents tables to the skills&agents table in a many to many relationship. The query to get all an agents skills would then look like this in the designer:

How to save language skill levels correctly in a database

I think I am before a problem where many of you were before. I have a registration form where a user can pick any language of the planet and then pick his skill level for the respective language from a selectbox.
So, for example:
Language1: German
Skill: Fluent
Language2: English
Skill: Basic
I'm thinking what's the best way to store these values in a MySQL database.
I thought of two ways.
First way: creating a column for each language and assigning a skill value to it.
--------------------------------------------------
| UserID | language_en | language_ge |
--------------------------------------------------
| 22 | 1 | 4 |
--------------------------------------------------
| 23 | 3 | 4 |
--------------------------------------------------
So the language is always the column's name and the number represents the skill level (1. Basic, 2. Average ... )
I believe this is a nice way to work with these things and it is also pretty fast. The problem starts when there are 50 languages or more. It doesn't sound like a good idea to make 50 columns where the script always have to check them all if a user have any skill in that language.
Second way: inserting an array in one of the table's column. The table will look like this:
----------------------------------
| UserID | languages |
----------------------------------
| 22 | "ge"=>"4", "en"=>"1" |
----------------------------------
This way the user with ID 22 has skill level 4 for Germany and skill level 1 for English. This is fine because we don't need to check 50 additional columns (or even more) but it's not the right way in my eyes anyway.
We have to parse a lot of results and find a user with, for example, has level 1 for Germany and level 2 for Spanish without looking for the English skill level - it will take the server's a longer time and when bigger data comes we are in trouble.
I bet many of you have experienced this kind of issue. Please, can someone advise me how to sort this out?
Thanks a lot.
I'd advise you to have a separate table with all the languages:
Table: Language
+------------+-------------------+--------------+
| LanguageID | LanguageNameShort | LanguageName |
+------------+-------------------+--------------+
| 1 | en | English |
| 2 | de | German |
+------------+-------------------+--------------+
And another table to link the users to the languages:
Table: LanguageLink
+--------+------------+--------------+
| UserID | LanguageID | SkillLevelID |
+--------+------------+--------------+
| 22 | 1 | 1 |
| 22 | 2 | 4 |
| 23 | 1 | 3 |
| 23 | 2 | 4 |
+--------+------------+--------------+
This is the normalised way to represent that kind of relations in a DB. All data is easily searchable and you don't have to change the DB scheme if you add a language.
To render a user's languages you could use a query like that. It will give you a row per lanugage a user speaks:
SELECT
LanguageLink.UserID,
LanguageLink.SkillLevelID,
Language.LanguageNameShort
FROM
LanguageLink,
Language
WHERE
LanguageLink.UserID = 22
AND LanguageLink.LanguageID = Language.LanguageID
If you want to go further, you could create another table fo the skill level:
Table: Skill
+--------------+-----------+
| SkillLevelID | SkillName |
+--------------+-----------+
| 1 | bad |
| 2 | mediocre |
| 3 | good |
| 4 | perfect |
+--------------+-----------+
What I've done here is called Database normalization. I'd recommend reading about it, it may help you design further databases.

Sum query for MySQL where field contain certain values

I need help with a Query, i have a table like this:
| ID | codehwos |
| --- | ----------- |
| 1 | 16,17,15,26 |
| 2 | 15,32,12,23 |
| 3 | 53,15,21,26 |
I need an outpout like this:
| codehwos | number_of_this_code |
| -------- | ---------------------- |
| 15 | 3 |
| 17 | 1 |
| 26 | 2 |
I want to sum all the time a code is used in a row.
Can anyone make a query for doing it for all the code in one time?
Thanks
You have a very poor data format. You should not store lists in strings and never store lists of numbers in strings. SQL has a great data structure for storing lists. Hint: it is called a "table" not a "string".
That said, sometimes one is stuck with other people's really poor design choices. We wouldn't make them ourselves, but we still need to get something done. Assuming you have a list of codes, you can do what you want with:
select c.code, count(*)
from codes c join
table t
on find_in_set(c.code, t.codehwos) > 0
group by c.code;
If you have any influence over the data structure, then advocate for a junction table, the right way to store this data in a relational database.

Left join table until no parent and table structure

By referring table in the link, I have table category and another table name "package" to store category id.
http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/hierarchical-data.html
Category
+-------------+----------------------+--------+
| category_id | name | parent |
+-------------+----------------------+--------+
| 1 | ELECTRONICS | NULL |
| 2 | TELEVISIONS | 1 |
| 3 | TUBE | 2 |
| 4 | LCD | 2 |
| 5 | PLASMA | 2 |
| 6 | PORTABLE ELECTRONICS | 1 |
| 7 | MP3 PLAYERS | 6 |
| 8 | FLASH | 7 |
| 9 | CD PLAYERS | 6 |
| 10 | 2 WAY RADIOS | 6 |
+-------------+----------------------+--------+
Is there anyway I can left join until there is no parent left without knowing how many times I have to join?
And second question, my table "package" is only storing the last/smallest category id, for example in the table is "7 - FLASH", is that a good practices to only store the last/smallest category id and refer it back by joining the table? Will this action making the database heavy by query it back every time?
Thanks in advance!
It is not possible to do such queries in MySQL.
If you need to keep this database structure, then the fastest approach is likely to select the relevant data from the table and then process the data client-side into the approach array/join.
The above may not work well if you cannot sufficiently narrow down the number of rows to SELECT out, in which case, recursively running multiple queries may be faster. On your second query, the best approach is to do something like WHERE id IN (list_of_parent_values) rather than 1 query per parent.
Lastly if you can change your data structure, there is a way of using special tree column values to efficiently select all of the nodes out with a single SQL query. Some more work is required to insert and re-organise the tree however.
There are a number of slightly differing implementations of this, see here for one such discussion:
http://web.archive.org/web/20110606032941/http://dev.mysql.com/tech-resources/articles/hierarchical-data.html
awesome_nested_set is also a ruby implementation of this pattern:
https://github.com/collectiveidea/awesome_nested_set

How to make a field of a DataBase as array?

I want to have a column in a database that can contain multiple entries. Is it possible to have to define the type of the column as an array (fixed-sized array or some dynamic collection) so that it can store multiple entries.
If you require various values to be stored together, in a single field, then you will likely be best off storing them as a delimiter-separated string of values:
+----------------------------------+
| PRODUCTS |
+----------+-----------------------+
| Product | Colors |
+----------+-----------------------+
| Notebook | blue,red,green,orange |
+----------+-----------------------+
This is usually not what youw want though. Generally-speaking, the idea solution is to create relationships between tables. For instance:
+---------------+
| PRODUCT |
+----+----------+
| ID | Product |
+----+----------+
| 1 | Notebook |
+---------------+
+---------------+
| COLORS |
+----+----------+
| ID | Color |
+----+----------+
| 1 | Blue |
+---------------+
| 2 | Red |
+---------------+
| 3 | Green |
+---------------+
+---------------------+
| PRODUCTCOLORS |
+-----------+---------+
| ProductID | ColorID |
+-----------+---------+
| 1 | 1 | Notebook, Blue
+-----------+---------+
| 1 | 3 | Notebook, Green
+-----------+---------+
yes, in a typical relational design, you would have a 1:N (1-to-many) relationship between 1 table and another. each row in the first table represents a collection, each row in the second table is an element in a collection and references the first table.
a comma-separated list, serialize, or a url-encoded string is also a good solution as the other answers point out...
No, but what server side language are you using?
If using PHP you can use
$serializedArray = serialize($myArray);
And then insert that value into the db. To get it back out use unserialize();
This is pretty much the same answer as above (have a delimited string), but you could also save the text in that column as XML. Depending on the database you are using, that could be easy or tedious.
As pointed out above, is you obviously lose any aspect of being able to manage the data integrity from your DB layer (easily).