mysql select from 2 other columns in the same table - mysql

I have a table which looks like this but much longer...
| CategoryID | Category | ParentCategoryID |
+------------+----------+------------------+
| 23 | Screws | 3 |
| 3 | Packs | 0 |
I am aiming to retrieve one column from this which in this instance would give me the following...
| Category |
+--------------+
| Packs/Screws |
Please excuse me for not knowing exactly how to word this, so far I can only think to split the whole table into multiple tables and use LEFT JOIN, this seems like a very good opportunity for a learning curve however.
I realise that CONCAT() will come into play when combining the two retrieved Category names but beyond that I am stumped.

SELECT CONCAT(x.category,'/',y.category) Category
FROM my_table x
JOIN my_table y
ON y.categoryid = x.parentcategoryid
[WHERE x.parentcategoryid = 0]

Related

design review, extra selects to gain abstraction in MySQL

I hope that stackoverflow is the correct place to ask this, I feel a bit on the fence but didn't find that it really fit better into another stack-exchange site.
So, the question is pretty much about "best-practice" or design in mysql, I don't see this done a lot in tutorials and resources why I am a bit afraid that it is not a good way to do it, so I thought I'd try to get some feedback.
I tried to make a layout as an example (thanks for commenting)
https://www.db-fiddle.com/f/rBRUhX3DYiTgGyBPSgQfCm/2
I have a layout similar to this:
table: player
+----+------+------+
| id | name | data |
+----+------+------+
| 1 | foo | bar |
| 2 | test | test |
+----+------+------+
Then I have tables to pick specific information
table: user_external_name
+----+----------+
| id | nickname |
+----+----------+
| 1 | baz |
| 2 | qux |
+----+----------+
And I have a third table containing matches between players, something like:
table: matches
+---------+--------+--------+
| matchid | homeid | awayid |
+---------+--------+--------+
| 0 | 1 | 2 |
+---------+--------+--------+
And then I might do queries like this on matches:
SELECT
(SELECT nickname from user_external_name WHERE id = matches.home) as home,
(SELECT nickname from user_external_name WHERE id = matches.away) as away
FROM matches;
I also realized that I can make use of joins to make the query and that way I go get rid of the multiple selects. I am still not sure why the design is dumb, but I figured out that what I need to read about is pretty much relational databases. I will leave my original above for reference if someone else come stumbling down this road.
SELECT
h.nickname home,
a.nickname away
FROM `matches` as m
join user_external_name as h on h.id = m.home
join user_external_name as a on a.id = m.away;
resulting in:
+------+------+
| home | away |
+------+------+
| baz | qux |
+------+------+
So the actual question
Is this a reasonable way of doing it, or is it dumb in some way? One of my main arguments are that this way I can reuse the id to get the specific information by id in other tables (i.e. I never have to copy the actual name). Could you point me to a better way of doing this, or some resources/suggestions as how to think in this situation?
Thanks for taking the time to read through and hopefully I can learn something good. :)

MySQL Table structure: Multiple attributes for each item

I wanted to ask you which could be the best approach creating my MySQL database structure having the following case.
I've got a table with items, which is not needed to describe as the only important field here is the ID.
Now, I'd like to be able to assign some attributes to each item - by its ID, of course. But I don't know exactly how to do it, as I'd like to keep it dynamic (so, I do not have to modify the table structure if I want to add a new attribute type).
What I think
I think - and, in fact, is the structure that I have right now - that I can make a table items_attributes with the following structure:
+----+---------+----------------+-----------------+
| id | item_id | attribute_name | attribute_value |
+----+---------+----------------+-----------------+
| 1 | 1 | place | Barcelona |
| 2 | 2 | author_name | Matt |
| 3 | 1 | author_name | Kate |
| 4 | 1 | pages | 200 |
| 5 | 1 | author_name | John |
+----+---------+----------------+-----------------+
I put data as an example for you to see that those attributes can be repeated (it's not a relation 1 to 1).
The problem with this approach
I have the need to make some querys, some of them for statistic purpouses, and if I have a lot of attributes for a lot of items, this can be a bit slow.
Furthermore - maybe because I'm not an expert on MySQL - everytime I want to make a search and find "those items that have 'place' = 'Barcelona' AND 'author_name' = 'John'", I end up having to make multiple JOINs for every condition.
Repeating the example before, my query would end up like:
SELECT *
FROM items its
JOIN items_attributes attr
ON its.id = attr.item_id
AND attr.attribute_name = 'place'
AND attr.attribute_value = 'Barcelona'
AND attr.attribute_name = 'author_name'
AND attr.attribute_value = 'John';
As you can see, this will return nothing, as an attribute_name cannot have two values at once in the same row, and an OR condition would not be what I'm searching for as the items MUST have both attributes values as stated.
So the only possibility is to make a JOIN on the same repeated table for every condition to search, which I think it's very slow to perform when there are a lot of terms to search for.
What I'd like
As I said, I'd like to be able to keep the attributes types dynamical, so by adding a new input on 'attribute_name' would be enough, without having to add a new column to a table. Also, as they are 1-N relationship, they cannot be put in the 'items' table as new columns.
If the structure, in your opinion, is the only one that can acheive my interests, if you could light up some ideas so the search queries are not a ton of JOINs it would be great, too.
I don't know if it's quite hard to get it as I've been struggling my head until now and I haven't come up with a solution. Hope you guys can help me with that!
In any case, thank you for your time and attention!
Kind regards.
You're thinking in the right direction, the direction of normalization. The normal for you would like to have in your database is the fifth normal form (or sixth, even). Stackoverflow on this matter.
Table Attribute:
+----+----------------+
| id | attribute_name |
+----+----------------+
| 1 | place |
| 2 | author name |
| 3 | pages |
+----+----------------+
Table ItemAttribute
+--------+----------------+
| item_id| attribute_id |
+--------+----------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
+--------+----------------+
So for each property of an object (item in this case) you create a new table and name it accordingly. It requires lots of joins, but your database will be highly flexible and organized. Good luck!
In my Opinion it should be something like this, i know there are a lot of table, but actually it normilizes your DB
Maybe that is why because i cant understant where you get your att_value column, and what should contains this columns

mysql get table based on common column between two tables

while trying to learn sql i came across "Learn SQL The Hard Way" and i started reading it.
Everything was going fine then i thought ,as a way to practice, to make something like given example in the book (example consists in 3 tables pet,person,person_pet and the person_pet table 'links' pets to their owners).
I made this:
report table
+----+-------------+
| id | content |
+----+-------------+
| 1 | bank robbery|
| 2 | invalid |
| 3 | cat on tree |
+----+-------------+
notes table
+-----------+--------------------+
| report_id | content |
+-----------+--------------------+
| 1 | they had guns |
| 3 | cat was saved |
+-----------+--------------------+
wanted result
+-----------+--------------------+---------------+
| report_id | report_content | report_notes |
+-----------+--------------------+---------------+
| 1 | bank robbery | they had guns |
| 2 | invalid | null or '' |
| 3 | cat on tree | cat was saved |
+-----------+--------------------+---------------+
I tried a few combinations but no success.
My first thought was
SELECT report.id,report.content AS report_content,note.content AS note_content
FROM report,note
WHERE report.id = note.report_id
but this only returns the ones that have a match (would not return the invalid report).
after this i tried adding IF conditions but i just made it worse.
My question is, is this something i will figure out after getting past basic sql
or can this be done in simple way?
Anyway i would appreciate any help, i pretty much lost with this.
Thank you.
EDIT: i have looked into related questions but havent yet found one that solves my problem.
I probably need to look into other statements such as join or something to sort this out.
You need to get to the chapter on OUTER JOINS, specifically, a LEFT JOIN
SELECT report.id,report.content AS report_content,note.content AS note_content
FROM report
LEFT JOIN note ON report.id = note.report_id
Note the ANSI-92 JOIN syntax as opposed to using WHERE x=y
(You can probably do it using the older syntax you were using WHERE report.id *= note.report_id, if I recall the old syntax correctly, but I'd recommend the above syntax instead)
You are doing a join. The kind of join you have is an inner join, but you want an outer join:
SELECT report.id,report.content AS report_content,note.content AS note_content
FROM report
LEFT JOIN note on report.id = note.report_id
Note that the LEFT table is the one that will supply the missing values.

MySQL Query to Match Unrelated Terms

I'm trying to construct a query that's driving me crazy. I had no idea where to start with solving it, but after searching around a bit I started playing with subqueries. Now I'm at the point where I'm not sure if that will solve my issue or, if it will, how to create one that does what I want.
Here's a very simplistic view of my current table (call it tbl_1):
---------------------------------
| row | name | other_names |
|-------------------------------|
| 1 | A | B, C |
| 2 | B | C |
| 3 | A | C |
| 4 | D | E |
| 5 | C | A, B |
---------------------------------
Some of the items I'm working with have multiple names (brand names, names in other countries, code names, etc.), but ultimately all of those different names refer to the same item. I originally was running a search query along the lines of:
SELECT * FROM tbl_1
WHERE name LIKE '%A%'
OR other_names LIKE '%A%';
Which would return rows 1 and 3. However, I quickly realized that my query should also return row 2, as A = B = C. How would I go about doing something like that? I'm open to alternative suggestions outside of a fancy query, such as constructing another table that somehow combines all the names into one row, but I figure something like that would be error prone or inefficient.
Additionally, I'm running MySQL 5.5.23 using InnoDB with other code written in PHP and Python.
Thanks!
Update 5/26/12:
I went back to my original thinking of using a subquery, but right when I thought I was getting somewhere I ran into a documented MySQL issue where the query is evaluated from the outside in and my subquery will be evaluated for every row and won't finish in a realistic amount of time. Here's what I was attempting to do:
SELECT * FROM tbl_1
WHERE name = ANY
(SELECT name FROM tbl_1 WHERE other_names LIKE '%A%' or name LIKE '%A%')
OR other_names = ANY
(SELECT name FROM tbl_1 WHERE other_names LIKE '%A%' or name LIKE '%A%')
Which returns what I want using the example table, but the aforementioned MySQL issue/bug causes the subquery to be considered a dependent query rather than an independent one. As a result, I haven't been able to test the query on my real table (~250,000 rows) as it eventually times out.
I've read that the main workaround for the issue is to use joins rather than subqueries, but I'm not sure how I would apply that to what I'm trying to do. The more I think about it, I might be better off running the subqueries independently using PHP/Python and using the resulting arrays to craft the main query that I want. However, I still think there is the potential to miss some results because the terms in the columns aren't nearly as nice as my example (some of the terms are multiple words, some have parenthesis, the other names aren't necessarily comma-separated, etc).
Alternatively, I'm thinking about constructing a separate table that will build the necessary links, something like:
| 1 | A | B, C|
| 2 | B | C, A|
| 3 | C | A, B|
but I think that's a lot easier said than done considering the data I'm working with and the non-standardized format in which it exists.
The route that I'm strongly considering at the point is to build a separate table with the links that are easily constructed (i.e. 1:1 ratio for name:other_names) so I don't have to deal with the formatting issues that exist in the other_names column. I may also eliminate/limit the use of LIKE and require users to know at least one exact name in order to simplify the results and probably increase the overall performance.
In conclusion, I hate working with input data that I have no control over.
Stumbled on this question by accident, so i don't know if my suggestion is relevant, but this looks like good usage for something like an "union-find".
The SELECT would be extremely easy and fast.
But the insert & update is relativly complex and you will probably need an in-code loop (while updated rows > 0)... and several databse calls
Example for the table:
---------------------------
| row | name | group |
|-------------------------|
| 1 | A | 1 |
| 2 | B | 1 |
| 4 | C | 1 |
| 5 | D | 2 |
| 6 | X | 1 |
| 7 | Z | 2 |
---------------------------
selecting:
SELECT name FROM tbl WHERE group = (SELECT group FROM tbl WHERE name LIKE '%A%')
inserting relation K = T: (psedu codeish..)
SELECT group as gk WHERE name = K;
SELECT group as gt WHERE name = T;
if (gk empty result) and (gt empty result) insert both with new group
---------------------------
| row | name | group |
|-------------------------|
| 1 | A | 1 |
| 2 | B | 1 |
| 4 | C | 1 |
| 5 | D | 2 |
| 6 | X | 1 |
| 7 | Z | 2 |
| 8 | K | 3 |
| 9 | T | 3 |
---------------------------
if (gk empty result) and (gt NOT empty result) insert t with group = gx.group
---------------------------
| row | name | group |
|-------------------------|
| 1 | A | 1 |
| 2 | B | 1 |
| 4 | C | 1 |
| 5 | D | 2 |
| 6 | X | 1 |
| 7 | Z | 2 |
| 8 | K | 2 |
| 9 | T | 2 |
---------------------------
(the same in the other case)
and when both not empty, update one group to be the other
UPDATE tbl1 SET group = gt WHERE group = gk
I can't think of a query, that supports unlimited depth of name identity. But if you could work with a limited number of "recursions", you might consider using a query similar to this, starting with the query you provided, you retrieve all rows with name identities:
SELECT a.* FROM tbl_1 a
WHERE a.name='A'
OR a.other_names LIKE '%A%'
UNION
SELECT b.* FROM tbl_1 a
JOIN tbl_1 b ON a.other_names LIKE '%' || b.name || '%' OR b.other_names LIKE '%' || a.name || '%'
WHERE a.name='A'
OR a.other_names LIKE '%A%';
This query would return row 2, but it wouldn't return any additional rows having "B" as "other_name" in your example. So you would have to union another query:
SELECT a.* FROM tbl_1 a
WHERE a.name='A'
OR a.other_names LIKE '%A%'
UNION
SELECT b.* FROM tbl_1 a
JOIN tbl_1 b ON a.other_names LIKE '%' || b.name || '%' OR b.other_names LIKE '%' || a.name || '%'
WHERE a.name='A'
OR a.other_names LIKE '%A%';
UNION
SELECT c.* FROM tbl_1 a
JOIN tbl_1 b ON (a.other_names LIKE '%' || b.name || '%' OR b.other_names LIKE '%' || a.name || '%')
JOIN tbl_1 c ON (b.other_names LIKE '%' || c.name || '%' OR c.other_names LIKE '%' || b.name || '%')
WHERE a.name='A'
OR a.other_names LIKE '%A%';
As you can see, the query would grow and accelerate rapidly with increasing depth, and it also isn't what I would call beautiful. But it might fit your needs. I'm not very experienced working with MySQL functions, but I guess you would be able to create a more elegant solution also working with unlimited depth using those. You might also consider solving the problem programmatically with Python.

Is this good Database Normalization?

I am a beginner at using mysql and I am trying to learn the best practices. I have setup a similar structure as seen below.
(main table that contains all unique entries) TABLE = 'main_content'
+------------+---------------+------------------------------+-----------+
| content_id | (deleted) | title | member_id |
+------------+---------------+------------------------------+-----------+
| 6 | | This is a very spe?cal t|_st | 1 |
+------------+---------------+------------------------------+-----------+
(Provides the total of each difficulty and joins id --> actual name) TABLE = 'difficulty'
+---------------+-------------------+------------------+
| difficulty_id | difficulty_name | difficulty_total |
+---------------+-------------------+------------------+
| 1 | Absolute Beginner | 1 |
| 2 | Beginner | 1 |
| 3 | Intermediate | 0 |
| 4 | Advanced | 0 |
| 5 | Expert | 0 |
+---------------+-------------------+------------------+
(This table ensures that multiple values can be inserted for each entry. For example,
this specific entry indicates that there are 2 difficulties associated with the submission)
TABLE = 'lookup_difficulty'
+------------+---------------+
| content_id | difficulty_id |
+------------+---------------+
| 6 | 1 |
| 6 | 2 |
+------------+---------------+
I am joining all of this into a readable query:
SELECT group_concat(difficulty.difficulty_name) as difficulty, member.member_name
FROM main_content
INNER JOIN difficulty ON difficulty.difficulty_id
IN (SELECT difficulty_id FROM main_content, lookup_difficulty WHERE lookup_difficulty.content_id = main_content.content_id )
INNER JOIN member ON member.member_id = main_content.member_id
The above works fine, but I am wondering if this is good practice. I practically followed the structure laid out Wikipedia's Database Normalization example.
When I run the above query using EXPLAIN, it says: 'Using where; Using join buffer' and also that I am using 2 DEPENDENT SUBQUERY (s) . I don't see any way to NOT use sub-queries to achieve the same affect, but then again I'm a noob so perhaps there is a better way....
The DB design looks fine - regarding your query, you could rewrite it exclusively with joins like:
SELECT group_concat(difficulty.difficulty_name) as difficulty, member.member_name
FROM main_content
INNER JOIN lookup_difficulty ON main_content.id = lookup_difficulty.content_id
INNER JOIN difficulty ON difficulty.id = lookup_difficulty.difficulty_id
INNER JOIN member ON member.member_id = main_content.member_id
If the lookup_difficulty provides a link between content and difficulty I would suggest you take out the difficulty_id column from your main_content table. Since you can have multiple lookups for each content_id, you would need some extra business logic to determine which difficulty_id to put in your main_content table (or multiple entries in the main_content table for each difficulty_id, but that goes against normalization practices). For ex. the biggest value / smallest value / random value. In either case, it does not make much sense.
Other than that the table looks fine.
Update
Saw you updated the table :)
Just as a side-note. Using IN can slow down your query (IN can cause a table-scan). In any case, it used to be that way, but I'm sure that these days the SQL compiler optimizes it pretty well.