Getting parent/child/subchild relation in mysql - mysql

I have a single table 'tags' with the following fields (id, parent_id, name). Now I've set a limit of 3 levels in the hierarchy, i.e.: parent > child > subchild. A subchild cannot have a further child. So I want a query to retrieve records such as:
Parent-data
(if parent has child) child-data
(if child has subchild) subchild-data

Try something like:
SELECT tparent.id AS parent_id,
tparent.name AS parent_name,
tchild1.id AS child_id,
tchild1.name AS child_name,
tchild2.id AS subchild_id,
tchild2.name AS subchild_name
FROM tags tparent
LEFT JOIN tags tchild1
ON tparent.id = tchild1.parent_id
LEFT JOIN tags tchild2
ON tchild1.id = tchild2.parent_id
According to your comment, you're looking for the following output:
ID | PARENT | NAME
1 | 0 | family
2 | 1 | male
3 | 2 | boy1
4 | 2 | boy2
5 | 1 | female
6 | 5 | girl1
I will assume that the ids won't always be in this order, cause if they are, problem solved :)
I'm not sure you can achieve this directly in SQL without adding some additional information that will be used for ordering. For instance, you could add another column where you'd concatenate the ids of parent-child-subchild. Something like:
-- parent
SELECT CONCAT(LPAD(id, 6, '0'), '-000000-000000') AS order_info,
id AS id,
parent_id AS parent,
name AS name
FROM tags
WHERE parent_id = 0
UNION
-- child
SELECT CONCAT_WS('-', LPAD(tparent.id, 6, '0'),
LPAD(tchild1.id, 6, '0'),
'000000'),
tchild1.id,
tparent.id,
tchild1.name
FROM tags tparent
INNER JOIN tags tchild1
ON tparent.id = tchild1.parent_id
WHERE tparent.parent_id = 0
UNION
-- subchild
SELECT CONCAT_WS('-', LPAD(tparent.id, 6, '0'),
LPAD(tchild1.id, 6, '0'),
LPAD(tchild2.id, 6, '0')),
tchild2.id,
tchild1.id,
tchild2.name
FROM tags tparent
INNER JOIN tags tchild1
ON tparent.id = tchild1.parent_id
INNER JOIN tags tchild2
ON tchild1.id = tchild2.parent_id
ORDER BY 1
See the fiddle illustrating this.
Here, I'm formatting the ids to keep ordering coherent. That implies to know the maximum length of the ids (I used a length of 6 here), which is trivial to guess from the id field type.

Related

select one row multiple time when using IN()

I have this query :
select
name
from
provinces
WHERE
province_id IN(1,3,2,1)
ORDER BY FIELD(province_id, 1,3,2,1)
the Number of values in IN() are dynamic
How can I get all rows even duplicates ( in this example -> 1 ) with given ORDER BY ?
the result should be like this :
name1
name3
name2
name1
plus I shouldn't use UNION ALL :
select * from provinces WHERE province_id=1
UNION ALL
select * from provinces WHERE province_id=3
UNION ALL
select * from provinces WHERE province_id=2
UNION ALL
select * from provinces WHERE province_id=1
You need a helper table here. On SQL Server that can be something like:
SELECT name
FROM (Values (1),(3),(2),(1)) As list (id) --< List of values to join to as a table
INNER JOIN provinces ON province_id = list.id
Update: In MySQL Split Comma Separated String Into Temp Table can be used to split string parameter into a helper table.
To get the same row more than once you need to join in another table. I suggest to create, only once(!), a helper table. This table will just contain a series of natural numbers (1, 2, 3, 4, ... etc). Such a table can be useful for many other purposes.
Here is the script to create it:
create table seq (num int);
insert into seq values (1),(2),(3),(4),(5),(6),(7),(8);
insert into seq select num+8 from seq;
insert into seq select num+16 from seq;
insert into seq select num+32 from seq;
insert into seq select num+64 from seq;
/* continue doubling the number of records until you feel you have enough */
For the task at hand it is not necessary to add many records, as you only need to make sure you never have more repetitions in your in condition than in the above seq table. I guess 128 will be good enough, but feel free to double the number of records a few times more.
Once you have the above, you can write queries like this:
select province_id,
name,
#pos := instr(#in2 := insert(#in2, #pos+1, 1, '#'),
concat(',',province_id,',')) ord
from (select #in := '0,1,2,3,1,0', #in2 := #in, #pos := 10000) init
inner join provinces
on find_in_set(province_id, #in)
inner join seq
on num <= length(replace(#in, concat(',',province_id,','),
concat(',+',province_id,',')))-length(#in)
order by ord asc
Output for the sample data and sample in list:
| province_id | name | ord |
|-------------|--------|-----|
| 1 | name 1 | 2 |
| 2 | name 2 | 4 |
| 3 | name 3 | 6 |
| 1 | name 1 | 8 |
SQL Fiddle
How it works
You need to put the list of values in the assignment to the variable #in. For it to work, every valid id must be wrapped between commas, so that is why there is a dummy zero at the start and the end.
By joining in the seq table the result set can grow. The number of records joined in from seq for a particular provinces record is equal to the number of occurrences of the corresponding province_id in the list #in.
There is no out-of-the-box function to count the number of such occurrences, so the expression at the right of num <= may look a bit complex. But it just adds a character for every match in #in and checks how much the length grows by that action. That growth is the number of occurrences.
In the select clause the position of the province_id in the #in list is returned and used to order the result set, so it corresponds to the order in the #in list. In fact, the position is taken with reference to #in2, which is a copy of #in, but is allowed to change:
While this #pos is being calculated, the number at the previous found #pos in #in2 is destroyed with a # character, so the same province_id cannot be found again at the same position.
Its unclear exactly what you are wanting, but here's why its not working the way you want. The IN keyword is shorthand for creating a statement like ....Where province_id = 1 OR province_id = 2 OR province_id = 3 OR province_id = 1. Since province_id = 1 is evaluated as true at the beginning of that statement, it doesn't matter that it is included again later, it is already true. This has no bearing on whether the result returns a duplicate.

SQL: How to check "if this record exists then that record must also exist" for given ID set

my database table (DWInfo) looks like this:
InstanceID | AttributeID
1 | 1
1 | 2
1 | 3
2 | 1
2 | 4
3 | 1
3 | 2
There are several instances and every instance has multiple attributes.
What I want to achieve is this: for a given set/rule of id's I want to get all InstanceID's which violate the condition, for example let the given ID's be 1 and 2, which means if there is an instance with AttributeID=1, Attribute=2 should also exist for it. In this case the result would be instance two, because this instance violates the condition.
I tried it with JOINS but this only seemed effective for 2 attributes and not more.
Select * from DWInfo dw1 INNER JOIN DWInfo dw2 ON dw1.InstanceID = dw2.InstanceID where dw1.AttributeID != dw2.AttributeID and dw1.AttributeID = 1 AND dw2.AttributeID != 2
Is it possible to solve this problem with a SQL query?
Assuming that each InstanceId can have only one of each different AttributeId, i.e. a unique composite index (InstanceId, AttributeId):
SELECT InstanceID
FROM DWInfo
WHERE AttributeID IN (1,2)
GROUP BY InstanceID
HAVING SUM(AttributeId = 1) = 1
AND COUNT(*) < 2 /* Or SUM(AttributeId = 2) = 0 */
SQLFiddle DEMO
Note that if having AttributeId of 2 means that the instance requires an AttributeId of 1 also.. slightly different logic, this is neater:
SELECT InstanceID
FROM DWInfo
WHERE AttributeID IN (1,2)
GROUP BY InstanceID
HAVING COUNT(*) < 2
Where there exists Attribute 1 find the ones that don't have Attribute 2.
select InstanceID
from DWInfo
group by InstanceID
having
count(case when AttributeID = 1 then 1 end) > 0
and count(case when AttributeID = 2 then 1 end) = 0
This answer is basically the same as Arth's. You might find it beneficial to filter the Attributes in the where clause but it's not strictly necessary. I prefer the standard syntax using case expressions even though the shorthand would be handy if it were portable. I also prefer count over sum in these scenarios.
It's not clear whether you can have duplicates (probably not) and whether Attribute 2 can appear alone. You might have to tweak the numbers a bit but you should be able to follow the pattern.
I think this does what you want:
select instanceid
from dwinfo
where attributeid in (1, 2)
group by instanceid
having count(*) = 2;
This guarantees that you have two matching rows for each instance. If you can have duplicates, then use:
having count(distinct attributeid) = 2
EDIT:
For the conditional version (if 1 --> 2):
having max(attributeid = 2) > 0
That is, if it has 1 or 2, then it has to have 2, and everything is ok.

MYySQL is the Table Relationship wrong?

I'm very new in Databases and more specific in MYSQL. I use xampp + MySQL Workbench.
I make 3 tables using MySQL Workbench:
- tbStores with fields StoreID(PK-INT-AI), StoreName
- tbProducts with fields ProductID(PK-INT-AI), ProductName
- tbProductDetails with fields ProductDetailID(PK-INT-AI), Price, ProductID(FK), StoreID(FK)
*PK=Primary Key
*INT=Numeric Type Attributes
*AI=Auto Increments
In case you don’t understand the Relationships above:
1 to many From tbStores(StoreID) To tbProductDetails (StoreID)
1 to many From tbProducts(ProductID) To tbProductDetails (ProductID)
I add values to the fields:
- tbStores=> StoreName=> Store 1
- tbProducts=> ProductName=> Product 1, Product 2
- tbProductDetails=> Price=> 50, 30
- tbProductDetails=> ProductID=> 1, 2
- tbProductDetails=> StoreID=> 1, 1
To the Query:
SELECT tbStores.StoreName, tbProductDetails.Price, tbProducts.ProductName
FROM tbStores, tbProductDetails, tbProducts
Where ProductName = 'Product 1';
The Problem:
Query will return this
Store 1, 50, Product 1
Store 1, 30, Product 1
Is giving me Same Product with 2 different Prices.
What I was expecting to take was this :
Store 1, 50, Product 1
What am I doing wrong? I believe it has to do with relationships but I can't figure it out. Thanks
You need to join the tables together (specify how they are related) in the query, the query should look something like this:
SELECT tbStores.StoreName, tbProductDetails.Price, tbProducts.ProductName
FROM tbProductDetails
JOIN tbStores ON tbStores.StoreID = tbProductDetails.StoreID
JOIN tbProducts ON tbProducts.ProductID = tbProductDetails.ProductID
WHERE tbProducts.ProductName = 'Product 1';
If you want all products you have to remove the where clause. Note that I took the liberty of changing your implicit joins in the from clause to explicit joins using the join keyword.
Sample SQL Fiddle
Sample output:
| STORENAME | PRICE | PRODUCTNAME |
|-----------|-------|-------------|
| Store1 | 50 | Product1 |
What you want is to use JOIN combined with ON
SELECT StoreName, Price, Product Name
FROM tblStores
JOIN tblProduct ON tblStores.StoreID = tblProducts.StoreID
JOIN tblProductDetails ON tblProduct.ProductID = tblProductDetails.ProductID
WHERE ProductName = 'Product 1'
You may consider GROUP BY to identify the specific stores.

Nested Set Query to retrieve all ancestors of each node

I have a MySQL query that I thought was working fine to retrieve all the ancestors of each node, starting from the top node, down to its immediate node. However when I added a 5th level to the nested set, it broke.
Below are example tables, queries and SQL Fiddles:
Four Level Nested Set:
CREATE TABLE Tree
(title varchar(20) PRIMARY KEY,
`tree` int,
`left` int,
`right` int);
INSERT Tree
VALUES
("Food", 1, 1, 18),
('Fruit', 1, 2, 11),
('Red', 1, 3, 6),
('Cherry', 1, 4, 5),
('Yellow', 1, 7, 10),
('Banana', 1, 8, 9),
('Meat', 1, 12, 17),
('Beef', 1, 13, 14),
('Pork', 1, 15, 16);
The Query:
SELECT t0.title node
,(SELECT GROUP_CONCAT(t2.title)
FROM Tree t2
WHERE t2.left<t0.left AND t2.right>t0.right
ORDER BY t2.left) ancestors
FROM Tree t0
GROUP BY t0.title;
The returned result for node Banana is Food,Fruit,Yellow - Perfect. You can see this here SQL Fiddle - 4 Levels
When I run the same query on the 5 level table below, the 5th level nodes come back in the wrong order:
CREATE TABLE Tree
(title varchar(20) PRIMARY KEY,
`tree` int,
`left` int,
`right` int);
INSERT Tree
VALUES
("Food", 1, 1, 24),
('Fruit', 1, 2, 13),
('Red', 1, 3, 8),
('Cherry', 1, 4, 7),
('Cherry_pie', 1, 5, 6),
('Yellow', 1, 9, 12),
('Banana', 1, 10, 11),
('Meat', 1, 14, 23),
('Beef', 1, 15, 16),
('Pork', 1, 17, 22),
('Bacon', 1, 18, 21),
('Bacon_Sandwich', 1, 19, 20);
The returned result for Bacon_Sandwich is Bacon,Food,Meat,Pork which is not the right order, it should be Food,Meat,Pork,Bacon - You can see this here SQL Fiddle - 5 Levels
I am not sure what is happening because I don't really understand subqueries well enough. Can anyone shed any light on this?
EDIT AFTER INVESTIGATION:
Woah!! Looks like writing all this out and reading up about ordering with GROUP_CONCAT gave me some inspiration.
Adding ORDER BY to the actual GROUP_CONCAT function and removing from the end of the subquery solved the issue. I now receive Food,Meat,Pork,Bacon for the node Bacon_Sandwich
SELECT t0.title node
,(SELECT GROUP_CONCAT(t2.title ORDER BY t2.left)
FROM Tree t2
WHERE t2.left<t0.left AND t2.right>t0.right
) ancestors
FROM Tree t0
GROUP BY t0.title;
I still have no idea why though. Having ORDER BY at the end of the subquery works for 4 levels but not for 5?!?!
If someone could explain what the issue is and why moving the ORDER BY fixes it, I'd be most grateful.
First it's important to understand that you have an implicit GROUP BY
If you use a group function in a statement containing no GROUP BY clause, it is equivalent to grouping on all rows.
To make the point more understandable I'll leave out subqueries and reduce the problem to the banana. Banana is the set [10, 11]. The correct sorted ancestors are those:
SELECT "banana" as node, GROUP_CONCAT(title ORDER by `left`)
FROM Tree WHERE `left` < 10 AND `right` > 11
GROUP BY node;
The ORDER BY must be in GROUP_CONCAT() as you want the aggregation function to sort. ORDER BY outside sorts by the aggregated results (i.e. the result of GROUP_CONCAT()). The fact that it worked until level 4 is just luck. ORDER BY has no effect on an aggregate function. You would get the same results with or without the ORDER BY:
SELECT GROUP_CONCAT(title)
FROM Tree WHERE `left` < 10 AND `right` > 11
/* ORDER BY `left` */
It might help to understand what
SELECT GROUP_CONCAT(title ORDER BY left) FROM Tree WHERE … ORDER BY left does:
Get a selection (WHERE) which results in three rows in an undefined order:
("Food")
("Yellow")
("Fruit")
Aggregate the result into one row (implicit GROUP BY) in order to be able to use an aggregate function:
(("Food","Yellow", "Fruit"))
Fire the aggregate function (GROUP_CONCAT(title, ORDER BY link)) on it. I.e. order by link and then concatenate:
("Food,Fruit,Yellow")
And now finally it sorts that result (ORDER BY). As it's only one row, sorting changes nothing.
("Food,Fruit,Yellow")
You can get the result using JOIN or SUB-QUERY.
Using JOIN:
SELECT t0.title node, GROUP_CONCAT(t2.title ORDER BY t2.left) ancestors
FROM Tree t0
LEFT JOIN Tree t2 ON t2.left < t0.left AND t2.right > t0.right
GROUP BY t0.title;
Check this SQL FIDDLE DEMO
Using SUB-QUERY:
SELECT t0.title node,
(SELECT GROUP_CONCAT(t2.title ORDER BY t2.left)
FROM Tree t2 WHERE t2.left<t0.left AND t2.right>t0.right) ancestors
FROM Tree t0
GROUP BY t0.title;
Check this SQL FIDDLE DEMO
OUTPUT
| NODE | ANCESTORS |
|----------------|-----------------------|
| Bacon | Food,Meat,Pork |
| Bacon_Sandwich | Food,Meat,Pork,Bacon |
| Banana | Food,Fruit,Yellow |
| Beef | Food,Meat |
| Cherry | Food,Fruit,Red |
| Cherry_pie | Food,Fruit,Red,Cherry |
| Food | (null) |
| Fruit | Food |
| Meat | Food |
| Pork | Food,Meat |
| Red | Food,Fruit |
| Yellow | Food,Fruit |
In your sub query you had used ORDER BY after WHERE clause which won't affect the output. By default GROUP_CONCAT() function will orders the output string in ascending order of column value. It won't consider you explicit ORDER BY clause.
If you check your output of first query which returns the data in ascending order of title column. So the returned result for node Banana is Food,Fruit,Yellow.
But in your second result for Bacon_Sandwich is Bacon,Food,Meat,Pork because in ascending order Bacon comes first than Food will come.
If you want to order the result based on left column than you have to specify ORDER BY inside the GROUP_CONCAT() function as above. Check my both queries.
I prefer that you use JOIN instead of SUB-QUERY for improving performance.

SQL query - tagging system, delimited string + unique values

I am working for a client that stores item tags in the MySQL DB like so (I know, I know - not ideal):
coats_and_jackets-Woven_Jacket-brand:Hobbs;
coats_and_jackets-Woven_Jacket-color:Black;
coats_and_jackets-Woven_Jacket-style:Boucle;
coats_and_jackets-Woven_Jacket-pattern:Plain;
dresses-Pinafore-brand:COS;
dresses-Pinafore-color:Blue _ Navy;
dresses-Pinafore-style:Wool;
dresses-Pinafore-pattern:Plain;
shoes-Ankle_Boot-brand:Topshop;
shoes-Ankle_Boot-color:Black;
shoes-Ankle_Boot-style:Leather;
shoes-Ankle_Boot-pattern:Plain;
bags-Tote-brand:Mulberry;
bags-Tote-color:Brown _ Tan;
bags-Tote-style:Leather;
bags-Tote-pattern:Plain;
shoes-Ballet_shoes-brand:Chanel;
shoes-Ballet_shoes-color:Black;
shoes-Ballet_shoes-style:Leather;
shoes-Ballet_shoes-pattern:Plain;
accessories-Scarf-brand:Zara;
accessories-Scarf-color:Brown _ Tan;
accessories-Scarf-style:Wool;
accessories-Scarf-pattern:Checked;
Each tag is broken down into 4 parts like so: category-type-brand, category-type-color, category-type-style, category-type-pattern
Not all 4 parts of a tag are required and can be omitted from the DB.
I have been tasked with finding out how many tags an item has, so in this example 6 tags have been used, each with all 4 parts.
The query I have so far counts all the tag parts, in this example 24, but I cannot assume that each tag will have all 4 parts stored. So cannot divide the parts amount by 4 to get the amount of tags.
In this example, the 6 tags used are as follows:
Coats & Jackets (Woven Jacket)
Dresses (Pinafore)
Shoes (Ankle boot)
Bags (Tote)
Shoes (Ballet Shoes)
Accessories (Scarf)
Now I'm not concerned about the category, type or parts (brand, color, style, pattern) - I'm just concerned about fetching the total amount of tags for this item.
Also, the data example above would be stored in a db row that looks like:
+----------+-------------+----------------------------+
| ID | meta_key | meta_value |
+----------+-------------+----------------------------+
| 1 | tags | coats_and_jackets-wove... |
+----------+-------------+----------------------------+
| 2 | item_desc | Fashion editor |
+----------+-------------+----------------------------+
Help structuring this query would be much appreciated.
The tags use hyphen as a separator. Here is a method for finding the number of tags used by a given item:
select it.*, length(it.tags) - length(replace(it.tags, '-', ''))+1
from itemtags it
This replaces the hyphen with an empty string, and measures the difference in lengths.
Assuming I'm understanding your requirement correctly, how about something like this (with CTE used to demonstrate assumed table structure)
WITH CTE1(tag) AS(
select 'coats_and_jackets-Woven_Jacket-brand:Hobbs' union
-- ...
select 'accessories-Scarf-color:Brown _ Tan' union
select 'accessories-Scarf-style:Wool' union
select 'accessories-Scarf-pattern:Checked'
)
, CTE2(tag_prefix) AS(
select LEFT(tag, CHARINDEX('-', tag, CHARINDEX('-', tag) + 1) - 1) from CTE1
)
select tag_prefix, COUNT(*) from CTE2 group by tag_prefix
This will give you results of...
accessories-Scarf 4
bags-Tote 4
coats_and_jackets-Woven_Jacket 4
dresses-Pinafore 4
shoes-Ankle_Boot 4
shoes-Ballet_shoes 4
... which gives you the tag prefix and number of parts used. From there you can count the individual rows or sum the number of parts or whatever else you need...
I've just realised that my solution is completely pointless given that I missed the 'mysql' tag ;) but I'll post it up here anyway. Hopefully it can give you a pointer on how to proceed.
WITH CTE1(ID, meta_key, meta_value) AS(
select 1, 'tags', 'coats_and_jackets-Wo...' union all
select 2, 'item_desc', 'Fashion editor'
)
, TagsCTE AS(
select t.ID, x.Item as tag_and_value
from CTE1 t
cross apply dbo.fn_SplitString(t.meta_value, ';') x
where meta_key = 'tags' and LEN(x.Item) > 0
)
select ID, COUNT(parts_count) from (
select ID, COUNT(*) as parts_count
from TagsCTE
group by ID, LEFT(tag_and_value, CHARINDEX('-', tag_and_value, CHARINDEX('-', tag_and_value) + 1) - 1)
) a group by ID
This gives results of:
1 6
Good luck.