The idea is simple - I have two tables, categories and products.
Categories:
id | parent_id | name | count
1 NULL Literature 6020
2 1 Interesting books 1000
3 1 Horrible books 5000
4 1 Books to burn 20
5 NULL Motorized vehicles 1000
6 5 Cars 999
7 5 Motorbikes 1
...
Products:
id | category_id | name
1 1 Cooking for dummies
2 3 Twilight saga
3 5 My grandpa's car
...
Now while displayed, the parent category contains all the products of all the children categories. Any category may have children categories. The count field in the table structure contains (or at least I want it to contain) count of all products displayed in this particular category. On the front-end, I select all subcategories with a simple recursive function, however I'm not so sure how to do this in a SQL procedure (yes it has to be a SQL procedure).The tables contain about a hundread categories of any kind and there are over 100 000 products.
Any ideas?
Bill Karwin made some nice slides about hierachical data, and the current Adjacency Model certainly as pros, but it's not very suited for this (getting a whole subtree).
For my Adjacency tables, I solve it by storing / caching the path (possibly in a script, or in a 'before update trigger'), on change of parent_id id, a new path-string is created. Your current table would look like this:
id | parent_id | path | name | count
1 NULL 1 Literature 6020
2 1 1:2 Interesting books 1000
3 1 1:3 Horrible books 5000
4 1 1:4 Books to burn 20
5 NULL 5 Motorized vehicles 1000
6 5 5:6 Cars 999
7 5 5:7 Motorbikes 1
(choose any delimiter not found in the id you like)
So, now to get all products from a category + subcategories:
SELECT p.*
FROM categories c_main
JOIN categories c_subs
ON c_subs.id = c_main.id
OR c_subs.path LIKE CONCAT(c_main,':%')
JOIN products p
ON p.category_id = c_subs.id
WHERE c_main.id = <id>
Take a look at this article on managing heirachical trees in MySQL.
It explains the disadvantages to your current method and some more optimal solutions.
See especially the section towards the ended headed 'Aggregate Functions in a Nested Set'.
There's a whole chapter in "SQL Antipatterns Avoiding the Pitfalls of Database Programming" by Bill Karwin about managing hierachical data in SQL.
As you havent accepted an answer yet i thought i'd post my method for handling trees in mysql and php. (single db call to non recursive sproc)
Full script here : http://pastie.org/1252426 or see below...
Hope this helps :)
PHP
<?php
$conn = new mysqli("localhost", "foo_dbo", "pass", "foo_db", 3306);
$result = $conn->query(sprintf("call product_hier(%d)", 3));
echo "<table border='1'>
<tr><th>prod_id</th><th>prod_name</th><th>parent_prod_id</th>
<th>parent_prod_name</th><th>depth</th></tr>";
while($row = $result->fetch_assoc()){
echo sprintf("<tr><td>%s</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td></tr>",
$row["prod_id"],$row["prod_name"],$row["parent_prod_id"],
$row["parent_prod_name"],$row["depth"]);
}
echo "</table>";
$result->close();
$conn->close();
?>
SQL
drop table if exists product;
create table product
(
prod_id smallint unsigned not null auto_increment primary key,
name varchar(255) not null,
parent_id smallint unsigned null,
key (parent_id)
)engine = innodb;
insert into product (name, parent_id) values
('Products',null),
('Systems & Bundles',1),
('Components',1),
('Processors',3),
('Motherboards',3),
('AMD',5),
('Intel',5),
('Intel LGA1366',7);
delimiter ;
drop procedure if exists product_hier;
delimiter #
create procedure product_hier
(
in p_prod_id smallint unsigned
)
begin
declare v_done tinyint unsigned default 0;
declare v_depth smallint unsigned default 0;
create temporary table hier(
parent_id smallint unsigned,
prod_id smallint unsigned,
depth smallint unsigned default 0
)engine = memory;
insert into hier select parent_id, prod_id, v_depth from product where prod_id = p_prod_id;
/* http://dev.mysql.com/doc/refman/5.0/en/temporary-table-problems.html */
create temporary table tmp engine=memory select * from hier;
while not v_done do
if exists( select 1 from product p inner join hier on p.parent_id = hier.prod_id and hier.depth = v_depth) then
insert into hier
select p.parent_id, p.prod_id, v_depth + 1 from product p
inner join tmp on p.parent_id = tmp.prod_id and tmp.depth = v_depth;
set v_depth = v_depth + 1;
truncate table tmp;
insert into tmp select * from hier where depth = v_depth;
else
set v_done = 1;
end if;
end while;
select
p.prod_id,
p.name as prod_name,
b.prod_id as parent_prod_id,
b.name as parent_prod_name,
hier.depth
from
hier
inner join product p on hier.prod_id = p.prod_id
inner join product b on hier.parent_id = b.prod_id
order by
hier.depth, hier.prod_id;
drop temporary table if exists hier;
drop temporary table if exists tmp;
end #
delimiter ;
call product_hier(3);
call product_hier(5);
What you want is a common table expression. Unfortunately it looks like mysql doesn't support them.
Instead you will probably need to use a loop to keep selecting deeper trees.
I'll try whip up an example.
To clarify, you're looking to be able to call the procedure with an input of say '1' and get back all the sub categories and subsub categories (and so on) with 1 as an eventual root?
like
id parent
1 null
2 1
3 1
4 2
?
Edited:
This is what I came up with, it seems to work.
Unfortunately I don't have mysql, so I had to use sql server. I tried to check everythign to make sure it will work with mysql but there may still be issues.
declare #input int
set #input = 1
--not needed, but informative
declare #depth int
set #depth = 0
--for breaking out of the loop
declare #break int
set #break = 0
--my table '[recursive]' is pretty simple, the results table matches it
declare #results table
(
id int,
parent int,
depth int
)
--Seed the results table with the root node
insert into #results
select id, parent, #depth from [recursive]
where ID = #input
--Loop through, adding notes as we go
set #break = 1
while (#break > 0)
begin
set #depth=#depth+1 --Increase the depth counter each loop
--This checks to see how many rows we are about to add to the table.
--If we don't add any rows, we can stop looping
select #break = count(id) from [recursive]
where parent in
(
select id from #results
)
and id not in --Don't add rows that are already in the results
(
select id from #results
)
--Here we add the rows to the results table
insert into #results
select id, parent, #depth from [recursive]
where parent in
(
select id from #results
)
and id not in --Don't add rows that are already in the results
(
select id from #results
)
end
--Select the results and return
select * from #results
Try to get rid of the hierarchy that is implemented that way. Recursion in stored procedures aren't nice, and for example, on MS SQL they fail after 64th level.
Also, to get for example everything from some category and it's subcategories, you will have to recursively go all the way down, which is impractical for SQL - nevertheless to say slow.
Instead, use this; create category_path field, and make it look like:
category_path name
1/ literature
1/2/ Interesting books
1/3/ Horrible books
1/4/ Books to burn
5/ Motorized vehicles
5/6/ Cars
5/7/ Motorbikes
By using that method, you will be able to SELECT categories and subcategories very fast. Updates will be slow, but I guess that they CAN be slow. Also, you can keep your old child-parent relationship fields, to help you maintain your tree structure.
For example, getting all cars, without any recursion, will be:
SELECT * FROM ttt WHERE category_path LIKE '5/%'
Related
I have a new column in my database and I need fill it up with the value of the same column from one specific row. I want to create a feature "copy to all"
For example add the same price to all the products taken from the first row:
ID NAME PRICE
1 PROD1 5
2 PROD2 0
3 PROD3 0
4 PROD4 0
I am trying to select the PRICE of the first row (ID 1) and copy it to all the other rows.
I have tried:
UPDATE PRODUCTS SET PRICE = (select PRICE from PRODUCTS where ID = 1);
I want to end up with this
ID NAME PRICE
1 PROD1 5
2 PROD2 5
3 PROD3 5
4 PROD4 5
But I get this error:
Table 'PRODUCTS' is specified twice,
both as a target for 'UPDATE' and as a separate source for data
I tried specifying each table separately
UPDATE PRODUCTS as a SET a.PRICE = (select b.PRICE from PRODUCTS as b where b.ID = 1);
But I get the same error.
Table 'a' is specified twice,
both as a target for 'UPDATE' and as a separate source for data
Maybe I have to create a temporary table and copy from it?
Any hints on how to accomplish this?
Thanks.
You can do it by nesting the select query:
UPDATE PRODUCTS
SET PRICE = (
select PRICE from (select PRICE from PRODUCTS where ID = 1) t
);
See the demo.
Another way to do it, with a self CROSS JOIN:
UPDATE PRODUCTS p CROSS JOIN (
select PRICE from PRODUCTS where ID = 1
) t
SET p.PRICE = t.PRICE;
See the demo.
This wouldn't work logically, as SQL would try to fetch the data it is updating.
Try running your nested statement select PRICE from PRODUCTS where ID = 1 seperately, saving the response and then running your main statement: "UPDATE PRODUCTS SET PRICE = " + newPrice
If this is a SQL script, i suggest you to break it into 2 queries using a variable:
select #var := PRICE from PRODUCTS where ID = 1;
UPDATE PRODUCTS SET PRICE = #var;
Variables are much more easier than temporary table in my opinion.
I've not fully tested, but the syntax should be that
After looking at many answers on other posts and websites, (none of them had the precise answer), I found the solution with this query, and yes we need temp tables:
CREATE TEMPORARY TABLE tmptable SELECT ID, PRICE FROM PRODUCTS WHERE ID = 1;
UPDATE PRODUCTS SET `PRICE` = (select tmptable.`PRICE` from tmptable where tmptable.ID = 1);
BUT! #forpas solution's is really good and works without creating a temp table.
enjoy.
NEW question: is the temp table removed automatically? Leave me a comment.
Cheers
have a question for you guys,
trying to make a procedure for my mysql table but I need some assistance ...
IM completely block ...
I need to create a procedure that will show the parents name in my table but the table show parents id
ex.
(DELIMITER //
CREATE PROCEDURE fetch_animal_parents (IN animal_id INT, OUT animal_name VARCHAR(10))
BEGIN
DECLARE animal_mom INT DEFAULT 0 ;
DECLARE animal_dad INT DEFAULT 0 ;
DECLARE animal_name_mom VARCHAR(10) ;
DECLARE animal_name_dad VARCHAR(10) ;
SELECT name INTO animal_name, (SELECT name FROM animal WHERE id = child.mother_id) INTO animal_name_mom,
(SELECT name FROM animal WHERE id = child.father_id) INTO animal_name_dad
FROM animal AS child ;
END //)
What im doing wrong ....
................................................................................................
Any input ...
1) why do you select mom/dad's name when you are not using them anywhere?
2) I imagine your procedure should take an child animal id as input and give mom & dad's name as output(that's what you procedure name suggest). In that case you need to either have 2 output value or you need to concatenate those names into 1 variable and return them.
3) #VMai suggested a 2 join format which I would agree. The query will be something like..
SELECT mom.name,dad.name INTO animal_name_mom, animal_name_dad
FROM (select mother_id,father_id from animal where id = <precedure_input>) AS `child`
INNER JOIN (select id,name from animal) AS `mom` ON (mom.id=child.mother_id)
INNER JOIN (select id,name from animal) AS `dad` ON (dad.id=child.father_id)
I see that you have as least tried something on your own (thou very confused). I'd suggest you to start with learning some basic syntax/keyword/functions of mysql before trying procedures. Learn to use GROUP BY, variations of JOIN and you could handle a lot of basic querys.
I have 2 tables customer,category
category:
category_id category_name vendor_id
1 laptops 10
2 bikes 10
3 cars 10
customer:
user_name password assigned_categories vendor_id
nag 12345 1,2,3 10
When I login with user_name and password ...I need to get all category_id, category_name's from category.but am getting only first category details like
category_id=1,category_name=laptops
Your customer table does not abide by the First Normal Form because you are storing multiple values in the assigned_categories attribute. If you would create a new table customer_assignments, it could all be easily done with basic SQL commands.
Here's how the new table customer_assignments would look:
user_name (FK)
category_id (FK)
PRIMARY KEY(user_name, category_id)
Many-to-Many relationships should be handled like this, not by adding multiple values into one attribute.
You could then extract your required information with a query like:
SELECT category_id, category_name
FROM (category NATURAL JOIN customer_assignments) NATURAL JOIN customer
WHERE user_name = your_current_user_name
If you are not changed your structure then below query will help you to get the your required result
DECLARE #Test VARCHAR(MAX) = ''
SELECT SET #Test = assigned_categories FROM customer WHERE user_name = 'nag'
IF LEN(#Test) > 0
BEGIN
CREATE TABLE #Temp(category_id VARCHAR(MAX))
WHILE LEN(#Test) > 0
BEGIN
IF CHARINDEX(',',#Test) > 0
BEGIN
INSERT INTO #Temp VALUES(LEFT(#Test,CHARINDEX(',',#Test)-1))
SET #Test = SUBSTRING(#Test,CHARINDEX(',',#Test) + 1,LEN(#Test))
END
ELSE
BEGIN
INSERT INTO #Temp VALUES(#Test)
SET #Test = ''
END
END
SELECT * FROM category INNER JOIN #Temp ON category.category_id = #Temp.category_id
END
maybe using SQL IN operator may help, which allows you to specify multiple values in a WHERE clause.
I have a table containing stages and sub-stages of certain projects, and a table with specific tasks and estimated costs.
I need some way to aggregate each level (stages/sub-stages), to see how much it costs, but to do it at a minimum performance cost.
To illustrate this, I will use the following data structure:
CREATE TABLE stage
(
id int not null,
fk_parent int
)
CREATE TABLE task
(
id int not null,
fk_stage int not null,
cost decimal(18,2) not null default 0
)
with the following data:
==stage==
id fk_parent
1 null
2 1
3 1
==task==
id fk_stage cost
1 2 100
1 2 200
1 3 600
I want to obtain a table containing the total costs on each branch. Something like this:
Stage ID Total Cost
1 900
2 300
3 600
But, I also want it to be productive. I don't want to end up with extremely bad solutions like The worst algorithm in the world. I mean this is the case. In case I'll request the data for all the items in the stage table, with the total costs, each total cost will be evaluated D times, where D is the depth in the tree (level) at which it is situated. I am afraid I'll hit extremely low performances at large amounts of data with a lot of levels.
SO,
I decided to do something which made me ask this question here.
I decided to add 2 more columns to the stage table, for caching.
...
calculated_cost decimal(18,2),
date_calculated_cost datetime
...
So what I wanted to do is pass another variable within the code, a datetime value which equals to the time when this process was started (pretty much unique). That way, if the stage row already has a date_calculated_cost which equals to the one I'm carrying, I don't bother calculating it again, and just return the calculated_cost value.
I couldn't do it with Functions (updates are needed to the stage table, once costs are calculated)
I couldn't do it with Procedures (recursion within running cursors is a no-go)
I am not sure temporary tables are suitable because it wouldn't allow concurrent requests to the same procedure (which are least likely, but anyway I want to do it the right way)
I couldn't figure out other ways.
I am not expecting a definite answer to my question, but I will reward any good idea, and the best will be chosen as the answer.
1. A way to query the tables to get the aggregated cost.
Calculate the cost for each stage.
Use a recursive CTE to get the level for each stage.
Store the result in a temp table.
Add a couple of indexes to the temp table.
Update the cost in the temp table in a loop for each level
The first three steps is combined to one statement. It might be good for performance to do the first calculation, cteCost, to a temp table of it's own and use that temp table in the recursive cteLevel.
;with cteCost as
(
select s.id,
s.fk_parent,
isnull(sum(t.cost), 0) as cost
from stage as s
left outer join task as t
on s.id = t.fk_stage
group by s.id, s.fk_parent
),
cteLevel as
(
select cc.id,
cc.fk_parent,
cc.cost,
1 as lvl
from cteCost as cc
where cc.fk_parent is null
union all
select cc.id,
cc.fk_parent,
cc.cost,
lvl+1
from cteCost as cc
inner join cteLevel as cl
on cc.fk_parent = cl.id
)
select *
into #task
from cteLevel
create clustered index IX_id on #task (id)
create index IX_lvl on #task (lvl, fk_parent)
declare #lvl int
select #lvl = max(lvl)
from #task
while #lvl > 0
begin
update T1 set
T1.cost = T1.cost + T2.cost
from #task as T1
inner join (select fk_parent, sum(cost) as cost
from #task
where lvl = #lvl
group by fk_parent) as T2
on T1.id = T2.fk_parent
set #lvl = #lvl - 1
end
select id as [Stage ID],
cost as [Total Cost]
from #task
drop table #task
2. A trigger on table task that maintains a calculated_cost field in stage.
create trigger tr_task
on task
after insert, update, delete
as
-- Table to hold the updates
declare #T table
(
id int not null,
cost decimal(18,2) not null default 0
)
-- Get the updates from inserted and deleted tables
insert into #T (id, cost)
select fk_stage, sum(cost)
from (
select fk_stage, cost
from inserted
union all
select fk_stage, -cost
from deleted
) as T
group by fk_stage
declare #id int
select #id = min(id)
from #T
-- For each updated row
while #id is not null
begin
-- Recursive update of stage
with cte as
(
select s.id,
s.fk_parent
from stage as s
where id = #id
union all
select s.id,
s.fk_parent
from stage as s
inner join cte as c
on s.id = c.fk_parent
)
update s set
calculated_cost = s.calculated_cost + t.cost
from stage as s
inner join cte as c
on s.id = c.id
cross apply (select cost
from #T
where id = #id) as t
-- Get the next id
select #id = min(id)
from #T
where id > #id
end
Hello im having a hard time with this stored procedure. im getting the error:
Result consisted of more than one row.
here is my stored procedure:
DELIMITER $$
DROP PROCEDURE IF EXISTS `dss`.`COSTRET` $$
CREATE DEFINER=`dwadmin`#`192.168.%.%` PROCEDURE `COSTRET`( TDATE DATE)
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE ls_id VARCHAR(8);
DECLARE ld_cost DECIMAL(10,4);
DECLARE ld_retail DECIMAL(10,4);
DECLARE cur1 CURSOR FOR SELECT DISTINCT `id` FROM `prod_performance` WHERE `psc_week` = TDATE;
DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done = 1;
-- Get the Cost
CREATE TEMPORARY TABLE IF NOT EXISTS `prod_itemcost`
SELECT DISTINCTROW `itemcode` ID, `mlist` COST
FROM (SELECT `itemcode`, `pceffdate`, `mlist`
FROM `purchcost` a
where `pceffdate` = (SELECT MAX(z.`pceffdate`) FROM `purchcost` z WHERE z.`itemcode` = a.`itemcode`
AND z.`pceffdate` <= TDATE)) tb
ORDER BY `itemcode`;
OPEN cur1;
REPEAT
FETCH cur1 INTO ls_id;
IF NOT done THEN
SELECT DISTINCTROW `cost` INTO ld_cost FROM `prod_itemcost` WHERE id = ls_id;
UPDATE LOW_PRIORITY `prod_performance` SET `current_cost` = ld_cost WHERE `psc_week` = TDATE and `id` = ls_id;
END IF;
UNTIL done END REPEAT;
CLOSE cur1;
-- Destroy Temporary Tables
DROP TEMPORARY TABLES IF EXISTS `prod_itemcost`;
END $$
DELIMITER ;
Any solutions and recommendations are much appreciated!
I'd say the problem is here :
SELECT DISTINCTROW `cost` INTO ld_cost FROM `prod_itemcost` WHERE id = ls_id;
and caused by this returning more than one row.
How you solve it depends on your requirements. Does the existence of multiple rows imply the database is in need of some cleaning, for example? Or should you be taking the first value of 'cost', or perhaps the sum of all 'cost' for id = ls_id?
Edit :
Your INTO clause is attempting to write multiple rows to a single variable. Looking at your SQL, I'd say the underlying problem is that your initial query to pull back just the latest cost for each ID is being hamstrung by duplicates of pceffdate. If this is the case, this SQL :
SELECT DISTINCTROW `itemcode` ID, `mlist` COST
FROM (SELECT `itemcode`, `pceffdate`, `mlist`
FROM `purchcost` a
where `pceffdate` = (SELECT MAX(z.`pceffdate`) FROM `purchcost` z WHERE z.`itemcode` = a.`itemcode`
AND z.`pceffdate` <= TDATE)) tb
will return more rows than just this :
SELECT DISTINCTROW `itemcode` ID
FROM (SELECT `itemcode`, `pceffdate`, `mlist`
FROM `purchcost` a
where `pceffdate` = (SELECT MAX(z.`pceffdate`) FROM `purchcost` z WHERE z.`itemcode` = a.`itemcode`
AND z.`pceffdate` <= TDATE)) tb
This line
SELECT MAX(z.`pceffdate`) FROM `purchcost` z WHERE z.`itemcode` = a.`itemcode`
AND z.`pceffdate` <= TDATE
has got to be the problem. It must be returning more than 1 row. So, the DBMS is trying to set multiple values to the same thing, which of course it cannot do.
Do you need something else in your WHERE clause there?
The problem is that
SELECT DISTINCTROW `itemcode` ID, `mlist` COST
could store multiple costs against each ID, and so
SELECT DISTINCTROW `cost` INTO ld_cost FROM `prod_itemcost` WHERE id = ls_id;
could return multiple rows for each id.
For example, if purchcost contained the following:
itemcode mlist pceffdate
1 10.99 10-apr-2009
1 11.99 10-apr-2009
1 9.99 09-apr-2009
Then temporary table prod_itemcost would contain:
itemcode mlist
1 10.99
1 11.99
These both being values that were in effect on the most recent pceffdate for that itemcode.
This would then cause a problem with selecting mlist into ld_cost for itemcode 1 because there are two matching values, and the scalar ld_cost can only hold one.
You really need to look at the data in purchcost. If it is possible for 1 item to have more than one entry with different mlist values for the same date/datetime, then you need to decide how that should be handled. Perhaps take the highest value, or the lowest value, or any value. Or perhaps this is an error in the data.
There is another possibility, that is your parameter "TDATE" same as table field name in uppercase or lowercase or mixed. such as 'tdate', 'tDate', 'TDATE'.
so you should check that. I hit this before.
You are inserting an array in a variable instead of a single value that's why its problem occurs.
Like:
DECLARE name varchar;
select f_name into name from student;
here name will accept only single name instead of multiple name;