I'm a newbie in mysql and not sure if possible to have a loop statement in select.
My Table:
ID user_id parent
1 13 2
2 14 2
3 15 13
4 16 13
5 17 14
6 18 14
7 19 15
8 20 15
parent with value of 2 has no parent. it is something like root
user_id is equivalent to child
so if i do
SELECT * FROM my_table WHERE parent = 2
basically output is:
ID user_id parent
1 13 2
2 14 2
is there a way i can get the other child? should i do subquery? If 2 is selected this is the image i want to achieve: http://awesomescreenshot.com/04b2y7qfe2
Here's how you can connect your parents users with their children users:
This will display only record that have parents:
SELECT c.user_id, c.parent, p.user_id, p.parent
FROM my_table c, my_table p
WHERE c.parent = p.user_id
Or using left Join, display records that have or have no parents (the ones with 2 as parentID)
SELECT c.user_id as ChildID, c.parent as ChildParentID, p.user_id as ParentID, p.parent as ParentOfParentID
FROM my_table c
LEFT JOIN my_table p
ON c.parent = p.user_id
This looks like what is often called an adjacency tree list, where a tree structure is defined in list form by specifying IDs and parent IDs, with a given parent ID value indicating the root node of the tree (in this case, the value 2).
Amir's answer will give you the immediate children of each node. If you'd like to retrieve entire branches of the tree, from a given node, you could look at Modified Preorder Tree Traversal (MPTT). You just fetch all rows where the left field value falls between the root node's left and right values. The key drawback with this method is that on average 50% of the records in the table need to be updated when adding or removing nodes from the tree. If you've got a big table, that can be a bit of a performance hit.
Unfortunately, as far as I know, MySQL doesn't have a way of performing recursive queries, which would be another way of solving this problem. Some other database systems of this functionality, but not MySQL.
Related
In a mysql database, I have a business Unit table which maintain the hierarchy of a client's business units. Each business unit can have a parent and/or a child.
products_client_1.business_units
id parent_id
1
2 1
3 1
4 1
8 1
14 3
17 2
31 1
35 4
36 1
37 4
38 2
39 31
40 8
41 3
42 31
43
44 43
Currently, I have a customerId table which maintains the customerId at a business unit level
contacts_client_1.buid_customer_id
global_id customer_id bu_id
ABC1000033 1812130 2
ABC1000033 1812130 54
ABC1000034 4049809 2
ABC1000035 5630631 2
ABC1000082 5707052 2
ABC1000082 1111116 54
ABC1000091 5813085 2
ABC1000091 5813085 54
ABC1000093 5208477 2
ABC1000115 5045891 2
ABC1000115 5045891 54
ABC1000117 6114245 2
ABC1000117 6114247 54
ABC1000117 6114247 1
ABC1000111 1234567 38
ABC1000100 9023456 43
ABC1000100 9023457 44
Going forward, I do not want to maintain the customer id at individual business unit level. It should be unique for a given globalId. For this I want to migrate the existing customer id data based on the following condition.
If a globalId has customerId for a only single BU, migrate it as it is without bu_id.
If a globalId has customerId for 2 BUs (they can be parent-child at any level), keep the customerId of the parent most available BU.
required table contacts_client_1.customer_id
global_id customer_id
ABC1000033 1812130
ABC1000034 4049809
ABC1000035 5630631
ABC1000082 5707052
ABC1000091 5813085
ABC1000093 5208477
ABC1000100 9023456
ABC1000111 1234567
ABC1000115 5045891
ABC1000117 6114247
PS:
globalId are not overlapping among different parent most BUs.
business_unit table is under products_client_1 schema and buid_customer_id table is under contacts_client_1 schema.
The same code should be executable for different clients.
This is a one time migration.
Need help in writing the query.
I'm not sure what exactly you are going to do with your data, but the following should help:
Show only rows which have no parent for the same global_id in the buid_customer_id table:
select child.*
from contacts_client_1.buid_customer_id child
left join products_client_1.business_units bu
on bu.id = child.bu_id
left join contacts_client_1.buid_customer_id parent
on parent.global_id = child.global_id
and parent.bu_id = bu.parent_id
where parent.global_id is null
Examples:
Row (ABC1000100 9023456 43) - The bu_id (43) has no parent in buid_customer_id, so there will be no match for the first LEFT JOIN and also no match for the second. Since all columns from the left joined tables will be NULL, parent.global_id is null is TRUE and the row will be selected.
Row (ABC1000100 9023457 44) - The bu_id (44) has a parent_id (43), so the first JOIN will find a match. The second JOIN will also find a match, because a row with the parent BU and the same global_id exists in the buid_customer_id table. Thus parent.global_id is not NULL and the row won't be selected.
Row (ABC1000033 1812130 2) - The bu_id (2) has a parent_id (1). The first JOIN will find a match. But tere is no row in the buid_customer_id table with bu_id = 1 and global_id = ABC1000033, so there is no match for the second JOIN. Thus parent.global_id will be NULL and the row will be selected.
Now you can use this statement to copy (migrate) the data to a new table with
insert into new_table
select child.*
[..]
You can also go the other way. If you replace the LEFT JOINs with INNER JOINs and remove the WHERE clause, you will get the opposite result (all rows which are not returned by the first query). You can use it to remove all those rows from the table.
Delete all rows which have a parent row for the same global_id:
delete child
from contacts_client_1.buid_customer_id child
join products_client_1.business_units bu
on bu.id = child.bu_id
join contacts_client_1.buid_customer_id parent
on parent.global_id = child.global_id
and parent.bu_id = bu.parent_id
Now the table buid_customer_id will contain the same rows which are selected by the first query. If this data needs to be in another table - just rename it. Then you can copy global_id and customer_id with
insert into customer_id (global_id, customer_id)
select global_id, customer_id
from new_table
I have a hierarchy that I have represented as a closure table, as described by Bill Karwin. I am trying to write a query that will return the nodes sorted as a depth-first traversal. This reply would solve my problem, except that in my structure some nodes appear more than once because they have multiple parents.
My sample data looks like this:
125354625
As you can see, node 2 appears twice, both as a child and a grandchild of the root. Node 5 appears twice as a grandchild of the root (each time with a different parent), and then again as a great-grandchild because its parent, node 2, is repeated.
This will set up the data as a closure table:
CREATE TABLE ancestor_descendant (
ancestor int NOT NULL,
descendant int NOT NULL,
path_length int NOT NULL
);
INSERT INTO ancestor_descendant (ancestor, descendant, path_length) VALUES
(1,1,0),(2,2,0),(3,3,0),(4,4,0),(5,5,0),(6,6,0),(1,2,1),(1,3,1),(1,4,1),
(2,5,1),(3,5,1),(4,6,1),(4,2,1),(1,5,2),(1,6,2),(1,2,2),(1,5,3),(4,5,2);
or as an adjacency list:
CREATE TABLE parent_child (
parent int NOT NULL,
child int NOT NULL
);
INSERT INTO parent_child (parent, child) VALUES
(1,2),(1,3),(1,4),(2,5),(3,5),(4,2),(4,6);
I can produce a breadth-first traversal (although 5 only appears as a grandchild once):
SELECT CONCAT(LPAD('', path_length, '-'), ' ', descendant)
FROM ancestor_descendant
WHERE ancestor = 1
ORDER BY path_length;
1
- 2
- 3
- 4
-- 5
-- 6
-- 2
--- 5
but my attempt at a depth-first traversal using breadcrumbs fails (it shows the repeated nodes only once because of the GROUP BY a.descendant):
SELECT a.descendant, GROUP_CONCAT(b.ancestor ORDER BY b.path_length DESC) AS breadcrumbs
FROM ancestor_descendant a
INNER JOIN ancestor_descendant b ON (b.descendant = a.descendant)
WHERE a.ancestor = 1
GROUP BY a.descendant
ORDER BY breadcrumbs;
1 1
2 1,1,4,1,4,1,2,2
5 1,1,4,1,4,1,3,2,3,2,5,5
3 1,3
4 1,4
6 1,4,6
Is it possible to output a depth-first traversal using a closure table representation?
Should I use an alternative representation? I can't use recursive CTEs, because I'm restricted to MySql (which doesn't implement them).
I would suggest splitting the node id into two concepts. One would be a unique id that is used for the graph properties (i.e. ancestor_descendant list). The second is what you show on output.
125350462051
Then create a mapping table:
Id Value
1 1
2 2
20 2
3 3
4 4
5 5
50 5
51 5
6 6
You can then get what you want by joining back to the mapping table and using the value column instead of the id column.
I have a table with the following structure. I need to return all rows where the district of the record immediately preceding and immediately following the row are different than the district for that row. Is this possible? I was thinking of a join on the table itself but not sure how to do it.
id | zip_code | district
__________________________
20063 10169 12
20064 10169 9
20065 10169 12
Assuming that "preceding" and "following" are in the sense of the ID column, you can do:
select *
from zip_codes z1
inner join zip_codes z2 on z1.id=z2.id + 1
inner join zip_codes z3 on z1.id=z3.id - 1
where z1.district <> z2.district and z1.district <> z3.district
This will automatically filter out the first and last rows, because of the inner joins, if you need those to count, change it to left outer join.
Also, this checks if it's different from both. To find if it's different from either (as is implied in the comment), change the and in the where clause to an or. But note, that then, all three rows in your example fit that criteria, even if there are long rows of twelves above and below these rows.
I have a table like so:
categoryID categoryName
----------------------------
1 A
2 B
3 C
Now I want the user to be able to order this data according to his will. I want to remember his preferred order for future. So I thought I'd add a column order to the table above and make it of type INT and AUTO_INCREMENT. So now I get a table like this:
categoryID categoryName order
-------------------------------------
1 A 1
2 B 2
3 C 3
4 D 4
My problem is - the user now decides, to bring categoryName with order 4 (D in example above) up to 2 (above B in example above) such that the table would now look like:
categoryID categoryName order
-------------------------------------
1 A 1
2 B 3
3 C 4
4 D 2
My question is - How should I go about assigning new values to the order column when a reordering happens. Is there a way to do this without updating all rows in the table?
One approach that comes to mind is to make the column a FLOAT and give it an order of 1.5 if I want to bring it between columns with order 1,2. In this case I keep loosing precision as I reorder items.
EDIT:
Another is to update all rows between (m, n) where m, n are the source and destination orders respectively. But this would mean running (m-n) separate queries wouldn't it?
Edit 2:
Assuming I take the FLOAT approach, I came up with this sql to compute the order value for an item that needs to be inserted after item with id = 2 (for example).
select ((
select `order` as nextHighestOrder
from `categories`
where `order` > (
select `order` as targetOrder
from `categories`
where `categoryID`=2)
limit 1) + (
select `order` as targetOrder
from `categories`
where `categoryID`=2)) / 2;
This gives me 3.5 which is what I wanted to achieve.
Is there a better way to write this? Notice that select order as targetOrder from categories where categoryID=9 is executed twice.
If the number of changes is rather small you can generate a clumsy but rather efficient UPDATE statement if the you know the ids of the involved items:
UPDATE categories
JOIN (
SELECT 2 as categoryID, 3 as new_order
UNION ALL
SELECT 3 as categoryID, 4 as new_order
UNION ALL
SELECT 4 as categoryID, 2 as new_order) orders
USING (categoryId)
SET `order` = new_order;
or (which I like less):
UPDATE categories
SET `order` = ELT (FIND_IN_SET (categoryID, '2,3,4'),
3, 4, 2)
WHERE categoryID in (2,3,4);
UPD:
Assuming that you know the current id of the category (or its name), its old position, and its new position you can use the following query for moving a category down the list (for moving up you will have to change the between condition and new_rank computation to rank+1):
SET #id:=2, #cur_rank:=2, #new_rank:=4;
UPDATE t1
JOIN (
SELECT categoryID, (rank - 1) as new_rank
FROM t1
WHERE rank between #cur_rank + 1 AND #new_rank
UNION ALL
SELECT #id as categoryID, #new_rank as new_rank
) as r
USING (categoryID)
SET rank = new_rank;
The idea with Float sounds reasanoble, just don't show these numbers to a user -)
Whenever user moves an entry up or down, you can figure out entries above and below. Just take their Order number and find mean value - that is a new order for the entry that has been moved.
You could keep order as integer and renumber all the items between a drag's source index and destination index because they can't drag that far, especially as only 20 odd categories. Mulit-item drags make this more complicated however.
Float is easier, but each time they move you find the middle you could very quickly run out of precission, I would write a test for this to check it doesn't finally give up working if you keep moving the 3rd item to the 2nd pos over and over.
Example:
1,2,3
Move 3rd to 2nd
1,1.5,2
Move 3rd to 2nd
1,1.25,1.5
Move 3rd to 2nd
1,1.125,1.25
Do that in an excel spread sheet and you'll find the number becomes too small for floats to deal with in about 30 iterations.
Ok, here's the same that #newtover suggests, but these 2 simple queries can be much easier understood by any other developer, even unexperienced.
Let's say we have a table t1:
id name position
-------------------------------------
1 A 1
2 B 2
3 C 3
4 D 4
5 -E- 5
6 F 6
Let's move item 'E' with id=5 to 2nd position:
1) Increase positions for all items between the old position of item 'E' and the desired position of 'E' (positions 2, 3, 4)
UPDATE t1 SET position=position+1 WHERE position BETWEEN 2 AND 4
2) Now there is no item at position 2, so 'E' can take it's place
UPDATE t1 SET position=2 WHERE id=5
Results, ordered by 'position'
id name position
-------------------------------------
1 A 1
5 -E- 2
2 B 3
3 C 4
4 D 5
6 F 6
Just 2 simple queries, no subqueries.
Restriction: column 'position' cannot be UNIQUE. But perhaps with some modifications it should work as well.
Haven't tested this on large datasets.
I have a watchlist system that I've coded, in the overview of the users' watchlist, they would see a list of records, however the list shows duplicates when in the database it only shows the exact, correct number.
I've tried GROUP BY watch.watch_id, GROUP BY rec.record_id, none of any types of group I've tried seems to remove duplicates. I'm not sure what I'm doing wrong.
SELECT watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
FROM
(
watchlist watch
LEFT OUTER JOIN records rec ON rec.record_id = watch.record_id
LEFT OUTER JOIN members usr ON rec.user_id = usr.user_id
)
WHERE watch.user_id = 1
GROUP BY watch.watch_id
LIMIT 0, 25
The watchlist table looks like this:
+----------+---------+-----------+------------+
| watch_id | user_id | record_id | watch_date |
+----------+---------+-----------+------------+
| 13 | 1 | 22 | 1314038274 |
| 14 | 1 | 25 | 1314038995 |
+----------+---------+-----------+------------+
GROUP BY does not "remove duplicates". GROUP BY allows for aggregation. If all you want is to combine duplicated rows, use SELECT DISTINCT.
If you need to combine rows that are duplicate in some columns, use GROUP BY but you need to to specify what to do with the other columns. You can either omit them (by not listing them in the SELECT clause) or aggregate them (using functions like SUM, MIN, and AVG). For example:
SELECT watch.watch_id, COUNT(rec.street_number), MAX(watch.watch_date)
... GROUP by watch.watch_id
EDIT
The OP asked for some clarification.
Consider the "view" -- all the data put together by the FROMs and JOINs and the WHEREs -- call that V. There are two things you might want to do.
First, you might have completely duplicate rows that you wish to combine:
a b c
- - -
1 2 3
1 2 3
3 4 5
Then simply use DISTINCT
SELECT DISTINCT * FROM V;
a b c
- - -
1 2 3
3 4 5
Or, you might have partially duplicate rows that you wish to combine:
a b c
- - -
1 2 3
1 2 6
3 4 5
Those first two rows are "the same" in some sense, but clearly different in another sense (in particular, they would not be combined by SELECT DISTINCT). You have to decide how to combine them. You could discard column c as unimportant:
SELECT DISTINCT a,b FROM V;
a b
- -
1 2
3 4
Or you could perform some kind of aggregation on them. You could add them up:
SELECT a,b, SUM(c) "tot" FROM V GROUP BY a,b;
a b tot
- - ---
1 2 9
3 4 5
You could add pick the smallest value:
SELECT a,b, MIN(c) "first" FROM V GROUP BY a,b;
a b first
- - -----
1 2 3
3 4 5
Or you could take the mean (AVG), the standard deviation (STD), and any of a bunch of other functions that take a bunch of values for c and combine them into one.
What isn't really an option is just doing nothing. If you just list the ungrouped columns, the DBMS will either throw an error (Oracle does that -- the right choice, imo) or pick one value more or less at random (MySQL). But as Dr. Peart said, "When you choose not to decide, you still have made a choice."
While SELECT DISTINCT may indeed work in your case, it's important to note why what you have is not working.
You're selecting fields that are outside of the GROUP BY. Although MySQL allows this, the exact rows it returns for the non-GROUP BY fields is undefined.
If you wanted to do this with a GROUP BY try something more like the following:
SELECT watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
FROM
(
watchlist watch
LEFT OUTER JOIN est8_records rec ON rec.record_id = watch.record_id
LEFT OUTER JOIN est8_members usr ON rec.user_id = usr.user_id
)
WHERE watch.watch_id IN (
SELECT watch_id FROM watch WHERE user_id = 1
GROUP BY watch.watch_id)
LIMIT 0, 25
I Would never recommend using SELECT DISTINCT, it's really slow on big datasets.
Try using things like EXISTS.
You are grouping by watch.watch_id and you have two results, which have different watch IDs, so naturally they would not be grouped.
Also, from the results displayed they have different records. That looks like a perfectly valid expected results. If you are trying to only select distinct values, then you don't want ot GROUP, but you want to select by distinct values.
SELECT DISTINCT()...
If you say your watchlist table is unique, then one (or both) of the other tables either (a) has duplicates, or (b) is not unique by the key you are using.
To suppress duplicates in your results, either use DISTINCT as #Laykes says, or try
GROUP BY watch.watch_date,
rec.street_number,
rec.street_name,
rec.city,
rec.state,
rec.country,
usr.username
It sort of sounds like you expect all 3 tables to be unique by their keys, though. If that is the case, you are simply masking some other problem with your SQL by trying to retrieve distinct values.