mysql - Keeping unique records after removing parent child relationship - mysql

In a mysql database, I have a business Unit table which maintain the hierarchy of a client's business units. Each business unit can have a parent and/or a child.
products_client_1.business_units
id parent_id
1
2 1
3 1
4 1
8 1
14 3
17 2
31 1
35 4
36 1
37 4
38 2
39 31
40 8
41 3
42 31
43
44 43
Currently, I have a customerId table which maintains the customerId at a business unit level
contacts_client_1.buid_customer_id
global_id customer_id bu_id
ABC1000033 1812130 2
ABC1000033 1812130 54
ABC1000034 4049809 2
ABC1000035 5630631 2
ABC1000082 5707052 2
ABC1000082 1111116 54
ABC1000091 5813085 2
ABC1000091 5813085 54
ABC1000093 5208477 2
ABC1000115 5045891 2
ABC1000115 5045891 54
ABC1000117 6114245 2
ABC1000117 6114247 54
ABC1000117 6114247 1
ABC1000111 1234567 38
ABC1000100 9023456 43
ABC1000100 9023457 44
Going forward, I do not want to maintain the customer id at individual business unit level. It should be unique for a given globalId. For this I want to migrate the existing customer id data based on the following condition.
If a globalId has customerId for a only single BU, migrate it as it is without bu_id.
If a globalId has customerId for 2 BUs (they can be parent-child at any level), keep the customerId of the parent most available BU.
required table contacts_client_1.customer_id
global_id customer_id
ABC1000033 1812130
ABC1000034 4049809
ABC1000035 5630631
ABC1000082 5707052
ABC1000091 5813085
ABC1000093 5208477
ABC1000100 9023456
ABC1000111 1234567
ABC1000115 5045891
ABC1000117 6114247
PS:
globalId are not overlapping among different parent most BUs.
business_unit table is under products_client_1 schema and buid_customer_id table is under contacts_client_1 schema.
The same code should be executable for different clients.
This is a one time migration.
Need help in writing the query.

I'm not sure what exactly you are going to do with your data, but the following should help:
Show only rows which have no parent for the same global_id in the buid_customer_id table:
select child.*
from contacts_client_1.buid_customer_id child
left join products_client_1.business_units bu
on bu.id = child.bu_id
left join contacts_client_1.buid_customer_id parent
on parent.global_id = child.global_id
and parent.bu_id = bu.parent_id
where parent.global_id is null
Examples:
Row (ABC1000100 9023456 43) - The bu_id (43) has no parent in buid_customer_id, so there will be no match for the first LEFT JOIN and also no match for the second. Since all columns from the left joined tables will be NULL, parent.global_id is null is TRUE and the row will be selected.
Row (ABC1000100 9023457 44) - The bu_id (44) has a parent_id (43), so the first JOIN will find a match. The second JOIN will also find a match, because a row with the parent BU and the same global_id exists in the buid_customer_id table. Thus parent.global_id is not NULL and the row won't be selected.
Row (ABC1000033 1812130 2) - The bu_id (2) has a parent_id (1). The first JOIN will find a match. But tere is no row in the buid_customer_id table with bu_id = 1 and global_id = ABC1000033, so there is no match for the second JOIN. Thus parent.global_id will be NULL and the row will be selected.
Now you can use this statement to copy (migrate) the data to a new table with
insert into new_table
select child.*
[..]
You can also go the other way. If you replace the LEFT JOINs with INNER JOINs and remove the WHERE clause, you will get the opposite result (all rows which are not returned by the first query). You can use it to remove all those rows from the table.
Delete all rows which have a parent row for the same global_id:
delete child
from contacts_client_1.buid_customer_id child
join products_client_1.business_units bu
on bu.id = child.bu_id
join contacts_client_1.buid_customer_id parent
on parent.global_id = child.global_id
and parent.bu_id = bu.parent_id
Now the table buid_customer_id will contain the same rows which are selected by the first query. If this data needs to be in another table - just rename it. Then you can copy global_id and customer_id with
insert into customer_id (global_id, customer_id)
select global_id, customer_id
from new_table

Related

How to add a tag based on a column value

I'm trying to join two tables and select certain columns to display in the output including a 'flag' if a certain transaction amount is greater than or equal to 100. The flag would return a 1 if it is, else null.
I thought I could achieve this using a CASE in my SELECT but it only returns one record every time since it returns the first record that meets this condition. How do I just create this 'FLAG' column during my join easily?
SELECT payment_id, amount, type,
CASE
WHEN amount >= 100 THEN 1
ELSE NULL
END AS flag
FROM trans JOIN customers ON (user_id = cust_id)
JOIN bank ON (trans.bank = bank.id)
WHERE (error is false)
I expect an output such as:
payment_id amount type flag
1 81 3 NULL
2 104 2 1
3 150 2 1
4 234 1 1
However, I'm only getting the first record such as:
payment_id amount type flag
2 104 2 1
I tried your table structure in my local and it is working perfectly.
I need one thing from you is in which table you are having error column.
If I comment where condition then it is working fine.
If you're getting fewer rows than you expect, it's either due to:
Join condition
You're doing a INNER joins to the customers and bank tables. If you have 4 source rows in your trans table, but only one row that matches in your customers table (condition user_id = cust_id), then you will only have one row returned.
The same goes for the subsequent join to your bank table. If there you somehow have a transaction that references a bank which is not defined in the bank table, then you won't see a record for this row.
WHERE clause
Obviously you won't see any rows that don't meet the conditions specified here.
It's probably #1 -- check to see if the rows with payment_id IN (1,3,4) have corresponding user id values in the user table and corresponding bank id values in the banks table.

Get related table's related record in mysql

I have three tables ClaimHeader, ResClaim and ResActivity. ClaimHeader table's primary key is used as foreign key in ResClaim table and ResClaim table's primary key is used as foreign key in ResActivity table.
Below is my tables
ClaimHeader:
HeaderID FileID FileName
1 fileid1 file1.xml
2 fileid2 file2.xml
3 fileid3 file3.xml
4 fileid4 file4.xml
--------------------------------------------------
ResClaim:
ClaimPKID HeaderPKID ClaimDateSettlement
1 1 2017-04-08
2 1 2017-03-08
3 2 2017-04-10
4 3 2017-05-08
--------------------------------------------------
ResActivity:
ActivityPKID ClaimPKID ActivityNet
1 1 400
2 2 3000
3 2 2030
4 3 5000
Tables screenshot
ResClaim table uses HeaderPKID as the foreign key from ClaimHeader table and ResActivity table uses ClaimPKID as the foreign key from ResClaim table
My scenario is i should display related record from all the three tables.
For example i want to display FileID from ClaimHeader table , Total claims count from ResClaim table and Sum of ActivityNet from ResActivity table with the matching condtion.
My expected result would be:
FileID | Total Claim(s) | ActivityNet
--------------------------------------------------
fileid2 | 1 | 5030 (3000+2030)
--------------------------------------------------
I have tried below query:
SELECT
`ClaimHeader`.*,
count(ResClaim.ClaimPKID) as claims,
sum(ResActivity.ActivityNet) as net
FROM
`ClaimHeader`
RIGHT OUTER JOIN `ResClaim`
ON ResClaim.ClaimPKID = ClaimHeader.HeaderID
RIGHT OUTER JOIN `ResActivity`
ON ResClaim.ClaimPKID = ResActivity.ActivityPKID
The above query is not returning related record values - instead of that it's returning sum of all the columns and count from ResActivity and ResClaim table.
It is still not completely clear what you are asking. It seems you either want one result row per claim header or one result row per file. So group by the column in question.
By joining all records you get of course claim headers and claims multifold, e.g. with
claim headers: head1, head2
claims for head1: claim10, claim11
claims for head2: claim20, claim21
actions for claim 10: action100, action101
actions for claim 11: action110, action111
actions for claim 20: action200, action201
you get this intermediate result from the joins:
header | claim | action
-------+---------+----------
head1 | claim10 | action100
head1 | claim10 | action101
head1 | claim11 | action110
head1 | claim11 | action111
head2 | claim20 | action200
...
Now by grouping per header, you get one aggregated result row for head1, one for head2. As the intermediate result contains one record per action, you can easily sum them. But as to claims: there are four records for head1, and if you do count(*) or count(claimpkid) you get a result of 4 (count(claimpkid) counts all claimpkid that are not null, which is always the case in this example). What you want to do instead is counting distinct claims (namly two here: claim10 and claim11). So use COUNT DISTINCT.
select
h.headerid,
count(distinct c.claimpkid) as claims,
coalesce(sum(a.activitynet), 0) as net
from claimheader h
left outer join resclaim c on c.claimpkid = h.headerid
left outer join resactivity a on a.activitypkid = c.claimpkid
group by h.headerid
order by h.headerid;
I am using left outer joins (and COALESCE) for the case that a header has no claims or a claim has no actions, for which we would show zeros rather then removing them from the results.
As mentioned, if you want this per file, then select fileid and group by it instead of headerid.

Select multiple rows by chosing two indexed columns

I am trying to select ALL productId´s from product_parameter table by specifing conditions of same columns. Here is working example
SELECT PROD.productId FROM product PROD
LEFT JOIN product_parameter PARAM ON PROD.`productId` = PARAM.`productId`
WHERE PROD.`publish` = 1 AND ( PARAM.`parameterId` = "264" AND PARAM.`valueId` = 10 )
It selects several products
productId parameterId valueId
3328 264 10
3514 264 10
3513 264 10
3512 264 10
When I display all parameters for productId 3328 or 3514 I see they have parameterId 247 and valueId 60 IN COMMON, now here is the tricky part. I would like to create a query which selects those two productId by adding another condition like
AND ( PARAM.`parameterId` = "247" AND PARAM.`valueId` = 60 )
But logicaly it can not work cause I am trying to select a row which should contain two valueId (no sense) I have been looking into several product parameter system p.e. joomshopping uses a separate columns for every new parameter which in my case would create 600+ columns for each row
VERY IMPORTANT I know i can put OR between every parameter GROUP but this would give me just a number of all products existing for combination of those conditions my goal is to select EXACT number of productIds which meets my condition any IDEAS?

MYSQL SELECT LOOP dynamically

I'm a newbie in mysql and not sure if possible to have a loop statement in select.
My Table:
ID user_id parent
1 13 2
2 14 2
3 15 13
4 16 13
5 17 14
6 18 14
7 19 15
8 20 15
parent with value of 2 has no parent. it is something like root
user_id is equivalent to child
so if i do
SELECT * FROM my_table WHERE parent = 2
basically output is:
ID user_id parent
1 13 2
2 14 2
is there a way i can get the other child? should i do subquery? If 2 is selected this is the image i want to achieve: http://awesomescreenshot.com/04b2y7qfe2
Here's how you can connect your parents users with their children users:
This will display only record that have parents:
SELECT c.user_id, c.parent, p.user_id, p.parent
FROM my_table c, my_table p
WHERE c.parent = p.user_id
Or using left Join, display records that have or have no parents (the ones with 2 as parentID)
SELECT c.user_id as ChildID, c.parent as ChildParentID, p.user_id as ParentID, p.parent as ParentOfParentID
FROM my_table c
LEFT JOIN my_table p
ON c.parent = p.user_id
This looks like what is often called an adjacency tree list, where a tree structure is defined in list form by specifying IDs and parent IDs, with a given parent ID value indicating the root node of the tree (in this case, the value 2).
Amir's answer will give you the immediate children of each node. If you'd like to retrieve entire branches of the tree, from a given node, you could look at Modified Preorder Tree Traversal (MPTT). You just fetch all rows where the left field value falls between the root node's left and right values. The key drawback with this method is that on average 50% of the records in the table need to be updated when adding or removing nodes from the tree. If you've got a big table, that can be a bit of a performance hit.
Unfortunately, as far as I know, MySQL doesn't have a way of performing recursive queries, which would be another way of solving this problem. Some other database systems of this functionality, but not MySQL.

Joining values of columns?

I have a query that fetches the list of user IDs and their corresponding user names on a board but from another table also gets a column that has a value (a name) on the row corresponding to the user ID if said user has changed their name. Using an outer join I got the three nicely displayed as in the following example of a few of the results:
member_id name dname_current
1 Blablabla1 blablabla2
2 Bla4444
3 RevZ
5 Herpaderp42
6 Lalalala
7 Kaboom
14 testtesttest21 Formula21
15 Alex Ethan
16 Bob Radio3
The SQL query to get the three columns is as follows:
SELECT
data_members.member_id,
data_members.name,
data_dnames_change.dname_current
FROM data_members LEFT OUTER JOIN data_dnames_change
ON data_members.member_id = data_dnames_change.dname_member_id
GROUP BY data_members.member_id
Is there a way to display this so that it merges the values which exist in the 'dname_current' column of that other table into the 'name' column, replacing any value that's already in the corresponding row of that column?
COALESCE() returns the first non-null value, so you can do the following to prefer dbname_current over data_members.name unless it is NULL:
SELECT
data_members.member_id,
COALESCE(data_dnames_change.dname_current, data_members.name) AS name
FROM data_members LEFT OUTER JOIN data_dnames_change
ON data_members.member_id = data_dnames_change.dname_member_id
GROUP BY data_members.member_id
Should return:
member_id name
1 blablabla2
2 Bla4444
3 RevZ
5 Herpaderp42
6 Lalalala
7 Kaboom
14 Formula21
15 Ethan
16 Radio3