MySQL recursive update based on values in the same table - mysql

I am having trouble implementing the following structure in MySQL.
Table1:
ID | Val
1 | 10
2 | 20
Table2:
ID | LeftTableType | LeftID | LeftVal | RightTableType | RightID | RightVal | Operation | Result
1 | Table1 | 1 | (10) | Table1 | 2 | (20) | + | (30)
2 | Table2 | 1 | (30) | Table1 | 2 | (20) | + | (50)
I tried to use a trigger system where an update to Table1 would update the values of Table2. Unfortunately, I needed to then update subsequent values of Table2, which caused a recursive trigger system that MySQL did not like.
I have also been looking into nested sets and tree structures. It seems like they might be what I am looking for, or at least very close.
Is there something obvious that I am missing to implement something like this? This seems like it might lead me to a messy mixture of cursors, recursion, triggers, procedures, and tree structures.
Any hints would be greatly appreciated!

Related

Splitting a cell in mySQL into multiple rows while keeping the same "ID"

In my table I have two columns "sku" and "fitment". The sku represents a part and the fitment represents all the vehicles this part will fit on. The problem is, in the fitment cells, there could be up to 20 vehicles in there, separated by ^^. For example
**sku -- fitment**
part1 -- Vehichle 1 information ^^ vehichle 2 information ^^ vehichle 3 etc
I am looking to split the cells in the fitment column, so it would look like this:
**sku -- fitment**
part1 -- Vehicle 1 information
part1 -- Vehicle 2 information
part1 -- Vehicle 3 information
Is this possible to do? And if so, would a mySQL db be able to handle hundreds of thousands of items "splitting" like this? I imagine it would turn my db of around 250k lines to about 20million lines. Any help is appreciated!
Also a little more background, this is going to be used for a drill down search function so I would be able to match up parts to vehicles (year, make, model, etc) so if you have a better solution, I am all ears.
Thanks
Possible duplicate of this: Split value from one field to two
Unfortunately, MySQL does not feature a split string function. As in the link above indicates there are User-defined Split function's.
A more verbose version to fetch the data can be the following:
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(fitment, '^^', 1), '^^', -1) as fitmentvehicle1,
SUBSTRING_INDEX(SUBSTRING_INDEX(fitment, '^^', 2), '^^', -1) as fitmentvehicle2
....
SUBSTRING_INDEX(SUBSTRING_INDEX(fitment, '^^', n), '^^', -1) as fitmentvehiclen
FROM table_name;
Since your requirement asks for a normalized format (i.e. not separated by ^^) to be retrieved, it is always better to store it in that way in the first place. And w.r.t the DB size bloat up, you might want to look into possibilities of archiving older data and deleting the same from the table.
Also, you should partition your table using an efficient partitioning strategy based on your requirement. It would be more easier to archive and truncate a partition of the table itself, instead of row by row.
E.g.
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table (user_id INT NOT NULL PRIMARY KEY,stuff VARCHAR(50) NOT NULL);
INSERT INTO my_table VALUES (101,'1,2,3'),(102,'3,4'),(103,'4,5,6');
SELECT *
FROM my_table;
+---------+-------+
| user_id | stuff |
+---------+-------+
| 101 | 1,2,3 |
| 102 | 3,4 |
| 103 | 4,5,6 |
+---------+-------+
SELECT * FROM ints;
+---+
| i |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+---+
SELECT DISTINCT user_id
, SUBSTRING_INDEX(SUBSTRING_INDEX(stuff,',',i2.i*10+i1.i+1),',',-1) x
FROM my_table
, ints i1
, ints i2
ORDER
BY user_id,x;
+---------+---+
| user_id | x |
+---------+---+
| 101 | 1 |
| 101 | 2 |
| 101 | 3 |
| 102 | 3 |
| 102 | 4 |
| 103 | 4 |
| 103 | 5 |
| 103 | 6 |
+---------+---+

Several separated tables vs one integrated table with an additional column?

I have 3 tables which all of them have the same structure:
// table1 // table2 // table3
+----+------+ +----+------+ +----+------+
| id | name | | id | name | | id | name |
+----+------+ +----+------+ +----+------+
| 1 | jack | | 1 | ali | | 1 | peter|
+----+------+ +----+------+ +----+------+
Well, I want to know, my current structure is better or an integrated table along with one additional column? something like this:
+----+------+-------+
| id | name | which |
+----+------+-------+
| 1 | jack | table1|
| 2 | ali | table2|
| 3 | peter| table3|
+----+------+-------+
Note: It should be noted that in the current structure (several tables) my query is something like this:
select id, name from table1
union all
select id, name from table2
union all
select id, name from table3
Now I want to know converting those several tables to one table and add a new column is better or not? (I think that new column is kinda overload, is it true?)
This has practical consequences and also philosophical consequences. From a practical point of view, it's very hard to know without knowing a lot more about how the data is going to be used. what's the read to write ratio for this data? How often is data from two or more tables going to be selected in a single query? If you have to do a UNION to get all the data gathered, it's both slower and more cumbersome.
I prefer the philosophical approach, starting with the subject matter. Is there only one kind of entity here, or are there three different entitites that all happen to have the same attribute? That nearly always tells me whether to put them in the same table or not, and also turns out to give the right answer to the practical issue as well, most of the time.
I will say that I would be looking around for some better name for the values of the extra attribute. "table1", "table2" and "table3" seem terribly opaque to me. The subject matter should provide a clue here as well.
Edit:
now that I get the subject matter, I'm going to opine in favor of a single table. It is an opinion rather than a hard and fast rule. So it would be something like.
+----+-----------+----------+--------------+
| id | word | language |translation |
+----+-----------+----------+--------------+
| 1 | butterfly | Spanish | mariposa |
| 2 | butterfly | French | papillon |
| 3 | butterfly | Italian | farfalla |
| 4 | chair | Spanish | silla |
+----+-----------+----------+--------------+
If you are sure that all three tables will remain have common attributes then the option of single table is fine and if that may not persist then don't think about it.
This thread may help you more.

SQL Trigger Multiple Tables

I want to trigger an Update on multiple sql tables without creating a loop.
Lets say I have 2 tables:
Table: User_Names
---------------
|Name | Clark |
|Gen | Male |
|id | 1 |
---------------
Table: User_Ages
---------------
|Age | 34|
|Gen | Male |
|id | 1 |
---------------
The id's are unique and refer to the same person.I want to update the columnGen in User_Names, my trigger should update it in the other Table. I also want this to happen when I change it in User_Ages Table, But if both update eachother im creating a loop on the Update trigger in mysql. How do I prevent this loop? The point here is creating a SQL Trigger.
I'm not going to address your original question given the nature of your example. This is a normalization issue much more than trigger issue.
In this case you should normalize your data and only store it in one place. Example above also suggests that you have slight misunderstanding on how to use rows and columns.
Given the example, better layout would probably be:
Table: User_names
+----+---------+------+
| id | Name | gen |
+----+---------+------+
| 1 | Clark | Male |
+----+---------+------+
Table: User_Ages
+----+------+
| id | age |
+----+------+
| 1 | 34 |
+----+------+
When you want to retrieve both values, you'd just link them in your query, e.g.
SELECT user_names.id,name,gen,age FROM User_names JOIN User_Ages USING (id);
Would give you:
+----+---------+------+-----+
| id | Name | gen | age |
+----+---------+------+-----+
| 1 | Clark | Male | 34 |
+----+---------+------+-----+
Coming back to your original question: In situation like that I'd question the original design. If it is really called for, then I'd pick one table that acts as a master and propagates the changes to other table. E.g. define the trigger on User_names table and use it to populate User_Ages table as well.

MySQL Multi Duplicate Record Merging

A previous DBA managed a non relational table with 2.4M entries, all with unique ID's. However, there are duplicate records with different data in each record for example:
+---------+---------+--------------+----------------------+-------------+
| id | Name | Address | Phone | Email | LastVisited |
+---------+---------+--------------+---------+------------+-------------+
| 1 | bob | 12 Some Road | 02456 | | |
| 2 | bobby | | 02456 | bob#domain | |
| 3 | bob | 12 Some Rd | 02456 | | 2010-07-13 |
| 4 | sir bob | | 02456 | | |
| 5 | bob | 12SomeRoad | 02456 | | |
| 6 | mr bob | | 02456 | | |
| 7 | robert | | 02456 | | |
+---------+---------+--------------+---------+------------+-------------+
This isnt the exact table - the real table has 32 columns - this is just to illustrate
I know how to identify the duplicates, in this case i'm using the phone number. I've extracted the duplicates into a seperate table - there's 730k entires in total.
What would be the most efficient way of merging these records (and flagging the un-needed records for deletion)?
I've looked at using UPDATE with INNER JOIN's, but there are several WHERE clauses needed, because i want to update the first record with data from subsequent records, where that subsequent record has additional data the former record does not.
I've looked at third party software such as Fuzzy Dups, but i'd like a pure MySQL option if possible
The end goal then is that i'd be left with something like:
+---------+---------+--------------+----------------------+-------------+
| id | Name | Address | Phone | Email | LastVisited |
+---------+---------+--------------+---------+------------+-------------+
| 1 | bob | 12 Some Road | 02456 | bob#domain | 2010-07-13 |
+---------+---------+--------------+---------+------------+-------------+
Should i be looking at looping in a stored procedure / function or is there some real easy thing i've missed?
U have to create a PROCEDURE, but before that
create ur own temp_table like :
Insert into temp_table(column1, column2,....) values (select column1, column2... from myTable GROUP BY phoneNumber)
U have to create the above mentioned physical table so that u can run a cursor on it.
create PROCEDURE myPROC
{
create a cursor on temp::
fetch the phoneNumber and id of the current row from the temp_table to the local variable(L_id, L_phoneNum).
And here too u need to create a new similar_tempTable which will contain the values as
Insert into similar_tempTable(column1, column2,....) values (Select column1, column2,.... from myTable where phoneNumber=L_phoneNumber)
The next step is to extract the values of each column u want from similar_tempTable and update into the the row of myTable where id=L_id and delete the rest duplicate rows from myTable.
And one more thing, truncate the similar_tempTable after every iteration of the cursor...
Hope this will help u...

How to split CSVs from one column to rows in a new table in MSSQL 2008 R2

Imagine the following (very bad) table design in MSSQL2008R2:
Table "Posts":
| Id (PK, int) | DatasourceId (PK, int) | QuotedPostIds (nvarchar(255)) | [...]
| 1 | 1 | | [...]
| 2 | 1 | 1 | [...]
| 2 | 2 | 1 | [...]
[...]
| 102322 | 2 | 123;45345;4356;76757 | [...]
So, the column QuotedPostIds contains a semicolon-separated list of self-referencing PostIds (Kids, don't do that at home!). Since this design is ugly as a hell, I'd like to extract the values from the QuotedPostIds table to a new n:m relationship table like this:
Desired new table "QuotedPosts":
| QuotingPostId (int) | QuotedPostId (int) | DatasourceId (int) |
| 2 | 1 | 1 |
| 2 | 1 | 2 |
[...]
| 102322 | 123 | 2 |
| 102322 | 45345 | 2 |
| 102322 | 4356 | 2 |
| 102322 | 76757 | 2 |
The primary key for this table could either be a combination of QuotingPostId, QuotedPostId and DatasourceID or an additional artificial key generated by the database.
It is worth noticing that the current Posts table contains about 6,300,000 rows but only about 285,000 of those have a value set in the QuotedPostIds column. Therefore, it might be a good idea to pre-filter those rows. In any case, I'd like to perform the normalization using internal MSSQL functionality only, if possible.
I already read other posts regarding this topic which mostly dealt with split functions but neither could I find out how exactly to create the new table and also copying the appropriate value from the Datasource column, nor how to filter the rows to touch accordingly.
Thank you!
€dit: I thought it through and finally solved the problem using an external C# program instead of internal MSSQL functionality. Since it seems that it could have been done using Mikael Eriksson's suggestion, I will mark his post as an answer.
From comments you say you have a string split function that you you don't know how to use with a table.
The answer is to use cross apply something like this.
select P.Id,
S.Value
from Posts as P
cross apply dbo.Split(';', P.QuotedPostIds) as S