I have a table describing changes that has been made to an end_customers table. When someone changes and end_customer we create a new row in the end_customers table and add a row to end_customer_history table where end_customer_parent_id is the ID of the old end_customer, and end_customer_child_id is the ID of the new end_customer.
End Customer Table:
CREATE TABLE `end_customers` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`reference_person` varchar(255) DEFAULT NULL,
`phone_number` varchar(255) NOT NULL,
`email` varchar(255) DEFAULT NULL,
`social_security_number` varchar(255) DEFAULT NULL,
`comment` longtext,
`token` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=101107 DEFAULT CHARSET=utf8;
End Customer History Table:
CREATE TABLE `end_customer_history` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`end_customer_parent_id` bigint(20) NOT NULL,
`end_customer_child_id` bigint(20) NOT NULL,
`user_id` bigint(20) NOT NULL,
`date` datetime NOT NULL,
PRIMARY KEY (`id`),
KEY `FK_end_customer_parent` (`end_customer_parent_id`),
KEY `FK_end_customer_child` (`end_customer_child_id`),
KEY `FK_user` (`user_id`),
CONSTRAINT `end_customer_history_old_ibfk_1` FOREIGN KEY (`end_customer_parent_id`) REFERENCES `end_customers` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `end_customer_history_old_ibfk_2` FOREIGN KEY (`end_customer_child_id`) REFERENCES `end_customers` (`id`) ON DELETE CASCADE ON UPDATE CASCADE,
CONSTRAINT `end_customer_history_old_ibfk_3` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=67 DEFAULT CHARSET=utf8;
We are now refactoring the schema so that changes made to end_customers table directly edits the row instead of creating a new row, and puts a copy of the old data in a end_customer_history_new table that has the same schema as end_customers.
I need to migrate all old data to this new table.
So for each end_customer I have, I need to check if it has an entry in end_customer_history as a end_customer_child_id (it has been modified), and then check if that entrys parent is also present in end_customer_history as a child, and then check if that entrys parent is also present in end_customer_history as a child, and so forth until there are no more rows.
How do I do this in one migration SQL script?
Unlike Oracle, MySQL does not provide the functionality to recursively query parent-child hierarchy. You can write a self join query (or inner queries) if you know the level beforehand. If not, you have to perform a single query and handle resursiveness in the application or stored procedure.
Here is an example of querying a parent child hierarchy if you know the level already, and here is an example of how to do it in Oracle (for your reference).
In the solution I came up with, it doesn't use any stored functions to loop over rows, because I could never get it to work. It takes the latest version of an end customer, and then joins similar rows using a token that we know is always the same and unique.
-- Move data from old end customer history to the new schema
-- May remove some redundant history, this will be ok
-- If an end customer has been changed more than once, all changes will have the same date (latest change) and the same user responsible for change
-- This is also ok
-- Insert the latest version of changed end_customers into new history table
INSERT INTO end_customer_history (name, reference_person, phone_number, email, social_security_number, comment, end_customer_id, user_id, date)
SELECT
ec.name, ec.reference_person, ec.phone_number, ec.email, ec.social_security_number, ec.comment, ech.end_customer_child_id, ech.user_id, ech.date
FROM end_customer_history_old AS ech
JOIN end_customers AS ec ON ec.id = ech.end_customer_parent_id
WHERE ech.end_customer_child_id IN (SELECT end_customer_id FROM orders);
-- Remove all the old data
DROP TABLE end_customer_history_old;
-- Insert all versions of an end_customer based on the latest history, joined on token (always the same)
INSERT INTO end_customer_history (name, reference_person, phone_number, email, social_security_number, comment, end_customer_id, user_id, date)
SELECT
parent.name, parent.reference_person, parent.phone_number, parent.email, parent.social_security_number, parent.comment, ech.end_customer_id, ech.user_id, ech.date
FROM end_customer_history AS ech
JOIN end_customers AS ec ON ec.id = ech.end_customer_id
JOIN end_customers AS parent ON parent.token = ec.token AND parent.id <> ec.id;
-- This unfortunately creates duplicates of some rows, so it looks like changes have been made multiple times (with no new data)
-- Create a temporary table that copies over unique rows from end_customer_history
CREATE TABLE temp AS
SELECT * FROM end_customer_history
GROUP BY
name, reference_person, phone_number, email, social_security_number, comment, end_customer_id, user_id, date;
-- Clear the end_customer_history table all together
TRUNCATE TABLE end_customer_history;
-- Copy over filtered unique history rows
INSERT INTO end_customer_history (name, reference_person, phone_number, email, social_security_number, comment, end_customer_id, user_id, date)
SELECT
name, reference_person, phone_number, email, social_security_number, comment, end_customer_id, user_id, date
FROM temp;
-- Remove temporary table
DROP TABLE temp;
Related
I need to increment value unique row, use only one request
Not use select and after condition INSERT OR UPDATE
Table columns (date, servie) are used as unique values
Only one INSERT
For example
CREATE TABLE `log` (
`date` date NOT NULL,
`service` varchar(255) NOT NULL,
`count` int(11) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I found a single solution, but it not work for my
INSERT INTO log (`date`,`service`,`count`) VALUES ('2020-12-1','amazon',count+1)
ON DUPLICATE KEY UPDATE `date`='2020-12-01',`service`='amazon',`count`=`count`+1;
Are there any other solutions?
First, define id_key as a unique key in the table (or as the primary key):
create table log (
id_key int primary key
dt date not null,
service varchar(255) not null,
cnt int not null,
) engine=innodb default charset=utf8;
Then you can do:
insert into t1 (id_key, dt, service, cnt)
values (1, '2020-12-1', 'amazon', 1)
on duplicate key update
dt = values(dt),
service = values(service),
cnt = cnt + 1;
This attempts to insert a new row in the table, with a cnt of of 1. If id_key exists already, then it is updated instead: the date and service are updated, and the count is incremented.
If you want to use the date and service as unique keys, then do reflect that in the create table statement:
create table log (
id_key int primary key
dt date not null,
service varchar(255) not null,
cnt int not null,
unique (dt, service)
) engine = innodb default charset = utf8;
And then, it does not really make sense to update these columns on duplicate keys, so:
insert into t1 (id_key, dt, service, cnt)
values (1, '2020-12-1', 'amazon', 1)
on duplicate key update cnt = cnt + 1;
I have a user-settings setup with a so called 'property bag' I guess.
I want to store settings for my users. Most users won't change the default setting, so I thought I should make a 'default value' for each setting. This way I don't have store a user_setting record for each setting for each user.
This is my (simplified) mysql database:
CREATE TABLE `user` (
`user_id` INT NOT NULL AUTO_INCREMENT,
`username` VARCHAR(45) NOT NULL,
PRIMARY KEY (`user_id`)
);
CREATE TABLE `setting` (
`setting_id` INT NOT NULL AUTO_INCREMENT,
`key` VARCHAR(100) NOT NULL,
`default_value` VARCHAR(100) NOT NULL,
PRIMARY KEY (`setting_id`)
);
CREATE TABLE `user_setting` (
`user_setting_id` INT NOT NULL AUTO_INCREMENT,
`user_id` INT NOT NULL,
`setting_id` INT NOT NULL,
`value` VARCHAR(100) NOT NULL,
PRIMARY KEY (`user_setting_id`),
CONSTRAINT `fk_user_setting_1`
FOREIGN KEY (`user_id`)
REFERENCES `user` (`user_id`)
ON DELETE CASCADE
ON UPDATE CASCADE,
CONSTRAINT `fk_user_setting_2`
FOREIGN KEY (`setting_id`)
REFERENCES `setting` (`setting_id`)
ON DELETE RESTRICT
ON UPDATE CASCADE
);
INSERT INTO `user` VALUES (1, 'username1'),(2, 'username2');
INSERT INTO `setting` VALUES (1, 'key1', 'somevalue'),(2, 'key2', 'someothervalue');
In my code I can easy do a lookup for each setting for each user. By checking if there is a row in the user_setting table, I know that this is other then the default value.
But is there a way to get an overview for all the settings for each user? Normaly I would left-join the user -> user_setting -> setting tables for each user, but now I don't have a user_setting record for each user/setting. Is this possible with a single query?
If you just had a cartesian join of user against setting, you'll get one row for every user/setting combination. Then simply left join the user_setting table and you can pick up the overridden value when it exists.
So something like this:
SELECT u.user_id, s.key, s.default_value, us.value
FROM user u, setting s
LEFT JOIN user_setting us
ON(us.user_id=u.user_id AND us.setting_id=s.setting_id)
ORDER BY u.user_id, s.key
You could refine this further using IFNULL so that you get the value of the setting regardless of whether it's overridden or default:
SELECT u.user_id, s.key, IFNULL(us.value , s.default_value) AS value
FROM user u, setting s
LEFT JOIN user_setting us
ON(us.user_id=u.user_id AND us.setting_id=s.setting_id)
ORDER BY u.user_id, s.key
(Answering my own question isn't the way I normaly work, but I'm not sure if this is the correct answer and it's based on Paul Dixon's answer)
As mentioned, a cartesian join is needed between user and setting. The correct query would be:
SELECT u.user_id, s.key, IFNULL(us.value , s.default_value) AS value
FROM user u
CROSS JOIN setting s
LEFT JOIN user_setting us ON
(us.user_id=u.user_id AND us.setting_id=s.setting_id)
ORDER BY u.user_id, s.key;
I was up all last night trying to crack this but with no luck so I'm hoping you guys can help as I'm all out of ideas:
I have two parent tables that I want to populate a Junction table from:
Brides:
create table if not exists `Brides` (
`BrideID` INT not null auto_increment,
`MaidenName` varchar(10) unique,
primary key (`BrideID`)
) engine=InnoDB;
insert into Brides (MaidenName)
values ('Smith'),
('Jones')
;
Churches:
create table if not exists `Churches` (
`ChurchID` INT not null auto_increment,
`ChurchName` varchar(10) unique,
primary key (`ChurchID`)
) engine=InnoDB;
insert into Churches (ChurchName)
values ('St Marys'),
('St Albans')
;
I am trying to populate the ID variables for Junction table Marriages by indirectly referencing the unique names in each parent table. In addition, I'm looking to include MarriedName to identify if a Bride marries more than once:
Marriages:
create table if not exists 'Marriages' (
'BrideID' INT not null,
'ChurchID' INT not null,
'MarriedName' TEXT not null
primary key ('BrideID','ChurchID','MarriedName')
INDEX `fk_Marriages_Brides1_idx` (`BrideID` ASC),
INDEX `fk_Marriages_Churches1_idx` (`ChurchID` ASC),
CONSTRAINT `fk_Marriages_Brides1`
FOREIGN KEY (`BrideID`)
REFERENCES `Brides` (`BrideID`)
ON DELETE NO ACTION
ON UPDATE NO ACTION,
CONSTRAINT `fk_Marriages_Churches1`
FOREIGN KEY (`Church_ID`)
REFERENCES `Churches` (`ChurchID`)
ON DELETE NO ACTION
ON UPDATE NO ACTION)
;
I'm trying to do something like the below pseudo-code (although I'm pretty sure it wouldn't be the smart way to do it anyway as must be slow with with so many sub-queries):
insert into Marriages (Bride_ID, Church_ID, MarriedName)
select b.BrideID, c.ChurchID, m.MarriedName
from (values (Bride,Church,MarriedName)
('Smith','St Marys','Johnson'),
('Jones','St Albans','Peterson')
) m
join Brides b
on a.MaidenName=m.Bride
join Churches c
on m.Church=c.ChurchName;
Any help/insight/corrections you have would be greatly appreciated!
Give this a try:
INSERT INTO Marriages
SELECT b.BrideID, c.ChurchID, 'Johnson'
FROM Brides b, Churches c
WHERE b.MaidenName='Smith' AND c.ChurchName='St Marys'
Just for completeness, I wanted to share this solution in case it helps anyone else....First and last step requires no changes and just the 'raw' data in the second step needs adding to:
create table if not exists `_marriages` (
`BrideName` varchar(10) null,
`ChurchName` varchar(10) null,
`MarriedName` varchar(10) NULL,
primary key (`BrideName`,`ChurchName`,`MarriedName`))
engine=InnoDB
;
insert into `_marriages` (BrideName,ChurchName,MarriedName)
values
('Smith','St Albans','Johnson'),
('Jones','St Marys','Peterson')
;
insert into `Marriages` (Bride_ID, Church_ID, MarriedName)
select distinct b.BrideID, c.ChurchID, a.MarriedName
from _marriages a
inner join Brides b
on (a.BrideName=b.MaidenName)
left join Churches c
on a.ChurchName=c.ChurchName
;
...as always though, if anyone has any advancement on a better way to do it please let us know!
There is a table:
CREATE TABLE `mytable` (
`user_id` INT(10) UNSIGNED NOT NULL,
`thing_id` VARCHAR(100) NOT NULL DEFAULT '',
`lock_date` DATETIME NOT NULL,
`lock_id` VARCHAR(36) NOT NULL,
PRIMARY KEY (`user_id`,`thing_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;
and some values there:
INSERT INTO mytable(user_id,thing_id,lock_date,lock_id)
VALUES
(51082,'299ac9ff-2b2b-102d-8ff6-f64c971398c3','2012-03-16 00:39:12','ec7b2008-6ede-11e1-aac2-5924aae99221'),
(108325,'299ac9ff-2b2b-102d-8ff6-f64c971398c3','2013-02-05 19:30:03','7c6de986-6edd-11e1-aac2-5924aae99221'),
(108325,'d90b354d-4b5f-11e0-9959-47117d41cf4b','2012-03-16 00:47:41','1c243032-6ee0-11e1-aac2-5924aae99221');
I want to delegate all records of user_id = 108325 to user_id = 51082, and if both users have an equal thing_id field, leave the newer one only (lock_date1 > lock_date2), so that I have following result:
51082,'299ac9ff-2b2b-102d-8ff6-f64c971398c3','2013-02-05 19:30:03','7c6de986-6edd-11e1-aac2-5924aae99221'
108325,'d90b354d-4b5f-11e0-9959-47117d41cf4b','2012-03-16 00:47:41','1c243032-6ee0-11e1-aac2-5924aae99221'
Note that 51082 now has a newer record: lock_date = '2013-02-05 19:30:03' instead of '2012-03-16 00:39:12'.
So, how can I update a row, and on duplicate key leave the newer one (by some particular field)?
Thanks!
INSERT INTO
mytable(user_id,thing_id,lock_date,lock_id)
VALUES
(51082,'299ac9ff-2b2b-102d-8ff6-f64c971398c3','2012-03-16 00:39:12','ec7b2008-6ede-11e1-aac2-5924aae99221'),
(108325,'299ac9ff-2b2b-102d-8ff6-f64c971398c3','2013-02-05 19:30:03','7c6de986-6edd-11e1-aac2-5924aae99221'),
(108325,'d90b354d-4b5f-11e0-9959-47117d41cf4b','2012-03-16 00:47:41','1c243032-6ee0-11e1-aac2-5924aae99221')
ON DUPLICATE KEY UPDATE SET
user_id = VALUES(user_id),
lock_date = VALUES(lock_date),
lock_id = VALUES(lock_id)
I have two tables, locations and location groups
CREATE TABLE locations (
location_id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(63) UNIQUE NOT NULL
);
INSERT INTO locations (name)
VALUES
('london'),
('bristol'),
('exeter');
CREATE TABLE location_groups (
location_group_id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
location_ids VARCHAR(255) NOT NULL,
user_ids VARCHAR(255) NOT NULL,
name VARCHAR(63) NOT NULL,
);
INSERT INTO location_groups (location_ids, user_ids, name)
VALUES
('1', '1,2,4', 'south east'),
('2,3', '2', 'south west');
What I am trying to do is return all location_ids for all of the location_groups where the given user_id exists. I'm using CSV to store the location_ids and user_ids in the location_groups table. I know this isn't normalised, but this is how the database is and it's out of my control.
My current query is:
SELECT location_id
FROM locations
WHERE FIND_IN_SET(location_id,
(SELECT location_ids
FROM location_groups
WHERE FIND_IN_SET(2,location_groups.user_ids)) )
Now this works fine if the user_id = 1 for example (as only 1 location_group row is returned), but if i search for user_id = 2, i get an error saying the sub query returns more than 1 row, which is expected as user 2 is in 2 location_groups. I understand why the error is being thrown, i'm trying to work out how to solve it.
To clarify when searching for user_id 1 in location_groups.user_ids the location_id 1 should be returned. When searching for user_id 2 the location_ids 1,2,3 should be returned.
I know this is a complicated query so if anything isn't clear just let me know. Any help would be appreciated! Thank you.
You could use GROUP_CONCAT to combine the location_ids in the subquery.
SELECT location_id
FROM locations
WHERE FIND_IN_SET(location_id,
(SELECT GROUP_CONCAT(location_ids)
FROM location_groups
WHERE FIND_IN_SET(2,location_groups.user_ids)) )
Alternatively, use the problems with writing the query as an example of why normalization is good. Heck, even if you do use this query, it will run more slowly than a query on properly normalized tables; you could use that to show why the tables should be restructured.
For reference (and for other readers), here's what a normalized schema would look like (some additional alterations to the base tables are included).
The compound fields in the location_groups table could simply be separated into additional rows to achieve 1NF, but this wouldn't be in 2NF, as the name column would be dependent on only the location part of the (location, user) candidate key. (Another way of thinking of this is the name is an attribute of the regions, not the relations between regions/groups, locations and users.)
Instead, these columns will be split off into two additional tables for 1NF: one to connect locations and regions, and one to connect users and regions. It may be that the latter should be a relation between users and locations (rather than regions), but that's not the case with the current schema (which could be another problem of the current, non-normalized schema). The region-location relation is one-to-many (since each location is in one region). From the sample data, we see the region-user relation is many-many. The location_groups table then becomes the region table.
-- normalized from `location_groups`
CREATE TABLE regions (
`id` INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`name` VARCHAR(63) UNIQUE NOT NULL
);
-- slightly altered from original
CREATE TABLE locations (
`id` INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`name` VARCHAR(63) UNIQUE NOT NULL
);
-- missing from original sample
CREATE TABLE users (
`id` INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
`name` VARCHAR(63) UNIQUE NOT NULL
);
-- normalized from `location_groups`
CREATE TABLE location_regions (
`region` INT UNSIGNED,
`location` INT UNSIGNED UNIQUE NOT NULL,
PRIMARY KEY (`region`, `location`),
FOREIGN KEY (`region`)
REFERENCES regions (id)
ON DELETE restrict ON UPDATE cascade,
FOREIGN KEY (`location`)
REFERENCES locations (id)
ON DELETE cascade ON UPDATE cascade
);
-- normalized from `location_groups`
CREATE TABLE user_regions (
`region` INT UNSIGNED NOT NULL,
`user` INT UNSIGNED NOT NULL,
PRIMARY KEY (`region`, `user`),
FOREIGN KEY (`region`)
REFERENCES regions (id)
ON DELETE restrict ON UPDATE cascade,
FOREIGN KEY (`user`)
REFERENCES users (id)
ON DELETE cascade ON UPDATE cascade
);
Sample data:
INSERT INTO regions
VALUES
('South East'),
('South West'),
('North East'),
('North West');
INSERT INTO locations (`name`)
VALUES
('London'),
('Bristol'),
('Exeter'),
('Hull');
INSERT INTO users (`name`)
VALUES
('Alice'),
('Bob'),
('Carol'),
('Dave'),
('Eve');
------ Location-Region relation ------
-- temporary table used to map natural keys to surrogate keys
CREATE TEMPORARY TABLE loc_rgns (
`location` VARCHAR(63) UNIQUE NOT NULL
`region` VARCHAR(63) NOT NULL,
);
-- Hull added to demonstrate correctness of desired query
INSERT INTO loc_rgns (region, location)
VALUES
('South East', 'London'),
('South West', 'Bristol'),
('South West', 'Exeter'),
('North East', 'Hull');
-- map natural keys to surrogate keys for final relationship
INSERT INTO location_regions (`location`, `region`)
SELECT loc.id, rgn.id
FROM locations AS loc
JOIN loc_rgns AS lr ON loc.name = lr.location
JOIN regions AS rgn ON rgn.name = lr.region;
------ User-Region relation ------
-- temporary table used to map natural keys to surrogate keys
CREATE TEMPORARY TABLE usr_rgns (
`user` INT UNSIGNED NOT NULL,
`region` VARCHAR(63) NOT NULL,
UNIQUE (`user`, `region`)
);
-- user 3 added in order to demonstrate correctness of desired query
INSERT INTO usr_rgns (`user`, `region`)
VALUES
(1, 'South East'),
(2, 'South East'),
(2, 'South West'),
(3, 'North West'),
(4, 'South East');
-- map natural keys to surrogate keys for final relationship
INSERT INTO user_regions (`user`, `region`)
SELECT user, rgn.id
FROM usr_rgns AS ur
JOIN regions AS rgn ON rgn.name = ur.region;
Now, the desired query for the normalized schema:
SELECT DISTINCT loc.id
FROM locations AS loc
JOIN location_regions AS lr ON loc.id = lr.location
JOIN user_regions AS ur ON lr.region = ur.region
;
Result:
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
+----+