Conversion of merge statement to MYSQL using on duplicate key - mysql

Please suggest how to convert this teradata statement in MYSQL. As we know mysql doesn't support merge statement. Below 2 tables are also being used in select query and we have multiple primary key in each table.
MERGE INTO XYZ USING (
SELECT
ITRR.WORKFLOW_NAME WORKFLOW_NAME
, ITRR.INSTANCE_NAME INSTANCE_NAME
, MIN(ITRR.START_TIME) EARLIEST_START_TIME
, ITRR.SUBJECT_AREA SUBJECT_AREA
, 'INFORMATICA' PLATFORM_NAME
FROM
ABC IWRR
, DEF ITRR
WHERE
IWRR.WORKFLOW_RUN_ID = ITRR.WORKFLOW_RUN_ID
AND IWRR.USER_NAME IN ('xyz')
AND ITRR.RUN_STATUS_CODE <> 2
GROUP BY
ITRR.WORKFLOW_NAME
, ITRR.INSTANCE_NAME
, ITRR.SUBJECT_AREA
) SRC
ON
XYZ.PARENT_JOB_NAME = SRC.WORKFLOW_NAME
AND XYZ.CHILD_JOB_NAME = SRC.INSTANCE_NAME
AND XYZ.SANDBOX = SRC.SUBJECT_AREA
WHEN MATCHED THEN UPDATE SET FIRST_EXECUTION = SRC.EARLIEST_START_TIME
WHEN NOT MATCHED THEN INSERT
(
PARENT_JOB_NAME
, CHILD_JOB_NAME
, FIRST_EXECUTION
, SANDBOX
, PLATFORM_NAME
)VALUES
(
SRC.WORKFLOW_NAME
, SRC.INSTANCE_NAME
, SRC.EARLIEST_START_TIME
, SRC.SUBJECT_AREA
, SRC.PLATFORM_NAME
);
I am trying below query but it is not working.
INSERT INTO XYZ (
PARENT_JOB_NAME
, CHILD_JOB_NAME
, FIRST_EXECUTION
, SANDBOX
, PLATFORM_NAME
)
(SELECT
ITRR.WORKFLOW_NAME WORKFLOW_NAME
, ITRR.INSTANCE_NAME INSTANCE_NAME
, MIN(ITRR.START_TIME) EARLIEST_START_TIME
, ITRR.SUBJECT_AREA SUBJECT_AREA
, 'INFORMATICA' PLATFORM_NAME
FROM
ABC IWRR
, DEF ITRR
WHERE
IWRR.WORKFLOW_RUN_ID = ITRR.WORKFLOW_RUN_ID
AND IWRR.USER_NAME IN ('XYZ')
AND ITRR.RUN_STATUS_CODE <> 2
GROUP BY
ITRR.WORKFLOW_NAME
, ITRR.INSTANCE_NAME
, ITRR.SUBJECT_AREA
) SRC
ON DUPLICATE KEY UPDATE
FIRST_EXECUTION = SRC.EARLIEST_START_TIME
Primary key of XYZ = PARENT_JOB_NAME
Primary key of ABC= SUBJECT_ID
Primary key of DEF= SUBJECT_ID,WORKFLOW_ID,WORKFLOW_RUN_ID,WORKLET_RUN_ID,INSTANCE_ID,TASK_ID,START_TIME

The correct syntax in MySQL is:
INSERT INTO XYZ (PARENT_JOB_NAME, CHILD_JOB_NAME, FIRST_EXECUTION, SANDBOX, PLATFORM_NAME)
SELECT ITRR.WORKFLOW_NAME, ITRR.INSTANCE_NAME,
MIN(ITRR.START_TIME), ITRR.SUBJECT_AREA, 'INFORMATICA'
FROM ABC IWRR JOIN
DEF ITRR
ON IWRR.WORKFLOW_RUN_ID = ITRR.WORKFLOW_RUN_ID
WHERE IWRR.USER_NAME IN ('XYZ') AND
ITRR.RUN_STATUS_CODE <> 2
GROUP BY ITRR.WORKFLOW_NAME, ITRR.INSTANCE_NAME, ITRR.SUBJECT_AREA
ON DUPLICATE KEY UPDATE FIRST_EXECUTION = VALUES(FIRST_EXECUTION);
Note the use of proper, explicit, standard, readable JOIN syntax. Use it.
The major changes are
Fixing the archaic syntax.
Removing the parentheses are not needed for the select in an insert . . . select (although they are probably allowed).
Removing the table alias, which is definitely not allowed.
Fixing the on duplicate key statement.

Believe the comment from #Akina is correct and that the primary key on the table XYZ is just incorrect.
The primary key on the XYZ table need to include these columns PARENT_JOB_NAME, CHILD_JOB_NAME and SANDBOX for the mysql INSERT ... ON DUPLICATE KEY statement to work correctly.

Related

SQL Merge Statement Check Constraint Error

I have the following code table A has a check constraint on column Denial.
CREATE TABLE Table a
(
[ID] int IDENTITY(1,1) NOT NULL ,
[EntityID] int ,
Denial nVarchar(20)
CONSTRAINT Chk_Denial CHECK (Denial IN ('Y', 'N')),
)
Merge statement
MERGE INTO Table a WITH (HOLDLOCK) AS tgt
USING (SELECT DISTINCT
JSON_VALUE(DocumentJSON, '$.EntityID') AS EntityID,
JSON_VALUE(DocumentJSON, '$.Denial') AS Denial
FROM Table1 bd
INNER JOIN table2 bf ON bf.FileUID = bd.FileUID
WHERE bf.Type = 'Payment') AS src ON tgt.[ID] = src.[ID]
WHEN MATCHED
)) THEN
UPDATE SET tgt.ID = src.ID,
tgt.EntityID = src.EntityID,
tgt.Denial = src.Denial,
WHEN NOT MATCHED BY TARGET
THEN INSERT (ID, EntityID, Denial)
VALUES (src.ID, src.EntityID, src.Denial)
THEN DELETE
I get this error when running my MERGE statement:
Error Message Msg 547, Level 16, State 0, Procedure storproctest1, Line 40 [Batch Start Line 0]
The MERGE statement conflicted with the CHECK constraint "Chk_Column". The conflict occurred in the database "Test", table "Table1", and column 'Denial'. The statement has been terminated.
This is due to the source files having "Yes" and "No" instead of 'Y' and 'N'. Hence, I'm getting the above error.
How can I use a Case statement in merge statement to handle the above Check constraints error? or Any alternative solutions.
You can turn Yes to Y and No to N before merging your data. That would belong to the using clause of the merge query:
USING (
SELECT Distinct
JSON_VALUE(DocumentJSON, '$.EntityID') AS EntityID,
CASE JSON_VALUE(DocumentJSON, '$.Denial')
WHEN 'Yes' THEN 'Y'
WHEN 'No' THEN 'N'
ELSE JSON_VALUE(DocumentJSON, '$.Denial')
END AS Denial
FROM Table1 bd
INNER JOIN table2 bf ON bf.FileUID = bd.FileUID
WHERE bf.Type = 'Payment'
) AS src
The case expression translates Y and N values, and leaves other values untouched. Since this applies to the source dataset, the whole rest of the query benefits (ie both the update and insert branches).

How to prevent duplicate entry key when update

Problem explain
I won't update the last primary key of the 3 primary key concatenate. But the problem is sometimes the first and second primary key was the same for multiple records. And in this case, when I set my new value I have a duplicate entry key even I use sub-request to avoid that problem.
Some Code
Schemas
create table llx_element_contact
(
rowid int auto_increment
primary key,
datecreate datetime null,
statut smallint default 5 null,
element_id int not null,
fk_c_type_contact int not null,
fk_socpeople int not null,
constraint idx_element_contact_idx1
unique (element_id, fk_c_type_contact, fk_socpeople)
)
Update request
this request return duplicate key error
update llx_element_contact lec
set lec.fk_socpeople = 64
where
-- Try to avoid the error by non including the values that are the same
(select count(*)
from llx_element_contact ec
where ec.fk_socpeople = 64
and ec.element_id = lec.element_id
and ec.fk_c_type_contact = lec.fk_c_type_contact) = 0
Test data
rowid, datecreate, statut, element_id, fk_c_type_contact, fk_sockpeople
65,2015-08-31 18:59:18,4,65,160,30
66,2015-08-31 18:59:18,4,66,159,12
67,2015-08-31 18:59:18,4,67,160,12
15283,2016-03-23 11:47:15,4,6404,160,39
15284,2016-03-23 11:51:30,4,6404,160,58
You should check only two other members of unique constraint as you're trying to assign the same value to the 3d member. No more then one row with the same two members must exist.
update llx_element_contact lec
set lec.fk_socpeople = 64
where
-- Try to avoid the error by non including the values that are the same
(select count(*)
from llx_element_contact ec
where ec.element_id = lec.element_id
and ec.fk_c_type_contact = lec.fk_c_type_contact) <=1
or
update llx_element_contact lec
set lec.fk_socpeople = 64
where
-- Try to avoid the error by non including the values that are the same
not exists (select 1
from llx_element_contact ec
where ec.element_id = lec.element_id
and ec.fk_c_type_contact = lec.fk_c_type_contact
and lec.fk_socpeople != ec.fk_socpeople)
You can use:
You can prevent the unique conflict using left join to check that the corresponding row doesn't already exist:
update llx_element_contact lec left join
(select element_id, fk_c_type_contact
from llx_element_contact lec2
where lec2.fk_socpeople = 64
group by element_id, fk_c_type_contact
) lec2
using (element_id, fk_c_type_contact)
set lec.fk_socpeople = 64
where lec2.element_id is null;
Your query has additional logic in it that is not explained. It is not necessary for what you are asking.
WITH cte AS (
SELECT rowid,
SUM(fk_socpeople = 64) OVER (PARTITION BY element_id, fk_c_type_contact) u_flag,
ROW_NUMBER() OVER (PARTITION BY element_id, fk_c_type_contact ORDER BY datecreate DESC) u_rn
FROM llx_element_contact
)
update llx_element_contact lec
JOIN cte USING (rowid)
set lec.fk_socpeople = 64
where cte.u_flag = 0
AND cte.u_rn = 1
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=08e20328ccc6187716084ce9d78816b0

MySQL Slow query with multiple joins and subqueries

I have 3 tables:
Pi - images
Pidl - images dl log => Pidl
Pirl - images resize log => Pidl
Basically an image is downloaded and a log record is created in Pidl. After that, it's resized and a record is created in Pirl. Said record being connected to the Pidl record.
I am writing a query as to find which images need to be resized and it basically queries Pidl. The algo I've devised is simple:
for each Image in Pi {
pidlA=newest_pidl(Image);
if(pidlA.status == success) {
pirlA=newest_pirl(Image);
if(pirlA.pidl.hash != pidlA.hash)
{
go;
}
else if(pirlA.status != success){
failed_attempts = failed_pirl_count(pirlA,newest_succesful_pirl(Image))
decide based on pirlA.time and failed_attempts if go or not
}
else
{
dont go;
}
}
else
{
dont go;
}
}
And now my query(altough is not yet finished, the failed attempts part is missing, but it's already too slow, so first I'd like to fix that).
SELECT
pidl1A.pidl_id
FROM Pidl as pidl1A
LEFT JOIN Pidl as pidl2A
ON (
pidl1A.pidl_pi_id = pidl2A.pidl_pi_id AND
pidl2A.pidl_status = 1 AND
(pidl2A.pidl_time > pidl1A.pidl_time OR
(pidl2A.pidl_id > pidl1A.pidl_id and pidl1A.pidl_time=pidl2A.pidl_time)
)
)
LEFT JOIN (
#newest pirl subquery#
SELECT
pidl1B.pidl_pi_id as sub_pi_id,
pidl1B.pidl_hash as sub_pidl_hash,
pirl1B.pirl_id as sub_pirl_id,
pirl1B.pirl_status as sub_pirl_status
FROM Pirl as pirl1B
INNER JOIN Pidl as pidl1B ON (pirl1B.pirl_pidl_id = pidl1B.pidl_id)
LEFT JOIN (
SELECT
pidl2B.pidl_pi_id as sub_pi_id,
pirl2B.pirl_id as sub_pirl_id,
pirl2B.pirl_time as sub_pirl_time
FROM Pirl as pirl2B
INNER JOIN Pidl as pidl2B ON (pirl2B.pirl_pidl_id = pidl2B.pidl_id)
WHERE 1
) as pirl3B
ON (
pirl3B.sub_pi_id = pidl1B.pidl_pi_id and
(pirl3B.sub_pirl_time > pirl1B.pirl_time or
(pirl3B.sub_pirl_time = pirl1B.pirl_time and
pirl3B.sub_pirl_id > pirl1B.pirl_id)
)
)
WHERE
pirl3B.sub_pirl_id is null
) as pirl1A
ON (pirl1A.sub_pi_id = pidl1A.pidl_pi_id)
WHERE
pidl1A.pidl_status = 1 AND pidl2A.pidl_id IS NULL
AND (
pirl1A.sub_pirl_id IS NULL
OR (
pidl1A.pidl_hash != pirl1A.sub_pidl_hash
)
OR (
pirl1A.sub_pirl_status != 1
)
)
And this is my db schema:
CREATE TABLE Pi (
`pi_id` int,
PRIMARY KEY (`pi_id`)
)
;
CREATE TABLE Pidl
(
`pidl_id` int,
`pidl_pi_id` int,
`pidl_status` int,
`pidl_time` int,
`pidl_hash` varchar(16),
PRIMARY KEY (`pidl_id`)
)
;
alter table Pidl
add constraint fk1_branchNo foreign key (pidl_pi_id) references Pi (pi_id);
CREATE TABLE Pirl
(
`pirl_id` int,
`pirl_pidl_id` int,
`pirl_status` int,
`pirl_time` int,
PRIMARY KEY (`pirl_id`)
)
;
alter table Pirl
add constraint fk2_branchNo foreign key (pirl_pidl_id) references Pidl (pidl_id);
INSERT INTO Pi
(`pi_id`)
VALUES
(3),
(4),
(5);
INSERT INTO Pidl
(`pidl_id`, `pidl_pi_id`,`pidl_status`,`pidl_time`, `pidl_hash`)
VALUES
(1, 3, 1,100, 'hashA'),
(2, 3, 1,150,'hashB'),
(3, 4, 2, 200,'hashC'),
(4, 3, 1, 200,'hashA')
;
INSERT INTO Pirl
(`pirl_id`, `pirl_pidl_id`,`pirl_status`,`pirl_time`)
VALUES
(1, 2, 0,100),
(2, 3, 1,150),
(3, 1, 2, 200)
;
Of course with 3 records it's fast. But with around 10-30k it takes more than 5 seconds. What I've found is that the thing that makes it slow is the last part of the where:
AND (
pirl1A.sub_pirl_id IS NULL
OR (
pidl1A.pidl_hash != pirl1A.sub_pidl_hash
)
OR (
pirl1A.sub_pirl_status != 1
)
)
The other strange thing that I've found is that by using DISTINCT, the query got a bit faster but not fast enough.
When I read your requirements, I come up with a query like this:
select pidl.*
from pidl left join
(select image, max(pidl_time) as pidl_time
from pidl
group by image
) maxpidl
on pidl.image = maxpidl.image and pidl.pidl_time = maxpidl.pidl_time
pirl
on pidl.hash = pirl.hash
where pirl.hash is null;
I think you have some other conditions that are not fully explained (such as the role of status). You should be able to incorporate that.
In MySQL, you should avoid subqueries in the from clause. These are materialized and -- as a result -- there is additional overhead for that work and the engine cannot subsequently use indexes.
Your queries aren't using your indexes, and are instead using views in a subquery. This can be very slow. I would suggest making new tables that are indexed with the information that you need or a materialized view.

mysql add key does not work

I have this query:
SELECT adressid, adressname FROM kino_adressen WHERE city ='Seattle'
I wanted to create an index like this
ALTER TABLE <tablename> ADD KEY index_abc(adressid, adressname(40))
But when I then check it by using:
EXPLAIN SELECT adressid, adressname FROM kino_adressen WHERE city ='Seattle'
It says
type = ALL
possible keys = NULL
key = NULL
...rows = 106
Can anyone give some piece of advice how to do this properly ?
// edit:
Another problem I do not understand:
SELECT DISTINCT
titel,
regie,
darsteller,
filmbild,
kino_filme.filmid,
kino_filme.beschreibung,
fsk,
filmlaenge,
verleih,
sprachfassung
FROM
kino_filme
LEFT JOIN kino_terminefilme ON (
kino_terminefilme.filmid = kino_filme.filmid
)
LEFT JOIN kino_termine ON (
kino_terminefilme.terminid = kino_termine.terminid
)
LEFT JOIN kino_kinos ON (
kino_kinos.kinoid = kino_termine.kinoid
)
LEFT JOIN kino_adressen ON (
kino_adressen.adressid = kino_kinos.adressid
)
WHERE
kino_adressen.adressid = 32038
And the result is like:
Why is kino_termine not using any index ?
I set it to PK while creating and even added an index afterwards, but none of those helped.
You added an index on the address but use the city in the where clause. Add an index on the city then it will be used.

Can I use MySQL ifnull with a select statement?

I'm trying the following and cannot find out what is wrong:
IF( IFNULL(
SELECT * FROM vertreter AS ag
WHERE ag.iln = param_loginID
AND ag.verkaeufer = param_sellerILN
),
UPDATE vertreter AS agUp
SET agUp.vertreterkennzeichen
WHERE agUp.iln = param_loginID AND agUp.verkaeufer = param_sellerILN
,
INSERT INTO vertreter AS agIn
( agIn.iln, agIn.verkaeufer, agIn.vertreterkennzeichen, agIn.`status` )
VALUES
( param_loginID, param_sellerILN, param_agentID, 'Angefragt' )
);
Question:
Is this possible at all, to check if a SELECT returns NULL and then do A or B depending?
You need to create unique composite index (iln + verkaeufer).
CREATE UNIQUE INDEX vertreter_iln_verkaeufer ON vertreter (iln, verkaeufer)
http://dev.mysql.com/doc/refman/5.0/en/create-index.html
And then you can do this in one query:
INSERT INTO vertreter
(agIn.iln, agIn.verkaeufer, agIn.vertreterkennzeichen, agIn.`status`)
VALUES (param_loginID, param_sellerILN, param_agentID, 'Angefragt')
ON DUPLICATE KEY UPDATE vertreterkennzeichen = param_agentID
Documentation: http://dev.mysql.com/doc/refman/5.5/en/insert-on-duplicate.html