I want to load the following 2 tables using the SQLLDR.
Table Structures of the 2 tables are as follows:
CREATE TABLE Customer
(ID varchar2(50), --PK
org_cd varchar2(50), --PK
NAME VARCHAR2 (255),
Address1 VARCHAR2(1000),
DOB TIMESTAMP(3),
cust_ref_col number ---used for all the future references to this record since this is a number. This is unique key.
);
CREATE TABLE Customer_contact
(ID varchar2(50), --PK
org_cd varchar2(50), --PK
Contact_id Number, --PK --Running serial # for a given Customer
contact_name varchar2(50),
cust_ref_col number ---foreign key from Customer table
);
Here is the Data File, customer.dat (the last column value of 1 is dummy since I want to generate the Oracle Sequence(partnersequence) Number
PTNR_78814824,ACCT,Tom,123 Church Road, 12-dec-99,1,Ralph,1
PTNR_78814825,FIN,Tom,124 Main Road, 12-dec-99,2,Jody,1
PTNR_78814826,ENGG,Tom,125 Station Road, 12-dec-99,3,Mardy,1
My control File Looks like this
LOAD DATA
INFILE test.dat
INTO TABLE Customer
APPEND
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS
(ID ,
org_cd ,
name ,
Address1 ,
DOB ,
cust_ref_col "partnersequence.nextval"
)
INTO TABLE Customer_contact
APPEND
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS
(ID ,
org_cd ,
Fill1 Filler,
Fill2 Filler,
Fill3 Filler,
Fill4 Filler,
cust_ref_col "partnersequence.nextval"
)
Here the issue is that the cust_ref_col in Customer_contact table is getting a new sequence#. I want to use the same sequencec# generated for
Customer table. Can you please help.
EDIT: I misunderstood the OP's question. He was looking for a solution where both tables would get the same sequence number per each row (I believe answered in the comments). This answer shows how to get a unique sequence per an entire load. While not an answer I think the info would be helpful so I'll leave it.
As you found out already, calling your sequence.nextval in the control file causes it to increment for each row loaded. The trick is to call a package that sets the value once upon instantiation, then call package functions that return the values for each row loaded. Since I have set this up already for my loads, I'll share with you what I did with an explanation. Our loads add a load_date and a load_sequence_id to the tables, so that's what you'll see in these examples. Assumed is the reader understands the structure of a package already as that's too much to explain here without muddying up the main question.
You'll need to create a sequence (you have already), and a package. The package contains 2 variables to hold the load_date and load_seq_id, 2 "getter" functions to return them, and code to set them upon instantiation. Then your control file will call the "getter" functions to return the load_date and load_seq_id from the package which will be the same for every row.
So, upon starting the load, when the first row to load calls the function to get the sequence, the package is instantiated and the date and sequence are set, then returned. As long as the session is active the date/sequence will not change from then on and subsequent calls to the getter functions will continue to return the same values.
The package spec and body:
CREATE OR REPLACE PACKAGE SCHEMA.LOAD_SEQ AS
/******************************************************************************
NAME: LOAD_SEQ
PURPOSE: Sets unique load_date and Load_seq_id per session when
the package is instantiated. Package functions are
intended to be called from control files so all rows in a
file load will have the same load_date and
load_seq_id.
When the functions are called, the package is instantiated and
the code at the bottom is run once for the session, setting the
load_date and load_seq_id. The functions simply return the values
which will remain the same for that session.
load_date date "MM/DD/YYYY" "to_char(trunc(schema.load_seq.get_load_date), 'mm/dd/yyyy')",
load_seq_id decimal external "schema.load_seq.get_load_seq_id"
(each row then has the same load_seq_id).
REVISIONS:
Ver Date Author Description
--------- ---------- --------------- ------------------------------------
1.0 2/20/2017 Gary_W 1. Created this package.
******************************************************************************/
NEXT_LOAD_SEQ_ID NUMBER;
NEXT_LOAD_DATE DATE;
FUNCTION GET_LOAD_SEQ_ID
RETURN NUMBER;
FUNCTION GET_LOAD_DATE
RETURN DATE;
END THC_LOAD_SEQ;
/
CREATE OR REPLACE PACKAGE BODY SCHEMA.LOAD_SEQ AS
/******************************************************************************
NAME: GET_LOAD_SEQ_ID
PURPOSE: Return the package variable LOAD_SEQ.NEXT_LOAD_SEQ_ID
which is set when the package is instantiated. It does not
change during the session.
REVISIONS:
Ver Date Author Description
--------- ---------- --------------- ------------------------------------
1.0 2/20/2017 Gary_W 1. Created this package.
******************************************************************************/
FUNCTION GET_LOAD_SEQ_ID
RETURN NUMBER IS
BEGIN
RETURN LOAD_SEQ.NEXT_LOAD_SEQ_ID;
END GET_LOAD_SEQ_ID;
/******************************************************************************
NAME: GET_LOAD_DATE
PURPOSE: Return the package variable LOAD_SEQ.NEXT_LOAD_DATE
which is set when the package is instantiated. It does not
change during the session.
REVISIONS:
Ver Date Author Description
--------- ---------- --------------- ------------------------------------
1.0 2/20/2017 Gary_W 1. Created this package.
******************************************************************************/
FUNCTION GET_LOAD_DATE
RETURN DATE IS
BEGIN
RETURN LOAD_SEQ.NEXT_LOAD_DATE;
END GET_LOAD_DATE;
BEGIN
-- Code outside of the procedures/functions defined in the spec runs
-- once on instantiation of the package, when the package is first called by the session.
-- It sets the package variables which then do not change during the life of the session.
SELECT SYSDATE, partnersequence.NEXTVAL
INTO LOAD_SEQ.NEXT_LOAD_DATE, LOAD_SEQ.NEXT_LOAD_SEQ_ID
FROM DUAL;
END LOAD_SEQ;
/
In your control file:
LOAD_DATE date "MM/DD/YYYY" "to_char(trunc(schema.load_seq.get_load_date), 'mm/dd/yyyy')"
cust_ref_col "decimal external "schema.load_seq.get_load_seq_id""
Related
I have a linking table between two tables, ja1_surveyors and ja1_stores. I'm trying to writ a stored procedure that will take three arguments, the third being a json array of store_id.
I've tried this, but I know it's not correct:
/*
========================================================================================
Set the list of stores for a surveyor in a survey. Used with template to create the list
a user sees to edit, copy and delete surveyors in a survey
Accepts three arguments:
arg_srvy_id Survey key
arg_srvr_id Surveyor key
STORE_LIST JSON value holding a list of store keys assigned to this survey/surveyor
STORE_LIST JSON should be in the form: '{store_id:val1},{store_id:val2}' etc.
========================================================================================
*/
DROP PROCEDURE IF EXISTS SURVEYOR_LINK_STORES;
DELIMITER //
CREATE PROCEDURE SURVEYOR_LINK_STORES( IN arg_srvy_id INT(11),IN arg_srvr_id INT(11),IN STORE_LIST JSON)
BEGIN
/* Remove all links for this surveyor to stores for this survey */
DELETE FROM `ja1_storesurveyor`
WHERE `lnk_strsrvr_srvy_id` = arg_srvy_id AND `lnk_strsrvr_srvr_id` = arg_srvr_id;
/* Add links between this survey and surveyor for each key in STORE_LIST */
INSERT INTO `ja1_store_surveyor`
(
`lnk_strsrvr_srvy_id`,
`lnk_strsrvr_srvr_id`,
`lnk_strsrvr_store_id`
)
SELECT
arg_srvy_id,
arg_srvr_id,
STORE_LIST->>`$.store_id`
FROM STORE_LIST;
END
DELIMITER ;
The problem seems to be the select part of the insert statement.
All of the columns are INT(11). And I'm using MySQL version 5.6.41-84.1
What am I missing?
The best way to do this is with JSON_TABLE() but it requires MySQL 8.0.
Edit: When I wrote this answer, your original question did not make it clear you were using an old version of MySQL Server.
CREATE PROCEDURE SURVEYOR_LINK_STORES(
IN arg_arg_srvy_id INT,
IN arg_arg_srvr_id INT,
IN arg_STORE_LIST JSON)
BEGIN
/* Remove all links for this surveyor to stores for this survey */
DELETE FROM `ja1_storesurveyor`
WHERE `lnk_strsrvr_srvy_id` = arg_srvy_id
AND `lnk_strsrvr_srvr_id` = arg_srvr_id;
/* Add links between this survey and surveyor for each key in STORE_LIST */
INSERT INTO `ja1_store_surveyor`
(
`lnk_strsrvr_srvy_id`,
`lnk_strsrvr_srvr_id`,
`lnk_strsrvr_store_id`
)
SELECT
arg_srvy_id,
arg_srvr_id,
j.store_id
FROM JSON_TABLE(
arg_STORE_LIST, '$[*]' COLUMNS(
store_id VARCHAR(...) PATH '$'
)
) AS j;
END
I'm guessing the appropriate data type for store_id is a varchar, but I don't know how long the max length should be.
Re your comment: MySQL 5.6 doesn't have any JSON data type, so your stored procedure won't work as you wrote it (the arg_STORE_LIST argument cannot use the JSON data type).
FYI, MySQL 5.6 past its end-of-life in February 2021, so the version you are using won't get any more bug fixes or security fixes. You should really upgrade, regardless of the JSON issue.
The equivalent code to insert multiple rows in MySQL 5.6 is a lot of work and code to write. I'm not going to write an example for such an old version of MySQL.
You can find other examples on Stack Overflow with the general principle. It involves taking the argument as a VARCHAR, not JSON, and writing a WHILE loop to picking apart substrings of the varchar.
I want to modify standard SSIS SCD behavior.
EmployeeID is my business key and title, firstname, lastname are type 2 attributes.
I want BatchLogID to reflect when a change occurred - otherwise it remains unchanged.
BatchLogID is passed to dataflow as an int
EmployeeID,title,firstname,lastname,BatchLogID,startdate,enddate
source data
101,Miss,Jane,Smith,101 -- inserted for first time
101,Miss,Jane,Smith,102 process runs
101,Miss,Jane,Smith,103 process runs
101,Miss,Jane,Smith,104 process runs
101,Mrs, Jane,Brown,105 process runs -- only when data has changed do I want the Batch number in target updated
target data
101,Miss,Jane,Smith,101,101,1 jan 2000,null-- inserted for first time
101,Miss,Jane,Smith,105,105,1 jan 2000,5 Jan 2000 -- as a change is detected the data is updated
101,Mrs, Jane,Brown,105,105 jan 2000,null-- only when data has changed to I want the Batch number updated
any thoughts?
You may need to perform delta load using :
Lookup and derived column
Merge join, Conditional Split and derived column
Change Data Capture
Temporal tables (if the RDBMS supports that)
To know more :
https://www.c-sharpcorner.com/article/design-the-full-load-and-delta-load-patterns-in-ssis/
Used a sql merge command - had to wash through twice for
declare #batchLogID int= 1
MERGE dbo.targetTable AS t
USING dbo.sourceTable AS s `enter code here`
ON (t.[key] = s.[key] and t.endDate is null)
WHEN MATCHED and s.[value] <> t.[value]
THEN UPDATE SET t.enddate = dateadd(ss,-1,cast(cast(getdate() as date) as datetime))
WHEN not MATCHED
THEN INSERT (key,[col1], [col2], [value], [col3],startdate,BatchLogID)
VALUES (s.key,s.[col1], s.[col2], s.[value], s.[col3],cast(getdate() as date),#batchLogID)
I am trying to write a Spring Batch Starter job that reads a CSV file and inserts the records into a MySQL DB. When it begins I want to save the start time in a tracking table, and when it ends, the end time in that same table. The table structure is like:
TRACKING : id, start_time, end_time
DATA: id, product, version, server, fk_trk_id
I am unable to find an example project that does such a thing. I believe this needs to be a Spring Batch Starter project that can handle multiple queries. i.e.
// insert start time
1. INSERT INTO tracking (start_time) VALUES (NOW(6));
// get last inserted id for foreign key
2. SET #last_id_in_tracking = LAST_INSERT_ID();
// read from CSV and insert data into 'data' DB table
3. INSERT INTO data (product, version, server, fk_trk_id) VALUES (mysql, 5.1.42, Server1, #last_id_in_tracking);
4. INSERT INTO data (product, version, server, fk_trk_id) VALUES (linux, 7.0, Server2, #last_id_in_tracking);
5. INSERT INTO data (product, version, server, fk_trk_id) VALUES (java, 8.0, Server3, #last_id_in_tracking);
// insert end time
6. UPDATE tracking SET end_time = NOW(6) WHERE fk_trk_id = #last_id_in_table1;
I'd like sample code and explanation on how to use those queries to multiple tables in the same Spring Batch Starter job.
start of edit section - additional question
I do have an additional question. In my entities I have them set-up to represent the relationships with annotations (i.e #ManyToOne, #JoinColumn)...
In your code, how would I get the trackingId from a referenced object? Let me explain:
My Code (Data.java):
#JsonManagedReference
#ManyToOne
#JoinColumn(name = "id")
private Tracking tracking;
Your code (Data.java):
#Column(name = "fk_trk_id")
private Long fkTrkId;
Your code (JobConfig.java):
final Data data = new Data();
data.setFkTrkId(trackingId);
How do I set the id with "setFkTrkId" when the relationship in my Entity is an object?
end of edit section - additional question
Here is an example app that does what you're asking. Please see the README for details.
https://github.com/joechev/examples/tree/master/csv-reader-db-writer
I have created a project for you as an example. Please refer to https://bigzidane.wordpress.com/2018/02/25/spring-batch-mysql-reader-writer-processor-listener/
This example simply has a Reader/Processor/Writer. The reader will read a CSV file and then process something and then write to database.
And we have a listener to capture StartJob and EndJob. For Start Job, we will insert an entry to DB and then return a generatedId. We will pass the same ID to writer when we stored entries.
Note: I'm sorry I'm reused an example I have already. So it may not match 100% as your question but technically it should be the same.
Thanks,
Nghia
I am developing a PHP script and I have a table like this:
TABLE_CODE
code varchar 8
name varchar 30
this code column has to be a code using random letters from A to Z and characters from 0 to 9 and has to be unique. all uppercase. Something like
A4RTX33Z
I have create a method to generate this code using PHP. But this is a intensive task because I have to query the database to see if the generated code is unique before proceeding and the table may have a lot of records.
Because I know mySQL is a bag of tricks but not having advanced knowledge about it now, I wonder if there's some mechanism that could be built in a table to run a script (or something) every time a new record in created on that table to fill the code column with a unique value.
thanks
edit: What I wonder is if there's a way to created the code on-the-fly, as the record is being added to the table and that code being unique.
Better generate these codes in SQL. This is 8-character random "Promo code generator":
INSERT IGNORE INTO
TABLE_CODE(name, code)
VALUES(
UPPER(SUBSTRING(MD5(RAND()) FROM 1 FOR 8)), -- random 8 characters fixed length
'your code name'
)
Add UNIQUE on code field as #JW suggested, and some error-handling in PHP, because sometimes generated value may be not UNIQUE, and MySQL will raise error in that situation.
Adding a UNIQUE constraint on the code column is the first thing you would need to do. Then, to insert the code I would write a small loop like this:
// INSERT IGNORE will not generate an error if the code already exists
// rather, the affected rows will be 0.
$stmt = $db->prepare('INSERT IGNORE INTO table_code (code, name) VALUES (?, ?)');
$name = 'whatever name';
do {
$code = func_to_generate_code();
$stmt->execute(array($code, $name));
} while (!$stmt->rowCount()); // repeat until at least one row affected
As the table grows the number of loops may increase, so if you feel it should only try three times, you could add it as a loop condition and throw an error if that happens.
Btw, I would suggest using transactions to make sure if an error occurs after the code generation, rolling back will make sure the code is removed (can be reused).
I am using Microsoft sql server 2008, I tried all the 3 solutions, but every time I get the same error
Error at Data Flow Task[OLEDB source[449]]:No colum information was returned by the sql command
I am using the following batch of sql statments to retrieve the server level configuration of all servers in my company. The table variable #tb1_SvrStng has 83 columns and it is populated using different resources.
So I summarize the sql script. I cannot use it as stored procedure because this script is going to run against 14 servers (once for each server). So if I store the procedure on one server, other server cannot execute that procedure in its context.
I will highly appreciate your help. I am not using any temporary table in my script.
declare #tb1_SvrStng table
(
srvProp_MachineName varchar(50),
srvProp_BldClrVer varchar(50),
srvProp_Collation varchar(50),
srvProp_CNPNB varchar(100),
...
xpmsver_ProdVer varchar(50),
..... .
syscnfg_UsrCon_cnfgVal int,
.....
);
insert into #tb1_SvrStng
(
srvProp_BldClrVer,
srvProp_Collation,
srvProp_CNPNB , ........
........ .
)
select convert(varchar, serverproperty('BuildClrVer')),
convert(varchar, serverproperty('Collation'))
........
.......
declare #temp_msver1 table
(
id int, name varchar(100),
...........
);
insert into #temp_msver1 exec xp_msver
Update #tb1_SvrStng
set xpmsver_ProdVer =
(
select value from #temp_msver1 where name = 'ProductVersion'
),
xpmsver_Platform =
(
select value from #temp_msver1 where name = 'Platform'
),
.....
......
select
srvProp_SerName as srvProp_SerName,
getdate() as reportDateTime,
srvProp_BldClrVer as srvProp_BldClrVer,
srvProp_Collation as srvProp_Collation,
.....
.....
from #tb1_SvrStng
From what i can gather from your code and question is that the query cannot be processed outside runtime, because you're doing something dynamic to it, or that it won't process it since it's doing something funky.
One trick to this would be to use the Source component in the data flow task to something "dummy" - which you can fake with a query like this
SELECT
CONVERT(DATATYPE,NULL) AS srvProp_SerName,
CONVERT(DATETIME,NULL) AS reportDateTime,
CONVERT(DATATYPE,NULL) AS srvProp_BldClrVer,
CONVERT(DATATYPE,NULL) AS srvProp_Collation
This way the source component should be able to read the metadata. You can then put your proper query (as long as it's within the limits of the length of the query text) into a variable, and then assign this as an expression to the source component.
At runtime it will then use the expression query - and hopefully don't mind too much the metadata issue.
This may or may not work but it should be worth a try since it won't take long to confirm.