Insert rows with batch id using sqlldr - sql-loader

I am able to insert rows into a table using sqlldr with no worries. I would like to tag all rows of a file to a unique number so that I can treat them as one batch. I tried "my_db_seq.nextval" as batch id. But its not serving my purpose.
So please advise on how to create a unique batch id for entire set of rows of a file while loading using sqlldr.

Wrap your call to the sequence in a function like this:
create or replace function get_batch_id return integer is
x exception;
-- ORA-08002: sequence %s.CURRVAL is not yet defined in this session
pragma exception_init (x, -8002);
begin
return my_db_seq.currval;
exception
when x then
return my_db_seq.nextval;
end;
Then call it from the control file:
...
batch_id "get_batch_id()"
...
From this post: https://www.orafaq.com/forum/t/187236/.

Related

Selecting Into Lag (Window Function) in MySQL Stored Function

I am trying to automate mathematic formulas that I need to run many times over a dataset for a view page. I have used the SELECT INTO function to successfully create these values. Below is an example of code that has worked:
DELIMITER //
DROP FUNCTION IF EXISTS InvLevel;
CREATE FUNCTION InvLevel(
metric DECIMAL (65,4),
norm DECIMAL (65,10)
)
RETURNS DECIMAL(65,4)
BEGIN
SET #InvLevel = NULL;
SELECT
metric * norm
INTO #InvLevel;
RETURN #InvLevel;
END //
I then call this with:
CREATE VIEW Mainfile
SELECT
Clients, Users, Normalization, ID, Year
MNorm(Clients, Normalization) AS ClientsN,
MNorm(Users, Normalization) AS UsersN
FROM Dataset
Does everything I want! I'm happy.
However, one of the important functions I need to run is a lag function--i.e. I'm interested in how clients, users, etc. has changed over time. So I wrote the following stored function:
DELIMITER //
DROP FUNCTION IF EXISTS Pace;
CREATE FUNCTION Pace(
metric decimal(65,4),
id VARCHAR(20),
year INT
)
RETURNS DECIMAL(65,4)
BEGIN
SELECT
metric - (LAG(metric, 1) OVER (PARTITION BY id ORDER BY year))
INTO #paceco;
RETURN #paceco;
END //
That I then call as:
Pace(Clients,ID,Year) AS ClientsP
However, this operation only returns NULL values in ClientsP. I know it's not a question of the formula since if I directly write out the math in the create view function, I receive the correct values.
I know that I can just plug the original math into the view function but I have many metrics I will be repeating this formula on (for the next few years or so) so I would vastly prefer to automate it instead of have long, messy SQL files. Thanks in advance.

How to create random numbers at set time interval in mysql?

I want to insert random numbers between 23 and 31 at every 5 seconds in a Mysql table.
I know that I have to use RAND() function. But i don't know how to use it for inserting random numbers at set time interval.
You can use the event scheduler for this:
create table foo (id int primary key auto_increment, value int);
create event insert_random_value_every_5_sec on schedule every 5 second do insert into foo (value) select floor(23+rand()*9) as value;
If the event scheduler is disabled, you will need to enable it:
set global event_scheduler=on;
You can specify start and or end times for the event in the create event statement or later in alter event.
This will give you random numbers FLOOR(RAND()*(31-23+1))+23 And this will give you data in every 5 second date MOD(SECOND(curdate()) ,5)=0. You can use this sql -
SELECT
FLOOR(RAND()*(31-23+1))+23
FROM table
WHERE MOD(SECOND(curdate()) ,5)=0
Alternatively, you can use PHP file and using script timeInterval with AJAX to insert the random value to the database. I will tell you step by step but without code.
Steps:
create PHP function to connect your database (you can Googling with keysearch mysqli_connect)
create PHP function to handle your AJAX request and save it in database (keysearch: mysqli_query)
create script function (ex: named ajax_query) to send AJAX request with RAND function to generate random number as you wish. (You can read this question)
create script interval function to call function "ajax_query"
PS: don't forget to include/use jQuery library in your file

Is there a way to change the default date input format

I have a source of data from where I extract some fields, among the fields there are some date fields and the source sends their values like this
#DD/MM/YYYY#
almost all the fields can be sent into the query with no modificaction, except this of course.
I have written a program the gets the data from an internet connection and sends it to the MySQL server and it's sending everything as it should, I am sure because I enabled general logging in the MySQL server and I can see all the queries are correct, except the ones with date fields.
I would like to avoid parsing the fields this way because it's a lot of work since it's all written in c, but if there is no way to do it, I understand and would accept that as an answer of course.
As an example suppose we had the following
INSERT INTO sometable VALUES ('#12/10/2015#', ... OTHER_VALUES ..., '#11/10/2015#');
in this case I send the whole thing as a query using mysql_query() from libmysqlclient.
In other cases I can split the parts of the message in something that is like an instruction and the parameters, something like this
iab A,B,C,#12/10/2015#,X,Y,#11/10/2015#
which could mean INSERT INTO table_a_something_b_whatever VALUES, and in this situation of course, I capture all the parameters and send a single query with a list of VALUES in it. Also in this situation, it's rather simple because I can handle the date like this
char date[] = "#11/10/2015#";
int day;
int month;
int year;
if (sscanf(date, "#%d/%d/%d#", &day, &month, &year) == 3)
{
/* it's fine, build a sane YYYY-MM-DD */
}
So the question is:
How can I tell the MySQL server in what format the date fields are?
Clarification to: Comment 1
Not necessarily INSERT, it's more complex than that. They are sometimes queries with all their parameters in it, sometimes they are just the parameters and I have to build the query. It's a huge mess but I can't do anything about it because it's a paid database and I must use it for the time being.
The real problem is when the query comes from the source and has to be sent as it is, because then there can be many occurrences. When I split the parameters one by one there is no real problem because parsing the above date format and generating the appropriate value of MySQL is quite simple.
You can use STR_TO_DATE() in MySQL:
SELECT STR_TO_DATE('#08/10/2015#','#%d/%m%Y#');
Use this as part of your INSERT process:
INSERT INTO yourtable (yourdatecolumn) VALUES (STR_TO_DATE('#08/10/2015#','#%d/%m%Y#'));
The only Thing I could imagine at the Moment would be to Change your Column-Type from DateTime to varchar and use a BEFORE INSERT Trigger to fix "wrong" Dates.
Something like this:
DELIMITER //
CREATE TRIGGER t1 BEFORE INSERT on myTable FOR EACH ROW
BEGIN
IF (NEW.myDate regexp '#[[:digit:]]+\/[[:digit:]]+\/[[:digit:]]+#') THEN
SET NEW.myDate = STR_TO_DATE(NEW.myDate,'#%d/%m/%Y#');
END IF;
END; //
DELIMITER ;
If you are just Need to run the Import in question once, use the Trigger to generate a "proper" dateTimeColumn out of the inserts - and drop the varchar-column afterwards:
('myDate' := varchar column to be dropped afterwards;`'myRealDate' := DateTime Column to Keep afterwards)
DELIMITER //
CREATE TRIGGER t1 BEFORE INSERT on myTable FOR EACH ROW
BEGIN
IF (NEW.myDate regexp '#[[:digit:]]+\/[[:digit:]]+\/[[:digit:]]+#') THEN
SET NEW.myRealDate = STR_TO_DATE(NEW.myDate,'#%d/%m/%Y#');
else
#assume a valid date representation
SET NEW.myRealDate = NEW.myDate;
END IF;
END; //
DELIMITER ;
Unfortunately you cannot use a Trigger to work on the datetime-column itself, because mysql will already mess up the NEW.myDate-Column.

Hive External Table exclude records that violate data type

I have an external table in Hive that uses a serde to process json records. Occasionally there will be a value that does not match the table ddl data type, e.g. table field definition is int, json has a string value. During query execution Hive will correctly throw this error for metadata exception due to type mismatch:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException:
Hive Runtime Error while processing writable
Is there a way to set Hive to just ignore these records that have data type violations?
Note the json is valid syntax, so settings the serde properties like to ignore malformed json is not applicable.
Example DDL:
CREATE EXTERNAL TABLE IF NOT EXISTS test_tbl (
acd INT,
tzo INT
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS TEXTFILE
;
ALTER TABLE test_tbl SET SERDEPROPERTIES ( "ignore.malformed.json" = "true");
Example data - the TZO = alpha record will cause the error:
{"acd":6,"tzo":4}
{"acd":6,"tzo":7}
{"acd":6,"tzo":"alpha"}
You can set up Hive to tolerate a configurable amount of failures.
SET mapred.skip.mode.enabled = true;
SET mapred.map.max.attempts = 100;
SET mapred.reduce.max.attempts = 100;
SET mapred.skip.map.max.skip.records = 30000;
SET mapred.skip.attempts.to.start.skipping = 1
This is not Hive specific and can be applied to ordinary MapReduce as well.
I don't think there is a way to handle this in hive yet. I think you may need to have an intermediate step using MR, Pig etc. to make sure the data is sound and then input from that result.
There may be a configuration parameter here you could use
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-SerDes
I'm thinking you may be able to write your own exception handler to catch that and continue by specifying your custom handler with hive.io.exception.handlers
or if you are ok storing as an ORC file instead of a text file. You can specify the ORC file format with HiveQL statements such as these:
CREATE TABLE ... STORED AS ORC
ALTER TABLE ... [PARTITION partition_spec] SET FILEFORMAT ORC
And then when you run your jobs you can use the skip setting:
set hive.exec.orc.skip.corrupt.data=true

Number of rows incorrectly retrieved with updateQuery() statement

I want to print the number of rows retrieved from an updateQuery() statement in JDBC (mysql database). The code I have till now is this:
int rows=0;
//constructor sets this up like opening connection, etc...
String buildSelectQuery = buildSelectQueryForCode();
stmt = connection.createStatement();
rows= stmt.executeUpdate(buildSelectQuery); <---- mismatch
System.out.println(rows);
Where buildSelectQuery is CREATE VIEW viewName AS (SELECT * FROM tableName WHERE gen-congruence>1). There is a getRows method as well in the class:
public String getRows(){
return Integer.toString(rows);
}
Now, this query should ideally pull out over 2000 records and this is done in the view as well (in the database actually) but the getRows (which is being called in the GUI) prints out incorrect number of rows in the view (0) and I have no idea why. Is there another method to setup the result set? Am I doing something wrong? Please help me.
Your query is creating a view, not selecting from the view, so no rows are returned. You need to update rows when some rows are read.