Copy previous values kettle pentaho - integration

I have an issue and i'm looping on it! :| I hope someone can help me..
So i have an input file (.xls), that is simple but there are a row (lets say its "ROW1") that is like this:
ROW1 | ROW2 | ROW3 | ROW_N
765 | 1 | AAAA-MM-DD | ...
null | 1 | AAAA-MM-DD | ...
null | 1 | AAAA-MM-DD | ...
944 | 2 | AAAA-MM-DD | ...
null | 2 | AAAA-MM-DD | ...
088 | 7 | AAAA-MM-DD | ...
555 | 2 | AAAA-MM-DD | ...
null | 2 | AAAA-MM-DD | ...
There are no stardard here, like you can see.. There are some lines null (ROW1) and in ROW2, there are equal numbers, with different association to ROW1 (like in line 5 and 6, then in line 8 and 9).
My objective is to copy and paste the values from ROW1, in the ROW1 after when is null, till isn't null. Basically is to copy form previous step, when is null...
I'm trying to use the "Formula" step, by using something like:
=IF(AND(ISBLANK([ROW1]);NOT(ISBLANK([ROW2]));ROW_n=ROW1;IF(AND(NOT(ISBLANK([ROW1]));NOT(ISBLANK([ROW2]));ROW_n=ROW1;ROW_n=""));
But nothing yet..
I've tried "Analytic Query" but nothing too..
I'm using just stream a xls file input..
Tks very much, any help is very much appreciiated!!
Best Regardsd!

Well i discover a solution, adding a "User Defined Java Class" with the code below:
import java.util.HashMap;
private FieldHelper output_field, card_field;
private RowSet out, log;
private String previou_card =null;
public boolean processRow(StepMetaInterface smi, StepDataInterface sdi) throws KettleException
{
if (first)
{
first = false;
out = findTargetRowSet("out");
output_field = get(Fields.Out, "previous_card");
} else {
Object[] r = getRow();
if (r == null) {
setOutputDone();
return false;
}
r = createOutputRow(r, data.outputRowMeta.size());
if (previous_card != null) {
output_field.setValue(r, previous_card);
}
if (card_field == null) {
card_field = get(Fields.In, "Grupo de Cartões");
}
String card = card_field.getString(r);
if (card != null && !card.isEmpty()) {
previous_card = card;
}
// Send the row on to the next step.
putRowTo(data.outputRowMeta, r, out);
}
return true;
After this i have to put a few steps but this help very much.
Thank you mates!!

Finally i got result. Please follow below steps
Below image is full transformation screen.
Data Grid Data will be like these. Sorry for that in my local i don't have Microsoft because of that i took Data Grid. Instead of Data Grid you can drag and drop Microsoft Excel Input step.
Drag and Drop one java script step and write below code.
Last step of transformation, drag and drop Select values step and select the columns.( These step is no necessary)
Final result will be like these.
Hope this helps.

Related

Checking discord messages with sql

So I've been trying to check each message from a sql table. My sql table structure is down below:
Example:
| id | | triggervalue | | triggermessage |
| 633666515413237791 | | hello, world, test | | Trigger works! |
(array like string is something like: hello, world, test)
I want to check each message from each triggervalue column to see if message contains a string from array like string.
Here is what I've done:
I tried to merge every single array like string then send(triggermessage) where the same row of the found array contains, then checking for word.
connection.query(`SELECT * FROM triggervalue`, (err, rows) => {
let array = []
for(i = 0; i < rows.length; i++) {
let received = rows[i].jsonstring;
var intarray = received.replace(/^\[|\]$/g, "").split(", ");
array.concat(intarray)
// continue code here...
}
})
However, I can't get the triggermessage of the same row of found array. How would I go for it? I've been stuck here for quite a while... Sorry if this way of asking is wrong, thanks!
(Sorry if my english is bad)

Storetext from a table choice correct lines to store for Selenium Ide Ui Vision Kantu

I need to storetext all lines from a table where CODICE CATASTALE have a value
I add an image to show, I need to save all line in variable with storetext with this characteristic CODICE CATASTALE have a value, in the image I add 1 - 2 - 3 - 4 to explain line to store.
This is a relative storetext when CODICE CATASTALE have a value stored the line.
Here the page
nonsolocap.it/cap?k=56040
Image
After the execution such script in Selenium IDE table variable will contain data from the table. xpath_to_all_rows_with_CODICE_CATASTALE should be replaced with corresponding xpath.
store xpath count | xpath = xpath_to_all_rows_with_CODICE_CATASTALE | n
store | 0 | j
while | ${j} < ${n} |
store | | rowElement
store | 0 | i
while | ${i} < 7 |
store text | xpath = xpath_to_all_rows_with_CODICE_CATASTALE[${j}]/td[${i}] | element
execute script | if (${i} != 0) var arr = ${rowElement}; else var arr = []; var element = ${element}; arr.push(element); return arr; | rowElement
execute script | return Number(${i}) + 1; | i
end| |
execute script | if (${j} != 0) var arr = ${table}; else var arr = []; var rowElement = ${rowElement}; arr.push(rowElement); return arr; | table
execute script | return Number(${j}) + 1; | j
end| |
use scvSave command and give a name for the target for divide raws

Loading quoted numbers into snowflake table from CSV with COPY TO <TABLE>

I have a problem with loading CSV data into snowflake table. Fields are wrapped in double quote marks and hence there is problem with importing them into table.
I know that COPY TO has CSV specific option FIELD_OPTIONALLY_ENCLOSED_BY = '"'but it's not working at all.
Here are some pices of table definition and copy command:
CREATE TABLE ...
(
GamePlayId NUMBER NOT NULL,
etc...
....);
COPY INTO ...
FROM ...csv.gz'
FILE_FORMAT = (TYPE = CSV
STRIP_NULL_VALUES = TRUE
FIELD_DELIMITER = ','
SKIP_HEADER = 1
error_on_column_count_mismatch=false
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
)
ON_ERROR = "ABORT_STATEMENT"
;
Csv file looks like this:
"3922000","14733370","57256","2","3","2","2","2019-05-23 14:14:44",",00000000",",00000000",",00000000",",00000000","1000,00000000","1000,00000000","1317,50400000","1166,50000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000",",00000000"
I get an error
'''Numeric value '"3922000"' is not recognized '''
I'm pretty sure it's because NUMBER value is interpreted as string when snowflake is reading "" marks, but since I use
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
it shouldn't even be there... Does anyone have some solution to this?
Maybe something is incorrect with your file? I was just able to run the following without issue.
1. create the test table:
CREATE OR REPLACE TABLE
dbNameHere.schemaNameHere.stacko_58322339 (
num1 NUMBER,
num2 NUMBER,
num3 NUMBER);
2. create test file, contents as follows
1,2,3
"3922000","14733370","57256"
3,"2",1
4,5,"6"
3. create stage and put file in stage
4. run the following copy command
COPY INTO dbNameHere.schemaNameHere.STACKO_58322339
FROM #stageNameHere/stacko_58322339.csv.gz
FILE_FORMAT = (TYPE = CSV
STRIP_NULL_VALUES = TRUE
FIELD_DELIMITER = ','
SKIP_HEADER = 0
ERROR_ON_COLUMN_COUNT_MISMATCH=FALSE
FIELD_OPTIONALLY_ENCLOSED_BY = '"'
)
ON_ERROR = "CONTINUE";
4. results
+-----------------------------------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
| file | status | rows_parsed | rows_loaded | error_limit | errors_seen | first_error | first_error_line | first_error_character | first_error_column_name |
|-----------------------------------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------|
| stageNameHere/stacko_58322339.csv.gz | LOADED | 4 | 4 | 4 | 0 | NULL | NULL | NULL | NULL |
+-----------------------------------------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
1 Row(s) produced. Time Elapsed: 2.436s
5. view the records
>SELECT * FROM dbNameHere.schemaNameHere.stacko_58322339;
+---------+----------+-------+
| NUM1 | NUM2 | NUM3 |
|---------+----------+-------|
| 1 | 2 | 3 |
| 3922000 | 14733370 | 57256 |
| 3 | 2 | 1 |
| 4 | 5 | 6 |
+---------+----------+-------+
Can you try with a similar test as this?
EDIT: A quick look at your data shows many of your numeric fields appear to start with commas, so something definitely amiss with the data.
Assuming your numbers are European formatted , decimal place, and . thousands, reading the numeric formating help, it seems Snowflake does not support this as input. I'd open a feature request.
But if you read the column in as text then use REPLACE like
SELECT '100,1234'::text as A
,REPLACE(A,',','.') as B
,TRY_TO_DECIMAL(b, 20,10 ) as C;
gives:
A B C
100,1234 100.1234 100.1234000000
safer would be to strip placeholders first like
SELECT '1.100,1234'::text as A
,REPLACE(A,'.') as B
,REPLACE(B,',','.') as C
,TRY_TO_DECIMAL(C, 20,10 ) as D;

Why does reading csv file with empty values lead to IndexOutOfBoundException?

I have a csv file with the foll struct
Name | Val1 | Val2 | Val3 | Val4 | Val5
John 1 2
Joe 1 2
David 1 2 10 11
I am able to load this into an RDD fine. I tried to create a schema and then a Dataframe from it and get an indexOutOfBound error.
Code is something like this ...
val rowRDD = fileRDD.map(p => Row(p(0), p(1), p(2), p(3), p(4), p(5), p(6) )
When I tried to perform an action on rowRDD, gives the error.
Any help is greatly appreciated.
This is not answer to your question. But it may help to solve your problem.
From the question I see that you are trying to create a dataframe from a CSV.
Creating dataframe using CSV can be easily done using spark-csv package
With the spark-csv below scala code can be used to read a CSV
val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load(csvFilePath)
For your sample data I got the following result
+-----+----+----+----+----+----+
| Name|Val1|Val2|Val3|Val4|Val5|
+-----+----+----+----+----+----+
| John| 1| 2| | | |
| Joe| 1| 2| | | |
|David| 1| 2| | 10| 11|
+-----+----+----+----+----+----+
You can also inferSchema with latest version. See this answer
Empty values are not the issue if the CSV file contains fixed number of columns and your CVS looks like this (note the empty field separated with it's own commas):
David,1,2,10,,11
The problem is your CSV file contains 6 columns, yet with:
val rowRDD = fileRDD.map(p => Row(p(0), p(1), p(2), p(3), p(4), p(5), p(6) )
You try to read 7 columns. Just change your mapping to:
val rowRDD = fileRDD.map(p => Row(p(0), p(1), p(2), p(3), p(4), p(5))
And Spark will take care of the rest.
The possible solution to that problem is replacing missing value with Double.NaN. Suppose I have a file example.csv with columns in it
David,1,2,10,,11
You can read the csv file as text file as follow
fileRDD=sc.textFile(example.csv).map(x=> {val y=x.split(","); val z=y.map(k=> if(k==""){Double.NaN}else{k.toDouble()})})
And then you can use your code to create dataframe from it
You can do it as follows.
val df = sqlContext
.read
.textfile(csvFilePath)
.map(_.split(delimiter_of_file, -1)
.map(
p =>
Row(
p(0),
p(1),
p(2),
p(3),
p(4),
p(5),
p(6))
Split using delimiter of your file. When you set -1 as limit it consider all the empty fields.

mysql recursive self join

create table test(
container varchar(1),
contained varchar(1)
);
insert into test values('X','A');
insert into test values('X','B');
insert into test values('X','C');
insert into test values('Y','D');
insert into test values('Y','E');
insert into test values('Y','F');
insert into test values('A','P');
insert into test values('P','Q');
insert into test values('Q','R');
insert into test values('R','Y');
insert into test values('Y','X');
select * from test;
mysql> select * from test;
+-----------+-----------+
| container | contained |
+-----------+-----------+
| X | A |
| X | B |
| X | C |
| Y | D |
| Y | E |
| Y | F |
| A | P |
| P | Q |
| Q | R |
| R | Y |
| Y | X |
+-----------+-----------+
11 rows in set (0.00 sec)
Can I find out all the distinct values contained under 'X' using a single self join?
EDIT
Like, Here
X contains A, B and C
A contains P
P contains Q
Q contains R
R contains Y
Y contains C, D and E...
So I want to display A,B,C,D,E,P,Q,R,Y when I query for X.
EDIT
Got it right by programming.
package com.catgen.helper;
import java.sql.Connection;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.List;
import com.catgen.factories.Nm2NmFactory;
public class Nm2NmHelper {
private List<String> fetched;
private List<String> fresh;
public List<String> findAllContainedNMByMarketId(Connection conn, String marketId) throws SQLException{
fetched = new ArrayList<String>();
fresh = new ArrayList<String>();
fresh.add(marketId.toLowerCase());
while(fresh.size()>0){
fetched.add(fresh.get(0).toLowerCase());
fresh.remove(0);
List<String> tempList = Nm2NmFactory.getContainedNmByContainerNm(conn, fetched.get(fetched.size()-1));
if(tempList!=null){
for(int i=0;i<tempList.size();i++){
String current = tempList.get(i).toLowerCase();
if(!fetched.contains(current) && !fresh.contains(current)){
fresh.add(current);
}
}
}
}
return fetched;
}
}
Not the same table and fields though. But I hope you get the concept.
Thanks guys.
You can't get all the contained objects recursively using a single join with that data structure. You would need a recursive query but MySQL doesn't yet support that.
You could however construct a closure table, then you can do it with a simple query. See Bill Karwin's slideshow Models for heirarchical data for more details and other approaches (for example, nested sets). Slide 69 compares the different designs for ease of implementing 'Query subtree'. Your chosen design (adjacency list) is the most awkward of all four designs for this type of query.
What about reading the whole table into a php array, and determine the children via. a function which would call itself?
But this is not a good solution if the table has more than 10000 rows...