Converting string to integer with "%" special character in Pentaho - mysql

Hi I am facing an issue while converting string value to integer.
Actually I am reading data from the table and there are fields like 39% and they are string data type.
Now i want to convert them into INteger datatype and load them in to another table.
I tried using select values in PDI but it is giving me error like. "Could't convert String to Integer."
Please help me in resolving this issue.

The percentage sign isn't part of the integer type in Java, so first you need to remove that character in order to make the type casting.
Add a new "Replace in string" step between the data origin and "Select values"
Double click on the new added step and on the "In stream field" select the field that needs to be cleaned
On "Search", type "%" (without the parentheses) and click Ok to close the dialog.
That should do the trick.

Related

How to convert date from csv file into integer

I have to send data from csv into SQL DB.
Problem starts when I try to convert data into Int. It wasnt my idea and I really cant do much with this datatype. When I'm trying to achieve this problem pop up:
Data Conversion 2: Data conversion failed while converting column
"pr_czas" (387) to column "C pr_dCz_id" (14). The conversion returned
status value 2 and status text "The value could not be converted
because of a potential loss of data.".
Tried already to ignore this problem but then another problems came up so there is no other way than solving this.
I have to convert this data from csv file which is str 50 into int 4
It must be int4. One of the requirements Dont know what t odo.
This is data I'm trying to put into int4. Look on pr_czas
This is data's datatype
Before I tried to do same thing with just DD.MM.YYYY but got same result...
Given an input column named [pr_czas] that contain string values that look like 31.01.2020 00:00 which appears to be a formatted date time represented in the format "DD.mm.YYYY HH:MM", I would like to express that as a whole number DDMMYYHHMM
Add a derived column to your data flow and call this new_pr_czas
The logic I'm going to use is a series of REPLACE statements and cast the final result to an integer. Replace the period, replace the colon and the space - all with nothing
(DT_I8)REPLACE(REPLACE(REPLACE([pr_czas], ".", ""), ":", ""), " ", "")
This is an easy case but things to note.
An integer/int32/I4 has a maximum value of 2 billion.
310120200000 is too large to fit into that space so you would need to make that an bigint/int64/I8. If I remember your previous question, you were having troubles with a lookup task so this data type mismatch might hurt you there.
The other thing to be aware of is that leading zeros will be dropped when converted to a number because they are not significant. If you need to retain the leading zeros, then you're working with string data type. This is an advantage to working with the ISO standard but if your data expects DD, then far be it for me to say otherwise.
If you need to slice your date into another format, then you'll want to have a few derived columns. The first one will generate a string column for each piece of pr_czas - year, month, day, hour and minute. You'll use the substring method for this and findstring to find the period space and colon.
The next data flow will be used to put those string pieces back into the new format and cast that to I8. Why? Because you can't debug doing it all in one shot but you can put a data viewer between two derived columns to figure out where a slice went awry.

Change datatype of SSIS flat file data with string "NULL" values

In my SSIS project I have to retrieve my data from a flat csv file. The data itself looks something like this:
AccountType,SID,PersonID,FirstName,LastName,Email,Enabled
NOR,0001,0001,Test,Test0001,Test1#email.com,TRUE
NOR,1001,NULL,Test,Test1002,Test2#email.com,FALSE
TST,1002,NULL,Test,Test1003,Test3#email.com,TRUE
I need to read this data and make sure it has the correct datatypes for future checks. Meaning SID and PersonID should have a numeric datatype, Enabled should be a boolean. But I would like to keep the same columns and names as my source file.
It seems like the only correct way to read this data trough the 'Flat File Source'-Task is as String. Otherwise I keep getting errors because "NULL" is literally a String and not a NULL value.
Next I perform a Derived Column transformation to get rid of all "NULL" values. For example, I use the following expression for PersonId:
(TRIM(PersonID) == "" || UPPER(PersonID) == "NULL") ? (DT_WSTR,50)NULL(DT_WSTR,50) : PersonID
I would like to immediatly convert it to the correct datatype by adding it in the expression above, but it seems impossible to select another datatype for the same column when I select 'Replace 'PersonId'' in the Derived Column dropdown box.
So next up I thought of using the Data Conversion task next to change the datatypes of these columns, but when I use this it only creates new columns, even when I enter the output alias to remain the same.
How could I alter my solution to efficiently and correctly read this data and convert its values to the correct datatypes?

Insert/update JSON into Postgresql column WHERE myvar = myval

I'm trying to insert JSON into a Postgresql column who's data type is JSON, but I'm having trouble finding how I can do this. This is as far as I've gotten but it's not correct because it just overwrites it every time, instead of adding a new key pair.
I'm using pg-promise node module to perform these queries. Here's what I have so far:
db.query("UPDATE meditation_database SET completed=$1 WHERE user_id=$2", [{myVar : true}, user_id]);
Also 'myVar' should be updated to the variable value, but instead it treats it as a string. How can I get the actual value of 'myVar' instead of it being treated literally.
Thanks,
I'm trying to insert JSON into a Postgresql column who's data type is JSON, but I'm having trouble finding how I can do this.
By executing this:
db.query("INSERT INTO meditation_database(completed, user_id) VALUES($1, $2)",
[{myVar : true}, user_id]);
Also 'myVar' should be updated to the variable value, but instead it treats it as a string. How can I get the actual value of 'myVar' instead of it being treated literally.
myVar is serialized into JSON as a string, that's the proper JSON format for property names, and is the only format that PostgreSQL will accept.
This is as far as I've gotten but it's not correct because it just overwrites it every time, instead of adding a new key pair.
If you are asking how to update JSON in PostgreSQL, this question has been answered previously, and in great detail: How do I modify fields inside the new PostgreSQL JSON datatype?

Talend Casting of JSON string to JSON or JSONB in PostgreSQL

I'm trying to use Talend to get JSON data that is stored in MySQL as a VARCHAR datatype and export it into PostgreSQL 9.4 table of the following type:
CREATE TABLE myTable( myJSON as JSONB)
When I try running the job I get the following error:
ERROR: column "json_string" is of type json but expression is of type
character varying
Hint: You will need to rewrite or cast the expression. Position:
54
If I use python or just plain SQL with PostgreSQL insert I can insert a string such as '{"Name":"blah"}' and it understands it.
INSERT INTO myTable(myJSON) VALUES ('{"Name":"blah"}');
Any Idea's how this can be done in Talend?
You can add a type-cast by opening the "Advanced Settings" tab on you "tPostgresqlOutput" component. Consider the following example:
In this case, the input row to "tPostgresqlOutput_1" has one column data. This column is of type String and is mapped to the database column data of type VARCHAR (as by the default suggested by Talend):
Next, open the component settings for tPostgresqlOutput_1 and locate the "Advanced settings" tab:
On this tab, you can replace the existing data column by a new expression:
In the name column, specify the target column name.
In the SQL Expression column, do your type casting. In this case: "?::json"`. Note the usage of the placeholder character?`` which will be replaced with the original value.
In Position, specify Replace. This will replace the value proposed by Talend with your SQL expression (including the type cast).
As Reference Column use the source value.
This should do the trick.
Here is a sample schema for where in i have the input row 'r' which has question_json and choice_json columns which are json strings. From which i know the key what i wanted to extract and here is how i do
you should look at the columns question_value and choice_value. Hope this helps you

String length at the start of a MySQL text field when using HYDRATE_NONE?

I'm using Symfony 1.4 with Doctrine.
I'm saving text as MySQL text type (Doctrine "array" type) into the database, and it goes in clean & correct.
When querying the data back, if I use Doctrine_Core::HYDRATE_ARRAY the data is returned as it should be. However, if I use HYDRATE_NONE, the data is returned with the text length appended to it:
S:45"this is some text from the database" // where "45" is the length.
Is this expected behaviour or might I have defined the wrong type?
Thanks.
The text you are seeing is the serialized form of the array. If you choose not to hydrate, you will get the serialized form, as Doctrine converts the array into a serialized form in order to store it in a TEXT column in MySQL. PHP's serialize/unserialize function pairs should provide an example of the type of process used by Doctrine.