I have here a sample data where I want to put the city on a separate column. The city can be defined after the comma sign. How can I do this with talend. What component should I use?
Here's the sample data. On the left side is my input and the right should be the right output.
You can put the logic for extracting city either in a tMap (create variable in tmap to do parsing) or you can use a tJavaRow component to do this.
Just search for indexOf and substring methods to do your parsing.
For example in tJavaRow you can use
output_row.city = input_row.addressfield.substring(input_row.addressfield.indexOf(",")+1).trim();
You can use split function in tMap. use below code in city column just change dwetl address field with your actual column name.
!Relational.ISNULL(row7.dwetl_Address)?row7.dwetl_Address.split(",")[1]:"default City"
As mentioned above you can create your own routine and make it more generic by using StringTokenizer(you may use split but I prefer the later) and then you can also pass the string separator to the routine as an argument and and return the position as you want.
This will make the routine to be reusable and you can also use it later. And the routine can be called through tMap.
Related
I'm trying to map a JSON document (sensor data) into a more meaningful representation using Mapping Dataflows. However, hard time getting this to work and would really appreciate some insight/recommendations on how to solve the following:
The input is
What I would like to end up with is the following:
Any pointers as to how this can be implemented are more than welcome.
This can be accomplished using the Copy activity and then split function in Derived Column transformation in Azure Data Factory.
Use the copy activity to read the JSON file as source and in sink, use SQL database to store the data as table. In Mapping tab, Import the schema and map the JSON records to the corresponding column names. Refer this third-part tutorial for guidance - https://sqlkover.com/dynamically-map-json-to-sql-in-azure-data-factory/
Finally, use the Data Flow activity and choose the SQL table as source now which you have used as sink above.
Select the Derived Column transformation.
Use split function.
Add the column which will take the split values which you want to split as shown below.
Use split(<column_name_to_split>, '_') function to split the column on with _ delimiter. Change <column_name_to_split> to the name of column you cant to split. Refer image below.
Preview the data to check the result.
I have a a field which accepts value as "Patient_1077,ELLA(161st Pharmacy address)" i.e: Patient ID, name and Address, that I want to parameterize, CSV is not helpful in this case as value contains comma itself, is there any alternative way where I can inject these kind of values from file and parameterize the same?
Delimiter is configurable. See this for an example:
http://ivetetecedor.com/how-to-use-a-csv-file-with-jmeter/
You can also quote the data, which is another solution in your case.
You can make values in csv like:
Patient_0154, ELLA(102st The Cave)
Patient_0155, ELLA(101st The
Wall)
Then read it:
Patient_0154 -> user
ELLA(102st The Cave) -> adress
And when you need the value you do the simple trick — ${user},${adress}.
At least this is my approach when I don't want to code a lot for a simple task.
I am using RapidMiner 5 GUI and i want to store all the value of an attribute in different text file. But if i am using any write utility like write or write document it is either overwriting the data or giving an error.
I want to store all the value in different files with a sequence or something attached to it.
Is there any way?
With the Loop Attributes operator you can loop with the attributes and with the Generate Macro operator you can create from the macro as attribute name a file name you prefer.
I have a text field with data, something like:
[{"id":10001,"timeStarted":1355729600733,"projectId":10002,"issueId":"29732,","userName":"tester","assignee":"test","status":"STARTED","shared":True,"name":"Session 4","projectName":"IDS","assigneeDisplayName":"First1 Last1"},
{"id":10002,"timeStarted":1358354188010,"projectId":10002,"issueId":"","userName":"tester","assignee":"test","status":"CREATED","shared":True,"name":"asdf98798","projectName":"IDS","assigneeDisplayName":"First Last"}]
but with much more rows, it may be 30-40, and may be 2 more different statuses (total 4).
Is it possible to extract some data from here having read-only access to DB and only using MySQL query?
For example to count number of items with status "Stated" and with status "created".
Additional conditions may apply, e.g. where id is in definite interval.
Assuming you're using PHP, first you're better off with correcting those unrecognized booleans. You have True where it should have been true (alternatively TRUE for PHP) for it to evaluate the data right.
$jsStr = preg_replace_callback(
'~(?<=[,{[])(".+?"\s*:\s*)(true|false)(?=\s*[,}\]])~i',
create_function('$m','return $m[1].strtolower($m[2]);'),
$jsStr);
Then to be able to process it you want to use the json_decode() function.
$parsed = json_decode($jsStr);
// see the result if you like:
// print_r($parsed);
Ultimately if you want to extract some specific information on the client side (using Javascript) you can use the Array filter() function or a loop if you're not using jQuery. Otherwise you can use the jQuery filter() function with necessary conditions.
If you want to do this in PHP, after the string is parsed into JSON you can use the solutions that apply to Javascript.
Each parameter in a URL can have multiple values. How can I separate them? Here's an example:
http://www.example.com/search?queries=cars,phones
So I want to search for 2 different things: cars and phones (this is just a contrived example). The problem is the separator, a comma. A user could enter a comma in the search form as part of their query and then this would get screwed up. I could have 2 separate URL parameters:
http://www.example.com/login?name1=harry&name2=bob
There's no real problem there, in fact I think this is how URLs were designed to handle this situation. But I can't use it in my particular situation. Requires a separate long post to say why... I need to simply separate the values.
My question is basically, is there a URL encodable character or value that can't possibly be entered in a form (textarea or input) which I can use as a separator? Like a null character? Or a non-visible character?
UPDATE: thank you all for your very quick responses. I should've listed the same parameter name example too, but order matters in my case so that wasn't an option either. We solved this by using a %00 URL encoded character (UTF-8 \u0000) as a value separator.
The standard approach to this is to use the same key name twice.
http://www.example.com/search?queries=cars&queries=phones
Most form libraries will allow you to access it as an array automatically. (If you are using PHP (and making use of $_POST/GET and not reinventing the wheel) you will need to change the name to queries[].)
You can give them each the same parameter name.
http://www.example.com/search?query=cars&query=phones
The average server side HTTP API is able to obtain them as an array. As per your question history, you're using JSP/Servlet, so you can use HttpServletRequest#getParameterValues() for this.
String[] queries = request.getParameterValues("query");
Just URL-encode the user input so that their commas become %2C.
Come up with your own separator that is unlikely to get entered in a query. Two underscores '__' for example.
Why not just do something like "||"? Anyone who types that into a search area probably fell asleep on their keyboard :} Then just explode it on the backend.
easiest thing to do would be to use a custom separator like [!!ValSep!!].