I have a JSON like
{
"campaign_key": 316,
"client_key": 127,
"cpn_mid_counter": "24",
"cpn_name": "Bopal",
"cpn_status": "Active",
"clt_name": "Bopal Ventures",
"clt_status": "Active"
}
Expected output
1st JSON :
{
"campaign_key": 316,
"client_key": 127,
"cpn_mid_counter": "24",
"cpn_name": "Bopal",
"cpn_status": "Active"
}
2nd JSON:
{
"clt_name": "Bopal Ventures",
"clt_status": "Active"
}
How do I acheive this by using NIFI? Thanks.
You can do what 'user' had said. The not-so-good thing about that approach is, if you number of fields are increasing, then you are required to add that many JSON Path expression attributes to EvaluateJsonPath and subsequently add that many attributes in ReplaceText.
Instead what I'm proposing is, use QueryRecord with Record Reader set to JsonTreeReader and Record Writer set to JsonRecordSetWriter. And add two dynamic relationship properties as follows:
json1 : SELECT campaign_key, client_key, cpn_mid_counter, cpn_name, cpn_status FROM FLOWFILE
json2 : SELECT clt_name, clt_status FROM FLOWFILE
This approach takes care of reading and writing the output in JSON format. Plus, if you want to add more fields, you just have add the field name in the SQL SELECT statement.
QueryRecord processor lets you execute SQL query against the FlowFile content. More details on this processor can be found here
Attaching screenshots
Karthik,
Use EvaluateJsonPath processor to get those all json Values by using its keys.
Example: $.campaign_key for gets compaign key value and $.clt_name for get clt name.
Like above one you can get all jsons.
Then use ReplaceText Processor for convert single json into two jsons.
{"Compaign_Key":${CompaignKey},...etc}
{"Clt_name":${clt_name}}
It will convert single json into two jsons.
Hope this helpful and let me know if you have issues.
Related
Is it possible to apply filter on the Nested JSON fields with help of Kafka Streams? If Yes, how those fields can be addressed?
For example,
{
"before":{
"id":1,
"name":"abc"
},
"after":{
"id":1,
"name":"xyz"
}
and now if name is modified in after field I do not want to filter it but fields other than name are getting modified I want to filter that record.
Thank you.
The deserializer of the configured Stream Serde should return your object type. You can then filter just like a regular Java Stream
stream.filter(yourMesssage -> compareCDCRecords(yourMessage.getBefore(), yourMessage.getAfter()))
I'm trying to create a hive external table for a json file in .txt format. I have tried several approaches but I think I'm going wrong in how the hive external table should be defined:
My Sample JSON is:
[[
{
"user": "ron",
"id": "17110",
"addr": "Some address"
},
{
"user": "harry",
"id": "42230",
"addr": "some other address"
}]]
As you can see it's array inside an array. It seems that this is valid json, returned by an API, although I have read posts saying that json should start with a '{'
Anyway, I am trying to create an external table like this:
CREATE EXTERNAL TABLE db1.user(
array<array<
user:string,
id:string,
desc:string
>>)
PARTITIONED BY(date string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS TEXTFILE
LOCATION '/tmp/data/addr'
This does not work. Nor does something like this work
CREATE EXTERNAL TABLE db1.user(
user string,
id string,
desc string
)PARTITIONED BY(date string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS TEXTFILE
LOCATION '/tmp/data/addr'
After trying to modify the json text file, replacing [ with { etc., adding parition I still wasn't able to query it using select *. I'm missing a key piece in the table structure.
Can you please help me so that the table can read my JSON correctly?
If required, I can modify the input JSON, if the double [[ is a problem.
1st: Row in a table should be represented in a file as single line, no multi-line JSON.
2nd: You can have array<some complex type> as a single column, but this is not convenient because you will need to explode the array to be able to access nested elements. The only reason you may want such structure is when there are really multiple rows with array<array<>>.
3rd: Everything in [] is an array. Everything in {} is a struct or map, in your case it is a struct, and you have missed this rule. Fields user, id and desc are inside struct, and struct is nested inside array. Array can have only type in it's definition, if it is nested struct, then it will be array<struct<...>>, If array is of simple type then, for example array<string>.
4th: Your JSON is not valid one because it contains extra comma after address value, fix it.
If you prefer to have single column colname containing array<array<struct<...>>> then create table like this:
CREATE EXTERNAL TABLE db1.user(
colname array<array<
struct<user:string,
id:string,
desc:string>
>>)...
And JSON file should look like this (single line for each row):
[[{"user": "ron","id": "17110","addr": "Some address"}, {"user": "harry","id": "42230","addr": "some other address"}]]
If the file contains single big array nested in another array, better remove [[ and ]], remove commas between structs and extra newlines inside structs. If single row is a struct {}, you can define your table without upper struct<>, only nested structs should be defined as struct<>:
CREATE EXTERNAL TABLE db1.user(
user string,
id string,
desc string
)...
Note, that in this case you do not need : between column name and type. Use : only inside nested structs.
And JSON should look like this (whole JSON object as defined in DDL in a single line, no comma between structs, each struct in a separate line):
{"user": "ron","id": "17110","addr": "Some address"}
{"user": "harry","id": "42230","addr": "some other address"}
Hope you got how it works. Read more in the JSONSerDe manual.
I have converted some columns to JSON using the columns to json node. The output from that is:
{
"Material" : 101,
"UOM" : "GRAM",
"EAN" : 7698,
"Description" : "CHALK BOX"
}
I would like to add the value of the material property as a key to each JSON object. So, my desired output is:
"101": {
"Material" : 101,
"UOM" : "GRAM",
"EAN" : 7698,
"Description" : "CHALK BOX"
}
I have tried entering the following expression in the JSON transformer node but all I get is a question mark in the new column it generates:
$Material$:{"Material":$Material$,"UOM":$UOM$,"EAN":$EAN$,"Description":$Description$}
I have also tried replacing the $Material$ with "Material" but got the same result.
How would I go about this, please?
In case you convert the Material column to String (for example with String Manipulator), you can easily configure the Columns to JSON:
As you can see the Data bound key is the important part.
The String Manipulator node configuration (string($Material$)):
I finally managed to solve this by a different method.
I split the JSON data into several columns, then used the join function to create a string in the required order. I put the resulting string through the string to JSON node to create the new JSON object.
Thanks for all your tips and comments !
How to read values from nested JSON without using any library like GSON or org.JSON?
JSON is :
{data: { "EV_TOT_AMT" : "12" , "EV_CURR" : "INR", "T_BASKET" : [{"ORDER" : "abc", "BASE" : "xyx"},{"ORDER" : "def", "BASE" : "mno"}] } }
I want to read specific values as EV_TOT_AMT , EV_CURR , ORDER.
As far as I know, Java doesn't includes a JSON parser inside it's core classes... so if you don't want to use an external library, you'll need to build your own JSON parser.
Of course, you can just search the JSON string for your desired substrings and get the values moving into the string from the next ":" to the next "," (if the first character after the ":" is not an "["). But this isn't a good approach, unless your JSON input string is going to have always the same structure... well... actually that's not a good approach, period.
I've iOS project which is using RestKit 0.21.0 component responsible to get, parse and store in Core Data responses from remote server. In one of the backend JSON response I have something like that:
"response": [
{
"id": 1,
"start_time": "10:00:00",
"end_time": "14:00:00",
"name": "Object name",
"occurrences": [
"2013-09-13T14:00:00",
"2013-09-20T14:00:00",
"2013-09-27T14:00:00"
]
},
.
.
.
]
Generally I'm able to parse and store in Core Data received objects. I've only problem with nested array occurrences.
Do you have any advices how should I properly parse and store this collection?
I guess you want to map it to dates. To do that you generally need a container. You can also simply map to an array of strings and post-process.
1) Array of strings:
Just add an NSArray property to your destination object and map occurrences to it. This would be a transformable attribute in Core Data (could be transient). Now you can iterate the array and create the dates (could be done in willSave).
2) Relationship to dates:
Create a new entity, call it Occurrence. It has a single date property. Use a 'nil' keypath mapping to create instances of this Occurrence entity and map each of the dates to a new instance (the conversion to NSDate will be done for you). You have no identity so your only option would be to use the date as the unique identifier.