DataWeave 2.0 how to build dynamically populated accumulator for reduce() - reduce

I'm trying to convert an array of strings into an object for which each member uses the string for a key, and initializes the value to 0. (Classic accumulator for Word Count, right?)
Here's the style of the input data:
%dw 2.0
output application/dw
var hosts = [
"t.me",
"thewholeshebang.com",
"thegothicparty.com",
"windowdressing.com",
"thegothicparty.com"
]
To get the accumulator, I need a structure in this style:
var histogram_acc = {
"t.me" : 1,
"thewholeshebang.com" : 1,
"thegothicparty.com" : 2,
"windowdressing.com" : 1
}
My thought was that this is a slam-dunk case for reduce(), right?
So to get the de-duplicated list of hosts, we can use this phrase:
hosts distinctBy $
Happy so far. But now for me, it turns wicked.
I thought this might be the gold:
hosts distinctBy $ reduce (ep,acc={}) -> acc ++ {ep: 0}
But the problem is that this didn't work out so well. The first argument to the lambda for reduce() represents the iterating element, in this case the endpoint or address. The lambda appends the new object to the accumulator.
Well, that's how I hoped it would happen, but I got this instead:
{
ep: 0,
ep: 0,
ep: 0,
ep: 0
}
I kind of need it to do better than that.

As you said reduce is a good fit for this problem, alternatively you can use the "Dynamic elements" of objects feature to "flatten an array of objects into an object"
%dw 2.0
output application/dw
var hosts = [
"t.me",
"thewholeshebang.com",
"thegothicparty.com",
"windowdressing.com",
"thegothicparty.com"
]
---
{(
hosts
distinctBy $
map (ep) -> {"$ep": 0}
)}
See https://docs.mulesoft.com/mule-runtime/4.3/dataweave-types#dynamic_elements

Scenario 1:
The trick I think for this scenario is you need to enclose the expression for the distinctBy ... map with {}.
Example:
Input:
%dw 2.0
var hosts = [
"t.me",
"thewholeshebang.com",
"thegothicparty.com",
"windowdressing.com",
"thegothicparty.com"
]
output application/json
---
{ // This open bracket will do the trick.
(hosts distinctBy $ map {($):0})
} // See Scenario 2 if you remove or comment this pair bracket
Output:
{
"t.me": 0,
"thewholeshebang.com": 0,
"thegothicparty.com": 0,
"windowdressing.com": 0
}
Scenario 2: If you remove the {} from the expression {<expression distinctBy..map...} the output will be an Array.
Example:
Input:
%dw 2.0
var hosts = [
"t.me",
"thewholeshebang.com",
"thegothicparty.com",
"windowdressing.com",
"thegothicparty.com"
]
output application/json
---
//{ // This is now commented
(hosts distinctBy $ map {($):0})
//} // This is now commented
Output:
[
{
"t.me": 0
},
{
"thewholeshebang.com": 0
},
{
"thegothicparty.com": 0
},
{
"windowdressing.com": 0
}
]
Scenario 3: If you want to count the total duplicate per item, you can use the groupBy and sizeOf
Example:
Input:
%dw 2.0
var hosts = [
"t.me",
"thewholeshebang.com",
"thegothicparty.com",
"windowdressing.com",
"thegothicparty.com"
]
output application/json
---
hosts groupBy $ mapObject (value,key) -> {
(key): sizeOf(value)
}
Output:
{
"t.me": 1,
"thewholeshebang.com": 1,
"thegothicparty.com": 2,
"windowdressing.com": 1
}

Hilariously (but perhaps only to me) is the fact that I discovered the answer to this while I was writing my question. Hoping that someone will pose this same question, here is what I found.
In order to present the lambda argument in my example (ep) as the key in a structure, I must quote and intererpolate it.
"$ep"
Once I did that, it was a quick passage to:
hosts distinctBy $ reduce (ep,acc={}) -> acc ++ {"$ep": 0}
...and then of course this:
{
"t.me": 0,
"thewholeshebang.com": 0,
"thegothicparty.com": 0,
"windowdressing.com": 0
}

Related

Change name of main row Rails in JSON

So i have a json:
{
"code": "Q0934X",
"name": "PIDBA",
"longlat": "POINT(23.0 33.0)",
"altitude": 33
}
And i want to change the column code to Identifier
The wished output is this
{
"Identifier": "Q0934X",
"name": "PIDBA",
"longlat": "POINT(23.0 33.0)",
"altitude": 33
}
How can i do in the shortest way? Thanks
It appears that both "the json" you have and your desired result are JSON strings. If the one you have is json_str you can write:
json = JSON.parse(json_str).tap { |h| h["Identifier"] = h.delete("code") }.to_json
puts json
#=> {"name":"PIDBA","longlat":"POINT(23.0 33.0)","altitude":33,"Identifier":"Q0934X"}
Note that Hash#delete returns the value of the key being removed.
Perhaps transform_keys is an option.
The following seems to work for me (ruby 2.6):
json = JSON.parse(json_str).transform_keys { |k| k === 'code' ? 'Identifier' : k }.to_json
But this may work for Ruby 3.0 onwards (if I've understood the docs):
json = JSON.parse(json_str).transform_keys({ 'code': 'Identifier' }).to_json

Multiple JSON payload to CSV file

i have a task to generate CSV file from multiple JSON payloads (2). Below are my sample data providing for understanding purpose
- Payload-1
[
{
"id": "Run",
"errorMessage": "Cannot Run"
},
{
"id": "Walk",
"errorMessage": "Cannot Walk"
}
]
- Payload-2 (**Source Input**) in flowVars
[
{
"Action1": "Run",
"Action2": ""
},
{
"Action1": "",
"Action2": "Walk"
},
{
"Action1": "Sleep",
"Action2": ""
}
]
Now, i have to generate CSV file with one extra column to Source Input with ErrorMessage on one condition basis, where the id in payload 1 matches with sourceInput field then errorMessage should assign to that requested field and generate a CSV file as a output
i had tried with the below dataweave
%dw 1.0
%output application/csv header=true
---
flowVars.InputData map (val,index)->{
Action1: val.Action1,
Action2: val.Action2,
(
payload filter ($.id == val.Action1 or $.id == val.Action2) map (val2,index) -> {
ErrorMessage: val2.errorMessage replace /([\n,\/])/ with ""
}
)
}
But, here im facing an issue with, i'm able to generate the file with data as expected, but the header ErrorMessage is missing/not appearing in the file with my real data(in production). Kindly assist me.
and Expecting the below CSV output
Action1,Action2,ErrorMessage
Run,,Cannot Run
,Walk,Cannot Walk
Sleep,
Hello the best way to solve this kind of problem is using groupBy. The idea is that you groupBy one of the two parts to use the join by and then you iterate the other part and do a lookup. This way you avoid O(n^2) and transform it to O(n)
%dw 1.0
%var payloadById = payload groupBy $.id
%output application/csv
---
flowVars.InputData map ((value, index) ->
using(locatedError = payloadById[value.Action2][0] default payloadById[value.Action1][0]) (
(value ++ {ErrorMessage: locatedError.errorMessage replace /([\n,\/])/ with ""}) when locatedError != null otherwise value
)
)
filter $ != null
Assuming "Payload-1" is payload, and "Payload-2" is flowVars.actions, I would first create a key-value lookup with the payload. Then I would use that to populate flowVars.actions:
%dw 1.0
%output application/csv header=true
// Creates lookup, e.g.:
// {"Run": "Cannot run", "Walk": "Cannot walk"}
%var errorMsgLookup = payload reduce ((obj, lookup={}) ->
lookup ++ {(obj.id): obj.errorMessage})
---
flowVars.actions map ((action) -> action ++ errorMsgLookup[action.Action1])
Note: I'm also assuming flowVars.action's id field is unique across the array.

json key iteration in DW mule

I have the following requirement need to interate the dynamic json key
need to use this json key and iterate through it
This is my input
[
{
"eventType":"ORDER_SHIPPED",
"entityId":"d0594c02-fb0e-47e1-a61e-1139dc185657",
"userName":"educator#school.edu",
"dateTime":"2010-11-11T07:00:00Z",
"status":"SHIPPED",
"additionalData":{
"quoteId":"d0594c02-fb0e-47e1-a61e-1139dc185657",
"clientReferenceId":"Srites004",
"modifiedDt":"2010-11-11T07:00:00Z",
"packageId":"AIM_PACKAGE",
"sbsOrderId":"TEST-TS-201809-79486",
"orderReferenceId":"b0123c02-fb0e-47e1-a61e-1139dc185987",
"shipDate_1":"2010-11-11T07:00:00Z",
"shipDate_2":"2010-11-12T07:00:00Z",
"shipDate_3":"2010-11-13T07:00:00Z",
"shipMethod_1":"UPS Ground",
"shipMethod_3":"UPS Ground3",
"shipMethod_2":"UPS Ground2",
"trackingNumber_3":"333",
"trackingNumber_1":"2222",
"trackingNumber_2":"221"
}
}
]
I need output like following
{
"trackingInfo":[
{
"shipDate":"2010-11-11T07:00:00Z",
"shipMethod":"UPS Ground",
"trackingNbr":"2222"
},
{
"shipDate":"2010-11-12T07:00:00Z",
"shipMethod":"UPS Ground2",
"trackingNbr":"221"
},
{
"shipDate":"2010-11-13T07:00:00Z",
"shipMethod":"UPS Ground3",
"trackingNbr":"333"
}
]
}
the shipdate, shipmethod ,trackingnumber can be n numbers.
how to iterate using json key.
First map the array to iterate and then use pluck to get a list of keys.
Then as long as there is always the same amount of shipDate to shipMethod etc fields. filter the list of keys to only iterate the amount of times those field combinations exist.
Then construct the output of each object by dynamically looking up the key using 'shipDate__ concatenated with the index(incremented by 1 because your example starts at 1 and dw arrays start at 0):
%dw 2.0
output application/json
---
payload map ((item, index) -> item.additionalData pluck($$) filter ($ contains 'shipDate') map ((item2, index2) ->
using(incIndex=(index2+1 as String)){
"shipDate": item.additionalData[('shipDate_'++ incIndex)],
"shipMethod": item.additionalData[('shipMethod_'++ incIndex)],
"trackingNbr": item.additionalData[('trackingNumber_'++ incIndex)],
}
)
)
In DW 1.0 syntax:
%dw 1.0
%output application/json
---
payload map ((item, index) -> item.additionalData pluck ($$) filter ($ contains 'shipDate') map ((item2, index2) ->
using (incIndex = (index2 + 1 as :string))
{
"shipDate": item.additionalData[('shipDate_' ++ incIndex)],
"shipMethod": item.additionalData[('shipMethod_' ++ incIndex)],
"trackingNbr": item.additionalData[('trackingNumber_' ++ incIndex)]
}))
It's mostly the same, except:
output => %output
String => :string

Look for JSON example with all allowed combinations of structure in max depth 2 or 3

I've wrote a program which process JSON objects. Now I want to verify if I've missed something.
Is there an JSON-example of all allowed JSON structure combinations? Something like this:
{
"key1" : "value",
"key2" : 1,
"key3" : {"key1" : "value"},
"key4" : [
[
"string1",
"string2"
],
[
1,
2
],
...
],
"key5" : true,
"key6" : false,
"key7" : null,
...
}
As you can see at http://json.org/ on the right hand side the grammar of JSON isn't quite difficult, but I've got several exceptions because I've forgotten to handles some structure combinations which are possible. E.g. inside an array there can be "string, number, object, array, true, false, null" but my program couldn't handle arrays inside an array until I ran into an exception. So everything was fine until I got this valid JSON object with arrays inside an array.
I want to test my program with a JSON object (which I'm looking for). After this test I want to be feel certain that my program handle every possible valid JSON structure on earth without an exception.
I don't need nesting in depth 5 or so. I only need something in nested depth 2 or max 3. With all base types which nested all allowed base types, inside this base type.
Have you thought of escaped characters and objects within an object?
{
"key1" : {
"key1" : "value",
"key2" : [
"String1",
"String2"
],
},
"key2" : "\"This is a quote\"",
"key3" : "This contains an escaped slash: \\",
"key4" : "This contains accent charachters: \u00eb \u00ef",
}
Note: \u00eb and \u00ef are resp. charachters ë and ï
Choose a programming language that support json.
Try to load your json, on fail the exception's message is descriptive.
Example:
Python:
import json, sys;
json.loads(open(sys.argv[1]).read())
Generate:
import random, json, os, string
def json_null(depth = 0):
return None
def json_int(depth = 0):
return random.randint(-999, 999)
def json_float(depth = 0):
return random.uniform(-999, 999)
def json_string(depth = 0):
return ''.join(random.sample(string.printable, random.randrange(10, 40)))
def json_bool(depth = 0):
return random.randint(0, 1) == 1
def json_list(depth):
lst = []
if depth:
for i in range(random.randrange(8)):
lst.append(gen_json(random.randrange(depth)))
return lst
def json_object(depth):
obj = {}
if depth:
for i in range(random.randrange(8)):
obj[json_string()] = gen_json(random.randrange(depth))
return obj
def gen_json(depth = 8):
if depth:
return random.choice([json_list, json_object])(depth)
else:
return random.choice([json_null, json_int, json_float, json_string, json_bool])(depth)
print(json.dumps(gen_json(), indent = 2))

Using RJSONIO and AsIs class

I am writing some helper functions to convert my R variables to JSON. I've come across this problem: I would like my values to be represented as JSON arrays, this can be done using the AsIs class according to the RJSONIO documentation.
x = "HELLO"
toJSON(list(x = I(x)), collapse="")
"{ \"x\": [ \"HELLO\" ] }"
But say we have a list
y = list(a = "HELLO", b = "WORLD")
toJSON(list(y = I(y)), collapse="")
"{ \"y\": {\n \"a\": \"HELLO\",\n\"b\": \"WORLD\" \n} }"
The value found in y -> a is NOT represented as an array. Ideally I would have
"{ \"y\": [{\n \"a\": \"HELLO\",\n\"b\": \"WORLD\" \n}] }"
Note the square brackets. Also I would like to get rid of all "\n"s, but collapse does not eliminate the line breaks in nested JSON. Any ideas?
try writing as
y = list(list(a = "HELLO", b = "WORLD"))
test<-toJSON(list(y = I(y)), collapse="")
when you write to file it appears as:
{ "y": [
{
"a": "HELLO",
"b": "WORLD"
}
] }
I guess you could remove the \n as
test<-gsub("\n","",test)
or use RJSON package
> rjson::toJSON(list(y = I(y)))
[1] "{\"y\":[{\"a\":\"HELLO\",\"b\":\"WORLD\"}]}"
The reason
> names(list(a = "HELLO", b = "WORLD"))
[1] "a" "b"
> names(list(list(a = "HELLO", b = "WORLD")))
NULL
examining the rjson::toJSON you will find this snippet of code
if (!is.null(names(x)))
return(toJSON(as.list(x)))
str = "["
so it would appear to need an unnamed list to treat it as a JSON array. Maybe RJSONIO is similar.