Mule condense data based on a category - json

Example below. I've got a set of account numbers, with an account attribute. For each account_number there are three categories, and I would like the sum for each account number based on each balance in DataWeave.
Data input
[
{
Account_Number: 1,
Account: 5,
Category: "A",
Balance: 500
},
{
Account_Number: 1,
Account: 5,
Category: "A",
Balance: 700
},
{
Account_Number: 1,
Account: 5,
Category: "B",
Balance: 300
},
{
Account_Number: 1,
Account: 5,
Category: "C",
Balance: 100
},
{
Account_Number: 2,
Account: 10,
Category: "B",
Balance: 300
},
{
Account_Number: 2,
Account: 10,
Category: "B",
Balance: 800
}
]
Data Output
[
{
Account_Number: 1,
Account: 5,
CategoryA_Balance: 1200,
CategoryB_Balance: 300,
CategoryC_Balance: 100
}
{
Account_Number: 2,
Account: 10,
CategoryA_Balance: 0,
CategoryB_Balance: 1100,
CategoryC_Balance: 0
}
]```

I assume Categories are dynamic. If not, you can replace the Categories variable with a static array.
%dw 2.0
output application/json
var byAcctNbr = payload groupBy ($.Account_Number)
var categories = payload..Category distinctBy $
---
keysOf(byAcctNbr) map ((acctNbr) ->
do {
var item = byAcctNbr[acctNbr]
var outItem = (item[0] default {}) - "Balance" - "Category"
var balances = categories reduce ((category, acc={}) ->
do {
var accounts = item filter ($.Category == category)
---
acc ++ (
("Category" ++ category ++ "_Balance"): if (isEmpty(accounts)) 0
else sum (accounts.Balance)
)
})
---
outItem ++ balances
}
)

A Similar solution to sudhish. Breaking down the solution for better understanding
distinctBy Since .. will give you all the categories present in the input. DistinctBy will remove duplicates and you will have [A,B,C]
groupBy to group based over details of each account number
(item[0] - "Balance" - "Category") Since we require AccountNumber and Account only once so used item[0] and "-" to eliminate Balance and Category since we need to perform some conditional based logic further
pluck to convert the object with account number as key to array
map iterates over the details of each account number
map over the categories will yield you [A,B,C] for both the account numbers
filter to check if the Category present in the top level map matches the categories present in the variable. if (true) then sum(Balance) else 0
sum to add based on the categories matched using filter
%dw 2.0
output application/json
var categories = payload..Category distinctBy $
---
payload groupBy $.Account_Number pluck $ map(item,index)->{
(item[0] - "Balance" - "Category"),
(categories map (cat)->{
("Category" ++ cat ++ "_Balance"):
if (isEmpty(item filter ($.Category == cat)))
0
else
sum((item filter ($.Category == cat)).Balance)
})
}

Related

Neither 1D array, nor 2D array working for setValues() in google scripts

I'm trying to add column headings to a sheet if they are needed. In this test data, there are two new column headings I'm trying to add. I've tried using both a 1D array and a 2D array, but neither is working. I'm trying to use answers like this one on SO. I must be missing something simple.
Here's what I tried first:
163. console.log("aMissing_Contracts (1D Array): " + aMissing_Contracts);
164. let targetContractsRange = targetSheet.getRange(1,targetLastColumn + 1, 1, 2);
165. targetContractsRange.setValues(aMissing_Contracts)
166. exit();
But the console showed:
5:49:22 PM Notice Execution started
5:49:23 PM Info aMissing_Contracts (1D Array): extra contract 2,extra contract 1
5:49:23 PM Error
Exception: The parameters (number[]) don't match the method signature for SpreadsheetApp.Range.setValues.
controller # GetBalances.gs:165
So, I tried:
// get range of current contracts
if (aMissing_Contracts.length >0) { // are there any missing contracts?
/** turn aMissing_contracts into 2d array: aTwoDMissing_Contracts */
var aTwoDMissing_Contracts = [];
for (var d = aMissing_Contracts.length - 1; d >= 0; d--){
var aTempArray = [];
aTempArray[0] = aMissing_Contracts[d]
aTwoDMissing_Contracts.push(aTempArray);
};
let targetContractsRange = targetSheet.getRange(1,targetLastColumn + 1, 1, 2);
console.log(aTwoDMissing_Contracts)
console.log("the range: " + targetContractsRange.getA1Notation());
console.log("the missing contracts: " + aTwoDMissing_Contracts);
183. targetContractsRange.setValues(aTwoDMissing_Contracts);
};
and I got this in the console:
6:05:57 PM Info [ [ 'extra contract 1' ], [ 'extra contract 2' ] ]
6:05:57 PM Info the range: AZ1:BA1
6:05:57 PM Info the missing contracts: extra contract 1,extra contract 2
6:05:57 PM Error
Exception: The number of rows in the data does not match the number of rows in the range. The data has 2 but the range has 1.
controller # GetBalances.gs:183
AZ1:BA1 is one row deep and the array only has two sub-arrays. What am I missing?
The array [ [ 'extra contract 1' ], [ 'extra contract 2' ] ] has two rows and one column, so instead of
let targetContractsRange = targetSheet.getRange(1,targetLastColumn + 1, 1, 2);
use
let targetContractsRange = targetSheet.getRange(1,targetLastColumn + 1, 2, 1);
Or change the array shape to
[ [ 'extra contract 1' , 'extra contract 2' ] ]

Groovy: compare two lazy maps/jsons

I have two jsons/lazy maps in the format as shown below. I now need to compare them to find if there is any difference between them. The reason I combine each set of values in a string so that the comparison becomes faster as my actual inputs (i.e. json messages) are going to be really large.
reqJson:
[["B1": 100, "B2": 200, "B3": 300, "B4": 400],["B1": 500, "B2": 600, "B3": 700, "B4": 800], ["B1": 900, "B2": 1000, "B3": 2000, "B4": 3000], ["B1": 4000, "B2": 5000, "B3": 6000, "B4": 7000]]
respJson:
[["B1": 100, "B2": 200, "B3": 300, "B4": 400],["B1": 500, "B2": 600, "B3": 700, "B4": 800], ["B1": 900, "B2": 1000, "B3": 2000, "B4": 3000], ["B1": 4000, "B2": 5000, "B3": 6000, "B4": 7000], ["B1": 8000, "B2": 9000, "B3": 10000, "B4": 11000]]
My code looks something like as shown below but somehow I am unable to get the desired result. I am unable to figure out what is going wrong. I am taking each value from response Json and compare it with any value in request-Json to find if there is a difference or not.
def diffCounter = 0
Set diffSet = []
respJson.each { respJ ->
reqJson.any {
reqJ ->
if (respJ.B1+respJ.B2+respJ.B3+respJ.B4 != reqJ.B1+reqJ.B2+reqJ.B3+reqJ.B4) {
diffCounter += 1
diffSet << [
"B1" : respJ.B1,
"B2" : respJ.B2,
"B3" : respJ.B3,
"B4" : respJ.B4
]
}
}
}
println ("Difference Count: "+ diffCounter)
println ("Difference Set: "+ diffSet)
Actual Output:
Difference Count: 5
Difference Set: [[B1:100, B2:200, B3:300, B4:400], [B1:500, B2:600, B3:700, B4:800], [B1:900, B2:1000, B3:2000, B4:3000], [B1:4000, B2:5000, B3:6000, B4:7000], [B1:8000, B2:9000, B3:10000, B4:11000]]
Expected Output:
Difference Count: 1
Difference Set: [["B1": 8000, "B2": 9000, "B3": 10000, "B4": 11000]]
NOTE: It can also happen that the request-json is bigger than the response-json so in that case I need to store the difference obtained from request-json into the diffSet.
Any inputs/suggestions in this regard will be helpful.
As #daggett mentioned, if your JSONs become more nested/complicated, you will want to use a library to do this job for you.
In your use case of pure Lists of elements (with values that can be concatenated/added to form a unique key for that element) there is no problem with doing it 'manually'.
The problem with your code is that you check if any reqJson entry has a different count, which for 2+ different reqJson entries is always true.
What you really want to check is if there is any matching reqJson entry that has the same count. And if you can't find any matching entry, then you know that entry only exists in respJson.
def diffCounter = 0
Set diffSet = []
respJson.each { respJ ->
def foundMatching = reqJson.any { reqJ ->
respJ.B1 + respJ.B2 + respJ.B3 + respJ.B4 == reqJ.B1 + reqJ.B2 + reqJ.B3 + reqJ.B4
}
if (!foundMatching) {
diffCounter += 1
diffSet << [
"B1" : respJ.B1,
"B2" : respJ.B2,
"B3" : respJ.B3,
"B4" : respJ.B4
]
}
}
println ("Difference Count: "+ diffCounter)
println ("Difference Set: "+ diffSet)
You mention that reqJson can become bigger than respJson and that in that case you want to switch the roles of the two arrays in the comparison, so that you always get the unmatched elements from the larger array. A trick to do this is to start by swapping the two variables around.
if (reqJson.size() > respJson.size()) {
(reqJson, respJson) = [respJson, reqJson]
}
Note that the time complexity of this algorithm is O(m * n * 2i), meaning it grows linearly with the multiplication of the sizes of the two arrays (m and n, here 5 and 4), times the count of property accesses we do every loop on both elements (i for both elements, here 4 because there are 4 Bs), because we potentially check each element of the smaller array one time for each element of the bigger array.
So if the arrays are tens of thousands of elements long, this will become very slow. A simple way to speed it up to O(m * i + n * i) would be to:
make a Set smallArrayKeys out of the concatenates messages/added values of the smaller array
iterate through the bigger array, check if it's concatenated message is contained in the smallArrayKeys Set, and if not then it only exists in the bigger array.

Querying to parent and children to a JSON format from MySQL 5.6?

I have a heirarchy of tables in a MySQL 5.6 database that I need to query to a JSON format for use by a javascript tree structure.
Just as a test in my flask I did the following for just the top level
def get_all_customers():
response_object = {'status': 'success'}
cnx = mysql.connector.connect(user="", password="", database="", host="localhost", port=3306)
cursor = cnx.cursor()
cursor.execute('SELECT idx, name FROM listcustomers ORDER BY name')
data = []
for idx, name in cursor:
data.append({'id': idx, 'label':name, 'otherProp': "Customer"})
response_object['customers'] = data
return jsonify(response_object)
which returns
[
{ id: 1,
label: "customer 1",
otherProp: "Customer"
},
...
]
But each customer has locations, and each location has areas, and each area has assets, and each asset has projects, and I need to also query them into children of this json object. So, for example, just going one level deeper to locations, I would need something like this -
[
{ id: 1,
label: "customer 1",
otherProp: "Customer",
children: [
{
id: 5,
label: "location 5",
otherProp: "Location"
},
...
]
},
...
]
where in my database listlocatiosn who links to listcustomers via the it's parentCustomerId column. How can I manage this? Eventually this tree will have about 13,000 objects so I know just querying the data and then parsing it with python would be far more inefficient than if I am able to query properly to begin with.

RoR, need to create recursive relationships in table associations

I'm very new to rails, and am a little stuck on the logic for this problem.
I have one table (using mysql) of employees, each of them with a manager_id key which refers to the employee they report to. So for example the employee with the title of "CEO" with an id of 1, has a manager_id of nil, and the employee with title of "CTO" has a manager_id of 1. So my records look like this
id: 1, first_name: "Bob", last_name: "Boss", title: "CEO", manager_id: null
id: 2, first_name: "Pat", last_name: "Guy", title: "CTO", manager_id: 1
id: 3, first_name: "John", last_name: "Dude", title: "VP of engineering", manager_id: 2
and my JSON structure should look like this
[
{id: 1, first_name: "Bob", last_name: "Boss", title: "CEO", manager_id: null, descendents: [
{id: 2, first_name: "Pat", last_name: "Guy", title: "CTO", manager_id: 1, descendents: [
{id: 3, first_name: "John", last_name: "Dude", title: "VP of engineering", manager_id: 2, descendents: [....]}
]},
{..more CEO descendents...}
]
I'm trying to create a nested JSON structure that starts at CEO, lists all employees that report to them, and each of those employees descendants. I was trying to write a script that creates this but I keep getting infinite recursive calls. This is what I have
#start at some root
#root = Employee.find_by title: 'CEO'
#convert to hash table
#results[0] = #root.attributes
#add direct_reports key
#results[0]["direct_reports"] = []
def getBelow(root=#root)
#reports = Employee.where("manager_id = ?", #root[:id])
if #reports.blank?
return []
else
#reports = #reports.map(&:attributes)
#reports.each do |person|
person["direct_reports"] = []
getBelow(person)
end
#reports = Employee.where("manager_id = ?", #root[:id])
root["direct_reports"] = #reports
end
return #root
end
#list = getBelow(#results[0])
If I'm passing in each new person object, shouldn't they all eventually end when #reports.blank? becomes true?
An alternative I was thinking of was to use table associations inspired by this blog post
https://hashrocket.com/blog/posts/recursive-sql-in-activerecord
but that seems a little too complicated.
Some issues in the getBelow method
You are always using #root, instead of using the param (root). So you are always starting again from the 'CEO'.
You are calling getBelow recursively but you are not using the result.
You call #reports = Employee.where("manager_id = ?", #root[:id]) twice.
You return #root.
As Jorge Najera said, there are gems that handle a tree structure easily. If you want to build it on your own, this is my suggestion:
#start at some root
#root = Employee.find_by manager_id: nil
#convert to hash table
#tree = #root.attributes
#add direct_reports key
#tree["direct_reports"] = getBelow(#root)
def getBelow(manager)
branch = []
Employee.where("manager_id = ?", manager.id).each do |employee|
node = employee.attributes
node["direct_reports"] = getBelow(employee)
branch << node
end
branch
end
This was not tested so I think you´ll get some errors, but I believe the idea is fine.

Python3 json output values to file line by line only if other fields are greater than value

I have retrieved remote json using urllib.request in python3 and would like to to dump, line by line, the value of the IP addresses only (ie. ip:127.0.0.1 would be 127.0.0.1, next line is next IP) if it matches certain criteria. Other key values include a score (one integer value per category) and category (one or more string values possible).
I want to check if the score is higher than, say 10, AND the category number equals a list of one OR more values. If it fits the params, I just need those IP addresses added line by line to a text file.
Here is how I retrieve the json:
ip_fetch = urllib.request.urlopen('https://testonly.com/ip.json').read().decode('utf8')
I have the json module loaded, but don't know where to go from here.
Example of json data I'm working with, more than one category:
"127.0.0.1" : {
"Test" : "10",
"Prod" : "20"
},
I wrote a simple example that should show you how to iterate trough json objects and how to write to a file:
import json
j = json.loads(test)
threshold = 10
validCategories = ["Test"]
f=open("test.txt",'w')
for ip, categories in j.items():
addToList = False
for category, rank in categories.items():
if category in validCategories and int(rank) >= threshold:
addToList = True
if addToList:
f.write("{}\n".format(ip))
f.close()
I hope that helps you to get started. For testing I used the following json-string:
test = """
{
"127.0.0.1" : {
"Test" : "10",
"Prod" : "20"
},
"127.0.0.2" : {
"Test" : "5",
"Prod" : "20"
},
"127.0.0.3" : {
"Test" : "5",
"Prod" : "5",
"Test2": "20"
}
}
"""